Metric embeddings and geometric inequalities

Lectures by Professor Assaf Naor Scribes: Holden Lee and Kiran Vodrahalli

April 27, 2016 MAT529 Metric embeddings and geometric inequalities

2 Contents

1 The Ribe Program7 1 The Ribe Program...... 7 2 Bourgain’s Theorem implies Ribe’s Theorem...... 14 2.1 Bourgain’s discretization theorem...... 15 2.2 Bourgain’s Theorem implies Ribe’s Theorem...... 17

2 Restricted invertibility principle 19 1 Restricted invertibility principle...... 19 1.1 The first restricted invertibility principles...... 19 1.2 A general restricted invertibility principle...... 21 2 Ky Fan maximum principle...... 23 3 Finding big subsets...... 26 3.1 Little Grothendieck inequality...... 26 3.1.1 Tightness of Grothendieck’s inequality...... 29 3.2 Pietsch Domination Theorem...... 30 3.3 A projection bound...... 32 3.4 Sauer-Shelah Lemma...... 33 4 Proof of RIP...... 34 4.1 Step 1...... 34 4.2 Step 2...... 36

3 Bourgain’s Discretization Theorem 47 1 Outline...... 47 1.1 Reduction to smooth norm...... 48 2 Bourgain’s almost extension theorem...... 49 2.1 Lipschitz extension problem...... 50 2.2 John ellipsoid...... 51 2.3 Proof...... 52 3 Proof of Bourgain’s discretization theorem...... 57 3.1 The Poisson semigroup...... 57 4 Upper bound on 훿푋→푌 ...... 66 4.1 Fourier analysis on the Boolean cube and Enflo’s inequality...... 69 5 Improvement for 퐿푝 ...... 70

3 MAT529 Metric embeddings and geometric inequalities

6 Kirszbraun’s Extension Theorem...... 79

4 Grothendieck’s Inequality 91 1 Grothendieck’s Inequality...... 91 2 Grothendieck’s Inequality for Graphs (by Arka and Yuval)...... 93 3 Noncommutative Version of Grothendieck’s Inequality - Thomas and Fred.. 97 4 Improving the Grothendieck constant...... 102 5 Application to Fourier series...... 104 6 ? presentation...... 105

5 Lipschitz extension 111 1 Introduction and Problem Setup...... 111 1.1 Why do we care?...... 112 1.2 Outline of the Approach...... 113 2 푟-Magnification...... 113 3 Wasserstein-1 norm...... 114 4 Graph theoretic lemmas...... 116 4.1 Expanders...... 116 4.2 An Application of Menger’s Theorem...... 118 5 Main Proof...... 118

4 Introduction

The topic of this course is geometric inequalities with applications to metric embeddings; and we are actually going to do things more general than metric embeddings: Metric geometry. Today I will roughly explain what I want to cover and hopefully start proving a first major theorem. The strategy for this course is to teach novel work. Sometimes topics will be covered in textbooks, but a lot of these things will be a few weeks old. There are also some things which may not have been even written yet. I want to give you a taste for what’s going on in the field. Notes from the course I taught last spring are also available. One of my main guidelines in choosing topics will be topics that have many accessible open questions. I will mention open questions as we go along. I’m going to really choose topics that have proofs which are entirely self-contained. I’m trying to assume nothing. My intention is to make it completely clear and there should be no steps you don’t understand. Now this is a huge area. I will explain some of the background today. I’m going to use proofs of some major theorems as excuses to see the lemmas that go into the proofs. Some of the theorems are very famous and major, and you’re going to see some improvements, but along the way, we will see some lemmas which are immensely powerful. So we will always be proving a concrete theorem. But actually somewhere along the way, there are lemmas which have wide applicability to many many areas. These are excuses to discuss methods, though the theorems are important. The course can go in many directions: If some of you have some interests, we can always change the direction of the course, so express your interests as we go along.

5 MAT529 Metric embeddings and geometric inequalities

6 Chapter 1

The Ribe Program

2/1/16

1 The Ribe Program

The main motivation for most of what we will discuss is called the Ribe Program, which is a research program many hundreds of papers large. We will see some snapshots of it, and it all comes from a theorem from 1975, Ribe’s rigidity theorem 1.1.9, which we will state now and prove later in a modern way. This theorem was Martin Ribe’s dissertation, which started a whole direction of , but after he wrote his dissertation he left mathematics. He’s apparently a government official in Sweden. The theorem is in the context of Banach spaces; a relation between their linear structure and their structure as metric spaces. Now for some terminology.

Definition 1.1.1 (): A Banach space is a complete, normed vector space. Therefore, a Banach space is equipped with a metric which defines vector length and distances between vectors. It is complete, so every Cauchy sequence of converges to a limit defined inside the space.

Definition 1.1.2: Let (푋, ‖·‖푋 ), (푌, ‖·‖푌 ) be Banach spaces. We say that 푋 is (crudely) finitely representable in 푌 if there exists some constant 퐾 > 0 such that for every finite- dimensional linear subspace 퐹 ⊆ 푋, there is a linear operator 푆 : 퐹 → 푌 such that for every 푥 ∈ 퐹 ,

‖푥‖푋 ≤ ‖푆푥‖푌 ≤ 퐾 ‖푥‖푋 . Note 퐾 is decided once and for all, before the subspace 퐹 is chosen. (Some authors use “finitely representable” to mean that this is true forany 퐾 = 1 + 휀. We will not follow this terminology.) Finite representability is important because it allows us to conclude that 푋 has the same finite dimensional linear properties (local properties) as 푌 . That is, it preserves any invariant involves finitely many vectors, their lengths, etc.

7 MAT529 Metric embeddings and geometric inequalities

Let’s introduce some local properties like type. To motivate the definition, consider the triangle inequality, which says

‖푦1 + ··· + 푦푛‖푌 ≤ ‖푦1‖푌 + ··· + ‖푦푛‖푌 .

In what sense can we improve the triangle inequality? In 퐿1 this is the best you can say. In many spaces there are ways to improve it if you think about it correctly.

For any choice 휀1, . . . , 휀푛 ∈ {±1}, ⃦ ⃦ ⃦ ⃦ ⃦∑︁푛 ⃦ ∑︁푛 ⃦ ⃦ ⃦ 휀푖푦푖⃦ ≤ ‖푦푖‖푌 . 푖=1 푌 푖=1

Definition 1.1.3: df:type Say that 푋 has type 푝 if there exists 푇 > 0 such that for every 푛, 푦1, . . . , 푦푛 ∈ 푌 , ⃦ ⃦ [︃ ]︃ ⃦ ⃦ 1 ⃦∑︁푛 ⃦ ∑︁푛 푝 ⃦ ⃦ 푝 E ⃦ 휀푖푦푖⃦ ≤ 푇 ‖푦푗‖푌 . 휀∈{±1}푛 푖=1 푌 푖=1 The 퐿푝 norm is always at most the 퐿1 norm; if the lengths are spread out, this is asymptot- ically much better. Say 푌 has nontrivial type if 푝 > 1.

For example, 퐿푝(휇) has type min(푝, 2). Later we’ll see a version of “type” for metric spaces. How far is the triangle inequality from being an equality is a common theme in many questions. In the case of normed spaces, this controls a lot of the geometry. Proving a result for 푝 > 1 is hugely important.

Proposition 1.1.4: pr:finrep-type If 푋 is finitely representable and 푌 has type 푝 then also 푋 has type 푝.

Proof. Let 푥1, . . . , 푥푛 ∈ 푋. Let 퐹 = span{푥1, . .∑︀ . , 푥푛}. Finite representability gives me 푆 : 퐹 → 푌 . Let 푦푖 = 푆푥푖. What can we say about 휀푖푦푖? ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦∑︁푛 ⃦ ⃦ ∑︁푛 ⃦ ⃦ 휀푖푦푖⃦ = ⃦푆( 휀푖푥푖)⃦ E휀 ⃦ ⃦ E휀 ⃦ ⃦ 푖=1 푌 ⃦ 푖=1 ⃦ 푌 ⃦ ⃦ ⃦∑︁푛 ⃦ ≥ ⃦ 휀푖푋푖⃦ E휀 ⃦ ⃦ 푖=1 푋 ⃦ ⃦ (︃ )︃ 1 ⃦∑︁푛 ⃦ ∑︁푛 푝 ⃦ ⃦ 푝 ⃦ 휀푖푦푖⃦ ≤ 푇 ‖푆푥푖‖ E휀 ⃦ ⃦ 푖=1 푌 푖=1 (︃ )︃ 1 ∑︁푛 푝 푝 ≤ 푇 퐾 ‖푥푖‖ . 푖=1

Putting these two inequalities together gives the result.

8 MAT529 Metric embeddings and geometric inequalities

Theorem 1.1.5 (Kahane’s inequality). For any normed space 푌 and 푞 ≥ 1, for all 푛, 푦 , . . . , 푦 ∈ 푌 , 1 푛 ⃦ ⃦ (︃ [︃⃦ ⃦ ]︃)︃ ⃦ ⃦ ⃦ ⃦푞 1 ⃦∑︁푛 ⃦ ⃦∑︁푛 ⃦ 푞 ⃦ ⃦ ⃦ ⃦ E ⃦ 휀푖푦푖⃦ &푞 E ⃦ 휀푖푦푖⃦ . 푖=1 푖=1 푌

Here &푞 means “up to a constant”; subscripts say what the constant depends on. The constant here does not depend on the norm 푌 .

Kahane’s Theorem tells us that the LHS of Definition 1.1.3 can be replaced by any norm, if we change ≤ to .. We get that having type 푝 is equivalent to ⃦ ⃦ ⃦ ⃦푝 ⃦∑︁푛 ⃦ ∑︁푛 ⃦ ⃦ 푝 푝 E ⃦ 휀푖푦푖⃦ . 푇 ‖푦푖‖푌 . 푖=1 푌 푖=1 Recall the parallelogram identity in a 퐻: ⃦ ⃦ ⃦ ⃦2 ⃦∑︁푛 ⃦ ∑︁푛 ⃦ ⃦ 2 E ⃦ 휀푖푦푖⃦ = ‖푦푖‖퐻 . 푖=1 푖=1 A different way to understand the inequality in the definition of “type” is: how far isagiven norm from being an Euclidean norm? The Jordan-von Neumann Theorem says that if parallelogram identity holds then it’s a Euclidean space. What happes if we turn it in an inequality? ⃦ ⃦ ⃦ ⃦2 ⃦∑︁푛 ⃦ ≥ ∑︁푛 ⃦ 휀 푦 ⃦ 푇 ‖푦 ‖2 . E ⃦ 푖 푖⃦ ≤ 푖 퐻 푖=1 퐻 푖=1 Either inequality still characterizes a Euclidean space. What happens if we add constants or change the power? We recover the definition for type and cotype (which has the inequality going the other way): ⃦ ⃦ ⃦ ⃦푞 ⃦∑︁푛 ⃦ ∑︁푛 ⃦ ⃦ & 푞 E ⃦ 휀푖푦푖⃦ ‖푦푖‖퐻 . 푖=1 퐻 . 푖=1 Definition 1.1.6: Say it has cotype 푞 if ⃦ ⃦ ⃦ ⃦푞 ∑︁푛 ⃦∑︁푛 ⃦ 푞 푞 ⃦ ⃦ ‖푦푖‖푌 . 퐶 E ⃦ 휀푖푦푖⃦ 푖=1 푖=1 푌 R. C. James invented the local theory of Banach spaces, the study of geometry that involves properties involving finitely many vectors (∀푥1, . . . , 푥푛, 푃 (푥1, . . . , 푥푛) holds). As a counterexample, reflexivity cannot be characterized using finitely many vectors (this isa theorem). Ribe discovered link between metric and linear spaces. First, terminology.

9 MAT529 Metric embeddings and geometric inequalities

Definition 1.1.7: Two Banach spaces are uniformly homeomorphic if there exists 푓 : 푋 → 푌 that is 1-1 and onto and 푓, 푓 −1 are uniformly continuous.

Without the word “uniformly”, if you think of the spaces as topological spaces, all of them are equivalent. Things become interesting when you quantify! “Uniformly” means you’re controlling the quantity. Theorem 1.1.8 (Kadec). Any two infinite-dimensional separable Banach spaces are home- omorphic. This is a amazing fact: these spaces are all topologically equivalent to Hilbert spaces! Over time people people found more examples of Banach spaces that are homeomorphic but not uniformly homeomorphic. Ribe’s rigidity theorem clarified a big chunk of what was happening.

Theorem 1.1.9 (Rigidity Theorem, Martin Ribe (1975)). thm:ribe Suppose that 푋, 푌 are uni- formly homeomorphic Banach spaces. Then 푋 is finitely representable in 푌 and 푌 is finitely representable in 푋. For example, for 퐿푝 and 퐿푞, for 푝 ̸= 푞 it’s always that case that one is not finitely repre- sentable in the other, and hence by Ribe’s Theorem, 퐿푝, 퐿푞 are not uniformly homeomorphic. (When I write 퐿푝, I mean 퐿푝(R).)

Theorem 1.1.10. For every 푝 ≥ 1, 푝 ̸= 2, 퐿푝 and ℓ푝 are finitely representable in each other, yet not uniformly homeomorphic.

(Here ℓ푝 is the sequence space.)

Exercise 1.1.11: Prove the first part of this theorem: 퐿푝 is finitely representable in ℓ푝.

Hint: approximate using step functions. You’ll need to remember some measure theory. When 푝 = 2, 퐿푝, ℓ푝 are separable and isometric. The theorem in various cases was proved by: 1. 푝 = 1: Enflo

2.1 < 푝 < 2: Bourgain

3. 푝 > 2: Gorelik, applying the Brouwer fixed point theorem (topology) Every linear property of a Banach signs which is local (type, cotype, etc.; involving summing, powers, etc.) is preserved under a general nonlinear deformation. After Ribe’s rigidity theorem, people wondered: can we reformulate the local theory of Banach spaces without mentioning anything about the linear structure? Ribe’s rigidity theorem is more of an existence statement, we can’t see anything about an explicit dictionary which maps statements about linear sums into statements about metric spaces. So people started to wonder whether we could reformulate the local theory of Banach spaces, but only

10 MAT529 Metric embeddings and geometric inequalities

looks at distances between pairs instead of summing things up. Local theory is one of the hugest subjects in analysis. If you could actually find a dictionary which takes one linear theorem at a time, and restate it with only distances, there is a huge potential here! Because the definition of type only involves distances between points, we can talk about ametric space’s type or cotype. So maybe we can use the intution given by linear arguments, and then state things for metric spaces which often for very different reasons remain true from the linear domain. And then now maybe you can apply these arguments to graphs, or groups! We could be able to prove things about the much more general metric spaces. Thus, we end up applying theorems on linear spaces in situations with a priori nothing to do with linear spaces. This is massively powerful. There are very crucial entries that are missing in the dictionary. We don’t even now how to define many of the properties! This program has many interesting proofs. Some ofthe most interesting conjectures are how to define things!

Corollary 1.1.12. cor:uh-type If 푋, 푌 are uniformly homeomorphic and if one of them is of type 푝, then the other does.

This follows from Ribe’s Theorem 1.1.9 and Proposition 1.1.4. Can we prove something like this theorem without using Ribe’s Theorem 1.1.9? We want to reformulate the definition of type using only the distance, so this becomes self-evident.

Enflo had an amazing idea. Suppose 푋 is a Banach space, 푥1, . . . , 푥푛 ∈ 푋. The type 푝 inequality says [︃⃦ ⃦ ]︃ ⃦ ⃦푝 ⃦∑︁푛 ⃦ ∑︁푛 ⃦ ⃦ 푝 eq:type-pE ⃦ 휀푖푥푖⃦ .푋 ‖푥푖‖ . (1.1) 푖=1 푖=1 Let’s rewrite this in a silly way. Define 푓 : {±1}푛 → 푋 by

∑︁푛 푓(휀1, . . . , 휀푛) = 휀푖푥푖. 푖=1

푛 Write 휀 = (휀1, . . . , 휀푛). Multiplying by 2 , we can write the inequality (1.1) as

∑︁푛 푝 푝 eq:type-genE [‖푓(휀) − 푓(−휀)‖ ] .푋 E [‖푓(휀) − 푓(휀1, . . . , 휀푖−1, −휀푖, 휀푖+1, . . . , 휀푛)‖ ] . (1.2) 푖=1

This inequality just involves distances between points 푓(휀), so it is the reformulation we seek.

Definition 1.1.13: df:enflo A metric space (푋, 푑푋 ) has Enflo type 푝 if there exists 푇 > 0 such that for every 푛 and every 푓 : {±1}푛 → 푋,

∑︁푛 푝 푝 푝 E[푑푋 (푓(휀), 푓(−휀)) ] ≤ 푇 E[푑푋 (푓(휀), 푓(휀1, . . . , 휀푖−1, −휀푖, 휀푖+1, . . . , 휀푛)) ]. 푖=1

11 MAT529 Metric embeddings and geometric inequalities

This is bold. It wasn’t true before for a general function! The discrete cube (Boolean hypercube) {±1}푛 is all the 휖 vectors, of which there are 2푛. Our function just assigns 2푛 points arbitrarily. No structure whatsoever. As they are indexed this way, you see nothing. But you’re free to label them by vertices of the cube however you want. But there are many labelings! In (1.2), the points had to be vertices of a cube, but in Definition 1.1.13, they are arbitrary. The moment you choose the labelings, you impose a cube structure between the points. Some of them are diagonals of the cube, some of them are edges. 휖 and −휖 are antipodal points. But it’s not really a diagonal. They are points on a manifold, and are a function of how you decided to label them. What this sum says is that the sum over all diagonals, the length of the diagonals to the power 푝 is less than the sum over edges to the 푡ℎ 푝 powers (these are the points where one 휖푖 is different). Thus we can see ∑︁ ∑︁ 푝 푝 diag .푋 edge . The total 푝th power of lengths of diagonals is up to a constant, at most the same thing over all edges.

This is a vast generalization of type; we don’t even know a Banach space satisfies this. The following is one of my favorite conjectures. Conjecture 1.1.14 (Enflo). If a Banach space has type 푝 then it also has Enflo type 푝. This has been open for 40 years. We will prove the following. Theorem 1.1.15 (Bourgain-Milman-Wolfson, Pisier). If 푋 is a Banach space of type 푝 > 1 then 푋 also has type 푝 − 휀 for every 휀 > 0. If you know the type inequality for parallelograms, you get it for arbitrary sets of points, up to 휀. Basically, you’re getting arbitrarily close to 푝 instead of getting the exact result. We also know that the conjecture stated before is true for a lot of specific Banach spaces, though we do not yet have the general result. For instance, this is true for the 퐿4 norm. Index functions by vertices; some pairs are edges, some are diagonals; then the 퐿4 norm of the diagonals is at most that of the edges.

12 MAT529 Metric embeddings and geometric inequalities

How do you go from knowing this for a linear function to deducing this for an arbitrary function? Once you do this, you have a new entry in the hidden Ribe dictionary. If 푋 and 푌 are uniformly homeomorphic Banach spaces and 푌 has Enflo type 푝, then so is 푋. The minute you throw away the linear structure, Corollary 1.1.12 becomes easy. It requires a tiny argument. Now you can take a completely arbitrary function 푓 : {±1}푛 → 푋. There exists a homeomorphism 휓 : 푋 → 푌 such that 휓, 휓−1 are uniformly continuous. Now we want to deduce that the same inequality in 푌 gives the same inequality in 푋.

Proposition 1.1.16: pr:uh-enflo If 푋, 푌 are uniformly homeomorphic Banach spaces and 푌 has Enflo type 푝, then so does 푋.

This is an example of making the abstract Ribe theorem explicit.

Lemma 1.1.17 (Corson-Klee). lem:corson-klee If 푋, 푌 are Banach spaces and 휓 : 푋 → 푌 are uniformly continuous, then for every 푎 > 0 there exists 퐿(푎) such that

‖푥1 − 푥2‖푋 ≥ 푎 =⇒ ‖휓(푥1) − 휓(푥2)‖ ≤ 퐿 ‖푥1 − 푥2‖ . Proof sketch of 1.1.16 given Lemma 1.1.17. By definition of uniformly homeomorphic, there exists a homeomorphism 휓 : 푋 → 푌 such that 휓, 휓−1 are uniformly continuous. Lemma 1.1.17 tells us that 휓 perserves distance up to a constant. Dividing so that the smallest nonzero distance you see is at least 1, we get the same inequality in the image and the preimage. Proof details. Let 휀푖 denote 휀 with the 푖th coordinate flipped. We need to prove ∑︁푛 푝 푝 푖 푝 E(푑푋 (푓(휀), 푓(−휀)) ) ≤ 푇푓 E(푑푋 (푓(휀), 푓(휀 )) ) 푖=1 Without loss of generality, by scaling 푓 we may assume that all the points 푓(휀) are distance at least 1 apart. (푋 is a Banach space, so distance scales linearly; this doesn’t affect whether the inequality holds.) Let 휓 : 푋 → 푌 be such that 휓, 휓−1 are uniform homeomorphisms. Because 휓−1 is −1 −1 uniformly homeomorphic, there is 퐶 such that 푑푌 (푦1, 푦2) ≤ 1 implies 푑푋 (휓 (푦1), 휓 (푦2)) < 퐶. WLOG, by scaling 푓 we may assume that all the points 푓(휀) are max(1, 퐶) apart, so that the points 휓 ∘ 푓(휀) are at least 1 apart. We know that for any 푔 : {±1}푛 → 푌 that ∑︁푛 푝 푝 푖 푝 E(푑푋 (푔(휀), 푔(−휀)) ) ≤ 푇푔 E(푑푋 (푔(휀), 푔(휀 )) ). 푖=1 We apply this to 푔 = 휓 ∘ 푓, ; 푋 푓

{±1}푛 휓

푔=휓∘푓 #  푌

13 MAT529 Metric embeddings and geometric inequalities

to get

푝 −1 −1 푝 E(푑푋 (푓(휀), 푓(−휀)) ) = E(푑푋 (휓 ∘ 푔(휀), 휓 ∘ 푔(−휀)) ) 푝 ≤ 퐿휓−1 (1)E(푑푌 (푔(휀), 푔(−휀)) ) ∑︁푛 푝 푖 ≤ 퐿휓−1 (1)푇푔 E(푑푌 (푔(휀), 푔(휀 ))) 푖=1 ∑︁푛 푝 푖 푝 ≤ 퐿휓−1 (1)퐿휓(1)푇푔 E(푑푋 (푔(휀), 푔(휀 )) ) 푖=1 as needed.

The parallelogram inequality for exponent 1 instead of 2 follows from using the triangle inequality on all possible paths for all paths of diagonals. Type 푝 > 1 is a strengthening of the triangle inequality. For which metric spaces does it hold? What’s an example of a metric space where the inequality doesn’t hold with 푝 > 1? The cube itself (with 퐿1 distance). 푝 푛  푛. I will prove to you that this is the only obstruction: given a metric space that doesn’t contain bi-Lipschitz embeddings of arbitrary large cubes, the inequality holds. We know an alternative inequality involving distance equivalent to type; I can prove it. It is, however, not a satisfactory solution to the Ribe program. There are other situations where we have complete success. We will prove some things, then switch gears, slow down and discuss Grothendieck’s inequality and applications. They will come up in the nonlinear theory later.

2 Bourgain’s Theorem implies Ribe’s Theorem

2-3-16 We will use the Corson-Klee Lemma 1.1.17.

Proof of Lemma 1.1.17. Suppose 푥, 푦 ∈ 푋, ‖푥 − 푦‖ ≥ 푎. Break up the line segment from 푥, 푦 into intervals of length 푎; let 푥 = 푥0, 푥1, . . . , 푥푘 = 푦 be the endpoints of those intervals, with

‖푥푖+1 − 푥푖‖ ≤ 푎. The modulus of continuity is defined as

푊푓 (푡) = sup {‖푓(푢) − 푓(푣)‖ : 푢, 푣 ∈ 푋, ‖푢 − 푣‖ ≤ 푡} .

Uniform continuity says lim푡→0 푊푓 (푡) = 0. The number of intervals is

‖푥 − 푦‖ 2 ‖푥 − 푦‖ 푘 ≤ + 1 ≤ . 푎 푎 14 MAT529 Metric embeddings and geometric inequalities

Then ∑︁푘 ‖푓(푥) − 푓(푦)‖ ≤ ‖푓(푥푖) − 푓(푥푖−1)‖ 푖=1 2푊 (푎) ≤ 퐾푊 (푎) ≤ 푓 ‖푥 − 푦‖ , 푓 푎

2푊푓 (푎) so we can let 퐿(푎) = 푎 .

2.1 Bourgain’s discretization theorem There are 3 known proofs of Ribe’s Theorem. 1. Ribe’s original proof, 1987. 2. HK, 1990, a genuinely different proof. 3. Bourgain’s proof, a Fourier analytic proof which gives a quantitative version. This is the version we’ll prove. Bourgain uses the Discretization Theorem 1.2.4. There is an amazing open problem in this context. Saying 훿 is big says there is a not-too-fine net, which is enough. Therefore weare interested in lower bounds on 훿.

Definition 1.2.1 (Discretization modulus): Let 푋 be a finite-dimensional normed space dim(푋) = 푛 < ∞. Let the target space 푌 be an arbitrary Banach space. Consider the unit ball 퐵푋 in 푋. Take a maximal 훿-net 풩훿 in 퐵푋 . Suppose we can embed 풩훿 into 푌 via 푓 : 풩훿 → 푌 . Suppose we know in 푌 that ‖푥 − 푦‖ ≤ ‖푓(푥) − 푓(푦)‖ ≤ 퐷 ‖푥 − 푦‖ .

for all 푥, 푦 ∈ 푁훿. (We say that 풩훿 embeds with distortion 퐷 into 푌 .)

You can prove using a nice compactness argument that if this holds for 훿 is small enough, then the entire space 푋 embeds into 푌 with rough the same distortion. Bourgain’s dis- cretization theorem 1.2.4 says that you can choose 훿 = 훿푛 to be independent of the geometry of 푋 and 푌 such that if you give a 훿-approximation of the unit-ball in the 푛-dimensional norm, you succeed in embedding the whole space. I often use this theorem in this way: I use continuous methods to show embedding 푋 into 푌 requires big distortion; immediately I get an example with a finite object. Let us now make the notion of distortion more precise.

Definition 1.2.2 (Distortion): Suppose (푋, 푑푋 ), (푌, 푑푌 ) are metric spaces 퐷 ≥ 1. We say that 푋 embeds into 푌 with distortion 퐷 if there exists 푓 : 푋 → 푌 and 푠 > 0 such that for all 푥, 푦 ∈ 푋, 푆푑푋 (푥, 푦) ≤ 푑푌 (푓(푥), 푓(푦)) ≤ 퐷푆푑푋 (푥, 푦).

15 MAT529 Metric embeddings and geometric inequalities

The infimum over those 퐷 ≥ 1 such that 푋 embeds into 푌 with distortion is denoted 퐶푌 (푋). This is a measure of how far 푋 is being from a subgeometry of 푌 .

Definition 1.2.3 (Discretization modulus): Let be a 푛-dimensional normed space and 푌 be any Banach space, 휀 ∈ (0, 1). Let 훿푋˓→푌 (휀) be the supremum over all those 훿 > 0 such that for every 훿-net 풩훿 in 퐵푋 ,

퐶푌 (풩훿) ≥ (1 − 휀)퐶푌 (푋).

Here 퐵푋 := {푥 ∈ 푋 : ‖푥‖ ≤ 1}.

In other words, the distortion of the 훿-net is not much larger than the distortion of the whole space. That is, the discrete 훿-ball encodes almost all information about the space when it comes to embedding into 푌 : If you got 퐶푌 (풩훿) to be small, then the distortion of the entire object is not much larger. Theorem 1.2.4 (Bourgain’s discretization theorem). For every 푛, 휀 ∈ (0, 1), for every 푋, 푌 , dim 푋 = 푛, 퐶푛 −( 푛 ) 훿푋˓→푌 (휀) ≥ 푒 휀 . −(2푛)퐶푛 Moreover for 훿 = 푒 , we have 퐶푌 (푋) ≤ 2퐶푌 (풩훿). Thus there is a 훿 dependending on the dimension such that in any 푛-dimensional norm space, the unit ball it encodes all the information of embedding 푋 into anything else. It’s only a function of the dimension, not of any of the relevant geometry. The theorem says that if you look at a 훿-net in the unit ball, it encodes all the information about 푋 when it comes to embedding into everything else. The amount you have to discretize is just a function of the dimension, and not of any of the other relevant geometry.

Remark 1.2.5: The proof is via a linear operator. All the inequality says is that you can find a function with the given distortion. The proof will actually give a linear operator.

The best known upper bound is (︂ )︂ 1 1 훿 . 푋˓→푌 2 . 푛 The latest progress was 1987, there isn’t a better bound yet. You have a month to think about it before you get corrupted by Bourgain’s proof. There is a better bound when the target space is a 퐿푝 space. Theorem 1.2.6 (Gladi, Naor, Shechtman). For every 푝 ≥ 1, if dim 푋 = 푛, 휀2 훿푋˓→퐿푝 (휀) & 5 푛 2 (We still don’t know what the right power is.) The case 푝 = 1 is important for appli- cations. There are larger classes for spaces where we can write down axioms for where this holds. There are crazy Banach spaces which don’t belong to this class, so we’re not done. We need more tools to show this: Lipschitz extension theorems, etc.

16 MAT529 Metric embeddings and geometric inequalities

2.2 Bourgain’s Theorem implies Ribe’s Theorem With the “moreover,” Bourgain’s theorem implies Ribe’s Theorem 1.1.9. Proof of Ribe’s Theorem 1.1.9 from Bourgain’s Theorem 1.2.4. Let 푋, 푌 be Banach spaces that are uniformly homeomorphic. By Corson-Klee 1.1.17, there exists 푓 : 푋 → 푌 such that

‖푥 − 푦‖ ≥ 1 =⇒ ‖푥 − 푦‖ ≤ ‖푓(푥) − 푓(푦)‖ ≤ 퐾 ‖푥 − 푦‖ .

(Apply the Corson-Klee lemma for both 푓 and the inverse.) In particular, if 푅 > 1 and 풩 is a 1-net in

푅퐵푋 = {푥 ∈ 푋 : ‖푥‖ ≤ 푅} ,

then 퐶푌 (풩 ) ≤ 퐾. Equivalently, for every 훿 > 0 every 훿-net in 퐵푋 satisfies 퐶푌 (풩 ) ≤ 퐾. If 퐹 ⊆ 푋 is a finite dimension subspace and 훿 = 푒−(2 dim 퐹 )퐶 dim 퐹 , then by the “moreover” part of Bourgain’s Theorem 1.2.4, there exists a linear operator 푇 : 퐹 → 푌 such that

‖푥 − 푦‖ ≤ ‖푇 푥 − 푇 푦‖ ≤ 2퐾 ‖푥 − 푦‖

for all 푥, 푦 ∈ 퐹 . This means that 푋 is finitely representable. The motivation for this program comes in passing from continuous to discrete. The theory has many applications, e.g. to computer science whcih cares about finite things. I would like an improvement in Bourgain’s Theorem 1.2.4. First we’ll prove a theorem that has nothing to do with Ribe’s Theorem. There are lemmas we will be using later. It’s an easier theorem. It looks unrelated to metric theory, but the lemmas are relevant.

17 MAT529 Metric embeddings and geometric inequalities

18 Chapter 2

Restricted invertibility principle

1 Restricted invertibility principle

1.1 The first restricted invertibility principles We take basic facts in linear algebra and make things quantitative. This is the lesson of the course: when you make things quantitative, new mathematics appears.

Proposition 2.1.1: If 퐴 : R푚 → R푛 is a linear operator, then there exists a linear subspace 푉 ⊆ R푛 with dim(푉 ) = rank(퐴) such that 퐴 : 푉 → 퐴(푉 ) is invertible. What’s the quantitative question we want to ask about this? Invertibility just says that an inverse exists. Can we find a large subspace where not only is 퐴 invertible, but the inverse has small norm? We insist that the subspace is a coordinate subspace. Let 푒1, . . . , 푒푚 be the standard 푚 basis of R , 푒푗 = (0,..., ⏟ 1⏞ , 0,...). The goal is to find a “large” subset 휎 ⊆ {1, . . . , 푚} such 푗 that 퐴 is invertible on R휎 where

휎 푚 R := {(푥1, . . . , 푥푛) ∈ R : 푥푖 = 0 if 푖 ̸∈ 휎} and the norm of 퐴−1 : 퐴(R휎) → R휎 is small. A priori this seems a crazy thing to do; take a small multiple of the identity. But we can find conditions that allow us to achieve this goal.

Theorem 2.1.2 (Bourgain-Tzafriri restricted invertibility principle, 1987). thm:btrip Let 퐴 : R푚 → R푚 be a linear operator such that

‖퐴푒푗‖2 = 1 for every 푗 ∈ {1, . . . , 푚}. Then there exist 휎 ⊆ {1, . . . , 푚} such that

1. |휎| ≥ 푐푚 , where ‖퐴‖ is the operator norm of 퐴. ‖퐴‖2

19 MAT529 Metric embeddings and geometric inequalities

2. 퐴 is invertible on R휎 and the norm of 퐴−1 : 퐴(R휎) → R휎 is at most 퐶′ (i.e., ′ ‖퐴퐽휎‖푆∞ ≤ 퐶 , to use the notation introduced below). Here 푐, 퐶′ are universal constants. Suppose the biggest eigenvalue is at most 100. Then you can always find a coordinate subset of proportional size such that on this subset, 퐴 is invertible and the inverse has norm bounded by a universal constant. All of the proofs use something very amazing. This proof is from 3 weeks ago. This has been reproved many times. I’ll state a theorem that gives better bound than the entire history. This was extended to rectangular matrices. (The extension is nontrivial.) Given 푉 ⊆ R푚 a linear subspace with dim 푉 = 푘 and 퐴 : 푉 → R푚 a linear operator, the singular values of 퐴 푠1(퐴) ≥ 푠2(퐴) ≥ · · · ≥ 푠푘(퐴) * 1 are the eigenvalues of (퐴 퐴) 2 . We can decompose 퐴 = 푈퐷푉

where 퐷 is a with 푠푖(퐴)’s on the diagonal, and 푈, 푉 are unitary.

Definition 2.1.3: For 푝 ≥ 1 the Schatten-von Neumann 푝-norm of 퐴 is

(︃ )︃ 1 ∑︁푘 푝 ‖퐴‖ := 푠 (퐴)푝 푆푝 푖 푖=1 * 푝 1 = (Tr((퐴 퐴) 2 )) 푝 * 푝 1 = (Tr((퐴퐴 ) 2 )) 푝 The cases 푝 = ∞, 2 give the operator and Frobenius norm, ‖퐴‖ = operator norm 푆∞ (︀∑︁ 1 ‖퐴‖ = Tr(퐴*퐴) = 푎2 2 . 푆2 푖푗 Exercise 2.1.4: ‖·‖ is a norm on ℳ ( ). You have to prove that given 퐴, 퐵, 푆푝 푛×푚 R

* 푝 1 * 푝 1 * 푝 1 (Tr([(퐴 + 퐵) (퐴 + 퐵)] 2 )) 푝 ≤ (Tr((퐴 퐴) 2 )) 푝 + (Tr((퐵 퐵) 2 )) 푝 . This requires an idea. Note if 퐴, 퐵 commute this is trivial. Apparently von Neumann wrote a paper called “Metric Spaces” in the 1930s in which he just proves this inequality and doesn’t know what to do with it, so it got forgotten for a while until the 1950s, when Schatten wrote books on applications. When I was a student in grad school, I was taking a class on random matrices. There was two weeks break, I was certain that it was trivial because the professor had not said it was not, and it completely ruined my break though I came up with a different proof of it. It’s short, but not trivial: It’s not typical linear algebra!. This is like another triangle inequality, which we may need later on. Spielman and Srivastava have a beautiful theorem.

20 MAT529 Metric embeddings and geometric inequalities

Definition 2.1.5: Stable rank. Let 퐴 : R푚 → R푛. The stable rank is defined as (︃ )︃ ‖퐴‖ 2 srank(퐴) = 푆2 . ‖퐴‖ 푆∞ The numerator is the sum of squares of the singular values, and the denominator is the maximal value. Large stable rank means that many singular values are nonzero, and in fact large on average. Many people wanted to get the size of the subset in the Restricted Invertibility Principle to be close to the stable rank.

푚 푛 Theorem 2.1.6 (Spielman-Srivastava). thm:ss For every linear operator 퐴 : R → R , 휀 ∈ (0, 1), there exists 휎 ⊆ {1, . . . , 푚} with |휎| ≥ (1 − 휀)srank(퐴) such that ⃦ ⃦ √ ⃦ ⃦ 푚 ⃦ −1⃦ (퐴퐽휎) . . 푆∞ 휀 ‖퐴‖ 푆2

휎 휎 푚 Here, 퐽휎 is the identity function restricted to R , 퐽 : R ˓→ R . This is stronger than Bourgain-Tzafriri. In Bourgain-Tzafriri the columns were unit vectors. Proof of Theorem 2.1.2 from Theorem 2.1.6. Let 퐴 be as in Theorem 2.1.2. Then ‖퐴‖ = √ 푆2 Tr(퐴*퐴) = 푚 and srank(퐴) = 푚 . We obtain the existence of ‖퐴‖2 푆∞ 푚 |휎| ≥ (1 − 휀) ‖퐴‖2 푆∞ √ with ‖(퐴퐽 )−1‖ 푚 1 . 휎 푆∞ . = 휀 This is a sharp dependence on 휀. The proof introduces algebraic rather than analytic methods; it was eye-opening. Marcus even got sets bigger than the stable rank and looked at 푝푓‖퐴‖ ‖퐴‖ 2, which is much 푆2 푆4 stronger.

1.2 A general restricted invertibility principle I’ll show a theorem that implies all these intermediate theorems. We use (classical) analysis and geometry instead of algebra. What matters is not the ratio of the norms, but the tail 2 2 of the distribution of 푠1(퐴) , . . . , 푠푚(퐴) .

푚 푛 Theorem 2.1.7. thm:gen-srank Let 퐴 : R → R be a linear operator. If 푘 < rank(퐴) then there exist 휎 ⊆ {1, . . . , 푚} with |휎| = 푘 such that ⃦ ⃦ ⃦ −1⃦ 푚푟 ⃦(퐴퐽휎) ⃦ min ∑︀ . 푆 . 푚 2 ∞ 푘<푟≤rank(퐴) (푟 − 푘) 푖=푟 푠푖(퐴)

21 MAT529 Metric embeddings and geometric inequalities

You have to optimize over 푟. You can get the ratio of 퐿푝 norms from the tail bounds. This implies all the known theorems in restricted invertibility. The subset can be as big as you want up to the rank, and we have sharp control in the entire range. This theorem general- izes Spielman-Srivasta (Theorem 2.1.6), which had generalized Bourgain-Tzafriri (Theorem 2.1.2). 2-8-16 Now we will go backwards a bit, and talk about a less general result. After Theorem 2.1.6, a subsequent theorem gave the same theorem but instead of the stable rank, used something better.

Theorem 2.1.8 (Marcus, Spielman, Srivastava). thm:mss4 If (︃ )︃ 1 ‖퐴‖ 4 푘 < 푆2 , 4 ‖퐴‖ 푆4 there exists 휎 ⊆ {1, . . . , 푚}, |휎| = 푘 such that ⃦ ⃦ √ ⃦ ⃦ 푚 ⃦ −1⃦ (퐴퐽휎) . . 푆∞ ‖퐴‖ 푆2 A lot of these quotients of norms started popping up in people’s results. The correct generalization is the following notion.

Definition 2.1.9: For 푝 > 2, define the stable 푝th rank by

2푝 ‖퐴‖ 푝−2 srank (퐴) = 푆2 . 푝 ‖퐴‖ 푆푝

Exercise 2.1.10: Show that if 푝 ≥ 푞 > 2, then

srank푝(퐴) ≤ srank푞(퐴).

(Hint: Use H¨older’sinequality.)

Now we would like to prove how Theorem 2.1.7 generalizes the previously listed results:

푝 Proof of Generalizability of Theorem 2.1.7. Using H¨older’sinequality with 2 , ∑︁푚 ‖퐴‖2 = 푠 (퐴)2 푆2 푗 푗=1 푟∑︁−1 ∑︁푚 2 2 = 푠푗(퐴) + 푠푗(퐴) 푗=1 푗=푟 2 푟∑︁−1 푝 ∑︁푚 1− 2 푝 2 ≤ (푟 − 1) 푝 푠푗(퐴) + 푠푗(퐴) 푗=1 푗=푟

22 MAT529 Metric embeddings and geometric inequalities

∑︁푚 1− 2 2 2 ≤ (푟 − 1) 푝 ‖퐴‖ + 푠 (퐴) 푆푝 푗 푗=푟 푚 2 ∑︁ 2 ‖퐴‖ 2 2 − 푆푝 푠푗(퐴) ≥ ‖퐴‖ 1 − (푟 − 1) 푝 푆2 ‖퐴‖2 푗=푟 푆2 1− 2 푟 − 1 푝 = ‖퐴‖2 1 − 푆2 srank푝(퐴)

Now we can plug the previous calculation into Theorem 2.1.7 to demonstrate the way the new theorem generalizes the previous results: ⃦ ⃦ ⃦ −1⃦ 푚푟 ⃦(퐴퐽 ) ⃦ min (︂ (︁ )︁ )︂ 휎 . 1− 2 푘+1≤푟≤rank(퐴) (푟 − 푘) ‖퐴‖2 1 − 푟−1 푝 푆2 srank푝(퐴) √ 푚 푟 = min (︂ (︁ )︁ )︂ 1− 2 ‖퐴‖ 푘+1≤푟≤rank(퐴) 푝 ∞ (푟 − 푘) 1 − 푟−1 srank푝(퐴)

This equation implies all the earlier theorems. To optimize, fix the stable rank, differentiate in 푟, and set to 0. All theorems in the literature follow from this theorem; in particular, we get all the bounds we got before. There was nothing special about the number 4 in Theorem 2.1.8; this is about the distribution of the eigenvalues.

2 Ky Fan maximum principle sec:kf We’ll be doing linear algebra. It’s mostly mechanical, except we’ll need this lemma.

푚 푚 Lemma 2.2.1 (Ky Fan maximum principle). lem:kf Suppose that 푃 : R → R is a rank 푘 orthogonal projection. Then ∑︁푘 * 2 Tr(퐴 퐴푃 ) ≤ 푠푖(퐴) 푖=1 2 * where 푠푖(퐴) are the singular values, i.e., 푠푖(퐴) are the eigenvalues of 퐵 := 퐴 퐴. This material was given in class on 2-15. This lemma follows from the following general theorem.

Theorem 2.2.2 (Convex function of dot products acheives maximum at eigenvectors). 푛 푛 푛 thm:eignmax Let 퐵 : R → R be symmetric, and let 푓 : R → R be a convex function. Then for 푛 every orthonormal basis 푢1, . . . , 푢푛 ∈ R , there exists a permutation 휋 such that

eq:kf0푓(⟨퐵푢1, 푢1⟩,..., ⟨퐵푢푛, 푢푛⟩) ≤ 푓(휆휋(1), . . . , 휆휋(푛)) (2.1)

23 MAT529 Metric embeddings and geometric inequalities

Essentially, we’re saying using the eigenbasis maximizes the convex function.

Remark 2.2.3: We can weaken the condition on 푓 to the following: for every 푖 < 푗, 푡 ↦→ 푓(푥1, . . . , 푥푖 + 푡, 푥푖+1, . . . , 푥푗−1, 푥푗 − 푡, 푥푗+1, . . . , 푥푛) is convex as a function of 푡. If 푓 is smooth, this is equivalent to the second derivative in 푡 being ≥ 0:

휕2푓 휕2푓 휕2푓 eq:kf1 2 + 2 − 2 ≥ 0 (2.2) 휕푥푖 휕푥푗 휕푥푖휕푥푗

In other words, you just need that the Hessian is positive semidefinite. (Above, we wrote the determinant of the Hessian on the 푥푖 − 푥푗 plane. This being true for all pairs 푖, 푗 is the same as the Hessian being positive definite.)

Proof. We may assume without loss of generality that

1. 푓 is smooth. If not, convolute with a good kernel.

2. Strict inequality holds: 휕2푓 휕2푓 휕2푓 2 + 2 − 2 > 0. 휕푥푖 휕푥푗 휕푥푖휕푥푗 2 To see this, note we can take any 휀 > 0 and perturb 푓(푥) into 푓(푥) + 휀‖푥‖2. The inequality to prove (2.1) is also perturbed by this slight change, and taking 휀 → 0 gives the desired inequality.

Now let 푢1, . . . , 푢푛 be an orthonormal basis at which 푓(⟨퐵푢1, 푢1⟩,..., ⟨퐵푢푛, 푢푛⟩) attains its maximum. Then for 푢푖, 푢푗, we want to rotate in the 푖 − 푗 plane by angle 휃. Since 푢푖, 푢푗 span a two dimensional subspace, recall the 2-dimensional rotation matrix. Let cos(휃) sin(휃) 푢푖 푅휃 = ; 푢푖;푗 = sin(휃) − cos(휃) 푢푗

Multiplying, we get cos(휃) sin(휃) 푢푖 cos(휃)푢푖 + sin(휃)푢푗 (푅휃푢푖;푗)1 푅휃푢푖;푗 = = = sin(휃) − cos(휃) 푢푗 sin(휃)푢푖 − cos(휃)푢푗 (푅휃푢푖;푗)2 Then, we replace 푓 with 푔(휃) = (︀ 푓 ⟨퐵푢1, 푢1⟩,..., ⟨퐵 (푅휃푢푖;푗)1 , (푅휃푢푖;푗)1⟩, ⟨퐵 (푅휃푢푖;푗)2 , (푅휃푢푖;푗)2⟩,..., ⟨퐵푢푛, 푢푛⟩ where we keep all other dot products the same. By assumption, 푔 attains its maximum at 휃 = 0, so 푔′(0) = 0, 푔′′(0) ≤ 0. Expanding out the rotated dot products explicitly in 푔(휃), we get that the 푖th argument is

2 2 cos (휃)⟨퐵푢푖, 푢푖⟩ + sin (휃)⟨퐵푢푗, 푢푗⟩ + sin(2휃)⟨퐵푢푖, 푢푗⟩

24 MAT529 Metric embeddings and geometric inequalities

and the 푗th argument is

2 2 sin (휃)⟨퐵푢푖, 푢푖⟩ + cos (휃)⟨퐵푢푗, 푢푗⟩ − sin(2휃)⟨퐵푢푖, 푢푗⟩ Then we can mechanically take the derivatives at 휃 = 0 to get 0 = 푔′(0) = 2⟨퐵푢 , 푢 ⟩(푓 − 푓 ) 푖 푗 푥푖 푥푗 (︀ ′′ 2 0 ≥ 푔 (0) = 2 (⟨퐵푢푗, 푢푗⟩ − ⟨퐵푢푖, 푢푖⟩)(푓푥 − 푓푥 ) +4⟨퐵푢푖, 푢푗⟩ 푓푥 푥 + 푓푥 푥 − 2푓푥 푥 . ⏟ 푖 ⏞ 푗 ⏟ 푖 푖 푗 ⏞푗 푖 푗

=0 if ⟨퐵푢푖,푢푗 ⟩̸=0 ≥0

This implies that for all 푖 ̸= 푗 ⟨퐵푢푖, 푢푗⟩ = 0, which implies that for all 푖, 퐵푢푖 = 휇푖푢푖 for some 휇푖. Thus any function applied to a vector of dot products is maximized at eigenvalues.

푛 Exercise 2.2.4: exr:kf If 푓 : R → R satisfies the conditions in Theorem 2.2.2 and (푢1, . . . , 푢푛), 푛 푛 푛 (푣1, . . . , 푣푛) are two orthonormal bases of R , then for every 퐴 : R → R , there exists 휋 ∈ 푆푛, 푛 (휀1, . . . , 휀푛) ∈ {±1} such that

푓(⟨퐴푢1, 푣1⟩, ⟨퐴푢2, 푣2⟩,..., ⟨퐴푢푛, 푣푛⟩) ≤ 푓(휀1푠휋(1)(퐴), . . . , 휀푛푠휋(푛)(퐴)) Show that choosing 푢, 푣 as the singular vectors maximizes 푓 (over all pairs of orthonormal bases).

To solve this problem, you can rotate both vectors in the same direction and take deriva- tives, and also rotate them in opposite directions and take derivatives to get enough infor- mation to prove that the singular values are the maximum. Essentially, a lot of the inequalities you find in books follow from this. For instance, if you want to prove that the Schatten 푝-norm is a norm, it follows directly from this fact. Corollary 2.2.5. Let ‖·‖ be a norm on R푛 that is invariant under premutations and sign:

‖(푥1, . . . , 푥푛)‖ = ‖(휀1푥휋(1), . . . , 휀푛푥휋(푛))‖

푛 for all 휀 ∈ {±1} and 휋 ∈ 푆푛 (In the literature, we call this a symmetric norm). This induces a norm on matrices 푀푚×푛(R) with

‖퐴‖ = ‖(푠휋(1)(퐴), . . . , 푠휋(푛)(퐴)‖) Then the triangle inequality holds for matrices 퐴, 퐵: ‖퐴 + 퐵‖ ≤ ‖퐴‖ + ‖퐵‖. Proof. We have by Exercise 2.2.4

푛 ‖퐴 + 퐵‖ = max ‖(⟨(퐴 + 퐵)푢푖, 푣푖⟩)푖=1‖ (푢푖)⊥,(푣푖)⊥ 푛 푛 ≤ max ‖(⟨(퐴푢푖, 푣푖⟩)푖=1‖ + max ‖(⟨퐵푢푖, 푣푖⟩)푖=1‖ (푢푖)⊥,(푣푖)⊥ (푢푖)⊥,(푣푖)⊥ ≤ ‖퐴‖ + ‖퐵‖ .

25 MAT529 Metric embeddings and geometric inequalities

Remember Theorem 2.2.2! For many results, you simply need to apply the right convex function to get the result. ∑︀ 푘 Our lemma follows from setting 푓(푥) = 푖=1 푥푖.

Proof of Ky Fan Maximum Principle (Lemma 2.2.1). Take an orthonormal basis 푢1, . . . , 푢푛 of 푃 such that 푢1, . . . , 푢푘 is a basis of the range of 푃 . Then

∑︁푘 ∑︁푘 ∑︁푘 2 Tr(퐵푃 ) = ⟨퐵푒푗, 푒푗⟩ ≤ 푠푖(퐵) = 푠푖(퐴) 푗=1 푖=1 푖=1

3 Finding big subsets

We’ll present 4 lemmas for finding big subsets with certain properties. We’ll putthem together at the end.

3.1 Little Grothendieck inequality

Theorem 2.3.1 (Little Grothendieck inequality). thm:lgi Fix 푘, 푚, 푛 ∈ N. Suppose that 푇 : 푚 푛 푚 R → R is a linear operator. Then for every 푥1, . . . , 푥푘 ∈ R ,

∑︁푘 ∑︁푘 2 휋 2 2 ‖푇 푥푟‖ ≤ ‖푇 ‖ 푚 푛 푥 2 ℓ∞→ℓ2 푟푖 푟=1 2 푟=1 for some 푖 ∈ {1, . . . , 푚} where 푥푟 = (푥푟1, . . . , 푥푟푚).

휋 Later we will show 2 is sharp. If we had only 1 vector, what does this say? √︂ 휋 ‖푇 푥1‖2 ≤ ‖푇 ‖ℓ푚 →ℓ푛 ‖푥1‖∞ 2 ∞ 2 We know the inequality is true for 푘 = 1 with constant 1, by definition of the operator norm. 휋 The theorem is true for arbitrary many vectors, losing an universal constant ( 2 ). After we 휋 see the proof, the example where 2 is attained will be natural. We give Grothendieck’s original proof. The key claim is the following.

Claim 2.3.2. clm:lgi

(︃ )︃ 1 √︂ (︃ )︃ 1 ∑︁푚 ∑︁푘 2 ∑︁푘 2 * 2 휋 2 eq:lgi1 (푇 푇 푥푟) ≤ ‖푇 ‖ 푚 푛 ‖푇 푥푟‖ . (2.3) 푗 ℓ∞→ℓ2 푗=1 푟=1 2 푟=1

26 MAT529 Metric embeddings and geometric inequalities

Proof of Theorem 2.3.1. Assuming Claim 2.3.2, we prove the theorem.

∑︁푘 ∑︁푘 2 ‖푇 푥푟‖2 = ⟨푇 푥푟, 푇 푥푟⟩ 푟=1 푟=1 ∑︁푘 * = ⟨푥푟, 푇 푇 푥푟⟩ 푟=1 ∑︁푘 ∑︁푚 * = 푥푟푗(푇 푇 푥푟)푗 푟=1 푗=1 (︃ )︃ 1 (︃ )︃ 1 ∑︁푚 ∑︁푘 2 ∑︁푘 2 2 * 2 ≤ 푥푟푗 (푇 푇 푥푟)푗 by Cauchy-Schwarz 푗=1 푟=1 푟=1 (︃ )︃ 1 1 ∑︁푘 2 ∑︁푚 ∑︁푘 2 2 * 2 ≤ max 푥푟푗 (푇 푇 푥푟)푗 1≤푗≤푚 푟=1 푗=1 푟=1 (︃ )︃ 1 √︂ (︃ )︃ 1 ∑︁푘 2 ∑︁푘 2 2 휋 2 ≤ max 푥푖푗 ‖푇 ‖ 푚 푛 ‖푇 푥푟‖ 1≤푗≤푚 ℓ∞→ℓ2 2 푖=1 2 푟=1 ∑︁푘 ∑︁푘 2 휋 2 2 ‖푇 푥푟‖ ≤ ‖푇 ‖ 푚 푛 max 푥푖푗. 2 ℓ∞→ℓ2 푗 푟=1 2 푟=1 We bounded by a square root of the multiple of the same term, a bootstrapping argument. In the last step, divide and square.

Proof of Claim 2.3.2. Let 푔1, . . . , 푔푘 be iid standard Gaussian random variables. For every fixed 푗 ∈ {1, . . . , 푚}, ∑︁푘 * 푔푟(푇 푇 푥푟)푗. 푟=1 ∑︀ 푘 * 2 This is a Gaussian random variable with mean 0 and variance 푟=1(푇 푇 푥푟)푗 . Taking the expectation,1

⃒ ⃒ (︃ )︃ 1 ⃒ ⃒ ⃒∑︁푘 ⃒ ∑︁푘 2 2 ⃒ * ⃒ * 2 E ⃒ 푔푟(푇 푇 푥푟)푗⃒ = (푇 푇 푥푟)푗 . 푟=1 푟=1 휋 Sum these over 푗:

⎡ ⃒ ⃒⎤ (︃ )︃ 1 ⃒ ⃒ ∑︁푚 ⃒ ∑︁푘 ⃒ 2 ∑︁푚 ∑︁푘 2 ⎣ ⃒ * ⃒⎦ * 2 E ⃒푇 ( 푔푟푇 푥푟)푗⃒ = (푇 푇 푥푟)푗 푗=1 푟=1 휋 푗=1 푟=1 (︃ )︃ 1 ⎡ ⃒ ⃒⎤ √︂ ⃒ ⃒ ∑︁푚 ∑︁푘 2 휋 ∑︁푚 ⃒ ∑︁푘 ⃒ * 2 ⎣ ⃒ * ⃒⎦ (푇 푇 푥푟)푗 = E ⃒푇 푔푟(푇 푥푟)푗⃒ .eq:lgi2 (2.4) 푗=1 푟=1 2 푗=1 푟=1 ∫︀ ∞ 푥2 푥2 1 1 − 2 1 − 2 ∞ 2 2휋 −∞ |푥|푒 = −2 2휋 [푒 ]0 = 휋

27 MAT529 Metric embeddings and geometric inequalities

Define a random sign vector 푧 ∈ {±1}푚 by (︃ )︃ ∑︁푘 * 푧푗 = sign 푇 푔푟푇 푥푟 푟=1 푗 Then ⃒ ⃒ ⟨ ⟩ ⃒ ⃒ ∑︁푚 ⃒ ∑︁푘 ⃒ ∑︁푘 ⃒ * ⃒ * ⃒(푇 푔푇 푥푟)푗⃒ = 푧, 푇 푔푟푇 푥푟 푗=1 푟=1 푟=1 ⟨ ⟩ ∑︁푘 = 푇 푧, 푔푟푇 푥푟 푟=1⃦ ⃦ ⃦ ⃦ ⃦∑︁푘 ⃦ ⃦ ⃦ ≤ ‖푇 푧‖2 ⃦ 푔푟푇 푥푟⃦ 푟=1⃦ 2 ⃦ ⃦ ⃦ ⃦∑︁푘 ⃦ ≤ ‖푇 ‖ 푚 푛 ⃦ 푔푟푇 푥푟⃦ ℓ∞→ℓ2 ⃦ ⃦ 푟=1 2 This is a pointwise inequality. Taking expectations and using Cauchy-Schwarz,

⎡ ⃒(︃ )︃ ⃒⎤ ⃦ ⃦ 1 ⃒ ⃒ ⃦ ⃦2 2 ∑︁푚 ⃒ ∑︁푘 ⃒ ⃦∑︁푘 ⃦ ⎣ ⃒ * ⃒⎦ ⃦ ⃦ eq:lgi3E ⃒ 푇 푔푟푇 푥푟 ⃒ ≤ ‖푇 ‖ℓ푚 →ℓ푛 E 푔푟푇 푥푟 . (2.5) ⃒ ⃒ ∞ 2 ⃦ ⃦ 푗=1 푟=1 푗 푟=1 2 What is the second moment? Expand: ⃦ ⃦ ⎡ ⎤ ⃦ ⃦2 ⃦∑︁푘 ⃦ ∑︁ ∑︁푘 ⃦ ⃦ ⎣ ⎦ 2 eq:lgi4E ⃦ 푔푖푇 푥푟⃦ = E 푔푖푔푗 ⟨푇 푥푖, 푇 푥푗⟩ = ‖푇 푥푟‖2 . (2.6) 푟=1 2 푖푗 푟=1 Chaining together (2.4), (2.5), (2.6) gives the result. Why use the Gaussians? The identity characterizes the Gaussians using rotation invari- ance. Using other random variables gives other constants that are not sharp. There will be lots of geometric lemmas: ∙ A fact about restricting matrices. ∙ Another geometric argument to give a different method for selecting subsets. ∙ A combinatorial lemma for selecting subsets. Finally we’ll put them together in a crazy induction. 2-10-16: We were in the process of proving three or four subset selection principles, which we will somehow use to prove the RIP. The little Grothendieck inequality (Theorem 2.3.1) is part of an amazing area of mathe- matics with many applications. It’s little, but very useful. The proof is really Grothendieck’s original proof, re-organized. For completeness, we’ll show the fact that the inequality is sharp (cannot be improved).

28 MAT529 Metric embeddings and geometric inequalities

3.1.1 Tightness of Grothendieck’s inequality Corollary 2.3.3. 휋/2 is the best constant in Theorem 2.3.1.

From the proof, we reverse engineer vectors that make the inequality sharp. They are given in the following example.

Example 2.3.4: Let 푔1, 푔2, . . . , 푔푘 be iid Gaussians on the probability space (Ω, 푃 ). Let 푘 푇 : 퐿∞(Ω, 푃 ) → ℓ2 be

푇 푓 = (E[푓푔1],..., E[푓푔푘]).

Let 푥푟 ∈ 퐿∞(Ω, 푃 ), 푔푟 푥푟 = (︀∑︀ 1 . 푘 2 2 푖=1 푔푖

Proof. Le 푔1, . . . , 푔푘; 푇 ; 푥1, . . . 푥푘 be as in the example. Note the 푥푟 are nothing more than vectors on the 푘-dimensional unit sphere, so they are bounded functions on the measure space Ω. We can also write

∑︁푘 ∑︁푘 2 2 푔푟(휔) eq:sum-xr ∑︀ 푥푟(휔) = 푟 2 = 1 (2.7) 푟=1 푟=1 푖=1 푔푖(휔)

We use the Central Limit Theorem in order to replace the Ω by a discrete space. Let 휀푟,푖 be ±1 random variables. Then letting

휀푟,1 + ... + 휀푟,푁 푔푟 = √ 푁

instead, we have that 푔푟 approaches a standard Gaussian in distribution, so the statements we make will be asymptotically true. With this discretization, the random variables {푔푟} 푁퐾 2푁퐾 live in Ω = {±1} . So 퐿∞(Ω) = 푙∞ , which is in a large but finite dimension. So 휔 will really be a coordinate in Ω. Now we show two things; they are nothing more than computations.

1. ‖푇 ‖ 푘 = 2/휋, 퐿∞(Ω,P)→푙2

∑︀ 푘 2 푘→∞ 2. We also show 푟=1 ‖푇 푥푟‖2 −−−→ 1.

From (2.7) and the 2 items, the little Grothendieck inequality is sharp in the limit.

29 MAT529 Metric embeddings and geometric inequalities

For (1), we have (︃ )︃ ∑︁푘 1/2 2 ∞ 2 ‖푇 ‖ℓ →ℓ = sup‖푓‖∞≤1 E [푓푔푟] 푟=1 ∑︁ ∑︀ = sup sup 푘 훼 [푓푔 ] ‖푓‖∞≤1 훼2=1 푟E 푟 푟=1 푟 푟=1[︃ ]︃ ∑︁푘 (2.8) ∑︀ = sup 푘 sup 푓 훼 푔 훼2=1 ‖푓‖∞≤1E 푟 푟 푟=1 푟 ⃒ ⃒ 푖=1 ⃒ ⃒ ⃒∑︁푘 ⃒ 2 ∑︀ ⃒ ⃒ = sup 푘 E ⃒ 훼푟푔푟⃒ = E|푔1| = 푟=1 푟=1 휋 ∑︀ 푘 as we claimed, since ‖훼‖2 = 1 implies 푟=1 훼푟푔푟 is also a gaussian. Now for (2), ⎡ ⎤ 2 ∑︁푘 ∑︁푘 2 2 ⎣ 푔푟 ⎦ ‖푇 푥푟‖2 = E (︀∑︀ 푘 2 1/2 푟=1 푟=1 푖=1 푔푖 ⎡ ⎤ 2 2 ⎣ 푔1 ⎦ = 퐾 E (︀∑︀ 푘 2 1/2 푖=1 푔푖 ⎡ ⎤ 2 (2.9) ∑︁푘 2 1 ⎣ 푔푟 ⎦ = 퐾 E (︀∑︀ 퐾 푘 2 1/2 푟=1 푖=1 푔푖 ⎡(︃ )︃ ⎤2 ∑︁푘 1/2 1 ⎣ 2 ⎦ = E 푔푖 퐾 푖=1

and you can use Stirling to finish. This is just a 휒2-distribution. ∑︀푔1푔2 ∑︀푔1(−푔2) 푘 In this case E 1/2 = E 1/2 . Also note that if (푔1, . . . , 푔푘) ∈ R is a standard ( 푔2) ( 푔2) 푖 푖 (︀∑︀푖 푖 (푔 ,...,푔 ) Gaussian, then (︀∑︀ 1 푘 and 푘 푔2)1/2 are independent. In other words, the length 푘 1/2 푖=1 푖 푔2 푖=1 푖 and angle are independent: This is just polar coordinates, you can check this.

Now, how does this relate to the Restricted Invertibility Problem?

3.2 Pietsch Domination Theorem

Theorem 2.3.5 (Pietsch Domination Theorem). thm:pdt Fix 푚, 푛 ∈ N and 푀 > 0. Suppose 푚 푛 푚 that 푇 : R → R is a linear operator such that for every 푥1, . . . , 푥푘 ∈ R have (︃ )︃ (︃ )︃ ∑︁푘 1/2 ∑︁푘 1/2 2 2 eq:pdt1 ‖푇 푥푟‖2 ≤ 푀 max 푥푟푗 (2.10) 1≤푗≤푚 푟=1 푟=1

30 MAT529 Metric embeddings and geometric inequalities

∑︀ 푚 푚 Then there exist 휇 = (휇1, . . . , 휇푚) ∈ R with 휇1 ≥ 0 and 푖=1 휇푖 = 1 such that for every 푚 푥 ∈ R (︃ )︃ ∑︁푀 1/2 2 eq:pdt2‖푇 푥‖2 ≤ 푀 휇푖 ‖푥푖‖ (2.11) 푖=1 The theorem says that you can come up with a probability measure such that the norm of T as an operator as a standard norm from 푙∞ to 푙2 (?), is bounded by 푀.

Remark 2.3.6: The theorem really an iff: (2.11) is a stronger statement than (2.10), and in fact they are equivalent.

Proof. Define 퐾 ⊆ R푚 with {︃ }︃ ∑︁푘 ∑︁푚 푚 2 2 2 푚 퐾 = 푦 ∈ R : 푦푖 = ‖푇 푥푟‖2 − 푀 푥푟푖 for some 푘, 푥1, . . . , 푥푘 ∈ R 푟=1 푟=1

Basically we cleverly select a convex set. Every 푛-tuple of vectors in R푚 gives you a new vector in R푚. Let’s check that 퐾 is convex. We have to check if two vectors 푦, 푧 ∈ 퐾, 푘 then all points on the line between them are in 퐾. 푦, 푧 ∈ 퐾 means that there exist (푥푖)푖=1, 푙 (푤푖)푖=1,

∑︁푘 ∑︁푚 2 2 2 푦푖 = ‖푇 푥푟‖2 − 푀 푥푟푖 푟=1 푟=1 ∑︁푙 ∑︁푙 2 2 2 푧푖 = ‖푇 푤푟‖2 − 푀 푤푟푖 푟=1 푟=1 √ √ √ √ for all 푖. Then 훼푦푖 + (1 − 훼)푧푖 comes from ( 훼푥1,..., 훼푥푘, 1 − 훼푤1,... 1 − 훼푤푘). So by design, 퐾 is a convex set. Now, the assumption of the theorem says that (︃ )︃ (︃ )︃ ∑︁푘 1/2 ∑︁푘 1/2 2 2 ‖푇 푥푟‖2 ≤ 푀max1≤푗≤푚 푥푟푗 푟=1 푟=1 which implies ∑︁푚 2 2 2 ‖푇 푥푟‖2 − 푀 max1≤푗≤푚 푥푟푗 ≤ 0 푟=1 which implies 퐾 ∩ (0, ∞)푚 = ∅. By the hyperplane separation theorem (for two disjoint convex sets in R푚 with at least one compact, there is a hyperplane between them), there 푚 exists 0 ̸= 휇 = (휇1, . . . , 휇푚) ∈ R with ⟨휇, 푦⟩ ≤ ⟨휇, 푧⟩ ∑︀ 푚 푚 for all 푦 ∈ 퐾 and 푧 ∈ (0, ∞) . By renormalizing, we may assume 푖=1 휇푖 = 1. Moreover 휇 cannot have any strictly negative coordinate: Otherwise you could take 푧 to have arbitrarily

31 MAT529 Metric embeddings and geometric inequalities

large value at a strictly negative coordinate with zeros everywhere else, implying ⟨푢, 푧⟩ is no longer bounded from below, a contradiction.∑︀ Therefore, 휇 is a probability vector and ⟨휇, 푧⟩ 푚 can be arbitrarily small. So for every 푦 ∈ 퐾, 푖=1 휇푖푦푖 ≤ 0. Write 2 2 2 푦푖 = ‖푇 푥‖2 − 푀 ‖푥푖‖ ∈ 퐾. Expanding this out, ∑︁푛 2 2 2 ‖푇 푥‖2 − 푀 휇푖 ‖푥푖‖ ≤ 0, 푖=1 which is exactly what we wanted.

3.3 A projection bound

푛 푚 Lemma 2.3.7. lem:projbound 푚, 푛 ∈ N, 휀 ∈ (0, 1), 푇 : R → R a linear operator. Then ∃휎 ⊂ {1, . . . , 푚} with |휎| ≥ (1 − 휀)푚 such that √︂ 휋 ‖Proj 휎 푇 ‖푆 ≤ ‖푇 ‖푙푛→푙푚 R ∞ 2휀푚 2 1 We will find ways to restrict a matrix to a big submatrix. We won’t be able tocontrol 푛 푚 its operator norm, but we will be able to control the norm from 푙2 to 푙1 . Then we pass to a further subset, which this becomes an operator norm on, which is an improvement which Grothendieck gave us. This is the first very useful tool to start finding big submatrices.

푛 푚 * 푚 푛 Proof. We have 푇 : 푙2 → 푙1 , 푇 : 푙∞ → 푙2 . Now some abstract nonsense gives us that for Ba- * nach spaces, the norm of an operator and its adjoint are equal, i.e. ‖푇 ‖ 푛 푚 = ‖푇 ‖ 푚 푛 . 푙2 →푙1 푙∞→푙2 This statement follows from the Hahn-Banach theorem (come see me if you haven’t seen this before, I’ll tell you what book to read). From the Little Grothendieck inequality * (Theorem 2.3.1), 푇 satisfies the assumption of the Pietsch domination theorem 2.3.5 with 휋 * 푀 = ‖푇 ‖ 푛 푚 (we’re applying it to 푇 ). By the theorem, there exists a probability 2 푙2 →푙1 푚 vector (휇1, . . . , 휇푚) such that for every 푦 ∈ R , (︃ )︃ ∑︁푚 1/2 * 2 ‖푇 푦‖2 = 푀 휇푖푦푖 푖=1 ⌋︀ {︀ 휋 1 with 푀 = ‖푇 ‖ 푛 푚 . Define 휎 = 푖 ∈ {1, . . . , 푚} : 휇 ≤ ; then |휎| ≥ (1 − 휀)푚 by 2 푙2 →푙1 푖 푚휀 Markov’s inequality. We can also see this by writing ∑︁푚 ∑︁ ∑︁ ∑︁ 푚 − |휎| 1 = 휇푖 = 휇푖 + 휇푖 > 휇푖 + 푖=1 푖∈휎 푖̸∈휎 푖∈휎 푚휀

1 which follows since for 푗 ̸∈ 휎, 휇푗 > 푚휀 . Continuing, 푚휀 − 푚 + |휎| ∑︁ ≥ 휇 푚휀 푖 푖∈휎 ∑︁ |휎| ≥ (푚휀) 휇푖 + 푚(1 − 휀) 푖∈휎

32 MAT529 Metric embeddings and geometric inequalities

∑︀ Then, because 휇 is a probability distribution, (푚휀) 푖∈휎 휇푖 ≥ 0 and we have

|휎| ≥ 푚(1 − 휀)

푛 푚 Now take 푥 ∈ R and choose 푦 ∈ R with ‖푦‖2 = 1. Then

2 * 2 * 2 2 ⟨푦, Proj 휎 푇 푥⟩ = ⟨푇 Proj 휎 푦, 푥⟩ ≤ ‖푇 Proj 휎 푦‖ ·‖푥‖ R R (︃ )︃ R 2 2 ∑︁ 휋 2 2 휋 2 1 2 ≤ ‖푇 ‖푙푛→푙푚 휇푖푦 ‖푥‖ ≤ ‖푇 ‖ 푛 푚 ‖푥‖ 2 1 푖 2 푙2 →푙1 2 2 푖∈휎 2 푚휀 by Cauchy-Schwarz. Taking square roots gives the desired result.

In the previous proof, we used a lot of duality to get an interesting subset.

Remark 2.3.8: In Lemma 2.3.7, I think that either the constant 휋/2 is sharp (no subset are bigger; it could come from the Gaussians), or there is a different constant here. If the constant is 1, I think you can optimize the previous argument and get the constant to be 휋 arbitrarily close to 1, which would have some nice applications: In other words, getting 2휀푚 as close to 1 as possible would be good. I didn’t check before class, but you might want to check if you can carry out this argument using the Gaussian argument we made for the 휋 sharpness of 2 in Grothendieck’s inequality (Theorem 2.3.1). It’s also possible that there is a different universal constant.

3.4 Sauer-Shelah Lemma Now we will give another lemma which is very easy and which we will use a lot.

Lemma 2.3.9 (Sauer-Shelah). lem:saushel Take integers 푚, 푛 ∈ N and suppose that we have a 푛 large set Ω ⊆ {±1} with (︃ )︃ 푚∑︁−1 푛 |Ω| > 푘=0 푘 Then ∃휎 ⊆ {1, . . . , 푛} such that with |휎| = 푚, if you project onto R휎 the set of vectors, you get 휎 2 휎 the entire cube: ProjR휎 (Ω) = {±1} . For every 휀 ∈ {±1} , there exists 훿 = (훿1, . . . , 훿푛) ∈ Ω such that 훿푗 = 휀푗 for 푗 ∈ 휎.

Note that Lemma 2.3.9 is used in the proof of the Fundamental Theorem of Statistical Learning Theory.

Proof. We want to prove by induction on 푛. First denote the shattering set

휎 sh(Ω) = {휎 ⊆ {1, . . . , 푛} : ProjR휎 Ω = {±1} } 2I.e., the VC dimension of Ω is ≥ 푚.

33 MAT529 Metric embeddings and geometric inequalities

The claim is that the number of sets shattered by a given set is |sh(Ω)| ≥ |Ω|. The empty set case is trivial. What happens when 푛 = 1? Ω ⊂ {−1, 1}, and thus the set is shattered. Assume that our claim holds for 푛, and now set Ω ⊆ {±1}푛+1 = {±1}푛 × {±1}. Define

푛 Ω+ = {휔 ∈ {±1} :(휔, 1) ∈ Ω}

푛 Ω− = {휔 ∈ {±1} :(휔, −1) ∈ Ω} ˜ 푛+1 ˜ ˜ Then, letting Ω+ = {(휔, 1) ∈ {±1} : 휔 ∈ Ω+} and Ω− similarly, we have |Ω| = |Ω+| + ˜ |Ω−| = |Ω+| + |Ω−|. By our inductive step, we have sh(Ω+) ≥ |Ω+| and sh(Ω−) ≥ |Ω−|. Note ′ that any subset that shatters Ω+ also shatters Ω, and likewise for Ω−. Note that if a set Ω shatters both of them, we are allowed to add on an extra coordinate to get Ω′ × {±1} which shatters Ω. Therefore,

sh(Ω+) ∪ sh(Ω−) ∪ {휎 ∪ {푛 + 1} : 휎 ∈ sh(Ω+) ∩ sh(Ω−)} ⊆ sh(Ω) where the last union is disjoint since the dimensions are different. Therefore, we can now use this set inclusion to complete the induction using the principle of inclusion-exclusion:

|sh(Ω)| ≥ |sh(Ω+) ∪ sh(Ω−)| + |sh(Ω+) ∩ sh(Ω−)| (disjoint sets)

= |sh(Ω+)| + |sh(Ω−)| − |sh(Ω+) ∩ sh(Ω−)| + |sh(Ω+) ∩ sh(Ω−)|

= |sh(Ω+)| + |sh(Ω−)|

≥ |Ω+| + |Ω−| = |Ω|

which completes the induction as desired.

We will primarily use the theorem as the following corollary, which says that if you have half of the points in terms of cardinality, you get half of the dimension.

푛−1 푛+1 푛 Corollary 2.3.10. If |Ω| ≥ 2 then there exists 휎 ⊆ {1, . . . , 푛} with |휎| ≥ ⌈ 2 ⌉ ≥ 2 such 휎 that ProjR휎 Ω = {±1} . 2-15-16: Last time we left off with the proof of the Sauer-Shelah lemma. To remind you, we were finding ways to find interesting subsets where matrices behave well. Now recallwe had a linear algebraic fact which I owe you; I will prove it in an analytic way. The proof has been moved to Section2.

4 Proof of RIP

4.1 Step 1 Now we need another geometric lemma for the proof of Theorem 2.1.7, the restricted invert- ibility principle.

34 MAT529 Metric embeddings and geometric inequalities

푚 푛 Lemma 2.4.1 (Step 1). lem:step1 Fix 푚, 푛, 푟 ∈ N. Let 퐴 : R → R be a linear operator with rank(퐴) ≥ 푟. For every 휏 ⊆ {1, . . . , 푚}, denote

⊥ 퐸휏 = (span((퐴푒푗)푗∈휏 )) .

Then there exists 휏 ⊆ {1, . . . , 푚} with |휏| = 푟 such that for all 푗 ∈ 휏, (︃ )︃ 1 ∑︁푚 1/2 ‖Proj 퐴푒 ‖ ≥ √ 푠 (퐴)2 . 퐸휏∖{푗} 푗 2 푖 푚 푖=1

Basically we’re taking the projection of the 푗푡ℎ column onto the orthogonal completement of the span of the subspace of all columns in the set except for the 푗푡ℎ one, and bounding the norm of that by a dimension term and the square root of the sum of the eigenvalues. (This is sharp asymptotically, and may in fact even be sharp as written too—I need to check. Check this?)

Proof. For every 휏 ⊆ {1, . . . , 푚}, denote

퐾휏 = conv ({±퐴푒푗}푗∈휏 )

The idea is to make the convex hull have big volume. Once we do that, wewill get all these inequalities for free. Let 휏 ⊆ {1, . . . , 푚} be the subset of size 푟 that maximizes vol푟(퐾휏 ). We know that vol푟(퐾휏 ) > 0. Observe that for any 훽 ⊆ {1, . . . , 푚} of size 푟 − 1 and 푖 ̸∈ 훽, we have 퐾훽∪{푖} = conv (퐾훽 ∪ {±퐴푒푖}) , which is a double cone.

What is the height of this cone? It is ‖Proj 퐴푒 ‖ , as 퐸 is the orthogonal complement 퐸훽 푖 2 훽 of the space spanned by 훽. Therefore, the 푟-dimensional volume is given by

vol푟−1(퐾훽) ·‖Proj 퐴푒푖‖2 vol (퐾 ) = 2 · 퐸훽 푟 훽∪{푖} 푟 35 MAT529 Metric embeddings and geometric inequalities

Because |휏| = 푟 is the maximizing subset of 퐹Ω, for any 푗 ∈ 휏 and 푖 ∈ {1, . . . , 푚}, choosing 훽 = 휏 ∖ {푗}, we get

vol푟(퐾훽∪{푗}) ≥ vol푟(퐾훽∪{푖}) =⇒ ‖Proj 퐴푒 ‖ ≥ ‖Proj 퐴푒 ‖ . 퐸휏∖{푗} 푗 2 퐸휏∖{푗} 푖 2 for every 푗 ∈ 휏 and 푖 ∈ {1, . . . , 푚}. Summing,

∑︁푚 푚‖Proj 퐴푒 ‖2 ≥ ‖Proj 퐴푒 ‖2 = ‖Proj 퐴‖2 . 퐸휏∖{푗} 푗 2 퐸휏∖{푗} 푖 2 퐸휏∖{푗} 푆2 푖=1 Then, for all 푗 ∈ 휏, 1 eq:rip-s1-1‖Proj 퐴푒푗‖2 ≥ √ ‖Proj 퐴‖푆 (2.12) 퐸휏∖{푗} 푚 퐸휏∖{푗} 2

Let’s denote 푃 = Proj . Note 푃 is an orthogonal projection of rank 푟 − 1. Then, 퐸휏∖{푗}

2 * * * * * ‖푃 퐴‖푆2 = Tr((푃 퐴) (푃 퐴)) = Tr(퐴 푃 푃 퐴) = Tr(퐴 푃 퐴) = Tr(퐴퐴 푃 ) ∑︁푚 푟∑︁−1 ∑︁푚 eq:rip-s1-2 * * 2 2 2 (2.13) = Tr(퐴퐴 ) − Tr(퐴퐴 (퐼 − 푃 )) ≥ 푠푖(퐴) − 푠푖(퐴) = 푠푖(퐴) 푖=1 푖=1 푖=푟 using the Ky Fan maximal principle 2.2.1, since 퐼 − 푃 is a projection of rank 푚 − 푟 + 1. Putting (2.12) and (2.13) together gives the result.

4.2 Step 2 In our proof of the restricted invertibility principle, this is the first step. Before proving it, let me just tell you what the second step looks like.

푚 푛 Lemma 2.4.2 (Step 2). lem:step2 Let 푘, 푚, 푛 ∈ N, 퐴 : R → R , rank(퐴) > 푘. Let 휔 ⊆ {1, . . . , 푚} with |휔| = rank(퐴) such that {퐴푒푗}푗∈휔 are linearly independent. Denote for every 푗 ∈ Ω (︀ 퐹푗 = 퐸휔∖{푗} = span(퐴푒푖)푖∈휔∖{푗} . Then there exists 휎 ⊆ 휔 with |휎| = 푘 such that rank(퐴) −1 1 ‖ (퐴퐽휎) ‖푆 · max ∞ . 푗∈휔 rank(퐴) − 푘 ‖Proj퐹푗 퐴푒푗‖

Most of the work is in the second step. First we pass to a subset where we have some information about the shortest possible orthogonal project. But Step 1 saves us by bounding what this can be. Here we use the Grothendieck inequality, Sauer-Shelah, etc. Everything: It’s simple, but it kills the restricted invertibility principle.

36 MAT529 Metric embeddings and geometric inequalities

Proof of Theorem 2.1.7 given Step 1 and 2. Take 퐴 : R푚 → R푛. By Step 1 (Lemma 2.4.1), we can find subset 휏 ⊆ {1, . . . , 푚} with |휏| = 푟 such that for all 푗 ∈ 휏, (︃ )︃ ∑︁푚 1/2 1 2 eq:step1‖Proj 퐴푒 ‖ ≥ √ 푠 (퐴) . (2.14) 퐸휏∖{푗} 푗 2 푖 푚 푖=푟

Now we apply Step 2 (Lemma 2.4.2) to 퐴퐽휏 , using 휔 = 휏, and find a further subset 휎 ⊆ 휏 such that −1 rank(퐴) ⃦ 1 ⃦ ‖ (퐴퐽휎) ‖푆∞ ≤ min max ⃦ ⃦ 푘<푟

푚 푛 푚 Lemma 2.4.3. lem:rip-step2-1 Let 퐴 : R → R be such that {퐴푒푗}푗=1 are linearly independent, and 휎 ⊆ {1, . . . , 푚}, 푡 ∈ N. Then there exists 휏 ⊆ 휎 with (︂ )︂ 1 |휏| ≥ 1 − |휎|, 2푡

⊥ 1 such that denoting 휃 = 휏∪({1, . . . , 푚}∖휎), 퐹푗 = (span({퐴푒푗}푖̸=푗)) , and 푀 = max푗∈휔 ‖Proj 퐴푒 ‖ , 퐹푗 푗 we have that for all 휎 ∈ R휃, ⃦ ⃦ ∑︁ ⃦∑︁ ⃦ 푡 ⃦ ⃦ 2 ⃦ ⃦ |푎푖| ≤ 2 푀 |휎| ⃦ 푎푖퐴푒푖⃦ . 푖∈휏 푖∈휃 2

37 MAT529 Metric embeddings and geometric inequalities

This is proved by a nice inductive argument. Proof. TODO next time

푚 푛 Lemma 2.4.4. lem:rip-step2-2 Let 푚, 푛, 푡 ∈ N and 훽 ⊆ {1, . . . , 푚}. Let 퐴 : R → R be a 푚 linear operator such that {(︀퐴푒푗}푗=1 are linearly independent. Then there exist two subsets 1 |훽| 휎 ⊆ 휏 ⊆ 훽 such that |휏| ≥ 1 − 2푡 |훽|, |휏∖휎| ≤ 4 , and if we denote 휃 = 휏 ∪ ({1, . . . , 푚}∖훽), 1 푀 = max푗∈휔 ‖Proj 퐴푒 ‖ , then 퐹푗 푗 ⃦ ⃦ ⃦ −1⃦ 푡 ⃦ ⃦ 2 Proj 휎 (퐴퐽휃) . 2 푀. R 푆∞

Proof of Lemma 2.4.4 from Lemma 2.4.3. Apply Lemma 2.4.3 with 휎 = 훽. Basically, we’re going to inductively construct a subset 휎 = 훽 of {1, ··· , 푚} onto which to project, so that we can control the 푙1 norm of the subset 휏 by the 푙2 norm on a slightly greater set 휃 via Lemma 2.4.3. (︀ 1 We find 휏 ⊆ 훽 with |휏| ≥ 1 − 2푡 |훽| such that ⃦ ⃦ ∑︁ ⃦∑︁ ⃦ 푡 ⃦ ⃦ 2 ⃦ ⃦ |푎푖| ≤ 2 푀 |훽| ⃦ 푎푖퐴푒푖⃦ . 푖∈휏 푖∈휃 2 Rewriting gives that 휎 푡 ∀푎 ∈ , ‖Proj 휏 푎‖ ≤ 2 2 푀 |훽| ‖퐴퐽 푎‖ ⃦ R ⃦R 휃 2 ⃦ −1⃦ 푡 =⇒ ⃦Proj 휏 (퐴퐽 ) ⃦ 2 2 푀 |훽|. R 휃 휃 휏 . ℓ2→ℓ1 |훽| Denote 휀 = 4|휏| . By Lemma 2.3.7, there exists 휎 ⊆ 휏, |휎| ≥ (1 − 휀)|휏| such that ⃦ ⃦ ⃦ ⃦ ⃦ −1⃦ ⃦ −1⃦ 휋 푡 ⃦Proj 휎 (퐴퐽휃) ⃦ = ⃦Proj 휎 Proj 휏 (퐴퐽휃) ⃦ 2 2 푀 |훽| R 푆∞ R R 푆∞ . 2휀|휏| 푡 . 2 2 푀.

38 MAT529 Metric embeddings and geometric inequalities

Now, we have basically finished making use of Lemma 2.4.3 in the proof of Lemma 2.4.4. It remains to use Lemma 2.4.4 to prove the full second step in our proof of the Restricted Invertibility Principle.

1 푘 1 Proof of Lemma 2.4.2 (Step 2). Fix an integer 푟 such that 22푟+1 ≤ 1 − 푚 ≤ 22푟 . Proceed inductively as follows. First set

휏0 = {1, . . . , 푚}

휎0 = 휑.

Suppose 푢 ∈ {0, . . . , 푟 + 1} and we constructed 휎푘, 휏푘 ⊆ {1, . . . , 푚} such that if we denote 훽푢 = 휏푢∖휎푢, 휃푢 = 휏푢 ∪ ({1, . . . , 푚}∖훽푢−1), then

1. 휎푢 ⊆ 휏푢 ⊆ 훽푢−1 (︀ 1 2. |휏푢| ≥ 1 − 22푟−푢+4 |훽푢−1|

1 3. |훽푢| ≤ 4 |훽푢−1|

−1 푟− 푢 4. ‖Proj 휎 (퐴퐽 ) ‖ 2 2 푀. R 푢 휃푢 푆∞ .

(︀ 1 Let 퐻 = 2푟 − 푢 + 4. For instance, |휏1| ≥ 1 − 22푟+3 |훽0|. What is the new 훽? For the inductive step,(︀ apply Lemma 2.4.4 on 훽푢−1 with 푡 = 2푟 − 푢 + 4 to get 휎푢 ⊆ 휏푢 ⊆ 1 |훽푢−1| 훽푢−1 such that |휏푢| ≥ 1 − 22푟−푢+4 |훽푢−1|, |휏푢∖휎푢| ≤ 4 As we induct, we are essentially building up more 휎푢 to eventually produce the invertible subset over which we will project. Note that the size of the 휃 set is decreasing as we proceed inductively. ⃦ ⃦ ⃦ −1⃦ 푟−푢/2 eq:rip-s2-1 ⃦ ⃦ ProjR휎푢 (퐴퐽휃푢 ) . 2 푀. (2.15)

39 MAT529 Metric embeddings and geometric inequalities

푚 We know |훽푢−1| ≤ 4푢−1 ,

훽푢−1 = 훽푢 ⊔ 휎푢 ⊔ (훽푢−1∖휏푢)

|훽푢−1| = |훽푢| + |휎푢| + (|훽푢−1| − |휏푢|)

|휎푢| = |훽푢−1| − |훽푢| − (|훽푢−1| − |휏푢|) |훽 | ≥ |훽 | − |훽 | − 푢−1 푢−1 푢 22푟−푢+4 푚 ≥ |훽 | − |훽 | − . 푢−1 푢 22푟+푢+2 Our choice for the invertible subset is

푟ᨁ+1 휎 = 휎푢. 푢=1 Telescoping gives

푟∑︁+1 푚 ∑︁∞ 1 |휎| = |휎푢| ≥ |훽0| − |훽푟+1| − 2푟+2 푢 푢=1 2 푢=1 2 푚 푚 ≥ 푚 − 푟+1 − 2푟+2 (︂ 4 2)︂ 1 푘 = 푚 1 − ≥ 푚 = 푘. 22푟+1 푚 ⋂︀ 푟+1 Observe that 휎 ⊆ 푢=1 휃푢 and for every 푢,

휎푢, . . . , 휎푟+1 ⊆ 휏푢

휎1, . . . , 휎푢−1 ⊆ {1, . . . , 푚}∖훽푢−1

This allows us to use the conclusion for all the 휎푢’s at once. 휎 휃푢 For 푎 ∈ R , 퐽휎푎 ⊆ 퐽휃푢 R ,

−1 ProjR휎푢 (퐴퐽휃푢 ) (퐴퐽휎)푎 = ProjR휎푢 퐽휎푎. since 퐴−1퐴 = 퐼 and projecting a subset onto its containing set is just the subset itself. Then, breaking 퐽휎푎 into orthogonal components,

푟∑︁+1 2 2 ‖퐽휎푎‖2 = ‖ProjR휎푢 퐽휎푎‖2 푢=1 푟+1 ⃦ ⃦ ∑︁ ⃦ ⃦2 ⃦ −1 ⃦ = Proj 휎푢 (퐴퐽휎 ) (퐴퐽휎)푎 R 푢 2 푢=1 푟∑︁+1 2푟−푢 2 2 . 2 푀 ‖퐴퐽휎푎‖2 by (2.15) 푢=1 2푟 2 2 . 2 푀 ‖퐴퐽휎푎‖2

40 MAT529 Metric embeddings and geometric inequalities

푀 2 ≤ ‖퐴퐽 푎‖2 1 − 푘 휎 2 푚 푚 ‖퐽 푎‖ ≤ 푀 ‖퐴퐽 푎‖ 휎 2 푚 − 푘 휎 2

Since this is true for all 푎 ∈ R휎, ⃦ ⃦ ⃦ ⃦ 푚 ⃦ −1⃦ =⇒ (퐴퐽휎) . 푀. 푆∞ 푚 − 푘

An important aspect of this whole proof to realize is the way we took a first subset to give one bound, and then took another subset in order to apply the Little Grothendieck inequality. In other words, we first took a subset where we could control 푙1 to 푙2, and then took a further subset to get the operator norm bounds. Also, notice that in this approach, we construct our “optimal subset” of the space in- ductively: However, doing this in general is inefficient. It would be nice to construct theset greedily instead for algorithmic reasons, if we wanted to come up with a constructive proof. I have a feeling that if we were to avoid this kind of induction, it would involve not using Sauer-Shelah at all. How can we make this theorem algorithmic? The way the Pietsch Domination Theorem 2.3.5 worked was by duality. We look at a certain explicitly defined convex set. We found a separating hyperplane which mustbea probability measure. Then we had a probabilistic construction. This part is fine. The bottleneck for making this an algorithm (I do believe this will become an algorithm) consists of 2 parts: 1. Sauer-Shelah lemma 2.3.9: We have some cloud of points in the boolean cube, and we know there is some large subset of coordinates (half of them) such that when you project to it you see the full cube. I’m quite certain that it’s NP-hard in general. (Work is necessary to formulate the correct algorithmic Sauer-Shelah lemma. How is the set given to you?). In fact, formulating the right question for an algorithmi Sauer-Shelah is the biggest difficulty. We only need to answer the algorithmic question tailored to our sets, which have a simple description: the intersection of an ellipsoid with a cube. There is a good chance that there is a polynomial time algorithm in this case. This question has other applications as well (perhaps one could generalize to the intersection of the cube with other convex bodies). 3

2. The second bottleneck is finding a subset with maximal volume. It’s another place where we chose a subset, the subset that maximizes the volume out of all subsets of a given size (Lemma 2.4.1). Specifically, we want the set of columns

3There could be an algorithm out there, but I couldn’t find one in the literature.

41 MAT529 Metric embeddings and geometric inequalities

of a matrix that maximizes the volume of the convex hull.(︀ Computing the volume of 푛 the convex hull requires some thought. Also, there are 푟 subsets of size 푟; if 푟 = 푛/2 there are exponentially many. We need a way to find subsets with maximum volume fast. There might be a replacement algorithm which approximately maximizes this volume.

2-22: We are at the final lemma, Lemma 2.4.3 in the Restricted Invertiblity Theorem.

The proof of this is an inductive application of the Sauer-Shelah lemma. A very important idea comes from Giannopoulos. If you naively try to use Sauer-Shelah, it won’t work out. We will give a stronger statement of the previous lemma which we can prove by induction.

푚 Lemma 2.4.5 (Stronger version of Lemma 2.4.3). lem:SS-induct-stronger Take 푚, 푛 ∈ N, 퐴 : R → 푛 푚 R a linear operator such that {퐴푒푗}푗=1 are linearly independent. Suppose that 푘 ≥ 0 is an 1 integer and 휎 ⊆ {1, . . . , 푚}. Then there exists 휏 ⊆ 휎 with |휏| ≥ (1 − 2푘 )|휎| such that for every 휃 ⊇ 휏 for all 푎 ∈ R푚 we have (︃ )︃ ⃦ ⃦ 푘 ⃦ ⃦ ∑︁ ∑︁ ⃦∑︁ ⃦ ∑︁ 푟/2 ⃦ ⃦ 푘 eq:rip21-1 |푎푖| ≤ 푀 |휎| 2 ⃦ 푎푖퐴푒푖⃦ + (2 − 1) |푎푖| (2.16) 푖∈휏 푟=1 푖∈휃 2 푖∈휃∩(휎∖휏)

42 MAT529 Metric embeddings and geometric inequalities

Lemma 2.4.3 (all we need to complete the proof of the Restricted Invertibility Principle) is the case where 휃 = 휏 ∪ {1, ··· , 푚} ∖ 휎, 푡 = 푘. We prove the stronger version via induction on 푘. Proof. As 푘 becomes bigger, we’re allowed to take more of the set 휎. The idea is to use Sauer-Shelah, taking half of 휎, and then half of what remains at each step. For 푘 = 0, the statement is vacuous, because we can take 휏 to be the empty set. By induction, assume that the statement holds for 푘: we have found 휏 ⊆ 휎 such that |휏| ≥ 1 (1 − 2푘 )|휎| and (2.16) holds for every 휏 ⊆ 휃. If 휎 = 휏 already, then 휏 satisfies (2.16) for 푘 + 1 as well, so WLOG |휎 ∖ 휏| > 0. Now define 푣푗 is the projection

Proj 퐴푒푗 ⃦ 퐹푗 ⃦ 푣푗 = ⃦ ⃦2 ⃦Proj 퐴푒푗⃦ 퐹푗 2

Then ⟨푣푖, 퐴푒푗⟩ = 훿푖푗 by definition since푣 ( 푖) is a dual basis for the 퐴푒푗’s. Now we want to user Sauer-Shelah so we’re going to define a certain subset of the cube. Define ⎧ ⃦ ⃦ ⎫ ⃦ ⃦ ⎨ ⃦ ∑︁ ⃦ ⎬ 휎∖휏 ⃦ ⃦ Ω = 휖 ∈ {±1} : ⃦ 휖 푣 ⃦ ≤ 푀 2|휎 ∖ 휏| ⎩ ⃦ 푖 푖⃦ ⎭ 푖∈휎∖휏 2 which is an ellipsoid intersected with the cube (it is not a sphere since the 푣푖s are not orthogonal). Then we have ∑︁ 1 ∑︁ 푀 2|휎 ∖ 휏| ≥ = ‖푣 ‖2 ‖Proj 퐴푒 ‖2 푗 2 푖∈휎∖휏 퐹푗 푗 푗∈휎∖휏 ⃦ ⃦ ⃦ ⃦2 ∑︁ ⃦ ∑︁ ⃦ 1 ⃦ ⃦ = ⃦ 휖 푣 ⃦ . |휎∖휏| ⃦ 푗 푗⃦ 2 휎∖휏 휖∈{±1} 푗∈휎∖휏 2 where the last step is true for any vectors (sum the squares and the pairwise correlations disappear). Using Markov’s inequality this is 1 (︀ ≥ 2|휎∖휏| − |Ω| 푀 22|휎 ∖ 휏| 2|휎∖휏| which gives |Ω| > 2|휎∖휏|−1 Then by Sauer-Shelah (Lemma 2.3.9, there exists 훽 ⊆ 휎 ∖ 휏 such that

훽 ProjR훽 Ω = {±1} and 1 |훽| ≥ |휎 ∖ 휏| 2 43 MAT529 Metric embeddings and geometric inequalities

Define 휏 * = 휏 ∪ 훽. We will show that 휏 * satisfies the inductive hypothesis with 푘 + 1. Each time we find a certain set of coordinates to add to what we have before. |휏 *| is the correct size because (︂ )︂ |휎| − |휏| |휏| + |휎| 1 |휏 *| = |휏| + |훽| ≥ |휏| + = ≥ 1 − |휎| 2 2 2푘+1 (︀ 1 * where we used that |휏| ≥ 1 − 2푘 |휎|. So at least 휏 is the right size. Now, suppose 휃 ⊇ 휏 *. For every 푎 ∈ R푚, we claim there exists some 휖 ∈ Ω such that ∀푗 ∈ 훽 such that 휖푗 = sign(푎푗). For any 훽, we can find some vector in the cube that hasthe sign pattern of our given vector 푎. What does being in Ω mean? It means that at least the dual basis is small there. 휖 ∈ Ω says that ∑︁ 푀 2|휎| eq:step21-2 ‖ 휖푖푣푖‖2 ≤ 푀 2|휎 ∖ 휏| ≤ 푘/2 (2.17) 푖∈휎∖휏 2

That was how we chose our ellipsoid. ⃦ ⌋︂ ∑︁ ∑︁ ∑︁ |푎푖| = 푎푖퐴푒푖, 휖푣푖 푖∈훽 푖∈훽 푖∈휎∖휏 because the 푣푖’s are a dual basis so ⟨퐴푒푖, 푣푗⟩ = 훿푖푗, and 훽 ⊆ 휎∖휏. We only know the 휖푖 are the signs when you’re inside 훽. This equals ⃦ ⌋︂ ∑︁ ∑︁ ∑︁ = 푎푖퐴푒푖, 휖푣푖 − 휖푖푎푖 푖∈휃 푖∈휎∖휏 푖∈(휃∖훽)∩(휎∖휏)

* Note that (휃 ∖ 훽) ∩ (휎 ∖ 휏) = 휃 ∩ (휎 ∖ 휏 ). In this set we can’t control the signs 휖푖. By Cauchy-Schwarz and (2.17), this is ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦∑︁ ⃦ ⃦ ∑︁ ⃦ ∑︁ ⃦ ⃦ ⃦ ⃦ ≤ ⃦ 푎 퐴푒 ⃦ · ⃦ 휖푣 ⃦ + |푎 | ⃦ 푖 푖⃦ ⃦ 푖⃦ 푖 푖∈휃 2 푖∈휎∖휏 푖∈휃∩(휎∖휏 *) ⃦ ⃦ 2 ⃦ ⃦ ⃦∑︁ ⃦ 푀 2|휎| ∑︁ ≤ ⃦ 푎 퐴푒 ⃦ · + |푎 |. ⃦ 푖 푖⃦ 2푘/2 푖 푖∈휃 2 푖∈휃∩(휎∖휏 *)

Because the conclusion of Sauer-Shelah told us nothing about the signs of 휖푖 outside 훽, so we just take the worst possible thing. Summarizing, ⃦ ⃦ ⃦ ⃦ ∑︁ 푀 2|휎| ⃦∑︁ ⃦ ∑︁ |푎 | ≤ ⃦ 푎 퐴푒 ⃦ + |푎 | 푖 2푘/2 ⃦ 푖 푖⃦ 푖 푖∈훽 푖∈휃 2 푖∈휃∖(휎∖휏 *)

44 MAT529 Metric embeddings and geometric inequalities

Using the inductive step, ∑︁ ∑︁ ∑︁ |푎푖| = |푎푖| + |푎푖| * 푖∈휏 푖∈휏 푖∈⃦훽 ⃦ ⃦ ⃦ ⃦∑︁ ⃦ (︀ ∑︁ ∑︁ ⃦ ⃦ 푘 ≤ 푀 |휎|훼푘 ⃦ 푎푖퐴푒푖⃦ + 2 − 1 |푎푖| + |푎푖| ⃦ 푖∈휃 ⃦ 2 푖∈휃∩(휎∖휏) 푖∈훽 ⃦ ⃦ ⃦∑︁ ⃦ (︀ ∑︁ ∑︁ ⃦ ⃦ 푘 푘 = 훼푘 |휎| ⃦ 푎푖퐴푒푖⃦ + 2 − 1 |푎푖| + 2 |푎푖| 푖∈휃 2 푖∈휃∩(휎∖휏 *) 푖∈훽

In the last step, we∑︀ moved 훽 from the second to the third term. Now use what we got before for the bound on 푖∈훽 |푎푖| and plug it in to get ⃦ ⃦ ⃦ ⃦ (︀ ⃦∑︁ ⃦ (︀ ∑︁ (푘+1)/2 ⃦ ⃦ 푘+1 ≤ 훼푘 + 2 |휎| ⃦ 푎푖퐴푒푖⃦ + 2 − 1 |푎푖| 푖∈휃 2 푖∈휃∩(휎∖휏 *) which is exactly the inductive hypothesis.4

Remark 2.4.6 (Algorithmic Sauer-Shelah): We only used Sauer-Shelah 2.3.9 for intersect- ing cubes with an ellipsoid, so to make RIP algorithmic, we need an algorithm for finding these particular intersections.√ Moreover, the ellipsoids are big is because we ask that the 2-norm is at most 2 times the expectation. An efficient algorithm probably exists. Afterwards, wecould ask for higher dimensional shapes. I’ve seen some references that worked for Sauer-Shelah when sets were of a special form, namely of size 표(푛). This is some- thing more geometric. I don’t think there’s literature about Sauer-Shelah for intersection of surfaces with small degree. This is a tiny motivation to do it, but it’s still interesting independently.

4I looked through the original Gianpopoulous paper, and it was clear he tried out many many things to find which inductive hypothesis makes everything go through cleanly. You want to boundan 푙1 sum from above, so you want to use duality, and then use Sauer-Shelah to get signs such that the norm of the dual-basis is small.

45 MAT529 Metric embeddings and geometric inequalities

46 Chapter 3

Bourgain’s Discretization Theorem

1 Outline

We will prove Bourgain’s Discretization Theorem. This will take maybe two weeks, and has many interesting ingredients along the way. By doing this, we will also prove Ribe’s theorem, which is what we stated at the beginning. Let’s remind ourselves of the definition.

Definition (Definition 1.2.3, discretization modulus): (푋, ‖·‖푋 ), (푌, ‖·‖푌 ) are Banach spaces. Let 휖 ∈ (0, 1). Then 훿푋˓→푌 (휖) is the supremum over 훿 > 0 such that for every 훿-net 풩훿 of the unit ball of 푋, the distortion 퐶 (풩 ) 퐶 (푋) ≤ 푌 훿 푌 1 − 휖

퐶푌 is smallest bi-Lipschitz distortion by which you can embed 푋 into 푌 . There are ideas 1 required to get 1 − 휖. For example, for 휖 = 2 , the equation says is that if we succeed in embedding a 훿-net into 푌 , then we succeeded in the full space with a distortion twice as much. A priori it’s not even clear that there exists such a 훿. There’s a nontrivial compactness argument needed to prove its existence. We will just prove bounds on it assuming it exists. Now, Bourgain’s discretization theorem says Theorem (Bourgain’s discretization theorem 1.2.4). If dim(푋) = 푛, dim(푌 ) = ∞, then

퐶푛 −( 푛 ) 훿푋˓→푌 (휖) ≥ 푒 휖 for 퐶 a universal constant.

Remark 3.1.1: It doesn’t matter what the maps are for the 풩훿, the proof will give a linear mapping and we won’t end up needing them. Assuming linearity in the definition is not necessary. Rademacher’s theorem says that any mapping R푛 → R푛 is differentiable almost every- where with bi-Lipschitz derivative. Yu can extend this to 푛 = ∞, but you need some

47 MAT529 Metric embeddings and geometric inequalities

additional properties: 푌 must be the , and the limit needs to be in the weak-* topology. In this topology there exists a sequence of norm-1 vectors that tend to 0. Almost everywhere, this doesn’t happen. The Principle of Local Reflexivity says 푌 ** and 푌 are not the same for infinite dimensional 푌 . The double dual of all sequences which tend to 0 is 퐿∞, a bigger space, but the difference between these never appears in finite dimensional phenomena.

1.1 Reduction to smooth norm

From now on, 퐵푋 = {푥 ∈ 푋 : ‖푋‖푋 ≤ 1} denotes the ball, and 푆푋 = 휕퐵 = {푥 ∈ 푋 : ‖푋‖푋 = 1} denotes the boundary. Later on we will be differentiating things without thinking about it, so we first prove that we can assume ‖·‖푋 is smooth on 푋 ∖ {0}. (︀ 2 푛 Lemma 3.1.2. For all 훿 ∈ (0, 1) there exists some 훿-net of 푆푋 with |풩훿| ≤ 1 + 훿 .

Proof. Let 풩훿 ⊆ 푆푋 be maximal with respect to inclusion such that ‖푥 − 푦‖푋 > 훿 for every distinct 푥, 푦 ∈ 풩훿. Maximality means that if 푧 ∈ 푆푋 , there exists 푥 ∈ 풩훿, ‖푥 − 푦‖푋 ≤ 훿 훿 (otherwise 푁훿 ∪ {푧} is a bigger 훿-net). The balls {푥 + 2 퐵푋 }푥∈풩훿 are pairwise disjoint and 훿 contained in 1 + 2 퐵푋 . Thus 훿 ∑︁ 훿 vol 1 + 퐵 ≥ vol 푥 + 퐵 2 푋 2 푋 푋∈푀훿 훿 푛 훿 푛 1 + vol(퐵 ) = |풩 | vol(퐵 ) 2 푋 훿 2 푋 1

2 푛 Reduction to smooth norm. Let 풩훿 be a 훿-net of 푆푋 with |풩훿| = 푁 ≤ (1 + 훿 ) . Given * 푧 ∈ 풩훿, by Hahn-Banach we can choose 푧 such that 1. ⟨푧, 푧*⟩ = 1

* 2. ‖푧 ‖푋* = 1. Let 푘 be an integer such that 푁 1/(2푘) ≤ 1 + 훿. Then define 1/(2푘) ∑︁ ‖푥‖ := ⟨푧*, 푥⟩2푘

푧∈풩훿

* Each term satisfies |⟨푧 , 푥⟩| ≤ ‖푥‖푋 , so we know

1/2푘 ‖푥‖ ≤ 푁 ‖푥‖푋 ≤ (1 + 훿)‖푥‖푋 .

1There are non-sharp bounds for the smallest size of a 훿-net. There is a lot of literature about the relations between these things, but we just need an upper bound.

48 MAT529 Metric embeddings and geometric inequalities

For the lower bound, if 푥 ∈ 푆푋 , then choose 푧 ∈ 풩훿 such that ‖푥 − 푧‖푋 ≤ 훿. Then 1 − ⟨푧*, 푥⟩ = ⟨푧*, 푧 − 푥⟩ ≤ ‖푧 − 푥‖ ≤ 훿, so ⟨푧*, 푥⟩ ≥ 1 − 훿, giving

‖푥‖ ≥ (1 − 훿) ‖푥‖푋 . Thus any norm is up to 1+훿 some equal to a smooth norm. 훿 was arbitrary, so without loss of generality we can assume in the proof of Bourgain embedding that the norm is smooth.

The strategy to prove Bourgain’s discretization theorem is as follows. Given a 훿-net 풩훿 ⊆ −(푛/휖)퐶푛 1 퐵푋 with 훿 ≤ 푒 , and we know ∃ 푓 : 풩훿 → 푌 such that 퐷 ‖푥−푦‖푋 ≤ ‖푓(푥)−푓(푦)‖푌 ≤ −퐷퐶푛 ‖푥 − 푦‖푋 , i.e., we can embed with distortion 퐷. Our goal is to show that if 훿 ≤ 푒 then −1 this implies that there exists 푇 : 푋 → 푌 linear operator invertible ‖푇 ‖‖푇 ‖ . 퐷. The steps are as follows. 1. Find the correct coordinate system (John ellipsoid), which will give us a dot prod- uct structure and a natural Laplacian. (We will need a little background in convex geometry. ) 2. A priori 푓 is defined on the net. Extend 푓 to the whole space in a nice way (Bourgain’s extension theorem) which doesn’t coincide with the function on the net, but is not too far away from it. 3. Solve the Laplace equation. Start with some initial condition, and then evolve 푓 according to the Poisson semigroup. This extended function is smooth the instant it flows a little bit away from the discrete function. 4. There is a point where the derivative satisfies what we want: The point exists bya pigeonhole style argument, but we won’t be able to give it explicitly. This comes from estimates of the Poisson kernel. We will use Fourier analysis. 2-24

2 Bourgain’s almost extension theorem

Theorem 3.2.1 (Bourgain’s almost extension theorem). Let 푋 be a 푛-dimensional normed space, 푌 a Banach space, 풩훿 ⊆ 푆푋 a 훿-net of 푆푋 , 휏 ≥ 퐶훿. Suppose 푓 : 풩훿 → 푌 is 퐿-Lipschitz. Then there exists 퐹 : 푋 → 푌 such that

훿푛 1. ‖퐹 ‖Lip . (1 + 휏 )퐿.

2. ‖퐹 (푥) − 푓(푥)‖푌 ≤ 휏퐿 for all 푥 ∈ 풩훿.

3. Supp(퐹 ) ⊆ (2 + 휏)퐵푋 . 4. 퐹 is smooth. Parts 3 and 4 will come “for free.”

49 MAT529 Metric embeddings and geometric inequalities

2.1 Lipschitz extension problem Theorem 3.2.2 (Johnson-Lindenstrauss-Schechtman, 1986). Let 푋 be a 푛-dimensional normed space 퐴 ⊆ 푋, 푌 be a Banach space, and 푓 : 퐴 → 푌 be 퐿-Lipschitz. There ex-

ists 퐹 : 푋 → 푌 such that 퐹 |퐴 = 푓 and ‖퐹 ‖Lip . 푛퐿. √ √ We know a lower bound√ of 푛; losing 푛 is sometimes needed. (The lower bound for nets on the whole space is 4 푛.) A big open problem is what the true bound is. This was done a year before Bourgain. Why didn’t he use this theorem? This theorem is not sufficient because the Lipschitz constant grows with 푛. −1 −1 We want to show that ‖푇 ‖‖푇 ‖ . 퐷. We can’t bound the norm of 푇 with anything less than the distortion 퐷, so to prove Bourgain embedding we can’t lose anything in the Lipschitz constant of 푇 ; the Lipschitz constant can’t go to ∞ as 푛 → ∞. Bourgain had the idea of relaxing the requirement that the new function be strictly an extension (i.e., agree with the original function where it is defined). What’s extremely important is that the new function be Lipschitz with a constant independent of 푛.

We need ‖퐹 ‖Lip . 퐿. Let’s normalize so 퐿 = 1. When the parameter is 휏 = 푛훿, we want ‖푓(푥) − 퐹 (푥)‖ . 푛훿. Note 훿 is small (less than the inverse of any polynomial), so losing 푛 is nothing.2 How sharp is Theorem 3.2.1? Given a 1-Lipschitz function on a 훿-net, if we want to almost embed it without losing anything, how close can close can we guarantee it to be from the original function

Theorem 3.2.3. There exists a 푛-dimensional normed space 푋, Banach space 푌 , 훿 > 0,

풩훿 ⊆ 푆푋 a 훿-net, 1-Lipschitz function 푓 : 풩훿 → 푌 such that if 푓 : 푋 → 푌 , ‖퐹 ‖Lip . 1 then there exists 푥 ∈ 풩훿 such that 푛 ‖퐹 (푥) − 푓(푥)‖ & √ . 푒푐 ln 푛 Thus, what Bourgain proved is essentially sharp. This is a fun construction with Grothendieck’s inequality. Our strategy is as follows. Consider 푃푡 * 퐹 , where (︀ 푛+1 퐶푛푡 Γ 2 푃푡(푥) = 2 푛+1 , 퐶푛 = 푛+1 , 2 2 2 (푡 + ‖푥‖2) 휋

′ and 푃푡 is the Poisson kernel. Let (푇 퐹 )(푥) = (푃푡 * 퐹 ) (푥); 푇 is linear and ‖푇 ‖ . 1. As 푡 → 0, this becomes closer to 퐹 . We hope when 푡 → 0 that it is invertible. This is true. We give a proof by contradiction (we won’t actually find what 푡 is) using the pigeonhole principle. The Poisson kernel depends on an Euclidean norm in it. In order for the argument to work, we have to choose the right Euclidean norm.

2If people get 훿 to be polynomial, then we’ll have to start caring about 푛.

50 MAT529 Metric embeddings and geometric inequalities

2.2 John ellipsoid Theorem 3.2.4 (John’s theorem). Let 푋 be a 푛-dimensional normed space, identified with 푛 푛 푛 √1 R . Then there exists a linear operator 푇 : R → R such that 푛 ‖푇 푥‖2 ≤ ‖푥‖푋 ≤ ‖푇 푥‖2 for all 푥 ∈ 푋.

√John’s Theorem says we can always find 푇 √which sandwiches the 푋-norm up to a factor of 푛. “Everything is a Hilbert space up to a 푛 factor.” Proof. Without loss of generality 퐷 ≤ 푛. Let ℰ be an ellipsoid of maximal volume contained in 퐵푋 . ⌋︀ {︀ max vol(푆퐵 푛 ): 푆 ∈ 푀 ( ), 푆퐵 푛 ⊆ 퐵 . ℓ2 푛 R ℓ2 푋 −1 ℰ exists by compactness. Take√푇 = 푆 . The goal is to show that 푛ℰ ⊇ 퐵푋 . We can choose the coordinate system such that 푛 ℰ = 퐵ℓ푛 = 퐵 . 2 2 √ Suppose by way of contradiction that 푛퐵푛 ̸⊇ 퐵 . Then there exists 푥 ∈ 퐵 such that √ 2 푋 푋 ‖푥‖푋 ≤ 1 yet ‖푥‖2 > 푛.

푥 Denote 푑 = ‖푥‖2 , 푦 = 푑 . Then ‖푦‖2 = 1 and 푑푦 ∈ 퐵푋 . By applying a rotation, we can assume WLOG 푦 = 푒1. Claim: Define for 푎 > 1, 푏 < 1 {︃ (︂ )︂ (︂ )︂ }︃ ∑︁푛 2 ∑︁푛 2 푡1 푡푖 ℰ푎,푏 := 푡푖푒푖 : + ≤ 1 . 푖=1 푎 푖=2 푏 Stretching by 푎 and squeezing by 푏 can make the ellipse grow and stay inside the body, contradicting that it is the minimizer. More precisely, we show that there exists 푎 > 1 and 푛 푛−1 푏 < 1 such that ℰ푎,푏 ⊆ 퐵푋 and Vol(ℰ푎,푏) > Vol(퐵2 ). This is equivalent to 푎푏 > 1. We need a lemma with some trigonometry. (︀ 푎2 2 1 Lemma 3.2.5 (2-dimensional lemma). Suppose 푎 > 1, 푏 ∈ (0, 1), and 푑2 + 푏 1 − 푑2 ≤ 1. Then ℰ푎,푏 ⊆ 퐵푋 .

51 MAT529 Metric embeddings and geometric inequalities

√︁ (︀ 푛−1 푑2−푎2 푛−1 푑2−푎2 2 Proof. Let 푏 = 푑2−1 . Then 휓(푎) = 푎푏 = 푎 푑2−1 , 휓(1) = 1. It is enough to show 휓′(1) > 0. Now 푛−1 −1 푑2 − 푎2 2 푑2 − 푛푎2 휓′(푎) = , 푑2 − 1 푑2 − 1 √ which is indeed > 0 for 푑 > 푛. Note this is really a 2-dimensional argument; there is 1 special direction. We shrunk the 푦 direction and expanded the 푥 direction. It’s enough to show the new ellipse ℰ푎,푏 is in the rhombus in the picture. Calculations are left to the reader.

2.3 Proof Proof of Theorem 3.2.1. By translating 푓 (so that it is 0 at some point), without loss of generality we can assume that for all 푥 ∈ 풩훿, ‖푓(푥)‖푌 ≤ 2퐿. Step 1: (Rough extension 퐹1 on 푆푋 .) We show that there exists 퐹1 : 푆푋 → 푌 such that for all 푥 ∈ 풩훿 → 푌 such that for all 푥 ∈ 풩훿,

1. ‖퐹1(푥) − 푓(푥)‖푌 ≤ 2퐿훿.

2. ∀푥, 푦 ∈ 푆푋 , ‖퐹1(푥) − 퐹1(푦)‖ ≤ 퐿(‖푥 − 푦‖푋 + 4훿).

This is a partition of unity argument. Write 풩훿 = {푋1, . . . , 푋푁 }. Consider

푁 {(푥푝 + 2훿퐵푋 ) ∩ 푆푋 }푝=1, which is an open cover of 푆푋 . 푁 Let {휑푝 : 푆푋 → [0, 1]}푝=1 be a partition of unity subordinted to this open cover. This means that

52 MAT529 Metric embeddings and geometric inequalities

1. Supp 휑푝 ⊆ 푥푝 + 2훿퐵푋 , ∑︀ 푁 2. 푝=1 휑푝(푥) = 1 for all 푥 ∈ 푆푋 . ∑︀ 푁 For all 푥 ∈ 푆푋 , define 퐹1(푥) = 푝=1 휑푝(푥)푓(푥푝) ∈ 푌 . Then as 퐹1 is a weighted sum of 푓(푥푝)’s, ‖퐹1‖∞ ≤ 2퐿. If 푥 ∈ 풩훿, because 휑푝(푥) is 0 when |푥 − 푥푝| > 2훿, ⃦ ⃦ ⃦ ⃦ ⃦ ∑︁ ⃦ ∑︁ ⃦ ⃦ ‖퐹 (푥) − 푓(푥)‖ = ⃦ 휑 (푥)(푓(푥 ) − 푓(푥))⃦ ≤ 휑 (푥)퐿 ‖푥 − 푥 ‖ ≤ 2퐿훿. 1 푌 ⃦ 푝 푝 ⃦ 푝 푝 푋 푝:‖푥−푥푝‖푋 ≤2훿 ‖푥−푥푝‖≤2훿

For 푥, 푦 ∈ 푆푋 , ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ∑︁ ⃦ ⃦ ⃦ ‖퐹 (푥) − 퐹 (푦)‖ = ⃦ (푓(푥 ) − 푓(푥 ))휑 (푥)휑 (푥)⃦ 1 1 ⃦ 푝 푞 푝 푞 ⃦ ⃦ ‖푥 − 푥 ‖ ≤ 2훿 ⃦ ⃦ 푝 ⃦ ‖푦 − 푥푞‖ ≤ 2훿 ∑︁ ≤ 퐿 ‖푥푝 − 푥푞‖ 휑푝(푥)휑푝(푦)

‖푥 − 푥푝‖ ≤ 2훿 ‖푦 − 푥푞‖ ≤ 2훿 ≤ 퐿(‖푥 − 푦‖ + 4훿).

2-29: We continue proving Bourgain’s almost extension theorem. Step 2: Extend 퐹1 to 퐹2 on the whole space such that

1. ∀푥 ∈ 풩훿, ‖퐹2(푥) − 푓(푥)‖푌 . 퐿훿.

2. ‖퐹2(푥) − 퐹2(푦)‖푌 . 퐿(‖푥 − 푦‖푋 + 훿).

3. Supp(퐹2) ⊆ 2퐵푋 .

4. 퐹2 is smooth.

Denote 훼(푡) = max{1 − |1 − 푡|, 0}.

53 MAT529 Metric embeddings and geometric inequalities

Let

퐹2(푥) = 훼(‖푥‖푋 )퐹1 (푥) ‖푥‖푋 .

퐹2 still satisfies condition 1. As for condition 2, ⃦ ⃦ ⃦ ⃦ ⃦ 푥 푦 ⃦ ‖퐹 (푥) − 퐹 (푦)‖ = ⃦훼(‖푥‖ )퐹 − 훼(‖푦‖ )퐹 ⃦ 2 2 푌 ⃦ 푋 1 ‖푥‖ 푋 1 ‖푦‖ ⃦ 푋⃦ ⃦ 푋⃦ ⃦ ⃦ ⃦ ⃦ ⃦ ⃦ 푥 ⃦ ⃦ 푥 푦 ⃦ ≤ |훼(‖푥‖) − 훼(‖푦‖)| ⃦퐹 ⃦ +훼(‖푦‖) ⃦퐹 − 퐹 ⃦ ⃦ 1 ‖푥‖ ⃦ ⃦ 1 ‖푥‖ 1 ‖푦‖ ⃦ ⏟ ⏞ 푋 푋 푋 ≤2퐿(︃⃦ ⃦ )︃ ⃦ ⃦ ⃦ 푥 푦 ⃦ ≤ (‖푥‖ − ‖푦‖)2퐿 + 훼(‖푦‖)퐿 ⃦ − ⃦ + 4훿 ⃦‖푥‖ ‖푦‖⃦ (︃ ⃒ ⃒ )︃ ⃒ ⃒ ⃒ 1 1 ⃒ ‖푥 − 푦‖ ≤ 2퐿 ‖푥 − 푦‖ + 퐿훼(‖푦‖) ‖푥‖ ⃒ − ⃒ + + 4훿 ⃒‖푥‖ ‖푦‖⃒ ‖푦‖ ‖푥 − 푦‖ ‖푥 − 푦‖ ≤ 2퐿 ‖푥 − 푦‖ + 퐿훼(‖푦‖) + + 4훿 ‖푦‖ ‖푦‖ . 퐿(‖푥 − 푦‖ + 훿), where in the last step we used 훼(‖푦‖) ≤ ‖푦‖ and 훼(‖푦‖) ≤ 1.

Note 퐹2 is smooth because the sum for 퐹1 was against a partition of unity and ‖·‖푋 is smooth, although we don’t have uniform bounds on smoothness for 퐹2. Step 3: We make 퐹 smoother by convolving.

Lemma 3.2.6 (Begun, 1999). Let 퐹2 : 푋 → 푌 satisfy ‖퐹2(푥) − 퐹2(푦)‖푌 ≤ 퐿(‖푥 − 푦‖푋 +훿). Let 휏 ≥ 푐훿. Define ∫︁ 1 퐹 (푥) = 퐹2(푥 + 푦) 푑푦. Vol(휏퐵푋 ) 휏퐵푋 Then 훿푛 ‖퐹 ‖ ≤ 퐿 1 + . Lip 2휏

The lemma proves the almost extension theorem as follows. We passed from 푓 : 풩훿 → 푌 to 퐹1 to 퐹2 to 퐹 . If 푥 ∈ 풩훿, ⃦ ⃦ ⃦ ∫︁ ⃦ ⃦ 1 ⃦ ⃦ ⃦ ‖퐹 (푥) − 푓(푥)‖푌 = ⃦ (퐹2(푥 + 푦) − 푓(푥)) 푑푦⃦ Vol(휏퐵푋 ) 퐵푋 ∫︁ 1 ≤ ‖퐹2(푥 + 푦) − 퐹2(푥)‖푌 + ‖퐹2(푥) − 푓(푥)‖푌 푑푦 Vol(휏퐵푋 ) 휏퐵푋 ⏟ ⏞ ∫︁ 훿퐿 1 ≤ (퐿(‖푦‖푋 +훿퐿)) 푑푦 . 퐿휏. Vol(휏퐵푋 ) 휏퐵푋 ⏟ ⏞ ≤휏

Now we prove the lemma.

54 MAT529 Metric embeddings and geometric inequalities

Proof. We need to show 훿푛 ‖퐹 (푥) − 퐹 (푦)‖ ≤ 퐿 1 + ‖푥 − 푦‖ . 푌 2휏 푋

Without loss of generality 푦 = 0, Vol(휏퐵푋 ) = 1. Denote

푀 = 휏퐵푋 ∖(푥 + 휏퐵푋 ) ′ 푀 = (푥 + 휏퐵푋 )∖휏퐵푋 .

We have ∫︁ ∫︁

퐹 (0) − 퐹 (푥) = 퐹푧(푦) 푑푦 − 퐹푧(푦) 푑푦. 푀 푀 ′

Define 휔(푧) to be the Euclidean length of the interval (푧 + R푥) ∩ (휏퐵푋 ). By Fubini, ∫︁

휔(푧) 푑푧 = Vol푛(휏퐵푋 ) = 1. Proj푋⊥ (휏퐵푋 )

Denote

푊 = {푧 ∈ 휏퐵푋 :(푧 + R푥) ∩ (휏퐵푋 ) ∩ (푥 + 휏퐵푋 ) ̸= 휑} 푁 = 휏퐵푋 ∖푊.

Define 퐶 : 푀 → 푀 ′ a shift in direction 푋 on every fiber that maps the interval푧 ( + R푥) ∩ 푀 → (푧 + R푥) ∩ 푀 ′.

55 MAT529 Metric embeddings and geometric inequalities

퐶 is a measure preserving transformation with ⎧ ⎨ ‖푥‖푋 , 푧 ≤ 푁 ‖푧 − 퐶(푧)‖ = ‖푥‖ 푋 ⎩휔(푧) 푋 , 푧 ∈ 푊 ∩ 푀. ‖푥‖2

(In the second case we translate by an extra factor 휔(푧) .) Then ‖푥‖2 ⃦ ⃦ ⃦∫︁ ∫︁ ⃦ ⃦ ⃦ ‖퐹 (0) − 퐹 (푥)‖푌 = ⃦ 퐹2(푦) 푑푦 − 퐹2(푦) 푑푦⃦ ⃦ 푀 푀 ′ ⃦ 푌 ⃦∫︁ ⃦ ⃦ ⃦ = ⃦ (퐹2(푦) − 퐹2(퐶(푦))) 푑푦⃦ ∫︁ 푀 푌

≤ 퐿(‖푦 − 퐶(푦)‖푋 + 훿) 푑푦 ∫︁푀

≤ 퐿(‖푦 − 퐶(푦)‖푋 + 훿) 푑푦 푀 ∫︁

= 퐿훿Vol(푀) + 퐿 ‖푦 − 퐶(푦)‖푋 푑푦 ∫︁ ∫︁ ∫︁ 푀 휔(푦) ‖푥‖푋 ‖푦 − 퐶(푦)‖푋 푑푦 = ‖푥‖푋 푑푦 + 푑푦 푀 푁 푊 ∩푀 ‖푥‖ ∫︁ 2 휔(푧) ‖푥‖푋 = ‖푥‖푋 Vol(푁) + ‖푥‖2 푑푧 orthogonal decomposition Proj(푊 ∩푀) ‖푥‖2

= ‖푥‖푋 Vol(푁) + Vol(휏퐵푋 ∖푁)

= ‖푥‖푋 Vol(휏퐵푋 ) = ‖푥‖푋 .

‖푥‖ We show 푀 = 휏퐵푋 ∖(푥 + 휏퐵푋 ) ⊆ 휏퐵푋 ∖(1 − 휏 )휏퐵푋 . Indeed, for 푦 ∈ 푀, ‖푦 − 푥‖ ≥ 휏 푋 ‖푥‖ ‖푦‖ ≥ 휏 − ‖푥‖ = 1 − 휏. 휏

56 MAT529 Metric embeddings and geometric inequalities

Then ‖푥‖ ‖푥‖ 푛 ‖푥‖ Vol(푀) ≤ Vol(휏퐵 ) − Vol 1 − 휏퐵 = 1 − 1 − 푋 휏 푋 휏 . 휏

Bourgain did it in a more complicated, analytic way avoiding geometry. Begun notices that careful geometry is sufficient. Later we will show this theorem is sharp.

3 Proof of Bourgain’s discretization theorem

At small distances there is no guarantee on the function 푓. Just taking derivatives is danger- ous. It might be true that we can work with the initial function. But the only way Bourgain figured out how to prove the theorem was to make a 1-parameter family of functions.

3.1 The Poisson semigroup

푛 Definition 3.3.1: The Poisson kernel is 푃푡(푥): R → R given by (︀ 푛+1 퐶푛푡 Γ 2 푃푡(푥) = 2 푛+1 , 퐶푛 = 푛+1 . 2 2 2 (푡 + ‖푥‖2) 휋 ∫︀ Proposition 3.3.2 (Properties of Poisson kernel): 1. For all 푡 > 0, R푛 푃푡(푥) 푑푥 = 1.

2. (Semigroup property) 푃푡 * 푃푠 = 푃푡+푠.

̂︁ −2휋‖푥‖ 푡 3. 푃푡(푥) = 푒 2 .

Lemma 3.3.3. Let 퐹 be the function obtained from Bourgain’s almost extension theo- rem 3.2.1. For all 푡 > 0, ‖푃푡 * 퐹 ‖Lip . 1. Proof. We have ∫︁

푃푡 * 퐹 (푥) − 푃푡 * 퐹 (푦) = 푃푡(푧)(퐹 (푥 − 푧) − 퐹 (푥 − 푦)) 푑푧. R푛 Now use the fact that 퐹 is Lipschitz.

Our goal is to show there exists 푡0 > 0, 푥 ∈ 퐵 such that if we define

′ eq:bdt-T푇 = (푃푡0 * 퐹 ) (푥): 푋 → 푌, (3.1)

−1 (i.e., 푇 푎 = 휕푎(푃푡0 *퐹 )(푥)). We showed ‖푇 ‖ . 1. It remains to show ‖푇 ‖ . 퐷. Then 푇 has distortion at most 푂(퐷), and hence 푇 gives the desired extension in Bourgain’s discretizaton theorem 1.2.4. 3-2: Today we continue with Bourgain’s Theorem. Summarizing our progress so far:

57 MAT529 Metric embeddings and geometric inequalities

1. Initially we had a function 푓 : 풩훿 → 푌 and a 훿-net 풩훿 of 퐵푋 .

2. From 푓 we got a new function 퐹 : 푋 → 푌 with supp(퐹 ) ⊆ 3퐵푋 , satisfying the following. (None of the constants matter.)

(a) ‖퐹 ‖퐿푝 ≤ 퐶 for some constant.

(b) For all 푥 ∈ 풩훿, ‖퐹 (푥) − 푓(푥)‖푌 ≤ 푛훿.

푛 Let 푃푡 : R → R be the Poisson kernel. What can be said about the convolution 푃푡 * 퐹 ? We know that for all 푡 and for all 푥, ‖(푃푡 *퐹 )(푥)‖푋→푌 ≤ 1. The goal is to make this function invertible. More precisely, the norm of the inverted operator is not too small. We’ll ensure this one direction at a time by investigating the directional derivatives. For 푛 푔 : R → 푌 , the directional derivative is defined by: for all 푎 ∈ 푆푋 , 푔(푥 + 푡푎) − 푔(푥) 휕푎푔(푥) = lim . 푡→0 푡 There isn’t any particular reason why we need to use the Poisson kernel; there are many other kernels which satisfy this. We definitely need the semigroup property, in addition to all sorts of decay conditions. We need a lot of lemmas.

′ ′′ Lemma 3.3.4 (Key lemma 1). lem:bdt-1 There are universal constants 퐶, 퐶 , 퐶 such that the 1 1 following hold. Suppose 푡 ∈ (0, 2 ], 푅 ∈ (0, ∞), 훿 ∈ (0, 100푛 ) satisfy 푡 log(3/푡) 1 훿 ≤ 퐶 √ ≤ 퐶′ (3.2) 푛 푛5/2퐷2 퐶′′ 퐶′′푛3/2퐷2 log(3/푡) ≤ 푅 ≤ √ . (3.3) 푡 푛

1 Then for all 푥 ∈ 2 퐵푋 , 푎 ∈ 푆푋 , we have 퐶′′′ (‖휕 (푃 * 퐹 )‖ * 푃 )(푥) ≥ 푎 푡 푌 푅푡 퐷

This result says that given a direction 푎 and look at how long the vector 휕푎(푃푡 * 퐹 ) is, it is not only large, but large on average. Here the average is over the time period 푅푡. Note that I haven’t been very careful with constants; there are some points where the constant is missing/needs to be adjusted. The first condition(︀ says is that 푡 is not too small, but the lower bound is extremely small. 1 푘 −푒푛 The upper bound is 푛 for some 푘 whereas the lower bound is something like 푒 , so there’s a huge gap. (︀ 1 The second condition says that log 푡 is still exponential. This lemma will come up later on and it will be clear. From now on, let us assume this key lemma and I will show you how to finish. We will go back and prove it later.

58 MAT529 Metric embeddings and geometric inequalities

Lemma 3.3.5 (Key lemma 2). lem:bdt-2 There’s a constant 퐶1 such that the following holds (we can take 퐶1 = 8). Let 휇 be any Borel probability measure on 푆푋 . For every 푅, 퐴 ∈ (0, ∞), 퐴 there exists (푅+1)푚+1 ≤ 푡 ≤ 퐴 such that ∫︁ ∫︁ ∫︁ ∫︁ 퐶1vol(3퐵푋 ) ‖휕푎(푃푡 * 퐹 )(푥)‖푌 푑푥푑휇(푎) ≤ ‖휕푎(푃(푅+1)푡 * 퐹 )‖푌 푑푥 푑휇(푎) + 푛 푛 푆푋 R 푆푋 ⏟ R ⏞ 푚 (*) Basically, under convolution, we can take the derivative under the integral sign. Thus (*) is an average of what we wrote on the left. Averaging only makes things smaller for norms, because norms are convex. Thus the right integral is less than the left integral, from Jensen’s inequality. The lemma says that adding that small factor, we get a bound in the opposite direction. We will argue that there must be a scale at which Jensen’s inequality stabilizes, i.e., ‖휕푎(푃푡 * 퐹 )‖푌 * 푃푅푡(푥) stabilizes to a constant. Note this is really the pigeonhole argument, geometrically. Proof. Bourgain uses this technique a lot in his papers: find some point where the inequality is equality, and then leverage that. 푡 If the result does not hold, then for every 푡 in the range (푅+1)푚+1 ≤ 푡 ≤ 퐴, the reverse holds. We’ll use the inequality at eveyr point in a geometric series. For all 푘 ∈ {0, . . . , 푚+1}, ∫︁ ∫︁ ∫︁ ∫︁ 퐶1vol(3퐵푋 ) ‖휕푎(푃퐴(푅+1)푘−푚−1 *퐹 )(푥)‖푌 푑푥푑휇(푎) > ‖휕푎(푃퐴(푅+1)푘−푚 *퐹 )(푥)‖푌 푑푥푑휇(푎)+ . 푛 푛 푆푋 R 푆푋 R 푚 Summing up these inequalities and telescoping ∫︁ ∫︁ ∫︁ ∫︁ 8(푚 + 1)vol(3퐵푋 ) ‖휕푎(푃퐴(푅+1)−푚−1 *퐹 )(푥)‖푌 푑푥푑휇(푎) > ‖휕푎(푃퐴(푅+1)*퐹 )(푥)‖푌 푑푥푑휇(푎)+ 푛 푛 푆푋 R 푆푋 R 푚 (3.4) (recall that we’ve been using the semigroup properties of the Poisson kernel this whole time). Now why is this conclusion absurd? Take 퐶1 to be the bound on the Lipschitz constant of 퐹 . Because ‖휕푎퐹 ‖푌 ≤ 8, we have휕푎(푃퐴(푅+1)−푚−1 * 퐹 )(푥) ≤ 퐶1. Since partial derivatives commute with the integral sign, we get ∫︁ ∫︁ ∫︁ ∫︁

‖휕푎(푃퐴(푅+1)−푚−1 * 퐹 )(푥)‖푌 푑푥 푑휇(푎) = ‖(푃퐴(푅+1)−푚−1 * 휕푎퐹 )(푥)‖푌 푑푥 푑휇(푎) 푛 푛 푆푋 R 푆푋 R (3.5) ∫︁ ∫︁

≤ 퐶1 ≤ 퐶1vol(퐵푋 ) (3.6) 푛 푆푋 R because 퐹 is Lipschitz and the Poisson semigroup integrates to unity. Together (3.4) and (3.6) give a contradiction. Now assuming Lemma 3.3.4 (Key Lemma 1), let’s complete the proof of Bourgain’s (︀ 2푛 1 (퐶퐷) discretization theorem. Assume from now on 훿 < 푐퐷 where 퐶 is a large enough constant that we will choose (퐶 = 500 or something).

59 MAT529 Metric embeddings and geometric inequalities

1 Proof of Bourgain’s Discretization Theorem 1.2.4. Let ℱ ⊆ 푆푋 be a -net in 푆푋 . Then 퐶2퐷 푛 |ℱ| ≤ (퐶3퐷) for some 퐶3. We will apply the Lemma 3.3.5 (Key Lemma 2) with 휇 the uniform measure on ℱ,

퐴 = (1/퐶퐷)5푛 푅 + 1 = (퐶퐷)4푛 푚 = ⌈(퐶퐷)푛+1⌉.

Then there exists (1/(퐶퐷))(퐶퐷)2푛 ≤ 푡 ≤ (1/(퐶퐷))5푛 such that ∑︁ ∫︁ ∑︁ ∫︁ 8vol(3퐵푋 ) eq:bdt-pf-lem2 ‖휕푎(푃푡 * 퐹 )(푥)‖푌 푑푥 ≤ ‖휕푎(푃(푅+1)푡 * 퐹 )(푥)‖푌 푑푥 + |ℱ| 푛 푛 푎∈ℱ R 푎∈ℱ R 푚 (3.7) We check the conditions of Lemma 3.3.4 (Key Lemma 1). 1. For (3.2), note 푡 is exponentially small, so the RHS inequality is satisfied. For the LHS 퐶 ln( 1 ) 퐶 ln( 1 ) √ 푡 √ 푡 inequality, note 훿 ≤ 푡 and 푛 ≤ 1. To see the second inequality, note 푛 ≥ 퐶5푛√ln(퐶퐷) 푛 ≥ 1 (for 푛 large enough). (︀ 1 2푛 2. For (3.2), note the LHS is dominated by ln 푡 ≤ (퐶퐷) ln(퐶퐷) which is much less 4푛 1 5푛 than 푅 = (퐶퐷) − 1, and 푡 , the dominating term on the RHS, is ≥ (퐶퐷) .

In my paper (reference?), I wrote what the exact constants are. They’re written in the paper, and are not that important. We’re choosing our constants big enough so that our inequalities hold true with room to spare. Now we can use Key Lemma 1, which says 1 eq:bdt-pf-lem1 (‖휕 (푃 * 퐹 )‖ * 푃 )(푥) ≥ (3.8) 푎 푡 푌 푅푡 퐷

Using 푃(푅+1)푡 = 푃푅푡 * 푃푡 (semigroup property) and Jensen’s inequality on the norm, which is a convex function, we have

eq:bdt-conv‖휕푎(푃(푅+1)푡 *퐹 )(푥)‖푌 = ‖ (휕푎(푃푡 * 퐹 ))*푃푅푡(푥)‖푌 ≤ (‖휕푎(푃푡 * 퐹 )‖푌 * 푃푅푡)(푥). (3.9) Since the norm is a convex function, by Jensen’s we get our inequality. Let (︀ 휓(푥) := (‖휕푎(푃푡 * 퐹 )‖푌 * 푃푅푡)(푥) − ‖휕푎 푃(푅+1)푡 * 퐹 (푥)‖푌 From (3.9), 휓(푥) ≥ 0 pointwise. Using Markov’s inequality, {︂ }︂ ∫︁ 1 1 (︀ vol 푥 ∈ 퐵푋 : 휓(푥) > ≤ 퐷 (‖휕푎(푃푡 * 퐹 )‖푌 * 푃푅푡)(푥) − ‖휕푎 푃(푅+1)푡 * 퐹 (푥)‖푌 )푑푥. 2 퐷 R푛 because we can upper bound the integral over the ball by an integral over R푛. Note we are using the probabilistic method.

60 MAT529 Metric embeddings and geometric inequalities

This inequality was for a fixed 푎. We now use the union bound over ℱ, a 훿-net of 푎’s to get {︂ }︂ 1 vol 푥 ∈ 퐵 : ∃푎 ∈ ℱ, 휓 (푥) > 1/퐷 (3.10) 2 푋 푎 ∑︁ 1 ≤ vol(푥 ∈ 퐵푋 : 휓푎(푥) > 1/퐷) (3.11) 푎∈ℱ 2 ∑︁ ∫︁ (︀ ≤ 퐷 ‖휕푎(푃푡 * 퐹 )‖푌 * 푃푅푡)(푥) − ‖휕푎(푃(푅+1)푡 * 퐹 )(푥)‖푌 푑푥 (3.12) 푛 푎∈ℱ R ∑︁ ∫︁ (︀ = 퐷 ‖휕푎(푃푡 * 퐹 )‖푌 )(푥) − ‖휕푎(푃(푅+1)푡 * 퐹 )(푥)‖푌 푑푥 (3.13) 푛 푎∈ℱ R (︂ )︂ 퐶 vol(3퐵 ) 1 ≤ 1 푋 (퐶 퐷)푛 < vol 퐵 . (3.14) 푚 3 2 푋 where (3.13) follows because when you convolve something with a probability measure, the integral over R푛 does not change, and (3.14) follows from (3.7) and our choice of 푚. We’ve proved that there must exist a point in half the ball such that for every 푎 ∈ ℱ, our net, we have 1 ≥ ‖휕 (푃 * 퐹 )‖ * 푃 (푥) − ‖휕 (푃 * 퐹 )(푥)‖ 퐷 푎 푡 푌 푅푡 푎 (푅+1)푡 푌 1 1 I think we want either this to be 2퐷 , or (3.8) to be 퐷/2 , in order to get the following bound 1 1 (with 2퐷 or 퐷 ). This involves changing some constants in the proof. From this we can conclude that for this 푥,

‖푇 푎‖푌 = ‖휕푎(푃(푅+1)푡 * 퐹 )(푥)‖푌 ≥ 1/퐷 −1 where 푎 ∈ 푆푋 we let 푡0 = (푅 + 1)푡. This shows ‖푇 ‖ ≤ 퐷. For the exact constants, see the paper (reference). Note this is very much a probabiblistic existence statement or result. Usually we estimate by hand the random variable we want to care about. Here we want to prove an existential bound, so we estimate the probability of the bad case. But we estimate the probability of the bad case using another proof by contradiction. It remains to estimate some integrals. 3-7: Finishing Bourgain. There’s a remaining lemma about the Poisson semigroup that we’re going to do today (Lemma 3.3.4); it’s nice, but it’s not as nice as what came before. The last remaining lemma to prove is Lemma 3.3.4. √1 Remember that 푛 ‖푥‖2 ≤ ‖푥‖푋 ≤ ‖푥‖2 for all 푥 ∈ 푋. We need three facts about the Poisson kernel.

Proposition 3.3.6 (Fact 1): pr:pois1 ∫︁ √ 푡 푛 푃푡(푥)푑푥 ≤ . 푛 R ∖(푟퐵푋 ) 푟

61 MAT529 Metric embeddings and geometric inequalities

Proof. First note that the Poisson semigroup has the property that 1 eq:pois-rescale푃 (푥) = 푃 (푥/푡), (3.15) 푡 푡푛 1 i.e., 푃푡 is just a rescaling of 푃1. So it’s enough to prove this for 푡 = 1. We have ∫︁ ∫︁

푃푡(푥) 푑푥 ≤ 푃푡(푥) 푑푥 ‖푥‖푋 ≥푟 ∫︁‖푥‖2≥푟

= 푃1(푥) 푑푥 ‖푥‖2≥푟/푡 by change of variables and (3.15) ∫︁ ∞ 푠푛−1 = 퐶푛푆푛−1 푑푠 푟/푡 (1 + 푠2)(푛+1)/2 푛−1 polar coordinates, where 푆푛−1 = vol(S ) ∫︁ ∞ 1 ≤ 퐶푛푆푛−1 푑푠 푟/푡 푠2 푡 = 퐶푛푆푛−1 푟√ 푡 푛 ≤ , 푟 where the last inequality follows from expanding in terms of the Gamma function and using Stirling’s formula. Above we changed to polar coordinates using ∫︁ ∫︁ 푅 푛−1 푓(‖푥‖) 푑푥 = 푆푛−1 ‖푥‖ 푓(푟) 푑푟. ‖푥‖2≤푅 0

Proposition 3.3.7 (Fact 2): For all 푦 ∈ R푛, ∫︁ √ 푛‖푦‖2 |푃푡(푥) − 푃푡(푥 + 푦)|푑푥 ≤ . R푛 푡 Proof. It’s again enough to prove this when 푡 = 1. ⃒ ⃒ ∫︁ ∫︁ ⃒∫︁ ⃒ ⃒ 1 ⃒ |푃푡(푥) − 푃푡(푥 + 푦)| 푑푥 = ⃒ ⟨∇푃푡(푥 + 푠푦), 푦⟩푑푠⃒ 푑푥 R푛 R푛 ∫︁0

≤ ‖푦‖2 ‖∇푃푡(푥)‖2푑푥 Cauchy-Schwarz 푛 R ∫︁ ‖푥‖2 ≤ ‖푦‖2(푛 + 1)퐶푛 푛+3 푑푥 computing gradient 푛 2 R (1 + ‖푥‖ ) 2 ∫︁ 2 ∞ 푟푛 = ‖푦‖2(푛 + 1)퐶푛푆푛−1 푛+3 푑푟 polar coordinates 0 (1 + 푟2) 2 √ ≤ 푛 ‖푦‖2 .

62 MAT529 Metric embeddings and geometric inequalities

The integral and multiplicative constants perfectly cancels out the 푛 + 1 and becomes 1 (calculation omitted).

1 Proposition 3.3.8 (Fact 3): For all 0 < 푡 < 2 and 푥 ∈ 퐵푋 , we have √ ‖푃푡 * 퐹 (푥) − 퐹 (푥)‖푌 ≤ 푛푡 log(3/푡).

Proof. The LHS equals ⃦ ⃦ ⃦∫︁ ⃦ ∫︁ ∫︁ ⃦ ⃦ ⃦ (퐹 (푥 − 푦) − 퐹 (푥)) 푃푡(푦)푑푦⃦ ≤ ‖퐹 (푥 − 푦) − 퐹 (푥)‖ 푃푡(푦)푑푦 + 푐 푃푡(푦)푑푦 . 푛 푌 푛 R 푌 ⏟ 푥+3퐵푋 ⏞ ⏟ R ∖(푥+3퐵 ⏞푋 ) (3.16) (3.17)

푐 Using ‖퐹 (푥 − 푦) − 퐹 (푥)‖푌 ≤ ‖퐹 ‖Lip 휌 ·‖푦‖푋 and that ‖퐹 ‖Lip is a constant, ∫︁

(3.16) = ‖푦‖푋 푃푡(푦)푑푦 푥+3퐵푋 ∫︁ √ ≤ √ ‖푦‖2푃푡(푦)푑푦 using 푥 ∈ 퐵푋 =⇒ 푥 + 3퐵푋 ⊆ 4 푛퐵푙2 4 푛퐵푙푛 2 √ ∫︁ 4 푛 푛 푡 푠 = 푡퐶푛푆푛−1 푑푠 polar coordinates 0 (1 + 푠2)(푛+1)/2 ∫︁ √ ∫︁ √ √ 푛 1 4 푛/푡 1 ≤ 푡 푛 √ 푑푠 + √ 푑푠 0 푛 푛 푠 (︂ (︂ )︂)︂ √ 4 ≤ 푡 푛 1 + ln , 푡

푠푛 √ 1 where we used the fact that 푛+1 is maximized when 푠 = 푛, and is always ≤ 푠 . (1+푠2) 2 For the second term, note that 2퐵푋 ⊆ 푥 + 3퐵푋 , so ∫︁ √ 푐푡 푛 (3.17) = 푐 푃푡(푦) ≤ 푛 R ∖2퐵푋 2 where we used the tail bound of Fact 1 (Pr. 3.3.7). Adding these two bounds gives the result.

Now we have these three facts, we can finish the proof of the lemma.

Proof of Lemma 3.3.4.

√ 1 Claim 3.3.9 (푃푡 * 퐹 separates points). Let 휃 = 푐퐷 푛푡 log(3/푡). Let 푤, 푦 ∈ 2 퐵푋 with ‖푤 − 푦‖푋 ≥ 휃. Then ‖푃푡 * 퐹 (푤) − 푃푡 * 퐹 (푦)‖푌 ≥ 1/퐷 We do not claim at all that these numbers are sharp.

63 MAT529 Metric embeddings and geometric inequalities

Proof. We can find a point 푝 ∈ 풩훿 which is 훿 away from 푤, and 푞 ∈ 풩훿 which is 훿 away from 푦. We also know that 퐹 is close to 푓, and we can use Fact 3 (Pr. 3.3.8) to bound the error of convolving.

By the triangle inequality

‖푃푡 * 퐹 (푤) − 푃푡 * 퐹 (푦)‖푌 ≥ ‖푓(푝) − 푓(푞)‖ − ‖퐹 (푝) − 푓(푝)‖ − ‖퐹 (푞) − 푓(푞)‖ − ‖퐹 (푤) − 퐹 (푝)‖ − ‖퐹 (푦) − 퐹 (푞)‖

− ‖푃푡 * 퐹 (푤) − 퐹 (푤)‖ − ‖푃푡 * 퐹 (푦) − 퐹 (푦)‖ ‖푝 − 푞‖ √ ≥ − 2푛훿 − 푐훿 − 푛푡 log(3/푡) 퐷 ‖푦 − 푤‖ − 2훿 √ ≥ − 2푛훿 − 푐훿 − 푛푡 log(3/푡). 퐷

Taking √ 휃 = 푐퐷 푛푡 log(3/푡), √ √we find this is at least a constant times ‖푦 − 푤‖/퐷, we need 휃 = 푐퐷 푛푡 log(3/푡). Note that 푛푡 log(3/푡) is the largest error term because by the assumptions in the lemma, it is greater than 2푛훿.

Now consider

‖푃푡 * 퐹 (푧 + 휃푎) − 푃푡 * 퐹 (푧)‖

1 By the claim, if 푧 ∈ 4 퐵푋 , then we know

휃/퐷 ≤ ‖푃푡 * 퐹 (푧 + 휃푎) − 푃푡 * 퐹 (푧)‖ (3.18) ∫︁ 휃 = ‖휕푎(푃푡 * 퐹 )(푧 + 푠푎)‖푌 (3.19) 0

64 MAT529 Metric embeddings and geometric inequalities

1 1 1 Suppose ‖푥 − 푦‖ ≤ 4 (we will apply (3.19) to 푥 − 푦 = 푧), 푥 ∈ 2 퐵푋 , 푦 ∈ 8 퐵푋 . By throwing away part of the integral, ∫︁ ∫︁ 1 휃 ‖휕푎(푃푡 * 퐹 )(푥 + 푠푎 − 푦)‖ 푃푅푡(푦)푑푦 (3.20) 푛 푌 휃 ∫︁0 ∫︁R 1 휃 ≥ ‖휕푎(푃푡 * 퐹 )(푥 − 푦 + 푠푎)‖푌 푃푅푡(푦) 푑푠 푑푦 (3.21) 휃 0 (1/8)퐵 ∫︁ (︂ ∫︁푋 )︂ 1 휃 = ‖휕푎(푃푡 * 퐹 )(푥 − 푦 + 푠푎)‖푌 푑푠 푃푅푡(푦) 푑푦 by Fubini (3.22) (1/8)퐵 휃 0 ∫︁ 푋 1 ≥ 푃푅푡(푦)푑푦 by (3.19) (3.23) 퐷 (1/8)퐵푋 푐 ≥ (3.24) 퐷

for some constant 푐. The calculation above says that the average length in every direction is big, and if you average the average length again, it is still big. How do we get rid of the additional averaging? We care about ∫︁

‖휕푎(푃푡 * 퐹 )‖* 푃푅푡(푥) = ‖휕푎(푃푡 * 퐹 )(푥 − 푦)‖푌 푃푅푡(푦)푑푦 푛 R∫︁ ∫︁ 1 휃 = ‖휕푎(푃푡 * 퐹 )(푥 − 푦 + 푠푎)‖푌 푃푅푡(푦)푑푦 푛 휃 0∫︁ ∫︁R 1 휃 − ‖휕푎(푃푡 * 퐹 )(푥 − 푦)‖푌 (푃푅푡(푦 + 푠푎) − 푃푅푡(푦))푑푦 푛 휃 0 ∫︁R ∫︁ 푐 푐′ 휃 > − |푃푅푡(푦 + 푠푎) − 푃푅푡(푦)|푑푠 푑푦 by (3.24) and what? 퐷 휃 0 푛 ∫︁ √R 푐 푐′ 휃 푛푠‖푎‖ ≥ − 2 푑푠 by Fact 2 (Pr. 3.3.7) 퐷 휃 0 푅푡

This error term is (︀ ∫︁ √ (︂ )︂ 푐′ 휃 푛푠‖푎‖ 푐퐷푛 log 3 1 2 푑푠 = 푡 = 푂 휃 0 푅푡 푅 퐷 (︀ by (3.2), so ‖휕 (푃 *퐹 )‖*푃 (푥) = Ω 1 (if we chose constants correctly), as desired. Factor √ 푎 푡 푅푡 퐷 of 푛?

To summarize: You compute the error such that you have distance 휃. Then you look at the average derivative on each line like this. Then, the average derivative is roughly ‖휕푎(푃푡 *퐹 )‖*푃푅푡(푥). So averaging the derivative in a bigger ball is the same up to constants as taking a random starting point and averaging over the ball. What’s written in the lemma is that the derivative of a given point in this direction, averaged over a little bit bigger ball, it’s not unreasonable to see that at first take starting point and going in direction 휃 and averaging in the bigger ball is equivalent to randomizing over starting points. 3-21: A better bound in Bourgain Discretization

65 MAT529 Metric embeddings and geometric inequalities

4 Upper bound on 훿푋→푌

We give an example where we need a discretization bound 훿 going to 0.

Theorem 3.4.1. There exist Banach spaces 푋, 푌 with dim 푋 = 푛 such that if 훿 < 1 and 1 풩훿 is a 훿-net in the unit ball and 퐶푌 (풩훿) & 퐶푌 (푋), then 훿 . 푛 .

Together with the lower bound we proved, this gives

1 1 < 훿 (1/2) < 휌푛퐶푛 푋→푌 푛

The lower bound would be most interesting to improve. 푛 The example will be 푋 = ℓ1 , 푌 = ℓ2. We need two ingredients. Firstly, we show that (퐿1, ‖푥 − 푦‖1) is isometric to a subset of 퐿2. More generally, we prove the following. Lemma 3.4.2. Given a measure space (Ω, 휇), (퐿1(휇), ‖푥 − 푦‖1) is isometric to a subset of 퐿2(Ω × R, 휇 × 휆), where 휆 is the Lesbegue measure of R.

Note all separable Hilbert spaces are the same, 퐿2(Ω × R, 휇 × 휆) is the same whenever it is separable.

Proof. First we define 푇 : 퐿1(휇) → 퐿2(휇 × 휆) as follows. For 푓 ∈ 퐿1(휇), let ⎧ ⎪ ⎨⎪1 0 ≤ 푓(푤) ≤ 푥 푇 푓(푤, 푥) = −1 푥 ≤ 푓(푥) ≤ 0 ⎪ ⎩0 otherwise

Consider two functions 푓1, 푓2 ∈ 퐿1(휇). The function |푇 푓1 − 푇 푓2| is the indicator of the area between the graph of 푓1 and the graph of 푓2.

66 MAT529 Metric embeddings and geometric inequalities

Thus the 퐿2 norm is, letting 1 be the signed indicator, ∫︁ ∫︁ (︀ 1 2 ‖푇 푓1 − 푇 푓2‖퐿2(휇×휆) = 푓1(푤),푓2(푤) (푥) 푑푥 푑휇 Ω R ∫︁

= |푓1(푤) − 푓2(푤)| 푑휇(푤) Ω

푝/푞 More generally, if 푞 ≤ 푝 < ∞, then (퐿푞, ‖푥 − 푦‖푞 ) is isometric to a subset of 퐿푝. We may prove this later if we need it. The second ingredient is due to Enflo .

Theorem 3.4.3 (Enflo,√ 1969). The best possible embedding of the hypercube into Hilbert space has distortion 푛: 푛 √ 퐶2({0, 1} , ‖·‖1) = 푛

Note that we don’t just calculate 퐶2 up to a constants here; we know it exactly. This is very rare. Proof. We need to show an upper bound and a lower bound. 1. To show the upper bound, we show the identity mapping {0, 1}푛 → ℓ푛 has distortion √ 2 푛. We have (︃ )︃ (︃ )︃ ∑︁푛 1/2 ∑︁푛 1/2 2 ‖푥 − 푦‖2 = |푥푖 − 푦푖| = |푥푖 − 푦푖| = ‖푥 − 푦‖1, 푖=1 푖=1 since the values of 푥, 푦 are only 0, 1. Then 1 √ ‖푥 − 푦‖ ≤ ‖푥 − 푦‖ = ‖푥 − 푦‖ ≤ ‖푥 − 푦‖ 푛 1 2 1 1

푛 √ Thus 퐶2({0, 1} ) ≤ 푛. (So this theorem tells us that if you want to represent the boolean hypercube as a Euclidean geometry, nothing does better than the identity mapping.)

67 MAT529 Metric embeddings and geometric inequalities

푛 √ 2. For the lower bound 퐶2({0, 1} , ‖·‖1) ≥ 푛, we use the following.

푛 Lemma 3.4.4 (Enflo’s inequality). We claim that for every 푓 : {0, 1} → ℓ2,

∑︁ ∑︁푛 ∑︁ 2 2 ‖푓(푥) − 푓(푥 + 푒)‖2 ≤ ‖푓(푥 + 푒푗) − 푓(푥)‖2, 푛 푛 푥∈F2 푗=1 푥∈F2

푛 where 푒 = (1,..., 1) and 푒푖 are standard basis vectors of R . 푛 Here, addition is over F2 .

This is specific to ℓ2. See the next subsection for the proof. In other words, Hilbert space is of Enflo-type 2 (Definition 1.1.13): the sum of the squares of the lengths of the diagonals is at most the sum of the squares of the lengths of the edges. Suppose 1 ‖푥 − 푦‖ ≤ ‖푓(푥) − 푓(푦)‖ ≤ ‖푥 − 푦‖ . 퐷 1 2 1 √ Our goal is to show that 퐷 ≥ 푛. Letting 푦 = 푥 + 푒,

1 푛 ‖푓(푥 + 푒) − 푓(푥)‖ ≥ ‖(푥 + 푒) − 푥‖ = . 2 퐷 1 퐷 Plugging into Enflo’s inequality 3.4.4, we have, since there are 2푛 points in the space,

2푛(푛/퐷)2 ≤ 푛 · 2푛 · 12 =⇒ 퐷2 ≥ 푛

as desired.

푛 Proof of Theorem 3.4.1. Let 풩 be a 훿-net in 퐵 푛 . Let 푇 : ℓ → 퐿 satisfy ‖푇 (푥)−푇 (푦)‖ = 훿 ℓ1 1 2 2 2 ‖푥 − 푦‖1. 푇 restricted to 풩훿 has distortion ≤ 훿 because for 푥, 푦 ∈ 풩훿 with 푥 ̸= 푦, we have 1 1 √ ‖푥 − 푦‖1 ≤ ‖푥 − 푦‖1 ≤ √ ‖푥 − 푦‖1. 2 훿 푛 √ However, the distortion of ℓ1 in ℓ2 is 푛. The condition 퐶푌 (풩훿) & 퐶푌 (푋) means that for some constant 퐾, 2 √ 2 ≥ 퐾 푛 =⇒ 훿 ≤ . 훿 퐾2푛

68 MAT529 Metric embeddings and geometric inequalities

4.1 Fourier analysis on the Boolean cube and Enflo’s inequality We can prove this by induction on 푛 (exercise). Instead, I will show a proof that is Fourier analytic and which generalizes to other situations.

Definition 3.4.5 (Walsh function): Let 퐴 ⊆ {1, . . . , 푛}. Define the Walsh function

푛 푊퐴 : F2 → {±1}

by 푗∈퐴 푊퐴(푥) = (−1) .

Proposition 3.4.6 (Orthonormality): For 퐴, 퐵 ⊆ {1, . . . , 푛}.

푊퐴(푥)푊퐵(푥) = 훿퐴퐵. E 푛 푥∈F2

푛 Thus, {푊퐴}퐴⊆{1,...,푛} is an orthonormal basis of 퐿2(F2 ). Proof. Without loss of generality 푗 ∈ 퐴∖퐵. Then ∑︀ ∑︀ 푥푗 + 푥푗 E (−1) 푗∈퐴 푗∈퐵 = 0.

푛 Corollary 3.4.7. Let 푓 : F2 → 푋 where 푋 is a Banach space. Then ∑︁ ^ 푓 = 푓(푎)푊퐴 퐴⊆{1,··· ,푛} ∑︀ ∑︀ 푥 ^ 1 푗∈퐴 푗 where 푓(퐴) = 푛 푛 푓(푥)(−1) . 2 푥∈F2 Proof. How do we prove two vectors are the same in a Banach space? It is enough to show that any composition with a linear functional gives the same value; thus it suffices to prove the claim for 푋 = R. The case 푋 = R holds by orthonormality.

Proof of Lemma 3.4.4. It is sufficient to prove the claim for 푋 = R; we get the inequality for ℓ2 by summing coordinates. (Note this is a luxury specific to ℓ2.) The summand of the LHS of the inequality is ∑︁ ^ 푓(푥) − 푓(푥 + 푒) = 푓(퐴)(푊퐴(푥) − 푊퐴(푥 + 푒)) 퐴⊆{1,...,푛} ∑︁ (︀ ^ |퐴| = 푓(퐴)푊퐴(푥) 1 − (−1) 퐴⊆{1,...,푛} ∑︁ ^ = 2푓(퐴)푊퐴(푥) 퐴⊆{1,...,푛},|퐴| odd

69 MAT529 Metric embeddings and geometric inequalities

The summand of the RHS of the inequality is ∑︁ ^ −푓(푥 + 푒푗) + 푓(푥) = 푓(퐴)(푊퐴(푥) − 푊퐴(푥 + 푒푗)) 퐴⊆{1,...,푛} ∑︁ ^ = 2푓(퐴)푊퐴(푥). 퐴⊆{1,...,푛},푗∈퐴 Summing gives ⃦ ⃦ ⃦ ⃦2 ∑︁ ⃦ ∑︁ ⃦ 2 푛 ⃦ ⃦ (푓(푥) − 푓(푥 + 푒)) = 2 ⃦ 2푓^(퐴)푊 ⃦ ⃦ 퐴⃦ 푥∈ 푛 F2 |퐴| odd 퐿 ( 푛) ∑︁ 2 F2 = 2푛 4(푓^(퐴))2 |퐴| odd ∑︁푛 ∑︁ ∑︁푛 ∑︁ (︁ )︁ 2 푛 ^ 2 (푓(푥 + 푒푗) − 푓(푥)) = 2 4 푓(퐴) 푗=1 푥∈ 푛 푗=1 퐴:푗∈퐴 F2 ∑︁ ∑︁ = 2푛 4푓^(퐴)2 퐴 푗∈퐴 ∑︁ = 2푛 4|퐴|푓^(퐴)2 퐴 From this we see that the claim is equivalent to ∑︁ ∑︁ 푓^(퐴)2 ≤ |퐴|푓^(퐴)2 퐴⊆{1,...,푛},|퐴| odd 퐴⊆{1,...,푛} which is trivially true. This inequality invites improvements, as this is a large gap. The Fourier analytic proof made this gap clear.

5 Improvement for 퐿푝

Theorem 3.5.1. thm:bourgain-lp Suppose 푝 ≥ 1 and 푋 is an 푛-dimensional normed space. Then 휖2 훿 (휀) 푋˓→퐿푝 & 푛5/2

In the world of embeddings into 퐿푝, we are in the right ballpark for the value of 훿—we know it is a power of 푛, if not exactly 1/푛. (It is an open question what the right power is.) This is a special case of a more general theorem, which I won’t state. This proof doesn’t use many properties of 퐿푝. Proof. This follows from Theorem 3.5.2 and Theorem 3.5.3.

1 We will prove the theorem for 휀 = 2 . The proof needs to be modified for the 휀-version.

70 MAT529 Metric embeddings and geometric inequalities

Theorem 3.5.2. thm:bourgain-lp2 Let 푋, 푌 be Banach spaces, with dim 푋 = 푛 < ∞. Suppose 푐휀2 3 that there exists a universal constant 푐 such that 훿 ≤ 푛5/2 , and that 풩훿 is a 훿-net of 퐵푋 . Then there exists a separable probability space (Ω, 휇)4, a finite dimensional subspace 푍 ⊆ 푌 and a linear operator 푇 : 푋 → 퐿∞(휇, 푍) such that for every 푥 ∈ 푋 with ‖푥‖푋 = 1, ∫︁ 1 − 휀 ≤ ‖푇 푥(휔)‖푌 푑휇(휔) ≤ esssup휔∈Ω‖(푇 푥)휔‖푌 ≤ 1 + 휖 퐷 Ω Note that the inequality can also be written as 1 − 휀 ‖푇 푥‖ ≤ ‖푇 푥‖ . 퐷 퐿1(휇,푍) 퐿∞(휇,푍)

Now what happens when 푌 = 퐿푝([0, 1])? We have

퐿푝(휇, 푍) ⊆ 퐿푝(휇, 푌 ) = 퐿푝(휇, 퐿푝) = 퐿푝(휇 × 휆)

A different way to think of this is that a function 푓 : 푋 → 퐿∞(휇, 퐿푝([0, 1])) can be thought of as a function of 푥 ∈ 푋 and 휔 ∈ [0, 1], 푓(휔, 푥):Ω × [0, 1] → R. This is a very classical fact in measure theory:

Theorem 3.5.3 (Kolmogorov’s representation theorem). Any separable 퐿푝(휇) space is iso- metric to a subset of 퐿푝.

This is an immediate corollary of the fact that if I give you a separable probability space, there is a separable transformation of the probability space (Ω, 휇) ∼= ([0, 1], 휆) where 휆 is Lebesgue measure. So there is a measure preserving isomorphism if the probability measure is atom-free. In general, the possible cases are

∙ (Ω, 휇) ∼= ([0, 1], 휆) if it is atom free.

∙ (Ω, 휇) ∼= [0, 1] × {1, . . . , 푛} if there are finitely many (푛) atoms ∼ ∙ (Ω, 휇) = [0, 1] × N if there are countably many atoms, ∼ ∙ (Ω, 휇) = {1, . . . , 푛} or N if it’s completely atomic. See Halmos’s book for deails. This is just called Kolmogorov’s representation theorem. Now for this to be an improvement of Bourgain’s discretization theorem for 퐿푝, we need that if 푍 ⊂ 푌 is a finite dimensional subspace and 푊 ⊂ 퐿∞(휇, 푍) is a finite dimensional subspace such that on 푊 , the 퐿∞ and 퐿1 norms are equivalent (a very strong restriction), then 푊 embeds in 푌 . That’s the only property of 퐿푝 we’re using. However, not every 푌 has this property.

3 Let me briefly say something∫︀ about vector-valued 퐿푝 spaces. Whenever you write 퐿푝(휇, 푍), this equals 푝 1/푝 all functions 푓 :Ω → 푍 such that ( ‖푓(휔)‖푍 푑휇(휔)) < ∞ (our norm is bounded). 4it will in the end be a uniform measure on half the ball

71 MAT529 Metric embeddings and geometric inequalities

Theorem 3.5.4 (Ostrovskii and Randrianantoanina). Not every 푌 has this property.

You can construct spaces of functions taking values in a finite dimensional subspace of it which cannot be embedded back into 푌 . Nevertheless, you have to work to find bad counterexamples. This is not published yet. Next class we’ll prove Theorem 3.5.2, which implies our first theorem. What is the advantage of formulating the theorem in this way? We can use Bourgain’s almost-extention theorem along the ball. The function is already Lipschitz, so we can already differentitate it. The measure itself will be the unit ball. We’ll look at the derivative of the extended function in direction 휔 at point 푥. That is what our 푇 will be. We will look in all possible directions. Then the lower bound we have here says that the derivative of the function is only invertible on average, not at every point in the sphere. That tends to be big on average. We will work for that, but you see the advantage: In this formulation, we’re going to get an average lower-bound, not an always lower-bound, and in this way we will be able to shave off two exponents. But this is not for Banach spaces in general. The advantage wehavehere is that 퐿푝 of 퐿푝 is 퐿푝. 3/23: We will prove the improvement of Bourgain’s Discretization Theorem.

Proof of Theorem 3.5.2. There exists 푓 : 풩훿 → 푌 such that 1 ‖푥 − 푦‖ ≤ ‖푓(푥) − 푓(푦)‖ ≤ ‖푥 − 푦‖ 퐷 푌 푌 푋 for 푥, 푦 ∈ 풩훿. By Bourgain’s almost extension theorem, letting 푍 = span(푓(풩훿)), there exists 퐹 : 푋 → 푍 such that

1. ‖퐹 ‖ 1. 퐿푝 .

2. For every 푥 ∈ 풩훿, ‖퐹 (푥) − 푓(푥)‖ ≤ 푛훿. 3. 퐹 is smooth.

1 Let Ω = 2 퐵푋 , with 휇 the normalized Lebesgue measure on Ω: (︀ 1 vol(퐴 ∩ 2 퐵푋 ) 휇(퐴) = 1 . vol( 2 퐵푋 )

Define 푇 : 푋 → 퐿∞(휇, 푍) as follows. For all 푦 ∈ 푋,

′ (푇 푦)(푥) = 퐹 (푥)(푦) = 휕푦퐹 (푥).

This function is in 퐿∞ because

′ ′ ‖(푇 푦)(푥)‖ = ‖퐹 (푥)(푦)‖ ≤ ‖퐹 (푥)‖‖푦‖ . ‖푦‖ . This proves the upper bound.

72 MAT529 Metric embeddings and geometric inequalities

We need ∫︁ 1 ′ ‖푦‖ . ‖푇 푦‖퐿 (휇,푍) = ‖퐹 (푥)푦‖ 푑휇(푥) 퐷 1 1 퐵 2 푋 ∫︁ 1 ′ = 1 ‖퐹 (푥)푦‖ 푑푥. 1 퐵 vol( 2 퐵푋 ) 2 푋

We need two simple and general lemmas.

Lemma 3.5.5. There exists an isometric embedding 퐽 : 푋 → ℓ∞.

This is a restatement of the Hahn-Banach theorem.

* * ∞ Proof. 푋 is separable, so let {푥푘}푘=1 be a set of linear functionals that is dense in 푆푋* = 휕퐵푋* . Let * ∞ 퐽푋 = (푥푘(푥))푘=1 ∈ ℓ∞.

* * * For all 푥, |푥푘(푥)| ≤ ‖푥푘‖ ≤ ‖푥푘‖‖푥‖ = ‖푥‖. By Hahn Banach, since every 푥 admits a normalizing functional,

* * ‖퐽푥‖ℓ = sup |푥푘(푥)| = sup |푥 (푥)| = ‖푥‖ . ∞ * 푘 푥 ∈푆푋

This may see a triviality, but there are 4 situations where this theorem will make its appearance.

Remark 3.5.6: Every separable metric space (푋, 푑) admits an isometric embedding into ∞ ℓ∞, given as follows. Take {푥푘}푘=1 dense in 푋, and let

∞ 푓(푥) = (푑(푥, 푥푘) − 푑(푥, 푥0))푘=1.

Proof is left as an exercise.

Lemma 3.5.7 (Nonlinear Hahn-Banach Theorem). Let (푋, 푑) be any metric space and 퐴 ⊆ 푋 any nonempty subset. If 푓 : 퐴 → R is 퐿-Lipschitz then there exists 퐹 : 푋 → R that extends 푓 and

‖퐹 ‖Lip = 퐿 = ‖푓‖Lip . The lemma says we can always extend the function and not lose anything in the Lips- chitz constant. Recall the Hahn-Banach Theorem says that we can extend functionals from subspace and preserve the norm. Extension in vector spaces and extension in R are different. This lemma seems general, but it is very useful. We mimic the proof of the Hahn-Banach Theorem.

73 MAT529 Metric embeddings and geometric inequalities

Proof. We will prove that one can extend 푓 to one additional point while preserving the Lipschitz constant. Then use Zorn’s Lemma on the poset of all extensions to supersets ordered by inclusion (with consistency) to finish. Let 퐴 ⊆ 푋, 푥 ∈ 푋∖퐴. We define 푡 = 퐹 (푥) ∈ R. To make 퐹 is Lipschitz, for all 푎 ∈ 퐴, we need |푡 − 푓(푎)| ≤ 푑푋 (푥, 푎).

Then we need ⋂︁ 푡 ∈ [푓(푎) − 푑푋 (푥, 푎), 푓(푎) + 푑푋 (푥, 푎)]. 푎∈퐴 The existence of 푡 is equivalent to ⋂︁ [푓(푎) − 푑푋 (푥, 푎), 푓(푎) + 푑푋 (푥, 푎)]. 푎∈퐴 By compactness it is enough to check this is true for finite intersections. Because we are on the real line (which is a total order), it is in fact enough to check this is true for pairwise intersections.

[푓(푎) − 푑푋 (푥, 푎), 푓(푎) + 푑푋 (푥, 푎)] ∩ [푓(푏) − 푑푋 (푥, 푏), 푓(푏) + 푑푋 (푥, 푏)].

We need to check

푓(푎) + 푑(푥, 푎) ≥ 푓(푏) − 푑(푥, 푏) ⇐⇒ |푓(푏) − 푓(푎)| ≤ 푑(푥, 푎) + 푑(푥, 푏).

This is true by the triangle inequality:

|푓(푏) − 푓(푎)| ≤ 푑(푎, 푏) ≤ 푑(푥, 푎) + 푑(푥, 푏).

This theorem is used all over the place. Most of the time people just give a closed formula: define 퐹 on all the points at once by

퐹 (푥) = inf {푓(푎) + 푑(푥, 푎): 푎 ∈ 퐴} .

Now just check that this satisfies Lipschitz. From our proof you ses exactly why they define 퐹 this way: it comes from taking the inf of the upper intervals (we can also take the sup of the lower intervals). The extension is not unique; there are many formulas for the extension. There is a whole isometric theory about exact extensions: look at all the possible ex- tensions, what is the best one? 퐹 is the pointwise smallest extension; the other one is the pointwise largest. The theory of absolutely minimizing extensions is related to an interesting nonlinear PDE called the infinite Laplacian. This is different from what we’re doing because we lose constants in many places.

74 MAT529 Metric embeddings and geometric inequalities

Corollary 3.5.8. Let (푋, 푑) be any metric space. Let 퐴 ⊆ 푋 be a nonempty subset 푓 : 퐴 →

ℓ∞ Lipschitz. Then there exists 퐹 : 푋 → ℓ∞ that extends 푓 and ‖퐹 ‖Lip = ‖푓‖Lip.

Proof. Note that saying 푓 : 퐴 → ℓ∞, 푓(푎) = (푓1(푎), 푓2(푎),...) has ‖푓‖Lip = 1 means that 푓푖 is 1-Lipschitz for every 푖.

−1 푓 takes 풩훿 ⊆ 퐵푋 to 푓(풩훿) ⊆ 푍. 푓 takes it back to 푋; 퐽 is the embedding to ℓ∞.

퐵 푍 O푋 O 퐺

−1 ? 푓 ? 푓 |푓(풩 ) 풩 / 푓(풩 ) 훿 / 푋 퐽 / ℓ 훿 훿 < ∞

퐽∘푓 −1| 푓(풩훿)

−1 Note 퐽 ∘ 푓 |푓(풩훿) is 퐷-Lipschitz. By nonlinear Hahn-Banach, there exists 퐺 : 푍 → ℓ∞ with ‖퐺‖Lip ≤ 퐷 such that 퐺(푓(푥)) = 퐽(푥)

for all 푥 ∈ 풩훿. We want 퐺 to be differentiable. We do this by convolving with a smooth bump function with small support to get 퐻 : 푍 → ℓ∞ such that

1. 퐻 is smooth

2. ‖퐻‖Lip ≤ 퐷,

3. for all 푥 ∈ 퐹 (퐵푋 ), ‖퐺(푥) − 퐻(푥)‖ ≤ 푛퐷훿.

Define a linear operator

푆 : 퐿1(휇, 푍) → ℓ∞

1 by for all ℎ ∈ 퐿1(휇, 푍), ℎ : 2 퐵푋 → 푍, ∫︁ 푆ℎ = 퐻′(퐹 (푥))(ℎ(푥)) 푑휇(푥). 1 2 퐵푋

(Type checking: F is a point in 푋, we can differentiate at this point and get linear operator 푍 → ℓ∞. ℎ(푥) is a point in 푍, so we can plug into linear operator and get a point in ℓ∞. This is a vector-valued integration; do the integration coordinatewise. So 푆ℎ is in ℓ∞.)

75 MAT529 Metric embeddings and geometric inequalities

Now ∫︁ ′ ‖푆ℎ‖ℓ ≤ ‖퐻 (퐹 (푥))(ℎ(푥))‖ℓ 푑휇(푥) ∞ 1 퐵 ∞ ∫︁ 2 푋 ≤ ‖퐻′(퐹 (푥))‖ ‖ℎ(푥)‖ 푑휇(푥) 1 ⏟ ⏞ 푍→ℓ∞ 2 퐵푋 ∫︁ =:퐷 ≤ 퐷 ‖ℎ(푥)‖ 1 2 퐵푋 ≤ 퐷 ‖ℎ‖ 퐿1(휇,푍) =⇒ ‖푆‖ ≤ 퐷 퐿1(휇,푍)→ℓ∞ ∫︁ 푆 푇 푦 = 퐻′(퐹 (푥))((푇 푦)(푥)) 푑휇(푥) ⏟ ⏞ 1 2 퐵푋 ℎ ∫︁ = 퐻′(퐹 (푥))(퐹 ′(푥)(푦)) 푑휇(푥) 1 퐵 ∫︁ 2 푋 = (퐻 ∘ 퐹 )′(푥)(푦) 푑휇(푥) chain rule 1 2 퐵푋 Now we show 퐻 ∘ 퐹 is very close to 퐽; it’s close to being invertible. (Recall 퐺(푓(푥)) = 퐽푥.) This does not a priori mean the derivative is close. We use a geometric argument to say that if a function is sufficiently close to being an isometry, then the integral of its derivative on a finite dimensional space is close to being invertible. 퐻 ∘ 퐹 was a proxy to 퐽. Check that 퐻 ∘ 퐹 is close to 퐽. For 푦 ∈ 풩훿, by choice of 퐻,

‖퐻(퐹 (푦)) − 퐽푦‖ ≤ ‖퐻(퐹 (푦)) − 퐺(퐹 (푦))‖ + ‖퐺(퐹 (푦)) − 퐺(푓(푦))‖ ℓ∞ ℓ∞ ℓ∞

≤ 푛퐷훿 + 퐷 ‖퐹 (푦) − 푓(푦)‖푍 . 푛퐷훿.

1 For general 푥 ∈ 2 퐵푋 , there exists 푦 ∈ 풩훿 such that ‖푥 − 푦‖푋 ≤ 2훿. Then ‖퐻(퐹 (푥)) − 퐽푥‖ ≤ 푛퐷훿 + 퐷훿 + 훿 ℓ∞ . 푛퐷훿.

1 For all 푥 ∈ 2 퐵푋 , ‖퐻 ∘ 퐹 (푥) − 퐽푥‖ ≤ 퐶푛퐷훿. Define 푔(푥) = 퐻 ∘ 퐹 (푥) − 퐽푥. Then ‖푔‖ ∞ 1 ≤ 퐶푛퐷훿. 퐿 ( 2 퐵푋 ) We need a geometric lemma. 푛 Lemma 3.5.9. Suppose (푉, ‖·‖푉 ) is any Banach space, 푈 = (R , ‖·‖푈 ) is a 푛-dimensional Banach space. Let 푔 : 퐵푈 → 푉 be continuous on 퐵푈 and differentiable on int(퐵푈 ). Then ⃦ ⃦ ⃦ ∫︁ ⃦ ⃦ 1 ⃦ ⃦ ′ ⃦ 푔 (푢) 푑푢 ≤ 푛 ‖푔‖퐿 (퐵 ) . ⃦ 퐵 ⃦ ∞ 푈 vol(퐵푈 ) 푈 푈→푉

76 MAT529 Metric embeddings and geometric inequalities

3/28: Finishing up Bourgain’s Theorem Previously we had 푋, 푌 Banach spaces with dim(푋) = 푛. Let 푁훿 ⊆ 퐵푋 be a 훿-net, 2 훿 ≤ 푐/푛 퐷, 푓 : 푁훿 → 푌 . We have 1 ‖푥 − 푦‖ ≤ ‖푓(푥) − 푓(푦)‖ ≤ ‖푥 − 푦‖ 퐷

WLOG 푌 = span(푓(푁훿)). The almost extension theorem gave us a smooth map from 퐹 : 푋 → 푌 , ‖퐹 ‖퐿푖푝 . 1 and for all 푥 ∈ 푁훿, ‖퐹 (푥) − 푓(푥)‖ . 푛훿. Letting 휇 be the normalized Lebesgue measure on 1/2퐵푋 , define 푇 : 푋 → 퐿∞(휇, 푌 ) by

(푇 푦)(푥) = 퐹 ′(푥)푦

for all 푦 ∈ 푋. The goal was to prove 1 ‖푇 푦‖ ‖푦‖ 퐿1(휇,푌 ) & 퐷 ∫︀ and the approach was to show for 푦 ∈ 푋 the average 1 ‖퐹 ′(푥)푦‖푑푦 is big. vol(1/2퐵푋 ) 1/2퐵푋 Fix a linear isometric embedding 퐽 : 푋 → 푙∞., we proved that there exists 퐺 : 푌 → 푙∞ such that ∀푥 ∈ 푁훿, 퐺(푓(푥)) = 퐽(푥). We have ‖퐺‖퐿푖푝 ≤ 퐷. Fix 퐻 : 푌 → 푙∞ where 퐻 is smooth, ‖퐻‖퐿푖푝 ≤ 퐷, and ∀푥 ∈ 퐹 (퐵푋 ), ‖퐻(푥) − 퐺(푥)‖ ≤ 푛퐷훿. Define 푆 : 퐿1(휇, 푌 ) → 푙∞ by ∫︁ 푆ℎ = 퐻′(퐹 (푥))(ℎ(푥))푑휇(푥) 1/2퐵푋 ∫︀ for all ℎ ∈ 퐿1(휇, 푌 ). Note that ‖푆‖퐿1(휇,푌 )→푙∞ ≤ 퐷. By the chain rule, 푆푇 푦 = 1/2퐵푋 (퐻 ∘ ′ 퐹 ) (푥)푦푑휇(푥). We checked for every 푥 ∈ 퐵푋 , ‖퐻(퐹 (푥)) − 퐽푥‖푙∞] . 푛퐷훿. Suppose you are on a point on the net. Then you know 퐻 is very close to 퐺. 퐹 was 푛훿 close to 푓, and 퐻 is 퐷-Lipschitz, which is how we get 푛훿퐷 (take a net point, find the closest point on the net, and use the fact that the functions are 퐷-Lipschitz). This is where we stopped. Now how do we finish? We need a small geometric lemma:

Lemma 3.5.10. Geometric Lemma. 푛 푈, 푉 are Banach spaces, dim(푈) = 푛. Let 푈 = (R , ‖·‖푈 ). Suppose that 푔 : 퐵푈 → 푉 is continuous on 퐵푈 and smooth on the interior. Then ⃦ ⃦ ⃦ ∫︁ ⃦ ⃦ 1 ⃦ ⃦ ′ ⃦ 푔 (푢)푑푢 ≤ 푛‖푔‖퐿∞(푆푋 ) ⃦ 퐵 ⃦ vol(퐵푈 ) 푈 푈→푉 where the inside of the LHS norm should be thought of as an operator. Let 푅 be the normalized integral operator. Then ∫︁ 1 푅푥 = 푔′(푢)푥푑푢 vol(퐵푈 ) 퐵푈

77 MAT529 Metric embeddings and geometric inequalities

The way to think of using this is if you have an 퐿∞ bound, you get an 퐿1 bound. Assuming this lemma, let’s apply it to 푔 = 퐻 ∘ 퐹 − 퐽. We get ⃦ ⃦ ⃦∫︁ ⃦ ⃦ ′ ⃦ 2 ⃦ ((퐻 ∘ 퐹 ) − 퐽) 푑휇(푥)⃦ . 푛 퐷훿 1/2퐵 ⃦ 푋 ⃦푋→푙∞ ⃦∫︁ ⃦ ⃦ ′ ⃦ 2 ⃦ (퐻 ∘ 퐹 ) (푥)푑휇(푥) − 퐽⃦ = ‖푆푇 − 퐽‖푋→푙∞ . 푛 퐷훿 1/2퐵 푋 푋→푙∞

Now we want to bound from below ‖푇 푦‖퐿1(휇,푌 ). We have

‖푆푇 푦‖푙∞ ‖푇 푦‖퐿1(휇,푌 ) ≥ ‖푆‖퐿1(휇,푌 )→푙∞ 1 1 ≥ ‖푆푇 푦‖ ≥ (‖퐽푦‖ − ‖푆푇 − 퐽‖ ‖푦‖) 퐷 푙∞ 퐷 푋→푙∞ 1 (︀ = ‖푦‖ − (푛2퐷훿‖푦‖) 퐷 There is a bit of magic in the punchline. We want to bound the operator below, and we understand it as an average of derivatives. We succeeded to show the function itself is small, and there is our geometric lemma which gives a bound on the derivative if you know a bound on the function. Now let’s prove the lemma.

푛 Proof of Lemma 3.5.10. Fix a direction 푦 ∈ R and normalize so that ‖푦‖2 = 1. For every 푢 ∈ Proj푦⊥ (퐵푈 ), let 푎푈 ≤ 푏푈 ∈ R be such that 푢 + R푦 ∩ 퐵푈 = 푢 + [푎푈 , 푏푈 ]푦 (basically, this is the intersection of the projection line with the ball). Using Fubini, ⃦ ⃦ ⃦ ⃦ ⃦ ∫︁ ⃦ ⃦ ∫︁ ∫︁ ⃦ ⃦ 1 ⃦ ⃦ 1 푏푈 푑 ⃦ ⃦ ′ ⃦ ⃦ ⃦ ⃦ 푔 (푢)푑푢⃦ = ⃦ 푔(푢 + 푠푦)푑푠푑푢⃦ vol(퐵 ) 퐵 vol(퐵 ) Proj (퐵 ) 푎 푑푠 푈 푈 푉 ⃦ 푈 푦⊥ 푈 푈 푉 ⃦ ⃦ ∫︁ ⃦ ⃦ 1 ⃦ = ⃦ (푔(푢 + 푏푈 푦) − 푔(푢 + 푎푈 푦)) 푑푢⃦ ⃦ Proj (퐵 ) ⃦ vol(퐵푈 ) 푦⊥ 푈 푉 1 ≤ · 2vol푛−1(Proj푦⊥ (퐵푈 ))‖푔‖퐿∞(푆푋 ,푉 ) vol푛(퐵푈 ) We need to show that (︀ 2vol푛−1 Proj푦⊥ (퐵푈 ) ≤ 푛‖푦‖푈 vol푛(퐵푈 ) The convex hull of 푦 over the projection is the cone over the projection. 푦 1 (︀ vol푛(conv ∪ Proj푦⊥ (퐵푈 ) ) = · vol푛−1 Proj푦⊥ (퐵푈 ) ‖푦‖푈 푛‖푦‖푈 (︁{︁ }︁ )︁ 푦 Letting 퐾 = conv ± ∩ Proj ⊥ (퐵푈 ) , this is the same as saying that vol푛(퐾) ≤ ‖푦‖푈 푦 vol푛(퐵푈 ). Note that 퐾 is the double cone, that is where the factor of 2 got absorbed. This is an application of Fubini’s theorem.

78 MAT529 Metric embeddings and geometric inequalities

Geometrically, the idea is that when we look on the projective lines for whatever part of the cone is not in our ball set, we will be able to fit the outside inside the set. In formulas, 푐푈 is the largest multiple of 푢 which is inside the boundary of the cone. 푐푈 ≥ 1. ⋃︁ 푐 − 1 푐 − 1 퐾 = 푢 + − 푈 , + 푈 푐푈 ‖푦‖푈 푐푈 ‖푦‖푈 푢∈Proj푦⊥ (퐵푈 ) We also have 1 1 푦 (푐푈 푢 + 푎푐푈 푢푦) ± (1 − ) ∈ 퐵푈 푐푈 푐푈 ‖푦‖푈 and we get that 퐾 ⊂ 퐵푈 by Fubini. Thus, we’ve completed the proof of Bourgain’s theorem for the semester. The big re- maining question is can we do something like this in the general case?

6 Kirszbraun’s Extension Theorem

Last time we proved nonlinear Hahn-Banach theorem. I want to prove one more Lipschitz Extension theorem, which I can do in ten minutes which we will need later in the semester.

Theorem 3.6.1 (Kirszbraun’s extension theorem (1934)). Let 퐻1, 퐻2 be Hilbert spaces and 퐴 ⊆ 퐻1. Let 푓 : 퐴 → 퐻2 Lipschitz. Then, there exists a function 퐹 : 퐻1 → 퐻2 that extends

푓 and has the same Lipschitz constant (‖퐹 ‖Lip = ‖푓‖Lip).

We did this for real valued functions and 푙∞ functions. This version is non-trivial, and relates to many open problems.

Proof. There is an equivalent geometric formulation. Let 퐻1, 퐻2 be Hilbert spaces {푥푖}푖∈퐼 ⊆

퐻1 and {푦푖}푖∈퐼 ⊆ 퐻2, {푟푖}푖∈퐼 ⊆ (0, ∞). Suppose that ∀푖, 푗 ∈ 퐼, ‖푦푖 − 푦푗‖퐻2 ≤ ‖푥푖 − 푦푖‖퐻1 . If ⋂︁

퐵퐻1 (푥푖, 푟푖) ̸= ∅ 푖∈퐼

then ⋂︁

퐵퐻1 (푦푖, 푟푖) ̸= ∅ 푖∈퐼 as well. Intuitively, this says the configuration of points in 퐻2 are a squeezed version of the 퐻1 points. Then, we’re just saying something obvious. If there’s some point that intersects all balls in 퐻1, then using the same radii in the squeezed version will also be nonempty. Our geometric formulation will imply extension. We have 푓 : 퐴 → 퐻2, WLOG ‖푓‖퐿푖푝 = 1. For

all 푎, 푏 ∈ 퐴, ‖푓(푎) − 푓(푏)‖퐻2 ≤ ‖푎 −⋂︀푏‖퐻1 . Fix any 푥 ∈ 퐻1 ∖ 퐴. What can we say about the

intersection of the following balls?: 푎∈퐴 퐵퐻1 (푎, ‖푎 − 푥‖퐻1 ). Well by design it is not empty since 푥 is in this set. So the conclusion from the geometric formulation says ⋂︁

퐵퐻2 (푓(푎), ‖푎 − 푥‖퐻1 ) ̸= ∅ 푎∈퐴

79 MAT529 Metric embeddings and geometric inequalities

So take some 푦 in this set. Then ‖푦 − 푓(푎)‖퐻2 ≤ ‖푥 − 푎‖퐻2 ∀푎 ∈ 퐴. Define 퐹 (푥) = 푦. Then we can just do the one-more-point argument with Zorn’s lemma to finish. Let us now prove the geometric formulation. It is enough to prove the geometric formu- lation when |퐼| < ∞ and 퐻1, 퐻2 are finite dimensional. To show all the balls intersect, it is enough to show that finitely many of them intersect (this is the finite intersection prop- erty: balls are weakly compact). Now the minute 퐼 is finite, we can write 퐼 = {1, ··· , 푛}, ′ ′ 퐻1 = span{푥1, ··· , 푥푛}, 퐻2 = span{푦1, ··· , 푦푛}. This reduces everything to finite dimen- sions. We have a nice argument using Hahn-Banach. FINISH THIS

Remark 3.6.2: In other norms, the geometric formulation in the previous proof is just not true. This effectively characterizes Hilbert spaces. Related is the Kneser-Poulsen conjecture, which effectively says the same thing in terms of volumes:

푛 Conjecture 3.6.3 (Kneser-Poulsen). Let 푥1, . . . , 푥푘; 푦1, . . . , 푦푘 ∈ R and ‖푦푖−푦푗‖ ≤ ‖푥푖−푥푗‖ for all 푖, 푗. Then ∀푟푖 ≥ 0 (︃ )︃ (︃ )︃ ⋂︁푘 ⋂︁푘 vol 퐵(푦푖, 푟푖) ≥ vol 퐵(푥푖, 푟푖) 푖=1 푖=1

This is known for 푛 = 2, where volume is area using trigonometry and all kinds of ad-hoc arguments. It’s also known for 푘 = 푛 + 2.

3/30/16 Here is an equivalent formulation of the Kirszbraun extension theorem.

∞ Theorem 3.6.4. Let 퐻1, 퐻2 be Hilbert spaces, {푥푖}푖∈퐼 ⊆ 퐻1, {푦푖}⋂︀푖∈퐼 ⊆ 퐻2, {푟푖}푖=1 ⊆ (0, ∞). Suppose that ‖푦 − 푦 ‖ ≤ ‖푥 − 푥 ‖ for all 푖, 푗 ∈ 퐼. Then 퐵 (푥 , 푟 ) ̸= 휑 implies ⋂︀ 푖 푗 퐻2 푖 푗 퐻2 푖∈퐼 퐻1 푖 푖

푖∈퐼 퐵퐻2 (푦푖, 푟푖) ̸= 휑.

Proof. By weak compactness and orthogonal projection, it is enough to⋂︀ prove this when 푛 퐼 = {1, . . . , 푛} and 퐻1, 퐻2 are both finite dimensional. Fix any 푥 ∈ 푖=1⋂︀퐵퐻1 (푥푖, 푟푖). If 푥 = 푥 for some 푖 ∈ {1, . . . , 푛}, ‖푦 − 푦 ‖ ≤ ‖푥 − 푥 ‖ ≤ 푟 , and 푦 ∈ 푛 퐵 (푦 , 푟 ). 푖0 0 푖0 푖 퐻2 푖0 푖 퐻1 푖 푖0 푖=1 퐻2 푖 푖 Assume 푥 ̸∈ {푥1, . . . , 푥푛}. Define 푓 : 퐻 → R by {︃ }︃ ‖푦 − 푦1‖ ‖푦 − 푦푛‖ 푓(푦) = max 퐻2 ,..., 퐻2 . ‖푥 − 푥 ‖ ‖푥 − 푥 ‖ 1 퐻1 푛 퐻2

Note that 푓 is continuous and lim‖푦‖ →∞ 푓(푦) = ∞. 퐻2 So, the minimum of 푓(푦) over R푛 is attained. Let

푚 = min 푓(푦). 푦∈R푛

Fix 푦 ∈ R푛, 푓(푦) = 푚.

80 MAT529 Metric embeddings and geometric inequalities

Observe that we are done if 푚 ≤ 1. Suppose by way of contradiction 푚 > 1. Let 퐽 be the indices where the ratio is the minimum. {︃ }︃ ‖푦 − 푦푖‖ 퐽 = 푖 ∈ {1, . . . , 푛} : 퐻2 = 푚 . ‖푥 − 푥 ‖ 푖 퐻1 By definition, if 푗 ∈ 퐽, then ‖푦 − 푦 ‖ = 푚 ‖푥 − 푥 ‖ , and for 푗 ̸∈ 퐽, ‖푦 − 푦 ‖ < 푗 퐻2 푗 퐻1 푗 퐻2 푚 ‖푥 − 푥 ‖ . 푗 퐻1

Claim 3.6.5. 푦 ∈ conv({푦푗}푗∈퐽 ). Proof. If not, find a separating hyperplane. If we move 푦 a small enough distance towards ′ conv({푦푗}푗∈퐽 ) perpendicular to the separating hyperplane to 푦 , then for 푗 ∈ 퐽,

‖푦′ − 푦 ‖ < ‖푦 − 푦 ‖ = 푚 ‖푥 − 푥 ‖ . 푗 퐻2 푗 퐻2 푗 퐻1 For 푗 ∈ 퐽, it is still true that ‖푦′ − 푦 ‖ < 푚 ‖푥 − 푥 ‖ . Then 푓(푦′) < 푚, contradicting 푗 퐻2 푗 퐻1 that 푦 is a minimizer. ∑︀ ∑︀ By the claim, there exists {휆푗}푗∈퐽 , 휆푗 ≥ 0, 푗∈퐽 휆푗 = 1, 푦 = 푗∈퐽 휆푗푦푗. We use this the coefficients of this convex combination to define a probability distribution. Define a random vector in 퐻1 by 푋, with

P(푋 = 푥푗) = 휆푗, 푗 ∈ 퐽. ∑︀ ′ For all 푗 ∈ {1, . . . , 푛}, let 푦푗 = ℎ(푥푗). Let E[ℎ(푋)] = 푗∈퐽 휆푗푦푗 = 푦. Let 푋 be an independent copy of 푋. Using ‖ℎ(푋) − 푦‖ = 푚 ‖푋 − 푥‖ , we have 퐻2 퐻1

‖ℎ(푋) − ℎ(푋)‖2 = ‖ℎ(푋) − 푦‖2 (3.25) E E 퐻2 E 퐻2 = 푚2 ‖푋 − 푥‖2 (3.26) E 퐻1 > ‖푋 − 푥‖2 (3.27) E 퐻1 ≥ ‖푋 − 퐸푋‖2 (3.28) E 퐻1 In Hilbert space, it is always true that

2 1 ′ 2 E ‖푋 − 퐸푋‖ = E ‖푋 − 푋 ‖ . 퐻1 2 퐻2 Using this on both sides of the above,

1 ′ 2 1 ′ 2 E ‖ℎ(푋) − ℎ(푋 )‖ > ‖푋 − 푋 ‖ . 2 퐻2 2 퐻1 So far we haven’t used the only assumption on the points. By the assumption ‖푋 − 푋′‖ ≥ 퐻1 ‖ℎ(푋) − ℎ(푋′)‖ . This is a contradiction. 퐻2

81 MAT529 Metric embeddings and geometric inequalities

This is not true in 퐿푝 spaces when 푝 ̸= 2, but there are other theorems one can formulate (ex. with different radii). We have to ask what happens to each of the inequalities inthe proof. As stated the theorem is a very “Hilbertian” phenomenon.

Definition 3.6.6: Let (푋, ‖·‖푋 ) be a Banach space and 푝 ≥ 1. We say that 푋 has Rademacher type 푝 if for every 푛, for every 푥1, . . . , 푥푛 ∈ 푋, (︃ [︃⃦ ⃦ ]︃)︃ (︃ )︃ ⃦ ⃦푝 1 1 ⃦∑︁푛 ⃦ 푝 ∑︁푛 푝 ⃦ ⃦ 푝 E ⃦ 휀푖푋푖⃦ ≤ 푇 ‖푋푖‖푋 . 휀=(휀 ,...,휀 )∈{±1}푛 1 푛 푖=1 푋 푖=1

The smallest 푇 is denoted 푇푝(푋).

Definition (Definition 1.1.13): A metric space (푋, 푑푋 ) is said to have Enflo type 푝 if for all 푓 : {±1}푛 → 푋,

1 1 ∑︁푛 푝 푝 푝 푝 [푑푋 (푓(휀), 푓(−휀)) ] 푋 [푑(푓(휀), 푓(휀1, . . . , 휀푗−1, −휀푗, 휀푗+1, ··· )) ] . E휀 . E휀 푗=1

If 푋 is also a Banach space of Enflo type 푝, then it is also of Rademacher type 푝.

Question 3.6.7: Let 푋 be a Banach space of Rademacher type 푝. Does 푋 also have Enflo type 푝?

We know special Banach spaces for which this is true, like 퐿푝 spaces, but we don’t know the anser in full generality.

Theorem 3.6.8 (Pisier). thm:pisier Let 푋 be a Banach space of Rademacher type 푝. Then 푋 has Enflo type 푞 for every 1 < 푞 < 푝.

We first need the following.

Theorem 3.6.9 (Kahane’s Inequality). For every ∞ > 푝, 푞 ≥ 1, there exists 퐾푝,푞 such that for every Banach space, (푋, ‖·‖푋 ) for every 푥1, . . . , 푥푛 ∈ 푋, (︃ ⃦ ⃦ )︃ (︃ ⃦ ⃦ )︃ ⃦ ⃦푝 1 ⃦ ⃦푞 1 ⃦∑︁푛 ⃦ 푝 ⃦∑︁푛 ⃦ 푞 ⃦ ⃦ ⃦ ⃦ E ⃦ 휀푖푥푖⃦ ≤ 퐾푝,푞 E ⃦ 휀푖푥푖⃦ . 푖=1 푋 푖=1 푋 In the real case this is Khintchine’s inequality. By Kahane’s inequality, 푋 has type 푝 iff (︃ ⃦ ⃦ )︃ (︃ )︃ ⃦ ⃦푞 1 1 ⃦∑︁푛 ⃦ 푞 ∑︁푛 푝 ⃦ ⃦ 푝 E ⃦ 휀푖푥푖⃦ .푋,푝,푞 ‖푥푖‖푋 . 푖=1 푋 푖=1 ∑︀ 푛 푛 푛 Define 푇 : ℓ푝 (푋) → 퐿푝({±1} , 푋) by 푇 (푥1, . . . , 푥푛)(휀) = 푖=1 휀푖푋푖.

82 MAT529 Metric embeddings and geometric inequalities

(︀ ∑︀ 1 1 푝 푝 (Here ‖푓‖ 푛 = 푛 ‖푓(휀)‖ .) 퐿푝({±1} ,푋) 2푛 휀∈{±1} 푋 Rademacher type 푝 mens

‖푇 ‖ 푛 푛 푋,푝,푞 1. ℓ푝 (푋)→퐿푞({±1} ,푋) . For an operator 푇 : 푌 → 푍, the adjoint 푇 * : 푍* → 푌 * satisfies

* ‖푇 ‖푌 →푍 = ‖푇 ‖푍*→푌 * . * 푝 1 1 Let 푝 = 푝−1 be the dual of 푝 ( 푝 + 푝* = 1). Now 푛 * 푛 * 퐿푞({±1} , 푋) = 퐿푞* ({±1} , 푋 ).

* 푛 * 푛 * * 푛 푛 * Here, 푔 : {±1} → 푋 , 푓 : {±1} → 푋, 푔 (푓) = E푔 (휀)(푓(휀)), ℓ푝 (푋) = ℓ푝* (푋 ). Note * ‖푇 ‖ * 푛 * 푛 * 1, 퐿푞 ({±1} ,푋 )→ℓ푝* (푋 ) . * * * * * * For 푇 : 푌 → 푍, 푇 : 푍 → 푌∑︀is defined by 푇 (푧 )(푦) = 푧 (푇 푦). * 푛 * * ̂︀* For 푔 : {±1} → 푋 , 푔 퐴⊆{1,...,푛} 푔 (퐴)푊퐴. We claim 푇 *푔* = (푔̂︀*({1}),..., 푔̂︀({푛})). We check 푇 *푔*(푥 , . . . , 푥 ) = 푔*(푇 (푥 , . . . , 푥 )) 1 푛 1(︃ 푛 )︃ ∑︁푛 * ≤ E푔 (휀) 휀푖푋푖 푖=1 ∑︁푛 * = (E푔 (휀)휀푖)(푋푖) 푖=1 ∑︁푛 ∑︁ ̂︀* = 푔 (퐴)(E푊퐴(휀)휀푖) (푥푖) 푖=1 퐴⊆{1,...,푛} ∑︁푛 = 푔̂︀*({푖})(푥) 푖=1 ̂︀* ̂︀* = (푔 ({1}),..., 푔 ({푛}))(푥1, . . . , 푥푛).

* 푛 * Rademacher type 푝 means for all 푔푖 : {±1} → 푋 , for all 푞 ∈ [1, ∞), (︃ )︃ ∑︁푛 ̂︀* 푝* * ‖푔 ({푖})‖ ‖푔 ‖ 푛 * 푋 . 퐿푞({±1} ,푋 ) 푖=1 Theorem 3.6.10 (Pisier). If 푋 has Rademacher type 푝 and 1 ≤ 푎 < 푝, then for every 푏 > 1, for all 푓 : {±1}푛 → 푋 with E푓 = 0, ⃦ ⃦ ⃦ 1 ⃦ ⃦ 푎 ⃦ ⃦ ∑︁푛 ⃦ ⃦ 푎 ⃦ ‖푓‖푏 - ⃦ ‖휕푗푓‖푋 ⃦ . ⃦ 푗=1 ⃦ 푏

83 MAT529 Metric embeddings and geometric inequalities

푛 푛 Here, for 푓 : {±1} → 푋, 휕푗푓 : {±1} → 푋 is defined by

푓(휀) − 푓(휀 , . . . , 휀 , −휀 , 휀 , . . . 휀 ) ∑︁ 휕 푓(휀) = 1 푗−1 푗 푗+1 푛 = 푓̂︀(퐴)푊 . 푗 2 퐴 퐴 ⊆ {1, . . . , 푛} 푗 ∈ 퐴

2 Note 휕푗 = 휕푗. ∑︀ ∑︀ 푛 푛 2 The Laplacian is Δ = 푗=1 휕푗 = 푗=1 휕푗 . We’ve proved that ∑︁ ̂︀ Δ푓 = |퐴|푓(퐴)푊퐴. 퐴⊆{1,...,푛}

The heat semigroup on the cube5 is for 푡 > 0, ∑︁ −푡Δ −푡|퐴| ̂︀ 푒 푓 = 푒 푓(퐴)푊퐴. 퐴⊆{1,...,푛}

This is a function on the cube

1 ∑︁푛 푎 푎 휀 ↦→ ‖휕푗푓(휀)‖푋 ∈ [0, ∞). 푗=1

For 푏 = 푎, 1 ∑︁푛 푎 푞 ‖푓‖ ‖휕 푓‖ 푛 . 푎 . 푗 퐿푎({±1} ,푋) 푗=1 Let 푓 : {±1}푛 → 푋 be

‖푓(휀) − 푓(−휀)‖푎 ≤ ‖푓(휀) − E푓‖푎 + E ‖E푓 − E푓(−휀)‖푎 1 ∑︁푛 푎 푎 . ‖푓(휀) − 푓(휀1,..., −휀푗, . . . , 휀푛)‖푋 . 푗=1

We need some facts about the heat semigroup.

Fact 3.6.11 (Fact 1): Heat flow is a contraction: ⃦ ⃦ ⃦ −푡Δ ⃦ ⃦푒 푓⃦ ≤ ‖푓‖ 푛 . 푛 퐿푞({±1} ,푋) 퐿푞({±1} ,푋)

Proof. We have ∑︁ ∑︁ −푡Δ −푡|퐴| ̂︀ 푒 푓 = 푒 푓(퐴)푊퐴 = 푅푡 * 푓 퐴⊆{1,...,푛}

5also known as the noise operator

84 MAT529 Metric embeddings and geometric inequalities

∑︀ 1 Here the convolution is defined as 2푛 훿∈{±1}푛 푅푡(휀훿)푓(훿). This is a vector valued function. In the real case it’s Parseval, multiplying the Fourier coefficients. It’s enough to prove for the real line. Identities on the level of vectors. Here

∑︁ −푡|퐴| 푅푡(휀) = 푒 푊퐴(휀); 퐴⊆{1,...,푛}

this is the Riesz product. The function is ∏︁푛 −푡 = (1 + 푒 휀푖) = 0, 푡 > 0. 푖=1

E푅푡 = 1 so 푅푡 is a probability measure and heat flow is an averaging operator.

4/11: Continue Theorem 3.6.8. The dual to having Rademacher type 푝: for all 푔* : (︁∑︀ )︁ 1 푛 * 푛 ̂︀ 푝* 푝* * {±1} → 푋 , 푖=1 ‖푔({푖})‖푋* . ‖푔 ‖퐿푟* ({±1}푛,푋*) for every 1 < 푟 < ∞. Note that for the purpose of proving Theorem 3.6.8 we only need 푟 = 푝 in Kahane’s inequality. However, I recommend that you read up on the full inequality. 푛 푛 푓(휀)−푓(휀1,...,−휀푗 ,...,휀푛) For 푓 : {±1} → 푋,⎧ define 휕푗푓 : {±1} → 푋 by 휕푗푓 = 2 . Then ⎨ 2 * 푊퐴, 푗 ∈ 퐴 휕 = 휕푗 = 휕 and 휕푗푊퐴 = 푗 푗 ⎩0, 푗 ̸∈ 퐴. ∑︀ ∑︀ 푛 푛 2 The Laplacian is Δ = 푗=1 휕푗 = 푗=1 휕푗 ,

∑︁ ̂︀ Δ푓 = |퐴|푓(퐴)푊퐴. 퐴⊆[푛]

If 푡 > 0 then ∑︁ −푡Δ −푡|퐴| 푒 = 푒 푊퐴 퐴⊆[푛]

∏︀ −푡Δ 푛 −푡 is the heat semigroup. It is a contraction. 푒 = ( 푖=1(1 + 푒 휀푖)) * 푓. We have

⃦ ⃦ ⃦ −푡Δ ⃦ −푛푡 ⃦푒 푓⃦ ≥ 푒 ‖푓‖ 푝 푛 (3.29) 푛 퐿 ({±1} ,푋) 퐿푝({±1} ,푋) −푡Δ −푡Δ −푡푛 푒 (푉[푛]푒 푓) = 푒 푊[푛]푓 (3.30)

85 MAT529 Metric embeddings and geometric inequalities

∏︀ ∑︀ 푛 ̂︀ where 푊[푛](휀) = 푖=1 휀푖. For 푓 = 퐴⊆[푛] 푓(퐴)푊퐴, we have ∑︁ −푡Δ −푡|퐴| ̂︀ 푊[푛]푒 푓 = 푒 푓(퐴)푊[푛]∖퐴 (3.31) 퐴⊆[푛] ∑︁ −푡Δ −푡Δ −푡|퐴| ̂︀ −푡(푛−|퐴|) 푒 (푊[푛]푒 푓) = 푒 푓(퐴)푒 (3.32) 퐴⊆[푛] ∑︁ −푡푛 ̂︀ = 푒 푓(퐴)푊[푛]∖퐴 (3.33) 퐴⊆[푛] = 푒−푡푛푊 푓 (3.34) ⃦ ⃦ ⃦ [푛] ⃦ ⃦ ⃦ ⃦ ⃦ 푒−푡푛 ⃦푊 푓⃦ = ⃦푒−푡Δ(푊 (푒−푡Δ푓))⃦ (3.35) [푛] 푛 [푛] 퐿푝({±1} ,푋) ⃦ ⃦ 푝 ⃦ ¨ −푡Δ ⃦ ≤ ⃦푊¨[푛]푒 푓⃦ . (3.36) ¨ 푝 The key claim is the following.

Claim 3.6.12. clm:pis1 Suppose 푋 has Rademacher type 푝. For all 1 < 푟 < ∞, for all 푡 > 0, for all 푔* : {±1}푛 → 푋*, ⃦ ⃦ ⃦ ⃦ ⃦ * 1 ⃦ * 푞 푞* ‖푔 ‖ 푛 * ⃦ ⃦ −푡Δ * ⃦ ⃦ 퐿푟* ({±1} ,푋 ) ⃦ ⃦푒 휕푗푔 (휀)⃦ ⃦ 푋,푟,푝 * . ⃦ 푋* ⃦ . 푝 푛 푡 푞* 퐿푟* ({±1} ) (푒 − 1) Assume the claim. We prove the following.

Claim 3.6.13. clm:pis2 If 푋 has Rademacher type 푝, 1 < 푞 < 푝, and 1 < 푟 < ∞, then for ̂︀ every function 푓 : {±1}푛 → 푋 with E푓 = 0 = 푓(휑) we have ⃦ ⃦ ⃦ 1 ⃦ ⃦ ∑︁푛 푞 ⃦ 1 ⃦ 푞 ⃦ ‖푓‖ 푛 ⃦ ‖휕 푓(휀)‖ ⃦ . 퐿푟({±1} ,푋) . ⃦ 푗 푋 ⃦ 푝 − 푞 ⃦ 푗=1 ⃦ 푛 퐿푟({±1} ,푋) When 푟 = 푞, 푞 ∑︁푛 1 푞 ‖푓 − 푓‖ ‖휕푗푓(휀)‖ (3.37) E 푞 . E휀 푋 푝 − 푞 푗=1 푞 ∑︁푛 1 1 푞 = 푞 ‖푓(휀) − 푓(휀1,..., −휀푗, 휀푛)‖푋 (3.38) 푝 − 푞 2 푗=1

‖푓(휀) − 푓(−휀)‖푞 ≤ (3.39) = 2 ‖푓(휀) − 푓‖ (3.40) E 푞 ∑︁푛 1 2 . E ‖푓(휀) − 푓(휀1,..., −휀푗, 휀푛)‖ (3.41) 푝 − 푞 푗=1 by definition of Enflo type. We show that the Key Claim 3.6.12 implies Claim 3.6.13.

86 MAT529 Metric embeddings and geometric inequalities

Proof. Normalize so that ⃦ ⃦ ⃦ 1 ⃦ ⃦ 푞 ⃦ ⃦ ∑︁푛 ⃦ ⃦ 푞 ⃦ ⃦ ‖휕푗푓(휀)‖푋 ⃦ = 1. ⃦ 푗=1 ⃦ 푛 퐿푟({±1} ,푋) * 푛 * * For every 푠 > 0, by Hahn-Banach, take 푔 : {±1} → 푋 such that ‖푔 ‖ 푛 * = 1 푠 푠 퐿푟* ({±1} ,푋 ) and ⃦ ⃦ ⃦ ⃦ 푔*(휀)(Δ푒−푠Δ푓(휀)) = ⃦Δ푒−푠Δ푓⃦ . E 푠 퐿 ({±1}푛,푋) )︂ 푟 * −푠Δ Write this as 푔푠 , Δ푒 푓 . 푛 * 푛 * 푛 * * Recall that 퐿푟({±1} , 푋) = 퐿푟* ({±1} , 푋 ). Taking ℎ ∈ 퐿푟({±1} , 푋) and 푔 ∈ 푛 * 퐿푟* ({±1} , 푋 ), we have * * * * 푔 (ℎ) = ⟨푔 , ℎ⟩ = E[푔 (휀)ℎ(휀)] = E[⟨푔 , ℎ⟩ (휀)]. We have ∑︁ −푠Δ −푠|퐴| ̂︀ Δ푒 푓 = |퐴|푒 푓(퐴)푊퐴 (3.42) ⃦ ⃦ 퐴⊆[푛] ⃦ ⃦ ∑︁ )︂ ⃦Δ푒−푠Δ푓⃦ = 푔*, 휕2푒−푠Δ푓 (3.43) 퐿푟(푋) 푠 푗 ∑︁푛 )︂ * −푠Δ = 푔푠 , 휕푗푒 휕푗푓 (3.44) 푗=1 ∑︁푛 )︂ −푠Δ * = 푒 휕푗푔푠 , 휕푗푓 (3.45) (︀푗=1 −푠Δ * 푛 푛 = 푒 휕푗푔푠 (휕푗푓)푗=1 . (3.46) ⏟ ⏞ 푗=1 ⏟ ⏞ 푛 * 푛 * ∈퐿푟(ℓ (푋)) ∈퐿푟 (ℓ푞* (푋 )) 푞 푛 * 푛 * Note that (퐿푟(ℓ푞 (푋))) = 퐿푟* (ℓ푞* (푋 )), so we have a pairing here. ⃦ ⃦ ⃦ ⃦ ⃦ 1 ⃦ ⃦ 1 ⃦ 푞 푞 ⃦ ∑︁푛 ⃦ ⃦ * ⃦ ⃦ ∑︁푛 ⃦ ⃦ ⃦ −푠Δ * ⃦푞 ⃦ ⃦ 푞 ⃦ ≤ ⃦ ⃦푒 휕푗푔 (휀)⃦ ⃦ ⃦ ‖휕푗푓(휀)‖ ⃦ (3.47) ⃦ 푠 푋* ⃦ ⃦ 푋 ⃦ ⃦ 푗=1 ⃦ ⃦ 푗=1 ⃦ 퐿푟* (푋푛) 퐿푟(푋) * ‖푔 ‖ * 푠 퐿푟* (푋 ) ≤ 푝* by Key Claim 3.6.12 (푒푡 − 1) 푞* (3.48) ⃦ ⃦ 1 ⃦ −푠Δ ⃦ 1 * . ⃦Δ푒 푓⃦ 푋,푝,푟 * . (3.49) . 푝 퐿 (푋) . 푝 (푒푡 − 1) 푞* 푟 (푒푡 − 1) 푞* Integrating, ∫︁ ∫︁ ∞ ∑︁ ∞ −푠Δ −푠|퐴| ̂︀ Δ푒 푓 = |퐴|푒 푓(퐴)푊퐴. (3.50) 0 0 퐴⊆[푛]

87 MAT529 Metric embeddings and geometric inequalities

Then ⃦∫︁ ⃦ ⃦ ∞ ⃦ ‖푓‖ = ⃦ Δ푒−푠Δ푓 푑푠⃦ (3.51) 퐿푟(푋) ⃦ ⃦ 0 퐿 (푋) ∫︁ ⃦ ⃦ 푟 ∞ ⃦ ⃦ ≤ ⃦Δ푒−푠Δ푓⃦ 푑푠 (3.52) 퐿푟(푋) ∫︁0 ∞ 푑푠 ≤ 푝* < ∞. (3.53) 0 (푒푡 − 1) 푞*

We now prove the key claim 3.6.12. Proof of Key Claim 3.6.12. This is an interpolation argument. We first introduce a natural operation. * 푛 * * 푛 푛 * Given 푔 : {±1} → 푋 , for every 푡 > 0 define a mapping 푔푡 : {±1} × {±1} → 푋 as follows. For 훿 ∈ {±1}푛, ∑︁ ∏︁ * ̂︀* −푡 −푡 푔 (휀, 훿) = 푔 (퐴) (푒 휀푖 + (1 − 푒 )훿푖) (3.54) 퐴⊆[푛] 푖∈퐴 = 푔*(푒−푡휀 + (1 − 푒−푡)훿) (3.55) ∑︀ ∏︀ * ̂︀* 푛 Think of 푔 (휀) = 퐴⊆[푛] 푔 (퐴) 푖∈퐴 휀푖 (extended outside {±1} by interpolation). Observe ∑︁ ∑︁ ∑︁ * −푡|퐵| −푡 푛−|퐵| * 푔 (휀, 훿) = 푒 (1 − 푒 ) 푔 휀푖푒푖 + 훿푖푒푖 (3.56) 푖∈퐵 푖̸∈퐵 퐵⊆[푛] ∑︁ ∑︁ ∑︁ * ̂︁* 푔 휀푖푒푖 + 훿푖푒푖 = 푔 (퐴)푊 (퐴 ∩ 퐵)(휀)푊퐴∖퐵(훿) (3.57) 푖∈퐵 푖̸∈퐵 퐴⊆[푛] ∑︁ ∑︁ ∑︁ −푡|퐵| −푡 푛−|퐵| * 푒 (1 − 푒 ) 푔 ( 휀푖푒푖 + 훿푖푒푖) (3.58) 퐵⊆[푛] 푖∈퐵 푖̸∈퐵 ∑︁ ∑︁ −푡|퐵| −푡 푛−|퐵| ̂︁* = 푒 (1 − 푒 ) 푔 (퐴)푊퐴∩퐵(휀)푊퐴∖퐵(훿) (3.59) 퐵⊆[푛] 퐴⊆[푛] ∑︁ ∑︁ ̂︁* −푡|퐵| −푡 푛−|퐵| = 푔 (퐴) 푒 (1 − 푒 ) 푊퐴∩퐵(휀)푊퐴∖퐵(훿). (3.60) 퐴⊆[푛] 퐵⊆[푛] ⎧ ⎨휀 , w.p. 푒−푡 Let 훾 = 푖 . Then ⎩ −푡 훿푖, w.p. 1 − 푒 . (︃ )︃ ∏︁ ∏︁ −푡 −푡 푡 훾푖 = (푒 + (1 − 푒 )훿푖 (3.61) 푖∈퐴 푖∈퐴(︃ )︃ ∑︁ ∑︁ ̂︁* ̂︁* 푔 (퐴)E푤퐴(훾) = E 푔 (퐴)푊퐴(훾) (3.62) 퐴 ∑︁ 퐴 = 푒−푡|퐵|(1 − 푒−푡)푛−|퐵|푔*(훾) (3.63) 퐵⊆[푛]

88 MAT529 Metric embeddings and geometric inequalities

Instead of a linear interpolation, we did∑︀ a random interpolation. * −푡|퐵|(1−푒−푡)푛−|퐵| What is ‖푔 ‖ 푛 푛 * ≤ 푒 ? 푡 퐿푟* ({±1} ×{±1} ,푋 ) 퐵⊆[푛] 4/13: Continue Theorem 3.6.8. We’re proving a famous and nontrivial theorem, so there are more computations (this is the only proof that’s known). We previously reduced everything to the following “Key Claim” 3.6.12. * 푛 푛 * Fix 푡 > 0. Define 푔푡 : {±1} × {±1} → 푋 . Then 푔*(휀, 훿) = 푔*(푒−푡휀 + (1 − 푒−푡)훿) 푡 ∑︁ ∏︁ (︀ * −푡 −푡 = 푔^ (퐴) 푒 휀푖 + (1 − 푒 )훿푖 퐴⊆{1,··· ,푛} 푖∈퐴

* Then it’s a fact that ‖푔 ‖ 푛 푛 * * which is true for any Banach 푡 퐿푟* ({±1} ×{±1} ,푋 )≤‖푔 ‖퐿푟({±1}푛,푋*) space from a Bernoulli interpretation. * Let us examine the Fourier expansion of the original definition of 푔 in the variable 훿푖. What is the linear part in 훿? We have ∑︁푛 ∑︁ * −푡(|퐴|−1) 푔푡 (휀, 훿) = 훿푖 (1 − 푒 휀푗) + Φ(휀, 훿) 푖=1 퐴⊆{1,··· ,푛},푖∈퐴 where Φ is the remaining part. The way to write this is to write EΦ(휀, 훿)훿푖 = 0, for all 푖 = 1, ··· , 푛 since we know the extra part is orthogonal to all the linear parts, since the Walsh function determines an orthogonal basis. So ∑︁ ∏︁ −푡 −푡|퐴| 푡 * 푡 −푡Δ * (1 − 푒 )푒 푒 푔^ (퐴) = (푒 − 1)휀푖푒 휕푖푔 퐴⊆{1,··· ,푛},푖∈퐴 푗∈퐴∖{푖} * What does 휕푖 do to 푔 ? It only keeps the 푖s which belong to it, and it keeps it with the same coefficient. Then you hit it with 푒−푡Δ, which corresponds to 푒−푡|퐴|. Then the last part is just the Walsh function. ∑︀ * 푡 푛 −푡Δ * All in all, you get that 푔푡 (휀, 훿) = (푒 − 1) 푖=1 훿푖휀푖푒 휕푖푔 (휀) + Φ(휀, 훿). 푛 Now fix 휀 ∈ {±1} . Recall∑︀ the dual formulation of Rademacher type 푝: For every function * 푛 * 푛 * 푝* 1/푝* * ^ 푛 * 푛 * ℎ : {±1} → 푋 , compute ‖( 푖=1 ‖ℎ (푖)‖푋 ) ‖퐿푟* ({±1} ,푋 ) ≤ ‖ℎ ‖퐿푟* ({±1} ,푋 ). * Then, applying this inequality to our situation with 푔푡 (휀, 훿) where we have fixed 휀, we get ⃦ ⃦ ⃦(︃ )︃ * ⃦ ⃦ ∑︁푛 1/푝 ⃦ (︀ ⃦ −푡Δ * 푝* ⃦ 푡 −1 * 푟* ⃦ ‖푒 휕 푔 (휀)‖ * ⃦ (푒 − 1) ‖푔 (휀, 훿)‖ * ⃦ 푖 푋 ⃦ . E훿 푡 푋 푖=1 푛 * 퐿푟* ({±1} ,푋 ) Then, we get ⃦ ⃦ ⃦(︃ )︃ * ⃦ ⃦ ∑︁푛 1/푝 ⃦ ⃦ −푡Δ * 푝* ⃦ 푡 −1 * ⃦ ‖푒 휕 푔 ‖ * ⃦ (푒 − 1) ‖푔 ‖ 푛 푛 * ⃦ 푖 푋 ⃦ . 푡 퐿푟* ({±1} ×{±1} ,푋 ) 푖=1 푛 * 퐿푟* ({±1} ,푋 ) 푡 −1 푛 * . (푒 − 1) ‖푔‖퐿푟* ({±1} ,푋 )

89 MAT529 Metric embeddings and geometric inequalities

−푡Δ * −푡Δ 푛 * Now we want to take 푡 → ∞. Let us look at ‖푒 휕푖푔 ‖퐿∞({±1} ,푋 ). 푒 is a contraction, and so does 휕푖. We proved the first term is an averaging operator, which does not decrease norms. We also have that 휕푖 is averaging a difference of two values, so it also does not decrease norm. Therefore, we can put a max on it and still get our inequality:

−푡Δ * * max ‖푒 휕푖푔 ‖퐿 ({±1}푛,푋*) ≤ ‖푔 ‖퐿 ({±1}푛,푋*) 1≤푖≤푛 ∞ ∞ Then, ⃦ ⃦ ⃦(︃ )︃ ⃦ ⃦ ∑︁푛 1/푎⃦ ⃦ −푡Δ * 푎 ⃦ −푡Δ * 푛 ⃦ ‖푒 휕 푔 ‖ * ⃦ = ‖(‖푒 휕 푔 ‖ * ) ‖ 푛 * ⃦ 푖 푋 ⃦ 푖 푋 푖=1 퐿푏(푙푎 (푋 )) 푖=1 푛 * 퐿푏({±1} ,푋 )

* 푛 * Define 푆 : 퐿푟* (푋 ) → 퐿푟* (푙푝* (푋 )), and it maps

* −푡Δ * 푛 푆(푔 ) = (푒 휕푖푔 (휀))푖=1 The first inequality is the same thing as saying the operator normof 푆 1 ‖푆‖퐿 * (푋*)→퐿 * (푙푛 (푋*)) 푟 푟 푝* . 푒푡 − 1 We also know that

* 푛 * ‖푆‖퐿∞(푋 )→퐿∞(푙∞(푋 )) ≤ 1 We now want to interpolate these two statements (푟* and ∞) to arrive at 푞*, as in the last remaining portion of the proof which is left. 1 휃 1−휃 휃 푝* Define 휃 ∈ [0, 1] by 푞* = 푝* + ∞ = 푝* , so 휃 = 푞* . By the vector-valued Riesz Interpolation Theorem (proof is identical to real-valued func- tions) (this is in textbooks),

1 ‖푆‖퐿 * (푋*)→퐿 * (푙푛 (푋*)) 푎 푎 푞* . (푒푡 − 1)휃

1 휃 1−휃 * 푟* 푟*푞* 푞* * provided 푎* = 푟* + ∞ , or 푎 = 휃 = 푝* . 푝* > 1, as 푟 ranges from 1 → ∞, so does 푟 . In Pisier’s paper, he says we can get any 푎* that we want. However, this seems to be false. But this does not affect us: If we choose 푟 = 푝, then we get 푎 = 푞, which is all we needed to finish the proof. * 푞* So in 3.6.12, we take 푟 > 푝* , and this completes the claim. Later I will either prove or give a reference for the interpolation portion of the argument. Remember what we proved here and the definition of Enflo type 푝. You assign 2푛 points in Banach space, to every vertex of the hyper cube. We deduce the volume of parallelpiped for any number of points. The open question is do we need to pass to this smaller value, and we needed it to get the integral to converge.

90 Chapter 4

Grothendieck’s Inequality

1 Grothendieck’s Inequality

Next week is student presentations, let’s give some facts about Grothendieck’s inequality. Let’s prove the big Grothendieck theorem (this has books and books of consequences). Ap- plications of Grothendieck’s inequality is a great topic for a separate course. The big Grothendieck Inequality is the following:

Theorem 4.1.1. Big Grothendeick. There exists a universal constant 퐾퐺 (the Grothendieck constant) such that the following holds. Let 퐴 = (푎푖푗) ∈ 푀푚×푛(R), and 푥1, . . . , 푥푚, 푦1, . . . , 푦푛 be unit vectors in a Hilbert space 푛 퐻. Then there exist signs 휀1, . . . , 휀푛, 훿1, . . . , 훿푛 ∈ {±1} such that if you look at the bilinear form ∑︁ ∑︁ 푎푖푗⟨푥푖, 푦푗⟩ ≤ 퐾퐺 푎푖푗휀푖훿푗 푖,푗 푖,푗

So whenever you have a matrix, and two sets of high-dimensional vectors in a Hilbert space, there is at most a universal constant (less than 2 or 3) times the same thing but with signs. Let’s give a geometric interpretation. Consider the following convex bodies in (R푛)2. So 퐴 is the convex hull of all matrices of the form 푐표푛푣({(휀푖훿푗): 휀1, ··· , 휀푚, 훿1, ··· , 훿푛 ∈ {±1}}) ⊆ ∼ 푚푛 푀푚×푛(R) = R . These matrices all have entries ±1. This is a polytope. 퐵 = 푐표푛푣({⟨푥푖, 푦푗⟩ : 푥푖, 푦푗 unit vectors in a Hilbert space }) ⊆ 푀푚×푛(R). It’s obvious that 퐵 contains 퐴 since 퐴 is a restriction to ±1 entries.

Grothendieck says the latter half of the containment 퐴 ⊆ 퐵 ⊆ 퐾퐺퐴. If there is a point outside 퐾퐺퐴, you can find a separating hyperplane, which is a matrix. If 퐵 is not in 퐾퐺퐴, then∑︀ there exists ⟨푥푖, 푦푗∑︀⟩ ̸∈ 퐾퐺퐴 which means by separation there exists a matrix 푎푖푗 such that 푎푖푗⟨푥푖, 푦푗 > 퐾퐺 푎푖푗푐푖푗 for all 푐푖푗 ∈ 퐴, which is a contradiction by Grothendieck’s theorem taking 푐푖푗 = 휀푖훿푗. We previously proved 푙∞ → 푙2, the following is a harder theorem:

91 MAT529 Metric embeddings and geometric inequalities

푛 푛 푛 Lemma 4.1.2. Every linear operator 푇 : 푙∞ → 푙1 satisfies for all 푥1, ··· , 푥푚 ∈ ℓ∞. (︃ )︃ (︃ )︃ ∑︁푛 1/2 ∑︁푛 1/2 2 2 ‖푇 푥푖‖푙푛 ≤ 퐾퐺 ·‖푇 ‖푙푛 →푙푛 max ⟨푥, 푥푖⟩ 1 ∞ 1 ‖푥‖ ≤1,푥 ∈푙푛 푖=1 1 푖 1 푖=1 As an exercise, see how the lemma follows from Grothendieck, and how the little Grothendieck inequality follows from the big one in the case where 푎푖푗 is positive definite (so it’s just a special case; there the best constant is 휋/2).

Note that max‖푥‖1≤1⟨푥, 푥푖⟩ just gives ‖푥푖‖∞. But Grothendieck says for any number of 푥1, ··· 푥푚, we can for free improve our norm bound by a constant universal factor. In the little Grothendieck inequality, what we did was prove the above fact for an operator for 푙1 → 푙2, and we deduced Pietch domination from that conclusion:∫︀ There is a probability 푛 2 2 * 2 * measure on the unit ball of ℓ such that ‖푇 푥‖ ≤ 퐾 ‖푇 ‖ 푛 푛 ⟨푦 , 푥⟩ 푑휇(푦 ). If you 1 1 퐺 ℓ∞→ℓ1 퐵ℓ푛 1 know there exists such a probability measure, you know the above lemma, since you just do this for each 푥푖, and this is at most the maximum so you can just multiply. 푛 푛 This is an amazing fact though: Look at 푇 : ℓ∞ → ℓ1 , and look at 퐿2(휇, ). This is saying that if you think of an element in ℓ∞ as a bounded function on the 퐿2 unit ball of its dual. You can think like this for any Banach space. Then, we have a diagram mapping from 푛 푛 ℓ2 → 퐿2(휇) by identity into a Hilbert space subspace 퐻 of 퐿2(휇), and 푆 : 퐻 → ℓ1 . You got these operators to factor through a Hilbert space 퐻, so we form a commutative diagram and the norm of 푆 is at most ‖푆‖ ≤ 퐾퐺‖푇 ‖. This is how this is used. These theorems give you for free a Hilbert space out of nothing, and we use this a lot. After the presentations, from just this duality consequence, I’ll prove one or two results in harmonic analysis and another result in geometry, and it really looks like magic. You can end up getting things like Parseval for free. We already saw the power of this in Restricted Invertibility. Let’s begin a proof of Grothendieck’s inequality.

Proof. We will give a proof getting 퐾 ≤ 휋 √ = 1.7 ··· (this is not from the original 퐺 2 log(1+ 2) theorem). We don’t actually know what 퐾퐺 is. People don’t really care what 퐾퐺 is beyond the first digit. Grothendieck was interested in the worst configuration of points on thesphere, but it would not tell us what the configuration of points is. We want to know the actual number, or some description of the points. We know this is not the best bound since it doesn’t give the worst configuration of points. You do the following: The input was points 푥1, ··· , 푥푚, 푦1, ··· , 푦푛 ∈ unit sphere in Hilbert ′ ′ ′ space 퐻. We will construct a new Hilbert space 퐻 and new unit vectors 푥푖, 푦푗 such that if ′ 푧 is uniform over the sphere of 퐻 according√ to surface area measure, then the expectation ′ ′ 2 log(1+ 2) of Esign(⟨푧, 푥푖⟩) * sign(⟨푧, 푦푗⟩) = 휋 ⟨푥푖, 푦푗⟩ for all 푖, 푗. So we have a sphere with points 푥푖, 푦푗, and there is a way to nonlinearly transform into a ′ ′ sphere in the same dimension with 푥푖, 푦푗. On the new sphere, if you take a uniformly random direction on the sphere, and look at a hyperplane defined by 푧, then the sign just indicates ′ ′ whether 푥 , 푦 are on the same or opposite sides. Let’s call 휀푖 the first “random sign”, and 푖 푗 ∑︀ √ ∑︀ 2 log(1+ 2) 훿푗 the second “random sign”. Then, E 푎푖푗휀푖훿푗 = 휋 푎푖푗⟨푥푖, 푦푗⟩. So there exists

92 MAT529 Metric embeddings and geometric inequalities

∑︀ ∑︀ an instance (random construction) of 휀 , 훿 such that 푎 ⟨푥 , 푦 ⟩ ≤ 휋 √ 푎 ⟨푥 , 푦 ⟩. 푖 푗 푖푗 푖 푗 2 log(1+ 2) 푖푗 푖 푗 If you succeed in making a bigger constant bound when bounding the expectation of the multiplication of the signs, you get a better Grothendieck constant. For the argument we will give, we will see that the exact number we get will be the best constant.

Undergrad Presentations.

2 Grothendieck’s Inequality for Graphs (by Arka and Yuval)

We’re going to talk about upper bounds for the Grothendieck constant for quadratic forms on graphs.

Theorem 4.2.1. Grothendieck’s Inequality.

∑︁푛 ∑︁푚 ∑︁푛 ∑︁푚 푎푖푗⟨푥푖, 푦푖⟩ ≤ 퐾퐺 푎푖푗휀푖훿푗 푖=1 푖=1 푖=1 푖=1 where 푥푖, 푦푗 are on the sphere. This is a bipartite graph where 푥푖s form one side and 푦푗s form the other side. Then you assign arbitrary weights 푎푖푗. So you can consider arbitrary graphs on 푛 matrices.

For every 푥푖, 푦푗, there exist signs 휀푖, 훿푗 such that ∑︁ ∑︁ 푎푖푗⟨푥푖, 푦푗⟩ ≤ 퐶 log 푛 푎푖푗휀푖훿푗 푖,푗∈{1,··· ,푛}

We will prove an exact bound on the complete graph, where the Grothendieck constant is 퐶 log 푛. For a general graph, we will prove that 퐾(퐺) = c푂(log Θ(퐺)), where we will define the Θ function of a graph later.

Definition 4.2.2: 퐾(퐺) is the least constant 퐾 s.t. for all matrices 퐴 : 푉 × 푉 → R, ∑︁ ∑︁ sup푓:푉 →푆푛−1 퐴(푢, 푣)⟨푓(푢), 푓(푣)⟩ ≤ 퐾sup휙:푉 →퐸 퐴(푢, 푣)휙(푢)휙(푣) (푢,푣)∈퐸 (푢,푣)∈퐸

Definition 4.2.3: The Gram representation constant of 퐺. Denote by 푅(퐺) the infimum over constants 푅 s.t. for every 푓 : 푉 → 푆푛−1. There exists [0,1] 퐹 : 푉 → 퐿∞ ∫︀ such that for every 푣 ∈ 푉 we have ‖퐹 (푣)‖∞ ≤ 푅 and ⟨푓(푢), 푓(푣)⟩ = 1 ⟨퐹 (푢), 퐹 (푣)⟩ = 0 퐹 (푢)(푡)퐹 (푣)(푡)푑푡. For (푢, 푣) which is an edge, you find the constant 푅 so that you can embed functions 푓 into 퐹 such that the ℓ∞ norm is less than an explicit constant.

93 MAT529 Metric embeddings and geometric inequalities

Lemma 4.2.4. Let 퐺 be a loopless graph. Then 퐾(퐺) = 푅(퐺)2.

푛−1 Proof. Fix 푅 > 푅(퐺) and 푓 : 푉 → 푆 . Then there exists 퐹 : 푉 → 퐿∞[0, 1] such that for every 푣 ∈ 푉 , we have ‖퐹 (푣)‖∞ ≤ 푅 and for (푢, 푣) ∈ 퐸 ⟨푓(푢), 푓(푣)⟩ ≤ ⟨퐹 (푢), 퐹 (푣)⟩. Then, ∑︁ ∑︁ ∫︁ ∑︁ 퐴(푢, 푣)⟨푓(푢), 푓(푣)⟩ = 퐴(푢, 푣)⟨퐹 (푢), 퐹 (푣)⟩ = 퐴(푢, 푣)퐹 (푢)(푡)퐹 (푣)(푡)푑푡 (푢,푣)∈퐸 (푢,푣)∈퐸 (푢,푣)∈퐸 ∫︁ ∑︁ ≤ sup푔:푉 →[−푅,푅] 퐴(푢, 푣)푔(푢)푔(푣)푑푡

We use definition and linearity to reverse the sum and integral. Now we use the loopless assumption to get to the definition of the right side of the Grothendieck inequality. We just need to fix∑︀ the [−푅, 푅] to [−1, 1]. We then get the last term above is equivalent to 2 푅 sup휙:푉 →[−1,1] (푢,푣)∈퐸 퐴(푢, 푣)휙(푢)휙(푣), which completes the proof. 푛−1 |퐸| Now, for each 푓 : 푉 → 푆 , consider 푀(푓) ⊆ R = (⟨푓(푢), 푓(푣))(푢,푣)∈퐸. For each 휙 휙 : 푉 → {−1, 1}, define 푀(휙) = (휙(푢), 휙(푣))(푢,푣) ∈ R . Now we use the convex geometry interpretation of Grothendieck from the last class. Mimicking it, we write ⌋︀ {︀ conv {푀(휙): 휙 : 푉 → {−1, 1}} ⊆ conv 푀(푓): 푓 : 푉 → 푆푛−1

The implication of Grothendieck’s inequality gives us ⌋︀ {︀ conv 푀(푓): 푓 : 푉 → 푆푛−1 ⊆ 퐾(퐺) · conv {푀(휙): 휙 : 푉 → {−1, 1}} ∑︀ So there exist weights {휆푔 : 푔 : 푉∑︀→ {−1, 1}} which satisfy 푞:푉 →{−1,1} 휆푔 = 1, 휆푔 ≥ 0 and for all (푢, 푣) ∈ 퐸 ⟨푓(푢), 푓(푣)⟩ = 푞:푣→{−1,1} 휆푔푔(푢)푔(푣)퐾(퐺). Now consider 퐹 (푢) = 퐾(푔) · 푔(푢)[휆1 + ··· + 휆푔−1, 휆1 + ··· + 휆푔]

Then, ∑︁ ⟨퐹 (푢), 퐹 (푣)⟩ = 퐾(퐺)푔(푢)푔(푣)휆푔 푔:{푉 →{−1,1}} Then we consider our interval becomes [− 퐾(퐺), 퐾(퐺)], and thus 푅(퐺) ≤ 퐾(퐺) and 푅(퐺)2 ≤ 퐾(퐺). This proof can be modified for graphs with loops, but this is not quite true.

An obvious corollary is that if 퐻 is a subgraph of 퐺, then 푅(퐻) ≤ 푅(퐺), and 퐾(퐻) ≤ 퐾(퐺). This inequality is not obvious from the Grothendieck inequality directly, but is obvious going with our point of view.

∘ ∘ Lemma√ 4.2.5. Let 퐾푛 denote the complete graph on 푛-vertices with loops. Then 푅(퐾푛) = c푂( log 푛).

94 MAT529 Metric embeddings and geometric inequalities

푛−1 Proof.⌈︀ Let 휎 be the normalized√︁ }︀ surface measure on 푆 . By computation, there exists 푐 s.t. 푛−1 log 푛 1 휎 푥 ∈ 푆 : ‖푥‖∞ ≤ 푐 푛 ≥ 1 − 2푛 , which can be calculated through integration. When you get a function 푓 on the sphere to the 푛 − 1 dimensional sphere, you want to find a rotation so that all these vectors have low coordinate vector value. We basically use the union bound. For each of the 푛 vectors, the probability you get a vector with low valued last coordinate is 1 − 1/(2푛), do this for all the vectors and you get probability greater than 1/2. Then you can magnify this to get almost surely. For every 푥 ∈ 푆푛−1, the random variable on the orthogonal group 푂(푛) given by 푈 → 푈푥 푛−1 푛−1 is uniformly distributed on 푆 . Thus√︁ for every 푓 : 푉 → 푆 , there is a rotation 푈 ∈ 푂(푛) log 푛 such that ∀푣 ∈ 푉 ‖푈(푓(푣))‖∞ ≤ 푐 푛 . 푡ℎ We want 퐹 (푣) to be√ equal to the 푗 coordinate of 푈푓(푣) on interval of length 1/푛. I.e., Let 퐹 (푢)(푡) = (푈(푓(푣))) 푛 on 푗−1 ≤ 푡 푗 . √ 푛 푛 Thus 푅 ≤ 푐 log 푛.

Now I will prove the thing mentioned at the beginning:

Theorem 4.2.6. 퐾(퐺) ≤ log 휒(퐺), where 휒(퐺) is the chromatic number.

This theorem generalizes since on bipartite graphs 휒(퐺) is a constant. One thing we want to observe about the Grothendieck inequality is that it only cares about the Hilbert space structure. So we will just prove this in one specific (nice) Hilbert space and then we’ll be done. Fix a probability space (Ω, 푃 ) such that 푔1, ··· , 푔푛 be i.i.d. standard Gaussians on Ω (for instance, Ω is the infinite∑︀ product∑︀ of R). ∞ 2 2 Now define the Gaussian Hilbert space 퐻 = { 푖=1 푎푖푔푖 : 푎푖 < ∞} ⊆ 퐿 (Ω). Every function in here will have mean zero since these are linear combinations of Gaussians. Note that the unit ball 퐵(퐻) consists of all Gaussian distributions with mean zero and variance at most 1 (since the variance is norm squared). ∑︀ Let Γ be the left hand side∑︀ of the Grothendieck inequality: sup푓:푉 →{−1,1} (푢,푣)∈퐸 퐴(푢, 푣)⟨푓(푢), 푓(푣)⟩, and let Δ = sup푓:푉 →{−1,1} (푢,푣)∈퐸 퐴(푢, 푣)휙(푢)휙(푣).

Definition 4.2.7: Truncation. For all 푀 > 0, 휓 ∈ 퐿2(Ω), define the truncation of 휓 at 푀 to be ⎧ ⎪ ⎨⎪휓(푥) |휓(푥)| ≤ 푀 휓푀 (푥) = 푀 휓(푥) ≥ 푀 ⎪ ⎩−푀 휓(푥) ≤ −푀

We’re just cutting things off at the interval [−푀, 푀]. So fix 푓 maximizing Γ.

2 Lemma 4.2.8. There exists Hilbert space 퐻, ℎ : 푉 → 퐻, 푀 > 0 with ‖ℎ(푣)‖퐻 ≤ 1/2 for all∑︀ vertices 푣 ∈ 푉 , 푀 . log 휒∑︀(퐺), and we can now write Γ as a sum over all edges 푀 푀 Γ = (푢,푣) 퐴(푢, 푣)⟨푓(푢) , 푓(푣) ⟩ + (푢,푣)∈퐸 퐴(푢, 푣)⟨ℎ(푢), ℎ(푣)⟩, where the second term is an error term.

95 MAT529 Metric embeddings and geometric inequalities

−1 Proof. Let 푘 = 휒(퐺). A fact for all 푠 : 푉 → 푙2 s.t. ⟨푠(푢), 푠(푣)⟩ = 푘−1 for all (푢, 푣) ∈ 퐸, and ‖푠(푢)‖ = 1. ^ Define another Hilbert space 푈 = 푙2 ⊕ R. Define 푡, 푡, two functions from 푉 → 푈. They are defined as follows: 푘 − 1 1 푡(푢) = ( 푠(푢)) ⊕ (√ 푒1) 푘 푘 푘 − 1 1 푡^(푢) = (− 푠(푢)) ⊕ (√ 푒1) 푘 푘 Now, 푡(푢), 푡^(푢) are unit vectors for all (푢, 푣) ∈ 퐸. Then ⟨푡(푢), 푡(푣)⟩ = ⟨^(푡)(푢), 푡^(푣)⟩ = 0. We ^ 2 also have ⟨푡(푢), 푡(푣)⟩ = 푘 . ′ Our goal is to prove that such a function exists. Set 퐻 = 푈 ⊗ 퐿2(Ω). We now write down a function 1 ℎ(푢) = 푡(푢) ⊗ (푓(푢) + 푓(푢)푀 ) + 푘푡^(푢) ⊗ (푓(푢) − 푓(푢)푀 ) 4 It turns out that this function does everything we want. A couple of key facts: It’s easy to check that the definition of Γ holds, defined in termsof ℎ(푢) (for the error term). Checking 2 2 푀 2 ‖ℎ(푣)‖퐻 ≤ 1/2 holds is also possible. You can bound ‖ℎ(푢)‖ ≤ (1/2 + 푘‖푓(푢) − 푓(푢) ‖) after evaluating inner products. Now we use th eproperty of the Hilbert space. 푓(푢) is in the ball of 퐻, with mean zero and variance at most 1. Then, ‖푓(푣) − 푓(푣)푀 ‖2 is the probability that the Gaussian falls outside the interval [−푀, 푀]. Therefore, from basic probability theory, √ ∞ 2휋 2 2 ‖푓(푣) − 푓(푣)푀 ‖2 ≤ ∫︀ 푥2푒−푥 /2푑푥 ≤ 2푀푒−푀 /2 푀 √ where the last part follows by calculus. This√ is great since if we pick 푀 ∼ log 푘. So this will decay super quickly. Picking 푀 = 8 log 푘 gives that the entire thing will be ≤ 1/2. That proves the lemma, so we’re done.

This lemma is actually all we need. Suppose we have proved∑︀ this. Now we can say the fol- 푀 푀 lowing.∑︀ We can bound the first term by taking expectations: (푢,푣)∈퐸 퐴(푢, 푣)⟨푓(푢) , 푓(푣) ⟩ = 퐴(푢, 푣)푓(푢)푀 푓(푣)푀 ≤ 푀 2Δ by the definition of Δ. E (푢,푣)∈퐸 ∑︀ 2 1 For the second term, we write (푢,푣)∈퐸 퐴(푢, 푣)⟨ℎ(푢), ℎ(푣)⟩ ≤ (max푣∈푉 ‖ℎ(푣)‖ )Γ ≤ 2 Γ. by re-scaling. You need to pull out each√ of these maxima separately, using linearity each time. You√ can also multiply√ each thing by 2 makes it be in 퐵(퐻). You just have to multiply by 2 and divide by 2 and they’re not dependent anymore. By the conditions of the lemma, our ℎ has norm at most 1/2, so the This entire thing is Hilbert space independent, so the Γ 2 1 2 whole thing is < 2 . Thus Γ ≤ 푀 Δ + 2 Γ =⇒ Γ ≤ 2푀 Δ. This implies Γ . log 휒(퐺)Δ, which is exactly what we wanted to say, since the left hand side of the Grothendieck inequality is the first term, and the right hand side of the Grothendieck inequality is the second term with Grothendieck constant log 휒(퐺).

96 MAT529 Metric embeddings and geometric inequalities

3 Noncommutative Version of Grothendieck’s Inequal- ity - Thomas and Fred

First we’ll phrase the classical Grothendieck inequality in terms of an optimization problem. Given 퐴 ∈ 푀푛(R), we can consider ∑︁푛 max 퐴푖푗휀푖훿푗 휀 ,훿 ∈ 푖 푗 푖,푗=1

In general this is hard to solve, so we do a semidefinite relaxation

∑︁푛 sup푑 dimensionssup푥,푦∈(푆푑−1)푛 퐴푖푗⟨푥푖, 푦푗⟩ 푖,푗=1 which implies that it’s polynomially solveable, and Grothendieck inequality ensures you’re only a constant factor off from the best. We can give a generalization to tensors. Given 푀 ∈ 푀푛(푀푛(R)) consider ∑︁푛

sup푢,푣∈푂푛 푀푖푗푘푙푈푖푗푉푘푙 푖,푗,푘,푙=1

Set 푀푖푖푗푗 = 퐴푖푗, then we obtain

∑︁ ∑︁푛 ∑︁ sup 퐴푖푗푈푖푖푉푗푗 = sup 퐴푖푗푥푖푦푗 = max 퐴푖푗휀푖훿푗 푛 휀,훿∈{−1,1}푛 푥,푦∈{−1,1} 푖,푗=1

We relax to SDP over R of 푀: ∑︁푛

sup푑∈N sup 푀푖푗푘푙⟨푋푖푗, 푌푘푙⟩ 푑 푋,푌 ∈푂푛(R ) 푖,푗,푘,푙=1

Recall that 푈 ∈ 푂푛 means that ∑︁푛 ∑︁푛 푈푖푘푈푗푘 = 푈푘푖푈푘푗 = 훿푖푗 푘=1 푘=1

푑 * * If 푋 ∈ 푀푛(R ), let 푋푋 , 푋 푋 ∈ 푀푛(R) defined by ∑︁푛 * (푋푋 )푖푗 = ⟨푋푖푘, 푌푗푘⟩ 푘=1

∑︁푛 * (푋 푋)푖푗 = ⟨푋푘푖, 푌푘푗⟩ 푘=1 ⌋︀ {︀ 푑 푑 * * Then 푂푛(R ) = 푋 : 푀푛(R ): 푋푋 = 푋 푋 = 퐼 .

97 MAT529 Metric embeddings and geometric inequalities

We want to say something about when we relax the search space, we get within a constant factor of the non-relaxed version of the program. We will prove this in the complex case. We will write ∑︁푛 Opt (푀) = sup | 푀 푈 푉 | C 푈,푉 ∈푈푛 푖푗푘푙 푖푗 푘푙 푖,푗,푘,푙=1 The fact that Opt is less than the SDP, Pisier proved thirty years after Grothendieck con- jectured it. What we will actually prove is that the SDP solution is at most twice the optimal:

SDPC(푀) ≤ 2OptC(푀)

Theorem 4.3.1. Fix 푛, 푑 ∈ N and 휀 ∈ (0, 1). Suppose we have 푀 ∈ 푀푛(푀푛(C)) and 푑 푋, 푌 ∈ 푈푛(C ) such that

∑︁푛 | 푀푖푗푘푙⟨푋푖푗, 푌푘푙⟩| ≥ (1 − 휀)SDPC(푀) 푖,푗,푘,푙=1

So what we’re saying is suppose some SDP algorithm gave a solution satisfying this from 푑 input 푋, 푌 ∈ 푈푛(C ). Then we can give a rounding algorithm which will output 퐴, 퐵 ∈ 푈푛 such that ∑︁푛 E| 푀푖푗푘푙퐴푖푗퐵푘푙| ≥ (1/2 − 휀)SDPC(푀) 푖,푗,푘,푙=1 Now what’s the algorithm? It’s a slightly clever version of projection. We have 푋, 푌 ∈ 푑 푈푛(C), and we want to get to actual unitary matrices. First sample 푧 ∈ {1, −1, 푖, −푖} (complex unit cube). Then 푥 = √1 ⟨푋, 푧⟩ (take inner products columnwise), similarly 푧 2 푌 = √1 ⟨푌, 푧⟩. Now we take the Polar Decomposition 퐴 = 푈Σ푉 * where 푈, 푉 are unitary. 푧 2 The polar decomposition is just 퐴 = (푈푉 *)(푉 Σ푉 *), and then the first guy is unitary, and the second guy is PSD (think of it as 푒푖휃 * 푟). 푖푡 −푖푡 Then we have (퐴, 퐵) = (푈푧|푋푧| , 푉푧|푌푧| ) where 푡 is sampled from the hyperbolic secant distribution. It’s very similar to a normal distribution. The PDF is more precisely 1 휋 1 휙(푡) = sech( 푡) = 2 2 푒휋푡/2 + 푒−휋푡/2 When you have a positive definite matrix, you can raise it to imaginary powers, sowe’re rotating only the positive semidefinite part, and keeping the rotation푈푉 side( *). Note that the (퐴, 퐵) are unitary. (You can also think of this as raising diagonals to 푖푡, these necessarily have eigenvalue of magnitude 1). 4-20-16 푑 For 푋, 푌 ∈ 푈푛(C ), do the following. 1. Sample 푧 ∈ {±1, ±푖} uniformly. Sample 푡 from the hyperbolic secant distribution.

2. Let 푋 = √1 ⟨푥, 푧⟩, 푌 = √1 ⟨푌, 푧⟩. 푧 2 푍 2

98 MAT529 Metric embeddings and geometric inequalities

푖푡 −푖푡 3. Let (퐴, 퐵) := (푈푍 |푋푍 | , 푉푍 |푌푍 | ) ∈ 푈푛 × 푈푛 where 푋푍 = 푈푍 |푋푍 |, 푌푍 = 푈푍 |푌푍 | (polar decomposition). ∑︀ 푛 푀(푋, 푌 ) = 푖,푗,푘,푙=1 푀푖푗푘푙 ⟨푋푖푗, 푌푘푙⟩. We want to show (︂ )︂ 1 1 [푀(퐴, 퐵)] ≥ − 휀 푀(푋, 푌 ) ≥ ( − 휀)SDP (푀) E푡 2 2 C We want to show that the rounded solution still has large norm. It’s possible that |푋푍 | has 0 eigenvalues. One solution is to add some Gaussian noise to the original 푋, 푌 because the set of non-invertible matrices is an algebraic set of measure 0. Alternatively, replace the eigenvalues by 휀 → 0. We have ⎡ ⎤ ∑︁푑 ∑︁푛 1 ⎣ ⎦ 1 E푧[푀(푋푧, 푌푧)] = E푧 푧푟푧푠 (푋푖푗)푟(푌푘푙)푠 = 푀(푋, 푌 ). 2 푟,푠=1 푖,푗,푘,푙=1 2 Now we analyze step 3. The characteristic function of the hyperbolic secant distribution is, for 푎 > 0, ∫︁ 2푎 [푎푖푡] = 푎푖푡푒(푡) 푑푡 = E 퐻푎2 by doing a contour integral. Then

[푎푖푡] = 2푎 − [푎2+푖푡] (4.1) E푡 E푡 [퐴푖푡] = 2퐴 − [퐴2+푖푡] (4.2) E푡 E푡 푖푡 푖푡 [퐴 ⊗ 퐵] = [(푈푧|푋푧| ) ⊗ (푉푧|푌푧| )] (4.3) E푡 E푡 푖푡 = (푈푧 ⊗ 푉푧) [(|푋푧| ⊗ |푌푡|) ] (4.4) E푡 2+푖푡 2−푖푡 = 2푋푧 ⊗ 푌푧 − [(푈푧|푋푧| ) ⊗ (푉푧|푌푧| )] (4.5) E푡 We apply 푀. 2+푖푡 2−푖푡 [푀(퐴, 퐵)] = 푀(푋, 푌 ) − [푀(푈푧|푋푧| , 푉푧|푌푧| )] 푧,푡E 푧,푡E

Because 퐴, 퐵 ∈ 푀푛(C), we can write 푀(퐴, 퐵) in terms of the tensor product, ∑︁푛 푀(퐴, 퐵) = 퐴푖푗퐵푘푙 (4.6) 푖,푗,푘,푙=1 ∑︁푛 = 푀푖푗푘푙(퐴 ⊗ 퐵)(푖푗),(푘푙) (4.7) 푖,푗,푘,푙=1

Claim 4.3.2 (Key claim). For all 푡 ∈ R, ⃒ ⃒ ⃒ ⃒ ⃒ 2+푖푡 2−푖푡 ⃒ 1 ⃒ 푀(푈푧|푋푧| , 푉푧|푌푧| ) ⃒ ≤ SDP (푀). E푧 2 C

99 MAT529 Metric embeddings and geometric inequalities

{±1,±푖}푑 Proof. We have 퐹 (푡), 퐺(푡) ∈ 푀푛(C ). where 1 (퐹 (푡) ) = (푈 |푋 |2+푖푡) (4.8) 푗푘 푧 2푑 푧 푧 푗푘 1 1 ∑︁ (퐺(푡) ) = (푉 |푌 |2−푖푡) .푀(퐹 (푡), 퐺(푡)) = 푀(푈 |푋 |2+푖푡, 푉 |푌 |2−푖푡) (4.9) 푗푘 푧 2푑 푧 푧 푗푘 4푑 푧 푧 푧 푧 푧∈{±1,±푖}푑 2+푖푡 2−푖푡 = [푀(푈푧|푋푧| , 푉푧|푌푧| )]. (4.10) E푧

Lemma 4.3.3. 퐹 (푡), 퐺(푡) satisfy

max {‖퐹 (푡)퐹 (푡)*‖ , ‖퐹 (푡)*퐹 (푡)‖ , ‖퐺(푡)퐺(푡)*‖ , ‖퐺(푡)*퐺(푡)‖}

푑 * * * * Lemma 4.3.4. Suppose 푋, 푌 ∈ ℳ푛(C ) and max{‖푋푋 ‖ , ‖푋 푋‖ , ‖푌 푌 ‖ , ‖푌 푌 ‖}. Then 푑+2푛2 there exist 푅, 푆 ∈ 푈푛(C ) such that 푀(푅, 푆) = 푀(푋, 푌 ) ≤ 1 for every 푀 ∈ 푀푛(푀푛(C)). 푑+2푛2 √From the√ two lemmas, there exist 푅(푡), 푆(푡) ∈ 푈푛(C ) such that 푀(푅(푡), 푆(푡)) = 푀( 2퐹 (푡), 2퐺(푡)). Then

1 1 |푀(퐹 (푡), 퐺(푡))| = |푀(푅(푡), 푆(푡))| ≤ 2SDP (푀) (4.11) 2 2 C 1 ∑︁ 퐹 (푡)퐹 (푡)* = 푈 |푋 |*푈 * (4.12) 4푑 푧 푧 푧 푧∈{±1,±푖}푑 * * * = [푈푧|푋푧| 푈2 ] = [(푋푧푋푧 )]. (4.13) E푧 E푧

푑 Claim 4.3.5. For 푊 ∈ 푀푛(C ), define for each 푣 ∈ [푑], (푊푟)푖푗 = (푊푖푗)푟, 푊푧 = ⟨푊, 푧⟩. Then ∑︁푑 * 2 * 2 * * * [(푊푧푊푧 ) ] = (푊 푊 ) + 푊푟(푊 푊 − 푊푟 푊푟)푊푟 . E푧 푟=1 ∑︀ 푑 Note 푊푧 = 푟=1 Σ푟푊푟 and

∑︁푑 ∑︁푑 * * * * 푊 푊 = 푊푟푊푟 푊 푊 = 푊푟 푊푟. (4.14) 푟=1 푟=1 We compute ⎡ ⎤ 푝,푞,푟,푠∑︁ * 2 ⎣ * *⎦ [(푊푧푊푧 ) ] = 푑푧푝푧푞푧푟푧푠푊푝푊푞 푊푟푊푠 (4.15) E푧 E푧 =1 ∑︁푑 ∑︁푛 * * * * * * = 푊푝푊푝 푊푝푊푝 + (푊푝푊푞 푊푞푊푞 + 푊푝푊푞 푊푞푊푝 ) (4.16) 푝=1 푝, 푞 = 1 푝 ̸= 푞

100 MAT529 Metric embeddings and geometric inequalities

∑︁푑 ∑︁푑 ∑︁푑 * * * * * * 푊푝푊푝 푊푞푊푞 + 푊푝푊푞 푊푞푊푝 − 푊푝푊푝 푊푝푊푝 푝,푞=1 푝,푞=1 푝=1 2 ∑︁푑 ∑︁푑 ∑︁푑 ∑︁푑 * * * * * = 푊푝푊푝 + 푊푝 푊푞 푊푞 푊푝 − 푊푝푊푝 푊푝푊푝 . (4.17) 푝=1 푝=1 푞=1 푝=1

Apply the claim with 푊 = √1 푋. Recall that 푋푋* = 푋*푋 = 퐼. Then 2

∑︁푑 ∑︁푑 * 1 * * 1 * 1 * * 퐹 (푡)퐹 (푡) + 푋푟푋푟 푋푟푋푟 = 퐼 = 퐹 (푡)퐹 (푡) + 푋푟 푋푟푋푟 푋푟 4 푟=1 2 4 푟=1 and similarly for 퐺.

Proof of Lemma 2. Let 퐴 = 퐼 − 푋푋*, 퐵 = 퐼 − 푋*푋. Note 퐴, 퐵 ⪰ 0, Tr(퐴) = Tr(퐵). We have

∑︁푛 * 퐴 = 휆푖(푣푖푣푖 ) (4.18) 푖=1 ∑︁푛 * 퐵 = 휇푗(푣푗푣푗 ) (4.19) 푗=1 ∑︁푛 ∑︁푛 휎 = 휆푗 = 휇푗 (4.20) 푖=1 푗=1 푛؁ 휆푖휇푗 * 푑 푛2 푛2 푅 = 푋 ⊕ (푢푖푣 ) ⊕ 푂 푛2 ∈ 푀푛( ⊕ ⊕ ) (4.21) 푗 푀푛(C ) C C C 푖,푗=1 휎 푛؁ 휆푖휇푗 * 푆 = 푌 ⊕ 푂 푛2 ⊕ (푢푖푣 ) (4.22) 푀푛(C ) 푗 푖,푗=1 휎

푑+2푛2 Check 푅 ∈ 푈푛(C ),

푅푅* = 푋푋* + 퐴 (4.23) 푅*푅 = 푋*푋 + 퐵 (4.24) 푀(푅, 푆) = 푀(푋, 푌 ). (4.25)

∑︀ The 휎 disappears because 휇푗 = 휎.

The factor 2 in the noncommutative inequality is sharp. That the answer is 2 (rather than some strange constant) means there is something going on algebraically. The hyperbolic secant is the unique distribution that makes this work. 4-27

101 MAT529 Metric embeddings and geometric inequalities

4 Improving the Grothendieck constant

Theorem 4.4.1 (Krivine (1977)). 휋 퐾퐺 ≤ √ ≤ 1.7... 2 ln(1 + 2)

The strategy is preprocessing.

′ ′ ′ ′ 2푛−1 2푛−1 There exist new vectors 푥1, . . . , 푥푛, 푦1, . . . , 푦푛 ∈ S such that if 푧 ∈ S is chosen uniformly at random, then for all 푖, 푗, √ )︂ 2 ln(1 + 2) (sign ⟨푧, 푥′ ⟩) (sign 푧, 푦′ ) = ⟨푥 , 푦 ⟩ . E ⏟ ⏞ 푖 푗 푖 푗 푧 ⏟ ⏞ ⏟ 휋⏞ 휀 푖 훿푗 푐

Then [︃ ]︃ ∑︁ 휋 ∑︁ 푎푖푗 ⟨푥푖, 푦푗⟩ = E √ 푎푖푗휀푖훿푗 . 2 ln(1 + 2) We will take vectors, transform them in this way, take a random point on the sphere, take a hyperplane orthogonal, and then see which side the points fall on.

푘 푘 Theorem 4.4.2 (Grothendieck’s identity). For 푥, 푦 ∈ S and 푧 uniformly random on S , 2 [sign(⟨푥, 푧⟩) sign(⟨푦, 푧⟩)] = sin−1(⟨푥, 푦⟩). E푧 휋 Proof. This is 2-D plane geometry. The expression is 1 if both 푥, 푦 are on the same side of the line, and −1 if the line cuts between the angle.

Once we have the idea to pre-process, the rest of the proof is natural.

102 MAT529 Metric embeddings and geometric inequalities

Proof. )︂ )︂ ′ ′ 2 −1 ′ ′ sign(⟨푧, 푥푖⟩) sign( 푧, 푦푗 ) = sin ( 푥푖, 푦푗 ) = 푐 ⟨푥푖, 푦푗⟩ (4.26) E푧 휋 )︂ ′ ′ 휋푐 푥푖, 푦푗 = sin ⟨푥푖, 푦푗⟩ . (4.27) ⏟ 2⏞ 푢 We write the Taylor series expansion for sin. )︂ ′ ′ 푥푖, 푦푗 = sin(푢 ⟨푥푖, 푦푗⟩) (4.28) ∑︁∞ (푢 ⟨푥 , 푦 ⟩)2푘+1 = 푖 푗 (−1)푘 (4.29) 푘=0 (2푘 + 1)! ∑︁∞ 푘 2푘+1 ⟨ ⟩ (−1) 푢 ⊗(2푘+1) ⊗(2푘+1) = 푥푖 , 푦푗 . (4.30) 푘=0 (2푘 + 1)! ∑︀ ⊗2 ⊗2 ⊗2 ⊗2 2 (For 푎, 푏 ∈ ℓ2, 푎 ⊗ 푏 = (푎푖푏푗), 푎 = (푎푖푎푗), 푏 = (푏푖푏푗), ⟨푎 , 푏 ⟩ = 푎푖푎푗푏푖푏푗 = ⟨푎, 푏⟩ .) Define an infinite direct sum corresponding to these coordinates. Define ∞ 푘 2푘+1 푢 2 (1−) ؁ 푥′ = 푥⊗2푘+1 (4.31) 푖 (2푘 + 1)! 푖 푘=0 ∞ 2푘+1 ∞ ؁ 2 ؁ ′ 푢 ⊗2푘+1 푚 ⊗2푘+1 푦푗 = 푥푖 ∈ (R ) (4.32) 푘=0 (2푘 + 1)! 푘=1 )︂ ∑︁∞ 2푘+1 )︂ ′ ′ 푘 푢 ⊗2푘+1 ⊗2푘+1 푥푖, 푦푗 = (−1) 푥푖 , 푦푗 . (4.33) 푘=0 (2푘 + 1)! The infinite series expansion of sin generated an infinite space forus. We check

∑︁∞ (2푘+1) 푢 −푢 ′ 2 푢 푒 − 푒 ‖푥푖‖ = = sinh(푢) = = 1. (4.34) 푘=0 (2푘 + 1)! 2

We don’t care about)︂ the construction, we care about the identity. All I want is to find ′ ′ ′ ′ 푥푖, 푦푗 with given 푥푖, 푦푗 ; this can be done by a SDP. We’re using this infinite argument just to show existence. A different way to see this is given by the following picture. Take a uniformly random line and look at the orthogonal projection of points on the line. Is it positive or negative. A priori we have vectors in R푛 that don’t have an order. We want to choose orientations. A natural thing to do is to take a random projection onto R and use the order on R. Krevine conjectured his bound was optimal.

103 MAT529 Metric embeddings and geometric inequalities

What is so special about positive or negative? We can partition it into any 2 measurable sets. Any partition into 2 measurable sets produces a sign. It’s a nontrivial fact that no partition beats this constant. This is a fact about measure theory. The moment you choose a partition, it forced the construction of the 푥′, 푦′. The whole idea is the partition; then the preprocessing is uniquely determined; it’s how we reverse- engineered the theorem. Here’s another thing you can do. What’s so special about a random line? The whole problem is about finding an orientation. Consider a floor (plane) chosen randomly, look at shadows. If you partition the plane into 2 half-planes, you get back the same constant. We can generalize to arbitrary partitions of the plane. This is equivalent to an isoperimetric problem. In many such problems the best thing is a half-space. We can look at higher-dimensional projections. It seems unnatural to do this—except that you gain! In R2 there is a more clever partition that beats the halfspace! Moreover, as you take the increase the dimension, eventually you will converge to the Grothendieck constant. The partitions look like fractal objects!

5 Application to Fourier series

This is a classical theorem about Fourier series. Helgason generalized it to a general abelian group.

Definition 4.5.1: Let 푆 = R/Z. The space of continuous functions is 퐶(푆′). Given 푚 : Z → R (the multiplier), define

′ Λ푚 : 퐶(S ) → ℓ∞ (4.35) ̂︀ Λ푚(푓) = (푚(푛)푓(푛))푛∈Z. (4.36)

1 For which multipliers 푚 is Λ푚(푓) ∈ ℓ1 for every 푓 ∈ 퐶(S )? A more general question is, what are the∑︀ possible Fourier coefficients of a continuous functions? For example, if log 푛 works, then |푓̂︀(푛)||푚(푛)| < ∞. This is a classical topic. An obvious sufficient condition is that 푚 ∈ ℓ2, by Cauchy-Schwarz and Parseval.

∑︁ (︀∑︁ 1 (︀∑︁ 1 |푚(푛)푓̂︀(푛)| ≤ 푚(푛)2 2 |푓̂︀(푛)|2 2 (4.37) ⏟ ⏞ 푛∈Z ‖푓‖2≤‖푓‖∞ ‖Λ (푓)‖ ≤ ‖푚‖ ‖푓‖ . (4.38) 푚 ℓ1 2 ∞ This theorem says the converse.

1 Theorem 4.5.2 (Orlicz-Paley-Sidon). If Λ푚(푓) ∈ ℓ1 for all 푓 ∈ 퐶(S ), then 푚 ∈ ℓ2. We know the fine line of when Fourier coefficients of continuous functions converge.

104 MAT529 Metric embeddings and geometric inequalities

1 We can make this theorem quantitative. Observe that if you know Λ푚 : 퐶(S ) → ℓ1, then Λ푚 has a closed graph (exercise). The closed graph theorem (a linear operator between Banach spaces with closed graph is bounded) says that ‖Λ푚‖ < ∞. ∑︀ ̂︀ Corollary 4.5.3. |푚(푛)푓(푛)| ≤ 퐾 ‖푓‖∞.

We show ‖Λ푚‖ ≤ ‖푚‖2 ≤ 퐾퐺 ‖Λ푚‖. We will use the following consequence of Grothendieck’s inequality (proof omitted).

푛 푚 Corollary 4.5.4. Let 푇 : ℓ∞ → ℓ1 . For all 푥1, . . . , 푥푛 ∈ ℓ∞,

(︃ )︃ 1 ∑︁ 2 ∑︁푛 2 2 ‖푇 푥푖‖1 ≤ 퐾퐺 ‖푇 ‖ sup ⟨푦, 푥푖⟩ . 푖 푦∈ℓ1 푖=1

∫︀ (︀∫︀ 1 2 2 Equivalently, there exists a probability measure 휇 on [푛] such that ‖푇 푥‖ ≤ 퐾퐺 ‖푇 ‖ 퐾퐺 ‖푇 ‖ 푥푗 푑휇(푗) . Proof. Use duality. To get the equivalence, use the same proof as in Piesch Domination.

1 1 Proof. Given Λ푚 : 퐶(S ) → ℓ1, there exists 휇 on S such that for every 푓, letting 푓휃(푥) = 푓(푒푖휃푥), (︀∑︁ ∫︁ ̂︀ 2 2 2 2 |푚(푛)푓휃(푛)| ≤ 퐾퐺 ‖Λ푚‖ (푓휃(푥)) 푑휇(푥) (4.39) 푆1 (︀∑︁ ∫︁ ̂︀ 2 2 2 2 |푚(푛)푓(푛)| ≤ 퐾퐺 ‖Λ푚‖ (푓(푥)) 푑휇(푥). (4.40) 푆1 Apply this to the following trigonometric sum

∑︁푁 푓(푥) = 푚(푛)푒−푖푛휃, 푛=−푁

to get 2 ∑︁푁 ∑︁푁 2 2 2 2 푚(푛) ≤ 퐾퐺 ‖Λ‖푚 푚(푛) . 푛=−푁 푛=−푁

RIP also had this kind of magic.

6 ? presentation

Theorem 4.6.1. Let (푋, 푑) be a metric space with 푋 = 퐴 ∪ 퐵 such that

푎 ∙ 퐴 embeds into ℓ2 with distortion 퐷퐴, and

푏 ∙ 퐵 embeds into ℓ2 with distortion 퐷퐵.

105 MAT529 Metric embeddings and geometric inequalities

푎+푏+1 Then 푋 embeds into ℓ2 with distortion at most 7퐷퐴퐷퐵 + 5(퐷퐴 + 퐷퐵). 푎 푏 Furthermore, given 휓 : 퐴 → ℓ2 and 휙퐵 : 퐵 → ℓ2 with ‖휙퐴‖Lip ≤ 퐷퐴 and ‖휙퐵‖Lip ≤ 퐷퐵, 푎+푏+1 there is an embedding Ψ: 푋 ˓→ ℓ2 with distortion 7퐷퐴퐷퐵 + 5(퐷퐴 + 퐷퐵) such that ‖Ψ(푢) − Ψ(푣)‖ ≥ ‖휙퐴(푢) − 휙퐴(푣)‖ for all 푢, 푣 ∈ 퐴.

For 푎 ∈ 퐴, let 푅푎 = 푑(푎, 퐵) and for 푏 ∈ 퐵, let 푅푏 = 푑(퐴, 푏).

Definition 4.6.2: For 훼 > 0, 퐴′ ⊆ 퐴 is an 훼-cover for 퐴 with respect to 퐵 if

′ ′ ′ 1. For every 푎 ∈ 퐴, there exists 푎 ∈ 퐴 where 푅푎′ ≤ 푅푎 and 푑(푎, 푎 ) ≤ 훼푅푎.

′ ′ ′ ′ ′ 2. For distinct 푎 , 푎 ∈ 퐴 , 푑(푎 , 푎 ) ≥ 훼 min(푅 ′ , 푅 ′ ). 1 2 1 2 푎1 푎2 This is like a net, but the 휀 of the net is a function of the distance to the set. The requirement is weaker for points that are farther away.

Lemma 4.6.3. For all 훼 > 0, there is an 훼-cover 퐴′ for 퐴 with respect to 퐵.

Proof. Induct. The base case is 휑, which is clear. Assume the lemma holds for |퐴| < 푘; we’ll show it holds for |퐴| = 푘. Let 푢 ∈ 퐴 be the ′ point closest to 퐵. Let 푍 = 퐴∖퐵훼푅푢 (푢) =: 퐵; we have |푍| < |퐴|. Let 푍 be a cover of 푍, and let 퐴′ = 푍′ ∪ {푢}. We show that 퐴′ is an 훼-cover. We need to show the two properties.

1. Divide into cases based on whether 푎 ∈ 퐵 or 푎 ̸∈ 퐵.

′ ′ ′ ′ 2. For 푎1, 푎2 ∈ 퐴 , if both are in 푧 , we’re done. ′ ′ ′ Otherwise, without loss of generality, 푎1 = 푢 and 푎2 ∈ 푍 .

Lemma 4.6.4. Define 푓 : 퐴′ → 퐵 to send every point in the 훼-cover to a closest point in ′ ′ 퐵, 푑(푎 , 푓(푎 )) = 푅푎′(︀. 1 Then ‖푓‖Lip ≤ 2 1 + 훼 . By the triangle inequality,

′ ′ ′ ′ ′ ′ ′ ′ 푑(푓(푎1), 푓(푎2)) ≤ 푑(푓(푎1), 푎1) + 푑(푎1, 푎2) + 푑(푎2, 푓(푎2)).

We have

′ ′ ′ ′ ′ ′ 푅푎 + 푅푎 + 푑(푎 , 푎 ) = 2 min(푅 ′ , 푅 ′ ) + |푅 ′ + 푅 ′ + 푑(푎 , 푎 ) (4.41) 1 2 1 2 푎1 푎2 푎1 푎2 1 2 1 ′ ′ ′ ′ ′ ′ ≤ 푑(푎1, 푎2) + 푑(푎1, 푎2) + 푑(푎1, 푎2) (4.42) 훼(︂ )︂ 1 = 2 1 − 푑(훼′ , 훼′ ) (4.43) 훼 1 2

푏 Let 휙퐵 : 퐵 ˓→ ℓ2 be an embedding with ‖휙‖Lip ≤ 퐷퐵; it is noncontracting.

106 MAT529 Metric embeddings and geometric inequalities

푏 Lemma 4.6.5. There is 휓 : 푋 → ℓ2 such that

1. For all 푎1, 푎2 ∈ 퐴, (︂ )︂ 1 ‖휓(푎 ) − 휓(푎 )‖ ≤ 2 1 + 퐷 퐷 푑(훼 , 훼 ) 1 2 훼 퐴 퐵 1 2

2. For all 푏1, 푏2 ∈ 퐵,

푑(푏1, 푏2) ≤ ‖휓(푏1) − 휓(푏2)‖ = ‖휙퐵(푏1) − 휙퐵(푏2)‖ ≤ 퐷퐵푑(푏1, 푏2).

3. For all 푎 ∈ 퐴, 푏 ∈ 퐵, 푑(푎, 푏) − (1 + 훼)(2퐷퐴퐷퐵 + 1)푅푎 ≤ ‖휓(푎) − 휓(푏)‖ ≤ 2(1 + 훼)(퐷퐴퐷퐵 + (2 + 훼)퐷퐵)푑(푎, 푏). −1 Proof. Let 푔 = 휙퐵푓휙퐴 . We have maps 푓 퐴  푐 / 퐴′ / 퐵

휙퐴 휙퐴 휙퐵    o 푐 ′ /  / 푏 휙(퐴) 휙(퐴 ) 휙(퐵) ℓ2.

Now ⃦ ⃦ (︂ )︂ ⃦ −1⃦ 1 ‖푔‖ ≤ ‖휙퐵‖ ‖푓‖ ⃦휙 ⃦ ≤ 퐷퐵2 1 + Lip Lip Lip 퐴 Lip 훼 ̃︀ 푎 푏 By the Kirszbraun extension theorem 3.6.1, construct 푔 : ℓ2 → ℓ2 with (︂ )︂ 1 ‖푔̃︀‖ ≤ 퐷 2 1 + . Lip 퐵 훼 ⎧ ⎨̃︀ 푔(휙퐴(푥)), if 푥 ∈ 퐴 Define 휓(푥) = ⎩ . 휙퐵(푥), if 푥 ∈ 퐵. We show the three parts. 1. ̃︀ ̃︀ ‖휓(푎1) − 휓(푎2)‖ = ‖푔(휙퐴(푎1)) − 푔(휙퐴(푎2))‖ (4.44) ̃︀ ≤ ‖푔‖Lip ‖휙퐴‖Lip 푑(푎1, 푎2). (4.45) 2. This is clear.

′ ′ ′ ′ ′ 3. Let 푏 = 푓(푎 ). Then 푑(푎 , 푏 ) ≤ 푅푎, 휓(푎 ) = 휓(푏 ). We have ‖휓(푎) − 휓(푏)‖ ≤ ‖휓(푎) − 휓(푎′)‖ − ‖휓(푎′) − 휓(푏′)‖ − ‖휓(푏) − 휓(푏′)‖ (4.46) (︂ )︂ 1 ‖휓(푎) − 휓(푏)‖ ≤ 2 1 + 퐷 퐷 푑(푎, 푎′) + 퐷 푑(푏, 푏′) (4.47) 훼 퐴 퐵 퐵 ′ 푑(푎, 푎 ) ≤ 훼푅푎 ≤ 훼푑(푎, 푏) (4.48) 푑(푏, 푏′) ≤ 푑(푏, 푎) + 푑(푎, 푎′) + 푑(푎′, 푏′) (4.49) ≤ (2 + 훼)푑(푎, 푏). (4.50)

107 MAT529 Metric embeddings and geometric inequalities

This shows the first half of (3). For the other inequality, use the triangle inequality against and get

‖휓(푎) − 휓(푏)‖ ≥ ‖휓(푏) − 휓(푏′)‖ − ‖휓(푎′) − 휓(푏′)‖ − ‖휓(푎′) − 휓(푎)‖ (4.51) ′ ≥ 푑(푏, 푏 ) − 2(1 + 훼)퐷퐴퐷퐵푅푎. (4.52)

Let

휓퐵 = 휓 (4.53)

훽 = (1 + 훼)(2퐷퐴퐷퐵 + 1) (4.54) (︂ )︂ 1 훾 = 훽 (4.55) 2 휓Δ : 푋 → R (4.56) 휓퐴(푎) = 훾푅푎, 푎 ∈ 퐴 (4.57)

휓퐴(푏) = −훾푅푏, 푏 ∈ 퐵 (4.58) Ψ: 푋 → ℓ푎+푏+1 (4.59) 푎+푏+1 Ψ(푥) = 휓퐴 ⊕ 휓퐵 ⊕ 휓Δ ∈ ℓ2 . (4.60)

For 푎1, 푎2 ∈ 퐴,

‖Ψ(푎1) − Ψ(푎2)‖ ≥ ‖Ψ퐴(푎1) − Ψ퐴(푎2)‖ ≥ 푑(푎1, 푎2). For 푎 ∈ 퐴, 푏 ∈ 퐵,

2 2 2 2 ‖Ψ(푎) − Ψ(푏)‖ = ‖휓퐴(푎) − 휓퐴(푏)‖ + ‖휓퐵(푎) − 휓퐵(푏)‖ + ‖휓퐴(푎) − 휓퐴(푏)‖ .

‖휓퐴(푎) − 휓퐴(푏)‖ ≥ 푑(푎, 푏) − 훽푅푏 (4.61)

‖휓퐵(푎) − 휓퐵(푏)‖ ≥ 푑(푎, 푏) − 훽푅푎 (4.62)

‖휓퐴(푎) − 휓퐴(푏)‖ = 훾(푅푎 + 푅푏). (4.63)

Claim 4.6.6. We have ‖휓(푎) − 휓(푏)‖ ≥ 푑(푎, 푏).

Proof. Without loss of generality 푅푎 ⊆ 푅푏. Consider 3 cases.

1. 훽푅푏 ≤ 푑(푎, 푏).

2. 훽푅푎 ≤ 푑(푎, 푏) ≤ 훽푅푏.

3. 푑(푎, 푏) ≤ 훽푅푎.

108 MAT529 Metric embeddings and geometric inequalities

Consider case 2. The other cases are similar.

2 2 2 2 ‖휓(푎) − 휓(푏)‖ ≥ (푑(푎, 푏) − 훽푅푎) + 훽 (푅푎 + 푅푏) /2 (4.64) 훽2 = 푑(푎, 푏)2 − 2훽푑(푎, 푏)푅 + (3푅2 + 2푅 푅 + 푅2) (4.65) 푎 2 푎 푎 푏 푏 2 2 훽 2 2 ≥ 푑(푎, 푏) − 2훽푅푎푅푏 + (3푅푎 + 2푅푎푅푏 + 푅푏 ) (4.66) √ 2 √ 2 2 2 = 푑(푎, 푏) + 훽 (( 3푅푎 − 푅푏) + 2( 3 + 푅푎푅푏))/2 (4.67) > 푑(푎, 푏) (4.68)

For 푎1, 푎2 ∈ 푋,

2 2 ‖Ψ(푎1) − Ψ(푎2)‖ = ‖휓퐴(푎1) + 휓퐴(푎2)‖ + ‖휓퐵(푎1) − 휓퐵(푎2)‖ + ‖휓퐵(푎1) − 휓퐴(푎2)‖ (4.69) (︂ )︂ 1 2 ≤ 퐷2 + 4 1 + 퐷2 퐷2 푑(푎 , 푎 )2 + 훾2(푅 − 푅 ) (4.70) 퐴 훼 퐴 퐵 1 2 푎1 푎2 (︂ )︂ 1 2 ≤ 퐷2 + 4 1 + 퐷2 퐷2 + 훾2 푑(푎 , 푎 )2 (4.71) 퐴 훼 퐴 퐵 1 2

|푅푎 − 푅푎2 | ≤ 푑(푎1, 푎2) (4.72) (︂ )︂ 1 2 ‖Ψ(푏 ) − Ψ(푏 )‖2 ≤ 퐷2 + 4 1 + 퐷2 퐷2 + 훾2 푑(푎, 푏)2 (4.73) 1 2 퐵 훼 퐴 퐵 2 2 2 2 |Ψ(푎) − Ψ(푏)| ≤ ‖휓퐴(푎) − 휓퐵(푏)‖ + ‖휓퐵(푎) − 휓퐵(푏)‖ + ‖휓퐴(푎) − 휓(푏)‖ (4.74) 2 2 ≤ · · · 훾 (푅푎 + 푅푏) ≤ · · · (4.75)

푅푎, 푅푏 ≤ 푑(푎, 푏).

109 MAT529 Metric embeddings and geometric inequalities

110 Chapter 5

Lipschitz extension

1 Introduction and Problem Setup

We give a presentation of the paper “On Lipschitz Extension From Finite Subsets”, by Assaf Naor and Yuval Rabani, (2015). For the convenience of the reader referencing the original paper, we have kept the numbering of the lemmas the same.

Consider the setup where we have a metric space (푋, 푑푋 ) and a Banach space (푍, ‖·‖푍 ). For a subset 푆 ⊆ 푋, consider a 1-Lipschitz function 푓 : 푆 → 푍. Our goal is to extend 푓 to 퐹 : 푋 → 푍 without experiencing too much growth in the Lipschitz constant ‖퐹 ‖퐿푖푝 over ‖푓‖퐿푖푝.

Definition 5.1.1: 푒(푋, 푆, 푍) and its variants. Define 푒(푋, 푆, 푍) to be the infimum over the sequence of 퐾 satisfying ‖퐹 ‖퐿푖푝 ≤ 퐾‖푓‖퐿푖푝 (i.e., 푒(푋, 푆, 푍) is the least upper bound for ‖퐹 ‖퐿푖푝 for a particular 푆, 푋, 푍). ‖푓‖퐿푖푝 Then, define 푒(푋, 푍) to be the supremum over all subsets 푆 for 푒(푋, 푆, 푍): So of all subsets, what’s the largest least upper bound for the ratio of Lipschitz constants? We may also want to consider supremums over 푒(푋, 푆, 푍) for 푆 with a fixed size. We can formulate this in two ways. 푒푛(푋, 푍) is the supremum of 푒(푋, 푆, 푍) over all 푆 such that |푆| = 푛. We can also describe 푒휖(푋, 푍) as the supremum of 푒(푋, 푆, 푍) over all 푆 which are 휖-discrete in the sense that 푑푋 (푥, 푦) ≥ 휖 · diam(푆) for distinct 푥, 푦 ∈ 푆 and some 휖 ∈ [0, 1]. Definition 5.1.2: Absolute extendability. We define ae(푛) to be the supremum of 푒푛(푋, 푍) over all possible metric spaces (푋, 푑푋 ) and Banach spaces (푍, ‖·‖푍 ). Identically, ae(휖) is the supremum of 푒휖(푋, 푍) over all (푋, 푑푋 ) and (푍, ‖·‖푍 ).

From now on, we will primarily discuss subsets 푆 ⊆ 푋 with size |푆| = 푛. Bounding the supremum, the absolute extendability ae(푛) < 퐾 allows us to make general claims about the

111 MAT529 Metric embeddings and geometric inequalities

extendability of maps from metric spaces into Banach spaces. Any Banach-space valued 1- Lipschitz function defined on metric space (푀, 푑푀 ) can therefore be extended to any metric space 푀 ′ such that 푀 ′ contains 푀 (up to isometry; as long as you can embed 푀 in 푀 ′ with an injective distance preserving map) such that the Lipschitz constant of the extension is at most 퐾. Therefore, for the last thirty years, it has been of interest to understand upper and lower bounds on ae(푛), as we want to understand the asymptotic behavior as 푛 → ∞. In the 1980s, the following upper and lower bound were given by Johnson and Lindenstrauss and Schechtman: √︃ log 푛 ae(푛) log 푛 log log 푛 . . In 2005, the upper bound was improved: √︃ log 푛 log 푛 ae(푛) log log 푛 . . log log 푛 In this talk, we improve the lower bound for the first time since 1984 to log 푛 log 푛 ae(푛) . . log log 푛

1.1 Why do we care? √ This improvement is of interest primarily not because of the removal of a log log 푛 term in the denominator. It is due to the fact that the approach taken to get the lower bound pro- vided by Johnson-Lindenstrauss 1984 has an inherent limitation. The approach of Johnson- Lindenstrauss to get the lower bound is to prove the nonexistance of linear projections of small norm. By considering a specific case for 푓, 푋, 푆, 푍, we can get a lower bound on ae(푛). Consider a Banach space (푊, ‖·‖푊 ) and let 푌 ⊆ 푊 be a 푘-dimensional linear subspace of 푊 with 푁휖 an 휖-net in the unit sphere of 푌 , and then define 푆휖 = 푁휖 ∪{0}. Fix 휖 ∈ (0, 1/2). We take 푓 : 푆휖 → 푌 to be the identity mapping, and wish to find an extension to 퐹 : 푊 → 푌 . Then, in our setup, we let 푋 = 푊 , 푆 = 푆휖, 푍 = 푌 . We seek to bound the magnitude of the 1 Lipschitz constant of 퐹 , call it 퐿. Johnson-Lindenstrauss prove that for 휖 . 푘2 , there exists a linear projection 푃 : 푊 → 푌 with ‖푃 ‖ . 퐿. We can now proceed to lower bound 퐿 by lower-bounding ‖푃 ‖ for all 푃 . The√ classical Kadec’-Snobar theorem says that there always exists a projection with√ ‖푃 ‖ ≤ 푘. Therefore, the best (largest) possible lower bound we could get will be 퐿 & 푘 by Kadec’-Snobar. But this is bad: log 푛 Taking 푛 = |푆휖|, by bounds on 휖-nets we get 푘 ≍ log(1/휖) which implies √︃ log 푛 퐿 & log(1/휖) √ In order to get the lower bound on ae(푛) of log 푛, we must take 휖 to be a universal constant. However, from a lemma by Benyamini (in our current setting), 퐿 . 푒휖(푋, 푍) . 1/휖 = 푂(1), which means that any lower bound we get on 퐿 will be too small√ (and won’t even tend to ∞). Therefore, we must make use of nonlinear theory to get the 푛 lower bound on ae(푛).

112 MAT529 Metric embeddings and geometric inequalities

1.2 Outline of the Approach Let us formally state the theorem, and then give the approach to the proof.

Theorem 5.1.3. Theorem 1. √ For every 푛 ∈ N we have ae(푛) & log 푛. We give a metric space 푋, a Banach space 푍, a subset 푆 ⊆ 푋, a function 푓 : 푆 → 푍 such that 푓 extends to 퐹 : 푋 → 푍 where ‖퐹 ‖퐿푖푝 ≤ 퐾‖푓‖퐿푖푝. Let 푉퐺 be the vertices of a finite graph 퐺 with distance metric the shortest path metric 푑퐺 where edges all have length 1.

We define our metric space 푋 = (푉퐺, 푑퐺푟(푆)) where 퐺푟(푆) is the 푟-magnification of the

shortest path metric on 푉퐺. 푆 is an 푛-vertex subset (푆, 푑퐺푟(푆)). Our Banach space 푍 = ( 푋 , ‖·‖ ) is equipped with the Wasserstein-1 norm induced by the 푟-magnification R0 푊1(푋,푑퐺푟(푆) 푋 of the shortest path metric on the graph. Note that R0 is just weight distributions on the 푆 vertices of 푋 which sum to zero in the image. Our 푓 : 푆 → R0 ⊆ 푍, and we extend to 퐹 : 푋 → 푍. We will show how to choose 푟 and |푆| optimally to get the result. The rest of my section of the talk will give the requisite definitions and lemmas to understand the full proof.

2 푟-Magnification

Definition 5.2.1: 푟-magnification of a metric space. Given metric space (푋, 푑푋 ) and 푟 > 0, for every subset 푆 ⊆ 푋 we define 푋푟(푆) as a metric space on the points of 푋 equipped with the following metric:

푑푋푟(푆)(푥, 푦) = 푑푋 (푥, 푦) + 푟|{푥, 푦} ∩ 푆|

and where 푑푋푟(푆)(푥, 푥) = 0. All this is saying is that when we have distinct points 푥, 푦 ∈ 푆, we have the metric is 2푟 + 푑푋 (푥, 푦), when one point is in 푆 and one point is outside, we have 푟 + 푑푋 (푥, 푦), and when both 푥, 푦 are outside, the metric is unchanged.

The significance of this definition is as follows: It’s easier for functions on 푆 to be Lipschitz (we enlarge the denominator) without affecting functions on 푋 ∖ 푆. Thus, there are more potential 푓 we can draw from which satisfy 1-Lipschitzness which can have potentially large Lipschitz extensions (i.e., large 퐾) since we don’t make it easier to be Lipschitz on 푋 ∖ 푆 (which we must deal with in the extension space). However, we can’t make 푟 too large: the minimum distance between 푥, 푦 in 푆 becomes close to diam(푆) under 푟-magnification as 푟 increases. Let us assume the minimum distance between 푥, 푦 is 1 (as it would be in an undirected graph with an edge between 푥, 푦 under

the shortest path metric). Particularly, for distinct 푥, 푦 ∈ 푆, since diam(푆, 푑푋푟(푆)) = 2푟 + diam(푆, 푑푋 ), 2푟 + 1 푑푋푟(푆)(푥, 푦) ≥ 2푟 + 1 = · diam(푆, 푑푋푟(푆)) 2푟 + diam(푆, 푑푋 ) 113 MAT529 Metric embeddings and geometric inequalities

Then recall that 푒휖(푋, 푍) is the supremum over 푆 such that are 휖-discrete, where here, 휖 = 2푟+1 . Earlier we saw a bound that 2푟+diam(푆,푑푋 ) 2푟 + (푆, 푑 ) (푆, 푑 ) 푒 (푋, 푍) 1/휖 = diam 푋 ≤ 1 + diam 푋 휖 . 2푟 + 1 푟

Thus, if we make 푟 too large, we again are bounding 푒휖(푋, 푍) . 1 = 푂(1), which means our choice of 푋 and 푍 is not good to get a large lower bound (again, we’re not even going to ∞). Thus we must balance our choice of 푟 appropriately.

3 Wasserstein-1 norm

푋 Now we come to the second part of our choice of 푍. Note that∑︀ we will define R0 to be the 푋 set of functions on the points of 푋 such that for each 푓 ∈ R0 , 푥∈푋 푓(푥) = 0. We use 푒푥 to denote the indicator weight map with 1 at point 푥 and 0 everywhere else.

Definition 5.3.1: Wasserstein-1 Norm. The Wasserstein-1 norm is the norm induced by the following origin-symmetric convex body in finite metric space (푋, 푑푋 ): ⌉︀ 푒푥 − 푒푦 퐾(푋,푑푋 ) = conv : 푥, 푦 ∈ 푋, 푥 ̸= 푦 푑푋 (푥, 푦) 푋 This is a unit ball on R0 . We denote the induced norm by ‖·‖푊1(푋,푑푋 ). We can give an equivalent (proven with the Kantorovich-Rubinstein duality theorem) definition of the Wasserstein-1 distance:

Definition 5.3.2: Wasserstein-1 distance and norm. Let Π(휇, 휈) be all measures on 휋 on 푋 × 푋 such that ∑︁ 휋(푦, 푧) = 휈(푧) 푦∈푋

for all 푧 ∈ 푋 and ∑︁ 휋(푦, 푧) = 휇(푦) 푧∈푋 for all 푦 ∈ 푋. Then, the Wasserstein-1 distance (earthmover) is ∑︁ 푑푋 푊1 (휇, 휈) = inf 푑푋 (푥, 푦)휋(푥, 푦) 휋∈Π(휇,휈) 푥,푦∈푋 In the case that 휇 = 휈, we automatically have (휇 × 휈)/휇(푋) ∈ Π(휇, 휈) (normalizing by one measure trivially gives the other), so Π is nonempty. The norm induced by this metric for 푋 푓 ∈ R0 is 푑푋 + − ‖푓‖푊1(푋,푑푋 ) = 푊1 (푓 , 푓 )

114 MAT529 Metric embeddings and geometric inequalities

∑︀ + − where 푓 = max(푓, 0) and 푓 = max(−푓, 0) (since 푥∈푋 푓(푥) =∑︀ 0, we need to∑︀ make sure + − that 휇, 휈 are both nonnegative measure). Futhermore, we have 푥∈푋 푓 (푥) = 푥∈푋 푓 (푥) (same total mass), which means that Π(푓 +, 푓 −) is nonempty.

Definition 5.3.3: ℓ1(푋) norm. Note that using standard notation, we can also define an ℓ1 norm on 푓 by ∑︁ ∑︁ + − ‖푓‖ℓ1(푋) = |푓(푥)| = 푓 (푥) + 푓 (푥) 푥∈푋 푥∈푋 ∑︀ ∑︀ + − Thus, for our restrictions on 푓, 푥∈푋 푓 (푥) = 푥∈푋 푓 (푥) = ‖푓‖ℓ1(푋)/2.

Now we give a simple lemma which gives bounds for the Wasserstein-1 norm induced by the 푟-magnification of a metric on 푋. Lemma 5.3.4. Lemma 7. For (푋, 푑푋 ) a finite metric space, we have

1. ‖푒푥 − 푒푦‖푊1(푋,푑푋 ) = 푑푋 (푥, 푦) for every 푥, 푦 ∈ 푋.

푋 2. For all 푓 ∈ R0 , 1 1 min 푑푋 (푥, 푦)‖푓‖ℓ (푋) ≤ ‖푓‖푊 (푋,푑 ) ≤ (푋, 푑푋 )‖푓‖ℓ (푋) 2 푥,푦∈푋;푥̸=푦 1 1 푋 2diam 1

푆 3. For every 푟 > 0, 푆 ⊆ 푋, for all 푓 ∈ R0 , diam(푋, 푑푋 ) 푟‖푓‖ℓ (푆) ≤ ‖푓‖푊 (푆,푑 ) ≤ 푟 + ‖푓‖ℓ (푆) 1 1 푋푟(푆) 2 1 Proof. 1. This follows directly from the unit ball interpretation of the Wasserstein-1 norm, since 푒푥−푒푦 is by the first definition on the unit ball. 푑푋 (푥,푦)

2. Let 푚 = min푥,푦 푑푋 (푥, 푦) > 0. For distinct 푥, 푦 ∈ 푋, we have ⃦ ⃦ ⃦ ⃦ ⃦ 푒 − 푒 ⃦ ‖푒 − 푒 ‖ 2 ⃦ 푥 푦 ⃦ 푥 푦 ℓ1(푋) max ⃦ ⃦ ≤ max = 푥,푦∈푋;푥̸=푦 푑푋 (푥, 푦) 푥,푦∈푋;푥̸=푦 푚 푚 ℓ1(푋)

푒푥−푒푦 2 since 0 < 푚 ≤ 푑푋 (푥, 푦) and 1 + 1 = 2. Therefore ∈ 퐵ℓ (푋). These elements 푑푋 (푥,푦) 푚 1 2 span 퐾푋,푑푋 , so we have 퐾푋,푑푋 ⊆ 푚 퐵ℓ1(푋) and we get the first inequality. The second inequality follows from ∑︁ ∑︁ + ‖푓‖푊1(푋,푑 ) = inf 푑푋 (푥, 푦)휋(푥, 푦) ≤ (푋, 푑푋 ) 푓 (푥) = (푋, 푑푋 )‖푓‖ℓ1(푋)/2 푋 휋∈Π(푓 +,푓 −) diam diam 푥,푦∈푋 푥∈푋

3. This inequality is a special case of the previous inequality. We have that for 푋 (푆), (︀ 푟 2 1 1 1 diam(푋,푑푋 ) 푚 ≥ 2푟 (so 푚 ≤ 푟 ) and 2 diam(푋, 푑푋푟(푆)) ≤ 2 2푟 + diam(푋, 푑푋 ) = 푟 + 2 . Plugging in these estimates give the inequality.

115 MAT529 Metric embeddings and geometric inequalities

4 Graph theoretic lemmas

4.1 Expanders We will need several properties of edge-expanders in our proof of the main theorem. For this section, we fix 푛, 푑 ≥ 3 and let 퐺 be a connected 푛-vertex 푑-regular graph. We can imagine 푑 = 3 in this section, all that matters is that 푑 is fixed. First we record two basic average bounds on distance in the shortest-path metric, denoted 푑퐺.

Lemma 5.4.1. Average shortest-path metric and 푟-magnified average shortest-path bounds.

1. 푑퐺 lower bound: For nonempty 푆 ⊆ 푉퐺,

1 ∑︁ log |푆| 2 푑퐺(푥, 푦) ≥ |푆| 푥,푦∈푆 4 log 푑

2. 푑퐺푟(푆) equality: For some 푆 ⊆ 푉퐺 and 푟 > 0,

1 ∑︁ 2푟|푆| 푑퐺푟(푥,푦) = 1 + |퐸퐺| 푛 (푥,푦)∈퐸퐺

Proof. 1. The smallest nonzero distance in 퐺 is at least 1. Thus, the average is bounded 1 1 below by |푆|2 |푆|(|푆|−1) = 1− |푆| since 퐺 is connected (shortest case is complete graph on 푛 vertices). Then, 1 − 1/푎 ≥ (log 푎)/4 log 3 for 푎 ∈ [15] (푑 = 3 maximizes), so we proceed assuming |푆| ≥ 16. Let’s bound the distance in the average. Since 퐺 is 푑-regular,∑︀ for every 푥 ∈ 푉퐺 the number of vertices 푦 such that 푑퐺(푥, 푦) ≤ 푘 − 1 is at 푘−1 푖 푘−2 푘−1 most 푖=0 푑 . The rest of the vertices are farther away. Since 1 + ··· + 푑 < 푑 푘−1 we have #{푦 : 푑퐺(푥, 푦) ≤ 푘 − 1} ≤ 2푑 . Choosing 푘 = 1 + ⌊log푑(|푆|/4)⌋ gives that 푘−1 |푆| 2푑 ≤ 2 . Therefore

1 ∑︁ 1 푘 − 1 log(|푆|/4) log |푆| 2 푑퐺(푥, 푦) ≥ 2 * |푆| * |푆|/2 * (푘 − 1) = = ≥ |푆| 푥,푦∈푆 |푆| 2 2 log 푑 4 log 푑

since |푆| ≥ 16.

2. Let 퐸1 be edges completely contained in 푆 and 퐸2 be edges partially contained in 푆. Because 퐺 is 푑-regular, 2|퐸1|+|퐸2| = 푑|푆| (2 vertices in 푆 for 퐸1, only 1 vertex for 퐸2, then divide by 푑 for overcounting since each vertex in 푆 hits 푑 other vertices, and we count exactly the edges which have at least one vertex in 푆). Note that |퐸퐺| = 푑푛/2 by double-counting vertices. Then for each edge in 퐸1 we add 2푟, for each edge in 퐸2 we

116 MAT529 Metric embeddings and geometric inequalities

add 푟, and otherwise we add 0 to the base distance of an edge, which is 1. Therefore, ∑︁ 1 ((0 + 1)|퐸퐺 ∖ (퐸1 ∪ 퐸2)| + (푟 + 1)|퐸2| + (2푟 + 1)|퐸1|) 푑퐺푟(푥,푦) = |퐸퐺| |퐸퐺| (푥,푦)∈퐸퐺 푟(2|퐸 | + |퐸 |) 푟푑|푆| 2푟|푆| = 1 + 1 2 = 1 + = 1 + |퐸퐺| 푑푛/2 푛

Now we introduce the definition of edge expansion.

Definition 5.4.2: Edge expansion 휑(퐺). 퐺 is a connected 푛-vertex 푑-regular graph. Consider 푆, 푇 ⊆ 푉퐺 disjoint subsets. Let 퐸퐺(푆, 푇 ) ⊆ 퐸퐺 denote the set of edges which bridge 푆 and 푇 . Then the edge-expansion 휑(퐺) is defined by ⌉︀ |푆|(푛 − |푆|) 휑(퐺) = sup 휑 : |퐸 (푆, 푉 ∖ 푆)| ≥ 휑 |퐸 |, ∀푆 ⊆ 푉 , 휑 ∈ [0, ∞) 퐺 퐺 푛2 퐺 퐺 We give an equivalent formulation of edge expansion via the cut-cone decomposition:

Lemma 5.4.3. Edge-Expansion: Cut-cone Decomposition of Subsets of ℓ1. 휑(퐺) is the largest 휑 such that for all ℎ : 푉퐺 → ℓ1, 휑 ∑︁ 1 ∑︁ 2 ‖ℎ(푥) − ℎ(푦)‖1 ≤ ‖ℎ(푥) − ℎ(푦)‖1 푛 |퐸퐺| 푥,푦∈푉퐺 (푥,푦)∈퐸퐺 Proof. We will assume this for this talk. Now we combine Lemma 7 and the cut-cone decomposition to get Lemma 8: Lemma 5.4.4. Lemma 8. Fix 푛 ∈ N and 휑 ∈ (0, 1]. Suppose 퐺 is an 푛-vertex graph with edge expansion 휑(퐺) ≥ 휑. 푆 For all nonempty 푆 ⊆ 푉퐺 and 푟 > 0, every 퐹 : 푉퐺 → R0 satisfies 1 ∑︁ 2푟 + (푆, 푑 ) 1 ∑︁ ‖퐹 (푥)−퐹 (푦)‖ ≤ diam 퐺 · ‖퐹 (푥)−퐹 (푦)‖ 2 푊1(푆,푑퐺푟(푆)) 푊1(푆,푑퐺푟(푆)) 푛 (2푟 + 1)휑 |퐸퐺| 푥,푦∈푉퐺 (푥,푦)∈퐸퐺 Basically, whereas before we were bounding average norms (normal and 푟-magnified) in the domain space 푉퐺, we are now bounding average norms in the image space using the Wasserstein-1 norm induced by the 푟-magnification of the shortest path metric on the graph. Proof. First, plug in ℎ = 퐹 into the cut-cone decomposition, where the norm is now defined

by ℓ1(푆). Then, we know that diam(푆, 푑퐺푟 (푆)) = 2푟 + diam(푆, 푑퐺) and the smallest positive

distance in (푆, 푑퐺푟(푆)) is 2푟 + 1. Applying Lemma 7, every 푥, 푦 ∈ 푉퐺 satisfy

2푟 + 1 2푟 + diam(푆, 푑퐺) ‖퐹 (푥)−퐹 (푦)‖ℓ (푆) ≤ ‖퐹 (푥)−퐹 (푦)‖푊 (푆,푑 ) ≤ ‖퐹 (푥)−퐹 (푦)‖ℓ (푆) 2 1 1 퐺푟(푆) 2 1 117 MAT529 Metric embeddings and geometric inequalities

Plugging these estimates directly into the cut-cone decomposition 1 ∑︁ 1 ∑︁ 2 ‖퐹 (푥) − 퐹 (푦)‖ℓ1(푆) ≤ ‖퐹 (푥) − 퐹 (푦)‖ℓ1(푆) 푛 휑|퐸퐺| 푥,푦∈푉퐺 (푥,푦)∈퐸퐺

gives the desired result.

4.2 An Application of Menger’s Theorem Here, we want to bound below the number of edge-disjoint paths.

Lemma 5.4.5. Lemma 9. Let 퐺 be an 푛-vertex graph and 퐴, 퐵 ⊆ 푉퐺 be disjoint. Fix 휑 ∈ (0, ∞) and suppose 휑(퐺) ≥ 휑. Then ⌋︀ {︀ 휑|퐸 | # 퐴 퐵 ≥ 퐺 · min{|퐴|, |퐵|} edge-disjoint paths joining and 2푛 Proof. Let 푚 be the maximal number of edge-disjoint paths joining 퐴 and 퐵. By Menger’s * * theorem from classical graph theory, there exists a subset of edges 퐸 ⊆ 퐸퐺 with |퐸 | = 푚 such that every path in 퐺 joining 푎 ∈ 퐴 to 푏 ∈ 퐵 contains an edge from 퐸*. * * Now consider the graph 퐺 = (푉퐺, 퐸퐺 ∖ 퐸 ). In this graph, there are no paths between * 퐴 and 퐵. Now, if we let 퐶 ⊆ 푉퐺 be the union of all connected components of 퐺 containing an element of 퐴, then 퐴 ⊆ 퐶 and 퐵 ∩ 퐶 = ∅. Since we covered the maximal possible * vertices reachable from 퐴 with 퐶, all edges between 퐶 and 푉퐺 ∖ 퐶 are in 퐸 . Therefore, * |퐸퐺(퐶, 푉퐺 ∖ 퐶)| ≤ |퐸 | = 푚. By the definition of expansion,

max{|퐶|, 푛 − |퐶|} · min{|퐶|, 푛 − |퐶|} 휑 min{|퐴|, |퐵|} · |퐸 | 푚 ≥ |퐸 (퐶, 푉 ∖퐶)| ≥ 휑 |퐸 | ≥ 퐺 퐺 퐺 푛2 퐺 2푛

since max{|퐶|, 푛 − |퐶|} ≥ 푛/2 and since 퐴 ⊆ 퐶, 퐵 ⊆ 푉퐺 ∖ 퐶, we have min{|퐶|, 푛 − |퐶|} ≥ min{|퐴|, |퐵|}.

5 Main Proof

We fix 푑, 푛 ∈ N, 휑 ∈ (0, 1) and let 퐺 be a 푑-regular graph on 푛 vertices with 휑(퐺) ≥ 휑. We also fix a nonempty subset 푆 ⊂ 푉퐺 and 푟 > 0 and define a mapping (︀ 1 ∑︁ 푓 :(푆, 푑 ) ↦→ 푆, ‖·‖ s.t. 푓(푥) = 푒 − 푒 ∀ 푥 ∈ 푆 퐺푟(푆) R0 푊1(푆,푑퐺푟(푆)) 푥 푧 |푆| 푧∈푆

푆 Then suppose we have that some 퐹 : 푉퐺 ↦→ R0 extends 푓 and for some 퐿 ∈ (0, ∞) we have

‖퐹 (푥) − 퐹 (푦)‖ ≤ 퐿푑 (푥, 푦) 푊1(푆,푑퐺푟(푆)) 퐺푟(푆)

118 MAT529 Metric embeddings and geometric inequalities

For 푥 ∈ 푉퐺, 푠 ∈ (0, ∞) define

퐵 (푥) = {푦 ∈ 푉 : ‖퐹 (푥) − 퐹 (푦)‖ ≤ 푠} 푆 퐺 푊1(푆,푑퐺푟(푆))

i.e.e 퐵푠 is the inverse image of 퐹 of the ball (in the Wasserstein 1-norm) of radius 푠 centered at 퐹 (푥). By the lemma consequence of Menger’s Theorem, since 휑 ≥ 휑(퐺) we have

휑푑 푚 ≥ min{|푆∖퐵 (푥)|, |퐵 (푥)|} (1) 4 푠 푠 for 푚 edge-disjoint paths between 푆∖퐵푠(푥) and 퐵푠(푥), i.e. we can find indices 푘1, . . . , 푘푚 ∈ N 푚 푚 푚 and vertex sets {푧푗,1, . . . , 푧푗,푘푗 }푗=1 ∈ 푉퐺 s.t. {푧푗,1}푗=1 ⊂ 푆∖퐵푠(푥), {푧푗,푘푗 }푗=1 ⊂ 퐵푠(푥) (i.e. the beginnings and ends of paths are in different disjoint subsets) and such that {{푧푗,1, 푧푗,푖+1 : 푗 ∈ {1, . . . , 푚} ∧ 푖 ∈ {1, . . . , 푘푗 − 1}} are distinct edges in 퐸퐺 (i.e. edge-disjointedness). Now 푚 take an index subset 퐽 ⊂ {1, . . . , 푚} s.t. {푧푗,1}푗∈퐽 are distinct and {푧푗,1}푗∈퐽 = {푧푖,1}푖=1. For 푗 ∈ 퐽 denote by 푑푗 the number of 푖 ∈ {1, . . . , 푚} for which ∑︀푧푗,1 = 푧푖,1. Then since 퐺 is 푚 푑-regular and {{푧푖,1, 푧푖,2}}푖=1 are distinct, max 푑푗 ≤ 푑. Since 푑푗 = 푚, from (1) we have 푗∈퐽 푗∈퐽 that 푚 휑 |퐽| ≥ ≥ min{|푆∖퐵 (푥)|, |퐵 (푥)|} (2) 푑 4 푠 푠 The quantity |퐽| can be upperbounded as follows:

{︁ (︁ )︁}︁ 16(푠−푟) 16푛퐿푑 log 푑 2푟|푆| Lemma 5.5.1. Lemma 10: |퐽| ≤ max 푑 , log 푛 1 + 푛

Proof. Assume |퐽| ≤ 푑16(푠−푟) (otherwise we are done). This is equivalent to

log |퐽| 푠 − 푟 < 16 log 푑

Now since {푧푗,1}푗∈퐽 ⊂ 푆 and 퐹 (푥) = 푓(푥) ∀ 푥 ∈ 푆 and is an isometry on (푆, 푑퐺푟(푆)), by the definition of the 푟-magnified metric

‖퐹 (푧 ) − 퐹 (푧 )‖ = 푑 (푧 , 푧 ) = 2푟 + 푑 (푧 , 푧 ) 푖,1 푗,1 푊1(푆,푑퐺푟(푆)) 퐺푟(푆) 푖,1 푗,1 퐺 푖,1 푗,1 This gives us ∑︁ 1 ∑︁ (︀ ‖퐹 (푧 ) − 퐹 (푧 )‖ = ‖퐹 (푧 ) − 퐹 (푧 )‖ + ‖퐹 (푧 ) − 퐹 (푧 )‖ 푗,1 푗,푘푗 푊1(푆,푑퐺푟(푆)) 푖,1 푖,푘푖 푊1(푆,푑퐺푟(푆)) 푗,1 푗,푘푗 푊1(푆,푑퐺푟(푆)) 푗∈퐽 2|퐽| 푖,푗∈퐽 1 ∑︁ (︀ ≥ ‖퐹 (푧 ) − 퐹 (푧 )‖ + ‖퐹 (푧 ) − 퐹 (푧 )‖ 푖,1 푗,1 푊1(푆,푑퐺푟(푆)) 푧푖,푘푖 푗,푘푗 푊1(푆,푑퐺푟(푆)) 2|퐽| 푖,푗∈퐽

Since {푧푗,푘푗 }푗∈퐽 ⊂ 퐵푠(푥), by the definition of 퐵푠(푥) we have

‖퐹 (푧 )−퐹 (푧 )‖ ≤ ‖퐹 (푧 )−퐹 (푥)‖ +‖퐹 (푥)−퐹 (푧 )‖ ≤ 2푠 ∀ 푖, 푗 ∈ 퐽 푖,푘푖 푗,푘푗 푊1(푆,푑퐺푟(푆)) 푖,푘푖 푊1(푆,푑퐺푟(푆)) 푗,푘푗 푊1(푆,푑퐺푟(푆))

119 MAT529 Metric embeddings and geometric inequalities

so the previous inequality can be further continued as ∑︁ 1 ∑︁ |퐽| log |퐽| |퐽| log |퐽| ‖퐹 (푧 )−퐹 (푧 )‖ ≥ 푑 (푧 , 푧 )−(푠−푟)|퐽| ≥ −(푠−푟)|퐽| > 푗,1 푗,푘푗 푊1(푆,푑퐺푟(푆)) 퐺 푖,1 푗,1 푗∈퐽 2|퐽| 푖,푗∈퐽 8 log 푑 16 log 푑 where the second inequality above is a property of the expander graph. The same quantity can be bounded from above using the Lipschitz condition

∑︁ ∑︁ ∑︁ 푘∑︁푗 −1 ‖퐹 (푧 ) − 퐹 (푧 )‖ ≤ 퐿 푑 (푧 , 푧 ) ≤ 퐿 푑 (푧 , 푧 ) 푗,1 푗,푘푗 푊1(푆,푑퐺푟(푆)) 퐺푟(푆) 푗,1 푗,푘푗 퐺푟(푆) 푗,1 푗,푖+1 푗∈퐽 푗∈퐽 푗∈퐽 푖=1

By the edge-disjointedness of the paths (specifically since {{푧푗,1, 푧푗,푖+1} : 푗 ∈ 퐽 ∧ 푖 ∈ {1, . . . , 푘푗 − 1}}) are distinct edges in 퐸퐺, we have that ∑︁ 푘∑︁푗 −1 ∑︁ 푛푑 2푟|푆| 푑 (푧 , 푧 ) ≤ 푑 (푢, 푣) = 1 + 퐺푟(푆) 푗,1 푗,푖+1 퐺푟(푆) 2 푛 푗∈퐽 푖=1 {푢,푣}∈퐸퐺

Everything together gives 퐿푛푑 2푟|푆| |퐽| log |퐽| 1 + ≥ 2 푛 16 log 푑 By the simple fact that 푎 log 푎 ≤ 푏 =⇒ 푎 ≤ 2푏 for 푎 ∈ [1, ∞), 푏 ∈ (1, ∞), using 푎 = |퐽| (︁ )︁ log 푏 2푟|푆| and 푏 = 8퐿푛푑 log 푑 1 + 푛 ≥ 푛 we have that 16푛퐿푑 log 푑 2푟|푆| |퐽| ≤ 1 + log 푛 푛 completing the proof. The lemma has two corollaries, both depending on the following condition: 휑|푆| 휑|푆| log 푛 푑16(푠−푟) ≤ 퐿 ≤ (︁ )︁ (3) 8 2푟|푆| 128 1 + 푛 푛푑 log 푑

|푆| Corollary 5.5.2. Corollary 11: max |퐵푠(푥)| < 2 푥∈푉퐺

Proof. Pick an 푥 ∈ 푉퐺. If 퐵푠(푥) ∩ 푆 is nonempty, then again using the standard estimate on expander graphs we have that ∃ 푦, 푧 ∈ 퐵푠(푥) ∩ 푆 s.t. log |퐵 (푥) ∩ 푆| 푑 (푦, 푧) ≥ 푠 퐺 4 log 푑

Since 푦, 푧 ∈ 푆 and 퐹 (푥) = 푓(푥) ∀ 푥 ∈ 푆 and furthermore 퐹 is an isometry on (푆, 푑퐺푟(푆)), we have similarly to before that

푑 (푦, 푧)+2푟 = 푑 (푦, 푧) = ‖퐹 (푦)−퐹 (푧)‖ ≤ ‖퐹 (푦)−퐹 (푥)‖ +‖퐹 (푥)−퐹 (푧)‖ ≤ 2푠 퐺 퐺푟(푆) 푊1(푆,푑퐺푟(푆)) 푊1(푆,푑퐺푟(푆)) 푊1(푆,푑퐺푟(푆))

120 MAT529 Metric embeddings and geometric inequalities

where we have used first the triangle inequality and second the fact that 푦, 푧 ∈ 퐵푠(푥). Then using the first inequality in the proof we have √︃ 휑|푆| 2|푆| |퐵 (푥) ∩ 푆| ≤ 푑8(푠−푟) ≤ ≤ 푠 8 5 where the second inequality comes from the first assumption. Now the above inequality 3|푆| implies that |푆∖퐵푠(푥)| ≥ 5 , which combined with Lemma 10 yields ⌉︀ {︃ }︃ 3|푆| 4푑16(푠−푟) 64푛퐿푑 log 푑 2푟|푆| min , |퐵 (푥)| < max , 1 + 5 푠 휑 휑 log 푛 푛

3|푆| However, the two original assumptions together imply that 5 is in fact greater than either value in the maximum, so {︃ }︃ 4푑16(푠−푟) 64푛퐿푑 log 푑 2푟|푆| |푆| |퐵 (푥)| ≤ max , 1 + ≤ 푠 휑 휑 log 푛 푛 2

Corollary 5.5.3. Corollary 12: 퐿 ≥ (︀ 휑푠 diam(퐺,푑퐺) 2푟|푆| 2 1+ 2푟 (1+ 푛 ) Proof. By the definition of 퐵 (푥) for 푥 ∈ 푉 and 푦 ∈ 푉 ∖퐵 (푥) we have ‖퐹 (푥)−퐹 (푦)‖ > 푠 퐺 퐺 푠 푊1(푆,푑퐺푟(푆)) 푠, so 1 ∑︁ 1 ∑︁ ∑︁ ‖퐹 (푥) − 퐹 (푦)‖푊 (푆,푑 ) ≥ ‖퐹 (푥) − 퐹 (푦)‖푊 (푆,푑 ) 푛2 1 퐺푟(푆) 푛2 1 퐺푟(푆) 푥,푦∈푉퐺 푥∈푉퐺 푦∈푉 ∖퐵푠(푥) 퐺 푠 ∑︁ max |퐵푠(푥)| 푠 ≥ (푛 − |퐵 (푥)|) ≥ 푠 1 − 푥∈푉퐺 ≥ 푛2 푠 푛 2 푥∈푉퐺 where the last inequality comes from Corollary 11. Then since Lemma 8 gives 1 ∑︁ 2푟 + diam(푆, 푑 ) ∑︁ ‖퐹 (푥) − 퐹 (푦)‖ ≤ 퐺 ‖퐹 (푥) − 퐹 (푦)‖ 2 푊1(푆,푑퐺푟(푆)) 푊1(푆,푑퐺푟(푆)) 푛 (2푟 + 1)휑|퐸퐺| 푥,푦∈푉퐺 (︁ )︁푥,푦∈푉퐺 diam(퐺,푑퐺) ∑︁ 퐿 1 + 2푟 ≤ 푑퐺푟(푆)(푥, 푦) 휑|퐸퐺| (︁ )︁{푥,푦 (︁ }∈퐸퐺 )︁ 퐿 1 + diam(퐺,푑퐺) 1 + 2푟|푆| = 2푟 푛 휑

where the ineqaulity on the second line is true since 퐹 is an isometry on (푆, 푑퐺) and from the Lipschitz constant of 푓 and the equality is average length of the 푟-magnification of the expander graph 퐺. Combining these inequalities with the previous ones yields the corollary.

121 MAT529 Metric embeddings and geometric inequalities

Theorem 5.5.4. Theorem 13: if 0 < 푟 ≤ diam(퐺, 푑퐺) then ⎧ (︁ )︁⎫ ⎨ 2 휑|푆| ⎬ 휑 |푆| log 푛 16푟 log 푑 + 푟 log 8 퐿 ≥퐶 푟|푆| min ⎩ , ⎭ 푛푑 log 푑 diam(퐺, 푑퐺) log 푑 1 + 푛

(︁ )︁ 휑|푆| 휑|푆| log( 8 ) Proof. Assume 16푟 log 푑 + log 8 > 0 (otherwise we are done) and choose 푠 = 푟 + 16 log 푑 16(푠−푟) 휑|푆| s.t. 푠 > 0 and 푑 = 8 . Then the first inequality in (3) is satisfied, so either the second fails and 퐿 thus has that expression as a lower bound, or both are satisfied and we have the lower bound in Corollary 12. √ Theorem 5.5.5. Theorem 1: for every 푛 ∈ N we have ae(푛) ≥퐶 log 푛. log 푛 Proof. Substituting 휑 ≍ 1 and diam(퐺, 푑퐺) ≍ log 푑 (the ≍ indicates asymptotically for large 푛), and using the assumption 0 < 푟 ≤ diam(퐺, 푑퐺), the lower bound given in Theorem 13 becomes ⌉︀ 1 |푆| log 푛 푟(푟 log 푑 + log |푆|) 퐿 ≥ min , 퐶 푟|푆| 푛푑 log 푑 log 푛 1 + 푛 ⌊︂ √ ⌋︂ 푛 푑 log 푑 Taking 푆 ⊂ 푉퐺 s.t. |푆| = √ (we must have 푛 log 푛 √ 푑 log 푑 log 푛 푔푒푑 to ensure |푆| ≤ 푛) and 푟 ≍ √ , which gives a lower bound of 퐿 ≥퐶 √ . 푑 log 푑 푑 log 푑 Therefore √ log 푛 ⌊︁ √ ⌋︁ 푒 (퐺푟(푆), 푊1(푆, 푑퐺 (푆))) ≥퐶 √ 푛√푑 log 푑 푟 log 푛 푑 log 푑 which completes the proof for fixed 푑.

122 List of Notations

훿푋˓→푌 (휀) supremum over all those 훿 > 0 such that for every 훿-net 풩훿 in 퐵푋 , 퐶푌 (풩훿) ≥ (1 − 휀)퐶푌 (푋)...... 16

휎 푚 R {(푥1, . . . , 푥푛) ∈ R : 푥푖 = 0 if 푖 ̸∈ 휎}...... 19 ‖퐴‖ 2 푆2 srank(퐴) stable rank, ‖퐴‖ ...... 21 푆∞ ‖퐴‖ Schatten-von Neumann 푝-norm...... 20 푆푝

퐶푌 (푋) infimum of 퐷 where 푋 imbeds into 푌 with distortion 퐷 ...... 16

123 Index

Bourgain’s almost extension theorem, 49 type,8

Corson-Klee Lemma, 13 uniformly homeomorphic, 10 cotype 푞,9 discretization modulus, 15, 16 distortion, 15

Enflo type, 82 Enflo type 푝, 11 finitely representable,7

Jordan-von Neumann Theorem,9

Kadec’s Theorem, 10 Kahane’s inequality,9 Ky Fan maximum principle, 23 local properties,7 nonlinear Hahn-Banach Theorem, 73 nontrivial type,8 parallelogram identity,9 Pietsch Domination Theorem, 30 Poisson kernel, 50, 57 preprocessing, 102

Rademacher type, 82 restricted invertibility principle, 19 rigidity theorem, 10

Sauer-Shelah, 33 Schatten-von Neumann 푝-norm, 20 shattering set, 33 stable 푝th rank, 22 stable rank, 21 symmetric norm, 25

124