Some topics in Extremal Combinatorics

Hong Liu

August 9, 2019 The purpose of this note is to give a touch on some topics in extremal combinatorics. These topics are chosen semi-randomly :) When possible, I try to present the proofs “back- wards”, or with some intuitions here and there. The proofs would be a bit longer than usual, but hopefully they look more natural this way. The material covered are based on various notes/books/papers, see main texts for refer- ences. Comments are welcome, if you spot mistakes, please let me know :)

2 Contents

1 Extremal 5 1.1 Tur´antheorem ...... 5 1.2 Zykov’s symmetrisation ...... 6 1.3 Erd˝os-Stonetheorem ...... 7 1.4 Stability method ...... 8

2 Szemer´edi’sregularity lemma and its applications 11 2.1 Taster session ...... 11 2.2 Formal setup ...... 12 2.3 Key lemmas ...... 15 2.3.1 Reduced graph ...... 15 2.3.2 Embedding lemma ...... 16 2.3.3 Counting lemma ...... 16 2.4 Ruzsa-Szemer´editriangle removal lemma ...... 17 2.4.1 Cleaning the graph G ...... 17 2.4.2 Proof of triangle removal lemma ...... 18 2.5 (6, 3)-theorem and Roth’s theorem ...... 18 2.6 Ramsey-Tur´anproblem for K4 ...... 20 2.7 Chv´atal-R¨odl-Szemer´edi-Trotter theorem ...... 21 2.8 Spectral proof of regularity lemma ...... 22

3 Pseudorandomness 27 3.1 Quasirandom graphs ...... 27 3.1.1 Equivalent definitions of quasirandomness ...... 28 3.1.2 (Codegree) ⇒ (Induced Subgraph Count)...... 29 3.1.3 (4-cycle Count) ⇒ strongly regular via Cauchy-Schwarz ...... 32

3 4 Chapter 1

Extremal graph theory

In this chapter, we will discuss the classical Tur´an’stheorem in extremal graph theory and present some standard techniques such as Zykov’s symmetrisation, stability method.

1.1 Tur´antheorem

Before we start, let us consider the following puzzle. Suppose we have to choose n irrational numbers x1, . . . , xn. How can we maximise the number of pairs (xi, xj) such that xi + xj is rational? One of the most classical extremal problems, nowadays so-called Tur´an-type problem, is: Problem 1.1.1 (Tur´an-type). How dense a graph can be without containing another (usually small) graph as a subgraph? More specifically, given a graph H, we say a graph G contains a copy of H, or H is a subgraph of G, or H ⊆ G, if there is an injective map ϕ : V (H) → V (G) that preserves adjacencies, i.e. for any uv ∈ E(H), we have ϕ(u)ϕ(v) ∈ E(G). We call such a map an embedding of H in G. We say G is H-free if it does not contain H as a subgraph. If in addition, the map preserves also non-adjacencies, then H is an induced subgraph of G. The main parameter we study for Problem 1.1.1 is the extremal number of H,

ex(n, H) = max{e(G): |G| = n and G is H-free},

is the maximum size of an n-vertex H-free graph. We call an n-vertex graph G an extremal graph for H, if G is H-free of maximum size, i.e. e(G) = ex(n, H). One of the earliest applications of extremal graph theory, by Erd˝os,is to construct dense multiplicative Sidon set of integers using a graph without 4-cycles. The first result in extremal graph theory is the following theorem of Mantel, which answers Problem 1.1.1 when forbidding triangles as subgraphs. Theorem 1.1.2 (Mantel 1907). Let G be an n-vertex graph. If G is triangle-free, then

2 e(G) ≤ ex(n, K3) = bn /4c.

5 Exercise 1.1.3. Solve the puzzle at the beginning of this section, i.e. find the maximum 1 number of pairs of irrationals (xi, xj) with xi + xj being rational. Exercise 1.1.4. Prove that for any tree T , ex(n, T ) = O(n), that is, there exists a constant C = C(T ) such that ex(n, T ) ≤ Cn.2

Mantel’s result in fact shows that extremal graph for triangle is Kbn/2c,dn/2e. This answers also the following natural question for triangles. Problem 1.1.5 (Extremal structure/Stability). How do H-extremal graphs look like? What about almost extremal graphs3, do they look like extremal ones? Theorem 1.1.2 was later generalised by Tur´anto forbidding larger cliques. To state his result, we need to define a special family of graphs. Let r ∈ N, the r-partite Tur´angraph on n vertices, denoted by Tr(n), is the balanced complete r-partite n-vertex graph, i.e. each partite set is of size either bn/rc or dn/re. Clearly, Tr(n) is Kr+1-free. Theorem 1.1.6 (Tur´an1941). Let r ≥ 2 be an integer and G be an n-vertex graph. If G is Kr+1-free, then  1 n2 e(G) ≤ ex(n, K ) = e(T (n)) = 1 − − O(r). r+1 r r 2

Furthermore, the Tur´angraph Tr(n) is the unique extremal graph.

We see from Tur´antheorem that there is a unique extremal graph Tr(n). The following theorem of Erd˝osand Simonovits shows that this problem is stable in the sense that every almost extremal graph must be close in structure to the extremal Tur´angraph, answering Problem 1.1.5 for cliques.

Theorem 1.1.7 (Erd˝os-Simonovits stability 1966). Let ε > 0, there exists δ > 0 such that the following holds. Let G be an n-vertex Kr+1-free graph. If

2 e(G) ≥ ex(n, Kr+1) − δn ,

2 then G can be changed to Tr(n) by altering at most εn adjacencies.

1.2 Zykov’s symmetrisation

There are many proofs for Tur´antheorem. Here we present one using Zykov’s symmetri- sation. Zykov’s symmetrisation is a process in which we alter the graph, one vertex at a time,

1Hint: Build an auxiliary graph and apply Mantel’s theorem. 2Prove first that every graph with average d contains a subgraph with minimum degree at least d/2. 3We say G is almost extremal for H if G is H-free and close to maximum size, i.e. e(G) ≥ ex(n, H)−o(n2).

6 • without decreasing the number of edges, and

• without increasing the clique number ω(G).4

At the end of the process, we arrive to a complete partite graph, which has a much simpler structure to deal with. In particular, if the original graph is Kr+1-free, then all the graphs during symmetrisation will be Kr+1-free.

Proof of Tur´antheorem via Zykov’s Symmetrisation. Let G be an n-vertex Kr+1-extremal graph. Pick v1 ∈ V (G) with maximum degree and symmetrise all of its non-neighbours to v1. That is, for each u not adjacent to v1, set N(u) := N(v1). This operation keeps Kr+1-freeness and the resulting graph G1 has at least as many edges as G. Note that in G1, V1 := V \ N(v1) is an independent set and completely joined to N(v1). We now repeat this operation as follows. Pick v2 ∈ G1[N(v1)] and symmetrise all its non- neighbours to v2. Let G2 be the resulting graph, then again G2 is Kr+1-free and e(G2) ≥ e(G1) ≥ e(G). Note that in G2, V2 := V (G2) \ N(v2) is an independent set and completely joined to N(v2). 0 Continue this process, we will get a complete partite graph, say G , that is also Kr+1- free at the end. As the original graph G is an extremal Kr+1-free graph, together with e(G0) ≥ e(G), we see that G0 must be also extremal, i.e. e(G0) = e(G). We leave the uniqueness of extremal graph as exercise.

Exercise 1.2.1. Among all Kr+1-free complete partite graphs, the Tur´angraph Tr(n) is the unique extremal graph.

Further readings. Symmetrisation trick has been used in various extremal problems. To begin, one can read the linear algebraic version, and a recent generalisation due to F¨uredi and Maleki that can be applied to multiple graphs simultaneously. See also Pikhurko-Staden- Yilma for another application on Erd˝os-Rothschild problem.

• Motzkin and Straus, Maxima for graphs and a new proof of a theorem of Tur´an, Canad. J. Math., (1965).

• F¨urediand Maleki, The minimum number of triangular edges and a symmetrization method for multiple graphs, Combin. Probab. Comput., (2017).

• Pikhurko, Staden, and Yilma, The Erd˝os-Rothschild problem on edge-colourings with forbidden monochromatic cliques, Math. Proc. Cambridge Phil. Soc. (2017).

1.3 Erd˝os-Stonetheorem

We have seen that Tur´antheorem determines the extremal number for cliques and describes the unique extremal structure. The natural next step is what if we forbid general graphs

4The clique number ω(G) of a graph G is the order of the largest clique contained in G.

7 other than cliques? We shall present in this section a satisfying answer for all non-bipartite graphs. The seminal result of Erd˝osand Stone shows that the extremal function of a general graph is completely determined by another important graph parameter: the chromatic number. Recall that the chromatic number of a graph H, denoted by χ(H), is the minimum number of colours needed to colour V (H) so that adjacent vertices do not receive the same colour.

Theorem 1.3.1 (Erd˝os-Stone1946). Let H be an arbitrary graph, then5

 1  n2 ex(n, H) = 1 − + o(1) . χ(H) − 1 2

Note that by the definition of chromatic number, the (χ(H) − 1)-partite Tur´angraph is H-free, yielding the lower bound above. The proof of Erd˝os-Stonetheorem proceeds by building a large (χ(H) − 1)-partite subgraph. We will not present this proof, instead, we shall give a more “modern” proof later on, which is conceptually simpler, using Szemer´edi regularity lemma. We remark that Erd˝os-Stonetheorem gives the asymptotics of the extremal number for all non-biparite H; while for bipartite H, it only implies that ex(n, H) = o(n2). In fact, it is known that extremal number for bipartite graphs is polynomially smaller, i.e. for any bipartite H, there exists c = cH such that

ex(n, H) = O(n2−c).

Further readings. Many bipartite Tur´anproblems are open, we refer the readers to the comprehensive survey of F¨urediand Simonovits: • F¨urediand Simonovits, The history of degenerate (bipartite) extremal graph problems, arXiv:1306.5167, (2013).

1.4 Stability method

One standard technique in attacking an extremal problem is the so-called stability method. We have seen in Section 1.1 the Erd˝os-Simonovits stability theorem. Such kind of stability statements are not only interesting on its own, but also helpful in obtaining exact results in extremal combinatorics. For instance, the recent result of Liu, Pikhurko and Staden, The exact minimum number of triangles in graphs of given order and size, uses, among others, the stability approach. Often time (but not always), we can tackle an extremal problem with the following three steps: • Step 1. Obtain asymptotic result;

5The term o(1) throughout should be understood as a quantity tending to zero as n, the order of the graph, tends to infinity.

8 • Step 2. Obtain stability statement;

• Step 3. Use the stability statement to get exact result.

The stability method is refered to Steps 2 and 3. Sometimes, the stability statement in Step 2 can be derived by a more careful analysis of the proof for asymptotic result in Step 1. We shall illustrate Step 3 via a baby application: determining the extremal number of pentagon C5.

2 Theorem 1.4.1. For large n, we have ex(n, C5) = bn /4c.

2 2 Note that Step 1 follows from Erd˝os-Stonetheorem: ex(n, C5) = n /4 + o(n ). The stability statement in Step 2 in this case reads as follows.

Lemma 1.4.2. Let ε > 0, there exists δ > 0 such that the following holds for large n. Let G 2 2 be an n-vertex C5-free graph. If e(G) ≥ n /4 − δn , then G can be made bipartite by deleting at most εn2 edges.

Exercise 1.4.3. Prove the weaker version of Lemma 1.4.2 assuming a stronger condition that δ(G) ≥ (1/2 − δ)n.6

Exercise 1.4.4. Deduce Lemma 1.4.2 from the above weaker version.7

We now complete Step 3. The idea in stability method is the following. From Step 2, we already know the asymptotic structure of the extremal configuration. Suppose some unwanted imperfection shows up in the configuration, then we can derive a contradiction as we have good control of the structure thanks to Step 2, proving that there is no imperfection to begin with.

Proof of Theorem 1.4.1. Let G be an n-vertex extremal C5-free graph. As T2(n) is also C5- 2 free, extremality of G implies e(G) ≥ e(T2(n)) = bn /4c. We can use the trick of removing low degree vertices in Exercise 1.4.4 and enlarge n to assume additionally that

δ(G) ≥ (1/2 − ε)n.

Let V (G) = X ∪ Y be a max-cut of G.8 By Lemma 1.4.2, we have

e(G[X]) + e(G[Y ]) ≤ εn2.

6 3/2 Hint: Take for granted that ex(n, C4) = O(n ). So there are 4-cycles in G. Consider the neighbour- hoods of two adjacent vertices in a copy of C4. 7Hint: By deleting vertices of low degree, we can find a subgraph with high minimum degree. Conse- quently, if the weaker version holds for large n, Lemma 1.4.2 holds for larger n. 8A max-cut of a graph G is a bipartition V (G) = X ∪ Y that maximises the number of cross edges, i.e. edges between the two partite sets X and Y . An important property of a max-cut, which we shall use shortly, is that every vertex in one part, say X, has as many neighbours in the other part Y than in its own part X, since otherwise moving this vertex from X to Y would increase the number of cross edges, contradicting to the fact that X ∪ Y is a max-cut.

9 Consequently, this max-cut is almost balanced, i.e. √ |X|, |Y | = n/2 ± 2 εn.

Indeed, otherwise e(G) ≤ |X||Y | + e(G[X]) + e(G[Y ]) < n2/4, a contradiction. We shall show that there is no edge inside X or Y , and so G is bipartite, which together with the extremality of G implies that G has to be T2(n), as T2(n) has the maximum size among all bipartite graphs, yielding the desired. To get rid of the imperfections (edges in X and Y ), we first show that the inner degree is o(n), i.e. √ ∆(G[X]), ∆(G[Y ]) ≤ 2 εn. √ 9 Suppose otherwise that there√ is some v ∈ X with d(v, X) ≥ 2 εn. As X ∪ Y is a max- cut, d(v, Y ) ≥ d(v, X) ≥ 2 εn. Note that as G is C5-free, the induced between XH := N(v, X) and YH := N(v, Y ) in G is P4-free, thus having only O(n) edges (by Exercise 1.1.4). Then for large n, the number of missing edges in G[X,Y ]10 is at least 2 2 2 |XH ||YH | − O(n) ≥ 3εn . So again e(G) ≤ |X||Y | − 3εn + e(G[X]) + e(G[Y ]) < n /4, a contradiction. With the additional information that inner degree is sublinear, we are now ready to show that there is not even a tiny bit of imperfection, i.e. not a single edge is allowed in X or Y . √ Suppose uv is an√ edge in X. Let w be a third vertex in X. Using that ∆(G[X]) ≤ 2 εn, |X|, |Y | = n/2 ± 2 εn and δ(G) ≥ (1/2 − ε)n, we see that the common√ neighbourhood of u, v, w contains almost the entire set Y : d(u, v, w, Y ) ≥ (1 − 10 ε)|Y |. Then two such common neighbours in Y together with u, v, w induces a copy of C5, a contradiction. This completes the proof.

9We write N(v, X) := N(v) ∩ X for the set of neighbours of v in X, and d(v, X) = |N(v, X)| for the degree of v in X. 10We write G[X,Y ] for the bipartite graph induced between X and Y in G.

10 Chapter 2

Szemer´edi’sregularity lemma and its applications

Szemer´edi’sregularity lemma is one of the most important tools in extremal graph theory dealing with dense graphs (positive edge-density). Here we give a gentle introduction to this powerful lemma and see some of its applications and other classical results related to it. Roughly speaking, the regularity lemma states that every large graph admits a partition into bounded number of parts such that between almost all pairs of parts, the induced bipartite subgraphs behave pseudorandomly. The essence of the regularity lemma is:

Approximating large structures by small structures with low complexity.

It usually offers conceptually simple proofs for asymptotic results. For instance, the regu- larity lemma and its counting lemma together imply that, in terms of subgraph densities, any graph can be approximated by one of the few (weighted) graphs with bounded order (reduced graphs on Oε(1) vertices).

2.1 Taster session

To state the regularity lemma rigorously, we need to set up several notions. Before we do so, let us informally describe a common way of applying the regularity lemma:

• Step 1. Reduce an extremal problem A on large graphs to a problem B on small weighted graphs (using the random behaviour of the regular partition, embedding lemma, counting lemma etc.);

• Step 2. Solve problem B (using e.g. classical results in graph theory).

To be more illuminating, let us sketch a proof of Erd˝os-Stonetheorem, Theorem 1.3.1, to get a taste of how one can carry out this approach. We need some definitions. A map ϕ : V (H) → V (G) is called a homomorphism if it is adjacency preserving, i.e. for any uv ∈ E(H), we have ϕ(u)ϕ(v) ∈ E(G). When there is a surjective homomorphism from

11 H to F , we say that F is a homomorphic image of H. We also say that G contains a homomorphic copy of H if G contains a homomorphic image of H as subgraph. Step 1 above in this case can be done using the following consequence of the regularity lemma and counting lemma: for any graph G, there is a (weighted reduced) graph R on O(1) vertices such that

P.1 for any fixed H, the subgraph density of H in R is roughly the same as that in G;1

P.2 if R contains a homomorphic copy of H, then G contains a copy of H.

By subgraphs densities, we mean the following. Denote by Inj(H,G) the number of labelled (not necessarily induced) copy of H in G, and

Inj(H,G) t(H,G) = |G||H|

be the H-density in G.2 For readers convenience, let us recall the upper bound in Erd˝os-Stonetheorem:

 1  n2 ex(n, H) ≤ 1 − + o(1) χ(H) − 1 2 .

Informal proof of Erd˝os-Stonetheorem. Step 1 . Let r := χ(H) − 1. By P.1 with H = K2, 1 we just need to bound the edge-density of R, i.e. t(K2,R) ≤ 1 − r . Step 2 . Note that Kr+1 is a homomorphic image of H. Then by P.2, R is Kr+1-free. The desired bound on edge-density then follows from Tur´an’stheorem.

Further readings. Before we dive into the details, let us point out a comprehensive survey of Koml´os-Simonovits on regularity lemma:

• Koml´osand Simonovits, Szemer´edi’sregularity lemma and its applications to graph theory, Bolyai Math. Soc., (1996).

2.2 Formal setup

The basic notion in regularity lemma is that of an ε-regular pair which measures the pseudo- randomness/regularity of the induced bipartite subgraph between the pair. The parameter ε is the precision of the regularity; the smaller ε is, the more random like the pair is.

1As we shall see in counting lemma, more precisely, here by subgraph density in R, we mean the weighted subgraph homomorphism density. 2This notation is not standard. More commonly, t(H,G) denotes the homomorphism density. Though, these two versions differ by a lower order term (homomorphisms that are not injective).

12 Definition 2.2.1 (Regular pair). Given G = (V,E) and disjoint vertex subsets X,Y ⊆ V , let e(X,Y ) := e(G[X,Y ]) and denote by

e(X,Y ) d(X,Y ) := |X||Y |

the density of the pair (X,Y ). For ε > 0, the pair (X,Y ) is ε-regular if for any A ⊆ X,B ⊆ Y with |A| ≥ ε|X|, |B| ≥ ε|Y |, satisfy

|d(A, B) − d(X,Y )| < ε.

Additionally, if d(X,Y ) ≥ δ, for some δ > 0, we say that (X,Y ) is (ε, δ)-regular.

In other words, a regular pair (X,Y ) has “uniform” edge distribution in the sense that the density of any pair of large (ε-proportion) subsets (A, B) is roughly the same as that of (X,Y ).

Definition 2.2.2 (Regular partition). A partition V = V0 ∪ V1 ∪ · · · ∪ Vr is ε-regular, if (i) |V0| ≤ ε|V |; (called exceptional set) (ii) |V1| = |V2| = ··· = |Vr|; 2 (iii) all but εr pairs (Vi,Vj) with 1 ≤ i < j ≤ r are ε-regular.

It is worth making a quick remark that we do not assume that Vi, i ∈ [r], is larger than the exceptional set V0. In fact, quite the contrary, most of the time, we take r ≥ m ≥ 1/ε to make the edges in Vi negligible. Another remark is that in the definition of regular partition, we can also have no excep- tional set (by distributing V0 equally to other parts) and instead have ||Vi| − |Vj|| ≤ 1 for all 1 ≤ i ≤ j ≤ r. We will use mostly the version of regular partition with no exceptional set V0, unless otherwise specified. We can now state the lemma.

Theorem 2.2.3 (Szemer´ediregularity lemma 1976). Given ε > 0 and m ∈ N, there exists M = M(ε, m), such that any graph G admits an ε-regular partition V = V0 ∪ V1 ∪ · · · ∪ Vr with m ≤ r ≤ M.

Remark 2.2.4. Let us make some remarks about the parameters in the regularity lemma.

• We usually think of ε in the regularity lemma as a very small constant, i.e. o(1).

• Both the lower and upper bounds m ≤ r ≤ M on the number of parts of the partition are meaningful. If there is no lower bound, then the trivial partition V = V consisting of just one part is vacuously a regular partition and clearly this partition is of no use for us. The upper bound on r is also needed as we shall see shortly, the proof of the counting lemma relies crucially on the fact that the reduced graph R we use to approximate the original graph G is of bounded order.

13 • If the graph G does not have positive edge-density, then the regularity lemma does not say much about G. • The εr2 exceptional irregular pairs are needed. Consider the following example: Half graph. G = (A ∪ B,E), where A = B = [n]. For any a ∈ A and b ∈ B, put ab ∈ E(G) if and only if a ≥ b. Notice that d(A, B) = 1/2. Let the top half of A be X and bottom half of B be Y , then d(X,Y ) = 0, while d(A − X,B − Y ) = 1. There are εr irregular pairs in any partition. • The upper bound on the size of the partition M coming from the proof of regularity lemma is rather large, it is a tower of 2s with height 2ε−5. Gowers gave a construction showing that a tower of 2s with height ε−1/16 is needed. We end this section with two simple lemmas. The first one states that between a regular dense pair, almost every vertex has the “correct” degree to any large subset of the other side. Lemma 2.2.5. Let (X,Y ) be an ε-regular pair with density d, and B ⊆ Y with |B| ≥ ε|Y |, then all but 2ε|X| vertices in X have degree (d ± ε)|B| in B. Proof. Let A ⊆ X be the set of vertices with “small” degree in B, i.e. d(v, B) < d − ε. |B| Suppose that |A| > ε|X|, consider the pair (A, B). By the choice of A, we have e(A, B) |A| · (d − ε)|B| d(A, B) = < = d − ε, |A||B| |A||B| contradicting (X,Y ) being ε-regular. Thus, |A| ≤ ε|X|. Similarly, the same bound holds for the set of vertices of “large” degree, i.e. d(v, B)/|B| > d + ε in B. Given a regular pair (X,Y ), one can also show that almost all pairs from one part, say X, have the “correct” codegree to large subsets of the other side. Exercise 2.2.6. Formulate the above codegree statement rigorously and prove it. The second lemma states that regularity is inherited by large subsets of pairs (with a slightly worse precision/regularity). This lemma is useful as it implies that we can further refine a regular partition to get additional properties without losing regularity.

Lemma 2.2.7 (Slicing lemma). Let V0 ∪ V1 ∪ · · · ∪ Vr be an ε-regular partition. Further refine 1 s each part into s equal parts: Vi = Vi ∪ · · · ∪ Vi . The new partition (with sr + 1 parts) is O(sε)-regular.3 Exercise 2.2.8. Prove the slicing lemma. 3Note that O(sε)-regular implicitly requires that in the slicing lemma, s  1/ε.

14 2.3 Key lemmas

In the taster session, Section 2.1, we have seen how we can use P.1 and P.2 to carry out Step 1. In this section, we shall formally state and prove these two properties. They follows from two key consequences of the regularity lemma: the embedding lemma and the counting lemma. Roughly speaking, the embedding lemma says that we can embed any (appropriate) bounded degree graphs (up to linear-size ); and the counting lemma says for any fixed (small) graph H, we can count accurately the number of copies of H in G. We remark that there is a stronger version of embedding lemma, the blow up lemma, due to Koml´os,S´ark¨ozy and Szemer´edi,which we will not cover for now. The blow up lemma states that we can embed any (appropriate) spanning bounded degree graphs.

2.3.1 Reduced graph

We first define a notion of reduced graphs that appear in P.1 and P.2 formally.

Definition 2.3.1 (Reduced graph). Given an ε-regular partition V (G) = V0 ∪ V1 ∪ · · · ∪ Vr of G, and δ > 0, the reduced/cluster graph R = R(ε, δ) of G is defined as follows:

• V (R) = [r];

• ij ∈ E(R) if and only if (Vi,Vj) is ε-regular with density at least δ.

We can also think of the reduced graph as a weighted graph, assigning weight dij := P d(Vi,Vj) to the edge ij, and define the weighted degree of vertex i ∈ V (R) to be j∼i dij. We will specify it when we treat R as a weighted graph.

Exercise 2.3.2. Normalised minimum degree is inherited by the reduced graph R = R(ε, δ), i.e.4 δ(R) + 1 δ(G)  δ(G)  ≥ − δ − ε = − o(1) . r n n

Exercise 2.3.3. Bound the edge-density of G by that of R’s:

t(K2,G) ≤ t(K2,R) + o(1).

As we shall soon see in the counting lemma, the notion of reduced graph R(ε, δ) captures essentially the whole (asymptotic) information of G in terms of subgraphs densities.

4Here we need a slightly stronger version: when we get a regular parition from the regularity lemma, we can assume that each part Vi is in at most εr irregular pairs.

15 2.3.2 Embedding lemma Given a graph F , denote by F (s) the blow-up of F obtained from replacing each vertex u ∈ V (F ) by an independent set Iu of size s and make (Iu,Iv) complete bipartite in F (s) if and only if uv ∈ E(F ). Observe that the blow-up F (s) contains H as a subgraph is the same as saying that F contains a homomorphic copy of H. We now present the embedding lemma, which is a formal statement of P.2. One thing to notice here is that we can embed appropriate bounded degree graphs of order up to linear size Ω(n) (so think of d, ∆ below as constants and |G| = Θ(`), |H| = Θ(s) and s = Ω(`)), which is what we use in proving Chv´atal-R¨odl-Szemer´edi-Trotter theorem, see Section 2.7.

Lemma 2.3.4 (Embedding lemma). For any d ∈ (0, 1], ∆ ≥ 1, there exists ε0 > 0, such that for any G and H with ∆(H) ≤ ∆ and for any s ∈ N and R = R(ε, d) the reduced graph of G with ε ≤ ε0. Suppose the corresponding regular partition of G has each of its part of size ` ≥ 2s/d∆. Then H ⊆ R(s) ⇒ H ⊆ G.

Sketch of proof. Given d, ∆, choose ε0 < d, such that 1 (d − ε )∆ − ε ∆ ≥ d∆ ≥ ε . 0 0 2 0 Let ϕ : V (H) → V (R) be a homomorphism (exists as H ⊆ R(s)). Order vertices in H as u1, . . . , uh. Initially, set Yj = Vj, for j ∈ [r]. Embed vertices u1, . . . , ui−1 one by one, and update the sets of eligible vertices Yϕ(uj ) ⊆ Vϕ(uj ) for each uj, j ≥ i and ui−1uj ∈ E(H), to embed by intersecting it with N(ui−1), maintaining always |Yϕ(uj )| ≥ ε|Vϕ(uj )|. When

embedding ui in Yϕ(ui) ⊆ Vϕ(ui), note that for each j > i with uiuj ∈ E(H), in Vϕ(ui), all but

ε|Vϕ(ui)| vertices u, by Lemma 2.2.5, satisfy d(u, Yϕ(uj )) ≥ (d − ε)|Yϕ(uj )|. Since ∆ |Vi|(d − ε) − ε∆|Vi| ≥ max{s, ε|Vi|}, we never get stuck.

2.3.3 Counting lemma The formal statement of P.1 is the following counting lemma.

Lemma 2.3.5 (Counting lemma). Given H, V1, ..., Vh with h = |H| and |Vi| = n, all pairs 5 (Vi,Vj) are ε-regular and d(Vi,Vj) = dij  ε. Then the number of canonical copies of H in V1, ..., Vh is at least Y √ h (dij − ε)n . ij∈E(H) We skip the proof for the counting lemma, instead leaving the baby case of triangle counting as exercise.

Exercise 2.3.6. Prove counting lemma for the special case H = K3. Exercise 2.3.7. Make the proof of the upper bound of Erd˝os-Stonetheorem rigorous. 5 By canonical copy, we mean a copy of H with exactly one vertex in each Vi.

16 2.4 Ruzsa-Szemer´editriangle removal lemma

In this section, we will present, yet, another important consequence of the regularity lemma, the removal lemma, due to Ruzsa and Szemer´edi,which states that an almost triangle-free graph (o(n3) triangles) can be made genuinely triangle-free by removing a negligible amount of edges (o(n2) edges).

Lemma 2.4.1 (Ruzsa-Szemer´editriangle removal lemma 1976). Given c > 0, there exists a = a(c) > 0, such that for sufficiently large n the following holds. Let G be an n-vertex graph. If G has at most an3 triangles, then it can be made triangle-free by removing at most cn2 edges.

The contrapositive says if one cannot make a graph triangle-free by removing few edges, then the graph contains lots (positive proportion) of triangles. The removal lemma has many applications, e.g. (6, 3)-theorem and Roth’s theorem.

2.4.1 Cleaning the graph G

Before proving the removal lemma, It is convenient to define the subgraph GR of G corre- sponding to a reduced graph R = R(ε, δ), obtained by keeping only edges between (regular and dense) pairs (Vi,Vj) for which ij ∈ E(R). We can obtain the subgraph GR via the following standard cleaning process, showing that only a negligible amount of edges are deleted: 2 e(GR) = e(G) − o(n ).

• Remove inner edges, i.e. edges in Vi, i ∈ [r]. By choosing m ≥ 1/ε when applying the regularity lemma to obtain the regular partition corresponding to R(ε, δ), we can guarantee the number of parts satisfies r ≥ m ≥ 1/ε. Then the number of inner edges is at most n/r n2 n2 1 · r ≤ ≤ = εn2. 2 2r 2m 2

• Remove edges between irregular pairs. As there are at most εr2 irregular pairs, the number of edges of this kind is at most

εr2 · (n/r)2 = εn2.

• Remove edges between sparse pairs with density at most δ, i.e. (Vi,Vj) with ij 6∈ E(R). The number of such edges is at most

n2 r 1 δ ≤ δn2. r 2 2

17 Thus, in forming GR, we delete in total at most 1 (3ε + δ)n2 = O(ε + δ)n2 2 edges, which is negligible as we usually choose ε, δ sufficiently small. The cleaning graph process above is exactly the non-essential information we discard when forming the reduced graph. Edges in GR all lie in regular and dense pairs and so we can employ e.g. the counting lemma, which is how we shall prove the triangle removal lemma.

2.4.2 Proof of triangle removal lemma Suppose the statement is not true. That is, there is some c > 0 such that for any a there exists a counterexample G, i.e. G has at most an3 triangles, but the removal of any cn2 edges does not make it triangle-free. Apply Szemer´edi’sregularity lemma with ε = c/8 and m = 1/ε to G to get an ε-regular partition V (G) = V1 ∪ ... ∪ Vr, where M ≥ r ≥ m and ||Vi| − |Vj|| ≤ 1, for 1 ≤ i, j ≤ r. Let R = R(ε, c/4) be the reduced graph, and GR ⊆ G be the cleaned subgraph, as in Section 2.4.1. Then the number of edges deleted is at most cn2/2. By the choice of G, there is still triangles in GR, which can only be in three sets, say X,Y,Z, that are pairwise regular with density larger than c/4. We can then apply the counting lemma to the tripartite graph GR[X,Y,Z] to see that there are at least

 c 3 n3  c 3  c 3 − ε · ≥ n3 ≥ n3 4 r 8r 8M triangles. Note that M = M(ε, m) depends in fact only on c. Then choosing a = a(c) < c 3 3 8M , we get that G has more than an triangles, a contradiction. Remark 2.4.2. To get more familiar, let us write a streamlined proof without all the cal- culations. Let G be an almost triangle-free graph. Then its reduced graph R must be triangle-free, as otherwise, by the counting lemma, G would contain too many triangles. Thus, G can be made triangle-free by removing few edges not corresponding to R.

2.5 (6, 3)-theorem and Roth’s theorem

In this section, we present Ruzsa-Szemer´edi(6, 3)-theorem and see how it implies Roth’s theorem on 3-term arithmetic progression (3AP). Throughout this section, we will work with 3-uniform hypergraphs H = (V,E), where V  the edge set E ⊆ 3 consists of triples in V . We say a hypergraph is linear if any two of its edges share at most one vertex in common. For s, t ∈ N, an (s, t)-configuration (or simply (s, t)) in a hypergraph is a set of s vertices inducing at least t edges. A hypergraph is (s, t)-free if it does not contain any (s, t)-configuration.

18 Theorem 2.5.1 ((6,3)-theorem). If a 3-uniform hypergraph H is (6, 3)-free, then

e(H) = o(n2).

We remark that this upper bound is not√ very far from optimal: there exists 3-uniform H that is (6, 3)-free and have e(H) > n2 · e−c ln n, which is larger than n2−ε for any constant ε > 0. We shall give this lower bound construction after Theorem 2.5.3.

Proof of (6,3) theorem. Suppose to the contrary that there exists c > 0 such that for in- finitely many n, there is a (6, 3)-free 3-uniform n-vertex H with e(H) > cn2. By zooming into a subgraph with higher average degree (which is still a counterexample), we may in addition assume that H is maximal in the sense that no subgraph of H has larger average degree than H. The maximality of H implies that it is linear. Indeed, if there are two edges intersecting at two points, we have a (4, 2)-configuration, then no other edges intersect at these four points, as otherwise we get a (6, 3). Thus these two edges form a component themselves. Then deleting this component results in a subgraph with higher average degree than H, contradicting the maximality of H. Note also that, since H is linear, H is a steiner triple system on n vertices, which is known to have at most (1/6 + o(1))n2 hyperedges. Let G be the shadow graph of H, obtained by setting V (G) = V (H) and turning every hyperedge in H into a triangle. We say a triangle in G is an H-triangle if it corresponds to a hyperedge in H. Since H is linear, no two H-triangles in G share an edge. Thus, there are at least e(H) > cn2 edge-disjoint H-triangles in G. Consequently, G cannot be made triangle-free by removing at most cn2 edges, and so the removal lemma implies that G contains at least an3 triangles in G, where a = a(c). For large n, an3 > n2 > e(H), meaning that there are triangles in G that does not come from a hyperedge in H. Such a triangle in G corresponds to a (6, 3) in H as H is linear, a contradiction.

Let us now see how (6, 3)-theorem implies Roth’s theorem on 3APs.

Theorem 2.5.2 (Roth’s theorem). For any δ > 0, there exists n0 such that for n ≥ n0, any subset S ⊆ [n] with size δn contains a three-term arithmetic progression.

Theorem 2.5.3. (6, 3)-theorem ⇒ Roth’s Theorem.

Proof. Suppose there is a 3AP-free set A ⊆ [n] with |A| ≥ δn. Define a 3-partite 3-uniform H as follows: V (H) = V1 ∪ V2 ∪ V3, where V1 = [n], V2 = [2n] and V3 = [3n]; for the edge set, for each x ∈ [n] and a ∈ A, add the hyperedge (x, x + a, x + 2a). So

2 e(H) = |A||V1| ≥ δn .

Thus, Theorem 2.5.1 implies that there is a (6, 3) in H. Say the three hyperedges in this (6, 3)-configuration are (x, x + a, x + 2a), (y, y + b, y + 2b) and (z, z + c, z + 2c). Since two points completely determine an edge in H, this (6, 3) has to have two points from each Vi, 1 ≤ i ≤ 3. Without loss of generality, say x = z 6= y, then x + a 6= z + c and x + 2a 6= z + 2c,

19 otherwise two edges coinside. Again without loss of generality, say y + 2b = x + 2a, then similarly y + b 6= x + a and so y + b = z + c. Then a simple calculation shows that b + c = 2a. Note that a, b, c ∈ A and a 6= c since x = z and x + a 6= z + c. Thus {b, a, c} ⊆ A forms a 3AP, a contradiction.

Dense (6,3)-free H. In the above proof, in fact A is 3AP-free if√ and only if H is (6,3)- −c log n free. Behrend constructed a 3AP-free subset of√ [n] of size n · e . The corresponding hypergraph H then is (6,3)-free and has n2 · e−c log n edges. We end this section with an old conjecture.

Conjecture 2.5.4 (Brown-Erd˝os-S´os1973). If a 3-uniform H is (s + 3, s)-free, then

e(H) = o(n2).

The simplest open case is (7, 4).

2.6 Ramsey-Tur´anproblem for K4

In this section, we present an appication of the regularity lemma in Ramsey-Tur´anproblem. Recall that Tur´an’stheorem states that among all n-vertex Kr+1-free graphs, the Tur´an graph Tr(n) has the largest size. Notice that these Tur´angraphs have rigid structures, in particular, there are independent sets of size linear in n. It is then natural to ask what happens when there is no such big holes. Such problems, first introduced by S´osin 1969, are the substance of the Ramsey-Tur´antheory. Given a graph H and natural numbers m, n ∈ N, the Ramsey-Tur´annumber for H is:

RT(n, H, m) := max{e(G): |G| = n, α(G) ≤ m, and G is H-free}.

The most classical case is when m is sublinear in n, i.e. m = o(n). Formally,

Definition 2.6.1. Given a graph H and δ ∈ (0, 1), let

RT(n, H, δn) %(H, δ) := lim and %(H) := lim %(H, δ). n→∞ n2 δ→0 Define RT(n, H, o(n)) = %(H) · n2 + o(n2).

2 Exercise 2.6.2. Prove that RT(n, K3, o(n)) = o(n ).

2 When there is no restriction on the independence number, recall that ex(n, K4) = n /3 ± O(1). In comparison,

2 2 Theorem 2.6.3 (Szemer´edi). RT(n, K4, o(n)) ≤ n /8 + o(n ).

20 Sketch of proof. Let G be an n-vertex K4-free graph with α(G) = o(n). Let R be a weighted reduced graph of G. It suffices to show that R is triangle-free and no edge in R has density 2 larger than 1/2. Indeed, K3-free implies that, as a graph, R has at most r /4 edges; each edge having weight at most 1/2+o(1) implies that, as a weighted graph, e(R) ≤ r2/8+o(r2), and hence e(G) ≤ n2/8 + o(n2) as desired. Suppose R has a triangle ijk. Consider the corresponding pairwise dense regular triple Vi,Vj,Vk in G. We can find two typical adjacent vertices vivj ∈ E(G) with vi ∈ Vi and vj ∈ Vj, having linear codegree in Vk: d(vi, vj,Vk) = Ω(n). As α(G) = o(n), there is an edge in N(vi, vj,Vk), yielding a copy of K4, a contradiction. Suppose R has a chubby edge ij, and so d(Vi,Vj) ≥ 1/2 + Ω(1). Then any two typical vertices in Vi has codegree 2(n/2 + Ω(n)) − n = Ω(n) linear in Vj. This also yields a K4, as almost all vertices (hence linear many) in Vi are typical, we can find two adjacent ones and pick an edge in their coneighbourhood in Vj, again reaching a contradiction. An ingenious geometric construction of Bollob´asand Erd˝oslater yields a matching lower bound: n2 RT(n, K , o(n)) = + o(n2). 4 8 The following question is open.

2 Question 2.6.4. Is RT(n, K2,2,2, o(n)) = o(n )?

2.7 Chv´atal-R¨odl-Szemer´edi-Trotter theorem

In this section, we present an application of the regularity lemma in graph Ramsey theory. Recall that the Ramsey number r(G, G) for a graph G is the minimum integer N such that any 2-edge-colouring of KN contains a monochromatic copy of G. The Ramsey number for cliques is exponential. A theorem of Chv´atal-R¨odl-Szemer´edi-Trotter states that bounded degree graphs have linear Ramsey number.

Theorem 2.7.1. Let d ∈ N and G be a graph with ∆(G) ≤ d, then

r(G, G) = Od(|V (G)|).

We will make use of the multicolour version of the Szemer´ediRegularity Lemma. For a k-edge-coloured graph G, a partition V (G) = V1 ∪ ... ∪ Vr is an ε-regular partition if

[r] • for all ij ∈ 2 , |Vi| − |Vj| ≤ 1;

r [r] • for all but at most ε 2 choices of ij ∈ 2 , the pair (Vi,Vj) is ε-regular in every colour. Lemma 2.7.2 (Multicolour regularity lemma). For every real ε > 0 and integers k ≥ 1 and m, there exists M = M(ε, m, k) such that every k-edge-coloured graph G with n ≥ m vertices admits an ε-regular partition V (G) = V1 ∪ ... ∪ Vr with m ≤ r ≤ M.

21 We can similarly define a reduced graph corresponding to a regular partition, with the only difference that ij ∈ E(R) if and only if (Vi,Vj) is regular with respect to every colour. The reduced graph inherits a (multi)edge-colouring from G: we can assign each edge ij ∈ E(R), the set of all colours that is dense in G[Vi,Vj]. For the application here, it suffices to just assign the majority colour, i.e. if ij ∈ E(R) is red, then (Vi,Vj) has red-density at least 1/k − o(1) = Ω(1). Let us recall Brook’s theorem, which will be needed in the proof. Theorem 2.7.3 (Brook’s theorem). Every graph G can be properly vertex-coloured using ∆(G) + 1 colours, i.e. χ(G) ≤ ∆(G) + 1.

Proof of Theorem 2.7.1. Let m ≥ 5r(Kd+1,Kd+1) be sufficiently large, ε = 1/m, C := 2M/(1/2 − ε)d, where M = M(ε, m, 2) returned from Lemma 2.7.2. Let N ≥ C|V (G)| and fix an arbitrary 2-edge-colouring of KN . We shall find a monochromatic copy of G. Apply multicolour regularity lemma to the given 2-edge-coloured KN and let R be the corresponding reduced graph. Recall that R is almost complete: r  1  r2 e(R) ≥ (1 − 2ε) > 1 − , 2 m/3 − 1 2 with a 2-edge-colouring indicating the majority colour. By Tur´antheorem, it R contains a clique Km/3. As m ≥ 5r(Kd+1,Kd+1), in this 2- edge-coloured clique Km/3, there is a monochromatic Kd+1. By Brook’s theorem, Kd+1 is a homomorphic image of G, as ∆(G) ≤ d. Then by the embedding lemma, the original 2-edge-coloured KN contains a monochromatic copy of G as desired.

2.8 Spectral proof of regularity lemma

Finally, we give a proof of the regularity lemma. The original proof proceeds by refining partition and energy increment strategy. Here, we shall give a proof based on the spectral decomposition of the adjacency matrix given by Tao, and independently by Szegedy. This idea originates from Frieze-Kannon’s proof of the weak regularity lemma. We shall only prove a weaker version in which we do not require equipartition. One can refine this regular partition further to get a equipartition. Lemma 2.8.1. Let G be an n-vertex graph and let ε > 0. Then there exists a partition [M] V = V1 ∪ · · · ∪ VM , M ≤ M(ε), such that apart from an exceptional set Σ ⊆ 2 with

X 2 |Vi||Vj| = O(ε|V | ), (i,j)∈Σ

we have for every (i, j) ∈/ Σ, A ⊆ Vi and B ⊆ Vj that

|e(A, B) − dij|A||B|| = O(ε|Vi||Vj|).

22 Before we dive into the details of the proof, let us sketch briefly how it goes. We write the adjacency matrix T as the sum of the rank-1 matrices from eigenvectors of T with weights being the associated eigenvalues. Then the structure of T is dictated mostly by the part with large eigenvalues (main term); while the part with small eigenvalues is more like noise (error term). Thus, to capture the behaviour of T , we shall get a partition in which each eigenvector with large eigenvalue is approximately constant in each part. We need Cauchy-Schwarz inequality for the proof, let us recall it here.

Lemma 2.8.2 (Cauchy-Schwarz inequality). Let u, v ∈ Cn, then v n u n n X uX 2 X 2 uivi = hu, vi ≤ kuk2kvk2 = t |ui| |vi| . i=1 i=1 i=1 Furthermore, equality holds if and only if u and v are linearly dependent. Proof of the regularity lemma, Lemma 2.8.1. Let T be the adjacency matrix of G. As T is a real symmetric matrix, it is self-adjoint and has eigenvalue decomposition:6

n X ∗ T = λiuiui , i=1

where u1, . . . , un form an orthonormal basis of C with eigenvalues |λ1| ≥ ... ≥ |λn| ∈ R.

Splitting T . As outlined above, we shall splits T = T1 + T2 + T3 into main term T1 and error terms T2,T3. To do so, we need a bound on the eigenvalues. Note that the (i, j)-th entry in k T records the number of vi, vj-walks with length k in G. In particular, each diagonal entry of T 2 is the degree of the corresponding vertex. Thus, the trace

2 X 2 tr(T ) = di = 2e(G) ≤ n , i and for each i ∈ [n], we have

n 2 X 2 2 n i · |λi| ≤ |λi| ≤ n ⇒ |λi| ≤ √ . (2.1) i=1 i Let F = F (ε): N → N be a function to be chosen later with F (i) ≥ i. By averaging, for some J ≤ F 1/ε3 (1)7, we can take out a piece in the middle with small weight8:

X 2 3 2 |λi| ≤ ε n . (2.2) i∈[J,F (J)]

We can now write T = T1 + T2 + T3, where 6We treat all vectors here as column vectors. 7We use F k = F ◦ · · · ◦ F for k iteration of F . 8Proof: Consider the partition of [n] into intervals [1,F (1)) ∪ [F (1) ∪ F 2(1)) ∪ [F 2(1),F 3(1)) ··· . As P 2 2 3 3 2 i∈[n] |λi| ≤ n , one of the first 1/ε intervals should be at most ε n .

23 P ∗ • T1 = i≤J λiuiui is the “structured” term;

P ∗ • T2 = i∈[J,F (J)] λiuiui is the “small” term;

P ∗ • T3 = i>F (J) λiuiui is the “pseudorandom” term.

Partition for the structured term T1. We now construct a partition of V (G) such that T1 is approximately constant in most parts. For each i ≤ J, we partition V (G) into OJ,ε(1) parts ε3/2 −1/2 ε in which ui only fluctuates by O( J n ) apart from an exceptional part of size O( J n) q J −1/2 where |ui| is excessively large (of value at least ε n ). Let u = ui and write u(j) for 2 P 2 the j-th coordinate of u. Recall that kuk2 = j∈[n] u(j) = 1, so the number of coordinates q J −1/2 ε with value at least ε n is at most J n. Thus, for the rest of the coordinates, we can q J −1/2 ε3/2 −1/2 partition it into at most ε n /( J n ) = OJ,ε(1) parts as claimed. Combining all of these J partitions together, we get V (G) = V1 ∪ · · · ∪ VM−1 ∪ VM , M = OJ,ε(1), where the exceptional part |VM | ≤ εn, and for any 1 ≤ i ≤ M − 1, the ε −1/2 eigenvectors u1, . . . , uJ all fluctuate at most O( J n ). We claim that T1 fluctuates at most O(ε) on each block Vi × Vj, for 1 ≤ i, j ≤ M − 1, and consequently, writing dij for the mean value of entries of T1 on Vi × Vj, we have for any A ⊆ Vi, B ⊆ Vj, that

1∗ 1 AT1 B = dij|A||B| + O(ε|Vi||Vj|). (2.3) √ P ∗ Indeed, recall that |λi| ≤ n/ i, we see that each Vi ×Vj-entry of T1 = i≤J λiuiui fluctuates by at most r X J ε3/2   ε  X n λ · n−1/2 · O n−1/2 = O √ n−1 · √ = O(ε). i ε J i≤J J i≤J i

2 3 2 Bounding error term T2. By the choice of T2 and (2.2), tr(T2 ) ≤ ε n . On the other hand, 2 P 2 let xab be (a, b)-th entry of T2, as T2 is self-adjoint, we have tr(T2 ) = a,b∈V (G) |xab| . Then by Markov inequality, we get

X 2 2 |xab| ≤ ε |Vi||Vj|, (2.4)

a∈Vi,b∈Vj

0 [M−1] P for all 1 ≤ i, j ≤ M − 1 apart from an exceptional set Σ ⊆ 2 with (i,j)∈Σ0 |Vi||Vj| ≤ 2 0 εn . Hence, for any (i, j) 6∈ Σ and A ⊆ Vi,B ⊆ Vj, by (2.4) and Cauchy-Schwarz, we have

 1/2 1∗ 1 X X 2 1/2 AT2 B ≤ |xab| ≤  |xab|  (|Vi||Vj|) = O(ε|Vi||Vj|). (2.5) a∈Vi,b∈Vj a∈Vi,b∈Vj

24 Bounding error term T3. By the choice of T3 and (2.1), the operator norm of T3 is at most n kT3kop ≤ √ . Then by Cauchy-Schwarz, we have F (J)

2 ! ∗ n 1 T31B = h1A,T31Bi ≤ k1Ak2 · kT31Bk2 ≤ k1Ak2 · kT3kop · k1Bk2 = O . (2.6) A pF (J)

0 Set Σ := Σ ∪ {(i, j): i or j = M} ∪ {(i, j) : min{|Vi|, |Vj|} ≤ εn/M}. Then easy to P 2 check that (i,j)∈Σ |Vi||Vj| ≤ O(εn ). By (2.3), (2.5), (2.6), we have

3 ! n2 1∗ 1 X 1∗ 1 e(A, B) = AT B = ATi B = dij|A||B| + O(ε|Vi||Vj|) + O p . i=1 F (J)

2 n2 M |Vi||Vj | As |Vi|, |Vj| ≥ εn/M, we have √ ≤ √ . To absorb the 2nd error term into the first F (J) ε2 F (J) one, we need √ 1 = O(ε3/M 2). F (J)

Remark 2.8.3. The point of having T2-term is to have local control on the fluctuation of n2 e(A, B), i.e. O(ε|Vi||Vj|). For the tail T3, we only have a global type control O(√ ), and F (J) p M 2 J we need F (J) ≥ ε3 , to make it into a local error. Recall that M ≥ J when we combine the partitions for each ui, i ≤ J. Thus, we need to create a gap between F (J) and J by splitting out a small term T2 in the middle.

25 26 Chapter 3

Pseudorandomness

In the last chapter, we studied Szemer´edi’sregularity lemma, which partitions any (large) graph into parts such that almost all pairs of parts induces a random-like bipartite graph. This random-like property then enables us (counting lemma, embedding lemma) to use expectation to approximate some graph parameters (subgraph density). In this chapter, we will take a look at the notion of pseudorandomness, also referred to in other contexts as quasirandomness, regularity, uniformity. Pseudorandomness has played an important role in not just extremal combinatorics, but also other fields such as number theory, probability, coding theory and theoretical computer science.

3.1 Quasirandom graphs

We will first take a look at quasirandom graphs, introduced in the 80s by Thomason and independently by Chung-Graham-Wilson. We shall define several properties that at the first glance seems irrelevant of one another but turns out to be equivalent in the sense of being random-like. One immediate application of this is that we have many different ways of checking whether a graph is quasirandom, as if a graph satisfies any one of the equivalent properties, then it satisfies all of them. We need some notations before stating the equivalent quasirandom properties. Through- n out this section, G will be an n-vertex graph with edge density p ∈ (0, 1), i.e. e(G) = p 2 . When reading this section, we should compare G with the Erd˝os-R´enyi 1 G(n, p). We let λ1, . . . , λn be the eigenvalues of the adjacency matrix T of G, ordered by |λ1| ≥ ... ≥ |λn|. We will write d(u, v) = |N(u) ∩ N(v)| for the codegree of u and v. 1The Erd˝os-R´enyi random graph G(n, p) is the probability space of all graphs on vertex set [n] with p-biased measure, equivalently, G(n, p) is a random graph on [n] in which every pair forms an edge with probability p independent of all other pairs.

27 3.1.1 Equivalent definitions of quasirandomness We can now state the aforementioned properties:

• (Induced Subgraph Count) For every graph H, the number of labelled induced copy of H in G is pe(H)(1 − p)e(H) + o(1).

• (Subgraph Count) For every graph H, t(H,G) = pe(H) + o(1).

4 • (4-cycle Count) t(C4,G) ≤ p + o(1).

• (Spectral Gap) |λ2| = o(n).

• (Discrepancy) For any A, B ⊆ V (G), e(A, B) = p|A||B| + o(n2).

P 2 3 • (Codegree) u,v∈V (G) |d(u, v) − p n| = o(n ). Notice that these properties hold almost surely in G(n, p). The result of Thomason and Chung-Graham-Wilson states that all the above properties are equivalent. A graph G is called quasirandom if it satisfies any one of the above properties. It is surprising at first that the seemingly weaker property of having the correct C4 count implies the correct count of all subgraphs densities. We shall give a proof for regular graphs.

Theorem 3.1.1. Let p ∈ (0, 1) and G be an n-vertex d-regular graph with d = pn, then all the above properties are equivalent.

Proof. We will prove (Induced Subgraph Count) ⇒ (Subgraph Count) ⇒ (4-cycle Count) ⇒ (Spectral Gap) ⇒ (Discrepancy) ⇒ (Codegree). We defer the proof of (Codegree) ⇒ (Induced Subgraph Count) to the next subsection.

• (Induced Subgraph Count) ⇒ (Subgraph Count) Exercise.

• (Subgraph Count) ⇒ (4-cycle Count) By definitions.

4 • (4-cycle Count) ⇒ (Spectral Gap) This amounts to write C4-count using trace of T and the correct count of C4 means the contribution from the non-trivial eigenvalues λi, i ≥ 2, is negligible. k More precisely, note that Tu,v, the u, v-th entry of the k-th power of the adjacency matrix T , is the number of u, v-walk of length k in G. Then the trace of T k

k X k tr(T ) = λi i∈[n]

counts the number of closed walks of length k in G. Among these walks, the non- degenerate ones are Ck, while the degenerate ones is easily seen to be negligible,

28 k−1 O(n ). Recall that for d-regular graphs, λ1 = d. Splitting out the first term in tr(T 4), we see that

n 4 4 4 4 4 4 4 X 4 p n ± o(n ) ≥ t(C4,G)n ± o(n ) = tr(T ) = (pn) + λi , i=2

implying that |λi| = o(n) for all i ≥ 2. • (Spectral Gap) ⇒ (Discrepancy) Expander mixing lemma.

• (Discrepancy) ⇒ (Codegree) We shall prove a stronger statement that every vertex has small codegree deviation: for any u, X |d(u, v) − p2n| = o(n2). v:v6=u

To see this, we split V (G) \{u} = B ∪ B0, where B := {v : d(u, v) > p2n}. This splitting helps us to get rid of absolute value sign: writing A := N(u) and so |A| = pn, we have X |d(u, v) − p2n| = (e(A, B) − p2n|B|) + (p2n|B0| − e(A, B0)) v:v6=u = (e(A, B) − p|A||B|) + (p|A||B0| − e(A, B0)).

Now applying (Discrepancy) to each of the two terms above finishes the proof.

Exercise 3.1.2. Prove that (Induced Subgraph Count) ⇒ (Subgraph Count).

3.1.2 (Codegree) ⇒ (Induced Subgraph Count).

Proof. Let H be a graph on vertex set {v1, v2, . . . , vs}, and for r ∈ [s], let Hr := H[{v1, . . . , vr}]. We will use induction on 1 ≤ r ≤ s, via building Hr+1 from Hr, to show that G has the correct count of copies of H = Hs. That is, writing Nr for the number of labelled induced copies of Hr in G, we shall show

r e(Hr) e(Hr) Nr = (1 + o(1))n p (1 − p) . (3.1)

The base case r = 1 is clearly true. Assume now it holds for 1 ≤ r < s, we will prove it for r + 1.

Extension function. To count copies of Hr+1, we will make use of a function that helps us to r count the number of ways to extend a copy of Hr to Hr+1. For this purpose, let  ∈ {0, 1} be the 0/1-vector encoding the adjacencies of vr+1 to {v1, . . . , vr} in Hr+1, namely,

 = {1, . . . , r} with i = 1 if and only if vi ∼ vr+1 in Hr+1.

29 2 For z = {z1, . . . , zr} ∈ V(r) , an ordered set of r distinct vertices in G, let

X(z) := |{v ∈ V (G): v 6= z1, . . . , zr, and v ∼ zi if and only if i = 1, for 1 ≤ i ≤ r}| . ∼ ∼ We write z = Hr when z ∈ V(r) induces a copy of Hr. Note that for any z = Hr, the function X(z) counts exactly the number of ways to extend z to a labelled induced copy of Hr+1. We can view X(z) probabilistically as follows. Think of X(z) as a random variable X drawn from the space Ω := V(r) with uniform measure, that is, for any z ∈ Ω, 1 Pr[X = X(z)] = . n(r)

∗ ∼ As observed above, writting Ω = {z : z = Hr} ⊆ Ω, we can count copies of Hr+1 by ∼ summing up the number of extensions of each z = Hr to Hr+1: X Nr+1 = X(z). (3.2) z∈Ω∗ Inductive step assuming concentration of X. Later, we will bound the variance of X to show that each X(z) is close to the mean E[X], in particular: X X(z) = |Ω∗| · E[X] + o(nr+1). (3.3) z∈Ω∗ P Assuming (3.3) for now, let us see how it finishes the proof. Observe first that z∈Ω X(z) counts the number of ordered (r + 1)-tuple {u1, . . . , ur, ur+1} ∈ V(r+1) such that the adja- cencies of ur+1 to {u1, . . . , ur} is . We can count this quantity from ur+1’s point of view as follows. First choose ur+1, for which there are n ways; and then choose an ordered r-tuple, among which exactly || are neighbours of ur+1 and r − || are non-neighbours of ur+1, where || denotes the number of 1s in . The number of such ordered r-tuple is

(1 + o(1))p||(1 − p)r−||nr.

For simplicity, let ρ := p||(1 − p)r−||. P r+1 Thus, z∈Ω X(z) = (1 + o(1))ρn . We can then compute the mean: X 1 X E[X] = Pr[X = X(z)] · X(z) = X(z) = (1 + o(1))ρn. n z∈Ω (r) z∈Ω Recall the definition of Ω∗, we see that

∗ |Ω | = Nr.

2 We write V(r) for the set of all ordered r-tuples of vertices in V , and |V(r)| = n(r) = n·(n−1) ··· (n−r+1) for the r-falling factorial of n.

30 Then, using (3.2) and (3.3), we derive

∗ r+1 r+1 Nr+1 = |Ω | · E[X] + o(n ) = (1 + o(1))Nrρn + o(n ) = (1 + o(1))nr+1pe(Hr+1)(1 − p)e(Hr+1), where the last equality follows from the induction hypothesis, i.e. (3.1), and that || = e(Hr+1) − e(Hr) and r − || = e(Hr+1) − e(Hr). This concludes the inductive step, hence the proof. Concentration of f via bounding variance. We are left to prove (3.3). By Lemma 3.1.4 below, it suffices to bound the error term: s X s X |Ω∗| · (X(z) − E[X])2 = |Ω∗| · (X(z)2 − E[X]2) = o(nr+1). z∈Ω z∈Ω

∗ r As |Ω | = Nr = O(n ), it suffices to show X X(z)2 = |Ω| · E[X]2 + o(nr+2) = ρ2nr+2 + o(nr+2). (3.4) z∈Ω P 2 To see this, For the second moment z∈Ω X(z) , we can approximate it with X T := X(z)(X(z) − 1), z∈Ω which can be computed by double counting, using the following combinatorial meaning. To do so, we need the following claim: for any u 6= v ∈ V and any integer k, k0 ≥ 1,

X k k0 2k 2k0 k+k0+2 dG(u, v) dG(u, v) = (1 + o(1))p (1 − p) n . (3.5) u6=v We leave this claim as an exercise, see Exercise 3.1.3. We can now compute T . Note that T counts the number of pairs (z, {u, v}), with z ∈ V(r) and {u, v} ∈ V(2), such that u, v 6∈ z and both u and v have the same adjacency  to z. On the other hand, counting such pairs from {u, v}’s perspective, for each given {u, v}, we can

pick z according to  by choosing || terms from NG(u, v) and r − || terms from NG(u, v), then by (3.5), we have

X || r−|| T = (1 + o(1)) dG(u, v) dG(u, v) u6=v = (1 + o(1))p2||(1 − p)2(r−||)nr+2 = (1 + o(1))ρ2nr+2. Consequently, (3.4) follows: X X X(z)2 = T + X(z) = T + O(nr+1) = ρ2nr+2 + o(nr+2), z∈Ω z∈Ω as desired.

31 Exercise 3.1.3. Prove (3.5) using (Codegree) property. It is worth noting that the right hand side in (3.5) is what we would expect from a genuine random graph G(n, p).3 When proving (Codegree) implies (Induced Subgraph Count), we use the following lemma, which basically says that we can approximate a subset sum by average provided that the variance is small. Lemma 3.1.4. Let X be a random variable defined on space Ω with uniform measure. Let Ω∗ ⊆ Ω, then X s X X(ω) = |Ω∗| · E[X] ± |Ω∗| · (X(ω) − µ)2 ω∈Ω∗ ω∈Ω = |Ω∗| · E[X] ± p|Ω∗| · |Ω| Var[X]. Proof. Let µ = E[X]. By Cauchy-Schwarz, 2 2

X ∗ X ∗ X 2 ∗ X 2 X(ω) − |Ω | · µ = (X(ω) − µ) ≤ |Ω |· (X(ω)−µ) ≤ |Ω |· (X(ω)−µ) , ω∈Ω∗ ω∈Ω∗ ω∈Ω∗ ω∈Ω implying that

s X ∗ ∗ X 2 X(ω) − |Ω | · µ = |Ω | · (X(ω) − µ) . ω∈Ω∗ ω∈Ω Thus, X s X X(ω) = |Ω∗| · µ ± |Ω∗| · (X(ω) − µ)2, ω∈Ω∗ ω∈Ω as desired.

3.1.3 (4-cycle Count) ⇒ strongly regular via Cauchy-Schwarz We can also prove a stronger version of (Codegree) property directly from (4-cycle Count). n 4 Proposition 3.1.5. Let G be an n-vertex graph with e(G) = p 2 . If t(C4,G) ≤ p + o(1), then for any u 6= v ∈ G, d(u) = pn + o(n) and d(u, v) = p2n + o(n).

Proof. Note first that, by double counting the cherry K1,2 and Cauchy-Schwarz, X X X d(u, v) = d(w)(d(w) − 1) = d(w)2 − 2e(G) u6=v w w !2 1 X ≥ d(w) − 2e(G) = (1 + o(1))p2n3. n w

3 2 P k k+2 Hint: Let δuv = dG(u, v) − p n. Show first that for any integer k ≥ 1, u6=v |δuv| = o(n ), using the fact that |δuv| ≤ n for any u 6= v.

32 Consequently, the number of labelled C4, using Cauchy-Schwarz again, is

4 4 4 4 X X 2 X p n + o(n ) ≥ t(C4,G)n = d(u, v)(d(u, v) − 1) = d(u, v) − d(u, v) u6=v u6=v u6=v !2 1 X X ≥ d(u, v) − d(u, v) n(n − 1) u6=v u6=v ≥ p4n4 + o(n4).

We thus should have the equalities above throughout. Equalities hold in the two applications of Cauchy-Schwarz implies that both the degree vector {d(w)}w∈V (G) and the codegree vector

{d(u, v)}(u,v)∈V(2) are linearly dependent with all 1s vector. In other words, the graph is strongly regular, i.e. every vertex has degree pn+o(n) and codegree of every pair is p2n+o(n) as desired.

Remark 3.1.6. In retrospect, it is not that surprising now that the seemingly weaker prop- erty of (4-cycle Count) implies the (Induced Subgraph Count). As we have seen above, (Codegree) follows from (4-cycle Count) with two applications of Cauchy-Schwarz. And in the proof of (Codegree) ⇒ (Induced Subgraph Count), we count H-subgraphs by building ∗ it up one vertex at a time, H1,...,Hr,...,Hs = H. Let Hr+1 be the graph obtained from ∗ Hr+1 by adding a new vertex vr+1, which is a copy of vr+1. To count Hr+1, we need to control ∗ its “variance”: the number of copies of Hr+1, which, in point of views of the twins vr+1 and ∗ vr+1, is governed by (Codegree) property.

33