WHAT IS ... a Graphon?

Daniel Glasscock

Large graphs are ubiquitous in mathematics, and Consider the following problem from extremal describing their structure is an important goal of : modern combinatorics. One way to study large, finite objects is to pass from sequences of larger How many 4-cycles must there be in a graph and larger such objects to ideal limiting objects. with edge density at least 1/2? Done properly, properties of the limiting objects It is easy to see that there are at most on the order reflect properties of the finite objects which ap- of n4 4-cycles in any graph; a theorem of Erd˝os proximate them, and vice versa. gives that graphs with at least half the number Graphons, short for graph functions, are the of possible edges have at least on the order of n4 limiting objects for sequences of large, finite 4-cycles. More specifically, for any graph G, graphs with respect to the so-called cut metric. They were introduced and developed by C. Borgs, 4 J. T. Chayes, L. Lov´asz,V. T. S´os,B. Szegedy, t( ,G) ≥ t( ,G) , and K. Vesztergombi in [1] and [2]. Graphons arise naturally wherever sequences of large graphs meaning that if t( ,G) ≥ 1/2, then t( ,G) ≥ appear: extremal graph theory, property testing 1/16. In light of this, the problem may be re- of large graphs, quasi-random graphs, random formulated into a minimization one: Minimize networks, et cetera. t( ,G) over finite graphs G satisfying t( ,G) ≥ Let’s begin with some definitions and a moti- 1/2. With some work, it may be shown that no vating example. A graph G is a set of vertices finite graph G with t( ,G) ≥ 1/2 achieves the V (G) and a set of edges E(G) between the ver- minimum t( ,G) = 1/16. tices (excluding loops and multiple edges). A It’s useful at this point to draw an analogy graph homomorphism from H to G is a map with a problem from elementary analysis: Mini- from V (H) to V (G) that preserves edge adja- mize x3 − 6x over rational numbers x satisfying cency; that is, for every edge {v, w} in E(H), x ≥ 0. This polynomial√ has a unique minimum the edge {ϕ(v), ϕ(w)} is in E(G). Denote by on x ≥ 0 at x = 2, so the best we may do arXiv:1611.00718v1 [math.CO] 2 Nov 2016 hom(H,G) the number of homomorphisms from over the rationals is show that the polynomial H to G. For example, hom( ,G) = |V (G)|, achieves values approaching this minimum√ along hom( ,G) = 2|E(G)|, and hom( ,G) is 6 times a sequence of rationals approaching 2. We know the number of triangles in G. Normalizing by the well to avoid this complication by completing the total number of possible maps, we get the homo- rational numbers to the reals√ and realizing the morphism density of H into G, limit of such a sequence as 2. hom(H,G) There is a sequence of finite graphs with edge t(H,G) = , density at least 1/2 and 4-cycle density approach- |V (G)||V (H)| ing 1/16. Let Rn be an instance of a random the probability that a randomly chosen map from graph on n vertices where each edge is decided V (H) to V (G) preserves edge adjacency. This independently with probability 1/2. Throwing number also represents the density of H as a sub- away those Rn’s for which t( ,Rn) < 1/2, the graph in G asymptotically as n = |V (G)| → ∞. 4-cycle density in the remaining graph sequence√ For example, t( ,G) = 2|E(G)|/n2 while the limits to 1/16 almost surely. Following the 2 density of edges in G is 2|E(G)|/n(n − 1); these analogy, we should look to realize the limit of this two expressions are nearly the same when n is sequence of finite graphs and understand how it large. solves the minimization problem at hand. 1 2

What might the limit of the sequence of ran- example after we define this convergence more dom graphs (Rn)n be? From the adjacency ma- precisely. trix of a labeled graph, construct the graph’s pixel picture by turning the 1’s into black squares, eras- ing the 0’s, and scaling to the unit square [0, 1]2.

 0 1 0 1  1 0 1 0   −→  0 1 0 1  1 0 1 0

Pixel pictures may be seen to “converge” graph- Homomorphism densities extend naturally to ically; those of larger and larger random graphs graphons. For a finite graph G, the density with edge probability 1/2, regardless of how they t( ,G) may be computed by giving each ver- are labeled, seem to converge to a gray square, tex of G a mass of 1/n and integrating the edge the constant 1/2 function on [0, 1]2. indicator function over all pairs of vertices. In exactly the same way, the edge density t( ,W ) of a labeled graphon W is Z W (x, y) dxdy, [0,1]2

The constant 1/2 function on [0, 1]2 is an ex- and the 4-cycle density t( ,W ) is ample of a labeled graphon. A labeled graphon is Z a symmetric, Lebesgue-measurable function from W (x1, x2)W (x2, x3) [0,1]4 [0, 1]2 to [0, 1] (modulo the usual identification almost everywhere); they may be thought of as W (x3, x4)W (x4, x1) dx1dx2dx3dx4. edge-weighted graphs on the vertex set [0, 1]. An It is straightforward from here to write down unlabeled graphon is a graphon up to re-labeling, the expression for the homomorphism density where a re-labeling is the result of applying an t(H,W ) of a finite graph H into a graphon invertible, measure preserving transformation to W . This allows us to see how the constant the [0, 1] interval. Note that any pixel picture is graphon W ≡ 1/2 solves the minimization prob- a labeled graphon, meaning that (labeled) graphs lem: t( ,W ) = 1/2 while t( ,W ) = 1/16. are (labeled) graphons. To see the space of graphons as the com- As another example of this convergence, con- pletion of the space of finite graphs and make sider the growing uniform attachment graph se- graphon convergence precise, define the cut dis- quence (Gn)n defined inductively as follows. Let tance δ (W, U) between two labeled graphons W  G1 = . For n ≥ 2, construct Gn from Gn−1 and U by by adding one new vertex, then, considering each Z  pair of non-adjacent vertices in turn, drawing an inf sup W ϕ(x), ϕ(y) edge between them with probability 1/n. This ϕ,ψ S,T S×T sequence almost surely limits to the graphon  1 − max(x, y). (Since matrices are indexed with − U ψ(x), ψ(y) dxdy , (0, 0) in the top left corner, so too are graphons.) where the infimum is taken over all re-labelings ϕ of W and ψ of U, and the supremum is taken over all measurable subsets S and T of [0, 1]. The cut distance first measures the maximum discrepancy between the integrals of two labeled graphons 2 There are two natural ways to label a complete over measurable boxes (hence the ) of [0, 1] , , and each suggests a different then minimizes that maximum discrepancy over limit graphon for the all possible re-labelings. (It is possible to define sequence. Both sequences of labeled graphons in the cut distance between two finite graphs com- fact have the same limit, as indicated in the dia- binatorially, without any analysis, but the defini- gram; the reader is encouraged to return to this tion is quite involved.) 3

The infimum in the definition of the cut dis- demonstrates how graphons provide a bridge be- tance makes it well defined on the space of un- tween different forms of Szemer´edi’s Regularity labeled graphons, but it is not yet a metric. Lemma: Theorem 2 may be deduced from a weak Graphons W and U for which t(H,W ) = t(H,U) form of the lemma, while a stronger regularity for all finite graphs H are called weakly isomor- lemma follows from the compactness of G. phic; it turns out that W and U are weakly iso- Theorem 3 For every finite graph H, the map morphic if and only if δ (W, U) = 0. The cut t(H, ·): G → [0, 1] is Lipschitz continuous.  distance becomes a genuine metric on the space Theorems 2 and 3 combine with elementary G of unlabeled graphons up to weak isomorphism. analysis to show that minimization problems in The examples of pixel picture convergence above extremal graph theory (such as the one consid- provide examples of convergent sequences and ered above) are guaranteed to have solutions in their limits in G (up to weak isomorphism). the space of graphons. These graphon solutions We conclude by highlighting some fundamen- provide a “template”, via Theorem 1, for approx- tal results on graphons. imate solutions in the space of finite graphs. Theorem 1 Every graphon is the δ -limit of The interested reader is encouraged to consult  a sequence of finite graphs. L. Lov´asz’sbook [3] for more! To approximate a labeled graphon W by a fi- nite labeled graph, let S be a set of n randomly References chosen points from [0, 1], then construct a graph [1] C. Borgs, J. T. Chayes, L. Lov´asz, V. T. S´os, on S where the edge {si, sj} is included with and K. Vesztergombi. Convergent sequences of dense probability W (si, sj). With high probability (as graphs. I. Subgraph frequencies, metric properties and |S| → ∞), this labeled graph approximates W testing. Adv. Math., 219(6):1801–1851, 2008. well in cut distance. [2] L. Lov´aszand B. Szegedy. Limits of dense graph se- Theorem 2 The space (G, δ ) is compact. quences. J. Combin. Theory Ser. B, 96(6):933–957,  This implies that G is complete; combining this 2006. [3] L´aszl´oLov´asz. Large networks and graph limits, vol- fact with Theorem 1, we see that the space of ume 60 of American Mathematical Society Colloquium graphons is the completion of the space of finite Publications. American Mathematical Society, Provi- graphs with the cut metric! This theorem also dence, RI, 2012.