Isomorphism and Embedding Problems for Infinite Limits of Scale
Total Page:16
File Type:pdf, Size:1020Kb
Isomorphism and Embedding Problems for Infinite Limits of Scale-Free Graphs Robert D. Kleinberg ∗ Jon M. Kleinberg y Abstract structure of finite PA graphs; in particular, we give a The study of random graphs has traditionally been characterization of the graphs H for which the expected dominated by the closely-related models (n; m), in number of subgraph embeddings of H in an n-node PA which a graph is sampled from the uniform distributionG graph remains bounded as n goes to infinity. on graphs with n vertices and m edges, and (n; p), in n G 1 Introduction which each of the 2 edges is sampled independently with probability p. Recen tly, however, there has been For decades, the study of random graphs has been dom- considerable interest in alternate random graph models inated by the closely-related models (n; m), in which designed to more closely approximate the properties of a graph is sampled from the uniformG distribution on complex real-world networks such as the Web graph, graphs with n vertices and m edges, and (n; p), in n G the Internet, and large social networks. Two of the most which each of the 2 edges is sampled independently well-studied of these are the closely related \preferential with probability p.The first was introduced by Erd}os attachment" and \copying" models, in which vertices and R´enyi in [16], the second by Gilbert in [19]. While arrive one-by-one in sequence and attach at random in these random graphs have remained a central object \rich-get-richer" fashion to d earlier vertices. of study and continue to have many important applica- Here we study the infinite limits of the preferential tions in combinatorics and theoretical computer science, attachment process | namely, the asymptotic behavior recently there has also been a great deal of interest in al- of finite graphs produced by preferential attachment ternative random graph models whose properties more (briefly, PA graphs), as well as the infinite graphs closely resemble those of complex real-world networks obtained by continuing the process indefinitely. We are such as the Web graph, the Internet, and large social guided in part by a striking result of Erd}os and R´enyi networks. Two of the most well-studied of these are the on countable graphs produced by the infinite analogue closely related \preferential attachment" and \copying" of the (n; p) model, showing that any two graphs models; the former was introduced by Barab´asi and Al- producedG by this model are isomorphic with probability bert in [3] and subsequently formalized by Bollob´as and 1; it is natural to ask whether a comparable result holds Riordan in [8], while the latter was introduced by Ku- for the preferential attachment process. mar et al. in [22]. We find, somewhat surprisingly, that the answer de- A random graph in the preferential attachment pends critically on the out-degree d of the model. For model (henceforth, the PA model) is built up one 1 d = 1 and d = 2, there exist infinite graphs Rd such vertex at a time, with each new vertex v linking to the that a random graph generated according to the in- preceding ones by d new edges, where the out-degree d finite preferential attachment process is isomorphic to is a parameter of the model. Roughly, the head of each 1 Rd with probability 1. For d 3, on the other hand, edge emanating from v is chosen by sampling from the two different samples generated≥from the infinite prefer- preceding vertices with probabilities weighted according ential attachment process are non-isomorphic with pos- to their total degree (in-degree plus out-degree); this itive probability. The main technical ingredients under- is the preferential, or \rich-get-richer," aspect of the lying this result have fundamental implications for the model, since nodes of higher in-degree attract new in- coming edges more readily. (We will sometimes use ∗Department of Mathematics, MIT, Cambridge MA 02139. the term \PA graph" as an informal shorthand to Email: [email protected]. Supported by a Fannie and John refer to a random graph drawn from the distribution Hertz Foundation Fellowship. defined by the PA model.) As we discuss further yDepartment of Computer Science, Cornell University, Ithaca NY 14853. Email: [email protected]. Supported in part below, there has been considerable work aimed at by a David and Lucile Packard Foundation Fellowship and NSF determining fundamental graph-theoretic properties in grants 0081334 and 0311333. the PA model, exposing both similarities and contrasts with the classical (n; p) model. the random process will almost surely be a tree with In the presenGt paper, we seek to understand the countably many nodes, in which each node has infinite infinite limits of the PA model | namely, the asymp- degree. For the case of out-degree d = 2, the resulting 1 totic behavior of graphs produced by this model as the graph R2 is much more complicated. Its structure can number of nodes goes to infinity, and the distribution be characterized axiomatically, but it is also possible to 1 1 d on random graphs with countably many vertices give explicit constructions of graphs isomorphic to R2 . obtainedPA by continuing the PA process indefinitely. We For example, it is isomorphic to the graph whose vertices were inspired by the following classical theorem about consist of all finite rooted binary trees with integer the “infinite version" of the (n; p) model [17]. labels, where the vertex corresponding to a labeled tree G T has edges to its left sub-tree and to its right sub-tree. Theorem 1.1. Let ( ; p) denote the probability dis- G 1 The global structure of the proof for the case d = 2 tribution on graphs with vertex set N, in which each edge is a standard \back-and-forth" argument, which will (i; j) is included independently with probability p. (Here be familiar to readers acquainted with Theorem 1.1. p is any constant in (0; 1).) There exists an infinite The key step, however | establishing that there is an graph R, such that a random sample from ( ; p) is G 1 adequate supply of vertices to sustain the back-and- isomorphic to R with probability 1. forth construction of the isomorphism | is much more When one first encounters this theorem, it can complicated than in the classical case, since the PA seem quite startling: infinite random graphs are not process introduces difficult conditioning problems. \random" at all; they are almost surely isomorphic to One might imagine that for the cases of out-degrees a single fixed graph R. A rich theory has developed d = 3; 4; 5; : : : one could establish isomorphisms with 1 1 around the infinite model ( ; p), with connections probability 1 to increasingly complex graphs R3 , R4 , reaching into mathematical logic,G 1 algebra, and a number and so on. But in fact, we have the following result. of other areas (see e.g. [13]). Theorem 1.3. For each out-degree d 3, it is not the On the other hand, essentially nothing is known case that two independent random samples≥ from 1 about the the infinite version of the PA model. Does d are isomorphic with probability 1. PA something analogous to Theorem 1.1 hold here as well, or is the situation fundamentally different? At a more This contrast between the cases of d = 2 and d 3 fine-grained level, we are also interested in understand- comes to us as something of a surprise, since it does ≥not ing what can be said about the local structure of finite have an obvious analogue in the prior work on graphs graphs produced by the PA model as the number of generated according to the PA process. There, typically, nodes goes to infinity. As we discuss further below, the the out-degree d has a clear quantitative effect on the only prior work addressing the infinite graphs generated underlying graph parameters, but not a qualitative by such processes, as far as we are aware, are some in- effect of this sort. teresting recent papers by Bonato and Janssen [11, 12], This contrasting pair of results is a particularly suc- which proposed the notion of studying infinite limits of cinct consequence of one of the main technical com- random graph evolution processes related to the copy- ponents of the paper, which addresses a fundamental ing model of [23]. These papers consider the relationship structural issue for both the finite and infinite versions between such infinite random graphs and certain deter- of the PA model | a characterization of the graphs H ministic adjacency axioms. Some of these axioms have for which the expected number of subgraph embeddings a unique infinite model up to isomorphism, while others of H in an n-node PA graph remains bounded as n goes are satisfied with probability 1 by the infinite limits of to infinity. Phrased equivalently as a statement about the random graph processes considered in these papers. 1 the infinite model d , we show that if a finite graph However, none of their theorems resolve the question H is equal to its 3-corePA (i.e. the union of all subgraphs of whether an analogue of Theorem 1.1 holds for such of H of minimum degree 3), then the number of sub- infinite random graphs. 1 graph embeddings of H in a random sample from d Our first result is the following, where again 1 PA PAd has a positive finite expectation, while if H is not equal denotes the distribution associated with the infinite PA to its 3-core, then the number of embeddings is almost model.