Statistical Science 2010, Vol. 25, No. 3, 275–288 DOI: 10.1214/10-STS335 © Institute of Mathematical ,2010 Connected Spatial Networks over Random Points and a Route-Length Statistic David J. Aldous and Julian Shun

Abstract. We review mathematically tractable models for connected net- works on random points in the plane, emphasizing the class of proximity graphs which deserves to be better known to applied probabilists and statisti- cians. We introduce and motivate a particular statistic R measuring shortness of routes in a network. We illustrate, via Monte Carlo in part, the trade-off between normalized network length and R in a one-parameter family of prox- imity graphs. How close this family comes to the optimal trade-off over all possible networks remains an intriguing open question. The paper is a write-up of a talk developed by the first author during 2007– 2009. Key words and phrases: Proximity graph, , spatial network, geometric graph.

1. INTRODUCTION Recall that the most studied network model, the ran- dom geometric graph [40]reviewedinSection2.1, The topic called random networks or complex net- does not permit both connectivity and bounded nor- works has attracted huge attention over the last 20 malized length in the n limit. An attractive al- years. Much of this work focuses on examples such as ternative is the class of proximity→∞ graphs,reviewedin social networks or WWW links, in which edges are not Section 2.3,whichinthedeterministiccasehavebeen closely constrained by two-dimensional . In studied within . These graphs contrast, in a spatial network not only are vertices and are always connected. Proximity graphs on random edges situated in two-dimensional space, but also it is points have been studied in only a few papers, but are actual distances, rather than number of edges, that are potentially interesting for many purposes other than of interest. To be concrete, we visualize idealized inter- the specific “short route lengths” topic of this paper city road networks, and a feature of interest is the (min- (see Section 6.5). One could also imagine construc- imum) route length between two given cities. Because tions which depend on points having specifically the we work only in two dimensions, the word spatial may distribution, and one novel such be misleading, but equally the word planar would be network, which we name the Hammersley network,is misleading because we do not require networks to be described in Section 2.5. planar graphs (if edges cross, then a junction is cre- Visualizing idealized road networks, it is natural to ated). take total network length as the “cost” of a network, but Our major purpose is to draw the attention of read- what is the corresponding “benefit”? Primarily we are ers from the applied probability and statistics commu- interested in having short route lengths. Choosing an nities to a particular class of spatial network models. appropriate statistic to measure the latter turns out to be rather subtle, and the (only) technical innovation of this David J. Aldous is Professor, Department of Statistics, paper is the introduction (Section 3.2)andmotivation University of California, 367 Evans Hall # 3860, Berkeley, California 94720, USA (e-mail: [email protected]; of a specific statistic R for measuring the effectiveness URL: www.stat.berkeley.edu/users/aldous). Julian Shun is of a network in providing short routes. Graduate Student, Department, In the theory of spatial networks over random points, Carnegie Mellon University, 5000 Forbes Avenue, it is a challenge to quantify the trade-off between net- Pittsburgh, Pennsylvania 15213, USA (e-mail: work length [precisely, the normalized length L de- [email protected]). fined at (2)] and route length efficiency statistics such

275 276 D. J. ALDOUS AND J. SHUN as R.OurparticularstatisticR is not amenable to ex- First (Sections 2.1–2.3)areschemeswhichusede- plicit calculation even in comparatively tractable mod- terministic rules to define edges for an arbitrary deter- els, but in Section 4 we present the results from Monte ministic configuration of cities; then one just applies Carlo simulations. In particular, Figure 7 shows the these rules to a random configuration. Second, one can trade-off for the particular β-skeleton family of prox- have random rules for edges in a deterministic config- imity graphs. uration (e.g., the probability of an edge between cities Given a normalized network length L,foranyreal- i and j is a function of Euclidean d(xi,xj ),as ization of cities there is some network of normalized in popular small worlds models [39]), and again apply length L which minimizes R.AsindicatedinSec- to a random configuration. Third, and more subtly, one tion 5,bygeneralabstractmathematicalarguments, can have constructions that depend on the randomness there must exist a deterministic function Ropt(L) giv- model for city positions—Section 2.5 provides a novel ing (in the “number of cities ”limitunderthe example. random model) the minimum value→∞ of R over all pos- We work throughout with reference to Euclidean dis- sible networks of normalized length L.Anintriguing tance d(x,y) on the plane, even though many mod- open question is as follows: els could be defined with reference to other metrics (or even when the triangle inequality does not hold, for the how close are the values Rβ-skel(L) from the MST). β-skeleton proximity graphs to the optimum values Ropt(L)? 2.1 The Geometric Graph As discussed in Section 5.3,atfirstsightitlookseasyto In Sections 2.1–2.3 we have an arbitrary configura- design heuristic for networks which should tion x xi of city positions, and a deterministic rule for defining={ } the edge-set .Usuallyingraphtheory improve over the β-skeletons, for example, by intro- E ducing Steiner points, but in practice we have not suc- one imagines a finite configuration, but note that every- ceeded in doing so. thing makes sense for locally finite configurations too. This paper focuses on the random model for city po- Where helpful, we assume “general position,” so that sitions because it seems the natural setting for theoret- intercity distances d(xi,xj ) are all distinct. For the geometric graph one fixes 0

Here are the definitions (for MST and Delaunay, it 2.3 Proximity Graphs is easy to check these are equivalent to more familiar Write v and v for the points ( 1 , 0) and ( 1 , 0). definitions). In each case, we write the criterion for an − + 2 2 The lune is the intersection of the open− discs of radii 1 edge (xi,xj ) to be present: centered at v and v .Sov and v are not in the − + − + (MST) [24]. There does not lune but are on its boundary. Define a template A to be • 2 exist a sequence i k0,k1,...,km j of cities such asubsetofR such that: = = that (i) A is a subset of the lune. (ii) A contains the open line segment (v ,v ). max(d(xk0 ,xk1 ), d(xk1 ,xk2 ), . . . , d(xkm 1 ,xkm )) − + − (iii) A is invariant under the “reflection in the y-

FIG.1. The relative neighborhood graph (left) and Gabriel graph (right) on different realizations of 500 random points. 278 D. J. ALDOUS AND J. SHUN graph is always connected. The Gabriel graph is pla- For a picturesque description, imagine one-eyed nar. But if A is not a superset of the disc centered at the frogs sitting on an infinitely long, thin log, each being origin with radius 1/2, then G might not be a subgraph able to see only the part of the log to their left before of the Delaunay triangulation, and in this case edges the next frog. At random times and positions (precisely, may cross, so G is not planar (e.g., if the vertex-set is as a space–time Poisson point process of rate 1) a fly the four corners of a square, then the diagonals would lands on the log, at which instant the (unique) frog be edges). which can see it jumps left to the fly’s position and eats For a given configuration x,thereisacollectionof it. This defines a continuous time Markov process (the proximity graphs indexed by the template A,soby Hammersley process) whose states are the configura- choosing a monotone one-parameter family of tem- tions of positions of all the frogs. There is a stationary plates, one gets a monotone one-parameter family of version of the process in which, at each time, the posi- graphs, analogous to the one-parameter family c of tions of the frogs form a Poisson (rate 1) point process G geometric graphs. Here is a popular choice [30]in on the line. which β 1givestheGabrielgraphandβ 2gives Now consider the space–time trajectories of all the = = the relative neighborhood graph. frogs, drawn with time increasing upward on the page. See Figure 2.Foreachfrog,thepartofthetrajectory DEFINITION (The β-skeleton family). (i) For 0 < between the completions of two successive jumps con- β<1letAβ be the intersection of the two open discs 1 sists of an upward edge (the frog remains in place as of radius (2β)− passing through v and v . − + time increases) followed by a leftward edge (the frog (ii) For 1 β 2letA be the intersection of β jumps left). the two open≤ discs≤ of radius β/2centeredat( (β Reinterpreting the time axis as a second space axis, 1)/2, 0). ± − and introducing compass directions, that part of the 2.4 Networks Based on Powers of Edge-Lengths trajectory becomes a North edge followed by a West edge. Now replace these two edges by a single North- It is not hard to think of other ways to define one- West straight edge. Doing this procedure for each frog parameter families of networks. Here is one scheme and each pair of successive jumps, we obtain a col- used in, for example, [38]. Fix 1 p< .Given lection of NW paths, that is, a network in which each aconfigurationx,andaroute(sequenceofvertices)≤ ∞ city (the reinterpreted space–time random points) has x ,x ,...,x ,say,thecostoftherouteisthesumof 0 1 k an edge to the NW and an edge to the SE. Finally, we pth powers of the step lengths. Now say that a pair (x, y) is an edge of the network p if the cheapest route from x to y is the one-step route.G As p increases from 1 to , these networks decrease from the com- plete graph∞ to the MST. Moreover, for p 2thenet- ≥ work p is a subgraph of the Gabriel graph. G 2.5 The Hammersley Network There is a quite separate recent literature in theoreti- cal probability [26, 27]definingstructuressuchastrees and matchings directly on the infinite Poisson point process. In this spirit, we observe that the Hammers- ley process studied in [6]canbeusedtodefineanew network on the infinite Poisson point process, which we name the Hammersley network.Thisnetworkisde- signed to have the feature that each vertex has exactly 4edges,indirectionsNE(betweenNorthandEast), NW, SE and SW. The conceptual difference from the networks in the previous section is that there is not such asimple“local”criterionforwhetherapotentialedge (xi,xj ) is in the network. And edges cross, creating junctions. FIG.2. Space–time trajectories in Hammersley’s process. CONNECTED NETWORKS OVER RANDOM POINTS 279

2.6 Normalized Length The notion of normalized network length L is most easily visualized in the setting of an infinite determin- istic network which is “regular” in the sense of con- sisting of a repeated pattern. First choose the unit of length so that cities have an average density of one per unit area. Then define (2) L average network length per unit area, = " average (number of incident edges) (3) ¯ = of cities. Figure 4 shows the values of L and " for some sim- ¯ ple “repeated pattern” networks. Though not directly relevant to our study of the random model, we find Figure 4 helpful for two reasons: as intuition for the interpretation of the different numerical values of L, FIG.3. The Hammersley network on 2500 random points. and because we can make very loose analogies (Sec- tion 6.6)betweenparticularnetworksonrandompoints repeat the construction with the same realization of the and particular deterministic networks. space–time Poisson point process but with frogs jump- ing rightward instead of leftward. This yields a network 3. NORMALIZED LENGTH AND ROUTE-LENGTH on the infinite Poisson point process, which we name EFFICIENCY the Hammersley network.SeeFigure3. 3.1 The Random Model REMARKS.(a)TodrawtheHammersleynetwork For the remainder of the paper we work with “the on random points in a finite square,oneneedsexternal random model” for city positions. The finite model as- randomization to give the initial (time 0) frog positions, sumes n random vertices (cities) distributed indepen- in fact, two independent randomizations for the left- dently and uniformly in a square of area n.Theinfinite ward and the rightward processes. So to be pedantic, model assumes the Poisson point process of rate 1 (per one gets a random network over the given realization unit area) in the plane. The quantities L," above and of cities. However, one can deduce from the theoretical ¯ R below that we discuss may be interpreted as exact results in [6]thattheexternalrandomizationhaseffect values in the infinite model or as n limits in the only near the boundary of the square. finite model; see Section 5.Weusetheword→∞ normal- (b) The property that each vertex has exactly ized as a reminder of the “density 1” convention—we 4edges,indirectionsNE(betweenNorthandEast), choose the normalized unit of distance to make cities NW, SE and SW, is immediate from the construction. have average density 1 per unit area. After this normal- Note, however, that while adjacent NW space–time tra- ization, L is the average network length per unit area. jectories in Figure 2 do not cross, the corresponding di- agonal roads in the Hammersley network may cross, so 3.2 The Route-Length Efficiency Statistic R it is not a , though this has only negligible effect on route lengths. In designing a network, it is natural to regard total (c) Intuition, confirmed by Figure 7 later, says that length as a “cost”. The corresponding “benefit” is hav- the Hammersley network is not very efficient as a road ing short routes between cities. Write #(i, j) for the network. It serves to demonstrate that there do exist route length (length of shortest path) between cities i random networks other than the familiar ones, and pro- and j in a given network, and d(i,j) for Euclidean distance between the cities. So #(i, j) d(i,j),and vides an instance where imposing deterministic con- ≥ straints (the four edges, in this case) on a random net- we write work makes it much less efficient. How general a phe- #(i, j) r(i,j) 1 nomenon is this? = d(i,j) − 280 D. J. ALDOUS AND J. SHUN

FIG.4. Variant square, triangular and hexagonal lattices. Drawn so that the density of cities is the same in each diagram, and ordered by value of L. so that “r(i,j) 0.2” means that route length is 20% unreasonable to characterize the UK rail network as in- longer than straight= line distance. With n cities we get efficient simply because there is no very direct route n 2 such numbers r(i,j);whatisareasonablewayto between Oxford and Cambridge. combine these into a single statistic? Two natural pos- The statistic R has a more subtle drawback. Con- ! " ave sibilities are as follows: sider a network consisting of:

Rmax max r(i,j), the minimum-length connected network (Steiner := j i • (4) =) tree) on given cities; Rave ave(i,j) r(i,j), and a superimposed sparse collection of randomly := • oriented lines (a Poisson line process [45]). where ave(i,j) denotes average over all distinct pairs (i, j).ThestatisticRmax has been studied in the con- See Figure 5.Bychoosingthedensityoflinestobe text of the design of networks [37] sufficiently low, one can make the normalized network where it is called the stretch.However,beingan“ex- length be arbitrarily close to the minimum needed for tremal” statistic Rmax seems unsatisfactory as a de- connectivity. But it is easy to show (see [7]forcareful scriptor of real world networks—for instance, it seems analysis and a stronger result) that one can construct CONNECTED NETWORKS OVER RANDOM POINTS 281

efficiency trade-off [the function Ropt(L) discussed in Section 5], and so, in particular, it makes sense to com- pare the values of R for networks with different n. Advantage 2. A more realistic model for traffic would posit that volume of traffic between two cities γ varies as a power-law d− of distance d,sothatincal- culating Rave it would be more realistic to weight by γ d− .Thismeansthattheoptimalnetwork,whenusing Rave as optimality criterion, would depend on γ .Useof R finesses this issue; the value of γ does not affect R. Arelatedissueisthatvolumeoftrafficbetweentwo cities should depend on their populations. Intuitively, incorporating random population sizes should make the optimal R smaller because the network designer can create shorter routes between larger cities. We see this effect in data [10]; R calculated via population- weighting is typically slightly smaller. But we have not tried theoretical study. Disadvantage.ThestatisticR is tailored to the in- finite model, in which it makes sense to consider two FIG.5. Efficient or inefficient? Rave would judge this network cities at exactly distance d apart (then the other city po- efficient in the n limit. →∞ sitions form a Poisson point process). For finite n we need to discretize. For the empirical data in [10], where such networks so that Rave 0asn .Ofcourse n 20, we average over intervals of width 1 unit (re- no one would build a road→ network looking→∞ like Fig- call= the unit of distance is taken such that the density ure 5 to link cities, because there are many pairs of of cities is 1 per unit area), that is, for d 1, 2,...,5, nearby cities with only very indirect routes between we calculate = them. The disadvantage of Rave as a descriptive sta- ρ(d) mean value of r(i,j) over city-pairs tistic is that (for large n)mostcity-pairsarefarapart, ˜ := (6) with d 1

FIG.6. The function ρ(d) for three theoretical networks on random cities. Irregularities are Monte Carlo random variation.

This characteristic shape has a common-sense in- 4. LENGTH-EFFICIENCY TRADE-OFF FOR terpretation. Any efficient network will tend to place TRACTABLE NETWORKS roads directly between unusually close city-pairs, im- Recall that our overall theme is the trade-off between plying that ρ(d) should be small for d<1. For large network length and route-length efficiency, and that in d the presence of multiple alternate routes helps pre- this paper we focus on n limits in the random vent ρ(d) from growing. At distance 2 3fromatyp- →∞ − model and the particular statistics L and R. ical city i there will be about π32 π22 16 other − ≈ The models described in Section 2 are “tractable” cities j.Forsomeofthesej there will be cities k near in the specific sense that one can find exact analytic the straight line from i to j,sothenetworkdesigner formulas for normalized length L.UnfortunatelyR is can create roads from i to k to j.Thedifficultyarises not amenable to analytic calculation, and we resort to where there is no such intermediate city k:includinga Monte Carlo simulation to obtain values for R.Table1 direct road (xi,xj ) will increase L,butnotincludingit and Figure 7 show the values of (L, R) in the models. will increase ρ(d) for 2

FIG.7. The normalized network length L and the route-length efficiency statistic R for certain networks on random points. The show the beta-skeleton family, with RN the relative neighborhood graph and G the Gabriel graph. The are special models: shows the◦ Delaunay triangulation, shows the network from Section 2.4 and shows the Hammersley network•. , ! G2 ♦

(b) Value of L for MST from Monte Carlo [19]. LEMMA 1. For a proximity graph with template A In principle, one can calculate arbitrarily close bounds on the Poisson point process, [11], but apparently this has never been carried out. Of π 3/2 course, " 2foranytree. ¯ (8) L 3/2 , (c) The Gabriel= graph and the relative neighborhood = 4c π graph fit the assumptions of Lemma 1 with c π/4 (9) " , = ¯ and c 2π √3 ,respectively,andtheirtableentries = c = 3 − 4 for L and " are obtained from Lemma 1, as are the where c area(A). ¯ = values for β-skeletons in Figure 7. PROOF.Takeatypicalcityatpositionx0.Fora (d) For the Hammersley network, every degree city x at distance s the chance that (x0,x) is an edge equals 4, so L 2 (mean edge-length). It follows equals exp( cs2) and so from theory [6]thatatypicaledge,say,NEfrom= × (x, y), − goes to a city at position (x ξx,y ξy),whereξx and mean-degree ∞ exp( cs2)2πsds, + + = 0 − ξy are independent with Exponential(1) distribution. # So mean edge-length equals 1 L ∞ s exp( cs2)2πsds. = 2 0 − ∞ ∞ 2 2 x y # (7) x y e− − dxdy 1.62. 0 0 + ≈ Evaluating the integrals gives (8)and(9). # # $ ! (e) For any triangulation, " 6intheinfinitemodel. One can derive similar integral formulas for other ¯ For the Delaunay triangulation,= L ES where S is “local” characteristics, for example, mean density of the perimeter length of a typical cell,= and it is known triangles and moments of vertex degree. See [18, 20, 32 ([35], page 113) that ES 3π .Note[33]thattheDe- 21, 34]foravarietyofsuchgeneralizationsandspe- launay triangulation is in= general not the minimum- cializations. length triangulation. Our simulation results in Figure 6 4.2 Other Tractable Networks for ρ(d) for the Delaunay triangulation are roughly consistent with a simulation result in [13]sayingthat We do not know any other ways of defining networks ρ(65) 0.05. on random points which are both “natural” and are ≈ tractable in the sense that one can find exact analytic 4.1 A Simple Calculation for Proximity Graphs formulas for L.Inparticular,weknownotractableway Let us give an example of an elementary calculation of defining networks with deliberate junctions as in for proximity graphs over random points. Figure 8.Notealsothat,whileitiseasytomakeadhoc 284 D. J. ALDOUS AND J. SHUN

where R is the discretized version (6)calculatedusing ˜ intervals of some suitable length δn.Applyingthistoa random configuration X in the finite model gives, for each L,arandomvariable

)n(L) Rn(X,L). := One intuitively expects convergence to some determin- istic limit

(12) )n(L) Ropt(L) say, as n . → →∞ The analogous result for Rmax will be proved care- fully in [8], and the same “superadditivity” argument could be used to prove (12). See [43, 44, 47]forgen-

FIG.8. An ad hoc modification of the relative neighborhood eral background to such results. The point is that we graph, introducing junctions. do not have any explicit description of the optimal [i.e., attaining the minimum in (11)] networks in the modifications to the geometric graph to ensure connec- finite or infinite models, so it seems very challenging tivity, these destroy tractability. On the other hand, one to prove the natural stronger supposition that the finite can construct “unnatural” networks (see, e.g., [8]) de- optimal networks themselves converge (in some appro- signed to permit calculation of L. priate sense) to a unique infinite optimal network for which the value R Ropt(L) is attained. = 5. OPTIMAL NETWORKS AND N LIMITS →∞ 5.3 The Curve Ropt(L) 5.1 Tractable Models Every possible network on the infinite Poisson point As mentioned earlier, the quantities L,",R we dis- process defines a pair (L, R),andthecurveR ¯ = cuss may be interpreted as exact values in the infinite Ropt(L) can be defied equivalently as the lower bound- model or as n limits in the finite model. To elab- ary of the set of possible values of (L, R).Thereis orate briefly, in→∞ a realization of the finite model (n cities no reason to believe that proximity graphs are exactly distributed independently and uniformly in a square of optimal, and, indeed, Figure 7 shows that the Delau- area n), a network in Table 1 has a normalized length nay triangulation is slightly more efficient than the cor- 1 Ln n− (network length) and an average degree responding β-skeleton. But our attempts to do better " =which× are random variables, but there is conver- by ad hoc constructions (e.g., by introducing degree-3 ¯ n gence (in probability and in expectation) junctions—see Figure 8 for an example) have been un- successful. And, indeed, the fact that the two special (10) Ln L, "n " as n → ¯ → ¯ →∞ models in Figure 7 lie close to the β-skeleton curve to limit constants definable in terms of the analogous lends credence to the idea that this curve is almost network on the infinite model (rate 1 Poisson point optimal. We therefore speculate that the function Ropt process on the infinite plane). For the proximity graphs looks something like the curve in Figure 9,whichwe or Delaunay triangulation, the network definition ap- now discuss. plies directly to the infinite model and proof of (10)is What can we say about Ropt(L)?Itisapriorinonin- straightforward. For the Hammersley network, (10)is creasing. It is known [47]thatthereexistsaEuclidean implicit in [6], and for the MST detailed arguments can Steiner tree constant LST representing the limit nor- be found in [9, 43]. malized Steiner tree length in the random model, and 5.2 Optimal Networks clearly Ropt(L) for LLST ∞ ; Given a configuration x of n cities in the area-n (13) 1 Ropt(L) 0asL square, and a value of L which is greater than n− → →∞ (length of Steiner tree), one can define a number × are not trivial to prove rigorously, but follow from the R (x,L) min of R over all networks corresponding facts for Rmax proved in [8]. But we are n ˜ (11) = unable to prove rigorously that Ropt(L) is strictly de- on x with normalized length L, creasing or that it is continuous. ≤ CONNECTED NETWORKS OVER RANDOM POINTS 285

FIG.9. Speculative shape for the curve Ropt(L), with and values from tractable networks in Figure 7. ◦ •

6. FINAL REMARKS 6.3 Real-World Trade-Off Between Network Length and Route-Length Efficiency 6.1 Toy Models for Road Networks The idea of using proximity graphs as toy models for Recall that our central theme is seeking to quan- road networks has previously been noted [30]butnot tify the trade-off between normalized network length l investigated very thoroughly. It is an intuitively natural and route-length efficiency R.Figure9 suggests that idea to a network designer: whether or not to place a di- for optimal networks the “law of diminishing returns” rect road from city i to a nearby city j depends (partly) sets in around L 2(forcomparison,thisisthevalue = on whether some other city k is close to the line be- of L corresponding to the square grid network), in tween them. that Ropt(L) decreases rapidly to around 0.13 as L in- As observed by a referee, for the kind of models creases to 2 but decreases only slowly as L increases studied in this paper we expect route length #(i, j) be- further. This suggests a kind of “economic prediction” tween distant cities to be roughly proportional to graph for the lengths of real-world networks which are per- distance (number of edges), which is a more relevant ceived by users to be efficient in providing short routes: quantity in some contexts. However, when one con- siders design of optimal networks, replacing or par- the length of an efficient network linking n tially replacing route length by graph distance leads cities in a region of area A will be roughly to quite different optimal networks [1, 22]. For some 2√An. other cost/benefit functionals leading to yet different Here the √An arises from undoing the normalization optimal networks see [2, 14]. and the “2” is the value of L.Ofcourse,thisisrough: 6.2 Rigorous Proof of Finite R in Random we mean “closer to 2 than to 1 or 3.” Proximity Graphs 6.4 Other Results for the Random Network Models Table 1 presented the Monte Carlo numerical value 0.38 of R for the relative neighborhood graph on ran- There is substantial literature on the networks (MST, dom≈ points. From a rigorous viewpoint, the assertion proximity graphs, Delaunay triangulation) in the de- that a random network has R< is essentially the terministic setting. In the random case, central limit assertion that ρ(d) O(d) as d ∞ .Thisisoften theorems for total network length have been studied nontrivial to prove.= A general sufficient→∞ condition for in many models: for the MST in [29, 31, 32], and for this property, which applies to the relative neighbor- the Delaunay triangulation, Voronoi tessellation, rela- hood graph (and hence all proximity graphs), is proved tive neighborhood and Gabriel graphs in [12, 25, 42]. in [3]. The related fact that the limit limd ρ(d)/d Large deviation estimates for total network length are exists is proved in [4]. →∞ given for the Gabriel graph in [46], Section 11.4, and 286 D. J. ALDOUS AND J. SHUN presumably could be extended to other models. Oth- three cases: erwise the literature for the random case is rather dif- Relative n’hood graph punctured lattice, fuse, with different focuses for different networks. For ↔ instance, work on MSTs has focused on connections Gabriel graph square lattice, ↔ with critical continuum percolation [17]. For the rel- Hammersley network diagonal lattice, ative neighborhood graph and the Gabriel graph, [20] ↔ calculates " and [18]showsthat,inthefinitemodel, Delaunay triangulation triangular lattice. ¯ ↔ in a certain range the β-skeletons have 6.7 Scale Invariant Continuum Networks

(14) Rmax grows as order log n/ log log n Introducing the statistic R can be viewed as one $ approach to resolving the “paradox” from [7], dis- and [21]showsthesameorderformaximumvertex cussed in Section 3.2,thatthemorenaturalstatistic degree in the Gabriel graph. As for the Delaunay tri- Rave does not lead to realistic optimal networks in the angulation, there has been surprisingly little follow-up n limit. This particular approach was prompted to the seminal analysis by Miles [35](variousmaxi- by→∞ visualizing real-world road networks—cf. discus- mal statistics are studied in [16]), though the closely sion in Section 3.3.Letusmentionamathematically related Voronoi tessellation has been studied in more more sophisticated alternative, under study as a work detail [36]. in progress [5]. Instead of a discrete Poisson process of 6.5 Speculative Applications of Random Proximity cities, we imagine a continuum limit. That is, for each finite set (z ,...,z ) of points in the plane, there is a Graphs 1 k random network (z1,...,zk) linking the points, con- Random proximity graphs seem an interesting ob- sistent as more pointsS are added. Mathematically nat- ject of study from many viewpoints, in particular, as ural structural properties for the distribution of such a an attractive alternative to random geometric graphs process are as follows: for modeling spatial networks that are connected by (i) translation and rotation invariance, design. It is remarkable that results such as (14)are (ii) scale invariance, the only nonelementary results about them that we can find in the literature. As well as being natural models where the latter means that routes, as point-sets in R2, for road networks, proximity graphs might be useful are invariant in distribution under Euclidean scaling. in modeling communication networks suffering line of This implies that the quantity ρ(d) analogous to (5), sight interference. assumed finite, is a constant, which we can call R(.The At a more mathematical level, for questions such as analog L( of L is defined by spread-out percolation [41]orcriticalvalueofcontact the expected length of the network on n processes [15], random proximity graphs with small A uniform random points in the area-n square are an interesting alternative to the usual lattice- or ran- grows L(n as n . dom graph-based models. For instance, it is natural to ∼ →∞ In this setting we can study the optimal trade-off be- conjecture that the critical value pA∗ for edge perco- lation on a random proximity graph with template A tween L( and R(,andthekindof“paradoxical”Fig- satisfies ure 5 network cannot arise because it violates scale- invariance. 1 (15) pA∗ π − area(A) as area(A) 0 ∼ → ACKNOWLEDGMENTS [the right side 1/" from (9)] and that the critical = ¯ Aldous’s research supported by NSF Grant DMS- value λA∗ for the contact process has the same asymp- totics. 0704159. We thank three anonymous referees for help- ful comments. 6.6 Analogies Between Deterministic and Random Networks REFERENCES As mentioned earlier, we may make very loose [1] ALDOUS,D.J.(2008).Spatialtransportationnetworkswith analogies between particular networks on random transfer costs: Asymptotic optimality of hub and spoke points and particular deterministic networks in Fig- models. Math. Proc. Cambridge Philos. Soc. 145 471–487. ure 4,basedinpartonexactequalityof" in the latter MR2442138 ¯ CONNECTED NETWORKS OVER RANDOM POINTS 287

[2] ALDOUS,D.J.(2008).Optimalspatialtransportationnet- [20] DEVROYE,L.(1988).Theexpectedsizeofsomegraphs works where link-costs are sublinear in link-capacity. J. Stat. in computational geometry. Comput. Math. Appl. 15 53–64. Mech. 2008 P03006. MR0937563 [3] ALDOUS,D.J.(2009).Whichconnectedspatialnetworks [21] DEVROYE,L.,GUDMUNDSSON,J.andMORIN, P. (2009). on random points have linear route-lengths? Available at On the expected maximum degree of Gabriel and Yao graphs. arXiv:0911.5296v1. Adv. in Appl. Probab. 41 1123–1140. MR2663239 [4] ALDOUS,D.J.(2009).Theshapetheoremforroute-lengths [22] GASTNER, M. T. and NEWMAN,M.E.J.(2006).Shapeand in connected spatial networks on random points. Available at efficiency in spatial distribution networks. J. Stat. Mech. The- arXiv:0911.5301v1. ory Exp. 2006 P01015 (electronic). [5] ALDOUS,D.J.(2010).Scale-invariantrandomspatialnet- [23] GEORGE,P.-L.andBOROUCHAKI, H. (1998). Delau- works. To appear. nay Triangulation and Meshing.EditionsHermès,Paris. [6] ALDOUS,D.J.andDIACONIS,P.(1995).Hammers- MR1686530 ley’s interacting particle process and longest increasing sub- [24] GRAHAM,R.L.andHELL,P.(1985).Onthehistoryofthe sequences. Probab. Theory Related Fields 103 199–213. minimum spanning tree problem. IEEE Ann. History Comput. MR1355056 07 43–57. MR0783327 [7] ALDOUS,D.J.andKENDALL,W.S.(2008).Short-length [25] HEINRICH, L. (1994). Normal approximation for some routes in low-cost networks via Poisson line patterns. Adv. in mean-value estimates of absolutely regular tessellations. Appl. Probab. 40 1–21. MR2411811 Math. Methods Statist. 3 1–24. MR1272628 [8] ALDOUS,D.J.,BHAMIDI,S.andLANDO, T. (2010). The [26] HOLROYD,A.E.,PEMANTLE,R.,PERES,Y.and stretch-length tradeoff in geometric networks: Worst-case and SCHRAMM,O.(2009).Poissonmatching.Ann. Inst. H. average-case study. To appear. Poincaré Probab. Statist. 45 266–287. MR2500239 [9] ALDOUS,D.J.andSTEELE,J.M.(1992).Asymptoticsfor [27] HOLROYD,A.E.andPERES,Y.(2003).Treesandmatch- Euclidean minimal spanning trees on random points. Probab. ings from point processes. Electron. Comm. Probab. 8 17–27 Theory Related Fields 92 247–258. MR1161188 (electronic). MR1961286 [10] ALDOUS,D.J.andCHOI,A.(2009).Aroute-length [28] JAROMCZYK,J.W.andTOUSSAINT,G.T.(1992).Rela- efficiency statistic for road networks. Unpublished manu- tive neighborhood graphs and their relatives. Proc. IEEE 80 script. Available at www.stat.berkeley.edu/~aldous/Spatial/ 1502–1517. paper.pdf. [29] KESTEN,H.andLEE,S.(1996).Thecentrallimittheorem [11] AVRAM,F.andBERTSIMAS,D.(1992).Theminimumspan- for weighted minimal spanning trees on random points. Ann. ning tree constant in geometric probability and under the in- Appl. Probab. 6 495–527. MR1398055 dependent model: A unified approach. Ann. Appl. Probab. 2 [30] KIRKPATRICK,D.G.andRADKE,J.D.(1985).Aframe- 113–130. MR1143395 work for computational morphology. In Computational [12] AVRAM, F. and BERTSIMAS,D.(1993).Oncentrallimitthe- Geometry (G. T. Toussaint, ed.) 217–248. Elsevier, Amster- orems in geometrical probability. Ann. Appl. Probab. 3 1033– dam. 1046. MR1241033 [31] LEE,S.(1997).ThecentrallimittheoremforEuclidean [13] BACCELLI,F.,TCHOUMATCHENKO,K.andZUYEV,S. minimal spanning trees. I. Ann. Appl. Probab. 7 996–1020. (2000). Markov paths on the Poisson–Delaunay graph with MR1484795 applications to routeing in mobile networks. Adv. in Appl. [32] LEE,S.(1999).ThecentrallimittheoremforEuclideanmin- Probab. 32 1–18. MR1765174 imal spanning trees. II. Adv. in Appl. Probab. 31 969–984. [14] BARTHÉLEMY,M.andFLAMMINI,A.(2006).Optimaltraf- MR1747451 fic networks. J. Stat. Mech. Theory Exp. 2006 L07002. [33] LLOYD,E.L.(1977).Ontriangulationsofasetofpointsin [15] BERGER,N.,BORGS,C.,CHAYES,J.T.andSABERI,A. the plane. In 18th Annual Symposium on Foundations of Com- (2005). On the spread of viruses on the . In Pro- puter Science (Providence, RI, 1977) 228–240. IEEE Com- ceedings of the Sixteenth Annual ACM-SIAM Symposium on put. Soc., Long Beach, CA. MR0660693 Discrete Algorithms 301–310 (electronic). ACM, New York. [34] MATULA,D.W.andSOKAL,R.R.(1980).Propertiesof MR2298278 Gabriel graphs relevant to geographical variation research and [16] BERN,M.,EPPSTEIN, D. and YAO,F.(1991).Theexpected the clustering of points in the plane. Geog. Anal. 12 205–222. extremes in a Delaunay triangulation. Internat. J. Comput. [35] MILES,R.E.(1970).OnthehomogeneousplanarPoisson Geom. Appl. 1 79–91. MR1099499 point process. Math. Biosci. 6 85–127. MR0279853 [17] BEZUIDENHOUT,C.,GRIMMETT,G.andLÖFFLER,A. [36] MØLLER, J. (1994). Lectures on Random Vorono˘ıTessel- (1998). Percolation and minimal spanning trees. J. Stat. Phys. lations. Lecture Notes in Statistics 87. Springer, New York. 92 1–34. MR1645627 MR1295245 [18] BOSE,P.,DEVROYE,L.,EVA N S ,W.andKIRKPAT- [37] NARASIMHAN, G. and SMID, M. (2007). Geometric RICK,D.(2006).OnthespanningratioofGabrielgraphsand Spanner Networks. Cambridge Univ. Press, Cambridge. β-skeletons. SIAM J. Discrete Math. 20 412–427 (electronic). MR2289615 MR2257270 [38] NARAYANASWAMY,S.,KAWADIA,V.,SREENIVAS,R.S. [19] CORTINA-BORJA,M.andROBINSON,T.(2000).Estimat- and KUMAR,P.R.(2002).Powercontrolinad-hocnetworks: ing the asymptotic constants of the total length of Euclidean Theory, architecture, and implementation of the minimal spanning trees with power-weighted edges. Statist. COMPOW protocol. In Proc. European Wireless Conference, Probab. Lett. 47 125–128. MR1747099 Florence, Italy. 288 D. J. ALDOUS AND J. SHUN

[39] NEWMAN,M.E.J.(2003).Thestructureandfunctionof [44] STEELE, J. M. (1997). Probability Theory and Combinator- complex networks. SIAM Rev. 45 167–256. MR2010377 ial Optimization. CBMS-NSF Regional Conference Series in [40] PENROSE, M. D. (2003). Random Geometric Graphs.Ox- Applied Math 69.SIAM,Philadelphia,PA.MR1422018 ford Univ. Press, Oxford. MR1986198 [45] STOYAN,D.,KENDALL,W.S.andMECKE, J. (1995). Sto- [41] PENROSE,M.D.(1993).Onthespread-outlimitforbond chastic Geometry and Its Applications,2nded.Wiley,Chich- and continuum percolation. Ann. Appl. Probab. 3 253–276. ester. MR0895588 MR1202526 [46] TALAGRAND,M.(1995).Concentrationofmeasureand [42] PENROSE,M.D.andYUKICH,J.E.(2001).Centrallimit theorems for some graphs in computational geometry. Ann. isoperimetric inequalities in product spaces. Inst. Hautes Appl. Probab. 11 1005–1041. MR1878288 Études Sci. Publ. Math. 81 73–205. MR1361756 [43] PENROSE,M.D.andYUKICH,J.E.(2003).Weaklawsof [47] YUKICH, J. E. (1998). Probability Theory of Classical large numbers in geometric probability. Ann. Appl. Probab. Euclidean Optimization Problems. Lecture Notes in Math. 13 277–303. MR1952000 1675.Springer,Berlin.MR1632875