OFFPRINT boundaries of complex networks

Jia Shao, Sergey V. Buldyrev, Reuven Cohen, Maksim Kitsak, Shlomo Havlin and H. Eugene Stanley EPL, 84 (2008) 48004

Please visit the new website www.epljournal.org TAKE A LOOK AT THE NEW EPL Europhysics Letters (EPL) has a new online home at www.epljournal.org

Take a look for the latest journal news and information on: • reading the latest articles, free! • receiving free e-mail alerts • submitting your work to EPL

www.epljournal.org November 2008 EPL, 84 (2008) 48004 www.epljournal.org doi: 10.1209/0295-5075/84/48004

Fractal boundaries of complex networks

Jia Shao1(a),SergeyV.Buldyrev1,2, Reuven Cohen3, Maksim Kitsak1, Shlomo Havlin4 and H. Eugene Stanley1

1 Center for Polymer Studies and Department of Physics, Boston University - Boston, MA 02215, USA 2 Department of Physics, Yeshiva University - 500 West 185th Street, New York, NY 10033, USA 3 Department of Mathematics, Bar-Ilan University - 52900 Ramat-Gan, Israel 4 Minerva Center and Department of Physics, Bar-Ilan University - 52900 Ramat-Gan, Israel

received 12 May 2008; accepted in final form 16 October 2008 published online 21 November 2008

PACS 89.75.Hc – Networks and genealogical trees PACS 89.75.-k – Complex systems PACS 64.60.aq –Networks

Abstract – We introduce the concept of the boundary of a as the set of nodes at distance larger than the mean distance from a given node in the network. We study the statistical properties of the boundary nodes seen from a given node of complex networks. We find that for both Erd˝os-R´enyi and scale-free model networks, as well as for several real networks, the boundaries have fractal properties. In particular, the number of boundaries nodes B follows a − power law probability density function which scales as B 2. The clusters formed by the boundary nodes seen from a given node are with a fractal dimension df ≈ 2. We present analytical and numerical evidences supporting these results for a broad class of networks.

Copyright c EPLA, 2008

Many complex networks are “small world” due to the role in several scenarios, such as in the spread of viruses very small average distance d between two randomly or information in a human social network. If the virus chosen nodes. Often d ∼ ln N,whereN is the number of (information) spreads from one node to all its nearest nodes [1–6]. Thus, starting from a randomly chosen node neighbors, and from them to all next nearest neighbors following the shortest path, one can reach any other node and further on until d,howmanynodesdonot get the in a very small number of steps. This phenomenon is called virus (information), and what is their distribution with “six degrees of separation” in social networks [4]. That is, respect to the origin of the infection? for most pairs of randomly chosen people, the shortest In this letter, we find theoretically and numerically “distance” between them is not more than six. Many that the nodes at the boundaries, which are of order random network models, such as Erd˝os-R´enyi network N, exhibit similar fractal features for many types of (ER) [1], Watts-Strogatz network (WS) [5] and scale-free networks, including ER and SF models as well as several network (SF) [3,6–8], as well as many real networks, have real networks. Song et al. [9] found that some networks been shown to possess this small-world property. have fractal properties while others do not. Properties Much attention has been devoted to the structural of fractal networks were also studied [10,11]. Here we properties of networks within the average distance d from a show that almost all model and real networks includ- given node. However, almost no attention has been given ing non-fractal networks have fractal features at their to nodes which are at distances greater than d from a boundaries. given node. We define these nodes as the boundaries of Figure 1 demonstrates our approach and analysis. For the network and study the ensemble of boundaries formed each “root” node, we call the nodes at distance ℓ from it by all possible starting nodes. An interesting question is: “nodes in shell ℓ”. We choose a random root node and howmany“friendsoffriendsoffriendsetc....”hasoneat count the number of nodes Bℓ at shell ℓ.Weseethat a distance greater than the average distance d?Whatis B1 = 10, B2 = 11, B3 = 13, etc. . . . We estimate the aver- their probability distribution and what is the structure age distance (diameter) d ≈ 2.9 by averaging the distances of the boundaries? The boundaries have an important between all pairs of nodes. After removing nodes with ℓ

48004-p1 Jia Shao et al.

0 10 l=11 1- 10 l=12 β= 0.1 l 2- 10 l=13 P(B )

-3 10

4- )a( ER 10 0 2 4 6 shell 1 10 10 10 10 Bl

0 shell 2 10

1- 10 cluster of 8 nodes β= 0.1

l 2- 10 P(B ) 3- l=8 Fig. 1: (Color online) Illustration of shells and clusters origi- 10 l=11 nating from a randomly chosen root node, which is shown in l=10 4- l=9 the center (red). Its neighboring nodes are defined as shell 1 10 )b( SF (green), the nodes at distance ℓ are defined as shell ℓ.When 0 2 4 6 10 10 10 10 removing all nodes with ℓ<3, the remaining network (purple) Bl becomesfragmentedinto12clusters. 0 10 l =9 l=10 boundary of the network is always seen from a given node, 1- 10 β= 0.1 thus not a unique set of nodes in the network. l l =11 We begin by simulating ER and SF networks, and l=12

P(B ) 2- then we present analytical proofs. Figure 2a shows 10 simulation results for the number of nodes Bℓ reached from a randomly chosen origin node for an ER network. -3 )c( HEP The results shown are for a single network realization 10 0 1 2 3 4 of size N =106, with average degree k =6 and d ≈ 7.9 10 10 10 10 10 Bl (see footnote 1). For ℓB ,whereB is the maximum typical size of shell ℓ l=4 ℓ ℓ ℓ 1- l=6 l=5 (see footnote 2). However, for ℓ>d, we observe a clear 10 transition to a power law decay behavior, where l β= 0.1 −β -2 l=7 P (Bℓ) ∼ Bℓ ,withβ ≈ 1andthepdfofBℓ is 1P(B ) 0 ˜ −2 P (Bℓ) ≡ dP (Bℓ)/dBℓ ∼ Bℓ . For different networks, the 3- emergence of the power law can occur at shell ℓ = d +1 or 10 (d) AS ℓ = d + 2. Thus, our results suggest a broad “scale-free” 0 1 2 3 4 10 10 10 10 10 distribution for the number of nodes at distances larger B than d. This power law behavior demonstrates that there l is no characteristic size and a broad range of sizes can Fig. 2: (Color online) The cumulative distribution function, appear in a shell at the boundaries. P (Bℓ), for two random network models: (a) ER network In SF networks, the degrees of the nodes, k, follow with N =106 nodes and k = 6, and (b) SF network with −λ a power law distribution function q(k) ∼ k ,wherethe N =106 nodes and λ =2.5, and two real networks: (c) the minimum degree of the network, kmin,ischosentobe2. High Energy Particle (HEP) physics citations network and Figure 2b shows, for SF networks with λ =2.5, similar (d) the Autonomous System (AS) Internet network. The shells −β power law results, P (Bℓ) ∼ Bℓ ,withβ ≈ 1forℓ>d with ℓ>d are marked with their shell number. The thin lines from left to right represent shells ℓ =1, 2, ..., respectively, with ℓd, P (Bℓ) follows a power law distribution 1Different realizations yield similar results. In one realization, − − P (B ) ∼ B β , with β ≈ 1 (corresponding to P˜(B ) ∼ B 2 for a certain fraction of nodes are randomly taken to be origin. The ℓ ℓ ℓ ℓ the pdf). The appearance of a power law decay only happens histogram is obtained from Bℓ belonging to different origin nodes. 2 The behavior of the pdf of Bℓ for ℓ

48004-p2 Fractal boundaries of complex networks

0 10 N=200 8 a( ) ER N=400 10 l=5 1- N=800 l=6 10 N=8000 6 l N=32000 10 θ= .3 0 =7 2- N=128000 6 l 10 N=10 4

l 10

3- n(s ) 10

/N 2 10 4- 10 0 10 )a( SF 5- 10 0 2 4 6 6- -4 2- 0 2 4 6 8 10 10 10 10 l (- lnN nl/ sl

4 8 10 10 ER N=128000 l=5 6 l=7 3 ER N=10 6 l=8 10 SF N=128000 10 θ=3.0 6

1 SF N=10 l 4 2 1l 0 10 ~ k + n(s ) 2 1 10 10 ( )b 0 )b( HEP 0 10 10 0 1 2 3 4 1 2 3 4 5 6 7 8 9 10 11 12 13 10 10 10 10 10 l sl

Fig. 3: (Color online) (a) Normalized average number of nodes Fig. 4: (Color online) The number of clusters of sizes sℓ, n(sℓ), at shell ℓ, Bℓ/N , as a function of ℓ − ln N/ln k for ER as a function of sℓ after removing nodes within shell ℓ for: network with k = 6. For different N, the curves collapse. (a) SF network with N =106 and λ =2.5, (b) HEP citations ˜ 2 (b) kℓ + 1, which is kℓ /kℓ, as a function of ℓ shown for both network. The relation between n(sℓ) and sℓ is characterized −θ ER and SF networks with different N. by a power law, n(sℓ) ∼ sℓ , with θ ≈ 3. In order to show all curves clearly, vertical shifts are made. Note that the points in the tail of the distributions represent the rare occurences of which is similar with ER. We find similar results also for large clusters which are formed by nodes outside shell ℓ − 1. λ>3 (not shown). To test how general is our finding, we also study The branching factor [15] of the network is several real networks (figs. 2c, d), including the High ˜ 2 Energy Particle (HEP) physics citations network [12] and k = k /k−1, the Autonomous System (AS) Internet network [13,14]. where the averages are calculated for the entire network. For HEP network and AS network, d ≈ 4.2 and 3.3, For ER network, k˜ can be proved to be equivalent to respectively. The of HEP network is k. Similarly, we define k˜ ≡k2/k −1, where the not a power law (see fig. 2c), while the AS network shows ℓ ℓ ℓ averages are calculated only for nodes in shell ℓ.Above a power law degree distribution with λ ≈ 2.1 (see fig. 2d). the diameter, k˜ + 1 decreases with ℓ for both ER and Our results suggest that the power law decay behavior ℓ SF networks (fig. 3b). Thus, at the shells where power appears also in both networks, with similar values of β ≈ 1 law behavior of P (B ) appears (fig. 2), the nodes have for ℓ>d (see footnote 3). ℓ much lower k˜ + 1 compared with the entire network. Next we ask: how many nodes are on average at ℓ The approaching of k˜ +1 to 1 (ER network) and 2 (SF the boundaries? Are they a nonzero fraction of N? ℓ network) is consistent with a critical behavior at the We calculate the mean number B  in shell ℓ,andin ℓ boundaries of the network [15]. fig. 3a plot B /N as a function of ℓ − ln N/ln k for ℓ Next, we study the structural properties of the bound- different values of N for ER network. The term ln N/ln k aries. Removing all nodes that are within a distance ℓ (not represents the diameter d of the network [2]. We find that, including shell ℓ)forℓ>d, the network will become frag- for different values of N, the curves collapse, supporting mented into several clusters (see fig. 1). We denote the a relation independent of network size N.SinceB /N is ℓ size of those clusters as s , the number of clusters of size apparently constant and independent of N, it follows that ℓ s as n(s ), and the diameter of the cluster as d . We find B ∼N, i.e., a finite fraction of N nodes appears at each ℓ ℓ ℓ ℓ n(s) ∼ s−θ,withθ ≈ 3.0 (figs. 4a and b). The points in shell including shells with ℓ>d. We find similar behavior the tails of figs. 4a and b represent the rare appearances for SF network with λ =3.5 (not shown). of the large clusters. We find similar results for ER and other real networks. 3We also find similar results (not shown here) for other real The relation between the sizes of the clusters sℓ and networks. their diameters dℓ is shown as scatter plots in figs. 5a

48004-p3 Jia Shao et al.

4 10 for ER, SF with λ =3.5 and several other real networks. l=6 l =7 Root nodes with different degree yield different average 3 10 distances of the rest of the nodes from the root [18,19]. ϕ=1.9 However, using our definition of boundaries the fractal

l 2 S 10 clusters can be observed for both large and small degree roots (see fig. 5c).

1 10 Next we present analytical derivations supporting the )a( SF above numerical results. We denote the degree distribution

0 of a network as q(k). For infinitely large networks we can 10 1 10 100 dl neglect loops for ℓ9 Ave. D si =.t 6.3 Bm distribution of Bm, is the coefficient of x in the Taylor S8 vs d8 k<4 Ave. D si .t =7 1. 3 10 expansion of G˜m(x). For shells with large m which is still smaller than d, ϕ= 8.1 l 2

S 10 it is expected [23] that the number of nodes will increase ˜ m−1 by a factor of k. It is possible to show [21] that G1 (x) 1 ˜m 10 converges to a function of the form f((1 − x)k ) for large c( ) ER m (m ≪ d), where f(x) satisfies the Poincar´efunctional N=10 0,0 00 =6 0 relation 10 1 10 100 G1(f(y))= f(yk˜), (2) dl where y =1− x. The function form of f(y) can be uniquely

Fig. 5: (Color online) The size of clusters, sℓ, shown as scatter determined from eq. (2). plots of the diameters dℓ of the clusters for (a) SF network It is known [21] that f(x) has an asymptotic func- 6 −δ δ with N =10 and λ =2.5, (b) HEP citations network for nodes tional form, f(y)=f∞ + ay +0(y ), where f∞ satisfies outside shell ℓ − 1. Vertical shifts of the curves are made for G (f )=f . It can be shown [22] that f also gives ϕ 1 ∞ ∞ ∞ clarity. sℓ scales with dℓ as sℓ ∼ dℓ , with ϕ ≈ 2. (c) For ER 5 the probability that a link is not connected to the giant network with N =10 and k =6, sℓ as a function of dℓ for component of the network by one of its ends. Expanding small degree (k<4) and large degree (k>9) nodes chosen as both sides of eq. (2), we obtain roots. Here, sℓ as a function of dℓ is shown for ℓ =8. The ′ −δ ˜−δ −δ δ diameter of the entire network is d ≈ 6.6. Depending on the G1(f∞)+G1(f∞)ay = f∞ + ak y +0(y ). (3) degree of the root node, the average distance of all nodes from ′ ˜ the root may change. Large degree roots (k>9) have small Since G1(f∞)=f∞,wehaveδ = − ln G1(f∞)/ ln k. average distances (≈ 6.3) while small degree roots (k<4) have If q(1) = 0 and q(2) =0, from G1(f∞)=f∞,wehave ϕ f =0 and G′ (f )=G′′(0)/k =2q(2)/k.Ifq(2) = large average distances (≈ 7.1). However, sℓ ∼ dℓ , with ϕ ≈ 2, ∞ 1 ∞ 0 can be observed for both small and large degree roots. We q(1) = 0 (B¨ottcher case [21]), then δ = ∞, which indi- ignore the large clusters which appear in the flat regions of cates that f(y) has an exponential singularity. Therefore, fig. 4. networks with minimum degree kmin  3donothavethe power law distribution of Bℓ shown in fig. 2, and therefore have no fractal boundaries. and b, for SF (λ =2.5) and HEP citations networks, Applying Tauberian-like theorems [21,24] to f(y), which respectively. In order to show all curves, vertical shifts are has a power law behavior for y →∞, Dubuc [25] concluded ϕ made. Figures 5a and b show a power law relation, sℓ ∼ dℓ , that the Taylor expansion coefficient of G˜m(x), P˜(Bm), with ϕ ≈ 2, suggesting that the clusters at the boundaries μ ∗ ˜m behaves as Bm with an exponential cutoff at Bm ∼ k , are fractals with fractal dimension df = 2 like where clusters at criticality [16,17]. Here, we ignore the non- δ − 1,q(1) =0 and q(2) =0; fractal large clusters which appear in the flat regions of μ = fig. 4. We find that the fractal dimension is df = ϕ ≈ 2 also 2δ − 1,q(1) = 0 and q(2) =0.

48004-p4 Fractal boundaries of complex networks

0 10 n−1 l=1 where rn =1− ( i=1 Bi)/N is the fraction of nodes l=2 l=3 outside shell n − 1. Note that eq. (5) has almost the 2- 10 l=4  µ= 4.1 -+ 0 .1 l=5 same structure as eq. (1). It can also be shown that the l=6 ˜ l=7 branching factor of nodes outside shell n − 1isk(rn)= 4- l=8 ′′ ′ −1 10 uG0 (u)/G0(u), where u = G0 (rn). PDF For ER networks, eq. (5) yields 6- 10 l ∞ )a( ER ( <_ )d k(rℓ−1) k -8 rℓ+1 = e = q(k)rℓ , (6) 10 0 2 4 6 10 10 10 10 ℓ=0 Bl  0 10 which is valid for all possible ℓ. We test it in fig. 6b for )b( ER ER network. The relation between rℓ+m and rℓ can be obtained by applying eq. (6) m times on rℓ. In fig. 6b we -1 10 m=1 show the fraction of nodes outside shell ℓ + m − 1, rℓ+m,

l+m E 6(.q ) as a function of rℓ for ER network. Different values of m r are tested in the plot. -2 m=2 10 When m ≪ d and n ≫ d, using the same considerations m=3 as we used in eq. (1), one can show that

3- −μ−1 10 0 .0 2 0 4. .0 6 0.8 1 rn =[ak˜(1 − rm)] + r∞, (7) r l where r∞ = G0(f∞) is the fraction of nodes not belonging Fig. 6: (Color online) (a) For ER network, the probability to the giant component of the network, a is a constant. ˜ distribution function P (Bℓ)ofnumberofnodesBℓ in shells Based on eqs. (4) and (7), expressing rm and rn in ℓ  d. For small values of B , P˜(B ) ∼ Bμ,whereμ depends on ℓ ℓ ℓ terms of Bm and Bn, we find that for m ≪ d and n ≫ d, the k of the network (eq. (4)). The slopes of the least-square −μ−1 Bn ∼ B .UsingP˜(Bn)dBn = P˜(Bm)dBm,weobtain fit represented by the straight lines give μ =1.4 ± 0.1, which is m in good agreement with the theoretically predicted value μ = ˜ −1−μ/(μ+1)−1/(μ+1) −2 P (Bn) ∼ Bn = Bn , (8) 1.34. (b) The fraction of nodes outside shell ℓ + m − 1, rℓ+m,as ℓ−1 a function of rℓ for ER network, where rℓ =1− (i=1 Bi)/N supporting the numerical findings in fig. 2. is calculated for any possible ℓ. The (red) lines represent the These results are rigorous when k˜ exists and when the theoretical iteration function (eq. (6)). minimum degree kmin  2. For SF networks with λ<3, k˜ diverges for N →∞. But for finite N, k˜ still exists. Thus Thus the probability distribution function of the number the above results can also be applied to the case of λ<3. of nodes in the shell m with m ≪ d has a power law tail For both ER and SF networks with kmin  3, the power for small values of Bm law of P˜(Bn)withn ≫ d cannot be observed, as we indeed confirm by simulations. P˜(B ) ∼ Bμ . (4) m m Relating our problem to , we can For an ER network, eq. (4) is supported by simulations explain the simulation results of the probability distri- for m  d in fig. 6a. Figure 6a shows for ER network bution of cluster size sℓ. The cluster size distribution that P˜(Bℓ)forℓS) ∼ S exp(−S|p − pc| ). (9) −ln(kf∞)/ln k−1, where f∞ can be obtained numeri- cally from f = ek(f∞−1). In the case of k =6,μ ≈ 1.34, ∞ In the case of random networks the percolation threshold which is close to the result shown in fig. 6a. is given by p =1/k˜. In the exterior of the shell n − 1 The above considerations are correct only for m˜ 1 a giant component. These finite clusters have fractal links in each shell. Writing down the master equation for dimension d = 2 [16,17]. This theoretical prediction is the degree distribution in the outer shells, one can obtain f confirmed in fig. 5. a system of ordinary differential equations, which can The cluster size distribution can be estimated by be solved analytically using the apparatus of generating ∗ introducing a sharp exponential cutoff at s = Sn ∼ functions. Using this solution one can show that 1 ˜ − σ −τ+1 ∗ |k(rn) − 1| ,sothatPn(s>S) ∼ S P (Sn >S), n−m −1 ∗ rn = G0(G1 (G0 (rm))), (5) where P (Sn >S) is the probability for a given shell to have

48004-p5 Jia Shao et al.

∗ 74 Sn >S.Sincern − r∞ has a smooth power law distribution [3] Albert R. and Barabasi´ A.-L., Rev. Mod. Phys., −σ and k˜(r∞) < 1, the probability that |k˜(rn) − 1| S) ∼ S and −τ+1−σ [5] Watts D. J. and Strogatz S. H., Nature (London), 393 Pn(s>S)=S [26]. Therefore the cluster size distribution follows n(s) ∼ s−(τ+σ) = s−θ,thusθ = τ + σ. (1998) 440. Cohen R. Havlin S. 90 For ER networks and SF networks with λ>4, τ =2.5 [6] and , Phys. Rev. Lett., (2003) 058701. and σ =0.5, the above derivations lead to n(s) ∼ s−3. [7] Dorogovtsev S. N. and Mendes J. F. F., Evolution of For SF networks with 2 <λ<4, τ =(2λ − 3)/(λ − 2) and Networks: from Biological Nets to the Internet and WWW σ = |λ − 3|/(λ − 2) [16]. Thus, for SF network with λ>3, (Oxford University Press, New York) 2003. −3 −3 there will be n(s) ∼ s . We conjecture n(s) ∼ s even [8] Pastor-Satorras R. and Vespignani A., Evolution for 2 <λ<3, although in this case k˜(rn) does not exist and Structure of the Internet: Statistical Physics Approach and the above derivations are not valid. Our numerical (Cambridge University Press) 2004. simulations support these results in fig. 4. [9] Song C. et al., Nature (London), 433 (2005) 392; Nat. In summary, we find empirically and analytically that Phy., 2 (2006) 275. the boundaries of a broad class of complex networks [10] Goh K. I. et al., Phys. Rev. Lett., 96 (2006) 018701. 75 including non-fractal networks [9] have fractal features. [11] Kitsak M. et al., Phys. Rev. E., (2007) 056115. The probability distribution function of the number of [12] Derived from the HEP section of arxiv.org; http:// vlado.fmf.uni-lj.si/pub/networks/data/hep-th/hep- nodes in these shells follows a power law, P˜(B ) ∼ B−2, ℓ ℓ th.htm (website of Pajek). and the number of clusters of size sℓ, n(sℓ), scales as Carmi S. 104 −3 [13] et al., Proc. Natl. Acad. Sci. U.S.A., (2007) n(sℓ) ∼ sℓ . The clusters at the boundaries are fractals 11150. with a fractal dimension df ≈ 2. Our findings can be [14] Shavitt Y. and Shir E., DIMES - Letting the Inter- applied to the study of epidemics. They imply that a net Measure Itself, http://www.arxiv.org/abs/cs.NI/ strong decay of the epidemic will happen in the boundaries 0506099. of human network, due to the low degree of nodes. [15] Cohen R. et al., Phys. Rev. Lett., 85 (2000) 4626. [16] Cohen R. et al., Phys. Rev. E, 66 (2002) 036113; Bornholdt S. and Schuster H. G. (Editors), Handbook ∗∗∗ of Graphs and Networks (Wiley-VCH) 2002, Chapt. 4. [17] Bunde A. and Havlin S., Fractals and Disordered System We thank ONR and Israel Science Foundation for (Springer) 1996. financial support. SVB thanks the Office of the Academic [18] Holyst J. et al., Phys. Rev. E, 72 (2005) 026108. 73 Affairs of Yeshiva University for funding the Yeshiva [19] Dorogovtsev S. N. et al., Phys. Rev. E, (2006) University high-performance computer cluster and 056122. [20] Harris T. E., Ann. Math. Stat., 41 (1948) 474; The acknowledges the partial support of this research through Theory of Branching Processes (Springer-Verlag, Berlin) the Dr. Bernard W. Gamson Computational Science 1963. Center at Yeshiva College. [21] Bingham N. H., J. Appl. Probab. A, 25 (1988) 215. [22] Braunstein L. A. et al., Int. J. Bifurcat. Chaos, 17 (2007) 2215; Phys. Rev. Lett., 91 (2003) 168701. REFERENCES [23] Newman M. E. J. et al., Phys. Rev. E, 64 (2001) 026118. [24] Weiss G. H., Aspects and Applications of the Random [1] Erdos˝ P. and Renyi´ A., Publ. Math., 6 (1959) 290; Publ. Walk (North Holland Press, Amsterdam) 1994. Math. Inst. Hung. Acad. Sci., 5 (1960) 17. [25] Dubuc S., Ann. Inst. Fourier, 21 (1971) 171. [2] Bollobas´ B., Random Graphs (Academic, London) 1985. [26] Barabasi´ A. L. et al., Phys. Rev. Lett., 76 (1996) 2192.

48004-p6