<<

RAPID COMMUNICATIONS

PHYSICAL REVIEW E 66, 015104͑R͒͑2002͒

Percolation in directed scale-free networks

N. Schwartz,1 R. Cohen,1 D. ben-Avraham,2 A.-L. Baraba´si,3 and S. Havlin1 1Minerva Center and Department of Physics, Bar-Ilan University, Ramat-Gan, Israel 2Department of Physics, Clarkson University, Potsdam, New York 13699-5820 3Department of Physics, University of Notre Dame, Notre Dame, Indiana 46556 ͑Received 1 May 2002; published 26 July 2002͒ Many complex networks in nature have directed links, a property that affects the network’s navigability and large-scale topology. Here we study the properties of such directed scale-free networks with correlated in and out distributions. We derive a phase diagram that indicates the existence of three regimes, determined by the values of the degree exponents. In the first regime we regain the known mean field exponents. In contrast, the second and third regimes are characterized by anomalous exponents, which we calculate analytically. In the third regime the network is resilient to random dilution, i.e., → the is pc 1.

DOI: 10.1103/PhysRevE.66.015104 PACS number͑s͒: 02.50.Cw, 05.50.ϩq, 64.60.Ak

Recently the topological properties of large complex net- nent ͑GWCC͒ and several finite components. In the GWCC works such as the Internet, the World Wide Web ͑WWW͒,an every site is reachable from every other, provided that the electric power grid, and cellular and social networks have links are treated as bidirectional. The GWCC is further di- drawn considerable attention ͓1,2͔. Some of these networks vided into a giant strongly connected ͑GSCC͒, are directed, for example, in social and economical networks consisting of all sites reachable from each other following if node A gains information or acquires physical goods from directed links. All the sites reachable from the GSCC are node B, it does not necessarily mean that node B gets similar referred to as the giant OUT component, and the sites from input from node A. Likewise, most metabolic reactions are which the GSCC is reachable are referred to as the giant IN one directional, thus changes in the concentration of mol- component. The GSCC is the intersection of the IN and OUT ecule A affect the concentration of its product B, but the components. All sites in the GWCC, but not in the IN and reverse is not true. Despite the directedness of many real OUT components, are referred to as the ‘‘tendrils’’ ͑see Fig. networks, the modeling literature, with few notable excep- 1͒. tions ͓3–5͔, has focused mainly on undirected networks. For a directed random network of arbitrary degree distri- An important property of directed networks can be cap- bution the condition for the existence of a tured by studying their , P( j,k), or the can be deduced in a manner similar to ͓9͔. If a site, b,is probability that an arbitrary node has j incoming and k out- reached following a link pointing to it from site a, then it going edges. Many naturally occurring directed networks, must have at least one outgoing link, on average, in order to such as the WWW, metabolic networks, citation networks, be part of a giant component. This condition can be written etc., exhibit a power-law, or scale-free degree distribution for as the incoming or outgoing links: Ϫ␭ ͒ϭ in(out) у ͑ ͒ ͗ ͉ → ͘ϭ ͑ ͉ → ͒ϭ ͑ ͒ Pin(out)͑l cl , l m, 1 kb a b ͚ kbP jb ,kb a b 1, 2 jb ,kb where m is the minimal connectivity ͑usually taken to be m ϭ1), c is a normalization factor, and ␭ are the in ͑out͒ where j and k are the in and out degrees, respectively, in(out) ͉ → degree exponents characterizing the network ͓6,7͔. An im- P( jb ,kb a b) is the conditional probability given that site ͗ ͉ → ͘ portant property of scale-free networks is their robustness to a has a link leading to b, and kb a b is the conditional random failures, coupled with an increased vulnerability to average. Using Bayes rule we get ͓ ͔ attacks 8–12 . Recently it has been recognized that this fea- P͑ j ,k ͉a→b͒ϭP͑ j ,k ,a→b͒/P͑a→b͒ ture can be addressed analytically in quantitative terms b b b b ͓ ͔ ϭ → ͉ ͒ ͒ → ͒ 9–11 by combining graph theoretical concepts with ideas P͑a b jb ,kb P͑ jb ,kb /P͑a b . from percolation theory ͓13–15͔. Yet, while the percolative properties of undirected networks are much studied, little is known about the effect of node failure in directed networks. As many important networks are directed, it is important to fully understand the implications to their stability. Here we show that directedness has a strong impact on the percolation properties of complex networks and we draw a detailed phase diagram. The structure of a has been characterized in ͓3,4͔, and in the context of the WWW in ͓7͔. In general, a directed graph consists of a giant weakly connected compo- FIG. 1. Structure of a general directed graph.

1063-651X/2002/66͑1͒/015104͑4͒/$20.0066 015104-1 ©2002 The American Physical Society RAPID COMMUNICATIONS

SCHWARTZ, COHEN, ben-AVRAHAM, BARABA´ SI, AND HAVLIN PHYSICAL REVIEW E 66, 015104͑R͒͑2002͒

For random networks P(a→b)ϭ͗k͘/(NϪ1) and P(a → ͉ ϭ Ϫ b jb ,kb) jb /(N 1), where N is the total number of nodes in the network. The above criterion thus reduces to ͓3,4͔ ͗ jk͘у͗k͘. ͑3͒

Suppose a fraction p of the nodes is removed from the network. ͑Alternatively, a fraction qϭ1Ϫp of the nodes is retained.͒ The original degree distribution, P( j,k), becomes ϱ FIG. 2. Phase diagram of the different regimes for the IN com- j0 Ϫ ponent of scale-free correlated directed networks. The boundary PЈ͑ j,k͒ϭ ͚ P͑ j ,k ͒ͩ ͪ ͑1Ϫp͒ jp j0 j 0 0 j between resilient and anomalous exponents is derived from Eq. ͑9͒ j0 ,k0 while that between anomalous exponents and mean field exponents ͑ ͒ ␭Ãϭ k0 Ϫ is given by Eq. 24 for 4. For the diagram of the OUT com- ϫͩ ͪ ͑1Ϫp͒kpk0 k. ͑4͒ ␭ ␭ k ponent in and out change roles.

Ϫ␭ Ϫ␭ ͑ Ϫ ͒ in out In view of this new distribution, Eq. ͑3͒ yields the percola- 1 A Bcinj coutk Ϫ␭ tion threshold ϩ out␦ ϭ P͑ j,k͒ϳͭ BAcoutk j, j(k) , j 0 ͑8͒ ” Ϫ␭ ͗k͘ ͑1ϪB͒c k out, jϭ0, ϭ Ϫ ϭ ͑ ͒ out qc 1 pc , 5 ͗ jk͘ ␭ Ϫ ␭ Ϫ where j(k)ϭk( out 1)/( in 1). With this distribution, any fi- nite fraction BA of fully correlated sites yields a diverging where averages are computed with respect to the original ͗ jk͘ whenever distribution before dilution, P( j,k). Equation ͑5͒ indicates ͑␭ Ϫ2͒͑␭ Ϫ2͒р1, ͑9͒ that in directed scale-free networks if ͗ jk͘ diverges then qc out in →0 and the network is resilient to random breakdown of causing the percolation threshold to vanish ͑see Fig. 2͒. nodes and bonds. In the case of no correlations between the in and the out The term ͗ jk͘ may be dramatically influenced by the ap- degrees, Aϭ0, Eq. ͑8͒ becomes P( j,k)ϭP ( j)P (k). pearance of correlations between the in and out degrees of in out Then the condition for the existence of a giant component is the nodes. In particular, let us consider scale-free distribu- ͗k͘ϭ͗ j͘ϭ1. Moreover, Eq. ͑5͒ reduces to tions for both the in and out degrees Ϫ␭ 1 in ϭ ϭ Ϫ ϭ ͑ ͒ Bcinj , j 0 qc 1 pc ͗ ͘ . 10 P ͑ j͒ϳͭ ” ͑6͒ k in 1ϪB, jϭ0 Applying Eq. ͑10͒ to scale-free networks one concludes that ␭ Ͼ ␭ Ͼ and for out 2 and in 2 a exists at a finite Ϫ␭ qc . Here we concern ourselves with the critical exponents ͑ ͒ϭ out ͑ ͒ Pout k coutk . 7 associated with the percolation transition in scale-free net- ␭ Ͼ ␭ Ͼ work of out 2 and in 2 which is the most relevant re- In Eq. ͑6͒ we choose to add the possible zero value to the in gime ͑Fig. 2͒. degree in order to maintain ͗ j͘ϭ͗k͘. If the in and out de- Percolation of the GWCC can be seen to be similar to grees are uncorrelated, we expect ͗ jk͘ϭ͗ j͗͘k͘. For several percolation in the non-directed graph created from the di- real directed networks this equality does not hold. For ex- rected graph by ignoring the directionality of the links. The ample, the network of Notre Dame University WWW ͓6͔ has threshold is obtained from the criterion ͓9͔ ͗ ͘ϭ͗ ͘Ϸ ͗ ͗͘ ͘ϭ k j 4.6, and thus j k 21.16. In contrast, measur- ͗k͘ ing directly we find ͗ jk͘Ϸ200, about an order of magnitude ϭ ͑ ͒ qc ͗ ͑ Ϫ ͒͘ . 11 larger than the result expected for the uncorrelated case. This k k 1 yields an estimate of q Ϸ0.02, i.e., a very stable directed c Here the connectivity distribution is the convolution of the in network. We obtained similar results also for some metabolic and out distributions networks ͓16͔, indicating that in real directed networks, the k in and out degrees are correlated. PЈ͑k͒ϭ͚ P͑l,kϪl͒. ͑12͒ To address correlations, we model it in the following lϭ0 manner: we first generate the j values for the entire network. ϭ Ј Next, for each site with j ” 0 with probability A we generate Regardless of correlations, P (k) is always dominated by k fully correlated with j, i.e., kϭk( j). Assuming that k( j)is the slower decay exponent, therefore percolation of the a monotonically increasing function then the requirement GWCC is the same as in nondirected scale-free networks, Ϫ␭ Ϫ␭ out ϭ in coutk dk cinj dj—needed to maintain the distribu- with ␭ ϭmin(␭ ,␭ ). Note that the percolation threshold ␭ Ϫ ␭ Ϫ eff in out tions scale-free—leads to k out 1ϭ j in 1. With probability of the GWCC may differ from that of the GSCC and the IN 1ϪA, the degree k is chosen independently from j, and OUT components ͓4͔.

015104-2 RAPID COMMUNICATIONS

PERCOLATION IN DIRECTED SCALE-FREE NETWORKS PHYSICAL REVIEW E 66, 015104͑R͒͑2002͒

͓ ͔ We now use the formalism of generating functions 17,18 H0(1) is the probability to reach an outgoing component to analyze percolation of the GSCC and IN and OUT com- of any finite size choosing a site. Thus, below the percolation ͓ ͔ ϭ ponents. In 3,4 a generating function is built for the joint transition H0(1) 1, while above the transition there is a probability distribution of outgoing and incoming degrees, finite probability to follow a directed link to a site which is a ϭ Ϫ before dilution: root of an infinite outgoing component: Pϱ 1 H0(1). It follows that j k ⌽͑x,y͒ϭ͚ P͑ j,k͒x y . ͑13͒ ϱ k, j ͑ ͒ϭ ͩ Ϫ ͑ ͒ k ͪ ͑ ͒ Pϱ q q 1 ͚ Pout k u , 21 Using the approach of Callaway et al. ͓10͔ let q( j,k) be the k probability that a of degree ( j,k) remains in the net- where uϵH (1) is the smallest positive root of work following dilution. The generating function after dilu- 1 q tion is then ϭ Ϫ ϩ ͒ϩ Ϫ ͒ ͒ k u 1 q ͚ ͓BAj͑k ͑1 A ͗ j͔͘Pout͑k u . ͗ j͘ k G͑x,y͒ϭ͚ P͑ j,k͒q͑ j,k͒x jy k. ͑14͒ ͑22͒ k, j Here Pϱ(q) is the fraction of sites from which an infinite ͑ ͒ From Eq. 14 it is possible to define the generating function number of sites is reachable. Equation ͑22͒ can be solved for the outgoing degrees G0 : numerically and the solution may be substituted into Eq. ͑21͒, yielding the size of the IN component at dilution p ͒ϵ ͒ϭ ͒ ͒ k ͑ ͒ G0͑y G͑1,y ͚ P͑ j,k q͑ j,k y . 15 ϭ Ϫ k, j 1 q. Near criticality, the probability to start from a site and ϳ Ϫ ␤ The probability of reaching a site by following a specific link reach a giant outgoing component follows Pϱ (q qc) . is proportional to jP( j,k), therefore, the probability of For mean-field systems ͑such as infinite-dimensional sys- reaching an occupied site following a specific directed link is tems, random graphs, and Cayley trees͒ it is known that ␤ generated by ϭ1 ͓19͔. This regular mean-field result is not always valid. Instead, following ͓20͔ we study the behavior of Eq. ͑22͒ k ϭ ϭ ͚ jP͑ j,k͒q͑ j,k͒y near q qc ,u 1, and find j,k G ͑y͒ϭ . ͑16͒ 1 à 1 , 2Ͻ␭ Ͻ3 ͚ jP͑ j,k͒ 3Ϫ␭à j,k ␤ϭ ͑ ͒ 1 à 23 Let H (y) be the generating function for the probability , 3Ͻ␭ Ͻ4 1 Ά ␭ÃϪ of reaching an outgoing component of a given size by fol- 3 à lowing a directed link, after a dilution. H1(y) satisfies the 1, ␭ Ͼ4, self-consistent equation: where H ͑y͒ϭ1ϪG ͑1͒ϩyG H ͑y͒ . ͑17͒ 1 1 1„ 1 … ␭ Ϫ␭ ␭Ãϭ␭ ϩ in out ͑ ͒ out ␭ Ϫ . 24 Since G0(y) is the generating function for the outgoing de- in 1 gree of a site, the generating function for the probability that n sites are reachable from a given site is We see that the order parameter exponent ␤ attains its usual à ͒ϭ Ϫ ͒ϩ ͒ ͑ ͒ mean-field value only for ␭ Ͼ4.As ␭ →␭ the correlated H0͑y 1 G0͑1 yG0 H1͑y . 18 out in „ … fraction BA of sites resembles nondirected networks ͓20,21͔ For the case where correlations exist, and assuming random ͑where there is no distinction between incoming and outgo- ͒ ␭Ãϭ␭ ϭ␭ dilution: q( j,k)ϭq, Eqs. ͑17͒ and ͑18͒ reduce to ing degrees . In this case we get out in for any amount of correlation A. The criterion for the existence of a qy ͒ϭ Ϫ ϩ ͒ϩ Ϫ ͒ giant component is then ͗k2͘/͗k͘ϭ1, and not 2 as in the H1͑y 1 q ͚ ͓BAj͑k ͑1 A ͗ j͘ k nondirected case. The difference stems from the fact that in the nondirected case one of the links is used to reach the site, ϫ͗ j͔͘P ͑k͒H ͑y͒k ͑19͒ out 1 while in the directed case there is generally no correlation and between the location of the incoming and outgoing links. Therefore, one more outgoing link is available for leaving ͒ϭ Ϫ ϩ ͒ ͒k ͑ ͒ the site. H0͑y 1 q qy͚ Pout͑k H1͑y . 20 k Without any correlations, Aϭ0, different terms prevail in the analysis and → ϭ If A 0, one expects that H0(y) H1(y), since there is no 1 2Ͻ␭ Ͻ3 correlation between j and k, thus the probability to have k ␭ Ϫ , out ␤ϭͭ out 2 ͑25͒ outgoing edges is Pout(k) whether we choose the site ran- ␭ Ͼ domly or weighted by the incoming edges j. 1, out 3.

015104-3 RAPID COMMUNICATIONS

SCHWARTZ, COHEN, ben-AVRAHAM, BARABA´ SI, AND HAVLIN PHYSICAL REVIEW E 66, 015104͑R͒͑2002͒

␭à ϭ͚ s ͓ ͔ TABLE I. Values of for the different network components for erated by H0, where H0(y) sp(s)y .Asin 20 , H0(y) both correlated and uncorrelated cases. can be expanded from Eq. ͑18͒. In the presence of correla- tions we find Uncorrelated Correlated 1 ϩ Ͻ␭ÃϽ 1 à , 2 4 GWCC min(␭ ,␭ )ϩ1 min(␭ ,␭ ) ␭ Ϫ2 out in out in ␶ϭ ͑27͒ ␭ Ϫ␭ in out Ά 3 à IN ␭ ϩ1 ␭ ϩ ␭ Ͼ out out ␭ Ϫ , 4. in 1 2 ␭ Ϫ␭ à OUT ␭ ϩ1 ␭ ϩ out in The regular mean-field exponents are recovered for ␭ Ͼ4. in in ␭ Ϫ out 1 For the uncorrelated case we get ␭ ␭ ϩ ␭ ␭ GSCC min( out , in) 1 min( out* , in*) 1 1ϩ , 2Ͻ␭ Ͻ3 ␭ Ϫ1 out ␶ϭ out ͑ ͒ à ͭ 28 This is the same as Eq. ͑23͒ but with ␭ ϭ␭ ϩ1. 3 out , ␭ Ͼ3. The GSCC is the intersection of the IN and OUT compo- 2 out nents. Therefore, it behaves as the smaller of the two com- ␤ ϭ ␤ ␤ Now the regular mean-field results are obtained for ␭Ͼ3. ponents: GSCC max( in , out). This can be also derived by applying the same methods as for the IN and OUT compo- In summary, we calculate the percolation properties of nents to the generating function of the GSCC obtained in ͓4͔. directed scale-free networks. We find that the percolation The exponent for the GWCC, on the other hand, is indepen- critical exponents in scale-free networks are strongly depen- dent upon the existence of correlations and upon the degree dent of the exponents of the other components, since the à transition point is different. distribution exponents in the range of 2Ͻ␭ Ͻ4. This regime It is known that for a of arbitrary degree characterizes most naturally occurring networks, such as distribution the finite clusters follow the scaling form metabolic networks or the WWW. The regular mean-field behavior of percolation in infinite dimensions is recovered only for ␭ÃϾ4. A connection is found between nondirected ͑ ͒ϳ Ϫ␶ Ϫs/s* ͑ ͒ n s s e , 26 and directed scale-free percolation exponents for any finite correlation between the in and out degrees. In the uncorre- where s is the cluster size and n(s) is the number of clusters Ϫ␴ lated case, i.e. P( j,k)ϭP ( j)P (k), the probability to of size s. At criticality s*ϳ͉qϪq ͉ diverges and the tail of in out c reach an outgoing component does not bear any dependence the distribution follows a power law. The probability that s sites can be reached from a site by upon Pin( j). The results are summarized in Table I. Ϫ␶ following links at criticality follows p(s)ϳs , and is gen- Support from the NSF is gratefully acknowledged ͑D.bA͒.

͓1͔ R. Albert and A.L. Baraba´si, Rev. Mod. Phys. 74,47͑2002͒. Lett. 86, 3682 ͑2001͒. ͓2͔ S.N. Dorogovtsev and J.F.F. Mendes, Adv. Phys. 51, 1079 ͓12͔ R.V. Sole and J.M. Montoya, Proc. R. Soc. London, Ser. B ͑2002͒. 268, 2039 ͑2001͒. ͓3͔ M.E.J. Newman, S.H. Strogatz, and D.J. Watts, Phys. Rev. E ͓13͔ D. Stauffer and A. Aharony, Introduction to Percolation 64, 026118 ͑2001͒. Theory, 2nd ed. ͑Taylor and Francis, London, 1991͒. ͓4͔ S.N. Dorogovtsev, J.F.F. Mendes, and A.N. Samukhin, Phys. ͓14͔ and Disordered System, edited by A. Bunde, and S. ͑ ͒ Rev. E 64, 025101R 2001 . Havlin ͑Springer, New York, 1996͒. ͓ ͔ ´ ´ ´ 5 A.D. Sanchez, J.M. Lopez, and M.A. Rodrıguez, Phys. Rev. ͓15͔ H. Hinrichsen, Adv. Phys. 49, 815 ͑2000͒. ͑ ͒ Lett. 88, 048701 2002 . ͓16͔ H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Bara- ͓ ͔ ´ 6 A.-L. Barabasi, R. Albert, and H. Jeong, Physica A 281,2115 ba´si, Nature ͑London͒ 407, 651 ͑2000͒. ͑2000͒. ͓17͔ H.S. Wilf, Generatingfunctionology, 2nd ed. ͑Academic Press, ͓7͔ A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajago- London, 1994͒. palan, R. Stata, A. Tomkins, and J. Wiener, Comput. Netw. 33, ͓18͔ G.H. Weiss, Aspects and Applications of the 309 ͑2000͒. ͑North-Holland, Amsterdam, 1994͒. ͓8͔ R. Albert, H. Jeong, and A.L. Baraba´si, Nature ͑London͒ 406, ͓19͔ P. Frojdh, M. Howard, and K.B. Lauritsen, Int. J. Mod. Phys. B 6794 378 ͑2000͒; 406, 378 ͑2000͒. ͑ ͒ ͓9͔ R. Cohen, K. Erez, D. ben-Avraham, and S. Havlin, Phys. Rev. 15, 1761 2001 . ͓ ͔ Lett. 85, 4626 ͑2000͒. 20 R. Cohen, D. ben-Avraham, and S. Havlin, e-print ͓10͔ D.S. Callaway, M.E.J. Newman, S.H. Strogatz, and D.J. Watts, cond-mat/0202259. Phys. Rev. Lett. 85, 5468 ͑2000͒. ͓21͔ R. Pastor-Satorras and A. Vespignani, Phys. Rev. E 63, 066117 ͓11͔ R. Cohen, K. Erez, D. ben-Avraham, and S. Havlin, Phys. Rev. ͑2001͒.

015104-4