The friendship paradox in real and model networks

George T. Cantwell,1, 2, ∗ Alec Kirkley,2 and M. E. J. Newman2, 3 1Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501, USA 2Department of Physics, University of Michigan, Ann Arbor, Michigan 48109, USA 3Center for the Study of Complex Systems, University of Michigan, Ann Arbor, Michigan 48109, USA The friendship paradox is the observation that the degrees of the neighbors of a node in any network will, on average, be greater than the of the node itself. In common parlance, your friends have more friends than you do. In this paper we develop the mathematical theory of the friendship paradox, both in general as well as for specific model networks, focusing not only on average behavior but also on variation about the average and using generating function methods to calculate full distributions of quantities of interest. We compare the predictions of our theory with measurements on a large number of real-world network data sets and find remarkably good agreement. We also develop equivalent theory for the generalized friendship paradox, which compares characteristics of nodes other than degree to those of their neighbors.

I. INTRODUCTION ful of outliers? How much variation is there? For how many people will the difference be positive? The answers to such questions all depend on the specific details of the It may appear to you that your friends are more pop- network under study. ular than you are, and if so, you may well be right. It is In this paper we develop the theory of the friendship however not necessarily your fault. It is, rather, a natu- paradox in both its original and generalized versions. We ral consequence of network structure. Feld [1] has shown derive expressions for the full distribution of differences that in any network the average degree (i.e., the number in three progressively more complex network models and of neighbors) of the neighbor of a node is strictly greater find that the observed distributions in real-world net- than the average degree of nodes in the network as a works are in good agreement with our theoretical results. whole. Applied to networks of friendship, this implies that on average your friends have more friends than you do. This phenomenon is known as the friendship paradox. II. THE FRIENDSHIP PARADOX A related phenomenon, the generalized friendship para- dox, describes similar behavior with respect to other at- Informally, the friendship paradox states that people’s tributes of network nodes [2]. Are your friends richer friends tend to be more popular than they themselves than you, for instance, or smarter, or more attractive? are. Stated a little more precisely, nodes in a network Generalized friendship paradoxes arise when such at- tend to have lower degree than their neighbors do. Con- tributes are correlated with node degree. If richer people sider an undirected network of n nodes, labeled by inte- are on average also more popular, then wealth and popu- gers i = 1, . . . , n. For any given node i we can compute larity will be positively correlated and hence the tendency the difference ∆i between the average of its neighbors’ for your friends to be more popular than you could mean degrees and its own degree: they are also richer. Examples of generalized friendship 1 X paradoxes occur for instance with citation counts in col- ∆ = A k − k , (1) i k ij j i laboration networks (your collaborators have more cita- i j tions than you do) [3] and viral content dissemination in online social networks (your online friends receive more where Aij is an element of the adjacency matrix and

arXiv:2012.03991v1 [cs.SI] 7 Dec 2020 P viral content than you do) [4]. ki = j Aij is the degree of node i. (The value of ∆i is undefined for nodes with degree zero, since they have no Understood as a mathematical statement about net- neighbors. We will assume that there are no such nodes work averages, the friendship paradox occurs in all net- in the network, or that all of them have been removed works. How the effect is manifested, however, depends on before the analysis.) the details of a network’s structure. For any given per- One statement of the friendship paradox is that the son we can measure the difference between how popular average of ∆i across all nodes is greater than zero, which their friends are and how popular they are. The results of can be proven as follows. We write the average as Feld [1] tell us that this difference must be positive on av- n erage, but to accurately characterize the effect we should 1 X 1 X 1 X  consider the entire distribution of differences. How large ∆i = Aijkj − ki n n ki will the differences be? Is their average driven by a hand- i=1 i j     1 X kj 1 X kj = A − A = A − 1 . n ij k ij n ij k ij i ij i ∗Email address: [email protected] (2) 2

Exchanging the summation variables i and j and adding that generalized friendship paradoxes arise when an at- the result to Eq. (2), this result can also be written tribute x is positively correlated with degree—it turns out that correlation with the more complex quantity κ n   1 X 1 X kj ki is the crucial behavior. Degree, however, is always posi- ∆ = A + − 2 n i 2n ij k k tively correlated with κ since, combining Eqs. (3) and (8), i=1 ij i j P we have Cov(k, κ) = (1/n) ∆i ≥ 0. While it is not r r 2 i 1 X kj ki mathematically guaranteed, in practice we therefore ex- = Aij − ≥ 0. (3) 2n ki kj pect properties that correlate with degree to also corre- ij late with κ and hence in such situations we can reason- ably expect to observe a generalized friendship paradox. The exact equality holds only when ki = kj for all pairs of neighboring nodes, i.e., when the network is a regular graph (or, technically, when every component is a regular graph). In all other cases, the average difference between III. MODELS the mean degree of a node’s neighbors and its own degree is strictly greater than zero. While these results are illuminating, there is more to For the generalized friendship paradox, which consid- be said on this topic. We have shown that the network ers attributes other than degree, one can define an anal- average of ∆i will always be greater than zero, but we (x) ogous quantity ∆i for an attribute x according to have not said by how much or what the value depends on. Nor have we considered how ∆i varies across a network. (x) 1 X As an example (albeit a somewhat contrived one), con- ∆ = Aijxj − xi, (4) i k sider a 1000-node network consisting of a i j with one edge removed. In such a network almost all of which measures the difference between the average of the the nodes—998 of them—have a degree larger than the attribute for node i’s neighbors and the value for i itself. average of their neighbors, but there are two outliers that When the average of this quantity over all nodes is posi- substantially skew the distribution so that over the whole tive one may say that the generalized friendship paradox network nodes are still less popular than their neighbors holds. In contrast to the case of degree, this is not always on average. In this case, therefore, the average is not very (x) informative. To understand the friendship paradox fully true—the value of ∆i can be zero or negative—but we can write the average as we need to move beyond statements about averages. In doing so, however, the specific structure of the net-   1 X (x) 1 X 1 X work becomes important. A common way to study the ∆i = Aijxj − xi n n ki effects of structure is to examine the behavior of model i i j networks and this is the approach we take in the re-   1 X X Aij = x − x , (5) mainder of this paper, considering the friendship para- n i k i dox for three successively more complex network models, i j j the Poisson random graph (sometimes called the Erd˝os– where the second line again follows from interchanging R´enyi model) [5–7], the configuration model [8, 9], and a summation indices. Defining the new quantity more sophisticated model that incorporates degree cor- relations. X Aij κ = (6) i k j j A. Poisson random graph and noting that The Poisson random graph is the simplest of random 1 X 1 X Aij 1 X 1 X κi = = Aij = 1, (7) graph models and one of the most widely studied. In n n k n k i ij j j j i this model one takes n nodes and connects each pair in- dependently with some probability p. When the net- we can then write work is large and sufficiently sparse—when n → ∞ and 1 X (x) 1 X 1 X 1 X p = λ/(n − 1) with λ growing slower than n so that ∆ = x κ − x κ n i n i i n i n i p → 0—the degrees are Poisson distributed with mean λ, i i i i meaning that the probability pk of a node having degree k = Cov(x, κ). (8) is

Thus we will have a generalized friendship paradox in the λk p = e−λ. (9) sense defined here if (and only if) x and κ are positively k k! correlated. This is perhaps not exactly the result one might Let us compute the complete distribution within this have anticipated when, in the introduction, we argued model of the quantity ∆i defined in Eq. (1). To compute 3 this distribution, we note that X P (∆) = pkP (∆|k), (10) k 0.015

where P (∆|k) denotes the probability that a node has )

∆ 0.010 value ∆ given that it has degree k. For given ∆ and k, ( 10 0 10

Eq. (1) tells us that the sum of the neighboring degrees P P 2 of the node is j Aijkj = k∆ + k . Let us denote this sum by K. Then we have 0.005

P (∆|k) = P (K = k∆ + k2|k). (11) 0.000 10 5 0 5 10 We know that K is an integer with K ≥ k, since each of ∆ the node’s k neighbors necessarily has degree at least 1. Hence the allowed values of ∆ that satisfy K = k∆ + k2 must be rational numbers of the form ∆ = 1 − k + m/k FIG. 1: The distribution of the quantity ∆, defined in Eq. (1), where m is a non-negative integer. for a random graph with mean degree 8. The main panel To evaluate (11) we need to know the distribution of shows the probability mass function while the inset shows a P traditional histogram of the distribution. the sum K = j Aijkj of neighbor degrees. While nodes in general follow the pk, the degrees of neighbors follow a modified distribution. A neighbor, of equally spaced peaks and the overall resulting distri- by definition, is a node arrived at by following an edge bution is quite complicated—both widely dispersed and and each node of degree k is at the end of k edges, so the jagged, even for this simple network model. Note also degree distribution for nodes at the ends of edges goes not that a significant fraction of nodes have ∆ < 0, meaning as p but as kp , which after appropriate normalization k k that they do not satisfy the traditional definition of the gives a probability distribution q for neighbor degrees of k friendship paradox—they have more friends than their the form average neighbor does. kpk It is also possible to calculate the mean and variance qk = P . (12) j jpj of ∆ for the Poisson random graph. We find that

For the Poisson degree distribution of Eq. (9), we then E[∆] = 1; Var(∆) = λ(1 + E[k−1]). (16) find that (Note that again we remove any nodes of degree zero k −λ k−1 kλ e λ −λ before computing E[k−1].) While the expected value of ∆ qk = = e . (13) λk! (k − 1)! is always 1, the variance is larger than the mean degree λ, meaning that more and more nodes have ∆ < 0 as λ In other words qk is also a Poisson distribution, but becomes large, with the fraction tending to a half. For shifted by one, meaning that k − 1 is a Poisson variable example, in Fig. 1, where the mean degree is 8, around with mean λ. 35% of nodes have degree larger than that of their average We can write the sum K as neighbor. This rises to 44% when the mean degree is 64, X X and 49% for a mean degree of 1024. For large λ therefore, K = A k = k + A (k − 1). (14) ij j i ij j no meaningful “friendship paradox” applies. Here, as is j j often the case, looking only at the average value of the P distribution is misleading when there is large variation. But j Aij(kj − 1) ∼ Poisson(kiλ), since the sum of the Poisson variables (kj − 1) is itself a Poisson random variable with mean a factor of ki greater. Hence B. The configuration model 2 (kλ)k∆+k −k P (∆|k) = P (K = k∆ + k2|k) = e−kλ. (k∆ + k2 − k)! The random graph of the previous section is in many (15) respects not a realistic model. In particular, as we have For each value of k, this gives us a distribution over a noted, it has a Poisson degree distribution, which is very discrete set of equally spaced values ∆ = 1−k+m/k with different from the broad degree distributions seen in typ- m = 0 ... ∞, and the full distribution, Eq. (10), is a linear ical real-world networks [10, 11]. We can address this combination of an infinite number of such distributions. shortcoming by using a more sophisticated random graph Figure 1 shows an example of the resulting distribution model that allows for arbitrary degree distributions, the of ∆ computed from Eq. (10) for the case λ = 8. The so-called configuration model [8, 9]. In this model one individual discrete distributions are clearly visible as sets fixes the degree of each of the nodes and then draws a 4 network at random from the set of all networks with the consider the Laplace transform for the distribution of ∆ given degrees. for fixed degree, then we average over degree. Calculations on networks such as the configuration If node i has degree ki, then from Eq. (1) ∆i is model can be greatly simplified by using generating func-   tion methods [12, 13]. For instance, many properties of X kj ∆ = A − 1 . (23) the model can be expressed in terms of the generating i ij k j i function for the degree distribution pk, defined by

∞ Each kj is the degree of a neighbor node which, as before, X k f(z) = pkz . (17) is a random quantity distributed according to qk ∝ kpk. k=0 Let G(s) be the Laplace transform for this distribution: We employ this approach here too, but there is a catch, in ∞ X −sk that generating functions are normally applied to distri- G(s) = qke . (24) butions over integer quantities, like the degree, but the k=0 quantity ∆, whose distribution is our main focus here, Then from Eq. (21) the Laplace transform for kj/ki −1 is can take non-integer values. To allow for this, we make s use of the (two-sided) Laplace transform e G(s/ki) and, since each of the ki nonzero terms in the sum of Eq. (23) is independent, the Laplace transform ∞ s ki Z for the full sum is [e G(s/ki)] by Eq. (22). F (s) = p(x) e−sx dx, (18) This is for a node of degree ki. Since the Laplace trans- −∞ form is linear, we can now simply average over degree to which is the standard extension of the generating func- compute the transform for the full distribution of ∆: tion to a variable x on the real line. X sk k However, the distribution of ∆ is not continuous- F∆(s) = pk e G(s/k) . (25) valued either. It is nonzero on a dense set of rational k values but zero everywhere else—see Fig. 1. To allow for this, we consider functions p(x) that are equal to a Given any degree distribution pk we can use this equation discrete sum of Dirac delta functions. For the degree to compute F∆(s). Inverting the Laplace transform then gives us the density function ρ(x) of ∆ itself: distribution pk, for instance, we would define

∞ X X ρ(x) = P (∆)δ(x − ∆) p(x) = pkδ(x − k). (19) ∆ k=0 Z +∞ 1 isx = F∆(is) e ds, (26) With this definition Eqs. (17) and (18) are essentially 2π −∞ equivalent, since where the sum over ∆ is a sum over all rational Z ∞ X X numbers—all possible values of ∆. F (s) = p δ(x − k) e−sx dx = p e−sk k k Since P (∆) is a rather complicated object, it is in prac- −∞ k k −s tice simpler to integrate ρ(x) to compute the probabil- = f(e ), (20) ity that ∆ falls between any two values—in other words a histogram of ∆. In fact, we can do something more but Eq. (18) also allows for quantities like ∆ that have sophisticated: we can use any kernel we like to calcu- non-integer values. late a kernel density estimate of the distribution of ∆. The Laplace transform has several properties that will For a general kernel function κ(x) with Laplace trans- be useful for our purposes. First, if x is a random variable form Fκ(s) we have whose distribution has Laplace transform Fx(s), then the Laplace transform for the distribution of ax + b is X ρκ(x) = P (∆)κ(x − ∆) −bs ∆ Fax+b(s) = e fx(as). (21) Z +∞ 1 isx Second, if x and y are independent random variables, = F∆(is)Fκ(is) e ds. (27) 2π −∞ then the Laplace transform for their sum is the product of the Laplace transforms for x and y alone: A conventional histogram is equivalent to using a rect- angular (“top hat”) kernel, but for our figures we use Fx+y(s) = Fx(s)Fy(s). (22) a smoother double-exponential kernel (also known as a Laplace distribution): These two results now allow us to calculate F∆(s), the Laplace transform for P (∆). We follow essentially the 1 κ(x) = e−|x|/b, (28) same logic as we did for the Poisson random graph: we 2b 5

The calculation of the distribution of ∆ proceeds in a similar manner to that for the configuration model. If we follow an edge that begins at a node of degree k, it will end up at a node of degree j with probability Qjk/qk, meaning that the Laplace transform for the degrees of neighbors is

X Qjk G (s) = e−sj. (31) k q j k Note that this function depends on the degree k of the node at which we started. Nevertheless, the calcula- tion proceeds essentially as before. The equivalent of Eq. (25) is

X sk k FIG. 2: The distribution of ∆i for three configuration F∆(s) = pk e Gk(s/k) , (32) models with the truncated power-law degree distribution k −α −βk pk ∝ k e and different choices of the parameters α and β. ∆i can take only rational values, but for clarity we and the distribution of ∆ can be calculated from F∆(s) show a kernel density estimate of the distribution, calculated using Eq. (27). from Eq. (27) with a Laplace distribution kernel. The values Degree correlations are commonly quantified using an of α and β were chosen so that the mean degree is always 8, assortativity coefficient r, defined as the Pearson corre- while the variance is 8, 64, or 256 as indicated. lation coefficient of degrees across edges [14]. In terms of the quantities defined here, P where the parameter b sets the width of the distribution. jk(Qjk − qjqk) r = jk , (33) (We arbitrarily pick b = 1/3.) The Laplace transform for 2 σq this choice of κ(x) is 2 where σq is the variance of the distribution qk. To study 1 how the effects of the friendship paradox vary with vary- Fκ(s) = . (29) 1 − b2s2 ing r it is convenient to define a model network that allows us to adjust r, in effect a correlated version of the Figure 2 shows the distribution of ∆ computed in this configuration model, parameterized by its degree distri- way for configuration models with the truncated power- bution and a single extra parameter controlling the as- law degree distribution sortativity. This means choosing a suitable value of Qjk, −α −βk which we do by maximizing the entropy pk ∝ k e (30) X and three different choices of the parameters α and β. S(Q) = − Qjk log Qjk (34) Each example has the same mean degree but, as the fig- jk ure shows, the distributions of ∆ are quite different. P P subject to the constraints j Qjk = qk = kpk/ j jpj P 2 for all k and jk jk(Qjk − qjqk)/σq = r. The maximum C. Random graphs with degree correlations entropy distribution is often considered to be the least biased choice for a given set of constraints, meaning that it makes no assumptions other than those implied by the The configuration model of the previous section im- constraints themselves. proves on the Poisson random graph by allowing arbi- The maximum entropy solution for Q in this case is trary distributions of node degrees, but like the random jk graph it lacks any correlation or assortativity between the eγjk Q = q q , (35) degrees of adjacent nodes. Such correlation is common jk Z Z j k in real-world networks [14–16] and will clearly impact j k friendship paradox phenomena. where γ is a Lagrange multiplier whose value controls the One can create a model network with degree correla- assortativity and the Zk are normalizing constants that tions by fixing not only the degree distribution pk as in satisfy the equations the configuration model, but the joint distribution of ad- q eγjk jacent degrees Qjk, which is the fraction of edges that X j Zk = , (36) join nodes of degrees j and k [14]. Note that the dis- Z j j tribution of degrees at the end of an edge is then given P by qk = j Qjk, so that fixing Qjk also fixes the degree which we solve numerically by iteration. This model de- distribution. fines an ensemble of random networks with a desired level 6

0.90 20 E[∆] P (∆ > 0) 15 0.85 ) 0 ] > ∆ [ ∆

E 10 0.80 ( P

5 0.75

FIG. 3: Distribution for ∆ in a random graphs with pk de- 2 0.6 0.3 0.0 0.3 0.6 fined by Eq. (30). The mean degree hki = 8 and σk = 64. r Three different values of γ were chosen so that the assorta- tivity coefficient r takes the values 0, 0.5, and −0.5.

FIG. 4: Two measures of the magnitude of the friendship of assortativity and allows us to study the generic ef- paradox as a function of assortativity. The degree distribution fects of assortativity on network properties, including the is that of Eq. (30) with α and β chosen so that hki = 8 and σ2 = 64. friendship paradox. k Figure 3, for example, shows the probability distribu- tion of ∆ for fixed degree distribution and three different ison, the standard configuration model, which fixes the choices of the assortativity coefficient r. Note how the av- 2 erage value of ∆ decreases as the networks become more degree distribution only, gives R values of 0.77 and 0.95. assortative, but so too does the variance, leading to a Thus it would be fair to say that the distribution of ∆ more complex picture. This is a good example of why we is fairly accurately captured by the degree distribution should be wary of conclusions based on the average value alone, but that the inclusion of assortativity results in a of ∆ alone. significant improvement. Figure 4 sheds more light on this point, showing the average value along with the expected fraction P (∆ > 0) of nodes with positive ∆, both as a function of r. Both E. Generalized friendship paradox of these quantities can be viewed as measures of the strength of the friendship paradox, but they behave in Before finishing let us return to the generalized friend- different ways. While E[∆] does indeed decrease mono- ship paradox. Recall that for any quantity x defined on tonically with assortativity, the fraction of nodes with the nodes of a network the quantity ∆(x), Eq. (4), mea- positive ∆—in effect, the fraction of nodes that display i sures the difference between the average value of x at the classic friendship paradox behavior—peaks at small node i’s neighbors and i’s own value. We can compute the negative r, and has lower values for both large positive distribution of ∆(x), for instance for the degree-correlated and large negative r. model of Section III C, using the Laplace transform for- malism again. The argument differs from previous devel- D. Comparison with real-world networks opments in some details but remains conceptually simi- lar. One first writes the Laplace transform for the distri- bution of a node’s value of x given its degree k How good a guide are these model calculations to the behavior of real-world networks? To shed light on this Z ∞ −sx question we compare the mean and variance of ∆ in 32 Hk(s) = P (x|k) e dx, (37) real-world social networks with values calculated from −∞ the assortative network model of the previous section with the same r. (The mean and variance in the model where P (x|k) is the probability that a node of degree k can be computed from the first and second derivatives of has value x. Since the neighbors of a degree-k node have Eq. (32).) The results are shown in Fig. 5 and, as we can degree distributed as Qjk/qk, they have a distribution of see, there is remarkably good agreement between theo- x values with Laplace transform retical and empirical results—we find R2 values of 0.93 X Qkj and 0.99 between theory and experiment for the mean G (s) = H (s). (38) k q j and standard deviation of ∆ respectively. By compar- j k 7

will cause the resulting sample of the population to be 25 biased and introduce systematic errors [37]. The formal- ism developed here allows us to quantify these effects µ∆ 24 not only in terms of the average individual but in terms σ∆ of the complete distribution of outcomes over the entire 23 population.

22 IV. CONCLUSIONS 21 In this paper we have quantified the friendship and

Maximum entropy model generalized friendship paradoxes in terms of the differ- 1 2 3 4 5 ence ∆ between the characteristics of a node and the 2 2 2 2 2 average of the same characteristics for the node’s neigh- Observed value bors. Previous studies have examined the mean of this difference but, as we have argued here, to get a full pic- ture one must examine the complete distribution of val- ues. We have performed theoretical calculations of this FIG. 5: Mean and standard deviation of ∆ in 32 real-world distribution for three classes of model networks, the Pois- social networks [17–32] compared to predictions from the son random graph, the configuration model, and a model maximum-entropy model, Eq. (35). of a random degree-assortative network. Among other things, our results indicate that the friendship paradox will tend to be strongest in networks with very hetero- Applying the key properties in Eqs. (21) and (22), we geneous degree distributions and negative assortativity. then arrive at Conversely, the effects will tend to be muted when de-

X k grees are fairly homogeneous and the network is degree F∆(x) (s) = pkGk(s/k) Hk(−s), (39) assortative. On the other hand, we have also seen that k even in simple network models the distribution for ∆ can (x) be widely dispersed, meaning that the average value of- and inverting F (x) gives the distribution for ∆ . ∆ fers an incomplete description of the behavior. As an example, we have tested Eq. (39) for normally We have also compared our results with a selection of distributed x linearly correlated with degree and find be- real-world networks, finding remarkably good agreement havior closely similar to that of Fig. 3. Behavior like this between theoretical predictions and empirical measure- could have an impact in any situation where the gen- ments, particularly in the case of the model that incor- eralized friendship paradox has practical consequences. porates assortativity. For example, it has been found that people’s individual well-being can be substantially affected by the behavior of their network neighbors. People whose acquaintances smoke are more likely to smoke themselves [33]. Peo- Acknowledgments ple whose friends appear to be better off than they are may develop a lower sense of self-worth [34, 35]. If peo- This work was funded in part by the US Department of ple with more acquaintances tend to smoke more, or if Defense NDSEG fellowship program (AK) and by the US well-off people with exciting lives have a lot of friends or National Science Foundation under grants DMS–1710848 followers on , then we may have a generalized and DMS–2005899 (MEJN). friendship paradox in which you are most likely to have This research uses data from Add Health, a pro- contact with precisely those people who would adversely gram project directed by Kathleen Mullan Harris and affect you. designed by J. Richard Udry, Peter S. Bearman, and As another example, it has been shown that polling Kathleen Mullan Harris at the University of North Car- forecasts of election outcomes can be significantly im- olina at Chapel Hill, and funded by grant P01–HD31921 proved by focusing not on how study participants say from the Eunice Kennedy Shriver National Institute they will vote but on how they expect their acquain- of Child Health and Human Development, with co- tances to vote [36], in part because this reduces variance operative funding from 23 other federal agencies and in the estimates of outcomes. If voting intention is sub- foundations. Information on how to obtain the Add ject to the generalized friendship paradox however—if for Health data files is available on the Add Health web- instance partisan inclination is correlated with degree— site (https://addhealth.cpc.unc.edu/). No direct support then the tendency for one’s friends to have high degree was received from grant P01-HD31921 for this analysis. 8

[1] S. Feld, Why your friends have more friends than you do. 2008, Public Use Version 21 (2008). Am. J. Sociol. 96, 1464–1477 (1991). [23] B. F. Maier and D. Brockmann, Cover time for random [2] H.-H. Jo and Y.-H. Eom, Generalized friendship para- walks on arbitrary complex networks. Phys. Rev. E 96, dox in networks with tunable degree-attribute correla- 042307 (2017). tion. Phys. Rev. E 90, 022809 (2014). [24] D. E. Knuth, The Stanford GraphBase: A Platform [3] Y.-H. Eom and H.-H. Jo, Generalized friendship paradox for Combinatorial Computing. Addison-Wesley, Reading, in complex networks: The case of scientific collaboration. MA (1993). Scientific Reports 4, 4603 (2014). [25] A. Beveridge and J. Shan, Network of thrones. Math [4] N. O. Hodas, F. Kooti, and K. Lerman, Friendship para- Horizons 23(4), 18–22 (2016). dox redux: Your friends are more interesting than you. In [26] M. R. Weeks, S. Clair, S. P. Borgatti, K. Radda, and J. J. Proceedings of the 7th International AAAI Conference on Schensul, Social networks of drug users in high-risk sites: Weblogs and Social Media, AAAI Press, Palo Alto, CA Finding the connections. AIDS and Behavior 6, 193–206 (2013). (2002). [5] E. N. Gilbert, Random graphs. Annals of Mathematical [27] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, Statistics 30, 1141–1144 (1959). E. Slooten, and S. M. Dawson, The bottlenose dolphin [6] P. Erd˝osand A. R´enyi, On the evolution of random community of Doubtful Sound features a large propor- graphs. Publications of the Mathematical Institute of the tion of long-lasting associations. Can geographic isola- Hungarian Academy of Sciences 5, 17–61 (1960). tion explain this unique trait? Behavioral Ecology and [7] B. Bollob´as, Random Graphs. Academic Press, New Sociobiology 54, 396–405 (2003). York, 2nd edition (2001). [28] M. Fire, G. Katz, Y. Elovici, B. Shapira, and L. Rokach, [8] B. Bollob´as,A probabilistic proof of an asymptotic for- Predicting student exam’s scores by analyzing social net- mula for the number of labelled regular graphs. European work data. In R. Huang, A. A. Ghorbani, G. Pasi, T. Ya- Journal of Combinatorics 1, 311–316 (1980). maguchi, N. Y. Yen, and B. Jin (eds.), Active Media [9] M. E. J. Newman, S. H. Strogatz, and D. J. Watts, Ran- Technology, pp. 584–595, Springer, Berlin (2012). dom graphs with arbitrary degree distributions and their [29] J. Tang, J. Sun, C. Wang, and Z. Yang, Social influence applications. Phys. Rev. E 64, 026118 (2001). analysis in large-scale networks. In Proceedings of the [10] A.-L. Barab´asiand R. Albert, Emergence of scaling in 15th ACM SIGKDD International Conference on Knowl- random networks. Science 286, 509–512 (1999). edge Discovery and Data Mining, KDD ’09, pp. 807–816, [11] L. A. N. Amaral, A. Scala, M. Barth´elemy, and H. E. ACM, New York, NY (2009). Stanley, Classes of small-world networks. Proc. Natl. [30] G. Edwards, FIFA networks (2015), URL Acad. Sci. USA 97, 11149–11152 (2000). https://sites.google.com/site/ucinetsoftware/ [12] H. Wilf, Generatingfunctionology. Academic Press, Lon- datasets/covert-networks/fifa. don, 2nd edition (1994). [31] C. Seierstad and T. Opsahl, For the few not the many? [13] M. Newman, Networks. Oxford University Press, Oxford, The effects of affirmative action on presence, prominence, 2nd edition (2018). and of women directors in Norway. Scandi- [14] M. E. J. Newman, in networks. Phys. navian Journal of Management 27, 44–54 (2011). Rev. Lett. 89, 208701 (2002). [32] R. Mastrandrea, J. Fournet, and A. Barrat, Contact pat- [15] M. E. J. Newman and J. Park, Why social networks are terns in a high school: A comparison between data col- different from other types of networks. Phys. Rev. E 68, lected using wearable sensors, contact diaries and friend- 036122 (2003). ship surveys. PLOS One 10, e107878 (2015). [16] H.-B. Hu and X.-F. Wang, Disassortative mixing in on- [33] N. A. Cristakis and J. H. Fowler, The collective dynam- line social networks. Europhys. Lett. 86, 18003 (2009). ics of smoking in a large . New England [17] P. Gleiser and L. Danon, Community structure in jazz. Journal of Medicine 358, 2249–2258 (2008). Advances in Complex Systems 6, 565–573 (2003). [34] E. Kross, P. Verduyn, E. Demiralp, J. Park, D. S. Lee, [18] M. Bogu˜n´a, R. Pastor-Satorras, A. D´ıaz-Guilera, and N. Lin, H. Shablack, J. Jonides, and O. Ybarra, Facebook A. Arenas, Models of social networks based on social dis- use predicts declines in subjective well-being in young tance attachment. Phys. Rev. E 70, 056122 (2004). adults. PLOS One 8, e69841 (2013). [19] M. E. J. Newman, Finding community structure in net- [35] K. Lerman, X. Yan, and X.-Z. Wu, The majority illusion works using the eigenvectors of matrices. Phys. Rev. E in social networks. PLOS One 11, e147616 (2016). 74, 036104 (2006). [36] B. Nettasinghe and V. Krishnamurthy, What do your [20] M. E. J. Newman, The structure of scientific collabora- friends think? Efficient polling methods for networks us- tion networks. Proc. Natl. Acad. Sci. USA 98, 404–409 ing friendship paradox. IEEE Transactions on Knowledge (2001). and Data Engineering (2019). [21] R. Guimer`a,L. Danon, A. D´ıaz-Guilera,F. Giralt, and [37] D. Rothschild, Forecasting elections: Comparing pre- A. Arenas, Self-similar community structure in a network diction markets, polls, and their biases. Public Opinion of human interactions. Phys. Rev. E 68, 065103 (2003). Quarterly 73, 895–916 (2009). [22] K. M. Harris and J. R. Udry, National Longitudinal Study of Adolescent to Adult Health (Add Health), 1994–