Edge Direction and the Structure of Networks
Total Page:16
File Type:pdf, Size:1020Kb
Edge direction and the structure of networks Jacob G. Foster1, David V. Foster, Peter Grassberger, and Maya Paczuski Complexity Science Group, Department of Physics and Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB, Canada T2N 1N4 Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved April 6, 2010 (received for review November 2, 2009) Directed networks are ubiquitous and are necessary to represent In directed networks, an edge from source to target (A → B) complex systems with asymmetric interactions—from food webs represents an asymmetric interaction; for example, that Web site to the World Wide Web. Despite the importance of edge direction A contains a hyperlink to Web site B, or organism A is eaten for detecting local and community structure, it has been disregarded by organism B. Edge direction is essential to evaluate and explain in studying a basic type of global diversity in networks: the tendency local structure in such networks. For instance, motif analysis of nodes with similar numbers of edges to connect. This tendency, (13, 14) identifies local connection patterns that appear more called assortativity, affects crucial structural and dynamic properties frequently in the real-world network than in ensembles of rando- of real-world networks, such as error tolerance or epidemic spread- mized networks. In this context, edge direction distinguishes ing. Here we demonstrate that edge direction has profound effects functional units like feed-forward and feedback loops. Takingedge on assortativity. We define a set of four directed assortativity mea- direction into account also overturns the simple picture of the sures and assign statistical significance by comparison to rando- World Wide Web (WWW) as having a short average distance mized networks. We apply these measures to three network between all Web pages (15) in favor of a richer picture of link classes—online/social networks, food webs, and word-adjacency flow into and out of a dense inner core (16). More recently, networks. Our measures (i) reveal patterns common to each class, attempts to identify communities in directed networks have (ii) separate networks that have been previously classified together, demonstrated that ignoring edge direction misses key organiza- – and (iii) expose limitations of several existing theoretical models. tional features of community structure in networks (17 19). Hence We reject the standard classification of directed networks as purely it is striking that assortativity in directed networks has been studied assortative or disassortative. Many display a class-specific mixture, only by ignoring edge direction entirely (8) or by measuring a likely reflecting functional or historical constraints, contingencies, subset of the four possible degree-degree correlations (9, 20). and forces guiding the system’s evolution. All four degree-degree correlations were addressed in the specific contexts of earthquake recurrences (21) and the WWW (22) using the average neighbor degree, e.g., hk0outi ðkinÞ, as a measure omplex networks reveal essential features of the structure, nn rather than the Pearson correlation. However, it is easier to inter- function, and dynamics of many complex systems (1–4). C pret and assign statistical significance to the Pearson correlation. While networks from diverse fields share various properties Moreover, the average neighbor degree cannot be easily used (3, 5–7) and universal patterns (1, 3), they also display enormous to quantify the diversity of a given network or to compare networks structural, functional, and dynamical diversity. A basic measure of various sizes, unlike the Pearson correlation. Incorporating of diversity is assortativity by degree (hereafter assortativity): the edge direction into familiar assortativity measures based on the tendency of nodes to link to other nodes with a similar number of Pearson correlation is an essential step to better characterize, edges (4, 8, 9). Despite its importance, no disciplined approach to understand, and model directed networks. Indeed, since they scale assortativity in directed networks has been proposed. Here we as OðEÞ, where E is the number of edges in the network, our present such an approach and show that measures of directed directed assortativity measures can be evaluated for large assortativity provide a number of insights into the structure of networks that are beyond the reach of current motif analysis or directed networks and key factors governing their evolution. community detection algorithms. Assortativity is a standard tool in analyzing network structure Here we analyze online and social networks, food webs, and (4) and has a simple interpretation. In assortative networks with word-adjacency networks. Classes of directed networks show symmetric interactions (i.e., undirected networks), high degree common patterns across the four directed assortativity measures: nodes, or nodes with many edges, tend to connect to other high rðout; inÞ; rðin; outÞ; rðout; outÞ; and rðin; inÞ. The first element in degree nodes. Hence, assortative networks remain connected the parentheses labels the degree of the source node of the despite node removal or failure (9), but are hard to immunize directed edge, and the second labels the degree of the target against the spread of epidemics (10). In disassortative networks, node. Thus rðin; outÞ quantifies the tendency of nodes with high conversely, high degree nodes tend to connect to low degree in-degree to connect to nodes with high out-degree, and so on; PHYSICS nodes (8, 9); these networks limit the effects of node failure see Fig. 1. because important nodes (with many edges) are isolated from We compare the real-world network with an ensemble of ran- each other (11). Assortativity has a convenient global measure: domized networks. This comparison allows us to assign statistical the Pearson correlation (r) between the degrees of nodes sharing significance to each measure.* We use that significance to define an edge (8, 9). It ranges from −1 to 1, with (r>0) in assortative an Assortativity Significance Profile for each network. This pro- networks and (r < 0) in disassortative ones. Earlier work pro- file allows us to distinguish between networks grouped together posed a simple classification of networks on the basis of assorta- tivity, in which social networks are assortative and biological and Author contributions: J.G.F., D.V.F., P.G., and M.P. designed research; J.G.F., D.V.F., P.G., and technological networks are disassortative (4, 8, 9). Recent work M.P. performed research; J.G.F., D.V.F., P.G., and M.P. analyzed data; and J.G.F., D.V.F., P.G., suggests that this classification does not hold for undirected and M.P. wrote the paper. networks: Many online social networks are disassortative (12). The authors declare no conflict of interest. We go further, demonstrating that the simple assortative/disas- This article is a PNAS Direct Submission. sortative dichotomy misses fundamental features of networks *To our knowledge, statistical significance has been assigned to assortativity measures in where edge direction plays a crucial role. In fact, we show that only one publication (23). many networks are neither purely assortative nor disassortative, 1To whom correspondence should be addressed. E-mail: [email protected]. but display a mixture of both tendencies. These patterns provide a This article contains supporting information online at www.pnas.org/lookup/suppl/ classification scheme for networks with asymmetric interactions. doi:10.1073/pnas.0912671107/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.0912671107 PNAS ∣ June 15, 2010 ∣ vol. 107 ∣ no. 24 ∣ 10815–10820 Downloaded by guest on September 24, 2021 by other measures; indeed, we find that online and social net- r α; β − r α; β Z α; β rwð Þ h randð Þi : [2] works, which have similar motif structure (14), have substantially ð Þ¼ σ r α; β ½ randð Þ different assortativity profiles. The class-specific profiles point to forces or constraints that may guide the structure, function and This quantifies the difference between the assortativity measure of r α; β growth of that class (14, 24, 25). We also uncover limitations of the real-world network rwð Þ and its average value in the ran- r α; β several theoretical network models. For example, neither of two domized ensemble h randð Þi in units of the standard deviation, σ r α; β Z plausible models of word-adjacency networks [one proposed by ½ randð Þ. Larger networks typically have larger scores (see Milo et al. (14), the other in this paper] can reproduce the direc- Table S2). To compare networks of various sizes, the Z scores are ted assortativity profile we observe in the real-world networks. A normalized (14) by defining an Assortativity Significance Profile 2 1∕2 standard model of the WWW (26) is similarly unsuccessful. On (ASP), where ASPðα; βÞ¼Zðα; βÞ∕½∑α;βZðα; βÞ . This quan- the other hand, the food web models (27) examined here repro- tity is directly related to the Z score, and for a given network duce the pattern of assortativity seen in different food webs. the normalization does not change the relative size of the signifi- Hence our measures provide useful benchmarks to test models cance measures. To separate less significant correlations, we indi- of network formation. cate jZðα; βÞj < 2 in all figures by an appropriately colored Table S1 provides descriptions and sources for all networks asterisk. A positive Zðα; βÞ or ASPðα; βÞ (“Z assortative”) indicates examined in this paper; Table S2 collects the full results including that the real-world network is more assortative in that measure error estimates. than expected based on the degree sequence. A negative Zðα; βÞ or ASPðα; βÞ (“Z disassortative”) means that the original Results and Discussion network is less assortative than expected. Since nodes in directed networks have both an in-degree and an out-degree, we introduce a set of four directed assortativity Online and Social Networks.