Navigating Networks by Using Homophily and Degree

Total Page:16

File Type:pdf, Size:1020Kb

Navigating Networks by Using Homophily and Degree Navigating networks by using homophily and degree O¨ zgu¨r S¸ ims¸ek* and David Jensen Department of Computer Science, University of Massachusetts, Amherst, MA 01003 Edited by Peter J. Bickel, University of California, Berkeley, CA, and approved July 11, 2008 (received for review January 16, 2008) Many large distributed systems can be characterized as networks We show that, across a wide range of synthetic and real-world where short paths exist between nearly every pair of nodes. These networks, EVN performs as well as or better than the best include social, biological, communication, and distribution net- previous algorithms. More importantly, in the majority of cases works, which often display power-law or small-world structure. A where previous algorithms do not perform well, EVN synthesizes central challenge of distributed systems is directing messages to whatever homophily and degree information can be exploited to specific nodes through a sequence of decisions made by individual identify much shorter paths than any previous method. nodes without global knowledge of the network. We present a These results have implications for understanding the func- probabilistic analysis of this navigation problem that produces a tioning of current and prospective distributed systems that route surprisingly simple and effective method for directing messages. messages by using local information. These systems include This method requires calculating only the product of the two social networks routing messages, referral systems for finding measures widely used to summarize all local information. It out- informed experts, and also technological systems for routing performs prior approaches reported in the literature by a large messages on the Internet, ad hoc wireless networking, and margin, and it provides a formal model that may describe how peer-to-peer file sharing. humans make decisions in sociological studies intended to explore the social network as well as how they make decisions in more Formulation naturalistic settings. We formulate the navigation problem as a probabilistic decision- making task in which the objective is to minimize the length of complex networks ͉ search algorithms ͉ social network analysis the path traveled by the message. We assume that each node knows about its immediate neighbors—including their identity, uch of the current interest in small-world networks has degree, and attributes—but is unaware of the rest of the Mdrawn inspiration from the now classic work of Travers network. At the source node, and subsequently at each node and Milgram (1). Their 1969 study asked individuals to deliver along the path, the optimal decision rule (given the limited information) is to forward the message to the neighbor from letters to designated persons by passing them through chains of which the message will reach the target in the smallest number first-name acquaintances. That study, as well as more recent of hops, assuming that all future nodes will make their decision work (2), shows that surprisingly short paths exist between pairs similarly by using local information. Although a recurrence of nodes in real-world social networks. relation may, in principle, govern the optimal decision rule, it is However, the existence of such paths is only one of the not apparent how this can be formulated in a way that would interesting findings of the study. Kleinberg (3) notes that the suggest an effective navigation method. study contains a second finding of fundamental algorithmic Instead, we suggest that an effective (although not necessarily importance: People are able to find short paths even though they optimal) decision would be to forward the message to the know very little about the target individual or the network. Both neighbor with the minimum expected distance to the target, Kleinberg and a variety of subsequent researchers have ad- where distance from node s to node t is the length of the shortest dressed the question of how such network navigation is possible. path from s to t. We can express this quantity, the expected value That work identifies two network characteristics that can guide of the distance dst from neighbor s to target t, as a weighted sum navigation. The first is homophily—the tendency of attributes of of all possible distances: connected nodes to be correlated. People tend to be acquainted with other people who live in the same geographical area or who ͑ ͒ ϭ ͸ ⅐ ͑ ϭ ͒ E dst i P dst i . [1] have the same occupation. The second is the existence of iՆ1 high-degree nodes. Some people have a large number of ac- quaintances and act as hubs that connect different social circles. Computing this expectation is daunting but, fortunately, not Both of these characteristics are widely observed in real-world necessary. Effective decision making requires only identifying networks, and both lead directly to navigation algorithms. Con- the neighbor that minimizes the expectation. To perform this sideration of homophily gives rise to a navigation algorithm that comparison, we propose to use only the first term in the series, passes messages to the neighbor that is the most similar to the which calculates the probability of a distance of one. The relative target node (e.g., an acquaintance who lives in Boston, if the values of this first term may be an accurate estimator of the target person lives in Boston) (3–6), whereas consideration of relative values of the entire expectation, because the terms in the degree gives rise to an algorithm that favors the neighbor with series are correlated, and this correlation increases with increas- the highest degree (7). ing homophily. For example, given a relatively high probability In contrast to previous work, we cast the navigation problem of a distance of one, we expect a relatively high probability of a as a task of decision making under uncertainty, in which the distance of two. In general, the greater the first term, the lower desired decision is to forward the message to the neighbor with the minimum expected distance to the target. We show how the desired decision may be approximated by using a single criteri- Author contributions: O¨ .S¸. and D.J. designed research; O¨ .S¸. and D.J. performed research; ¨ ¨ on—the probability that the neighbor links directly to the O.S¸. analyzed data; and O.S¸. and D.J. wrote the paper. target—which may be estimated by a simple product of the The authors declare no conflict of interest. homophily and degree information. Our formulation directly This article is a PNAS Direct Submission. implies an algorithm that we call expected-value navigation Freely available online through the PNAS open access option. (EVN). Earlier algorithms are special cases of EVN when only *To whom correspondence should be addressed. E-mail: [email protected]. homophily or degree information is present. © 2008 by The National Academy of Sciences of the USA 12758–12762 ͉ PNAS ͉ September 2, 2008 ͉ vol. 105 ͉ no. 35 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0800497105 Downloaded by guest on October 2, 2021 the whole expectation, so the resulting decision rule is to select 120 the neighbor with the highest probability of linking directly to the A Random target. This probability is influenced by both homophily and 100 degree. The more similar the neighbor is to the target and the Degree 80 higher the degree of the neighbor, the greater the probability that the neighbor links directly to the target. In directed net- 60 works, we express this relationship by making the simplifying 40 Similarity assumption that the links originating at s were created indepen- Median path length 20 dently, with the same probability (qst) of terminating at t. When EVN the network exhibits homophily, qst is a function of the similarity Global 0 between nodes s and t. The number of links from s to t (nst) then 0 1 2 3 Homophily parameter follows a binomial distribution with probability of success qst and ks trials, where ks is the out-degree of s. Using the Poisson approximation to the binomial, we can compute the first term of B 120 Eq. 1: 100 ͑ ϭ ͒ ϭ ͑ Ն ͒ 80 P dst 1 P nst 1 ϭ Ϫ ͑ ϭ ͒ 60 1 P nst 0 EVN 40 ks ϭ Ϫ ͑ ͒ ϭ Ϫ ͑ Ϫ ͒ Median path length 1 binomial 0; ks, qst 1 1 qst [2] 20 Global Ϸ 1 Ϫ Poisson͑0; k q ͒ ϭ 1 Ϫ eksqst. [3] s st 0 0 1 2 3 In undirected networks, we obtain the same result, similarly Homophily parameter assuming that the links were placed independently. In this case, Fig. 1. Median path lengths on power-law networks with degree parameter we use qst to denote the probability that, given that a link borders 1.5 (A), on Poisson networks with ␭ ϭ 3.5 (B). The graphs show performance node s, it connects node s to node t. on 5,000 randomly selected search tasks on 30 randomly generated 1,000- Because only a relative ordering is necessary for effective node networks. ‘‘Global’’ indicates the median path lengths obtained with navigation, the resulting decision rule is remarkably simple: complete knowledge of the network. Navigation methods not shown in B Select the neighbor that maximizes the product of a degree term resulted in median path lengths higher than the values shown on the graph. (ks) and a homophily term (qst). If the network shows no homophily, this is equivalent to selecting the neighbor with the highest degree. On the other hand, if all nodes have equal large because two attribute values may be arbitrarily close. In degree, it is equivalent to selecting the neighbor most similar to applying similarity-based navigation, we considered all neigh- the target.
Recommended publications
  • Networkx: Network Analysis with Python
    NetworkX: Network Analysis with Python Salvatore Scellato Full tutorial presented at the XXX SunBelt Conference “NetworkX introduction: Hacking social networks using the Python programming language” by Aric Hagberg & Drew Conway Outline 1. Introduction to NetworkX 2. Getting started with Python and NetworkX 3. Basic network analysis 4. Writing your own code 5. You are ready for your project! 1. Introduction to NetworkX. Introduction to NetworkX - network analysis Vast amounts of network data are being generated and collected • Sociology: web pages, mobile phones, social networks • Technology: Internet routers, vehicular flows, power grids How can we analyze this networks? Introduction to NetworkX - Python awesomeness Introduction to NetworkX “Python package for the creation, manipulation and study of the structure, dynamics and functions of complex networks.” • Data structures for representing many types of networks, or graphs • Nodes can be any (hashable) Python object, edges can contain arbitrary data • Flexibility ideal for representing networks found in many different fields • Easy to install on multiple platforms • Online up-to-date documentation • First public release in April 2005 Introduction to NetworkX - design requirements • Tool to study the structure and dynamics of social, biological, and infrastructure networks • Ease-of-use and rapid development in a collaborative, multidisciplinary environment • Easy to learn, easy to teach • Open-source tool base that can easily grow in a multidisciplinary environment with non-expert users
    [Show full text]
  • Network Analysis of the Multimodal Freight Transportation System in New York City
    Network Analysis of the Multimodal Freight Transportation System in New York City Project Number: 15 – 2.1b Year: 2015 FINAL REPORT June 2018 Principal Investigator Qian Wang Researcher Shuai Tang MetroFreight Center of Excellence University at Buffalo Buffalo, NY 14260-4300 Network Analysis of the Multimodal Freight Transportation System in New York City ABSTRACT The research is aimed at examining the multimodal freight transportation network in the New York metropolitan region to identify critical links, nodes and terminals that affect last-mile deliveries. Two types of analysis were conducted to gain a big picture of the region’s freight transportation network. First, three categories of network measures were generated for the highway network that carries the majority of last-mile deliveries. They are the descriptive measures that demonstrate the basic characteristics of the highway network, the network structure measures that quantify the connectivity of nodes and links, and the accessibility indices that measure the ease to access freight demand, services and activities. Second, 71 multimodal freight terminals were selected and evaluated in terms of their accessibility to major multimodal freight demand generators such as warehousing establishments. As found, the most important highways nodes that are critical in terms of connectivity and accessibility are those in and around Manhattan, particularly the bridges and tunnels connecting Manhattan to neighboring areas. Major multimodal freight demand generators, such as warehousing establishments, have better accessibility to railroad and marine port terminals than air and truck terminals in general. The network measures and findings in the research can be used to understand the inventory of the freight network in the system and to conduct freight travel demand forecasting analysis.
    [Show full text]
  • Social Network Analysis and Information Propagation: a Case Study Using Flickr and Youtube Networks
    International Journal of Future Computer and Communication, Vol. 2, No. 3, June 2013 Social Network Analysis and Information Propagation: A Case Study Using Flickr and YouTube Networks Samir Akrouf, Laifa Meriem, Belayadi Yahia, and Mouhoub Nasser Eddine, Member, IACSIT 1 makes decisions based on what other people do because their Abstract—Social media and Social Network Analysis (SNA) decisions may reflect information that they have and he or acquired a huge popularity and represent one of the most she does not. This concept is called ″herding″ or ″information important social and computer science phenomena of recent cascades″. Therefore, analyzing the flow of information on years. One of the most studied problems in this research area is influence and information propagation. The aim of this paper is social media and predicting users’ influence in a network to analyze the information diffusion process and predict the became so important to make various kinds of advantages influence (represented by the rate of infected nodes at the end of and decisions. In [2]-[3], the marketing strategies were the diffusion process) of an initial set of nodes in two networks: enhanced with a word-of-mouth approach using probabilistic Flickr user’s contacts and YouTube videos users commenting models of interactions to choose the best viral marketing plan. these videos. These networks are dissimilar in their structure Some other researchers focused on information diffusion in (size, type, diameter, density, components), and the type of the relationships (explicit relationship represented by the contacts certain special cases. Given an example, the study of Sadikov links, and implicit relationship created by commenting on et.al [6], where they addressed the problem of missing data in videos), they are extracted using NodeXL tool.
    [Show full text]
  • Maximum and Minimum Degree in Iterated Line Graphs by Manu
    Maximum and minimum degree in iterated line graphs by Manu Aggarwal A thesis submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Master of Science Auburn, Alabama August 3, 2013 Keywords: iterated line graphs, maximum degree, minimum degree Approved by Dean Hoffman, Professor of Mathematics Chris Rodger, Professor of Mathematics Andras Bezdek, Professor of Mathematics Narendra Govil, Professor of Mathematics Abstract In this thesis we analyze two papers, both by Dr.Stephen G. Hartke and Dr.Aparna W. Higginson, on maximum [2] and minimum [3] degrees of a graph G under iterated line graph operations. Let ∆k and δk denote the minimum and the maximum degrees, respectively, of the kth iterated line graph Lk(G). It is shown that if G is not a path, then, there exist integers A and B such that for all k > A, ∆k+1 = 2∆k − 2 and for all k > B, δk+1 = 2δk − 2. ii Table of Contents Abstract . ii List of Figures . iv 1 Introduction . .1 2 An elementary result . .3 3 Maximum degree growth in iterated line graphs . 10 4 Minimum degree growth in iterated line graphs . 26 5 A puzzle . 45 Bibliography . 46 iii List of Figures 1.1 ............................................1 2.1 ............................................4 2.2 : Disappearing vertex of degree two . .5 2.3 : Disappearing leaf . .7 3.1 ............................................ 11 3.2 ............................................ 12 3.3 ............................................ 13 3.4 ............................................ 14 3.5 ............................................ 15 3.6 : When CD is not a single vertex . 17 3.7 : When CD is a single vertex . 18 4.1 ...........................................
    [Show full text]
  • Degrees & Isomorphism: Chapter 11.1 – 11.4
    “mcs” — 2015/5/18 — 1:43 — page 393 — #401 11 Simple Graphs Simple graphs model relationships that are symmetric, meaning that the relationship is mutual. Examples of such mutual relationships are being married, speaking the same language, not speaking the same language, occurring during overlapping time intervals, or being connected by a conducting wire. They come up in all sorts of applications, including scheduling, constraint satisfaction, computer graphics, and communications, but we’ll start with an application designed to get your attention: we are going to make a professional inquiry into sexual behavior. Specifically, we’ll look at some data about who, on average, has more opposite-gender partners: men or women. Sexual demographics have been the subject of many studies. In one of the largest, researchers from the University of Chicago interviewed a random sample of 2500 people over several years to try to get an answer to this question. Their study, published in 1994 and entitled The Social Organization of Sexuality, found that men have on average 74% more opposite-gender partners than women. Other studies have found that the disparity is even larger. In particular, ABC News claimed that the average man has 20 partners over his lifetime, and the av- erage woman has 6, for a percentage disparity of 233%. The ABC News study, aired on Primetime Live in 2004, purported to be one of the most scientific ever done, with only a 2.5% margin of error. It was called “American Sex Survey: A peek between the sheets”—raising some questions about the seriousness of their reporting.
    [Show full text]
  • Assortativity and Mixing
    Assortativity and Assortativity and Mixing General mixing between node categories Mixing Assortativity and Mixing Definition Definition I Assume types of nodes are countable, and are Complex Networks General mixing General mixing Assortativity by assigned numbers 1, 2, 3, . Assortativity by CSYS/MATH 303, Spring, 2011 degree degree I Consider networks with directed edges. Contagion Contagion References an edge connects a node of type µ References e = Pr Prof. Peter Dodds µν to a node of type ν Department of Mathematics & Statistics Center for Complex Systems aµ = Pr(an edge comes from a node of type µ) Vermont Advanced Computing Center University of Vermont bν = Pr(an edge leads to a node of type ν) ~ I Write E = [eµν], ~a = [aµ], and b = [bν]. I Requirements: X X X eµν = 1, eµν = aµ, and eµν = bν. µ ν ν µ Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. 1 of 26 4 of 26 Assortativity and Assortativity and Outline Mixing Notes: Mixing Definition Definition General mixing General mixing Assortativity by I Varying eµν allows us to move between the following: Assortativity by degree degree Definition Contagion 1. Perfectly assortative networks where nodes only Contagion References connect to like nodes, and the network breaks into References subnetworks. General mixing P Requires eµν = 0 if µ 6= ν and µ eµµ = 1. 2. Uncorrelated networks (as we have studied so far) Assortativity by degree For these we must have independence: eµν = aµbν . 3. Disassortative networks where nodes connect to nodes distinct from themselves. Contagion I Disassortative networks can be hard to build and may require constraints on the eµν.
    [Show full text]
  • A Faster Parameterized Algorithm for PSEUDOFOREST DELETION
    A faster parameterized algorithm for PSEUDOFOREST DELETION Citation for published version (APA): Bodlaender, H. L., Ono, H., & Otachi, Y. (2018). A faster parameterized algorithm for PSEUDOFOREST DELETION. Discrete Applied Mathematics, 236, 42-56. https://doi.org/10.1016/j.dam.2017.10.018 Document license: Unspecified DOI: 10.1016/j.dam.2017.10.018 Document status and date: Published: 19/02/2018 Document Version: Accepted manuscript including changes made at the peer-review stage Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.
    [Show full text]
  • Deep Generative Modeling in Network Science with Applications to Public Policy Research
    Working Paper Deep Generative Modeling in Network Science with Applications to Public Policy Research Gavin S. Hartnett, Raffaele Vardavas, Lawrence Baker, Michael Chaykowsky, C. Ben Gibson, Federico Girosi, David Kennedy, and Osonde Osoba RAND Health Care WR-A843-1 September 2020 RAND working papers are intended to share researchers’ latest findings and to solicit informal peer review. They have been approved for circulation by RAND Health Care but have not been formally edted. Unless otherwise indicated, working papers can be quoted and cited without permission of the author, provided the source is clearly referred to as a working paper. RAND’s R publications do not necessarily reflect the opinions of its research clients and sponsors. ® is a registered trademark. CORPORATION For more information on this publication, visit www.rand.org/pubs/working_papers/WRA843-1.html Published by the RAND Corporation, Santa Monica, Calif. © Copyright 2020 RAND Corporation R® is a registered trademark Limited Print and Electronic Distribution Rights This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited. Permission is given to duplicate this document for personal use only, as long as it is unaltered and complete. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial use. For information on reprint and linking permissions, please visit www.rand.org/pubs/permissions.html. The RAND Corporation is a research organization that develops solutions to public policy challenges to help make communities throughout the world safer and more secure, healthier and more prosperous.
    [Show full text]
  • Random Graph Models of Social Networks
    Colloquium Random graph models of social networks M. E. J. Newman*†, D. J. Watts‡, and S. H. Strogatz§ *Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501; ‡Department of Sociology, Columbia University, 1180 Amsterdam Avenue, New York, NY 10027; and §Department of Theoretical and Applied Mechanics, Cornell University, Ithaca, NY 14853-1503 We describe some new exactly solvable models of the structure of actually connected by a very short chain of intermediate social networks, based on random graphs with arbitrary degree acquaintances. He found this chain to be of typical length of distributions. We give models both for simple unipartite networks, only about six, a result which has passed into folklore by means such as acquaintance networks, and bipartite networks, such as of John Guare’s 1990 play Six Degrees of Separation (10). It has affiliation networks. We compare the predictions of our models to since been shown that many networks have a similar small- data for a number of real-world social networks and find that in world property (11–14). some cases, the models are in remarkable agreement with the data, It is worth noting that the phrase ‘‘small world’’ has been used whereas in others the agreement is poorer, perhaps indicating the to mean a number of different things. Early on, sociologists used presence of additional social structure in the network that is not the phrase both in the conversational sense of two strangers who captured by the random graph. discover that they have a mutual friend—i.e., that they are separated by a path of length two—and to refer to any short path social network is a set of people or groups of people, between individuals (8, 9).
    [Show full text]
  • Group, Graphs, Algorithms: the Graph Isomorphism Problem
    Proc. Int. Cong. of Math. – 2018 Rio de Janeiro, Vol. 3 (3303–3320) GROUP, GRAPHS, ALGORITHMS: THE GRAPH ISOMORPHISM PROBLEM László Babai Abstract Graph Isomorphism (GI) is one of a small number of natural algorithmic problems with unsettled complexity status in the P / NP theory: not expected to be NP-complete, yet not known to be solvable in polynomial time. Arguably, the GI problem boils down to filling the gap between symmetry and regularity, the former being defined in terms of automorphisms, the latter in terms of equations satisfied by numerical parameters. Recent progress on the complexity of GI relies on a combination of the asymptotic theory of permutation groups and asymptotic properties of highly regular combinato- rial structures called coherent configurations. Group theory provides the tools to infer either global symmetry or global irregularity from local information, eliminating the symmetry/regularity gap in the relevant scenario; the resulting global structure is the subject of combinatorial analysis. These structural studies are melded in a divide- and-conquer algorithmic framework pioneered in the GI context by Eugene M. Luks (1980). 1 Introduction We shall consider finite structures only; so the terms “graph” and “group” will refer to finite graphs and groups, respectively. 1.1 Graphs, isomorphism, NP-intermediate status. A graph is a set (the set of ver- tices) endowed with an irreflexive, symmetric binary relation called adjacency. Isomor- phisms are adjacency-preseving bijections between the sets of vertices. The Graph Iso- morphism (GI) problem asks to determine whether two given graphs are isomorphic. It is known that graphs are universal among explicit finite structures in the sense that the isomorphism problem for explicit structures can be reduced in polynomial time to GI (in the sense of Karp-reductions1) Hedrlı́n and Pultr [1966] and Miller [1979].
    [Show full text]
  • Scale- Free Networks in Cell Biology
    Scale- free networks in cell biology Réka Albert, Department of Physics and Huck Institutes of the Life Sciences, Pennsylvania State University Summary A cell’s behavior is a consequence of the complex interactions between its numerous constituents, such as DNA, RNA, proteins and small molecules. Cells use signaling pathways and regulatory mechanisms to coordinate multiple processes, allowing them to respond to and adapt to an ever-changing environment. The large number of components, the degree of interconnectivity and the complex control of cellular networks are becoming evident in the integrated genomic and proteomic analyses that are emerging. It is increasingly recognized that the understanding of properties that arise from whole-cell function require integrated, theoretical descriptions of the relationships between different cellular components. Recent theoretical advances allow us to describe cellular network structure with graph concepts, and have revealed organizational features shared with numerous non-biological networks. How do we quantitatively describe a network of hundreds or thousands of interacting components? Does the observed topology of cellular networks give us clues about their evolution? How does cellular networks’ organization influence their function and dynamical responses? This article will review the recent advances in addressing these questions. Introduction Genes and gene products interact on several level. At the genomic level, transcription factors can activate or inhibit the transcription of genes to give mRNAs. Since these transcription factors are themselves products of genes, the ultimate effect is that genes regulate each other's expression as part of gene regulatory networks. Similarly, proteins can participate in diverse post-translational interactions that lead to modified protein functions or to formation of protein complexes that have new roles; the totality of these processes is called a protein-protein interaction network.
    [Show full text]
  • The Perceived Assortativity of Social Networks: Methodological Problems and Solutions
    The perceived assortativity of social networks: Methodological problems and solutions David N. Fisher1, Matthew J. Silk2 and Daniel W. Franks3 Abstract Networks describe a range of social, biological and technical phenomena. An important property of a network is its degree correlation or assortativity, describing how nodes in the network associate based on their number of connections. Social networks are typically thought to be distinct from other networks in being assortative (possessing positive degree correlations); well-connected individuals associate with other well-connected individuals, and poorly-connected individuals associate with each other. We review the evidence for this in the literature and find that, while social networks are more assortative than non-social networks, only when they are built using group-based methods do they tend to be positively assortative. Non-social networks tend to be disassortative. We go on to show that connecting individuals due to shared membership of a group, a commonly used method, biases towards assortativity unless a large enough number of censuses of the network are taken. We present a number of solutions to overcoming this bias by drawing on advances in sociological and biological fields. Adoption of these methods across all fields can greatly enhance our understanding of social networks and networks in general. 1 1. Department of Integrative Biology, University of Guelph, Guelph, ON, N1G 2W1, Canada. Email: [email protected] (primary corresponding author) 2. Environment and Sustainability Institute, University of Exeter, Penryn Campus, Penryn, TR10 9FE, UK Email: [email protected] 3. Department of Biology & Department of Computer Science, University of York, York, YO10 5DD, UK Email: [email protected] Keywords: Assortativity; Degree assortativity; Degree correlation; Null models; Social networks Introduction Network theory is a useful tool that can help us explain a range of social, biological and technical phenomena (Pastor-Satorras et al 2001; Girvan and Newman 2002; Krause et al 2007).
    [Show full text]