The Perceived Assortativity of Social Networks: Methodological Problems and Solutions
Total Page:16
File Type:pdf, Size:1020Kb
The perceived assortativity of social networks: Methodological problems and solutions David N. Fisher1, Matthew J. Silk2 and Daniel W. Franks3 Abstract Networks describe a range of social, biological and technical phenomena. An important property of a network is its degree correlation or assortativity, describing how nodes in the network associate based on their number of connections. Social networks are typically thought to be distinct from other networks in being assortative (possessing positive degree correlations); well-connected individuals associate with other well-connected individuals, and poorly-connected individuals associate with each other. We review the evidence for this in the literature and find that, while social networks are more assortative than non-social networks, only when they are built using group-based methods do they tend to be positively assortative. Non-social networks tend to be disassortative. We go on to show that connecting individuals due to shared membership of a group, a commonly used method, biases towards assortativity unless a large enough number of censuses of the network are taken. We present a number of solutions to overcoming this bias by drawing on advances in sociological and biological fields. Adoption of these methods across all fields can greatly enhance our understanding of social networks and networks in general. 1 1. Department of Integrative Biology, University of Guelph, Guelph, ON, N1G 2W1, Canada. Email: [email protected] (primary corresponding author) 2. Environment and Sustainability Institute, University of Exeter, Penryn Campus, Penryn, TR10 9FE, UK Email: [email protected] 3. Department of Biology & Department of Computer Science, University of York, York, YO10 5DD, UK Email: [email protected] Keywords: Assortativity; Degree assortativity; Degree correlation; Null models; Social networks Introduction Network theory is a useful tool that can help us explain a range of social, biological and technical phenomena (Pastor-Satorras et al 2001; Girvan and Newman 2002; Krause et al 2007). For instance, network approaches have been used to investigate diverse topics such as the global political and social system (Snyder and Kick 1979; Nemeth and Smith 2010) to the formation of coalitions among-individuals (Kapferer 1969; Zachary 1977; Thurman 1979; Voelkl and Kasper 2009). Networks can be described using a number of local (related to the individual) and global (related to the whole network) measures. One important global measure is degree correlation or assortativity (we use the latter term for brevity), which was formally defined by Newman (2002), although Pastor-Satorras et al. (2001) had calculated an analogous measure previously. The assortativity of a network measures how the probability of a connection between two nodes (individuals) in a network depends on the degrees of those two nodes (the degree being the number of connections each node possesses). The measure quantifies whether those with many connections associate with others with many connections (assortative networks or networks with assortativity), or if “hubs” form where well connected individuals are connected to many individuals with few other connections (disassortative networks or networks with dissassortativity). If the tendency for nodes to be connected is independent of each other’s degrees a network has neutral assortativity. Assortativity is calculated as the Pearson’s correlation coefficient between the degrees of all pairs of connected nodes, and ranges from -1 to 1 (Newman 2002). The Pastor-Satorras method involves plotting the degree of each node against the mean degree of its neighbours, and judging the network assortatative if the slope is positive and disassortatative if the slope is negative (Pastor-Satorras et al 2001). In this article we will focus on the Newman measure, as it has been more commonly used by the scientific community, gives a coefficient bounded between -1 and 1 rather than a slope, and typically is supported by a statistical test, something general absent from the reporting of the Pastor-Satorras method. Assortativity is a key property to consider when understanding how networks function, especially when considering social networks. A network’s robustness to attacks is increased if it is assortative (Jing et al 2007; Hasegawa et al 2012). However, the speed of information transfer and ability to act in synchrony is increased in disassortative networks (Di Bernado et al 2007; Gallos et al 2008). Thus the assortativity of a social network in which an individual is embedded can have a substantial impact on that individual. Networks are typically found to be neutrally or negatively assortative (Newman 2002; Newman and Park 2003; Whitney and Alderson 2008). When considering all networks comprised of eight nodes, Estrada (2011) observed that only 8% of over 11,000 possible networks were assortative. Despite this general trend, social networks are often said to differ from other networks by being assortative (Newman 2003; Newman and Park 2003). This has led to those finding disassortativity in networks of online interactions (Holme et al 2004), mythical stories (Mac Carron and Kenna 2012) or networks of dolphin interactions (Lusseau and Newman 2004) to suggest they are different to typical human social networks. While the generality of assortativities in social networks has been questioned (Whitney and Alderson 2008; Hu and Wang 2009), a wide variety of recent research still states that this is a property typical of social networks (e.g. (Estrada 2011; Palathingal and Chirayath 2012; Mac Carron and Kenna 2013; Litvak and van der Hofstad 2013; Thedchanamoorthy et al 2014; Araújo et al 2014; Furtenbacher et al 2014)). With assortativity being a key network property, and subject of interest in a range of fields, this topic requires clarification. In this paper we review assortativity in the networks literature, with emphasis on social networks. We assess the generality of the hypothesis that social networks tend to be distinct from other kinds of networks in their assortativities and explore whether the precise method of social network construction influences this metric. We go on to show how particular methodologies of social network construction could result in falsely assortative networks, and present a number of solutions to this by drawing on advances in sociological and biological fields. 1. Assortativity in social and other networks Random networks should be neutrally assortative (Newman 2002). However, simulations by Franks et al. (2009) showed that random social networks constructed using group-based methods are assortative unless extensive sampling is carried out. Group-based methods are where links are formed between individuals not when they directly interact, but when they both are found in the same group, or contribute to a joint piece of work e.g. co-author a paper or appear in a band together. We hypothesised that the suggestion that social networks possess assortativity was due to a preponderance of social networks constructed using group- based methods in the early literature. We thus conducted a literature search, recording the (dis)assortativity of networks, network types (social or non-social) and for the social networks, method of construction (direct interactions of group-based). We expect that social networks built using group-based methods would be more assortative than the other two classes of network, which would be similarly assortative. 1.1 Literature search - Method Assortativity has been calculated for a wide range of networks in the last decade. We compiled a list of the assortativity of published networks based on the table in Whitney & Alderson (Whitney and Alderson 2008), literature searches with the terms “degree correlation” and “assortativity”, and examining the articles citing Newman (2002). If it could not be determined how the network was constructed, or how large it was, the network was excluded. We did not include the average assortativity when reported from a range of similar networks when the individual scores were not reported. We also did not include assortativities of networks from studies re-analysing existing datasets, to avoid pseudo- replication. Only undirected networks were considered; see Piraveenan et al. (2012) for a review of assortativities in directed networks. We then classified these networks as non-social networks, social networks constructed using direct interactions, or social networks constructed using group-based methods. We then compared assortativity across network classes using a Kruskal-Wallis test as assortativity scores were not normally distributed. If this revealed significant differences among network classes, we then compared network classes to each other using Wilcoxon rank sum tests and to the a neutral assortativity of zero with Wilcoxon signed rank tests. 1.2 Literature search - Results In published papers we found assortativities for 88 networks that met our criteria for inclusion, see Table 1. 52 of these were social networks, of which 25 were constructed using group-based methods. The assortativities for the network classes are shown as boxplots in Fig. 1. The Kruskal-Wallis tests indicated that there were differences among groups (Kruskal- Wallis χ2 = 26.8, d.f. = 2, p < 0.001). All networks types were different from each other, with the group-based social networks being more assortative than both other classes of network, and the direct social networks being more assortative