Evaluating Balance on Social Networks from Their Simple Cycles
Total Page:16
File Type:pdf, Size:1020Kb
Evaluating balance on social networks from their simple cycles P.-L. Giscard1, P. Rochet2 and R. C. Wilson1 1University of York, Department of Computer Science, UK 2Universit´ede Nantes, Laboratoire de Math´ematiques Jean Leray, France (Dated: June 10, 2016) Signed networks have long been used to represent social relations of amity (+) and enmity (-) between individuals. Group of individuals who are cyclically connected are said to be balanced if the number of negative edges in the cycle is even and unbalanced otherwise. In its most natural formulation, the balance of a social network is thus defined from its simple cycles, cycles which do not visit any vertex more than once. Because of the inherent difficulty associated with finding such cycles on very large networks, social balance has always been studied via other, less-direct means. In this article we present the balance as measured from the simple cycles and primitive orbits of social networks. We use a Monte Carlo implementation of a novel exact formula for counting the simple cycles on any weighted directed graph. We show that social networks exhibit strong inter-edge correlations favoring balanced situations and we determine the corresponding correlation length ξ. For longer simple cycles, the percentage of unbalanced simple cycles undergoes a rapid transition to values expected from an uncorrelated model. Our method is more generally applicable to evaluate arbitrary functions over the simple cycles and simple paths of any weighted directed graph and can also answer vertex-specific questions. PACS numbers: 89.75.Fb, 89.65.Ef I. INTRODUCTION strategies are implemented in this work: i) approximate the balance of the network to within any desired accuracy A. Balance in networks by evaluating the balance on a large sample of subgraphs of the network; or ii) compute the balance exactly from objects which are not simple cycles, but should carry a Relations of amity and enmity between individuals are similar information. well represented by signed networks, where an edge is as- We successfully implemented the first strategy thanks signed a positive value if two individuals are acquainted to a novel exact formula for counting simple cycles on any and in good terms, and a negative one if there are instead (weighted directed) graph in conjunction with a Monte enemies [1{5]. Such networks provide a natural setting to Carlo approach. This method is presented in Section II. study inter-personal relationships and their correlations. It effectively solves the mathematical problem enunciated For example, one could expect that people are friendly to- earlier since the quality of the obtained approximation is wards the friends of their friends, a situation that is said controlled and can be improved at will. For the second to be \balanced". More generally, on signed networks, strategy, we relied on the primitive orbits of the graph, a group of individuals who are cyclically connected|i.e. cycles which contain no backtracking steps or tail, and forming a triangle, a square, a pentagon etc.|are said to are not the multiple of any other cycle. This is presented be balanced if the number of negative edges in the cycle in Section III. The results produced by both approaches is even. Otherwise the cycle is said to be unbalanced. on four social networks are discussed and compared in Sociologists have suggested that such negative cycles are Section IV. the cause of tension and thus, that social networks should evolve into a state where balanced cycles are largely pre- dominant [1, 2, 6]. The question of wether this holds for B. Notation real-social networks and if not, by how much this fails to be true, arose from these considerations in the 1940s [1]. Throughout this article, we consider signed directed Mathematically speaking, this sociological question networks G = (V; E), of which undirected networks are a translates into the following problem: on a signed net- special case. The adjacency matrix of G is denoted AG work G, determine for all ` the percentage of negative or simply A. Each edge of the network is weighted with simple cycles of length `. This problem remains largely a value +1 or -1 indicating a positive or negative interac- unsolved owing to its natural formulation in terms of tion. A cycle is positive if the product of its edge values simple cycles|cycles which do not visit any vertex more is positive, and otherwise it is negative. A cycle is simple than once. Unfortunately, enumerating all the simple cy- if it does not visit any vertex more than once. The start- cles of a network exactly is computationally intractable ing point of a simple cycle is irrelevant but its orientation since, for example, the problem of determining if a Hamil- is retained. For example, v0v1v2v0 and v1v2v0v1 repre- tonian cycle exists in a graph is NP-complete. sent the same triangle, which is however distinct from For this reason, we need to seek more efficient meth- v0v2v1v0. The number of positive and negative simple + − ods, which inevitably lead to some approximations. Two cycles of length ` on G are designated by N` and N` , 2 respectively. + When discussing the balance of a network, we refer to the ratio R` of the number of negatively signed simple cycles of length ` to the total number of simple cycles of length `, i.e. -+ − N` R` := − + : N` + N` FIG. 1: An unbalanced network of three vertices. The dotted In particular, R` = 0 when the network is perfectly bal- line represents a negative relationship. The exponential un- anced for length `, while R` = 1 indicates a totally un- balanced ratio is Uwalks = 0:106 while in fact R2 = 1=3 and balanced situation. R3 = 1. C. Existing approach using walks These quantities only take walks of length ` into account when calculating the balance. Yet, since short cycles and their multiples are typically much more abundant As noted earlier, since counting all the simple cycles than long cycles, the values of U walks and Rwalks are still of a large graph exactly is intractable, one may instead ` ` largely dominated by the contributions from self-loops, count objects which are not simple but carry a similar in- backtracks and triangles. Consequently Rwalks will be formation when it comes to balance. In this vein, Estrada ` depressed as compared to the true balance ratio R`, that and Benzi [9] proposed the use of walks is, R` overestimates the proportion of balanced cy- 1 cles. We demonstrate this concretely in Section IV, Fig- X 1 ` walks D = Tr exp(A) = Tr A ; (1) ures (2) and (3), where we compare R with R` cal- `! ` `=1 culated from the simple cycles on two social networks. While one may empirically argue that long cycles are as a method of counting the number of balanced and un- less relevant than short ones in real social networks balanced cycles, also known as closed walks, in a network. [9, 10], it seems better to offer as detailed a mathemat- They show that, by computing the ratio K = D=jDj, the ical analysis as possible before deciding this issue. For ratio of negative to positive cycles can be obtained as these reasons, we found it necessary to abandon the use of walks and rather recur either to the simple cycles them- 1 − K U walks := : selves or to primitive orbits. 1 + K This is an extremely efficient method which simply re- quires the evaluation of the eigenvalues of the adjacency II. BALANCE FROM SIMPLE CYCLES matrix. Expression (1) counts all closed walks (weighted by a A. Core combinatorial engine factor 1=`! for a walk of length `). Because of this, back- tracking steps and multiple cycles are counted. For exam- One possible strategy to estimate the balance ratios R` 2 ple, the non-simple cycle v0v1v2v0v1v2v0 = (v0v1v2v0) is consists of approximating them from a large sample of part of the sum. Such cycles are positive, and so they do subgraphs of the network under study. The main novelty not upset the balance of an already balanced network, permitting this straightforward approach in practice is but they do have an effect on the (global) balance ra- a recently developed mathematical formula for counting tio(s) for an unbalanced network. In other words, the simple cycles of any length on weighted directed graphs. cycle-sum embodied in (1) contains non-simple cycles at Rather than sampling the simple cycles directly, the for- order 2 and higher effectively mixing the balance ratios mula allows for a rapid and exact evaluation of the bal- R` at all lengths. In addition, the global signature of bal- ance ratios on subgraphs of the original network. We ance Uwalks further mixes the contributions of the various show below that this strategy is much better from a com- cycles lengths. Figure 1 illustrates this issue in a triad. putational standpoint than sampling the simple cycles Whilst the network is completely unbalanced for trian- themselves. gles, we get U walks = 0:106, a number that is not easy to interpret. These difficulties cannot be resolved easily using walks. 1. Formula for counting simple cycles In an attempt to better account for the length depen- ` dency of the balance, we define D` := Tr A , K` := Let P (z) be the ordinary generating function of the walks D`=jD`j, U` := (1 − K`)=(1 + K`) and simple cycles of any weighted directed graph G, that is ` ` X `(c) walks Tr jAj − Tr A P (z) := w(c) z ; R` := : (2) 2 Tr jAj` c: simple cycle 3 where w(c) is the weight of c, that is the product of the ∼ 18 sec, yielding 255; 323; 504; 932 ' 2:5 × 1011 for weights of its edges, z is a formal variable and `(c) is the the total number of simple cycles, which we verify length of c.