The Impact of Influence on the Dynamics of Complex Information Cascades
Total Page:16
File Type:pdf, Size:1020Kb
Influence and Cascades December 10 2013 Group 34 • • The Impact of Influence on the Dynamics of Complex Information Cascades Adriana Diakite,Emily Glassberg,Eyuel Tessema Stanford University [email protected], [email protected], [email protected] I. Introduction current research on the dynamics of complex information transmission by adding variation I. Motivation in influence across a navigable, small-world network model. We then explore the impact of ow do behaviors, contagions, products, variation in influence on real world informa- and ideas spread through networks? tion cascades by comparing the results of our HThese questions can all be addressed simulated cascades to a cascade of retweets on using the conceptual framework of information the Twitter network. cascades. The impact of network characteris- tics on information cascades is of great interest to sociologists, politicians, public health profes- II. Prior Work sionals, marketing executives, and many others. Propagation through networks can be simple, Current research on complex propagation has meaning that contact with one "infected" node focused on random networks, regular lattices, is enough for transmission. Many diseases and small-world networks. A 2007 paper by spread in this fashion. However, propagation Centola et. al. investigated the impact of net- can also be complex, meaning that a certain work topology on the "critical threshold," the proportion of a node’s neighbors must have threshold above which no cascades occur. The adopted a behavior or idea before the node authors found that, in random networks, in- will decide to adopt it as well. Current liter- creasing the degree of each node decreases ature suggests that cascades involving social the critical threshold (promotes propagation). movements, attitudes, and ideas are likely to be However, in regular lattices, increasing the de- transmitted in this way (Granovetter 1978). As gree increases the critical threshold (inhibits a result, the dynamics of complex information propagation). Most interestingly, increasing cascades differ from simple cascades. Further- the randomness of regular lattices by rewiring more, the dynamics of social networks are in- edges also increased the critical threshold. Pre- fluenced by the specific relationships between vious literature has shown that increasing ran- individuals within that community. For exam- domness in the form of long-range links allows ple, it has been shown that people’s behavior in for increased transmission between neighbor- social networks is influenced by an individual’s hoods (Granovetter, 1973), however, the above status (Leskovec et al, 2010). Existing models results suggest that these conclusions do not of complex transmission are limited in that hold in the case of complex transmission. Ad- they treat the relationships between all pairs of ditionally, a 2010 paper by Centola found that nodes as equal, not accounting for potential ef- complex transmission occurred more rapidly fects of homophily, interpersonal relationships, and with greater success (as determined by the and status on an individuals decision to adopt proportion of nodes that adopt the cascading a behavior or idea (Watts 2001, Centola et. al trait) in clustered lattices than in random net- 2007, Centola 2010). This project aims to extend works. The author suggests that this difference 1 Influence and Cascades December 10 2013 Group 34 • • results from increased social reinforcement due tance between the nodes. The distance be- to the high degree of clustering. These find- tween a pair of nodes will be the number ings support the claim that the dynamics of of steps to the most recent common ances- complex propagation are distinct from those of tor of that pair on the tree. This process simple transmission. Specifically, these results will be repeated until r edges have been demonstrate that complex and simple informa- added from each node to nodes in different tion cascades are impacted differently by net- neighborhoods (Model I). By varying the work topology. However, the network topolo- size of the neighborhoods (2h), and the inter- gies currently studied in the context of com- neighborhood connectivity (r), this general plex information cascades may be insufficient model will allow for the investigation of to capture the dynamics of cascades in real- variable influence on a variety of network world networks. For example, random graphs topologies. do not reproduce the clustering of real-world networks, regular lattices do not capture the b) Models of Variable Influence - Common small diameter seen in real-world situations, sense suggests that, in real world networks, and small-world networks are not navigable in distance between nodes is likely to affect the the way that real-world networks are. Further- influence that one node exerts on another. more, these analyses assume that each node For example, you are more likely to adopt in the network exerts equal influence on its the behavior of a family member or close neighbors. A significant body of research on friend with whom you are often in contact the impact of properties of social groups such than the behavior of a distant acquaintance as status and homophily suggests that this as- that you rarely see. Therefore, we model sumption is highly unrealistic in real-world influence as a decreasing function of the settings. In this analysis, we expand existing tree distance between a target node and its research on complex information cascades in neighbor (Model II). Specifically, the influ- two ways. First, we investigate the behavior ence of a node v on a node w is calculated of complex information cascades on navigable, as [ 1 ], where d(v, w) is the number of small-world networks. Second, we explore two d(v,w) steps to the most recent common ancestor models of variable influence and their impact of v and w on the tree. It is worth noting on the rate and success of complex information that Model I also indirectly incorporates a cascades. distance-based effect in terms of the abil- ity of one node to influence another (nodes II. Methods that are further away from a target node are less likely to have an edge to the target I. Description of Models and so are unlikely to influence it). For pur- poses of comparison, we implement a third a) Basic Model - The basic model we will use model wherein influence is modeled as an is an extension of the small-world network increasing function of the tree distance be- model (Watts and Strogatz, 1998). The net- tween a target node and its neighbor (Model work will be represented as a balanced bi- III). In this case, the influence of a node v nary tree of height H. The leaves of the tree on a node w is calculated as d(v, w). This are the nodes of the network. Nodes will be model could be interpreted as a scenario formed into completely connected "neigh- in which, as you become familiar with the borhoods" by placing an edge between ev- people around you, you become immune to ery pair of nodes in subtrees of height h. their views and no longer adopt behaviors Edges will be added between pairs of nodes from them. By contrast, when you meet in different neighborhoods with probabili- someone new, their behaviors or attitudes ties that decrease as a function of the dis- are surprising or interesting enough that 2 Influence and Cascades December 10 2013 Group 34 • • you are more likely to adopt them. the graph adopt the new state), the adoption threshold is increased by 0.001, meaning that, relative to the previous simulation, nodes re- II. Description of Simulations quire at most one additional neighbor to have We generate a series of graphs each of H = 10, adopted a trait in order to adopt the trait them- varying h from 0 5 and r from 1 10. We then selves. The simulation is then repeated until we − − reach the adoption threshold at which cascades calculate the critical threshold (Tc) for each net- work topology (as specified by h and r)by fail. We say the value of the critical threshold simulating complex cascades on each of these (Tc) is the value of the adoption threshold at the graphs. last successful cascade (0.001 below the thresh- old at the first failed cascade). We repeat this We simulate an information cascade in a procedure for each of the network topologies network by seeding a random neighborhood described above. with a new state. At each of one thousand Finally, we repeat this set of experiments time steps, each node will change state if a using each of the models of variable influence certain proportion of its neighbors had previ- described above. In these simulations, the con- ously adopted the new state. This proportion tributions of each neighbor towards the adop- is called the adoption threshold (discussed in tion threshold are weighted by the influence greater detail below). Each graph contains 210 score of that neighbor. (1024) nodes and, at every iteration, a cascade will either infect at least one additional node or the cascade will cease permanently. 1000 III. Analysis of simulations nodes is 98% of the 1024 nodes in the origi- ⇡ To compare the relationships between network nal graph, so, if a cascade has not succeeded by topology, critical threshold, and influence, we 1000 iterations, we know that the cascade must plot 3-dimensional graphs for each of our mod- have ceased prior to the final iteration. There- els with h, r, and Tc on the axes. This analysis, fore, performing more than 1000 iterations will however, only describes cascades in terms of have no impact on the final cascade size. success or failure. On a single network topology, we simu- To explore the behavior of cascades in late information cascades as described above greater detail, we created heatmaps to repre- at varying adoption thresholds.