Collective Human Behavior in Cascading System: Discovery, Modeling and Applications
Yunfei Lu Linyun Yu Tianyang Zhang Chengxi Zang Tsinghua University Tsinghua University Tsinghua University Tsinghua University [email protected] [email protected] [email protected] [email protected]
Peng Cui Chaoming Song Wenwu Zhu Tsinghua University University of Miami Tsinghua University [email protected] [email protected] [email protected]
Abstract—The collective behavior, describing spontaneously However, collective behavior underlying information cas- emerging social processes and events, is ubiquitous in both cades in online social media is seldom explored. With the physical society and online social media. The knowledge of rapid growth of various online social media, detailed human collective behavior is critical in understanding and predicting social movements, fads, riots and so on. However, detecting, behavior is documented at large scale, offering great opportu- quantifying and modeling the collective behavior in online social nities to study the collective behavior. The most related works media at large scale are seldom unexplored. In this paper, we try to model the cascading system in social networks, such as examine a real-world online social media with more than 1.7 modeling the popularity dynamics [4] or predicting the final million information spreading records, which explicitly document size of cascades [5]. However, none of them tries to examine the detailed human behavior in this online information cascading system. We observe evident collective behavior in information cas- the collective behavior in cascading system which is embodied cading, and then propose metrics to quantify the collectivity. We in the phenomena that a group of users always participate in find that previous information cascading models cannot capture same cascades collectively. the collective behavior in the real-world and thus never utilize it. Furthermore, we propose a generative framework with a latent user interest layer to capture the collective behavior in cascading system. Our framework achieves high accuracy in modeling the information cascades with respect to popularity, structure and collectivity. By leveraging the knowledge of collective behavior, our model shows the capability of making predictions without temporal features or early-stage information. Our framework can serve as a more generalized one in modeling cascading system, and, together with empirical discovery and applications, advance our understanding of human behavior. Index Terms—Collective Human Behavior; Information Cas- cades; Generative Framework I.INTRODUCTION Fig. 1. Illustration of collective behavior in cascading system. We illustrate three collective groups of followers of a same root user. Followers in the Collective behavior describes the phenomenon that people same collective group tend to participate in the same information cascades, exhibit same behavior in a spontaneous way which do not while followers in different collective groups seldom appear together. Possible factors, including timing, structure, topic of posts and user interests, are all reflect existing social structure [1]. The collective behavior responsible for such collective behavior. In this paper, we try to quantify and lies in various social phenomena, ranging from the worldwide model such collective behavior in cascading system. stock crashes in 2018, the popularity of the billion-view- ¨ video Gangnam Style¨, to inconspicuous ones such as several In this paper, we collect more than 1.7 million information customers having meals in a restaurant at some point. The spreading records from Tencent Weibo, a Twitter style social collective behavior underlying these phenomena cannot be media in China, which explicitly document the detailed human explained by existing social structure, but indicates that they behavior in this online information cascading system. We find share some unknown common points, which might be social- evident collective behavior in real-world cascading system. In economic factors, interests or eating habits, etc. Although Fig.1 we illustrate three collective groups of followers of a different opinions on interpretations, the existence and signif- same root user. Followers in the same collective group tend to icance of collective behavior is widely recognized by public. participate in the same information cascades, while followers The usefulness of the understanding of collective behavior is in different collective groups seldom appear together. The further proven by different research topics, such as predicting factors that influence aforementioned collective behavior are human mobility [2], analyzing human activity patterns [3] and complex. First, timing can play an important role in collective so on. behavior due to the shared daily routine of followers in the same collective group. Second, followers in the same collective However, none has discovered the collective behavior in cas- group can be embedded in a same tightly-knit community, cading system and promote cascade modeling and prediciton indicating the impact of structure. Third, followers in the with this collectivity. same collective group can possibly share similar interests in Cascading system. In social network, a piece of informa- specific topics of tweets. Furthermore, all these factors can tion may get reshared multiple times: one shares the content drive collective behavior simultaneously. Thus, the critical with friends, several of these friends also share it with their problem is: Can we model the collective behavior in cascading respective sets of friends, then information cascade develop system caused by the complex and mixture factors? and this phenomenon happens all the time, which constitutes In order to capture the collective behavior, we first provide cascading system. In recent years, many methods have been metrics to quantify the collectivity of behavior, and then prove proposed to model cascading system and make prediction on that previous information cascading models cannot capture the cascades. Some of them focus on predicting the future size of observed collectivity. Furthermore, we propose a generative a cascade with topological characteristics of the cascade [10] framework to model the collective behavior in cascading social or dynamic information [11]. Among them Cheng et. al [12] systems. Our genreative framework is based on point process indicate the significance of temporal features like retweeting with a latent user interest layer to capture the collective rate among other features. Other methods, mainly based on behavior at behavior level directly in cascading system. With point process [13] or survival analysis [14], attempt to model this framework, we not only successfully capture the collective cascade dynamics and predict the evolution of popularity. behavior, but also model cascading system more accurately on Shen et. al [4] employ reinforcement poisson process to cascade popularity, structure and their correlation. Besides, our model cascades. Kobayashi et. al [15] take human circadian framework shows excellent extensibility and provides unique nature into account. Mishra et. al [16] combine feature-driven capability of predicting the popularity and participants of methods with generative modeling approaches. cascade with knowing only the identity of a few randomly However, these methods mainly treat each cascade as an selected participants. independent process rather than considering the whole cas- In short, we summarize our contributions as follows: cading system as an entirety. Former studies concentrate on • Quantification of collectivity: We discover the ubiqui- popularity but disregard the structure patterns and collective tous collective behavior in cascading system and propose behavior, for which there is still no comprehensive framework metrics to quantify the collectivity of it. that can achieve accuracy in all three aspects of cascades: • A generative framework: We propose a generative popularity, structure and collectivity. framework which is based on point process with a latent III.COLLECTIVE BEHAVIOR IN CASCADING SYSTEM user interest layer to capture the collective behavior in cascading system. Our framework shows excellent Given a newtowrk G = (V,E), where V indicates set of extensibility and unification power. users and E indicates the relations, supposing a user u usually post different kinds of root tweets < t , t , ..., t >, • Accuracy and usefulness: With the knowledge of col- 1,u 2,u n,u lectivity, our framework accurately matches real world where ti,u indicates the ith kind of information, his followers cascading system in popularity, structure and collectiv- are ususally only interested in one or a few kinds among ity, providing capability of making predicitons without them, due to their interest or some other reasons, and thus temporal features. tend to appear in the cascades triggerred by the corresponding roots. Denoting the cluster of users who frequently retweet The outline of the paper is: survey, method, collective ti,u as Ci,u, if there are many cooccurence for users in same behavior in cascading system, experiments, conclusions and clusters but few cooccurence for users in different clusters, future work. we regard this phenomenon as the collective behavior in cascading system. That means when u triggers a cascade, II.RELATED WORK people involved by retweeting are always those interested in As the investigated problem is closely related to collective this kind of information, rather than randomly selected from behavior and information cascades, we mainly review the followers or completely determined by social relationship. related works in these two fields. Users in same cluster share similar interests, which signifies Collective behavior. Collective human behavior is tightly that they are likely to keep this collectivity in future cascades. associated with human culture [6], which has wide applications Also, users with opposite interest present mutual exclusion in in social network researches. Lehmann et al. [7] find the their behavior of cascade participation. dynamic classes of hashtags from the spikes of collective An intuitional method to quantify collective behavior is to attention in Twitter. Tang et al. [8] propose a scalable learning calculate the correlation of behavior between users. For each method for collective behavior based on sparse social dimen- user u, we construct his behavior vector