<<

Examining Untempered : Analyzing Cascades of Polarized Conversations

Arunkumar Bagavathi∗, Pedram Bashiri∗, Shannon Reid†, Matthew Phillips†, Siddharth Krishnan∗ ∗Dept. of Science, † Dept. of Criminal Justice University of North Carolina at Charlotte [email protected], [email protected], [email protected], [email protected], [email protected],

Abstract—Online social media, periodically serves as a plat- of online social networks [4] is an effective strategy to recruit form for cascading polarizing topics of conversation. The inherent members, instigate the public, and ultimately culminate in community structure present in online social networks (ho- riots and violence as witnessed recently in Charlottesville and mophily) and the advent of fringe outlets like have created online “echo chambers” that amplify the effects of polarization, Portland. Due to the threat of violence that these groups bring which fuels detrimental behavior. Recently, in October 2018, with them, analyzing the online dynamics of conversations and Gab made headlines when it was revealed that Robert Bowers, interactions in such social media sites is an important problem the individual behind the Pittsburgh Synagogue massacre, was that the research community is trying to address [5], [6]. an active member of this social media site and used it to In this work, using the well-known propagation mechanism express his anti-Semitic views and discuss conspiracy theories. Thus to address the need of automated data-driven analyses of information cascades [7], we demonstrate how polarized of such fringe outlets, this research proposes novel methods to conversations take shape on Gab. By analyzing 34M posts discover topics that are prevalent in Gab and how they cascade and 3.7M cascades built from conversations on Gab, we show within the network. Specifically, using approximately 34 million that there are five different classes of conversation cascades, posts, and 3.7 million cascading conversation threads with close where each type shows varied level of user participation, to 300k users; we demonstrate that there are essentially five cascading patterns that manifest in Gab and the most “viral” and responses. Along with analyzing structural properties , ones begin with an echo-chamber pattern and grow out to the we emphasize the post level details of the cascades with an entire network. Also, we empirically show, through two models algorithm to classify , and eventually cascades into viz. Susceptible-Infected and Bass, how the cascades structurally topics. We observe that controversial topics are adopted by evolve from one of the five patterns to the other based on the users that are more strongly connected and also these topics topic of the conversation with upto 84% accuracy. generate larger cascades. By analyzing structural dynamics of Index Terms—Polarized conversations, conversation cascades, conversation topics, Cascade evolution models conversation cascades on Gab, we observe that all cascades start with a simple linear pattern and evolve into other patterns when a topic becomes viral and more users join the conver- I.INTRODUCTION sation. We found that the average time and average posts to Fringe social media sites, such as Gab, 8chan, and PewTube, make an evolution on Gab is 1.5 days and atleast 3 posts have become a fertile ground for individuals and groups with respectively on average. We model this evolution of cascades far right and extreme far right views to post and share their using popular network growth models like Susceptible-Infected messages in an unfettered manner and to galvanize supporters model [8] and Bass model [9]. Our best model fit give upto for their cause [1]. While most mainstream social media like 84% accuracy. , , and moderate their content and Essentially, we answer the following research questions with deplatform more extreme users and groups, the emergence of our corresponding contributions in this broad study of Gab: outlets like 8chan and Gab.ai have given radical groups large • What are the types of cascading behavior in Gab content delivery networks to broadcast their polarizing mes- conversations? We study conversation patterns and their sages. The echo chamber effects in these social networks [2], dynamics in Gab as cascades and provide metrics to [3], are often amplified via online conversations and interac- measure structural patterns within conversations. tions that occur on social networks. In recent times, we have • Can we characterize user relationship and types of observed that such interactions, combined with the exploitation topics in Gab conversations? We propose an algorithm Permission to make digital or hard copies of all or part of this work for to cluster topics that circulate in conversation cascades personal or classroom use is granted without fee provided that copies are not and our results show that the polarizing topics gain lots made or distributed for profit or commercial advantage and that copies bear of traction in Gab conversations. this notice and the full citation on the first page. Copyrights for components • of this work owned by others than ACM must be honored. Abstracting with Can we study evolution of Gab conversations using credit is permitted. To copy otherwise, or republish, to post on servers or to the cascades? We give pathways for Gab conversation redistribute to lists, requires prior specific permission and/or a fee. Request cascades to evolve over time. To capture this evolution permissions from [email protected]. patterns algorithmically, we use the Susceptible-Infected ASONAM ’19, August 27-30, 2019, Vancouver, Canada model [8] and the Bass model [9]. © 2019 Association for Computing Machinery. ACM ISBN 978-1-4503-6868-1/19/08 http://dx.doi.org/10.1145/3341161.3343695 II.RELATED WORK Related work of our research can be split into three sections: i) Applications of Gab analysis ii) Cascading behavior in social media iii) Quantitative analysis of topics in social media . The use of social media to study radicalization [10], dis- crimination [11], and fringe web communities [12] is gaining traction over the past few years. Particularly, past studies highlights the fact that the advent of Gab.com creates a scope to analyze topics like alt-right echo-chamber and hate speech research [1], [2]. Most notably, the recent work studied the spread of hate speech among Gab users with the help of repost cascades and friend/follower network [6]. Our research adds an extra dimension to existing Gab works, in which we analyze Fig. 1: Timeseries of frequency posts, replies, and reshares from the origin of gab.com(August 2016) until the forum went down on the last week of October its conversations, provide analysis on growth of topics, and 2018 give measures and metrics to study conversation structure and evolution in the perspective of cascades. Cascades in social media accounts for information dissem- pushshift.io2. Figure 1 gives an overview of the dataset as ination [7], which can be applied to variety of applications timeseries plot for the number of posts, replies, and quotes like fake news [13], viral [14], and emergency appeared in Gab between August 2016 and October 2018. management [15]. These information cascades are crucial in The dataset is a comprehensive collection with 34 million incorporating machine learning models for variety of appli- posts, replies, quotes, about 15,000 groups and about 300,000 cations like: modeling influence propagation in the social public users information. It is evidential from Figure 1 that our media [16], predicting number of reshares using self-exciting dataset comprise of 55% posts, 30% replies, and 15% quotes. point process [17], and modeling network growth patterns as This data is available with complete set of metadata like time, an alternative to sigmoid models [18]. In this work, we reframe attachments, likes, dislikes, replies, quotes along with post and information cascades as conversation cascades and give novel user details. ideas on defining models for cascade evolution types in Gab B. Conversation cascades conversations. Just like Twitter [19], hashtags on Gab are means to add Microblogging conversations have been widely studied in metadata to posts that highlights topics. Some researchers the context of cascades for a wide spectrum of applica- defined the topic (class) of hashtags manually [20] [21]. Lee tions like emotion analysis [5] and topic modelling [25]. et. al. [22] classify Twitter Trending Topics into 18 general In this work, we give variety of cascade representations for categories using one bag-of-words text based approach and one conversations in Gab and give their in-depth structural and network approach. Wang et. al [23] propose Graph- temporal analysis, response rates, and longevity. A conver- based Topic Model (HGTM) to discover topics of tweets. We sation cascade is a sequence of posts made as a response believe manual labeling is not scalable and unsupervised algo- or reply to other posts/users, together act as a conversation. rithms are not accurate enough due to the nature of hashtags, With available posts, replies and quotes/reshares from Gab, thus we propose a supervised semi-automated procedure to we construct conversation cascades, where each cascade is classify hashtags. one complete conversation. Nodes in each cascade represent original post/reply/quote and edges represent reply/quote of III.PRELIMINARIES post/reply/quote. Conversation cascade and construction process: A con- In this section, we briefly describe about the dataset and versation cascade is a directed graph G = (V ,E ), where terminologies in our methods. c c c Vc is a set of posts/replies/quotes and Ec is a set of edges A. Dataset Description connecting posts ordered by time. We represent a node/post in the cascade as P (v, t), where v ∈ Vc is a post appearing in Gab.com/Gab.ai is a social media forum, founded in 2016, the social media at time t. There exists a directed edge(v, v0), provide a forum to connect and share information among users. when P (v, t) receives a reply/quote P (v0, t0), where t0 > t. Even though the description of the forum looks similar to This process continues each time when a user replies/quotes counterparts like Twitter and Facebook, Gab supports indi- a post. vidual liberty and committed to contribute for free speech Thus, root node in conversation cascades represents original 1 in the social media community . Users of Gab can share post and replies and quotes take branches from the root information via posts, post replies, and quotes/reshares. We node(original post). Nodes in second or above level in the use the data published by [24] and data scraping forum cascade take branches again if such nodes in turn get any

1https://gab.com/ 2https://pushshift.io TABLE I: Basic statistics of depth, volume, number of unique users, and structural virality(Wiener Index) of all cascade types. Overall, the cascade type E achieves higher structural popularity even though the number of unique user participation is low. Metric Depth Volume Users Wiener Index Avg. Std. Dev. Avg. Std. Dev. Avg. Std. Dev. Avg. Std. Dev. A 2.91 2.52 2.33 1.09 2.89 0.41 0.57 0.21 B NA NA 4.1 2.38 2.27 6.41 0.38 0.22 C 2.79 1.07 8.59 3.25 5.27 16.19 0.37 0.22 D 7.81 2.54 24.53 16.45 11.0 16.19 0.45 0.47 E 6.9 4.9 15.72 10.76 3.17 2.88 0.93 0.7

a cascade of size n using Wiener Index(WI) [26] given in the Equation 1, where dij is the shortest distance of nodes i and j. 1 WI = Pn Pn (1) n(n − 1) i=1 j=1 dij From Table I and Figure 3, we find that cascades of (a) Figure a: Cascade types category E attains higher volume and depth with very few user participation and it plays a vital role in Gab conversations.

(b) Figure b: No. of posts per cascade type (a) Depth Distribution of cascades Fig. 2: Possible shallow cascading structures in replies and reshares of posts and number of posts in each cascade type. As predicted, most of the cascades follow simple patterns(type A). Interestingly, many conversations form cascade type B in which many response come directly to the root post and does not evolve into any other cascade types. Small number of conversations follow cascade type C, which split from the root and each branch from root follows linear pattern. Very few conversations follow cascade type D pattern, which has highly nested structure, both at root level and branch level and cascade type E, which initially follows linear pattern and takes non-linear during the evolution time. replies or quotes. In total we constructed 1,721,441 cascades from the Gab data, which comprise of 19,220,059 nodes/posts contributed by 173,581 users and 15,476,852 edges. (b) Size Distribution of cascades We evidence from these cascades that conversations in Gab Fig. 3: Depth and Size distribution of each cascade type follow one of the five patterns and we have given details about these patterns in Figure 2a. These conversation cascade types We also analyze depth and volume (number of nodes) of give a generic representation of user engagement patterns over each cascade type and give their corresponding distributions time for a post. These cascades are also used to represent in Figures 3a and 3b respectively. From Figure 3a we note linear and non-linear conversation patterns in Gab. Figure 2b that users in Gab have longer conversations, which take more gives number of cascades in each conversation cascade pattern. branches after level 2 in the conversation thread(Cascade types In Table I, we report statistics of cascade depth, volume, D and E). Cascade type B is not shown in the Figure 3a number of unique user participation, and structural virality because these conversations split at the root node(post) and of all cascade types. We calculate the structural virality of terminate at immediate next level. In Figure 3b we show TABLE II: Topic Categories, a set of examples of hashtags, the number of instances, and the average size of cascades on each category Topic Examples of Hashtags Number Avg. of Cas- Cas- cades cades Size Conservitism #capitalism, 7337 15.26 #nationalsocialism, #freedom Anti #kikes, #jewproof, 161 47.58 Semitism #nazis, #jewsdid911, Fig. 4: Average tie strength of users participated in cascades on different #110neveragain topic. Users participating in cascades on polarizing topics like antisemitism, White #altright, 7934 19.64 anti Islam, and white supremacy tend to have higher tie strength, i. e. the frequency of interactions between two users. Supremacy #whitegenocide, #folkright, #newright, #retribalize distribution of volume(number of nodes/replies/quotes) across Freedom #speakfreely, #1sta, 13508 14.10 cascade types. The volume distribution of the given cascades of Speech #censors, #freespeech, are similar to the depth distribution given in Figure 3a(Cascade #shallnotcensor types D and E have much engaging participation). Interest- Anti Islam #banislam, #bansharia, 5379 16.01 ingly, we find that cascades of type B have more participation, #shariakills given that these cascades stop at level 1, compared to cascades Conspiracy #qanon, #pizzagate, #q, 23921 15.30 of type A. Theories #wwg1wga, #deepstate, C. Cascades across topics #thestorm, #latearth Different topics spread in different ways over the networks in terms of speed, number of participants and the dynamics of by subject matter experts into 6 topics. Unlike other works their cascades. We analyze how this intuitively accepted notion identified on Twitter [27], our work do not cover a broad apply to cascades on Gab and what topics in particular differ range of topics. Majority of posts on Gab are on political significantly from the others. We give a representative set of and rather controversial topics, thus we went few steps deeper hashtags and their topics in Table II. Then we use the labeled and classified these political topics into more specific topics. hashtags to classify the cascades where those hashtags appear. We used tagdef4, tagsfinder5, Google, and Twitter to find Our analysis shows that cascades on more controversial topics the meaning of hashtags and assigned them to one of the tend to result in larger cascades, majority of them being the 6 categories. Excluding hashtags that are too broad to be same cascade type (Type D). We specifically identified three assigned to a specific topic, for instance #gab, #eu, #music, topics, “Antisemitism”, “Anti Islam”, and “White Supremacy” or #welcome, we labeled 126 hashtags. to be noticeably different from other topics in regards to the In the next step, we used our algorithm to label other nature of the cascades in which they are discussed. hashtags used on Gab based on the 126 hashtags that we We define tie strength of user u and u as the number of 1 2 manually labeled. As shown in Algorithm 1, the inputs of times u replies to a post from u or u replies to a post from 1 2 2 the method are the network of hashtags, list of topics with the u . Figure 4 shows that users who participated in topics of 1 set of hashtags assigned to each topic, and a constant value as “Antisemitism”, “Anti Islam”, and “White Supremacy” have threshold. We defined the network of hashtags as G = {V,E} higher tie strength and are more strongly connected which where V is the set of hashtags used on Gab, and E is the set supports the theory that more controversial topics are adopted of weighted edges. An edge between two vertices indicates by users with higher tie strength [27]. ADL3 believes that that the two hashtags have appeared in the same post at least white supremacist, hateful, antisemitic bigotry are widespread once and the weight of the edge represents the number of co- on Gab. Our findings are aligned with this statement and other occurrences. We create this network of hashtags [23], but we studies that argue antisemitism and white nationalist topics are added weights to the edges to underscore the importance of the openly expressed on Gab and have great similarities in terms number of co-occurrences between two hashtags. In line 3 of of communities who adopt these topics [21] [28]. the algorithm, an edge is passed to get node with no topic() IV. TOPIC DISCOVERY OF CASCADES method which returns the vertex that is not already assigned to We introduce a novel procedure for labeling hashtags with any topic. The method returns null if both vertices of the edge the topics they belong to. We first looked at the top 200 most are already assigned topics or if neither has any topic assigned used hashtags on Gab. These hashtags then were classified 4www.tagdef.com 3www.adl.org 5www.tagsfinder.com TABLE III: Performance of hashtag labeling algorithm. Our algorithm pro- Algorithm 1 Topic Discovery of Hashtags duced high results for conventional performance metrics of accuracy, recall, and precision. Low HL (Hamming Loss) and high SA (Subset Accuracy) are Require: G, T and τ also strong indicatives that our algorithm performs well in classifying hashtags 1: procedure LABEL HASHTAGS into topics. 2: for e ∈ E do Metric Accuracy Recall Precision F1 HL SA 3: u = get node with no topics(e) Value 71% 73% 91% 77% 16% 84% 4: v = e − u 5: if u 6= ∅ then 6: topics = get topics(v) to it. Because first, we are not interested in labeling hashtags 7: for t ∈ topics do that are already labeled and secondly, we cannot label a node 8: inc node prop(u,t,w) if none of its neighbors is labeled. If the method returns a 9: for n ∈ V do node u we get the topics that are assigned to that node in 10: for t ∈ T do pt = get property(G,n,t) step 6, and then for each topic t in topics, we increment an 11: if pt > τ then integer property of node v, the other vertex of the edge, that 12: add hashtag to topic(G,n,t) represents t by w, the weight of edge e. After all edges have been traversed and the properties of their respective vertices are updated we move to steps 9 to 13 where for each node n in the graph and each topic t in topics, we get the property pt. If the value of the pt is greater than the threshold τ, we conclude that node n, and the hashtag that it represents, belongs to topic t. After we label hashtags with one or more topics, then for each cascade if hashtags of a topic t1 appear more than C times, we assign that cascade to t1. Note that a cascade could belong to more than one category. Table II shows the 6 topics we identified, some examples of the hashtags in each category, as well as the number of cascades and average size of cascades on each topic. (a) Figure a: Manually Labeled hashtags To evaluate our algorithm, we designed a procedure where we randomly picked 42 hashtags from the set of 126 labeled hashtags and labeled the rest of the hashtags in the set using our algorithm. Then we compared the results of our algorithm with our manual labeling. FIgure 5 shows the distribution of hashtags among different topics. Figure 5a shows the ground truth, i. e. manually labeled hashtags, and figure 5b shows the results of our algorithm. Since one hashtag could belong to multiple topics, we used the evaluation metrics of multi-label learning algorithms mentioned in [29]. Table III shows how well our algorithm performs in labeling the hashtags with topics, we use accuracy, (b) Figure b: Prediction of our Algorithm recall, precision, F 1 score, Hamming Loss (HL), and Subset Accuracy (SA). Fig. 5: Accuracy of the hashtag labeling algorithm, 5a shows the ground truth, i. e. manually labeled hashtags, and 5b shows how our algorithm classified the same hashtags. Overall accuracy of the algorithm is 71% as given in V. CASCADE ANALYSIS Table III A. Response rate in cascades c With our proposed cascade types in Section III-B, we (c) and ∆l is an average response time for posts at a given analyze response time of each cascade type. With this exper- level/depth (l) for a given cascade type (c). iment, we aim to produce results that depict how fast posts 1 in these cascades get responses and help growing/evolving l Rc = l (2) the cascades. We also provide a notion for response rate to ∆c compare velocity of responses at all levels in a given cascade The distribution of average response rates of all cascade type. We define response rate(Rc) of a cascade type as the types at each level is given in Figure 6. Interestingly, we average speed of response(s) of posts at a given level/depth of find that all cascade types start with almost equal and slower a cascade type(c). Equation 2 gives a formulation to calculate response rate and progress with faster responses over time. Im- response rate at a given level/depth (l) for a given cascade portantly, cascades of type C achieve overall higher response Fig. 6: Avg. response rate at each level/depth of the cascade. Higher values represent faster average response time. Simpler cascades(types A and C) have (a) Figure a: Time taken by a cascade type to evolve into another higher response rates than complex cascade structures(types D and E).

Fig. 7: Overview of cascade evolution as a state diagram. All cascades start as type A and they can evolve to the maximum of type D. rate at earlier levels, even though they do not grow as larger (b) Figure b: No. of post for a cascade type to evolve into another and deeper as other cascades. Also, cascades of type A follow Fig. 8: Summary distributions of amount of time and posts taken by a cascade constant response rate like other cascade types and spikes as to evolve into another. Sudden spikes in the plot are due to anomalies in the depth and volume increase. We also find that response rate data. 30% of cascades in each cascade type require less than 2 minutes and the number of cascade evolution decreases as the number of posts increases for other larger cascades such as type D and E is inversely proportional to the distribution of volume of the cascade. B. Evolution of cascades Given cascade types of Gab conversations, we study their growth patterns and evolution. All Gab conversa- tions/cascading patterns, as given in Figure 2, starts with type A and some of them moderately evolve into other cascade types. All possible transformations within the proposed cas- cade varieties are given in Figure 7. It is notable from this figure that a conversation cascade can reach its maximum potential by transforming to cascade type D and cascades must evolve into other types(B,C,E) before reaching type D. Providing such evolution patterns and number of occurrences of each cascade from Figure 2b, we find that there are Fig. 9: Normalized timeseries distribution of number of cascade evolution for significant evolution of cascades in our data. each evolution type. With the availability of evolution patterns in Gab conversa- tions, we provide basic analysis such as time 8a and number of posts 8b required by a cascade to evolve into another. As a about the topic. We use traditional models, like Susceptible- summary of this plot, we present Table IV to give minimum, Infected [8] and Bass [9] models to fit evolution patterns that maximum, average, and standard deviation of number of time exist in Gab conversation cascades. We prefer to use these and posts required to achieve evolve the cascades. models, because of their ability to fit sigmoid curves which in Figure 9 gives a timeseries on all possible cascade evolution turn maps exponential growth or exponential fall [18]. in Gab. Modeling this evolution helps to study intensification The goal of our models is to predict a number of new dn(t) of conversations in Gab. Conversations in any social media in- evolution( dt ) given a time t and a cumulative sum of user tensifies when it create interests or controversies among users parameters(). Equation 3 gives a modified equation of SI (a) A → B (b) A → C (c) A → E

(d) B → C (e) C → D (f) E → D Fig. 10: SI and Bass model fit for multiple cascade evolution timeseries. SI model fits for simple evolution types, while the Bass model captures the sharp spike that occurs near the end of the timeseries.

TABLE IV: Number of posts and time required by a cascade type to evolve into another. Overall evolution in Gab is slow with smaller number of posts and longer time to perform evolution. Evol. MeasureA → A → A → B → C → E → Type B C E C D D # Min. 2 3 3 1 1 1 replies to evolve Max. 3 43 163 134 208 675 Avg. 3 4.4 4.85 2.59 3.91 3.58 Std. 0.03 1.03 1.92 3.11 4.95 7.53 Dev. Fig. 11: Error(%) of SI and Bass model to predict the evolution patterns. Our best model fit is 84% accuracy(in A → C cascade evolution) Time Min. 5s 32s 10.5s 11.9s 26s 63s to evolve n(t) is the cumulative sum of cascades evolved at time t. In (hrs) other words, total infected ones at time t Max. 18742 18625 17069 17838 18421 17036 In Equation 4, we give the Bass model with the user Avg. 30.41 40 38.02 38.69 55.98 52.86 parameter(). Although, Bass model is introduced to describe the process of how new business product is taking effect in population, the model has been widely used in understanding diffusion and influence patterns in social networks also. Like model. We model α, γ < 1 to restrict complete participation the SI model, Bass model also generates S-shaped curve to fit of susceptibles (N − n(t)) and infected ones(n(t)) because exponential growth patterns. of an assumption that not all of the previous evolution are responsible for the current evolution. dn(t) (p + q)2 e−(p+q)t (Bass) =  ∗ m ∗ p −(p+q) 2 (4) dn(t) dt p (1 + q e ) (SI) =  ∗ β ∗ n(t)α ∗ (N − n(t))γ (3) dt where, m is total number of potential adopters where, p is the parameter to model external influences such as videos, β is evolution rate of cascades news articles N is total number of the cascades that evolved q is the parameter to model internal influences such as con- versations [7] J. Cheng, L. Adamic, P. A. Dow, J. M. Kleinberg, and J. Leskovec, “Can Results of both models to map the cascade evolution is cascades be predicted?” in Proc. of the 23rd International Conf. WWW. ACM, 2014, pp. 925–936. given in Figure 10. Each plot in this figure represents their [8] R. B. Banks, Growth and diffusion phenomena: Mathematical frame- corresponding evolution type, for example Figures 10a and 10d works and applications. Springer Science & Business Media, 2013, marks result for the evolution types A → B and B → C vol. 14. [9] V. Mahajan, E. Muller, and F. M. Bass, “New product diffusion models in respectively. From these results, we note that the performance marketing: A review and directions for research,” Journal of marketing, of the SI model degrades as the evolution types become vol. 54, no. 1, pp. 1–26, 1990. complex(for example, types C → D and E → D). We [10] M. Rowe and H. Saif, “Mining pro-isis radicalisation signals from social media users,” in ICWSM-16: 10th International AAAI Conf. on Web and evaluate both models performance by the Mean Absolute Social Media, 2016, pp. 329–338. Percentage Error (MAPE). Error rate of our models are given [11] R. Ottoni, E. Cunha, G. Magno, P. Bernadina, W. Meira Jr, and in Figure 11. Although this evolution problem itself can have V. Almeida, “Analyzing right-wing channels: Hate, violence and discrimination,” in Proc. of the 10th ACM Conf. on Web Science, its own model, we leave that work to focus in the future. 2018. [12] S. Zannettou, T. Caulfield, J. Blackburn, E. De Cristofaro, M. Sirivianos, VI.CONCLUSIONAND DISCUSSION G. Stringhini, and G. Suarez-Tangil, “On the origins of memes by means of fringe web communities,” arXiv preprint arXiv:1805.12512, 2018. Online extremism has gained momentum in the past decade [13] S. Vosoughi, D. Roy, and S. Aral, “The spread of true and false news due to extensive usage of social media and many works online,” Science, vol. 359, no. 6380, pp. 1146–1151, 2018. have utilized Gab as one of the platforms to analyze the [14] Y.-T. Chang, H. Yu, and H.-P. Lu, “Persuasive messages, popularity cohesion, and message diffusion in social media marketing,” Journal of problem. In this work we have given an extensive study on Business Research, vol. 68, no. 4, pp. 777–782, 2015. Gab using its conversations patterns and their related topics. [15] J. Kim and M. Hastak, “ analysis: Characteristics of We provide cascade templates for user conversations in Gab online social networks after a disaster,” International Journal of Infor- mation Management, vol. 38, no. 1, pp. 86–96, 2018. as conversation cascades. Dissecting Gab conversations as [16] Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos, “Rise these cascade types give an intuition on analyzing more viral and fall patterns of information diffusion: model and implications,” in and responsive cascades. We provided variety of analysis that Proc. of the 18th ACM SIGKDD, 2012, pp. 6–14. [17] Q. Zhao, M. A. Erdogdu, H. Y. He, A. Rajaraman, and J. Leskovec, revolve around these cascades and given models that fits “Seismic: A self-exciting point process model for predicting tweet cascade evolution over time. Also, we studied about topics popularity,” in Proc. of the 21th ACM SIGKDD, 2015, pp. 1513–1522. in the form of hashtag co-occurrence and given an algorithm [18] C. Zang, P. Cui, and C. Faloutsos, “Beyond sigmoids: The nettide model for social network growth, and its applications,” in Proc. of the 22nd to cluster hashtags into corresponding topics. ACM SIGKDD, 2016, pp. 2015–2024. In future, we plan to incorporate multiple social media [19] X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang, “Topic sentiment forums like Gab, Twitter, and Reddit in the context of polariz- analysis in twitter: A graph-based hashtag sentiment classification ap- proach,” in Proc. of the 20th ACM CIKM ’11’. New York, NY, USA: ing conversations and hate speech. We mainly focus to study ACM, 2011, pp. 1031–1040. information difussion and mutation patterns across online [20] M. Jeon, S. Jun, and E. Hwang, “Hashtag recommendation based on user social media during shock events. Given this problem, there tweet and hashtag classification on twitter,” in Proc. of the International Conference on Web-Age Information Management. Springer, 2014, pp. are various interesting areas to work in the near future. For 325–336. example, we can engineer temporal features such as response [21] D. M. Romero, B. Meeder, and J. Kleinberg, “Differences in the mechan- rate, content features like word or sentence embedding, and ics of information diffusion across topics: Idioms, political hashtags, and complex contagion on twitter,” in Proc. of the 20th WWW ’11’. New features from ground truth network like follower network to York, NY, USA: ACM, 2011, pp. 695–704. model such information flow across platforms. We can embed [22] K. Lee, D. Palsetia, R. Narayanan, M. M. A. Patwary, A. Agrawal, and these features in addition to post level features to predict the A. Choudhary, “Twitter trending topic classification,” in 2011 IEEE 11th ICDM, Dec 2011, pp. 251–258. amount of hate in a social media. [23] Y. Wang, J. Liu, Y. Huang, and X. Feng, “Using hashtag graph- based topic model to connect semantically-related words without co- REFERENCES occurrence in microblogs,” IEEE Transactions on Knowledge and Data [1] S. Zannettou, B. Bradlyn, E. De Cristofaro, H. Kwak, M. Sirivianos, Engineering, vol. 28, no. 7, pp. 1919–1933, July 2016. G. Stringini, and J. Blackburn, “What is gab: A bastion of free speech [24] G. Fair and R. Wesslen, “Shouting into the void: A database of the or an alt-right echo chamber,” pp. 1007–1014, 2018. alternative social media platform gab,” in Proc. of the International AAAI [2] L. Lima, J. C. Reis, P. Melo, F. Murai, L. Araujo, P. Vikatos, and Conference on Web and Social Media, vol. 13, no. 01, 2019, pp. 608– F. Benevenuto, “Inside the right-leaning echo chambers: Characterizing 610. gab, an unmoderated social system,” in 2018 IEEE/ACM ASONAM. [25] D. Alvarez-Melis and M. Saveski, “Topic modeling in twitter: Aggre- IEEE, 2018, pp. 515–522. gating tweets by conversations,” in Tenth AAAI ICWSM, 2016. [3] K. Garimella, G. De Francisci Morales, A. Gionis, and M. Mathioudakis, [26] S. Goel, A. Anderson, J. Hofman, and D. J. Watts, “The structural “Political discourse on social media: Echo chambers, gatekeepers, and virality of online diffusion,” Management Science, vol. 62, no. 1, pp. the price of bipartisanship,” in Proc. of the 2018 WWW, 2018, pp. 913– 180–196, 2015. 922. [27] J. Finkelstein, S. Zannettou, B. Bradlyn, and J. Blackburn, “A quan- [4] S. Gonzalez-Bail´ on,´ J. Borge-Holthoefer, A. Rivero, and Y. Moreno, titative approach to understanding online antisemitism,” CoRR, vol. “The dynamics of protest recruitment through an online network,” abs/1809.01644, 2018. CoRR, vol. abs/1111.5595, 2011. [Online]. Available: http://arxiv.org/ [28] I. Kalmar, C. Stevens, and N. Worby, “Twitter, gab, and racism: The abs/1111.5595 case of the soros myth,” in Proc. of the SMSociety ’18. New York, [5] S. Kim, J. Bak, and A. H. Oh, “Do you feel what i feel? social aspects of NY, USA: ACM, 2018, pp. 330–334. emotions in twitter conversations,” in Sixth International AAAI ICWSM, [29] M. Zhang and Z. Zhou, “A review on multi-label learning algorithms,” 2012. IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 8, [6] B. Mathew, R. Dutt, P. Goyal, and A. Mukherjee, “Spread of hate speech pp. 1819–1837, Aug 2014. in online social media,” arXiv preprint arXiv:1812.01693, 2018.