Uncovering the Wider Structure of Extreme Right Communities
Total Page:16
File Type:pdf, Size:1020Kb
Uncovering the Wider Structure of Extreme Right Communities Spanning Popular Online Networks Derek O’Callaghan Derek Greene Maura Conway University College Dublin University College Dublin Dublin City University Dublin 4, Ireland Dublin 4, Ireland Dublin 9, Ireland [email protected] [email protected] [email protected] Joe Carthy Padraig´ Cunningham University College Dublin University College Dublin Dublin 4, Ireland Dublin 4, Ireland [email protected] [email protected] ABSTRACT murder campaign. While initially the extreme right relied Recent years have seen increased interest in the online pres- upon dedicated websites for the dissemination of their hate ence of extreme right groups. Although originally composed content [7], they have since incorporated the use of social of dedicated websites, the online extreme right milieu now media platforms for community formation around variants of spans multiple networks, including popular social media plat- extreme right ideology [15]. As Twitter’s facility for the dis- forms such as Twitter, Facebook and YouTube. Ideally there- semination of content has been established [2], our previous fore, any contemporary analysis of online extreme right activ- work [27] provided an exploratory analysis of its use by ex- ity requires the consideration of multiple data sources, rather treme right groups. Here we extend this analysis to investi- than being restricted to a single platform. We investigate the gate its potential to act as one possible gateway to commu- potential for Twitter to act as one possible gateway to com- nities located within the online extreme right milieu, which munities within the wider online network of the extreme right, spans multiple platforms such as Twitter, Facebook and You- given its facility for the dissemination of content. A strategy Tube in addition to other websites. Our analysis is focused for representing heterogeneous network data with a single ho- on two case studies, involving the investigation of English mogeneous network for the purpose of community detection and German language online communities. is presented, where these inherently dynamic communities We infer network representations of heterogeneous nodes and are tracked over time. We use this strategy to discover and an- edges to capture the relations between four different extreme alyze persistent English and German language extreme right right online entities: (a) Twitter accounts, (b) YouTube chan- communities. nels, (c) Facebook profiles and (d) all other websites. The in- clusion of such platforms allows us to identify extreme right Author Keywords communities that would otherwise not be apparent when con- Social network analysis; extreme right; heterogeneous online sidering Twitter data alone. As online extreme right networks networks. are inherently dynamic, we are also interested in tracking the evolution of the associated groups over an extended period of ACM Classification Keywords time. Following work by Lehmann et al. [23], we apply the J.4 Social and Behavioral Sciences: Sociology community finding process to a single network representation where all node types are treated as peers. When interpreting arXiv:1302.1726v2 [cs.SI] 16 May 2013 INTRODUCTION the output of the process, the use of heterogeneous data pro- The online presence of the extreme right has come under vides us with a rich insight into the composition of the com- greater scrutiny in recent years, particularly given events such munities, and serves to differentiate between them. as the Oslo killings by Anders Behring Breivik in July 2011, Wade Michael Page’s attack on a Sikh temple in Wisconsin We initially provide a description of related work on the in August 2012 and the uncovering in November 2011 of the online activities of extremist groups, along with discussion German National Socialist Underground (NSU)’s multi-year of relevant network analysis methods. The generation of two data sets using English and German language tweets is then discussed. Next, we describe the methodology used to derive network representations and subsequently track de- tected communities. This is followed by an analysis of se- lected persistent dynamic communities found in both data sets, where we discuss the merits of the network represen- tation and provide characterizations of the membership com- position in terms of the extent to which they span multiple on- 1 line platforms. Finally, the overall conclusions are discussed, current work uses Twitter as a starting point for analyzing on- and some suggestions for future work are made. line extreme right activity, the notion of missing nodes and links is realized with the unavailability of older tweet data, RELATED WORK relevant Twitter accounts that have been suspended, profiles that are not publicly accessible and extreme right entities that Online Extremism do not maintain a presence on Twitter. A number of studies have investigated the online presence of extreme right groups. An early example is that of Our proposed representation of heterogeneous node types Burris et al. [7], where social network analysis of a white with a single homogeneous network is also motivated in part supremacist website network found evidence of decentral- by previous work in cluster analysis and community find- ization, with little division along doctrinal lines. Gersten- ing, where nodes with different semantics have been treated feld et al. [14] analyzed the content of various extremist web- equally during the clustering phase, and subsequently treated sites, and described the potential for forging a stronger sense separately to support the interpretation of the resulting clus- of community between geographically isolated groups. Chau ters. For instance, Dhillon [13] produced a bipartite spectral and Xu [9] studied networks built from users contributing to embedding of words and documents, which allowed both to hate group and racist blogs. Caiani and Wagemann [8] an- be clustered simultaneously. Lehmann et al. [23] proposed alyzed the website networks of German and Italian extreme an extension of the well-known clique percolation method for right groups. community finding to identify communities of nodes in bipar- tite graphs. The use of multiple node types avoids the loss of More recent work has included the study of online social me- important structural information, which can occur when per- dia usage by the extreme right. Bartlett et al. [4] performed forming a one-mode projection of a graph. a survey of European populist party and group supporters on Facebook. Baldauf et al. [3] investigated the use of Facebook For the analysis of communities in temporal networks, a va- by the German extreme right, exploring the strategies em- riety of approaches have been proposed. Greene et al. [18] ployed for promoting associated ideology along with the use described a general model for tracking communities in dy- of certain themes designed to attract new unsuspecting fol- namic networks. Communities are identified on each individ- lowers, such as stated support for freedom of speech and op- ual time step network, which provides a snapshot of the data position to child sex abuse. Goodwin and Ramalingam’s [16] at a particular interval. These communities are then matched analysis of the radical right in Europe discussed the impor- together to form timelines, representing long-lived dynamic tance of online social media, and separately found a shortage communities that exist over multiple successive steps. This of research on non-electoral forms of right-wing extremism. work was extended in [17] to address the problem of clus- A set of reports commissioned by the Swedish Ministry of tering dynamic bipartite networks, where a clustering is pro- Justice and the Institute for Strategic Dialogue [15] analyzed duced on a spectral embedding of both types of nodes simul- the use of social media such as Facebook and YouTube by a taneously at each time step before matching. number of European extreme right groups. DATA These related studies suggest that communities within the on- In our previous analysis [27], we collected a number of Twit- line extreme right milieu span multiple networks, which pro- ter data sets, each associated with a particular country, to fa- vides a motivation for the current work to analyze their struc- cilitate the analysis of contemporary activity by extreme right ture and temporal persistence, while also investigating the po- groups. One of our findings was the influence of linguistic tential for popular social media platforms such as Twitter to proximity on relationships within the detected extreme right act as gateways to this activity. communities, in particular, the use of the English language. Given this, we merged two of our data sets associated with Network Analysis the USA and the United Kingdom respectively. This data set, An appropriate network representation is required in order along with that associated with German extreme right Twitter to analyze the multi-network online activity of the extreme accounts, are the starting points for this current work. Both inference relevance right. The and problems discussed by De data sets were extended to include additional accounts that Choudhary et al. [12] are applicable here, where the former were identified as relevant, using criteria such as profiles con- is attributed to the fact that “real” social ties are not directly taining extreme right symbols or references to known groups; observable and must be inferred from observations of events follower relationships with known relevant accounts; simi- (in our case, tweets from identified extreme right accounts lar Facebook profiles/YouTube channels; extreme right media on Twitter), while the lack of one “true” social network is accounts; a propensity to share links to known extreme right associated with the latter, due to the potential existence of websites. Similar criteria were employed in earlier studies multiple networks each corresponding to a different defini- [3,8], and further details can be found in our previous anal- incompleteness tion of a tie. Separate issues also exist given ysis [27]. In total, 1,267 English language and 430 German the nature of the extreme right.