<<

Here I Am, Praying to an Egyptian Frog Exploring Political Fluidity on /pol/

Sal Hendrik Hagen

MA Thesis

Universiteit van Amsterdam Programme: Media Studies (Research)

Referencing: MLA 8th edition

29 June 2018 Hagen 2

Index

Index ...... 2 Abstract ...... 4 Introduction: “No One Can Step in the Same River Twice” ...... 5 1. Theory: Tarde and Why 4chan Is Not One Person ...... 13 1.1 Tarde’s imitations and anti-structuralism...... 13 1.2 A Tardean view on 4chan ...... 17 1.2.1 4chan as an incubator for imitations and innovations ...... 18 1.2.2 4chan and the emergence of publics ...... 20 2. Method/ology: Tracing and Navigating Imitative Currents ...... 24 2.1 Approach: Circulating around differently conceived wholes ...... 24 2.2 Data: 4chan as an archived object ...... 29 2.2.1 Full sample ...... 29 2.2.2 Topic sample ...... 31 2.3 Methods: Digital navigation through text mining ...... 33 2.3.1 Mapping vocabulary change with tf-idf ...... 33 2.3.2 Semantic similarities through word embeddings ...... 38 2.3.3 Ontological associations with word trees ...... 42 3. Case Study: Digitally Navigating Trump as an Imitated Object ...... 44 3.1 Text mining per week: Stopping momentarily to test the waters ...... 45 3.1.1 2015, week 45: “Trump is a meme-wizard” ...... 47 3.1.2 2016, week 18: “A major zone of sigil magic” ...... 51 3.1.3 2016, week 42: In the trenches of the Great Meme War ...... 57 3.1.4 2017, week 21: “Identity first, lulz second” ...... 62 3.1.5 2017, week 51: “Rationalizations that support emotionally driven conclusions” ...... 66 3.2 Informed reflection: Lulzy crowds, extremist publics? ...... 71 Conclusions ...... 78 Works cited ...... 81 Appendices ...... 89 I A primer on 4chan’s infrastructure ...... 89 II Scoping the amount of users on /pol/ ...... 94 III Database column headers ...... 97 IV Frequency charts ...... 98

Hagen 3

This text references the online 4chan/pol/ archive archive.4plebs.org. Archived posts are referred to as (Anon #4). Links to the respective posts are included in the “Anonology” underneath the works cited.

Acknowledgements This thesis is not an isolated unit created in solitude, but rather a coming-together of influences from many people around me. As such, a brief note of gratitude is fitting. My gratitude to Bernhard Rieder for the supervision of this project; I left each meeting with a newfound dose of inspiration. Thanks to my brother Abel for the beautiful cover art representing both my research process and the object of study. Not in the last place, I would like to express my gratitude to the people behind 4plebs.org for dedicating their time to the under-appreciated art of Web archiving and making this research possible. Thanks to my friends and fellow library dwellers who made the daily (and nightly) study sessions bearable, and the breaks too long. Lastly, I am most indebted to my lovely parents, who provided me with all I could have asked for and more - a happy youth, a carefree study career and a well-filled fridge.

Hagen 4

Abstract This research is concerned with exploring change in political sentiments on /pol/ “Politically Incorrect”, a subsection of the 4chan. Since its inception in 2003, 4chan gained infamy as a hotspot for offensive humour and trolling campaigns, coalescing into political activism on multiple occasions. While earlier activism emerging from 4chan leaned towards left-libertarianism, the imageboard has been increasingly associated with the far- and radical right after 2014, with /pol/ as the main incubator. This political fluidity, together with 4chan’s anonymity, ephemerality, and subcultural characteristics, has troubled nuanced scrutiny of the space, often leading to generalising equations of the imageboard with “Anonymous”, “trolls” or the “alt-right”. This fluidity means that making claims about 4chan’s political composition requires a perpetual analysis of the sentiments present. This text explores the theoretical and methodological challenges that arise from this task. Theoretically, holistic or metaphysical conceptions fail to grasp the heterogeneity of the imageboard’s users, but 4chan’s anonymity also complicates individualistic accounts. To that end, Gabriel Tarde’s work on imitation offers a suitable framing that shifts the focus from the collective or individual towards repeated “objects of imitation”, without disassociating these objects from subjective relations. Further, Tarde helps characterising 4chan as a particularly “innovative” space, which in turn stimulates the rapid emergence of (political) publics. Methodologically, the Tardean view is also espoused by following his claim that within complex social heterogeneity, dominant imitative patterns can be discerned, legitimising an empirical study into what 4chan “is made of” to demarcate political trends. To study such revelatory “objects of imitation”, this project proposes a set of experimental text mining methods that identify changes in word associations. To test these theories and methods, the research offers a case study on how the word “trump” provides various “points of view” into political sentiment on /pol/ within five weeks from late 2015 to late 2017. It identifies that the 2016 campaign cycle was particularly marked with semi-humorous slang, Trump became more embedded in (derogatory) vernacular, and ontological associations regarding Trump shifted from him being a “meme candidate” to a target for anti-Semitic hate speech. Together, these results suggest a “crowd-like” collective formation during the 2016 election period, and that /pol/ is not “countercultural” but rather marked by a longer presence of extremist publics. Because associations on the word “trump” changed significantly over time, the case study cements the necessity of “testing the waters” of the political currents on 4chan.

Keywords: 4chan/pol/, political fluidity, Tarde, imitation, Trump, text mining

Hagen 5

Introduction: “No One Can Step in the Same River Twice”

I can feel a warmth deep within my blossom. It’s energy. It’s high energy. I am feeding from the iridescent pool of MAGA meme magic. Can you feel it? Once in my life I feel proud of my country. Goddamn /pol/ it feels good. (Anon #1)

9 November 2016 was a special day on /pol/, a subsection of the infamous imageboard 4chan dedicated to “debate and discussion related to politics and current events” (“Rules”). After months of trolling, conspiring and “memeing” during the election cycle, its users realised that the electoral tables slowly turned in favour of Donald J. Trump. To the delight of many anons, as 4chan’s anonymous users are nicknamed, their continuous pro-Trump, anti-Clinton content stream was seemingly not in vain. These efforts ranged from spreading pro-Trump memes (Beran; Nuzzi), constructing a conspiracy claiming the Clintons maintained a child network in the basements of pizza parlours (Fisher et al.; Tuters et al.), and creating a pseudo-religious roleplay prophesising Trump’s win (Burton; Lawrence). The former business mogul formed a politically incorrect and, at times, absurd avatar matching the transgressive, comical, and right-wing sentiments common on /pol/. After the eventual results came in, anons both ironically and sincerely claimed they collectively “memed Trump into office” (Ohlheiser; Anon #2). As a self-contained force, the political agency of these pro-Trump anons during the 2016 U.S. Elections was hard to delineate, but likely to be negligibly marginal (Faris et al.; Phillips, “Oxygen” 5- 6; see appendix II). Nonetheless, a consensus arose in media that the so-called “alt-right” -- often equated with /pol/ -- had successfully won the cultural, metapolitical battle intertwined with the elections (Phillips, “Oxygen” 5). For instance, The Washington Post headlined that “The Only True Winners of this Election Are Trolls” (Dewey). More problematically, 4chan and its malicious users quickly appeared in headlines verging on sensationalism. According to some, the anons, often presented as interchangeable with “trolls”, were “Plotting a GOP Takeover” (Stuart), had “Won the 2017 Election” (Nuzzi) or had already reached the White House (Marantz). In a much-cited piece, Paul Beran even headlined 4chan as the “Skeleton Key to the Rise of Trump”. Enticing as they might be, these claims are questionable considering the far-right political narratives could not have catapulted into the broader political agenda without the amplification by coverage of centre-left press, provoked to criticise or capitalise on the misinformation cooked up in nebulous trolling hotspots like /pol/ (Faris et al. 131; Phillips, “Oxygen” 5-6). Even more pressing, generalising statements on far-right publics -- ’s “basket of deplorables”1 as the most problematic frontrunner -- recognised and unintentionally

1 Clinton described half of the Republican voters as right-wing bigots in a campaign rally speech, saying: “To just be grossly generalistic, you could put half of Trump’s supporters into what I call the basket of deplorables. Right? The racist, sexist, homophobic, xenophobic, Islamophobic — you name it. […] [Trump] has given voice to their websites that used to only have 11,000 people — now 11 million. He tweets and retweets their offensive, hateful, mean-spirited rhetoric.” (Blow). Instead of a damning condemnation, deplorables became a shared self-adopted nomenclature for right-wing and far right actors, Jr. included (Firozi).

Hagen 6

strengthened the in-group coherence of those it was meant to disavow (Phillips, This Is Why; Phillips “Oxygen”; Phillips et al.). A main problem in the coverage of the online far right is a lack of a nuanced understanding of the online spaces these actors inhabit (Phillips, “Oxygen” 18). “Trolling scholars” Whitney Phillips, Jessica Beyer, and Gabriella Coleman, each of whom conducted extensive research on (political) collectives emerging on 4chan, emphasise such an understanding is paramount, arguing to refrain from making claims about a transgressive, dehumanising space like 4chan without first “plotting the landscape” and “safeguarding the actual record”, as not doing so risks the propagation of problematic ambiguities and sweeping statements. For instance, attributing the election of Trump to a vague notion of “4chan” or “trolls” “bestows a kind of atemporal, almost godlike power” to what is actually an “ever- evolving, ever-unstable, ever-reactive anonymous online collective” (Phillips et al.). Lamenting this dynamic, Phillips et al. call for an increased scrutiny of politicised, anonymous digital spaces:

Taking the time to map—to accurately map—the repeated, fractured, reconfiguring mobilizations emerging from anonymous and pseudo-anonymous spaces online allows us to understand where we are and how we got here. (Phillips et al.)

As such, Phillips et al. argue that coming to a better cultural and political understanding of an anonymous space like 4chan helps to “stand up to those who attempt to hijack the narrative” (Phillips et al.).

Figure 1: An example of a thread on 4chan/pol/. The post with the picture is the opening post (OP) of a thread, while the brown boxes underneath reply to the OP.

A lack of mainstream understanding of a space like 4chan is perhaps unsurprising considering 4chan’s obscuring affordances2 and related (sub)culture. At its core, the setup of the imageboard is straight-forward: users can post in one of 4chan’s seventy boards; digital bulletin boards dedicated to

2 The term “affordance” knows many different (mis)uses (Bucher and Helmond), but is it is understood here as “functional and relational aspects, which frame, while not determining, the possibilities for agentic action in relation to an object” (Hutchby 444).

Hagen 7

specific topics, like /pol/, for political discussion (also referred to as Politically Incorrect). Anyone can create a new thread by making an opening post (OP) or leave a reply in one of the 200 active threads (see fig. 1). These threads exist of simple boxes with text and an (optional) image. No account is needed to participate, meaning almost3 all posts are labelled with the default name: Anonymous. Despite this simplicity, 4chan’s affordances and interrelated cultural norms render the imageboard more arcane than its infrastructure would suggest. Firstly, 4chan’s anonymity disallows a user’s posting activity and impedes traditional classifications by age, gender, identity and so on. Further, anonymity and lax moderation allows (but does not inherently lead to) an environment where “unthinkable” and dark thoughts find their place (Nagle, Kill All Normies 14). Combined with the homogenisation of all users by the shared moniker of Anonymous, these aspects might heighten the outsider’s perception of 4chan as a single, mysterious, and potentially dangerous entity, even though the imageboard consists of a cacophony of conflicting voices (Coleman, Hacker 114-5). Secondly, 4chan’s content is ephemeral: threads can be deleted after only a few minutes4, while every thread is permanently deleted from the servers after a few days5. This complicates information retrieval for historical research purposes, but, more importantly, it stimulates proactive repetition and remixture of images and texts as a “locus of memory” working against the quickly moving “volume of posts and responses” (Coleman, “Net Wars”). As a corollary, a considerable time investment is required to stay up-to-date with the many intertextual and intracultural references (Beyer 48). Emboldened by anonymity and ephemerality, anons intentionally and unintentionally use transgressive “gatekeeping” practices to protect 4chan’s reputation as an underground hub, for instance through posting extremist or gory content, covering statements with (often imperceptible) irony, and adopting discourse filled with vernacular terms. Resultingly, many anons are no strangers to “discourse around ‘normies’6 and ‘basic bitches’ who ‘don’t get’ the countercultural styles of the amoral subculture” (Nagle, Kill All Normies 107), and “[embrace] the mantle of being the ‘tastemakers’ of memetic subculture” (Milner 105). However, the ambiguities in decoding 4chan’s posts applies to anons as well, as jokesters might be unaware of the political sincerity of other anons. For example, an oft-seen claim on /pol/ is that its users are merely “roleplaying as Nazis”, even though the board formed an inspiration and recruitment zone for overt and violent neo-

3 There is an option to insert a name when posting. However, these names have to be inserted each time and can also be co-opted by other users, meaning there is no consistency between actual users and the names. Further, using a different name than the default “Anonymous” is heavily frowned upon. 4 Bernstein et al. reported in 2011 that the lifetime of a thread on /b/ ‘Random’ had a median of 3.9 minutes and a mean of 9.1 minutes, while the shortest thread expired in 28 seconds and the longest-living thread lasted 6.2 hours. I could not find such metrics on /pol/, but a self-scraped sample of all posts from 15 January 2018 01:30:00 to 16 January 2018 01:30:00 found that an average /pol/-thread lasted 1 hour and 17 minutes, with the longest thread lasting over 23 hours and 9 minutes, and the shortest being archived after merely 13 minutes. 5 The deletion of threads differs per board. /b/ ‘Random’ does deletes threads as soon as they fall below the thread- limit, while /pol/ has an archive that stores the threads for a few days (although replying is disabled). 6 Normie is slang for a conventional person who does not understand the unwritten subcultural rues of spaces like 4chan/pol/ (“Normie”).

Hagen 8

Nazi’s (O’Brien; Phillips, “Oxygen” 23). Combined with conspiratorial distrust, wild claims are made about covert hidden political agendas of other anons - all impossible to verify (see fig. 2).

Figure 2: A post on /pol/ indicating 4chan's intra-ambiguities. Quick glossary: MSM is ‘mainstream media’, an oldfag is a long-term 4chan frequenter, a cuck is a subordinate man, BBC refers to male genitalia, and BTFO is “blown the fuck out” (roughly meaning to desperately lose at the hands of someone else). Derived and screencaptured from archive.4plebs.org.

Just as 4chan’s affordances can obscure the production of cute cat pictures, they can mystify patterns and shifts in political activity (Zuckerman). Both are not without precedent on the imageboard,7 but the latter is of focus here. Historically, groups of anons managed to rapidly generate political mascots, ideas, and an activist drive (Coleman, Hacker). This is best exemplified by two movements that emerged (or are heavily associated with) 4chan: Anonymous and the alt-right. Anonymous, directly named after 4chan’s default user name, emerged from the /b/ “Random” board around 2007. Initially, its loose set of participants was mainly interested in lulz, the joy of eliciting a (preferably emotional) reaction from an unwitting or unwilling audience (Coleman, Hacker 48-9; Phillips, This Is Why 27-8). However, with “the lulz as a behavioural rudder” (Phillips, This Is Why 57), Anonymous gradually drove towards politically motivated activism, mainly focused on liberal or libertarian issues of free speech and anti-corporatism. For instance, its “members” performed DDoS attacks on MasterCard, VISA and PayPal for boycotting Julian Assange’s WikiLeaks, and later partook in the Occupy movement (Coleman, Hacker). By then, Anonymous had “left 4chan in pursuit of activist goals” (Coleman, Hacker 47), but from 2014 onwards, the imageboard gained infamy for a darker side of grassroots political mobilisation, as anons targeted left-wing liberalism, feminism, globalism and multiculturalism. The main hotspot for these sentiments was not /b/, but /pol/ (Phillips, “Oxygen” 23), created in 2011 as a “containment board” for right-wing extremists (Hine et al. 1), suggesting it was filled with politically distasteful content from the outset. Even though one of 4chan’s global rules was to “keep /pol/ in /pol/” (“/pol”), the board drew broad attention during Gamergate, a controversy

7 In the case of cute cat pictures, anons on /b/ would organise “Caturday”, denoting that Saturday was the day to post image macros of cats, also known as (“Caturday”).

Hagen 9

surrounding feminist perspectives on videogames, framed by Angela Nagle as a “galvinizing issue that drew up the battle lines of the culture wars for a younger online generation” (Kill All Normies 24). Gamergate escalated into a repugnant online harassment of female journalists, with /pol/ as one of the “troll” hangouts, cementing the board as a “safe space for self-selecting misogynists and racists whose bigotries were an identity first, source of lulz second” (Phillips, “Oxygen” 23). Despite a ban on Gamergate discussion by 4chan’s founder, Christopher “moot” Poole, the event marked an “ideological crystallization” of previously disparate far-right underbellies (Phillips, “Oxygen” 24). Additionally, it prepared /pol/’s grounds just in time for Donald Trump, whose campaign and antics matched /pol/’s misogyny, racism, anti-political correctness, and absurdities. These online far-right publics were referred to as the “alt-right”,8 a “broad, and for its adherents, a usefully vague” (Nagle, “Goodbye, Pepe”) moniker for “an amalgam of conspiracy theorists, techno-libertarians, white nationalists, Men’s Rights advocates, trolls, anti-feminists, anti-immigration activists, and bored young people” (Marwick and Lewis, 3). It is troublesome to consider both “Anonymous” and “alt-right” as coherent entities, since they merely form generalising monikers for “ever-evolving, ever-unstable, ever-reactive” collectives, but that does not negate that diametrically opposed political collectives emerged from the same space (Phillips et al.). This means that with 4chan, “nobody can step in the same river twice” (Phillips et al.). If I follow Heraclitus’s analogy, exploring how to test the waters of the 4chan river becomes an urgent challenge, both because it could aid in identifying and making sense of the peculiar political shifts already transpired, as well as to “prepare” for possible influential collectives emerging in the future. While excellent ethnographical and anthropological studies on 4chan have been conducted (Coleman Hacker; Beyer; Phillips This Is Why; Nagle, Kill All Normies), little research has investigated the space from a data-driven approach. This is somewhat surprising, since the imageboard offers virtually unlimited data fetching all boards, circumventing the platform’s ephemerality, and opening ways to maintain and scrutinise a “data record”. Literal records have been established in the form of archives of various boards.9 While I do not argue a data-driven approach is inherently superior to ethnographic work, it can provide empirical handles that answer the call by Phillips et al. to “plot the landscape” and “safeguard the actual record” of 4chan. Such efforts are not wholly absent: Bernstein et al. collected two weeks of data from the /b/ “Random” board, analysing the ephemerality of the various threads, and engaged in a categorisation of post content. Hine et al. analysed two-and-a-half months of /pol/ data from the summer of 2016 and traced the spread of hate speech, amongst others. Zannettou et al. classified thirteen million images from /pol/ and tracked their diffusion to other Internet platforms. However, these studies emphasise the resulting metrics in lieu of the conceptual and methodological

8 The “alt-right” was vaguely inclusive but became more loaded after the violent protests in Charlottesville, Virginia, increasingly becoming a moniker for a hard-core strand of white nationalists - as it was first associated to when it was popularised by the white nationalist Richard Spencer. See e.g. Nagle (“Goodbye Pepe”). 9 The most coherent and accessible archive of /pol/ is archive.4plebs.org. This archive forms the main source of data for this research and will be highlighted in chapter 2.

Hagen 10

complexities of “extracting meaning” from the imageboard. This process of methodological reflection is crucial because a blind trust in “mechanical objectivity” can create more epistemological problems than it solves (Rieder and Röhle 72-3), potentially forming “deterministic models [that] fail to recognize the contingency, creativity, and unpredictability of movement dynamics” (Uitermark 414). Further, most of the above studies focus on momentary snapshots instead of temporal (political) change, which is a necessary lens to understand the fluid dynamics in a space like /pol/. To that end, this thesis forms an exploration into how the employment of quantitative techniques can aid in making sense of the political variability of /pol/. It presents an empirical inquiry into a data archive of /pol/, but the conceptual and methodological challenges in this research process are given prominent stage time. A conceptual challenge is present since 4chan’s anonymity eludes characterisation of emerging collectives as traditionally “individualistic” or “collectivist”. As Coleman notes, ontological characterisation of a phenomenon like Anonymous is complex because “Western philosophy, and in turn, much of Western culture more generally, has posited the self -- the individual -- as the site of epistemic inquiry” (50). Such individualistic characterisations fail to properly conceptualise the decentralised, leaderless (at least initially), anonymous collectives emerging from 4chan (Coleman, Hacker 50). This means the dynamics of such movements cannot be reduced to specific individual participants, but simultaneously, neither should such instable groups be elevated to the status of “hyperobject”, both because this mischaracterises movements that, despite their complexity, are still made up by the actions of human subjects (Coleman, Hacker 114), and because such mystifications can lead to troublesome misnomers like “trolls” and the “alt-right”, as highlighted above (Phillips et al.). As a theoretical handle for these complexities, I draw from the work of the 19th Century sociologist Gabriel Tarde. His social theory is somewhat outdated (Katz), but is useful here because it offers a position in between the “micro” of individuals and the “macro” of collectives by focusing on relational “imitations” between the entities of what he generalised as a “society”. More importantly, this position can be extended to the methodological challenges. A data-driven study of /pol/’s concrete objects does not offer an automated, magical answer to the above conceptual issues, necessitating methodological pondering on how to interpret 4chan’s concrete objects. Tarde’s theory on imitation allows to consider these digital results not as static “objects” or infectious “memes”, but rather as part of subjective “beliefs and desires” (Schmidt 111). Resultingly, inferring “certain regularities or laws” (Marsden 1177) in this social data can identify what ideas or issues are important to its propagators, and possibly delineate what Tarde saw as a “public”. Considering political sentiments on /pol/ are in a constant flux, I use Tarde’s thought to propose that instead of thinking of 4chan’s and its collectives in a structural, metaphysical, top-down sense (“what is 4chan?” or “what is Anonymous?”), it is fruitful to empirically identify the ontic (“what is 4chan made of?”) and from there deploy “as many subjective perspectives as possible” (Venturini, “What Is”) to infer regularities or irregularities in the data, that in turn can potentially identify broader continuities and changes in the imageboard’s political current. However, this also requires practical discussion since 4chan lacks the

Hagen 11

usual suspects in “natively digital” (Rogers, Digital Methods) objects one can repurpose to illustrate social, cultural or political patterns: neither stable objects of personal characterisation (e.g. names, profiles, friends), nor explicit objects of affect (likes, shares, up- and downvotes) are present in 4chan’s infrastructure. However, the simple images and texts on 4chan embed complex “webs of significance” (Geertz). This research attempts to “untangle” the webs of the latter through employing text mining techniques on a particular word, trump, by drawing from Latour et al.’s reconsideration of Tarde to “digitally navigate” the properties and attributes of such an object to test whether it can identify political shifts on /pol/. Although this text knows a conventional structure, the chapters form more of an assemblage rather than a build-up to the case study in the third chapter, as to reserve enough room for the how. The first chapter poses the theoretical framing, discussing how Tarde’s anti-structuralist consideration that “the whole is never more than its parts” allows to de-generalise and untangle the complex social trends emerging from 4chan. Further, his focus on imitations allows to think of the imageboard as space that affords particularly “innovate” imitations, creating a lively “shared knowledge reservoir” (Rieder), which in turn can stimulate the rapid emergence of (issue) publics. In the second chapter, the Tardean approach is extended towards a methodology by exploring how this theory allows to trace an “imitated object” (like the word trump) to construct “different parts of a whole” and from these points of view speculate on grander claims. It discusses a set of text mining methods that map vocabulary change and identify topics animating an issue public (mapping cosine distances and extracting tf-idf terms), the semantic similarities of a word (word2vec with t-SNE), and ontological associations (word trees of “Trump is a…”). The third and last chapter presents a case study on the temporal change in the relational properties of the word trump. It uses a dataset of nearly all posts on /pol/ since November 2013 and “navigates” five particular weeks in five discursively differing clusters. The findings identify various shifts in word associations. Firstly, the faux-playful, ironic political activity, for instance by associating Trump with a “god-emperor”, rises during the election cycle but quickly evaporates afterwards due to a suspected falling-apart of “crowd-like” formations. Secondly, both Trump’s adversaries as well as the former businessman himself become more embedded in derogatory vernacular, illustrating the fractured, antagonistic content on the board. Thirdly, Trump is first associated with being a “meme candidate”, but changes to become a target for anti-Semitic hate speech. These findings are considered both proof and an experimental exercise in arguing one has to keep “safeguarding the record” (Phillips et al.) of a space like /pol/ before imposing structuralisms. This introduction has illustrated the most relevant aspects of 4chan’s infrastructure, but a more comprehensive understanding can be helpful. To that end, appendix I offers a detailed overview of 4chan’s features.

Hagen 12

Note on ethics and amplification This study deals with an online space containing racist, sexist, extremist and occasionally violent discourse. Its main concern is textual data on /pol/, meaning the most offensive or gruesome images could be avoided. Still, some posts visualised or otherwise reused in this research contain extremist language. I am well aware that the inclusion of this discourse, “even if it’s done in the service of critical assessment” also “continues their circulation, and therefore may continue to normalize their antagonisms and marginalizations” (Milner 64). Still, I share Whitney Phillips’s statement in her research on Internet trolls that “a certain amount of offensive content is necessary to the coherence and in fact the accuracy of this study” (3). This is especially the case here, as antagonisms and extremisms are unfortunate cornerstones of /pol/ and hateful words form a crucial part of the case study in the third chapter. Ultimately, a case-by-case assessment was made whether the text significantly benefited from the inclusion of hateful content.

Hagen 13

1. Theory: Tarde and Why 4chan Is Not One Person

Despite all its twists and turns, collective existence does have a sense (even if not straightforward, unique or simple). (Venturini, “Diving in Magma” 263)

One of the main conceptual challenges for this research is to infer collective meaning from /pol/ without assuming or imposing problematic structuralisms. This chapter discusses how the French sociologist Gabriel Tarde’s idiosyncratic yet inspiring theory of imitation is useful vis-à-vis this problem, and how it offers conceptual alternatives. Adopting his anti-structuralism means neither 4chan nor the collectives emerging from the space should be elevated into the stratum of metaphysics. Simultaneously, as Tarde “refuses to take the individual human agent as the real stuff out of which society if made” (Latour, “Gabriel Tarde” 4), the individual neither forms the central point of Tardean analysis, sidestepping the individualistic epistemologies so problematic in conceptualising collectives like Anonymous (Coleman, Hacker 50). Instead, the Tardean view focuses on his somewhat problematically inclusive concept of “imitation”: the relational influence between various entities. Tarde predicted that analysing the “objects of imitation”, indicative of the “beliefs and desires” of human subjects, could explain how “social assimilation and political consensus [can] be brought about” (Leys 279-80). This bottom-up approach is fruitful here because it eludes generalising structuralisms through emphasising multitudinous influential relations. Considering digital communications networks have increased these interpersonal influences, Tarde is “far more relevant to the conditions that have created the Occupy movement” than to the sovereign or disciplinary powers of his time (Niezen 55-6). His thought is further useful because it can conceptualise how the imageboard is particularly effective in stimulating “imitation” and “innovation”, which in turn explains the rapid emergence of (albeit very loose) publics like Anonymous. These insights can help empirical research into 4chan move beyond mere metrics and think about what political ideologies or ideas its concrete objects demarcate. Because Tarde’s thought brings quite some conceptual luggage, I first briefly discuss his most relevant theories for this reserach (section 1.1) before relating them to 4chan (section 1.2).

1.1 Tarde’s imitations and anti-structuralism In fin de siècle France, the sociologist Gabriel Tarde put forth his theory that imitation and invention are the elementary forces of social life. He posed that social research, instead of making grand, structural claims about societies, should scrutinise these minute imitations between humans to “infer certain regularities or laws that appeared to pattern the social world” (Marsden 1177). Tarde’s thought had profound impact on questions of interaction, crowds and public opinion (Kinnunen; Katz 264). However, his work was overshadowed by the structural macro-sociology of his contemporary peer, Émile Durkheim, leaving Tarde with the dubious label of “precursor” to subsequent thinkers (Latour, “Gabriel Tarde” 82). However, Tarde’s work elicited a reappreciation in the early 21st Century, most

Hagen 14

notably through revitalisations by Bruno Latour (“Gabriel Tarde”; “Tarde’s Idea”), Andrew Barry and Nigel Thrift. The main reason for this is that Tarde’s century-old theories contain many foresights into contemporary issues on digital information “diffusion” (Barry and Thrift 510). Crucially, his wish to trace innumerable social imitations to distinguish larger social patterns has, to an extent, become attainable with the advent of Internet technologies and the social networks built on top of these wires and protocols (Latour, “Tarde’s Idea” 157-8). Imitation requires explanation as it is perhaps not the most elegant term (Katz 264). Tarde uses imitation as a general concept denoting relational influence between multiple actors. He sees imitation as the very core of societies, and human subjects only social insofar that they are “essentially imitative” (Tarde, Ld’I 11).10 This imitative principle does not just to the human subject, however: it forms the “basic mechanical fact” of “the communication or any kind of modification of a movement determined by the action of one molecule or mass on another” (Tarde, “Les Deux” 64).11 As such, Tarde poses the role of imitation between humans in a society is “analogous to that of heredity in organic life or that of vibration among organic bodies” (Tarde, Ld’I 11). Consequently, just like repetitious influences between atoms create “atomic societies” (Tarde, M&S 28), Tarde sees human societies not as ontologically distinct to these material constellations, but rather made up of more complex and innumerable imitative interactions, in their repetitions only imagined as a distinct structure. The single entity within these societies, be it a rock, a galactic constellation or a person, is not an isolated unit, but rather an always further reducible “point of intersection or interference between diverse lines of imitation” (Barry and Thrift 513).12 As atomic societies are rarely seen as having a metaphysical essence, Tarde argues the idea of a social society is essentially a construct, a label for a whole that is merely the sum of its parts. Such a stance could invite paralysing relativism, but for Tarde, small imitations offer crucial insights into how humans and societies “behave”. He provides the following example:

When a young farmer, facing the sun set, does not know if he should believe his school master asserting that the fall of the day is due to the movement of the earth and not of the sun, or if he should accept as witness his senses that tell him the opposite, in this case, there is one imitative ray, which, through his school master, ties him to Galileo. No matter what, it is enough for his hesitation, his internal strive, to find its origin in the social. (Tarde, LS 87-8)

Instead of thinking of such influences as a “preestablished harmony” or structural “universal laws” (Tarde, LS 56), Tarde argues that the thing which makes the farmer hesitate, and ultimately believe that the earth spins, is an “imitative ray” produced and propagated by other actors – his school teacher,

10 I abbreviate Tarde’s works the following way: Ld’I: Les Lois d’Imitation, LS: Les Lois Sociales, M&D: Monadologie et Sociologie, PE: Psychologie Economique. 11 Translated to English by Barry and Thrift (514). 12 As indicated by “diverse”, Tarde shies away from generalising the imitative production across various spheres, as he recognises that differing “imitative bodies” generate differing “societies”, taking form in wildly different subjects (Barry and Thrift 512-3).

Hagen 15

Galileo, and so on. In other words, that which produces a human “society” is the accumulation of diverse imitations between various subjects. As such, no one can act socially “without the collaboration of a great many other individuals, most of the time ignored” (Tarde, LS 66). It is this relationality between subjects, not the individual or societal structures, which takes central stage in Tarde’s thought (Barry and Thrift 514). Unsurprisingly, Tarde had little sympathy for considerations of society as a metaphysical entity or organic system (Ld’I 68).13 The word “imitation” is somewhat problematic since it implies “voluntaristic” behaviour (Katz 264) in the sense that humans are consciously mimicking others. However, Tarde’s imitations are broader in scope and can be “willed or not willed, passive or active” (Ld’I xiv), addressing humans as “social ‘somnambules’ who believe whatever they are told, but also as ‘natural’ beings who, to some degree at least, are able to resist those ‘contagious thoughts’ by means of a critical assessment of their truth or falsity” (Schmidt 115). To clear up Tarde’s terminological inclusivity, Katz (266) provides a handy typology (table 1) that better stresses the differences between voluntary or involuntary imitation.

A has influenced B Is A aware that he or she has been influential? Is B aware that he or she has been influenced? yes no yes Persuasion / command Imitation no Manipulation Contagion Table 1: Forms of influence, derived from Katz (266)

Especially “contagion” is relevant in relation to anonymous online spaces, since not the “influencer”, but the content that takes centre stage, obscuring a direct awareness of its influential propagators. Yet “contagion” might sound overly biological, potentially downplaying the role of human agency. However, Tarde was “safe from the fatal memetic tendency to model cultural evolution too closely on genetic evolution” and “displac[ing] the ‘self’ by the meme” (Schmidt 111), because, as present in the quote on the young farmer, imitation is often met with “hesitation” (Tarde, PE). As such, Tarde eschews the model of the “memetic” information as “viral” or “selfish” objects (Schmidt 103), as e.g. theorised by Richard Dawkins and Susan Blackmore. Instead, Tarde’s imitations are bound to the human subject in both conscious and unconscious manners, but always indicative of subjectivity:

What is imitated is always an idea or a wish, a judgement or a plan, in which a certain amount of belief and desire are expressed, which is the entire soul of the words of a language, the prayers of a religion, the administrations of a government, the paragraphs of a code of law, the duties of a moral system, the work of an industry, the products of an art. (Tarde, Ld’I 157)

13 Such systemic or organic views were at the time proposed by e.g. Herbert Spencer and Émile Durkheim

Hagen 16

By arguing that the “ultimate ‘objects of imitation’ are our beliefs and desires” (Schmidt 111), these objects of imitation are not just “out there”, but also influence and indicate very human urges. These “beliefs and desires” can discerned through “objects of imitation”, rendering visible internal wishes that are ontologically subjective, only “exist[ing] insofar as they are experienced by human or animal subject” (e.g pain; Searle 15). Importantly, however, just as Tarde does not place the “meme” central, neither does he foreground the rational individual; if “beliefs and desires are ontologically subjective in the sense that there has to be someone who ‘has’” them, Tarde did not rigidly hold that these beliefs and desires were exclusively and intentionally “hers or his”, but rather that they are shared -- indeed, imitated -- between various actors (Schmidt 112). Therewith, Tarde did not focus on a separation between “the intentional subject and the intentional content” but rather emphasised “the relation between different intentional subjects” (112). The Tardean view thus means an analytical shift from denoting what objects of imitation “are about”, by whom they are made, or the individual’s intention in their propagation, and towards thinking about what social relations and patterns in “believes and desires” these objects demarcate (112) - even when they only indicate “ambivalence” (Phillips and Milner) or superficial personal commitments (Niezen 55) Especially considering the generativity of 4chan, one could wonder: how does change, originality or creativity emerge when all of “society is imitation”? (Tarde, Ld’I 74) Again, Tarde’s “imitation” is such an inclusive concept that it also denotes “innovative” dynamics. For Tarde, innovations are not purely unique instances of new ideas, but rather the coming-together of different imitations that together create an original combination. As such, imitation and imitation are bound together: “even the most imitative of all men is innovative in some respect” (qtd. in Schmidt 110-1). However, innovations denote a more generative, creative combination of influences, more than mere repetitive imitation (and this distinction is used as such in this text). Because minute innovations can be further imitated and innovated upon, but (as stressed above) do not have to voluntarily adopt a norm, this innovative agency holds the potential for a domino effect leading to both progress and rebellion (Kinnunen, 434). The fact that Tarde’s addressed difference and variance with innovations mean the association between him and “diffusion” might not be wholly apt since the latter term suggests an adoptive uniformity in information dissemination (Rieder “Refraction Chamber”). The concept of refraction might be more fitting, proposed by Rieder (“Refraction Chamber”) to denote “the space between identical reproduction and total heterogeneity”. This echoes Tarde, as he takes this “total heterogeneity” through innovations as a basis, but notes that cumulated imitations create homogeneity:

Heterogeneity, not homogeneity, is at the heart of things. […] Homogeneity is a likeness of parts and all likeness is the outcome of an assimilation which has been produced by the voluntary or non-voluntary repetition of what was in the beginning an individual innovation (Tarde, Ld’I 71-2).

Hagen 17

Therewith, it is only when agglomerate entities (like humans) “imitate” and “innovate” enough that a “uniformity of […] repetitions makes an apparent homogeneity, in spite of the internal complexities of each group” (Tarde, Ld’I 18). In other words, imitations are chaotic and heterogeneous, causing not mere “diffusion” but also “refraction”, with a high enough density of communication causing general imitative patterns to emerge.

1.2 A Tardean view on 4chan Why is Tarde useful for research into the fluidity of political sentiments on 4chan? Firstly, his anti- structuralism suggests an epistemological outlook that refuses to consider the total sum of activity on the imageboard as more than its parts. The subcultural obscurity and sheer breadth of 4chan’s content make it a difficult object of study, but as with any socially complex issue, it is not “inexorably chaotic and therefore impossible to interpret” (Venturini, “Diving in Magma” 263). Though it might house complex meaning, the imageboard’s infrastructure forms a stable network; not stable in the sense of a stable constellations of, and interactions between its users, but because (as noted in the introduction) its infrastructure and mechanics are straight-forward and have remained roughly identical over time.14 4chan’s ephemerality ensures the website is “trimmed” to a limited amount of nodes, making the platform never wholly unimaginable in terms of size. Of course, the posts and the innumerable webs of meaning within the nodes are complex, but even these are not “wholly random or simply chaotic” (Coleman, Hacker 17) because they form objects of, to some extent, traceable subjective production - not mythical symbols. It is the web of “imitative rays” influencing why and what an anon posts that forms the main complexity, since the intentionality of a post is often obscured by a lack of social cues.15 This does not ask for psychoanalytical analysis, however; from a Tardean approach, the resulting productions (or: “objects of imitation”) in images or texts are not a sole window into the soul of the author, nor distinct objects (i.e. “memes”), but rather demarcate “the relation between different intentional subjects” (Schmidt 112). This allows at least some generalisation into “patterns”, as the beliefs and desires expressed through these objects of imitation are shared between subjects. Since this can inform what shared political “sentiments”16 are present on /pol/, this outlook will be espoused in the empirical sections of this study.

14 On-site affordances do change sometimes. For instance, Christopher “moot” Poole often went back-and-forth with the settings on IDs, which denote a unique code per user, at times forcing them board-wide, at times removing them completely (see e.g. “Complete History of 4chan”). 15 As the Internet adage “Poe’s Law” states, on in the digital sphere, it is impossible to irrefutably know the intention behind a statement without explicit social cues like smileys (see e.g. Aigin). 16 I use “sentiment” in relation to political views on 4chan because it can both describe overt political ideologies as well as more ambivalent or emotional beliefs and actions.

Hagen 18

1.2.1 4chan as an incubator for imitations and innovations An unfortunate lack in Tarde’s theory is that he did not stress the co-constitutive nature of media, treating communication technologies like the newspaper “less as a mediator than as an intermediary, transmitting and distributing ideas that then can work as mediators” (Wiedemann, 315).17 However, Tarde’s theories can fit alongside accepting the agency of material technology (Latour, “Gabriel Tarde”; Latour et al.; Parikka; Sampson; Thacker). Moreover, Tarde’s thought is not devoid of acknowledging non-human agency, since he argues human societies and e.g. ant societies share similar imitative dynamics (although the former is more complex than the latter), meaning a “nature and society divide is irrelevant for understanding the world of human interactions” (Latour, “Gabriel Tarde” 82). Partly for this reason, Bruno Latour labelled Tarde’s theory of imitation not as the precursor, but the successor to the more recent Actor-Network Theory (“Gabriel Tarde” 3). The similarities are indeed striking, because Tarde proposes the influential relation between actants “create” social production, behaviour and meaning. As Latour himself attests: “agency plus influence and imitation, is exactly what has been called, albeit with different words, an actor-network” (4). For this research, the material reframing is necessary since 4chan’s mechanics actively stimulate imitation and innovation. As stressed above, its anonymity provides a platform for unconventional thoughts that could be considered “innovative”; not in a progressive sense, but because they can transgress the boundaries of regular norms without consequence. As such, repercussion-less anonymity means one can draw from a larger pool of usually (and often rightfully) unacceptable interpersonal imitations. At the same time, the cultural norm of anonymity stimulates repetitive imitation because “being a distinct flower in the field of anonymous daisies only leads to punishment”, and to avoid this, “users have to adhere to a very strict code of behavior and ritualized language, which means they are always monitoring their own discourse patterns” (Beyer 47). Similarly, 4chan’s ephemerality stimulates particularly repetitive forms of imitation. For instance, it enforces that discussion on a specific topic is often “renewed” by copy-pasting the same text in a new opening post (OP) after an old one gets purged (see e.g. Tuters et al.). This injection both “copy-pasted” and fresh content is the basic mode of operation on 4chan, since the only way for active participation is by posting text or images; OPs even require both text and an image. “Slacktivist” upvotes, likes, shares, retweets, etc. are absent. Combined with the sheer volume of posts (and other aspects I cannot discuss in detail here18), 4chan is an environment that is particularly effective in creating “a reservoir of shared ideas, debates, stereotypes, facts, trivia, and so on” (Rieder, “Refraction”).

17 For Tarde, imitation was a precondition to information media rather than the other way around: “if people did not talk, it would be futile to publish newspapers. […] They would exercise no durable or profound influence; they would be like a vibrating string without a sounding board” (qtd. in Clark 307). 18 For instance, the “gatekeeping” dynamics mentioned in the introduction also stimulate the creation of a semi- secret code language. See Beyer for more on the dynamics between 4chan’s affordances and “innovative” political activity.

Hagen 19

This interplay between imitation and innovation is akin to Deborah Tannen’s “fixity” and “novelty” in in language: the dynamics between established knowledge and original remixture that creates “new” shared meaning (49). This “balance between the new and the expected” forms the basis of the creation of shared imitated objects (“memes”) in vernacular spaces like /pol/ (Milner 88-9). This gives some credence to the close relation between “objects of imitation” and ontological subjectivity, since these objects can only become “objectified” when its propagators possess mentally shared knowledge. As a corollary, what are usually referred to as “memes” on /pol/ (I will try to refrain from the term) are not distinct units but propagators and indicators for relational knowledge -- and possibly “beliefs and desires” -- between subjects. That these objects might “spread outwards” (Tarde, Ld’I 140) is of particular relevance to 4chan since it has historically formed a fringe hotbed for vernacular and imagery that later saw a mainstream adaptation.19 Just as it is often unclear how “a local dialect […] spoken by only certain families, gradually becomes, through imitation, a national idiom” (Tarde, Ld’I, 17), many are unaware of the “cultural well” of 4chan, which led Phillips to describe the imageboard as “the most influential cultural force most people didn’t realize they were actually quite familiar with” (“Oxygen” 17). These “tiny obscure subcultural beginnings” have reached “mainstream public and political life”, not only as cat pictures, but in recent years increasingly as politicised messages (Nagle, Kill All Normies 9). One of the most famous examples is Donald Trump tweeting an image of Hillary Clinton in front of a wall of dollar bills, alongside a David Star labelled with “Most Corrupt Candidate Ever!” - an anti-Semitic image that allegedly sprouted from , another fringe imageboard similar to 4chan (Smith). However, the “success” of vernacular creativity should not only be measured by whether it reached outside attention; they also form “intracultural” objects that can evoke complex meaning with simple signs within the community. Exemplary for this is fig. 3, showing a minimalist version of the anti-Semitic “Happy Merchant” meme. Even if it would be completely ambiguous to an outsider, two black lines can make /pol/ anons recall the full image and the shared anti-Semitic, conspiratorial ideas they associate it with. As one anon reacts: “I’m actually impressed with how ingrained that imagine is in my head that I can identify it almost instantly even in this minimalist form” (Anon #3). Particularly because these objects can form extremist dog whistles, it necessitates awareness (but not exposure) of seemingly fringe online spaces.

19 Many famous Internet “memes” continue to originate or popularise on 4chan. Early “successful” Internet memes include , LOLCats, advice animals and rage comics, while newer memes reaching mainstream audiences include and Feels Guy. For research on the cross-platform diffusion of memes, see Zannettou et al.

Hagen 20

Figure 3: A minimalist version of the anti-Semitic image “Le Happy Merchant”, posted on /pol/ on 4 January 2017. Derived from archive.4plebs.org.

It is hard to identify whether such images are mainly a corollary of a broader political climate, or if (and if so, to what extent) their spread can significantly influence the beliefs of its recipients. Nonetheless, identifying and analysing their presence on the fringe might explain larger (political) events. This claim is Tardean, exemplified by Deleuze and Guattari in A Thousand Plateaus:

May 1968 in France was molecular, making what led up to it all the more imperceptible from the viewpoint of macropolitics. […] As Gabriel Tarde said, what one needs to know is which peasants, in which areas of the south of France, stopped greeting the local landowners. A very old, outdated landowner can in this case judge things better than a modernist. It was the same with May ’68: those who evaluated it in macropolitical terms understood nothing of the event because something unaccountable was escaping. (Deleuze and Guattari 216)

Drawing the parallels between the ’68 movements and the recent surge of the online right is apt when espousing Nagle’s argument that the same bottom-up, countercultural, transgressive style of the 60s has been adopted by the online right, partly manifested on /pol/. Unfortunately, while the above peasants might have had valid reasons to stop shaking the hands of the land owners, imitations on 4chan are often based on the accumulation of “bullshit” (Tuters et al.). Regardless of whether it is used for the production of bullshit like “race realism” or conspiracies, the shared discourse and in-group behaviours related to these social imitations can change from fringe ideas to accumulate into a “high level of group trust that lays the groundwork for political action” (Beyer 47-8). This observation was underlined when a single post about the “Pizzagate” conspiracy on /pol/ snowballed into a shooting at one of the parlours claimed to be part of the Clintons’ network (Fisher et al.).

1.2.2 4chan and the emergence of publics Despite the importance of the material infrastructure, shared discourse, or common behaviours, perhaps the most crucial collectivising force on /pol/ is the presence of specific ideas. 4chan’s effectiveness in stimulating “objects of imitaion” can as well stimulate the propagation and development of mostly

Hagen 21

mental -- i.e. ontologically subjective -- (political) ideas. In turn, this can lead to the emergence of larger publics. Tarde’s saw publics as “constellations of individuals brought into relationships through networks of communication” (Niezen 53), connected by “the idea or the passion” (Tarde, “The Public” 288). Collectives like Anonymous and the “alt-right”, fractured as they might be, can be conceptualised as publics because they consist of individuals that share “ideas and passions” on specific issues -- be it anti-corporativism, “men’s rights”, or white supremacism -- and find this connection through online platforms. Of course, these constellations are always “comprised of a number of different perspectives and collectives, a cacophony of voices and interests constituting multiple publics” (Phillips and Milner 166) and participation in a “technical medium” like 4chan can form a “passion” in itself (Wiedemann 314). Still, shared belief, ideas and interests in a subject matter is arguably one of, if not the basic tenet holding the dispersed contributors on 4chan together. This is particularly the case for /pol/, since, despite remaining a cacophony of antagonistic voices, political views are homogenised by “polarization effects”; “as skeptical users opt out of these communities [like 4chan], they become echo chambers of like-minded believers without exposure to any differing views” (Marwick and Lewis 18). As noted above, /pol/ might not be a pure “echo chamber”, but also a “refraction chamber” (Rieder, “Refraction Chamber”), but its characterisation as a “containment board” and its infamy for allowing extremist content suggest a degree of like-mindedness, making /pol/ a “safe space for self-selecting misogynists and racists” (Phillips, “Oxygen” 23, emphasis added). Despite their close connection, publics should not be rigidly equated to the medium they connect through, since the former are not fixed to the latter. Anonymous, for instance, moved further away from its birthplace on 4chan as it “matured” and increasingly relied on IRC chats (Coleman, Hacker 48).20 As such, there is a subtle difference between the network and the publics it enables and co-constitutes. This is relevant because the use of generalising misnomers has historically only strengthened in-group coherence, for instance when a CNN reporter famously referred to iCloud “hackers” with: “Who is this 4chan guy?” (Vincent).21 The idea of the public is further useful here because it can “subjectify” the presence of reoccurring topics on /pol/, and anons often self-organise in threads dedicated to specific issues.22

20 is a protocol that facilitates group messaging. It played a big role in the organisation of Anonymous’s activist efforts (Coleman, Hacker 17). 21 4chan as a singular entity became a meme on multiple occasions. In the early Anonymous-era, Fox News broadcasted a report on Anonymous, presenting the loose activist group as “hackers on steroids”, a “hacker gang” and “Internet Hate Machine”, giving 4chan a national platform as a dangerous, unanimous, well-coordinated group (Phillips, This Is Why 58-9). In 2014, 4chan’s presumed entitativity was presented as even more radical when after an iCloud hack leading to a leak of nude pictures of celebrities, a CNN reporter alluded to the culprit with the question “who is this 4chan guy?” - to the delight of many anons (Vincent). Many anons gladly joke about this entitativity, as do 4chan’s moderators, as the FAQ states: “Anonymous […] is a god amongst men. Anonymous invented the moon, assassinated former President David Palmer, and is also harder than the hardest metal known the man: diamond. His power level is rumored to be over nine thousand. He currently resides with his auntie and uncle in a town called Bel-Air (however, he is West Philadelphia born and raised). He does not forgive.” (4chan.org/faq) 22 These topic-specific threads are called “Generals”, to which I will return to in chapter three.

Hagen 22

Inherently, the public forms a simplification that is always further reducible into smaller “parts” (Phillips and Milner 166). However, the fact that anons also adopt this simplification, as I will illustrate here, raises an epistemological question: is the social “whole” still “never more than its parts” when its participants collectivise around a shared imagination of that whole? Humans, in contrast to atoms in “atomic societies” or ants in “ant societies”, can at least partially envision what they perceive as “the ‘whole’ in which they are said to reside” (Latour et al. 604-5). Just like a French person could formulate a public perception on “what it means to be French”, a 4chan or /pol/ visitor can be influenced by their ideas of what the broader culture, politics and customs of the imageboard entails. This imagination is not rare; many anons have found joy in roleplaying -- or actually believing23 -- that their collective effort is more than the sum of its part. For instance, when 4chan was represented as an “Internet Hate Machine” by Fox News in 2007, the generalisations incentivised /b/ anons to remix the stigma into “larger and ever-more conspicuous structures” (Phillips, This Is Why 58-9). More recently, this structuralist belief even took satirical-religious forms when /pol/ anons claimed to have harnessed magical powers in memes that foretold future events - so-called “meme magic” (Phillips et al.). Despite these imaginations of structures, theorising them as “outside” the human would be folly since the human-perceived “wholes” are no sui generis super-organisms but rather the internalisation of a perception of a whole (Latour et al. 603-4). Still, such internalised imaginations of a wholes should not be dismissed since they form extremely interesting “objects of imitation” when researching political activity on 4chan, providing insight into shared beliefs, norms, wishes, ideological associations, and so on - to which I will return later. If these are common, it can possibly legitimise characterising it as a “public”.

This chapter presented a Tardean perspective on 4chan. His non-structuralist view provides a frame to avoid generalising observations on the imageboard and its users. Further, it informs to think of 4chan as a network that is particularly effective in affording innovations, creating obscure subcultural “objects of imitation” indicating shared beliefs and knowledge, but that can also “spread outwards” into the mainstream. If Tarde proposed to study minute imitations before making grander claims about a “society”, a simple question remains: how to do so? More specifically, how can one “trace” the lines of the constantly shifting political sentiments on /pol/ by analysing what mental worlds are opened by looking at “objects of imitation”? What patterns in ontologically subjective “beliefs and desires” do these objects demarcate? These questions also captured Tarde’s attention, who dreamt of possessing enough data on flows of imitations (Kinnunen 434),24 but could not “turn his intuitions into data”

23 Burton illustrates how a “self-identified active member of the alt-right” told her: “I don’t believe in God. But I say ‘Praise Kek’ more than I’ve ever said anything about God.” 24 Tarde and even criticised the French government for lacking “data on people’s values, religious activities, linguistic change or emotional attributes, beliefs or desires” (Kinnunen 434).

Hagen 23

(Latour, “Gabriel Tarde” 2). Now that the digital sphere presents an abundance of social data, Latour argues that “many of the argument[s] of Tarde can be turned into sound empirical use” (“Gabriel Tarde” 2). To that end, the next chapter will explore how such an empirical outlook can take form.

Hagen 24

2. Method/ology: Tracing and Navigating Imitative Currents

Our problem is to learn why, given one hundred different innovations conceived of at the same time - innovations in the form of words, in mythical ideas, in industrial processes etc. - ten will spread abroad, while ninety will be forgotten. (Tarde, Ld’I 140)

How can empirical analysis on the content of /pol/ denote what broader political sentiments coagulate, move, or disappear? Fortunately, because /pol/ is almost entirely archived, it allows for “digital traceability” that can form a “vindication” for both Tarde’s theory and methodology (Latour, “Tarde’s Idea” 157). As such, this chapter concerns the methodological pondering on how to combine approach, data and methods for the purposes of tracing fluid political sentiments on /pol/, informed by the Tardean insights of the prior chapter. In the first section, approach, I argue that fractured political mobilisations can be studied by analysing multiple “parts” of a whole. This follows the argument by Latour et al. that “aggregates” in data analysis do not offer an epistemologically distinct or fully objective insight, which is particularly the case here since it can further obscure the vernacular intertextuality of the objects of study. To remain some contact with the “dirt” of the data, one can take a particular data object and “[deploy] as many subjective perspectives as possible” (Venturini, “What Is”) to “digitally navigate” attributes indicating what the scrutinised object “is made of”, which in turn potentially offers “points of view” onto shared political sentiments or “beliefs”. In the second section, data, I discuss a database of all posts from /pol/ since November 2013 and how to sampled it in relation to a particular text object. In the third section, methods, I discuss five text mining techniques that can repurpose /pol/’s text data to indicate shifts in discourse and associations, each providing its own “point of view” into how the data object is imitated. While many of the reflections and practices in this chapter are generalisable, it also forms an outline for the case study in the next chapter, which takes the word trump as a particular object to longitudinally explore political sentiments on /pol/. For all of the methods listed below, I wrote scripts in Python, extensively using the libraries pandas, matplotlib, sqlite3, scikit-learn and gensim.25 The full scripts are available on GitHub.26

2.1 Approach: Circulating around differently conceived wholes How can the Tardean view inform data-driven research into 4chan’s political sentiments? Latour et al. propose a methodological approach that espouses Tarde’s argument that “the whole is equal to no more than the sum of its parts” (Clark 17). They note that, in the social sciences, two epistemological levels form cornerstones for empirical research: the individuals and the aggregates. These are used as conceptual lenses, questioning how one can conceive of a “whole” through studying “individual”

25 See pandas.pydata.org, docs.python.org/2/library/sqlite3.html, scikit-learn.org and radimrehurek.com/gensim/ 26 github.com/salhamander/polthesis

Hagen 25

elements by starting at the macro and, from there, dive into the micro, or vice versa (591). For instance, Latour et al. note how Farkas et al. understood the stadium wave (the macro) by characterising individual reactions as either excitable, active or passive (the micro). Though fruitful in this example, humans do not spend most of their time predictably adopting group behaviour in a football stadium -- they do not just “react” but also “refract” (Rieder, “Refraction Chamber”) -- meaning such an approach is usually “incapable of understanding more complex collective dynamics” (Latour et al. 598). This problem also forms a roadblock in researching 4chan. Case in point: Bernstein et al. attempt to categorise the types of posts on /b/ (52),27 which, however insightful, simplifies its complexity by identifying a “100% whole” by labelling its parts with altogether structuralist generalisations like “themed”, “sharing content” or “discussion”. In another example, Rolling Stone “proved” that the use of “racist and fascist terms on [/pol/] has increased since the start of Donald Trump’s presidential campaign” by showing a bar graph with the temporal increase of several hate words (Reitman). I will not argue radicalisation is absent on /pol/, or fascism is no source of concern, but Reitman’s claim is problematically limited by inferring structural change in political sentiments (“the rise of white supremacy”) from simple metrics and aggregates (word trends). The blinding particularity of this case is made even clearer because adding normalised word counts percentages, omitted in Reitman’s text, show the hate words remained stagnant.28 To prevent the problem of inferring “individuals” from “wholes”, or the other way around, Latour et al. ask: “is there a way to define what is a longer lasting social order without making the assumption that there exist two levels?” (591). The reason for this questioning is informed by Tarde’s anti-structuralist premises that “the whole is never more than its parts” and thus not on a distinct “level”, and that neither the micro is an isolated unit, but rather a whole in itself. From this perspective, concepts like the “the individual” are but indicators for a “range located within a more complex social but not individual field, where the regions beneath and beyond the individual have their own domain” (Brighenti 298). As such, the micro and the macro are not ontologically distinct, but rather different epistemologies of the parts they are assumed to consist of. As an alternative, Latour et al. pose that these viewpoints are not the only options: by not inferring an essence through micros and macros, but properties by employing various particular views. When defining e.g. “the individual”, but also something like “Anonymous”, once swap this essence to asks what it is made of, noting that the parts

27 Bernstein et al. categorised the OPs in a sample of /b/-data as “themed” (28%), “sharing content” (19%) “question, advice and recommendation” (10%), “sharing personal information” (9%), “discussion” (8%), “request for item” (8%), “request for action” (7%), “meta” (5%) and “other” (6%). 28 Reitman shows, for instance, the n-word and the anti-Semitic “kike”. Appendix IV contains the normalised counts of these words, showing the former stayed constant, and the only latter slightly increased. Further, they do not correlate with the start of Trump’s presidency. Again, I am not stating that these numbers are not problematic, but rather that they behove more in-depth empirical scrutiny.

Hagen 26

are reducible and complex vectors to other parts. As Latour notes, the Tardean ontology swaps being for having:

If essence is the way to define an entity within the ‘To be’ philosophy, for the ‘To have’ philosophy an entity is defined by its properties and also by its avidity... No way to escape from Tarde’s logic: take any monad,29 if you look at what are it’s [sic] properties and its proprietors, you will be led to define the whole cosmos, which would be impossible if you had only tried to define the essence of an isolated identity. (Latour, “Gabriel Tarde” 16)

As such, Latour proposes to work towards an ontology by tracing the ontic. The entity of “a cat”, for example, cannot be defined through metaphysics. Rather, an attempt can be made to define it by noting it is a cat because it has four legs, it has fur, it has the capacity to purr, and so on. However, while this ontic approach eludes metaphysical obscurities, it creates a problem of partiality: even in simple phenomena, one can always trace more properties, leading to further properties, leading to further properties, etc. This slippery epistemological slope also puts the finger on the sore spot with studying 4chan. Endlessly complex work is required not only for generalising questions like “what is 4chan?” or “what is Anonymous?”, but also questions like “what is the dominant ideology of most /pol/-anons?” or “how is X meme used?”. Gabriella Coleman implicitly attests to this by stating that even though Anonymous “simply consist of humans […] meshed together by wires, transistors and wi-fi signals”, each element within this network further points to “a vector for irreducibly unique and complex history” (Hacker 115). Capturing a static state of a public like Anonymous is even further complicated when such publics are “always emergent and in flux” (Coleman, Hacker 115). To return to Heraclitus, one can describe the river by throwing out a net to catch and study that which happens to get trapped, but the knowledge of the river in its totality is always partial. Moreover, as the river’s current keeps flowing and its banks keep shifting, knowledge about its properties is perpetually outdated. Even the catch poses the same epistemological problems as the river, as e.g. a fish is also endlessly reducible to molecules, structure, and so on. This relativism should not mean the Tardean approach is useless in researching a complex network of imitations like 4chan. In accepting that perceiving complex “wholes” is unattainable, both for individuals and aggregates, Latour et al. propose that “the notion of structure should be replaced by the circulation of differently conceived wholes” (592). In doing so, they pose empirical research can refrain from suggesting a totality on singular micro-entities as well as macroscopic structuralisms (602), and instead map “a collective phenomenon without ever considering either individual components or structure” (608). As Latour et al. formulate, digital tools allow this: “instead of having to choose and

29 Tarde’s consideration of Leibniz’s monads is a complex and vague one. Latour et al. define it as “a monad is not a part of a whole, but a point of view on all the other entities” (599), while Latour describes it as “a representation, a reflection, or an interiorization of a whole set of other elements borrowed from the world around it” (“Tarde’s Idea” 154).

Hagen 27

thus to jump from individuals to wholes, from micro to macro, you occupy all sorts of other positions” (595) by navigating “individual data points to the aggregates and back” (Latour, “Tarde’s Idea” 158). As such, the scrutiny of complex social phenomena becomes an iterative process of data work:

The “whole” is now nothing more than a provisional visualization which can be modified and reversed at will, by moving back to the individual components, and then looking for yet other tools to regroup the same elements into alternative assemblages. (Latour, “Tarde’s Idea” 158)

This neither means that moving from viewpoint to viewpoint will present an objective and “total” view. Instead, it is closer to what Latour calls “second-degree objectivity”, differing from absolute objectivity as it

is obtained by the multiplication of different viewpoints. It is a kind of objectivity that comes from diversity rather than from uniformity. A kind of impartiality that comes from exploring a multitude of partial bias, rather than abstracting from them. (Venturini, “What Is”)

As such, “deploying as many subjective perspectives as possible” (Venturini, “What Is”) provides at least an informed, non-generalising characterisation. It recalls the parable of the blind men and the elephant: individually, the blind men will never fully know what this “elephant” phenomenon is, but by combining their perspectives, they can formulate a pretty good sense of its shape.30 This means that for this research, scrutinising “political fluidity” or “sentiments” becomes not a holistic endeavour, but rather an approach of taking different points of view that can together provide a sense of patterns in ideas and ideologies on /pol/. Latour et al. note such a digital navigation still needs to begin at a somewhat pre-conceived position because it has to start at a particular entity (599). Searching for such entities, they employ Leibniz’s notion of a “monad”: objects that are no coherent “part of a whole” nor a single entity, but rather form a “point of view” on the other entities it deals with “severally and not as a totality” (Latour et al. 598).31 Because it remains disputed what a “monad” exactly meant to both Leibniz and Tarde, who adopted it (Latour et al. 598), I will not import the concept, but, for the purposes of this research, refer to what I call “imitated objects”. Imitated object alludes to “object of imitation” discussed in the previous chapter, which, from the Tardean view, is no isolated production of the subject (a “meme”), nor the “possession” of the subject, but rather demarcates “the relation between different intentional subjects” (Schmidt 112) through indicating shared beliefs, desires, knowledge, humour, and so on.

30 The parable tells the story of how a group of blind men learns what an elephant by touching different parts of its body. The man touching its tail thinks of it as a snake, while the one touching its knee thinks of it as a tree. The moral of the story is that ideas of a totality are always based upon partial perception. See e.g. John Godfrey Saxe’s poem in Gardner. The story is not entirely apt to this study, since correlating the political sentiment on 4chan to the elephant would mean the animal is walking, shapeshifting, and generally made of fractured pieces - hence the preference for the metaphor of the changing river. 31 See footnote 29.

Hagen 28

However, I would like to stress that the “object” in “imitated object” can be further “objectified” for the sake of digital traceability. To put it more comprehensibly, the digital sphere allows to pinpoint e.g. a concrete string (a sequence of letters) which can then form a “benchmark” to trace shifts in the properties around it. This objectification is a necessary evil to attain a lens unravel what further attributes it provides insight in. As such, I take an imitated object not as an endpoint but rather a simplified anchor point to trace further complexities in webs of meaning. What imitated objects could provide “points of view” on the political fluidity of a space like /pol/? Two main contenders are 4chan’s texts and images, since these are most indicative of its users’ subcultural world-building and shared knowledge reservoirs. However interesting, tracing the meaning of images can be complex, both technically, and because their use is often ambiguous or downright random on 4chan,32 meaning it is perhaps better served for qualitative analysis.33 For instance, the quantitative research by Zannettou et al. traces the cross-spherical diffusion of images originating on /pol/, but did not engage in untangling its (potential change in) meaning, relying on the website Know Your Meme instead. Alternatively, the text in the post body is another hotspot of 4chan’s vernacular and intertextual complexities. In contrast to images, analysing a context of a word in a sentence can go a long way in identifying its meaning. This complies with the approach above since it does not assume structures or individuals, but rather gradually infers meaning from context. Moreover, these “ontic” cooccurrences can be explored from multiple “points of view” with various low-barrier text mining techniques (section 2.3).

As such, to explore political sentiments in the case study, I use the word trump as an imitated object, a “benchmark” of five letters that, at least hypothetically, offers points of view on rich and variable (political) meaning as it embeds a plethora of connotations. Most obviously, trump refers to Donald J. Trump as an individual, a politician, his actions, and associated ideologies.34 In itself, the word trump already suggests temporal instability, both because Trump underwent the transformation from a presidential candidate to the U.S. President, but also because of the multi-faceted debates he provokes. In a purely temporal sense, it offers longitudinal insights, instead of a single “burst” of meaning, which can be the case with events or short-lived memes. More importantly, however, trump can form a point of reflection and identification for /pol/ anons: the political sentiment surrounding the word does not only teach something about their consideration of Trump, but also about those who mention him. trump could be surrounded by both extreme ends of nihilism, irony, and absurdity (considering Trump’s antics), as well as outright supportive and antagonistic activism, each indicating something different about the “beliefs” of the anons.

32 The meaning of an image could be derived by semiotic or visual analysis, but through more automated means, things get complicated quick. One could, for instance, correlate images to the text they appear with. However, images are often unrelated to the corresponding text - “pic unrelated” is a common meme on 4chan. 33 See e.g. the study on the aesthetics of 4chan’s images by Knuttila (“Trolling Aesthetics”). 34 What some call “Trumpism”. See e.g. Tarnoff.

Hagen 29

Of course, data and methods are required navigate the points of view trump provides, bringing in their own methodological and practical considerations. The rest of this chapter is devoted to these two topics, starting with the data.

2.2 Data: 4chan as an archived object In the analysis of digital platforms like or , problems of delimitation (i.e. the “selection of subsets to analyse”) are common because of their sheer size and limited accessibility (Rieder and Gerlitz). Similarly, the question whether an imitated object can provide rich webs of meaning is “highly sensitive to the quality and quantity of information available” (Latour et al. 600). These are no barriers in the case of 4chan. The imageboard is simultaneously accessible and inaccessible: while its user perspective might offer a fleeting, contingent, and arcane voyage (Knuttila “User Unknown”), from a technical perspective, the imageboard is open, transparent, and clear;35 a remnant of a time before the commercialisation of the Internet and its walled data gardens (Rogers, “Post-Demographic Machines” 34-5). 4chan offers an easy to use API, allowing data scraping from all its boards. The only -- though significant -- caveat is being swift enough to collect the desired bytes before they are purged from

4chan’s server. Luckily, a third-party archival website, archive.4plebs.org, has done so for /pol/ since November 2013, and made the data publicly available.36 Its text data is “only” 36GB in size, containing virtually37 all comments and posts on /pol/ from 29 November 2013 to 18 March 2018 (see appendix III for column headers). Such a complete dataset opens many methodological roads. First and foremost, tracing an imitated object like trump asks for further data filtering. For the case study, I work with a full sample, a trump posts sample and what I call a trump-dense threads sample.

2.2.1 Full sample The size and availability of 4plebs’s /pol/ text data (36GB) is manageable as a “full sample” (Rieder, “Refraction Chamber”), meaning no data has to be discarded. Its size does mean the 4plebs dataset (available in comma separated format) should be converted to a relational database, as this prevents memory overload. I converted the dump into an SQLite database for further querying.38 In total, this database contains 140,731,206 posts in 3,635,798 threads. /pol/ has tripled in its amount of activity from

35 For a technical consideration of 4chan and its API, I wrote an unpublished article: salhagen.nl/tracing- 4chan-pol.pdf. For more information on the post sorting mechanism, see Hagen. 36 See archive.4plebs.org/_/articles/credits/ 37 The dataset contains nearly all posts, because a limited amount of data was dropped. After archiving, 4plebs deletes data that is reported as illegal content, copyright violations or personal information. Content that was deleted by 4chan moderators after it was already captured by the 4pleb’s fetcher is included. This deleted content is marked by a non-null value in the timestamp_expired column – these only comprise a small part of all entries and will naturally not be featured in this research. Furthermore, some outages on both 4plebs and 4chan caused fetching issues, resulting in a small amount of dropped data. After extensive work with the dataset, these missing data files were all deemed insignificant. 38 See github.com/salhamander/polthesis/blob/master/createDatabase.py

Hagen 30

December 2014 to March 2018, both in terms of posts (fig. 4) and threads (fig. 5), meaning non- normalised and time-independent data from this full sample can have a bias towards later dates (as was the case in Reitman). For some methods, the data was sliced from 1 December 2013 to 1 March 2018 to preserve full months and avoid partial data representations.39

Figure 4 (top): The total amount of posts (OPs and replies) on /pol/, separated per month. Figure 5 (bottom): The total amount of threads on /pol/, separated per month.

39 In specific: the frequency charts and the monthly word2vec models, although the latter were only used for five months.

Hagen 31

2.2.2 Topic sample A full archive of /pol/ opens many roads for analysis, but the sheer size of the data puts the researcher “into a specific and complex epistemological situation that is characterized by a constant production of ‘surplus’ where there is always ‘more’” (Rieder, “Refraction”). This is where the imitated object can be of use, functioning as an anchor point for a more manageable “topic sample” (Rieder, “Refraction”). The simplest way to do so is by creating a dataset that only contains posts mentioning the object. For the case study, all posts containing trump were queried and stored in a separate csv sheet. Prefixes and suffixes were allowed.40 This dataset will from now on be referred to as the trump-posts dataset.

Figure 6: The amount of posts on /pol/ containing trump, separated per month

While the trump-posts dataset is inclusive, it creates its own bias because posts might be related to Trump, but not directly mention him by his surname. For instance, a thread’s OP could start with a picture of Trump and the question “Do you stand with Israel as much as your God Emperor does?” (Anon #4) and subsequent replies could refer to him with “he” or “that guy”. In this example, the full thread might still be relevant because Trump appears as a red line throughout it, with the OP “setting

40 Pre- and suffixes were handled with SQL percentage wildcards: %trump%. Allowing prefixes and suffixes allows the inclusion of terms like “nevertrump” or “trumptrain”. Such words comprise a small amount of the data, but most often referred to a representation of Trump, so were conserved. No filter for dates was used, causing a small amount of mentions from before his presidential bid to refer to unrelated words like “trumpet”. However, these amounts are negligible and had no impact on the further analysis.

Hagen 32

the agenda” for Trump-related discussion. How can such topic-related threads be identified? One way is to “repurpose” (Rogers, Digital Methods) the self-organisations of anons,41 but another method is to filter on threads that are “dense” with a certain keyword. For the case study, a dataset with trump-dense threads was created, consisting of threads that:

• Appear after the announcement of Trump’s presidential bid (16 June 2015): discussion on Trump was sparse before this date and mostly includes false positives like “trumpet”. • Contain thirty or more posts (including the OP): this ensures threads casually mentioning trump can be discarded (e.g. a thread with three posts and one mention of trump), leaving only those that evoke conversation. • Mention trump in at least 15% of posts in the thread (either thread title or post body, including prefixes and suffixes): this threshold ensures the word is central to the thread. 15% was settled upon after qualitative testing; a lower number would import too much unrelated threads, and a higher number problematically decreases the dataset.42

Figure 7: the amount of trump-dense threads; threads with 30 posts or more, were the word trump appeared in 15% or more of these posts, and that appeared after Trump’s announcement of his presidential bid.

41 One way of filtering on topic-dense threads is by “repurposing” (Rogers, Digital Methods) the chan-specific phenomenon of “General” threads. These are threads focussing on a topic and continuously reborn by anons copy- pasting a standard text as an OP and setting a recognisable title. Popular Generals include /ptg/ “President Trump General” (see fig. 36), /sg/ “Syria General”, or extremist topics like /nsg/ “National Socialism General”. General threads can be seen as units of self-sorting publics on the imageboard. Querying the OP title for the specific General name can sample a dataset for a topic-specific issue public on 4chan. I will further discuss Generals in the case study. 42 Practically, the trump-dense threads were collected by making a new table in the full database. This table contains all thread numbers where trump appeared. For each thread (i.e. row), three values were added: the total amount of posts (thread_count), the amount of posts containing trump (trump_count), and the percentage of the posts containing trump (trump_density). The thread_count and trump_density were used to subsequently create a csv file with the above parameters (thread count >= 30, trump-density >= 0.15).

Hagen 33

Using these parameters, a random selection of five trump-dense threads from July 2015 suggest Trump is indeed a central point of discussion, as illustrated by the OP titles:

• “Based trump/trump daily” • “Huffington post puts trump details in entertainment section” • “GET ON FOX, TRUMP IS DESTROYING” • “RIP in peace Donald Trump's campaign” • “Dr. Ben Carson defends Donald Trump's Remarks Live on CNN”

The selected parameters resulted in 8.6 million posts in 59.356 trump-dense threads. As the red line in fig. 7 shows, in the periods trump was mentioned most on /pol/ (March and September 2016), the chance was over 8% that any thread was a trump-dense thread, accounting for approximately 15,000 monthly “units of conversation”. Before and after these spikes, this chance is around 3% - still a considerable number. It can thus already be assumed Trump, in large part, animated a vast amount of activity on /pol/.

The full sample, trump-posts sample and trump-dense threads sample reveal further interesting numbers in size alone. Appendix II offers insights into a few “scopes” of these datasets, most notably making an educated guess about the amount of active /pol/ anons.

2.3 Methods: Digital navigation through text mining In his time, Tarde mostly imagined statistics as the tool for identifying imitative patterns. A decade later, more experimental modern technologies are readily available that can identify patterns in the production of meaning, for instance in word contexts This section discusses a set of experimental techniques that allow to explore the context of the word trump. Together, they form a toolbox for a “digital navigation through point-to-point datascapes” (Latour, “Tarde’s Idea” 159), each allowing configurations of many “differently conceived wholes” (592). Their combination does not present a holistic and total overview; the techniques described here should be thought of as (and are used as) exploratory and experimental techniques forming particular lenses, with particular parameters, offering particular gazes, on particular objects.

2.3.1 Mapping vocabulary change with tf-idf Following Tarde, imitations and innovations can bring about “social assimilation and political consensus” because they formulate and propagate “beliefs and desires” in human societies (Leys 279- 80). If it is indeed assumed these beliefs and desires are present in discourse, the vocabulary change of a certain group can perhaps give some insight into this. I do not focus on the grand concepts of “beliefs and desires”, but relational objects can indicate what issues and ideas coagulate through dominant imitations. For /pol/, popular words can identify what topics animate its loose publics. More specifically, narrowed to an imitated object, dominant words can identify the topics this object is most

Hagen 34

heavily associated with. For instance, it makes a difference whether trump-dense threads give more prominence to maga or russia, particularly because these words themselves form vectors of further ideas and meanings. To identify these issues, it is necessary to determine what are popular terms. “Popular” is a fickle parameter: in language, common words like “and” are strictly popular, but do not indicate the topics I am after. Luckily, there are many text mining techniques that can identify words that are unusually relevant in a text. Term frequency - inverse document frequency, or tf-idf, is one of the most used methods to extract such word relevancy.43 Specifically, tf-idf calculates what words are common in one body of text by “subtracting” its weight by its frequency in other bodies of text. For each term in a text body, tf-idf can calculate two metrics:

1. Term frequency (tf): The number of occurrences of the term in its document. 2. Inverse document frequency (idf): A measure weighing down a word’s term frequency by how frequently the words appears in other documents (Spärck Jones).44

By weighing down the tf with the idf, it pinpoints terms that are common locally but rare globally. This circumvents issues of generally common words over-dominating (e.g. “because” or “what”). Tf-idf can also take into account a tf might be high simply because the document it appears in is larger than the others.45

Figure 8: Unstructured 4chan text, like “shitposts” or copy-pastas complicate text mining, for instance with tf-idf. Derived and screencaptured from archive.4plebs.org

43 Apart from its PageRank algorithm (Rieder, “What is in PageRank?”), Google was built on the technique, and forms “the core of many, if not most, ranking methods used in search engines” (Robertson 518). 44 For a technical explanation of tf-idf, see Aizawa. Spärck Jones offers an interesting historical perspective, as she was the first to formulate idf. 45 In language, many terms are the same (common grammatical words, stop words, nouns) and thus over-dominate. Additionally, words that are uncommon in general speech might be extremely common on /pol/, (vernacular terms, hate speech, names), forming a second vocabulary dominance. Such terms are not irrelevant, but they problematically homogenise differences between the temporal use of terms. Practically, documents with more words can be attributed more weight than smaller ones, creating a size bias. In regards to /pol/, this is a shame, since its activity is highly animated by commenting on geopolitical events and stories, making much of the posts time-specific.

Hagen 35

For the case study, tf-idf calculations were performed on the vocabulary in the trump-dense threads, since this would take into account posts that were made under the umbrella of trump-related discussion, creating a more varied issue space than in e.g. the trump-posts sample. To do so, I created weekly time-separated documents for all posts in the trump-dense threads dataset. This resulted in 143 documents (week 26, 2015 to week 11, 2018). By default, the methods used calculate scores for all terms. However, this resulted in little significant change over time. To attain a higher variance between weeks, I tuned the Tardean lens to focus more on innovations rather than imitations: terms were omitted that appeared in all weekly documents. This also removed frequent yet relevant terms like trump and clinton, but it was considered a worthwhile trade-off; not only because it removed general words like people, but also because it meant all resulting words were “innovative” and highly relevant for the corresponding timespans - there was always another week where a specific word was not imitated. Further, I filtered out some “clutter”: stop words, words with two or less characters, URLs, and numbers. Additionally, because 4chan’s text objects are very unstructured (see fig. 8), duplicate words in a post were removed to prevent copy-pasted content to skew the data.46 Finally, to equate very similar semantics, words were “stemmed”, meaning that only the stem of words was kept (e.g. trumps would be cut-off to trump).47 This resulted in a total vocabulary of 127,956 terms. I used the function

TfidfVectoriser from the Python library sklearn with the default parameters, save for the ones that managed the word filtering described above.48 I used the resulting sparse word occurrence frequency to create a new datasheet top 100 terms per week, and analysed the first ten words as a basis for the case study (table 2 shows a sample). 2016-45 2016-46 2016-47 2016-48 2016-49 maga pence recount recount russia red bolton alt-right pence russian energy bannon spencer twitter cia ctr dying nazi carrier hacked florida ctr stein taiwan cabinet blue soros ctr soros pence michigan protests jill dying recount pence maga alt mnuchin goldman dying cnn disavowed goldman swamped lads russia rigged ctr dying hour red pence cnn sachs shadilay appointed soros cabinet pizzagate

Table 2: A sample of top tf-idf terms per week in trump-dense threads

46 The post in fig. 8 would thus be translated to “turn the cameras camerasturn”. 47 I stemmed by using nltk.stem.snowballstemmer. See nltk.org/_modules/nltk/stem/snowball.html. 48 See scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html for the default values. I set max_df to the amount of weeks minus one (142 weeks) to filter out the words that appeared in all weeks (the ‘domain-specific stop words’). Further, I used the following regular expression to delete words that were shorter than three characters and contained special characters (which were mostly irrelevant clutter): [- //a-zA-Z0-9]{3,}

Hagen 36

To test whether temporal vocabulary change could offer “points of view” into shifts in political sentiment on /pol/, I searched for a method to map the word similarities of each week. Because the tf- idf matrix basically outputs time-separated, weighted dictionaries, it formed a suitable data source to quantify weekly vocabulary similarities. To do so, I used cosine similarity calculations; a typical approach for denoting similarity in text documents. It quantifies an angle (-1 and 1 for absolute dissimilarity and similarity) indicating the similarity of the components in two (size-independent) documents (Aggarwal 76). All cosine similarities between all weeks were calculated and outputted as a csv matrix with the cosine similarities between each weeks.49 The results from this matrix can be used for a visual network, using weeks as nodes, and the cosine distances between the weeks as the edges. I visualised this network as a scatterplot visualisation50 with the Python library matplotlib. The resulting plot was showed a fairly logical representation, but some dispersed nodes made it difficult to read, so I compressed the distances between nodes using the Multi Dimensional Scaling algorithm (Kruskal).51 This resulted in a circular52 “path” of chronologically increasing nodes (see fig. 9), suggesting gradual vocabulary change over time. This did not, however, indicate which words were causing this longitudinal change - the valuable “points of view” into particular text objects were lost. Therefore, on top of the map, I plotted the most significant terms from the documents therein. To do so, some minor topic extraction had to be done. I used the K-means clustering algorithm (Hartigan and Wong) on the tf-idf matrix to find clusters within the path of nodes and extract the dominant terms from these clusters. In simple terms, K-means takes a pre-defined number of central points, “centroids”, and uses these to identify clusters by repeatedly calculate an ideal middle point for each cluster until it “reaches convergence” (Aggarwal 207-8).53 The closed nodes to these centroids are then seen as a cluster. As the number of centroids is predefined, in my case five centroids were chosen since this provided distinct differences between nodes. Roughly, the clusters denote weeks falling within the following date ranges:

• 22 July 2015 - 14 February 2016 • 15 February 2016 - 18 July 2016 • 19 July 2016 - 22 January 2017 • 23 January 2017 - 25 September 2017 • 26 September 2017 - 18 March 2018

49 Available at salhagen.nl/thesis/trump-dense-cosine-matrix.csv 50 I chose a scatterplot instead of software like Gephi because edges are irrelevant: each week is connected to all other weeks. Of course, Gephi can also create an “edgeless” graph, but I chose to stay within the Python environment with matplotlib. 51 See the getDocSimilariy function in the following code: github.com/salhamander/polthesis/blob/master/similarities.py For a technical consideration of MDS, see Kruskal and the sklearn documentation: scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html 52 The cosine scatterplot is circular because of the MDS compression, not because the vocabulary at the start and end weeks are similar. 53 For a technical description on K-means, see Hartigan and Wong or Aggarwal (ch. 7.2).

Hagen 37

To perform this clustering, the KMeans function from the Python library sklearn was used,54 set to the default Expectation Maximisation algorithm. The clustering results were used to colour the nodes (fig. 9). In the legend, I included the top five tf-idf words per cluster to provide specificity on what issues were animating the anons in the trump-dense threads

Figure 9: The abstracted cosine scatter plot (no labels, legend and title). The distances and positioning denote the cosine distance (scaled with MDS) and the colours the K-means clusters of vocabulary similarity

The resulting cosine scatterplot (fig. 14) forms a map that points to further spots of variance, or, in Tardean terms, points of innovation. However, I want to again raise the exploratory nature of this method. As the parameters and techniques radically determine the visual outcome, the scatterplot should not be seen as an authoritative cartography but rather as a useful point of view to further particulars. If this is espoused, the method makes most of the Tardean approach in allowing imitative patterns to emerge organically.

To use the map as an indicator for temporal change between “points of view” on trump, I took the median dates of each cluster as a basis to explore trump in five weeks: weeks 45-2015, 18-2016, 42- 2016, 21-2017, and 51-2017.55 In no sense do these five weeks capture the breadth or complexity of the data from the entire clusters. They do, however, form specific entry points to explore how the vocabulary, and possibly the political sentiment, within trump-dense threads has changed. Per week, three further methods were used to “circulate” around trump. Firstly, I used the ten most popular tf-idf terms of that specific week, as explained here, to indicate what issues animate the anons active in these threads. This allows some insight in the wider issue space around trump - both in terms of specific events

(e.g. brexit is the top term in week 26-2016) as well as suddenly spiking vernacular (e.g. chess, alluding

54 See the sklearn documentation: scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html 55 I attempted to quantitatively determine the most “representative” weeks of their respective cluster by identifying the nodes closest to the cluster centroids. However, as “the effectiveness of a k-means algorithm is highly dependent on the distribution of the attribute values in the underlying data” (208), the centroid positions of the clusters can be fickle: a single outlier can cause considerable change. Therefore, median dates were used.

Hagen 38

to the “4D-chess” meme56 is second most popular in week 09-2018). The next two sections are devoted to the second and third methods: word embeddings and word trees.

2.3.2 Semantic similarities through word embeddings If a Tardean outlook would discuss ontology through defining the ontic, i.e. defining a whole by its parts, this logic could also apply as a way to “define” what trump “has”, and thus “is”. This ontic analysis can be focused on physical properties (“a cat has four legs”), but more interesting for my purposes is how the entity is imitated through language, since this foregrounds the subjectivity “behind” the production of meaning. But what would trump be “made of”, in this case? Arguably, the previous section has already denoted what related topics trump “has”, but the direct context of a word can also identify the construction of its meaning. It matters greatly what words trump “has” around itself, e.g. in the sentences “i think trump is useless” or “all hail trump the god-emperor”. This is especially interesting here, since temporal shifts in word associations can possibly provide insights into changing political sentiments on /pol/. Many techniques attempt to derive word meanings from their context,57 but word embedding algorithms almost literally operationalise the above approach. Its underlying linguistic theory have had a long intellectual history (e.g. Salton et al.), posing that words with similar contexts have similar meanings. Because this statement implies one should trace the properties of an entities’ use to understand it, it echoes Tarde’s non-structuralism and his to have philosophy. For instance, an implicit Tardean flavour is found in the work of the linguists Zellig Harris, stating “distribution of an element will be understood as the sum of all its environments” (146), or John Rupert Firth, declaring “you shall know the word by the company it keeps!” (11). However, these linguists were often chasing structural linguistic realities (see e.g. Harris 149), which contrasts Tarde’s “one-level” anti-structuralism (Latour, “Gabriel Tarde” 6). Nonetheless, modern word embedding techniques allow employment in a non- structuralist Tardean sense, because attributing it any structural meaning can be seen as a matter of inference. Further, word embedding techniques take the particular context of words, and presents the relations between these word entities without assuming a pre-defined coherence. For instance, words like “United” and “States” correlate not because of a definitive structure of language, but because the repetitions of “United States” provides them similar contexts.

56 The 4D-chess meme denotes when someone is planning many steps ahead. On /pol/, it was mostly used to find a reason for Trump’s often incomprehensible policies (“Trump Is Playing 4D Chess”). 57 For instance, N-gram word colocation techniques can denote which n other words a specific word appeared with most, but were deemed too limited because they cannot capture the wider discourse a word is embedded in. Topic modelling techniques like LDA were experimented with, but either failed to provide coherent topics or problematically abstracted the data. When I subsequently found relatively simple neural network techniques could provide more intelligent semantic analyses (Mikolov et al., “Effective Estimation” 2), I started exploring these paths.

Hagen 39

Only recently have word embedding techniques become powerful and readily available. word2vec, released in 2013 by Mikolov et al. of Google, was the first widely adopted word embedding method. Afterwards, other implementations were released, most notably Stanford’s GloVe and Facebook’s FastText, but word2vec is used here for a variety of reasons.58 I will provide a very basic description of how it works (for an accurate consideration, see Mikolov, “Distributed Representation”; “Efficient Estimation”). Word2vec offers two main neutral network models: Continuous Bag of Words (CBOW) and skip-gram.59 I opted for the skip-gram technique because it performed better on semantic tasks and training efficiency was of no importance in this study (Mikolov et al., “Efficient Estimation” 7). Skip-gram “slides” over all the sentences in a corpus with a window of a certain size (five in my case). For each middle word, it calculates an “embedding”, a compressed numerical representation based on the words around the word analysed. To limit the complexity of thousands or millions of collocations, this embedding is “compressed” to a certain amount of “dimensions” (or: features) using a simple neural network. These embeddings are informed by “output vectors”: a large range of numbers for each word in the vocabulary, each denoting the similarity between its own word context and that of the input word. It is not this output, but rather the word embedding itself that is interesting, since this denotes a simplified representation of its “position” in a larger vector space, considering each word in the corpus. A singular embedding, or vector representation, e.g. for trump in March 2018, is noted as the following:

[-3.857777 1.7559991 0.6310176 1.7210367 -0.7211131 0.8466116 -3.381561 2.4608939 -0.6242869 0.00936217 -1.3582907 0.24295196 0.24723968 -0.18548784 2.8901794 1.999349 1.8424224 -1.8466829 1.4647037 2.903145 2.8759904 -0.3610506 1.0114994 1.3317658 -3.404333 1.892668 -0.4396822 -1.8178911 -0.4573367 -1.3257287 0.52875215 -3.1926286 -1.3008403 -2.8959842 4.7373857 -0.49697646 1.9581877 0.54118526 1.6440973 -0.24264862 3.4413238 2.6657383 -2.7399948 1.4154265 -1.7192115 -0.195145 -0.53809106 -1.5116366 -1.4740981 -1.0984546 0.31347153 1.128983 -1.1819432 2.15623 -1.8193393 1.0150502 -0.04109395 2.5652912 0.85835344 0.60916644 -1.5303342 -0.13864571 3.792254 -1.3716407 0.25061244 -0.02041579 -1.9252301 0.22671886 -3.4493906 -1.1272904 2.0542607 0.5664386 0.81504 0.59916234 -2.302329 0.06729359 3.2095075 3.6359448 2.2336447 -2.4671333 2.093517 -2.053294 -1.6498947 -2.6875484 -2.2431338 -1.9736847 -0.6124684 0.94388443 -0.89074385 -0.20927763 -0.60044533 -0.88564414 0.5244542 4.1841006 -4.402893 -1.2990781 1.5802844 -0.84377706 -0.69516355 0.16152757]

58 I tested all three techniques, but both because results were similar, time constraints prevented further tuning, and word2vec had the best documentation in the genism implementation, I opted for word2vec. What is interesting to know however, is that FastText is more geared towards the syntactic context of a word instead of word2vec and GloVe’s semantic focus. Analysing data on /pol/ with different word embedding techniques can be an interesting path for future research. 59 Skip-gram trains word representation by starting at a single word and then iterating over its neighbours. For instance, consider the word “fox” in the sentence ‘the quick brown fox jumps over the lazy dog’. If one considers the “window” of the word’s context as two, skip-gram iterates over the two words before and after it: (fox, quick), (fox, brown), (fox, jumps), (fox, over). Conversely, CBOW attempts to infer the context of a word by looking at the neighbours taking “each current word as an input to a log-linear classifier with continuous projection layer, and predict words within a certain range before and after the current word” (Mikolov et al., “Efficient Estimation” 4). For instance, it will train the vector representation of fox by inferring from (brown, jumps). For further explanation, see Mikolov et al. (“Distributed Representation”; “Efficient Estimation”).

Hagen 40

The amount of numbers (here: 100) denote the dimensions, and the values the position of the word in the vector space. Because word2vec generates these word embeddings by just analysing contexts, words with similar neighbours receive similar vector representations. While completely unreadable for humans, such numerical representations allow calculations that provide semantic insights. For instance, the classic example of word embeddings is that it “understands” king - man + woman = queen. Similarly, the models I trained know that frog + kek = pepe and chad + chick = stacy.60 Word embedding techniques can thus provide “points of view” onto subjective production of vernacular intertextuality.

I created monthly word2vec models with the library gensim using the full /pol/ sample. word2vec models becomes more “intelligent” with the more data it can train on, so using it with the trump-dense threads sample limits its capabilities, as would a model for just the weekly data - hence the full sample. Before the training I did some pre-processing. I deleted non-relevant terms61 and only the words that appeared more than 200 times were considered - infrequent terms cluttered the dataset, but a threshold of 200 still ensured more obscure vernacular was included. I set the window size to the (default value) of five, meaning five words before and after a specific word were considered in the calculation of the embeddings. This resulted in a 100-dimensional vector space for each month.62 Vector representations of words make possible insightful semantic calculations, but for the purposes of a digital “navigation”, a visual overview of similarities to trump was desired to trace the “continuity of […] names and boundaries” (Latour et al. 650) around word with similar contextual meaning to trump. To visualise this, t-SNE (t-Distributed Stochastic Neighbour Embedding; Maaten and Hinton) formed an outcome. t-SNE does a best attempt in “translating” high-dimensional models like word2vec to a -- for humans observable -- two- or three-dimensional space by “unpacking” its high- dimensional topological spaces (“manifolds”).63 t-SNE graphs were made of all the months containing the five weeks identified by the cosine scatterplot (November 2015, May 2016, October 2016, May 2017 and December 2017). The perplexity parameter highly impacts how t-SNE unpacks the data, representing a guess on how many close neighbours a single data point has (Wattenberg et al.). By iterating over various settings and looking at the coherence of the graphs,64 I settled on a perplexity of ten. To ensure the terms visualised were indicative of “patterns in imitation”, I used a parameter to only show the top 4% most used terms of the monthly vocabulary, meaning words had to be used around 350 times per month. After setting these parameters, the t-SNE graphs were visualised as scatterplots with the Python library matplotlib.

60 “Chads” and “Stacies” refer to a meme on stereotypes of sexually successful, attractive and mainstream men and women. 61 I removed stop words, URLs, numerical values, and strings shorter than three characters. Long strings were sliced to up to twenty characters. Words were not stemmed. See tokeniserAndStemmer in the following code: github.com/salhamander/polthesis/blob/master/similarities.py 62 See the getWord2VecModel in the following code: github.com/salhamander/polthesis/blob/master/similarities.py 63 For a technical discussion of t-SNE, see its original paper (Maaten and Hinton). Wattenberg et al. provide a very informative visual explainer on t-SNEs parameters. 64 To assess whether clustering makes sense, Van der Maaten suggests to “just look at them!”

Hagen 41

Figure 10: A full view on a word2vec t-SNE representation of /pol/ text, November 2017

Fig. 10 gives a reference on all /pol/-words in the t-SNE graphs. This reveals the total vocabulary in /pol/ is not clearly separated, which makes sense since words are not used in isolation. Hover, some clusters do form, and zooming in reveals word2vec is able to match semantic relations fairly well. For instance, it groups words associated to the Middle East like afghan and isis, and even

4chan “meta” words like lulz and (see fig. 11). For the case study, I zoomed in on the area around trump to explore the words closest to this imitated object. Longitudinally, it thus became possible to explore imitations, repetitions, and variations in semantic relations to trump. I drew “bubbles” in

Adobe Photoshop around coherent clusters surrounding trump as indicators on whether semantically coherent groups would appear, disappear or change over the course of the five months. Together, word2vec and t-SNE thus allow a “Tardean”, bottom-up approach of “mapping a collective phenomenon without ever considering either individual components or structure” (Latour et al. 608).

Figure 11: t-SNE word2vec word clusters (left: Middle East, right: 4chan meta)

Hagen 42

Again, the use of word2vec and t-SNE should be considered as exploratory. Part of this is anchored in the methods themselves. Any technique dealing with high-dimensional data suffers from what Bellman describes as the “curse of dimensionality”: high-dimensional models cannot be faithfully reduced to human-observable lower-dimensional spaces. The fickleness of these techniques is further made clear because T-SNE takes a random walk between data points each time it runs (Maaten and Hinton). Although in this research, multiple runs provided the same general shapes and clusters, slight differences did occur. Additionally, its parameters, especially “perplexity”,65 highly impact how t-SNE will “unpack” the high-dimensional data (Wattenberg et al.). Importantly, in t-SNE graphs, cluster sizes “mean nothing” and the distance between clusters “might not mean anything” (Wattenberg et al.). Both because of the random elements and the sensitivity of these parameters, word2vec and t-SNE challenge how to “transpose our epistemological enterprise into the language of computer science and back” (Rieder and Röhle 76). As such, I do not consider the visualisations as the results of this study; they are not “finished object[s] with (seemingly) apparent interpretation” but rather a “temporary result of an ongoing process” (Rieder and Röhle 74). When espousing this exploratory stance, the t-SNE graphs form insightful maps to further scrutinise the links between familiar terms and unfamiliar vernacular on /pol/.

2.3.3 Ontological associations with word trees In making sense of language, “repeated elements tell us a great deal about texts”, but as described above “with context more nuances and revealing themes appear” (Wattenberg and Viégas 1). word2vec loses the original word contexts by abstracting from their case-by-case composition. This can be a shame as seemingly uninteresting words like and can form bridges to identify associations. For instance, counting the most common words after “russia and …” on /pol/ shows what subjects anons associate with the country. Using this insight, one can, in a very literal sense, operationalise how digital “objects” define trump, namely by analysing the ontological associations in the sentence “Trump is a …”. “Is a” perhaps somewhat suggestive, as it invites harsh or affirmative claims, but such these explicit associations form a relevant “view” on their own. To indicate and visualise this, word trees were used. Word trees are stylised suffix trees, counting the most frequent words appearing after a certain string, and subsequently visualising all the possible branches that appear in the input text (see fig. 12 for reference). I used the “Word Tree” Web application created by Jason Davies66 based on the work of Wattenberg and Viégas.

For each of the five central weeks in the cosine scatterplot, the posts from the trump-posts dataset were

65 Perplexity represents a guess on how many close neighbours a single data point has. Wattenberg et al. provide a great visual explainer on the effect of this parameter. 66 See jasondavies.com/wordtree/

Hagen 43

concatenated into a long string and non-alphabetic characters were removed.67 The concatenated string was inserted in the Web application and the word tree for “Trump is a …” was saved.

Figure 12: A word tree with the most used words after “russia and …” in week 9-2014

This chapter described the approach, data and methods one can use to “navigate” around an imitated object like trump on /pol/ to explore its contextual meaning, which in turn could identify political sentiments. While also a prelude to the next chapter, I see this methodological exploration as a proposal to further theorise on how “Tardean” empirical methods can aid in extracting meaning. I stress this as exploratory not only because I consider the proposal as fairly experimental, but also because the computational methods proposed are no rigid statistical tools, an assumption that would risks the invitation of “mechanical objectivity”: “the impression that machine processing endows results with a higher epistemological status” (Rieder and Röhle 72). As discussed e.g. in relation to t-SNE, the way the above methods are employed is a fickle process resulting in highly variable outcomes. Still, when acknowledging the “second-degree objectivity” of the researcher and employing “as many subjective perspectives as possible” (Venturini, “What Is”) the above methods can -- although always partially -- identify revealing shifts, relations, and issues in textual data, and possibly, the political currents on /pol/. To put this to the test, I now turn to the case study.

67 Non-alphabetic words were removed because they often broke otherwise logical sentence structures. The period character formed an exception to this, since Word Tree uses it to understand the end of sentences. See the writeToText function in the following code: github.com/salhamander/polthesis/blob/master/substringFilter.py

Hagen 44

3. Case Study: Digitally Navigating Trump as an Imitated Object

It’s been a hell of a ride, I laughed a year ago when I heard rumors that Trump was going to run, I was dead set on writing in Ron Paul or going libertarian, but here I am praying to an ancient egyptian frog to give it to the Golden Lion. (Anon #5)

This case study traces the temporal change of word associations around trump to test if this can indicate changes in political sentiment on /pol/. As such, it not the goal to come to a definitive answer to how

Trump is represented on /pol/, but rather to explore how an imitated object like trump can form a point of view into other “objects of imitation” that denote shared sentiments between subjects. Fig. 13 shows the steps taken in this digital navigation. First, the cosine scatterplot is used as a birds-eye overview

(without considering it a distinct “macro”) to identify vocabulary change in trump-dense threads over time. The median weeks in the vocabulary clusters form points at which I “stop momentary” and

“[move] on to the attributes” (Latour et al. 594) of trump within each of these weeks. Within five subsections in section 3.1, these attributes are explored using the top tf-idf terms (what animates discussions in Trump-dense threads?), the word2vec t-SNE maps (what is semantically similar to Trump?), and the word trees of “Trump is a …” (what is Trump?). To provide a rough context on why the words emerging from the findings were used, I queried the words in the Web environment of archive.4plebs.org in their respective timespans to read how anons referred to it. Finally, section 3.2 attempts to reflect and “infer certain regularities” (Marsden 1177) as well as irregularities from the findings, and relate these back to the theoretical framing.

Figure 13: The rough steps of the digital navigation in this case study.

Hagen 45

3.1 Text mining per week: Stopping momentarily to test the waters

Figure 14: A scatterplot showing cosine distances between tf-idf weighted vocabularies of trump-dense threads per week. Compressed with MDS and coloured by five K-means clusters on the underlying tf-idf matrix (excluding terms that appeared in all weeks). Legend shows the top tf-idf terms within these clusters. ‘★’-symbol denotes the median week in the cluster.

Fig. 14 denotes the cosine similarity between weeks and the most relevant terms within five vocabulary clusters.68 To be read counter clockwise (MDS compresses a long “path” into a more condensed circular space), the weekly nodes roughly increase in chronological order, meaning that a gradual “evolution” in vocabulary change is distinguishable. While the clusters are rigidly separated by colours, in reality the change in discourse is gradual. This is illustrated by the fact that weeks in late 2017 are sometimes grouped in the green cluster, and sometimes in the yellow cluster. For instance, 2018-01 is grouped as a green node, and 2017-44 as a yellow one, despite the neighbouring weeks being in the other cluster. This implies that there is a subtle homogenisation in vocabulary during these timeframes (and the amount of clusters could perhaps have been reduced to four).69 From this particular birds-eye perspective, some general trends can already be discerned.

Electoral discourse is popular in the trump-dense threads in the earlier two clusters (pink and blue). The

68 Download a csv with all the top 100 most relevant terms per week via salhagen.nl/thesis/tfidf_top100_weeks.csv 69 In order to determine the median date, I had to choose a starting and ending week for the green and yellow clusters qualitatively, as there was no week that formed the “edge” of the cluster (like 2016-29 for the blue one). Ultimately, 2017-29 was chosen as a split because it was in the middle of six “misplaced” weeks.

Hagen 46

Republican adversaries of Trump are especially apparent, namely ben carson, rubio, rato (“El Rato” is a nickname for Ted Cruz on /pol/) and kasich. Further, the top two words in these clusters, stump

(referring to “Can’t Stump the Trump”) and maga (referring to “Make America Great Again”) show pro-

Trump words are oft-used in the trump-dense threads. clinton is excluded because the word appeared in each week, but the orange cluster shows that she is implicitly present nonetheless: the top term is ctr, an abbreviation Clinton’s super PAC Correct the Record, which received a 1-million-dollar investment to “engage in online messaging both for Secretary Clinton and to push back against attackers on ” (Foran). Despite these presumably anti-Hillary terms, terms that possibly denote scepticism or doubt towards Trump also emerge, in particular in the last two clusters; drumpf is used to remind of

Trump’s less-attractive root of his family name, while impeached and mueller show the FBI investigation into Russian collusion is well-represented in the trump-dense threads. Of course, I cannot make any judgement on how these terms are used yet (e.g. drumpf can also be used to mock “liberal speech”), but the map offers a suitable starting point.

Week Week Cluster start end Date start Date end Date median Pink 2015-26 2016-06 22 July 2015 14 February 2016 2 November 2015 (w 45) Blue 2016-07 2017-29 15 February 2016 18 July 2016 2 May 2016 (w 18) Orange 2016-30 2017-03 19 July 2016 22 January 2017 20 October 2016 (w 42) Green 2017-04 ~2017-39 23 January 2017 25 September 2017 25 May 2017 (w 21) Yellow ~2017-40 2018-11 26 September 2017 18 March 2018 21 December 2017(w51)

Table 3: Time separation of K-means clusters on the tf-idf matrix

Table 3 contains the dates of the five clusters. In the sections below, I dive into the median weeks per cluster, starting with week 45 of November 2015.

Hagen 47

3.1.1 2015, week 45: “Trump is a meme-wizard”

Figure 15: The first tf-idf cosine cluster, weeks 2015-26 to 2016-06.

The first vocabulary cluster starts in the week of the announcement of Trump’s presidential bid (2015-

26), which is somewhat removed from the other weeks, but quickly, the nodes start to cluster by their cosine similarity. Looking at the words that group these nodes together, a tip of the iceberg shows that stump, carson, rubio, iowa and ben are the most relevant terms overall, showing an unsurprising prominence of the Republican candidates.

What animates discussions in Trump-dense threads?

2015-45 tf-idf score carson 0.56 stump 0.31 rubio 0.25 kasich 0.13 fiorina 0.13 randlet 0.11 bombs 0.10 ben 0.10 refugees 0.09 tonight 0.08 christmas 0.08 Table 4: The top tf-idf terms in trump-dense threads, week 45, 2015

The median week 2015-45 (2-8 Nov.) also reinforces the prominence of Republican candidates: carson, rubio, kasich and fiorina are five of the six top tf-idf terms in this week. This is rather unsurprising. In November 2015, the US Republican primaries started to kick into a higher gear, but the possible outcomes were still unsure. As such, arguing Trump’s chances vis-à-vis other candidates formed an

Hagen 48

obvious topic of debate. Early that month, Carson and Trump were leading the polls by a wide margin (Murray). This led to head-to-head comparisons and oppositions on /pol/, especially when Trump took to Twitter and accused the former surgeon of lying about a story from his youth (Gass and Shutt). By November 2015, many anons seemed to have adopted pro-Trump vocabulary: the second most significant tf-idf term is stump, referring to in the catchphrase “Can’t Stump the Trump”. Though sounding like a slogan from the official Trump-campaign, the phrase was allegedly born on /pol/ when the business mogul was about to announce his campaign in June 2015 (fig. 16; “Can’t Stump the Trump”). Afterwards, it was repeated as a humorous yet political indicator of support and posted almost

5,000 times on /pol/ in November 2015 (see the stump frequency graph in appendix IV).

Figure 16: The first occurrence of ‘Can’t Stump the Trump’ on 4chan

What is semantically similar to Trump?

Figure 17: t-SNE graph on a word2vec model (skip-gram, window size: 5) for all posts on /pol/ in November 2015. Zoomed in on trump. Only showing words appearing more than 342 times. Perplexity: 10.

Hagen 49

At first glance, the word2vec similarities for November 2015 (fig. 17) show trump in a fairly evident discourse on politicians and elections. Some interesting anomalies are present on closer inspection.

Firstly, supportive terms around trump contrast the derogatory terms associated to his Democratic adversaries (notably clinton and obama). As both stump and stumped appear near trump, the graph cements “Can’t Stump the Trump” was used in similar contexts to trump. Contrastingly, for Clinton, a derogatory nickname implying covert financial motives, shillary, is apparently used in the same contexts as her real name. Additionally, anons brought up Clinton’s controversial role in attack on the

US consulate on 11 September 2012 in Benghazi, as denoted by benghazi and scandal. Further, the t-

SNE graph suggests /pol/ anons discuss trump by using different words than his Republican opposition: Marco Rubio, Jeb Bush, Carly Fiorina, John Kasich, Mike Huckabee, and others are lumped together, while Trump was already an “outsider” since her is further away from this bubble. Apart from these oddities, the close semantic context of trump shows a fairly run-of-the-mill, maybe even boring electoral discourse.70

What is Trump?

Figure 18: The word tree for "Trump is a ..." of all posts mentioning trump in week 45, 2015.

The word tree from trump-posts in week 2015-45 (fig. 18) provides an interesting “point of view” on trump’s attributes that the above methods do not uncover. Firstly, the antagonistic and conspiratorial character in posts mentioning trump come to the fore. Accusations include “retard”, “fraud” and

70 However, this electoral discourse is not a reason to downplay political extremes and vulgarities, as indicated by the presence of e.g. obongo and heil hitler.

Hagen 50

“failure”. Additionally, Trump is referred to as an undercover “shill”, “Hillary plant” and “Hillary Clinton puppet”, framing him as a covert placement to heighten Clinton’s chances of winning the main elections. These references denote suspicious or outright conspiratorial discourse. However, the most prominent reference is most interesting: “Trump is a meme.” This suggest that, even though trump is well embedded in formal electoral discourse in the two cases above, there is also a sentiment that, all considered, he should not be taken all too seriously (see e.g. fig. 19). Other associations like “Trump is a joke” and “Trump is a fucking comic relief” further attest to this consideration. If the above methods show Trump was “serious business”71 on /pol/, the word tree shows that he was not.

Taken together, the fetched water of the trump-river in this timeframe shows that Trump was associated with supportive vernacular, as indicated by stump. Issues surrounding him predominantly concerned him vis-à-vis other Republican candidates, and the semantic contexts mostly show formalistic election discourse. At the same time, Trump was not always portrayed as a realistic candidate, since his comic relief was stressed in the characterisation of him as “meme candidate”, embedding him in “lulzy” discourse (see e.g. fig 20).

Figure 19: An anon who did not take Trump seriously. Derived and screencaptured from archive.4plebs.org

Figure 20: Trump merging with lulzy discourse. Derived and screencaptured from archive.4plebs.org

71 “The Internet is serious business” is an old sarcastically stating one should take the online space seriously (“The Internet is Serious Business”)

Hagen 51

3.1.2 2016, week 18: “A major zone of sigil magic”

Figure 21: The second tf-idf cosine cluster, weeks 2016-07 to 2016-29

During the first half of the second cluster, the race for the Republican nomination was still ongoing, but only Ted Cruz, John Kasich and Trump remained. The prominence of Cruz and Kasich is apparent in the overall cluster, as kasich and rato are among the top terms. delegates and convention denote further electoral discourse, while maga, the top term in the cluster, indicates ongoing pro-Trump sentiment on /pol/.

What animates discussions in Trump-dense threads?

18-2016 tf-idf score

maga 0.23 kasich 0.21 ted 0.19 rato 0.17 delegates 0.15 indiana 0.15

energy 0.11

red 0.10

cnn 0.10 hat 0.10 dying 0.10

Table 5: The top tf-idf terms in trump-dense threads, week 18, 2016

On the second day of the median week 2016-18 (2-8 May), Trump seized the candidature by winning the Indiana primary. Cruz and Kasich subsequently dropped out of the race, after which the Republican National Committee leader Rence Preibus announced Donald Trump was the presumptive candidate

Hagen 52

(Collinson). If anons were “imitating” Trump as a mere “meme candidate” in November 2015, the Trump-fever has arguably transformed into more politically overt forms by May 2016. The highest scoring tf-idf term in week 18, like the overall cluster,72 is maga, for “Make America Great Again”. The term can of course be used insincerely, but it is likely indicating Trump-support, as the sentence is explicitly indicative of certain “beliefs and desires”. From a Tardean perspective, the disappearance of stump and the emergence of maga is interesting as the former originated as an “innovation” on /pol/, while latter is an “adaptation” of campaign speech. The prominence of maga thus means anons were somewhat receptive to imitate speech by authoritative sources. Most other prominent tf-idf terms, kasich, ted, rato, delegates and indiana, show that most posts in trump-dense threads in this week were, unsurprisingly, animated by the Republican battle for the decisive primary in the Midwestern state. Even though Trump himself transcended meme candidacy by beating the electoral odds, the discourse in trump-dense threads became all the more playful, as illustrated by two tf-idf terms that open up vectors into further “shared knowledge reservoirs” (Rieder, “Refraction”): energy and magic. energy is the most relevant tf-idf term behind indiana, reusing Trump’s own use of the term which he propelled into the public domain when calling Jeb Bush a “low energy person” (Parker).73 Its appearance as a top tf-idf term suggests /pol/ anons eagerly adopted the term. The thirteenth tf-idf term in this week (falling just outside table 5) is magic, referring to the concept of “meme magic”. The concept was allegedly born on the rival imageboard 8chan to denote a faux-ironic belief system that Internet memes hold prophesising and agential powers (“Meme Magic”). While several events conjured its common usage, it was not until Trump’s presidential campaign that the concept picked up steam: through multiple coincidental relationships, Pepe the Frog, the Egyptian deity Kek, and Donald Trump himself became symbols of a satirical religion: the Cult of Kek (“Cult of Kek”). Through these unlikely interrelations, the Cult forecasted Trump’s presidency -- a roleplay Trump energised by tweeting an image of himself as Pepe in October 2015. This led to (dis)belief amongst many anons that is hard to interpret its sincerity, as illustrated by the following post:

/pol/ has moved away from politics almost entirely and has become a major zone of sigil magic. Ebola Chan, Tay, God Emperor Trump, the power of Kek, ‘Pol is always right’, Stroke Chan... There’s just too much for this to be coincidence. (Anon #6)

While the anon frames these mystical performances as moving “away from politics”, it should perhaps be regarded as a different mode of political activity, since “regardless of their emotional keying, political memes are about making a point – participating in a normative debate about how the world should look

72 The fact that the tf-idf terms in this week are fairly similar to the cluster’s overall top terms suggests that picking the median date of the cluster is a decent method to determine a ‘representative’ week within the larger cluster. 73 ‘Energy’ also references another meme, but one that was rarely used during this timeframe. Still, it likely is in the ‘shared reservoir’ of knowledge of anons, since this meme is fairly popular. In Japanese anime Dragonball Z, protagonist Goku requires the ‘spirit energy’ of people around the globe to load up an ‘energy bomb’ to defeat his adversaries. In combination with the text-face “つ ◕_◕ ༽つ” and the text “take my energy”, it is often used as a sign of support – at minimal transaction cost.

Hagen 53

and the best way to get there” (Shifman 120) - even if it takes the form of the Cult of Kek. In any case, the tf-idf terms from this week show that trump became enmeshed with lulzy, satirical terms, suggesting a homogenisation between more traditional political discourse (e.g. delegates) and humorous pseudo- politics.

What is semantically similar to Trump?

Figure 22: t-SNE graph on a word2vec model (skip-gram, window size: 5) for all posts on 4chan/pol/ in May 2016. Zoomed in on trump. Only showing words appearing more than 351 times. Perplexity: 10.

The t-SNE graph of May 2016 shows both continuations and shifts in semantic associations compared to the previous month. Continuing from November 2015, Trump remains semantically connected to relatively mundane electoral discourse (light green bubble). The Democratic opposition (clinton, sanders) also maintains its close proximity to trump, but the Republicans are not present anymore, perhaps unsurprisingly considering the decisive Indiana primary was at the start of the month. Two notable changes occur. Firstly, instead of only showing shillary, this t-SNE graph attest to a “vernacularisation” of the 2016 Presidential Election since a new bubble (brown) can be drawn around derogatory nicknames for the main contenders and their followers: drumpf, hilldawg, goofus hillary, hernie, berniefags, and so on. This vernacular innovation indicates the increase of shared knowledge, captured in single terms. For instance, the simple word bernout refers both to the stereotype of Sanders- supporters as “burnouts” asking for social safety nets, as well as and Sanders’s eventual drop-out from

Hagen 54

the campaign. Like bernout, most other nicknames are explicitly negative, derogatory or antagonistic, but are present for every candidate. Indeed, the appearance of drumpf and trumpfags reminds that most things -- Trump included -- are actively ridiculed on /pol/. Secondly, a bubble (purple) emerges with terms related to religion. It is important to remind that with t-SNE, “distances between clusters might not mean anything” (Wattenberg et al., emphasis mine), but considering the prominence of the Cult of Kek and meme magic above, it is likely that theological and electoral discourse were discussed with similar word contexts. The clearest “proof” of this is the presence of god-emperor in the bubble “religious discourse”, denoting a nickname for Trump portraying him as a divine conqueror.

What is Trump?

Figure 23: The word tree for "Trump is a ..." of all posts mentioning trump in week 18, 2016

Again, the word tree provides a differing picture, showing Trump’s status was very much contested. It shows that the most frequent association is “Trump is a racist”, likely invigorated by Trump’s comments on immigration during this timeframe, and the subsequent critiques on his remarks. Considering /pol/’s reputation, “Trump is a racist” seems a counterintuitive statement, as one might see accusations of racism by /pol/ anons as hypocritical. Unfortunately, on closer inspection, this hunch is not unwarranted; when reading the particular posts in the word tree, most posts actually react to and dispute the claim Trump is racist. Fig. 24 lists the specific posts in the word tree, including:

Hagen 55

• “I don’t think Trump is a racist” • “they have been told Trump is a racist” • “I am a racist but Trump is not” • “Where is the proof that Trump is a racist”

While some posts in fact do accuse Trump of racial discrimination (“I just find it humorous people like Trump supporters deny Trump is a racist”) the majority of associations challenge the statement. Therefore, the word tree indicates the clause “Trump is a racist” is not a “contagious” or wholly “adopted” (Katz 264) imitation, but rather, a point of discussion and contestation. Indeed, it is a clear example that such “beliefs” are not merely diffused, but refracted (Rieder, “Refraction”).

The various “parts of a whole” in this timeframe indicate the popularity of Trump on /pol/, particularly through maga, and the influx of fantastical, pseudo-religious yet political sentiment e.g. posing him as god-emperor. While some anons pointed fingers to Trump’s racism, and drumpf and trumpfags attest that /pol/ never houses a homogenous public, the dominant discourse on May 2016 nonetheless paints a relatively pro-Trump picture.

Hagen 56

Figure 24: The regular and reversed word trees for “Trump is a racist” of all posts mentioning trump in week in week 42, 2016.

Hagen 57

3.1.3 2016, week 42: In the trenches of the Great Meme War

Figure 25: The third tf-idf cosine cluster, weeks 2016-30 to 2017-03.

The third cluster represents the middle of the vortex of Trump’s presence on /pol/. The median node,

2016-42 (17-23 October), was three weeks prior to the week U.S. citizens could go to the all-decisive final ballot. Interestingly, the cosine scatterplot shows a gap between the nodes before and after the election (in 2016-42), suggesting the method is fairly effective in separating vocabulary changes (fig.

25). In October, trump was mentioned in a staggering amount of 300,000 posts, just over 7% of all comments (fig. 6), even surpassing the frequency of the word the.74 Wrapped in a roleplaying-cloak under the banner of “The Great Meme War”, /pol/ anons participated in the campaign by waging “meme warfare” (Haddow). In September, Clinton’s campaign reacted to these shenanigans by grouping Trump supporters as a “basket of deplorables” and posting explainers on the “alt-right” and Pepe the Frog on her website (Kozlowska). This denouncement only worked counterproductive since it mostly angered or amused her adversaries (Phillips, “Oxygen”; Roberston), leading one anon to state the common sentiment on /pol/: “did we deplorables just […] win hands down?” (Anon #7). As Phillips formulates, Clinton’s comments “catapult[ed] the group, to the extent that it could be called a cohesive group, onto the national stage” (Phillips, “Oxygen” 4).

74 These numbers are derived from the amount of times trump and the appear in the vocabulary of the word2vec model of October 2016. trump appeared over 400,000 times, while the is stuck at 369,458.

Hagen 58

What animates discussions in Trump-dense threads?

42-2016 tf-idf score ctr 0.41 rigged 0.25 0.17 keef 0.15 russia 0.14 red 0.13 fraud 0.12 cnn 0.11 maga 0.11 okeefe 0.10 dying 0.09 Table 6: The top tf-idf terms in trump-dense threads, week 42, 2016

The collective opposition to Clinton is apparent in the top two tf-idf term of week 42, 2016 (17-23 Oct.).

Firstly, ctr refers to a supposed “information war” between /pol/ and Correct the Record, the pro- Clinton PAC mentioned above. The group not only formed a common enemy to pro-Trump /pol/ frequenters, but also led to suspicion of CTR-members invading the board, further increasing mistrust amongst anons. One can trace the lines of a “defence” against an “invasion” to the fact that Internet trolls, and I would add most netizens of /pol/, consider the Internet as their “personal playground and birth right” where “no one, not lawmakers, not the media, and certainly not other Internet users, should be able to dictate their behaviour” (Phillips, This Is Why 129).75 Conversely, the same actors would feel threatened when this “homestead” is “invaded” (129). In the case of the belief that CTR’s members “astroturfed”76 on /pol/, many defensive reactions indeed ensued, taking the form of suspicions and allegations (see e.g. fig. 26). The second word further attest to this suspicious mode: when reading the

/pol/ comments containing rigged in this week, the word commonly refers to accusations that polls and voting machines were manipulated to secure a Clinton victory. This doubt was amplified when on 20 October, Trump refused to state he would unequivocally accept the results of the elections (Healy and

Martin). Other relevant words, keef, veritas and fraud refer to conservative journalist James O’Keefe’s publication of a series of videos attempting to expose voter fraud amongst Democrats (though are critiqued for a lack of context; Detrow). The third-most relevant word, wikileaks, shows /pol/ anons commitment to discuss and dig up dirt about Clinton using the database of John Podesta’s leaked e-

75 As Phillips (This Is Why 129) illustrates, this sentiments echoes (but takes to the “most grotesque extreme”) early Web libertarianism, where the online domain was considered as a free space from offline bureaucracies (as declared by e.g. Barlow). 76 Astroturfing refers to “the attempt to create an impression of widespread grassroots support for a policy, individual, or product, where little such support exists. Multiple online identities and fake pressure groups are used to mislead the public into believing that the position of the astroturfer is the commonly held view” (Bienkov).

Hagen 59

mails. Taken together, the tf-idf terms show that not trump himself, but rather topics related to the tactical information warfare against the Clinton campaign were at the forefront of trump-related discussion.

Figure 26: A /pol/ anon suspecting CTR-members of creating threads on /pol/ to obscure information from the Podesta -leaks. Derived and screencaptured from archive.4plebs.org.

What is semantically similar to Trump?

Figure 27: t-SNE graph on a word2vec model (skip-gram, window size: 5) for all posts on /pol/ in October 2016. Zoomed in on trump. Only showing words appearing more than 359 times. Perplexity: 10.

The t-SNE picture of October 2016 somewhat solidifies the above interpretation of the tf-idf terms. ctr and shilling are grouped and embedded in a cluster on “4chan meta” discourse, further suggesting

Hagen 60

anons perceived the PAC’s efforts as a “war” on home turf. With the tensions of the elections rising, the purple bubble (“Clinton-smearing”) attests to the information warfare against Clinton, as it brings up multiple conspiracies and smear campaigns directed at both Hillary and Bill. With affair, monica and lewinsky appearing, Bill Clinton’s bygones are not kept bygones, while the inclusion of seth refers to far-right conspiracy theories falsely claiming that the murder of a DNC staffer, Set Rich, was a rebuttal for him leaking Clinton’s to WikiLeaks (Mole). The t-SNE graph further shows a

“convergence” between hillary and trump, meaning semantic similarity between the two was extremely high that October. Additionally, the derogatory nicknames for Clinton expanded, with hitlery, killary and hilldawg making their entry. Interestingly, despite this antagonism towards Hillary, another bubble labelled as “anti-Trump vernacular” also attests to anti-Trump sentiments or ridicule, containing drumpfkin, drumpftards and trumpcucks, apparently used in the same contexts as anons that are grapsing at straws in defending a Trump win. Another bubble (orange) shows words related to “racism, sexism and deplorables”. This cluster is present because it contains the exact words from Clinton’s “basket of deplorables”-speech, calling the “deplorables” “racist, sexist, homophobic, xenophobic, Islamophobic—you name it” (Reilly). The presence of this cluster implies Clinton’s speech indeed energised the eager adoption of this characterisation, at least on /pol/.

What is Trump?

Figure 28: The word tree for "Trump is a ..." of all posts mentioning trump in week 42, 2016.

Hagen 61

The word tree for 2016-42 does not drastically differ from the prior one in May 2016. “Fucking” overtook “racist” as the most frequent reference, and “Trump is a meme” disappears entirely. “Trump is a good…” does appear, but overall, the word tree shows the framing of “Trump is a …” as largely antagonistic. A more particular view of “Trump is a fucking …” (fig. 29) indicates further opposition, with “retard”, “idiot” and “moron” as the most common words after this sentence. Contrastingly, in the last cluster, “Trump is a racist” was used to debunk the association, but the “particular” posts in fig. 29 show the insults towards him are more openly oppositional in this week.

Figure 29: The word tree for "Trump is a fucking ..." of all posts mentioning trump in week 42, 2016.

Figure 30: A post on /pol/ opposing Trump, stating “trumpkins” have to accept his electoral defeat. Derived and screencaptured from archive.4plebs.org.

In sum, political activity related to trump in October 2016 can perhaps best be captured with the word antagonistic. This is firstly because, instead of related to Donald Trump himself, trump provides “points of view” into an information warfare against Clinton, as shown by ctr, rigged, wikileaks, lewinsky and

Hagen 62

seth. Secondly, even the Trump-supportive current is faced with antagonisms, as shown by the anti- Trump vernacular in the t-SNE graph and the negative portrayals of Trump in the word trees. These are not the oblique and semi-humorous antagonisms known from trolling campaigns (Phillips and Milner), but rather more direct oppositions, and sometimes even tactical political endeavours.

3.1.4 2017, week 21: “Identity first, lulz second”

Figure 31: The fourth cluster, weeks 2017-04 to ~2017-39

The fourth cluster, roughly denoting the first half of 2017, entails Trump’s first semester as President. Hypothetically, the transformation from candidate to president would considerably change the imitations of trump on /pol/. Comparing the limited perspectives offered here, this seems indeed true, though with some notable continuities. bannon is the cluster’s overall most relevant tf-idf term, presumably caused by the many episodes in the tumultuous relationship between Trump and the Breitbart manager, culminating in his departure from the White House staff in August 2018 (week 33). drumpf makes an ppearance, now as the second-most relevant term. As the Mueller probe started in May

2017, impeached became a common term. shareblue refers to the online political news source Shareblue.com, by progressive activist David Brock, who, uncoincidentally, also founded Correct the Record. Since Shareblue’s “bread-and-butter content is exposing what it considers to be news coverage stacked against” Hillary Clinton (Horowitz), it fills up the spot of Correct the Record, which was defunct by 2017. The continuation can be seen as a testament to the paranoid gaze of many /pol/ anons.

Hagen 63

What animates discussions in Trump-dense threads?

21-2017 tf-idf score saudi 0.25 macron 0.22 russia 0.18 shareblue 0.15 russian 0.14 comey 0.12 arabia 0.12 islamic 0.11 kushner 0.11 drumpf 0.11 dying 0.11 Table 7: The top tf-idf terms in trump-dense threads, week 21, 2017

The top term in week 21 (22-28 May) is saudi. On May 20th, Trump signed a 110 billion-dollar arms deal with Saudi-Arabia while on a visit in the Gulf state. In the week afterwards, posts on /pol/ mentioning the country are frequent, yet with varied sentiments. Some anons defended the deal, some lamented a breach of Trump’s promised protectionism, but most posters suspected behind-the-scenes ties between Saudi-Arabia and Israel. Present in these discussions on /pol/ is rampant anti-Semitism, as will be touched upon later (perhaps also explaning kushner’s appearance). Looking at other terms, macron naturally refers to the newly appointed French president, who assumed office in May 2017. russia and russian are commonly used in trump-dense threads this week, likely because of the country’s involvements in the 2016 Election, catalysed by the Mueller investigation the witness of James comey.

Hagen 64

What is semantically similar to Trump?

Figure 32: t-SNE graph on a word2vec model (skip-gram, window size: 5) for all posts on /pol/ in May 2017. Zoomed in on trump. Only showing words appearing more than 348 times. Perplexity: 10.

The t-SNE graph for May 2017 foremost illustrates an increase of political actors in close proximity to trump. Names were already included in the previous months, but now, new bubbles can be drawn around varied US politicians and commentators (pink), former presidents (green), European politicians (blue), “alt-right/lite” actors (brown) and miscellaneous names (grey). The increase in names of political actors is perhaps because of a diversification of political topics discussed on /pol/, since the US presidential elections did no longer occupy the centre stage. The end of the U.S. elections did not mean a halt for electoral discourse, though, but rather a shift of focus: the light green bubble now mentions the US midterms and is close to varied parties and politicians from European countries (wilders, adf, corbyn). clinton and obama remain closely connected to trump, and the derogatory nicknames remain present, though the widespread “vernacularisation” of clinton, trump, and their supporters decreases in prevalence (potentially moving to another part of the map). Further, similar to the tf-idf terms, discourse on impeachment appears, with the yellow bubble containing terms related to the FBI’s investigations

(fbi, mueller, manafort). Interestingly, the loose moniker of “alt-right” is also used by /pol/-anons, as the t-SNE graph groups the word alt-right with e-celebs like Stefan molyneux, milo Yiannopoulos, but also jordan

Hagen 65

peterson (brown bubble). Further, the older generation of white nationalist “academic racists” (Neiwert

221) like kevin macdonald and jared taylor are not grouped in the same bubble, suggesting a different discursive context. Online presence indeed seems to be the dividing line, as the YouTuber pewdiepie also appears in the brown bubble, even though he holds no far-right beliefs (though he was discredited for joking with anti-Semitism and making racial remarks). As such, the alt-right bubble forms an insight into how this fractured collective is imagined on /pol/.

Whether /pol/ anons also share “beliefs” with the actors in this bubble (/pol/ does appear close to alt-right) can be further explored by the light blue bubble, containing the meme /ourguy/. This word is used in the context of debating which actor is most representative of the overall culture and beliefs of /pol/ anons (“is this /ourguy?/”). Naturally, agreement of who best fits the /ourguy/ label is never reached, despite the dominance of far-right sentiment. There is never a distinct “whole” in social collectives, but /ourguy/ forms a very interesting traceable point of view into how anons portray their internalised conceptualisation of the whole, an ontologically subjective imagination of an “us”.77

What is Trump?

Figure 13: The word tree for "Trump is a ..." of all posts mentioning trump in week 21, 2017.

77 In the t-SNE graph, the female equivalent to /ourguy/ is used in close correspondence with brittany venti, an online “livestreamer” famous for the many raids on her channel. A potential reason for the /ourgirl/ label is that she also “trolled” the HEWILLNOTDIVIDE.US (LaBeouf et al.) livestream in January 2017.

Hagen 66

The May 2017 word tree of “Trump is a …” shows a continuation of the antagonism from the previous time slice, but through a shocking anti-Semitic radicalisation of the words used. The top spot is now taken over by “kike”, an ethnic slur for a Jewish person. “Jew”, “globalist”, “Zionist” and “jewish puppet” all attest to the same hateful speech. Of course, as the previous “points of view” stress, Trump was never unequivocally praised on /pol/, but this word tree indicates a particularly hateful vilification of the US President.

When taking together, the week in this cluster mostly stands out for its anti-Semitic connections. Most likely, Trump’s visit to Saudi Arabia and Israel put fuel on the anti-Semitic sentiments on /pol/. However, it is hard to tell whether this visit kindled a new anti-Semitic flame, or whether it is indicative of a “fascistic current” that had always been present on /pol/ (Biddle). In any case, the rampant anti-

Semitism, as well as the high presence of far-right actors in the t-SNE graph (e.g. jared taylor), imply that, despite the influx of posts during the US election, /pol/ maintained or even strengthened its reputation as a “safe space for self-selecting […] racists whose bigotries were an identity first, source of lulz second” (Phillips, “Oxygen” 23).

3.1.5 2017, week 51: “Rationalizations that support emotionally driven conclusions”

Figure 34: The fifth cluster, weeks ~2017-40 to 2018-11

The final cluster comprises the second half of 2017 and the first weeks of 2018 (up to 18 March). It knows some overlap with the previous cluster in terms of top tf-idf results: drumpf and impeached remain two of the most relevant terms in trump-dense threads. The FBI investigations seep into the posts even further, indicated by comey and a first position for mueller. daca is the second-most relevant tf-idf term, referring to the American policy allowing undocumented children (also known as “dreamers”) to remain

Hagen 67

in the U.S. The plans by the Trump administration to repeal the act possibly connects to the dominant presence of anti-immigration discourse on /pol/.

What animates discussions in Trump-dense threads?

51-2017 tf-idf score mueller 0.36 rubashkin 0.23 bannon 0.13 russia 0.13 dying 0.11 sholom 0.11 russian 0.11 jerusalem 0.10 chess 0.10 drumpf 0.09 shareblue 0.08 Table 8: The top tf-idf terms in trump-dense threads, week 51, 2017

When scrutinising the weekly tf-idf terms in week 51 of 2017 (18-24 Dec.), just like the overall cluster, most anon-activity seems animated by Robert S. Mueller. On 17 December, Trump denied the speculations claiming he pondered on firing Mueller to end the investigations into Russian collusion (Rucker et al.). On /pol/, many anons criticised and mocked the mainstream public for propagating the speculation merely as a wish to impeach Trump for impeding the judicial process. For instance, one anons mocks another poster for “still peddling the ‘Trump will fire Mueller then we’ll impeach him!’ fairy tale” (Anon #8), while another anon claims “Shills are DESPERATE for Trump to fire Mueller”

(Anon #9). Next to this anti-liberalism (for the lack of a better word), the posts containing mueller in this week bring up yet another conspiracy theory of “The Storm”,78 alluding to “world-building activities” taking form in particularly imaginative “bullshit” (Tuters et al.). As one critical anon comments on fellow posters: “Their rational mind is only used to come up with rationalizations that support emotionally driven conclusions they’ve arrived at” (Anon #10).

sholom rubashkin further attest to the anti-Semitic gaze on /pol/. The name refers a Jewish businessman convicted in 2009 for money laundering schemes. On 20 December, 2017, Trump commuted the sentence of Rubashkin, granting him a supervised release (Hawkins). The reactions of many /pol/-frequenters is easy to predict: filled with disbelief and anti-Semitic sentiments. For instance,

78 Mueller is mentioned in relation to “The Storm”, a conspiracy theory posing that the FBI investigations are, in secret, a cover-up for suing and locking up the liberal establishment, including Clinton and Obama - with Mueller having covertly sided with Trump all along (Martineu). The conspiracy theory is brought up numerous times in comments mentioning mueller this week, though not wholly accepted, as many anons ridicule the story.

Hagen 68

one anon asked: “are we going to talk about this, or just ignore it?” (Anon #11), and another: “what the hell is going on?” (Anon #12). Indeed, supporting Trump and being anti-Semitic became an increasingly untenable position - posing a problem for the vilest of anons.

What is semantically similar to Trump?

Figure 35: t-SNE graph on a word2vec model (skip-gram, window size: 5) for all posts on /pol/ in December 2017. Zoomed in on trump. Only showing words appearing more than 360 times. Perplexity: 10.

The t-SNE graph of the full sample of December 2017 does not show too many changes compared to the last month. Again, obama and clinton are in close semantic proximity to trump, even a year after the election. Again, US politicians and election discourse gets clustered around the “trump” vector representation, though the European names disappear (presumably because no prominent elections were ongoing). Again, a group with alt-right and alt-lite associated actors appears (brown bubble), continuing despite also related to bandwagon. Interestingly, or perhaps worryingly, the “academic racists” like jared taylor now appear within this alt-right cluster, right next to the /ourguy/ meme. The grey bubble contains a large group of fairly common names without a clear overarching connection; these are presumably versatile names like “David” and “Michael” that are used in similar “name” contexts.

Hagen 69

Figure 36: An example of the OP in a /ptg/ President Trump General thread

The purple bubble seemed a mistake at first sight, since some URLs slipped through the filtering and the terms appear random. However, upon inspection, it is caused by so-called “General” threads, specifically “/ptg/ President Trump General”. “Generals” are threads with recurring topics of discussion, as indicated by the OP. For instance, returning Trump-supportive threads on /pol/ are indicated with a subject “/ptg/ President Trump General”. Multiple users repost these openings posts (OP), so it stays “alive” through multiple iterations, and then comment (i.e.: bump) to increase its longevity. The OP of a General-thread often includes links to introduce the topics, such as YouTube clips, books or current events. The words in the purple bubble denote the copy-pasted text of this /ptg/ OP. The “trump general” frequency chart in appendix IV shows the /ptg/ and /tg/ (“Trump General”, from before the elections) threads have stayed constant, occurring 1,000 times a month. “Generals” can be seen as units of self-sorting publics on the imageboard and form a reminder that platform-specific practices should not be discounted when using computational methods.

Hagen 70

What is Trump?

Figure 37: The word tree for "Trump is a ..." of all posts mentioning trump in week 21, 2017.

In the word tree of 2017-51, the associations of Trump as a “jew” are as vile, if not viler than the word tree in May 2017, suggesting the latter week was not just an incident and a “pattern” in imitations arises. Paradoxically, and telling of the cacophony that is /pol/, “nazi” also joins the ranks. “Good” and “great” are the only associations that is not hate speech, though the most used word after “good” is “goy”, again implying prevalent anti-Semitic discourse. Indeed, the fairly innocent utterances of “Trump is a meme” these are not.

The word relations surrounding trump in the final cluster did not deviate too much from the previous timeframe. This continuity highlights the (anti-Semitic) extremities were no one-off accidents, but rather a continuing presence on /pol/. This can be interpreted as a stabilisation of political ideology on the board, which contrasts the tumultuous vortex during the election phase. To that end, I turn to the next section, which provides some overarching reflection to “infer certain regularities” (Marsden 1177) as well as irregularities from the results, and to these back to the theoretical and methodological setup.

Hagen 71

3.2 Informed reflection: Lulzy crowds, extremist publics?

Figure 38: The word trees for “Trump is a …” from week 45, 2015, and week 51, 2017.

When opposing the first and last weeks (fig. 38), the word trees suggest a political radicalisation towards anti-Semitic sentiment. Does this only indicate trump is increasingly associated with anti-Semitism, or does it also allude to a broader trend of radicalisation on /pol/? This is challenging to answer without user demographics; it could be caused by a radicalisation of the beliefs of the board’s users (as Reiman argues), but also that Trump’s actions, like commuting the sentence of Rubashkin, merely drew the attention of anti-Semites already present on, or migrating to /pol/. These difficulties are not merely relevant to this study, but they also form a general problem for research into changing (political) sentiments in anonymous online spaces. The above case study might not offer clear answers to this, but in the research process, I found particular objects demarcating congregative and transformative dynamics, and “aggregate” views that attest to these. I argue these form a reason to believe that the 2016 election stimulated the formation of “crowd-like” collectives that obscured the ongoing presence of “extremist publics”. While the notion of publics is useful when interpreting the findings from the case study (e.g. with anons creating General threads), the change in post volumes and political sentiments surrounding trump suggest even less coherence and consistency than what is captured with the already flexible concept of “publics”. For this reason, the notion of the “crowd” might be useful. To briefly understand this concept, I would like to return to Tarde:

A crowd is a strange phenomenon. It is a gathering of heterogeneous elements, unknown to one another but as soon as a spark of passion, having flashed out from one of these elements, electrifies this confused mass, there takes a place a sort of sudden organization, a spontaneous generation. […] The majority […] had assembled out of pure curiosity, but the fever of some of them soon reached the minds of all, and in all of them there arose a delirium. (Tarde, Penal Philosophy 323)

Hagen 72

Tarde adds that “in an exited crowd, imitation is absolutely unconscious and blind and contrary to the habitual character of the person who is subjected to it” (Tarde, Penal Philosophy 302). The crowd is a somewhat problematic concept since “sudden organization” stresses coordination yet describing its members as “unconscious and blind” contrastingly downplays their agency (e.g. lamented by Katz 266), and it was understood as physical gathering (Wiedemann 314). Still, it can be helpful here because it can aid in understanding sudden credulous collectivity; crowds are “disorganized collectives” that are “more unstable, more forgetful”, and its “constituent members” are “more credulous and gullible” -- i.e. prone to “contagion” (Katz 266) -- in their imitations (Brighenti 302).79 Therewith, crowds contrast with publics, the latter which are collectivised by more a consistent and conscious sharing of “ideas”. The notion of the crowd has been reworked after Tarde,80 but here it suffices as an indication for a “spontaneous gathering” of credulous members kindled by a certain event, rather than a collective based on relatively consistent and conscious ideas or passions.

Why do the “attributes” of trump, as traced above, necessitate this conceptual expansion? Although clear answers are obscured by 4chan’s anonymity, contours of two “crowd-formation” dynamics can be discerned, which I refer to here as transformative and congregative. Firstly, it is likely that anons were influenced by the transformative capacities of the trump-bandwagon during the election cycle, if only by the sheer volume of trump-imitations (recall that at its peak trump was used more than the). Despite it being unclear how many anons were “converted’, the presence of this transformative force is visible when taking one particular: the opening quote of this chapter. In this post, an anon admits he “laughed a year ago when [hearing] rumors that Trump was going to run” but instead ended up desiring the win would go “to the Golden Lion” (Anon #13). The post implies his participation on /pol/ made him more credulous to adopt political sentiments he did not foresee, indicated by his self- reflection: “here I am, praying to an ancient egyptian frog”. If Coleman’s characterisation of

Anonymous as an “emotional” movement (Hacker 396) can be extended to the band of trump-anons, such political transformations are perhaps unsurprising given the capricious state of “emotionally driven” (Anon #10) reasoning so present on /pol/. It suggests enough repetitions can create a domino- effect of “contagious” imitations, potentially “[electrifying a] confused mass” (Tarde, Penal Philosophy 323).

79 This seems to conflict with Tarde’s non-structuralism, but a crowd is still no more than the individuals it is comprised of - the mode of imitation between these individuals just changes to a more contagious form when in close proximity. 80 The concept of the crowd has e.g. been taken up by Le Bon and rethought as “multiplicity” by Deleuze.

Hagen 73

Figure 39: Replies by outside-anons “reporting in” the sticky thread announcing Ted Cruz's drop out, 4 May 2016. Derived and screencaptured from archive.4plebs.org

Secondly, the trump-infected crowd dynamics can be seen as congregative to emphasise that, in all likeliness, a significant influx of new members took to /pol/ during the election cycle. Another “particular” illustrates this dynamic: in the second cluster of the case study, a thread appeared on 3 May 2016 announcing “Cruz drops out”. Within this thread, anons indicated they arrived from other boards on 4chan to join the Trump-infused activity on /pol/. For instance, fig. 39 shows an anon replying “/v/ REPORTING IN”, to which sixty other users reacted by also affirming themselves as representatives from other boards (e.g. “/mu/ here. Ready to MAGA”). This implies that Trump’s surprising nomination, as a result of Cruz dropping out, stimulated a migration of users who jumped into the vortex of trump-related discussions on /pol/ - although the volume of this migration is impossible to discern. The homogenisation between different users from different boards aligns with Tarde’s concept of the crowd as it indicates a “gathering of heterogeneous elements” (Tarde, Penal Philosophy 323). If Tarde’s crowd is strictly a physical gathering, anons “reporting in” to /pol/ in the face of a specific event might form a close digital equivalent. This congregational dynamic is further signified by a common claim on /pol/ that the election brought an “invasion” of users, specifically from the pro-Trump subreddit r/The_Donald; anons commonly lament that “4chan is Reddit now”, speaking of “r/The_Donald refugees” and commanding explicitly partisan users to “go back to r/The_Donald”.81 These “beliefs and desires” for /pol/ to stay “pure” is captured in one particular object of imitation encountered through the week-by-week exploration: an anon-made “timeline of /pol/”, interpreting the increase in post activity in 2016 as a period where “Trump shit goes into overdrive, meme shit floods /pol/, /pol/ is now reddit” (fig. 40). Whether these claims hold any truth is difficult to establish, since no one knows exactly how many “new recruits were attracted to trolling spaces [like 4chan], how many existing users quietly stepped away out of concern, or how many stayed put and were subsequently radicalized throughout the election cycle” (Phillips, “Oxygen” 24). Still, the image forms an interesting indication of how a perception of the “whole” is internalised and expressed through the imitations of a particular anon.

81 On /pol/, the phrase “back to reddit” is used in 19,069 posts, and “back to the_donald” in 1,940, as of 29 June 2018.

Hagen 74

Figure 40: An image circulated on /pol/ lamenting that "/pol/ is now reddit" by annotating 4plebs statistics. (Recursively) derived and screencaptured from archive.4plebs.org.

It should be stressed that a factual difference between “publics” and “crowds” cannot be established by clear borders. Rather, the concepts point to different areas on the same gradient, as participants in crowds can be drawn by specific ideas and gradually become engrossed by more coherent “public-like” ideas, or vice versa. For instance, the /ptg/ threads illustrate that active, public-like engagement exists amongst more random, contingent crowd-like currents on /pol/. Further, even “bursts” of crowd-like organisation and innovations should not be downplayed as mere contingencies. Since, as described in the first chapter, 4chan affords a particularly innovative mode of participation, even the most fleeting of collectives on /pol/ cannot be wholly captured as “spontaneous” crowds once they tap from (and contribute to) the subcultural webs of meaning. /pol/’s arcane “reservoir of shared ideas” is not conjured out of nothingness but rather the work refraction, “the result of labor on different levels and a product rather than an effect pervers, an unintended consequence” (Rieder, “Refraction”).

Figure 41: tf-idf matrix of trump-dense threads, columns denoting weeks and rows denoting the top hundred most relevant terms per week. Download the full tf-idf matrix with all terms from salhagen.nl/thesis/tfidf_trumpdensethreads.csv

Hagen 75

Despite these nuances, it is useful to speak of crowd-like “coming together” and “dispersion” since these can be further be discerned through specific trump-related imitated objects. Fig. 41 shows the relevance of “meme terms” in trump-dense threads over time.82 The columns denote weeks, and the rows the top hundred tf-idf terms. All terms are greyed-out, except for four terms: emperor, energy, magic and pepe. Though somewhat diffused, a pattern emerges, since these both pseudo-humorous and pseudo-political keywords flourish in 2016, but have evaporated only a year later. Of course, such words do not hold their cultural capital forever (Milner 47-8), but the concentration and disappearance of these humorous terms indicate the “turnover rate” of trump-related memes is high (as the frequency graphs of these terms support; see appendix IV). Additionally, the later weeks in 2017 and 2018 rarely contain such memetic terms, and if they do, they imply ani-Trump sentiments (e.g. drumpf). Though speculative, this suggests that the vortex of the election period created a homogenised field for various forms of trump-related activity: one could jump on the “Trump train”83 motivated by explicit ideological desires, but also join just to partake in the faux-ironic “carnival”.84 However, this homogenisation was short- lived; considering fig. 41 and the growing extremisms in the post-election weeks in the case study, a large part of the comical side of trump seems to have faded on /pol/.

This hardening of discourse around trump could attest to Nagle’s characterisation of 4chan as “countercultural” (Kill All Normies). Nagle argues that the Trump-supportive online collectives in 2016 found a “right-wing sensibility” (28) and espoused a transgressive, rebellious character, embodied by “edgy” memes like Pepe the Frog.85 Nagle describes these movements as countercultural since they “[defined] itself against” a “liberal establishment” (28, emphasis mine); similar to how in the ’68 movements, dispersed groups were homogenised by a shared opposition to a hegemony, but in this case of conservatism (Tielbeke). These broad countercultural “anti”-dynamics can also be discerned in a smaller form on /pol/. As the case study showed, the earlier weeks showed more pro-Trump words appear (stump, maga), but derogatory vernacular increased afterwards (drumpf, drumpfkins, trumptards). While these are perhaps more indicators of perpetual antagonism and trolling than counterculture, i.e. the practice of attacking anyone “committed to their cause, whatever the politics” (Phillips 2018, 16), the countercultural cloak can be discerned being espoused in particular “group introspections” by anons.

82 Download the full tf-idf matrix with all terms from salhagen.nl/thesis/tfidf_trumpdensethreads.csv 83 The “Trump train” was a commonly used meme during the 2016 US election, as a metaphor for the ongoing current of Trump-support on the Internet. 84 The politically opaque “ironic politics” of trolls and anons has led to 4chan being associated with Michael Bakthin’s concept of the “carnivalesque” (see also: Nagle, Kill All Normies 37), referring to a Medieval form of transgression against hegemonic dominance, but instead of challenging it through overt political action, the carnival ridicules it through comical relativity. Just like the carnivalesque is not apolitical, do not suggest that the political activity associated with laugher is apolitical, but rather that it forms a different mode of operation. Illustrating this, Stoehrel and Lingren note that there is “no opposition between the lulz and political engagement. The lulz can, as we suggest [...] also be understood as the (forbidden) pleasure or joy of fighting for something meaningful, the passion of (political) struggle [the lulz] are then basically fundamental for rebellion. Without joy, or the fantasy of hope, we cannot, imagine an ‘alternative’ to -- and less revolting against -- a given political situation” (257-8). 85 Angela Nagle relates the countercultural style to Peter Stallybrass and Allon White’s The Politics and Poetics of Transgression, who rework Bakthin’s notion of the carnivalesque.

Hagen 76

For instance, the post in fig. 42 states that “4chan is neither left-wing or right-wing, but purely counter- cultural and largely libertarian”, and fig. 43 constructs a “timeline of the future of /pol/”: “four years in […] this place has transformed into a heaven for mocking conservatives and cemented its arguments against rightism. […] Whatever happened was just ‘that edgy phase’ that /pol/ went through”.

Figure 42: An anon describing 4chan's dominant culture as countercultural. Derived and screencaptured from archive.4plebs.org

Figure 43: An anon predicting the "timeline of the future of /pol/", alluding to the political pendulum theory. Derived and screepcaptured from archive.4plebs.org.

However, the characterisation of 4chan’s overall political fluidity as countercultural does not fully fit with my empirical findings on /pol/. If the board was truly oppositional, fluctuating between the left and right, the reactions to Trump would crawl increasingly to the left end of the spectrum, as the anon in fig. 43 theorises. In contrast, the associations with trump over the course of the five clusters became more entrenched in right-wing extremes, most problematically coagulating into anti-Semitism. As such, particularly considering the repeated appearance of anti-Semitic hate speech, it suggests a recurring presence of “extremist publics” on /pol/. For instance, one anon in fig. 44, replying to the post in fig. 43, claims “/pol/ has been right[-wing] from the start”, noting that the only reason a significant political shift will occur is if “Trump turns to the kikes” (Anon #14). While anecdotal, there is truth in these statements, as far-right (anti-Semitism) sentiments seem to be deep-rooted within the fabric of /pol/ publics instead of merely contingent to mainstream currents. One of the main reasons to assume this is that the raison d’être of /pol/ was as a far-right containment board, implying a presence of a “core” public that finds in /pol/ a place for their consistent hate, instead of ambivalence. The above findings further buttress this, most notably because of the reoccurrence of anti-Semitism in the word trees. Further, the occurrence of Judaism-related or anti-Semitic hate words have (proportionally)

Hagen 77

remained stagnant. For instance, fig. 45 shows that kike has grown in late 2017 and 2018 but was already prominent in 2015. Other anti-Semitic keywords show the same patterns (see appendix IV for the frequency graphs of goyim and jew). As such, instead of indicative of a sudden anti-Semitic eruption, it is likely that Trump drew the attention of the anti-Semites already manifested on the board, though more “subjective positions” have to be employed to further support this claim.

Figure 44: An anon countering the characterisation of /pol/ as countercultural, stating it will remain far-right instead. Derived and screencaptured from archive.4plebs.org.

Figure 45: The amount of posts on 4chan/pol/ mentioning kike.

The above case study offered a “navigation” around a specific imitated object, trump, to explore whether it could provide points of view into political shifts and continuities. It did; while the week-by-week views attest to the fracturedness of the political sentiments of /pol/ anons, word contexts surrounding trump turned from relatively supportive to more hateful. As differently conceived perspectives of an inexistent “whole”, the case study could never, and has never, attempted to provide a holistic view on the political climate on on /pol/. However, it did “test the waters” by identifying some changes in the current.

Hagen 78

Conclusions This research has explored various facets of the research practice that is rendering visible and interpreting political sentiments on 4chan/pol/ - from the conceptual to the practical and back. It posed the problem that 4chan, and /pol/ in particular, can be difficult to scrutinise considering its anonymity, ephemerality and subcultural obscurity, but that it is nonetheless necessary to scrutinise its political sentiments, both to moderate /pol/’s political extremities as well as to prevent problematic generalisations. This is not only a problem of reporting, but an academic challenge as well: how does one refrain from accurately characterising a space whose affordances invite generalisation? To that end, the first chapter offered an excursion into the thought of Tarde. His anti-structuralist outlook on social collectives offers handles to refrain from conceptualising 4chan or the publics emerging from it as metaphysical or holistic entities by emphasising that the “whole” of such collectives is merely the sum of its parts. 4chan itself is a fairly simple network, but one that houses complex subcultural meaning. The Tardean view allows for the scrutiny of patterns within these webs of imitations, since he argues subjective beliefs and desires are not exclusively tied to the human subject, nor that objects produced by humans are just “out there”, but rather that “objects of imitation” can render visible the shared relations between subjects. This elevates research into “what 4chan is made of” from the mere analysis of individuals or objects, and moves it towards scrutiny of what social relations, and by extension, political sentiments are embedded in the concrete data. Since 4chan affords a particularly high rate of “innovative” objects of imitation, but because this cultural production also deepens the communities’ already opaque knowledge reservoirs, empirically untangling the intertextual meaning from these concrete data objects forms a challenge. To that end, the second chapter offered theoretical and practical approaches on how to employ quantitative techniques to analyse 4chan’s complex streams of imitation and innovation. Particularly, Latour et al.’s rethinking of Tarde’s approach to refrain from micros and macros provided an armature to “navigate” multiple points of view offered by different methods. This allows to circulate around “properties and attributes” of a particular object to infer complex cultural meaning without imposing structural preconceptions. Three main methods that accounted for the Tardean outlook were presented: extracting popular terms present in trump-dense discussion, mapping the semantic similarities in word contexts of trump, and visualising wat trump is considered to “be”. While these methods are relatively simple text mining techniques, they can provide rich meaning, since the context of words indicates what a particular object “has”, and by extension, what it “is”. As a particularly insightful and “fluid” imitated object, trump was chosen as an anchor point to explore political sentiments on /pol/ using the proposed methodology. With a map of vocabulary change as a rudder, the approach and methods were put to the test in the case study in chapter three. Apart from providing many perspectives on further particulars denoting meaningful relations, the five weeks under scrutiny show the “attributes” of trump changed quite

Hagen 79

radically. The most prominent findings were that trump was referred to as a “meme candidate” and later moved to form a target for anti-Semitic hate speech, the pre-election weeks indicate supportive terms

(stump, maga) and semi-humorous slang (energy) but changed after the elections to include derogatory vernacular (drumpf). The varying longitudinal results implied that, apart from the public, the notion of the crowd could be useful to denote sudden eruptions of transformative and congregative coagulations of political activity, which likely occurred during the Trump campaign. Despite the fact that semi- humorous roleplay (e.g. god-emperor) played a part in this pre-election activities, there were multiple markers found throughout the exploration that imply /pol/ also houses a consistent presence of devoted extremist publics, as indicated by the last two word trees and the fact that anti-Semitic keywords have been constant for multiple years (e.g. kike). In a way, the results of the case study wrap around to the start of this thesis: the shifts in meaning surrounding trump buttresses the initial reason this research was deemed urgent, namely that political sentiments on 4chan are ever-evolving and behove constant “safeguarding of the record”. This process should not be taken for granted or simplified to an “automated” effort, made clear by two perspectives. The first perspective focuses on the object of study. A space like 4chan “changes” on multiple fronts, which complicates research practices. As I have personally experienced, many /pol/ anons use a variety of tactics when realising their community is scrutinised, obscuring and skewing the post data in order to set the researcher on the wrong foot. The abstract Happy Merchant image illustrated in the first chapter is one example of this. The imageboard also changes infrastructurally, e.g. through the creation of new boards that grow in (political) relevance. On an even higher level, 4chan as a whole might lose relevancy as a site of political and cultural scrutiny because users might migrate to other platforms and chans - as indicated by the alleged post-Gamergate “exodus” to 8chan. Luckily, the insights offered in this research are somewhat generalisable to other chans, since various are based on the same software and transgressive cultures. The second perspective is introspective, focused on the position and practices of the researcher. Like the fluidity of the river’s current, one’s own perception of the current is far from rock-steady, nor are the fishing nets always catching the same contents. Therefore, it should be remarked the research presented here is “particular” - both in the sense that it merely describes my singular “gaze”, as well as denoting its unusual nature. As such, I would like to repeat the exploratory nature of this research. Instead of focusing on one question or case, it has mostly removed the lid of a jar containing endless further cultural meanings and references. I think the methods employed here were fairly useful in providing paths for further explorations, but the quantitative approach knows its limits. To a large extent, ethnographical research into these vernacular spaces are crucial since a first-person experience can help explain what data analysis often fails to capture: the participatory dynamics in a thread, the ebbs and flows of the platform’s contingency, “happenings” as they unfold, the rhetorical battles, the nuances in irony, shifts in transgressions, and so forth. Still, the “navigation” presented here can indicate

Hagen 80

entry points for interplays between quantitative and qualitative approaches, strengthening the researcher’s “second-degree objectivity”. The fact that this study has revealed particularly qualitative relations through quantitative approaches is, in my view, a strand that should be explored in future academic endeavours. Empirical research into digital communities can repurpose concrete “quantitative” objects, like retweets and likes, but what much of the case study attested to is that “qualitative”, fickle and very social text objects can be repurposed to formulate the “beliefs” of the members within online collectives. These beliefs can be pointed towards an event or person, like trump, but to get an understanding of the online space and the in-group norms, it is particularly interesting when these beliefs are pointed inwards, indicating perceptions on the community itself. Exemplifying this is “/ourguy/”, found through the t-SNE graphs. This single word is a traceable data point that opens up a whole vector of imaginaries, identifications, and internalisations of a social “whole” anons imagine to be inside in. Such meta-commentary on /pol/, also seen in self-reflective debates (e.g. on 4chan as “countercultural”), has its own vernacular of “shilling”, “shitposting”, “lurking”, “baiting”, and so on. Apart from natively digital objects, such natively digital discourse can be used as anchor points for social research into online subcultures. This introspective vernacular might blur the outsider’s view on the space, particularly on 4chan, but carefully traced as relational objects, these markers can also work in illuminating instead of obscuring manners.

Hagen 81

Works cited

Aggarwal, Charu C. Data Mining: The Textbook. New York: Springer, 2015. Aigin, Scott F. “Poe’s Law, Group Polarization, and Argumentative Failure in Religious and Political Discourse.” Social Semiotics, vol. 23, no. 3, 2013, pp. 301-317. Aizawa, Akiko. “An Information-Theoretic Perspective of Tf–idf Measures.” Information Processing and Management, vol. 39, no. 1, 2002, pp. 45-65. Bakhtin, Mikhail M. Rabelais and His World. Bloomington: Indiana University Press, 1984. Barry, Andrew, and Nigel Thrift. “Gabriel Tarde: Imitation, Invention and Economy.” Economy and Society, vol. 36, no. 4, 2007, 509-525. Bellman, Richard E. Dynamic Programming. Princeton: Princeton University Press, 1957. Beran, Paul. “4chan: The Skeleton Key to the Rise of Trump.” The Huffington Post, 20 Feb. 2017, huffingtonpost.com/entry/4chan-the-skeleton-key-to-the-rise-of- trump_us_58ab6156e4b0a855d1d8dfe4. Accessed 28 June 2018. Bernstein, Michael S, et al. “4chan and /b/: An Analysis of Anonymity and Ephemerality in a Large Online Community.” Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, 2011. Biddle, Sam. “Reddit Is So Racist White Supremacists Are Using it to Recruit.” Gawker, 13 Mar. 2015, gawker.com/reddit-is-so-racist-white-supremacists-are-using-it-to-1691162974. Accessed 28 June 2018. Bienkov, Adam. “Astroturfing: What Is It and Why Does It Matter?” The Guardian, 8 Feb. 2018, theguardian.com/commentisfree/2012/feb/08/what-is-astroturfing. Accessed 28 June 2018. Blackmore, Susan. The Meme Machine. New York: Oxford University Press, 1999. Blow, Charles M. “About the ‘Basket of Deplorables’.” The New York Times, 12 Sep. 2016, nyti.ms/2cGdHj5. Accessed 24 June 2018. Brighenti, Andrea Mubi. “Tarde, Canetti, and Deleuze on Crowds and Packs.” Journal of Classical Sociology, vol. 10, no. 4, 2010, pp. 291-314. Bucher, Taina, and Anne Helmond. “The Affordances of Social Media Platforms.” The SAGE Handbook of Social Media, edited by Jean Burgess, Thomas Poell, and Alice Marwick, London: SAGE Publications Ltd., 2017, pp. 233-253. “Can’t Stump the Trump.” Know Your Meme, 13 Oct. 2015, knowyourmeme.com/memes/cant-stump- the-trump. Accessed 28 June 2018. “Caturday.” Know Your Meme, 13 Aug. 2009, knowyourmeme.com/memes/caturday. Accessed 28 June 2018. Clark, Terry N. Gabriel Tarde on Communication and Social Influence. Chicago: University of Chicago Press, 1969.

Hagen 82

Coleman, Gabriella. “Net Wars Over Free Speech, Freedom, and Secrecy or How to Understand the Hacker and Lulz Battle Against the Church of Scientology.” Youtube, uploaded by Channel2600, 17 July 2009, youtube.com/watch?v=_ywypPjPVDM\. Accessed 28 June 2018. ---. Hacker, Hoaxer, Whistleblower, Spy: The Many Faces of Anonymous. London: Verso, 2014. Collinson, Stephen. “Donald Trump: Presumptive GOP Nominee; Sanders Takes Indiana.” CNN, 3 May 2016, edition.cnn.com/2016/05/03/politics/indiana-primary-highlights/index.html. Accessed 28 June 2018. “Complete History of 4chan.” Tanasinn.info, tanasinn.info//Complete_History_of_4chan. Accessed 28 June 2018. “Cult of Kek.” Know Your Meme, 16 Sep. 2016, knowyourmeme.com/memes/cult-of-kek. Accessed 28 June 2018. Davies, Jason. “Word Tree.” jasondavies.com, jasondavies.com/wordtree/. Accessed 28 June 2018. Dawkins, Richard. The Selfish Gene. Oxford: Oxford University Press, 1989. Deleuze, Gilles. Le Bergsonisme. Paris: Presses Universitaires de France, 1998. Deleuze, Gilles, and Felix Guattari. Thousand Plateaus: Capitalism and Schizophrenia, London: Athlone, 1987. Detrow, Scott. “Sting Video Purports to Show Democrats Describing How to Commit Voter Fraud.” National Public Radio, 19 Oct. 2016, npr.org/2016/10/19/498587397/sting-video-purports-to- show-democrats-describing-how-to-commit-voter-fraud. Accessed 28 June 2018. Dewey, Caitlin. “The Only True Winners of this Election are Trolls.” The Washington Post, 3 Nov. 2016, washingtonpost.com/news/the-intersect/wp/2016/11/03/the-only-true-winners-of-this- election-are-trolls/?noredirect=on&utm_term=.b2074461cc5c. 28 June 2016. Faris, Robert, et al. “Partisanship, Propaganda, and Disinformation: Online Media and the 2016 U.S. Presidential Election.” Berkman Klein Center Research Publication, 2017. Farkas, Illés, et al. “Social Behaviour: Mexican Waves in an Excitable Medium.” Nature, vol. 419, 2002, pp. 131-132. Firozi, Paulina. “Trump Jr. and Top Supporter Share White Nationalist Image on Social Media.” The Hill, 10 Aug. 2016, thehill.com/blogs/ballot-box/presidential-races/295297-trump-son-white- nationalist-meme. Accessed 24 June 2018. Firth, John R. “A Synopsis of Linguistic Theory, 1930-1955.” Studies in Linguistic Analysis, Oxford: Basil Blackwell, 1962. Fisher, Marc, et al. “Pizzagate: From Rumor, to Hashtag, to Gunfire in D.C.” The Washington Post, 6 Dec. 2016, wapo.st/2hfxKqk. Accessed 28 June 2018. Foran, Clare. “A $1 Million Fight Against Hillary Clinton’s Online Trolls.” The Atlantic, 31 May 2016, theatlantic.com/politics/archive/2016/05/correct-the-record-online-trolls/484847/. Accessed 28 June 2018. Gardner, Martin. Famous Poems from Bygone Days. New York: Dover Publications, 1995.

Hagen 83

Gass, Nick, and Jeniffer Shutt. “Trump Whacks Carson as Neurosurgeon under Fire for Past Remarks.” Politico, 5 Nov. 2015, politico.com/story/2015/11/donald-trump-ben-carson-youth- story-215585. Accessed 26 June 2018. Geertz, Clifford. The Interpretation of Cultures: Selected Essays. New York: Basic Books, 1973. Goldberg, Yoav. “Neural network methods for natural language processing.” Synthesis Lectures on Human Language Technologies, vol. 10, no. 1, 2017, pp. 1-309. Haddow, Douglas. “Meme Warfare: How the Power of Mass Replication has Poisoned the US Election.” The Guardian, 4 Nov. 2016, theguardian.com/us-news/2016/nov/04/political-memes- 2016-election-hillary-clinton-donald-trump. Accessed 29 June 2018. Hagen, Sal. “Rendering Legible the Ephemerality of 4chan/pol/.” OILab, 12 Apr. 2018, oilab.eu/rendering-legible-the-ephemerality-of-4chanpol/. Accessed 20 June 2018. Harris, Zellig S. “Distributional Structure.” Word, vol 10, no. 2-3, 1954, pp. 146-162. Hartigan, John, and Mancheck Wong. “Algorithm AS 136: A K-means Clustering Algorithm.” Journal of the Royal Statistical Society, vol. 28, no. 1, 1979, pp. 100-108. Hawkins, Derek. “How Trump Came to Commute an Ex-Meatpacking Executive’s 27-Year Prison Sentence.” The Washington Post, 21 Dec. 2017, wapo.st/2kyrcHb. Accessed 20 June 2018. Healy, Patrick, and Jonathan Martin. “Donald Trump Won’t Say if He’ll Accept Result of Election.” The New York Times, 20 Oct. 2016, nytimes.com/2016/10/20/us/politics/presidential-debate.html. Accessed 28 June 2018. Hine, Gabriel Emile, et al. “Keks, Cucks, and God Emperor Trump: A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on the Web.” Proceedings of ICWSM ’17, pp. 92-101. Horowitz, Jason. “Inside Hillary Clinton’s Outrage Machine, Allies Push the Buttons.” The New York Times, 23 Sep. 2016, nytimes.com/2016/09/23/us/politics/hillary-clinton-media-david-brock.html. Accessed 28 June 2016. Hutchby, Ian. “Technologies, Texts and Affordances.” Sociology, vol. 35, no. 2, 2001, pp. 441-456. “The Internet is Serious Business.” Know Your Meme, 19 Aug. 2009, knowyourmeme.com/memes/the-internet-is-serious-business. Accessed 26 June 2018. Katz, Elihu. “Rediscovering Gabriel Tarde.” Political Communication, vol. 23, no. 3, pp. 263-270. Kinnunen, Jussi. “Gabriel Tarde as a Founding Father of Innovation Diffusion Research.” Acta Sociologica, no. 39, vol. 4, 1996, pp. 431-442. Kozlowska, Hanna. “Hillary Clinton’s Website Now Has an Explainer about a Frog that Recently Became a Nazi.” Quartz, 13 Sep. 2016, qz.com/780663/hillary-clintons-website-now-has-an- explainer-about-pepe-the-frog-a-white-supremacist-symbol/. Accessed 18 June 2018. Knuttila, Lee. “User Unknown: 4chan, Anonymity and Contingency.” First Monday, vol. 16, no. 10, 2011. ---. “Trolling Aethetics: The Lulz as Creative Practice.” Dissertation, York University, 2015.

Hagen 84

Kruskal, Joseph B. “Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis.” Psychometrika, vol. 29, no. 1, 1964, pp. 1-27. LaBeouf, Shia, et al. HEWILLNOTDIVIDE.US. Live installation, Museum of the Moving Art, New York, 2017. Latour, Bruno. “Gabriel Tarde and the End of the Social.” The Social in Question: New Bearings in History and the Social Sciences, edited by Patrick Joyce, London: Routledge, 2002, pp. 117-132. ---. “Tarde’s Idea of Quantification.” The Social After Tarde: Debates and Assessments, edited by Matei Candea, London: Routledge, pp. 145-162. Latour, Bruno, et al. “‘The Whole Is Always Smaller Than Its Parts’. A Digital Test of Gabriel Tarde’s Monads.” British Journal of Sociology, vol. 63, no. 4, 2012, pp. 590-615. Lawrence, David. “HOPE Not Hate Explains… The Cult of Kek.” Hope Not Hate, 15 Nov. 2017, hopenothate.org.uk/2017/11/15/hope-not-hate-explains-cult-kek/. Accessed 24 June 2018. Le Bon, Gustave. The Crowd: A Study of the Popular Mind. London: T. Fisher Unwin, 1908. Leys, Ruth. “Mead’s Voices: Imitation as Foundation, or, the Struggle against Mimesis.” Critical Inquiry, vol. 19, no. 2, 1993, pp. 277-307. Maaten, Laurens van der. “t-SNE.” GitHub, lvdmaaten.github.io/tsne/. Accessed 27 June 2018. Maaten, Laurens van der, and Geoffrey Hinton. “Visualising Data Using t-SNE.” Journal of Machine Learning Research, vol. 8, 2008, pp. 2579-2605. “Meme Magic.” Know Your Meme, 8 Feb. 2016, knowyourmeme.com/memes/meme-magic. Accessed 29 June 2018. Martineau, Paris. “The Storm is the New Pizzagate -- Only Worse.” NYMag, 19 Dec. 2017, nymag.com/selectall/2017/12/qanon-4chan-the-storm-conspiracy-explained.html. Accessed 20 June 2018. Marsden, Paul. “Forefathers of Memetics: Gabriel Tarde and the Laws of Imitation.” A Memetics Compendium, edited by Robert Finkelstein, 2008, pp. 1176-1180. Marwick, Alice, and Rebecca Lewis. “Media Manipulation and Disinformation Online.” Data & Society, 2017. Mikolov, Tomas, et al. “Efficient Estimation of Word Representations in Vector Space.” ICLR Work- shop Papers, 2013, pp. 1-12. Mikolov, Tomas, et al. “Distributed Representations of Words and Phrases and Their Compositionality.” Advances in Neural Information Processing Systems, 2013, pp. 3111-3119. Milner, Ryan. World Made Meme. Cambridge: MIT Press, 2016. Mole, Charlie. “Seth Rich: How a Young Man’s Murder Attracted Conspiracy Theories.” BBC, 21 Apr. 2018, bbc.com/news/blogs-trending-43727858. Accessed 27 June 2018. Murray, Mark. “NBC/WSJ Poll: Carson Surges Into Lead of National GOP Race” NBC, 4 Nov. 2015, nbcnews.com/politics/2016-election/nbc-wsj-poll-carson-surges-lead-national-gop-race-n456006. Accessed 27 June 2018.

Hagen 85

Nagle, Angela. Kill All Normies: Online Culture Wars from 4chan and Tumblr to Trump and the Alt- Right. Winchester: Zero Books, 2017. ---. “Goodbye, Pepe.” The Baffler, 15 Aug. 2017, thebaffler.com/latest/goodbye-pepe. Accessed 17 June 2018. Neiwert, David. Alt-America: The Rise of the Radical Right in the Age of Trump. London: Verso, 2017. Niezen, Ronald. “Gabriel Tarde’s Publics.” History of the Human Sciences, vol. 27, no. 2, 2014, pp. 41-59. “Normie.” Know Your Meme, 27 Feb. 2015, knowyourmeme.com/memes/normie. Accessed 24 June 2018. Nuzzi, Olivia. “How Internet Trolls Won the 2016 Presidential Election.” NYMag, 16 Sep. 2016, nymag.com/selectall/2016/09/how-internet-trolls-won-the-2016-presidential-election.html. Accessed 29 June 2018. O’Brien, Luke. “The Making of an American Nazi.” The Atlantic, Dec. 2017, theatlantic.com/magazine/archive/2017/12/the-making-of-an-american-nazi/544119/. Accessed 20 June 2018. Parikka, Jussi. Digital Contagions: A Media Archaeology of Computer Viruses. New York: Peter Lang, 2007. Parker, Ashley. “Jeb Bush Sprints to Escape Donald Trump’s ‘Low Energy’ Label.” The New York Times, 30 Dec. 2015, nytimes.com/2015/12/30/us/politics/jeb-bush-sprints-to-escape-donald- trumps-low-energy-label.html. Accessed 29 June 2018. Phillips, Whitney. This Is Why We Can’t Have Nice Things: Mapping the Relationship between Online Trolling and Mainstream Culture. Cambridge: MIT Press, 2015. ---. “The Oxygen of Amplification: Better Practices for Reporting on Extremists, Antagonists, and Manipulators Online.” Data & Society, 2018. Phillips, Whitney, and Ryan Milner. The Ambivalent Internet: Mischief, Oddity, and Antagonism Online. Cambridge: Polity Press, 2017. Reilly, Katie. “Read Hillary Clinton’s ‘Basket of Deplorables’ Remarks About Donald Trump Supporters.” Time, 10 Sep. 2016, time.com/4486502/hillary-clinton-basket-of-deplorables- transcript/. Accessed 28 June 2018. Reitman, Janet. “All-American Nazis.” Rolling Stone, 2 May 2018, rollingstone.com/politics/news/all-american-nazis-fascist-youth-united-states-w519651. Accessed 17 June 2018. Rieder, Bernhard. “The Refraction Chamber: Twitter as Sphere and Network.” First Monday, vol. 17, no. 11, 2012. ---. “What Is in PageRank? A Historical and Conceptual Investigation of a Recursive Status Index.” Computational Culture, vol. 2, 2012.

Hagen 86

Rieder, Bernhard, and Carolin Gerlitz. “Mining One Percent of Twitter: Collections, Baselines, Samples.” M/C Journal¸ vol. 16, no. 2, 2013. Rieder, Bernhard, and Theo Röhle. “Digital Methods: Five Challenges.” Understanding Digital Humanities, edited by David Berry, Basingstoke: Palgrave, 2012, pp. 67-84. Robertson, Stephenson. “Understanding Inverse Document Drequency: on Theoretical Arguments for IDF.” Journal of documentation, vol. 60, no. 5, 2004, pp. 503-520. Roberston, Adi. “Hillary Clinton Exposing Pepe the Frog is the Death of Explainers.” The Verge. 15 Sep. 2016, theverge.com/2016/9/15/12926976/hillary-clinton-trump-pepe-the-frog-alt-right- explainer. Accessed 20 June 2018. Rogers, Richard. “Post-Democraphic Machines.” Walled Garden, edited by Anne Dekker and Annette Wolfsberger, 2009, pp. 29-39. ---. Digital Methods. Cambridge: MIT Press, 2013. Rucker, Phillip, et al. “Trump Says He Won’t Fire Mueller, as Campaign to Discredit Russia Probe Heats Up.” The Washington Post, 17 Dec. 2017, washingtonpost.com/politics/trump-says-he- wont-fire-mueller-as-campaign-to-discredit-russia-probe-heats-up/2017/12/17/801e8cce-e348- 11e7-ab50-621fe0588340_story.html?utm_term=.50fa2d97e4c6. Accessed 29 June 2018. “Rules.” 4chan, 4chan.org/rules. Accessed 10 May 2018. Salton, G., et al. “A Vector Space Model for Automatic Indexing.” Communications of the ACM, vol. 18, no. 11, 1975, pp. 613-620. Sampson, Tony. Virality: Contagion Theory in the Age of the Networks. Minnesota: University of Minnesota Press, 2012. Schmidt, Hans B. “Evolution by Imitation.” Distinktion: Scandinavian Journal of Social Theory, vol. 5, no. 2, 2004, pp. 103-118. Searle, John R. “Social Ontology: Some Basic Principles.” Anthropological Theory, vol. 6, no. 1, 2006, pp. 12-29. Shifman, Limor. Memes in Digital Culture. Cambridge: MIT Press, 2013. Smith, Anthony. “Donald Trump’s Star of David Hillary Clinton Meme Was Created by White Supremacists.” Mic, 3 July 2016, mic.com/articles/147711/donald-trump-s-star-of-david-hillary- clinton-meme-was-created-by-white-supremacists#.qVBC3iywC. Accessed 26 June 2018. Spärck Jones, Karen. “A statistical interpretation of term specificity and its application in retrieval.” Journal of Documentation, vol. 28 no. 1, 1972, pp. 11-21. Tannen, Deborah. Talking Voices: Repetition, Dialogue, and Imagery in Conversational Discourse. Cambridge: Cambridge University Press, 1989. Tarde, Gabriel. “Les Deux Éléments de la Sociologie.” Études de psychologie sociale, 1895, pp. 63- 94. ---. The Laws of Imitation. Edited and translated by Elsie Clews Parsons, New York: Henry Holt and Company, 1903.

Hagen 87

---. Monadology and Sociology. Edited and translated by Theo Lorenc, Melbourne: re.pess, 2012. ---. Penal Philosophy. Edited by Edward Lindsey, translated by Rapelje Howell, Boston: Little, Brown and Company, 1912. ---. Psychologie Économique. Paris: Félix Alcan, 1902. ---. “The Public and the Crowd.” Gabriel Tarde: On Communication and Social Influence. Chicago: Chicago University Press, 1969. Tarnoff, Ben. “The Triumph of Trumpism: the New Politics That Is Here to Stay.” The Guardian, 9 Nov. 2016, theguardian.com/us-news/2016/nov/09/us-election-political-movement-trumpism . Accessed 20 June 2018. Thacker, Eugene. “Networks, Swarms, Multitudes, Part Two.” CTheory, 2004, ctheory.net/articles.aspx?id=422. Accessed 28 June 2018. Tielbeke, Jaap. “Oorlog Tegen de Babyboomers.” De Groene Amsterdammer, 18 Apr. 2018, groene.nl/artikel/oorlog-tegen-de-babyboomers. Accessed 28 June 2018, “Trump Is Playing 4D Chess.” Know Your Meme, 20 July 2016, knowyourmeme.com/memes/trump- is-playing-4d-chess. Accessed 27 June 2018. Tuters, Marc, et al. “Theory Delirium: How 4chan/pol/ Cooked-up Pizzagate.” Digital Methods Initiative Wiki, 9 Feb 2018, digitalmethods.net/Dmi/WinterSchool2018DeepVernacularWebAltRight. Accessed 24 June 2018. Uitermark, Justus. “Complex Contention: Analyzing Power Dynamics within Anonymous.” Social Movement Studies, vol. 16, no, 4, 2017, pp. 403-417. Venturini, Tommaso. “Diving in Magma: How to Explore Controversies with Actor-Network Theory.” Public Understanding of Science, vol. 19, no. 3, 2010, pp. 258-273. ---. “What is Second-Degree Objectivity and How Could it Be Represented.” Séminaire W2S, 2011. Vincent, James. “4chan iCloud ‘Expert’ from CNN Thinks 4chan Is a Person and ‘Pa$$word’ Is a Good Password.” The Independent, 3 Sep. 2014, independent.co.uk/life-style/gadgets-and- tech/4chan-icloud-expert-from-cnn-thinks-4chan-is-a-person-and-paword-is-a-good-password- 9707845.html. Accessed 20 June 2018. Wattenberg, Martin, and Fernanda Viégas. “The Word Tree, an Interactive Visual Concordance.” IEEE Transactions on Visualization and Computer Graphics, vol. 14, no. 6, 2008, pp. 1221-1228. Wattenberg, Martin, et al. “How to Use t-SNE Effectively”. Distill, 13 Oct. 2016, distill.pub/2016/misread-tsne. Accessed 20 June 2018. Wiedemann, Carolin. “Between Swarm, Network, and Multitude: Anonymous and the Infrastructures of the Common.” Scandinavian Journal of Social Theory, vol. 15, no. 3, 2014, 309-326. Zannetou, Savvas, et al. “On the Origins of Memes by Means of Fringe Web Communities.” arXiv:1805.12512, 2018. Zuckerman, E. “The Cute Cat Theory of Digital Activism.” E-Tech Conference, San Diego, California, USA, March 2008.

Hagen 88

Anonology

Anon #1. no.97318582. 4plebs, 9 Nov. 2016, archive.4plebs.org/pol/thread/97317427/#q97318582. Accessed 28 June 2018. Anon #2. no.97827121. 4plebs, 11 Nov. 2016, archive.4plebs.org/pol/thread/97824845/#97827121. Accessed 28 June 2018. Anon #3. no.105649189. 4plebs, 4 Jan. 2017, archive.4plebs.org/pol/thread/105649057/#q105649189. Accessed 28 June 2018. Anon #4. no.152279244. 4plebs, 6 Dec. 2017, archive.4plebs.org/pol/thread/152279244/#152279244. Accessed 28 June 2018. Anon #5. no.97148985. 4plebs, 9 Nov. 2016, archive.4plebs.org/pol/chunk/97022906/#q97148985. Accessed 28 June 2018. Anon #6. no.72691022. 4plebs, 2 May 2016, archive.4plebs.org/pol/thread/72675998/#72691022. Accessed 28 June 2018. Anon #7. no.88263501. 4plebs, 10 Sep. 2016, archive.4plebs.org/pol/thread/88261162/#88263501. Accessed 28 June 2018. Anon #8. no.154297448. 4plebs, 23 Dec. 2017, archive.4plebs.org/pol/thread/154283198/#154297448. Accessed 28 June 2018. Anon #9. no.153873352. 4plebs, 20 Dec 2017, archive.4plebs.org/pol/thread/154283198/#154297448. Accessed 28 June 2018. Anon #10. no.153741618. 4plebs, 18 Dec. 2017, archive.4plebs.org/pol/thread/153739689/#153741618. Accessed 28 June 2018. Anon # 11. no.154003252. 4plebs, 21 Dec. 2017. archive.4plebs.org/pol/thread/154003252/. Accessed 28 June 2018. Anon #12. no.153978794. 4plebs, 21 Dec. 2017, archive.4plebs.org/pol/thread/153978794/. Accessed 28 June 2018. Anon #13. no.97148985. 4plebs, 9 Nov. 2016, archive.4plebs.org/pol/thread/97022906/#97148985. Accessed 28 June 2018. Anon #14. no.123116321. 4plebs, 28 Apr. 2017, archive.4plebs.org/pol/thread/123115623/#123116321. Accessed 28 June 2018.

Hagen 89

Appendices

I A primer on 4chan’s infrastructure When it was born, 4chan not only carried over the Japanese Futaba channel’s focus on anime and manga culture, but also its open-source imageboard software. The core functionality of 4chan’s infrastructure has remained roughly identical since 2003. It consists of three main building blocks: posts, threads, and boards. Posts and threads Posts are the output of a single user contribution on 4chan. A post can contain a text, an image, or both, and always contains at least the following metadata: an author name to the user’s liking, a timestamp, and an integer representing the post’s chronological entry on a board (dubbed no). Posts are inseparable from the sequence of posts they appear in: the thread. The full thread comprises of a list of two types of posts: an opening post (OP) and a reply. The OP is always the first post in a thread, which always contains an image, post text, and a title. The OP forms the unit that initiates the topic of conversation, for instance by posing a question or making a statement (see e.g. fig. 46). Other users can then react by leaving a reply, the second type of post. A reply contains some text, an image, or both (see e.g. the brown boxes underneath the OP in fig. 46).

Figure 46: A beginning of a thread on 4chan/pol/

The conversation on 4chan is held through posting these images and texts. The image is given a prominent position next to the post’s text and are often used as a visual indicator of an emotional state.86 The text in a post (here referred to as the post body) usually consists of a simple text string, but

86 For instance, “Reaction faces” are its own kind of genre on 4chan, with prominent examples being Rage Faces, Pepe the Frog, Feels Guy, Hide the Pain Harold, and more.

Hagen 90

also allows some site-specific notations. Starting a line with a single greater-than-sign (>) will cause the subsequent text to appear green (see fig. 47). This “greentexting” entails a culture of its own: it is often used as a narrative device for personal stories, but also to summarise, quote or intentionally misquote another post. Further, two greater-than-signs (>>) followed by a number of another post in the same thread allows to reply to that post. This allows in-thread discussion, as referred-to posts will be annotated with a hyperlink to replies it garnered (e.g. the hyperlinks in the OP and “>>174006730” in the second reply in fig. 46). Longer threads tend to develop smaller sub-discussions through these in- thread replies, sometimes diverging substantially from the OPs agenda setting (called a “slide thread” in 4chan-speak).

Figure 47: An example of a reply on /pol/

The fact that there is no registration needed to submit a post, points to 4chan’s first notable affordance: its anonymity. While there is an option to insert a name before posting, it is most common the field is left blank, meaning the default name “Anonymous” is displayed as the author. This means that, practically speaking, 4chan affords an extremely low barrier of entry: the only required form of authentication is a CAPTCHA meant to fend off bots. Still, despite its lack of registration, 4chan is not strictly anonymous. On some boards, like /pol/, an ID number is attached to a post, denoting a unique alphanumeric value based on the IP address of the poster (see the green label next to “ID:” in fig. 47). However, the consistency of IDs is usually only maintained within one thread; across threads, IDs are different again, restoring the anonymity. This means the IDs prevent impersonation and hidden self- replies, but only within a single chain of discussion. Additionally, full anonymity is not ensured between users and website administrators, who can identify IP addresses associated to posts. These are sometimes handed over to authorities, according to 4chan’s FAQ, “to comply with court orders or to cooperate with law enforcement agencies when appropriate”.87 Still, from a user perspective, 4chan affords general anonymity since it is nigh impossible to trace the (pseudo-)identity of other contributors, and vice versa. Boards A set number of threads dedicated to a specific topic make up the third and largest building block of 4chan: the board. As noted, moot started with /b/ “Random”,88 for random anime discussion, and

87 https://www.4chan.org/faq#personalinfo 88 The abbreviations like /b/ and /pol/ are derived from the board’s URL: /b/ can for instance be found on 4chan.org/b/

Hagen 91

quickly added other anime-oriented boards like /c/ and /h/. As 4chan’s traffic grew over the years, so did the amount of boards and their and variation: at the time of writing, it houses 72 boards, with topics varying from music (/mu/) to fitness (/fit/) and origami (/po/). Currently, /pol/ “Politically Incorrect” is the most active board, followed by /v/ “Video Games” and /b/ “Random”.89 A list of all boards is the first view presented upon visiting 4chan. Clicking on one directs to the main overview page, called the index (fig. 48). The index forms the main entry point for browsing through a board’s active threads, usually presenting ten pages with 20 thread previews, which only show the OP and a maximum of five replies. Some boards feature “stickied” threads, which are threads pinned to the top of the board, usually to indicate rules or codes of conduct. For instance, /pol/ displays a sticky post warning about rational fallacies (which anons regularly overstep – especially the strawman). From the index overview, threads can be opened in full on a separate page.

Figure 48: The index page of /pol/, showing its two stickied threads

The index is not only relevant because it is the main view for browsing threads, it also offers an insight into 4chan’s thread sorting mechanisms. When refreshing the index, the order of threads shuffles: a thread can disappear to a lower place on a moment’s notice and reappear upon another refresh.90 Combined with the fact that identities are obscured, the thread sorting creates a contingent experience, an encounter with “a stranger in passing” (Knuttila, “User Unknown”). However, the mechanisms driving this contingency are simple. The content of a board essentially consists of a list of

89 Data derived from 4stats.io on 4 June 2016, 22:36. 90 The rate of thread position changes is dependent on the total post activity on a board: content will move faster on /pol/ than on /po/ ‘Papercraft and origami’.

Hagen 92

threads that quickly change positions. A new, reply-less thread is positioned on top of the thread-list, meaning it appears on top of the first page in the index. However, it is pushed down because both new threads and older threads that receive a new reply are positioned on top. As such, the rate of replies determines how often a thread is “bumped” to the first position. When a thread drops below the last slot (usually 200), the thread is removed, or pruned. On some boards (like /b/), the thread is then immediately deleted from the server, while other boards (like /pol/) archive pruned threads for a few days before permanent removal. Since there is only a limited amount of slots in the thread-list, a new thread is always at the expense of another.91 Additionally, popular threads cannot eternally survive by gaining new replies: when it garners more replies than the allowed “bump limit” (300 replies on /pol/), it is no longer bumped on a new reply, and drifts off to its inevitable purge. Fig. 49 shows a visual representation of the shuffling thread-list, showing the top pages know a high variance, while the lower pages mostly consist of threads that are either disallowed bumping because they reached the bump limit (red blocks), or do not provoke other visitors into replying (blue blocks).92 This sorting principle thus uses the reply as the “currency”93 of content visibility, in contrast to the usual affective metrics (e.g. upvotes, hearts or likes). It ensures both equality and creativity in content: new threads are provided a chance for viewership, while the pruning mechanism encourages creation (or reposting) of content instead of a reliance on older posts. The imageboard only “moves” when a contribution has been made that requires more input than a simple click. These posts can be simple, but arguably form a higher barrier of entry than the “slacktivist” upvoting or liking. While 4chan is often described as a lawless Wild West, its knows various forms of rules and content moderation. 4chan’s rules-pages features “global rules that apply to all boards unless otherwise noticed”, including prohibition of underage users, “anything that violates local or United States law”, calls for raids or , , advertising, impersonation, bots and proxies. /b/ deserves a special treatment for allowing “trolls, flames, racism, off-topic replies, uncalled for catchphrases, macro image replies”. /pol/ also has its own rules, listing to “not attack other users”, “keep it civil” and prohibiting pornography.94 Indeed, /pol/ is at least somewhat moderated: explicit pornographic or gory imagery is sparse, and Hine et al. found 6% of posts on /pol/ were deleted during the summer of 2016. However,

91 This ephemerality also applies to activity within threads: only a thousand replies are allowed, with newer replies pruning the oldest from the thread listing. However, reaching this amount of replies rarely occurs. Often, such massive threads are tagged as sticky by the site moderators, pinning them to the top of the first page. 92 The positions of threads were queried with a custom script using the 4chan API (script available on GitHub). This was done each two minutes from 4th of April 2018 22:20 to 23:15. These time-separated positions are then chronologically placed in columns with Bernhard Rieder’s RankFlow. The height and red hue of a block (i.e.: a thread) denotes how much comments the thread received. See oilab.eu/rendering-legible-the- ephemerality-of-4chanpol/ 93 Replies further work as “currency” because of two other factors. Firstly, a poster can input the word “sage” in the Options-field before posting, which allows replying without bumping the thread. As such, sage-posts can be used tactically to make a thread reach its bump limit quicker, without increasing its visibility. Secondly, boards know timeouts for posting (at the time of writing, /pol/ has timeouts of a minute between creating threads and fifteen seconds between replies, see http://a.4cdn.org/boards.json), meaning a single poster can only bump a limited amount. 94 See 4chan.org/rules

Hagen 93

enforcement is lax, as users are often insulted, and certain calls for co-ordinated campaigns (raids) do sparingly appear (e.g. “Operation Google”, discussed in Hine et al.). The task of deleting inappropriate content is taken up by site moderators and janitors. Moderators have site-wide privileges like banning users, sticky posts and other administrative functions, while janitors are volunteers sitting between regular users and moderators, tasked and equipped with tools to remove rule-breaking content. Apart from these rules, boards have other minor variances. Specific to /pol/, a country flag is appended to a post, based on the user’s IP address. Users can change this geo-located flag to a pre- selected collection, such as Gadsden, Hippie or White Supremacist. Further, as mentioned above, /pol/ knows thread-specific IDs per poster. Further, /pol/ contains an archive-page, where pruned threads can be viewed for a few days.

Figure 49: The positions of all /pol/ threads per two minutes, from 22:20 to 23:15 on April 4th 2018. The height and red hue of a block (i.e.: a thread) denotes how much replies the thread received (the redder, the more replies).

Hagen 94

II Scoping the amount of users on /pol/ This appendix provides some metrics on the user data on /pol/ and speculates how many individual users are active. 4chan itself gives insight into some website-wide statistics. Its “Advertise” page lists that the entire website receives 27.7 million unique visitors per month, 70% male and 30% female, aged between 16 and 34. Half of the visitors are located in United States and 19% in other English-speaking countries. It states 900,000 to a million posts are made on the entire imageboard per day.

Figure 50: Statistics provided by 4chan.org/advertise

Comparing /pol/ with demographic data from other Internet platforms can put these numbers into some perspective. While it is not quite apples to apples, the content aggregator Reddit can form a comparison. For February 2018, a month with average activity, /pol/ only makes up 4.1% of the size of Reddit in terms of total post activity. This is not a fair comparison, however, since it compares a full website to a subsection (a board). Alternatively, one can compare all the comments in February 2018 of the subreddit r/The_Donald, often linked to /pol/ for their joint support of Trump. /pol/ shows that the subreddit only generated 24.6% of the amount of posts on /pol/. Even r/politics, the largest political subreddit, only makes up 50% of all 4chan/pol/ comments in that month. Unfortunately, these statistics are site-wide and not specific to /pol/, and it is unclear how accurate or recent they are. One can make some further guesses about the amount of humans on /pol/. Fig. 51 takes February 2018, a fairly average month on /pol/ in terms of post activity, and plots the average amount of daily posts per anon are needed to meet the 3.4 million posts made in the month. If every anon would place, on average, one post every other day, 250,000 users would have to be active in February on /pol/. In the unlikely event that anons on average place one post per day, without daily exceptions, /pol/ would consist of around 125,000 active users, and five posts per day sees the user total drop to 25.000. Note that this excludes (i.e. frequenters who do not post).

Hagen 95

Figure 51: The amount of active (i.e. posting) anons needed to reach the monthly posts in February 2018.

The comparison with Reddit can further aid in speculating on the total amount of active users on /pol/. In February 2018, on r/politics and r/The_Donald, individual Reddit accounts posted respectively 0.45 and 0.67 comments per day (excluding as many bots as possible, see SQL code underneath). If these numbers are in any way reflective of the activity on /pol/, fig. 51 indicates that between 175,000 and 250,000 “unique anons” would be active in February 2018. By analysing the pure activity on /pol/, it seems far-fetched that the board had any impact on the 2016 US Elections. If one assumes the average anon posts 0.5 posts a day (indicating a high amount of anons), and all of these active anons in November 2016 (367,676 individuals) were somehow eligible to vote in the U.S., and all of them actually voted for Trump, this would only generate 0,58% of the total 63 million votes he received. Even adding a fair share of lurkers going to the ballot, a total political mobilisation of anons would have had, in /pol/’s most favourable case, a minimal impact on the elections.

Hagen 96

Month /pol/ posts 11-2016 5,515,133 02-2018 3,468,140

Average post All Reddit Unique per account Month comments accounts per day 11-2016 69,971,099 3,305,235 0,7

02-2018 84,474,752 4,277,589 0.71

Average post r/politics Unique per account Month comments accounts per day 11-2016 2,598,981 173,712 0.5 02-2018 1,727,689 135,874 0.45

Average post r/The_Donald Unique per account Month comments accounts per day 11-2016 2,200,613 104,829 0,7 02-2018 853,571 44,842 0,67

Table 9: User statistics on /pol/ and Reddit, for comparison

Generated through Google BigQuery with the code:

SELECT count(*)count FROM [fh-bigquery:reddit_comments.2018_02] WHERE lower(author) NOT LIKE '%bot' AND lower(author) NOT LIKE 'auto%' AND subreddit='politics'

Thanks to Jason Baumgartner of pushshift.io and Felipe Hoffa.

Hagen 97

III Database column headers

Column headers in the /pol/ posts csv dump from archive.4plebs.org.

Column header Description num unique post number Subnum board number (same in full dataset, /pol/ only) thread_num unique thread number op is opening poster or not timestamp UNIX timestamp of moment of posting timestamp_expired The timestamp of the moment of thread removal preview_orig Thumbnail of media file preview_w Width of thumbnail preview_h Heigth of thumbnail media_filename Name of media file media_w Width of media file media_h Height of media file media_size Size in byes of media file media_hash Hash code of media file media_orig The media file spoiler Wheter a post is noted as a spoiler (irrelevant for /pol/) deleted Whether a post was deleted capcode Country code of poster email Email of poster (irrelevant) name Name of poster (i.e. Anonymous) trip Tripcode of poster title Subject title of post comment The post body sticky Whether the post is stickied locked Whether the thread is closed poster_hash The hash for the poster (ID) poster_country The country flag of the poster

Hagen 98

IV Frequency charts emperor

energy

Hagen 99

goyim

jew

Hagen 100

kike

magic

Hagen 101

nigger

pepe

Hagen 102

“Trump general” in the title (/tg/ Trump General and /ptg/ President Trump General)

stump