Automated Meme Magic: an Exploration Into the Implementations and Imaginings of Bots on Reddit”

1

“Automated Meme Magic: An
Exploration into the Implementations and Imaginings of Bots on Reddit”

Jonathan Murthy | [email protected] | 2018

2

Table of Contents

3

Acknowledgments

There are several people I would like to acknowledge and thank for their support, encouragement, and insights. First, I would like to thank all the professors and students I have had the pleasure of working with over the course of this program. Marc Tuters, my thesis supervisor, who encouraged me to pursue this research. Sal Hagen, for writing various SQL scripts which helped me find interesting data. My mom, for the lifelong, love, and support she has given me over the years. To my sisters and brother who encourage me to follow my dreams. To my father, who is no longer with us and the unconditional pride he had for me. And to Laura, for talking to me everyday, for challenging me, loving me, being patient with me, and the amount of growth I have experienced because of you.

4

Abstract

The allegations of bots being used as deceptive, persuasive, manipulative, unseen, networked machines, seeded inside digital environments to control, guide, subvert, or otherwise alter the public discourse, is a prevalent topic around the areas of new media, political science, human-computer interaction, journalism, computational propaganda, science and technology studies, and many other areas of interest. Recent instances where bots have arisen and have caused alarm are typically situated around political elections, but also have been seen in some areas related to cryptocurrencies. The effects of bots are most commonly seen via social networking sites where they are capable of exploiting homophilic algorithms and direct content toward particular groups of people. Essentially, visibility is a means toward shifting normative discourse, maintaining popularity or controlling the circulation of a particular piece of information is susceptible to manipulation. Visibility is also directly linked to profit as well.
This thesis will be an attempt to present the history of bot research, classifications for different bots which display specific attributes, and background on the content aggregator site Reddit.com. A first focal point of this thesis will be the use of social bots, political bots, and Search Engine Optimization (SEO) strategies which entails the use of marketing techniques which seem at odds with Reddit’s behavioral policy. The second focus will revolve around particularities of Reddit and how bots are used on the site.
Discussion points will focus on a taxonomy of the observed bots on Reddit, comments and
SEO models, Reddit’s culture and internal governance, and the political, economic, and cultural implications of bot and bot like activity.
Less research has been considered for Reddit as a point of investigation. Hopefully, this thesis will act as a stepping stone into further research on an increasingly prevalent online environment and topic.

5

1 Introduction

Is it possible that I did not write this thesis? I, Jonathan Murthy? Or, perhaps, is it possible that someone else wrote it? Is it possible that something else wrote it? If you had an infinite amount of monkeys typing at an infinite number of typewriters, typing words at random, could they not write this thesis? Could my digital profile be fabricated and used to gain credibility? Could my style of writing be derived from a corpus of previously consumed works around a particular area of interest in order to mimic natural knowledgeable language? How much would it take to convince you that I am a human being presenting credible information? In thinking about these questions (despite their hyperbolic nature), we can then think, ‘what is required, technically speaking, to mediate exploitable abstractions between you and I’, and (perhaps more importantly) ‘why would I do this?’ Despite these ponderings, sowing seeds of doubt toward me and this thesis’ authenticity is not what this thesis is about, but attempts to acts as an image to enter into a world of algorithmically mediated communication, automation, and online identities. This is then compounded by the circulation of misinformation, fake news, visibility manipulation, directed marketing, and other issues which concern public discourse around digital media. While there are other factors that can contribute to these same issues, I will be focusing on what is colloquially referred to as ‘bots’, ‘botnets’.
But what exactly are bots and what are they capable of doing? In computing, a bot is “an autonomous program on a network (especially the Internet) that can interact with computer systems or users, especially one designed to respond or behave like a player in an adventure game (Google.com). The term’ bot’ is a shortening of the word robot, derived from the Czech ‘robota’ meaning “forced Labor” (Google.com), and this seems to be in reference to the ability programmability of bots (Geiger, 2014). These three notions (that bots interact with both humans and computers, that they are designed to mimic human behavior, and can automate tasks) makes for a precarious state of affairs regarding what we see online and what information get circulated (Wooley and Howard, 2016; Howard et al., 2017; Forelle, 2015). Various industries and institutions recognize this reliance on automated tasks within networked systems, and what seems to be a growing department in every market for the ability to automate data heavy tasks (Geiger, 2014). This translates into how visibility,

6

virility, and amounts of engagement are transformed into profit by marketing firms described as Search Engine Optimization (SEO) (Heder, 2018). Also, It is not solely economic reasonings for the deployment and use of bots, as there are social, political, experimental, and other reasons for their use. According to the 2016 Incapsula Bot Traffic report, bots makeup 51.8% of internet traffic where 22.9 % of total traffic is classified as “good” bots and 28.9% are classified as “bad bots” (www.incapsula.com). But what determines the moral signifier of ‘bad’ or ‘good’?
This thesis will be an attempt to present to the best of my knowledge the ways in which bots are implemented and deployed in online environments. Information regarding bots on Twitter and Facebook will be presented and I will contribute original research involving bot activity on Reddit. Allegations of bots being used to control and manipulate the normative discourse and circulate misinformation is a prevalent topic in many different areas of research and study (Woolley and Howard, 2017). A majority of that research has been focused on Facebook and Twitter especially when we consider events like the 2016 US Presidential Election, where Donald Trump relied heavily on an online campaign and the use of social media (politico.com). Other online spaces, such as Reddit, have not been given the same amount of attention despite there being what seems to be equally dubious activity and even what many consider to be the largest gathering space of Trump supporters in the subreddit /r/The_Donald (Zannetou, 2017; qz.com; thehill.com). Reddit has also been observed to have a techno-libertarian sentiment, a particular culture of self-governance, a controversial history, and what many consider to be an easily manipulable voting system, all of which make it an interesting point of observation (Massanari 2015). For these reasons, I have chosen to focus on bot activity Reddit in order to explore its particularities and bring an area of study to a less publicized site of observation. The rest of this introduction will cover some of the research questions and objectives of this thesis, as well as an outline for the following chapters and sections.

1.2 Research Questions

While investigations into various methods of manipulation on Facebook and Twitter are more prolific, other sites that have equally, if not more, precarious policies and governance systems are

7

overlooked. The research presented in this thesis is investigative and exploratory in nature, first attempting to answer the question of ‘how do bots operate on Reddit as opposed to Facebook and Twitter?’ This breaks down into what are the capabilities, functionalities, and imaginings of bots in a broad view. From there we will move into the particularities of how Reddit as site functions both technically and culturally. What is it about Reddit and bots in general that allows them to operate in the way that they do? What are the implications of their activity and what can be done to lessen harmful abuse? I will now present brief summaries for the theoretical reasonings for choosing Reddit as a site of observation and bots as an object of investigation. Longer, more in depth backgrounds will be given on both Reddit and bots in late chapters.

1.2.1 Why Reddit

While Reddit does not seem to receive as much attention as social networking sites such as Facebook and Twitter, it is still the 4th most visited site in the U.S. and the 6th most visited site worldwide with 58.5% of traffic coming from the U.S. followed by 7.6% from the UK and 6.1% from Canada (www.alexa.com). Reddit is also mired in controversial events, including but not limited to Pizzagate, Gamergate, and The Fappening (thenewyorker.com; Masanari, 2015).
Pizzagate consists of a conspiracy theory accusing the Clinton’s of being part of a child sex trafficking operation centered around a Washington D.C. pizzeria by the name of Planet Ping Pong. The theory developed on 4Chan’s /pol/ board after leaked emails from the Clinton campaign circulated and eventually made its way onto /r/The_Donald in November of 2016, leading up to the US presidential election (Malmgren, 2017; digitalmethods.net). The event culminated when an unidentified man in his late 20’s opened fire on the pizzeria, and where it was subsequently revealed that the theory was false (snopes.com). This event demonstrated how the spreading of misinformation can have violent implications as well as examining the role anonymous and pseudonymous sites like 4chan and Reddit play in the spread of misinformation. Reddit in particular is considered to have a playful and facetious but scientific and logical sentiment, where the information is shared quite virially, but can be erroneous, antagonistic, and even unlawful (Masanari, 2015; Milner, 2015).

8

Gamergate refers to the harassment and alienation of women in the video games industry which initially began as the harassment of Zoe Quinn, but took on the sentiment of misogyny and sexism underneath a growing distrust in video game journalism and the industry as a whole. (Masanari, 2015). The Fappeing refers to the circulation of private celebrity photos which which were stolen from Apple’s iCloud service (Masanari, 2015). Those photos circulated on Reddit with such a high amount of activity that one subreddit moderator described “insane traffic ”due to the hack. (Masanari, 2015). Both Gamergate and Fappening will be elaborated in a later chapter as well, but what these event demonstrate is a proclivity toward spreading information with an ideological identity, one most closely associated with the American alt-right. All three of these events had dedicated subreddits where they were discussed and where information particular to the event was shared. These subreddits have since been banned, but until after a substantial amount of time.
/r/The_Donald is considered to be the largest pro-Trump online community and, along with other subreddits such as /r/bitcoin, have been accused of vote brigading and vote nudging, a phenomena where a group of individuals vote in a particular way in order to gain visibility on Reddit’s message board (gizmodo.com; medium.com). Vote brigading is strictly against Reddit’s policy, while vote nudging is a more common practice which exploits how Reddit weighs earlier votes against later votes, enabling content which is voted favorably earlier a longer lifespan (reddit.com). By abusing this exploit, a piece of content’s lifespan can increase exponentially and garner a great deal of attention. The /r/The_Donald subreddit has also been at odds with the Reddit administration for some time where a tenuous relationship regarding free speech, hate speech, censorship, and authenticity hang in the balance. Many calls for the banning of the /r/The_Donald have circulated, but Steve Hoffman (/u/spez), current CEO of Reddit, has been reluctant levy the punishment despite having banned other problemantic subreddits (vox.com). /u/spez has also received a large degree of hate from /r/The_Donald’s community involving censorship, the suspension of accounts, the purposeful downvoting of /r/The_Donalds content, and the editing of user comments. This last event involved an script which swapped /uspez’s account name with the name of the person making the comment. This resulted in anger directed toward /u/spez being redirected to the person who first posted, but the event left a bad impression on an already tenuous relationship. The Reddit administrative team has changed some of their interface and homepage to include /r/popular alongside the previous /r/all page in order

9

to satisfy the calls for removing /r/The_Donald, and /r/The_Donald’s claims of censorship. Other filtering options like ‘best” and “hot” have been implemented for similar reasons in order to limit or better curate the content that rises to the top of other pages (reddit.com).More on Reddit’s background, history, and structure will be presented in a later chapter.
These examples demonstrate the technical aspects of Reddit’s infrastructure and cultural sentiment, how information circulates and how content can be controlled. Reddit has a history of controversy and seems to be an intermediary space between mainstream media sites and the darker corners of the web such as 4chan (thehill.com; Malmgren, 2017). But it is important to understand how events similar to these occur on Reddit, and what functionalities are vulnerable to exploitation. How does Reddit’s ranking algorithms, technical nature, and culture contribute and influence the way information is shared? This leads me into the next aspect of this thesis: the role of computation and automation (i.e. bots).

1.2.2 Bots

In order to begin exploring the operationalization of bots in digital spaces, a distinction should be made here about what bots are, what they do, what they are capable of, and where they are placed in the discussion over social media, misinformation, automation, data, and networked activity. Bots are capable of performing infrastructural tasks. These functional bots are helpful in their imaginings Some of the ways in which bots are used include scraping, crawling or cleaning websites, capturing and performing web analytics, assisting in customer services (Geiger, 2014), assisting in data research and many other administrative tasks. But in their utile capacities and given the right environment, their abuse seems to almost inevitably follow, especially in cultures where the hacking of systems is a pillar of that culture.
This is where the notion of harmful or malicious bots can begin to be explored within the context of Reddit. The use of automated systems on social networking sites seem to be responsible for purposeful swaying, manipulating, guiding, drowning, or altering the public discourse of political elections online (Woolley, 2017). Even functional bots have been seen erring on the side of spreading

10

misinformation during the Boston Marathon Bombing (Cassa et al, 2013) Those bots were considered benign and were more a product of a faulty architecture, spreading false information about suspects. Similarly, Twitter’s experimental chatbot, Tay, after digesting a corpus of tweets from other users, began tweeting hateful, racist, misogynistic, pro-Hitler content (Neff and Nagy, 2016). The more deliberate bot which attempts to hide its identity in order to astroturf, and present itself as contributing to a normative discussion. Social Bots are by definition deceptive, seeming human while in actuality being automated (Ferrara et al., 2014, Boshmaf, 2011). When Social Bots take on a specifically political sentiment they are considered to be political bots. Political bots have been seen to populate online spaces with partizan ideologies governing what kinds of things they post (Howard and Woolley, 2016). Numerous investigation and research into the influence political bots in the wake of the U.S. Presidential election have been undertaken to gauge the extent with which their environments were infiltrated, as well in other countries around the world (Forelle et al., 2017; Howard et al., 2017 Schafter et al., 2017, Wooley, 2017).
Bots are not just capable of circulating misinformation, but are also capable of reinforcing cultural sentiments. In what Lawrence Lessig refers to in his book Code and Other Laws of Cyperspace, the code acts as an infrastructure that has the ability to enforce social normative and governmental rules that is omnipresent, omnitemporal, and automated (Lessig, 1999). Because an automated process can run indefinitely, their presence can allow for a normative discourse to emerge that is encouraged by that process. We will see examples of this in a later chapter on /r/The_Donald.