<<

“We Believe in Free Expression...” Reverse-Engineering ’s Content Removal Policies for the Alt-Right

The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:38811534

Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA

Contents

The Problem & The Motivation ...... 4 Free Speech: Before and After the Internet ...... 5 Speech on Twitter ...... 11 Defining the Alt-Right ...... 13 The Alt-Right on ...... 14 Social Media Reaction to Charlottesville ...... 17 Twitter’s Policies for the Alt-Right ...... 19 Previous Work ...... 21 Structure of this Thesis ...... 22 Data Collection Methodology ...... 24 Looking to King and Berger’s Research ...... 24 i. Preliminary Findings ...... 26 ii. Pivot to Seed Accounts ...... 28 Overarching Data Collection Pipeline ...... 29 Technological Constraints ...... 31 Data Collection Pipeline in Detail ...... 33 Data Collected ...... 39 Analysis & Results ...... 40 Getting a Sense of the Alt-Right on Twitter ...... 41 Removed versus Active Users ...... 44 Successful Policy Implementations ...... 47 1. Promotion of the Alt-Right via Propaganda ...... 47 2. Representation of the Alt-Right ...... 57 3. Maintaining Free Expression ...... 60 Failure to Meet Policies ...... 62 1. Overreach: Awareness of Suspension ...... 62 2. Violation: Failure to Remove Popular Alt-Right Profiles ...... 64 3. Violation: Failure to Stop Recruitment ...... 67 Summary of Findings ...... 74 Conclusion & Future Research ...... 75 Speculation of Political Motivation ...... 78 Twitter Removal of ...... 79 Further Research ...... 82 In Conclusion: The Future of Speech on Twitter ...... 83 Bibliography ...... 87

3

1 The Problem & The Motivation

On October 11, 2017, at approximately seven pm Eastern Standard Time,

President ’s Twitter account was deactivated. People searching for

@realDonaldTrump were greeted with a banner: “Sorry, this page doesn’t exist!”2

A mere eleven minutes later, the Commander-in-Chief’s page was reactivated.3 It might have seemed like a comical fluke – and, for many, a relief – but this removal was the work of a disgruntled Twitter employee on his last day of work.4

Given that many argue that Twitter offers direct access to the President’s voice, it is notable that a single employee was able to remove his account.5 This event raised a host of questions about Twitter’s content removal process, many of which

2 Maggie Astor, “Rogue Twitter Employee Briefly Shuts Down Trump’s Account,” New York Times, November 2, 2017, https://www.nytimes.com/2017/11/02/us/politics/trump-twitter-deleted.html. 3 Ibid. 4 Ibid. 5 Jameel Jaffer, “Government Secrecy in the Age of Information Overload,” Salant Lecture, Shorenstein Center, Cambridge, MA, October 17, 2017.

4

went largely unanswered.6 The problem is not Twitter’s alone; for most social media companies, despite published content policies, there are no statistics or publicly available reasoning regarding removal. Given that global society is increasingly reliant on these platforms to communicate and connect, this lack of transparency in content regulation is unsettling. How did social media platforms like Twitter evolve to a point where unchecked content removal is not unusual, and why is it important that these companies clearly define a sense of criteria and precedence for the content they remove?

Free Speech: Before and After the Internet

Free speech is deeply rooted in America’s identity, as evidenced by its enshrinement in the First Amendment of the Constitution.7 However, given that companies like , Facebook, and Twitter are private corporations, they are under no obligation to follow this model of free speech; they have the right to determine what content should stay or be removed on their platforms. To better understand the effects of these companies on public discourse, this thesis will begin by discussing the notion of free speech beginning two thousand years ago.

The concept of free speech is intertwined with the concept of privacy, which was first pioneered by Greek philosopher Aristotle. Aristotle envisioned privacy as a separation of the public life of the agora from the private life of

6 Astor, “Rogue Twitter Employee.” 7 The First Amendment: “Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the , or of the press; or the right of the people peaceably to assemble, and to petition the government for a redress of grievances.”

5

home.8,9 According to Aristotle, in the agora, the public space, it is assumed that whatever one says can be heard by others, whereas in the home, it is assumed that one’s words are private.

However, the invention of new technology altered this distinction between public and private life, intertwining the notions of privacy and free speech. In

1888, the invention of the Kodak camera sparked a great deal of panic. While previously, photos required the subject to sit still to be captured, the Kodak camera allowed instant photography. Photographers began taking pictures of people in public, going about their every day lives, without their permission. 10

Soon, concern grew about the potential risk to the safety of personal lives and reputations. This prompted Justice Louis Brandeis to co-author the famous essay,

“The Right to Privacy,” which argued that, unlike European law, American constitutional law contains no protection of speech (or photography) that results in an “offense against honor.”11 In the decision for the Supreme Court case

Whitney vs. California (1927), Brandeis clarified his understanding of free speech by referencing Thomas Jefferson and the revolutionaries of 1776:

[Those who won our independence] knew that order cannot be secured merely through fear of punishment for its infraction; that it is hazardous to discourage thought, hope, and imagination; that fear breeds repression; that repression breeds hate; that hate

8 Judith DeCew, “Privacy,” Stanford Encyclopedia of Philosophy, , 14 May 2002, https://plato.stanford.edu/entries/privacy/. 9 Agora: from ancient Greece, a public space used for assemblies and markets. 10 Steve Lohr, “With 'Brandeis' Project, Darpa Seeks to Advance Privacy Technology,” , September 14, 2015, https://bits.blogs.nytimes.com/2015/09/14/with-brandeis-project-darpa-seeks-to- advance-privacy-technology/. 11 Samuel Warren and Louis Brandeis, “The Right to Privacy,” Harvard Law Review, vol. 4, no. 5, (1890): pp. 193–220.

6

menaces stable government; that the path of safety lies in the opportunity to discuss freely supposed grievances and proposed remedies; and that the fitting remedy for evil counsels is good ones.12

Jefferson’s understanding of free discourse was inspired by John Milton’s century-old concept of the marketplace of ideas: through the open discussion of opinions and thoughts, the best idea will “win.”13

For this reason, the Supreme Court has continually ruled in favor of free expression in controversial cases of . In Brandenburg v. Ohio (1969), which focused on the legality of a rally, the court ruled that the rally was protected under the right to free speech. The case determined that speech that is “directed at inciting or producing imminent lawless action,” or

“likely to incite or produce such an action” is not protected by the First

Amendment. 14 More recently, the ruling of Snyder v. Phelps (2011) declared that the Westboro Baptist had a right to protest at a fallen soldier’s funeral.

According to the Court, their actions, while offensive, could not be ruled unprotected due to emotional distress, given that they were in a public setting, speaking about public matters.15

Although Brandeis seemed to have shaped the definitive American view of free speech and privacy, as new technologies entered the market, the basic

12 Whitney v. California, 274 U.S. 357 (1927). 13 Brian Miller, “There’s No Need to Compel Speech. The Marketplace of Ideas Is Working,” Forbes, December 4, 2017. https://www.forbes.com/sites/briankmiller/2017/12/04/theres-no-need-to-compel- speech-the-marketplace-of-ideas-is-working/. 14 Brandenburg v. Ohio, 395 U.S. 444 (1969). 15 Snyder v. Phelps, 562 U.S. 433, (2017).

7

ideas of the public space and speech became further distorted. When the telephone was invented, conversations over the phone were at first assumed to have the privacy equivalent of a public space, given the presence of operators who checked for sound quality. But when the technology advanced and operators became obsolete, phone conversations were considered private; so much so that in

Katz vs. (1967), the Supreme Court ruled that Katz had a complete expectation of privacy on the payphone because he closed the door to the telephone booth.16

Just as the changing technology of phone lines required a reconsideration of the distinction between public and private spaces, so should the development of the Internet. Given that seven out of ten Americans use social media, perhaps we should consider the Internet to be an extension of Aristotle’s agora.17 However, as aforementioned, the First Amendment has no legal claim over social media sites, given that they are private companies. Yet, the very nature of the First

Amendment, which states the limitations of government interference in public speech, implicates a moral claim over these platforms’ treatment of speech.

Operating under this framework, speech on social media platforms should be considered under the same constitutional standards as speech in-person. Or, at the very minimum, users should have a sense of precedence for what is being removed and why.

16 Katz v. United States, 389 U.S. 347, (1967). 17 “Social Media Fact Sheet,” Pew Research, last modified January 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/.

8

Despite the fact that social media platforms have no legal requirement to comply with the speech standards set out in the Constitution, economic incentives blur the line between private company independence and government interests.

Nicole Wong, former general counsel at Google, was famously nicknamed “The

Decider,” as she was the employee woken up in the middle of the night to make removal decisions about content on Google’s platform with respect to different countries with different laws and cultural standards.18 Given the immediacy of social media, these decisions have to be made quickly. If the wrong decision is made, the company could incur significant fines from the aggrieved country. For example, if Google does not remove content that is requested through the

European Union’s Right to be Forgotten, and the European Privacy

Commissioner disagrees, Google would be fined for up to two percent of their earnings.19 This financial pressure may be the motivating factor behind Google’s removal of forty-three percent of the takedown requests it receives.20 While, from a Western perspective, we might respect the fact that Google complies with

European standards, this economic incentive becomes dangerous when the site is pressured to remove content by a nation that might not have the same ethical understandings. For example, if Turkey requested the removal of opposition to

President Erdoğan from Google Turkey, there would be less certainty as to the moral righteousness of the company’s actions.

18 Jeffrey Rosen, “The Deciders: The Future of Free Speech in a Digital World,” Salant Lecture, Shorenstein Center, Cambridge, MA, October 13, 2016. 19 Ibid. 20 Ibid.

9

Beyond the laws and wishes of government bodies, social media sites have a further economic incentive to remove disturbing or harassing content. These sites generate revenue from advertisements and the number of clicks, so there is direct motivation to take down unpleasant or upsetting content that could result in a loss of users. For example, in July 2016, after Philando Castile was shot by a police officer, his girlfriend began streaming to Facebook Live. Facebook removed the video, but then quickly reinstated it, claiming that a technical glitch was at fault.21 While the company perpetuated this story of a “technical glitch,” it seems clear to the informed observer that Facebook removed the video due to its violation of published content policies, which prohibit graphic content. Upon reconsideration of the nature of the video, which served to spread awareness of police brutality, Facebook reinstated it. In this instance, the company preserved the content; but it would be easy for Facebook to make an inconsistent decision.

For decisions that can be determined without the help of “The Decider” or an equivalent body, most social media companies contract a group of content reviewers (whose group size, educational background, and exact role is unclear) to remove content based on removal policies decided through “internal legislative meetings.”22 Yet, social media sites receive millions of removal requests per day – and if individual content reviewers are deciding what stays and what goes, there is bound to be variation.23

21 Ibid. 22Monika Bickert, “The Line Between Hate and Debate,” Berkman Klein Center for Internet & Society, Cambridge, MA, September 19, 2017. 23 Rosen, “The Deciders.”

10

Speech on Twitter

It is clear that all social media platforms companies with the issue of somewhat opaque content processing and removal on their platforms, but this thesis focuses specifically on Twitter due to its unique history. When it was first founded, the company prided itself on its model of free speech, calling itself “the free speech wing of the free speech party.”24 Twitter stated that it would only remove content that directly incited violence, which is parallel to the constitutional model. However, by 2016, this approach to free speech proved to be at odds with its business goals. High-publicity events, including the harassment of actress Leslie Jones, spearheaded by notorious Breitbart journalist Milo

Yiannopoulos, resulted in a loss of users from the platform.25 For the sake of its business, Twitter changed its policy to crack down on abuse, hate-speech, and harassment.26 As Twitter CEO Jack Dorsey said of the platform’s original approach, “We didn't fully predict or understand the real-world negative consequences. We acknowledge that now, and are determined to find holistic and fair solutions.”27 Yet, published content policies continue to be murky. On the

“Twitter Rules” site, more than ten policies are listed, but many are redundant and difficult to distinguish. For example, both the “Hateful Conduct” policy and the

“Violent Threats and Glorification of Violence” policy describe that users may

24 Josh Halliday, “Twitter's Tony Wang: ‘We Are the Free Speech Wing of the Free Speech Party,’” , March 22 2012, https://www.theguardian.com/media/2012/mar/22/twitter-tony-wang-free-speech. 25 Rosen, “The Deciders.” 26 Ibid. 27 Kevin Roose, “The Young and Brash of Tech Grow a Bit Older, and Wiser,” New York Times, March 14, 2018, https:// nytimes.com/2018/03/14/technology/tech-leaders-growing-up.html.

11

not make threats or promote violence against any individual on the basis of a protected category (i.e. gender, race, religion, origin).28 Very few categories list specific actions that result in a user’s removal.29

Twitter’s relationship with free speech is complicated by its status as an open platform for many social movements. During the 2012 Arab Spring, Twitter was used to quickly spread information about protests. The 2011 Occupy Wall

Street protests began with astonishing speed, also largely due to organization over

Twitter.30 But in more recent years, the darker potential of social media platforms has become apparent. Notably, ISIS began to use Twitter for recruitment and to spread their message; however, Twitter has taken publicly announced steps to remove ISIS content, removing almost 400,000 terror-related accounts in 2017.31

Within the last year, the alt-right movement has gained momentum and implanted itself in many social media platforms. These platforms, including

Twitter, have been forced to consider how to handle the presence of this population. Before describing the various decisions made by companies to address the alt-right, it is key to understand the movement as a whole, in addition to how its members interact with social media.

28 “The Twitter Rules,” Twitter, accessed November 4, 2017, https://help.twitter.com/en/rules-and-policies/twitter-rules. 29 Note: the basis of policy comparison for the alt-right will be discussed in a later subsection of this chapter. 30 “#Democracy on Fire: Twitter, Social Movements, and the Future of Dissent,” The Institute of Politics, Harvard University, Cambridge, MA, October 27, 2017. 31 Selena Larson, “Twitter Suspends 377,000 Accounts for pro-Terrorism Content,” CNN, March 21, 2017, http://money.cnn.com/2017/03/21/technology/twitter-bans-terrorism-accounts/.

12

Defining the Alt-Right

Defining the alt-right is a challenge; the definition changes depending on the source. After the destructive August 2017 alt-right rally in Charlottesville, the

Associated Press (AP) published a statement explaining that the use of the term

“alt-right” is simply a means of disguising and populism.32 Their official definition states that the term is “a name currently embraced by some white supremacists and white nationalists to refer to themselves and their ideology, which emphasizes preserving and protecting the white race in the United States in addition to, or over, other traditional conservative positions.”33 The AP distinguishes white nationalists from white supremacists; the former want special political and territorial guarantees for white people, while the latter simply believe that whites should dominate other races. Additionally, the group’s beliefs overlap with , isolationism, anti-feminism, and anti-Semitism.34

Looking at the far-right news outlet Breitbart, the definition of the alt-right is significantly different. In the now-famous article by and

Allum Bokhari, “An Establishment Conservative’s Guide to the Alt-Right,” the authors claim that the majority of alt-right members are “natural conservatives.”35

According to them, natural conservatives are white, middle American radicals who fervently champion . The article argues that the alt-right is a

32 “Writing about the ‘alt-right,’” , accessed November 23, 2017, https://blog.ap.org/behind-the-news/writing-about-the-alt-right. 33 Ibid. 34 Ibid. 35 Milo Yiannopoulos and Allum Bokhari, “An Establishment Conservative's Guide To The Alt-Right,” Breitbart, March 30, 2016, http://www.breitbart.com/tech/2016/03/29/an-establishment-conservatives-guide- to-the-alt-right/.

13

well-educated group (which is why they are threatening), with ideas supported by established thinkers such as Spengler and Mencken, who value the success of their own culture above economic gains, an order society has strayed away from.36

The authors distance the core of the movement from Internet trolls, who they describe as geeks looking to get a rise out of people, and Neo-Nazis, who they claim are an unwanted subsection.37

Given the inconsistency of the definition of the alt-right, I examined all cross-sections of the group, which encompass members who align themselves with anti-Semitism, white , and populism – but particularly those who self-identify as alt-right. This inconsistency informed my methodology, which will be discussed in depth in the following chapters.

The Alt-Right on Social Media

Given that the alt-right is so amorphous, their online activity is helpful in elucidating their existential definition. Like many online communities, the group has created its own jargon. Several pieces of the alt-right’s vernacular have caught the eye of the mainstream media. This jargon ranges in meaning from deeply racist ideologies to ironic meme humor. Terms such as “dindu nuffin,” a racist response to Black Lives Matters protests, and “Ghost Skin,” a reference to KKK members who are prominent law enforcement officials, expose the culture of

36 Ibid. 37 Ibid.

14

white supremacy.38 Other terms are not directly pejorative, but are still used exclusively within the alt-right as part of their -culture. This includes the term “,” a conservative who is sympathetic to the left; the use of the meme “”; “,” crusader iconography; and “Trumpire,” referring to Trump’s presidency.39

Previously, this chapter mentioned the role of the Internet in organizing riots, protests, and uprisings. Some of these movements borne out of social media have been successful, including the Arab Spring, but others have lived and died on the Internet in a “boom and bust” cycle, such as Occupy Wall Street, resulting in no tangible change.40 While social media can be employed as a tool to rapidly disseminate information, it does not alone build the muscle of organization and trust necessary to create political and societal changes.41 However, the alt-right movement is particularly frightening because it manages to translate real-life to the Internet and vice versa, thereby building the muscle key to a successful movement.42 For example, in 2016, alt-righters began to surround Jewish names with , known as “echoes,” in order to direct attention to the presence of Judaism in mainstream culture.43 A podcast from the right-wing blog

38 Nikhil Sonnad, “The Alt-Right Is Creating Its Own Dialect. Here’s the Dictionary,” Quartz, October 30, 2017, https://qz.com/1092037/the-alt-right-is- creating-its-own-dialect-heres-a-complete-guide/. 39 Ibid. 40 “#Democracy on Fire,” Institute of Politics. 41 Ibid. 42 Ibid. 43 Cooper Fleishman and Anthony Smith, “Neo-Nazis Are Targeting Victims Online With This Secret Symbol Hidden in Plain Sight,” Mic, June 8, 2016, https://mic.com/articles/144228/echoes-exposed-the-secret-symbol-neo-nazis-use- to-target--online.

15

The Right Stuff inspired this echo pattern. The blog describes that “all Jewish surnames echo throughout history," meaning that the supposed harm caused by

Jewish people has lasted for centuries. 44

The idea of the reverberation of Jewish names throughout history was an idea circulated through “traditional media.” While, of course, the views that The

Right Stuff espouse are not the norm, the platform of their content is “traditional”: a news outlet with funding to produce content such as videos and podcasts. This idea was picked up on and put to use in social media, spreading the message.

While social media provides the speed to spread ideas, media outlets such as The

Right Stuff, Breitbart, and provide the infrastructure necessary to give these social movements gravitas.

While some of this alt-right terminology circulates on Twitter and

Facebook, the group’s unfettered discussion happens on platforms like ,

Reddit, and , a social media site with almost no content restrictions.45 Aware that Twitter has begun to crack down on hate speech, many alt-right members refrain from posting unfiltered opinions and particularly racist terminology.46 In fact, in preliminary findings, when I scraped Twitter for select pieces of alt-right terminology, my algorithm returned only a handful of results per day.47 Prominent alt-right members, such as self-proclaimed leader Richard Spencer, provide a shiny image for the group and its goals. In fact, according to the Data & Society think-tank, whose researchers embed themselves in alt-right chat networks,

44 Ibid. 45 Becca Lewis, Skype Conversation, Data & Society, October 14, 2017. 46 Ibid. 47 Note: These results will be discussed further in Chapter Two.

16

Twitter is primarily used for recruitment.48 Alt-right members seek out individuals who have expressed opinions that are alt-right sympathetic, such as against or social justice warriors. Then, alt-right members engage in debate with the individual, and if they deem their views to ring true, they invite them to join the discussion on 4chan or Gab.49

While some may shrug off Internet memes and the darkness of human nature as words behind a computer screen, these tendencies are bleeding into real life. In fact, the Internet is no longer a bubble – and events like Charlottesville prove the reality of the power and potential of groups like the alt-right.

Social Media Reaction to Charlottesville

On August 12, 2017, in Charlottesville, Virginia, when alt-right protestors gathered for the “Unite the Right” rally, an alt-right member drove his car into the crowd, killing one and injuring nineteen. At the rally, Robert Azzmador Ray of

The Daily Stormer said, “We’re stepping off the Internet in a big way… people realized we are not atomized individuals, we are part of a larger whole because we have been spreading our memes, organizing on the Internet.”50 Ray is not wrong – the rally was quite well organized: some drove twelve hours to participate, and organizers handed out flags, shields, and helmets.51 This is the

48 Ibid. 49 Ibid. 50 “Charlottesville: Race and Terror,” , last modified August 14, 2017, https://news.vice.com/en_us/article/qvzn8p/vice-news-tonight-full-episode- charlottesville-race-and-terror. 51 Ibid.

17

organizational muscle that alt-right members are proud to flex, proving that they are more than an Internet body to be dismissed.

While President Donald Trump shied from taking a strong stance on the horrific events of that day, many social media outlets did. Yielding to pressure, the favored chat app of the alt-right, Discord, which hosted group chats that helped organize the rally, banned these communities.52 When The Daily Stormer posted an article mocking Heather Heyer, the killed counter-protestor, GoDaddy and Google revoked their hosting services.53 Then, Cloudfare, a website security service, stopped servicing the site.54 Cloudfare has a near monopoly in their business sphere: protecting websites from bad actors who would compromise their security and shut them down. For this reason, Cloudfare’s refusal to service

The Daily Stormer took the site completely offline.55

Unlike other online platforms, Twitter made no specific statement addressing the Charlottesville incident. Without any direct response targeting the alt-right, the only point of reference for how Twitter handles this population is found within the Twitter Rules.

52 Kevin Roose, “This Was the Alt-Right’s Favorite Chat App. Then Came Charlottesville,” New York Times, August 15, 2017, https://www.nytimes.com/2017/08/15/technology/discord-chat-app-alt-right.html. 53 Kate Klonick,“The Terrifying Power of Internet Censors,” New York Times, September 13, 2017, https://www.nytimes.com/2017/09/13/opinion/cloudflare- daily-stormer-charlottesville.html. 54 Ibid. 55 Ibid.

18

Twitter’s Policies for the Alt-Right

Combing through the many different (and often redundant) content policies, it appears that restrictions for the alt-right best fit into Twitter’s rules for violent extremist groups.56 The Twitter Rules defines a violent extremist group to be a group that meets the following criteria:

• identify through their stated purpose, publications, or actions, as an extremist group • have engaged in, or currently engage in, violence (and/or the promotion of violence) as a means to further their cause • target civilians in their acts (and/or promotion) of violence.57

At first glance, these rules seem targeted to address ISIS’ presence on the platform, a publicly acknowledged effort for Twitter. However, these rules are also applicable to other groups that use violent means to achieve their goals – particularly the alt-right. As for identifying as an extremist group, “alt” or

“alternative” is a thinly disguised term for “extremist.” In regards to engaging in violence as a means to further their cause, Charlottesville resulted in injuries and a fatality. And lastly, in terms of targeting civilians in their acts of violence, the official policy of the alt-right site The Daily Stormer was “Jews should be exterminated.”58 Through all of this evidence, the alt-right seems to be categorized as a violent extremist group by Twitter’s definition.59

56 “Violent extremist groups,” Twitter, accessed November 4, 2017, https://help.twitter.com/en/rules-and-policies/violent-groups. 57 Ibid. 58 Luke O’Brien, “The Making of an American Nazi,” , December 2017, https://www.theatlantic.com/magazine/archive/2017/12/the-making-of-an- american-nazi/544119/. 59 Ibid.

19

Twitter also signaled that the violent extremist group policy applied to the alt-right when it expanded this policy on December 18, 2017. Twitter announced that users may not affiliate with organizations that – either on or offline – promote violence against civilians.60 While the alt-right is not shy to speak hatefully and express white supremacist views, the group does not possess the same online systematic advocacy of violence against civilians as ISIS. However, organized events like Charlottesville demonstrate the advocacy of violence offline. Alt-right users were cognizant of the targeted change; for example, on December 17, one alt-right user collected in my research tweeted “24 hours till the Thought-Crime purge begins.” It is clear that the alt-right believes that the policy expansion targets them. But the nature of the changes, as well as political pressure to address the alt-right’s presence after Charlottesville, suggests that this change did indeed expand Twitter’s policy to include the alt-right. That is not to say that Twitter was not previously removing alt-right users; social media companies often test their content removal policies before publicizing them.61,62

Twitter lists specific behaviors within this definition of a violent extremist group that can result in content removal:

• providing or distributing services (e.g., financial, media/propaganda) in furtherance of progressing a violent extremist group’s stated goals • engaging in or promoting acts for the violent extremist group • stating or suggesting than an account represents or is part of a violent extremist group

60 Twitter, “The Twitter Rules.” 61 For this reason, as well as evidence later presented, this change is not substantially considered in the analysis. 62 Bickert, “The Line Between Hate and Debate.”

20

• recruiting for the violent extremist group.63,64

In order to determine if Twitter adheres to its policies when dealing with the alt-right, I will reverse-engineer the company’s content removal policies and attempt to map quantitative findings to the above rules.65 Thus, my methodology is constructed in order to verify that Twitter removes users who represent the alt- right, users who promote the alt-right by distributing propaganda, and users who recruit new members, as these actors are in explicit violation of Twitter’s policies.

Previous Work

Very little research has been conducted on the subject of the alt-right on

Twitter or Twitter’s content removal policies. In terms of the alt-right on Twitter, the most similar research comes from George Washington University’s Program on Extremism, in which the research conducted by J.M. Berger compared the growth of white nationalist and ISIS social media networks over time.66 Berger found that since 2012, there has been a 600% increase in white nationalist follower presence on Twitter.67 While Berger’s study was longitudinal and comparative, unlike this thesis, his methodology in finding the network of the alt- right on Twitter helped inform my data collection process.

63 Ibid. 64 Note: Order changed for consistency. 65 Given that the alt-right is a complicated mix of racial and political beliefs, its members promote acts for the group by spreading ideologies and propaganda, thus encouraging new members to join. For this reason, the first and second bullet points can be combined. 66 J.M. Berger, “Nazis vs. ISIS on Twitter: A Comparative Study of White Nationalist and ISIS Online Social Media Networks,” George Washington University Program on Extremism (September 2016). 67 Ibid.

21

In terms of content removal on Twitter, no formal research has been conducted to study the accuracy of Twitter’s policies. However, other research has been conducted about online censorship, most prominently from Harvard’s

Gary King, who conducted a quantitative study about social media censorship by the Chinese government.68 King’s methodology provided a structural framework that was helpful in constructing my data collection pipeline. These two pieces of research will be discussed in more detail in relation to my methodology in

Chapter Two.

Despite points of methodological inspiration from these two resources, this thesis presents an entirely new subject of quantitative research at the intersection of the subjects of hate groups’ presence on social media platforms and online censorship, prompting further questions that could help construct a new cannon of work.

Structure of this Thesis

Having laid out the need for and importance of this thesis, the next chapters will explain the research itself. In Chapter Two, methodology will be discussed. First, a deeper dive into King and Berger’s research will be presented so as to justify the construction of my data collection pipeline. Then, the chapter will explain the methodology of the pipeline, including approaches used to circumnavigate difficulties with the Twitter API.

68 Gary King, Jennifer Pan, and Margaret E. Roberts, “How Censorship in China Allows Government Criticism but Silences Collective Expression,” American Political Science Review 107, no. 2 (May 2013): 1-18, http://j.mp/2nxNUhk.

22

Chapter Three will map the results of the data collected to Twitter’s published policies to determine the company’s degree of consistency in following its own rules. First, the chapter will describe the results of statistical analyses performed on the general language of the alt-right to give the reader a sense of how the group communicates. Then, the chapter will describe the successes of

Twitter in honoring its own policies by delineating specific metrics that indicate the following of published rules. Finally, the chapter will explain the ways in which Twitter is inconsistent in its policies, both by overstepping these rules and by falling short of following them.

Finally, Chapter Four will provide a conclusion for these findings. A short piece of additional research about antifa will be presented to prompt further questions about Twitter’s bias and motivation for content removal. The chapter will put the thesis’ findings into a larger context, provoking future questions for consideration.

23

2 Data Collection Methodology

This thesis aims to determine whether or not Twitter honors their content removal policies for the alt-right. In order to collect user data about Twitter’s content removal, I scraped the timelines of a large network of alt-right users, as well as the users they mentioned, consistently updating every few hours to determine whether or not any users had been removed. I was then able to analyze this data to test the hypothesis that removed profiles violated Twitter’s rules while active profiles did not. This chapter will describe the methodology behind the data collection process.

Looking to King and Berger’s Research

As aforementioned, King’s research about social media censorship by the

Chinese government provides a solid data collection framework for this thesis. In order to determine the factors that qualify a post for censorship, the researchers separated topics by sensitivity according to Chinese political science specialists; for example, Ai Weiwei, a Chinese dissident artist, is a highly sensitive topic,

24

whereas discussion of a popular online video game is less sensitive.69 By defining keywords within these topics, the researchers were able to scrape several social media sites for posts relating to these issues. Given that censorship is conducted manually, whereas data scraping is automatic, the researchers took advantage of this lag time in order to find politically sensitive posts before they were removed.

Through this method, the researchers amassed a collection of censored posts, and thus, were able to determine the qualifications for censorship.

King’s results were quite surprising. Through this method, it was found that 13% of Chinese social media posts are censored.70 Given that low, medium, and high sensitivity posts had the same likelihood of being removed, the researchers deduced that an explanation other than political sensitivity was accountable for censorship. They discovered that most of the posts that are censored are those regarding collective action.71 Users can denounce the president and express their political views, but when they attempt to congregate, for positive or negative reasons, their posts are removed. Considering China’s history, particularly the violent Tiananmen Square protest of 1989, this censoring logic makes sense. Another finding was that users who criticize these online censors, or Internet Police, were often censored themselves; perhaps the government does not want citizens to speculate about why it censors certain content, or more simply, perhaps the Internet censors are more inclined to punish users who have criticized their work.

69 King et al., “Censorship in China,” 2. 70 Ibid, 6. 71 Ibid.

25

i. Preliminary Findings

Inspired by King’s approach, I denoted low, medium, and high sensitivity terms within alt-right discourse in order to collect tweets, assuming that the most hateful terms would be the most likely to be removed. I used various media outlets and pieces of literature in order to amass a collection of terms used by the alt-right.

In accordance with the Twitter Rules, accounts that incite violence against specific races or populations will be suspended. It logically follows that the highest-risk rhetoric is that of the Neo-Nazi contingent of the alt-right, self- identified as the “1488Rs.” The name stems from 1914, the year Hitler rose to power, and “HH,” standing for “Heil Hitler,” being the eighth letter of the alphabet. The term “1488R,” anti-Semitic hashtags like “#unbonjuif” (“a good

Jew,” a phrase that sparked controversy in France), and the use of triple parentheses to identify Jewish users were all considered likely candidates for removal within this high-risk category.

Discourse that harasses specific races or populations without directly advocating violence is considered medium-risk for removal. These terms are connected to racist and supremacist culture; this includes the phrases “dindu nuffin,” and “Ghost Skin,” as mentioned in Chapter One.

Low-risk for removal is the vocabulary coined by the Internet culture of the alt-right. These terms are not directly pejorative, but are part of the language of the group. This includes the term “cuckservative,” use of the meme “Pepe the frog,” and “Deus vult.” Many of these terms stem from the Internet troll group of

26

the alt-right, but have become prominent within the alt-right’s vernacular, given that they do not obviously allude to racist and supremacist ideologies.

I used Twitter’s API search endpoint to gather tweets that included the above terms. Allowing my code to run for two hours, the algorithm returned only twelve tweets. Four of the twelve tweets employed the terminology to racist and alt-right ends, although the exact context was difficult to understand. The other eight tweets picked up by the filter had nothing to do with the alt-right; for example, one tweet used triple parentheses to emphasize the name of a song that was being released on SoundCloud.

There are a couple of explanations for the lack of results. First, it is possible that the alt-right is aware of Twitter’s content removal policies, and its members are thus careful not to use language that would be targeted for removal.

Second, these results support the idea that alt-right members use mainstream platforms like Twitter for recruitment rather than communication amongst themselves.72 Additionally, even the tweets that did seem to purposefully employ alt-right language have to be taken into question. Twitter is a unique platform given its 280-character limit per tweet; this makes analysis of posts without their context particularly difficult. Could a user be quoting an article he or she read, or using a term ironically? It is very difficult to understand a tweet without the context of recent posts, its author, and the author’s network. For these reasons and this preliminary evidence, a different approach is necessary to find the alt-right on

Twitter.

72 Lewis, Conversation.

27

ii. Pivot to Seed Accounts

Given that scraping tweets based off of the terms they contain was not a fruitful method of finding the alt-right on Twitter, I used an alternate approach.

Inspired by Berger, I sourced the population from the network of users who self- identify as alt-right. To study the growth of white nationalists on Twitter, Berger collected 25,000 users by beginning with a selection of seed accounts, specifically users singled out by the Anti-Defamation League and Southern Poverty Law

Center.73 From these seed accounts, he then gathered their friends and followers to expand the network. To cut down the set of users to a manageable size for storage and analysis, I identified just one seed account: Richard Spencer.

Richard Spencer is not only the self-proclaimed leader of the alt-right, but he is also identified by the Southern Poverty Law Center as “one of the country’s most successful young white nationalist leaders.”74 He began using the term “alt- right” in 2008, and in 2011, he founded the online magazine Alternative Right and took over the National Policy Institute.75 The National Policy Institute purposefully bears a vague, bland name to keep a low profile – it is a white- identity think tank that espouses sexist and racist ideas through conferences and publications.76 The Alternative Right is not just a collection of vitriol; it is a well- written, well-organized disguise of vitriol, giving the movement credibility by

73 Berger, “Nazis vs. ISIS,” 5. 74 “Richard Bertrand Spencer,” Southern Poverty Law Center, accessed January 13, 2018, https:// splcenter.org/fighting-hate/extremist-files/individual/richard- bertrand-spencer-0. 75 Graeme Wood, “His Kampf,” The Atlantic, June 12, 2017, https://www.theatlantic.com/magazine/archive/2017/06/his-kampf/524505/. 76 Ibid.

28

taking them out of the murky depths of the Internet and its hate-comments. Yet, many users on the site have profile photos that bear the “SS” insignia.77 For a long time, Richard Spencer and his beliefs were unknown. But when Donald Trump anchored his campaign to the controversial topic of and “Build the

Wall,” a foothold opened for the alt-right to capitalize on the discussion of identity politics. The alt-right aligned themselves with Trump, who never attempted to distance himself from the movement, seeing him as a mechanism to bring the political fringes into the mainstream. While there are other notable figures within the alt-right, including of The Daily Stormer and

Mike Enoch of The Daily Shoah, these other leaders mostly hide behind their keyboards, mirroring the vile language of online chat-rooms.78 Spencer is not afraid to speak publicly or to reporters, no matter the source. 79 His prominence both online and offline allows him to not only be supported by the alt-right and seen as an ideological leader, but also be identified as a household name to

“normies” – the alt-right’s term for non-alt-right members. Spencer is a polished, well-spoken man, which makes it all the more frightening when he leads a chant of “Heil Trump, Heil our People, Heil Victory.”

Overarching Data Collection Pipeline

Given that Richard Spencer is the self-proclaimed leader of the alt-right, he can also be thought of as the center of the alt-right network on Twitter.

Imagining Richard Spencer as the central node, we can follow his network several

77 Ibid. 78 O’Brien, “The Making of an American Nazi.” 79 Wood, “His Kampf.”

29

levels of connection deep in order to obtain a wide sample of tweets and profiles.

To ensure that popular users like Donald Trump and media outlets like CNN were not the majority of profiles within the Richard Spencer network, I elected to only follow the network of “mutual followings,” users who follow Richard Spencer and whom he also follows. If there is a mutual following, it can be assumed that an established relationship exists between the two users. While the first layer of

Richard Spencer’s mutual followings yielded about thirty profiles, the next layer pushed the order of magnitude to the thousands.

For each of these users, I scraped their tweet timelines as far back as the

Twitter API would allow: a total of 3200 tweets.80 Building a word-frequency dictionary of the cumulative tweets, in which each key was a unique word and its value was the corresponding frequency of that word across the entire corpus of gathered tweets, I noticed that the a majority of the most frequently-used words were usernames, users whom members of the Richard Spencer network are talking to and about. To include them in my data collection, usernames mentioned above a certain frequency by the Richard Spencer network were added to a secondary network, the “mentioned network.” Just as with the Richard Spencer network, I scraped all of these users’ tweets.

Due to the fact that Twitter employs content reviewers to manually remove tweets and profiles, there is a lag time between when a tweet is posted and when a user is removed, just as with the Chinese Internet Police. Taking

80 Note: This limitation may have led to some inconsistencies, given that some users tweet more frequently than others, and thus, the 3200th least recent tweet for each user is unlikely to have been tweeted on the same day.

30

advantage of this lag, I used a similar method to King by continuously checking whether or not users had been removed from the platform. Also similar to King, upon collection of the data, I conducted various forms of analysis to determine the actual nature of the tweets that were removed.

Technological Constraints

Twitter is one of the few social media companies to provide a fairly robust developer Application Programming Interface (API). I used the API’s standard search endpoint to get user information, tweet timelines from specific users, and user friends and follower lists. While using the API may seem straightforward, there were many roadblocks that took navigating.

The free API imposes severe rate limiting, meaning that only a certain amount of data can be collected over a certain amount of time. For some endpoints, this rate limit is particularly crippling. For example, to hit the endpoint to collect the list of users whom a particular user follows, only fifteen requests are allowed within fifteen minutes. Given the amount of data required for this project, that lag is untenable.

In order to increase the rate limit, a huge expense is imposed. For other web applications with APIs, they usually require payment in order to increase the rate limits, but these prices are typically offered at an educational discount.

Twitter has a different model. It offers no educational discount for the “firehose,” the service that provides all requested data with no rate limiting. The firehose is not part of the API, rather, it is a querying service for which each call – for

31

example, to get all tweets from a set of users – costs $2500. Given my scope of research, this price was prohibitive.

It is interesting to note that Twitter purposefully makes it difficult to collect large amounts of data. While Twitter wants developers to be able to glean some information for the sake of their business or personal research, accessing too much information too quickly could allow for the reconstruction of research that the company is conducting internally and wants to keep private. Perhaps they do not want researchers to conduct studies like this very thesis.

Given these limitations, I had to find workarounds to the rate limiting imposed by the free API. To mitigate its impact, I used multiple API tokens.

When a developer creates an account to use the API, he or she is provided with tokens in order to activate it. These tokens hold key information that uniquely identifies the developer and tracks rate limiting and usage information. I enlisted peers to create developer accounts and used their tokens in addition to my own.

When a certain token reached its rate limit, I simply switched to another, allowing the pipeline to fluidly continue data collection. This is somewhat taboo as it is against the preferred practices of Twitter; but given the price and collection constraints, it was a necessary solution.

Additionally, because rate limiting, though minimized, was still a factor, a key consideration of the structure of my pipeline was the tradeoff between breadth and time. Given that I wanted to consistently check whether or not a profile had been removed, and was operating on the assumption that Twitter removes profiles rather quickly, it was essential that an entire loop of my pipeline should occur

32

with low latency. The more users in the network I examined, the faster the rate- limit expired, and the longer the pipeline took to complete. Thus, I randomly sampled from the thousands of profiles I had collected from reaching two-levels deep within the Richard Spencer network to gather just 2,500 profiles. This cap still yielded a great number of users, but kept my pipeline length to two hours, sleeping for only ninety minutes during that runtime.

I ran my code for twenty-four hours a day for an entire month, using a server to ensure that my data collection would not be interrupted. The next subsection will describe this pipeline in depth.

Data Collection Pipeline in Detail

Figure 2.1: Flow chart of data collection pipeline.

33

(1 & 2) Gather users from Richard Spencer network, Scrape tweets

The program begins with the seed of Richard Spencer’s user ID. Whereas a user can change his or her username, the user ID is a permanent unique identifying number. The Twitter API provides lists of Richard Spencer’s followers, users who follow him, and friends, users who he follows. These lists can be compared to find mutual followers, users who both follow and are followed by Richard Spencer.81 All of these users are added to the network. Only public users are selected for the network, as private users’ tweets are inaccessible through the Twitter API. This did not pose a serious limitation, as most alt-right profiles are public.

To expand the network further, a selection of users two degrees from

Richard Spencer are also added to the network. As aforementioned, in order to ensure low latency, the number of nodes in the network is capped at 2,500. Users two degrees from Richard Spencer are randomly chosen to be in the network until this quota is filled, as visualized in Figure 2.2.

81 Note: All future relationships between users discussed in this methodology are mutual followings, unless otherwise stated.

34

Figure 1.2: Visual of simplified Richard Spencer network collection. Selected users are in blue, unselected are in white.

Before scraping the tweets of a user, the program checks to see whether or not their account has been suspended or deactivated, and add them to a stored list if so.82 If a user is suspended, their account page redirects to a webpage that reads,

“Account suspended.” If a user is deactivated, their account page redirects to a search webpage that apologizes, “Sorry, we cannot find this user.” To systematically determine the difference between suspended and deactivated profiles, a helper method uses an HTTP GET method in order to extract the URL of the redirected webpage. If the webpage redirects to

“https://twitter.com/account/suspended,” the profile has been suspended.

Otherwise, it has been deactivated. There is a stark difference between the two. If

Twitter suspends a user’s account, there is a possibility that he or she can come

82 Note: Of course, the first time the pipeline runs, it is unlikely that enough lag time will have passed for an account to have been removed. This point about the pipeline is primarily relevant for future iterations of the pipeline.

35

back to the site by following the reviewer’s request to remove certain content. If a user is deactivated, that user no longer exists on Twitter: his or her account will be deleted permanently after 30 days. This prompts the question: what type of language leads to deactivation versus suspension? Chapter Three will consider this question.

As each user is collected, the program calls the API endpoint to scrape the user tweet timeline (all of the last 3200 tweets) and dump it in a JSON format within a text file labeled by the user’s ID. Because this process is repeated, I had to be cautious of duplicate users and tweets. The program checks whether or not the user already exists, creating a new file if not. If the user already exists, the program checks whether or not each tweet returned by the API endpoint has already been stored (via its unique tweet ID number), only adding new tweets to the file. At this point, we have a completed list of a random set of 2,500 users in the Richard Spencer network, as well as files that correspond to each user containing tweet history.

It is important to note that the issue of rate limiting posed a serious imposition in the writing of this code. A “try except” block surrounds every call of the Twitter API endpoint in order to catch any unforeseen instances of rate limiting. The finicky nature of this rate limiting system is exemplified by the fact that even the API endpoint to check the rate limit status of the API has its own rate limit. To ensure that my code never stops running prematurely, any time a token has reached its rate limit, the program redirects to a switch function to switch API tokens. Once the switch function iterates through all of the tokens

36

within the cycle of the program, the pipeline sleeps for fifteen minutes, allowing all of the tokens to reset their rate limits. A great deal of trial and error went into ensuring that this complex rate-limit system would not stop my code, given that it needed to run continuously for thirty days.

(3) Gather most frequent words

This method gathers the text from all tweets from all users in the Richard

Spencer network, and creates a word-frequency dictionary.83 Iterating through every user’s file in the Richard Spencer network, the program loads the JSON of each tweet and extracts its text. Every word is cleaned by removal of non- alphanumeric characters (except “@”, the symbol used to tag a Twitter user) and conversion to lowercase. The word is added to the dictionary, incrementing the value of the key by one if it already exists.

The resulting dictionary is sorted using the built-in Python sorted function, sorting by the frequency of each word. The function returns the sorted dictionary, which is used to find users who are mentioned by members of the Richard

Spencer network.

(4 & 5) Gather mentioned users, Scrape Tweets

At this stage of the pipeline, the “mentioned” network is created. For every key in the dictionary passed into the method from stage (3), if the word includes an “@” symbol, it can be assumed that the word is a username of a

Twitter user.

83 Note: this term will be used throughout the rest of the thesis to describe a dictionary in which each key is a word and the corresponding value is the frequency of that word.

37

The program checks that the mentioned user is not private or verified. As discussed previously, private profiles are inaccessible to the Twitter API. The verification check is necessary, as I want to only capture users in the alt-right network. Given that Twitter revoked verification of alt-right users in November

2017, it is a safe to assume that no alt-right member is verified.84 Thus, each public, unverified user who is mentioned above a certain frequency is added to the mentioned network.

The tweet timeline of each mentioned user is scraped and added to a file named after the user’s ID, which was found by calling the Twitter API user endpoint on the username. Just as with step (2), repetition is avoided, and each user is checked for deactivation or suspension before attempting to scrape their tweets. Just as in step (3), the method returns a word-frequency dictionary sorted by frequency.

(6) Documentation

As the final step of the full loop of the pipeline, the findings of these methods are documented in JSON format within a text file for later reference.

This documentation includes:

• Time of completed loop, • List of suspended users, • List of deactivated users, • Dictionary of most frequent words for Richard Spencer network, and • Dictionary of most frequent words for mentioned network.

84 Darrell Etherington, “Twitter to revoke verification for some accounts as part of overhaul,” TechCrunch, November 15, 2017, https://techcrunch.com/2017/11/15/twitter-to-revoke-verification-for-some- accounts-as-part-of-overhaul/.

38

This was a key step to record the progress of collection throughout the 30-day running of this program, as well as to conduct later analysis about rate of suspension, rate of deactivation, and use of terminology.

Obtaining the Network

Beyond the data collection pipeline described above, it was also necessary to gather the friends and followers of every user collected to ultimately generate a graph, as will be analyzed in Chapter Three. For every user in the network, the lists of their friends and followers were documented in two different text files named after the user ID: for example, “657802_Friends.txt” and

“657802_Followers.txt.” The information in these files was later used to generate the nodes and edges of the complete alt-right network collected by the pipeline.

Analyzing the resulting graph will illuminate the nature of the network and users’ relationships within it.

Data Collected

In total, over six thousand users were collected within the Richard Spencer network, a total of about 6 GB of data. Almost four thousand different users were collected within the mentioned network, a total of about 4 GB of data. By the end of the collection period, six hundred users were found whose accounts had been suspended or deactivated during that time.

Due to the large quantity of tweets collected, this collection procedure yielded a wide and diverse corpus of data. This allowed rigorous data analysis to be conducted, both in terms of natural language processing and network analysis.

We conduct these analyses in Chapter Three.

39

3 Analysis & Results

Twitter defines its values: “We believe in free expression and think every voice has the power to impact the world.”85 Of course, this values statement is not without caveats. As mentioned in Chapter One, in December 2017, Twitter formally expanded its “zero-tolerance policy” for violent extremist groups to encompass the alt-right.86 This expansion was designed to send a message of

Twitter’s intolerance of the alt-right, both to alt-right members and to the general public.87

According to Twitter’s policies on violent extremist groups, users may be removed from the site if they:

• Promote the alt-right by tweeting content that spreads ideologies and propaganda,

85 Twitter, “Our Values.” 86 Once again, note that while this announcement was made during my period of data collection, it simply served as a public message, but did not change the pattern of content removal, as will be discussed in a later subsection.86 87 Bob Moser, “How Twitter’s Alt-Right Purge Fell Short,” Rolling Stone, December 19, 2017, https://www.rollingstone.com/politics/news/how-twitters-alt- right-purge-fell-short-w514444.

40

• Claim to represent the alt-right, or • Attempt to recruit new members.

After a review of the general properties of the data collected, this chapter will map quantitative findings to the above content rules to describe instances in which

Twitter successfully carries out its policies. Then, this chapter will explore metrics that indicate inconsistency between these policies and the content Twitter removes in practice. Ultimately, this research finds that Twitter is successful in keeping consistent to its policies in some aspects, but oversteps and contradicts itself in others.

Getting a Sense of the Alt-Right on Twitter

Before determining Twitter’s success in following its own content removal policies, it is important to gain context about how the alt-right communicates on platform. To get a sense of the terminology used by the population, I created two word-frequency dictionaries: one for the six hundred removed users and one for a random selection of six hundred active users. Every word from each tweet was added to the appropriate dictionary after being cleaned through conversion to lowercase and removal of all non-alphanumeric symbols

(except “@”). This step ensured equality between words used in different grammatical contexts. Additionally, the dictionaries were passed through a filter to eliminate the most common words in the English language (“stop-words”). The removed and active dictionaries were concatenated to create a complete word- frequency dictionary of alt-right terminology, visualized as a word cloud in the

Figure 3.1.

41

Figure 2.1: Word cloud of collective alt-right users’ (removed and active) terminology.

From first glance, it is apparent that President Trump is a dominant topic of conversation. Both his username and his last name are some of the most prominent words in the word cloud. Other terms and their likely explanations for prominence include:

• “Women”: perhaps due to the sexual assault cases that continued to rock Hollywood during the month of December, • “FBI” and “Flynn”: likely due to news about the Russia investigation and Flynn’s plea-bargain covered during the period of collection, • “Christmas”: likely due Trump’s campaign to “take back Christmas” and data collection over the holiday season, • “Black,” “white,” and “illegal”: likely due to discussion about racial identity, a centerpiece of alt-right ideology, and • “Kate”: referencing a white woman, Kate Steinle, shot and killed in by an illegal immigrant.88 The man was acquitted in November 2017, resulting in a great deal of national controversy due to San Francisco’s status as a sanctuary city.

While certain terms can be identified through a visual search, it is difficult to understand the meaning of them without extended context. A Latent Dirichlet

Allocation (LDA) model helps illuminate the context of these words by

88 Holly Yan and Dan Simon, “Undocumented immigrant acquitted in Kate Steinle death,” CNN, December 1, 2017, https://cnn.com/2017/11/30/us/kate- steinle-murder-trial-verdict/index.html.

42

determining which words were most often paired in the same tweet. To fit this model to my collection of words, I created a Document Term Matrix (DTM). I began by collecting a corpus of every unique word used in alt-right tweets. Each cleaned word within this corpus served as columns for the DTM, and every row contained information for each individual tweet. The row and column position, corresponding to a word and a tweet, was filled with the number of instances of the word within the tweet. I then fit the LDA to this data, allowing words to be grouped into eight different topics.89 The results shown in Table 3.1 reveal the mental equivalencies of the alt-right.

Table 3.1: Sampled results of LDA on collective (removed and active) alt-right terminology.

Sample Topic 1 Sample Topic 2 Sample Topic 3 @realdonaldtrump America @realdonaldtrump Obama GOP Moore Illegal Democrat Liberal Alien Jew Fuck Muslim Woman Shit Terrorist Fake Racist Antifa Vote Kate

Topic 1 reveals that a tweet about both Obama and Trump is likely paired with discussion of immigrants or terrorism. Topic 2 shows that the alt-right connects the American party system, including the act of voting, to women, Jews, and perhaps, “fake” news. Topic 3 reveals that a fair amount of expletives are likely to be included in a tweet about Donald Trump, Roy Moore, and

89 Note: I determined this number of topics based off of a generate-and-test system: I reviewed the terms grouped together by the LDA with a different parameter of topic numbers and assessed their logical matching until I found that eight topics resulted in the most logical groupings of terms.

43

aforementioned Kate Steinle. The LDA illuminates the topics of alt-right discussion, not just the terminology used.

With the general terminology and primary topics of concern for the alt- right made clear, this chapter will now describe the statistical difference in language between removed and active users. This information will elucidate

Twitter’s approach to content review.

Removed versus Active Users

In total, out of 9,136 collected alt-right users, 600 were removed - a mere

6.5%. Of these removed users, 37% were suspended, and 63% were deactivated.

To get a preliminary understanding of the difference between removed and active user language, I generated two separate word clouds, one for active users and one for removed users (Figures 3.2 & 3.3). The two word clouds are nearly indistinguishable. Again, just as represented in the collective word cloud of

Figure 3.1, terms like “FBI,” “white,” “women,” and “America” appear prominently. A slight difference appears in Figure 3.3, in which a few suspended profiles, such as “@lgldeeds” and “@villamarshmello,” are included. This difference will be explored in a later subsection.

44

Figure 3.2: Word cloud of active alt-right users’ terminology.

Figure 3.3: Word cloud of removed alt-right users’ terminology.

While word clouds are helpful to visualize frequently used terms, they do not provide any quantitative understanding of the similarity or dissimilarity between two sets of words. In order to quantify the similarity between the language of active and removed users, I used a cosine similarity equation, a common metric for language processing. In the field of geometry, the equation is used to calculate the angle between two vectors, but can be repurposed to determine the similarity of two input sets of words. The closer the resulting value is to one, the more similar the sets.

45

! 𝐀 ∙ 𝐁 𝐴!𝐵! cos 𝜃 = = !!! ∥ 𝐀 ∥∥ 𝐁 ∥ ! ! ! ! !!! 𝐴! !!! 𝐵!

The two sets are input into the cosine similarity function as word-

frequency dictionaries. In the equation above, the notation indicates that Ai is the frequency of key i in dictionary A, and the same for dictionary B. The overlapping keys within these dictionaries are found, and each of their frequencies is summed to create the numerator. The denominator is found by multiplying the square roots of the summed squared frequencies of each of the dictionaries (in vector terms, to find the magnitude of each dictionary). Dividing the two, the resulting value represents the similarity between the two dictionaries.

I created three separate dictionaries: one for suspended users, one for deactivated users, and one for active users. The results of the cosine similarity values between these dictionaries are quite striking: all of the dictionaries are highly similar, the results hovering around 0.86, as seen in Table 3.2. From these values, it is apparent that language is not the only factor Twitter uses in determining which users to remove – context is key.

Table 3.2: Cosine similarity scores for compared word-frequency dictionaries.

Dictionaries Compared Cosine Similarity Value Suspended & Active 0.869 Deactivated & Active 0.849 Suspended & Deactivated 0.850 Removed (Suspended + Deactivated) & Active 0.880

Having explored the general language of the alt-right as well as similarity between active and removed terminology, this chapter now describes the quantitative findings that successfully map to Twitter’s published policies.

46

Successful Policy Implementations 1. Promotion of the Alt-Right via Propaganda

Twitter does not provide any examples of actions that would constitute promoting acts or propaganda in support of a violent extremist group. For the alt- right, given the nature of the group’s communication laid out in Chapter One, we can assume promotion takes the form of the exchange of ideas that center on supremacist, racist, anti-Semitic, and sexist ideologies. This research finds that

Twitter successfully removes content that promotes alt-right ideology by:

• Banning most expletives and racial slurs, • Preventing the spread of controversial content, • More aggressively removing users during divisive political events, and • Removing high frequency tweeters.

Despite the similarity of active and removed user language, some words do account for a slight difference between the corpuses. In order to determine which words were most influential in removal, I used a logistic regression, a statistical method used to describe the relationship between a binary characteristic

(here, whether a user is active or removed) and a set of independent variables

(here, words).

I combined the active and removed word-frequency dictionaries used for the word cloud and grabbed all of the resulting dictionary’s keys to create a basis for the logistic regression. The basis represents the corpus of every unique word within all of the tweets. Then, I iterated through a random selection of tweets from removed and active users. For every tweet, I cleaned each word and compared it to the basis. A new array was created that mirrored the basis; every word that appeared in the tweet was marked with a “1” in its corresponding index

47

in this new array, and every word that did not appear was marked with a “0.” For each tweet, the corresponding index in a separate array indicated the status of the tweet’s source: “1” for a removed user, “0” for an active user.90

Both arrays were fit to the Python package sklearn’s logistic regression model, returning a corresponding coefficient for each word in the basis. The coefficients ranged from -2.0 to 2.0. Above zero, the greater the coefficient, the more likely that the term was included in tweets from removed users. Below zero, the smaller the coefficient, the more likely that the term was included in tweets from active users. Note that while closeness to these upper and lower bounds does have meaning, the exact difference in coefficient values between terms cannot be interpreted in this analysis.

Given that there were over twenty-one thousand unique words in the basis, categorizing the nature of these words is key to understanding the way in which content is considered for removal.

i. Expletives and Racial Slurs

The logistic regression results provide evidence that Twitter removes users who tweet expletives and racial slurs. While the succinctness and context-less nature of tweets makes natural language processing difficult, words such as expletives and racial slurs do not require much context to be understood. It can be argued that a slur can be written “ironically” or within a quotation, and thus, context is necessary; however, just as in daily life, it is generally considered

90 Note: To simplify the output of this logistic regression, I chose not to differentiate between suspended and deactivated profiles.

48

unacceptable to use slurs even in those contexts. Thus, this thesis analyzes these words at face value.

Some particularly interesting findings are included in the table below:

Table 3.3: Logistic regression coefficients for a sample of slurs and expletives. From midpoint 0, higher values indicate likelihood of inclusion in removed content, and lower values indicate likelihood of inclusion in active content.

Word Coefficient Value “nigga” -1.44 “nigger” 0.58 “fag” 0.43 “” -0.06 “fucking” 1.02 “jew” 0.63 “hitler” 0.33 “” 0.66

One of the most striking results is the difference in coefficient value between “nigga” and “nigger.” While “nigga” is a more widely accepted term, often used in rap lyrics and reclaimed by the black community, “nigger” is still a formal racial slur. It is clear through the data that by tweeting the latter, the user is on Twitter’s radar for removal.

Most other pejoratives included in this table are all above the zero- threshold for influencing removal.91 There is not always consistency, however.

For example, the derogatory terms “fag” and “faggot” received fairly distant coefficient scores, despite their equivalent meanings. This inconsistency, taken into account with the largely high coefficient scores for most racial slurs and expletives, suggests that Twitter does not solely suspend or deactivate accounts based off of slur usage, but uses these terms as an indicator of hateful language

91 Note: In reading a large sample of tweets, it is clear that the term “Jew” is primarily used in a derogatory fashion.

49

that might warrant content removal. It is likely that the company uses a filtering method to detect expletives and bring these controversial tweets to the attention of content reviewers.

ii. Preventing the Spread of Content: Mentioned Profiles & Links

A large proportion of the words that formed the logistic regression basis were usernames, which suggests that much of the discourse within the alt-right network is to and about other users. The characteristics of these mentioned users revealed why tweeting about them resulted in a higher or lower likelihood of removal. I selected the values of 0.5 and -0.5 as the “influential in removal” and

“influential in remaining active” thresholds. For every mentioned user whose username was above or below those thresholds, I determined if the account was removed or active.

Table 3.4: Coefficient threshold comparison for mentioned usernames.

Percent Removed Coefficient greater than 0.5 47% Coefficient less than -0.5 27%

From Table 3.4, it is apparent that usernames with a higher coefficient value are more likely to have been removed. This data supports the idea that

Twitter removes users based off of interactions; if User A is about to be removed, content reviewers are more likely to suspend User B given that he or she interacted with User A. In doing so, Twitter is preventing controversial alt-right ideas from spreading between users.

Similar to mentioned profiles, a great number of the terms in the basis were links: namely, retweets and external links. Given that this thesis focuses on

50

Twitter’s content removal policies, this analysis examined only retweets. Just as usernames can be removed, so can a link, if it redirects to a removed profile or post. Using the same technique as for usernames, I determined the percentage of removed links relative to coefficient thresholds.

Table 3.5: Coefficient threshold comparison for links.

Percent Removed Coefficient greater than 0.5 35% Coefficient less than -0.5 7%

Much like the username results, it appears that higher coefficient values correlate to removed links; if a user posts a link that is later suspended, that user is likely to be suspended him or herself. This finding again bolsters the idea that

Twitter is targeting profiles to prevent the further spread of alt-right ideologies.

iii. Divisive Current Events

The timing and frequency of content removal reveals a pattern of increased removal rates during divisive current events. Using the cumulative file collected in the “Documentation” step of my pipeline as described in Chapter

Two, I created a script to group the number of removed users by day within the month-long collection period.

51

Figure 3.4: Number of alt-right account removals during the month of December 2017.

The number of removals spikes during prominent current events. For example, eighty-four removals occurred on December 2. This was the day after

Michael Flynn pleaded guilty to lying to the F.B.I. and committed himself to cooperating with the Russia investigation.92,93 This event fueled alt-right discourse about a corrupt government and Democrats’ role in undermining the Trump administration. For example, deactivated user “@resierwilliam” retweeted another user: “RT @RealMAGASteve: A Whistleblower is said to have witnessed a top

FBI executive suggest FBI had personal motive in investigating Flynn.”

Suspended user “@9th_prestige” retweeted another user: “RT @KamVTV:

92Michael D. Shear and Adam Goldman, “Michael Flynn Pleads Guilty to Lying to the F.B.I. and Will Cooperate With Russia Inquiry,” New York Times, December 1, 2017, https:// nytimes.com/2017/12/01/us/politics/michael-flynn- guilty-russia-investigation.html. 93Note: These spikes occur a day after the events. As mentioned in Chapter Two, given that humans review tweets in question, there is an expected lag time between the posting of a tweet and its removal.

52

BAM!!!! BREAKING: Obama Admin Approved General Flynn Calls with

Russian Ambassador.” Despite referencing a supposed news story, the tweet lacks any sort of link to back up this claim.94

Another key date is that of December 13: the day after Doug Jones beat

Roy Moore for the Alabama Senate seat in a special election. Many alt-right users tweeted about the supposed reasons for Moore’s loss, including accusations of ballot stuffing. Suspended user “@draggingglock” retweeted: “RT

@NoelCarlson14: Doug Jones won??? He only wins due to fake votes! This has to be investigated!!! Soros created machines that change votes.” The tweet references , the Hungarian billionaire who donates to liberal political campaigns, including that of Hillary Clinton. In fact, the rumor that

George Soros owned the electronic voting machine company, Smartmatic, which was used in sixteen states, was one of the most viral pieces of fake news over the election period.95 Another suspended user, “@michaelhill51,” wrote in response to accusations that Moore sexually abused underage girls, “Polanski is a Jew.

Moore is not. Mystery solved.” The user is referring to Roman Polanski, the famed film director who was charged for raping a 13-year-old, but still manages to be accepted by the film community today. This example of anti-Semitism is subtle enough to be ignored by word filters, but must be read in context to be understood.

94 Note: Fake news as perpetuated by the alt-right will be discussed in the next subsection. 95 Caitlin Dewey, “What was fake on the Internet this election,” , October 24, 2017, https://washingtonpost.com/news/the- intersect/wp/2016/10/24/what-was-fake-on-the-internet-this-election-george- soross-voting-machines.

53

Other key dates include December 7: the day after Trump recognized

Jerusalem as Israel’s capital. The discourse on this event is split. Some users were disdainful of Trump’s choice, seeing it as a placation of Jews, and others were railing against Obama for failing to keep his promises to Israel. The spike on

December 18 was likely caused by two events. It was the day Twitter enacted its new affiliation rules, expanding policies to formally define the alt-right as a violent extremist group. This announcement created a great deal of uproar from the targeted population, as mentioned in Chapter One. December 18 was also the day after Robert Mueller released a statement defending his team after text messages supporting Hillary Clinton were uncovered. This realization stirred up fear that Trump would fire Mueller, and the alt-right spared no opportunity to criticize Mueller for his leanings and skewer the left for their fears about the

Trump administration.

By removing content more aggressively during divisive current events, periods that usually catalyze online discussion, Twitter ensures that alt-right members do not use these politicized events as footholds to spread their ideologies.

iv. Frequent Tweeters

In addition to the timing of current events, the frequent activity of a user’s activity is a key metric in removal likelihood. For each user, I calculated the average number of tweets per day. Each tweet is time-stamped, so by grouping tweets from the same day and averaging across days, I was able to determine the

54

average “tweets per day” value for a user. The results for removed and active users are presented as a histogram in Figure 3.5.

Histogram of Number of Tweets per Day

Frequency

Number of Tweets per Day Figure 3.5: Histogram of tweets per day for removed and active users.

Both histograms are right-skewed: only a small number of users tweet with high frequency per day. However, the histograms have different spike heights: more active users have a low “tweets per day” value than removed users.

This observation is numerically validated when the averages are calculated. As seen in Table 3.6, on average, removed tweeters tweet twice as often per day compared to active users.

Table 3.6: Average tweets per day comparison for removed and active users.

Average # of Tweets per Day Removed Users 90 Active Users 45

Since the 2016 election, Twitter has been clear about its aggressive stance against bots and fake news. While Twitter was originally concerned about fake news planted by Russian bots, fake news, whether or not the tweeter believes it,

55

can be a helpful tool for the alt-right to undermine traditional government. Thus, the alt-right is a logical target in Twitter’s removal of bots and fake news.

However, the process of detection is rather vague in Twitter official announcements: “When we do detect duplicative, or suspicious activity, we suspend accounts.”96 It is difficult to determine the threshold of duplicative or suspicious activity, but frequent tweet time is a likely consideration. I primarily used the “tweets per day” value to determine the presence of a bot: if a user tweeted over one hundred times per day, this is a likely indication of automation.97

Among the suspected-bot subset, if the user tweeted words that signaled news, such as “breaking” to indicate a breaking news story, they were suspected of tweeting or retweeting fake news.98

Out of the 600 removed profiles, only 112 tweeted over one hundred times a day, and only 55% tweeted or re-tweeted news. 85% of these likely-bot profiles were from the mentioned network, and 68% of them were deactivated (as opposed to suspended). These findings support the idea that while alt-right users may be retweeting bots, few of them are actually bots themselves. However, if a user is suspected of being a fake news bot, Twitter seems to not hesitate to permanently remove the account through deactivation, as its presence only promotes the alt- right message.

96 “Our Approach to Bots & Misinformation,” Twitter, accessed February 1, 2018, https://blog.twitter.com/official/en_us/topics/company/2017/Our-Approach-Bots- Misinformation.html. 97 Russell Brandom, “How to spot a Twitter bot,” The Verge, August 13, 2017, https://antifa.theverge.com/2017/8/13/16125852/identify-twitter-bot-botometer- spambot-program. 98 Ibid.

56

2. Representation of the Alt-Right

The Twitter Rules describe that a user will be removed from the site if they “represent” a violent extremist group. As mentioned in Chapter One, in

December 2017, Twitter became clearer and harsher with this concept of representation, announcing that users may not affiliate with organizations that, either on or offline, promote violence against civilians.99 As aforementioned, this change is not substantially considered in the analysis, as there was no significant change in the number of removals after December 18, as shown in Figure 3.4, and there is evidence that social media companies often test policies before announcing them.100 Regarding representation, evidence shows that Twitter removes alt-right users who discuss , thus affiliating with the core of alt-right ideology.

To determine whom the alt-right is talking about, I randomly picked three hundred tweets (some active, some removed) for hand-review in order to get a more granular sense of the discourse of the group. I noted eight demographics of alt-right discussion: black, Jewish, Muslim, gay and transgender, female, immigrants, Democrats, and white. I associated terms, including straightforward terminology, slurs, and alt-right vernacular, to these demographic categories as I continued to read the tweets.

Furthermore, I adapted the definition of these categories to fit the alt-right perspective. For example, the category of white people is not simply those of

Caucasian descent, but rather, the alt-right’s definition of an acceptable or

99 “The Twitter Rules,” Twitter. 100 Moser, “Twitter’s Alt-Right Purge.”

57

admirable white person. Words that fit into this category included “white,”

“Christian,” “alt-right,” “Republican,” “Ghost Skin,” and “KKK.” Note that I did not include “Donald Trump” in this category; from data presented earlier in this chapter, it is clear that the President is a major topic of discussion for the alt-right, but I do not want to misconstrue tweeting about the President as an indicator of alt-right identity. This analysis is mostly racial-based, but also blurs into some political lines. For instance, the words within the category of “Muslim” are defined by the alt-right’s biased understanding of the population; these include terms like “terrorist” and “sharia.” Each category contained about twelve words, as shown in Table 3.7:101

Table 3.7: Sample terms used for demographic bucketing.102

Category Sample Terms Used Black black, nigger, nigga, negro, … Jewish Jew, Hitler, 1488, holocaust, holohoax, … Muslim Muslim, Islam, terrorist, sharia, … Gay/Transgender Gay, trans, homo, fag, faggot, … Female Woman, girl, feminist, bitch, pussy, … Immigrant Immigrant, illegal, alien, dreamer, DACA, … Democrats Democrat, liberal, Hillary, snowflake, … White White, alt-right, nationalist, republican, Christian, …

Using these categories and their corresponding words, I created a script to iterate through another random sample of tweets from both removed and active users to categorize them in these buckets. For every tweet, the program checked if any of the above categorical words were included. If a word was, the tweet was categorized as discussion of the corresponding demographic. Note that the

101 Note: My choice of these words may have resulted in skewed results, but I tried to offset this skew by selecting words based on the random sampling of alt- right tweets. 102 Some terms left out for succinctness.

58

sentiment of the tweet did not matter, given that I simply wanted to measure the volume of discussion from the alt-right. The program created a dictionary to keep track of the categorization of tweets, incrementing the value of each categorical bucket by one if a tweet fit its terminology. This provided a rough breakdown of the comparative discussion of each demographic among the alt-right.

Figure 3.6: Proportion of demographic discussion for removed alt-right users.

These findings show that the removed alt-right population talks primarily about its own identity. This finding is initially surprising, given that the alt-right is a supremacist group known for their racism. However, this hatred is largely fueled by the idea that the white race has been wronged and edged out by other races.

For example, conservative writer Dinesh D’Souza wrote on Twitter during the

Charlottesville rally, “CELEBRATE WHO YOU ARE UNLESS YOU’RE

WHITE: White nationalists are not Democrats b/c there’s no room for them at the

59

multicultural picnic.”103 These views underpin the alt-right’s ideologies that are fueled by factors such as , DACA, and former President

Obama’s race.

Twitter can identify alt-right members based on who is talking about and sympathizing with these concepts of white identity. Perhaps Twitter is using key words to find this population and remove them, thus limiting representation of the alt-right. However, there is not complete consistency. As discussed in the next subsection, a fair number of active users, though not the majority, are also discussing white-identity.

3. Maintaining Free Expression

Despite Twitter’s policies for violent extremist groups, ultimately, the company is attempting to maintain the integrity of its core values, particularly, free expression. Given that more active users talk about Democrats, women,

Muslims, and immigrants, Twitter seems to be conscious to allow open and free discourse.

Previously, the results for demographic bucketing were only shown for removed users. By comparing results for removed and active users as in Figure

3.7, there is evidence that more users who engage in political discourse remain active. From the bar chart, it is clear that active users tweet about Democrats, women, Muslims, and immigrants in a slightly higher proportion.

103 Dinesh D’Souza (@DineshDSouza), Tweet, August 12, 2017, https://twitter.com/DineshDSouza/status/896414991913549827/photo/1.

60

Figure 3.7: Proportion of demographic discussion for removed and active alt-right users.

In some ways, this relative proportion seems counter-intuitive, but Twitter is very cautious of backlash for being overly aggressive in its content removal.

The alt-right is complicated in its blending of politics, race, and belief; in order to prevent an uproar about political bias that may affect the business of the company,

Twitter seems to be careful not to remove too many users for tweets that are disparaging the Democratic party or Hillary Clinton. An attempt for political free expression also provides a likely explanation for the higher proportion of active user discussion about women, Muslims, and immigrants. For instance, sexual harassment cases entered the political sphere with Roy Moore’s campaign.

Additionally, the very concept of a “Muslim ban” is something that has entered into political discourse with the positions of Donald Trump.

Note that the relative proportions of active versus removed users discussing these demographics differ only slightly. Perhaps, despite trying to

61

maintain free speech, Twitter is ultimately prioritizing removing suspected alt- right members from its platform.

Failure to Meet Policies

Having discussed instances of Twitter’s successful implementation of its published policies, this section will now discuss examples of Twitter’s failures to do so.

1. Overreach: Awareness of Suspension

The Twitter rules for violent extremist groups state that the site will remove users that promote the alt-right, affiliate with the alt-right, or recruit new members; however, the site is also removing users who criticize Twitter for content removal, which is outside the bounds of the company’s policies. As previously mentioned, a logistic regression was used to determine what words were most likely to be included in removed content. As seen in Table 3.8, words about Twitter profiles and content removal all had very high coefficient values.

Of course, all users on Twitter have the power to ban other users from viewing their tweets and tweeting at them, so a number of tweets containing these words were about blocking users. However, there is evidence that Twitter is mostly removing users who criticize the company’s content removal process.

Table 3.8: Logistic regression coefficients of words indicating content removal awareness.

Word Coefficient Value “blocked” 1.03 “account” 1.03 “removed” 0.89 “banned” 0.73 “purge” 0.30

62

As previously described, tweets are meant to be read in context, and no natural language processing toolkit fit the challenge of assessing the subtleties of these tweets. I gathered a random set of tweets from removed and active users that included words like “banned,” “deactivated,” “removed,” “suspended,” and

“blocked,” and also gathered the same collection from active profiles. Reading through each tweet, I encoded the tweets based on their meaning. Tweets that dealt with users blocking other users received a zero, whereas tweets about

Twitter banning profiles received a one. Averaging these results, they revealed that 89% of removed tweets were criticisms of Twitter, whereas only 27% of active tweets were criticisms.

Given that the majority of removed tweets containing these terms were criticisms of Twitter’s content removal process, this section will now provide examples of these tweets. Some users complained about Twitter’s removal policy, claiming it is laden with bias, some users expressed frustration about content that had been suspended, and other users exchanged recommendations for how to avoid “rebound suspension,” the term for when a user creates a new account after being suspended and is suspended once more. There also is a great deal of theorizing about who is conducting the removals and why, especially in light of the December policy change. A suspended user, “@ResistCM14,” tweeted the day before policy changes were supposedly enacted: “It’s not ‘Republicans’ who are being banned in the #TwitterPurge. It’s Nationalists and pro-white accounts.”

Another suspended user, @BlondieLives, addressed his followers, presumably all alt-right members, about the “Twitter Purge”: “Remember, if (((they))) do mass

63

ban us tomorrow, it’s not because “hate speech”. It’s because we are a threat to their power.” The user uses the triple parentheses discussed in Chapter One to signal the presence of Jewish names, implying Jews run Twitter and are threatened by the power that the alt-right has gained on Twitter.

From the data presented, there is evidence that if a user criticizes Twitter’s content removal process, they are more likely to be removed altogether. This finding parallels that of Gary King’s research on Chinese social media censorship, in which posts that criticized censors were more likely to be removed.104 Given that the Chinese Internet is surrounded by the “Great Firewall,” it might not be shocking that censors are removing these criticisms. But Twitter fails to include any content restriction of this sort. Thus, it is disturbing that an American company that claims to uphold free expression removes content that involves criticism of the content reviewers and company itself.

2. Violation: Failure to Remove Popular Alt-Right Profiles

Especially after the company’s December 2017 expansion of its affiliation policy to consider offline activities in determining the existence of a violent extremist group, Twitter seems to be sending a message that they are taking a harsh stance against leaders of hate-groups like the alt-right. It is surprising, then, that Twitter does not to remove alt-right users with a large follower base, even if they are self-identified spokespeople of the group.

To determine how follower number correlates to likelihood of removal, I created a script to iterate through all removed profiles, as well as a random sample

104 King et al., “Censorship in China,” 5.

64

of active profiles, to grab their number of followers – information encoded in the

JSON dump of each tweet. As shown in Figure 3.8, both active and removed users exhibit a substantial skew in their “long-tail” pattern of followers: there are a lot of users with a relatively small number of followers, and there are very few users with a large number of followers.

Histogram of Users’ Followers

Frequency

Number of Followers Figure 3.8: Histogram of number of followers for removed and active users. Note that because some frequency values are so small, the end of the tail is invisible. Additionally, some values have been clipped for more a detailed visualization. While both histograms are right-skewed due to their long tails and have similar spike heights, the difference between the two categories becomes clearer when the average number of followers is calculated.

Table 3.9: Comparison of mean followers for removed and active users.

Average Value Removed Users’ Followers 9,067 Active Users’ Followers 22,136

Active users have a far higher average number of followers – more than double that of followers for removed users. What accounts for such similar

65

histograms, yet such different averages? To assess how the density of data differs,

I created a quantile-quantile plot to map the distribution of removed users’ friends against active users’ friends.

QQ Plot of Removed vs. Active Users’ Number of Followers

Sample

st

ntiles of 1 of ntiles Qua

Quantiles of 2nd Sample Figure 3.9: QQ plot comparing removed (quantile 1) and active (quantile 2) users’ number of followers.

Plotting the two quantiles against each other, a shelf becomes apparent.

Twitter does not remove any users, regardless of their violation of Twitter policies, who have over 40,000 followers. At first, this is surprising; if Twitter is trying to remove the alt-right presence from its platform, why are they not removing users who are the most influential in this community? Even Richard

Spencer, who loudly self-identifies as alt-right, is still active as of the time of this research.105 However, the choice is quite logical from a business perspective.

Ultimately, Twitter’s goal is to maximize profits. If the company removes popular users, all of their followers are likely to become skeptical of Twitter’s leanings

105 Note: Richard Spencer’s profile has been deactivated in the past, but has always been reinstated.

66

and migrate off of the platform. By removing alt-right users with a slightly smaller following, Twitter can achieve a balance between its two goals – profit and control over content. Yet, this is a clear departure from its Twitter Rules statement about the consequences for violent extremist group members: “We have a zero-tolerance policy for accounts belonging to or affiliated with violent extremist groups and will permanently suspend them.”106

3. Violation: Failure to Stop Recruitment

Despite Twitter’s statement within its content rules that forbids violent extremist groups from recruiting new members, network analysis results suggest that Twitter does very little to prevent recruitment.

The structure of the network suggests that the alt-right network is one of a recruitment model. Of the 9,136 users collected, 65.31% of them came from the

Richard Spencer network, 32.66% came from the mentioned network, and only

2.02% were in both the Richard Spencer and mentioned network. This data shows that the alt-right is not a very closed network; users are communicating with and re-tweeting people beyond their own friends, followers, and larger social network community.

The idea of a recruitment model is further cemented by various network analysis statistics. In order to conduct this network analysis, I used the Python package NetworkX and the network visualization software Gephi. As mentioned in the previous chapter, for every user, I collected the lists of their friends and followers, each recorded in a text file named after the user. In order to represent

106 “Violent extremist groups,” Twitter.

67

this information as a graph, I created a program to iterate through every file in the network information folder, read each line from the file, and create a directed edge within the graph between the two users. If the file contained a user’s friends, directed edges were created from the namesake user to every user ID in the folder; if the file contained a user’s followers, directed edges were created from each user

ID in the folder to the namesake user. Through the creation of this graph, I was able to run analyses on its different network properties, as well as visualize the nodes and connections. However, this process resulted in a network containing twenty million nodes, too large to conduct any meaningful analysis and visualization. To remedy this issue, I adjusted the graph creation program to only create edges between users who were already within the alt-right network. This seemed like a reasonable tradeoff between size of the network and accuracy of the group that I am attempting to pinpoint. This adjustment resulted in a network of the collected ten thousand users, a much more reasonable number.

I was able to calculate various measurements to determine the nature of the graph. I began by calculating the average clustering coefficient, which measures for each node the proportion of neighbors who are neighbors with each other – an indication of the level of connectedness within the graph. The average clustering coefficient is found by averaging the clustering coefficient for each node in the graph. Given a graph G=(V,E), the equation is as follows:

(𝑢, 𝑤 ∈ 𝐸: 𝑢, 𝑣 ∈ 𝑁(𝑣)} 𝐶 𝑣 = 𝑑(𝑣) 2

68

Where u, v are nodes in the graph, N(v) is the list of neighbors for node v, and d(v) is the dimension of node v (the total number of neighbors). The closer the clustering coefficient value is to one, the more neighbors of a node are connected, and thus, the more interconnected the whole graph. I calculated the clustering coefficient for the complete graph and its respective sub-networks.

Table 3.10: Clustering coefficients for subsections of the graph.

Selected Network Clustering Coefficient Complete 0.27 Richard Spencer 0.31 Mentioned 0.26 Richard Spencer & Mentioned Intersection 0.40

The clustering coefficients are quite low, particularly for the complete graph. This value quantifies the idea that the alt-right network is not highly connected; on average, most followers and friends of a node are not following or friends with each other. This makes sense in the case of recruitment. If alt-right users are attempting to recruit new members, they are unlikely to already be a follower or friends with that individual. The relatively higher clustering coefficient for the intersection between the Richard Spencer and mentioned networks reflects the small subset of more highly connected users within the alt- right network – perhaps this is the “muscle” of the movement, as described in

Chapter One.

However, despite being somewhat disjointed, the network grows very quickly. Importing the entire ten thousand-node graph into Gephi, the visualization that resulted was a hairball of nodes and edges. Such an expansive network is very difficult to visually analyze, so, to understand the rate of

69

expansion, it is best to view the network in levels from the origin node of Richard

Spencer. Figure 3.10 presents the graph of all mutual friends and followers of

Richard Spencer. Suspended users are marked in red, and deactivated users are marked in orange. Dark green nodes are from the Richard Spencer network, pale yellow from the mentioned network, and light green from both networks. There are a total of 345 users within this layer of the network, scaled by in-degree

(number of followers). By filling in the appropriate edges between the users one- degree away from Richard Spencer, as in Figure 3.11, the fast entanglement of the network becomes clear.

Figure 3.10: Graph of nodes one degree of separation from Richard Spencer origin, without edges between neighbors.

70

Figure 3.11: Graph of nodes one degree of separation from Richard Spencer origin, with edges between neighbors.

Figure 3.12: Graph of nodes two degrees of separation from Richard Spencer origin, with edges between neighbors.

71

Proceeding one level deeper, as in Figure 3.12, the graph expands to 7,322 nodes. Just by gathering all users that are within maximum two degrees of separation from Richard Spencer, almost the entire collected network of users is gathered. The average path length, which is the average shortest path distance between nodes, and the diameter, which is the longest shortest path distance between two nodes, provide supporting evidence of the fast growth rate of the network. The diameter can also be thought of as the number of mutual friends between two nodes. For reference, if every user were connected to every other node, the diameter would be 1. If every node only connected to one other node, the diameter would be the total number of nodes – nearly 10,000. For this network, the average path length between nodes is 2 and the diameter is 7, both remarkably small, implying that many users are grouped in tight clusters that have tenuous connections between them.

The low clustering coefficient and high community number, yet small average path length and diameter, indicate that while not many users are interconnected, users are frequently mentioning and talking to other users outside of their own personal networks. This evidence suggests that Twitter is used as a recruiting and networking tool by the alt-right.

However, by determining the number of communities within the graph, as well as their properties, there is evidence that Twitter is failing to stunt the growth of these recruitment networks. To partition the users into distinct communities, I used a k-means analysis, a clustering algorithm that attempts to divide a graph into sensible clusters by minimizing an objective function. In this case, the

72

objective function to be minimized was the distance between nodes. Thus, the communities generated by k-means clustering were clustered based off of their connectedness. If a user is in a cluster with another user, they are closely connected (i.e. follow one another or have a mutual friend). To conduct this k- means analysis, I began by converting the graph into a structure that contained distance information. To do so, I created an adjacency array of the graph, and then used the procedure Classic Multidimensional Scaling (CMD) in order to convert that adjacency array into an array of relative distances. This relative distance information between nodes was then minimized as the objective function of k- means.

Before doing so, k-means requires the user to input the number of clusters

(or distinct communities) to fit the data. In order to determine the number of clusters, I generated an “elbow plot” - the summed objective function of the k- means plot against the input number of communities. The crook of the “elbow” within the graph corresponds to the most plausible number of communities; specifically, the number of communities that minimizes the distance between nodes within each community. This number was 43, is a fairly large number of communities, which further demonstrates the disjointedness of the population.

Table 3.11 shows the average characteristic breakdown of users within these communities.

Because these users are clustered in a community together based off of the similarity of their network of friends, it is logical that these clusters center around

73

the Richard Spencer network and mentioned network. In contrast, there is no pattern of communities regarding suspension and deactivation.

Table 3.11: Average characteristic properties of k-means detected communities.

Node Type Average Percentage within Communities Richard Spencer Network 46% Mentioned Network 50% Richard Spencer & Mentioned Networks 46% Intersection Suspended 0.87% Deactivated 2.8%

This data suggests that Twitter is not wiping out clusters of users. If a user is following another user, one can assume that it is because they are curious about the other user’s opinions. It follows that clusters of users, at least in the echo chamber of the alt-right, must hold similar opinions. By not removing clusters of users with similar networks of friends and similar opinions, Twitter is only cementing the support network that encourages alt-right members to reach out to other users, both inside and outside of their network, and possibly recruit them. In doing so, Twitter never eradicates the source of recruitment, failing to follow its own rules.

Summary of Findings

Twitter seems to follow its own published policies when it comes to removing users promoting alt-right content and identifying as a part of the alt- right through discussion of white-identity. Twitter manages to do this while also attempting to maintain free discourse about politics. Despite these successes, the company overreaches by removing users who criticize content removers and

Twitter itself. Twitter also fails to remove clearly affiliated alt-right users with

74

large follower bases, possibly for the sake of profit. Additionally, despite claiming to do so, the company fails to stop recruitment of new alt-right members by not wiping out clusters of highly connected users.

Twitter seems to be cautious of criticism of its content removal. Perhaps this is why they overreach in removing alt-right posts critiquing the company, but it also explains why they are not overly zealous in removal, even if users are clearly alt-right affiliated or are recruiting new members. As discussed in Chapter Four, this caution shows a company that is racing to keep up with political and societal expectations, while still maintaining control over its platform and keeping its values statement intact. Yet, these shifting expectations have only increased

Twitter’s opaqueness to the general public in terms of the nature of its platform and the content that it hosts.

75

4 Conclusion & Future Research

This thesis has described the methodology and reviewed the results of reverse- engineering content removal on Twitter for the alt-right to determine if the company follows its own published rules. The evidence discussed in Chapter

Three suggests that Twitter is only partially successful in honoring its published content removal polices; in some instances, the company oversteps these rules, and in other, fails to enact them. A mapping of policies to quantitative findings is shown in Table 4.1.

76

Table 4.1: Summary of thesis findings.

What the policy states Quantitative findings Removal of accounts that promote • Removal of users using alt-right propaganda and activities expletives/racial slurs • Removal of users retweeting controversial profiles/links • Aggressive removal during divisive events • Removal of frequent tweeters Removal of accounts that represent • Removal of users discussing white the alt-right identity Maintain free expression for the • Keep users discussing politics maximum number of users (values statement) Violation: Removal of accounts that • Keep users with over 40,000 represent the alt-right followers, regardless of representation Violation: Removal of users • Fail to wipe out clusters of alt- recruiting for the alt-right right users Actions taken outside the stated • Removal of users criticizing policy content policies and reviewers

“We believe in free expression and think every voice has the power to impact the world.”107 As mentioned at the start of this thesis, Twitter’s values statement is not without caveats, but published content limitations are supposedly created with the goal of championing these values for as wide an audience as possible. Yet, by not following its own content policies, Twitter is at risk of failing to uphold its mission. The company is only creating more opaqueness about the nature of its platform and motivations for content removal.

107 Our Values,” Twitter.

77

Speculation of Political Motivation

The uncertainty of Twitter’s content removal process breeds speculation.

Many individuals, both alt-right and not, speculate that political bias is a guiding force in determining what content to remove, hypothesizing that the far right is particularly singled out. This speculation is not unusual, as Silicon Valley companies are consistently under fire for their liberal tendencies.

For example, in 2010, during the US congressional election, Facebook tested a new newsfeed view for sixty-one million users to encourage voter turnout. Some users saw the original newsfeed, others saw a pop-up that reminded them it was election day, and a third group saw not only the pop-up, but also a list of their friends who had voted. This third group proved to be most responsive, and an estimated 340,000 extra people turned out to vote.108 When this experiment was revealed in 2012, Facebook received massive backlash. Through the user data the company collects, Facebook can determine the political leanings of an individual. Additionally, Facebook CEO Mark Zuckerberg has donated to

Democratic campaigns.109 For these reasons, there was speculation that Facebook could alter the site to increase voter turnout only for Democrat candidates. This controversy typifies the power of social media platforms, as well as the association of Silicon Valley companies with liberal politics.

108 Zoe Corbyn, “Facebook experiment boosts US voter turnout,” Nature, September 12, 2012, https://nature.com/news/facebook-experiment-boosts-us- voter-turnout-1.11401. 109 Kevin Roose, “The Political Leanings of Silicon Valley,” NYMag, January 24, 2013, http://nymag.com/daily/intelligencer/2013/01/political-leanings-of-silicon- valley.html.

78

Specifically on Twitter, this theory of political motivation in content removal is widely discussed amongst the alt-right. It does not go unnoticed that

Louis Farrakhan, leader of Nation of Islam, is still an active Twitter user (and, in fact, verified) despite posting vehemently anti-Semitic remarks. As one alt-right member, , says:

“It all makes sense once you understand the victim Olympics. Jews are seen as more oppressed than whites, but less oppressed than Palestinians. Thus, anti-Semitism from left-wing groups is allowed, where as right-wing speech is carefully scrutinized or called dog whistles.”110,111

While political bias of social media platforms is discussed somewhat in the mainstream media, far right outlets like Breitbart outright decry companies like

Twitter for this reason. By comparing the content removals of the alt-right versus antifa, we can begin to investigate this supposed bias. To do so, I ran my data collection pipeline on the ideological opposite of the alt-right: antifa.

Twitter Removal of Antifa

Antifa, short for “anti-fascist,” is an apt group to test for Twitter’s content removal bias given that they are the left’s answer to the alt-right: both groups advocate for violence in order to reach their goals.112 Just as with the alt-right, to

110 Sean Burch and Jon Levine, “Alt-Right mad that Twitter’s ‘Hate Conduct’ Policy Doesn’t Ban Liberals,” The Wrap, December 18, 2017, https:// thewrap.com/twitter-hate-conduct-left-wing-extremists/. 111 : an expression co-opted by the alt-right for a statement that has a secondary meaning intended to be understood by a specific group of people (Merriam-Webster). 112 Katie Bo Williams, “Antifa activists say violence is necessary,” The Hill, September 14, 2017, http://thehill.com/policy/national-security/350524-antifa- activists-say-violence-is-necessary.

79

seed the pipeline, I needed to find a prominent leader of antifa. Finding a leader proved to be more difficult than anticipated, given that consolidating power within one leader is against the left’s very nature. 113 The movement lacks formal structure, instead consisting of dispersed local groups.114 Online research pointed to Yvonne Felarca, a middle-school teacher and activist described by as an “antifa leader.”115 Felarca was arrested for battery during a rally against

University of California, Berkeley’s free speech week.116 However, she does not possess a Twitter account, demonstrating the lack of concentrated power and organized action within the antifa movement.

Instead, further research led me to Mike Isaacson, a John Jay College professor turned antifa leader. Isaacson (@VulgarEconomics on Twitter) has received a great deal of criticism from conservative outlets for his far left social media presence, including one tweet that referred to his capitalist-supporting students as “future dead cops.”117 While Isaacson only possesses four thousand followers compared to Richard Spencer’s eighty thousand, his prominence online, as well as the attention he has received outside of the antifa community, made him a fair candidate for seeding the pipeline program.

113 “#Democracy on Fire,” Institute of Politics. 114 Williams, “Antifa activists say violence is necessary.” 115 Edmund DeMarche, “Antifa leader, teacher Yvonne Felarca arrested at ‘empathy tent’ Berkley brawl,” Fox News, September 28, 2017, http:// foxnews.com/us/2017/09/28/antifa-leader-teacher-yvonne-felarca-arrested-at- empathy-tent-berkeley-brawl.html. 116 Ibid. 117 Ian Miles Cheong, “Criminal Justice Professor Justifies Antifa Violence and Jokes About Dead Cops,” The Daily Caller, September 14, 2017, http://dailycaller.com/2017/09/14/criminal-justice-professor-justifies-antifa- violence-and-jokes-about-dead-cops/.

80

Running the pipeline for twenty-four days beginning in late January, I found that out of 15,437 antifa users, 531 were suspended or deactivated – a mere

3.4%. This is about half of the 6.5% removed profiles found for the alt-right network. From these numbers alone, it is difficult to conclude whether or not

Twitter is biased. To understand the factors that result in the percentage removed difference, I surveyed the language of the antifa network using an LDA model and a logistic regression model, two models described in Chapter Three. By clustering terms, the LDA reveals major discussion topics of antifa, as presented in Table 4.2.

Table 4.2: Sampled results of LDA on collective (removed and active) antifa terminology.

Sample Topic 1 Sample Topic 2 Sample Topic 3 Trump Activist Nazi President Event Republican Hate Congress Racism Job Rule Poor Review Tweet Professor

Analyzing the meaning of these results, and in turn the communication patterns of antifa, could be a thesis within itself. To not get lost in the data, I direct the reader’s attention to one specific feature: the lack of expletives and slurs within these topics. Compared to the slur-heavy language of the alt-right, this absence is striking. The coefficient results of the logistic regression also confirm a lack of expletives, suggesting that they are not the cause of antifa content removal. All curse words had a coefficient score below 0.5, quantifying that they were not likely included in removed tweets.118

118 Note: Resulting coefficients spanned from -2.5 to 2.5.

81

These findings provide preliminary evidence that Twitter removes fewer antifa users because they are less violent in their language.119 This evidence supports the idea that Twitter has no defined political agenda, but is simply removing profiles that it views as unfit for the platform. However, the very reality that Twitter leaves so much room for speculation about its content removal policies speaks to the lack of transparency that the company perpetuates between itself and its users. Failing to consistently carry out the very policies it publishes certainly does not help this cause.

Further Research

The findings of this thesis prompt a set of questions for further research and consideration regarding Twitter’s treatment of the alt-right.

i. Determining an Altered Set of Twitter Rules

If Twitter is not following the content removal policies that they publish, do they follow an alternate set of unpublished rules? To answer this question, as well as determine this very set of rules, Twitter’s internal structure would need to be investigated, with particular focus on the pipeline through which new rules are proposed and approved. For example, what is the lag time between testing a new rule and publishing it?

ii. Verifying Alt-Right Profiles

Are there metrics that can determine whether or not a profile is affiliated with the alt-right? There was no overlap between the antifa and alt-right networks

119 Other factors relating to the group’s ideologies are likely the result of removal, but such exploration is saved for future research.

82

I collected, suggesting that these groups are disparate in real life. However, interesting future research would determine the exact factors that indicate that a user is an alt-right member in order to ensure that the network collected is completely accurate. This could also provide more insight into Twitter’s accuracy in removing alt-right profiles.

iii. Expanding to Other Platforms

How do other social media platforms handle the presence of the alt-right, and how does the alt-right interact on each site? This thesis’ research could be expanded upon to include Facebook, Youtube, and even Instagram. Additionally, the bias test described in this chapter could be conducted to determine whether or not there is truth to the claim of political leanings from these companies.

iv. Effectiveness of Content Removal

Does removing users affiliated with the alt-right from social media platforms actually reduce the size and power of the overall population, or does it simply drive them to channels where they can remain untouched? There has long been discussion about the most effective way to mitigate a ’s presence through online actions. Determining crossover of removed users into underground chatrooms and platforms such as Gab would help answer this question.

In Conclusion: The Future of Speech on Twitter

The lack of consistency between the quantitative findings and Twitter’s published policies is disturbing due to the free speech concerns laid out in Chapter

One. Even though Twitter is a private company that is not held to the standards of free expression set out in the United States Constitution, removing content

83

without fair reason is an issue of censoring speech. This is highly contradictory for a company that claims to provide a platform for free expression.

However, Twitter’s lack of consistency with its content policies might be an issue of carelessness rather than malice. There is no evidence that Twitter is a liberal giant attempting to enforce its views by censoring all disagreeable content.

There is also no evidence that Twitter is ruthlessly harsh in removing all profiles associated with the alt-right. It seems as though Twitter is racing to keep up with political and societal expectations while prioritizing revenue, and in doing so, appears to be irresponsible with its handling of content review.

For example, on September 23, 2017, President Donald Trump tweeted,

“Just heard Foreign Minister of North Korea speak at U.N. If he echoes thoughts of Little Rocket Man, they won’t be around much longer!”120 According to many news outlets, this tweet was a direct threat to North Korea. Given that “threats of violence” violate the Twitter Rules, many questioned why the President’s tweet, if not his account, was not removed. 121 Twitter responded that the company takes

“newsworthiness” into consideration when determining the fate of a profile or tweet – and in this case, the tweet was a matter of public interest.122 Twitter’s rules now include a clause describing factors of consideration regarding content, one of which is “the behavior is newsworthy and is in legitimate public

120 Donald J. Trump (@realDonaldTrump), Tweet, September 23, 2017, https://twitter.com/realDonaldTrump/status/911789314169823232. 121 Twitter Public Policy (@Policy), Tweet, September 25, 2017, https://twitter.com/Policy/status/912438520429662208. 122 Ibid.

84

interest.”123 As these rules are not dated, and there is no record of when amendments were made, it is unclear if this addition was only added after the

September 23 tweet incident and backlash.

It is no coincidence that this thesis is bookended by two controversial encounters surrounding Donald Trump’s Twitter account. The President has changed the nature of discourse between the government and the people through his use of his Twitter account. His controversial, flippant tweets, sometimes declaring foreign policy positions and firing cabinet members in 280 characters, are cause for shock for news outlets and citizens alike. Furthermore, hate groups have always existed, but given that alt-right leaders cite Donald Trump’s presidency as the reason they feel comfortable expressing their views so loudly, there is a strong likelihood that these groups are not shrinking in the near future.

This shift in communication and the standard of acceptableness has forced Twitter to address some high-profile cases that have put its content review policies in question.

The case mentioned in Chapter One, in which a disgruntled employee deactivated the President’s account, demonstrates the amount of power Twitter and its employees have to determine what content is permitted on the site. The previously mentioned case, in which Twitter provided a loophole for its own content policies to keep the President’s tweet up, demonstrates that the company alters its own rules to justify its choices. These facts, especially when strung together, are highly alarming. Given the quantity and frequency of

123 Twitter, “Twitter Rules.”

85

communication on the site, Twitter is an incredibly influential platform, and, by the argument laid out in Chapter One, should perhaps be considered under the same free speech framework as in-person speech. Yet, Twitter’s content removal is incredibly opaque: so opaque, in fact, that the company does not even fully follow its own guidelines. Given its lack of internal consistency and willingness to change rules on a case-by-case basis without any notion of precedence or transparency to the public, Twitter can effectively justify any content removal decision.

Twitter services a community of 200 million daily users. These users use the site to read news, express opinions, and engage in discussion: it is the modern version of Aristotle’s agora. However, if Twitter continues to be inconsistent with its own policies, shifty in its justifications, and largely opaque to the general population, it may alter the very definition of “free expression.”

86

Bibliography

“#Democracy on Fire: Twitter, Social Movements, and the Future of Dissent.” The Institute of Politics, Harvard University, Cambridge, MA, October 27, 2017.

Associated Press. “Writing about the ‘alt-right.’” Accessed November 23, 2017. https://blog.ap.org/behind-the-news/writing-about-the-alt-right.

Astor, Maggie. “Rogue Twitter Employee Briefly Shuts Down Trump’s Account.” New York Times, November 2, 2017. https://www.nytimes.com/2017/11/02/us/politics/trump-twitter- deleted.html.

Berger, J.M. “Nazis vs. ISIS on Twitter: A Comparative Study of White Nationalist and ISIS Online Social Media Networks.” George Washington University Program on Extremism (September 2016).

Bickert, Monica. “The Line Between Hate and Debate.” Berkman Klein Center for Internet & Society, Cambridge, MA, September 19, 2017.

Brandom, Russell. “How to spot a Twitter bot.” The Verge, August 13, 2017. https://antifa.theverge.com/2017/8/13/16125852/identify-twitter-bot- botometer-spambot-program.

Burch, Sean and Jon Levine. “Alt-Right mad that Twitter’s ‘Hate Conduct’ Policy Doesn’t Ban Liberals.” The Wrap, December 18, 2017. https:// thewrap.com/twitter-hate-conduct-left-wing-extremists/.

Cheong, Ian Miles. “Criminal Justice Professor Justifies Antifa Violence and Jokes About Dead Cops.” The Daily Caller, September 14, 2017. http://dailycaller.com/2017/09/14/criminal-justice-professor-justifies- antifa-violence-and-jokes-about-dead-cops/.

Corbyn, Zoe. “Facebook experiment boosts US voter turnout.” Nature, September 12, 2012. https://nature.com/news/facebook-experiment-boosts-us-voter- turnout-1.11401.

87

DeCew, Judith. “Privacy.” Stanford Encyclopedia of Philosophy, Stanford University, May 14, 2002. https://plato.stanford.edu/entries/privacy/.

DeMarche, Edmund. “Antifa leader, teacher Yvonne Felarca arrested at ‘empathy tent’ Berkley brawl.” Fox News, September 28, 2017. http:// foxnews.com/us/2017/09/28/antifa-leader-teacher-yvonne-felarca- arrested-at-empathy-tent-berkeley-brawl.html.

Dewey, Caitlin. “What was fake on the Internet this election.” The Washington Post, October 24, 2017. https://washingtonpost.com/news/the- intersect/wp/2016/10/24/what-was-fake-on-the-internet-this-election- george-soross-voting-machines.

Etherington, Darrell. “Twitter to revoke verification for some accounts as part of overhaul.” TechCrunch, November 15, 2017. https://techcrunch.com/2017/11/15/twitter-to-revoke-verification-for- some-accounts-as-part-of-overhaul/.

Fleishman, Cooper and Anthony Smith. “Neo-Nazis Are Targeting Victims Online With This Secret Symbol Hidden in Plain Sight.” Mic, June 8, 2016. https://mic.com/articles/144228/echoes-exposed-the-secret-symbol- neo-nazis-use-to-target-jews-online.

Halliday, Josh. “Twitter's Tony Wang: ‘We Are the Free Speech Wing of the Free Speech Party.’” The Guardian, March 22, 2012. https://www.theguardian.com/media/2012/mar/22/twitter-tony-wang-free- speech.

Jaffer, Jameel. “Government Secrecy in the Age of Information Overload.” Salant Lecture, Shorenstein Center, Cambridge, MA, October 17, 2017. King, Gary, Jennifer Pan, and Margaret E. Roberts. “How Censorship in China Allows Government Criticism but Silences Collective Expression.” American Political Science Review 107, no. 2 (May 2013): 1-18. http://j.mp/2nxNUhk.

Klonick, Kate. “The Terrifying Power of Internet Censors.” New York Times, September 13, 2017. https://www.nytimes.com/2017/09/13/opinion/cloudflare-daily-stormer- charlottesville.html.

Larson, Selena. “Twitter Suspends 377,000 Accounts for pro-Terrorism Content.” CNN, March 21, 2017. http://money.cnn.com/2017/03/21/technology/twitter-bans-terrorism- accounts/.

Lewis, Becca. Skype Conversation. Data & Society, October 14, 2017.

88

Lohr, Steve. “With 'Brandeis' Project, Darpa Seeks to Advance Privacy Technology.” The New York Times, 14 Sept. 2015. https://bits.blogs.nytimes.com/2015/09/14/with-brandeis-project-darpa- seeks-to-advance-privacy-technology/.

Miller, Brian. “There’s No Need to Compel Speech. The Marketplace of Ideas Is Working.” Forbes, December 4, 2017. https://www.forbes.com/sites/briankmiller/2017/12/04/theres-no-need-to- compel-speech-the-marketplace-of-ideas-is-working/#7f7686da4e68.

Moser, Bob. “How Twitter’s Alt-Right Purge Fell Short.” Rolling Stone, December 19, 2017. https://www.rollingstone.com/politics/news/how- -alt-right-purge-fell-short-w514444. O’Brien, Luke. “The Making of an American Nazi.” The Atlantic, December 2017. https://www.theatlantic.com/magazine/archive/2017/12/the-making- of-an-american-nazi/544119/.

Pew Research. “Social Media Fact Sheet.” Last modified January 12, 2017. http://www.pewinternet.org/fact-sheet/social-media/.

Roose, Kevin. “The Political Leanings of Silicon Valley.” NYMag, January 24, 2013. http://nymag.com/daily/intelligencer/2013/01/political-leanings-of- silicon-valley.html.

Roose, Kevin. “The Young and Brash of Tech Grow a Bit Older, and Wiser.” New York Times, March 14, 2018. https:// nytimes.com/2018/03/14/technology/tech-leaders-growing-up.html.

Roose, Kevin. “This Was the Alt-Right’s Favorite Chat App. Then Came Charlottesville.” New York Times, August 15, 2017. https://www.nytimes.com/2017/08/15/technology/discord-chat-app-alt- right.html.

Rosen, Jeffrey. “The Deciders: The Future of Free Speech in a Digital World.” Salant Lecture, Shorenstein Center, Cambridge, MA, October 13, 2016.

Shear, Michael D. and Adam Goldman. “Michael Flynn Pleads Guilty to Lying to the F.B.I. and Will Cooperate With Russia Inquiry.” New York Times, December 1, 2017. https:// nytimes.com/2017/12/01/us/politics/michael- flynn-guilty-russia-investigation.html.

Sonnad, Nikhil. “The Alt-Right Is Creating Its Own Dialect. Here’s the Dictionary.” Quartz, October 30, 2017. https://qz.com/1092037/the-alt- right-is-creating-its-own-dialect-heres-a-complete-guide/.

89

Southern Poverty Law Center. “Alt-Right.” Accessed December 3, 2017. https://www.splcenter.org/fighting-hate/extremist-files/ideology/alt-right.

Southern Poverty Law Center. “Richard Bertrand Spencer.” Accessed January 13, 2018. https:// splcenter.org/fighting-hate/extremist-files/individual/richard- bertrand-spencer-0.

Twitter. “Our Approach to Bots & Misinformation.” Accessed February 1, 2018. https://blog.twitter.com/official/en_us/topics/company/2017/Our- Approach-Bots-Misinformation.html.

Twitter. “Our values.” Accessed February 1, 2018. https://about.twitter.com/en_us/values.html. Twitter. “The Twitter Rules.” Accessed November 4, 2017. https://help.twitter.com/en/rules-and-policies/twitter-rules.

Twitter. “Violent extremist groups.” Accessed November 4, 2017. https://help.twitter.com/en/rules-and-policies/violent-groups.

Vice News. “Charlottesville: Race and Terror.” Last modified August 14, 2017,.https://news.vice.com/en_us/article/qvzn8p/vice-news-tonight-full- episode-charlottesville-race-and-terror.

Warren, Samuel and Louis D. Brandeis. “The Right to Privacy.” Harvard Law Review, vol. 4, no. 5, (1890): pp. 193–220.

Williams, Katie Bo. “Antifa activists say violence is necessary.” The Hill, September 14, 2017. http://thehill.com/policy/national-security/350524- antifa-activists-say-violence-is-necessary.

Wood, Graeme. “His Kampf.” The Atlantic, June 12, 2017. https://www.theatlantic.com/magazine/archive/2017/06/his- kampf/524505/.

Yan, Holly and Dan Simon. “Undocumented immigrant acquitted in Kate Steinle death.” CNN, December 1, 2017. https://cnn.com/2017/11/30/us/kate- steinle-murder-trial-verdict/index.html.

Yiannopoulos, Milo and Allum Bokhari. “An Establishment Conservative's Guide To The Alt-Right.” Breitbart, March 30, 2016. http://www.breitbart.com/tech/2016/03/29/an-establishment- conservatives-guide-to-the-alt-right/.

90