Technical Report UW-CSE-2018-12-02 Technology-Enabled Disinformation: Summary, Lessons, and Recommendations

December 2018 John Akers Gagan Bansal Gabriel Cadamuro Christine Chen Quanze Chen Lucy Lin Phoebe Mulcaire Rajalakshmi Nandakumar Matthew Rockett Lucy Simko John Toman Tongshuang Wu Eric Zeng Bill Zorn Franziska Roesner ([email protected]) Paul G. Allen School of Computer Science & Engineering University of Washington

Abstract In this report, which is the culmination of a PhD-level spe- 1 Technology is increasingly used — unintentionally (mis- cial topics course in Computer Science & Engineering at the information) or intentionally (disinformation) — to spread University of Washington’s Paul G. Allen School in the fall false information at scale, with potentially broad-reaching of 2018, we consider the phenomenon of mis/disinformation societal effects. For example, technology enables increas- from the perspective of computer science. The overall space ingly realistic false images and videos, and hyper-personal is complicated and will require study from, and collabora- targeting means different people may see different versions tions among, researchers from many different areas, includ- of reality. This report is the culmination of a PhD-level spe- ing cognitive science, economics, neuroscience, political sci- cial topics course1 in Computer Science & Engineering at ence, psychology, public health, and sociology, just to name the University of Washington’s Paul G. Allen School in the a few. As technologists, we focused in the course, and thus in fall of 2018. The goals of this course were to study (1) how this report, on the technology. We ask(ed): what is the role of technologies and today’s technical platforms enable and sup- technology in the recent plague of mis- and disinformation? port the creation and spread of such mis- and disinformation, How have technologies and technical platforms enabled and as well as (2) how technical approaches could be used to mit- supported the spread of this kind of content? And, critically, igate these issues. In this report, we summarize the space of how can technical approaches be used to mitigate these is- technology-enabled mis- and disinformation based on our in- sues — or how do they fall short? vestigations, and then surface our lessons and recommenda- In Section 2, we summarize the state of technology in tions for technologists, researchers, platform designers, pol- the mis/disinformation space, including reflecting on what icymakers, and users. is new and different from prior information pollution cam- paigns (Section 2.1), surveying work that studies current mis/disinformation ecosystems (Section 2.2), and summariz- 1 Introduction ing the underlying enabling technologies (specifically, false video, targeting and tracking, and bots, in Section 2.3), as Misinformation is false information that is shared with no in- well as existing efforts to curb the effectiveness or spread tention of harm. Disinformation, on the other hand, is false of mis/disinformation (Section 2.4). We then step back to information that is shared with the intention of deceiving the summarize the lessons we learned from our exploration (Sec- consumer [49, 110]. This phenomenon, also referred to as tion 3.1) and then make recommendations particularly for information pollution [110] or false or [54], is a technologists, researchers, and technical platform designers, critical and current issue. Though propaganda, rumors, mis- as well as for policymakers and users (Section 3.2). leading reporting, and similar issues are age-old, our current Our overarching conclusions are two-fold. First, as tech- technological era allows these types of content to be created nologists, we must be conscious of the potential negative im- easily and realistically, and to spread with unprecedented pacts of the technologies we create, recognize that their de- speed and scale. The consequences are serious and far- signs are not neutral, and take responsibility for understand- reaching, including potential effects on the 2016 U.S. elec- ing and minimizing their potential harms. Second, however, tion [33], distrust in vaccinations [9], murders in India [16], the (causes and) solutions to mis/disinformation are not just and impacts on the lives of individuals [50]. based in technology. For example, we cannot just “throw some machine learning at the problem”. In addition to de- 1https://courses.cs.washington.edu/courses/cse599b/ signing technical defenses — or designing technology to at- 18au/ tempt to avoid the problem in the first place — we must also

1 work together with experts in other fields to make progress on social media. Contrast the six long years that it took in a space where there do not seem to be easy answers. for the Russian disinformation campaign about AIDS having been created by the U.S. to take off in the 1980s [7, 20] with the viral spread of false stories like Pizza- 2 Understanding Mis- and Disinformation gate [5] during the 2016 U.S. election. A key goal of this report is to provide an overview of 4. Organic and intentionally created filter bubbles. In technology-enabled mis/disinformation — what is new, what today’s web experience, individuals get to choose what is happening, how technology enables it, and how tech- content they do or do not want to see, e.g., by customiz- nology may be able to defend against it — which we do ing their social media feeds or visiting specific websites in this section. Others have also written related and valu- for news. The resulting echo chambers are often called able summaries that contributed to our understanding of the “filter bubbles” [72] and can help mis/disinformation space and that we recommend the interested reader cross- spread to receptive people who consume it within a lim- reference [49, 54, 63, 110]. ited broader context. Though some recent studies sug- gest that filter bubbles may not be as extreme as feared (e.g., [39]), others find evidence for increased polariza- 2.1 What Is New? tion [62]. Moreover, cognitive biases like the backfire effect (discussed below) suggest that exposure to alter- While propaganda and its ilk are not a new problem, it is nate views is not necessarily effective in helping people becoming increasingly clear that the rise of technology has change their minds. Worse, entities wishing to spread served to catalyze the creation, dissemination, and consump- disinformation can intentionally create filter bubbles by tion of mis/disinformation at scale. Here we briefly detail, directly and accurately targeting (e.g., via ads or spon- from a technology standpoint, the major factors that have sored posts) potentially vulnerable individuals. contributed to our current situation: 5. Algorithmic curation and lack of transparency. Con- 1. Democratization of content creation. Anyone can tent curation algorithms have become increasing com- start a website, blog, page, account, plex and fine-tuned — e.g., the algorithm that deter- or similar and begin creating and sharing content. There mines which posts, in what order, are displayed in a has been a transformation from a highly centralized sys- user’s Facebook news feed, or YouTube’s recommender tem with a few major content producers to a decentral- algorithms, which have been frequently criticized for ized system with many content producers. This flood pushing people towards more extreme content [102]. At of information makes it difficult to distinguish between the same time, these algorithms are not transparent for true and false information (i.e., information pollution). end users (e.g., Facebook’s ad explanation feature [3]). Additionally, tools to create realistic fake content have This complexity can make it challenging for users to become available to non-technical actors (e.g., Deep- understand why they are having specific viewing expe- Fakes [10, 48]), significantly lowering the barrier to en- riences online, what the real source of information (e.g., try in terms of time and resources for people to create sponsored content) is [64], and even that they are hav- extremely realistic fake content. ing different experiences from other users. 2. Rapid news cycle and economic incentives. The more 6. Scale and anonymity in online accounts. Actors clicks a story gets, the more money gets made through wishing to spread disinformation can leverage the weak ad revenue. Given the rapid news cycle and plethora of identity and account management frameworks of many information sources, getting more clicks requires pro- online platforms to create a large number of accounts ducing content that catches consumers’ attention — for (“sybils” or bots) in order to, for example, pose as le- instance, by appealing to their emotions [74] or playing gitimate members of a political movement and/or arti- into their other cognitive biases [6, 89]. In addition, so- ficially create the impression that particular content is cial media platforms that provide free accounts to users popular (e.g., [4]). It is not clear how to address this make money via targeted advertising. This allows a va- issue, as stricter account verification can have negative riety of entities, with intentions ranging from helpful to consequences for some legitimate users [43], and bot harmful, to get their messages in front of users. accounts can have legitimate purposes [36, 82]. 3. Wide and immediate reach and interactivity. Con- tent created on one side of the planet can be viewed What has not changed: Cognitive biases of human con- on the other side of the planet nearly instantaneously. sumers. Though the above technology trends have con- Moreover, these content creators receive immediate tributed to creating the currently thriving mis/disinformation feedback on how their campaigns are performing — and ecosystem, ultimately, the content that is effective relies on can easily iterate and A/B test — based on the number something that has not changed: the cognitive biases of hu- of likes, shares, clicks, comments, and other reactions man consumers [6, 89]. A prescient article from 2002 [14]

2 discussed how these cognitive biases can be exploited via study of alternate news media coverage of the Syrian White technology, and such strategies are common in market- Helmets found that these media often copy content without ing [89] and traditional propaganda. A common such bias attribution; clusters of sources which shared articles reveal is “confirmation bias”, by which people pay more attention a densely-linked alternative media cluster which is largely and lend more credence to information that confirms some- disconnected from reputable news media [92]. thing they are already inclined to believe. Another example that will recur in this report is the “backfire effect”, which Consumers. In addition to understanding disinformation suggests that when confronted with facts intended to change campaigns, ecosystems studies can help us understand the a person’s opinion or belief about something, this confronta- behaviors of human consumers. For example, a study of the tion can have the opposite effect, causing people to more geographic and temporal trends of fake news consumption strongly hold their original belief [59, 70]. A full survey of during the 2016 U.S. election found a correlation of aggre- these exploitable human cognitive “vulnerabilities”, though gate voting behavior with consumption of fake news across outside the scope of this report, is an important component states [40] (though the causality is, of course, unclear). of understanding why mis/disinformation campaigns are ef- More generally, evidence suggests [97] that users base fective and how (not) to combat them. their trust in an article on their trust of the friend who they saw share it, i.e., information local to them, rather than the source or upstream links in the chain of shares. Further, 2.2 Measuring the Problem people mistakenly believe that they take the trustworthiness of the source into account more than they actually do [63]. We begin by surveying work that has studied today’s mis/ On some platforms, design choices mean that messages may disinformation ecosystems directly, attempting to quantita- spread in a way that obscures the source (e.g., WhatsApp) or tively and qualitatively understand the problem. Understand- conceals intermediate links in the chain between source and ing the problem — who are the adversaries, who are the tar- sharer (e.g., Twitter). These design choices may contribute gets, what are the techniques used, and why are they effec- to the spread of mis/disinformation and can make it difficult tive — is crucial for ultimately developing effective defenses. for external researchers to study the ecosystem.

2.2.1 Ecosystem Studies 2.2.2 Tools for Measurement A number of academic studies have now examined various A major challenge in conducting measurement studies of aspects of the mis/disinformation ecosystem. Many of these mis/disinformation, or of attempting to counter it by fact- studies are case studies focusing on a particular issue, cam- checking individual stories, is that it requires significant paign, or time span (e.g., [4, 40, 92]). Some more general manual effort. To help with this process, several tools — for studies or surveys exist as well (e.g., [63, 97, 115]). We sum- researchers, journalists, and/or end users — have been devel- marize some of the key findings from this work here. oped to surface the sources and trajectories of shared content Techniques. One repeated theme in this research is that or to facilitate fact checking directly. sources of mis/disinformation rely on a range of techniques For example, Hoaxy [71] is a tool that collects and tracks in terms of sophistication. For example, a study of dis- mis/disinformation and corresponding fact checking efforts information campaigns embedded in Twitter conversations on Twitter. A study by its creators [84] found that, per- around Black Lives Matter suggested that social media haps unsurprisingly, fact checking content is shared signif- accounts that spread mis/disinformation range from low- icantly less than the original article. In another study [85], effort spambots, to slightly more human-seeming, semi- its creators used Hoaxy to demonstrate the central role of so- automated accounts with some degree of human supervi- cial spambots in the mis/disinformation ecosystem — a valu- sion, to purely human-run accounts that build large fol- able conclusion, as it suggests that curbing these bots may lowings before exploiting the trust of those social links to be crucial in the fight against mis/disinformation (see Sec- spread mis/disinformation widely [4]. Supporting these re- tion 2.3.3). sults more broadly, Marwick et al. [63] provide a partial sur- TwitterTrails [38] is a related tool that provides informa- vey of groups and movements that have been particularly ef- tion regarding the origin, burst characteristics, and propaga- fective in manipulating the media to spread their messages, tion of content on Twitter. The resulting visualizations are using methods such as: impersonation of activists to exploit intended to help users to investigate the veracity of the con- the trust aspect of social networks, bot promotion of a story tent. Hamilton 68 [1] similarly provides a dashboard of in- on social media, and government-funded media outlets. formation about Russian influence operations on Twitter. Structure. Understanding how mis/disinformation content Challenges. Though valuable to helping us understand the or information ecosystems differ from other information broader ecosystem, none of these tools is yet a silver bullet. ecosystems can potentially help detect them. For example, a Some (e.g., TwitterTrails and Emergent [87], a seemingly

3 discontinued real-time rumor tracker) suffer from being in- Other work from 2017 [95] allows the reanimation to be au- sufficiently automated. Additionally, in some cases, it is not tomatically derived from an audio stream, without the need directly clear how the surfaced information should be used for a source video (but with the need for significant training by researchers or end users, and worse, may be misunder- video data of the target, e.g., Obama). stood [2]. Finally, these tools are all targeted at Twitter be- cause of its public nature; closed platforms like Facebook Implications. Synthetic, photorealistic video is hardly new, do not support similar efforts. These limitations are not due but historically it has been prohibitively expensive to pro- to the lack of effort or intention by the tools’ creators but duce, limiting its appearance to scenarios like special effects rather due to fundamental challenges. Mis/disinformation in movies. Tools like FakeApp democratize this capability, campaigns vary widely in their techniques, automatically de- making it possible for individuals to create similarly high- tecting “fake news” is hard (as discussed below), humans quality synthetic video in a restrictive but growing set of suffer from fundamental cognitive biases, and while know- cases. And we can only expect synthetic video technology ing the source and trajectory of content can help shed light to improve, given its importance in the entertainment indus- on the dynamics of the ecosystem, measurement may not di- try, from video games to Snapchat filters. In spite of its rectly provide a solution. Nevertheless, such tools are a valu- entertainment value, the increasing prevalence of synthetic able component of a defender’s toolbox. video can have negative effects on our information ecosys- tem, from spreading mis/disinformation to undermining the credibility of legitimate information (allowing anyone to ar- 2.3 Enabling Technologies gue that anything has been faked). We now turn to several key enabling technologies that under- lie the recent rise of mis/disinformation at scale: extremely 2.3.2 Tracking and Targeting realistic and easy-to-create false video (Section 2.3.1), in- dividualized tracking of a user’s browsing behaviors and The economic model of today’s web is based on advertis- hyper-personal targeting of content based on the results of ing: users receive content for “free” in exchange for see- this tracking (Section 2.3.2), and social media bots (Sec- ing advertisements. To optimize these advertisements, they tion 2.3.3). are increasingly targeted at specific consumers; to enable this targeting, widespread and sophisticated tracking of user browser behaviors has been developed. These technologies 2.3.1 False Video of tracking and targeted can be leveraged not only by legit- False media has existed for as long as there has been me- imate traditional advertisers but also by entities wishing to dia to falsify: forgers have faked documents or works of art, spread mis/disinformation for monetary gain (e.g., viral con- teenagers have faked driver’s licenses, etc. With the advent tent to maximize clicks) as well as directly malicious pur- of digital media, the problem has been amplified, with tools poses (e.g., affecting the outcome of an election). like Photoshop making it easy for relatively unskilled actors to perform sophisticated alterations to photographs. More Tracking. There has been significant prior work studying recently, developments in AI have extended that capability tracking and targeting within the computer security and pri- to permit the creation of realistic false video. vacy community, which we very briefly summarize here. The most popular technique to produce photorealistic false At the highest level, so-called third party tracking is the video is the reanimation of faces. In a typical setup, a tar- practice by which companies embed content — like adver- get video, such as an announcement by a politician, is rean- tising networks, social media widgets, and website analytics imated to show the same sequence of facial expressions as a scripts — in the first party sites that users visit directly. This source video, such as a political opponent sitting in front of embedding relationship allows the third party to re-identify a webcam in their living room. Usually the reanimation pro- users across different websites and build a browsing his- cess is implemented with generative adversarial networks, tory. In addition to traditional browser cookie based track- and a significant amount of training data may be required. ing [56, 79], tracking also manifests itself in browser finger- The resulting false video is often referred to as a “deepfake”. printing, which relies on browser-specific and OS-specific features and quirks to get a fingerprint of a user’s browser Instantiations. Various implementations exist in the litera- that can be used to correlate the user’s visits across web- ture and publicly on the internet. The Face2Face tool, pub- sites [21, 69]. Though tracking can be used for legitimate lished in 2016 [98], is an early example of real-time facial purposes — e.g., preventing fraudulent purchases or provid- reenactment. In 2017, the term “deepfake” went public on ing more personalized content — there have been increasing reddit, along with the FakeApp tool [48, 111]. The Deep concerns about its privacy implications (e.g., as evidenced Video Portraits project from Stanford significantly refined by the General Data Protection Regulation, or GDPR, in the the process in 2018 [51], requiring only a few minutes of EU). At present, though a number of anti-tracking browsers, video of the target to create a realistic real-time reanimation. browser extensions, other tools exist (e.g., [8, 19, 41, 100]),

4 there exists no fully reliable way for users to avoid tracking. bots. Research into bot ecosystems has found, for example, that unsophisticated bots interact with sophisticated bots to Targeting. Tracking can be and often is ultimately lever- give the sophisticated bots more credibility [104], and that aged to target particular content at a user, e.g., via adver- accounts spreading articles from low-credibility sources are tisements on Facebook or embedded in other websites (e.g., more likely to be bots [85]. news sites). Researchers have found that even when adver- Various efforts have been made to detect and combat so- tising platforms like Facebook provide some transparency cial bots online — a nice summary is written by Ferrara et for users about why certain ads are targeted at them, the in- al. [36]. Detection approaches commonly leverage machine formation provided is overly broad and vague [3], failing to learning to find differences between human users and bot ac- improve users’ understanding of why they are seeing some- counts. Detection models make use of the social network thing or what the platform knows about them [22]. The act graph structure, account data and posting metrics, as well as of targeting itself can raise privacy concerns, because it can natural language processing techniques to analyze the text allow potentially malicious advertisers to extract private in- content from profiles. Crowdsourcing techniques have also formation about users stored in advertising platforms (like been attempted [108]. From the platform provider’s vantage Facebook) [31, 105, 106]; it can also be used to intention- point, a promising approach relies on monitoring account ally or unintentionally target individuals based on sensitive behaviour such as time spent viewing posts and number of attributes (e.g., health issues) [55]. These very ecosystems, friend requests sent [112]. which underlie the economic model of today’s web and thus cannot be easily displaced, can be directly exploited by disin- Challenges. Multiple challenges arise in the fight against formation campaigns to target specific vulnerable users and malicious social bots. First, some bots are benign or bene- to create and leverage filter bubbles. ficial [36, 82], such as bots that help with news aggregation and disaster relief coordination. More generally, using ma- chine learning to detect social bots can be challenging due to 2.3.3 Social Bots the lack of ground truth and real world training data, espe- A common tactic seen in recent disinformation campaigns is cially since some bots or cyborgs can evade detection even the use of social media bots to amplify low-credibility con- by human evaluators [13]. Ultimately, bot detection is a clas- tent to give the false impression of significant support or in- sic arms race between attackers and defenders [13]. terest [36, 85], or to fan the flames of partisan debate [4]. Another challenge is the potential misalignment of incen- Indeed, bot accounts are common on social media platforms. tives: by identifying and banning bot accounts, social media Some studies estimate that as of 2017, up to 15% of Twit- platforms risk depressing their valuations when it becomes ter accounts are bots [104]; Twitter itself recently suspended clear that the number of legitimate users is lower than pre- bot accounts, resulting in a cumulative 6% drop of followers viously believed [12, 81]. This incentive problem is in ten- across the platform [12]. sion with the fact that platforms themselves have the best visibility into accounts behaviors that can be effective in de- Bot behaviors. Researchers have characterized the evolu- tection [112]. Indeed, most academic bot detection work is tion of bots over time, referring to the recent iteration as “so- centered on Twitter, as other platforms are much more diffi- cial spambots” [36]. Earlier spambots used simplistic tech- cult to study from an external perspective. niques — such as tweeting spam links at many different ac- Finally, the current mis/disinformation problem is not counts [13] — that were relatively easy to detect. By con- solely the result of bots; as we have seen, some disinforma- trast, social spambots use increasingly sophisticated tech- tion campaigns involve sophisticated human operators [4], niques to post context-relevant content based on the commu- and the advertising-based economic model of the web en- nity they are attempting to blend into. These bots slowly courages clickbait-style content more generally. build trust and credible profiles within online communi- ties [36]. They then either programmatically or manually 2.4 Defenses (via a human operator, in a human-bot combination some- times called a “cyborg” [104]) disseminate false news and Finally, we summarize existing efforts on defending against extremist perspectives in order to sow distrust and division the spread of mis/disinformation, either by attempting to de- among human users [4]. Social bots have shown to be ef- tect it directly (Section 2.4.1) and/or by changing the user fective at swaying online discussion as well as inflating the experience in an information platform (Section 2.4.2). amount of support for various political campaigns and social movements [36, 85]. 2.4.1 Fake News Detection Detection and measurement. Characterizing bot interac- One approach for combating mis/disinformation is to di- tions and measuring the social bot ecosystem are critical rectly tackle the problem as such: detecting fake or mislead- steps towards being able to detect and eliminate malicious ing news. Such detection could enable (1) social platforms to

5 reduce the spread and influence of fake news, and (2) readers claims from natural language, and knowledge bases are in- to select reliable sources of information. complete and inflexible with novel information. Recent work Fake news detection is a space where it is tempting to has thus focused on even further sub-problems, such as iden- try to just “throw some machine learning at the problem” tifying if a claim is backed up by given sources [15, 99, 107] (e.g., [18, 73]). However, our exploration of the space sug- or using unstructured external web information [76]. For ex- gests that breaking fake news detection down into more care- ample, the Fake News Challenge [32, 44] focused on a stance fully defined sub-problems may be more realistic and mean- detection sub-problem (based on a prior dataset [37]) that ingful at this point. involves estimating whether a given text agrees, disagrees, discusses, or is unrelated to a headline. Definition. To detect fake news, we first need a formal, well- defined notion of what is “fake”. A variety of definitions ex- Subtask: Detecting misleading style. Another sub- ist in the literature, depending on factors such as whether the problem of identifying fake news is to infer the intent of the facts are verifiably false and whether the publisher’s intent article by analyzing its style [109]. For example, attempts is malicious [86], and including definitions that include or at deception may be found through identifying manipulative exclude satire, propaganda, and hoaxes [80]. A commonly wording or weasel words [66, 78], examining the tone of used definition [32, 86] of fake news in the context of to- voice (such as identification of clickbait titles), and gauging day’s mis/disinformation discussions is information that is author inclinations [60] and stance [15, 37]. Earlier work on both (1) factually false and (2) intentionally misleading. deception detection (e.g., [66]) is also relevant. All of these analyses can be done with either trained human experts or Fake news detection as a unified task. At first blush, one through machine learning models. While writing style offers might approach the problem of fake news detection as a uni- a strong hint of an author’s intent, an absence of manipulative fied task. With machine learning, it is possible to directly style of writing does not imply good intentions. Conversely, train end-to-end models using labeled data (both fake and manipulative styles of writing may be associated with other non-fake articles) [73]. The accessibility of machine learning related phenomena, such as hyper-partisanship [77], rather tools creates a low barrier for attempting this technique [18]. than fake news directly. However, posing the problem generally in this way comes with pitfalls. First, it requires large amounts of labeled ex- Subtask: Analyzing metadata. Other work has focused on amples which can be costly to obtain; past work has used analyzing article metadata rather than article content. For examples from Politifact [75, 109] and Emergent [37, 87] as example, when social features, such as shares and likes, are data sources instead. Additionally, many of these systems available, they can be used to infer the intent of a collective are based on neural networks, which lack explanations for group. Social features that have been explored include in- their output and therefore make it hard to understand why formation about the connectivity of the social graph between they do or do not do well, and result in limited usefulness in users, known attributes and profile of the sharer, and the path downstream tasks. Biases in the datasets used can also trans- of propagation of information [58, 60, 92, 96]. late to hidden biases in the classifier, affecting performance when used in real-world situations. Challenges. While there is promise in automating fake news detection, there are also possible pitfalls. First, natu- Subtask: Detecting factual inaccuracies. More effective, ral (skewed) distribution of fake news can lead to misleading particularly from the perspective of advancing the underly- performance numbers of detection systems, as noted in the ing state of the art in natural language processing or related Fake News Challenge [32] postmortem [44]. Increasing ac- fields, is to solve well-defined sub-problems of fake news cessibility of machine learning tools coupled with lack of detection. One such sub-problem is detecting factual inaccu- understanding during evaluation can lead to deployment of racies and falsehoods, by examining and verifying the facts systems that are not sufficiently accurate in practice and may in the content of the article based on external evidence. have unintended consequences. Secondly, bad actors may A common method deployed in practice is to use experts be able to leverage fake news detection approaches to create (fact-checkers) to manually extract the facts and validate better fake content (as in a generative adversarial network, or each one using any tools necessary (including actively con- GAN), though such an arms race is likely inevitable. ducting research); examples include [90] and Politi- Finally, fake news detection alone, even if done perfectly, fact [75]. There are also some platforms which use volunteer does not guarantee a defense against mis/disinformation — crowd work to evaluate evidence, for educational purposes humans have psychological and cognitive patterns that deter- (e.g., Mind Over Media [65]) or to maintain veracity of a mine how we actually utilize the information obtained from resource (e.g., Wikipedia). detection. For example, the backfire effect [59, 70] suggests Using humans to fact-check is, of course, time-consuming that giving people evidence that something they believe is and expensive. Automated knowledge-based verification ex- false can backfire, leading them to believe it more strongly. ists but is not yet practical, because it is difficult to formalize Thus, an open but important question is: how should the re-

6 sults of fake news detection best be incorporated into tools ple sources [24, 25]. Facebook’s related article feature [27] or interfaces for end users? can also encourage reading articles from multiple sources about controversial topics. Outside of Facebook, researchers have proposed, for example, adding “nutritional labels” for 2.4.2 UI/UX Interventions news articles to indicate to users which sources are “healthy” Researchers and social media platforms have proposed and “unhealthy”, like mainstream news versus clickbait [83]. user experience interventions to help users detect online These labels would display “health” metrics like the publi- mis/disinformation and reduce the rate at which it spreads. cation’s slant, the tone of the content, the timeliness of the Some of these changes have already been deployed to users, article, and the experience of the author. as we describe here. However, it is unclear to the research Limiting the spread of mis/disinformation. community whether these interventions have been effective. The designs of different social media platforms may encourage the spread Labeling potential mis/disinformation. One approach is to of mis/disinformation, e.g., by rewarding attention-grabbing use the output of fake news detection or similar classifica- posts. Social media platforms have started making changes tions in order to label low-credibility content. For example, to their core product to make it less easy to disseminate in 2017, Facebook experimented with labeling news articles mis/disinformation. Twitter is considering removing or re- as “disputed” if they were believed false; however this ap- designing the (heart-shaped) “like” button, because they are proach was abandoned, as Facebook learned from academics concerned that the incentive to accumulate likes also en- that this does not change people’s minds [29]. Instead, Face- courages bad behaviors, which could include spreading sen- book began showing “related” articles [27], which may in- sationalist mis/disinformation [17]. Facebook has made clude articles from fact-checking sites like Snopes [90]. changes to its news feed algorithm to downweight “click- bait” posts that abuse sharing mechanisms to reach a larger More information for users. Because it is difficult to defini- audience, such as “share if x”, or “tag a friend that x” [28]. tively detect mis/disinformation, and because it can be inef- This change removes one mechanism that accounts that fective to surface this fact to users directly, a common strat- spread mis/disinformation could use to reach audiences be- egy is to simply provide more information to users to help yond their direct friends and followers. them make their own determination. Facebook’s related ar- ticles feature [27] is one example. Other examples from Challenges. It is difficult to conduct research on UI/UX in- Facebook include more prominently labeling political ads terventions, especially for academics, because many critical and their funders [30] and providing users with more context parts of the research require cooperation from the social me- about shared articles [26]. And because hoaxes on What- dia platforms, such as deploying interventions at scale, and sapp have been increasingly shared via forwarded messages, measuring the effects of an intervention in the wild. While in July 2018, Whatsapp began displaying “Forwarded” be- Facebook and others have made blog posts about the changes fore such messages to help users understand that the message they have made, these posts are not scientifically rigorous may not have been written by the immediate sender [45]. and detailed enough to be replicable or generalizable. So- Third-party tools also exist. For example, the SurfSafe cial media platforms may also be reluctant to work with re- browser extension [53, 94] can help users identify doctored searchers or experiment with interventions if these conflict images by showing them other websites that the image has or do not align with their business interests. been found in — so if an image appears newsworthy, but Another challenge is that interventions may depend on the does not appear in any mainstream news sites, users could accurate identification of mis/disinformation or untrustwor- interpret it as being faked. SurfSafe also highlights whether thy sources. Doing so automatically remains a significant an image also appears on fact checking sites. The InVID challenge (as discussed in Sections 2.3.3 and 2.4.1), so de- browser extension [47] is a set of tools for helping peo- fenses like the nutritional labels and SurfSafe still rely on ple investigate online content, like stepping frame by frame human raters to identify false information. through videos, reverse image searches, viewing photo/video metadata, and other forensic analysis tools.

User education. Another approach Facebook has tried is educating users about how to detect misinformation, and dis- 3 Discussion playing related articles to push users to better educate them- selves. In April 2017, Facebook began showing a post ti- We now step back from our exploration and summary of the tled “Tips for spotting false news” the top of users’ news mis/disinformation landscape, and the role of technology in feeds, which linked to a page about basic principles of on- that landscape, to consider first the broader lessons we, as line news literacy, like identifying sources, looking for dis- a group, took away from this exploration. We then make crepancies in the URL and formatting, and looking at multi- recommendations for a variety of stakeholders.

7 3.1 Lessons make it go away; what is worrying, though, is that it seems to be very easy for adversaries to indiscriminately use tech- Disinformation stems from a variety of motivations nology and make it much worse. and adversarial goals, which may necessitate different solutions. There are many different manifestations of Current adversaries are already sophisticated, and mis/disinformation, and many different actor incentives and can/will become more so; but they also use simple yet intentions [110]. Thus, different solutions may be required effective techniques. A number of case studies show how for different actors or manifestations. For example, actors sophisticated disinformation campaigns already are in some with financial incentives may share disinformation to in- cases, e.g., creating targeted content for each group of crease click-through profitability, as in the case of Facebook users that resonates with their opinions [4]. Disinformation clickbait [28] or fake news enterprises in Macedonia [88]. is becoming more realistic, more personalized, and more Identifying this motive enables combating it by demonetiz- widespread — including convincing fake videos and social ing these behaviors (e.g., limiting the spread of clickbait [28] media personas. At the same time, sometimes the most ef- or erecting barriers in the form of increased manual re- fective disinformation is not at all sophisticated in terms of view [30]). By contrast, state-level actors seeking a political technology. For example, fancy deep learning video editing outcome will not be so easily dissuaded; instead, a defen- techniques are not required for simple edited photos that ef- sive goal here may be to improve detection and attribution fectively spread virally (e.g., a fake photo of an NFL player capabilities in order to be able to provide evidence in an in- burning an American flag [61]). However, like spear phish- ternational court of law. ing [34] in the computer security domain, such simple tech- niques must be taken seriously. Additionally, the simplic- Defining the problem is not always straightforward. ity of some of the known techniques — like copying article Given the complexity of this space, the line of what behavior text [92] — contributes to the impression that this is an arms or content is “bad” is not necessarily clear or easy to define. race, and detecting disinformation now will not mean we can Work attempting to combat the issue, e.g., by automatically detect disinformation tomorrow. detecting fake news, should carefully define what the prob- lem is, e.g., what is meant by “fake news” — and reflect on Incentives of platforms, and even users themselves, are what these definitions are useful for (and whether they are often not aligned with combatting mis/disinformation. It useful in practice). Moreover, asking platform operators to is difficult to credibly make recommendations to platform moderate content is challenging, because in many cases it designers that are employed by for-profit companies: these may be hard to draw a clear line. (There are exceptions, e.g., companies (and by extension, the designers) may priori- human rights atrocities [67].) tize profit if it conflicts with combating mis/disinformation. Solutions cannot and will not just be technical. This concern is not just academic: Facebook’s and Twitter’s The motivations, forms, and methods for spreading user numbers, for example, may directly impact their valu- mis/disinformation are diverse, complicated and evolving; ations [12, 81], reducing the incentive to separate legitimate automated approaches for detecting it seem insufficient. from illegitimate users. Though these platforms are not de- Moreoever, mis/disinformation exploits underlying human signed to intentionally support mis/disinformation, often de- congitive vulnerabilities. Thus, we cannot just “throw some sign choices interact badly with human cognitive biases. We machine learning” at the problem but must consider more discuss this issue further in our recommendations below. holistic approaches. That said, some promising technical ap- proaches do exist and should be further researched and im- Solutions should be careful about the risk of causing peo- plemented. We make some recommendations below. ple to be overly skeptical and distrust everything. Sowing distrust and undermining institutions is a key goal of some The disinformation problem is an arms race, and an disinformation campaigns, and solutions must be careful not asymmetrical battle between attackers and defenders. to “throw the baby out with the bathwater” — that is, users Mis/disinformation is a difficult problem to fight because it must be taught to and supported in building (legitimate) trust, is a highly asymmetrical battle. Like insurgents, spreaders not just skepticism. of online disinformation have a wide variety of established targets to attack and the freedom to appear, test new strate- There may be lessons to apply from traditional computer gies, and disappear again with minimal cost and few reper- security. For example, lessons from usable security suggest cussions. Meanwhile, defenders like fact-checkers have a that interrupting users and asking them to make decisions much more difficult task ahead of them, that is both hard about what is good or bad is not, in and of itself, an effec- to define and often contrary to people’s cognitive biases. In tive strategy [113]. As another example, the best technical such a complicated space, it may be unsurprising that we systems can be undermined by targeted spear phishing at- cannot indiscriminately point technology at the problem and tacks [34], targeting human cognitive vulnerabilities.

8 3.2 Recommendations services via targeted advertisements. Though challenging, we must consider alternate models that will de-incentivize Finally, we surface our recommendations, organized by financially-based mis/disinformation and better align the in- stakeholder. centives of platform providers, users, content creators, and mis/disinformation defenders. 3.2.1 Recommendations for Technologists and Other Defenders Invest in and study user education efforts. Skills like critical thinking and the ability to fact check are good de- Be specific about the goals that a particular technical so- fenses against mis/disinformation in general, but applying lution is intending to achieve. If one’s goal is to solve a them becomes more challenging as technology enables more problem, then one should focus on the goal, not the technol- sophisticated and subtle mis/disinformation. We should ex- ogy used to solve it. A good goal is both useful and pos- plore how to effectively teach users how current technologies sible to achieve. A good example is the work of Starbird could be exploited, and how to spot these issues. For exam- et al. [92], which identified a useful task (investigating the ple, this could include: ecosystem) and published valuable data and insights as a re- • Teaching users to spot ads versus organic content. sult. Another possibly instructive comparison is between the • Teaching users to be aware of what technology is capa- Fake News Challenge [44], which pushed forward the state ble of (e.g., screenshots can be easily fabricated, fake of the art on a specific problem (stance detection) and other, portraits and videos made, accounts hacked). more general attempts at fake news detection with machine • Teaching users to look for markers of bot accounts and learning [18, 73], whose general applicability and ultimately low-credibility stories and sources. effectiveness may be limited. • Teaching users to spot mis/disinformation via a game. Do work on solving technology-related sub-problems. • Balancing teaching skepticism with teaching how to Though technology alone cannot solve the whole problem, build up knowledge. The end goal should not be for as discussed, there are technology-related sub-problems that users to simply distrust everything. are useful. Example technology-based directions to investi- • Bringing discussions of this topic and possible inter- gate (though not an exhaustive list) may include: ventions into the “public square”, such as via television • False news detection based on precise definitions of shows. There may be models for this type of engage- useful, solvable sub-problems; combining these sub- ment and/or lessons to learn from other attempts, such problems into a larger detection pipeline; and leverag- as the Latvian “Theory of Lies” show on uncovering ing the results of this detection (e.g., to measure ecosys- disinformation [23] or the bilingual “StopFake” show tems or effectively surface information to end users). in the Ukraine [93]. • “Chain of custody” approaches for information. For ex- We also recommend following a scientific process to develop ample, for photos, see ProofMode [42], TruePic [101], and evaluate educational interventions. Researchers and oth- and PhotoProof [68]. ers should test and evaluate which specific educational inter- • Platform-based or other UI/UX interventions: both for- ventions actually have a positive effect. mative research to understand what is (not) effective Avoid placing too much burden on users. Though educa- and how existing designs drive user behaviors, and de- tion and surfacing more information to users can be a part of veloping and evaluating new designs. the solution, it can certainly not be the entire solution. Tak- • Continued work on bot detection and characterization, ing lessons from, for example, the usable computer security and on strategies for user verification. community (e.g., [113]), solutions that increase the burden Continue measurement studies to understand the ecosys- on the user and get in the way of a user’s primary task (in this tem. Successful defenses require a deep understanding of case, interacting with their social network and consuming in- the underlying ecosystems creating, spreading, and reacting formation) may be circumvented, ignored, and/or ultimately to mis/disinformation. ineffective. Thus, user education must be complemented by other interventions discussed elsewhere in this section. Tailor solutions to specific types of users, content, and/or adversaries. Different parts of the space may necessitate Consider all users. We need to develop solutions for every- different solutions, as discussed above. one and not only the technically literate 1%. Fact-checking tools and UI interventions only work if people actually use Consider what alternate incentive or monetary struc- them. Thus, for example, browser extensions or other solu- tures for the web might be, and how to make those a tions that require users to go out of their way will have lim- reality. The financial incentives driving platform providers ited reach. As a result, some burden may be on social media and many adversaries are also core to the functioning of platform designers, news sites and aggregators, and browser the modern web, e.g., the supporting of free content and vendors themselves to consider what UI/UX interventions

9 may benefit all users. We discuss specific suggestions for reports suggest it is today (e.g., [57]), recognizing that (particularly social media) platform designers in the next this phenomenon, if unaddressed, could ultimately pose section. Additionally, different geographic or demographic an existential threats to these platforms. user groups may interact differently with mis/disinformation and interventions. Provide ways for users/researchers/others to understand and evaluate what is going on in the platform. One pos- sibility may be teaming up with researchers to experiment 3.2.2 Recommendations for Platform Designers with and understand the effects of each feature, both good Acknowledge that platform designs are not neutral. De- and bad. Some such efforts already exist, such as Facebook signers should recognize that some design patterns (while providing data to researchers via SocialScienceOne [91]. useful for user engagement) could have unintended conse- quences and leave openings for mis/disinformation. Most 3.2.3 Recommendations for Researchers prominently, patterns for generating revenue (tracking, ad targeting, and so on) can and are misused by malicious par- Make the results of research in this space accessible to ties. As another example, Twitter’s emphasis on a username and digestible by the general public. We know there is a instead of the unique “@handle” enables attacks in which problem. Research, such as that summarized in Section 2.2, compromised “verified” accounts can be used to masquer- clearly documents the problem. And the results of this re- ade as other prominent users [46]. Platform designers are in search needs to continue to move out of academia to the gen- a unique position to directly influence the mis/disinformation eral public, to support broader discussion and awareness. landscape. Some concrete recommendations for changes to Think critically about the technologies we develop and investigate: how they might be misused. While technologies help en- • Platforms should examine their design affordances and able mis/disinformation, most of them are designed with cer- study their role in mis/disinformation. For example, tain other motivations. For instance, tracking and targeting is consider some of the social signals that their platforms primarily for personalization, computer-generated video im- introduce, such as likes, shares, and the selection of proves the quality of post-production, and bots can also be comments shown. In their current form, while they benign. We, as researchers, must acknowledge that techno- show what is popular, these UI features also confer logical advances have also enabled the creation and spread credibility and encourage the spread of falsehoods and of mis/disinformation at a massive scale. When developing propaganda. Some platforms have already begun to a system or technology, we must assume bad actors will at- make changes to these features, though their impacts tempt to exploit or otherwise take advantage of these sys- are not publicly known. tems; pretending that people will not misuse technologies, • Platforms should be more thoughtful about their recom- e.g., manipulate videos for propaganda purposes, is naive. mender algorithms, and continue to find ways to down- weight content that is extreme and misleading. Collaborate across fields to understand and address the • Platforms should continue to provide users with more problem. It is critical that researchers from multiple information and more (genuine) transparency about, for (sub)fields, including cognitive science, computer security, example, the sources of information, the reasons for behavioral sciences, game theory, human-computer interac- why certain content is targeted at a user, and which con- tion, information science, journalism, law, machine learning, tent is sponsored versus organic. Prior works suggests natural language processing, neurology, psychology, sociol- that existing efforts to disclose and label advertisements ogy, and others work together to understand and address this is ineffective (e.g., [64]). problem. Otherwise proposed solutions risk being ineffec- tive due to an incomplete understanding of the issues. For • Consider how identity is handled on the platform. example, formulating defenses against mis/disinformation as Though there is value in places on the web where things a supervised binary classification problem may have ma- can be shared anonymously, its use — and how illegiti- jor limitations in practice — e.g., due to the backfire ef- mate accounts are identified and handled — should be fect [70, 70], flagging content as fake can make users more considered carefully. As a cautionary tale, consider likely to believe it, not less. Instead, detection, preven- how easily Russian actors impersonated members of the tion, and defense against mis/disinformation is an interdis- Black Lives Matter movement [4]. One possible miti- ciplinary research problem. gation to study is a design where posts from untrusted, anonymous sources are not intermingled with content Study different groups of users.. Researchers should work authored directly by accounts known to a user. to understand how the effect of mis/disinformation and ef- • More generally, platform designers should try to shift fective offensive/defensive strategies may differ between dif- the incentives of their organizations to treat combating ferent users groups. As with different adversarial motiva- mis/disinformation as a higher high priority than some tions, different approaches may be needed for different users.

10 Foundational science is still needed to understand these po- Be aware of your own possible cognitive biases and try tential differences and their impacts. to spot when they are being exploited. For example, be cautious if something makes you angry and confirms your 3.2.4 Recommendations for Policymakers existing view about a person or issue. An instructive list of cognitive biases can be found in the references [89]. Consider policies or regulations that might help align the incentives of platform developers to combating the mis/disinformation problem. Making recommendations 4 Conclusion to policymakers was not the goal of our investigation, so we do not intend to present a complete list here. How- The current ecosystem of technology-enabled mis- and dis- ever, as we often came up against the tensions between the information has potentially serious and far-reaching conse- web’s economic model (via advertisements) and the viability quences. As technologists, it is critical that we understand possible platform-based interventions to limit the spread of the potential and actual negative impacts and misuses of the mis/disinformation, we felt this point was important to men- technologies we create; that we deploy defensive technolo- tion. We suggest that policymakers should consider their po- gies at aimed at specific, achievable goals; and that we col- tential role in helping reshape the incentive structures that laborate with other fields to fully understand the problem and lead to this tension — for example, in regulating how users’ solution space. We have compiled this report, the result of an data can be used for targeting or how platforms should re- academic quarter’s investigation into the mis/disinformation spond to known disinformation campaigns. landscape from the perspective of a graduate special topics course in computer science, in the hopes that the interested Ground policy proposals in technical realities. At the reader will benefit from our summaries, lessons, and recom- same time, we caution that policymakers should ground their mendations. Further efforts from researchers, technologists, proposals in a solid understanding of the underlying tech- platform designers, educators, policymakers, and others are nologies. For example, a proposed U.S. Senate bill intend- required to study and combat both intentional disinformation ing to regulate bots by requiring them to self-identify (and and unintentional misinformation. Ultimately this is not a platform providers to enforce it) [35] suffers from an impre- problem that can be eliminated completely — the underlying cise definition of bot [52] and may be technically challeng- human cognitive biases and their exploitation for persuasion, ing or impossible to enforce, sweep up legitimate users or both good and bad, long predate the technologies discussed use cases, fail to positively affect end user behaviors even in this report. But that does not mean that we should throw if enforced, and may thus ultimately burden platforms and up our hands and avoid thinking critically about how to im- users without having the intended benefits. prove the information ecosystems that we use and build.

3.2.5 Recommendations for Users References Additional and more comprehensive recommendations for users can be found online (e.g., [11, 24, 103, 114]). We focus [1]A LLIANCEFOR SECURING DEMOCRACY. Hamilton 68: Tracking Russian Influence Operations on Twitter. http: briefly on several high-level recommendations. //dashboard.securingdemocracy.org/. Be skeptical of content you see on the web. Content from [2]A LLIANCEFOR SECURING DEMOCRACY. How to Interpret unknown sources, or that seems suspicious or particularly the Hamilton 68 Dashboard: Key Points and Clarifications. dramatic or too good or bad to be true, should be taken https://securingdemocracy.gmfus.org/toolbox/ with a healthy dose of skepticism. Even further considera- how-to-interpret-the-hamilton-68-dashboard- tion should be taken before sharing such content with others. key-points-and-clarifications/. Be aware that even “trusted” sources of information can re- share mis/disinformation; your friends may inadvertently do [3]A NDREOU,A.,VENKATADRI,G.,GOGA,O.,GUMMADI, so, and not all platforms surface information about the orig- K. P., LOISEAU, P., AND MISLOVE, A. Investigating Ad inal source (e.g., forwarded messages on WhatsApp). Look Transparency Mechanisms in Social Media: A Case Study of for corroborating or conflicting evidence from other sources, Facebook’s Explanations. In Network and Distributed Sys- tem Security Symposium (NDSS) (2018). and be open to changing your mind. At the same time, don’t assume everything is false just [4]A RIF,A.,STEWART,L.G., AND STARBIRD, K. Acting the Part: Examining Information Operations Within #Black- because it can be false. Undermining the legitimacy of in- LivesMatter Discourse. Proceedings of the ACM on Human- stitutions like journalism is a key goal of some disinforma- Computer Interaction (CSCW) 2 (Nov. 2018), 20:1–20:27. tion campaigns. Use strategies to build your trust in content: for example, try to identify the original sources, go directly [5]BBCT RENDING. The saga of Pizzagate: The fake story that to news sources you trust, and seek out a diversity of sources. shows how conspiracy theories spread. BBC News, Dec.

11 2016. https://www.bbc.com/news/blogs-trending- [19]E LECTRONIC FRONTIER FOUNDATION. Privacy Badger. 38156985. https://www.eff.org/privacybadger.

[6]B ENSON, B. Cognitive bias cheat sheet, Sept. [20]E LLICK,A.B., AND WESTBROOK, A. Operation InfeK- 2016. https://betterhumans.coach.me/cognitive- tion: Russian Disinformation from Cold War to Kanye. The bias-cheat-sheet-55a472476b18. New York Times. https://www.nytimes.com/2018/ 11/12/opinion/russia-meddling-disinformation- [7]B OGHARDT, T. Soviet Bloc Intelligence and Its AIDS Dis- fake-news-elections.html. information Campaign. Studies in Intelligence 53, 4 (2009). [21]E NGLEHARDT,S., AND NARAYANAN, A. Online tracking: [8]B RAVE. Brave browser. https://brave.com/. A 1-million-site measurement and analysis. In ACM Confer- [9]B RONIATOWSKI,D.A.,JAMISON,A.,QI,S.,ALKULAIB, ence on Computer and Communications Security (2016). L.,CHEN, T., BENTON,A.,QUINN,S.C., AND DREDZE, M. Weaponized Health Communication: Twitter Bots and [22]E SLAMI,M.,KUMARAN,S.R.K.,SANDVIG,C., AND Russian Trolls Amplify the Vaccine Debate. American Jour- KARAHALIOS, K. Communicating algorithmic process in nal of Public Health 108 10 (2018), 1378–1384. online behavioral advertising. In ACM Conference on Hu- man Factors in Computing Systems (CHI) (2018). [10]C HESNEY,R., AND CITRON, D. K. Deep Fakes: A Loom- ing Challenge for Privacy, Democracy, and National Secu- [23]EU VS DISINFO. Debunking in primetime, Jan. rity. 107 California Law Review (2019, Forthcoming); U of 2017. https://euvsdisinfo.eu/debunking-in- Texas Law, Public Law Research Paper No. 692; U of Mary- primetime/. land Legal Studies Research Paper No. 2018-21 (2018). [24]F ACEBOOK. Tips to Spot False News. https://www. [11]C OLUMBIA COLLEGE. Real vs. fake news: Detecting facebook.com/help/188118808357379. lies, hoaxes and clickbait. https://columbiacollege- ca.libguides.com/fake_news/avoiding. [25]F ACEBOOK. A New Educational Tool Against Misin- formation, Apr. 2017. https://newsroom.fb.com/ [12]C ONFESSORE,N., AND DANCE, G. J. Battling news/2017/04/a-new-educational-tool-against- Fake Accounts, Twitter to Slash Millions of Followers, misinformation/. July 2018. https://www.nytimes.com/2018/07/11/ technology/twitter-fake-followers.html. [26]F ACEBOOK. New Test to Provide Context About Articles, Oct. 2017. https://newsroom.fb.com/news/2017/10/ [13]C RESCI,S.,PIETRO,R.D.,PETROCCHI,M.,SPOG- news-feed-fyi-new-test-to-provide-context- NARDI,A., AND TESCONI, M. The paradigm-shift of social about-articles/. spambots: Evidence, theories, and tools for the arms race. In International Conference on World Wide Web (WWW) [27]F ACEBOOK. New Test With Related Articles, Apr. 2017. (2017). https://newsroom.fb.com/news/2017/04/news- feed-fyi-new-test-with-related-articles/. [14]C YBENKO,G.,GIANI,A., AND THOMPSON, P. Cognitive hacking: A battle for the mind. IEEE Computer 35 (2002), [28]F ACEBOOK. New Updates to Reduce Clickbait Head- 50–56. lines, May 2017. https://newsroom.fb.com/news/ [15]D ERCZYNSKI,L.,BONTCHEVA,K.,LIAKATA,M.,PROC- 2017/05/news-feed-fyi-new-updates-to-reduce- TER,R.,HOI, G. W. S., AND ZUBIAGA, A. SemEval-2017 clickbait-headlines/. Task 8: RumourEval: Determining rumour veracity and sup- [29]F ACEBOOK. Replacing Disputed Flags With Related port for rumours. In SemEval-2017 (2017). Articles, Dec. 2017. https://newsroom.fb.com/news/ [16]D IXIT, P., AND MAC, R. How WhatsApp De- 2017/12/news-feed-fyi-updates-in-our-fight- stroyed A Village. Buzzfeed News, Sept. 2018. against-misinformation/. https://www.buzzfeednews.com/article/ pranavdixit/whatsapp-destroyed-village- [30]F ACEBOOK. Making Ads and Pages More Transparent, lynchings-rainpada-india. Apr. 2018. https://newsroom.fb.com/news/2018/04/ transparent-ads-and-pages/. [17]D REYFUSS, E. Jack Dorsey has problems with Twitter, too. Wired, Oct. 2018. https://www.wired.com/story/ [31]F AIZULLABHOY,I., AND KOROLOVA, A. Facebook’s ad- wired25-jack-dorsey/. vertising platform: New attack vectors and the need for in- terventions. In Workshop on Consumer Protection (ConPro) [18]E DELL, A. I trained fake news detection AI with (2018). >95% accuracy, and almost went crazy, Jan. 2018. https://towardsdatascience.com/i-trained- [32]F AKE NEWS CHALLENGE. Exploring how artificial intelli- fake-news-detection-ai-with-95-accuracy-and- gence technologies could be leveraged to combat fake news. almost-went-crazy-d10589aa57c. http://www.fakenewschallenge.org/.

12 [33]F ARIS,R.M.,ROBERTS,H.,ETLING,B.,BOURASSA, [47]I NVID. InVID Verification Plugin. https://www.invid- N.,ZUCKERMAN,E., AND BENKLER, Y. Partisanship, project.eu/tools-and-services/invid- Propaganda, and Disinformation: Online Media and the verification-plugin/. 2016 U.S. Presidential Election. Tech. rep., Berkman Klein Center for Internet & Society Research Paper, 2017. [48] IPEROV. DeepFaceLab. https://github.com/iperov/ DeepFaceLab. [34]F AZZINI, K. How the Russians broke into the Democrats’ email, and how it could have been avoided, July 2018. [49]J ACK, C. Lexicon of lies: Terms for problematic informa- https://www.cnbc.com/2018/07/16/how-russians- tion. Data & Society, Aug. 2017. broke-into-democrats-email-mueller.html. [50]K ILL, K. When a stranger decides to destroy your life, [35]F EINSTEIN, D. Bot disclosure and accountabil- July 2018. https://gizmodo.com/when-a-stranger- ity act of 2018. S.3127 - 115th Congress, June decides-to-destroy-your-life-1827546385. 2018. https://www.congress.gov/bill/115th- congress/senate-bill/3127. [51]K IM,H.,GARRIDO, P., TEWARI,A.,XU, W., THIES,J., NIESSNER,M.,PEREZ´ , P., RICHARDT,C.,ZOLLHOFER¨ , [36]F ERRARA,E.,VAROL,O.,DAVIS,C.A.,MENCZER, F., M., AND THEOBALT, C. Deep video portraits. In ACM AND FLAMMINI, A. The rise of social bots. Communica- SIGGRAPH (2018). tions of the ACM 59 (2016), 96–104.

[37]F ERREIRA, W., AND VLACHOS, A. Emergent: A novel [52]L AMO, M. Regulating bots on social media data-set for stance classification. In NAACL-HLT (2016). is easier said than done, Aug. 2018. https: //slate.com/technology/2018/08/to-regulate- [38]F INN,S.,METAXAS, P. T., AND MUSTAFARAJ, E. Inves- bots-we-have-to-define-them.html. tigating Rumor Propagation with TwitterTrails. Tech. rep., arXiv:1411.3550, 2014. [53]L APOWSKY, I. This browser extension is like an antivirus for fake photos. Wired, Aug. 2018. [39]F LAXMAN,S.,GOEL,S., AND RAO, J. M. Filter bubbles, https://www.wired.com/story/surfsafe-browser- echo chambers, and online news consumption. Public Opin- extension-save-you-from-fake-photos/. ion Quarterly 80, S1 (2016), 298–320. [54]L AZER,D.,BAUM,M.A.,BENKLER, Y., BERINSKY, [40]F OURNEY,A.,RACZ´ ,M.Z.,RANADE,G.,MOBIUS,M., A.J.,GREENHILL,K.M.,MENCZER, F., METZGER, AND HORVITZ, E. Geographic and Temporal Trends in Fake M.J.,NYHAN,B.,PENNYCOOK,G.,ROTHSCHILD,D., News Consumption During the 2016 US Presidential Elec- SCHUDSON,M.,SLOMAN,S.A.,SUNSTEIN,C.R., tion. In ACM Conference on Information and Knowledge THORSON,E.,WATTS,D.J., AND ZITTRAIN, J. The sci- Management (CIKM) (2017). ence of fake news. Science 359 (2018), 1094–1096. [41]G HOSTERY. Faster, safer, and smarter browsing. https: //www.ghostery.com/. [55]L ECUYER´ ,M.,DUCOFFE,G.,LAN, F., PAPANCEA,A., PETSIOS, T., SPAHN,R.,CHAINTREAU,A., AND GEAM- [42]G UARDIAN PROJECT. Combating “fake news” with BASU, R. Xray: Enhancing the web’s transparency with a smartphone “proof mode”, Feb. 2017. https: differential correlation. In USENIX Security Symposium //guardianproject.info/2017/02/24/combating- (2014). fake-news-with-a-smartphone-proof-mode/. [56]L ERNER,A.,SIMPSON,A.K.,KOHNO, T., AND ROES- AIMSON AND OFFMANN [43]H ,O.L., H , A. L. Construct- NER, F. Internet Jones and the Raiders of the Lost Track- ing and enforcing ”authentic” identity online: Facebook, ers: An Archaeological Study of Web Tracking from 1996 real names, and non-normative identities. First Monday 21 to 2016. In USENIX Security Symposium (2016). (2016). [57]L EVIN, S. ‘They don’t care’: Facebook factchecking [44]H ANSELOWSKI,A.,AVINESHP.V., S., SCHILLER,B., in disarray as journalists push to cut ties, Dec. 2018. CASPELHERR, F., CHAUDHURI,D.,MEYER,C.M., AND https://www.theguardian.com/technology/2018/ GUREVYCH, I. A retrospective analysis of the fake news dec/13/they-dont-care-facebook-fact-checking- challenge stance-detection task. In COLING (2018). in-disarray-as-journalists-push-to-cut-ties. [45]H ATMAKER, T. WhatsApp now marks forwarded messages to curb the spread of deadly misinformation. TechCrunch, [58]L ONG, Y., LU,Q.,XIANG,R.,LI,M., AND HUANG,C.- July 2018. https://techcrunch.com/2018/07/10/ R. Fake news detection through multi-perspective speaker whatsapp-forwarded-messages-india/. profiles. In IJCNLP (2017).

[46]H IGGINS, S. Twitter Promoted a Fake Elon [59]L ORD,C.G.,ROSS,L., AND LEPPER, M. R. Biased as- Musk Crypto Giveaway Scam, Oct. 2018. https: similation and attitude polarization: The effects of prior the- //www.coindesk.com/twitter-let-a-fake-elon- ories on subsequently considered evidence. Journal of Per- musk-account-promote-a-crypto-scam. sonality and Social Psychology 37 (1979).

13 [60]M A,J.,GAO, W., AND WONG, K.-F. Detect rumors in mi- [75]P OLITIFACT. Fact-checking U.S. politics. https://www. croblog posts using propagation structure via kernel learn- politifact.com/. ing. In ACL (2017). [76]P OPAT,K.,MUKHERJEE,S.,YATES,A., AND WEIKUM, [61]M ALLONEE, L. That flag-burning NFL photo G. DeClarE: Debunking fake news and false claims using isn’t fake news, it’s a meme, Oct. 2017. https: evidence-aware deep learning. In EMNLP (2018). //www.wired.com/story/that-flag-burning- nfl-photo-isnt-fake-news-its-a-meme/. [77]P OTTHAST,M.,KIESEL,J.,REINARTZ,K.,BEVEN- DORFF,J., AND STEIN, B. A stylometric inquiry into hy- [62]M ARCHAL,N.,NEUDERT,L.-M.,KOLLANYI,B., perpartisan and fake news. In ACL (2018). HOWARD, P. N., AND KELLY, J. Polarization, Partisan- ship and Junk News Consumption on Social Media During [78]R ASHKIN,H.,CHOI,E.,JANG, J. Y., VOLKOVA,S., AND the 2018 US Midterm Elections. Tech. Rep. 2018.5, Project CHOI, Y. Truth of varying shades: Analyzing language in on Computational Propaganda, 2018. fake news and political fact-checking. In EMNLP (2017).

[63]M ARWICK,A., AND LEWIS, R. Media manipulation and [79]R OESNER, F., KOHNO, T., AND WETHERALL, D. Detect- disinformation online. Data & Society, May 2017. ing and defending against third-party tracking on the web. [64]M ATHUR,A.,NARAYANAN,A., AND CHETTY, M. En- In USENIX Symposium on Networked Systems Design and dorsements on Social Media: An Empirical Study of Af- Implementation (NSDI) (2012). filiate Marketing Disclosures on YouTube and Pinterest. Proceedings of the ACM on Human-Computer Interaction [80]R UBIN, V. L., CHEN, Y., AND CONROY, N. Deception (CSCW) 2 (Nov. 2018). detection for news: Three types of fakes. In ASIST (2015).

[65]M EDIA EDUCATION LAB. Mind over media: Analyz- [81]R USHE, D. Facebook share price slumps below ing contemporary propaganda. https://propaganda. $20 amid fake account flap, Aug. 2012. https: mediaeducationlab.com/node/1. //www.theguardian.com/technology/2012/aug/ 02/facebook-share-price-slumps-20-dollars. [66]M IHALCEA,R., AND STRAPPARAVA, C. The lie detector: Explorations in the automatic recognition of deceptive lan- [82]S CHREDER, S. 10 Twitter bots that actually make the in- guage. In ACL-IJCNLP (2009). ternet a better place, Jan. 2018. https://blog.mozilla. org/internetcitizen/2018/01/19/10-twitter- [67]M ULLIGAN,D.K., AND GRIFFIN, D. S. Rescripting bots-actually-make-internet-better-place/. Search to Respect the Right to Truth. 2 GEO. L. TECH. REV. 557 (July 2018). [83]S EHAT, C. M. Nutrition labels for the news: a way to support vibrant public discussion, Oct. 2018. [68]N AVEH,A., AND TROMER, E. Photoproof: Cryptographic image authentication for any set of permissible transforma- https://misinfocon.com/nutrition-labels-for- tions. In IEEE Symposium on Security and Privacy (2016). the-news-a-way-to-support-vibrant-public- discussion-20c64225637c. [69]N IKIFORAKIS,N.,KAPRAVELOS,A.,JOOSEN, W., KRUGEL¨ ,C.,PIESSENS, F., AND VIGNA, G. Cookie- [84]S HAO,C.,CIAMPAGLIA,G.L.,FLAMMINI,A., AND less monster: Exploring the ecosystem of web-based device MENCZER, F. Hoaxy: A platform for tracking online misin- fingerprinting. IEEE Symposium on Security and Privacy formation. In Workshop on Social News On the Web (SNOW) (2013), 541–555. (2016).

[70]N YHAN,B., AND REIFLER, J. When corrections fail: The [85]S HAO,C.,CIAMPAGLIA,G.L.,VAROL,O.,YANG,K., persistence of political misperceptions. Political Behavior FLAMMINI,A., AND MENCZER, F. The spread of low- 32 (June 2010), 303–330. credibility content by social bots. In Nature Communications (2018). [71]O BSERVATORY ON SOCIAL MEDIA. Hoaxy. Indiana Uni- versity. https://hoaxy.iuni.iu.edu/. [86]S HU,K.,SLIVA,A.,WANG,S.,TANG,J., AND LIU,H. Fake news detection on social media: A data mining per- [72]P ARISER,E. The Filter Bubble: What the Internet Is Hiding SIGKDD Explorations 19 from You. Penguin Press, May 2011. spective. (2017), 22–36.

[73]P EREZ´ -ROSAS, V., KLEINBERG,B.,LEFEVRE,A., AND [87]S ILVERMAN, C. Emergent: A real-time rumor tracker, MIHALCEA, R. Automatic detection of fake news. In COL- 2015. http://www.emergent.info/. ING (2018). [88]S ILVERMAN,C., AND ALEXANDER, L. How Teens In The [74]P ETTY,R.E.,DESTENO,D., AND RUCKER, D. D. The Balkans Are Duping Trump Supporters With Fake News, role of affect in attitude change. In Handbook of Affect and Nov. 2016. https://www.buzzfeednews.com/article/ Social Cognition, J. P. Forgas, Ed. Lawrence Erlbaum Asso- craigsilverman/how-macedonia-became-a-global- ciates Publishers, 2001, pp. 212–233. hub-for-pro-trump-misinfo#.tcR3aZa5r.

14 [89]S MITH, J. 67 ways to increase conversion with cog- [104]V AROL,O.,FERRARA,E.,DAVIS,C.A.,MENCZER, nitive biases. https://www.neurosciencemarketing. F., AND FLAMMINI, A. Human-bot interactions: De- com/blog/articles/cognitive-biases-cro.htm#. tection, estimation, and characterization. Tech. rep., arXiv:1703.03107, 2017. [90]S NOPES MEDIA GROUP. Snopes. https://www.snopes. com/. [105]V ENKATADRI,G.,ANDREOU,A.,LIU, Y., MISLOVE,A., GUMMADI, K. P., LOISEAU, P., AND GOGA, O. Privacy [91]S OCIAL SCIENCE ONE. Building industry-academic part- Risks with Facebook’s PII-Based Targeting: Auditing a Data nerships. https://socialscience.one/. Broker’s Advertising Interface. In IEEE Symposium on Se- curity and Privacy (2018). [92]S TARBIRD,K.,ARIF,A.,WILSON, T., KOEVERING, K. V., YEFIMOVA,K., AND SCARNECCHIA, D. Ecosystem [106]V INES, P., ROESNER, F., AND KOHNO, T. Exploring or Echo-System? Exploring Content Sharing across Alterna- ADINT: Using Ad Targeting for Surveillance on a Budget tive Media Domains. In International AAAI Conference on - or - How Alice Can Buy Ads to Track Bob. In Workshop Web and Social Media (ICWSM) (2018). on Privacy in the Electronic Society (WPES) (2017).

[93]S TOPFAKE.ORG. Struggle against fake information about [107]V LACHOS,A., AND RIEDEL, S. Fact checking: Task def- events in Ukraine. https://www.stopfake.org/en/ inition and dataset construction. In ACL 2014 Workshop on stopfake-209-with-marko-suprun/. Language Technologies and Computational Social Science (2014). [94]S URFSAFE. SurfSafe: Supercharge your browser to catch [108]W ANG,G.,MOHANLAL,M.,WILSON,C.,WANG,X., fake news. https://www.getsurfsafe.com/. METZGER,M.J.,ZHENG,H., AND ZHAO, B. Y. Social Network and [95]S UWAJANAKORN,S.,SEITZ,S.M., AND turing tests: Crowdsourcing sybil detection. In Distributed System Security Symposium (NDSS) KEMELMACHER-SHLIZERMAN, I. Synthesizing Obama: (2013). learning lip sync from audio. In ACM SIGGRAPH (2017). [109]W ANG, W. Y. “Liar, liar pants on fire”: A new benchmark dataset for fake news detection. In ACL (2017). [96]T AMBUSCIO,M.,RUFFO,G.,FLAMMINI,A., AND MENCZER, F. Fact-checking effect on viral hoaxes: A [110]W ARDLE,C., AND DERAKHSHAN, H. Information disor- model of misinformation spread in social networks. In WWW der: Toward an interdisciplinary framework for research and (2015). policymaking. Tech. rep., Council of Europe, Sept. 2017.

[97]T HE MEDIA INSIGHT PROJECT. “Who Shared It?”: How [111]W IKIPEDIA. Deepfake. https://en.wikipedia.org/ Americans Decide What News to Trust on Social Media, wiki/Deepfake. Mar. 2017. https://www.americanpressinstitute. org/publications/reports/survey-research/ [112]Y ANG,Z.,WILSON,C.,WANG,X.,GAO, T., ZHAO, trust-social-media/. B. Y., AND DAI, Y. Uncovering social network sybils in the wild. In Internet Measurement Conference (2011). [98]T HIES,J.,ZOLLHOFER¨ ,M.,STAMMINGER,M., THEOBALT,C., AND NIESSNER, M. Face2Face: Real- [113]Y EE, K.-P. Aligning security and usability. IEEE Security Time Face Capture and Reenactment of RGB Videos. In & Privacy Magazine 2 (2004), 48–55. IEEE Conference on Computer Vision and Pattern Recogni- [114]ZSRL IBRARY. Fake news: Avoiding fake news. https: tion (CVPR) (2016). //guides.zsr.wfu.edu/c.php?g=634067&p=4434417.

[99]T HORNE,J.,VLACHOS,A.,CHRISTODOULOPOULOS,C., [115]Z UBIAGA,A.,AKER,A.,BONTCHEVA,K.,LIAKATA, AND MITTAL, A. FEVER: A large-scale dataset for Fact M., AND PROCTER, R. Detection and resolution of ru- Extraction and VERification. In NAACL-HLT (2018). mours in social media: A survey. ACM Computing Surveys 51 (2018), 32:1–32:36. [100]T OR. Tor browser. https://www.torproject.org/ projects/torbrowser.html.en.

[101]T RUEPIC. Image authentication for any situation. https: //truepic.com/.

[102]T UFEKCI, Z. Youtube, the great radicalizer. The New York Times Opinion, Mar. 2018. https://www.nytimes.com/ 2018/03/10/opinion/sunday/youtube-politics- radical.html.

[103]U NIVERSITYOF WEST FLORIDA. Fake news. https:// libguides.uwf.edu/c.php?g=609513&p=4274530.

15