Scalable and Generalizable Social Bot Detection Through Data Selection

Total Page:16

File Type:pdf, Size:1020Kb

Scalable and Generalizable Social Bot Detection Through Data Selection The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Scalable and Generalizable Social Bot Detection through Data Selection Kai-Cheng Yang,1 Onur Varol,2 Pik-Mai Hui,1 Filippo Menczer1,3 1Center for Complex Networks and Systems Research, Indiana University, Bloomington, IN, USA 2Center for Complex Networks Research, Northeastern University, Boston, MA, USA 3Indiana University Network Science Institute, Bloomington, IN, USA Abstract is extremely difficult for humans with limited access to so- Efficient and reliable social bot classification is crucial for cial media data to recognize bots, making people vulnerable detecting information manipulation on social media. Despite to manipulation. rapid development, state-of-the-art bot detection models still Many machine learning frameworks for bot detection on face generalization and scalability challenges, which greatly Twitter have been proposed in the past several years (see a limit their applications. In this paper we propose a framework recent survey (Alothali et al. 2018) and the Related Work that uses minimal account metadata, enabling efficient analy- section). However, two major challenges remain: scalability sis that scales up to handle the full stream of public tweets of and generalization. Twitter in real time. To ensure model accuracy, we build a rich collection of labeled datasets for training and validation. We Scalability enables analysis of streaming data with lim- deploy a strict validation system so that model performance ited computing resources. Yet most current methods attempt on unseen datasets is also optimized, in addition to traditional to maximize accuracy by inspecting rich contextual infor- cross-validation. We find that strategically selecting a subset mation about an account’s actions and social connections. of training data yields better model accuracy and generaliza- Twitter API rate limits and algorithmic complexity make it tion than exhaustively training on all available data. Thanks impossible to scale up to real-time detection of manipulation to the simplicity of the proposed model, its logic can be inter- campaigns (Stella, Ferrara, and De Domenico 2018). preted to provide insights into social bot characteristics. Generalization enables the detection of bots that are dif- ferent from those in the training data, which is critical given Introduction the adversarial nature of the problem — new bots are eas- Social bots are social media accounts controlled in part by ily designed to evade existing detection systems. Methods software. Malicious bots serve as important instruments for considering limited behaviors, like retweeting and temporal orchestrated, large-scale opinion manipulation campaigns patterns, can only identify certain types of bots. Even more on social media (Ferrara et al. 2016a; Subrahmanian et al. general-purpose systems are trained on labeled datasets that 2016). Bots have been actively involved in online discus- are sparse, noisy, and not representative enough. As a result, sions of important events, including recent elections in the methods with excellent cross-validation accuracy suffer dra- US and Europe (Bessi and Ferrara 2016; Deb et al. 2019; matic drops in performance when tested on cross-domain Stella, Ferrara, and De Domenico 2018; Ferrara 2017). Bots accounts (De Cristofaro et al. 2018). are also responsible for spreading low-credibility informa- In this paper, we propose a framework to address these tion (Shao et al. 2018) and extreme ideology (Berger and two challenges. By focusing just on user profile information, Morgan 2015; Ferrara et al. 2016b), as well as adding con- which can be easily accessed in bulk, scalability is greatly fusion to the online debate about vaccines (Broniatowski et improved. While the use of fewer features may entail a small al. 2018). compromise in individual accuracy, we gain the ability to Scalable and reliable bot detection methods are needed to analyze a large-volume stream of accounts in real time. estimate the influence of social bots and develop effective We build a rich collection of labeled data by compil- countermeasures. The task is challenging due to the diverse ing all labeled datasets available in the literature and three and dynamic behaviors of social bots. For example, some newly-produced ones. We systematically analyze the fea- bots act autonomously with minimal human intervention, ture space of different datasets and the relationship between others are manually controlled so that a single entity can cre- them; some of the datasets tend to overlap with each other ate the appearance of multiple human accounts (Grimme et while others have contradicting patterns. al. 2017). And while some bots are active continuously, oth- The diversity of the compiled datasets allows us to build ers focus bursts of activity on different short-term targets. It a more robust classifier. Surprisingly, by carefully select- Copyright c 2020, Association for the Advancement of Artificial ing a subset of the training data instead of merging all Intelligence (www.aaai.org). All rights reserved. of the datasets, we achieve better performance in terms of 1096 cross-validation, cross-domain generalization, and consis- that bounds Botometer. Moreover, each tweet collected from tency with a widely-used reference system. We also show Twitter has an embedded user object. This brings two extra that the resulting bot detector is more interpretable. The advantages. First, once tweets are collected, no extra queries lessons learned from the proposed data selection approach are needed for bot detection. Second, while users lookups can be generalized to other adversarial problems. always report the most recent user profile, the user object embedded in each tweet reflects the user profile at the mo- Related Work ment when the tweet is collected. This makes bot detection on archived historical data possible. Most bot detection methods are based on supervised ma- Table 1 lists the features extracted from the user object. chine learning and involve manually labeled data (Ferrara The rate features build upon the user age, which requires the et al. 2016a; Alothali et al. 2018). Popular approaches probe time to be available. When querying the users lookup leverage user, temporal, content, and social network fea- API, the probe time is when the query happens. If the user tures with random forest classifiers (Davis et al. 2016; object is extracted from a tweet, the probe time is the tweet Varol et al. 2017; Gilani, Kochmar, and Crowcroft 2017; creation time (created at field). The user age is defined as Yang et al. 2019). Faster classification can be obtained us- the hour difference between the probe time and the creation ing fewer features and logistic regression (Ferrara 2017; time of the user (created at field in the user object). User Stella, Ferrara, and De Domenico 2018). Some methods ages are associated with the data collection time, an artifact include information from tweet content in addition to the irrelevant to bot behaviors. In fact, tests show that includ- metadata to detect bots at the tweet level (Kudugunta and ing age in the model deteriorates accuracy. However, age is Ferrara 2018). A simpler approach is to test the randomness used to calculate the rate features. Every count feature has a of the screen name (Beskow and Carley 2019). The method corresponding rate feature to capture how fast the account is presented here also adopts the supervised learning frame- tweeting, gaining followers, and so on. In the calculation of work with random forest classifiers, attempting to achieve the ratio between followers and friends, the denominator is both the scalability of the faster methods and the accuracy max(friends count, 1) to avoid division-by-zero errors. of the feature-rich methods. We aim for greater generaliza- The screen name likelihood feature is inspired by the ob- tion than all the models in the literature. servation that bots sometimes have a random string as screen Another strain of bot detection methods goes beyond in- name (Beskow and Carley 2019). Twitter only allows let- dividual accounts to consider collective behaviors. Unsuper- ters (upper and lower case), digits, and underscores in the vised learning methods are used to find improbable sim- screen name field, with a 15-character limit. We collected ilarities among accounts. No human labeled datasets are over 2M unique screen names and constructed the likeli- needed. Examples of this approach leverage similarity in hood of all 3,969 possible bigrams. The likelihood of a post timelines (Chavoshi, Hamooni, and Mueen 2016), ac- screen name is defined by the geometric-mean likelihood tion sequences (Cresci et al. 2016; 2018a), content (Chen of all bigrams in it. We do not consider longer n-grams as and Subramanian 2018), and friends/followers (Jiang et al. they require more resources with limited advantages. Tests 2016). Coordinated retweeting behavior has also been used show the likelihood feature can effectively distinguish ran- (Mazza et al. 2019). Methods aimed at detecting coordina- dom strings from authentic screen names. tion are very slow as they need to consider many pairs of accounts. The present approach is much faster, but consid- Datasets ers individual accounts and so cannot detect coordination. To train and test our model, we collect all public datasets of labeled human and bot accounts and create three new Feature Engineering ones, all available in the bot repository (botometer.org/ All Twitter bot detection methods need to query data be- bot-repository). See overview in Table 2. fore performing any evaluation, so they are bounded by API Let us briefly summarize how the datasets were col- limits. Take Botometer, a popular bot detection tool, as an lected. The caverlee dataset consists of bot accounts example. The classifier uses over 1,000 features from each lured by honeypot accounts and verified human accounts account (Varol et al. 2017; Yang et al. 2019). To extract these (Lee, Eoff, and Caverlee 2011). This dataset is older than the features, the classifier requires the account’s most recent 200 others.
Recommended publications
  • Arxiv:1805.10105V1 [Cs.SI] 25 May 2018
    Effects of Social Bots in the Iran-Debate on Twitter Andree Thieltges, Orestis Papakyriakopoulos, Juan Carlos Medina Serrano, Simon Hegelich Bavarian School of Public Policy Technical University of Munich Abstract Ahmadinejad caused nationwide unrests and protests. As these protests grew, the Iranian government shut down for- 2018 started with massive protests in Iran, bringing back the eign news coverage and restricted the use of cell phones, impressions of the so called “Arab Spring” and it’s revolution- text-messaging and internet access. Nevertheless, Twitter ary impact for the Maghreb states, Syria and Egypt. Many reports and scientific examinations considered online social and Facebook “became vital tools to relay news and infor- mation on anti-government protest to the people inside and networks (OSN’s) such as Twitter or Facebook to play a criti- 1 cal role in the opinion making of people behind those protests. outside Iran” . While Ayatollah Khameini characterized the Beside that, there is also evidence for directed manipulation influence of Twitter as “deviant” and inappropriate on Ira- of opinion with the help of social bots and fake accounts. nian domestic affairs2, most of the foreign news coverage So, it is obvious to ask, if there is an attempt to manipulate hailed Twitter to be “a new and influential medium for social the opinion-making process related to the Iranian protest in movements and international politics” (Burns and Eltham OSN by employing social bots, and how such manipulations 2009). Beside that, there was already a critical view upon will affect the discourse as a whole. Based on a sample of the influence of OSN’s as “tools” to express political opin- ca.
    [Show full text]
  • Political Astroturfing Across the World
    Political astroturfing across the world Franziska B. Keller∗ David Schochy Sebastian Stierz JungHwan Yang§ Paper prepared for the Harvard Disinformation Workshop Update 1 Introduction At the very latest since the Russian Internet Research Agency’s (IRA) intervention in the U.S. presidential election, scholars and the broader public have become wary of coordi- nated disinformation campaigns. These hidden activities aim to deteriorate the public’s trust in electoral institutions or the government’s legitimacy, and can exacerbate political polarization. But unfortunately, academic and public debates on the topic are haunted by conceptual ambiguities, and rely on few memorable examples, epitomized by the often cited “social bots” who are accused of having tried to influence public opinion in various contemporary political events. In this paper, we examine “political astroturfing,” a particular form of centrally co- ordinated disinformation campaigns in which participants pretend to be ordinary citizens acting independently (Kovic, Rauchfleisch, Sele, & Caspar, 2018). While the accounts as- sociated with a disinformation campaign may not necessarily spread incorrect information, they deceive the audience about the identity and motivation of the person manning the ac- count. And even though social bots have been in the focus of most academic research (Fer- rara, Varol, Davis, Menczer, & Flammini, 2016; Stella, Ferrara, & De Domenico, 2018), seemingly automated accounts make up only a small part – if any – of most astroturf- ing campaigns. For instigators of astroturfing, relying exclusively on social bots is not a promising strategy, as humans are good at detecting low-quality information (Pennycook & Rand, 2019). On the other hand, many bots detected by state-of-the-art social bot de- tection methods are not part of an astroturfing campaign, but unrelated non-political spam ∗Hong Kong University of Science and Technology yThe University of Manchester zGESIS – Leibniz Institute for the Social Sciences §University of Illinois at Urbana-Champaign 1 bots.
    [Show full text]
  • Understanding Users' Perspectives of News Bots in the Age of Social Media
    sustainability Article Utilizing Bots for Sustainable News Business: Understanding Users’ Perspectives of News Bots in the Age of Social Media Hyehyun Hong 1 and Hyun Jee Oh 2,* 1 Department of Advertising and Public Relations, Chung-Ang University, Seoul 06974, Korea; [email protected] 2 Department of Communication Studies, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong SAR, China * Correspondence: [email protected] Received: 15 July 2020; Accepted: 10 August 2020; Published: 12 August 2020 Abstract: The move of news audiences to social media has presented a major challenge for news organizations. How to adapt and adjust to this social media environment is an important issue for sustainable news business. News bots are one of the key technologies offered in the current media environment and are widely applied in news production, dissemination, and interaction with audiences. While benefits and concerns coexist about the application of bots in news organizations, the current study aimed to examine how social media users perceive news bots, the factors that affect their acceptance of bots in news organizations, and how this is related to their evaluation of social media news in general. An analysis of the US national survey dataset showed that self-efficacy (confidence in identifying content from a bot) was a successful predictor of news bot acceptance, which in turn resulted in a positive evaluation of social media news in general. In addition, an individual’s perceived prevalence of social media news from bots had an indirect effect on acceptance by increasing self-efficacy. The results are discussed with the aim of providing a better understanding of news audiences in the social media environment, and practical implications for the sustainable news business are suggested.
    [Show full text]
  • Supplementary Materials For
    www.sciencemag.org/content/359/6380/1094/suppl/DC1 Supplementary Materials for The science of fake news David M. J. Lazer,*† Matthew A. Baum,* Yochai Benkler, Adam J. Berinsky, Kelly M. Greenhill, Filippo Menczer, Miriam J. Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, Michael Schudson, Steven A. Sloman, Cass R. Sunstein, Emily A. Thorson, Duncan J. Watts, Jonathan L. Zittrain *These authors contributed equally to this work. †Corresponding author. Email: [email protected] Published 9 March 2018, Science 359, 1094 (2018) DOI: 10.1126/science.aao2998 This PDF file includes: List of Author Affiliations Supporting Materials References List of Author Affiliations David M. J. Lazer,1,2*† Matthew A. Baum,3* Yochai Benkler,4,5 Adam J. Berinsky,6 Kelly M. Greenhill,7,3 Filippo Menczer,8 Miriam J. Metzger,9 Brendan Nyhan,10 Gordon Pennycook,11 David Rothschild,12 Michael Schudson,13 Steven A. Sloman,14 Cass R. Sunstein,4 Emily A. Thorson,15 Duncan J. Watts,12 Jonathan L. Zittrain4,5 1Network Science Institute, Northeastern University, Boston, MA 02115, USA. 2Institute for Quantitative Social Science, Harvard University, Cambridge, MA 02138, USA 3John F. Kennedy School of Government, Harvard University, Cambridge, MA 02138, USA. 4Harvard Law School, Harvard University, Cambridge, MA 02138, USA. 5 Berkman Klein Center for Internet and Society, Cambridge, MA 02138, USA. 6Department of Political Science, Massachussets Institute of Technology, Cambridge, MA 02139, USA. 7Department of Political Science, Tufts University, Medford, MA 02155, USA. 8School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47405, USA. 9Department of Communication, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
    [Show full text]
  • Frame Viral Tweets.Pdf
    BIG IDEAS! Challenging Public Relations Research and Practice Viral Tweets, Fake News and Social Bots in Post-Factual Politics: The Case of the French Presidential Elections 2017 Alex Frame, Gilles Brachotte, Eric Leclercq, Marinette Savonnet [email protected] 20th Euprera Congress, Aarhus, 27-29th September 2018 Post-Factual Politics in the Age of Social Media 1. Algorithms based on popularity rather than veracity, linked to a business model based on views and likes, where novelty and sensationalism are of the essence; 2. Social trends of “whistle-blowing” linked to conspiracy theories and a pejorative image of corporate or institutional communication, which cast doubt on the neutrality of traditionally so-called expert figures, such as independent bodies or academics; 3. Algorithm-fuelled social dynamics on social networking sites which structure publics by similarity, leading to a fragmented digital public sphere where like-minded individuals congregate digitally, providing an “echo-chamber” effect susceptible to encourage even the most extreme political views and the “truths” upon which they are based. Fake News at the Presidential Elections Fake News at the Presidential Elections Fake News at the Presidential Elections Political Social Bots on Twitter Increasingly widespread around the world, over at least the last decade. Kremlin bot army (Lawrence Alexander on GlobalVoices.org; Stukal et al., 2017). DFRLab (Atlantic Council) Social Bots A “social bot” is: “a computer algorithm that automatically produces content and
    [Show full text]
  • Empirical Comparative Analysis of Bot Evidence in Social Networks
    Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks Ross Schuchard1(&) , Andrew Crooks1,2 , Anthony Stefanidis2,3 , and Arie Croitoru2 1 Computational Social Science Program, Department of Computational and Data Sciences, George Mason University, Fairfax, VA 22030, USA [email protected] 2 Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA 22030, USA 3 Criminal Investigations and Network Analysis Center, George Mason University, Fairfax, VA 22030, USA Abstract. The emergence of social bots within online social networks (OSNs) to diffuse information at scale has given rise to many efforts to detect them. While methodologies employed to detect the evolving sophistication of bots continue to improve, much work can be done to characterize the impact of bots on communication networks. In this study, we present a framework to describe the pervasiveness and relative importance of participants recognized as bots in various OSN conversations. Specifically, we harvested over 30 million tweets from three major global events in 2016 (the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship) and compared the con- versational patterns of bots and humans within each event. We further examined the social network structure of each conversation to determine if bots exhibited any particular network influence, while also determining bot participation in key emergent network communities. The results showed that although participants recognized as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. This includes the identification of individual bots as top-10 influencer nodes out of a total corpus consisting of more than 2.8 million nodes.
    [Show full text]
  • Disinformation, and Influence Campaigns on Twitter 'Fake News'
    Disinformation, ‘Fake News’ and Influence Campaigns on Twitter OCTOBER 2018 Matthew Hindman Vlad Barash George Washington University Graphika Contents Executive Summary . 3 Introduction . 7 A Problem Both Old and New . 9 Defining Fake News Outlets . 13 Bots, Trolls and ‘Cyborgs’ on Twitter . 16 Map Methodology . 19 Election Data and Maps . 22 Election Core Map Election Periphery Map Postelection Map Fake Accounts From Russia’s Most Prominent Troll Farm . 33 Disinformation Campaigns on Twitter: Chronotopes . 34 #NoDAPL #WikiLeaks #SpiritCooking #SyriaHoax #SethRich Conclusion . 43 Bibliography . 45 Notes . 55 2 EXECUTIVE SUMMARY This study is one of the largest analyses to date on how fake news spread on Twitter both during and after the 2016 election campaign. Using tools and mapping methods from Graphika, a social media intelligence firm, we study more than 10 million tweets from 700,000 Twitter accounts that linked to more than 600 fake and conspiracy news outlets. Crucially, we study fake and con- spiracy news both before and after the election, allowing us to measure how the fake news ecosystem has evolved since November 2016. Much fake news and disinformation is still being spread on Twitter. Consistent with other research, we find more than 6.6 million tweets linking to fake and conspiracy news publishers in the month before the 2016 election. Yet disinformation continues to be a substantial problem postelection, with 4.0 million tweets linking to fake and conspiracy news publishers found in a 30-day period from mid-March to mid-April 2017. Contrary to claims that fake news is a game of “whack-a-mole,” more than 80 percent of the disinformation accounts in our election maps are still active as this report goes to press.
    [Show full text]
  • Social Bots and Social Media Manipulation in 2020: the Year in Review
    Social Bots and Social Media Manipulation in 2020: The Year in Review Ho-Chun Herbert Chang∗a, Emily Chen∗b,c, Meiqing Zhang∗a, Goran Muric∗b, and Emilio Ferraraa,b,c aUSC Annenberg School of Communication bUSC Information Sciences Institute cUSC Department of Computer Science *These authors contributed equally to this work Abstract The year 2020 will be remembered for two events of global significance: the COVID-19 pandemic and 2020 U.S. Presidential Election. In this chapter, we summarize recent studies using large public Twitter data sets on these issues. We have three primary ob- jectives. First, we delineate epistemological and practical considerations when combin- ing the traditions of computational research and social science research. A sensible bal- ance should be struck when the stakes are high between advancing social theory and concrete, timely reporting of ongoing events. We additionally comment on the compu- tational challenges of gleaning insight from large amounts of social media data. Second, we characterize the role of social bots in social media manipulation around the discourse on the COVID-19 pandemic and 2020 U.S. Presidential Election. Third, we compare re- sults from 2020 to prior years to note that, although bot accounts still contribute to the emergence of echo-chambers, there is a transition from state-sponsored campaigns to do- mestically emergent sources of distortion. Furthermore, issues of public health can be confounded by political orientation, especially from localized communities of actors who spread misinformation. We conclude that automation and social media manipulation arXiv:2102.08436v1 [cs.SI] 16 Feb 2021 pose issues to a healthy and democratic discourse, precisely because they distort repre- sentation of pluralism within the public sphere.
    [Show full text]
  • The Rise of Social Botnets: Attacks and Countermeasures
    1 The Rise of Social Botnets: Attacks and Countermeasures Jinxue Zhang, Student Member, IEEE, Rui Zhang, Member, IEEE, Yanchao Zhang, Senior Member, IEEE, and Guanhua Yan Abstract—Online social networks (OSNs) are increasingly threatened by social bots which are software-controlled OSN accounts that mimic human users with malicious intentions. A social botnet refers to a group of social bots under the control of a single botmaster, which collaborate to conduct malicious behavior while mimicking the interactions among normal OSN users to reduce their individual risk of being detected. We demonstrate the effectiveness and advantages of exploiting a social botnet for spam distribution and digital-influence manipulation through real experiments on Twitter and also trace-driven simulations. We also propose the corresponding countermeasures and evaluate their effectiveness. Our results can help understand the potentially detrimental effects of social botnets and help OSNs improve their bot(net) detection systems. Index Terms—Social bot, social botnet, spam distribution, digital influence F 1 INTRODUCTION NLINE social networks (OSNs) are increasingly threat- two new social botnet attacks on Twitter, one of the most O ened by social bots [2] which are software-controlled OSN popular OSNs with over 302M monthly active users as of June accounts that mimic human users with malicious intentions. 2015 and over 500M new tweets daily. Then we propose two For example, according to a May 2012 article in Bloomberg defenses on the reported attacks, respectively. Our results help Businessweek,1 as many as 40% of the accounts on Facebook, understand the potentially detrimental effects of social botnets Twitter, and other popular OSNs are spammer accounts (or and shed the light for Twitter and other OSNs to improve their social bots), and about 8% of the messages sent via social bot(net) detection systems.
    [Show full text]
  • Supervised Machine Learning Bot Detection Techniques to Identify
    SMU Data Science Review Volume 1 | Number 2 Article 5 2018 Supervised Machine Learning Bot Detection Techniques to Identify Social Twitter Bots Phillip George Efthimion Southern Methodist University, [email protected] Scott aP yne Southern Methodist University, [email protected] Nicholas Proferes University of Kentucky, [email protected] Follow this and additional works at: https://scholar.smu.edu/datasciencereview Part of the Theory and Algorithms Commons Recommended Citation Efthimion, hiP llip George; Payne, Scott; and Proferes, Nicholas (2018) "Supervised Machine Learning Bot Detection Techniques to Identify Social Twitter Bots," SMU Data Science Review: Vol. 1 : No. 2 , Article 5. Available at: https://scholar.smu.edu/datasciencereview/vol1/iss2/5 This Article is brought to you for free and open access by SMU Scholar. It has been accepted for inclusion in SMU Data Science Review by an authorized administrator of SMU Scholar. For more information, please visit http://digitalrepository.smu.edu. Efthimion et al.: Supervised Machine Learning Bot Detection Techniques to Identify Social Twitter Bots Supervised Machine Learning Bot Detection Techniques to Identify Social Twitter Bots Phillip G. Efthimion1, Scott Payne1, Nick Proferes2 1Master of Science in Data Science, Southern Methodist University 6425 Boaz Lane, Dallas, TX 75205 {pefthimion, mspayne}@smu.edu [email protected] Abstract. In this paper, we present novel bot detection algorithms to identify Twitter bot accounts and to determine their prevalence in current online discourse. On social media, bots are ubiquitous. Bot accounts are problematic because they can manipulate information, spread misinformation, and promote unverified information, which can adversely affect public opinion on various topics, such as product sales and political campaigns.
    [Show full text]
  • You Must Be a Bot! How Beliefs Shape Twitter Profile Perceptions
    Disagree? You Must be a Bot! How Beliefs Shape Twitter Profile Perceptions MAGDALENA WISCHNEWSKI, University of Duisburg-Essen, Germany REBECCA BERNEMANN, University of Duisburg-Essen, Germany THAO NGO, University of Duisburg-Essen, Germany NICOLE KRÄMER, University of Duisburg-Essen, Germany In this paper, we investigate the human ability to distinguish political social bots from humans on Twitter. Following motivated reasoning theory from social and cognitive psychology, our central hypothesis is that especially those accounts which are opinion- incongruent are perceived as social bot accounts when the account is ambiguous about its nature. We also hypothesize that credibility ratings mediate this relationship. We asked N = 151 participants to evaluate 24 Twitter accounts and decide whether the accounts were humans or social bots. Findings support our motivated reasoning hypothesis for a sub-group of Twitter users (those who are more familiar with Twitter): Accounts that are opinion-incongruent are evaluated as relatively more bot-like than accounts that are opinion-congruent. Moreover, it does not matter whether the account is clearly social bot or human or ambiguous about its nature. This was mediated by perceived credibility in the sense that congruent profiles were evaluated to be more credible resulting in lower perceptions as bots. CCS Concepts: • Human-centered computing ! Empirical studies in HCI. Additional Key Words and Phrases: motivated reasoning, social bots, Twitter, credibility, partisanship, bias ACM Reference Format: Magdalena Wischnewski, Rebecca Bernemann, Thao Ngo, and Nicole Krämer. 2021. Disagree? You Must be a Bot! How Beliefs Shape Twitter Profile Perceptions. In CHI Conference on Human Factors in Computing Systems (CHI ’21), May 8–13, 2021, Yokohama, Japan.
    [Show full text]
  • Introduction: What Is Computational Propaganda?
    !1 Introduction: What Is Computational Propaganda? Digital technologies hold great promise for democracy. Social media tools and the wider resources of the Internet o"er tremendous access to data, knowledge, social networks, and collective engagement opportunities, and can help us to build better democracies (Howard, 2015; Margetts et al., 2015). Unwelcome obstacles are, however, disrupting the creative democratic applications of information technologies (Woolley, 2016; Gallacher et al., 2017; Vosoughi, Roy, & Aral, 2018). Massive social platforms like Facebook and Twitter are struggling to come to grips with the ways their creations can be used for political control. Social media algorithms may be creating echo chambers in which public conversations get polluted and polarized. Surveillance capabilities are outstripping civil protections. Political “bots” (software agents used to generate simple messages and “conversations” on social media) are masquerading as genuine grassroots movements to manipulate public opinion. Online hate speech is gaining currency. Malicious actors and digital marketers run junk news factories that disseminate misinformation to harm opponents or earn click-through advertising revenue.# It is no exaggeration to say that coordinated e"orts are even now working to seed chaos in many political systems worldwide. Some militaries and intelligence agencies are making use of social media as conduits to undermine democratic processes and bring down democratic institutions altogether (Bradshaw & Howard, 2017). Most democratic governments are preparing their legal and regulatory responses. But unintended consequences from over- regulation, or regulation uninformed by systematic research, may be as damaging to democratic systems as the threats themselves.# We live in a time of extraordinary political upheaval and change, with political movements and parties rising and declining rapidly (Kreiss, 2016; Anstead, 2017).
    [Show full text]