Bots and Cyborgs: Wikipedia's Immune System
Total Page:16
File Type:pdf, Size:1020Kb
SOCIAL COMPUTING Bots and Cyborgs: Wikipedia’s Immune System Aaron Halfaker and John Riedl University of Minnesota Bots and cyborgs are more than tools to better manage content quality on Wikipedia—through their interaction with humans, they’re fundamentally changing its culture. h e n W i k i p e d i a simply couldn’t keep up with the damaging edits that humans alone was young and the volume of incoming edits. couldn’t handle. number of active Community members responded W contributors num- to this problem by developing two Early tools bered in the 10s or 100s, volunteer basic types of computational tools: The first tools to redefine the editors could directly manage its robots, or bots, and cyborgs. Bots way Wikipedia dealt with van- content and processes. Some editors automatically handle repetitive dalism were AntiVandalBot and who have been around the longest will tasks—for example, SpellCheckerBot VandalProof. fondly recall the halcyon days when fixes spelling errors. The sidebar “A AntiVandalBot used a simple set the encyclopedia evolved slowly, and Taxonomy of Wikipedia Bots” provides of rules and heuristics to monitor one person could manually track an overview of the many activities changes made to articles, identify the all of a day’s changes in a matter of bots perform on the site. Cyborgs are most obvious cases of vandalism, and minutes. intelligent user interfaces that humans automatically revert them. Although Those days ended in 2004 when “put on” like a virtual Ironman suit this bot made it possible, for the first Wikipedia began to experience to combine computational power time, for the Wikipedia community exponential growth in new with human reasoning. Huggle, for to protect the encyclopedia from contributors, new articles, and instance, helps users zap vandals’ damage without wasting the time popular media attention. By the edits by the thousands. and energy of good-faith editors, it time growth peaked in 2007, the Together, bots and cyborgs function wasn’t very intelligent and could only encyclopedia was receiving more than as first-line defenses in Wikipedia’s correct the most egregious instances 180 edits per minute. emerging “immune system.” of vandalism. On one hand, Wikipedia was VandalProof, an early cyborg enjoying a rich flow of its lifeblood: COMBATING VANDALISM technology, was a graphical user volunteer contributions. On the other Bots and cyborgs arguably have interface written in Visual Basic that hand, suddenly no human could been most effective at filtering let trusted editors monitor article review all of the changes. Experienced vandalism on Wikipedia. Along edits as fast as they happened in editors spent all of their time watching with the 2004-2007 exponential Wikipedia and revert unwanted for copyright violations, potentially growth in good contributions contributions in one click. libelous articles, and vandalism; they came a torrent of bad content and VandalProof served as a natural 0018-9162/12/$31.00 © 2012 IEEE Published by the IEEE Computer Society MARCH 2012 79 SOCIAL COMPUTING A TAXONOMY OF WIKIPEDIA BOTS shows, these tools have had an increasing role in maintaining article quality on Wikipedia. obots perform a wide range of activities on Wikipedia. These include injecting public R domain data, monitoring and curating content, augmenting the MediaWiki software, ClueBot_NG has replaced and protecting the encyclopedia from malicious activity. AntiVandalBot’s simple rules and The first bots injected data into Wikipedia content from public databases. Rambot, widely heuristics with a highly accurate accepted to be the encyclopedia’s first sanctioned robot, inserted census data into articles neural network/machine learning about countries and cities. Rambot and its cousins act as “force multipliers” by performing approach: editors submit examples repetitive activities hundreds or thousands of times in minutes. of mistakes that ClueBot_NG makes, Many other bots monitor and curate Wikipedia content. For example, SpellCheckerBot and the cyborg’s developers retrain checks recent changes for common spelling mistakes using an international dictionary to its classifier periodically. Similarly, prevent accidental “fixes” to correctly spelled foreign words. Similarly, Helpful Pixie Bot cor- Huggle has replaced VandalProof with rects ISBNs and other structural features of articles such as section capitalization. The largest a slick user interface, configurablity, class of content curators is interlanguage bots, which use graph models of links between dif- and an intelligent system for sorting ferent languages of Wikipedia to identify missing links between articles covering the same edits by vandalistic likelihood to topic in a different language. As of this writing, the English Wikipedia has more than 60 active maximize the efficiency of human interlanguage bots. effort in dealing with those instances Some bots extend Wikipedia functionality by implementing features that the MediaWiki of vandalism that ClueBot_NG doesn’t software doesn’t support. For example, AIV Helperbot turns a simple page into a dynamic catch. priority-based discussion queue to support administrators in their work of identifying and blocking vandals. Similarly, SineBot ensures that every posted comment is signed and dated. Distributed cognition Finally, a series of bots protects the encyclopedia from malicious activity. For example, Today, the combined efforts of ClueBot_NG uses state-of-the-art machine learning techniques to review all contributions to ClueBot_NG and a small group of articles and to revert vandalism, while XLinkBot reverts contributions that create links to Huggle-human cyborgs identify and blacklisted domains as a way of quickly and permanently dealing with spammers. immediately destroy the majority of vandalism before any human editor or reader ever sees it. However, the supplement to AntiVandalBot and Current tools speed and efficiency with which its successors by leaving the obvious Several years and many iterations these tools deal with individual vandalism to bots’ simple decision- later, bots and cyborgs have become instances of vandalism is only one making algorithms and letting more powerful, accurate, and user- aspect of Wikipedia’s antivandal human editors handle the rest. friendly. Consequently, as Figure 1 strategy. R. Stuart Geiger and David Ribes observed that bots and cyborgs form 1.0 a “distributed cognition system” that has reshaped Wikipedia’s process for 0.8 identifying and removing vandals (“The Work of Sustaining Order in Wikipedia: The Banning of a Vandal,” tion 0.6 Proc. 2010 ACM Conf. Computer opor Bots Supported Cooperative Work [CSCW Cyborgs ed pr 10], ACM, 2010, pp. 117-126). 0.4 Humans Stack Although bots and cyborg- editors work independently, they “operationalize each offending edit 0.2 into a social structure through which administrators and editors come to 0 know users as vandals.” In other words, Wikipedia’s bots and cyborgs 2002 2004 2006 2008 2010 Year automatically build a record of new editors’ activities to quickly and Figure 1. Stacked proportion of reverts (rejections of damaging edits) for bots, cyborgs, efficiently identify vandals and block and human editors on Wikipedia. Bots and cyborgs have been taking over an increasing them, thereby conferring immunity role in protecting articles from damage since early 2006. to future damage from that source. 80 COMPUTER BOT SOCIALIZATION While bots are highly productive, performing edits hundreds of times faster than humans, they can also be massively disruptive to the community if they perform inappropriate actions, either because of disagreements between bot authors and other Wikipedians or because of bugs. To control for such issues, the Bot Approvals Group, a band of volunteer members, vets new bot proposals and addresses bot-related grievances. Geiger argues that Wikipedia Figure 2. The Huggle interface makes it easy to review a series of recent revisions by editors view bots not only as tools or filtering them according to the user’s preferences. “force multipliers” but also as social agents (“The Lives of Bots,” Critical Point of View: A Wikipedia Reader, In the discussions about whether administrators and a few thousand G. Lovink and N. Tkacz, eds., Institute HagermanBot should be allowed to other Wikipedia users. of Network Cultures, 2011, pp. 78-93). continue to operate, prolific editor Huggle makes it easy to review This is at least partially due and administrator Rich Farmbrough a series of recent revisions by to the fact that bots interact with argued it was important for users filtering them according to the Wikipedia in largely the same to understand that “bots are better user’s preferences. Figure 2 shows manner as human editors: they edit behaved than people,” and cautioned the main interface. On the left side articles and other pages via user against “botophobia” on Wikipedia. is a list of recent revisions sorted accounts, and they send and receive Bots were to some extent becoming by vandalistic probability. When a messages, usually read by their social agents, and they would need user selects one of these revisions, human maintainer—at least for now. to find a way to peacefully coexist it vanishes from the list for all Tawker, AntiVandalBot’s operator, beside flesh-and-blood contributors. other Huggle users to reduce the even half-jokingly nominated his bot Luckily for HagermanBot and risk of conflict. On the right side for election to the 2006 Arbitration its operator, an opt-out mechanism of