The Use of Software Tools and Autonomous Bots Against Vandalism: Eroding Wikipedia’S Moral Order?

Ethics Inf Technol DOI 10.1007/s10676-015-9366-9 ORIGINAL PAPER The use of software tools and autonomous bots against vandalism: eroding Wikipedia’s moral order? Paul B. de Laat1 Ó The Author(s) 2015. This article is published with open access at Springerlink.com Abstract English-language Wikipedia is constantly Introduction being plagued by vandalistic contributions on a massive scale. In order to fight them its volunteer contributors Communities that thrive on the contributions from their deploy an array of software tools and autonomous bots. respective ‘crowds’ have been with us for several decades After an analysis of their functioning and the ‘coactivity’ in now. The contents involved may be source code, text, or use between humans and bots, this research ‘discloses’ the pictures; the products created may be software, news, ref- moral issues that emerge from the combined patrolling by erence works, videos, maps, and the like. As argued before humans and bots. Administrators provide the stronger tools (de Laat 2012b, using Dutton 2008), the basic parameters only to trusted users, thereby creating a new hierarchical of such open content communities are twofold: on the one layer. Further, surveillance exhibits several troubling fea- hand their technological web design, which may range tures: questionable profiling practices (concerning from sharing (1.0), co-contributing (2.0), to co-creation anonymous users in particular), the use of the controversial (3.0); on the other hand their conditions of admission to the measure of reputation (under consideration), ‘oversurveil- collaborative work process, which may be fully open or lance’ where quantity trumps quality, and a prospective more restricted (say, only for experts). Some telling loss of the required moral skills whenever bots take over examples are Linux, Slashdot, Reddit, NowPublic, Wiki- from humans. The most troubling aspect, though, is that pedia, and YouTube. Wikipedia has become a Janus-faced institution. One face Open channels of communication invite all kinds of is the basic platform of MediaWiki software, transparent to contributions to the collective process. Inevitably, they also all. Its other face is the anti-vandalism system, which, in solicit contents that are disruptive and damaging to the contrast, is opaque to the average user, in particular as a goals of the community: off-topic, inappropriate, improper, result of the algorithms and neural networks in use. Finally offensive, malicious content and the like. Obviously, this it is argued that this secrecy impedes a much needed dis- issue is most urgent in those open-content communities cussion to unfold; a discussion that should focus on a that focus on co-contribution or co-creation without any ‘rebalancing’ of the anti-vandalism system and the devel- restrictions on entry. Most of such ‘open collaboration’ opment of more ethical information practices towards projects (a term coined by Forte and Lampe 2013), there- contributors. fore, have had no choice but to develop systems that detect improper contributions and subsequently eliminate them in Keywords Bots Á Disclosive ethics Á Profiling Á appropriate ways. In short, anti-intrusion systems have Surveillance Á Vandalism Á Wikipedia been introduced. Several varieties can be discerned.1 A default solution is that professional editors of the site police contributions for improper content, before and/or after publication. Most & Paul B. de Laat social news sites and citizen journals are a case in point. As [email protected] 1 University of Groningen, Groningen, The Netherlands 1 The next two paragraphs mostly follow de Laat (2012a). 123 P. B. de Laat a rule, though, the incoming flow of content is simply too certain fields of interest (encyclopedias) or for source code large to be handled in this way alone; therefore the crowds that meets certain technical criteria (open-source software); are called to the rescue. Such help may take various forms. their criteria for propriety obviously differ widely. First, contributors can be asked to scout for improper Correspondingly, coming closer to the ‘truth’ is much content as well. Any purported transgressions are to be harder for the latter types of community than for the for- reported to the managing editors who will deal with them. mer. Further, the size of the community involved matters: If necessary, additional workers are hired for the modera- the larger it becomes, the more tendencies towards strati- tion of reported contents. Facebook and YouTube report- fication and differentiation of power are likely to manifest edly outsource this kind of moderation globally to small themselves. Moreover, a multitude of elites may be forming, companies that all together employ thousands of lowly- each with their own agendas. paid workers. Many of the call centres involved, for Keeping these questions in mind let us return to the example, are located in the Philippines (Chen 2014). Sec- phenomenon of disruptive contributions and focus on their ondly, contributors can be invited to vote on new content: scale. It can safely be asserted that in the largest of all usually a plus or a minus. It was, of course, Digg that open-content encyclopedias, Wikipedia, disruption has pioneered this scheme (‘digging’). As a result of this vot- reached gigantic proportions. For the English language ing, contributions rise to public prominence (or visibility) version of the encyclopedia in particular, estimates hover more quickly, or fade away more quickly. Quality gets around 9000 malicious edits a day. A few reasons behind sorted out. Both reporting and voting are common practice this vandalism on a large scale can easily be identified. On now in social news sites and citizen journals. Thirdly, the one hand, the Wikipedia corpus involved has assumed contributors can be invited to change contributions that are large proportions (almost five million entries), making it a deemed inappropriate. Usually, of course, this means very attractive target for visitors of bad intent; on the other, deleting the content in question. It is common practice in Wikipedia allows co-creation (3.0) with full write-access communities that operate according to a wiki format that for all, a collaborative tool that is very susceptible to dis- allows contributors unrestricted write-access. Cases in ruptive actions by mala fide visitors. point are Citizendium, Wikinews, and Wikipedia. Finally, From the beginning of Wikipedia, in 2001, several contributors can be invited to spot improper contribu- approaches to curb vandalism have been tried. Some of tions—but only those who have proved themselves trust- them have been accepted and endured; others have been worthy are allowed to act accordingly. This hierarchical discarded along the way. In the following, the various tools option is common practice in communities of open-source and approaches are spelled out, with a particular emphasis software: after identifying them as genuine bugs, only on the ones that are still being used. It is shown that within developers (with write-access) may correct such edits by Wikipedia a whole collection of anti-vandalism tools has eliminating or fixing them in the main code tree. been developed. These instruments are being unfolded in a In order to present a unified framework in the above I massive campaign of collective monitoring of edits. This have been talking indiscriminately about improper and/or vigilance is exercised continuously, on a 24/7 basis. In the disruptive content. Such neutral language of course con- process, human Wikipedians are being mobilized as ceals a lot of variety between communities concerning patrollers, from administrators at the top to ordinary users what is to be counted as such. Moreover, impropriety and at the bottom; in addition, several autonomous bots are disruption only exist in the eyes of the particular beholder. doing their part. In a short detour, this system of surveil- These observations immediately raise many associated lance is analysed from the viewpoint of robotic ethics, and questions. What are the definitions of impropriety in use? shown to conform to the recent trend of ‘coactivity’ Who effectively decides on what is to count as such? Is it between humans and bots. the crowds that have been solicited or only a select few? If In the central part of this article, subsequently, the moral the latter, how did the select few obtain their position of questions that emerge from this mixed human/bot surveil- power: by appointment from above or by some democratic lance are discussed. The analysis is to be seen as an procedure? Ultimately the crucial question is: do such exercise in ‘disclosive ethics’ (Brey 2000): uncovering decisions on what counts as proper content bring us any features of the anti-vandalism system in place that are closer to the particular ‘truth’ being sought? largely opaque and appear to be morally neutral—but, in Obviously, the answers are bound to vary depending on actual fact, are not so. Based on a careful reading of which type of community is under scrutiny. Let me just Wikipedian pages that relate to vandalism fighting and a state here a few, select generalities. As concerns the kind of period of participant observation as a practising patroller I content being pursued, some communities are after an draw the following conclusions. As for power, next to exchange of opinions about topical issues (social news, administrators, strong anti-vandalism tools are shown to be citizen journals), others are on a quest for the ‘facts’ about only distributed to trusted users, thus creating dominance 123 The use of software tools and autonomous bots against vandalism: eroding Wikipedia’s moral… for a new hierarchical layer. As far as surveillance is namespace only (the entries themselves). Further, users concerned, filtering of edits becomes questionable when it may compose a list of specific entries they are interested in focuses on editor characteristics like anonymity or repu- (their own personal ‘watchlist’), and then select all recent tation.

The Use of Software Tools and Autonomous Bots Against Vandalism: Eroding Wikipedia’S Moral Order?

Wikipedia and Intermediary Immunity: Supporting Sturdy Crowd Systems for Producing Reliable Information Jacob Rogers Abstract

Initiativen Roger Cloes Infobrief

Community Or Social Movement? Piotr Konieczny

Reliability of Wikipedia 1 Reliability of Wikipedia

Towards Content-Driven Reputation for Collaborative Code Repositories

Checkuser and Editing Patterns

The Case of Wikipedia Jansn

Wikipedia's Labor Squeeze and Its Consequences Eric Goldman Santa Clara University School of Law, [email protected]

A Cultural and Political Economy of Web 2.0

Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features

Editing Wikipedia: a Guide to Improving Content on the Online Encyclopedia

Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach