Tell Me More: an Actionable Quality Model for Wikipedia
Total Page:16
File Type:pdf, Size:1020Kb
Tell Me More: An Actionable Quality Model for Wikipedia Morten Warncke-Wang Dan Cosley John Riedl GroupLens Research Department of Info. Science GroupLens Research Dept. of Comp. Sci. and Eng. Cornell University Dept. of Comp. Sci. and Eng. University of Minnesota Ithaca, NY 14850, USA University of Minnesota Minneapolis, MN 55455, USA [email protected] Minneapolis, MN 55455, USA [email protected] [email protected] ABSTRACT This raises the questions of what it means for a Wikipedia In this paper we address the problem of developing action- article to have quality and what actions are needed to im- able quality models for Wikipedia, models whose features di- prove it. We introduce these topics by surveying the research rectly suggest strategies for improving the quality of a given literature, first to understand the notion of quality and then article. We first survey the literature in order to understand to describe Wikipedia's quality assessment procedures, as the notion of article quality in the context of Wikipedia and well as the improvement actions these procedures recom- existing approaches to automatically assess article quality. mend. We follow most of the prior literature by focusing on We then develop classification models with varying combi- English Wikipedia; many of the concepts would work sim- nations of more or less actionable features, and find that a ilarly in other language editions, though the details would model that only contains clearly actionable features deliv- be different. ers solid performance. Lastly we discuss the implications of these results in terms of how they can help improve the 1.1 Article quality in Wikipedia quality of articles across Wikipedia. The notion of article quality in English Wikipedia is sim- ilar to that of traditional encyclopaedias, as discussed by Stvilia et al. [29]. Accuracy in particular has received re- Categories and Subject Descriptors search attention, aiming to understand how a community H.5 [Information Interfaces and Presentation]: Group process with no central oversight could result in articles with and Organization Interfaces|Collaborative computing, accuracy comparable to traditional encyclopedias [7, 14]. Computer-supported cooperative work, Web-based interac- The notion of article quality in Wikipedia differs across tion both space and time. There are currently 285 language edi- tions, each with their own user community, and research has shown that the notion of quality differs between these Keywords communities [28]. Featured Articles are considered to be Wikipedia, Information Quality, Modelling, Classification, the best articles Wikipedia has to offer. When they first ap- Flaw Detection, Machine Learning peared around April 2002 the only information quality crite- rion listed was \brilliant prose" [31]. Today, they go through a peer review process [34] that checks the articles accord- 1. INTRODUCTION ing to \accuracy, neutrality, completeness, and style."2 This With four million articles in the English language edition change over time is reflected in 1,013 articles having lost and over 25 million articles across 285 languages, Wikipedia their Featured Article status as of March 20133. has succeeded at building a compendium with a large num- Wikipedia has other quality classes in addition to Fea- ber of articles. At Wikimania in 2006, Jimmy Wales held tured Articles. There are seven assessment classes span- an opening plenary where he suggested Wikipedia contribu- ning all levels of quality from high to low: Featured Articles, tors should shift their focus from the number of articles and Good Articles, A-, B-, C-, Start-, and Stub-class4. Each of instead work on improving their quality1. these classes have specific criteria5, for instance how com- pletely the article covers the subject area. Out of the seven 1http://wikimania2006.wikimedia.org/wiki/Opening_ classes only the Featured Articles are regarded as mostly fin- Plenary_(transcript)#Quality_initiative_.2833: ished; the other six classes come with general descriptions of 20.29 how improvements can be made. For example, C-class arti- 2http://en.wikipedia.org/wiki/Wikipedia:Featured_ Permission to make digital or hard copies of all or part of this work for articles personal or classroom use is granted without fee provided that copies are 3http://en.wikipedia.org/wiki/Wikipedia:Former_ not made or distributed for profit or commercial advantage and that copies featured_articles bear this notice and the full citation on the first page. To copy otherwise, to 4 republish, to post on servers or to redistribute to lists, requires prior specific There are also two classes for \list" type articles: Featured permission and/or a fee. List and List-class. In this paper we do not examine list type articles. WikiSym ’13, Aug 05-07 2013, Hong Kong, China 5 Copyright is held by the owner/author(s). Publication rights licensed to http://en.wikipedia.org/wiki/Wikipedia:Version_1. ACM. ACM 978-1-4503-1852-5/13/08 $15.00. 0_Editorial_Team/Assessment#Grades cles have the following suggestion: \Considerable editing is et al. found that articles started by editors with high social needed to close gaps in content and solve cleanup problems." capital more quickly reached high quality status [24]. Complementing the quality assessment classes, Wikipedia Research has also studied article editor networks, using has also developed a set of templates6 for flagging articles the connections between editors and articles in order to un- having specific flaws, such as \this Wikipedia page is out- derstand how those connections affect quality. Brandes et dated" or \does not cite any references or sources". Unlike al. and Wang and Iwaihara add several features describing assessment classes, which provide explicit ratings of arti- editor activity to the edges in a network graph of an article cle quality with implicit suggestions for improvement, these and show that it is possible to predict article quality using templates explicitly express specific needed improvements statistics from this graph [6, 35]. A graph combining editors that in turn implicitly describe a notion of article quality. and articles was used by Wu et al., from which they mined Our goal in this paper is to combine these two types of certain network edge patterns between editors and articles, article quality labelling, by creating what we call an \action- called motifs. They showed that the motifs could be used able"quality model for the English Wikipedia. Such a model to predict article quality with comparable performance to would both accurately assess article quality, and, through other approaches [39]. the features included in the model, explicitly indicate which These editor-based assessments help researchers under- aspects of the article need improvement. stand how characteristics of articles and editors correspond to effective collaboration and could be used to study the 1.2 Assessing quality overall quality of Wikipedia and how it has changed over The research literature has both examples of studies look- time. They might also be directly useful for quality assess- ing at more general notions of quality (e.g., how collabo- ment work in Wikipedia, helping assessors find articles that ration leads to quality articles) and at specific features that are misclassified or that might be candidates for developing make it possible to distinguish between high and low quality into Good or Featured Articles. articles taken from the seven assessment classes (typically Featured Articles and Stub- or Start-class). We organise 1.2.2 Article-based assessment our review by whether the cited paper primarily describes Editor-based metrics, though useful for modelling article editor- or article-based approaches, although some of them quality, do not lead to directly actionable results. Knowing combine elements of both. We also discuss potential use that editor diversity leads to higher article quality suggests cases that each approach might support. that a user should look to collaborate with other editors with certain skills or experience levels, but does not explain 1.2.1 Editor-based assessment how to create collaborations with those editors. Approaches Editor-based approaches primarily investigate properties that look at characteristics of articles (such as length) might of the editors, for instance how much experience an editor be more useful for suggesting specific improvements that has, in order to understand how those properties affect ar- individual editors can make. ticle quality. W¨ohner and Peters investigated the life cycle of articles Studies on editor diversity have looked at how the com- and found that high quality articles have an edit pattern position of editor experience, skills and knowledge, and co- that is distinctly different from low quality articles [38]. A ordination between editors determines article quality. For shift to high quality happens in a burst, suggesting the shift instance it has been shown that high quality articles have requires a coordinated effort, as previously discussed when a large number of editors and edits, with intense coopera- looking at editor diversity [20, 37]. tive behaviour [37]. Kittur and Kraut confirmed those re- Looking instead at article content, research has found that sults and refined them by discovering that article quality metrics describing the writing style of an article, for instance increases faster when a concentrated group of editors work the variety of words used, correlates with quality [21, 40]. together [20]. In that situation, explicit