A Brief Review of Studies of Wikipedia in Peer-Reviewed Journals

Total Page:16

File Type:pdf, Size:1020Kb

A Brief Review of Studies of Wikipedia in Peer-Reviewed Journals A Brief Review of Studies of Wikipedia in Peer-Reviewed Journals Chitu Okoli John Molson School of Business Concordia University, Montréal, Canada [email protected] Abstract the encyclopedia’s content under two conditions: the original source must be cited, and everyone else must Since its establishment in 2001, Wikipedia, “the be accorded the same right of free modification and free encyclopedia that anyone can edit” has become a distribution of all derivative works via the FDL. cultural icon of the unlimited possibilities of the World Similar to open source software, Wikipedia aims to Wide Web. Thus, it has become a serious subject of create high-quality digital information products scholarly study to objectively and rigorously through the participation of large numbers of understand it as a phenomenon. This paper reviews contributors, mostly volunteers, though some might be studies of Wikipedia that have been published in peer- paid. It is structured to eliminate or minimize reviewed journals. Among the wealth of studies individual agendas and strive towards public or reviewed, major sub-streams of research covered industry welfare in setting policies for the development include: how and why Wikipedia works; assessments of of the encyclopedia. Their organizational structures are the reliability of its content; using it as a data source generally rather loose, yet there is some central for various studies; and applications of Wikipedia in administration that permits the project to survive and different domains of endeavour. flourish. All along its history, anecdotal evidence has abounded as to Wikipedia’s quality, mainly by 1. Introduction referring to its popularity (it is regularly ranked by Alexa as a global top-ten website), and by noting the status of some of the organizations that regular cite it, Although the open source approach has traditionally such as the British Broadcasting Corporation. been applied only to software products, recent years However, more recently, Wikipedia has begun to have demonstrated its applicability to the creation of attract the attention of scholars who have attempted to other information products, most notably to the open more rigorously study the Wikipedia phenomenon content Wikipedia, “the free encyclopedia that anyone from many different angles. This study attempts to can edit” (www.wikipedia.org). In just seven years build a base for scholarly research on the subject of since its establishment in 2001, this comprehensive Wikipedia by reviewing most of the important general encyclopedia has compiled over ten million scholarly work that has been done thus far. Because the articles in 253 languages (with over 2.5 million in focus here is on high-quality scholarly work, the English). Other open content encyclopedias, though studies reviewed here are limited to those that have not nearly as well developed, include the Association been published in peer-reviewed journals. Thus, work for Information Systems’ ISPedia (http:// that has been published in conference proceedings is ispedia.terry.uga.edu), Enciclopedia Libre Universal en not included, nor has other work that has not passed Español (http://enciclopedia.us.es), Wikinfo (http:// through scholarly peer review. www.wikinfo.org), and Citizendium (http:// I begin by reviewing studies concerning how www.citizendium.org). Wikipedia works, and why it works successfully. Then Wikipedia is based on wiki technology [1], a social I examine a large body of research that uses various collaboration Web application that allows viewers to approaches, both comparative and epistemological, to add content to Web pages using minimal technical assess the reliability of Wikipedia. Then there are skills. The submitted content is published under the many studies that refer to, depend on, or focus on Free Software Foundation’s Free Documentation Wikipedia as a source for information or data in the License (FDL)—the textual complement to the General study. Finally, a few studies propose and examine Public License for open source software. This license permits anyone to freely copy, modify, and distribute 1 applications of Wikipedia and the Wikipedia concept for ideological commitment to the project. However, in education and economics. other hypothesized motivation categories such as social reasons, career advancement, and protection, were not 2. How and why Wikipedia works found to be very relevant. The earliest peer-reviewed articles on Wikipedia 2.3. The editorial process were more or less general introductory reviews, introducing readers to the novel idea of an A number of studies have gone further to try to encyclopedia that anyone can edit. More advanced understand the details of how the editorial process of studies began to dig deeper by investigating the Wikipedia operates to produce high-quality motivations of Wikipedians to contribute, and studying encyclopedia articles. Brandes and Lerner [12] the details of the process of developing articles. specifically examined the historical progress of edit wars, where controversies lead to mutual revisions of 2.1. Reviews of Wikipedia contributions. They presented visual analysis tools to study the progress of such conflicts. Okoli and Oh [13] Remy [2] provided perhaps the first review of studied how Wikipedians’ interactions in both direct Wikipedia in a peer-reviewed journal, just a year after article creation and in discussions with each other its establishment. She considered the free-editing affected their election to the status of “administrator,” concept dubious, and referred readers rather to which gives greater editorial privileges. Stvilia et al Nupedia, the expert-written online encyclopedia that [14] examined the article-creation process in preceded Wikipedia [3]. Wikipedia to discover how information quality is Hall [4] and Strategic Finance [5] provided reviews assured. They found that Wikipedia’s discussion pages that were generally neutral, mainly stressing the provide an extensive resource of documentation on novelty of the concept. Krause [6] briefly discussed how the processes of error detection and correction Wikipedia in the context of online social collaboration, occur. Den Besten and Dalle [15] studied the Simple as a prime example of wiki collaboration. McFedries English Wikipedia, a Wikipedia distinct from the [7] discussed wikis, referring to Wikipedia as the prime English one that limits its vocabulary sense and example. Morse [8] interviewed Jimmy Wales, the grammatical structure to facilitate reading by children founder of Wikipedia. Wales mainly discussed the and learners of English. They investigated the editorial workings and value of wikis for companies, and when process that implemented the rules instated to keep they were appropriate or inappropriate. Arter [9] articles “simple.” reviewed Wikipedia from the point of view of its peer- review mechanism as a means of ensuring quality. 3. The reliability of Wikipedia 2.2. Motivations for contributing So far, perhaps the most abundant body of scholarly work conducted on Wikipedia has been formal studies In the study of open source software, the question of evaluating the reliability of Wikipedia, variously why people code for free has always been a fascinating expressed as trustworthiness, quality, or accuracy. In subject of study. Similarly, a few studies have explored other words, why trust the contents of an encyclopedia why Wikipedians contribute their free time to the that anyone can edit? The most famous scholarly encyclopedia, especially considering that, unlike in the assessment of Wikipedia is a comparison of selected case of open source software, it is extremely rare for science articles in Wikipedia and Encyclopaedia anyone to be paid for their efforts. Britannica conducted by Nature journal [16]. The In their study of public contribution to Amazon study found Wikipedia’s accuracy comparable to those book reviews and to Wikipedia articles, Peddibhotla of Britannica. Among 42 articles, Wikipedia had an and Subramani [10] found multiple motives for average of four errors each, and Britannica had three. contribution to Wikipedia, both self-oriented and other- In this section I will first present a large number of oriented. Extending critical mass theory, they found negative assessments of Wikipedia’s reliability. Then I that the higher the quality of contributions, the lower will present a number of studies that take an epistemic their quantity, and vice versa. Moreover, the approach of assessing the reliability of Wikipedia; this motivations for contribution high-quality articles were perspective unequivocally considers the radical different from those for contributing a high quantity of encyclopedia in a very favourable light. articles. Nov [11] found that Wikipedians are motivated to contribute primarily for the fun of it, and 2 3.1. Criticisms Britannica. With such criteria, he argued that Wikipedia has been repeatedly shown to be quite Denning et al [17] question whether Wikipedia’s reliable [16]. Moreover, when compared to its more collaborative editing process is capable of producing likely alternate sources on the Web, Wikipedia is accurate and authoritative information on a thoroughly strikingly superior as a source of knowledge [26]. He comprehensive scope of human knowledge. Gorman argued that Wikipedia has important epistemological [18] contends that Wikipedia has no basis to call itself properties (“e.g., power, speed, and fecundity”) that an encyclopedia, and that without regulation of its
Recommended publications
  • Wiki-Reliability: a Large Scale Dataset for Content Reliability on Wikipedia
    Wiki-Reliability: A Large Scale Dataset for Content Reliability on Wikipedia KayYen Wong∗ Miriam Redi Diego Saez-Trumper Outreachy Wikimedia Foundation Wikimedia Foundation Kuala Lumpur, Malaysia London, United Kingdom Barcelona, Spain [email protected] [email protected] [email protected] ABSTRACT Wikipedia is the largest online encyclopedia, used by algorithms and web users as a central hub of reliable information on the web. The quality and reliability of Wikipedia content is maintained by a community of volunteer editors. Machine learning and information retrieval algorithms could help scale up editors’ manual efforts around Wikipedia content reliability. However, there is a lack of large-scale data to support the development of such research. To fill this gap, in this paper, we propose Wiki-Reliability, the first dataset of English Wikipedia articles annotated with a wide set of content reliability issues. To build this dataset, we rely on Wikipedia “templates”. Tem- plates are tags used by expert Wikipedia editors to indicate con- Figure 1: Example of an English Wikipedia page with several tent issues, such as the presence of “non-neutral point of view” template messages describing reliability issues. or “contradictory articles”, and serve as a strong signal for detect- ing reliability issues in a revision. We select the 10 most popular 1 INTRODUCTION reliability-related templates on Wikipedia, and propose an effective method to label almost 1M samples of Wikipedia article revisions Wikipedia is one the largest and most widely used knowledge as positive or negative with respect to each template. Each posi- repositories in the world. People use Wikipedia for studying, fact tive/negative example in the dataset comes with the full article checking and a wide set of different information needs [11].
    [Show full text]
  • How Trust in Wikipedia Evolves: a Survey of Students Aged 11 to 25
    PUBLISHED QUARTERLY BY THE UNIVERSITY OF BORÅS, SWEDEN VOL. 23 NO. 1, MARCH, 2018 Contents | Author index | Subject index | Search | Home How trust in Wikipedia evolves: a survey of students aged 11 to 25 Josiane Mothe and Gilles Sahut. Introduction. Whether Wikipedia is to be considered a trusted source is frequently questioned in France. This paper reports the results of a survey examining the levels of trust shown by young people aged eleven to twenty- five. Method. We analyse the answers given by 841 young people, aged eleven to twenty-five, to a questionnaire. To our knowledge, this is the largest study ever published on the topic. It focuses on (1) the perception young people have of Wikipedia; (2) the influence teachers and peers have on the young person’s own opinions; and (3) the variation of trends according to the education level. Analysis. All the analysis is based on ANOVA (analysis of variance) to compare the various groups of participants. We detail the results by comparing the various groups of responders and discuss these results in relation to previous studies. Results. Trust in Wikipedia depends on the type of information seeking tasks and on the education level. There are contrasting social judgments of Wikipedia. Students build a representation of a teacher’s expectations on the nature of the sources that they can use and hence the documentary acceptability of Wikipedia. The average trust attributed to Wikipedia for academic tasks could be induced by the tension between the negative academic reputation of the encyclopedia and the mostly positive experience of its credibility.
    [Show full text]
  • The Culture of Wikipedia
    Good Faith Collaboration: The Culture of Wikipedia Good Faith Collaboration The Culture of Wikipedia Joseph Michael Reagle Jr. Foreword by Lawrence Lessig The MIT Press, Cambridge, MA. Web edition, Copyright © 2011 by Joseph Michael Reagle Jr. CC-NC-SA 3.0 Purchase at Amazon.com | Barnes and Noble | IndieBound | MIT Press Wikipedia's style of collaborative production has been lauded, lambasted, and satirized. Despite unease over its implications for the character (and quality) of knowledge, Wikipedia has brought us closer than ever to a realization of the centuries-old Author Bio & Research Blog pursuit of a universal encyclopedia. Good Faith Collaboration: The Culture of Wikipedia is a rich ethnographic portrayal of Wikipedia's historical roots, collaborative culture, and much debated legacy. Foreword Preface to the Web Edition Praise for Good Faith Collaboration Preface Extended Table of Contents "Reagle offers a compelling case that Wikipedia's most fascinating and unprecedented aspect isn't the encyclopedia itself — rather, it's the collaborative culture that underpins it: brawling, self-reflexive, funny, serious, and full-tilt committed to the 1. Nazis and Norms project, even if it means setting aside personal differences. Reagle's position as a scholar and a member of the community 2. The Pursuit of the Universal makes him uniquely situated to describe this culture." —Cory Doctorow , Boing Boing Encyclopedia "Reagle provides ample data regarding the everyday practices and cultural norms of the community which collaborates to 3. Good Faith Collaboration produce Wikipedia. His rich research and nuanced appreciation of the complexities of cultural digital media research are 4. The Puzzle of Openness well presented.
    [Show full text]
  • Name of the Tool Citizendium Home Page Logo URL En.Citizendium
    Name of the Tool Citizendium Home Page Logo URL en.citizendium.org Subject Encyclopedias Accessibility Free Language English Publisher CZ: Media Assets Workgroup Brief History It is an English-language Wiki-based free encyclopedia project launched by Larry Sanger, who had previously co - founded Wikipedia in 2001. It had launched on 23 October 2006 (as pilot project) and on 25 March 2007 (publicly). Scope and Coverage It serves all over the world and provides the access and modifications of information related to the “parent topics” or main topics like Natural Sciences, Social Sciences, Humanities, Arts, Applied arts and sciences, Recreation. Under each main topic or parent topic, there is hyperlinked list of sub topics and other related topics. As for example, under the main topic “ Natural Sciences” there is a following list of “Subtopics” and “Other related topics”: Subtopics Physics Chemistry Biology Astronomy Earth science Mathematics Other related topics Natural philosophy Natural history Applied science Health science Geology Under the main topic “Humanities” there is a list of following topics: Subtopics Classics History Literature Philosophy Religion Theology Other related topics Art Applied arts Education Law Music Science Social science Scholarship Society Theatre The encyclopedia includes total number of 16,891 articles when last accessed. Kind of Information Every article under subtopics and other related topics is provided with “Talk”, “Related Articles”, “Biography”, “External Links”, “Citable Version”, “Video” related to that article. Brief description, history of a topic etc. are present in the articles. Coloured images on topics, charts, graphs etc. are available where applicable. Notes and references are also found after the articles.
    [Show full text]
  • SI 410 Ethics and Information Technology
    Author(s): Paul Conway, PhD, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/ We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit http://open.umich.edu/privacy-and-terms-use. Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition. Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers. Citation Key for more information see: http://open.umich.edu/wiki/CitationPolicy Use + Share + Adapt { Content the copyright holder, author, or law permits you to use, share and adapt. } Public Domain – Government: Works that are produced by the U.S. Government. (17 USC § 105) Public Domain – Expired: Works that are no longer protected due to an expired copyright term. Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain.
    [Show full text]
  • Télécharger Le Texte Intégral En Format
    ANNUAIRE FRANÇAIS DE RELATIONS INTERNATIONALES 2019 Volume XX PUBLICATION COURONNÉE PAR L’ACADÉMIE DES SCIENCES MORALES ET POLITIQUES (Prix de la Fondation Edouard Bonnefous, 2008) Université Panthéon-Assas Centre Thucydide AFRI_2019_v3_1124p.indd 3 24/04/2019 11:44 WIKIPÉDIA PAR VALÉRIE NICOLAS (*) Wikipédia (1) est un site Internet collaboratif qui se décrit lui-même comme « encyclopédie en libre accès, en lecture comme en écriture » (2). Multilingue, ce site est servi par un logiciel identique le Mediawiki. Il est adossé à une fondation à but non lucratif Wikimedia, qui assure son fonctionnement et gère d’autres projets frères. Le contenu de Wikipédia est disponible sous licence libre (3). Ainsi chacun peut le recopier, le modifier et l’utiliser. Le projet encyclopédique est alimenté par chaque utilisateur par une écriture collaborative, participative et bénévole. Crée en 2001 par deux ressortissants américains (4), Wikipédia (WP) est en 2014 le 5e site le plus fréquenté au monde (5). 500 millions de visiteurs le consultent chaque mois. Il offre aux lecteurs plus de 30 millions d’articles dans plus de 300 versions linguistiques. La version en anglais – matrice du projet –, compte plus de 5 millions de contributions. Les chiffres sont évocateurs du formidable recueil de connaissances que WP constitue. Internet est le moteur et le vecteur du succès du projet Wikipédia. Internet est un réseau de réseaux informatiques international organisé grâce à un protocole unique de communication (TCP/IP). Outil de communication, le World Wide Web (Web), un des services fournis par le réseau (6), a bouleversé les échanges entre les individus. Le réseau permet leur multiplication sans considération de frontières, ni de temps.
    [Show full text]
  • Wikipedia Citations: a Comprehensive Data Set of Citations with Identifiers Extracted from English Wikipedia
    RESEARCH ARTICLE Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia Harshdeep Singh1 , Robert West1 , and Giovanni Colavizza2 an open access journal 1Data Science Laboratory, EPFL 2Institute for Logic, Language and Computation, University of Amsterdam Keywords: citations, data, data set, Wikipedia Downloaded from http://direct.mit.edu/qss/article-pdf/2/1/1/1906624/qss_a_00105.pdf by guest on 01 October 2021 Citation: Singh, H., West, R., & ABSTRACT Colavizza, G. (2020). Wikipedia citations: A comprehensive data set Wikipedia’s content is based on reliable and published sources. To this date, relatively little of citations with identifiers extracted from English Wikipedia. Quantitative is known about what sources Wikipedia relies on, in part because extracting citations Science Studies, 2(1), 1–19. https:// and identifying cited sources is challenging. To close this gap, we release Wikipedia doi.org/10.1162/qss_a_00105 Citations, a comprehensive data set of citations extracted from Wikipedia. We extracted DOI: 29.3 million citations from 6.1 million English Wikipedia articles as of May 2020, and https://doi.org/10.1162/qss_a_00105 classified as being books, journal articles, or Web content. We were thus able to extract Received: 14 July 2020 4.0 million citations to scholarly publications with known identifiers—including DOI, PMC, Accepted: 23 November 2020 PMID, and ISBN—and further equip an extra 261 thousand citations with DOIs from Crossref. Corresponding Author: As a result, we find that 6.7% of Wikipedia articles cite at least one journal article with Giovanni Colavizza [email protected] an associated DOI, and that Wikipedia cites just 2% of all articles with a DOI currently indexed in the Web of Science.
    [Show full text]
  • Lessons from Citizendium
    Lessons from Citizendium Wikimania 2009, Buenos Aires, 28 August 2009 HaeB [[de:Benutzer:HaeB]], [[en:User:HaeB]] Please don't take photos during this talk. Citizendium Timeline ● September 2006: Citizendium announced. Sole founder: Larry Sanger, known as former editor-in-chief of Nupedia, chief organizer of Wikipedia (2001-2002), and later as Wikipedia critic ● October 2006: Started non-public pilot phase ● January 2007: “Big Unfork”: All unmodified copies of Wikipedia articles deleted ● March 2007: Public launch ● December 2007: Decision to use CC-BY-3.0, after debate about commercial reuse and compatibility with Wikipedia ● Mid-2009: Sanger largely inactive on Citizendium, focuses on WatchKnow ● August 2009: Larry Sanger announces he will step down as editor-in-chief soon (as committed to in 2006) Citizendium and Wikipedia: Similarities and differences ● Encyclopedia ● Strict real names ● Free license policy ● ● Open (anyone can Special role for contribute) experts: “editors” can issue content ● Created by amateurs decisions, binding to ● MediaWiki-based non-editors collaboration ● Governance: Social ● Non-profit contract, elements of a constitutional republic Wikipedian views of Citizendium ● Competitor for readers, contributions ● Ally, common goal of creating free encyclopedic content ● “Who?” ● In this talk: A long-time experiment testing several fundamental policy changes, in a framework which is still similar enough to that of Wikipedia to generate valuable evidence as to what their effect might be on WP Active editors: Waiting to explode ● Sanger (October 2007): ”At some point, possibly very soon, the Citizendium will grow explosively-- say, quadruple the number of its active contributors, or even grow by an order of magnitude ....“ © Aleksander Stos, CC-BY 3.0 Number of users that made at least one edit in each month Article creation rate: Still muddling Sanger (October 2007): “It's still possible that the project will, from here until eternity, muddle on creating 14 articles per day.
    [Show full text]
  • Reliability of Wikipedia 1 Reliability of Wikipedia
    Reliability of Wikipedia 1 Reliability of Wikipedia The reliability of Wikipedia (primarily of the English language version), compared to other encyclopedias and more specialized sources, is assessed in many ways, including statistically, through comparative review, analysis of the historical patterns, and strengths and weaknesses inherent in the editing process unique to Wikipedia. [1] Because Wikipedia is open to anonymous and collaborative editing, assessments of its Vandalism of a Wikipedia article reliability usually include examinations of how quickly false or misleading information is removed. An early study conducted by IBM researchers in 2003—two years following Wikipedia's establishment—found that "vandalism is usually repaired extremely quickly — so quickly that most users will never see its effects"[2] and concluded that Wikipedia had "surprisingly effective self-healing capabilities".[3] A 2007 peer-reviewed study stated that "42% of damage is repaired almost immediately... Nonetheless, there are still hundreds of millions of damaged views."[4] Several studies have been done to assess the reliability of Wikipedia. A notable early study in the journal Nature suggested that in 2005, Wikipedia scientific articles came close to the level of accuracy in Encyclopædia Britannica and had a similar rate of "serious errors".[5] This study was disputed by Encyclopædia Britannica.[6] By 2010 reviewers in medical and scientific fields such as toxicology, cancer research and drug information reviewing Wikipedia against professional and peer-reviewed sources found that Wikipedia's depth and coverage were of a very high standard, often comparable in coverage to physician databases and considerably better than well known reputable national media outlets.
    [Show full text]
  • What You Say Is Who You Are. How Open Government Data Facilitates Profiling Politicians
    What you say is who you are. How open government data facilitates profiling politicians Maarten Marx and Arjan Nusselder ISLA, Informatics Institute, University of Amsterdam Science Park 107 1098XG Amsterdam, The Netherlands Abstract. A system is proposed and implemented that creates a lan- guage model for each member of the Dutch parliament, based on the official transcripts of the meetings of the Dutch Parliament. Using ex- pert finding techniques, the system allows users to retrieve a ranked list of politicians, based on queries like news messages. The high quality of the system is due to extensive data cleaning and transformation which could have been avoided when it had been available in an open machine readable format. 1 Introduction The Internet is changing from a web of documents into a web of objects. Open and interoperable (linkable) data are crucial for web applications which are build around objects. Examples of objects featuring prominently is (mashup) websites are traditional named entities like persons, products, organizations [6,4], but also events and unique items like e.g. houses. The success of several mashup sites is simply due to the fact that they provide a different grouping of already (freely) available data. Originally the data could only be grouped by documents; the mashup allows for groupings by objects which are of interest in their specific domain. Here is an example from the political domain. Suppose one wants to know more about Herman van Rompuy, the new EU “president” from Belgium. Being a former member of the Belgium parliament and several governments, an im- portant primary source of information are the parliamentary proceedings.
    [Show full text]
  • Academic Research About the Reliability of Wikipedia.Odp
    Academic research about the reliability of Wikipedia 台灣維基人冬聚 Taipei, January 7, 2012 T. Bayer Perspective of this talk ● About myself: – Wikipedian since 2003 (User:HaeB on de:, en:) – Editor of The Signpost on en:, 2010-11 – Working for the Foundation since July 2011 (contractor, supporting movement communications) – Editor of the Wikimedia Research Newsletter (together with WMF research analyst Dario Taraborelli): Monthly survey on recent academic research about Wikipedia Reliability of Wikipedia ● Standard criticism: “Wikipedia is not reliable because anyone can edit” – … is fallacious: ● Yes, one traditional quality control method (restrict who can write) is totally missing ● But there is a new quality control method: Anyone can correct mistakes ● But does the new method work? ● Need to examine the content, not the process that leads to it Anecdotes vs. systematic studies ● Seigenthaler scandal (2005, vandalism in biography article). Public opinion problem: Unusual, extreme cases are more newsworthy and memorable – and because of “anyone can edit”, these tend to be negative for WP. ● Scientists are trained not to rely on anecdotes: The 2005 Nature study ● “first [study] to use peer review to compare Wiki- pedia and Britannica’s coverage of science” ● 42 reviews by experts ● Errors per article: Wikipedia 4 , Britannica 3 ● Britannica protested, Nature stood by it ● 6 years later, still the most frequently cited study about Wikipedia's reliability ●Brockhaus – “the” German encyclopedia ● “generally regarded as the model for the development of many encyclopaedias in other languages” (Britannica entry “Brockhaus Enzyklopädie”) ● 1796: First edition ● 2005: 21st edition (30 volumes) ● 2009: Editorial staff dismissed, brand sold Stern (news magazine) study, 2007 ● Compared 50 random articles in German Wikipedia and Brockhaus ● Not peer-reviewed, but conducted by experienced research institute ● Wide range of topics ● Wikipedia more accurate: Brockhaus 2.3 vs.
    [Show full text]
  • Federated Ontology Search Vasco Calais Pedro CMU-LTI-09-010
    Federated Ontology Search Vasco Calais Pedro CMU-LTI-09-010 Language Technologies Institute School of Computer Science Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213 www.lti.cs.cmu.edu Thesis Committee: Jaime Carbonell, Chair Eric Nyberg Robert Frederking Eduard Hovy, Information Sciences Institute Submitted in partial fulfillment of the requirements for the degree Doctor of Philosophy In Language and Information Technologies Copyright © 2009 Vasco Calais Pedro For my grandmother, Avó Helena, I am sorry I wasn’t there Abstract An Ontology can be defined as a formal representation of a set of concepts within a domain and the relationships between those concepts. The development of the semantic web initiative is rapidly increasing the number of publicly available ontologies. In such a distributed environment, complex applications often need to handle multiple ontologies in order to provide adequate domain coverage. Surprisingly, there is a lack of adequate frameworks for enabling the use of multiple ontologies transparently while abstracting the particular ontological structures used by that framework. Given that any ontology represents the views of its author or authors, using multiple ontologies requires us to deal with several significant challenges, some stemming from the nature of knowledge itself, such as cases of polysemy or homography, and some stemming from the structures that we choose to represent such knowledge with. The focus of this thesis is to explore a set of techniques that will allow us to overcome some of the challenges found when using multiple ontologies, thus making progress in the creation of a functional information access platform for structured sources.
    [Show full text]