A Test of Cross-Wiki Search: Helping Users Discover Content on Wikipedia’S Sister Projects
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
1 Wikipedia: an Effective Anarchy Dariusz Jemielniak, Ph.D
Wikipedia: An Effective Anarchy Dariusz Jemielniak, Ph.D. Kozminski University [email protected] Paper presented at the Society for Applied Anthropology conference in Baltimore, MD (USA), 27-31 March, 2012 (work in progress) This paper is the first report from a virtual ethnographic study (Hine, 2000; Kozinets, 2010) of Wikipedia community conducted 2006-2012, by the use of participative methods, and relying on an narrative analysis of Wikipedia organization (Czarniawska, 2000; Boje, 2001; Jemielniak & Kostera, 2010). It serves as a general introduction to Wikipedia community, and is also a basis for a discussion of a book in progress, which is going to address the topic. Contrarily to a common misconception, Wikipedia was not the first “wiki” in the world. “Wiki” (originated from Hawaiian word for “quick” or “fast”, and named after “Wiki Wiki Shuttle” on Honolulu International Airport) is a website technology based on a philosophy of tracking changes added by the users, with a simplified markup language (allowing easy additions of, e.g. bold, italics, or tables, without the need to learn full HTML syntax), and was originally created and made public in 1995 by Ward Cunningam, as WikiWikiWeb. WikiWikiWeb was an attractive choice among enterprises and was used for communication, collaborative ideas development, documentation, intranet, knowledge management, etc. It grew steadily in popularity, when Jimmy “Jimbo” Wales, then the CEO of Bomis Inc., started up his encyclopedic project in 2000: Nupedia. Nupedia was meant to be an online encyclopedia, with free content, and written by experts. In an attempt to meet the standards set by professional encyclopedias, the creators of Nupedia based it on a peer-review process, and not a wiki-type software. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
Omnipedia: Bridging the Wikipedia Language
Omnipedia: Bridging the Wikipedia Language Gap Patti Bao*†, Brent Hecht†, Samuel Carton†, Mahmood Quaderi†, Michael Horn†§, Darren Gergle*† *Communication Studies, †Electrical Engineering & Computer Science, §Learning Sciences Northwestern University {patti,brent,sam.carton,quaderi}@u.northwestern.edu, {michael-horn,dgergle}@northwestern.edu ABSTRACT language edition contains its own cultural viewpoints on a We present Omnipedia, a system that allows Wikipedia large number of topics [7, 14, 15, 27]. On the other hand, readers to gain insight from up to 25 language editions of the language barrier serves to silo knowledge [2, 4, 33], Wikipedia simultaneously. Omnipedia highlights the slowing the transfer of less culturally imbued information similarities and differences that exist among Wikipedia between language editions and preventing Wikipedia’s 422 language editions, and makes salient information that is million monthly visitors [12] from accessing most of the unique to each language as well as that which is shared information on the site. more widely. We detail solutions to numerous front-end and algorithmic challenges inherent to providing users with In this paper, we present Omnipedia, a system that attempts a multilingual Wikipedia experience. These include to remedy this situation at a large scale. It reduces the silo visualizing content in a language-neutral way and aligning effect by providing users with structured access in their data in the face of diverse information organization native language to over 7.5 million concepts from up to 25 strategies. We present a study of Omnipedia that language editions of Wikipedia. At the same time, it characterizes how people interact with information using a highlights similarities and differences between each of the multilingual lens. -
Librarians As Wikimedia Movement Organizers in Spain: an Interpretive Inquiry Exploring Activities and Motivations
Librarians as Wikimedia Movement Organizers in Spain: An interpretive inquiry exploring activities and motivations by Laurie Bridges and Clara Llebot Laurie Bridges Oregon State University https://orcid.org/0000-0002-2765-5440 Clara Llebot Oregon State University https://orcid.org/0000-0003-3211-7396 Citation: Bridges, L., & Llebot, C. (2021). Librarians as Wikimedia Movement Organizers in Spain: An interpretive inQuiry exploring activities and motivations. First Monday, 26(6/7). https://doi.org/10.5210/fm.v26i3.11482 Abstract How do librarians in Spain engage with Wikipedia (and Wikidata, Wikisource, and other Wikipedia sister projects) as Wikimedia Movement Organizers? And, what motivates them to do so? This article reports on findings from 14 interviews with 18 librarians. The librarians interviewed were multilingual and contributed to Wikimedia projects in Castilian (commonly referred to as Spanish), Catalan, BasQue, English, and other European languages. They reported planning and running Wikipedia events, developing partnerships with local Wikimedia chapters, motivating citizens to upload photos to Wikimedia Commons, identifying gaps in Wikipedia content and filling those gaps, transcribing historic documents and adding them to Wikisource, and contributing data to Wikidata. Most were motivated by their desire to preserve and promote regional languages and culture, and a commitment to open access and open education. Introduction This research started with an informal conversation in 2018 about the popularity of Catalan Wikipedia, Viquipèdia, between two library coworkers in the United States, the authors of this article. Our conversation began with a sense of wonder about Catalan Wikipedia, which is ranked twentieth by number of articles, out of 300 different language Wikipedias (Meta contributors, 2020). -
The Wikipedia Diversity Observatory a Project to Identify and Bridge Content Gaps in Wikipedia
The Wikipedia Diversity Observatory A Project to Identify and Bridge Content Gaps in Wikipedia Marc Miquel-Ribé David Laniado Universitat Pompeu Fabra, Barcelona, Catalonia Eurecat, Centre Tecnològic de Catalunya [email protected] [email protected] ABSTRACT 1 Introduction In this paper we present the Wikipedia Diversity Observatory, Wikipedia is among the largest information repositories on the a project aimed to increase diversity within Wikipedia Internet that are both multilingual and created through language editions. The project includes dashboards with collaborative effort. Its prime objective1 is to "give free access visualizations and tools which show the gaps in terms of to the sum of all human knowledge" and, consequently, it exists concepts not represented or not shared across languages. The in as many as 309 languages. Even though the language dashboards are built on datasets generated for each of the more communities make the projects grow on a constant basis, the than 300 language editions, with features that label each article content does not represent the existing diversity in peoples, according to different categories relevant to overall content places, and cultures of the world; furthermore, there is a gap diversity. Through various examples, we show how the tools between language editions and articles often are not shared, or encourage and help editors to bridge the gaps in Wikipedia remain even unique to one language [1]. The creation of articles content. Finally, we discuss the project's impact on the in Wikipedia language editions is spontaneous and non- communities and implications for the Wikimedia movement, in directed. Several studies showed that cultural and geographical a moment in which covering diversity is considered strategic. -
Jimmy Wales and Larry Sanger, It Is the Largest, Fastest-Growing and Most Popular General Reference Work Currently Available on the Internet
Tomasz „Polimerek” Ganicz Wikimedia Polska WikipediaWikipedia andand otherother WikimediaWikimedia projectsprojects WhatWhat isis Wikipedia?Wikipedia? „Imagine„Imagine aa worldworld inin whichwhich everyevery singlesingle humanhuman beingbeing cancan freelyfreely shareshare inin thethe sumsum ofof allall knowledge.knowledge. That'sThat's ourour commitment.”commitment.” JimmyJimmy „Jimbo”„Jimbo” Wales Wales –– founder founder ofof WikipediaWikipedia As defined by itself: Wikipedia is a free multilingual, open content encyclopedia project operated by the non-profit Wikimedia Foundation. Its name is a blend of the words wiki (a technology for creating collaborative websites) and encyclopedia. Launched in January 2001 by Jimmy Wales and Larry Sanger, it is the largest, fastest-growing and most popular general reference work currently available on the Internet. OpenOpen and and free free content content RichardRichard StallmanStallman definition definition of of free free software: software: „The„The wordword "free""free" inin ourour namename doesdoes notnot referrefer toto price;price; itit refersrefers toto freedom.freedom. First,First, thethe freedomfreedom toto copycopy aa programprogram andand redistributeredistribute itit toto youryour neighbors,neighbors, soso thatthat theythey cancan useuse itit asas wellwell asas you.you. Second,Second, thethe freedomfreedom toto changechange aa program,program, soso ththatat youyou cancan controlcontrol itit insteadinstead ofof itit controllingcontrolling you;you; forfor this,this, thethe sourcesource -
The Case of Croatian Wikipedia: Encyclopaedia of Knowledge Or Encyclopaedia for the Nation?
The Case of Croatian Wikipedia: Encyclopaedia of Knowledge or Encyclopaedia for the Nation? 1 Authorial statement: This report represents the evaluation of the Croatian disinformation case by an external expert on the subject matter, who after conducting a thorough analysis of the Croatian community setting, provides three recommendations to address the ongoing challenges. The views and opinions expressed in this report are those of the author and do not necessarily reflect the official policy or position of the Wikimedia Foundation. The Wikimedia Foundation is publishing the report for transparency. Executive Summary Croatian Wikipedia (Hr.WP) has been struggling with content and conduct-related challenges, causing repeated concerns in the global volunteer community for more than a decade. With support of the Wikimedia Foundation Board of Trustees, the Foundation retained an external expert to evaluate the challenges faced by the project. The evaluation, conducted between February and May 2021, sought to assess whether there have been organized attempts to introduce disinformation into Croatian Wikipedia and whether the project has been captured by ideologically driven users who are structurally misaligned with Wikipedia’s five pillars guiding the traditional editorial project setup of the Wikipedia projects. Croatian Wikipedia represents the Croatian standard variant of the Serbo-Croatian language. Unlike other pluricentric Wikipedia language projects, such as English, French, German, and Spanish, Serbo-Croatian Wikipedia’s community was split up into Croatian, Bosnian, Serbian, and the original Serbo-Croatian wikis starting in 2003. The report concludes that this structure enabled local language communities to sort by points of view on each project, often falling along political party lines in the respective regions. -
Does Wikipedia Matter? the Effect of Wikipedia on Tourist Choices Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko Discus Si On Paper No
Dis cus si on Paper No. 15-089 Does Wikipedia Matter? The Effect of Wikipedia on Tourist Choices Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko Dis cus si on Paper No. 15-089 Does Wikipedia Matter? The Effect of Wikipedia on Tourist Choices Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko First version: December 2015 This version: September 2017 Download this ZEW Discussion Paper from our ftp server: http://ftp.zew.de/pub/zew-docs/dp/dp15089.pdf Die Dis cus si on Pape rs die nen einer mög lichst schnel len Ver brei tung von neue ren For schungs arbei ten des ZEW. Die Bei trä ge lie gen in allei ni ger Ver ant wor tung der Auto ren und stel len nicht not wen di ger wei se die Mei nung des ZEW dar. Dis cus si on Papers are inten ded to make results of ZEW research prompt ly avai la ble to other eco no mists in order to encou ra ge dis cus si on and sug gesti ons for revi si ons. The aut hors are sole ly respon si ble for the con tents which do not neces sa ri ly repre sent the opi ni on of the ZEW. Does Wikipedia Matter? The Effect of Wikipedia on Tourist Choices ∗ Marit Hinnosaar† Toomas Hinnosaar‡ Michael Kummer§ Olga Slivko¶ First version: December 2015 This version: September 2017 September 25, 2017 Abstract We document a causal influence of online user-generated information on real- world economic outcomes. In particular, we conduct a randomized field experiment to test whether additional information on Wikipedia about cities affects tourists’ choices of overnight visits. -
Cultural Bias in Wikipedia Content on Famous Persons
ASI21577.tex 14/6/2011 16: 39 Page 1 Cultural Bias in Wikipedia Content on Famous Persons Ewa S. Callahan Department of Film, Video, and Interactive Media, School of Communications, Quinnipiac University, 275 Mt. Carmel Avenue, Hamden, CT 06518-1908. E-mail: [email protected] Susan C. Herring School of Library and Information Science, Indiana University, Bloomington, 1320 E. 10th St., Bloomington, IN 47405-3907. E-mail: [email protected] Wikipedia advocates a strict “neutral point of view” This issue comes to the fore when one considers (NPOV) policy. However, although originally a U.S-based, that Wikipedia, although originally a U.S-based, English- English-language phenomenon, the online, user-created language phenomenon, now has versions—or “editions,” as encyclopedia now has versions in many languages. This study examines the extent to which content and perspec- they are called on Wikipedia—in many languages, with con- tives vary across cultures by comparing articles about tent and perspectives that can be expected to vary across famous persons in the Polish and English editions of cultures. With regard to coverage of persons in different Wikipedia.The results of quantitative and qualitative con- language versions, Kolbitsch and Maurer (2006) claim that tent analyses reveal systematic differences related to Wikipedia “emphasises ‘local heroes”’ and thus “distorts the different cultures, histories, and values of Poland and the United States; at the same time, a U.S./English- reality and creates an imbalance” (p. 196). However, their language advantage is evident throughout. In conclusion, evidence is anecdotal; empirical research is needed to inves- the implications of these findings for the quality and tigate the question of whether—and if so, to what extent—the objectivity of Wikipedia as a global repository of knowl- cultural biases of a country are reflected in the content of edge are discussed, and recommendations are advanced Wikipedia entries written in the language of that country. -
Master Thesis
MASTER THESIS TITLE : Cultural configuration of Wikipedia: measuring Autoreferentiality in different languages MASTER DEGREE: Master of Science in Telecommunication Engineering & Management AUTHOR: Marc Miquel Ribe´ DIRECTOR: Horacio Rodr´ıguez Hontoria TUTOR: Sebastia` Sallent Ribes DATE: March 31, 2011 T´ıtol : Cultural configuration of Wikipedia: measuring Autoreferentiality in different languages Autor: Marc Miquel Ribe´ Director: Horacio Rodr´ıguez Hontoria Tutor: Sebastia` Sallent Ribes Data: 31 de marc¸de 2011 Resum ”Wikipedia es´ un projecte enciclopedic` multiling¨ue, col·laboratiu, basat en web i sense anim` de lucre impulsat per la Fundacio´ Wikimedia”, aix´ı es´ com s’autodescriu Wikipedia en la definicio´ de l’article que du el seu nom. Aixo` significa que l’enciclopedia` pot ser modificada en qualsevol moment, per qualsevol persona i des de qualsevol lloc. Aquestes premisses i la seva gran participacio´ fan que es tracti d’un excel·lent objecte social d’estudi, que a la vegada, per tractar-se d’un artefacte tecnologic,` permeti tambe´ l’´us de tecniques` de processament llenguatge natural, obtencio´ i mineria de dades. Tanmateix, en la recerca actual hi ha una clara mancanc¸a en software que pugui aproximar-s’hi d’una manera integral. Tenint en compte aquest buit realitzem una caracteritzacio´ de Wikipedia amb l’objectiu de coneixer` a fons quins son´ els elements i estructures d’informacio´ que conte´ i com despres´ poden obtenir-se mitjanc¸ant una eina anal´ıtica. Partim de l’API existent anomenada wikAPIdia, que desenvolupem fins a incloure-hi noves funcionalitats i posar-la apunt per a encarar m´ultiples escenaris i problematiques` de les ciencies` socials. -
A Complete, Longitudinal and Multi-Language Dataset of the Wikipedia Link Networks
WikiLinkGraphs: A Complete, Longitudinal and Multi-Language Dataset of the Wikipedia Link Networks Cristian Consonni David Laniado Alberto Montresor DISI, University of Trento Eurecat, Centre Tecnologic` de Catalunya DISI, University of Trento [email protected] [email protected] [email protected] Abstract and result in a huge conceptual network. According to Wiki- pedia policies2 (Wikipedia contributors 2018e), when a con- Wikipedia articles contain multiple links connecting a sub- cept is relevant within an article, the article should include a ject to other pages of the encyclopedia. In Wikipedia par- link to the page corresponding to such concept (Borra et al. lance, these links are called internal links or wikilinks. We present a complete dataset of the network of internal Wiki- 2015). Therefore, the network between articles may be seen pedia links for the 9 largest language editions. The dataset as a giant mind map, emerging from the links established by contains yearly snapshots of the network and spans 17 years, the community. Such graph is not static but is continuously from the creation of Wikipedia in 2001 to March 1st, 2018. growing and evolving, reflecting the endless collaborative While previous work has mostly focused on the complete hy- process behind it. perlink graph which includes also links automatically gener- The English Wikipedia includes over 163 million con- ated by templates, we parsed each revision of each article to nections between its articles. This huge graph has been track links appearing in the main text. In this way we obtained exploited for many purposes, from natural language pro- a cleaner network, discarding more than half of the links and cessing (Yeh et al. -
Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories
Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories Brent Hecht† and Darren Gergle†‡ †Dept. of Electrical Engineering and Computer Science ‡Dept. of Communication Studies Northwestern University [email protected], [email protected] ABSTRACT In this paper, we explore this question by introducing and Self-focus is a novel way of understanding a type of bias in measuring self-focus, a new way of understanding a type of bias in community-maintained Web 2.0 graph structures. It goes beyond the graph structures that underlie many of these community- previous measures of topical coverage bias by encapsulating both maintained repositories, including one of the largest: Wikipedia. node- and edge-hosted biases in a single holistic measure of an We define self-focus bias as occurring when contributors to a entire community-maintained graph. We outline two methods to knowledge repository encode information that is important and quantify self-focus, one of which is very computationally correct to them and a large proportion of contributors to the same inexpensive, and present empirical evidence for the existence of repository, but not important and correct to contributors of similar self-focus using a “hyperlingual” approach that examines 15 repositories. different language editions of Wikipedia. We suggest applications Self-focus bias is similar to topical coverage biases [8, 11] in that of our methods and discuss the risks of ignoring self-focus bias in it seeks to describe the semantic makeup of knowledge technological applications. repositories. Topical coverage bias studies explicitly or implicitly compare the distribution of articles (or a similar measure) in Categories and Subject Descriptors particular semantic categories in Wikipedia to that of a more H.5.3 [Information Systems]: Group and Organization Interfaces traditional knowledge repository, generally in an effort to show – collaborative computing, computer-supported cooperative work, that Wikipedia describes in more detail semantic areas that are of theory and models.