Wikipedia Research and Tools: Review and Comments
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cultural Anthropology Through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-Based Sentiment
Cultural Anthropology through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-based Sentiment Peter A. Gloor, Joao Marcos, Patrick M. de Boer, Hauke Fuehres, Wei Lo, Keiichi Nemoto [email protected] MIT Center for Collective Intelligence Abstract In this paper we study the differences in historical World View between Western and Eastern cultures, represented through the English, the Chinese, Japanese, and German Wikipedia. In particular, we analyze the historical networks of the World’s leaders since the beginning of written history, comparing them in the different Wikipedias and assessing cultural chauvinism. We also identify the most influential female leaders of all times in the English, German, Spanish, and Portuguese Wikipedia. As an additional lens into the soul of a culture we compare top terms, sentiment, emotionality, and complexity of the English, Portuguese, Spanish, and German Wikinews. 1 Introduction Over the last ten years the Web has become a mirror of the real world (Gloor et al. 2009). More recently, the Web has also begun to influence the real world: Societal events such as the Arab spring and the Chilean student unrest have drawn a large part of their impetus from the Internet and online social networks. In the meantime, Wikipedia has become one of the top ten Web sites1, occasionally beating daily newspapers in the actuality of most recent news. Be it the resignation of German national soccer team captain Philipp Lahm, or the downing of Malaysian Airlines flight 17 in the Ukraine by a guided missile, the corresponding Wikipedia page is updated as soon as the actual event happened (Becker 2012. -
A Survey of Orthographic Information in Machine Translation 3
Machine Translation Journal manuscript No. (will be inserted by the editor) A Survey of Orthographic Information in Machine Translation Bharathi Raja Chakravarthi1 ⋅ Priya Rani 1 ⋅ Mihael Arcan2 ⋅ John P. McCrae 1 the date of receipt and acceptance should be inserted later Abstract Machine translation is one of the applications of natural language process- ing which has been explored in different languages. Recently researchers started pay- ing attention towards machine translation for resource-poor languages and closely related languages. A widespread and underlying problem for these machine transla- tion systems is the variation in orthographic conventions which causes many issues to traditional approaches. Two languages written in two different orthographies are not easily comparable, but orthographic information can also be used to improve the machine translation system. This article offers a survey of research regarding orthog- raphy’s influence on machine translation of under-resourced languages. It introduces under-resourced languages in terms of machine translation and how orthographic in- formation can be utilised to improve machine translation. We describe previous work in this area, discussing what underlying assumptions were made, and showing how orthographic knowledge improves the performance of machine translation of under- resourced languages. We discuss different types of machine translation and demon- strate a recent trend that seeks to link orthographic information with well-established machine translation methods. Considerable attention is given to current efforts of cog- nates information at different levels of machine translation and the lessons that can Bharathi Raja Chakravarthi [email protected] Priya Rani [email protected] Mihael Arcan [email protected] John P. -
Decentralization in Wikipedia Governance
Decentralization in Wikipedia Governance Andrea Forte1, Vanessa Larco2 and Amy Bruckman1 1GVU Center, College of Computing, Georgia Institute of Technology {aforte, asb}@cc.gatech.edu 2Microsoft [email protected] This is a preprint version of the journal article: Forte, Andrea, Vanessa Larco and Amy Bruckman. (2009) Decentralization in Wikipedia Governance. Journal of Management Information Systems. 26(1) pp 49-72. Publisher: M.E. Sharp www.mesharpe.com/journals.asp Abstract How does “self-governance” happen in Wikipedia? Through in-depth interviews with twenty individuals who have held a variety of responsibilities in the English-language Wikipedia, we obtained rich descriptions of how various forces produce and regulate social structures on the site. Our analysis describes Wikipedia as an organization with highly refined policies, norms, and a technological architecture that supports organizational ideals of consensus building and discussion. We describe how governance on the site is becoming increasingly decentralized as the community grows and how this is predicted by theories of commons-based governance developed in offline contexts. We also briefly examine local governance structures called WikiProjects through the example of WikiProject Military History, one of the oldest and most prolific projects on the site. 1. The Mechanisms of Self-Organization Should a picture of a big, hairy tarantula appear in an encyclopedia article about arachnophobia? Does it illustrate the point, or just frighten potential readers? Reasonable people might disagree on this question. In a freely editable site like Wikipedia, anyone can add the photo, and someone else can remove it. And someone can add it back, and the process continues. -
Assignment of Master's Thesis
ASSIGNMENT OF MASTER’S THESIS Title: Git-based Wiki System Student: Bc. Jaroslav Šmolík Supervisor: Ing. Jakub Jirůtka Study Programme: Informatics Study Branch: Web and Software Engineering Department: Department of Software Engineering Validity: Until the end of summer semester 2018/19 Instructions The goal of this thesis is to create a wiki system suitable for community (software) projects, focused on technically oriented users. The system must meet the following requirements: • All data is stored in a Git repository. • System provides access control. • System supports AsciiDoc and Markdown, it is extensible for other markup languages. • Full-featured user access via Git and CLI is provided. • System includes a web interface for wiki browsing and management. Its editor works with raw markup and offers syntax highlighting, live preview and interactive UI for selected elements (e.g. image insertion). Proceed in the following manner: 1. Compare and analyse the most popular F/OSS wiki systems with regard to the given criteria. 2. Design the system, perform usability testing. 3. Implement the system in JavaScript. Source code must be reasonably documented and covered with automatic tests. 4. Create a user manual and deployment instructions. References Will be provided by the supervisor. Ing. Michal Valenta, Ph.D. doc. RNDr. Ing. Marcel Jiřina, Ph.D. Head of Department Dean Prague January 3, 2018 Czech Technical University in Prague Faculty of Information Technology Department of Software Engineering Master’s thesis Git-based Wiki System Bc. Jaroslav Šmolík Supervisor: Ing. Jakub Jirůtka 10th May 2018 Acknowledgements I would like to thank my supervisor Ing. Jakub Jirutka for his everlasting interest in the thesis, his punctual constructive feedback and for guiding me, when I found myself in the need for the words of wisdom and experience. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
Strengthening and Unifying the Visual Identity of Wikimedia Projects: a Step Towards Maturity
Strengthening and unifying the visual identity of Wikimedia projects: a step towards maturity Guillaume Paumier∗ Elisabeth Bauer [[m:User:guillom]] [[m:User:Elian]] Abstract In January 2007, the Wikimedian community celebrated the sixth birthday of Wikipedia. Six years of constant evolution have now led to Wikipedia being one of the most visited websites in the world. Other projects developing free content and supported by the Wikimedia Foundation have been expanding rapidly too. The Foundation and its projects are now facing some communication issues due to the difference of scale between the human and financial resources of the Foundation and the success of its projects. In this paper, we identify critical issues in terms of visual identity and marketing. We evaluate the situation and propose several changes, including a redesign of the default website interface. Introduction The first Wikipedia project was created in January 2001. In these days, the technical infrastructure was provided by Bomis, a dot-com company. In June 2003, Jimmy Wales, founder of Wikipedia and owner of Bomis, created the Wikimedia Foundation [1] to provide a long-term administrative and technical structure dedicated to free content. Since these days, both the projects and the Foundation have been evolving. New projects have been created. All have grown at different rates. Some have got more fame than the others. New financial, technical and communication challenges have risen. In this paper, we will first identify some of these challenges and issues in terms of global visual identity. We will then analyse logos, website layouts, projects names, trademarks so as to provide some hindsight. -
D2.8 GLAM-Wiki Collaboration Progress Report 2
Project Acronym: Europeana Sounds Grant Agreement no: 620591 Project Title: Europeana Sounds D2.8 GLAM-Wiki Collaboration Progress Report 2 Revision: Final Date: 30/11/2016 Authors: Brigitte Jansen and Harry van Biessum, NISV Abstract: Within the Europeana Sounds GLAM-wiki collaboration task, nine edit-a-thons were organised by seven project partners. These edit-a-thons were held in Italy, Denmark, Latvia, England, Greece, France and the Netherlands. This report documents each event, the outcomes and lessons learned during this task. Dissemination level Public X Confidential, only for the members of the Consortium and Commission Services Coordinated by the British Library, the Europeana Sounds project is co-funded by the European Union, through the ICT Policy Support Programme as part of the Competitiveness and Innovation Framework Programme (CIP) http://ec.europa.eu/information_society/activities/ict_psp/ Europeana Sounds EC-GA 620591 EuropeanaSounds-D2.8-GLAM-wiki-collaboration-progress-report-2-v1.0.docx 30/11/2016 PUBLIC Revision history Version Status Name, organisation Date Changes 0.1 ToC Brigitte Jansen & Harry 14/10/2016 van Biessum, NISV 0.2 Draft Brigitte Jansen & Harry 04/11/2016 First draft van Biessum, NISV 0.3 Draft Zane Grosa, NLL 09/10/2016 Chapter 3.5 0.4 Draft Laura Miles, BL 15/11/2016 Chapters 3.4, 3.8, 5.1, 7 0.5 Draft Karen Williams, State 17/11/2016 Chapters 3.9, 7 and University Library Denmark 0.6 Draft Marianna Anastasiou, 17/11/2016 Chapter 3.6 FMS 0.7 Draft Brigitte Jansen, Maarten 18/11/2016 Incorporating feedback by Brinkerink & Harry van reviewer and Europeana Biessum, NISV Sounds partner 0.8 Draft David Haskiya, EF 28/11/2016 Added Chapter 3.2.2 0.9 Final draft Maarten Brinkerink & 28/11/2016 Finalise all chapters Harry van Biessum, NISV 1.0 Final Laura Miles & Richard 30/11/2016 Layout, minor changes Ranft, BL Review and approval Action Name, organisation Date Sindy Meijer, Wikimedia Chapter Netherland 16/11/2016 Reviewed by Liam Wyatt, EF 24/11/2016 Approved by Coordinator and PMB 30/11/2016 Distribution No. -
The Culture of Wikipedia
Good Faith Collaboration: The Culture of Wikipedia Good Faith Collaboration The Culture of Wikipedia Joseph Michael Reagle Jr. Foreword by Lawrence Lessig The MIT Press, Cambridge, MA. Web edition, Copyright © 2011 by Joseph Michael Reagle Jr. CC-NC-SA 3.0 Purchase at Amazon.com | Barnes and Noble | IndieBound | MIT Press Wikipedia's style of collaborative production has been lauded, lambasted, and satirized. Despite unease over its implications for the character (and quality) of knowledge, Wikipedia has brought us closer than ever to a realization of the centuries-old Author Bio & Research Blog pursuit of a universal encyclopedia. Good Faith Collaboration: The Culture of Wikipedia is a rich ethnographic portrayal of Wikipedia's historical roots, collaborative culture, and much debated legacy. Foreword Preface to the Web Edition Praise for Good Faith Collaboration Preface Extended Table of Contents "Reagle offers a compelling case that Wikipedia's most fascinating and unprecedented aspect isn't the encyclopedia itself — rather, it's the collaborative culture that underpins it: brawling, self-reflexive, funny, serious, and full-tilt committed to the 1. Nazis and Norms project, even if it means setting aside personal differences. Reagle's position as a scholar and a member of the community 2. The Pursuit of the Universal makes him uniquely situated to describe this culture." —Cory Doctorow , Boing Boing Encyclopedia "Reagle provides ample data regarding the everyday practices and cultural norms of the community which collaborates to 3. Good Faith Collaboration produce Wikipedia. His rich research and nuanced appreciation of the complexities of cultural digital media research are 4. The Puzzle of Openness well presented. -
Name of the Tool Citizendium Home Page Logo URL En.Citizendium
Name of the Tool Citizendium Home Page Logo URL en.citizendium.org Subject Encyclopedias Accessibility Free Language English Publisher CZ: Media Assets Workgroup Brief History It is an English-language Wiki-based free encyclopedia project launched by Larry Sanger, who had previously co - founded Wikipedia in 2001. It had launched on 23 October 2006 (as pilot project) and on 25 March 2007 (publicly). Scope and Coverage It serves all over the world and provides the access and modifications of information related to the “parent topics” or main topics like Natural Sciences, Social Sciences, Humanities, Arts, Applied arts and sciences, Recreation. Under each main topic or parent topic, there is hyperlinked list of sub topics and other related topics. As for example, under the main topic “ Natural Sciences” there is a following list of “Subtopics” and “Other related topics”: Subtopics Physics Chemistry Biology Astronomy Earth science Mathematics Other related topics Natural philosophy Natural history Applied science Health science Geology Under the main topic “Humanities” there is a list of following topics: Subtopics Classics History Literature Philosophy Religion Theology Other related topics Art Applied arts Education Law Music Science Social science Scholarship Society Theatre The encyclopedia includes total number of 16,891 articles when last accessed. Kind of Information Every article under subtopics and other related topics is provided with “Talk”, “Related Articles”, “Biography”, “External Links”, “Citable Version”, “Video” related to that article. Brief description, history of a topic etc. are present in the articles. Coloured images on topics, charts, graphs etc. are available where applicable. Notes and references are also found after the articles. -
An Analysis of Contributions to Wikipedia from Tor
Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor Chau Tran Kaylea Champion Andrea Forte Department of Computer Science & Engineering Department of Communication College of Computing & Informatics New York University University of Washington Drexel University New York, USA Seatle, USA Philadelphia, USA [email protected] [email protected] [email protected] Benjamin Mako Hill Rachel Greenstadt Department of Communication Department of Computer Science & Engineering University of Washington New York University Seatle, USA New York, USA [email protected] [email protected] Abstract—User-generated content sites routinely block contri- butions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Fig. 1. Screenshot of the page a user is shown when they attempt to edit the Our analysis suggests that although Tor users who slip through Wikipedia article on “Privacy” while using Tor. Wikipedia’s ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users. -
SI 410 Ethics and Information Technology
Author(s): Paul Conway, PhD, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/ We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit http://open.umich.edu/privacy-and-terms-use. Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition. Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers. Citation Key for more information see: http://open.umich.edu/wiki/CitationPolicy Use + Share + Adapt { Content the copyright holder, author, or law permits you to use, share and adapt. } Public Domain – Government: Works that are produced by the U.S. Government. (17 USC § 105) Public Domain – Expired: Works that are no longer protected due to an expired copyright term. Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. -
Qurrat Ann Kadwani: Still Calling Her Q!
1 More Next Blog» Create Blog Sign In InfiniteBody art and creative consciousness by Eva Yaa Asantewaa Tuesday, May 6, 2014 Your Host Qurrat Ann Kadwani: Still calling her Q! Eva Yaa Asantewaa Follow View my complete profile My Pages Home About Eva Yaa Asantewaa Getting to know Eva (interview) Qurrat Ann Kadwani Eva's Tarot site (photo Bolti Studios) Interview on Tarot Talk Contact Eva Name Email * Message * Send Contribute to InfiniteBody Subscribe to IB's feed Click to subscribe to InfiniteBody RSS Get InfiniteBody by Email Talented and personable Qurrat Ann Kadwani (whose solo show, They Call Me Q!, I wrote about Email address... Submit here) is back and, I hope, every bit as "wicked smart and genuinely funny" as I observed back in September. Now she's bringing the show to the Off Broadway St. Luke's Theatre , May 19-June 4, Mondays at 7pm and Wednesdays at 8pm. THEY CALL ME Q is the story of an Indian girl growing up in the Boogie Down Bronx who gracefully seeks balance between the cultural pressures brought forth by her traditional InfiniteBody Archive parents and wanting acceptance into her new culture. Along the journey, Qurrat Ann Kadwani transforms into 13 characters that have shaped her life including her parents, ► 2015 (222) Caucasian teachers, Puerto Rican classmates, and African-American friends. Laden with ▼ 2014 (648) heart and abundant humor, THEY CALL ME Q speaks to the universal search for identity ► December (55) experienced by immigrants of all nationalities. ► November (55) Program, schedule and ticket information ► October (56) ► September (42) St.