Language-Independent Methods for Identifying Cross-Lingual Similarity in Wikipedia

Total Page:16

File Type:pdf, Size:1020Kb

Language-Independent Methods for Identifying Cross-Lingual Similarity in Wikipedia Language-Independent Methods for Identifying Cross-Lingual Similarity in Wikipedia Monica Lestari Paramita Information School University of Sheffield A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy February 2019 Declaration I hereby declare that this thesis is a presentation of my original research work. The con- tents of this thesis have not been submitted in whole or in part for consideration for any other degree or qualification in this, or any other university. Wherever contributions of others are involved, every effort is made to indicate this clearly, with due reference to the literature, and acknowledgement of collaborative research and discussions. A number of the chapters in this thesis have been published in academic papers or journals; these publications are specifically indicated at the end of each relevant chapter. Monica Lestari Paramita February 2019 Abstract The diversity and richness of multilingual information available in Wikipedia have in- creased its significance as a language resource. The information extracted from Wikipedia has been utilised for many tasks, such as Statistical Machine Translation (SMT) and sup- porting multilingual information access. These tasks often rely on gathering data from articles that describe the same topic in different languages with the assumption that the contents are equivalent to each other. However, studies have shown that this might not be the case. Given the scale and use of Wikipedia, there is a need to develop an approach to measure cross-lingual similarity across Wikipedia. Many existing similarity measures, however, require the availability of ‘language-dependent’ resources, such as dictionaries or Machine Translation (MT) systems, to translate documents into the same language prior to comparison. This presents some challenges for some language pairs, particu- larly those involving ‘under-resourced’ languages where the required linguistic resources are not widely available. This study aims to present a solution to this problem by first, investigating cross-lingual similarity in Wikipedia, and secondly, developing ‘language- independent’ approaches to measure cross-lingual similarity in Wikipedia. Two main contributions were provided in this work to identify cross-lingual similarity in Wikipedia. The first key contribution of this work is the development of a Wikipedia similarity corpus to understand the similarity characteristics of Wikipedia articles and to evaluate and compare various approaches for measuring cross-lingual similarity. The author elicited manual judgments from people with the appropriate language skills to assess similarities between a set of 800 pairs of interlanguage-linked articles. This corpus vi contains Wikipedia articles for eight language pairs (all pairs involving English and in- cluding well-resourced and under-resourced languages) of varying degrees of similarity. The second contribution of this work is the development of language-independent approaches to measure cross-lingual similarity in Wikipedia. The author investigated the utility of a number of “lightweight” language-independent features in four different experiments. The first experiment investigated the use of Wikipedia links to identify and align similar sentences, prior to aggregating the scores of the aligned sentences to rep- resent the similarity of the document pair. The second experiment investigated the use- fulness of content similarity features (such as char-n-gram overlap, links overlap, word overlap and word length ratio). The third experiment focused on analysing the use of structure similarity features (such as the ratio of section length, and similarity between the section headings). And finally, the fourth experiment investigates a combination of these features in a classification and a regression approach. Most of these features are language-independent whilst others utilised freely avail- able resources (Wikipedia and Wiktionary) to assist in identifying overlapping informa- tion across languages. The approaches proposed are lightweight and can be applied to any languages written in Latin script; non-Latin script languages need to be transliter- ated prior to using these approaches. The performances of these approaches were eval- uated against the human judgments in the similarity corpus. Overall, the proposed language-independent approaches achieved promising results. The best performance is achieved with the combination of all features in a classification and a regression approach. The results show that the Random Forest classifier was able to classify 81.38% document pairs correctly (F1 score=0.79) in a binary classification prob- lem, 50.88% document pairs correctly (F1 score=0.71) in a 5-class classification problem, and RMSE of 0.73 in a regression approach. These results are significantly higher com- pared to a classifier utilising machine translation and cosine similarity of the tf-idf scores. These findings showed that language-independent approaches can be used to measure cross-lingual similarity between Wikipedia articles. Future work is needed to evaluate these approaches in more languages and to incorporate more features. Acknowledgements Firstly, I would like to thank my first supervisor, Professor Paul Clough, for his expertise and continuous support throughout my entire study. Thank you for always finding time to provide valuable feedback on my research and for encouraging me to be a better re- searcher. I would also like to thank my second supervisor, Professor Rob Gaizauskas, for providing insightful comments on my study. My PhD was partially funded by the ACCURAT project supported by the European Commission and I gratefully acknowledge this support. Thank you also to my colleagues in the ACCURAT project who have participated in the evaluation task. To my colleagues, Ahmet, Emina, Emma, Mengdie, Paula, Sophie, Alberto, Nikos, Si- mon, and all the members of the IR and NLP group in the University of Sheffield, and to my wonderful friends from the writing club, Akanimo and Anna, thank you very much for the motivation, encouragement, peer support and all the feedback you have given me. I would also like to thank Kay Guccione, Sarah Barnes (my career mentor) and, especially Hannah Roddie (my thesis mentor), whose advice and encouragement have built up my confidence to finish my thesis. Finally, my profound gratitude goes to my family and friends for their endless support and motivation during my years of study. To my wonderful husband, Paul, who has never stopped believing in me. To my son, Oliver, who always gives me a reason to laugh and be grateful every day, and to my baby bump, whose kicks have kept me accompanied (and awake) during the late night writing sessions. Finally, to my wonderful parents and parents in law who have loved, supported and encouraged me throughout this process. This thesis would not have been possible without their support. Table of contents Table of contents ix List of figures xvii List of tables xxi 1 Introduction1 1.1 Motivation . .1 1.2 Research questions and objectives . .5 1.2.1 Scope of the research . .6 1.3 Main contributions to the research field . .7 1.4 Thesis outline . .8 1.5 Structure of the work . 10 2 Related Work 13 2.1 Measuring similarity . 14 2.2 Defining the concept of similarity . 15 2.3 Identifying monolingual similarity . 23 2.3.1 Lexical similarity . 24 2.3.2 Semantic similarity . 28 2.4 Methods to measure cross-lingual similarity . 34 2.4.1 Language-dependent methods . 35 2.4.2 Language-independent approaches . 43 x Table of contents 2.5 Wikipedia as a linguistic resource . 52 2.5.1 Extraction of similar texts . 52 2.5.2 Assistance for other tasks . 54 2.6 Identifying similarity (or dissimilarity) in Wikipedia . 56 2.6.1 Evaluating similarity in interlanguage-linked articles . 57 2.6.2 Improving similarity in Wikipedia . 59 2.7 Summary........................................ 60 3 Methodology 61 3.1 Overview of methodology . 61 3.2 Initial study on investigating similarity in Wikipedia . 64 3.3 Gathering human judgments on similarity . 65 3.3.1 Selection of languages . 65 3.3.2 Selecting evaluation documents . 66 3.3.3 Assessors . 68 3.3.4 Evaluation tasks . 69 3.4 Development of approaches to measure similarity in Wikipedia . 71 3.4.1 Resources . 71 3.4.2 Language-dependent baseline . 72 3.5 Evaluation . 73 3.6 Wikipedia corpus . 74 3.6.1 Pre-processing of Wikipedia corpus . 74 3.6.2 Corpus statistics . 80 4 Identifying Similarity Features in Wikipedia 85 4.1 Background . 85 4.2 Similarity analysis method . 87 4.3 Dataset . 88 4.4 Initial findings . 89 4.4.1 Similarity between structures . 89 Table of contents xi 4.4.2 Similarity between paragraphs . 93 4.4.3 Similarity between sentences . 93 4.4.4 Relations between quality of articles, word length, and the similarity degree . 99 4.5 Conclusion . 100 5 Evaluation Corpus 103 5.1 Background . 103 5.2 Creating evaluation tasks . 104 5.2.1 Pilot evaluation task . 105 5.2.2 Main evaluation task . 108 5.3 Selecting evaluation documents . 109 5.4 Assessors . 112 5.5 Annotation results . 112 5.5.1 Q1 Results . 113 5.5.2 Q1-Reasons Results . 116 5.5.3 Q2 Results . 122 5.5.4 Q3 Results . 125 5.5.5 Q4 Results . 129 5.5.6 Comparison between evaluation questions . 132 5.6 Discussion . 135 5.6.1 Analysis of disagreements . 135 5.6.2 Relations between ‘similarity’ and ‘comparability’ . 136 5.6.3 Limitations . 137 5.7 Conclusion . 138 6 Anchor Text and Word Overlap Method 141 6.1 Background . 141 6.2 Method . 143 6.2.1 Pre-process documents . 145 xii Table of contents 6.2.2 Create a bilingual lexicon . 145 6.2.3 Anchor text translation . 146 6.2.4 Similarity calculation . 147 6.3 Experiments...................................... 151 6.3.1 Language selection . 151 6.3.2 Corpus . 151 6.3.3 Evaluation . 152 6.4 Results ......................................... 153 6.4.1 Correlation between the automatic methods .
Recommended publications
  • Latvian Sportspeople Representation in English and Latvian Wikipedias
    32 | Rudzinska: LATVIAN SPORTSPEOPLE REPRESENTATION ... ORIGINAL RESEARCH PAPER LATVIAN SPORTSPEOPLE REPRESENTATION IN ENGLISH AND LATVIAN WIKIPEDIAS Ieva Rudzinska Latvian Academy of Sport Education Address: 333 Brivibas Street, Riga, LV-1006, Latvia Phone: 37167543445, fax: +37167543480 E-mail: [email protected] Abstract The goal was to study Latvian sportspeople representation in English and Latvian Wikipedias in 2015. The analyses allowed identifying three main Latvian sportspeople related categories in English Wikipedia: “Latvian sportspeople”, “List of Latvian sportspeople” and “Latvian sports related lists”, a category “Latvijas sportisti” in Latvian Wikipedia. In “Latvian sportspeople” 1018 sportspeople were listed by family names, starting with Artis Ābols and ending with Ainārs Zvirgzdiņš, by sports – from Latvian alpine skiers to Latvian weightlifters. In “List of Latvian sportspeople” were included 99 most notable Latvian sportspeople, representing 24 sports. The largest athlete frequency per sport (14) was in 3 sports: athletics, basketball and luge. From 5 to 10 sportspeople were in 6 sports: rowing, bobsleigh, volleyball, ice hockey, judo and tennis, 15 sports were represented by 1 to 4 athletes. In Latvian Wikipedia in the category “Latvijas sportisti” were 1186 sportspeople from 38 sports. Statistical analysis allowed finding moderate Pearson correlations between the numbers of sportspeople in the category “Latvian sportspeople” and “List of Latvian sportspeople”, EN (0.60; Sig.<0.01); “List of Latvian sportspeople”,
    [Show full text]
  • Universality, Similarity, and Translation in the Wikipedia Inter-Language Link Network
    In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network Morten Warncke-Wang1, Anuradha Uduwage1, Zhenhua Dong2, John Riedl1 1GroupLens Research Dept. of Computer Science and Engineering 2Dept. of Information Technical Science University of Minnesota Nankai University Minneapolis, Minnesota Tianjin, China {morten,uduwage,riedl}@cs.umn.edu [email protected] ABSTRACT 1. INTRODUCTION Wikipedia has become one of the primary encyclopaedic in- The world: seven seas separating seven continents, seven formation repositories on the World Wide Web. It started billion people in 193 nations. The world's knowledge: 283 in 2001 with a single edition in the English language and has Wikipedias totalling more than 20 million articles. Some since expanded to more than 20 million articles in 283 lan- of the content that is contained within these Wikipedias is guages. Criss-crossing between the Wikipedias is an inter- probably shared between them; for instance it is likely that language link network, connecting the articles of one edition they will all have an article about Wikipedia itself. This of Wikipedia to another. We describe characteristics of ar- leads us to ask whether there exists some ur-Wikipedia, a ticles covered by nearly all Wikipedias and those covered by set of universal knowledge that any human encyclopaedia only a single language edition, we use the network to under- will contain, regardless of language, culture, etc? With such stand how we can judge the similarity between Wikipedias a large number of Wikipedia editions, what can we learn based on concept coverage, and we investigate the flow of about the knowledge in the ur-Wikipedia? translation between a selection of the larger Wikipedias.
    [Show full text]
  • A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages
    Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 2373–2380 Marseille, 11–16 May 2020 c European Language Resources Association (ELRA), licensed under CC-BY-NC A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages Dwaipayan Roy, Sumit Bhatia, Prateek Jain GESIS - Cologne, IBM Research - Delhi, IIIT - Delhi [email protected], [email protected], [email protected] Abstract Wikipedia is the largest web-based open encyclopedia covering more than three hundred languages. However, different language editions of Wikipedia differ significantly in terms of their information coverage. We present a systematic comparison of information coverage in English Wikipedia (most exhaustive) and Wikipedias in eight other widely spoken languages (Arabic, German, Hindi, Korean, Portuguese, Russian, Spanish and Turkish). We analyze the content present in the respective Wikipedias in terms of the coverage of topics as well as the depth of coverage of topics included in these Wikipedias. Our analysis quantifies and provides useful insights about the information gap that exists between different language editions of Wikipedia and offers a roadmap for the Information Retrieval (IR) community to bridge this gap. Keywords: Wikipedia, Knowledge base, Information gap 1. Introduction other with respect to the coverage of topics as well as Wikipedia is the largest web-based encyclopedia covering the amount of information about overlapping topics.
    [Show full text]
  • D2.8 GLAM-Wiki Collaboration Progress Report 2
    Project Acronym: Europeana Sounds Grant Agreement no: 620591 Project Title: Europeana Sounds D2.8 GLAM-Wiki Collaboration Progress Report 2 Revision: Final Date: 30/11/2016 Authors: Brigitte Jansen and Harry van Biessum, NISV Abstract: Within the Europeana Sounds GLAM-wiki collaboration task, nine edit-a-thons were organised by seven project partners. These edit-a-thons were held in Italy, Denmark, Latvia, England, Greece, France and the Netherlands. This report documents each event, the outcomes and lessons learned during this task. Dissemination level Public X Confidential, only for the members of the Consortium and Commission Services Coordinated by the British Library, the Europeana Sounds project is co-funded by the European Union, through the ICT Policy Support Programme as part of the Competitiveness and Innovation Framework Programme (CIP) http://ec.europa.eu/information_society/activities/ict_psp/ Europeana Sounds EC-GA 620591 EuropeanaSounds-D2.8-GLAM-wiki-collaboration-progress-report-2-v1.0.docx 30/11/2016 PUBLIC Revision history Version Status Name, organisation Date Changes 0.1 ToC Brigitte Jansen & Harry 14/10/2016 van Biessum, NISV 0.2 Draft Brigitte Jansen & Harry 04/11/2016 First draft van Biessum, NISV 0.3 Draft Zane Grosa, NLL 09/10/2016 Chapter 3.5 0.4 Draft Laura Miles, BL 15/11/2016 Chapters 3.4, 3.8, 5.1, 7 0.5 Draft Karen Williams, State 17/11/2016 Chapters 3.9, 7 and University Library Denmark 0.6 Draft Marianna Anastasiou, 17/11/2016 Chapter 3.6 FMS 0.7 Draft Brigitte Jansen, Maarten 18/11/2016 Incorporating feedback by Brinkerink & Harry van reviewer and Europeana Biessum, NISV Sounds partner 0.8 Draft David Haskiya, EF 28/11/2016 Added Chapter 3.2.2 0.9 Final draft Maarten Brinkerink & 28/11/2016 Finalise all chapters Harry van Biessum, NISV 1.0 Final Laura Miles & Richard 30/11/2016 Layout, minor changes Ranft, BL Review and approval Action Name, organisation Date Sindy Meijer, Wikimedia Chapter Netherland 16/11/2016 Reviewed by Liam Wyatt, EF 24/11/2016 Approved by Coordinator and PMB 30/11/2016 Distribution No.
    [Show full text]
  • Modeling Communities in Information Systems: Informal Learning Communities in Social Media"
    "Modeling Communities in Information Systems: Informal Learning Communities in Social Media" Von der Fakultät für Mathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades einer Doktorin der Naturwissenschaften genehmigte Dissertation vorgelegt von Zinayida Kensche, geb. Petrushyna, MSc. aus Odessa, Ukraine Berichter: Universitätsprofessor Professor Dr. Matthias Jarke Universitätsprofessor Professor Dr. Marcus Specht Privatdozent Dr. Ralf Klamma Tag der mündlichen Prüfung: 17.11.2015 Diese Dissertation ist auf den Internetseiten der Universitätsbibliothek online verfügbar. i c Copyright by 2016 All Rights Reserved ii Abstract Information modeling is required for creating a successful information system while modeling of communities is pivotal for maintaining community information systems (CIS). Online social media, a special case of CIS, have been intensively used but not usually adopted for learning community needs. Thus community stakeholders meet problems by supporting learning communities in social media. Under the prism of Community of Practice theory, such communities have three dimensions that are responsible for community sustainability: mutual engagement, joint enterprises and shared repertoire. Existing modeling solutions use either perspectives of learning theories, or anal- ysis of learner or community data captured in social media but rarely combine both approaches. Therefore, current solutions produce community models that supply only a part of community stakeholders with information that can hardly describe community success and failure. We also claim that community models must be created based on community data analysis integrated with our learning community dimensions. More- over, the models need to be adapted according to environmental changes. This work provides a solution to continuous modeling of informal learning com- munities in social media.
    [Show full text]
  • Same Gap, Different Experiences
    SAME GAP, DIFFERENT EXPERIENCES An Exploration of the Similarities and Differences Between the Gender Gap in the Indian Wikipedia Editor Community and the Gap in the General Editor Community By Eva Jadine Lannon 997233970 An Undergraduate Thesis Presented to Professor Leslie Chan, PhD. & Professor Kenneth MacDonald, PhD. in the Partial Fulfillment of the Requirements for the Degree of Honours Bachelor of Arts in International Development Studies (Co-op) University of Toronto at Scarborough Ontario, Canada June 2014 ACKNOWLEDGEMENTS I would like to express my deepest gratitude to all of those people who provided support and guidance at various stages of my undergraduate research, and particularly to those individuals who took the time to talk me down off the ledge when I was certain I was going to quit. To my friends, and especially Jennifer Naidoo, who listened to my grievances at all hours of the day and night, no matter how spurious. To my partner, Karim Zidan El-Sayed, who edited every word (multiple times) with spectacular patience. And finally, to my research supervisors: Prof. Leslie Chan, you opened all the doors; Prof. Kenneth MacDonald, you laid the foundations. Without the support, patience, and understanding that both of you provided, it would have never been completed. Thank you. i EXECUTIVE SUMMARY According to the second official Wikipedia Editor Survey conducted in December of 2011, female- identified editors comprise only 8.5% of contributors to Wikipedia's contributor population (Glott & Ghosh, 2010). This significant lack of women and women's voices in the Wikipedia community has led to systemic bias towards male histories and culturally “masculine” knowledge (Lam et al., 2011; Gardner, 2011; Reagle & Rhue, 2011; Wikimedia Meta-Wiki, 2013), and an editing environment that is often hostile and unwelcoming to women editors (Gardner, 2011; Lam et al., 2011; Wikimedia Meta- Wiki, 2013).
    [Show full text]
  • Genre Analysis of Online Encyclopedias. the Case of Wikipedia
    Genre analysis online encycloped The case of Wikipedia AnnaTereszkiewicz Genre analysis of online encyclopedias The case of Wikipedia Wydawnictwo Uniwersytetu Jagiellońskiego Publikacja dofi nansowana przez Wydział Filologiczny Uniwersytetu Jagiellońskiego ze środków wydziałowej rezerwy badań własnych oraz Instytutu Filologii Angielskiej PROJEKT OKŁADKI Bartłomiej Drosdziok Zdjęcie na okładce: Łukasz Stawarski © Copyright by Anna Tereszkiewicz & Wydawnictwo Uniwersytetu Jagiellońskiego Wydanie I, Kraków 2010 All rights reserved Książka, ani żaden jej fragment nie może być przedrukowywana bez pisemnej zgody Wydawcy. W sprawie zezwoleń na przedruk należy zwracać się do Wydawnictwa Uniwersytetu Jagiellońskiego. ISBN 978-83-233-2813-1 www.wuj.pl Wydawnictwo Uniwersytetu Jagiellońskiego Redakcja: ul. Michałowskiego 9/2, 31-126 Kraków tel. 12-631-18-81, 12-631-18-82, fax 12-631-18-83 Dystrybucja: tel. 12-631-01-97, tel./fax 12-631-01-98 tel. kom. 0506-006-674, e-mail: [email protected] Konto: PEKAO SA, nr 80 1240 4722 1111 0000 4856 3325 Table of Contents Acknowledgements ........................................................................................................................ 9 Introduction .................................................................................................................................... 11 Materials and Methods .................................................................................................................. 14 1. Genology as a study ..................................................................................................................
    [Show full text]
  • Critical Point of View: a Wikipedia Reader
    w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink and Nathaniel Tkacz Editorial Assistance: Ivy Roberts, Morgan Currie Copy-Editing: Cielo Lutino CRITICAL Design: Katja van Stiphout Cover Image: Ayumi Higuchi POINT OF VIEW Printer: Ten Klei Groep, Amsterdam Publisher: Institute of Network Cultures, Amsterdam 2011 A Wikipedia ISBN: 978-90-78146-13-1 Reader EDITED BY Contact GEERT LOVINK AND Institute of Network Cultures NATHANIEL TKACZ phone: +3120 5951866 INC READER #7 fax: +3120 5951840 email: [email protected] web: http://www.networkcultures.org Order a copy of this book by sending an email to: [email protected] A pdf of this publication can be downloaded freely at: http://www.networkcultures.org/publications Join the Critical Point of View mailing list at: http://www.listcultures.org Supported by: The School for Communication and Design at the Amsterdam University of Applied Sciences (Hogeschool van Amsterdam DMCI), the Centre for Internet and Society (CIS) in Bangalore and the Kusuma Trust. Thanks to Johanna Niesyto (University of Siegen), Nishant Shah and Sunil Abraham (CIS Bangalore) Sabine Niederer and Margreet Riphagen (INC Amsterdam) for their valuable input and editorial support. Thanks to Foundation Democracy and Media, Mondriaan Foundation and the Public Library Amsterdam (Openbare Bibliotheek Amsterdam) for supporting the CPOV events in Bangalore, Amsterdam and Leipzig. (http://networkcultures.org/wpmu/cpov/) Special thanks to all the authors for their contributions and to Cielo Lutino, Morgan Currie and Ivy Roberts for their careful copy-editing.
    [Show full text]
  • The Case of 13 Wikipedia Instances
    Interaction Design and Architecture(s) Journal - IxD&A, N.22, 2014, pp. 34-47 The Impact of Culture On Smart Community Technology: The Case of 13 Wikipedia Instances Zinayida Petrushyna1, Ralf Klamma1, Matthias Jarke1,2 1 Advanced Community Information Systems Group, Information Systems and Databases Chair, RWTH Aachen University, Ahornstrasse 55, 52056 Aachen, Germany 2 Fraunhofer Institute for Applied Information Technology FIT, 53754 St. Augustin, Germany {petrushyna, klamma}@dbis.rwth-aachen.de [email protected] Abstract Smart communities provide technologies for monitoring social behaviors inside communities. The technologies that support knowledge building should consider the cultural background of community members. The studies of the influence of the culture on knowledge building is limited. Just a few works consider digital traces of individuals that they explain using cultural values and beliefs. In this work, we analyze 13 Wikipedia instances where users with different cultural background build knowledge in different ways. We compare edits of users. Using social network analysis we build and analyze co- authorship networks and watch the networks evolution. We explain the differences we have found using Hofstede dimensions and Schwartz cultural values and discuss implications for the design of smart community technologies. Our findings provide insights in requirements for technologies used for smart communities in different cultures. Keywords: Social network analysis, Wikipedia communities, Hofstede dimensions, Schwartz cultural values 1 Introduction People prefer to leave in smart cities where their needs are satisfied [1]. The development of smart cities depends on the collaboration of individuals. The investigation of the flow [1] of knowledge created by the individuals allows the monitoring of city smartness.
    [Show full text]
  • A Wikipedia Reader
    UvA-DARE (Digital Academic Repository) Critical point of view: a Wikipedia reader Lovink, G.; Tkacz, N. Publication date 2011 Document Version Final published version Link to publication Citation for published version (APA): Lovink, G., & Tkacz, N. (2011). Critical point of view: a Wikipedia reader. (INC reader; No. 7). Institute of Network Cultures. http://www.networkcultures.org/_uploads/%237reader_Wikipedia.pdf General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:05 Oct 2021 w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink
    [Show full text]
  • 34 Understanding Wikipedia Practices Through Hindi, Urdu, and English
    Understanding Wikipedia practices through Hindi, Urdu, and English takes on an evolving regional conflict MOLLY G. HICKMAN, Computer Science, Virginia Tech 34 VIRAL PASAD, Computer Science, Virginia Tech HARSH SANGHAVI, Industrial and Systems Engineering, Virginia Tech JACOB THEBAULT-SPIEKER, The Information School, University of Wisconsin - Madison SANG WON LEE, Computer Science, Virginia Tech Wikipedia is the product of thousands of editors working collaboratively to provide free and up-to-date encyclopedic information to the project’s users. This article asks to what degree Wikipedia articles in three languages — Hindi, Urdu, and English — achieve Wikipedia’s mission of making neutrally-presented, reliable information on a polarizing, controversial topic available to people around the globe. We chose the topic of the recent revocation of Article 370 of the Constitution of India, which, along with other recent events in and concerning the region of Jammu and Kashmir, has drawn attention to related articles on Wikipedia. This work focuses on the English Wikipedia, being the preeminent language edition of the project, as well as the Hindi and Urdu editions. Hindi and Urdu are the two standardized varieties of Hindustani, a lingua franca of Jammu and Kashmir. We analyzed page view and revision data for three Wikipedia articles to gauge popularity of the pages in our corpus, and responsiveness of editors to breaking news events and problematic edits. Additionally, we interviewed editors from all three language editions to learn about differences in editing processes and motivations, and we compared the text of the articles across languages as they appeared shortly after the revocation of Article 370.
    [Show full text]
  • LASE JOURNAL of SPORT SCIENCE Is a Scientific Journal Published Two Times Per Year in Sport Science LASE Journal for Sport Scientists and Sport Experts/Specialists
    LASE JOURNAL OF SPORT SCIENCE is a Scientific Journal published two times per year in Sport Science LASE Journal for sport scientists and sport experts/specialists Published and financially supported by the Latvian Academy of Sport Education in Riga, Latvia p-ISSN: 1691-7669 Editorial Contact Information, e-ISSN: 1691-9912 Publisher Contact Information: ISO 3297 Inta Bula-Biteniece Latvian Academy of Sport Education Language: English Address: 333 Brivibas Street Indexed in IndexCopernicus Evaluation Riga, LV1006, Latvia Ministry of Science and Higher Phone.: +371 67543410 Education, Poland Fax: +371 67543480 De Gruyter Open E-mail: [email protected] DOI (Digital Object Identifiers) Printed in 100 copies The annual subscription (2 issues) is 35 EUR (20 EUR for one issue). Executive Editor: LASE Journal of Sport Inta Bula – Biteniece Science Exemplary order form of Ilze Spīķe subscription is accessible Language Editor: in our website: www.lspa.lv/research Ieva Rudzinska Please send the order to: Printed and bound: “Printspot” Ltd. LASE Journal of Sport Science Cover projects: Uve Švāģers - Griezis Latvijas Sporta pedagoģijas akadēmija Address: 14-36 Salnas Street Address; 333 Brivibas Street Riga, LV1021, Latvia Riga, LV1006, Latvia Phone: +371 26365500 Phone: +371 67543410 e-mail: [email protected] Fax: +371 67543480 website: www.printspot.lv E-mail: [email protected] Method of payment: Please send payments to the account of Latvijas Sporta pedagoģijas akadēmija Nr. 90000055243 Account number: LV97TREL9150123000000 Bank: State Treasury BIC: TRELLV22 Postscript: subscription LASE Journal of Sport Science You are free to: Share — copy and redistribute the material in any medium or format. The licensor cannot revoke these freedoms as long as you follow the license terms.
    [Show full text]