Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
Cultural Anthropology and the Infrastructure of Publishing
CULTURAL ANTHROPOLOGY AND THE INFRASTRUCTURE OF PUBLISHING TIMOTHY W. ELFENBEIN Society for Cultural Anthropology This essay is based on an interview conducted by Matt Thompson, portions of which originally appeared on the blog Savage Minds the week after Cultural Anthropology relaunched as an open-access journal.1 Herein, I elaborate on my original comments to provide a nuts-and-bolts account of what has gone into the journal’s transition. I am particularly concerned to lay bare the publishing side of this endeavor—of what it means for a scholarly society to take on publishing responsibilities—since the challenges of pub- lishing, and the labor and infrastructure involved, are invisible to most authors and readers. Matt Thompson: When did the Society for Cultural Anthropology (SCA) de- cide to go the open-access route and what was motivating them? Tim Elfenbein: Cultural Anthropology was one of the first American Anthropo- logical Association (AAA) journals to start its own website. Kim and Mike Fortun were responsible for the initial site. They wanted to know what extra materials they could put on a website that would supplement the journal’s articles. That experience helped germinate the idea that there was more that the journal could do itself. The Fortuns are also heavily involved in science and technology studies 2 (STS), where discussions about open access have been occurring for a long time. When Anne Allison and Charlie Piot took over as editors of the journal in 2011, they continued to push for an open-access alternative in our publishing program. Although the SCA and others had been making the argument for transitioning at CULTURAL ANTHROPOLOGY, Vol. -
The Future of the Internet and How to Stop It the Harvard Community Has
The Future of the Internet and How to Stop It The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Jonathan L. Zittrain, The Future of the Internet -- And How to Citation Stop It (Yale University Press & Penguin UK 2008). Published Version http://futureoftheinternet.org/ Accessed July 1, 2016 4:22:42 AM EDT Citable Link http://nrs.harvard.edu/urn-3:HUL.InstRepos:4455262 This article was downloaded from Harvard University's DASH Terms of Use repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms- of-use#LAA (Article begins on next page) YD8852.i-x 1/20/09 1:59 PM Page i The Future of the Internet— And How to Stop It YD8852.i-x 1/20/09 1:59 PM Page ii YD8852.i-x 1/20/09 1:59 PM Page iii The Future of the Internet And How to Stop It Jonathan Zittrain With a New Foreword by Lawrence Lessig and a New Preface by the Author Yale University Press New Haven & London YD8852.i-x 1/20/09 1:59 PM Page iv A Caravan book. For more information, visit www.caravanbooks.org. The cover was designed by Ivo van der Ent, based on his winning entry of an open competition at www.worth1000.com. Copyright © 2008 by Jonathan Zittrain. All rights reserved. Preface to the Paperback Edition copyright © Jonathan Zittrain 2008. Subject to the exception immediately following, this book may not be reproduced, in whole or in part, including illustrations, in any form (beyond that copying permitted by Sections 107 and 108 of the U.S. -
Web Archiving and You Web Archiving and Us
Web Archiving and You Web Archiving and Us Amy Wickner University of Maryland Libraries Code4Lib 2018 Slides & Resources: https://osf.io/ex6ny/ Hello, thank you for this opportunity to talk about web archives and archiving. This talk is about what stakes the code4lib community might have in documenting particular experiences of the live web. In addition to slides, I’m leading with a list of material, tools, and trainings I read and relied on in putting this talk together. Despite the limited scope of the talk, I hope you’ll each find something of personal relevance to pursue further. “ the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use International Internet Preservation Coalition To begin, here’s how the International Internet Preservation Consortium or IIPC defines web archiving. Let’s break this down a little. “Collecting portions” means not collecting everything: there’s generally a process of selection. “Archival format” implies that long-term preservation and stewardship are the goals of collecting material from the web. And “serving the archives for access and use” implies a stewarding entity conceptually separate from the bodies of creators and users of archives. It also implies that there is no web archiving without access and use. As we go along, we’ll see examples that both reinforce and trouble these assumptions. A point of clarity about wording: when I say for example “critique,” “question,” or “trouble” as a verb, I mean inquiry rather than judgement or condemnation. we are collectors So, preambles mostly over. -
Web Archiving for Academic Institutions
University of San Diego Digital USD Digital Initiatives Symposium Apr 23rd, 1:00 PM - 4:00 PM Web Archiving for Academic Institutions Lori Donovan Internet Archive Mary Haberle Internet Archive Follow this and additional works at: https://digital.sandiego.edu/symposium Donovan, Lori and Haberle, Mary, "Web Archiving for Academic Institutions" (2018). Digital Initiatives Symposium. 4. https://digital.sandiego.edu/symposium/2018/2018/4 This Workshop is brought to you for free and open access by Digital USD. It has been accepted for inclusion in Digital Initiatives Symposium by an authorized administrator of Digital USD. For more information, please contact [email protected]. Web Archiving for Academic Institutions Presenter 1 Title Senior Program Manager, Archive-It Presenter 2 Title Web Archivist Session Type Workshop Abstract With the advent of the internet, content that institutional archivists once preserved in physical formats is now web-based, and new avenues for information sharing, interaction and record-keeping are fundamentally changing how the history of the 21st century will be studied. Due to the transient nature of web content, much of this information is at risk. This half-day workshop will cover the basics of web archiving, help attendees identify content of interest to them and their communities, and give them an opportunity to interact with tools that assist with the capture and preservation of web content. Attendees will gain hands-on web archiving skills, insights into selection and collecting policies for web archives and how to apply what they've learned in the workshop to their own organizations. Location KIPJ Room B Comments Lori Donovan works with partners and the Internet Archive’s web archivists and engineering team to develop the Archive-It service so that it meets the needs of memory institutions. -
Web Archiving in the UK: Current
CITY UNIVERSITY LONDON Web Archiving in the UK: Current Developments and Reflections for the Future Aquiles Ratti Alencar Brayner January 2011 Submitted in partial fulfillment of the requirements for the degree of MSc in Library Science Supervisor: Lyn Robinson 1 Abstract This work presents a brief overview on the history of Web archiving projects in some English speaking countries, paying particular attention to the development and main problems faced by the UK Web Archive Consortium (UKWAC) and UK Web Archive partnership in Britain. It highlights, particularly, the changeable nature of Web pages through constant content removal and/or alteration and the evolving technological innovations brought recently by Web 2.0 applications, discussing how these factors have an impact on Web archiving projects. It also examines different collecting approaches, harvesting software limitations and how the current copyright and deposit regulations in the UK covering digital contents are failing to support Web archive projects in the country. From the perspective of users’ access, this dissertation offers an analysis of UK Web archive interfaces identifying their main drawbacks and suggesting how these could be further improved in order to better respond to users’ information needs and access to archived Web content. 2 Table of Contents Abstract 2 Acknowledgements 5 Introduction 6 Part I: Current situation of Web archives 9 1. Development in Web archiving 10 2. Web archiving: approaches and models 14 3. Harvesting and preservation of Web content 19 3.1 The UK Web space 19 3.2 The changeable nature of Web pages 20 3.3 The evolution of Web 2.0 applications 23 4. -
Wikimedia Diversity Conference 2013!
Concept VisualEditor Edit-a-Thons Actions Communication WikiProject EditorsOpen Source MediaWiki LGBT Wikidata Open Culture Innovation Behaviour Teahouse Content Women Edit Learning Resources Collaboration Participation Guidelines Encouragement Education Broadening Community Gender Change Strategies ToolsetWikimedia Global South Diversity Conference Dialogue WikiQueer Indigenous Languages Research Volunteers Free Knowledge Reducing Barriers Technical Developments Challenges & Opportunities Outreach IdeaLab Initiatives Equality Feedback Quality Wikipedia November 9 -10, 2013 Berlin WIKIMEDIA DEUTSCHLAND Welcome to THE Wikimedia DiversitY CONfereNce 2013! To empower and engage people around experience what has worked and what the world to collect and develop educa- has not and how these projects can be tional content is the mission of the glob- taken further. Beyond that, we will work al Wikimedia movement. But we are on new ideas on reducing barriers that still in our infancy. Many languages and keep people from taking part in our cultures are not adequately represented projects. We will exchange ideas about in the different Wikimedia projects, many diversity and an inclusive culture in the people shy away from participating in Wikimedia projects during working ses- our projects and sharing their know- sions. Let us start to talk about projects ledge with the world. Particularly striking and let us create new ideas! is the low proportion of women. To gain the knowledge and commitment, for Wikimedia Deutschland is proud to be example of underrepresented groups, the host of the Wikimedia Diversity to increase the pool of knowledge in the Conference. We would like to thank the Wikimedia projects, and to further a Wikimedia Foundation, Wikimedia UK heterogeneous, colorful, diverse com- and Wikimedia Netherlands for their munity, is the goal of our work. -
Jonathan Zittrain's “The Future of the Internet: and How to Stop
The Future of the Internet and How to Stop It The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Jonathan L. Zittrain, The Future of the Internet -- And How to Stop It (Yale University Press & Penguin UK 2008). Published Version http://futureoftheinternet.org/ Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:4455262 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA YD8852.i-x 1/20/09 1:59 PM Page i The Future of the Internet— And How to Stop It YD8852.i-x 1/20/09 1:59 PM Page ii YD8852.i-x 1/20/09 1:59 PM Page iii The Future of the Internet And How to Stop It Jonathan Zittrain With a New Foreword by Lawrence Lessig and a New Preface by the Author Yale University Press New Haven & London YD8852.i-x 1/20/09 1:59 PM Page iv A Caravan book. For more information, visit www.caravanbooks.org. The cover was designed by Ivo van der Ent, based on his winning entry of an open competition at www.worth1000.com. Copyright © 2008 by Jonathan Zittrain. All rights reserved. Preface to the Paperback Edition copyright © Jonathan Zittrain 2008. Subject to the exception immediately following, this book may not be reproduced, in whole or in part, including illustrations, in any form (beyond that copying permitted by Sections 107 and 108 of the U.S. -
Archiving Web Content Jean-Christophe Peyssard
Archiving Web Content Jean-Christophe Peyssard To cite this version: Jean-Christophe Peyssard. Archiving Web Content. École thématique. Archiving Web Content, American University of Beirut, Beirut, Lebanon, Lebanon. 2019. cel-02130558 HAL Id: cel-02130558 https://halshs.archives-ouvertes.fr/cel-02130558 Submitted on 15 May 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License American University of Beirut May 3rd 2019 ARCHIVING WEB CONTENT 2019 Jean-Christophe Peyssard Head of Digital Humanities Institut français du Proche-Orient (Ifpo) [email protected] Who Am I ? Research Ingineer from CNRS at the French Institute of the Near East, Head of Digital Humanities . Ifpo : http://www.ifporient.org/jean-christophe-peyssard/ . Linkedin : http://fr.linkedin.com/pub/jean-christophe-peyssard/22/705/782/ . ORCID : 0000-0002-8503-2217 . Google Scholar : https://scholar.google.com/citations?user=32cZHPsAAAAJ . HAL : https://cv.archives-ouvertes.fr/jcpeyssard Institut français -
Web Archiving & Digital Libraries
Proceedings o f th e 201 8 Web Archiving & Digital Libraries Workshop June 6, 2018 Fort Worth, Texas Martin Klein Edward A. Fox Zhiwu Xie Edito rs Web Archiving And Digital Libraries 2018 W ADL 2018 homepage Web Archiving and Digital Libraries Workshop at JCDL 2018 (http://2018.jcdl.org) Fort Worth, TX, USA Please see the approved WADL 2018 workshop proposal. Please also see last year's WADL 2017 homepage and the homepage of the 2016 edition of WADL. That workshop led in part to a special issue of International Journal on Digital Libraries. We fully intend to publish a very similar IJDL issue based on WADL 2018 contributions. SCHEDULE: Featured Talk by Weigle, Michele C. ([email protected]): Enabling Personal Use of Web Archives W ednesday, June 6, 10:30am-5pm Time Activity, Presenters/Authors Title of Presentation 10:30 Organizers and everyone speaking Opening, Introductions 10:45 Zhiwu Xie et al. IMLS project panel John Berlin, Michael Nelson and Michele Swimming In A Sea Of JavaScript, Or: How I Learned 11:45 Weigle To Stop Worrying And Love High-Fidelity Replay 12:15 Get boxes and return Lunch A Study of Historical Short URLs in Event Collections 1:00 Liuqing Li and Edward Fox of Tweets 1:30 Keynote by Michele Weigle Enabling Personal Use of Web Archives Libby Hemphill, Susan Leonard and Margaret 2:30 Developing a Social Media Archive at ICPSR Hedstrom Posters: Littman Justin; Sawood Alam, Mat Supporting social media research at scale; A Survey of 3:00 Kelly, Michele Weigle and Michael Nelson Archival Replay Banners 3:15 Discussions around posters Break Mohamed Aturban, Michael Nelson and 3:30 It is Hard to Compute Fixity on Archived Web Pages Michele Weigle Mat Kelly, Sawood Alam, Michael Nelson and Client-Assisted Memento Aggregation Using the Prefer 4:00 Michele Weigle Header 4:30 Closing discussion Plans for future activities and collaborations Description: The 2018 edition of the Workshop on Web Archiving and Digital Libraries (WADL) will explore the integration of Web archiving and digital libraries. -
Creating an Online Archive for LGBT History by Ross Burgess
Creating an online archive for LGBT history By Ross Burgess Presented at the third academic conference, ‘What is & How to Do LGBT History: Approaches, Methods & Subjects’ Manchester, February 2016 In July 2011 I heard about a new Wikipedia-style project to record Britain’s LGBT history, and immediately knew it was something I wanted to help with. I had been editing Wikipedia articles for about seven years, so I was very familiar with the wiki approach to building an online database; and I also had an interest in LGBT history: for instance in 2004 my partner and I had staged re-enactments of the mediaeval church’s same-sex union ceremonies – adelphopoiesis – as part of the campaign to support the introduction of civil partnerships.1 So when Jonathan Harbourne asked for volunteer editors for his new LGBT History wiki, I signed up at once. The project’s title includes ‘LGBT’ (lesbian, gay, bisexual, transgender) which is probably the most convenient and widely understood abbreviation for our community. But its scope could be better defined as ‘LGBT+’ – it definitely includes intersex people, those who classify themselves as queer, non-binary, pansexual, etc, etc. Jonathan’s project was a virtual ‘time capsule’, a resource for people and aca- demics in years to come. The image of the Figure 1. The original logo time capsule has been part of the project’s logo from the outset. He said at the time, In another ten or twenty years, I don’t want people to forget the struggle and the fight that won us our equalities and freedoms. -
Web-Archiving 01000100 01010000 Maureen Pennock 01000011 01000100 DPC Technology Watch Report 13-01 March 2013 01010000 01000011 01000100 01010000
01000100 01010000 01000011 Web-Archiving 01000100 01010000 Maureen Pennock 01000011 01000100 DPC Technology Watch Report 13-01 March 2013 01010000 01000011 01000100 01010000 Series editors on behalf of the DPC 01000011 Charles Beagrie Ltd. Principal Investigator for the Series 01000100 Neil Beagrie 01010000 01000011DPC Technology Watch Series © Digital Preservation Coalition 2013 and Maureen Pennock 2013 Published in association with Charles Beagrie Ltd. ISSN: 2048 7916 DOI: http://dx.doi.org/10.7207/twr13-01 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing from the publisher. The moral right of the author has been asserted. First published in Great Britain in 2013 by the Digital Preservation Coalition Foreword The Digital Preservation Coalition (DPC) is an advocate and catalyst for digital preservation, ensuring our members can deliver resilient long-term access to digital content and services. It is a not-for-profit membership organization whose primary objective is to raise awareness of the importance of the preservation of digital material and the attendant strategic, cultural and technological issues. It supports its members through knowledge exchange, capacity building, assurance, advocacy and partnership. The DPC’s vision is to make our digital memory accessible tomorrow. The DPC Technology Watch Reports identify, delineate, monitor and address topics that have a major bearing on ensuring our collected digital memory will be available tomorrow. They provide an advanced introduction in order to support those charged with ensuring a robust digital memory, and they are of general interest to a wide and international audience with interests in computing, information management, collections management and technology.