A General Architecture to Enhance Wiki Systems with Natural Language Processing Techniques

Total Page:16

File Type:pdf, Size:1020Kb

A General Architecture to Enhance Wiki Systems with Natural Language Processing Techniques A GENERAL ARCHITECTURE TO ENHANCE WIKI SYSTEMS WITH NATURAL LANGUAGE PROCESSING TECHNIQUES BAHAR SATELI A THESIS IN THE DEPARTMENT OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING PRESENTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE IN SOFTWARE ENGINEERING CONCORDIA UNIVERSITY MONTREAL´ ,QUEBEC´ ,CANADA APRIL 2012 c BAHAR SATELI, 2012 CONCORDIA UNIVERSITY School of Graduate Studies This is to certify that the thesis prepared By: Bahar Sateli Entitled: A General Architecture to Enhance Wiki Systems with Natural Language Processing Techniques and submitted in partial fulfillment of the requirements for the degree of Master of Applied Science in Software Engineering complies with the regulations of this University and meets the accepted standards with respect to originality and quality. Signed by the final examining commitee: Chair Dr. Volker Haarslev Examiner Dr. Leila Kosseim Examiner Dr. Gregory Butler Supervisor Dr. Rene´ Witte Approved Chair of Department or Graduate Program Director 20 Dr. Robin A. L. Drew, Dean Faculty of Engineering and Computer Science Abstract A General Architecture to Enhance Wiki Systems with Natural Language Processing Techniques Bahar Sateli Wikis are web-based software applications that allow users to collabora- tively create and edit web page content, through a Web browser using a simplified syntax. The ease-of-use and “open” philosophy of wikis has brought them to the attention of organizations and online communities, leading to a wide-spread adoption as a simple and “quick” way of collab- orative knowledge management. However, these characteristics of wiki systems can act as a double-edged sword: When wiki content is not prop- erly structured, it can turn into a “tangle of links”, making navigation, organization and content retrieval difficult for their end-users. Since wiki content is mostly written in unstructured natural language, we believe that existing state-of-the-art techniques from the Natural Lan- guage Processing (NLP) and Semantic Computing domains can help miti- gating these common problems when using wikis and improve their users’ experience by introducing new features. The challenge, however, is to find a solution for integrating novel semantic analysis algorithms into the multi- tude of existing wiki systems, without the need for modifying their engines. In this research work, we present a general architecture that allows wiki systems to benefit from NLP services made available through the Semantic Assistants framework – a service-oriented architecture for brokering NLP pipelines as web services. Our main contributions in this thesis include an analysis of wiki engines, the development of collaboration patterns be- tween wikis and NLP, and the design of a cohesive integration architecture. As a concrete application, we deployed our integration to MediaWiki – the powerful wiki engine behind Wikipedia – to prove its practicability. Fi- nally, we evaluate the usability and efficiency of our integration through a iii number of user studies we performed in real-world projects from various domains, including cultural heritage data management, software require- ments engineering, and biomedical literature curation. iv Acknowledgments As Isaac Newton once said, “If I have seen further, it is by standing on the shoulders of giants.” My utmost gratitude goes to my thesis supervisor, Dr. Rene´ Witte for his invaluable guidance, endless patience and under- standing throughout the course of this thesis. His positive outlook, care- ful editing and confidence in my research inspired me to do the best of my ability. I would like to express my appreciation to Dr. Marie-Jean Meurs for offering me her kind support and generous contributions to the evaluation of this thesis. Thank you for always being there for me. I am indebted to you for your kind heart and sisterly love. I would also like to thank my fellow lab mates at the Semantic Software Lab and the scientists of Concordia Centre for Structural and Functional Genomics for their careful feedback on my research work. I thank Caitlin Murphy and Elian Angius for their time and hard work, as well as tens of Software Engineering students who participated in the evaluation of this thesis. Last but not least, I cordially thank my parents and my brother for their everlasting love and encouragements. Words fall short to express how grateful I am to feel the warmth of their presence and support at every step of my life. It is to whom this thesis is dedicated. v Table of Contents List of Figures x List of Tables xiii List of Acronyms xiv 1 Introduction 1 1.1 Motivation ............................... 1 1.2 Approach ............................... 3 1.3 Outline ................................ 6 2 Background 7 2.1 Wiki Systems ............................. 7 2.1.1 System Specifications .................... 7 2.1.2 Markup Languages ..................... 9 2.2 Wikis in Practice ........................... 11 2.2.1 Wikipedia, An Encyclopedia Wiki ............. 11 2.2.2 Personal Information and Knowledge Management . 12 2.2.3 Software Requirements Engineering ............ 12 2.2.4 Enterprise Wikis ....................... 13 2.2.5 Cultural Heritage Data Management . ........ 14 2.3 Common Problems in Wikis .................... 15 2.4 Suggested NLP Solutions ...................... 17 2.5 The Semantic Assistants Architecture .............. 19 2.5.1 System Overview ....................... 19 2.5.2 System Workflow ....................... 21 vi 2.5.3 Service Results ........................ 22 2.6 Resum´ e´ ................................ 23 3 Related Work 24 3.1 NLP-Enhanced Wikis ........................ 24 3.1.1 Wikulu, An Intelligent User Interface for Wikis ..... 24 3.2 Semantic Wikis ............................ 27 3.2.1 Semantic MediaWiki, A Semantic Extension to Medi- aWiki.............................. 27 3.2.2 IkeWiki, A Semantic Wiki for Collaborative Knowledge Management ......................... 30 3.2.3 SweetWiki, A Semantic Web Enabled Technology Wiki . 32 3.2.4 AceWiki, A Natural and Expressive Semantic Wiki . 34 3.3 Discussion .............................. 36 3.4 Resum´ e´ ................................ 37 4 Requirements Analysis 38 4.1 End-User Requirements ....................... 39 4.2 Wiki Developer Requirements ................... 41 4.3 System Requirements ........................ 42 4.4 Discussion .............................. 44 4.5 Resum´ e´ ................................ 46 5 System Design 47 5.1 Design Alternatives ......................... 47 5.1.1 A Browser Plug-in ...................... 48 5.1.2 A Wiki Plug-in ........................ 49 5.1.3 A Semantic Assistants Wiki Component . ........ 51 5.1.4 A Proxy Server Component ................. 52 5.1.5 Summary ........................... 53 5.2 The Analysis Workflow ....................... 54 5.2.1 User Interaction ....................... 55 5.2.2 Service Invocation ...................... 61 5.2.3 Wiki Communication .................... 66 5.3 Transformation of Results ..................... 73 vii 5.3.1 Semantic Metadata Representation ............ 74 5.4 Wiki Independency .......................... 75 5.4.1 Module-based Architecture ................. 75 5.4.2 Semantics-based Architecture ............... 75 5.5 Wiki Ontology ............................. 76 5.6 Developed Solution ......................... 79 5.7 Resum´ e´ ................................ 81 6 Implementation and Application 83 6.1 System Overview ........................... 83 6.2 The Semantic Assistants Servlet .................. 86 6.2.1 The User Interface Module ................. 88 6.2.2 The Wiki Helper Module ................... 91 6.2.3 The Semantic Assistants Broker .............. 92 6.2.4 The Wiki Ontology Repository ............... 93 6.3 The Semantic Assistants Wiki Plug-in .............. 94 6.4 Storing and Presenting Service Results .............. 96 6.4.1 Templating Mechanism ................... 97 6.4.2 Storing the Markup .....................100 6.5 Service Execution Flow .......................102 6.6 Resum´ e´ ................................104 7 Evaluation 106 7.1 Methodology .............................106 7.2 Wiki-based Cultural Heritage Data Management . .......108 7.2.1 Evaluation Scenario .....................109 7.2.2 Results .............................110 7.3 Wiki-based Collaborative Software Requirements Engineering 111 7.3.1 Evaluation Scenario .....................113 7.3.2 Results .............................114 7.4 Wiki-based Biomedical Literature Curation ...........118 7.4.1 Evaluation Scenario .....................119 7.4.2 Results .............................122 7.5 Resum´ e´ ................................125 viii 8 Conclusions and Future Work 126 8.1 Summary ...............................126 8.2 Suggestions for Future Work ....................129 8.3 Conclusion ..............................130 Bibliography 131 A Wiki Upper Ontology Description 139 B MediaWiki Ontology Description 142 C ReqWiki Questionnaire 148 D Questionnaire Responses 158 D.1 Graduate Students Responses ...................158 D.2 Undergraduate Students Responses ...............173 ix List of Figures 1 Wikipedia–afreeencyclopedia wiki ............... 2 2 A subset of MediaWiki text formatting markup . ........ 10 3 A MediaWiki page markup importing the FOAF vocabulary . 10 4 The Semantic Assistants architecture [WG09].......... 20 5 The Semantic Assistants service execution workflow [WG09].21 6 The DTD for Semantic Assistants response messages ..... 23 7 The Wikulu system
Recommended publications
  • Assignment of Master's Thesis
    ASSIGNMENT OF MASTER’S THESIS Title: Git-based Wiki System Student: Bc. Jaroslav Šmolík Supervisor: Ing. Jakub Jirůtka Study Programme: Informatics Study Branch: Web and Software Engineering Department: Department of Software Engineering Validity: Until the end of summer semester 2018/19 Instructions The goal of this thesis is to create a wiki system suitable for community (software) projects, focused on technically oriented users. The system must meet the following requirements: • All data is stored in a Git repository. • System provides access control. • System supports AsciiDoc and Markdown, it is extensible for other markup languages. • Full-featured user access via Git and CLI is provided. • System includes a web interface for wiki browsing and management. Its editor works with raw markup and offers syntax highlighting, live preview and interactive UI for selected elements (e.g. image insertion). Proceed in the following manner: 1. Compare and analyse the most popular F/OSS wiki systems with regard to the given criteria. 2. Design the system, perform usability testing. 3. Implement the system in JavaScript. Source code must be reasonably documented and covered with automatic tests. 4. Create a user manual and deployment instructions. References Will be provided by the supervisor. Ing. Michal Valenta, Ph.D. doc. RNDr. Ing. Marcel Jiřina, Ph.D. Head of Department Dean Prague January 3, 2018 Czech Technical University in Prague Faculty of Information Technology Department of Software Engineering Master’s thesis Git-based Wiki System Bc. Jaroslav Šmolík Supervisor: Ing. Jakub Jirůtka 10th May 2018 Acknowledgements I would like to thank my supervisor Ing. Jakub Jirutka for his everlasting interest in the thesis, his punctual constructive feedback and for guiding me, when I found myself in the need for the words of wisdom and experience.
    [Show full text]
  • Research Article Constrained Wiki: the Wikiway to Validating Content
    Hindawi Publishing Corporation Advances in Human-Computer Interaction Volume 2012, Article ID 893575, 19 pages doi:10.1155/2012/893575 Research Article Constrained Wiki: The WikiWay to Validating Content Angelo Di Iorio,1 Francesco Draicchio,1 Fabio Vitali,1 and Stefano Zacchiroli2 1 Department of Computer Science, University of Bologna, Mura Anteo Zamboni 7, 40127 Bologna, Italy 2 Universit´e Paris Diderot, Sorbonne Paris Cit´e, PPS, UMR 7126, CNRS, F-75205 Paris, France Correspondence should be addressed to Angelo Di Iorio, [email protected] Received 9 June 2011; Revised 20 December 2011; Accepted 3 January 2012 Academic Editor: Kerstin S. Eklundh Copyright © 2012 Angelo Di Iorio et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The “WikiWay” is the open editing philosophy of wikis meant to foster open collaboration and continuous improvement of their content. Just like other online communities, wikis often introduce and enforce conventions, constraints, and rules for their content, but do so in a considerably softer way, expecting authors to deliver content that satisfies the conventions and the constraints, or, failing that, having volunteers of the community, the WikiGnomes, fix others’ content accordingly. Constrained wikis is our generic framework for wikis to implement validators of community-specific constraints and conventions that preserve the WikiWay and their open collaboration features. To this end, specific requirements need to be observed by validators and a specific software architecture can be used for their implementation, that is, as independent functions (implemented as internal modules or external services) used in a nonintrusive way.
    [Show full text]
  • WHY USE a WIKI? an Introduction to the Latest Online Publishing Format
    WHY USE A WIKI? An Introduction to the Latest Online Publishing Format A WebWorks.com White Paper Author: Alan J. Porter VP-Operations WebWorks.com a brand of Quadralay Corporation [email protected] WW_WP0309_WIKIpub © 2009 – Quadralay Corporation. All rights reserved. NOTE: Please feel free to redistribute this white paper to anyone you feel may benefit. If you would like an electronic copy for distribution, just send an e-mail to [email protected] CONTENTS Overview................................................................................................................................ 2 What is a Wiki? ...................................................................................................................... 2 Open Editing = Collaborative Authoring .................................................................................. 3 Wikis in More Detail................................................................................................................ 3 Wikis Are Everywhere ............................................................................................................ 4 Why Use a Wiki...................................................................................................................... 5 Getting People to Use Wikis ................................................................................................... 8 Populating the Wiki................................................................................................................. 9 WebWorks ePublisher and Wikis
    [Show full text]
  • Public Wikis: Sharing Knowledge Over the Internet
    Technical Communication Wiki : Chapter 2 This page last changed on Apr 15, 2009 by kjp15. Public Wikis: Sharing Knowledge over the Internet What is a public wiki? A public wiki refers to a wiki that everyone can edit, that is, it is open to the general public on the Internet with or without a free login. Anyone is welcome to add information to the wiki, but contributors are asked to stay within the subject area, policies, and standards of the particular wiki. Content that does not relate to the subject or is not of good quality can and will be deleted by other users. Also, if a contributor makes a mistake or a typo, other users will eventually correct the error. Open collaboration by anyone means that multiple articles can be written very fast simultaneously. And, while there may be some vandalism or edit wars, quality of the articles will generally improve over time after many editors have contributed. The biggest public wiki in the world, with over 75,000 contributors, is Wikipedia, "the free encyclopedia that anyone can edit" [1]. Wikipedia has over 2.8 million articles in the English version, which includes over 16 million wiki pages, and is one of the top ten most popular websites in the United States [2]. There are versions of Wikipedia in ten different languages, and articles in more than 260 languages. Wikipedia is operated by the non-profit Wikimedia Foundation [3] which pays a small group of employees and depends mainly on donations. There are also over 1,500 volunteer administrators with special powers to keep an eye on editors and make sure they conform to Wikipedia's guidelines and policies.
    [Show full text]
  • Trabajo De Grado Wiki Semántica Usando Folksonomies
    Trabajo de Grado Wiki Sem´antica usando Folksonomies Diego Torres Directora: Alicia D´ıaz 27 de noviembre de 2009 Trabajo de Grado - Construcci´onColaborativa de Ontolog´ıas 2 Agradecimientos Muchas son las personas que me ayudaron a concluir mi formaci´onde grado. En primer lugar quiero agradecerles a mis pap´asDenise y Eduardo por darme todo para poder desarrollarme en aquellas cosas que m´asme gustaron, por la educaci´onque me dieron y por los valores que me inculcaron siempre. Tambien hago extensible al resto de mi familia: mi hermana Denise, mis abuelos, mis t´ıos, primos y mi cu~nado. Tambi´ena Mariana, el amor de mi vida. Compa~nerainfatigable. Gracias por darme la fuerza, la alegr´ıay el amor de todos los d´ıas. A Alicia D´ıaz, por haber confiado en m´ı, en que pod´ıa hacer investigaci´on. Gracias por tanta paciencia y dedicaci´onen ense~narmecotidianamente . Gracias por el cari~no. A Hilda, Gustavo, Fede, Fran y Yaya. A Alicia Zingoni, por mostrarme c´omo honrar la vida. Al LIFIA por darme el espacio y las oportunidades. A los directores, a mis compa~neros.A los chicos del LTF que me dieron las primeras lecciones de esta cosa tan linda que es investigar. Gracias Fede Naso por dar tanto y pedir nada. Gracias a Diego, Nando, Richard y Casco. Gracias a mis compa~nerosde objetos y de multimedia. Son muchos, para todos m´asy m´asgracias. Y por supuesto a mis amigos, por la fuerza que me dieron para empezar, para seguir y para terminar.
    [Show full text]
  • Feasibility Study of a Wiki Collaboration Platform for Systematic Reviews
    Methods Research Report Feasibility Study of a Wiki Collaboration Platform for Systematic Reviews Methods Research Report Feasibility Study of a Wiki Collaboration Platform for Systematic Reviews Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 540 Gaither Road Rockville, MD 20850 www.ahrq.gov Contract No. 290-02-0019 Prepared by: ECRI Institute Evidence-based Practice Center Plymouth Meeting, PA Investigator: Eileen G. Erinoff, M.S.L.I.S. AHRQ Publication No. 11-EHC040-EF September 2011 This report is based on research conducted by the ECRI Institute Evidence-based Practice Center in 2008 under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, MD (Contract No. 290-02-0019 -I). The findings and conclusions in this document are those of the author(s), who are responsible for its content, and do not necessarily represent the views of AHRQ. No statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services. The information in this report is intended to help clinicians, employers, policymakers, and others make informed decisions about the provision of health care services. This report is intended as a reference and not as a substitute for clinical judgment. This report may be used, in whole or in part, as the basis for the development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. AHRQ or U.S. Department of Health and Human Services endorsement of such derivative products or actions may not be stated or implied.
    [Show full text]
  • Personal Knowledge Models with Semantic Technologies
    Max Völkel Personal Knowledge Models with Semantic Technologies Personal Knowledge Models with Semantic Technologies Max Völkel 2 Bibliografische Information Detaillierte bibliografische Daten sind im Internet über http://pkm. xam.de abrufbar. Covergestaltung: Stefanie Miller Herstellung und Verlag: Books on Demand GmbH, Norderstedt c 2010 Max Völkel, Ritterstr. 6, 76133 Karlsruhe This work is licensed under the Creative Commons Attribution- ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Fran- cisco, California, 94105, USA. Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswis- senschaften (Dr. rer. pol.) von der Fakultät für Wirtschaftswissenschaften des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation von Dipl.-Inform. Max Völkel. Tag der mündlichen Prüfung: 14. Juli 2010 Referent: Prof. Dr. Rudi Studer Koreferent: Prof. Dr. Klaus Tochtermann Prüfer: Prof. Dr. Gerhard Satzger Vorsitzende der Prüfungskommission: Prof. Dr. Christine Harbring Abstract Following the ideas of Vannevar Bush (1945) and Douglas Engelbart (1963), this thesis explores how computers can help humans to be more intelligent. More precisely, the idea is to reduce limitations of cognitive processes with the help of knowledge cues, which are external reminders about previously experienced internal knowledge. A knowledge cue is any kind of symbol, pattern or artefact, created with the intent to be used by its creator, to re- evoke a previously experienced mental state, when used. The main processes in creating, managing and using knowledge cues are analysed. Based on the resulting knowledge cue life-cycle, an economic analysis of costs and benefits in Personal Knowledge Management (PKM) processes is performed.
    [Show full text]
  • Wiki Technology, Community Dynamics and Community Culture
    Wiki Technology and Community Culture By Mingli Yuan July, 2008 Released under GFDL Contents Introduction − concept / history / jargons / a simple classification / organizations & companies / conferences Technology − implementations / features / principles / easy at first glance / syntax & parser / version control / wysiwyg / adventure of ideas Community Culture − openness & agf / npov / consensus / deletionism vs. inclusionism / controversy Introduction – concept A wiki is web pages anyone who accesses it can contribute or modify content a simplified markup language Introduction – history World Wide − 1994: Ward Cunningham, WikiWikiWeb (1994?) Patrick Mueller, the first WikiWikiClone − 2000: Sunir Shah, MeatballWiki − 2001: January 15, Jimmy Wales, Wikipedia Introduction – history cont. Mainland China Taiwan − 2001-12-27: − Schee / 徐子涵 Softme Studio / 索秘软 − hlb / 薛良斌 件工作室 − Newzilla jWiki as a sub-project of WebPM − 2002 / 5: 中蟒大杂院 Early Blogsphere − 2002 / 10 − Cnblog.org Chinese Wikipedia − Chinese Blogger Conference / − 2002 / 11 中文网志年会 贸大 Wiki Introduction – jargons Basics Community − Sandbox − EditWar − CamelCase − AGF − Wikify − NPOV − RecentChanges − Consensus / Vote − DocumentMode / − Deletionist / Inclusionism TheadMode − Namespace: Article / Talk / Copyright / Copyleft − PD User / Category − − GFDL / Free Signature − CC family − BackLinks − Fair use − InterWiki Introduction – a simple classification Tech related sites Wikimedia Family − c2.com / wikiwikiweb − wikiversity − meatball / usemode − wiktionary
    [Show full text]
  • Wiki Content Templating
    Wiki Content Templating Angelo Di Iorio Fabio Vitali Stefano Zacchiroli Computer Science Computer Science Computer Science Department Department Department University of Bologna University of Bologna University of Bologna Mura Anteo Zamboni, 7 Mura Anteo Zamboni, 7 Mura Anteo Zamboni, 7 40127 Bologna, ITALY 40127 Bologna, ITALY 40127 Bologna, ITALY [email protected] [email protected] [email protected] ABSTRACT semantic wikis (e.g. [11, 16, 17]) exploit the content medi- Wiki content templating enables reuse of content structures ation performed by the wiki engine to deliver semantically among wiki pages. In this paper we present a thorough rich web pages and give to authors simple interfaces (some- study of this widespread feature, showing how its two state times even invisible interfaces based on syntactic quirks!) to of the art models (functional and creational templating) are semantically annotate their content or to import semantic sub-optimal. We then propose a third, better, model called metadata. lightly constrained (LC) templating and show its implemen- Though (lowercase) semantic web and semantic wikis fill tation in the Moin wiki engine. We also show how LC tem- a technological gap inhibiting authors to provide metadata, plating implementations are the appropriate technologies to the question of how to actually foster the creation of seman- push forward semantically rich web pages on the lines of tically rich web pages is still open. By analogy, we observe (lowercase) semantic web and microformats. that a similar goal has been traditionally pursued by wiki content templating mechanisms1 which have been available since the first wiki clones under the name of seeding pages.
    [Show full text]
  • A Grammar for Standardized Wiki Markup
    A Grammar for Standardized Wiki Markup Martin Junghans, Dirk Riehle, Rama Gurram, Matthias Kaiser, Mário Lopes, Umit Yalcinalp SAP Research, SAP Labs LLC 3475 Deer Creek Rd Palo Alto, CA, 94304 U.S.A. +1 (650) 849 4087 [email protected], [email protected], {[email protected]} ABSTRACT 1 INTRODUCTION Today’s wiki engines are not interoperable. The rendering engine Wikis were invented in 1995 [9]. They have become a widely is tied to the processing tools which are tied to the wiki editors. used tool on the web and in the enterprise since then [4]. In the This is an unfortunate consequence of the lack of rigorously form of Wikipedia, for example, wikis are having a significant specified standards. This paper discusses an EBNF-based gram- impact on society [21]. Many different wiki engines have been mar for Wiki Creole 1.0, a community standard for wiki markup, implemented since the first wiki was created. All of these wiki and demonstrates its benefits. Wiki Creole is being specified us- engines integrate the core page rendering engine, its storage ing prose, so our grammar revealed several categories of ambigui- backend, the processing tools, and the page editor in one software ties, showing the value of a more formal approach to wiki markup package. specification. The formalization of Wiki Creole using a grammar Wiki pages are written in wiki markup. Almost all wiki engines shows performance problems that today’s regular-expression- define their own markup language. Different software compo- based wiki parsers might face when scaling up. We present an nents of the wiki engine like the page rendering part are tied to implementation of a wiki markup parser and demonstrate our test that particular markup language.
    [Show full text]
  • Urobe: a Prototype for Wiki Preservation
    UROBE: A PROTOTYPE FOR WIKI PRESERVATION Niko Popitsch, Robert Mosser, Wolfgang Philipp University of Vienna Faculty of Computer Science ABSTRACT By this, some irrelevant information (like e.g., automat- ically generated pages) is archived while some valuable More and more information that is considered for digital information about the semantics of relationships between long-term preservation is generated by Web 2.0 applica- these elements is lost or archived in a way that is not easily tions like wikis, blogs or social networking tools. However, processable by machines 1 . For example, a wiki article is there is little support for the preservation of these data to- authored by many different users and the information who day. Currently they are preserved like regular Web sites authored what and when is reflected in the (simple) data without taking the flexible, lightweight and mostly graph- model of the wiki software. This information is required based data models of the underlying Web 2.0 applications to access and integrate these data with other data sets in into consideration. By this, valuable information about the the future. However, archiving only the HTML version of relations within these data and about links to other data is a history page in Wikipedia makes it hard to extract this lost. Furthermore, information about the internal structure information automatically. of the data, e.g., expressed by wiki markup languages is Another issue is that the internal structure of particular not preserved entirely. core elements (e.g., wiki articles) is currently not preserved We argue that this currently neglected information is adequately.
    [Show full text]
  • Wikimedia Free Videos Download
    wikimedia free videos download Celebra los 20 años de Wikipedia y las personas que lo hacen posible → In this time of uncertainty, our enduring commitment is to provide reliable and neutral information for the world. The nonprofit Wikimedia Foundation provides the essential infrastructure for free knowledge. We host Wikipedia, the free online encyclopedia, created, edited, and verified by volunteers around the world, as well as many other vital community projects. All of which is made possible thanks to donations from individuals like you. We welcome anyone who shares our vision to join us in collecting and sharing knowledge that fully represents human diversity. (中文 (繁体中文) · Français (France) · Русский (Россия) · Español (España) · Deutsch (Deutschland · (اﻟﻌﺮﺑﯿﺔ (اﻟﻌﺎﻟﻢ · English 1 Wikimedia projects belong to everyone. You made it. It is yours to use. For free. That means you can use it, adapt it, or share what you find on Wikimedia sites. Just do not write your own bio, or copy/paste it into your homework. 2 We respect your data and privacy. We do not sell your email address or any of your personal information to third parties. More information about our privacy practices are available at the Wikimedia Foundation privacy policy, donor privacy policy, and data retention guidelines. 3 People like you keep Wikipedia accurate. Readers verify the facts. Articles are collaboratively created and edited by a community of volunteers using reliable sources, so no single person or company owns a Wikipedia article. The Wikimedia Foundation does not write or edit, but you and everyone you know can help. 4 Not all wikis are Wikimedia.
    [Show full text]