Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems

Total Page:16

File Type:pdf, Size:1020Kb

Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems A thesis submitted to the University of Dublin, Trinity College in fulfilment of the requirements of the degree of Doctor of Philosophy Mostafa Bayomi Knowledge and Data Engineering Group (KDEG) School of Computer Science and Statistics Trinity College, Dublin Ireland Supervised by Prof. Séamus Lawless 2019 Declaration I declare that this thesis has not been submitted as an exercise for a degree at this or any other university and it is entirely my own work. I agree to deposit this thesis in the University’s open access institutional repository or allow the library to do so on my behalf, subject to Irish Copyright Legislation and Trin- ity College Library conditions of use and acknowledgement. Signature ___________________________ Date ___________________ Mostafa Bayomi ii Permission to lend or copy I, the undersigned, agree that the Trinity College Library may lend or copy this thesis upon request. Signature ___________________________ Date ___________________ Mostafa Bayomi iii Dedication To my mother’s soul iv Acknowledgements First and foremost, all thanks and praise are due to God, who has granted me this great success in my PhD, and has bestowed upon me all the necessary strength, health, wits, patience, and perseverance to complete it. I would like to express my gratitude to numerous people who have helped me to complete my work over the last number of years. I would like to express my utmost gratitude to my supervisor Prof. Séamus Lawless for all his guidance, patience, feedback, encourage- ment, and for his great support personally and professionally throughout my PhD journey. I could not have done it without him. I would like to thank all my colleagues in KDEG and ADAPT who have directly and indirectly supported me both personally and academically during the years of the PhD. My utmost gratefulness and sincerest thanks to my father (Mohamed Bayomi), my broth- ers (Ahmed and Amr) and my sister (Rasha) who have always believed in me and for all their care, support and encouragement. To my lovely wife Alaa, for all her love, devotion, never-ending support and encouragement. To my kids, Hamza and Omar who are the joy of my life. I would like to thank all my Egyptian friends in Dublin who have been a major part of this journey. And a special thanks to two very special people, Ramy Shosha and Rami Ghorab. Words of gratitude and thanks can never do them their rights. And I’ve saved the best for last: My utmost gratefulness and sincerest thanks to my mother (Boushra Shams), whom I wish that she is with me in this moment. She has been and will always be the greatest supportive person in my life, especially in my PhD jour- ney. You have left this world, but you will always be with me. Thanks mum for every- thing, words will never express my feelings for you. May God bless your soul. v Abstract The volume of digital content resources written as text documents is growing every day, at an unprecedented rate. Because this content is generally not structured as easy-to-han- dle units, it can be very difficult for users to find information they are interested in, or to help them accomplish their tasks. This in turn has increased the need for producing tai- lored content that can be adapted to the needs of individual users. A key challenge for producing such tailored content lies in the ability to understand how this content is struc- tured. Hence, the efficient analysis and understanding of unstructured text content has become increasingly important. This has led to the increasing use of Natural Language Processing (NLP) techniques to help with processing unstructured text documents. Amongst the different NLP techniques, Text Segmentation is specifically used to under- stand the structure of textual documents. However, current approaches to text segmenta- tion are typically based upon using lexical and/or syntactic representation to build a struc- ture from the unstructured text documents. However, the relationship between segments may be semantic, rather than lexical or syntactic. Furthermore, text segmentation research has primarily focused on techniques that can be used to process text documents but not on how these techniques can be utilised to produce tailored content that can be adapted to the needs of individual users. In contrast, the field of Adaptive Systems has inherently focused on the challenges associated with dynami- cally adapting and delivering content to individual users. However, adaptive systems have primarily focused upon the techniques of adapting content, not on how to understand and structure this content. Even systems that have focused on structuring content are limited in that they rely upon the original structure of the content resource, which reflects the perspective of its author. Therefore, these systems are limited in that they do not deeply “understand” the structure of the content, which in turn, limits their capability to discover and supply appropriate content for use in defined contexts, and limits the content’s ame- nability for reuse within various independent adaptive systems. In order to utilise the strength of NLP techniques to overcome the challenges of under- standing unstructured text content, this thesis investigates how NLP techniques can be utilised in order to enhance the supply of content to adaptive systems. Specifically, the contribution of this thesis is concerned with addressing the challenges associated with hierarchical text segmentation techniques, and with content discoverability and reusabil- ity for adaptive systems. vi Firstly, this research proposes a novel hierarchical text segmentation approach, named C- HTS, that builds a structure from text documents based on the semantic representation of text. Semantic representation is a method that replaces keyword-based text representation with concept-based features, where the meaning of a piece of text is represented as a vector of knowledge concepts automatically extracted from massive human knowledge repositories such as Wikipedia. Using this approach, C-HTS represents the content of a document as a tree-like hierarchy. This way of structuring the document can be regarded as a hierarchically coherent tree that is useful for supporting a variety of search methods as it provides different levels of granularity for the underlying content. Secondly, this research proposes a novel content-supply service named CROCC. The aim of CROCC is to utilise the produced structure of C-HTS in order to overcome the limitations of the state of the art content-supply approaches. Finally, this research conducts an evaluation of the extent to which the CROCC service enhances content discoverability and reusabil- ity for adaptive systems. vii Table of Contents Abstract ............................................................................................................................ vi 1. Introduction ............................................................................................................... 1 1.1 Motivation ......................................................................................................... 1 1.2 Research Question ............................................................................................ 5 1.2.1 Research Objectives ...................................................................................... 5 1.3 Research Contributions ..................................................................................... 6 1.4 Research Methodology ..................................................................................... 8 1.5 Thesis Overview ............................................................................................. 10 2. State of the Art ........................................................................................................ 13 2.1 Introduction ..................................................................................................... 13 2.2 Natural Language Processing ......................................................................... 13 2.2.1 Overview ..................................................................................................... 13 2.2.2 Low-level NLP Tasks ................................................................................. 14 2.2.3 High-level NLP Tasks ................................................................................. 16 2.2.4 Summary ..................................................................................................... 18 2.3 Text Segmentation .......................................................................................... 18 2.3.1 Overview ..................................................................................................... 18 2.3.2 Content-based and Discourse-based ........................................................... 19 2.3.3 Supervised and Unsupervised ..................................................................... 20 2.3.4 Borderline sentences detection methods ..................................................... 20 2.3.5 Linear and Hierarchical ............................................................................... 21 2.3.6 Hierarchical Text Segmentation Techniques .............................................. 23 2.3.7 Summary ..................................................................................................... 25 2.4 Adaptive Systems ...........................................................................................
Recommended publications
  • Building a Visual Editor for Wikipedia
    Building a Visual Editor for Wikipedia Trevor Parscal and Roan Kattouw Wikimania D.C. 2012 Trevor Parscal Roan Kattouw Rob Moen Lead Designer and Engineer Data Model Engineer User Interface Engineer Wikimedia Wikimedia Wikimedia Inez Korczynski Christian Williams James Forrester Edit Surface Engineer Edit Surface Engineer Product Analyst Wikia Wikia Wikimedia The People Wikimania D.C. 2012 Parsoid Team Gabriel Wicke Subbu Sastry Lead Parser Engineer Parser Engineer Wikimedia Wikimedia The People Wikimania D.C. 2012 The Complexity Problem Wikimania D.C. 2012 Active Editors 20k 0 2001 2007 Today Growth Stagnation The Complexity Problem Wikimania D.C. 2012 just messing around Testing testing 123... The Complexity Problem Wikimania D.C. 2012 The Review Problem Wikimania D.C. 2012 Balancing the ecosystem Difficulty Editing Reviewing The Review Problem Wikimania D.C. 2012 Balancing the ecosystem Difficulty Editing Reviewing The Review Problem Wikimania D.C. 2012 Balancing the ecosystem Difficulty Editing Reviewing The Review Problem Wikimania D.C. 2012 Balancing the ecosystem Difficulty Editing Reviewing The Review Problem Wikimania D.C. 2012 Wikitext enthusiasts CC-BY-SA-3.0, http://commons.wikimedia.org/wiki/File:Usfa-heston.gif The Expert Problem Wikimania D.C. 2012 Exit strategy 100% Preference for Wikitext Capabilities of visual tools 0% The Expert Problem Wikimania D.C. 2012 To what extent? CC-BY-SA-3.0, http://commons.wikimedia.org/wiki/File:TriMet_MAX_Green_Line_Train_on_Portland_Transit_Mall.jpg The Expert Problem Wikimania D.C. 2012 To what extent? CC-BY-SA-3.0, http://commons.wikimedia.org/wiki/File:TriMet_MAX_Green_Line_Train_on_Portland_Transit_Mall.jpgCC-BY-SA-3.0, http://commons.wikimedia.org/wiki/File:TriMet_1990_Gillig_bus_carrying_bike.jpg The Expert Problem Wikimania D.C.
    [Show full text]
  • Collaborative Integration, Publishing and Analysis of Distributed Scholarly Metadata
    Collaborative Integration, Publishing and Analysis of Distributed Scholarly Metadata Dissertation zur Erlangung des Doktorgrades (Dr. rer. nat.) der Mathematisch-Naturwissenschaftlichen Fakultät der Rheinischen Friedrich-Wilhelms-Universität Bonn vorgelegt von Sahar Vahdati aus dem Tabriz, Iran Bonn 2019 Angefertigt mit Genehmigung der Mathematisch-Naturwissenschaftlichen Fakultät der Rheinischen Friedrich-Wilhelms-Universität Bonn 1. Gutachter: Prof. Dr. Sören Auer 2. Gutachter: Prof. Dr. Rainer Manthey Tag der Promotion: 17.01.2019 Erscheinungsjahr: 2019 Abstract Research is becoming increasingly digital, interdisciplinary, and data-driven and affects different en- vironments in addition to academia, such as industry, and government. Research output representation, publication, mining, analysis, and visualization are taken to a new level, driven by the increased use of Web standards and digital scholarly communication initiatives. The number of scientific publications produced by new players and the increasing digital availability of scholarly artifacts, and associated metadata are other drivers of the substantial growth in scholarly communication. The heterogeneity of scholarly artifacts and their metadata spread over different Web data sources poses a major challenge for researchers with regard to search, retrieval and exploration. For example, it has become difficult to keep track of relevant scientific results, to stay up-to-date with new scientific events and running projects, as well as to find potential future collaborators. Thus, assisting researchers with a broader integration, management, and analysis of scholarly metadata can lead to new opportunities in research and to new ways of conducting research. The data integration problem has been extensively addressed by communities in the Database, Artificial Intelligence and Semantic Web fields. However, a share of the interoperability issues are domain specific and new challenges with regard to schema, structure, or domain, arise in the context of scholarly metadata integration.
    [Show full text]
  • Working-With-Mediawiki-Yaron-Koren.Pdf
    Working with MediaWiki Yaron Koren 2 Working with MediaWiki by Yaron Koren Published by WikiWorks Press. Copyright ©2012 by Yaron Koren, except where otherwise noted. Chapter 17, “Semantic Forms”, includes significant content from the Semantic Forms homepage (https://www. mediawiki.org/wiki/Extension:Semantic_Forms), available under the Creative Commons BY-SA 3.0 license. All rights reserved. Library of Congress Control Number: 2012952489 ISBN: 978-0615720302 First edition, second printing: 2014 Ordering information for this book can be found at: http://workingwithmediawiki.com All printing of this book is handled by CreateSpace (https://createspace.com), a subsidiary of Amazon.com. Cover design by Grace Cheong (http://gracecheong.com). Contents 1 About MediaWiki 1 History of MediaWiki . 1 Community and support . 3 Available hosts . 4 2 Setting up MediaWiki 7 The MediaWiki environment . 7 Download . 7 Installing . 8 Setting the logo . 8 Changing the URL structure . 9 Updating MediaWiki . 9 3 Editing in MediaWiki 11 Tabs........................................................... 11 Creating and editing pages . 12 Page history . 14 Page diffs . 15 Undoing . 16 Blocking and rollbacks . 17 Deleting revisions . 17 Moving pages . 18 Deleting pages . 19 Edit conflicts . 20 4 MediaWiki syntax 21 Wikitext . 21 Interwiki links . 26 Including HTML . 26 Templates . 27 3 4 Contents Parser and tag functions . 30 Variables . 33 Behavior switches . 33 5 Content organization 35 Categories . 35 Namespaces . 38 Redirects . 41 Subpages and super-pages . 42 Special pages . 43 6 Communication 45 Talk pages . 45 LiquidThreads . 47 Echo & Flow . 48 Handling reader comments . 48 Chat........................................................... 49 Emailing users . 49 7 Images and files 51 Uploading . 51 Displaying images . 55 Image galleries .
    [Show full text]
  • Editing Wikipedia: a Guide to Improving Content on the Online Encyclopedia
    wikipedia globe vector [no layers] Editing Wikipedia: A guide to improving content on the online encyclopedia Wikimedia Foundation 1 Imagine a world in which every single human wikipedia globebeing vector [no layers] can freely share in the sum of all knowledge. That’s our commitment. This is the vision for Wikipedia and the other Wikimedia projects, which volunteers from around the world have been building since 2001. Bringing together the sum of all human knowledge requires the knowledge of many humans — including yours! What you can learn Shortcuts This guide will walk you through Want to see up-to-date statistics about how to contribute to Wikipedia, so Wikipedia? Type WP:STATS into the the knowledge you have can be freely search bar as pictured here. shared with others. You will find: • What Wikipedia is and how it works • How to navigate Wikipedia The text WP:STATS is what’s known • How you can contribute to on Wikipedia as a shortcut. You can Wikipedia and why you should type shortcuts like this into the search • Important rules that keep Wikipedia bar to pull up specific pages. reliable In this brochure, we designate shortcuts • How to edit Wikipedia with the as | shortcut WP:STATS . VisualEditor and using wiki markup • A step-by-step guide to adding content • Etiquette for interacting with other contributors 2 What is Wikipedia? Wikipedia — the free encyclopedia that anyone can edit — is one of the largest collaborative projects in history. With millions of articles and in hundreds of languages, Wikipedia is read by hundreds of millions of people on a regular basis.
    [Show full text]
  • Wikimedia Free Videos Download
    wikimedia free videos download Celebra los 20 años de Wikipedia y las personas que lo hacen posible → In this time of uncertainty, our enduring commitment is to provide reliable and neutral information for the world. The nonprofit Wikimedia Foundation provides the essential infrastructure for free knowledge. We host Wikipedia, the free online encyclopedia, created, edited, and verified by volunteers around the world, as well as many other vital community projects. All of which is made possible thanks to donations from individuals like you. We welcome anyone who shares our vision to join us in collecting and sharing knowledge that fully represents human diversity. (中文 (繁体中文) · Français (France) · Русский (Россия) · Español (España) · Deutsch (Deutschland · (اﻟﻌﺮﺑﯿﺔ (اﻟﻌﺎﻟﻢ · English 1 Wikimedia projects belong to everyone. You made it. It is yours to use. For free. That means you can use it, adapt it, or share what you find on Wikimedia sites. Just do not write your own bio, or copy/paste it into your homework. 2 We respect your data and privacy. We do not sell your email address or any of your personal information to third parties. More information about our privacy practices are available at the Wikimedia Foundation privacy policy, donor privacy policy, and data retention guidelines. 3 People like you keep Wikipedia accurate. Readers verify the facts. Articles are collaboratively created and edited by a community of volunteers using reliable sources, so no single person or company owns a Wikipedia article. The Wikimedia Foundation does not write or edit, but you and everyone you know can help. 4 Not all wikis are Wikimedia.
    [Show full text]
  • Wikipedia @ 20
    Wikipedia @ 20 Wikipedia @ 20 Stories of an Incomplete Revolution Edited by Joseph Reagle and Jackie Koerner The MIT Press Cambridge, Massachusetts London, England © 2020 Massachusetts Institute of Technology This work is subject to a Creative Commons CC BY- NC 4.0 license. Subject to such license, all rights are reserved. The open access edition of this book was made possible by generous funding from Knowledge Unlatched, Northeastern University Communication Studies Department, and Wikimedia Foundation. This book was set in Stone Serif and Stone Sans by Westchester Publishing Ser vices. Library of Congress Cataloging-in-Publication Data Names: Reagle, Joseph, editor. | Koerner, Jackie, editor. Title: Wikipedia @ 20 : stories of an incomplete revolution / edited by Joseph M. Reagle and Jackie Koerner. Other titles: Wikipedia at 20 Description: Cambridge, Massachusetts : The MIT Press, [2020] | Includes bibliographical references and index. Identifiers: LCCN 2020000804 | ISBN 9780262538176 (paperback) Subjects: LCSH: Wikipedia--History. Classification: LCC AE100 .W54 2020 | DDC 030--dc23 LC record available at https://lccn.loc.gov/2020000804 Contents Preface ix Introduction: Connections 1 Joseph Reagle and Jackie Koerner I Hindsight 1 The Many (Reported) Deaths of Wikipedia 9 Joseph Reagle 2 From Anarchy to Wikiality, Glaring Bias to Good Cop: Press Coverage of Wikipedia’s First Two Decades 21 Omer Benjakob and Stephen Harrison 3 From Utopia to Practice and Back 43 Yochai Benkler 4 An Encyclopedia with Breaking News 55 Brian Keegan 5 Paid with Interest: COI Editing and Its Discontents 71 William Beutler II Connection 6 Wikipedia and Libraries 89 Phoebe Ayers 7 Three Links: Be Bold, Assume Good Faith, and There Are No Firm Rules 107 Rebecca Thorndike- Breeze, Cecelia A.
    [Show full text]
  • Guide 4: Navigating the Wikipedia Interface
    Editing interfaces Guide 4: There are two different editing interfaces – Navigating wikitext and the visual editor. Each has its advantages and disadvantages and it is easy the Wikipedia to shift between them. Interface New users who are not familiar with coding are likely to find the visual editor easier to use, particularly when adding Wikipedia interfaces can appear citations and references. complicated at first, but don’t worry, you • Wikitext editor will get the hang of it quickly. This is the standard Wikipedia draft page Wikipedia provides a step-by-step tutorial and editing interface. It requires you to and there is a large amount of additional manually insert formatting commands, such material available to guide you through. as the heading hierarchy, formatted lists and tables, using wikitext. This guide provides a condensed overview, with links to further detailed information. • Visual editor Further reading and links This alternative editor interface is a ’visual‘ or : online rich-text editor that allows you to write • Wikipedia: Tutorial an article as it will appear on Wikipedia. https://en.wikipedia.org/wiki/Wikipedia:Tutorial • Wikipedia: Cheatsheet 1. Using wikitext https://en.wikipedia.org/wiki/Help:Cheatsheet Wikitext (also known as wiki markup or • Visual Editor: User Guide wikicode) is the syntax and keywords used https://www.mediawiki.org/wiki/ to format a page. Help:VisualEditor/User_guide The Wikipedia cheatsheet explains how to • Wikipedia Article Wizard access this, outlines the commands used, https://en.wikipedia.org/wiki/ how these will appear in the final page, and Wikipedia:Article_wizard how to save and edit. You can preview the formatting, citations • Wikipedia: Inline citation/examples and inserted content of your draft article https://en.wikipedia.org/wiki/ by clicking ‘show preview’ at the bottom of Wikipedia:Inline_citation/examples the draft page.
    [Show full text]
  • A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft)
    A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft) MIRIAM REDI, Wikimedia Foundation MARTIN GERLACH, Wikimedia Foundation ISAAC JOHNSON, Wikimedia Foundation JONATHAN MORGAN, Wikimedia Foundation LEILA ZIA, Wikimedia Foundation arXiv:2008.12314v2 [cs.CY] 29 Jan 2021 2 Miriam Redi, Martin Gerlach, Isaac Johnson, Jonathan Morgan, and Leila Zia EXECUTIVE SUMMARY In January 2019, prompted by the Wikimedia Movement’s 2030 strategic direction [108], the Research team at the Wikimedia Foundation1 identified the need to develop a knowledge gaps index—a composite index to support the decision makers across the Wikimedia movement by providing: a framework to encourage structured and targeted brainstorming discussions; data on the state of the knowledge gaps across the Wikimedia projects that can inform decision making and assist with measuring the long term impact of large scale initiatives in the Movement. After its first release in July 2020, the Research team has developed the second complete draft of a taxonomy of knowledge gaps for the Wikimedia projects, as the first step towards building the knowledge gap index. We studied more than 250 references by scholars, researchers, practi- tioners, community members and affiliates—exposing evidence of knowledge gaps in readership, contributorship, and content of Wikimedia projects. We elaborated the findings and compiled the taxonomy of knowledge gaps in this paper, where we describe, group and classify knowledge gaps into a structured framework. The taxonomy that you will learn more about in the rest of this work will serve as a basis to operationalize and quantify knowledge equity, one of the two 2030 strategic directions, through the knowledge gaps index.
    [Show full text]
  • User Engagement on Wikipedia: a Review of Studies of Readers and Editors
    Wikipedia, a Social Pedia: Research Challenges and Opportunities: Papers from the 2015 ICWSM Workshop User Engagement on Wikipedia: A Review of Studies of Readers and Editors Marc Miquel Ribé Universitat Pompeu Fabra [email protected] Abstract on Wikipedia. Their results help in explaining Is it an encyclopedia or a social network? Without consider- how engagement occurs on both the reader and editor ing both aspects it would not be possible to understand how sides. Yet, there are some attributes presenting room for a worldwide army of editors created the largest online improvement detected almost five years ago; overall usa- knowledge repository. Wikipedia has a consistent set of bility and the design of particular communication channels rules and it responds to many of the User Engagement Framework attributes, and this is why it works. In this pa- could be revised in order to mitigate frustration. per, we identify these confirmed attributes as well as those The duality between the two groups of users, readers and presenting problems. We explain that although having a editors, has acted similarly to a feedback loop system; new strong editor base Wikipedia is finding it challenging to content availability helped to popularize the encyclopedia maintain this base or increase its size. In order to understand and improved the position for searchers, which in turn in- this, scholars have analyzed Wikipedia using current metrics like user session and activity. We conclude there ex- creased its use and its editing base in order to create new ist opportunities to analyze engagement in new aspects in articles. Suh et al.
    [Show full text]
  • Librarians As Wikipedians: from Library History to “Librarianship and Human Rights”
    University of South Florida Scholar Commons School of Information Faculty Publications School of Information Summer 2014 Librarians as Wikipedians: From Library History to “Librarianship and Human Rights” Authors: Kathleen de la Peña McCook Wikipedia, the free encyclopedia built collaboratively using wiki software, is the most visited reference site on the web. Only 270 librarians identify as Wikipedians of 21,431,799 Wikipedians with named accounts. This needs to change. Understanding Wikipedia is essential to teaching information literacy and editing Wikipedia is essential to foster successful information-seeking behavior. Librarians who become skilled Wikipedians will maintain the centrality of librarianship to knowledge management in the 21st century—especially through active participation in crowdsourcing. Crowdsourcing is the online participation model that makes use of the collective intelligence of online communities for specific purposes in this case creating and editing articles for Wikipedia. Follow this and additional works at: http://scholarcommons.usf.edu/si_facpub Part of the Library and Information Science Commons Scholar Commons Citation McCook, Kathleen de la Peña, "Librarians as Wikipedians: From Library History to “Librarianship and Human Rights”" (2014). School of Information Faculty Publications. 316. http://scholarcommons.usf.edu/si_facpub/316 This Article is brought to you for free and open access by the School of Information at Scholar Commons. It has been accepted for inclusion in School of Information Faculty Publications by an authorized administrator of Scholar Commons. For more information, please contact [email protected]. Kathleen de la Peña McCook Librarians as Wikipedians From Library History to “Librarianship and Human Rights” Wikipedia: Need for Librarians as Contributors Wikipedia, the free encyclopedia built collaboratively using wiki software, is the most visited reference site on the web.1 Only 270 librarians identify as Wikipedians2 of 21,431,799 Wikipedians with named accounts.3 This needs to change.
    [Show full text]
  • Free Workshop to Create Your Own Open Data Site
    Third-quarter newsletter for the No Images? Click here Denver Regional Data Consortium. The data consortium consists of Denver Regional Council of Governments members and regional partners with an interest in geospatial data and collaboration. The data consortium newsletter improves communication among local geographic information systems professionals and features updates from all levels of government as they relate to data and geospatial initiatives in our region. This newsletter is published quarterly. Free workshop to create your own open data site Article submitted by Ashley Summers, GISP, PMP, information systems manager at DRCOG. Ashley can be reached at 303- 480-6746 or [email protected]. DRCOG and Esri are partnering to provide a free workshop in August to help local governments create an open data site. This hands-on, technical session will tell you everything you need to know about getting your GIS data online, using software that you probably already own. The workshop will be held at DRCOG’s offices at 1001 17th Street on Aug. 8 from 1 p.m. to 4 p.m. Some advance work is required to make the most of your time during the session and space is limited. To secure your spot and receive pre- workshop instructions, please RSVP. Wondering what an Esri open data site looks like? Check out these excellent examples that your colleagues created: https://data-auroraco.opendata.arcgis.com/ https://gis-bouldercounty.opendata.arcgis.com/ https://open-centennial.opendata.arcgis.com/ https://data-c3.opendata.arcgis.com/ https://data-erieco.opendata.arcgis.com/ https://data-jeffersoncounty.opendata.arcgis.com/ https://data-cityofthornton.opendata.arcgis.com/ Go Code winner Carbos creates carbon trading app with public data Article submitted by Margaret-Rose Spyker, GIS and data analyst at Xentity Corporation.
    [Show full text]
  • A Wiki Front End in 40 Lines of Code, Or Doing Cool Things with Wiki Content
    A Wiki Front End in 40 lines of code, or Doing Cool Things with Wiki Content Aka: Parsoid Power! http://etherpad.wikimedia.org/p/parsoidpower C. Scott Ananian <[email protected]> Wikimedia Foundation T105175 (And thanks to Gabriel Wicke for many slides!) Wikimania 2015 July 15, 2015 Outline ● Introduction to Parsoid HTML5 ● Data extraction with Parsoid HTML5 ● Introduction to RESTBase API ● Let’s build something real! Time permitting: ● Using compressed dumps Before we dive in... Who uses this stuff? ● VisualEditor: modify HTML, serialize to WT ● Flow: store HTML, edit WT ● Kiwix: massage for offline usage ● OCG: generate PDFs via HTML->LaTeX ● cxserver: adjust links (later: templates), machine translateeditProtectedHelper gadget: edit template params ● Google Knowledge Graph: extract semantic info from transclusions, tables, lists, categories... ● Batea! A bot for checking citations for WikiProject:Medicine ● You! “Printable page”? New wiki front ends or experiments? HTML5 with RDFa Wikitext: [[Main Page]] HTML: <a rel="mw:WikiLink" href="./Main_Page">Main Page</a> Match: document.querySelectorAll('a[rel~="mw:WikiLink"]') Wikitext: [http://example.com Link content] HTML: <a rel="mw:ExtLink" href="http://example.com">Link content</a> Match: document.querySelectorAll('a[rel~="mw:ExtLink"]') Page properties Wikitext: [[Category:Foo]] HTML: <link rel="mw:PageProp/Category" href="./Category:Foo"> Match: document.querySelectorAll('[rel~="mw:PageProp/Category"]'); Wikitext: __NOTOC__ HTML: <meta property="mw:PageProp/notoc"> Match: document.querySelectorAll('[rel~="mw:PageProp/notoc"]');
    [Show full text]