Processing XML and TEI Into What? a Free-For-All Pair of Workshops

Total Page:16

File Type:pdf, Size:1020Kb

Processing XML and TEI Into What? a Free-For-All Pair of Workshops Processing XML and TEI into What? A Free-for-all Pair of Workshops DHSI 2021, Course #12: Elisa Beshero-Bondar Coursepack contents (coauthored by Elisa Beshero-Bondar and David J. Birnbaum) A. Contents ..................................................................................................................................................................... 1 B. About this course ................................................................................................................................................... 3 C. Syllabus (2019) ......................................................................................................................................................................5 D. Resources (bibliography and links) ............................................................................................................. 8 E. Exercises and tutorials (links) ..................................................................................................................... 10 F. XPath 1. What can XPath do for me? .................................................................................................................... 11 2. The XPath functions we use most ....................................................................................................... 21 G. Regular expressions (regex) 1. Autotagging with regular expressions (regex) ............................................................................. 25 H. XSLT 1. Introduction to XSLT ................................................................................................................................. 33 2. Attribute value templates (AVT) ......................................................................................................... 41 3. The XSLT identity transformation a. Tutorial ..................................................................................................................................................... 45 b. Exercise .................................................................................................................................................... 49 4. Modal XSLT .................................................................................................................................................... 52 5. XSLT, part 2: advanced features .......................................................................................................... 55 6. Using <xsl:analyze-string> ..................................................................................................................... 63 I. Schematron 1. Guide to schema writing with Schematron .................................................................................... 66 2. Validating references with Schematron........................................................................................... 69 3. Coding with unique identifiers and Schematron ......................................................................... 73 J. What’s new in XSLT 3.0 and XPath 3.1 (under development) ...................................................... 76 K. Obdurodon exercises and tests 1. Regular expressions (regex) .................................................................................................................. 81 2. XPath ................................................................................................................................................................. 95 3. XQuery ........................................................................................................................................................... 103 4. XSLT................................................................................................................................................................ 108 5. Schematron ................................................................................................................................................. 129 L. Newtfire exercises and tests 1. Regular expressions (regex) ............................................................................................................... 134 2. XPath .............................................................................................................................................................. 136 3. XQuery ........................................................................................................................................................... 144 4. XSLT................................................................................................................................................................ 147 5. Schematron ................................................................................................................................................. 163 M. Mulberry guides and quick references 1. Guide to using the Oxygen XML Editor (v20.0) ......................................................................... 176 2. XQuery 1.0 and XPath 2.0 functions and operators quick reference .............................. 189 3. Regular expressions in XSLT 2.0, XQuery 1.0 and XPath 2.0 quick reference ............ 191 4. ISO Schematron quick reference ...................................................................................................... 193 5. XPath 2.0 quick reference .................................................................................................................... 195 6. XQuery 1.0 quick reference ................................................................................................................. 197 7. XSLT 2.0 quick reference...................................................................................................................... 199 N. Supplemental readings 1. XQuery (Priscilla Walmsley; sample) ............................................................................................. 202 2. XPath 2.0 and XSLT 2.0 programmer’s reference (Michael Kay; table of contents) .. 232 Instructors: Elisa Beshero-Bondar and David J. Birnbaum | XPath for Document Archaeology and Project Management 3/16/19, 7)38 PM XPath for processing XML and managing projects DHSI 2019 Course 49 (Week 2, 10–14 June, 2019) Course description Syllabus Resources and references Course Pack View on GitHub Instructors: Elisa Beshero-Bondar and David J. Birnbaum Description: Learn XPath intensively and gain superpowers with XML processing! Whether you’ve recently learned XML and want to build something with it, or whether you’ve worked with XPath before but are rusty, new and experienced coders alike will benefit from our course. XPath is usually not the center of a DHSI class, and people often gain hasty “ad hoc” experience with it when learning it only along the way to doing something else. Concentrating intensively for a week on XPath will “power up” what you can do with XML, and will help you refine the way you code your documents. Our course will assist XML coders (whether beginners or experienced) with complex processing of information from markup and from plain text. Our goals are 1) to increase our participants’ confidence and fluency in reading and extracting information coded in XML archives and databases, and 2) to share strategies for systematically reviewing, designing, and building those archives and databases. Because we can “dig” latent information out of the document “strata” of texts, we think of working with XPath as something like planning an archaeology project, turning an XML project into a carefully managed digital dig site for cultural data! In our course you’ll gain experience with writing precise and powerful XPath to illuminate information that isn’t obvious on a human reading. For example, we’ll write XPath to calculate how frequently you have marked a certain phenomenon, or locate which names of persons are mentioned together in the same chapter, paragraph, sentence, stanza, footnote, or other structural unit. We’ll apply XPath to check for accuracy of text encoding—to write schema rules to manage your coding (or your project team’s coding). You will learn how XPath can help you to pull data from your documents into lists, tables, and graphic visualizations. XPath is the center of the course, but we will explore how it applies in multiple XML processing contexts so that you learn how these work similarly and how these are used, respectively, to validate documents and to transform them for publication and other reuse. Thus we devote serious, sustained attention to writing and applying XPath by surveying how it is expressed in a variety of frameworks (including XSLT, XQuery, and Schematron), with a variety of materials (including XML and plain-text documents), and involving a variety of task types (such as date arithmetic to calculate how much time elapsed between dates and string surgery to look for and manipulate patterns inside your coded elements). You’ll gain fluency with XPath expressions and https://ebeshero.github.io/UpTransformation/dhsi-XPath_CourseDescription.html Page 1 of 2 Instructors: Elisa Beshero-Bondar and David J. Birnbaum | XPath for Document Archaeology and Project Management 3/16/19, 7)38 PM patterns, including predicates, operators, functions (from the core library and user-defined), regular expressions, and other features, and we’ll practice these in different XML-related contexts, starting with XQuery, and moving to XSLT and Schematron). Whether you are an XML beginner or a more experienced coder, you’ll find that XPath will
Recommended publications
  • Fun Factor: Coding with Xquery a Conversation with Jason Hunter by Ivan Pedruzzi, Senior Product Architect for Stylus Studio
    Fun Factor: Coding With XQuery A Conversation with Jason Hunter by Ivan Pedruzzi, Senior Product Architect for Stylus Studio Jason Hunter is the author of Java Servlet Programming and co-author of Java Enterprise Best Practices (both O'Reilly Media), an original contributor to Apache Tomcat, and a member of the expert groups responsible for Servlet, JSP, JAXP, and XQJ (XQuery API for Java) development. Jason is an Apache Member, and as Apache's representative to the Java Community Process Executive Committee he established a landmark agreement for open source Java. He co-created the open source JDOM library to enable optimized Java and XML integration. More recently, Jason's work has focused on XQuery technologies. In addition to helping on XQJ, he co-created BumbleBee, an XQuery test harness, and started XQuery.com, a popular XQuery development portal. Jason presently works as a Senior Engineer with Mark Logic, maker of Content Interaction Server, an XQuery-enabled database for content. Jason is interviewed by Ivan Pedruzzi, Senior Product Architect for Stylus Studio. Stylus Studio is the leading XML IDE for XML data integration, featuring advanced support for XQuery development, including XQuery editing, mapping, debugging and performance profiling. Ivan Pedruzzi: Hi, Jason. Thanks for taking the time to XQuery behind a J2EE server (I have a JSP tag library talk with The Stylus Scoop today. Most of our readers for this) but I’ve found it easier to simply put XQuery are familiar with your past work on Java Servlets; but directly on the web. What you do is put .xqy files on the could you tell us what was behind your more recent server that are XQuery scripts to produce XHTML interest and work in XQuery technologies? pages.
    [Show full text]
  • Oracle® Big Data Connectors User's Guide
    Oracle® Big Data Connectors User's Guide Release 4 (4.12) E93614-03 July 2018 Oracle Big Data Connectors User's Guide, Release 4 (4.12) E93614-03 Copyright © 2011, 2018, Oracle and/or its affiliates. All rights reserved. Primary Author: Frederick Kush This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency- specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
    [Show full text]
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • QUERYING JSON and XML Performance Evaluation of Querying Tools for Offline-Enabled Web Applications
    QUERYING JSON AND XML Performance evaluation of querying tools for offline-enabled web applications Master Degree Project in Informatics One year Level 30 ECTS Spring term 2012 Adrian Hellström Supervisor: Henrik Gustavsson Examiner: Birgitta Lindström Querying JSON and XML Submitted by Adrian Hellström to the University of Skövde as a final year project towards the degree of M.Sc. in the School of Humanities and Informatics. The project has been supervised by Henrik Gustavsson. 2012-06-03 I hereby certify that all material in this final year project which is not my own work has been identified and that no work is included for which a degree has already been conferred on me. Signature: ___________________________________________ Abstract This article explores the viability of third-party JSON tools as an alternative to XML when an application requires querying and filtering of data, as well as how the application deviates between browsers. We examine and describe the querying alternatives as well as the technologies we worked with and used in the application. The application is built using HTML 5 features such as local storage and canvas, and is benchmarked in Internet Explorer, Chrome and Firefox. The application built is an animated infographical display that uses querying functions in JSON and XML to filter values from a dataset and then display them through the HTML5 canvas technology. The results were in favor of JSON and suggested that using third-party tools did not impact performance compared to native XML functions. In addition, the usage of JSON enabled easier development and cross-browser compatibility. Further research is proposed to examine document-based data filtering as well as investigating why performance deviated between toolsets.
    [Show full text]
  • Scraping HTML with Xpath Stéphane Ducasse, Peter Kenny
    Scraping HTML with XPath Stéphane Ducasse, Peter Kenny To cite this version: Stéphane Ducasse, Peter Kenny. Scraping HTML with XPath. published by the authors, pp.26, 2017. hal-01612689 HAL Id: hal-01612689 https://hal.inria.fr/hal-01612689 Submitted on 7 Oct 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Scraping HTML with XPath Stéphane Ducasse and Peter Kenny Square Bracket tutorials September 28, 2017 master @ a0267b2 Copyright 2017 by Stéphane Ducasse and Peter Kenny. The contents of this book are protected under the Creative Commons Attribution-ShareAlike 3.0 Unported license. You are free: • to Share: to copy, distribute and transmit the work, • to Remix: to adapt the work, Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work.
    [Show full text]
  • Supporting SPARQL Update Queries in RDF-XML Integration *
    Supporting SPARQL Update Queries in RDF-XML Integration * Nikos Bikakis1 † Chrisa Tsinaraki2 Ioannis Stavrakantonakis3 4 Stavros Christodoulakis 1 NTU Athens & R.C. ATHENA, Greece 2 EU Joint Research Center, Italy 3 STI, University of Innsbruck, Austria 4 Technical University of Crete, Greece Abstract. The Web of Data encourages organizations and companies to publish their data according to the Linked Data practices and offer SPARQL endpoints. On the other hand, the dominant standard for information exchange is XML. The SPARQL2XQuery Framework focuses on the automatic translation of SPARQL queries in XQuery expressions in order to access XML data across the Web. In this paper, we outline our ongoing work on supporting update queries in the RDF–XML integration scenario. Keywords: SPARQL2XQuery, SPARQL to XQuery, XML Schema to OWL, SPARQL update, XQuery Update, SPARQL 1.1. 1 Introduction The SPARQL2XQuery Framework, that we have previously developed [6], aims to bridge the heterogeneity issues that arise in the consumption of XML-based sources within Semantic Web. In our working scenario, mappings between RDF/S–OWL and XML sources are automatically derived or manually specified. Using these mappings, the SPARQL queries are translated on the fly into XQuery expressions, which access the XML data. Therefore, the current version of SPARQL2XQuery provides read-only access to XML data. In this paper, we outline our ongoing work on extending the SPARQL2XQuery Framework towards supporting SPARQL update queries. Both SPARQL and XQuery have recently standardized their update operation seman- tics in the SPARQL 1.1 and XQuery Update Facility, respectively. We have studied the correspondences between the update operations of these query languages, and we de- scribe the extension of our mapping model and the SPARQL-to-XQuery translation algorithm towards supporting SPARQL update queries.
    [Show full text]
  • XML Transformations, Views and Updates Based on Xquery Fragments
    Faculteit Wetenschappen Informatica XML Transformations, Views and Updates based on XQuery Fragments Proefschrift voorgelegd tot het behalen van de graad van doctor in de wetenschappen aan de Universiteit Antwerpen, te verdedigen door Roel VERCAMMEN Promotor: Prof. Dr. Jan Paredaens Antwerpen, 2008 Co-promotor: Dr. Ir. Jan Hidders XML Transformations, Views and Updates based on XQuery Fragments Roel Vercammen Universiteit Antwerpen, 2008 http://www.universiteitantwerpen.be Permission to make digital or hard copies of portions of this work for personal or classroom use is granted, provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice. Copyrights for components of this work owned by others than the author must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission of the author. Research funded by a Ph.D. grant of the Institute for the Promotion of Innovation through Science and Technology in Flan- ders (IWT-Vlaanderen). { Onderzoek gefinancierd met een specialisatiebeurs van het Instituut voor de Aanmoediging van Innovatie door Wetenschap en Technologie in Vlaanderen (IWT-Vlaanderen). Grant number / Beurs nummer: 33581. http://www.iwt.be Typesetting by LATEX Acknowledgements This thesis is the result of the contributions of many friends and colleagues to whom I would like to say \thank you". First and foremost, I want to thank my advisor Jan Paredaens, who gave me the opportunity to become a researcher and teached me how good research should be performed. I had the honor to write several papers in collaboration with him and will always remember the discussions and his interesting views on research, politics and gastronomy.
    [Show full text]
  • SVG-Based Knowledge Visualization
    MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨ OF I !"#$%&'()+,-./012345<yA|NFORMATICS SVG-based Knowledge Visualization DIPLOMA THESIS Miloš Kaláb Brno, spring 2012 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Advisor: RNDr. Tomáš Gregar Ph.D. ii Acknowledgement I would like to thank RNDr. Tomáš Gregar Ph.D. for supervising the thesis. His opinions, comments and advising helped me a lot with accomplishing this work. I would also like to thank to Dr. Daniel Sonntag from DFKI GmbH. Saarbrücken, Germany, for the opportunity to work for him on the Medico project and for his supervising of the thesis during my erasmus exchange in Germany. Big thanks also to Jochen Setz from Dr. Sonntag’s team who worked on the server background used by my visualization. Last but not least, I would like to thank to my family and friends for being extraordinary supportive. iii Abstract The aim of this thesis is to analyze the visualization of semantic data and sug- gest an approach to general visualization into the SVG format. Afterwards, the approach is to be implemented in a visualizer allowing user to customize the visualization according to the nature of the data. The visualizer was integrated as an extension of Fresnel Editor. iv Keywords Semantic knowledge, SVG, Visualization, JavaScript, Java, XML, Fresnel, XSLT v Contents Introduction . .3 1 Brief Introduction to the Related Technologies ..........5 1.1 XML – Extensible Markup Language ..............5 1.1.1 XSLT – Extensible Stylesheet Lang.
    [Show full text]
  • XPATH in NETCONF and YANG Table of Contents
    XPATH IN NETCONF AND YANG Table of Contents 1. Introduction ............................................................................................................3 2. XPath 1.0 Introduction ...................................................................................3 3. The Use of XPath in NETCONF ...............................................................4 4. The Use of XPath in YANG .........................................................................5 5. XPath and ConfD ...............................................................................................8 6. Conclusion ...............................................................................................................9 7. Additional Resourcese ..................................................................................9 2 XPath in NETCONF and YANG 1. Introduction XPath is a powerful tool used by NETCONF and YANG. This application note will help you to understand and utilize this advanced feature of NETCONF and YANG. This application note gives a brief introduction to XPath, then describes how XPath is used in NETCONF and YANG, and finishes with a discussion of XPath in ConfD. The XPath 1.0 standard was defined by the W3C in 1999. It is a language which is used to address the parts of an XML document and was originally design to be used by XML Transformations. XPath gets its name from its use of path notation for navigating through the hierarchical structure of an XML document. Since XML serves as the encoding format for NETCONF and a data model defined in YANG is represented in XML, it was natural for NETCONF and XML to utilize XPath. 2. XPath 1.0 Introduction XML Path Language, or XPath 1.0, is a W3C recommendation first introduced in 1999. It is a language that is used to address and match parts of an XML document. XPath sees the XML document as a tree containing different kinds of nodes. The types of nodes can be root, element, text, attribute, namespace, processing instruction, and comment nodes.
    [Show full text]
  • Pearls of XSLT/Xpath 3.0 Design
    PEARLS OF XSLT AND XPATH 3.0 DESIGN PREFACE XSLT 3.0 and XPath 3.0 contain a lot of powerful and exciting new capabilities. The purpose of this paper is to highlight the new capabilities. Have you got a pearl that you would like to share? Please send me an email and I will add it to this paper (and credit you). I ask three things: 1. The pearl highlights a capability that is new to XSLT 3.0 or XPath 3.0. 2. Provide a short, complete, working stylesheet with a sample input document. 3. Provide a brief description of the code. This is an evolving paper. As new pearls are found, they will be added. TABLE OF CONTENTS 1. XPath 3.0 is a composable language 2. Higher-order functions 3. Partial functions 4. Function composition 5. Recursion with anonymous functions 6. Closures 7. Binary search trees 8. -- next pearl is? -- CHAPTER 1: XPATH 3.0 IS A COMPOSABLE LANGUAGE The XPath 3.0 specification says this: XPath 3.0 is a composable language What does that mean? It means that every operator and language construct allows any XPath expression to appear as its operand (subject only to operator precedence and data typing constraints). For example, take this expression: 3 + ____ The plus (+) operator has a left-operand, 3. What can the right-operand be? Answer: any XPath expression! Let's use the max() function as the right-operand: 3 + max(___) Now, what can the argument to the max() function be? Answer: any XPath expression! Let's use a for- loop as its argument: 3 + max(for $i in 1 to 10 return ___) Now, what can the return value of the for-loop be? Answer: any XPath expression! Let's use an if- statement: 3 + max(for $i in 1 to 10 return (if ($i gt 5) then ___ else ___))) And so forth.
    [Show full text]
  • Access Control Models for XML
    Access Control Models for XML Abdessamad Imine Lorraine University & INRIA-LORIA Grand-Est Nancy, France [email protected] Outline • Overview on XML • Why XML Security? • Querying Views-based XML Data • Updating Views-based XML Data 2 Outline • Overview on XML • Why XML Security? • Querying Views-based XML Data • Updating Views-based XML Data 3 What is XML? • eXtensible Markup Language [W3C 1998] <files> "<record>! ""<name>Robert</name>! ""<diagnosis>Pneumonia</diagnosis>! "</record>! "<record>! ""<name>Franck</name>! ""<diagnosis>Ulcer</diagnosis>! "</record>! </files>" 4 What is XML? • eXtensible Markup Language [W3C 1998] <files>! <record>! /files" <name>Robert</name>! <diagnosis>! /record" /record" Pneumonia! </diagnosis> ! </record>! /name" /diagnosis" <record …>! …! </record>! Robert" Pneumonia" </files>! 5 XML for Documents • SGML • HTML - hypertext markup language • TEI - Text markup, language technology • DocBook - documents -> html, pdf, ... • SMIL - Multimedia • SVG - Vector graphics • MathML - Mathematical formulas 6 XML for Semi-Structered Data • MusicXML • NewsML • iTunes • DBLP http://dblp.uni-trier.de • CIA World Factbook • IMDB http://www.imdb.com/ • XBEL - bookmark files (in your browser) • KML - geographical annotation (Google Maps) • XACML - XML Access Control Markup Language 7 XML as Description Language • Java servlet config (web.xml) • Apache Tomcat, Google App Engine, ... • Web Services - WSDL, SOAP, XML-RPC • XUL - XML User Interface Language (Mozilla/Firefox) • BPEL - Business process execution language
    [Show full text]
  • Web and Semantic Web Query Languages: a Survey
    Web and Semantic Web Query Languages: A Survey James Bailey1, Fran¸coisBry2, Tim Furche2, and Sebastian Schaffert2 1 NICTA Victoria Laboratory Department of Computer Science and Software Engineering The University of Melbourne, Victoria 3010, Australia http://www.cs.mu.oz.au/~jbailey/ 2 Institute for Informatics,University of Munich, Oettingenstraße 67, 80538 M¨unchen, Germany http://pms.ifi.lmu.de/ Abstract. A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This ar- ticle introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced us- ing common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion. 1 Introduction The Semantic Web Vision A major endeavour in current Web research is the so-called Semantic Web, a term coined by W3C founder Tim Berners-Lee in a Scientific American article describing the future of the Web [37]. The Semantic Web aims at enriching Web data (that is usually represented in (X)HTML or other XML formats) by meta-data and (meta-)data processing specifying the “meaning” of such data and allowing Web based systems to take advantage of “intelligent” reasoning capabilities. To quote Berners-Lee et al. [37]: “The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.” The Semantic Web meta-data added to today’s Web can be seen as advanced semantic indices, making the Web into something rather like an encyclopedia.
    [Show full text]