Xpath Processing in a Nutshell∗

Total Page:16

File Type:pdf, Size:1020Kb

Xpath Processing in a Nutshell∗ XPath Processing in a Nutshell∗ Georg Gottlob, Christoph Koch, and Reinhard Pichler Database and Artificial Intelligence Group Technische Universit¨at Wien, A-1040 Vienna, Austria fgottlob, [email protected], [email protected] Abstract size and intricacy of the queries. (Database theoreti- cians refer to this as combined complexity.) We provide a concise yet complete formal We present main-memory XPath processing algo- definition of the semantics of XPath 1 and rithms that run in time guaranteed to be polynomial summarize efficient algorithms for processing in the size of the complete input, i.e. query and data. queries in this language. Our presentation is As experimentally verified in [3], where these and other intended both for the reader who is looking results where first presented, contemporary XPath en- for a short but comprehensive formal account gines do not have this property, and require time ex- of XPath as well as the software developer in ponential in the size of queries at worst. need of material that facilitates the rapid im- We present polynomial-time XPath evaluation algo- plementation of XPath engines. rithms in two flavors, bottom-up and top-down2. Both algorithms have the same worst-case polynomial-time 1 Introduction bounds. The former approach comes with a clear and XPath [10] is a practical language for selecting nodes simple intuition why our algorithms are polynomial, from XML document trees. Its importance stems from while the latter requires to compute considerably fewer its suitability as an XML query language per se and it useless intermediate results and consequently performs being at the core of several other XML-related tech- much better in practice. nologies, such as XSLT, XPointer, and XQuery. The structure of this paper is as follows. Sec- Previous semantics definitions of XPath have either tion 2 introduces XPath axes. Section 3 presents the been overly lengthy [10, 12], or were restricted to frag- data model of XPath and auxiliary functions used ments of the language (e.g. [6, 7]). In this paper, we throughout the paper. Section 4 defines the seman- provide a formal semantics definition of full XPath, in- tics of XPath in a concise way. Section 5 houses the tended for the implementor or the reader who values bottom-up semantics definition and algorithm. Sec- a concise, formal, and functional description. As an tion 6 comes up with the modifications to obtain a extension to [3], we also provide a brief but complete top-down algorithm. We conclude with Section 7. definition of the XPath data model. Given a query language, the prototypical algorith- 2 XPath Axes mic problem is query evaluation, and its hardness needs to be formally studied. The query evaluation In XPath, an XML document is viewed as an unranked problem was not addressed for full XPath in any depth (i.e., nodes may have a variable number of children), until recently [3, 2, 4]1. We summarize the main ideas ordered, and labeled tree. Before we make the data of that work in this paper, with an emphasis on use- model used by XPath precise (which distinguishes be- fulness to a wide audience. tween several types of tree nodes) in Section 3, we Since XPath and related technologies will be tested introduce the main mode of navigation in document in ever-growing deployment scenarios, implementa- trees employed by XPath { axes { in the abstract, ig- tions of XPath engines need to scale well both with noring node types. We will point out how to deal with respect to the size of the XML data and the growing different node types in Section 3. All of the artifacts of this and the next section ∗ This work was supported by the Austrian Science Fund (FWF) under project No. Z29-INF. All methods and algorithms are defined in the context of a given XML document. presented in this paper are covered by a pending patent. Fur- Given a document tree, let dom be the set of its nodes, ther resources, updates, and possible corrections are available and let us use the two functions at http://www.xmltaskforce.com. 1The novelty of our results is a bit surprising, as considerable effort has been made in the past to understand the XPath query firstchild, nextsibling : dom ! dom; containment problem (e.g. [1, 5, 8]), which { with its relevance to query optimization { is only a means to the end of efficient 2By this we refer to the mode of traversal of the expression query evaluation. tree of the query while computing the query result. ∗ child := firstchild:nextsibling function −1 ∗ −1 evalχ1[χ2 (S) := evalχ1 (S) [ evalχ2 (S). parent := (nextsibling ) :firstchild ∗ ∗ function eval(R1[:::[Rn) (S) begin descendant := firstchild:(firstchild [ nextsibling) 0 0 −1 −1 ∗ −1 S := S; /* S is represented as a list */ ancestor := (firstchild [ nextsibling ) :firstchild while there is a next element x in S0 do descendant-or-self := descendant [ self append fRi(x) j 1 ≤ i ≤ n; Ri(x) 6= null; ancestor-or-self := ancestor [ self 0 0 Ri(x) 62 S g to S ; following := ancestor-or-self:nextsibling: 0 nextsibling∗:descendant-or-self return S ; preceding := ancestor-or-self:nextsibling−1: end; −1 ∗ (nextsibling ) :descendant-or-self where S ⊆ dom is a set of nodes of an XML document, following-sibling := nextsibling:nextsibling∗ 1 1 e and e are regular expressions, R, R , : : :, Rn are preceding-sibling := (nextsibling− )∗:nextsibling− 1 2 1 primitive relations, χ1 and χ2 are axes, and χ is an axis other than \self". Table 1: Axis definitions in terms of \primitive" tree relations “firstchild", \nextsibling", and their inverses. Clearly, some axes could have been defined in a simpler way in Table 1 (e.g., ancestor equals ∗ 3 parent.parent ). However, the definitions, which use to represent its structure . “firstchild" returns the a limited form of regular expressions only, allow to first child of a node (if there are any children, i.e., compute χ(S) in a very simple way, as evidenced by the node is not a leaf), and otherwise \null". Let Algorithm 2.2. n ; : : : ; nk be the children of some node in document 1 The function eval ∗ essentially computes order. Then, nextsibling(n ) = n , i.e., \nextsib- (R1[:::[Rn) i i+1 graph reachability (not transitive closure). It can be ling" returns the neighboring node to the right, if it implemented to run in linear time in terms of the data exists, and \null" otherwise (if i = k). We define the in a straightforward manner; (non)membership in S0 is functions firstchild−1 and nextsibling−1 as the inverses checked in constant time using a direct-access version of the former two functions, where \null" is returned if of S0 maintained in parallel to its list representation no inverse exists for a given node. Where appropriate, (naively, this could be an array of bits, one for each we will use binary relations of the same name instead member of dom, telling which nodes are in S0). of the functions. (fhx; f(x)i j x 2 dom; f(x) 6= nullg is the binary relation for function f.) Lemma 2.3 Let S ⊆ dom be a set of nodes of an The axes self , child, parent, descendant, ancestor, XML document and χ be an axis. Then, (1) χ(S) = descendant-or-self , ancestor-or-self , following, preced- evalχ(S) and (2) Algorithm 2.2 runs in time O(jdomj). ing, following-sibling, and preceding-sibling are binary relations χ ⊆ dom × dom. Let self := fhx; xi j x 2 3 Data Model domg. The other axes are defined in terms of our \primitive" relations “firstchild" and \nextsibling" as Let dom be the set of nodes in the document tree as ∗ shown in Table 1 (cf. [10]). R1:R2, R1 [R2, and R1 de- introduced in the previous section. Each node is of one note the concatenation, union, and reflexive and tran- of seven types, namely root, element, text, comment, sitive closure, respectively, of binary relations R1 and attribute, namespace, and processing instruction. As R2. Let E(χ) denote the regular expression defining in DOM [9], the root node of the document is the only χ in Table 1. It is important to observe that some one of type \root", and is the parent of the document axes are defined in terms of other axes, but that these element node of the XML document. The main type of definitions are acyclic. non-terminal node is \element", the other node types are self-explaining (cf. [10]). Nodes of all types besides Definition 2.1 (Axis Function) Let χ denote an \text" and \comment" have a name associated with it. XPath axis relation. We define the function χ : A node test is an expression of the form τ() (where 2dom ! 2dom as χ(X ) = fx j 9x 2 X : x χxg τ is a node type or the wildcard \node", matching any 0 0 0 0 type) or τ(n) (where n is a node name and τ is a type (and thus overload the relation name χ), where X0 ⊆ dom is a set of nodes. whose nodes have a name). τ(∗) is equivalent to τ(). We define a function T which maps each node test Algorithm 2.2 (Axis Evaluation) to the subset of dom that satisfies it. For instance, Input: A set of nodes S and an axis χ T (node()) = dom and T (attribute(href)) returns all Output: χ(S) attribute nodes labeled \href". Method: evalχ(S) Example 3.1 Consider a document consisting of six function eval (S) := S. nodes { the root node r, which is the parent of self the document element node a (labeled \a"), and function evalχ(S) := evalE(χ)(S).
Recommended publications
  • XML a New Web Site Architecture
    XML A New Web Site Architecture Jim Costello Derek Werthmuller Darshana Apte Center for Technology in Government University at Albany, SUNY 1535 Western Avenue Albany, NY 12203 Phone: (518) 442-3892 Fax: (518) 442-3886 E-mail: [email protected] http://www.ctg.albany.edu September 2002 © 2002 Center for Technology in Government The Center grants permission to reprint this document provided this cover page is included. Table of Contents XML: A New Web Site Architecture .......................................................................................................................... 1 A Better Way? ......................................................................................................................................................... 1 Defining the Problem.............................................................................................................................................. 1 Partial Solutions ...................................................................................................................................................... 2 Addressing the Root Problems .............................................................................................................................. 2 Figure 1. Sample XML file (all code simplified for example) ...................................................................... 4 Figure 2. Sample XSL File (all code simplified for example) ....................................................................... 6 Figure 3. Formatted Page Produced
    [Show full text]
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • QUERYING JSON and XML Performance Evaluation of Querying Tools for Offline-Enabled Web Applications
    QUERYING JSON AND XML Performance evaluation of querying tools for offline-enabled web applications Master Degree Project in Informatics One year Level 30 ECTS Spring term 2012 Adrian Hellström Supervisor: Henrik Gustavsson Examiner: Birgitta Lindström Querying JSON and XML Submitted by Adrian Hellström to the University of Skövde as a final year project towards the degree of M.Sc. in the School of Humanities and Informatics. The project has been supervised by Henrik Gustavsson. 2012-06-03 I hereby certify that all material in this final year project which is not my own work has been identified and that no work is included for which a degree has already been conferred on me. Signature: ___________________________________________ Abstract This article explores the viability of third-party JSON tools as an alternative to XML when an application requires querying and filtering of data, as well as how the application deviates between browsers. We examine and describe the querying alternatives as well as the technologies we worked with and used in the application. The application is built using HTML 5 features such as local storage and canvas, and is benchmarked in Internet Explorer, Chrome and Firefox. The application built is an animated infographical display that uses querying functions in JSON and XML to filter values from a dataset and then display them through the HTML5 canvas technology. The results were in favor of JSON and suggested that using third-party tools did not impact performance compared to native XML functions. In addition, the usage of JSON enabled easier development and cross-browser compatibility. Further research is proposed to examine document-based data filtering as well as investigating why performance deviated between toolsets.
    [Show full text]
  • XML Information Set
    XML Information Set XML Information Set W3C Recommendation 24 October 2001 This version: http://www.w3.org/TR/2001/REC-xml-infoset-20011024 Latest version: http://www.w3.org/TR/xml-infoset Previous version: http://www.w3.org/TR/2001/PR-xml-infoset-20010810 Editors: John Cowan, [email protected] Richard Tobin, [email protected] Copyright ©1999, 2000, 2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Abstract This specification provides a set of definitions for use in other specifications that need to refer to the information in an XML document. Status of this Document This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C. This is the W3C Recommendation of the XML Information Set. This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the file:///C|/Documents%20and%20Settings/immdca01/Desktop/xml-infoset.html (1 of 16) [8/12/2002 10:38:58 AM] XML Information Set Web. This document has been produced by the W3C XML Core Working Group as part of the XML Activity in the W3C Architecture Domain.
    [Show full text]
  • Scraping HTML with Xpath Stéphane Ducasse, Peter Kenny
    Scraping HTML with XPath Stéphane Ducasse, Peter Kenny To cite this version: Stéphane Ducasse, Peter Kenny. Scraping HTML with XPath. published by the authors, pp.26, 2017. hal-01612689 HAL Id: hal-01612689 https://hal.inria.fr/hal-01612689 Submitted on 7 Oct 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Scraping HTML with XPath Stéphane Ducasse and Peter Kenny Square Bracket tutorials September 28, 2017 master @ a0267b2 Copyright 2017 by Stéphane Ducasse and Peter Kenny. The contents of this book are protected under the Creative Commons Attribution-ShareAlike 3.0 Unported license. You are free: • to Share: to copy, distribute and transmit the work, • to Remix: to adapt the work, Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work.
    [Show full text]
  • SVG-Based Knowledge Visualization
    MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨ OF I !"#$%&'()+,-./012345<yA|NFORMATICS SVG-based Knowledge Visualization DIPLOMA THESIS Miloš Kaláb Brno, spring 2012 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Advisor: RNDr. Tomáš Gregar Ph.D. ii Acknowledgement I would like to thank RNDr. Tomáš Gregar Ph.D. for supervising the thesis. His opinions, comments and advising helped me a lot with accomplishing this work. I would also like to thank to Dr. Daniel Sonntag from DFKI GmbH. Saarbrücken, Germany, for the opportunity to work for him on the Medico project and for his supervising of the thesis during my erasmus exchange in Germany. Big thanks also to Jochen Setz from Dr. Sonntag’s team who worked on the server background used by my visualization. Last but not least, I would like to thank to my family and friends for being extraordinary supportive. iii Abstract The aim of this thesis is to analyze the visualization of semantic data and sug- gest an approach to general visualization into the SVG format. Afterwards, the approach is to be implemented in a visualizer allowing user to customize the visualization according to the nature of the data. The visualizer was integrated as an extension of Fresnel Editor. iv Keywords Semantic knowledge, SVG, Visualization, JavaScript, Java, XML, Fresnel, XSLT v Contents Introduction . .3 1 Brief Introduction to the Related Technologies ..........5 1.1 XML – Extensible Markup Language ..............5 1.1.1 XSLT – Extensible Stylesheet Lang.
    [Show full text]
  • XPATH in NETCONF and YANG Table of Contents
    XPATH IN NETCONF AND YANG Table of Contents 1. Introduction ............................................................................................................3 2. XPath 1.0 Introduction ...................................................................................3 3. The Use of XPath in NETCONF ...............................................................4 4. The Use of XPath in YANG .........................................................................5 5. XPath and ConfD ...............................................................................................8 6. Conclusion ...............................................................................................................9 7. Additional Resourcese ..................................................................................9 2 XPath in NETCONF and YANG 1. Introduction XPath is a powerful tool used by NETCONF and YANG. This application note will help you to understand and utilize this advanced feature of NETCONF and YANG. This application note gives a brief introduction to XPath, then describes how XPath is used in NETCONF and YANG, and finishes with a discussion of XPath in ConfD. The XPath 1.0 standard was defined by the W3C in 1999. It is a language which is used to address the parts of an XML document and was originally design to be used by XML Transformations. XPath gets its name from its use of path notation for navigating through the hierarchical structure of an XML document. Since XML serves as the encoding format for NETCONF and a data model defined in YANG is represented in XML, it was natural for NETCONF and XML to utilize XPath. 2. XPath 1.0 Introduction XML Path Language, or XPath 1.0, is a W3C recommendation first introduced in 1999. It is a language that is used to address and match parts of an XML document. XPath sees the XML document as a tree containing different kinds of nodes. The types of nodes can be root, element, text, attribute, namespace, processing instruction, and comment nodes.
    [Show full text]
  • Pearls of XSLT/Xpath 3.0 Design
    PEARLS OF XSLT AND XPATH 3.0 DESIGN PREFACE XSLT 3.0 and XPath 3.0 contain a lot of powerful and exciting new capabilities. The purpose of this paper is to highlight the new capabilities. Have you got a pearl that you would like to share? Please send me an email and I will add it to this paper (and credit you). I ask three things: 1. The pearl highlights a capability that is new to XSLT 3.0 or XPath 3.0. 2. Provide a short, complete, working stylesheet with a sample input document. 3. Provide a brief description of the code. This is an evolving paper. As new pearls are found, they will be added. TABLE OF CONTENTS 1. XPath 3.0 is a composable language 2. Higher-order functions 3. Partial functions 4. Function composition 5. Recursion with anonymous functions 6. Closures 7. Binary search trees 8. -- next pearl is? -- CHAPTER 1: XPATH 3.0 IS A COMPOSABLE LANGUAGE The XPath 3.0 specification says this: XPath 3.0 is a composable language What does that mean? It means that every operator and language construct allows any XPath expression to appear as its operand (subject only to operator precedence and data typing constraints). For example, take this expression: 3 + ____ The plus (+) operator has a left-operand, 3. What can the right-operand be? Answer: any XPath expression! Let's use the max() function as the right-operand: 3 + max(___) Now, what can the argument to the max() function be? Answer: any XPath expression! Let's use a for- loop as its argument: 3 + max(for $i in 1 to 10 return ___) Now, what can the return value of the for-loop be? Answer: any XPath expression! Let's use an if- statement: 3 + max(for $i in 1 to 10 return (if ($i gt 5) then ___ else ___))) And so forth.
    [Show full text]
  • Renderx XEP User Guide XEP User Guide
    RenderX XEP User Guide XEP User Guide © Copyright 2005-2019 RenderX, Inc. All rights reserved. This documentation contains proprietary information belonging to RenderX, and is provided under a license agreement containing restrictions on use and disclosure. It is also protected by international copyright law. Because of continued product development, the information contained in this document may change without notice.The information and intellectual property contained herein are confidential and remain the exclusive intellectual property of RenderX. If you find any problems in the documentation, please report them to us in writing. RenderX does not warrant that this document is error- free. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means - electronic, mechanical, photocopying, recording or otherwise - without the prior written permission of RenderX. RenderX Telephone: 1 (650) 328-8000 Fax: 1 (650) 328-8008 Website: http://renderx.com Email: [email protected] Table of Contents 1. Preface ................................................................................................................... 9 1.1. What©s in this Document? .............................................................................. 9 1.2. Prerequisites ................................................................................................ 9 1.3. Acronyms ................................................................................................... 10 1.4. Technical Support
    [Show full text]
  • XSLT Can Be One of the Following Five Types
    XPath Quick Reference Node Types Nodes can be of the following types: NodeType Name String Value root Concatenation of alI text nodes in the document element Element name Concatenation of alI text nodes descended from the element attribute Attribute name Attribute value text Text cornrnent Text held within the comment processing Processing Text in the processing instruction after the space after the instruction target targetname narnespace Namespace prefix Namespace URI Object Types Values in XSLT can be one of the following five types: Object Type Description boolean true or false number A floating-point number: a double-precision 64-bit format IEEE 754 value Table continued on following page Appendix A Object Type Description string A sequence of XML characters node set An unordered collection of nodes result tree fragment The root node of a generated node tree The various object types can be converted to each other by the following rules. Object Boolean Number String boolean 1 if the boolean is , true' if the boolean is true, O if the true, 'false' if the boolean is false boolean is false number false ifthe A string version of the number is O or number with no leading or NaN, otherwise trailing zeros; 'NaN' if the true number is NaN, , Infinity' if the number is infinity, and '-Infinity' ifthe number is negative infinity string false ifthe The number held string is an empty by the string, as string, otherwise long as it is a valid true XPath number (which doesn't include exponents or grouping separators) ; otherwise NaN node set true if the node The result of The string value of the first set holds nodes; converting the node in the node set false ifthe string value of the node set is empty first node in the node set to a number result tree true The result of The string value of the re suit fragment converting the tree fragment string value of the re suIt tree fragment to a number Booleans, numbers, and strings cannot be converted to node sets or result tree fragments.
    [Show full text]
  • Decentralized Identifier WG F2F Sessions
    Decentralized Identifier WG F2F Sessions Day 1: January 29, 2020 Chairs: Brent Zundel, Dan Burnett Location: Microsoft Schiphol 1 Welcome! ● Logistics ● W3C WG IPR Policy ● Agenda ● IRC and Scribes ● Introductions & Dinner 2 Logistics ● Location: “Spaces”, 6th floor of Microsoft Schiphol ● WiFi: SSID Publiek_theOutlook, pwd Hello2020 ● Dial-in information: +1-617-324-0000, Meeting ID ● Restrooms: End of the hall, turn right ● Meeting time: 8 am - 5 pm, Jan. 29-31 ● Breaks: 10:30-11 am, 12:30-1:30 pm, 2:30-3 pm ● DID WG Agenda: https://tinyurl.com/didwg-ams2020-agenda (HTML) ● Live slides: https://tinyurl.com/didwg-ams2020-slides (Google Slides) ● Dinner Details: See the “Dinner Tonight” slide at the end of each day 3 W3C WG IPR Policy ● This group abides by the W3C patent policy https://www.w3.org/Consortium/Patent-Policy-20040205 ● Only people and companies listed at https://www.w3.org/2004/01/pp-impl/117488/status are allowed to make substantive contributions to the specs ● Code of Conduct https://www.w3.org/Consortium/cepc/ 4 Today’s agenda 8:00 Breakfast 8:30 Welcome, Introductions, and Logistics Chairs 9:00 Level setting Chairs 9:30 Security issues Brent 10:15 DID and IoT Sam Smith 10:45 Break 11:00 Multiple Encodings/Different Syntaxes: what might we want to support Markus 11:30 Different encodings: model incompatibilities Manu 12:00 Abstract data modeling options Dan Burnett 12:30 Lunch (brief “Why Are We Here?” presentation) Christopher Allen 13:30 DID Doc Extensibility via Registries Mike 14:00 DID Doc Extensibility via JSON-LD Manu
    [Show full text]
  • Prevent Xpath and CSS Based Scrapers by Using Markup Randomization
    The Islamic University of Gaza الجـامعـــــــــ ـة اﻹســـــﻻميـ ـة بغـ ـزة عم ا ة البح ل العلم والد اس ات العلي ا Deanship of Research and Graduate Studies كـليــــ ـة تكنولوجي ا المعلوم ات Faculty of Information Technology ماجس تير تكنولوجي ا المعلوم ات Master of Information Technology Prevent XPath and CSS Based Scrapers by Using Markup Randomization منع جمع المعلومات بالطريقة المعتمدة على XPath و CSS باستخدام عشوائية الترميز By Ahmed Mustafa Ibrahim Diab Supervised by Dr. Tawfiq S. Barhoom Associate Prof. of Applied Computer Technology A thesis submitted in partial fulfilment of the requirements for the degree of Master of Information Technology September/2018 إقـــــــــــــرار أنا الموقع أدناه مقدم الرسالة التي تحمل العنوان: Prevent XPath and CSS Based Scrapers by Using Markup Randomization منع جمع المعلومات بالطريقة المعتمدة على XPath وCSS باستخدام عشوائية الترميز أقر بأن ما اشتملت عليه هذه الرسالة إنما هو نتاج جهدي الخاص، باستثناء ما تمت اﻹشارة إليه حيثما ورد، وأن هذه الرسالة ككل أو أي جزء منها لم يقدم من قبل اﻻخرين لنيل درجة أو لقب علمي أو بحثي لدى أي مؤسسة تعليمية أو بحثية أخرى. Declaration I understand the nature of plagiarism, and I am aware of the University’s policy on this. The work provided in this thesis, unless otherwise referenced, is the researcher's own work, and has not been submitted by others elsewhere for any other degree or qualification. احمد مصطفى إبراهيم دياب :Student's name اسم الطالب: Signature: التوقيع: Date: التاريخ: 28/08/2018 I I Abstract Web Scraping is a useful technique can be used in an ethical way such as climate and many researching fields, on the other hand, unethical way such as exploit content privacy, which is Data Theft.
    [Show full text]