Content Processing Framework Guide (PDF)

Total Page:16

File Type:pdf, Size:1020Kb

Content Processing Framework Guide (PDF) MarkLogic Server Content Processing Framework Guide 2 MarkLogic 9 May, 2017 Last Revised: 9.0-7, September 2018 Copyright © 2019 MarkLogic Corporation. All rights reserved. MarkLogic Server Version MarkLogic 9—May, 2017 Page 2—Content Processing Framework Guide MarkLogic Server Table of Contents Table of Contents Content Processing Framework Guide 1.0 Overview of the Content Processing Framework ..........................................7 1.1 Making Content More Useful .................................................................................7 1.1.1 Getting Your Content Into XML Format ....................................................7 1.1.2 Striving For Clean, Well-Structured XML .................................................8 1.1.3 Enriching Content With Semantic Tagging, Metadata, etc. .......................8 1.2 Access Internal and External Web Services ...........................................................8 1.3 Components of the Content Processing Framework ...............................................9 1.3.1 Domains ......................................................................................................9 1.3.2 Pipelines ......................................................................................................9 1.3.3 XQuery Functions and Modules .................................................................9 1.3.4 Pre-Commit and Post-Commit Triggers ...................................................10 1.3.5 Creating Custom Applications With the Content Processing Framework 11 1.4 Default Conversion Option ...................................................................................11 1.4.1 Microsoft Office and Adobe PDF Conversion .........................................11 1.4.2 HTML Conversion and Enrichment .........................................................11 2.0 Getting Started with a Simple CPF Application ..........................................13 2.1 Overview of Example CPF Application ...............................................................14 2.2 Create the Action Modules ...................................................................................15 2.2.1 add-copyright.xqy .....................................................................................15 2.2.2 add-last-updated.xqy .................................................................................16 2.3 Insert the Action Modules into the Modules Database .........................................16 2.4 Create the Pipeline ................................................................................................17 2.5 Configure a Database for Content Processing ......................................................19 2.6 Insert and Update a Document in the Database ....................................................22 2.7 View the Properties Document .............................................................................23 2.8 Extend the CPF Application .................................................................................24 3.0 Understanding and Using Domains .............................................................29 3.1 Overview of Domains ...........................................................................................29 3.2 Domain Scope and Code Evaluation Context .......................................................29 3.2.1 Domain Scope ...........................................................................................30 3.2.2 Evaluation Context ...................................................................................30 3.2.3 Domain Scope Can Encapsulate Processing Logic ..................................31 3.3 Rules for Domains ................................................................................................31 3.3.1 Do Not Overlap Domains .........................................................................31 3.3.2 Collection Domain Scope Notes ...............................................................32 3.4 Creating and Modifying Domains ........................................................................32 MarkLogic 9—May, 2017 Content Processing Framework Guide—Page 3 MarkLogic Server Version MarkLogic 9—May, 2017 Table of Contents 3.5 Attaching and Detaching Pipelines to Domains ...................................................33 4.0 Understanding and Using Pipelines .............................................................35 4.1 Pipeline Architecture ............................................................................................35 4.1.1 Automatic Status (Event) Handling ..........................................................35 4.1.2 Transitioning Between States ...................................................................36 4.1.3 Pipelines Can Flow Through Other Pipelines ...........................................37 4.2 Viewing Pipelines in the Admin Interface ............................................................38 4.3 Loading Pipelines With the Admin Interface .......................................................39 4.4 XML Format of a Pipeline ....................................................................................40 4.4.1 Sample Pipeline XML Document .............................................................41 4.4.2 Success-Action and Failure-Action ..........................................................42 4.4.3 Status Transitions ......................................................................................42 4.4.3.1 Status .........................................................................................43 4.4.3.2 On Success and On Failure .......................................................43 4.4.3.3 Default Action ...........................................................................43 4.4.3.4 Priority .......................................................................................43 4.4.3.5 Execute ......................................................................................44 4.4.4 State Transitions .......................................................................................44 4.4.4.1 State ...........................................................................................44 4.4.4.2 On Success and On Failure .......................................................45 4.4.4.3 Default Action ...........................................................................45 4.4.4.4 Priority .......................................................................................45 4.4.4.5 Execute ......................................................................................46 4.5 XQuery Functions to Manage Pipelines ...............................................................46 5.0 Developing Modules to Process Content .....................................................47 5.1 Overview of Modules ...........................................................................................47 5.2 Loading Modules Into the Database .....................................................................47 5.3 External Variables Available to Modules .............................................................48 5.4 Design Patterns and Rules ....................................................................................49 5.4.1 Condition Modules Must Return a Boolean .............................................49 5.4.2 Condition Modules Should Not Update Documents ................................49 5.4.3 Action Modules Use try/catch With cpf:success and cpf:failure ..............49 5.4.4 Action Modules Operate On a Single Document .....................................50 5.4.5 Action Modules Must Be a Single Transaction ........................................51 5.4.6 Using XSLT Stylesheets Instead of Action Modules ...............................51 6.0 Using the Framework to Create Custom Applications ................................53 6.1 Install Content Processing Framework in Database .............................................53 6.2 Decide on the Stages and Logic for Your Processing ..........................................54 6.3 Create and Load Your Pipelines ...........................................................................55 6.4 Create and Load Your Modules ............................................................................55 6.5 Design Patterns For Content Applications ............................................................55 6.6 Microsoft Office 2007 and Later Documents .......................................................56 Page 4—Content Processing Framework Guide MarkLogic Server Table of Contents 6.7 Other CPF Applications Included With MarkLogic Server .................................56 7.0 Security Considerations With Content Processing ......................................57 7.1 Security Requirements for Users Who Create or Modify Documents .................57 7.2 Security Requirements When Modules Perform Privileged Operations ..............57 7.3 Security Roles for Managing Content Processing ................................................58 8.0 Debugging and Recovering from Error Conditions .....................................59 8.1 Database Online Events ........................................................................................59 8.2 Disabling Content Processing Triggers ................................................................59 8.3 Content Processing Framework Trace Events ......................................................61 8.3.1 List of Trace Events ..................................................................................61
Recommended publications
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • Ontology Matching • Semantic Social Networks and Peer-To-Peer Systems
    The web: from XML to OWL Rough Outline 1. Foundations of XML (Pierre Genevès & Nabil Layaïda) • Core XML • Programming with XML Development of the future web • Foundations of XML types (tree grammars, tree automata) • Tree logics (FO, MSO, µ-calculus) • Expressing information ! Languages • A taste of research: introduction to some grand challenges • Manipulating it ! Algorithms 2. Semantics of knowledge representation on the web (Jérôme Euzenat & • in the most correct, efficient and ! Logic Marie-Christine Rousset) meaningful way ! Semantics • Semantic web languages (URI, RDF, RDFS and OWL) • Querying RDF and RDFS (SPARQL) • Querying data though ontologies (DL-Lite) • Ontology matching • Semantic social networks and peer-to-peer systems 1 / 8 2 / 8 Foundations of XML Semantic web We will talk about languages, algorithms, and semantics for efficiently and meaningfully manipulating formalised knowledge. We will talk about languages, algorithms, and programming techniques for efficiently and safely manipulating XML data. You will learn about: You will learn about: • Expressing formalised knowledge on the semantic web (RDF) • Tree structured data (XML) ! Syntax and semantics ! Tree grammars & validation You will not learn about: • Expressing ontologies on the semantic • XML programming (XPath, XSLT...) You will not learn about: web (RDFS, OWL, DL-Lite) • Tagging pictures ! Queries & transformations ! Syntax and semantics • Hacking CGI scripts ! Reasoning • Sharing MP3 • Foundational theory & tools • HTML • Creating facebook ! Regular expressions
    [Show full text]
  • Access Control Models for XML
    Access Control Models for XML Abdessamad Imine Lorraine University & INRIA-LORIA Grand-Est Nancy, France [email protected] Outline • Overview on XML • Why XML Security? • Querying Views-based XML Data • Updating Views-based XML Data 2 Outline • Overview on XML • Why XML Security? • Querying Views-based XML Data • Updating Views-based XML Data 3 What is XML? • eXtensible Markup Language [W3C 1998] <files> "<record>! ""<name>Robert</name>! ""<diagnosis>Pneumonia</diagnosis>! "</record>! "<record>! ""<name>Franck</name>! ""<diagnosis>Ulcer</diagnosis>! "</record>! </files>" 4 What is XML? • eXtensible Markup Language [W3C 1998] <files>! <record>! /files" <name>Robert</name>! <diagnosis>! /record" /record" Pneumonia! </diagnosis> ! </record>! /name" /diagnosis" <record …>! …! </record>! Robert" Pneumonia" </files>! 5 XML for Documents • SGML • HTML - hypertext markup language • TEI - Text markup, language technology • DocBook - documents -> html, pdf, ... • SMIL - Multimedia • SVG - Vector graphics • MathML - Mathematical formulas 6 XML for Semi-Structered Data • MusicXML • NewsML • iTunes • DBLP http://dblp.uni-trier.de • CIA World Factbook • IMDB http://www.imdb.com/ • XBEL - bookmark files (in your browser) • KML - geographical annotation (Google Maps) • XACML - XML Access Control Markup Language 7 XML as Description Language • Java servlet config (web.xml) • Apache Tomcat, Google App Engine, ... • Web Services - WSDL, SOAP, XML-RPC • XUL - XML User Interface Language (Mozilla/Firefox) • BPEL - Business process execution language
    [Show full text]
  • There's a Lot to Know About XML, and It S Constantly Evolving. but You Don't
    < Day Day Up > • Table of Contents • Index • Reviews • Reader Reviews • Errata • Academic XML in a Nutshell, 3rd Edition By Elliotte Rusty Harold, W. Scott Means Publisher: O'Reilly Pub Date: September 2004 ISBN: 0-596-00764-7 Pages: 712 There's a lot to know about XML, and it s constantly evolving. But you don't need to commit every syntax, API, or XSLT transformation to memory; you only need to know where to find it. And if it's a detail that has to do with XML or its companion standards, you'll find it--clear, concise, useful, and well-organized--in the updated third edition of XML in a Nutshell. < Day Day Up > < Day Day Up > • Table of Contents • Index • Reviews • Reader Reviews • Errata • Academic XML in a Nutshell, 3rd Edition By Elliotte Rusty Harold, W. Scott Means Publisher: O'Reilly Pub Date: September 2004 ISBN: 0-596-00764-7 Pages: 712 Copyright Preface What This Book Covers What's New in the Third Edition Organization of the Book Conventions Used in This Book Request for Comments Acknowledgments Part I: XML Concepts Chapter 1. Introducing XML Section 1.1. The Benefits of XML Section 1.2. What XML Is Not Section 1.3. Portable Data Section 1.4. How XML Works Section 1.5. The Evolution of XML Chapter 2. XML Fundamentals Section 2.1. XML Documents and XML Files Section 2.2. Elements, Tags, and Character Data Section 2.3. Attributes Section 2.4. XML Names Section 2.5. References Section 2.6. CDATA Sections Section 2.7.
    [Show full text]
  • Skyfire: Data-Driven Seed Generation for Fuzzing
    Skyfire: Data-Driven Seed Generation for Fuzzing Junjie Wang, Bihuan Chen†, Lei Wei, and Yang Liu Nanyang Technological University, Singapore {wang1043, bhchen, l.wei, yangliu}@ntu.edu.sg †Corresponding Author Abstract—Programs that take highly-structured files as inputs Syntax Semantic normally process inputs in stages: syntax parsing, semantic check- Features Rules ing, and application execution. Deep bugs are often hidden in the <?xml version="1.0" application execution stage, and it is non-trivial to automatically encoding="utf- pass pass pass 8"?><xsl:stylesheet version="1.0" Syntax Semantic Application xmlns:xsl="http://www.w3 .org/1999/XSL/Transform" generate test inputs to trigger them. Mutation-based fuzzing gen- ><xsl:output xsl:use- √ attribute- Parsing Checking Execution erates test inputs by modifying well-formed seed inputs randomly sets=""/></xsl:stylesheet> Parsing Semantic or heuristically. Most inputs are rejected at the early syntax pars- Inputs Crashes ing stage. Differently, generation-based fuzzing generates inputs Errors Violations from a specification (e.g., grammar). They can quickly carry the ! ! X fuzzing beyond the syntax parsing stage. However, most inputs fail to pass the semantic checking (e.g., violating semantic rules), Fig. 1: Stages of Processing Highly-Structured Inputs which restricts their capability of discovering deep bugs. In this paper, we propose a novel data-driven seed generation approach, named Skyfire, which leverages the knowledge in the analysis [8, 9] that identifies those interesting bytes to mutate, vast amount of existing samples to generate well-distributed seed symbolic execution [10, 11, 12] that relies on constraint solving inputs for fuzzing programs that process highly-structured inputs.
    [Show full text]
  • Session 7 - Main Theme XML Information Rendering (Part I)
    XML for Java Developers G22.3033-002 Session 7 - Main Theme XML Information Rendering (Part I) Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences 1 Agenda Summary of Previous Session Extensible Stylesheet Language Transformation (XSL-T) Extensible Stylesheet Language Formatting Object (XSL-FO) XML and Document/Content Management Introduction to XML Application Servers Working with XSLT-T and XSL-FO Processors Assignment 4a+4b (due in two week) 2 1 Summary of Previous Session Summary of Previous Session Document Object Model (DOM) Advanced XML Parser Technology JDOM: Java-Centric API for XML JAXP: Java API for XML Processing Parsers comparison Latest W3C APIs and Standards for Processing XML XML Infoset, DOM Level 3, Canonical XML XML Signatures, XBase, XInclude XML Schema Adjuncts Java-Based XML Data Processing Frameworks Assignment #3 3 XML-Based Rendering Development XML Software Development Methodology Language + Stepwise Process + Tools Rational Unified Process (RUP) vs. “XML Unified Process” XML Application Development Infrastructure Metadata Management (e.g., XMI) XSLT, XPath XSL-FO APIs (JAXP, JAXB, JDOM, SAX, DOM) XML Tools (e.g., XML Editors, Apache’s FOP, Antenna House’s XSL Formatter, HTML/CSS1/2/3, XHTML, XForms, WCAG XML App. Components Involved in the Rendering Phase: Application(s) of XML XML-based applications/services (markup language mediators) MOM, POP, Other Services (e.g., persistence) 4 Application Infrastructure Frameworks
    [Show full text]
  • Multi-Model Databases: a New Journey to Handle the Variety of Data
    0 Multi-model Databases: A New Journey to Handle the Variety of Data JIAHENG LU, Department of Computer Science, University of Helsinki IRENA HOLUBOVA´ , Department of Software Engineering, Charles University, Prague The variety of data is one of the most challenging issues for the research and practice in data management systems. The data are naturally organized in different formats and models, including structured data, semi- structured data and unstructured data. In this survey, we introduce the area of multi-model DBMSs which build a single database platform to manage multi-model data. Even though multi-model databases are a newly emerging area, in recent years we have witnessed many database systems to embrace this category. We provide a general classification and multi-dimensional comparisons for the most popular multi-model databases. This comprehensive introduction on existing approaches and open problems, from the technique and application perspective, make this survey useful for motivating new multi-model database approaches, as well as serving as a technical reference for developing multi-model database applications. CCS Concepts: Information systems ! Database design and models; Data model extensions; Semi- structured data;r Database query processing; Query languages for non-relational engines; Extraction, trans- formation and loading; Object-relational mapping facilities; Additional Key Words and Phrases: Big Data management, multi-model databases, NoSQL database man- agement systems. ACM Reference Format: Jiaheng Lu and Irena Holubova,´ 2019. Multi-model Databases: A New Journey to Handle the Variety of Data. ACM CSUR 0, 0, Article 0 ( 2019), 38 pages. DOI: http://dx.doi.org/10.1145/0000000.0000000 1.
    [Show full text]
  • Exploring the XML World
    Lars Strandén SP Swedish National Testing and Research Institute develops and transfers Exploring the XML World technology for improving competitiveness and quality in industry, and for safety, conservation of resources and good environment in society as a whole. With - A survey for dependable systems Swedens widest and most sophisticated range of equipment and expertise for technical investigation, measurement, testing and certfi cation, we perform research and development in close liaison with universities, institutes of technology and international partners. SP is a EU-notifi ed body and accredited test laboratory. Our headquarters are in Borås, in the west part of Sweden. SP Swedish National Testing and Research Institute SP Swedish National Testing SP Electronics SP REPORT 2004:04 ISBN 91-7848-976-8 ISSN 0284-5172 SP Swedish National Testing and Research Institute Box 857 SE-501 15 BORÅS, SWEDEN Telephone: + 46 33 16 50 00, Telefax: +46 33 13 55 02 SP Electronics E-mail: [email protected], Internet: www.sp.se SP REPORT 2004:04 Lars Strandén Exploring the XML World - A survey for dependable systems 2 Abstract Exploring the XML World - A survey for dependable systems The report gives an overview of and an introduction to the XML world i.e. the definition of XML and how it can be applied. The content shall not be considered a development tutorial since this is better covered by books and information available on the Internet. Instead this report takes a top-down perspective and focuses on the possibilities when using XML e.g. concerning functionality, tools and supporting standards. Getting started with XML is very easy and the threshold is low.
    [Show full text]
  • Xml2’ April 23, 2020 Title Parse XML Version 1.3.2 Description Work with XML files Using a Simple, Consistent Interface
    Package ‘xml2’ April 23, 2020 Title Parse XML Version 1.3.2 Description Work with XML files using a simple, consistent interface. Built on top of the 'libxml2' C library. License GPL (>=2) URL https://xml2.r-lib.org/, https://github.com/r-lib/xml2 BugReports https://github.com/r-lib/xml2/issues Depends R (>= 3.1.0) Imports methods Suggests covr, curl, httr, knitr, magrittr, mockery, rmarkdown, testthat (>= 2.1.0) VignetteBuilder knitr Encoding UTF-8 Roxygen list(markdown = TRUE) RoxygenNote 7.1.0 SystemRequirements libxml2: libxml2-dev (deb), libxml2-devel (rpm) Collate 'S4.R' 'as_list.R' 'xml_parse.R' 'as_xml_document.R' 'classes.R' 'init.R' 'paths.R' 'utils.R' 'xml_attr.R' 'xml_children.R' 'xml_find.R' 'xml_modify.R' 1 2 R topics documented: 'xml_name.R' 'xml_namespaces.R' 'xml_path.R' 'xml_schema.R' 'xml_serialize.R' 'xml_structure.R' 'xml_text.R' 'xml_type.R' 'xml_url.R' 'xml_write.R' 'zzz.R' R topics documented: as_list . .3 as_xml_document . .4 download_xml . .4 read_xml . .5 url_absolute . .8 url_escape . .9 url_parse . .9 write_xml . 10 xml2_example . 11 xml_attr . 11 xml_cdata . 13 xml_children . 13 xml_comment . 14 xml_document-class . 15 xml_dtd . 15 xml_find_all . 16 xml_name . 18 xml_new_document . 18 xml_ns . 19 xml_ns_strip . 20 xml_path . 21 xml_replace . 21 xml_serialize . 22 xml_set_namespace . 23 xml_structure . 23 xml_text . 24 xml_type . 25 xml_url . 25 xml_validate . 26 Index 27 as_list 3 as_list Coerce xml nodes to a list. Description This turns an XML document (or node or nodeset) into the equivalent R list. Note that this is as_list(), not as.list(): lapply() automatically calls as.list() on its inputs, so we can’t override the default. Usage as_list(x, ns = character(), ...) Arguments x A document, node, or node set.
    [Show full text]
  • Interactive Mathematical Videos
    Interactive Mathematical Videos Hans Cuypers & Jan Willem Knopper Department of Mathematics Eindhoven University of Technology June 21, 2013 Abstract We have added interaction from and to videos in our MathDox exercise system, by using the Popcorn.js library from the Mozilla foundation. Inthis weay we have realized interactive mathematical videos. Our approach is still in an experimental phase, but in the near future we will investigate the possiblities offered by this approach, both on technological and educational issues. 1 Introduction Over the last ten to fifteen years video has become an important means of communications through the world wide web. The video tools incorporated in mobile phones and other devices make it easy to create videos. These videos are then easily distribute via YouTube [43], Vimeo [37] or other channels. Also, in education of mathematics videos and screencasts are playing a more and more prominent role, witnessed by the popularity of the Kahn Academy [20]. On the other hand, mathematical software for online and blended education has become mature. Nowadays there exist various software packages, both commercial and open source, that provide the user various ways of interactive mathematics, ranging from interactive and dynamic geometry packages like Geogebra [12], Cinderella [5] and The Geometer's SketchPad [13], to automatically generated and graded exercises in Webworks [38], STACK [35] or MapleTA [21] and systems combining these functionalities like ActiveMath [1] and MathDox [22]. In this note we report on some experiments in which we have tried to combine the use of video with the interactive mathematical software package MathDox to obtain interactive mathematical videos.
    [Show full text]
  • XML and Applications 2014/2015 Lecture 12 – 19.01.2015 Standards for Inter-Document Relations
    Some more XML applications and XML-related standards (XLink, XPointer, XForms) Patryk Czarnik XML and Applications 2014/2015 Lecture 12 – 19.01.2015 Standards for inter-document relations XPointer – addressing documents and their fragments XInclude – logical inclusion of documents within other documents XLink – declarative relations between documents and their fragments 2 / 22 XPointer The standard defines addressing XML documents and their fragments using standard URI syntax: http://www.sejm.gov.pl/ustawa.xml#def-las 3 W3C recommendations dated 2002-2003: XPointer Framework http://www.w3.org/TR/xptr-framework/ XPointer element() Scheme http://www.w3.org/TR/xptr-element/ XPointer xmlns() Scheme http://www.w3.org/TR/xptr-xmlns/ XPointer xpointer() Scheme http://www.w3.org/TR/xptr-xpointer/ (neverending?) Working Draft 3 / 22 XPointer – xpointer scheme xpointer scheme allows to address elements using XPath: http://www.sejm.gov.pl/ustawa.xml#xpointer(/art[5]/par[2]) xmlns scheme adds namespace declarations to the above: ustawa.xml#xmlns(pr=http://www.sejm.gov.pl/prawo) xpointer(/pr:art[5]/pr:par[2]) 4 / 22 XPointer – element scheme Element carrying ID attribute with given value: document.xml#element(def-las) Element with given position (absolute or relative to element carrying ID with given value): document.xml#element(/1/4/3) document.xml#element(def-las/2/3) Short syntax: document.xml#def-las document.xml#/1/4/3 document.xml#def-las/2/3 5 / 22 XInclude Including external XML documents (or their fragments) in another XML document. Similar to entities, but: normal element markup, no special syntax, no need to declare anything in DTD, nor to have DTD at all Main capabilities: including complete documents (identified by URL) or their fragments (pointed by XPointer) including XML tree (default) or raw text defining content to be used in case of an error Supported by many parsers, including Java (JAXP).
    [Show full text]