XML: Tools for Parsing and Generating XML Within R and S-Plus

Total Page:16

File Type:pdf, Size:1020Kb

XML: Tools for Parsing and Generating XML Within R and S-Plus Package ‘XML’ September 17, 2021 Version 3.99-0.8 Title Tools for Parsing and Generating XML Within R and S-Plus Depends R (>= 4.0.0), methods, utils Suggests bitops, RCurl SystemRequirements libxml2 (>= 2.6.3) Description Many approaches for both reading and creating XML (and HTML) documents (including DTDs), both local and accessible via HTTP or FTP. Also offers access to an 'XPath' ``interpreter''. URL http://www.omegahat.net/RSXML License BSD_3_clause + file LICENSE Collate AAA.R DTD.R DTDClasses.R DTDRef.R SAXMethods.R XMLClasses.R applyDOM.R assignChild.R catalog.R createNode.R dynSupports.R error.R flatTree.R nodeAccessors.R parseDTD.R schema.R summary.R tangle.R toString.R tree.R version.R xmlErrorEnums.R xmlEventHandler.R xmlEventParse.R xmlHandler.R xmlInternalSource.R xmlOutputDOM.R xmlNodes.R xmlOutputBuffer.R xmlTree.R xmlTreeParse.R htmlParse.R hashTree.R zzz.R supports.R parser.R libxmlFeatures.R xmlString.R saveXML.R namespaces.R readHTMLTable.R reflection.R xmlToDataFrame.R bitList.R compare.R encoding.R fixNS.R xmlRoot.R serialize.R xmlMemoryMgmt.R keyValueDB.R solrDocs.R XMLRErrorInfo.R xincludes.R namespaceHandlers.R tangle1.R htmlLinks.R htmlLists.R getDependencies.R getRelativeURL.R xmlIncludes.R simplifyPath.R NeedsCompilation yes Author CRAN Team [ctb, cre] (de facto maintainer since 2013), Duncan Temple Lang [aut] (<https://orcid.org/0000-0003-0159-1546>), Tomas Kalibera [ctb] Maintainer CRAN Team <[email protected]> Repository CRAN Date/Publication 2021-09-17 06:19:17 UTC 1 2 R topics documented: R topics documented: addChildren . .4 addNode . .8 append.xmlNode . .9 asXMLNode . 10 asXMLTreeNode . 12 catalogLoad . 13 catalogResolve . 15 coerceNodes . 16 compareXMLDocs . 17 docName . 18 Doctype . 19 Doctype-class . 20 dtdElement . 21 dtdElementValidEntry . 22 dtdIsAttribute . 23 dtdValidElement . 24 ensureNamespace . 26 findXInclude . 27 free ............................................. 28 genericSAXHandlers . 29 getChildrenStrings . 30 getEncoding . 31 getHTMLLinks . 32 getLineNumber . 33 getNodeSet . 34 getRelativeURL . 42 getSibling . 44 getXIncludes . 45 getXMLErrors . 47 isXMLString . 48 length.XMLNode . 49 libxmlVersion . 50 makeClassTemplate . 51 names.XMLNode . 52 newXMLDoc . 53 newXMLNamespace . 59 parseDTD . 60 parseURI . 63 parseXMLAndAdd . 64 print.XMLAttributeDef . 65 processXInclude . 67 readHTMLList . 69 readHTMLTable . 70 readKeyValueDB . 73 readSolrDoc . 74 removeXMLNamespaces . 75 R topics documented: 3 replaceNodeWithChildren . 76 saveXML . 77 SAXState-class . 79 schema-class . 81 setXMLNamespace . 82 startElement.SAX . 83 supportsExpat . 84 toHTML . 85 toString.XMLNode . 86 xmlApply . 87 XMLAttributes-class . 88 xmlAttributeType . 89 xmlAttrs . 90 xmlChildren . 91 xmlCleanNamespaces . 92 xmlClone . 93 XMLCodeFile-class . 94 xmlContainsEntity . 96 xmlDOMApply . 97 xmlElementsByTagName . 98 xmlElementSummary . 100 xmlEventHandler . 101 xmlEventParse . 102 xmlGetAttr . 110 xmlHandler . 111 xmlHashTree . 112 XMLInternalDocument-class . 115 xmlName . 116 xmlNamespace . 117 xmlNamespaceDefinitions . 119 xmlNode . 120 XMLNode-class . 122 xmlOutputBuffer . 123 xmlParent . 126 xmlParseDoc . 128 xmlParserContextFunction . 129 xmlRoot . 130 xmlSchemaValidate . 132 xmlSearchNs . ..
Recommended publications
  • Differential Fuzzing the Webassembly
    Master’s Programme in Security and Cloud Computing Differential Fuzzing the WebAssembly Master’s Thesis Gilang Mentari Hamidy MASTER’S THESIS Aalto University - EURECOM MASTER’STHESIS 2020 Differential Fuzzing the WebAssembly Fuzzing Différentiel le WebAssembly Gilang Mentari Hamidy This thesis is a public document and does not contain any confidential information. Cette thèse est un document public et ne contient aucun information confidentielle. Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Technology. Antibes, 27 July 2020 Supervisor: Prof. Davide Balzarotti, EURECOM Co-Supervisor: Prof. Jan-Erik Ekberg, Aalto University Copyright © 2020 Gilang Mentari Hamidy Aalto University - School of Science EURECOM Master’s Programme in Security and Cloud Computing Abstract Author Gilang Mentari Hamidy Title Differential Fuzzing the WebAssembly School School of Science Degree programme Master of Science Major Security and Cloud Computing (SECCLO) Code SCI3084 Supervisor Prof. Davide Balzarotti, EURECOM Prof. Jan-Erik Ekberg, Aalto University Level Master’s thesis Date 27 July 2020 Pages 133 Language English Abstract WebAssembly, colloquially known as Wasm, is a specification for an intermediate representation that is suitable for the web environment, particularly in the client-side. It provides a machine abstraction and hardware-agnostic instruction sets, where a high-level programming language can target the compilation to the Wasm instead of specific hardware architecture. The JavaScript engine implements the Wasm specification and recompiles the Wasm instruction to the target machine instruction where the program is executed. Technically, Wasm is similar to a popular virtual machine bytecode, such as Java Virtual Machine (JVM) or Microsoft Intermediate Language (MSIL).
    [Show full text]
  • XML a New Web Site Architecture
    XML A New Web Site Architecture Jim Costello Derek Werthmuller Darshana Apte Center for Technology in Government University at Albany, SUNY 1535 Western Avenue Albany, NY 12203 Phone: (518) 442-3892 Fax: (518) 442-3886 E-mail: [email protected] http://www.ctg.albany.edu September 2002 © 2002 Center for Technology in Government The Center grants permission to reprint this document provided this cover page is included. Table of Contents XML: A New Web Site Architecture .......................................................................................................................... 1 A Better Way? ......................................................................................................................................................... 1 Defining the Problem.............................................................................................................................................. 1 Partial Solutions ...................................................................................................................................................... 2 Addressing the Root Problems .............................................................................................................................. 2 Figure 1. Sample XML file (all code simplified for example) ...................................................................... 4 Figure 2. Sample XSL File (all code simplified for example) ....................................................................... 6 Figure 3. Formatted Page Produced
    [Show full text]
  • Document Object Model
    Document Object Model DOM DOM is a programming interface that provides a way for the values and structure of an XML document to be accessed and manipulated. Tasks that can be performed with DOM . Navigate an XML document's structure, which is a tree stored in memory. Report the information found at the nodes of the XML tree. Add, delete, or modify elements in the XML document. DOM represents each node of the XML tree as an object with properties and behavior for processing the XML. The root of the tree is a Document object. Its children represent the entire XML document except the xml declaration. On the next page we consider a small XML document with comments, a processing instruction, a CDATA section, entity references, and a DOCTYPE declaration, in addition to its element tree. It is valid with respect to a DTD, named root.dtd. <!ELEMENT root (child*)> <!ELEMENT child (name)> <!ELEMENT name (#PCDATA)> <!ATTLIST child position NMTOKEN #REQUIRED> <!ENTITY last1 "Dover"> <!ENTITY last2 "Reckonwith"> Document Object Model Copyright 2005 by Ken Slonneger 1 Example: root.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE root SYSTEM "root.dtd"> <!-- root.xml --> <?DomParse usage="java DomParse root.xml"?> <root> <child position="first"> <name>Eileen &last1;</name> </child> <child position="second"> <name><![CDATA[<<<Amanda>>>]]> &last2;</name> </child> <!-- Could be more children later. --> </root> DOM imagines that this XML information has a document root with four children: 1. A DOCTYPE declaration. 2. A comment. 3. A processing instruction, whose target is DomParse. 4. The root element of the document. The second comment is a child of the root element.
    [Show full text]
  • Extensible Markup Language (XML) and Its Role in Supporting the Global Justice XML Data Model
    Extensible Markup Language (XML) and Its Role in Supporting the Global Justice XML Data Model Extensible Markup Language, or "XML," is a computer programming language designed to transmit both data and the meaning of the data. XML accomplishes this by being a markup language, a mechanism that identifies different structures within a document. Structured information contains both content (such as words, pictures, or video) and an indication of what role content plays, or its meaning. XML identifies different structures by assigning data "tags" to define both the name of a data element and the format of the data within that element. Elements are combined to form objects. An XML specification defines a standard way to add markup language to documents, identifying the embedded structures in a consistent way. By applying a consistent identification structure, data can be shared between different systems, up and down the levels of agencies, across the nation, and around the world, with the ease of using the Internet. In other words, XML lays the technological foundation that supports interoperability. XML also allows structured relationships to be defined. The ability to represent objects and their relationships is key to creating a fully beneficial justice information sharing tool. A simple example can be used to illustrate this point: A "person" object may contain elements like physical descriptors (e.g., eye and hair color, height, weight), biometric data (e.g., DNA, fingerprints), and social descriptors (e.g., marital status, occupation). A "vehicle" object would also contain many elements (such as description, registration, and/or lien-holder). The relationship between these two objects—person and vehicle— presents an interesting challenge that XML can address.
    [Show full text]
  • Rdfa in XHTML: Syntax and Processing Rdfa in XHTML: Syntax and Processing
    RDFa in XHTML: Syntax and Processing RDFa in XHTML: Syntax and Processing RDFa in XHTML: Syntax and Processing A collection of attributes and processing rules for extending XHTML to support RDF W3C Recommendation 14 October 2008 This version: http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014 Latest version: http://www.w3.org/TR/rdfa-syntax Previous version: http://www.w3.org/TR/2008/PR-rdfa-syntax-20080904 Diff from previous version: rdfa-syntax-diff.html Editors: Ben Adida, Creative Commons [email protected] Mark Birbeck, webBackplane [email protected] Shane McCarron, Applied Testing and Technology, Inc. [email protected] Steven Pemberton, CWI Please refer to the errata for this document, which may include some normative corrections. This document is also available in these non-normative formats: PostScript version, PDF version, ZIP archive, and Gzip’d TAR archive. The English version of this specification is the only normative version. Non-normative translations may also be available. Copyright © 2007-2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply. Abstract The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported - 1 - How to Read this Document RDFa in XHTML: Syntax and Processing into a user’s desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo’s creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
    [Show full text]
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • Exploring and Extracting Nodes from Large XML Files
    Exploring and Extracting Nodes from Large XML Files Guy Lapalme January 2010 Abstract This article shows how to deal simply with large XML files that cannot be read as a whole in memory and for which the usual XML exploration and extraction mechanisms cannot work or are very inefficient in processing time. We define the notion of a skeleton document that is maintained as the file is read using a pull- parser. It is used for showing the structure of the document and for selecting parts of it. 1 Introduction XML has been developed to facilitate the annotation of information to be shared between computer systems. Because it is intended to be easily generated and parsed by computer systems on all platforms, its format is based on character streams rather than internal binary ones. Being character-based, it also has the nice property of being readable and editable by humans using standard text editors. XML is based on a uniform, simple and yet powerful model of data organization: the generalized tree. Such a tree is defined as either a single element or an element having other trees as its sub-elements called children. This is the same model as the one chosen for the Lisp programming language 50 years ago. This hierarchical model is very simple and allows a simple annotation of the data. The left part of Figure 1 shows a very small XML file illustrating the basic notation: an arbitrary name between < and > symbols is given to a node of a tree. This is called a start-tag.
    [Show full text]
  • RDF/XML: RDF Data on the Web
    Developing Ontologies • have an idea of the required concepts and relationships (ER, UML, ...), • generate a (draft) n3 or RDF/XML instance, • write a separate file for the metadata, • load it into Jena with activating a reasoner. • If the reasoner complains about an inconsistent ontology, check the metadata file alone. If this is consistent, and it complains only when also data is loaded: – it may be due to populating a class whose definition is inconsistent and that thus must be empty. – often it is due to wrong datatypes. Recall that datatype specification is not interpreted as a constraint (that is violated for a given value), but as additional knowledge. 220 Chapter 6 RDF/XML: RDF Data on the Web • An XML representation of RDF data for providing RDF data on the Web could be done straightforwardly as a “holds” relation mapped according to SQLX (see ⇒ next slide). • would be highly redundant and very different from an XML representation of the same data • search for a more similar way: leads to “striped XML/RDF” – data feels like XML: can be queried by XPath/Query and transformed by XSLT – can be parsed into an RDF graph. • usually: provide RDF/XML data to an agreed RDFS/OWL ontology. 221 A STRAIGHTFORWARD XML REPRESENTATION OF RDF DATA Note: this is not RDF/XML, but just some possible representation. • RDF data are triples, • their components are either URIs or literals (of XML Schema datatypes), • straightforward XML markup in SQLX style, • since N3 has a term structure, it is easy to find an XML markup. <my-n3:rdf-graph xmlns:my-n3="http://simple-silly-rdf-xml.de#"> <my-n3:triple> <my-n3:subject type="uri">foo://bar/persons/john</my-n3:subject> <my-n3:predicate type="uri">foo://bar/meta#name</my-n3:predicate> <my-n3:object type="http://www.w3.org/2001/XMLSchema#string">John</my-n3:object> </my-n3 triple> <my-n3:triple> ..
    [Show full text]
  • Ontology Matching • Semantic Social Networks and Peer-To-Peer Systems
    The web: from XML to OWL Rough Outline 1. Foundations of XML (Pierre Genevès & Nabil Layaïda) • Core XML • Programming with XML Development of the future web • Foundations of XML types (tree grammars, tree automata) • Tree logics (FO, MSO, µ-calculus) • Expressing information ! Languages • A taste of research: introduction to some grand challenges • Manipulating it ! Algorithms 2. Semantics of knowledge representation on the web (Jérôme Euzenat & • in the most correct, efficient and ! Logic Marie-Christine Rousset) meaningful way ! Semantics • Semantic web languages (URI, RDF, RDFS and OWL) • Querying RDF and RDFS (SPARQL) • Querying data though ontologies (DL-Lite) • Ontology matching • Semantic social networks and peer-to-peer systems 1 / 8 2 / 8 Foundations of XML Semantic web We will talk about languages, algorithms, and semantics for efficiently and meaningfully manipulating formalised knowledge. We will talk about languages, algorithms, and programming techniques for efficiently and safely manipulating XML data. You will learn about: You will learn about: • Expressing formalised knowledge on the semantic web (RDF) • Tree structured data (XML) ! Syntax and semantics ! Tree grammars & validation You will not learn about: • Expressing ontologies on the semantic • XML programming (XPath, XSLT...) You will not learn about: web (RDFS, OWL, DL-Lite) • Tagging pictures ! Queries & transformations ! Syntax and semantics • Hacking CGI scripts ! Reasoning • Sharing MP3 • Foundational theory & tools • HTML • Creating facebook ! Regular expressions
    [Show full text]
  • XML for Java Developers G22.3033-002 Course Roadmap
    XML for Java Developers G22.3033-002 Session 1 - Main Theme Markup Language Technologies (Part I) Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences 1 Course Roadmap Consider the Spectrum of Applications Architectures Distributed vs. Decentralized Apps + Thick vs. Thin Clients J2EE for eCommerce vs. J2EE/Web Services, JXTA, etc. Learn Specific XML/Java “Patterns” Used for Data/Content Presentation, Data Exchange, and Application Configuration Cover XML/Java Technologies According to their Use in the Various Phases of the Application Development Lifecycle (i.e., Discovery, Design, Development, Deployment, Administration) e.g., Modeling, Configuration Management, Processing, Rendering, Querying, Secure Messaging, etc. Develop XML Applications as Assemblies of Reusable XML- Based Services (Applications of XML + Java Applications) 2 1 Agenda XML Generics Course Logistics, Structure and Objectives History of Meta-Markup Languages XML Applications: Markup Languages XML Information Modeling Applications XML-Based Architectures XML and Java XML Development Tools Summary Class Project Readings Assignment #1a 3 Part I Introduction 4 2 XML Generics XML means eXtensible Markup Language XML expresses the structure of information (i.e., document content) separately from its presentation XSL style sheets are used to convert documents to a presentation format that can be processed by a target presentation device (e.g., HTML in the case of legacy browsers) Need a
    [Show full text]
  • FME® Desktop Copyright © 1994 – 2018, Safe Software Inc. All Rights Reserved
    FME® Desktop Copyright © 1994 – 2018, Safe Software Inc. All rights reserved. FME® is the registered trademark of Safe Software Inc. All brands and their product names mentioned herein may be trademarks or registered trademarks of their respective holders and should be noted as such. FME Desktop includes components licensed as described below: Autodesk FBX This software contains Autodesk® FBX® code developed by Autodesk, Inc. Copyright 2016 Autodesk, Inc. All rights, reserved. Such code is provided “as is” and Autodesk, Inc. disclaims any and all warranties, whether express or implied, including without limitation the implied warranties of merchantability, fitness for a particular purpose or non-infringement of third party rights. In no event shall Autodesk, Inc. be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of such code. Autodesk Libraries Contains Autodesk® RealDWG by Autodesk, Inc., Copyright © 2017 Autodesk, Inc. All rights reserved. Home page: www.autodesk.com/realdwg Belge72/b.Lambert72A NTv2 Grid Copyright © 2014-2016 Nicolas SIMON and validated by Service Public de Wallonie and Nationaal Geografisch Instituut. Under Creative Commons Attribution license (CC BY). Bentley i-Model SDK This software includes some components from the Bentley i-Model SDK. Copyright © Bentley Systems International Limited CARIS CSAR GDAL Plugin CARIS CSAR GDAL Plugin is owned by and copyright © 2013 Universal Systems Ltd.
    [Show full text]
  • Linux from Scratch 版本 R11.0-36-中⽂翻译版 发布于 2021 年 9 ⽉ 21 ⽇
    Linux From Scratch 版本 r11.0-36-中⽂翻译版 发布于 2021 年 9 ⽉ 21 ⽇ 由 Gerard Beekmans 原著 总编辑:Bruce Dubbs Linux From Scratch: 版本 r11.0-36-中⽂翻译版 : 发布于 2021 年 9 ⽉ 21 ⽇ 由 由 Gerard Beekmans 原著和总编辑:Bruce Dubbs 版权所有 © 1999-2021 Gerard Beekmans 版权所有 © 1999-2021, Gerard Beekmans 保留所有权利。 本书依照 Creative Commons License 许可证发布。 从本书中提取的计算机命令依照 MIT License 许可证发布。 Linux® 是Linus Torvalds 的注册商标。 Linux From Scratch - 版本 r11.0-36-中⽂翻译版 ⽬录 序⾔ .................................................................................................................................... viii i. 前⾔ ............................................................................................................................ viii ii. 本书⾯向的读者 ............................................................................................................ viii iii. LFS 的⽬标架构 ............................................................................................................ ix iv. 阅读本书需要的背景知识 ................................................................................................. ix v. LFS 和标准 ..................................................................................................................... x vi. 本书选择软件包的逻辑 .................................................................................................... xi vii. 排版约定 .................................................................................................................... xvi viii. 本书结构 .................................................................................................................
    [Show full text]