XML Processing with Python.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

XML Processing with Python.Pdf Python & XML Christopher A. Jones Fred L. Drake, Jr. Publisher: O'Reilly First Edition January 2002 ISBN: 0-596-00128-2, 384 pages Python is an ideal language for manipulating XML, and this new volume gives you a solid foundation for using these two languages together. Complete with practical examples that highlight common application tasks, the book starts with the basics then quickly progresses to complex topics like transforming XML with XSLT and querying XML with XPath. It also explores more advanced subjects, such as SOAP and distributed web services. Copyright © 2002 O'Reilly & Associates, Inc. All rights reserved. Printed in the United States of America. Published by O'Reilly & Associates, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O'Reilly & Associates books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safari.oreilly.com). For more information contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered trademarks of O'Reilly & Associates, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O'Reilly & Associates, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. The association between the image of elephant shrews and Python and XML is a trademark of O'Reilly & Associates, Inc. While every precaution has been taken in the preparation of this book, the publisher assumes no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. IT-SC book 1 Dedication Preface Audience Organization Conventions Used in This Book How to Contact Us Acknowledgments 1. Python and XML 1.1 Key Advantages of XML 1.2 The XML Specifications 1.3 The Power of Python and XML 1.4 What Can We Do with It? 2. XML Fundamentals 2.1 XML Structure in a Nutshell 2.2 Document Types and Schemas 2.3 Types of Conformance 2.4 Physical Structures 2.5 Constructing XML Documents 2.6 Document Type Definitions 2.7 Canonical XML 2.8 Going Beyond the XML Specification 3. The Simple API for XML 3.1 The Birth of SAX 3.2 Understanding SAX 3.3 Reading an Article 3.4 Searching File Information 3.5 Building an Image Index 3.6 Converting XML to HTML 3.7 Advanced Parser Factory Usage 3.8 Native Parser Interfaces 4. The Document Object Model 4.1 The DOM Specifications 4.2 Understanding the DOM 4.3 Python DOM Offerings 4.4 Retrieving Information 4.5 Changing Documents 4.6 Building a Web Application 4.7 Going Beyond SAX and DOM 5. Querying XML with XPath 5.1 XPath at a Glance 5.2 Where Is XPath Used? 5.3 Location Paths 5.4 XPath Arithmetic Operators 5.5 XPath Functions 5.6 Compiling XPath Expressions IT-SC book 2 6. Transforming XML with XSLT 6.1 The XSLT Specification 6.2 XSLT Processors 6.3 Defining Stylesheets 6.4 Using XSLT from the Command Line 6.5 XSLT Elements 6.6 A More Complex Example 6.7 Embedding XSLT Transformations in Python 6.8 Choosing a Technique 7. XML Validation and Dialects 7.1 Working with DTDs 7.2 Validation at Runtime 7.3 The BillSummary Example 7.4 Dialects, Frameworks, and Workflow 7.5 What Does ebXML Offer? 8. Python Internet APIs 8.1 Connecting Web Sites 8.2 Working with URLs 8.3 Opening URLs 8.4 Connecting with HTTP 8.5 Using the Server Classes 9. Python, Web Services, and SOAP 9.1 Python Web Services Support 9.2 The Emerging SOAP Standard 9.3 Python SOAP Options 9.4 Example SOAP Server and Client 9.5 What About XML-RPC? 10. Python and Distributed Systems Design 10.1 Sample Application and Flow Analysis 10.2 Understanding the Scope 10.3 Building the Database 10.4 Building the Profiles Access Class 10.5 Creating an XML Data Store 10.6 The XML Switch 10.7 Running the XML Switch 10.8 A Web Application A. Installing Python and XML Tools A.1 Installing Python A.2 Installing PyXML A.3 Installing 4Suite B. XML Definitions B.1 XML Definitions C. Python SAX API D. Python DOM API D.1 4DOM Extensions IT-SC book 3 E. Working with MSXML3.0 E.1 Setting Up MSXML3.0 E.2 Basic DOM Operations E.3 MSXML3.0 Support for XSLT E.4 Handling Parsing Errors E.5 MSXML3.0 Reference F. Additional Python XML Tools F.1 Pyxie F.2 Python XML Tools F.3 XML Schema Validator F.4 Sab-pyth F.5 Redfoot F.6 XML Components for Zope F.7 Online Resources Colophon IT-SC book 4 Dedication We would like to dedicate this book to Frank Willison, O'Reilly Editor- in-Chief and Python Champion ——Christopher A. Jones and Fred L. Drake, Jr. Frank will be remembered in the Python community for the several great Python books that he made possible, memories of his participation in many Python conferences, and his Frankly Speaking columns. The Python world (and the world at large) won't be the same without Frank. ——Guido van Rossum, Python creator IT-SC book 5 Preface This book comes to you as a result of the collaboration of two authors who became interested in the topic in very different ways. Hopefully our motivations will help you understand what we each bring to the book, and perhaps prove to be at least a little entertaining as well. Chris Jones started using XML several years ago, and began using Python more recently. As a consultant for major companies in the Seattle area, he first used XML as the core data format for web site content in a home-grown publishing system in 1997. But he really became an XML devotee when developing an open source engine, which eventually became the key technology for Planet 7 Technologies. As a consultant, he continues to use XML on an almost daily basis for everything from configuration files to document formats. Chris began dabbling in Python because he thought it was a clean, object-oriented alternative to Perl. A long-time Unix user (but one who frequently finds himself working with Windows in Seattle), he has grown accustomed to scripting languages that place the full Unix API in the hands of developers. Having used far too much Java and ASP in web development over the years, he found Python a refreshing way to keep object-orientation while still accessing Unix sockets and threads—all with the convenience of a scripting language. The combination of Python and XML brings great power to the developer. While XML is a potent technology, it requires the programmer to use objects, interfaces, and strings. Python does so as well, and therefore provides an excellent playpen for XML development. The number of XML tools for Python is growing all the time, and Chris can produce an XML solution in far less time using Python than he can with Java or C++. Of course, the cross-platform nature of Python keeps our work consistently usable whether we're developing on Windows, Linux, or a Unix variant—the combination of which we both seem to find powerful. Fred Drake came to Python and XML from a different avenue, arriving at Python before XML. He discovered Python while in graduate school experimenting with a number of programming languages. After recognizing Python as an excellent language for rapid development, he convinced his advisors that he should be able to write his masters project using Python. In the course of developing the project, he became increasingly interested in the Python community. He then made his first contributions to the Python standard library, and in so doing became noticed by a group of Python programmers working on distributed systems projects at the research organization of CNRI. The group was led by Guido van Rossum, the creator of Python. Fred joined the team and learned more about distributed systems and gluing systems together than he ever expected possible, and he loved it. While still in graduate school, Fred argued that Python's documentation should be converted to a more structured language called SGML. After a few years at CNRI, he began to do just that, and was able to sink his teeth into the documentation more vigorously. The SGML migration path eventually changed to an XML migration path as XML acceptance grew. Though that goal has not yet been achieved (he is still working on it), Fred has substantially changed the way the documentation is maintained, and it now represents one of the most structured applications of the typesetting and document markup system developed by Donald Knuth and Leslie Lamport. Over time, the team from CNRI became increasingly focused on the development of Python, and moved on to form PythonLabs. Fred remained active in XML initiatives around Python and IT-SC book 6 pushed to add XML support to the standard library. Once this was achieved, he returned to the task of migrating the Python documentation to XML, and hopes to complete this project soon. Audience This book is for anyone interested in learning about using Python to build XML applications. The bulk of the material is suited for programmers interested in using XML as a data interchange format or as a transformable format for web content, but the first half of the book is also useful to those interested in building more document-oriented applications.
Recommended publications
  • On the Integrity and Trustworthiness of Web Produced Data
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Open Repository of the University of Porto On the Integrity and Trustworthiness of web produced data Luís A. Maia Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos Departamento de Ciência de Computadores 2013 Orientador Professor Doutor Manuel Eduardo Carvalho Duarte Correia, Professor Auxiliar do Departamento de Computadores, Faculdade de Ciências da Universidade do Porto Todas as correções determinadas pelo júri, e só essas, foram efetuadas. O Presidente do Júri, Porto, ______/______/_________ Acknowledgments I would like to express my appreciation for the help of my supervisor in researching and bringing different perspectives and to thank my family, for their support and dedication. 3 Abstract Information Systems have been a key tool for the overall performance improvement of administrative tasks in academic institutions. While most systems intend to deliver a paperless environment to each institution it is recurrent that document integrity and accountability is still relying on traditional methods such as producing physical documents for signing and archiving. While this method delivers a non-efficient work- flow and has an effective monetary cost, it is still the common method to provide a degree of integrity and accountability on the data contained in the databases of the information systems. The evaluation of a document signature is not a straight forward process, it requires the recipient to have a copy of the signers signature for comparison and training beyond the scope of any office employee training, this leads to a serious compromise on the trustability of each document integrity and makes the verification based entirely on the trust of information origin which is not enough to provide non-repudiation to the institutions.
    [Show full text]
  • XML Signature/Encryption — the Basis of Web Services Security
    Special Issue on Security for Network Society Falsification Prevention and Protection Technologies and Products XML Signature/Encryption — the Basis of Web Services Security By Koji MIYAUCHI* XML is spreading quickly as a format for electronic documents and messages. As a consequence, ABSTRACT greater importance is being placed on the XML security technology. Against this background research and development efforts into XML security are being energetically pursued. This paper discusses the W3C XML Signature and XML Encryption specifications, which represent the fundamental technology of XML security, as well as other related technologies originally developed by NEC. KEYWORDS XML security, XML signature, XML encryption, Distributed signature, Web services security 1. INTRODUCTION 2. XML SIGNATURE XML is an extendible markup language, the speci- 2.1 Overview fication of which has been established by the W3C XML Signature is an electronic signature technol- (WWW Consortium). It is spreading quickly because ogy that is optimized for XML data. The practical of its flexibility and its platform-independent technol- benefits of this technology include Partial Signature, ogy, which freely allows authors to decide on docu- which allows an electronic signature to be written on ment structures. Various XML-based standard for- specific tags contained in XML data, and Multiple mats have been developed including: ebXML and Signature, which enables multiple electronic signa- RosettaNet, which are standard specifications for e- tures to be written. The use of XML Signature can commerce transactions, TravelXML, which is an EDI solve security problems, including falsification, spoof- (Electronic Data Interchange) standard for travel ing, and repudiation. agencies, and NewsML, which is a standard specifica- tion for new distribution formats.
    [Show full text]
  • Sams Teach Yourself XML in 21 Days
    Steven Holzner Teach Yourself XML in 21 Days THIRD EDITION 800 East 96th Street, Indianapolis, Indiana, 46240 USA Sams Teach Yourself XML in 21 Days, ASSOCIATE PUBLISHER Michael Stephens Third Edition ACQUISITIONS EDITOR Copyright © 2004 by Sams Publishing Todd Green All rights reserved. No part of this book shall be reproduced, stored in a retrieval DEVELOPMENT EDITOR system, or transmitted by any means, electronic, mechanical, photocopying, record- Songlin Qiu ing, or otherwise, without written permission from the publisher. No patent liability MANAGING EDITOR is assumed with respect to the use of the information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and Charlotte Clapp author assume no responsibility for errors or omissions. Nor is any liability assumed PROJECT EDITOR for damages resulting from the use of the information contained herein. Matthew Purcell International Standard Book Number: 0-672-32576-4 INDEXER Library of Congress Catalog Card Number: 2003110401 Mandie Frank PROOFREADER Printed in the United States of America Paula Lowell First Printing: October 2003 TECHNICAL EDITOR 06050403 4321 Chris Kenyeres Trademarks TEAM COORDINATOR Cindy Teeters All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. Sams Publishing cannot attest to the accuracy INTERIOR DESIGNER of this information. Use of a term in this book should not be regarded as affecting Gary Adair the validity of any trademark or service mark. COVER DESIGNER Warning and Disclaimer Gary Adair PAGE LAYOUT Every effort has been made to make this book as complete and as accurate as possi- ble, but no warranty or fitness is implied.
    [Show full text]
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • QUERYING JSON and XML Performance Evaluation of Querying Tools for Offline-Enabled Web Applications
    QUERYING JSON AND XML Performance evaluation of querying tools for offline-enabled web applications Master Degree Project in Informatics One year Level 30 ECTS Spring term 2012 Adrian Hellström Supervisor: Henrik Gustavsson Examiner: Birgitta Lindström Querying JSON and XML Submitted by Adrian Hellström to the University of Skövde as a final year project towards the degree of M.Sc. in the School of Humanities and Informatics. The project has been supervised by Henrik Gustavsson. 2012-06-03 I hereby certify that all material in this final year project which is not my own work has been identified and that no work is included for which a degree has already been conferred on me. Signature: ___________________________________________ Abstract This article explores the viability of third-party JSON tools as an alternative to XML when an application requires querying and filtering of data, as well as how the application deviates between browsers. We examine and describe the querying alternatives as well as the technologies we worked with and used in the application. The application is built using HTML 5 features such as local storage and canvas, and is benchmarked in Internet Explorer, Chrome and Firefox. The application built is an animated infographical display that uses querying functions in JSON and XML to filter values from a dataset and then display them through the HTML5 canvas technology. The results were in favor of JSON and suggested that using third-party tools did not impact performance compared to native XML functions. In addition, the usage of JSON enabled easier development and cross-browser compatibility. Further research is proposed to examine document-based data filtering as well as investigating why performance deviated between toolsets.
    [Show full text]
  • Scraping HTML with Xpath Stéphane Ducasse, Peter Kenny
    Scraping HTML with XPath Stéphane Ducasse, Peter Kenny To cite this version: Stéphane Ducasse, Peter Kenny. Scraping HTML with XPath. published by the authors, pp.26, 2017. hal-01612689 HAL Id: hal-01612689 https://hal.inria.fr/hal-01612689 Submitted on 7 Oct 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Scraping HTML with XPath Stéphane Ducasse and Peter Kenny Square Bracket tutorials September 28, 2017 master @ a0267b2 Copyright 2017 by Stéphane Ducasse and Peter Kenny. The contents of this book are protected under the Creative Commons Attribution-ShareAlike 3.0 Unported license. You are free: • to Share: to copy, distribute and transmit the work, • to Remix: to adapt the work, Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work.
    [Show full text]
  • FME® Desktop Copyright © 1994 – 2018, Safe Software Inc. All Rights Reserved
    FME® Desktop Copyright © 1994 – 2018, Safe Software Inc. All rights reserved. FME® is the registered trademark of Safe Software Inc. All brands and their product names mentioned herein may be trademarks or registered trademarks of their respective holders and should be noted as such. FME Desktop includes components licensed as described below: Autodesk FBX This software contains Autodesk® FBX® code developed by Autodesk, Inc. Copyright 2016 Autodesk, Inc. All rights, reserved. Such code is provided “as is” and Autodesk, Inc. disclaims any and all warranties, whether express or implied, including without limitation the implied warranties of merchantability, fitness for a particular purpose or non-infringement of third party rights. In no event shall Autodesk, Inc. be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of such code. Autodesk Libraries Contains Autodesk® RealDWG by Autodesk, Inc., Copyright © 2017 Autodesk, Inc. All rights reserved. Home page: www.autodesk.com/realdwg Belge72/b.Lambert72A NTv2 Grid Copyright © 2014-2016 Nicolas SIMON and validated by Service Public de Wallonie and Nationaal Geografisch Instituut. Under Creative Commons Attribution license (CC BY). Bentley i-Model SDK This software includes some components from the Bentley i-Model SDK. Copyright © Bentley Systems International Limited CARIS CSAR GDAL Plugin CARIS CSAR GDAL Plugin is owned by and copyright © 2013 Universal Systems Ltd.
    [Show full text]
  • Linux from Scratch 版本 R11.0-36-中⽂翻译版 发布于 2021 年 9 ⽉ 21 ⽇
    Linux From Scratch 版本 r11.0-36-中⽂翻译版 发布于 2021 年 9 ⽉ 21 ⽇ 由 Gerard Beekmans 原著 总编辑:Bruce Dubbs Linux From Scratch: 版本 r11.0-36-中⽂翻译版 : 发布于 2021 年 9 ⽉ 21 ⽇ 由 由 Gerard Beekmans 原著和总编辑:Bruce Dubbs 版权所有 © 1999-2021 Gerard Beekmans 版权所有 © 1999-2021, Gerard Beekmans 保留所有权利。 本书依照 Creative Commons License 许可证发布。 从本书中提取的计算机命令依照 MIT License 许可证发布。 Linux® 是Linus Torvalds 的注册商标。 Linux From Scratch - 版本 r11.0-36-中⽂翻译版 ⽬录 序⾔ .................................................................................................................................... viii i. 前⾔ ............................................................................................................................ viii ii. 本书⾯向的读者 ............................................................................................................ viii iii. LFS 的⽬标架构 ............................................................................................................ ix iv. 阅读本书需要的背景知识 ................................................................................................. ix v. LFS 和标准 ..................................................................................................................... x vi. 本书选择软件包的逻辑 .................................................................................................... xi vii. 排版约定 .................................................................................................................... xvi viii. 本书结构 .................................................................................................................
    [Show full text]
  • SVG-Based Knowledge Visualization
    MASARYK UNIVERSITY FACULTY}w¡¢£¤¥¦§¨ OF I !"#$%&'()+,-./012345<yA|NFORMATICS SVG-based Knowledge Visualization DIPLOMA THESIS Miloš Kaláb Brno, spring 2012 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Advisor: RNDr. Tomáš Gregar Ph.D. ii Acknowledgement I would like to thank RNDr. Tomáš Gregar Ph.D. for supervising the thesis. His opinions, comments and advising helped me a lot with accomplishing this work. I would also like to thank to Dr. Daniel Sonntag from DFKI GmbH. Saarbrücken, Germany, for the opportunity to work for him on the Medico project and for his supervising of the thesis during my erasmus exchange in Germany. Big thanks also to Jochen Setz from Dr. Sonntag’s team who worked on the server background used by my visualization. Last but not least, I would like to thank to my family and friends for being extraordinary supportive. iii Abstract The aim of this thesis is to analyze the visualization of semantic data and sug- gest an approach to general visualization into the SVG format. Afterwards, the approach is to be implemented in a visualizer allowing user to customize the visualization according to the nature of the data. The visualizer was integrated as an extension of Fresnel Editor. iv Keywords Semantic knowledge, SVG, Visualization, JavaScript, Java, XML, Fresnel, XSLT v Contents Introduction . .3 1 Brief Introduction to the Related Technologies ..........5 1.1 XML – Extensible Markup Language ..............5 1.1.1 XSLT – Extensible Stylesheet Lang.
    [Show full text]
  • XPATH in NETCONF and YANG Table of Contents
    XPATH IN NETCONF AND YANG Table of Contents 1. Introduction ............................................................................................................3 2. XPath 1.0 Introduction ...................................................................................3 3. The Use of XPath in NETCONF ...............................................................4 4. The Use of XPath in YANG .........................................................................5 5. XPath and ConfD ...............................................................................................8 6. Conclusion ...............................................................................................................9 7. Additional Resourcese ..................................................................................9 2 XPath in NETCONF and YANG 1. Introduction XPath is a powerful tool used by NETCONF and YANG. This application note will help you to understand and utilize this advanced feature of NETCONF and YANG. This application note gives a brief introduction to XPath, then describes how XPath is used in NETCONF and YANG, and finishes with a discussion of XPath in ConfD. The XPath 1.0 standard was defined by the W3C in 1999. It is a language which is used to address the parts of an XML document and was originally design to be used by XML Transformations. XPath gets its name from its use of path notation for navigating through the hierarchical structure of an XML document. Since XML serves as the encoding format for NETCONF and a data model defined in YANG is represented in XML, it was natural for NETCONF and XML to utilize XPath. 2. XPath 1.0 Introduction XML Path Language, or XPath 1.0, is a W3C recommendation first introduced in 1999. It is a language that is used to address and match parts of an XML document. XPath sees the XML document as a tree containing different kinds of nodes. The types of nodes can be root, element, text, attribute, namespace, processing instruction, and comment nodes.
    [Show full text]
  • Feasibility and Performance Evaluation of Canonical XML
    Feasibility and Performance Evaluation of Canonical XML Student Research Project Manuel Binna Student: E-mail: [email protected] Matriculation Number: 108004202162 Supervisor: Dipl.-Inf. Meiko Jensen Period: 20.07.2010 - 19.10.2010 Chair for Network and Data Security Prof. Dr. Jörg Schwenk Faculty of Electrical Engineering and Information Technology Ruhr University Bochum Feasibility and Performance Evaluation of Canonical XML Manuel Binna Abstract Within the boundaries of the XML specification, XML documents can be formatted in various ways without losing the logical equivalence of its content within the scope of the application. However, some applications like XML Signature cannot deal with this flexibility, thus needing a definite textual representation in order to distinguish changes which do or do not alter the logical equivalence of XML content. Canonical XML provides a method to transform textually different yet logically equivalent XML content into a single definite textual representation. This work evaluates the upcoming new major version Canonical XML Version 2.0 with respect to feasibility and performance. Chair for Network and Data Security, Ruhr University Bochum 2 Feasibility and Performance Evaluation of Canonical XML Manuel Binna Declaration I hereby declare that the content of this thesis is a work of my own and that it is original to the best of my knowledge, except where indicated by references to other sources. ____________________________ ______________________________________ Location, Date Signature Chair for Network and Data Security, Ruhr University Bochum 3 Feasibility and Performance Evaluation of Canonical XML Manuel Binna Table of Contents 1. Introduction! 5 1.1.XML 5 1.2.Canonicalization 6 1.3.History 7 1.4.Canonicalization and XML Signature 16 2.
    [Show full text]
  • Pearls of XSLT/Xpath 3.0 Design
    PEARLS OF XSLT AND XPATH 3.0 DESIGN PREFACE XSLT 3.0 and XPath 3.0 contain a lot of powerful and exciting new capabilities. The purpose of this paper is to highlight the new capabilities. Have you got a pearl that you would like to share? Please send me an email and I will add it to this paper (and credit you). I ask three things: 1. The pearl highlights a capability that is new to XSLT 3.0 or XPath 3.0. 2. Provide a short, complete, working stylesheet with a sample input document. 3. Provide a brief description of the code. This is an evolving paper. As new pearls are found, they will be added. TABLE OF CONTENTS 1. XPath 3.0 is a composable language 2. Higher-order functions 3. Partial functions 4. Function composition 5. Recursion with anonymous functions 6. Closures 7. Binary search trees 8. -- next pearl is? -- CHAPTER 1: XPATH 3.0 IS A COMPOSABLE LANGUAGE The XPath 3.0 specification says this: XPath 3.0 is a composable language What does that mean? It means that every operator and language construct allows any XPath expression to appear as its operand (subject only to operator precedence and data typing constraints). For example, take this expression: 3 + ____ The plus (+) operator has a left-operand, 3. What can the right-operand be? Answer: any XPath expression! Let's use the max() function as the right-operand: 3 + max(___) Now, what can the argument to the max() function be? Answer: any XPath expression! Let's use a for- loop as its argument: 3 + max(for $i in 1 to 10 return ___) Now, what can the return value of the for-loop be? Answer: any XPath expression! Let's use an if- statement: 3 + max(for $i in 1 to 10 return (if ($i gt 5) then ___ else ___))) And so forth.
    [Show full text]