Perl and XML.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

Perl and XML.Pdf Perl and XML XML is a text-based markup language that has taken the programming world by storm. More powerful than HTML yet less demanding than SGML, XML has proven itself to be flexible and resilient. XML is the perfect tool for formatting documents with even the smallest bit of complexity, from Web pages to legal contracts to books. However, XML has also proven itself to be indispensable for organizing and conveying other sorts of data as well, thus its central role in web services like SOAP and XML-RPC. As the Perl programming language was tailor-made for manipulating text, few people have disputed the fact that Perl and XML are perfectly suited for one another. The only question has been what's the best way to do it. That's where this book comes in. Perl & XML is aimed at Perl programmers who need to work with XML documents and data. The book covers all the major modules for XML processing in Perl, including XML::Simple, XML::Parser, XML::LibXML, XML::XPath, XML::Writer, XML::Pyx, XML::Parser::PerlSAX, XML::SAX, XML::SimpleObject, XML::TreeBuilder, XML::Grove, XML::DOM, XML::RSS, XML::Generator::DBI, and SOAP::Lite. But this book is more than just a listing of modules; it gives a complete, comprehensive tour of the landscape of Perl and XML, making sense of the myriad of modules, terminology, and techniques. This book covers: • parsing XML documents and writing them out again • working with event streams and SAX • tree processing and the Document Object Model • advanced tree processing with XPath and XSLT Most valuably, the last two chapters of Perl & XML give complete examples of XML applications, pulling together all the tools at your disposal. All together, Perl & XML is the single book that gives you a solid grounding in XML processing with Perl. Table of Contents Preface ......................................................................................................................................... 1 Assumptions ..........................................................................................................................................................1 How This Book Is Organized ................................................................................................................................1 Resources...............................................................................................................................................................2 Font Conventions...................................................................................................................................................2 How to Contact Us ................................................................................................................................................2 Acknowledgments .................................................................................................................................................3 Chapter 1. Perl and XML ............................................................................................................ 4 1.1 Why Use Perl with XML? ...............................................................................................................................4 1.2 XML Is Simple with XML::Simple.................................................................................................................4 1.3 XML Processors ..............................................................................................................................................7 1.4 A Myriad of Modules ......................................................................................................................................8 1.5 Keep in Mind...................................................................................................................................................8 1.6 XML Gotchas ..................................................................................................................................................9 Chapter 2. An XML Recap........................................................................................................ 11 2.1 A Brief History of XML ................................................................................................................................11 2.2 Markup, Elements, and Structure ..................................................................................................................13 2.3 Namespaces ...................................................................................................................................................15 2.4 Spacing ..........................................................................................................................................................16 2.5 Entities...........................................................................................................................................................17 2.6 Unicode, Character Sets, and Encodings .......................................................................................................19 2.7 The XML Declaration....................................................................................................................................19 2.8 Processing Instructions and Other Markup....................................................................................................19 2.9 Free-Form XML and Well-Formed Documents ............................................................................................21 2.10 Declaring Elements and Attributes ..............................................................................................................22 2.11 Schemas.......................................................................................................................................................22 2.12 Transformations...........................................................................................................................................24 Chapter 3. XML Basics: Reading and Writing.......................................................................... 28 3.1 XML Parsers..................................................................................................................................................28 3.2 XML::Parser ..................................................................................................................................................34 3.3 Stream-Based Versus Tree-Based Processing ...............................................................................................38 3.4 Putting Parsers to Work.................................................................................................................................39 3.5 XML::LibXML..............................................................................................................................................41 3.6 XML::XPath ..................................................................................................................................................43 3.7 Document Validation.....................................................................................................................................44 3.8 XML::Writer..................................................................................................................................................46 3.9 Character Sets and Encodings........................................................................................................................50 Chapter 4. Event Streams .......................................................................................................... 55 4.1 Working with Streams ...................................................................................................................................55 4.2 Events and Handlers ......................................................................................................................................55 4.3 The Parser as Commodity..............................................................................................................................57 4.4 Stream Applications.......................................................................................................................................57 4.5 XML::PYX ....................................................................................................................................................58 4.6 XML::Parser ..................................................................................................................................................60 Chapter 5. SAX.......................................................................................................................... 64 5.1 SAX Event Handlers......................................................................................................................................64 5.2 DTD Handlers................................................................................................................................................70 5.3 External Entity Resolution.............................................................................................................................73 5.4 Drivers for Non-XML Sources......................................................................................................................74 5.5 A Handler Base Class ....................................................................................................................................76
Recommended publications
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • FME® Desktop Copyright © 1994 – 2018, Safe Software Inc. All Rights Reserved
    FME® Desktop Copyright © 1994 – 2018, Safe Software Inc. All rights reserved. FME® is the registered trademark of Safe Software Inc. All brands and their product names mentioned herein may be trademarks or registered trademarks of their respective holders and should be noted as such. FME Desktop includes components licensed as described below: Autodesk FBX This software contains Autodesk® FBX® code developed by Autodesk, Inc. Copyright 2016 Autodesk, Inc. All rights, reserved. Such code is provided “as is” and Autodesk, Inc. disclaims any and all warranties, whether express or implied, including without limitation the implied warranties of merchantability, fitness for a particular purpose or non-infringement of third party rights. In no event shall Autodesk, Inc. be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of such code. Autodesk Libraries Contains Autodesk® RealDWG by Autodesk, Inc., Copyright © 2017 Autodesk, Inc. All rights reserved. Home page: www.autodesk.com/realdwg Belge72/b.Lambert72A NTv2 Grid Copyright © 2014-2016 Nicolas SIMON and validated by Service Public de Wallonie and Nationaal Geografisch Instituut. Under Creative Commons Attribution license (CC BY). Bentley i-Model SDK This software includes some components from the Bentley i-Model SDK. Copyright © Bentley Systems International Limited CARIS CSAR GDAL Plugin CARIS CSAR GDAL Plugin is owned by and copyright © 2013 Universal Systems Ltd.
    [Show full text]
  • Linux from Scratch 版本 R11.0-36-中⽂翻译版 发布于 2021 年 9 ⽉ 21 ⽇
    Linux From Scratch 版本 r11.0-36-中⽂翻译版 发布于 2021 年 9 ⽉ 21 ⽇ 由 Gerard Beekmans 原著 总编辑:Bruce Dubbs Linux From Scratch: 版本 r11.0-36-中⽂翻译版 : 发布于 2021 年 9 ⽉ 21 ⽇ 由 由 Gerard Beekmans 原著和总编辑:Bruce Dubbs 版权所有 © 1999-2021 Gerard Beekmans 版权所有 © 1999-2021, Gerard Beekmans 保留所有权利。 本书依照 Creative Commons License 许可证发布。 从本书中提取的计算机命令依照 MIT License 许可证发布。 Linux® 是Linus Torvalds 的注册商标。 Linux From Scratch - 版本 r11.0-36-中⽂翻译版 ⽬录 序⾔ .................................................................................................................................... viii i. 前⾔ ............................................................................................................................ viii ii. 本书⾯向的读者 ............................................................................................................ viii iii. LFS 的⽬标架构 ............................................................................................................ ix iv. 阅读本书需要的背景知识 ................................................................................................. ix v. LFS 和标准 ..................................................................................................................... x vi. 本书选择软件包的逻辑 .................................................................................................... xi vii. 排版约定 .................................................................................................................... xvi viii. 本书结构 .................................................................................................................
    [Show full text]
  • Open Source Software Attributions for ASCET-DEVELOPER 7.3.0
    ASCET-DEVELOPER 7.3.0 OSS Attributions Open Source Software Attributions for ASCET-DEVELOPER 7.3.0 This document is provided as part of the fulfillment of OSS license conditions and does not require users to take any action before or while using the product. Page 1 of 102 ASCET-DEVELOPER 7.3.0 OSS Attributions Table of Contents Contents 1 List of used Open Source Components. ................................................................................................ 3 2 Appendix - License Text ................................................................................................................. 15 2.1 ANTLR Software Rights Notice .................................................................................................. 15 2.2 ASM License ......................................................................................................................... 16 2.3 Apache License 1.1 ................................................................................................................. 17 2.4 Apache License 2.0 ................................................................................................................. 18 2.5 BSD (Three Clause License) ...................................................................................................... 21 2.6 BSD 4-clause "Original" or "Old" License ..................................................................................... 22 2.7 Common Development and Distribution License 1.0.....................................................................
    [Show full text]
  • Veritas Infoscale™ Third-Party Software License Agreements Last Updated: 2017-11-06 Legal Notice Copyright © 2017 Veritas Technologies LLC
    Veritas InfoScale™ Third-Party Software License Agreements Last updated: 2017-11-06 Legal Notice Copyright © 2017 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This product may contain third party software for which Veritas is required to provide attribution to the third party (“Third Party Programs”). Some of the Third Party Programs are available under open source or free software licenses. The License Agreement accompanying the Software does not alter any rights or obligations you may have under those open source or free software licenses. Refer to the third party legal notices document accompanying this Veritas product or available at: https://www.veritas.com/about/legal/license-agreements The product described in this document is distributed under licenses restricting its use, copying, distribution, and decompilation/reverse engineering. No part of this document may be reproduced in any form by any means without prior written authorization of Veritas Technologies LLC and its licensors, if any. THE DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. VERITAS TECHNOLOGIES LLC SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE OF THIS DOCUMENTATION. THE INFORMATION CONTAINED IN THIS DOCUMENTATION IS SUBJECT TO CHANGE WITHOUT NOTICE.
    [Show full text]
  • Open Source Acknowledgements
    This document acknowledges certain third‐parties whose software is used in Esri products. GENERAL ACKNOWLEDGEMENTS Portions of this work are: Copyright ©2007‐2011 Geodata International Ltd. All rights reserved. Copyright ©1998‐2008 Leica Geospatial Imaging, LLC. All rights reserved. Copyright ©1995‐2003 LizardTech Inc. All rights reserved. MrSID is protected by the U.S. Patent No. 5,710,835. Foreign Patents Pending. Copyright ©1996‐2011 Microsoft Corporation. All rights reserved. Based in part on the work of the Independent JPEG Group. OPEN SOURCE ACKNOWLEDGEMENTS 7‐Zip 7‐Zip © 1999‐2010 Igor Pavlov. Licenses for files are: 1) 7z.dll: GNU LGPL + unRAR restriction 2) All other files: GNU LGPL The GNU LGPL + unRAR restriction means that you must follow both GNU LGPL rules and unRAR restriction rules. Note: You can use 7‐Zip on any computer, including a computer in a commercial organization. You don't need to register or pay for 7‐Zip. GNU LGPL information ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You can receive a copy of the GNU Lesser General Public License from http://www.gnu.org/ See Common Open Source Licenses below for copy of LGPL 2.1 License.
    [Show full text]
  • Foreign Library Interface by Daniel Adler Dia Applications That Can Run on a Multitude of Plat- Forms
    30 CONTRIBUTED RESEARCH ARTICLES Foreign Library Interface by Daniel Adler dia applications that can run on a multitude of plat- forms. Abstract We present an improved Foreign Function Interface (FFI) for R to call arbitary na- tive functions without the need for C wrapper Foreign function interfaces code. Further we discuss a dynamic linkage framework for binding standard C libraries to FFIs provide the backbone of a language to inter- R across platforms using a universal type infor- face with foreign code. Depending on the design of mation format. The package rdyncall comprises this service, it can largely unburden developers from the framework and an initial repository of cross- writing additional wrapper code. In this section, we platform bindings for standard libraries such as compare the built-in R FFI with that provided by (legacy and modern) OpenGL, the family of SDL rdyncall. We use a simple example that sketches the libraries and Expat. The package enables system- different work flow paths for making an R binding to level programming using the R language; sam- a function from a foreign C library. ple applications are given in the article. We out- line the underlying automation tool-chain that extracts cross-platform bindings from C headers, FFI of base R making the repository extendable and open for Suppose that we wish to invoke the C function sqrt library developers. of the Standard C Math library. The function is de- clared as follows in C: Introduction double sqrt(double x); We present an improved Foreign Function Interface The .C function from the base R FFI offers a call (FFI) for R that significantly reduces the amount of gate to C code with very strict conversion rules, and C wrapper code needed to interface with C.
    [Show full text]
  • Oracle Database Licensing Information for Additional Information
    Oracle® Exadata Database Machine Licensing Information 11g Release 2 (11.2) E25431-04 November 2013 Oracle Exadata Database Machine Licensing Information, 11g Release 2 (11.2) E25431-04 Copyright © 2012, 2013, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
    [Show full text]
  • Sun Microsystems, Inc. Binary Code License Agreement for the JAVA 2 PLATFORM STANDARD EDITION RUNTIME ENVIRONMENT 5.0 SUN MICROS
    Sun Microsystems, Inc. Binary Code License Agreement for the JAVA 2 PLATFORM STANDARD EDITION RUNTIME ENVIRONMENT 5.0 SUN MICROSYSTEMS, INC. ("SUN") IS WILLING TO LICENSE THE SOFTWARE IDENTIFIED BELOW TO YOU ONLY UPON THE CONDITION THAT YOU ACCEPT ALL OF THE TERMS CONTAINED IN THIS BINARY CODE LICENSE AGREEMENT AND SUPPLEMENTAL LICENSE TERMS (COLLECTIVELY "AGREEMENT"). PLEASE READ THE AGREEMENT CAREFULLY. BY DOWNLOADING OR INSTALLING THIS SOFTWARE, YOU ACCEPT THE TERMS OF THE AGREEMENT. INDICATE ACCEPTANCE BY SELECTING THE "ACCEPT" BUTTON AT THE BOTTOM OF THE AGREEMENT. IF YOU ARE NOT WILLING TO BE BOUND BY ALL THE TERMS, SELECT THE "DECLINE" BUTTON AT THE BOTTOM OF THE AGREEMENT AND THE DOWNLOAD OR INSTALL PROCESS WILL NOT CONTINUE. 1. DEFINITIONS. "Software" means the identified above in binary form, any other machine readable materials (including, but not limited to, libraries, source files, header files, and data files), any updates or error corrections provided by Sun, and any user manuals, programming guides and other documentation provided to you by Sun under this Agreement. "Programs" mean Java applets and applications intended to run on the Java 2 Platform Standard Edition (J2SE platform) platform on Java-enabled general purpose desktop computers and servers. 2. LICENSE TO USE. Subject to the terms and conditions of this Agreement, including, but not limited to the Java Technology Restrictions of the Supplemental License Terms, Sun grants you a non-exclusive, non-transferable, limited license without license fees to reproduce and use internally Software complete and unmodified for the sole purpose of running Programs. Additional licenses for developers and/or publishers are granted in the Supplemental License Terms.
    [Show full text]
  • Red Hat Jboss Core Services 2.4.23 Apache HTTP Server 2.4.23 Release Notes
    Red Hat JBoss Core Services 2.4.23 Apache HTTP Server 2.4.23 Release Notes For use with Red Hat JBoss middleware products. Last Updated: 2018-05-23 Red Hat JBoss Core Services 2.4.23 Apache HTTP Server 2.4.23 Release Notes For use with Red Hat JBoss middleware products. Legal Notice Copyright © 2018 Red Hat, Inc. The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/ . In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version. Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries. Linux ® is the registered trademark of Linus Torvalds in the United States and other countries. Java ® is a registered trademark of Oracle and/or its affiliates. XFS ® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries. MySQL ® is a registered trademark of MySQL AB in the United States, the European Union and other countries. Node.js ® is an official trademark of Joyent.
    [Show full text]
  • Foils USB Seminar.Msw
    To Transpose or Not to Transpose; That is the Question Performance Issues in the use of Java-based XML Software for Real-World Problems Herbert J. Bernstein Bernstein + Sons, Bellport, NY Design and Analysis Research Seminar Computer Science Department, SUNY at Stony Brook Wed., Sept. 26, 2001 2-3pm, CS Seminar Room 1306 Work supported in part by NSF via NDB project at Rutgers Bernstein: To Transpose or Not to Transpose, 26 Sep 01 Page - 1 Preface • Started in Bioinformatics • Managing large structural data sets • Drawing representations of molecules • Older representations were fixed field, order dependent • Natural for arrays and procedural languages • Newer representations are free field, order independent • Natural for trees and object-oriented languages • Experience revealed serious performance issues • Apply to any language with dynamic memory allocation • Current impact is in high performance graphics • Likely to be a general issue as use of XML for large complex documents spreads Bernstein: To Transpose or Not to Transpose, 26 Sep 01 Page - 2 Typical drawing from fixed field file (1CRN) Bernstein: To Transpose or Not to Transpose, 26 Sep 01 Page - 3 Drawing from XML version of 1CRN using Jmol Bernstein: To Transpose or Not to Transpose, 26 Sep 01 Page - 4 Introduction • The question: Java performance for large XML documents • How the question arose • Data representation in Bioinformatics • PDB format, CIF, CML and XML • XML is seeing more use • Document publication markup • Tree-oriented data representation • XHTML • Java is often
    [Show full text]
  • ATTACHMENT a to the LICENSE AGREEMENT for TRESYS XD AIR SOFTWARE END USER LICENSE AGREEMENT
    ATTACHMENT A to the LICENSE AGREEMENT FOR TRESYS XD AIR SOFTWARE END USER LICENSE AGREEMENT RED HAT® ENTERPRISE LINUX® AND RED HAT APPLICATIONS This end user license agreement (“EULA”) governs the use of any of the versions of Red Hat Enterprise Linux, any Red Hat Applications (as set forth at www.redhat.com/licenses/products), and any related updates, source code, appearance, structure and organization (the “Programs”), regardless of the delivery mechanism. 1. License Grant. Subject to the following terms, Red Hat, Inc. (“Red Hat”) grants to you (“User”) a perpetual, worldwide license to the Programs pursuant to the GNU General Public License v.2. The Programs are either a modular operating system or an application consisting of hundreds of software components. With the exception of certain image files identified in Section 2 below, the license agreement for each software component is located in the software component's source code and permits User to run, copy, modify, and redistribute (subject to certain obligations in some cases) the software component, in both source code and binary code forms. This EULA pertains solely to the Programs and does not limit User's rights under, or grant User rights that supersede, the license terms of any particular component. 2. Intellectual Property Rights. The Programs and each of their components are owned by Red Hat and others and are protected under copyright law and under other laws as applicable. Title to the Programs and any component, or to any copy, modification, or merged portion shall remain with the aforementioned, subject to the applicable license.
    [Show full text]