Lxmldoc-4.5.0.Pdf

Lxmldoc-4.5.0.Pdf

lxml 2020-01-29 Contents Contents 2 I lxml 14 1 lxml 15 Introduction................................................. 15 Documentation............................................... 15 Download.................................................. 16 Mailing list................................................. 17 Bug tracker................................................. 17 License................................................... 17 Old Versions................................................. 17 2 Why lxml? 18 Motto.................................................... 18 Aims..................................................... 18 3 Installing lxml 20 Where to get it................................................ 20 Requirements................................................ 20 Installation................................................. 21 MS Windows............................................. 21 Linux................................................. 21 MacOS-X............................................... 21 Building lxml from dev sources....................................... 22 Using lxml with python-libxml2...................................... 22 Source builds on MS Windows....................................... 22 Source builds on MacOS-X......................................... 22 4 Benchmarks and Speed 23 General notes................................................ 23 How to read the timings........................................... 24 Parsing and Serialising........................................... 24 The ElementTree API............................................ 27 Child access.............................................. 28 Element creation........................................... 28 Merging different sources....................................... 29 deepcopy............................................... 29 Tree traversal............................................. 30 XPath.................................................... 30 A longer example.............................................. 31 lxml.objectify................................................ 33 ObjectPath............................................... 33 2 CONTENTS CONTENTS Caching Elements........................................... 34 Further optimisations......................................... 34 5 ElementTree compatibility of lxml.etree 36 6 lxml FAQ - Frequently Asked Questions 39 General Questions.............................................. 39 Is there a tutorial?........................................... 39 Where can I find more documentation about lxml?.......................... 39 What standards does lxml implement?................................ 40 Who uses lxml?............................................ 40 What is the difference between lxml.etree and lxml.objectify?................... 41 How can I make my application run faster?............................. 42 What about that trailing text on serialised Elements?........................ 42 How can I find out if an Element is a comment or PI?........................ 42 How can I map an XML tree into a dict of dicts?........................... 43 Why does lxml sometimes return ’str’ values for text in Python 2?................. 43 Why do I get XInclude or DTD lookup failures on some systems but not on others?........ 43 How do namespaces work in lxml?.................................. 43 Installation................................................. 43 Which version of libxml2 and libxslt should I use or require?.................... 43 Where are the binary builds?..................................... 44 Why do I get errors about missing UCS4 symbols when installing lxml?.............. 44 My C compiler crashes on installation................................ 44 Contributing................................................. 45 Why is lxml not written in Python?.................................. 45 How can I contribute?......................................... 45 Bugs..................................................... 46 My application crashes!........................................ 46 My application crashes on MacOS-X!................................ 46 I think I have found a bug in lxml. What should I do?........................ 46 How do I know a bug is really in lxml and not in libxml2?..................... 47 Threading.................................................. 47 Can I use threads to concurrently access the lxml API?....................... 47 Does my program run faster if I use threads?............................. 48 Would my single-threaded program run faster if I turned off threading?............... 48 Why can’t I reuse XSLT stylesheets in other threads?........................ 48 My program crashes when run with mod_python/Pyro/Zope/Plone/................... 48 Parsing and Serialisation.......................................... 49 Why doesn’t the pretty_print option reformat my XML output?............... 49 Why can’t lxml parse my XML from unicode strings?........................ 50 Can lxml parse from file objects opened in unicode/text mode?................... 50 What is the difference between str(xslt(doc)) and xslt(doc).write() ?................ 51 Why can’t I just delete parents or clear the root node in iterparse()?................. 51 How do I output null characters in XML text?............................ 51 Is lxml vulnerable to XML bombs?.................................. 51 How do I use lxml safely as a web-service endpoint?........................ 52 How can I sort the attributes?..................................... 52 XPath and Document Traversal....................................... 53 What are the findall() and xpath() methods on Element(Tree)?............... 53 Why doesn’t findall() support full XPath expressions?..................... 53 How can I find out which namespace prefixes are used in a document?............... 53 How can I specify a default namespace for XPath expressions?................... 53 3 CONTENTS CONTENTS II Developing with lxml 54 7 The lxml.etree Tutorial 55 The Element class.............................................. 56 Elements are lists........................................... 56 Elements carry attributes as a dict.................................. 58 Elements contain text......................................... 59 Using XPath to find text........................................ 60 Tree iteration............................................. 61 Serialisation.............................................. 62 The ElementTree class........................................... 64 Parsing from strings and files........................................ 65 The fromstring() function....................................... 65 The XML() function......................................... 65 The parse() function.......................................... 66 Parser objects............................................. 67 Incremental parsing.......................................... 67 Event-driven parsing......................................... 68 Namespaces................................................. 70 The E-factory................................................ 72 ElementPath................................................. 74 8 APIs specific to lxml.etree 76 lxml.etree.................................................. 76 Other Element APIs............................................. 76 Trees and Documents............................................ 77 Iteration................................................... 78 Error handling on exceptions........................................ 79 Error logging................................................ 80 Serialisation................................................. 80 C14N................................................. 80 Pretty printing............................................. 80 XML declaration........................................... 81 Incremental XML generation........................................ 82 CDATA................................................... 83 XInclude and ElementInclude........................................ 84 9 Parsing XML and HTML with lxml 85 Parsers.................................................... 85 Parser options............................................. 86 Error log................................................ 87 Parsing HTML............................................ 87 Doctype information......................................... 88 The target parser interface.......................................... 89 The feed parser interface.......................................... 91 Incremental event parsing.......................................... 92 Event types.............................................. 93 Modifying the tree.......................................... 93 Selective tag events.......................................... 94 Comments and PIs.......................................... 95 Events with custom targets...................................... 95 iterparse and iterwalk............................................ 97 iterwalk................................................ 98 Python unicode strings........................................... 99 Serialising to Unicode strings..................................... 99 4 CONTENTS CONTENTS 10 Validation with lxml 101 Validation at parse time........................................... 101 DTD..................................................... 102 RelaxNG.................................................. 104 XMLSchema................................................ 105

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    512 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us