
DOM and XPath J. Schneeberger University of Applied Sciences Deggendorf [email protected] 1 Overview • DOM – Document Object Model • XPath – step and location path – axis – other concepts – node types, abbreviations, and data types – Functions • XLink and XPointer 2 DOM / XPath / XQuery • idea: uniform query language for parts of XML trees • DOM – first implementation in Netscape 2 – different in different browsers – standard by W3C – platform independent • XPath – Version 1: 1999 – Version 2 draft: 2004 – Version 2 standard: 2007 – Version 2 is supported by a few tools only 3 DOM Document Object Model Document Object Model (DOM) • W3C spezification • facilitates the XML/HTML documents tree structure • programming interface (API) in JavaScript for HTML and XML documents – Core DOM – base model for HTML and XML documents – XML DOM – model for XML documents – HTML DOM – model for HTML documents DOM nodes • the whole document is a node • each XML element is a node • the text within an XML element is a node • each attribute is an (attribute) node • comments are (comment) nodes 6 JavaScript DOM (Netscape) 7 In JavaScript / Browser • Load an XML file var xmlDoc; xmlDoc=new window.XMLHttpRequest(); xmlDoc.open("GET","books.xml",false); xmlDoc.send(""); • Load an XML string try { //Internet Explorer xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(txt); return xmlDoc; } catch(e) { parser=new DOMParser(); xmlDoc=parser.parseFromString(txt,"text/xml"); return xmlDoc; } 8 DOM: properties and methods • Examples for properties – if x is a node: – x.nodeName – the name x – x.nodeValue – the value of x – x.parentNode – the parent node of x – x.childNodes – the child node of x – x.attributes – the attribute node of x • Examples of methods: – x.getElementsByTagName(name) returns all elements with name name – x.appendChild(node) inserts a child node below x – x.removeChild(node) removes a child node from x 9 Another JavaScript example txt = xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue • xmlDoc the dom node generated by the parser • getElementsByTagName("title")[0] the first element from the array of all <title> elements • childNodes[0] the first element in the array of child nodes • nodeValue the value of a node (e.g. some text) 10 Accessing DOM nodes • by getElementsByTagName() • Traversing the dom tree • Navigating the tree using the node relations 11 Event handler • A lot of HTML elemnts can be accessed by event handler – e.g: onAbort, onClick, onFocus, onLoad, onMouseover, onSelect, onSubmit, etc. • Event handlers may specify JavaScript expressions which are evaluated when the specified event occurs. • The evaluation order of event handlers may be tricky – e.g. if a button specifies all of the following: onClick, onFocus, onMouseover, onSubmit 12 Firefox DOM Inspector G. Görz, FAU, Informatik 8 G. Görz, FAU, Informatik 8 DOM specifications • Level 1: Core – HTML and XML document model – navigation in trees, tree modifications • Level 2: Style Sheet – change of format specs at tree nodes – event handler – functions for interaction – XML namespaces • Level 3: – load and save of documents – DTD and Schema support – views and formatting • Further Levels: windows and interaction [http://xml.coverpages.org/dom.html] 16 XPath Overview • DOM – Document Object Model • XPath – step and location path – axis – other concepts – node types, abbreviations, and data types – Functions • XLink and XPointer 18 Source • Parts from: Anders Møller, Michael I. Schwartzbach “An Introduction to XML and Web Technologies” Addison-Wesley, January 2006 • http://www.brics.dk/ixwt/ 19 What is XPath? • a notation to describe parts of trees • to navigate in trees • Is used by: – XSLT – programming language to transform XML – XML Schema (to define the uniqueness and the scope of elements) – XLink and XPointer 20 XPath • combination of path expressions (like those of a command shell) and simple programming language expressions – *.xml – all files with the extension “.xml“ – eg.: /body/table[@border="1"] • XSLT style sheets use XPath expressions in match and select elements <xsl:template match="/"> <xsl:value-of select="."> <xsl:apply-templates select="/recipe/incredients/item"> 21 Spaghetti <recipe> <title>Spaghetti Carbonara</title> <incredients> <item weight="250g">spaghetti</item> <item weight="10g">butter</item> <item amount="3">egg</item> <item>garlic</item> </incredients> <preparation> Spaghetti carbonara is the classical .. <preparation/> <info> <difficulty>2</difficulty> <duration min="20" work="preparation"/> <duration min="20" work="total"/> </info> </recipe> 22 ... as XML tree recipe title incredients preparation info Spaghetti item item item item Spaghetti difficulty duration duration Carbonara Carbonara is.. .. weight Spaghetti weight butter amount eggs garlic 2 min work min work 250g 10g 3 20 prep.. 20 total 23 Location Path / Step 24 Location Path (1) • two kinds of location paths • relative path – one or more steps (from left to right), connected by “/“ – each step selects a set of nodes – relative to the context (i.e. the start node) – each node of this set is in the context for the next step – multiple result sets of a step are combined (set union). • absolute path – an absolute path consists of “/“ followed by a relative path – the “/“ absolute path selects the root node [http://www.w3.org/TR/xpath#location-paths] 25 Location Path (2) • a location path – evaluates to a sequence of nodes – the sequence is sorted in document order – the sequence will never contain duplicates • general form: step1 / step2 / ... / stepN 26 Evaluating a Location Path • A step maps a context node into a sequence • ƒThis also maps sequences to sequences – each node is used as context node – and is replaced with the result of applying the step • ƒ The path then applies each step in turn. 27 Example context descendant::C A B B C C D E E E F C E E E F F 28 Example descendant::C/child::E A B B C C D E E E F C E E E F F 29 Example descendant::C/child::E/child::F context A B B step 1 C C D step 2 E E E F C E E E F step 3 / result F 30 Context • The context of an XPath evaluation consists of – a context node (a node in an XML tree) – a context position and size (two nonnegative integers) – a set of variable bindings – a function library – a set of namespace declarations • ƒThe application determines the initial context • If the path starts with ‘/’ then – the initial context node is the root – the initial position and size are 1 31 XPath data model • Starting point is the XML Information Set (Infoset). I.e. the information found in a valid and parsed XML document. • In addition, the following holds for XPath: – All data types of XML Schema are supported. (complex and simple types). – Collection elements and complex values. – Typed atomic values. – Ordered and heterogeneous sequences. • http://www.w3.org/TR/xpath-datamodel/ 32 Location Step • The location path is a sequence of steps • A location step consists of – an axis – a node test – some predicates axis :: nodetest [expr1] [expr2] 33 Axis 34 Axis • An axis is a sequence of nodes • An axis is evaluated relative to a context • XPath names 12 axis: child parent self attribute ancestor descendant ancestor-or-self descendant-or-self preceding-sibling following-sibling preceding following 35 Axis Direction • Each Axis has a direction • forward – in document orientation – child, descendant, following-sibling, following, self, descendant-or-self • backward – inverse document orientation – parent, ancestor, preceding-sibling, preceding • without direction – depends on the implementation – attribute 36 parent axis A B B C C D E E E F C E E E F F 37 child axis A B B C C D E E E F C E E E F F 38 descendant axis A B B C C D E E E F C E E E F F 39 ancestor axis A B B C C D E E E F C E E E F F 40 following-sibling axis A B B C C D E E E F C E E E F F 41 preceding-sibling axis A B B C C D E E E F C E E E F F 42 following axis A B B C C D E E E F C E E E F F 43 preceding axis A B B C C D E E E F C E E E F F 44 axis ancestor self preceding following descendant 45 axis ancestor following-sibling self preceding-sibling preceding following descendant 46 Node Types, Abbreviations, and Data Types 47 Node Types • Element node – a node in the tree that corresponds to an element. • text node – text in the XML tree with no further subelements • attribute node – represents an attribute (with name and value) 48 Node Test node test selection text() a text node comment() a comment node <!-- --> processing- <? ... > instruction() node() All elements enclosed by tags and also nodes consisting of text (between elements). * elements with arbitrary names QName elements with a qualifying name *:NCName elements with an arbitrary namespace and a qualifying name NCName:* elements with a qualifying namespace and an arbitrary name 49 Abbreviations for XPath expressions • attributes: @ • the context: . • the parent context: .. • the descendent axis: // • the child axis: (... by omission) 50 Abbreviations [http://www.brics.dk/ixwt/] 51 More Examples Abbreviation Long form * child::element() text() child::text() @maker attribute::maker @* attribute::* x//y child::x/descendant::y . self::node() .. parent::node() ../@maker parent::node()/attribute::maker car[5] child::car[position()=5] car[@maker="US"] child::car/self::node()[maker="US"] car[milage] child::car/self::node()[child::milage] 52 Atomization • A sequence may be atomized • This results in a sequence of atomic values • For element nodes this
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages102 Page
-
File Size-