INFO/CS 4302 Web Informaon Systems FT 2012 Week 5: Web Architecture: Structured Formats – Part 4 (DOM, JSON/YAML) (Lecture 9)

Theresa Velden

Haslhofer & Velden COURSE PROJECTS Q&A Example Web Informaon System Architecture

User Interface REST / Linked Data API

Web Application

Raw combine, (Relational) Database / Dataset(s), clean, In-memory store / refine Web API(s) File-based store RECAP XML & Related Technologies overview

Purpose Structured Define Document Access Transform content Structure Document Document Items

XML XML Schema XPath XSLT JSON RELAX NG DOM

YAML DTD XSLT • A transformaon in the XSLT language is expressed in the form of an XSL stylesheet – root element: – an document using the XSLT namespace, i.e. tags are in the namespace hp://www.w3.org/1999/XSL/Transform • The body is a set of templates or rules – The ‘match’ aribute specifies an XPath of elements in source – Body of template specifies contribuon of source elements to result tree • You need an XSLT processor to apply the style sheet to a source XML document

XSLT – In-class Exercise Recap • Example 1 (empty stylesheet – default behavior) • Example 2 (output format text, local_name()) • Example 3 (pulll model, one template only) • Example 4 (push model, modular design)

XSLT: Condional Instrucons • Programming languages typically provide ‘if- then, else’ constructs • XSLT provides – If-then: – If-then-(elif-then)*-else:

XML Source Document xsl:if xsl:if

? xsl:if

Movies

The Others (English tle)

HTML Result Document xsl:choose xsl:choose

Movies

The Others (English tle)

Les Autres (French tle)

HTML Result Document xsl:choose xsl:choose

Movies

The Others (en.)

Les Autres (fr.)

HTML Result Document DOM Well-formed XML Documents

• If an xml document is well-formed it can be transformed determiniscally (parsed) to a single infoset (DOM) DOM – Document Object Model

• API for manipulang document tree – Language independent; implemented in lots of languages – Suitable for XML and other document formats – Core noon of a node – Using DOM for a requires definion of interfaces and classes (“language bindings” that provide a DOM conformant API) • W3C Specificaons – Level 1: funconality and navigaon within a document – Level 2: modules and opons for specific content models (XML, HTML, CSS) – Level 3: further content model facilies (e.g. XML validaon handlers) DOM (Core) Level 1

DOM Node Objects • Root must be Document • Root children: one node and oponally Processing Instrucons, DocumentType, Comment, Text • Children of element must be either element, comment or text • Aributes are properes of elements (not their children), and are not member of the DOM tree • Children of aribute must be text W3C DOM Node Types W3C DOM Node Types View of DOM Architecture

hp://www.w3.org/TR/DOM-Level-3-Core/introducon.html DOM Architecture and Conformance

• The Core module, which contains the fundamental interfaces that must be implemented by all DOM conformant implementaons, is associated with the feature name "Core"; • The XML module, which contains the interfaces that must be implemented by all conformant XML 1.0 [XML 1.0] (and higher) DOM implementaons, is associated with the feature name "XML". Parsing XML Document Model (DOM)

• Complete in-memory representaon of enre tree

hp://docstore.mik.ua/orelly/xml/jxml/ch05_01.htm Java, Python & DOM • JDOM – Basic navigaon funconality • Parent • Child (all, specific, filtered) • Descendants • Aributes (all, specific, filtered) – Basic tree manipulaon • Adding, replacing, removing contents and aributes) • Text modificaon • Maintains well-formedness • Python: hp://docs.python.org/library/ xml.dom.minidom.html Alternave Support for Parsing of XML

• SAX (open source soware) – Event driven, sequenal – Mostly hierarchical, no’ sibling’ concept – More memory efficient – Good for quick, less intensive parsing that does not require easy-to-use, clean interface

hp://docstore.mik.ua/orelly/xml/jxml/ch05_01.htm Simple SAX Example Document Events

startDocument() startElement(“books”) startElement(“book”) characters(“War and Peace”) War and Peace endElement(“book”) endElement(“books”) endDocument()interface

Lagoze Fall 2012) JSON/YAML Document/Data Formats JSON

• “JavaScript Object Notaon (JSON) is a lightweight, text- based, language-independent data interchange format. […] JSON defines a small set of formang rules for the portable representaon of structured data.” [RFC 4627 hp://tools.ie.org/html/rfc4627]

• popularizaon and formal specificaon defined by Douglas Crockford (senior JavaScript architect at PayPal) • JSON Homepage: hp://www..org/

JSON - an open standard for data interchange

• text-based & human { readable "firstName": "Nicole", "lastName": "Kidman", "phoneNumber": [ • machine readable and { easy to parse "type": "home", "number": "212 555-1234" }, • not a mark-up language { but 'self-descripve' "type": "fax", "number": "646-555-4567" } • not extensible (different ]} from XML) JSON = JavaScript Object Notation

• programming language independent • maps to two universal structures:

1) An unordered collecon of name value pairs: {"firstName": "Nicole","lastName": "Kidman"} in other languages: object, record, struct, diconary, hash table, keyed list, or associave array

2) An ordered list of values: ["Monday", "Tuesday", "Wednesday"] in other languages: array, vector, list, or sequence JSON Syntax

• Primive types: 1. Strings 2. Numbers 3. Booleans (true/false) 4. Null • Structured Types 1. Objects 2. Array JSON Syntax

a collection of string value pairs

an ordered list of values

hp://www.json.org/ JSON Syntax

hp://www.json.org/ JSON Syntax

hp://www.json.org/ JSON Syntax

hp://www.json.org/ JSON Example: De-constructed

{ "firstName": "Julia", "lastName": "Chen", "phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646-555-4567" } ]} JSON Example: De-constructed

{ an object with three string "firstName": "Julia", "lastName": value pairs "Chen", "phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646-555-4567" } ]} JSON Example: De-constructed

{ an object with three string "firstName": "Julia", "lastName": value pairs "Chen", "phoneNumber": [ { "type": "home", the value of the third pair is "number": "212 555-1234" }, an array { "type": "fax", "number": "646-555-4567" } ]} JSON Example: De-constructed

{ an object with three string "firstName": "Julia", "lastName": value pairs "Chen", "phoneNumber": [ { "type": "home", the value of the third pair is "number": "212 555-1234" }, an array {

"type": "fax", holding two objects, each "number": "646-555-4567" } with two string value pairs ]}

JSON formatting and syntax rules hp://www.freeformaer.com/json-formaer.html

Syntax rules such as:

• "Objects are encapsulated within opening and closing { } • An empty object can be represented by { } • Arrays are encapsulated within opening and closing square brackets [ ] • An empty array can be represented by [ ] • A member is represented by a key-value pair • The key of a member should be contained in double quotes. (JavaScript does not require this. JavaScript and some parsers will tolerate single- quotes) [...]" JSON to XML comparison

From: "JSON: The fat free alternave to XML" hp:// www.json.org/xml.html

"JSON is a beer format. XML is a beer document exchange format. Use the right tool for the right job."

• JSON mime type: applicaon/json • formally defined in RFC 4627

JSON Schema

• defines structure of JSON data • a contract what JSON data to provide for an applicaon and how to interact with it • mime-type: applicaon/schema+json • Expired IETF dra: hp://tools.ie.org/html/dra-zyp-json-schema-03 • rarely used and lack of tool support JSON data & JSON Schema

{ schema excerpt: "catalogue": [ e.g. an object has a name (the string) and { properes defining the value part of the "movie": { object that can include further objects… "ID": 24, "tle": "Star Wars Episode VI:", {"name": "movie", "actors": [ "properes":{ { "ID": 10, "ID":{ "name": "Mark Hamill", "type": "number", "birth_date": "09-25-1951" "descripon": "movie idenfier", }, "required": true { "ID": 12, }, "name": "Anthony Daniels", … "birth_date": "02-21-1946" “actors”:{ } “type”: “object”, … ]}}]} } } JSON & AJAX Using the XMLHpRequest API in Javascript

var my_JSON_object = {}; var hp_request = new XMLHpRequest(); returns data in hp_request.open("GET", url, true); JSON format hp_request.onreadystatechange = funcon () { from server and var done = 4, ok = 200; evaluates it on if (hp_request.readyState == done && the client side as hp_request.status == ok) { a JavaScript my_JSON_object = JSON.pars(hp_request.responseText); object }}; hp_request.send(null);

[source: hp://en.wikipedia.org/wiki/JSON#Schema]

Same origin policy: the URL replying to the request must reside within the same DNS domain as the server that hosts the page containing the request; need to use JSONP (JSON with padding) to request objects from different servers, see hp://en.wikipedia.org/wiki/JSONP

Writing JSON in Python

JSON result file:

{"catalogue": {"movies": [{"movie": [{"tle": "The Others"}, {"actors": [{"actor": {"name": "Nicole Mary Kidman"}}, {"actor": {"name": "Elaine Cassidy"}}, {"actor": {"name": "Christopher Eccleston"}}, {"actor": {"name": "Alakina Mann"}}, {"actor": {"name": "Eric Sykes"}}, {"actor": {"name": "Fionnula Flanagan"}}]}]}, …]}} Writing JSON in Python Good news: Starng with python 2.6 the JSON module is part of the standard python library Parsing JSON versus XML

"Speeding up AJAX with JSON" by Jean Kelly (2006) at hp:// www.developer.com/lang/jscript/arcle.php/3596836

YAML = “YAML Ain’t

• is a human-friendly data serializaon language • design goals in decreasing priority according to specificaon at hp://www..org/spec/1.2/spec.html: – YAML is easily readable by humans. – YAML data is portable between programming languages. – YAML matches the nave data structures of agile languages. – YAML has a consistent model to support generic tools. – YAML supports one-pass processing. – YAML is expressive and extensible. – YAML is easy to implement and use. • Relaon to JSON – every JSON file is a valid YAML file – different design goals: YAML (human-friendly), JSON (simplicity, universality) – YAML more complex to generate and parse – YAML has a more complete informaon model • Relaon to XML: “YAML is the result of lessons learned from XML and other technologies” YAML

Source: hp://www.yaml.org/spec/1.2/spec.html YAML

Source: hp://www.yaml.org/spec/1.2/spec.html Next Week:

• Tuesday: recap of first four weeks of classes – Topics • Internet Architecture • Web Architecture • Structured Data Formats (XML & related Technologies) – Use piazza to suggest topics to recap (areas of fuzziness/lingering doubt) • Thursday: ‘Internet Surveillance’ – Note: readings will be announced later today and homework 4 will be responses to these readings – We want you to come prepared, so homework 4 will be due by 10 AM on Thursday 9/27 (i.e. before class!)