Practical Use Of
CERN – European Organization for Nuclear Research IT Department – e–Business Section
PracticalPractical UseUse ofof XMLXML
Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland XMLXML
eeXXtensibletensibleMMarkuparkup LLanguageanguage
l SGML (ISO standard, 1986) Mainly for technical documentation l XML (W3C recommendation, 1998) Simplification and enhancement of SGML, wide area of use
CERN e–Business WhyWhy Markup?Markup?
CERN e–Business XMLXML:: RulesRules
l Header
Some rules l Element names are case-sensitive l Every opening tag should have a closing tag l Tags cannot intersect () l Attributes values – in quotes or apostrophes CERN e–Business XMLXML:: DataData TransferTransfer
l Platform and language independent
l Easy to write, easy to process
l Understandable for humans and computers
l Open standard
– Many libraries exist – Lots of literature available
– Specialized XML-editors
l Possibility to check the document structure
CERN e–Business XMLXML:: DataData TransferTransfer (2)(2)
Example: EDH Transport Request
XML
External EDH Program
l Automatic form generation from external programs
l XML as data transfer format
l Schema checkup as a warranty of data consistency CERN e–Business WebWeb ServicesServices
l Data transfer between programs on Internet l Open Standard l Platform and language independent (Java, .Net, …)
Web XML service
SOAPSOAP XML
WSDLWSDL WSDL – Web Service Definition Language SOAP – Simple Object Access Protocol
CERN e–Business XMLXML:: DataData StorageStorage
l Data structure is kept together with the data
l Object “addendum” to relational RDBMS
l Structure checkup
l Supported by many modern RDBMS
– Microsoft SQL Server 2005, Oracle 9i +,
– XML Data Type
– XML indexes
– XML Queries (XQuery etc.)
– Data output in XML format
CERN e–Business XMLXML:: DataData StorageStorage (2)(2)
Example: EDH Search System Problem: Effective search using arbitrary number of criteria is problematic
Our solution:
l All documents are stored in XML
l Context-specific XML search (Oracle InterMedia) Example: «Find documents created by Slava»:
Select DOC_ID from DOC_XML where Contains(XML, “Slava within creator”) > 0;
CERN e–Business XMLXML:: DataData TransformationsTransformations
l XML can be transformed into HTML, text, PDF, ...
– No need for special program solutions
– Commercial visual editors exist
– Platform independent
CERN e–Business XMLXML--basedbased StandardsStandards
l Possibility to formally define the structure
l Platform and language independent
l Understandable for humans and computers
l Possibility to use XML technologies (XSLT transformations, XQuery queries)…
– WSDL (Web Services Definition Language) – SOAP (Simple Object Access Protocol) – XHTML (HTML that complies to XML rules) – SVG (Scalable Vector Graphics) – ebXML (XML for e-Business) – … CERN e–Business FormalFormal StructureStructure DefinitionDefinition
l There are ways to define XML structure formally
ObsoleteObsolete!! NotNot for for new new developmentdevelopment • DTD (Document Type Definition)
• XML Schema
CERN e–Business XMLXML SchemaSchema:: PossibilitiesPossibilities
l Check element presence and their order l Sequences and choices l Number of repetitions for elements and groups l Attributes and their presence l Type of elements and attributes l Restrictions for elements and attributes l Default values l Unique constraints l ...
CERN e–Business XMLXML--schemaschema:: whenwhen itit isis neededneeded??
l Formal structure definition for future reference
l Programmers may rely on data consistence
l Authors may check XML validness in advance
CERN e–Business XMLXML--schemaschema:: whenwhen NOTNOT neededneeded??
l When we know in advance that XML is valid
l When we do not care about document validness
l When maximum processing speed is required
l Small “throw away” projects
CERN e–Business XPathXPath:: XMLXML NavigationNavigation
l Access to XML elements
l Result of an XPATH-expression can be:
l XML Node l String l Node Set l Number l Boolean l Empty Set
C:\presentation\author\firstname /presentation/author/firstname
CERN e–Business XXPathPath:: ExamplesExamples
CERN e–Business XPathXPath:: ExamplesExamples ((88))
Example: Event Handling System
Handling System Events Subscriptions
XML XML XPath XPath Check events against XPath XML
Notifications
«I want to see all documents for more than 600 CHF»
/ document [amount > 600]
CERN e–Business XPathXPath:: ProgramProgram UseUse
XPath
System.out.println(((XMLDocument)xml).selectSingleNode( "/config/report[@name='Slava']/title/text()").getNodeValue());
DOM Model
Element root = xml.getDocumentElement(); Node child; for (child = root.getFirstChild(); child != null; child = child.getNextSibling()) if (child.getNodeName().equals("report") && ( (Element)child ).getAttribute ("name").equals("Slava")) break; for (child = ((Element)child). getFirstChild(); child != null; child = child.getNextSibling()) { if (child.getNodeName().equals("title") ) { for (Node child2 = child.getFirstChild(); child2 != null; child2 = child2. getNextSibling()) if ( child2 instanceof Text ) System.out.println(( (Text)child2 ).getData ().trim()); } CERN } e–Business XQuerXQueryy ––XMLXML QueryQuery LanguageLanguage
l XQuery is SQL for XML
– Database independent
– Easy to use
l Supported by popular RDBMS (Microsoft SQL Server 2005, Oracle 9i and10g)
l Based on XPath, supports document sets
CERN e–Business XSLT:XSLT: XMLXML TransformationsTransformations
l Transforms XML to HTML, text or other XML l XSLT 1.0 (Current), XSLT 2.0 (Draft) l XSLT is a “Human Interface” to XML l Supported by Web Browsers
XSLT
CERN e–Business XSLT:XSLT: SimplifiedSimplified StructureStructure
l XSLT is an XML file l Active usage of XPath expressions
xsl:stylesheet Evaluate XPath and print value
xsl:value-of … xsl:template xsl:value-of… …
xsl:apply-templates xsl:template … Apply templates Apply a template to other elements to the given element CERN e–Business XSLT:XSLT: PossibilitiesPossibilities
l Conditions (
XSLT 2.0 (Draft) l XPath 2.0 l Custom functions l Regular expressions l Date and time formatting l Groupings CERN e–Business XSLT:XSLT: ExampleExample
Author:
Table of Contents
CERN e–Business XSLTXSLT:: WebWeb “Skins”“Skins” -- 22
XSLT
CERN e–Business XSLTXSLT:: UserUser InterfacesInterfaces
CERN Stores Catalog
l Data loaded through XML
l Data stored in XML
l XSLT for data output
l 150000 items
l +10000 users
l ~15-20K XML for each page
l Custom formatting (through XSLT redefinition)
CERN e–Business XSLT:XSLT: XMLXML toto TextText
Example: l Automatic code generation
XML-description
Did you know… BusinessBusiness Logic Logic that 1 EDH document is: ... l At least 20 source files (code, HTML templates, resources, SQL, …) SQLSQL
l About 250K of source code
CERN e–Business XSLT:XSLT: XMLXML toto XMLXML
l Generate XML from another XML source
l “Configuration files update”
l XSL:FO
CERN e–Business XSLXSL--FO:FO: FormattingFormatting ObjectsObjects
l FO: XML-description of document layout
l XSL-FO: XSLT transformation of XML document to FO document
l FO Processor: program that converts the FO definition into a printable format (PDF, PS, ...)
XML FO PDF Document Document Document
XSL:FO
CERN e–Business XSLXSL--FO:FO: FormattingFormatting ObjectsObjects
FO has all capabilities of modern text editors:
l Fonts l Pagination l Headers and footers l Page numbering l Odd/even page distinction l Margins and intervals l Keep paragraphs together FO Processor: Apache FOP l Hangout lines l Tables l Graphics l …
CERN e–Business XSLXSL--FO:FO: ExampleExample
Web Interface e-MAPS
XSLT XMLXML
XSL:FO
Printable Version FOP Processor
l No extra code required l RTF to XSL:FO converters are good l Can be written by a student l Output format independent
CERN e–Business XMLXML EditorsEditors
l Specially designed for XML editing l XML well-formedness and validity check l DTD and Schema visual editing XMLSpy 2005 l XML generation accordingly to DTD/Schema l Creation and debugging of XSLT and XSL:FO l Visual XSLT editing
Example: Altova XML Spy (www.xmlspy.com) - Available from NICE - License can be obtained from the SDT service CERN e–Business XMLXML:: ProgramProgram HandlingHandling
l DOM (Document Object Model)
– Tree building SAXSAX --muchmuch faster faster, , l SAX DOMDOM – –moremore versatile versatile – Event handling – startElement() – endElement()
– Apache Xalan – Built-in support – Oracle XML Parser
CERN ... e–Business NewNew TechnologiesTechnologies
l InfoPath 2003 – Corporate system for electronic form handling – XML-based – Business rules defined by XML schema – Data validation using XML schemas l Adobe Intellegent Document Platform – Similar ideas
CERN e–Business ConclusionConclusion
««XMLXML isis oneone ofof thethe biggestbiggest inventionsinventions inin ITIT areaarea inin thethe lastlast fewfew yearsyears.. ThereThere isis aa lotlot ofof XMLXML applicationsapplications aroundaround thethe worldworld todaytoday,, andand thisthis amountamount willwill growgrow everyevery yearyear»»
W3C Consortium Web Site: http://www.w3c.org
Questions: [email protected]
CERN e–Business