Quick viewing(Text Mode)

Practical Use Of

Practical Use Of

CERN – European Organization for Nuclear Research IT Department – e–Business Section

PracticalPractical UseUse ofof XMLXML

Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland XMLXML

eeXXtensibletensibleMMarkuparkup LLanguageanguage

l SGML (ISO standard, 1986) Mainly for technical documentation l XML (W3C recommendation, 1998) Simplification and enhancement of SGML, wide area of use

CERN e–Business WhyWhy Markup?Markup?

MarkupMarkup allowsallows toto addadd Introduction???????? informationinformation about about data data

Text?????
structurestructure
Markup????????
More???????????? document?? markup ?????? ? ????????
Reserved??????????????? attributes??
????????
? Processing????????? instructions ?? ?????????

CERN e–Business XMLXML:: RulesRules

l Header l One root element Rostislav l Tag hierarchy Titov l Attributes l Text elements XML (Extensible Markup Language) is … l Empty elements

Some rules l Element names are case-sensitive l Every opening tag should have a closing tag l Tags cannot intersect () l Attributes values – in quotes or apostrophes CERN e–Business XMLXML:: DataData TransferTransfer

l Platform and language independent

l Easy to write, easy to process

l Understandable for humans and computers

l Open standard

– Many libraries – Lots of literature available

– Specialized XML-editors

l Possibility to check the document structure

CERN e–Business XMLXML:: DataData TransferTransfer (2)(2)

Example: EDH Transport Request

XML

External EDH Program

l Automatic form generation from external programs

l XML as data transfer format

l Schema checkup as a warranty of data consistency CERN e–Business WebWeb ServicesServices

l Data transfer between programs on Internet l Open Standard l Platform and language independent (, .Net, …)

Web XML service

SOAPSOAP XML

WSDLWSDL WSDL – Web Service Definition Language SOAP – Simple Object Access Protocol

CERN e–Business XMLXML:: DataData StorageStorage

l Data structure is kept together with the data

l Object “addendum” to relational RDBMS

l Structure checkup

l Supported by many modern RDBMS

SQL Server 2005, Oracle 9i +,

– XML Data Type

– XML indexes

– XML Queries (XQuery etc.)

– Data output in XML format

CERN e–Business XMLXML:: DataData StorageStorage (2)(2)

Example: EDH Search System Problem: Effective search using arbitrary number of criteria is problematic

Our solution:

l All documents are stored in XML

l Context-specific XML search (Oracle InterMedia) Example: «Find documents created by Slava»:

Select DOC_ID from DOC_XML where Contains(XML, “Slava within creator”) > 0;

CERN e–Business XMLXML:: DataData TransformationsTransformations

l XML can be transformed into HTML, text, PDF, ...

– No need for special program solutions

– Commercial visual editors exist

– Platform independent

CERN e–Business XMLXML--basedbased StandardsStandards

l Possibility to formally define the structure

l Platform and language independent

l Understandable for humans and computers

l Possibility to use XML technologies (XSLT transformations, XQuery queries)…

– WSDL (Web Services Definition Language) – SOAP (Simple Object Access Protocol) – XHTML (HTML that complies to XML rules) – SVG () – ebXML (XML for e-Business) – … CERN e–Business FormalFormal StructureStructure DefinitionDefinition

l There are ways to define XML structure formally

ObsoleteObsolete!! NotNot for for new new developmentdevelopment • DTD (Document Type Definition)

• XML Schema

CERN e–Business XMLXML SchemaSchema:: PossibilitiesPossibilities

l Check element presence and their order l Sequences and choices l Number of repetitions for elements and groups l Attributes and their presence l Type of elements and attributes l Restrictions for elements and attributes l Default values l Unique constraints l ...

CERN e–Business XMLXML--schemaschema:: whenwhen itit isis neededneeded??

l Formal structure definition for future reference

l may rely on data consistence

l Authors may check XML validness in advance

CERN e–Business XMLXML--schemaschema:: whenwhen NOTNOT neededneeded??

l When we know in advance that XML is valid

l When we do not care about document validness

l When maximum processing speed is required

l Small “throw away” projects

CERN e–Business XPathXPath:: XMLXML NavigationNavigation

l Access to XML elements

l Result of an XPATH-expression can be:

l XML l String l Node Set l Number l Boolean l Empty Set

C:\presentation\author\firstname /presentation/author/firstname

CERN e–Business XXPathPath:: ExamplesExamples

R. Aymar l Find the DG’s name W-D. Schlatter /cern/dg/person/text() W. von Rueden l Find all departments R. Martens /cern/department/@name D. Myers l Find all people A. Pace //person l Find the name of DH of IT /cern/department[@name=“IT”]/dh/person/text() l Find how many groups has a department where R. Martens works count(//gl/person[starts-with(., 'R. Martens')]/../../../group)

CERN e–Business XPathXPath:: ExamplesExamples ((88))

Example: Event Handling System

Handling System Events Subscriptions

XML XML XPath XPath Check events against XPath XML

Notifications

«I want to see all documents for more than 600 CHF»

/ document [amount > 600]

CERN e–Business XPathXPath:: ProgramProgram UseUse

XPath

System.out.println(((XMLDocument)).selectSingleNode( "/config/report[@name='Slava']/title/text()").getNodeValue());

DOM Model

Element root = xml.getDocumentElement(); Node child; for (child = root.getFirstChild(); child != null; child = child.getNextSibling()) if (child.getNodeName().equals("report") && ( (Element)child ).getAttribute ("name").equals("Slava")) break; for (child = ((Element)child). getFirstChild(); child != null; child = child.getNextSibling()) { if (child.getNodeName().equals("title") ) { for (Node child2 = child.getFirstChild(); child2 != null; child2 = child2. getNextSibling()) if ( child2 instanceof Text ) System.out.println(( (Text)child2 ).getData ().trim()); } CERN } e–Business XQuerXQueryy ––XMLXML QueryQuery LanguageLanguage

l XQuery is SQL for XML

independent

– Easy to use

l Supported by popular RDBMS (Microsoft SQL Server 2005, Oracle 9i and10g)

l Based on XPath, supports document sets

CERN e–Business XSLT:XSLT: XMLXML TransformationsTransformations

l Transforms XML to HTML, text or other XML l XSLT 1.0 (Current), XSLT 2.0 (Draft) l XSLT is a “Human Interface” to XML l Supported by Web Browsers

XSLT

CERN e–Business XSLT:XSLT: SimplifiedSimplified StructureStructure

l XSLT is an XML file l Active usage of XPath expressions

:stylesheet Evaluate XPath and print value xsl:value-of … xsl:template xsl:value-of

… …

xsl:apply-templates xsl:template … Apply templates Apply a template to other elements to the given element CERN e–Business XSLT:XSLT: PossibilitiesPossibilities

l Conditions () l Loops () l Variables () l Sorting () l Numbering [1., 1.1., 1.1.?, 2.,] () l Number formatting (format-number()) l Multiple step processing (mode) l String manipulations (via XPath)

XSLT 2.0 (Draft) l XPath 2.0 l Custom functions l Regular expressions l Date and time formatting l Groupings CERN e–Business XSLT:XSLT: ExampleExample

Author:

Table of Contents



Chapter .



.
CERN e–Business XSLTXSLT:: WebWeb “Skins”“Skins”

Person Search

Full Name
Maksym TITOV 71169 40-3-C08 Oleg TITOV EXT4

CERN e–Business XSLTXSLT:: WebWeb “Skins”“Skins” -- 22

XSLT

CERN e–Business XSLTXSLT:: UserUser InterfacesInterfaces

CERN Stores Catalog

l Data loaded through XML

l Data stored in XML

l XSLT for data output

l 150000 items

l +10000 users

l ~15-20K XML for each page

l Custom formatting (through XSLT redefinition)

CERN e–Business XSLT:XSLT: XMLXML toto TextText

Example: l Automatic code generation

XML-description Program Interface … Interface

Did you know… BusinessBusiness Logic Logic that 1 EDH document is: ... l At least 20 source files (code, HTML templates, resources, SQL, …) SQLSQL

l About 250K of source code

CERN e–Business XSLT:XSLT: XMLXML toto XMLXML

l Generate XML from another XML source

l “Configuration files update”

l XSL:FO

CERN e–Business XSLXSL--FO:FO: FormattingFormatting ObjectsObjects

l FO: XML-description of document layout

l XSL-FO: XSLT transformation of XML document to FO document

l FO Processor: program that converts the FO definition into a printable format (PDF, PS, ...)

XML FO PDF Document Document Document

XSL:FO FO Transformation <fo:page-sequence> Processor <title> <fo:flow> XXX <fo:flow> XXX ... ...

CERN e–Business XSLXSL--FO:FO: FormattingFormatting ObjectsObjects

FO has all capabilities of modern text editors:

l Fonts l l Headers and footers l Page numbering l Odd/even page distinction l Margins and intervals l Keep paragraphs together FO Processor: Apache FOP l Hangout lines l Tables l Graphics l …

CERN e–Business XSLXSL--FO:FO: ExampleExample

Web Interface e-MAPS

XSLT XMLXML

XSL:FO

Printable Version FOP Processor

l No extra code required l RTF to XSL:FO converters are good l Can be written by a student l Output format independent

CERN e–Business XMLXML EditorsEditors

l Specially designed for XML editing l XML well-formedness and validity check l DTD and Schema visual editing XMLSpy 2005 l XML generation accordingly to DTD/Schema l Creation and debugging of XSLT and XSL:FO l Visual XSLT editing

Example: Altova XML Spy (www..com) - Available from NICE - License can be obtained from the SDT service CERN e–Business XMLXML:: ProgramProgram HandlingHandling

l DOM ()

building SAXSAX --muchmuch faster faster, , l SAX DOMDOM – –moremore versatile versatile – Event handling – startElement() – endElement()

Java, ++: , .Net:

– Built-in support – Oracle XML Parser

CERN ... e–Business NewNew TechnologiesTechnologies

l InfoPath 2003 – Corporate system for electronic form handling – XML-based – Business rules defined by XML schema – Data validation using XML schemas l Adobe Intellegent Document Platform – Similar ideas

CERN e–Business ConclusionConclusion

««XMLXML isis oneone ofof thethe biggestbiggest inventionsinventions inin ITIT areaarea inin thethe lastlast fewfew yearsyears.. ThereThere isis aa lotlot ofof XMLXML applicationsapplications aroundaround thethe worldworld todaytoday,, andand thisthis amountamount willwill growgrow everyevery yearyear»»

W3C Consortium Web Site: http://www.w3c.org

Questions: [email protected]

CERN e–Business