UNIT – V

XML: Defining data for web applications, basic XML, document type definition, presenting XML, . Web Services

1.Defining data for web applications

Extensible Markup Language(XML) is used to describe data. The XML standard is a flexible way to create information formats and electronically share structured data via the public Internet, as well as via corporate networks.

XML code, a formal recommendation from the World Wide Consortium (W3C), is similar to Hypertext Markup Language (HTML). Both XML and HTML contain markup symbols to describe page or file contents. HTML code describes web page content (mainly text and graphic image) only in terms of how it is to be displayed and interacted with.

XML data is known as self-describing or self –defining, meaning that the structure of the data is embedded with the data, thus when the data arrives there is no need to pre-build the structure to store the data. It is dynamically understood within the XML. The XML format can be used by any individual or group of individuals or companies that want to share information in a consistent way. XML is actually a simpler and easier-to-use subset of the Standard Generalized Markup Language (SGML), which is the standard to create a document structure.

The basic building block of an XML document is an element, defined by tags. An element has a beginning and an ending tag. All elements in an XML document are contained in an outermost elements known as the root element. XML can also support nested elements, or elements within elements. This ability allows XML to support hierarchical structures. Element names describe the content of the element, and the structure describes the relationship between the elements.

An XML documents is considered to be “Well formed” (that is, able to be read and understood by an XML parser) if its format complies with XML specification, if it is properly marked up, and if elements are properly nested. XML also supports the ability to define attributes for elements and describe the characteristics of the elements in the beginning tag of an element.

For example, XML documents can be very simple, such as the following,

Hello World ! Have a nice day…

Applications for XML are endless. For example, computer makers might agree upon a standard or common way to describe the information about a computer product(processor speed, memory size etc.) and then describe the product information format with XML code. Such a standard way of describing data would enable a user to send an intelligent agent(a program) to each computer maker’s website ,gather data, and then make a valid comparison.

XML’s power resides in its simplicity. It can take large chunks of information and consolidate them into an XML document – meaningful pieces that provide structure and organization to the information.

2.Basic XML

In computing, Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. There are three important characteristics of XML that make it useful in a variety of systems and solutions:

• XML was designed to store and transport data. • XML was designed to be self-descriptive tags. • XML is a W3C Recommendation.

What is MarkUp ?

Markup is information added to a document that enhances its meaning in certain ways, in that it identifies the parts and how they relate to each other. More specifically, a markup language is a set of symbols that can be placed in the text of a document to demarcate and label the parts of that document.

Following example shows how XML markup looks, when embedded in a piece of text −

Hello, world!

XML Syntax:- The simple syntax rules to write an XML document. Following is a complete XML document −

Rahul

Designer World

(011) 123-4567

You can notice there are two kinds of information in the above example −

• Markup, like

• The text, or the character data, Designer World and (040) 123-4567.

The following diagram depicts the syntax rules to write different types of markup and text in an XML document.

Let us see each component of the above diagram in detail.

XML Declaration:- The XML document can optionally have an XML declaration. It is written as follows −

Where version is the XML version and encoding specifies the character encoding used in the document.

Syntax Rules for XML Declaration • The XML declaration is case sensitive and must begin with "" where "xml" is written in lower- case.

• If document contains XML declaration, then it strictly needs to be the first statement of the XML document.

• The XML declaration strictly needs be the first statement in the XML document.

• An HTTP protocol can override the value of encoding that you put in the XML declaration.

Tags and Elements:- An XML file is structured by several XML-elements, also called XML-nodes or XML-tags. The names of XML- elements are enclosed in triangular brackets < > as shown below −

Syntax Rules for Tags and Elements Element Syntax − Each XML-element needs to be closed either with start or with end elements as shown below

.... (or)

Nesting of Elements − An XML-element can contain multiple XML-elements as its children, but the children elements must not overlap. i.e., an end tag of an element must have the same name as that of the most recent unmatched start tag.

The Following example shows incorrect nested tags −

Designer World

The Following example shows correct nested tags −

Designer World

Root Element − An XML document can have only one root element. For example, following is not a correct XML document, because both the x and y elements occur at the top level without a root element −

...

...

The Following example shows a correctly formed XML document −

...

...

Case Sensitivity − The names of XML-elements are case-sensitive. That means the name of the start and the end elements need to be exactly in the same case.

For example, is different from

XML Attributes:- An attribute specifies a single property for the element, using a name/value pair. An XML-element can have one or more attributes. For example −

Designer World! Here href is the attribute name and http://www.tutorialspoint.com/ is attribute value.

Syntax Rules for XML Attributes • Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are considered two different XML attributes.

• Same attribute cannot have two values in a syntax. The following example shows incorrect syntax because the attribute b is specified twice

....

• Attribute names are defined without quotation marks, whereas attribute values must always appear in quotation marks. Following example demonstrates incorrect xml syntax

....

In the above syntax, the attribute value is not defined in quotation marks.

3.Document Type Definition (DTD)

The XML Document Type Declaration, commonly known as DTD, is a way to describe XML language precisely. DTDs check vocabulary and validity of the structure of XML documents against grammatical rules of appropriate XML language. An XML DTD can be either specified inside the document, or it can be kept in a separate document and then liked separately.

Syntax Basic syntax of a DTD is as follows −

[

declaration1

declaration2

...... ]>

In the above syntax,

• The DTD starts with

• An element tells the parser to parse the document from the specified root element.

• DTD identifier is an identifier for the document type definition, which may be the path to a file on the system or URL to a file on the internet. If the DTD is pointing to external path, it is called External Subset.

• The square brackets [ ] enclose an optional list of entity declarations called Internal Subset. Two Types of Document Type Definition:- 1.Internal Document Type Definition and 2.External Document Type Definition.

1. Internal DTD:- A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it as internal DTD, standalone attribute in XML declaration must be set to yes. This means, the declaration works independent of an external source.

Syntax Following is the syntax of internal DTD −

where root-element is the name of root element and element-declarations is where you declare the elements.

Example Following is a simple example of internal DTD −

]>

Rahul

Designer World

(011) 123-4567

Let us go through the above code − Start Declaration − Begin the XML declaration with the following statement.

DTD − Immediately after the XML header, the document type declarationfollows, commonly referred to as the DOCTYPE −

Several elements are declared here that make up the vocabulary of the document. defines the element name to be of type "#PCDATA". Here #PCDATA means parse-able text data. End Declaration − Finally, the declaration section of the DTD is closed using a closing bracket and a closing angle bracket (]>). This effectively ends the definition, and thereafter, the XML document follows immediately.

Rules: • The document type declaration must appear at the start of the document (preceded only by the XML header) − it is not permitted anywhere else within the document.

• Similar to the DOCTYPE declaration, the element declarations must start with an exclamation mark.

• The Name in the document type declaration must match the element type of the root element.

2. External DTD:- In external DTD elements are declared outside the XML file. They are accessed by specifying the system attributes which may be either the legal .dtd file or a valid URL. To refer it as external DTD, standalone attribute in the XML declaration must be set as no. This means, declaration includes information from the external source.

Syntax Following is the syntax for external DTD −

where file-name is the file with .dtd extension.

Example The following example shows external DTD usage −

Rahul

Designer World

(011) 123-4567

The content of the DTD file address.dtd is as shown −

Note:- Types You can refer to an external DTD by using either system identifiers or public identifiers.

System Identifiers A system identifier enables you to specify the location of an external file containing DTD declarations. Syntax is as follows −

As you can see, it contains keyword SYSTEM and a URI reference pointing to the location of the document.

Public Identifiers Public identifiers provide a mechanism to locate DTD resources and is written as follows −

As you can see, it begins with keyword PUBLIC, followed by a specialized identifier. Public identifiers are used to identify an entry in a catalog. Public identifiers can follow any format, however, a commonly used format is called Formal Public Identifiers, or FPIs.

4. Presenting XML

Representing data with XML opens up new possibilities for transport and distribution. XML presentation technologies provide a modular way to deliver and display content to a variety of devices.

Here we examine some technologies for display, including CSS, XSL, Xforms, and VoiceXML.

1.CSS:- CSS is used to control document display.

➢ Cascading style sheets is an XML-supporting technology for adding style display properties such as fonts, colors, or spacing to Web documents. ➢ CSS origins may be traced to the SGML world, which used a style sheet technology called DSSSL to control the display of SGML documents. ➢ Style sheet technology is important because it lets developers separate presentation from content, which greatly enhances software's longevity.

As Figure shows, a style sheet tells a browser or other display engine how to display content.

Figure HTML or XML may be delivered to a browser with CSS, which controls how data is presented on the screen.

Each rule is made up of a selector ”typically an element name such as an HTML heading ( H1 ) or paragraph ( P ), or a user -defined XML element ( Book ) ”and the style to be applied to the selector.

The CSS specification defines numerous properties ( color , font style, point size , and so on) that may be defined for an element. Each property takes a value which describes how the selector should be presented.

2. XSL:- eXtensible Stylesheet Language XSL 1.0 is a W3C Recommendation that provides users with the ability to describe how XML data and documents are to be formatted. XSL does this by defining "formatting objects," such as footnotes, headers, or columns .

1.XSL began as an effort to provide a better CSS - The XSL Working Group split into two subgroups, one focused on trying to build a better display-oriented style sheet technology and a second group trying to define a transformation language that could be used to transform XML into a variety of target languages including HTML, other dialects of XML, or any text document, including a program.

2.XSL is based on applying rules or templates to an XML document –

➢ The patterns are similar to CSS's selectors, but the action part may create an arbitrary number of "objects." The action part of the rule is called the "template" in XSL, and a template and a pattern together are referred to as a "template rule." ➢ The result of applying all matching patterns to a document recursively is a tree of objects, which is then interpreted top-down according to the definition of each object.

Figure: Options for using CSS and XSLT with XML and HTML.

3. XForms:- Forms are widely used in all aspects of E-commerce:-

➢ XForms is an XML approach that overcomes the limitations of HTML forms. ➢ XForms is a GUI toolkit for creating user interfaces and delivering the results in XML. ➢ Because XForms separates what the form does from how it looks, XForms can work with a variety of standard or proprietary user interfaces, providing a standard set of visual controls that replaces the primitive forms controls in HTML and XHTML. ➢ Included in XForms are a variety of buttons , scrollbars, and menus integrated into a single execution model that generates XML form data as output. Figure: XForms provides a standard way to collect form data through a variety of device interfaces.

4. XHTML:- eXtensible HyperText Markup Language

➢ XTML bring HTML into Conformance with XML ➢ XHTML Modularization – It modules build a base for the future. ➢ It allows specialized markup languages to be developed.

Figure: The structure of XHTML.

5. VoiceXML:- ➢ VoiceXML uses XML text to drive voice dialogs. ➢ VoiceXML is an emerging standard for speech-enabled applications. Its XML syntax defines elements to control a sequence of interaction dialogs between a user and an implementation platform. ➢ The elements defined as part of VoiceXML control dialogs and rules for presenting information to and extracting information from an end-user using speech. ➢ For ZwiftBooks, VoiceXML opens up options for extending its service to voice through cellular networks. Figure: VoiceXML documents are used to drive voice interactions over conventional or wireless phones.

5. Document Object Model ( DOM )

"The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of a document."

DOM defines the objects and properties and methods (interface) to access all XML elements. It is separated into 3 different parts / levels −

• Core DOM − standard model for any structured document

• XML DOM − standard model for XML documents

• HTML DOM − standard model for HTML documents

The HTML DOM defines a standard way for accessing and manipulating HTML documents. It presents an HTML document as a tree-structure.

The XML DOM defines a standard way for accessing and manipulating XML documents. It presents an XML document as a tree-structure.

Fig: Tree Structure of XML DOM All XML elements can be accessed through the XML DOM.

The XML DOM is:

• A standard object model for XML • A standard programming interface for XML • Platform- and language-independent • A W3C standard

In other words: The XML DOM is a standard for how to get, change, add, or delete XML elements.

Get the Value of an XML Element

This code retrieves the text value of the first element in an XML document: txt = xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue; </p><p>Loading an XML File </p><p>The XML file used in the examples below is books.xml. </p><p>This example reads "books.xml" into xmlDoc and retrieves the text value of the first <title> element in books.xml: </p><p> example reads "books.xml" into xmlDoc and retrieves the text value of the first <title> element in books.xml: </p><p>Example <!DOCTYPE html> <html> <body> <p id="demo"></p> <script> var parser, xmlDoc; var text = "<bookstore><book>" + "<title>Everyday Italian" + "Giada De Laurentiis" + "2005" + ""; parser = new DOMParser(); xmlDoc = parser.parseFromString(text,"text/xml"); document.getElementById("demo").innerHTML = xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue; Example Explained

• xmlDoc - the XML DOM object created by the parser. • getElementsByTagName("title")[0] - get the first element • childNodes[0] - the first child of the <title> element (the text node) • nodeValue - the value of the node (the text itself) </p><p>Programming Interface </p><p>The DOM models XML as a set of node objects. The nodes can be accessed with JavaScript or other programming languages. In this tutorial we use JavaScript. </p><p>The programming interface to the DOM is defined by a set standard properties and methods. </p><p>Properties are often referred to as something that is (i.e. nodename is "book"). </p><p>Methods are often referred to as something that is done (i.e. delete "book"). </p><p>XML DOM Properties </p><p>These are some typical DOM properties: </p><p>• x.nodeName - the name of x • x.nodeValue - the value of x • x.parentNode - the parent node of x • x.childNodes - the child nodes of x • x.attributes - the attributes nodes of x </p><p>Note: In the list above, x is a node object. </p><p>XML DOM Methods </p><p>• x.getElementsByTagName(name) - get all elements with a specified tag name • x.appendChild(node) - insert a child node to x • x.removeChild(node) - remove a child node from x </p><p>Note: In the list above, x is a node object. </p><p>Advantages of XML DOM:- • XML DOM is language and platform independent. </p><p>• XML DOM is traversable - Information in XML DOM is organized in a hierarchy which allows developer to navigate around the hierarchy looking for specific information. </p><p>• XML DOM is modifiable - It is dynamic in nature providing the developer a scope to add, edit, move or remove nodes at any point on the tree. </p><p>Disadvantages of XML DOM:- • It consumes more memory (if the XML structure is large) as program written once remains in memory all the time until and unless removed explicitly. </p><p>• Due to the extensive usage of memory, its operational speed, compared to SAX is slower. 6. Web Services </p><p>Definition:- </p><p>A web service is any piece of software that makes itself available over the internet and uses a standardized XML messaging system. XML is used to encode all communications to a web service. For example, a client invokes a web service by sending an XML message, then waits for a corresponding XML response. As all communication is in XML, web services are not tied to any one operating system or programming language—Java can talk with Perl; Windows applications can talk with Unix applications. </p><p>(or) </p><p>Web services are XML-based information exchange systems that use the Internet for direct application- to-application interaction. These systems can include programs, objects, messages, or documents. </p><p>How Does Web Services Work? </p><p>A web service enables communication among various applications by using open standards such as HTML, XML, WSDL, and SOAP. A web service takes the help of: </p><p>➢ XML to tag the data ➢ SOAP to transfer a message ➢ WSDL to describe the availability of service. </p><p>You can build a Java-based web service on Solaris that is accessible from your Visual Basic program that runs on Windows. </p><p>You can also use C# to build new web services on Windows that can be invoked from your web application that is based on JavaServer Pages (JSP) and runs on Linux. </p><p>Components of Web Services:- </p><p>The basic web services platform is XML + HTTP. All the standard web services work using the following components: </p><p>1. SOAP (Simple Object Access Protocol) 2. WSDL (Web Services Description Language) 3. RDF ( Resource Description Framework ) 4. RSS (Really Simple Syndication) and 5. UDDI (Universal Description, Discovery and Integration) and </p><p>1. SOAP:- The best way to communicate between applications is over HTTP, because HTTP is supported by all Internet browsers and servers. SOAP was created to accomplish this. </p><p>✓ SOAP stands for Simple Object Access Protocol ✓ SOAP is an application communication protocol ✓ SOAP is a format for sending and receiving messages ✓ SOAP is platform independent ✓ SOAP is based on XML ✓ SOAP is a W3C recommendation Example:- </p><p><?xml version="1.0"?> </p><p><soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope/" soap:encodingStyle="http://www.w3.org/2003/05/soap-encoding"> </p><p><soap:Header> ... </soap:Header> </p><p><soap:Body> ... <soap:Fault> ... </soap:Fault> </soap:Body> </p><p></soap:Envelope> </p><p>2. WSDL:- WSDL is an XML – based language for describing web services and how to access them. </p><p>✓ WSDL stands for Web Services Description Language. ✓ WSDL was developed jointly by Microsoft and IBM ✓ WSDL is an XML based protocol for information exchange in decentralized and distributed environments. ✓ WSDL is the standard format for describing a web service ✓ WSDL is an integral part of UDDI, and XML-based worldwide business registry. </p><p>Document look like this, </p><p><definitions> </p><p><types> data type definitions...... </types> </p><p><message> definition of the data being communicated.... </message> </definitions> </p><p>3. RDF:- RDF was designed to provide a common way to describe information so it can be read and understood by computer applications. </p><p>✓ RDF stands for Resource Description Framework ✓ RDF is a framework for describing resources on the web ✓ RDF is designed to be read and understood by computers ✓ RDF is not designed for being displayed to people ✓ RDF is written in XML ✓ RDF is a part of the W3C's Semantic Web Activity </p><p>Example:- <?xml version="1.0"?> </p><p><RDF> <Description about="https://www.designerworld.com/rdf"> <author>Rahul</author> <homepage>https://www.designerworld.com</homepage> </Description> </RDF> </p><p>4. RSS:- With RSS it is possible to distribute up-to-date web content from one web site to thousands of other web sites around the world. RSS allows fast browsing for news and updates. </p><p>✓ RSS stands for Really Simple Syndication ✓ RSS allows you to syndicate your site content ✓ RSS defines an easy way to share and view headlines and content ✓ RSS files can be automatically updated ✓ RSS allows personalized views for different sites ✓ RSS is written in XML </p><p>Example: </p><p><?xml version="1.0" encoding="UTF-8" ?> <rss version="2.0"> <channel> <title>Designer World https://www.designerworld.com Free web page designing

5. UDDI:- UDDI is an XML-based standard for describing, publishing and finding web services.

✓ UDDI stands for Universal Description, Discovery and Integration. ✓ UDDI is a specification for a distributed registry of web services. ✓ UDDI is platform independent, open framework. ✓ UDDI can communicate via SOAP, CORBA and Java RMI Protocol and ✓ UDDI uses WSDL to describe interfaces to web services.

Web Services Security:-

Security is critical to web services. However, neither XML-RPC nor SOAP specifications make any explicit security or authentication requirements. There are three specific security issues with web services:

1. Confidentiality 2. Authentication and 3. Network Security

Web Service Benefits:-

1. Exposing the Existing Function on the Network - Once it is exposed on the network, other applications can use the functionality of your program. 2. Interoperability - Web services allows various applications to talk to each other and share data and services among themselves. 3. Standardized Protocol - It gives the wide range of choices, reduction in the cost due to competition and increase in the quality. 4. Low Cost Communication - Web services use SOAP over HTTP protocol, so you can use your existing low- cost internet for implementing web services.

Web Services Characteristics:-

1. Loosely Coupled 2. Ability to be Synchronous (or) Asynchronous 3. Supports Remote Procedure Calls ( RPCs ) and 4. Supports Document Exchange

Difference Between HML and XML:-

HTML XML

HTML is an abbreviation for HyperText Markup XML stands for eXtensible Markup Language. Language.

HTML was designed to display data with focus on how XML was designed to be a software and hardware data looks. independent tool used to transport and store data, with focus on what data is.

HTML is a markup language itself. XML provides a framework for defining markup languages.

HTML is a presentation language. XML is neither a programming language nor a presentation language.

HTML is case insensitive. XML is case sensitive.

HTML is used for designing a web-page to be rendered XML is used basically to transport data between the on the client side. application and the database.

HTML has it own predefined tags. While what makes XML flexible is that custom tags can be defined and the tags are invented by the author of the XML document.

HTML is not strict if the user does not use the closing XML makes it mandatory for the user the close each tag tags. that has been used.

HTML does not preserve white space. XML preserves white space.

HTML is about displaying data,hence static. XML is about carrying information,hence dynamic.