Chapter 2 Creating Web Pages: XHTML
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 2 Creating Web Pages: XHTML A Web page is a document, identi¯ed by an URL, that can be retrieved on the Web. Typically, a Web page is written in HTML, the Hypertext Markup Language. When a Web browser receives an HTML document, it can format and render the content for viewing, listening, or printing. The user can also follow embedded hyperlinks, or simply links, to visit other Web pages. HTML enables you to structure and organize text, graphics, pictures, sound, video, and other media content for processing and display by browsers. HTML supports headings, paragraphs, lists, tables, links, images, forms, frames, and so on. The major part of a website is usually a set of HTML documents. Learning and understanding HTML is fundamental to Web Design and Programming. To create HTML ¯les you may use any standard text editor such as vi, emacs, word (MS/Windows), and SimpleText (Mac/OS). Specialized tools for creating and editing HTML pages are also widely available. After creating an HTML ¯le and saving it in a ¯le, you can open that ¯le (by double-clicking the ¯le or using the browser File>Open File menu option) and look at the page. XHTML (Extensible Hypertext Markup Language) is a modern version of HTML that is recommended for creating new Web pages. Having evolved from version 2.0 to 4.01, HTML now gets reformulated in XML (Extensible Markup Language) and becomes XHTML 1.0. 41 42 CHAPTER 2. CREATING WEB PAGES: XHTML XML conforming documents follow strict XML syntax rules and therefore become easily manipulated by programs of all kinds{a great advantage. XHTML 1.0 is the basis for the further evolution of HTML. The HTML codes in this book follow XHTML 1.0. Unless noted otherwise, we shall use the terms HTML and XHTML interchangeably. The basics of HTML is introduced in this chapter. Chapter 3 continues to cover more advanced aspects of HTML. The two chapters combine to provide a comprehensive and in- depth introduction to HTML. Other aspects of HTML are described when needed in later chapters. 2.1 HTML Basics HTML is a markup language that provides tags for you to organize information for the Web. By inserting HTML tags into a page of text and other content, you mark which part of the page is what to provide structure to the document. Following the structure, user agents such as browsers can perform on-screen rendering or other processing. Thus, browsers process and present HTML documents based on the marked-up structure. The exact rendering is de¯ned by the browser and may di®er for di®erent browsers. For example, common visual browsers such as Internet Explorer (IE) and Netscape Navigator (NN) render Web pages on screen. A browser for the blind, on the other hand, will voice the content according to its markup. Hence, a Web page in HTML contains two parts: markup tags and content. HTML tags are always enclosed in angle brackets (< >). This way, they are easily distinguished from contents of the page. It is recommended that you create Web pages with XHTML 1.0, the current version of HTML. An XHTML document in English1 has the following basic form <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 1See Section 3.20 for Web page in other languages. Brooks/Cole book/January 28, 2003 2.1. HTML BASICS 43 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Company XYZ: home page</title> </head> <body> <!-- page content begin --> . <!-- page content end --> </body> </html> The xml line speci¯es the version of XHTML and the character encoding used (Section 3.1). The DOCTYPE line actually indicates the version of HTML used, XHTML 1.0 Strict in this case, and the URL of its DTD. Next comes the html line which indicates the default XML name space used. An important advantage of XHTML is the ability to use tags de¯ned in other name spaces. These three initial lines tell browsers how to process the document. In most situations, you can use the above template verbatim for creating your HTML ¯les. Simply place the page content between the <body> and </body> tags. Comments in HTML source begin with <!-- and end with -->. In Chapter 1, we have seen some simple HTML code in Figure 1.6. Generally, HTML tags come in pairs, a start tag and an end tag. They work just like open and close parentheses. Add a slash (/) pre¯x to a start tag name to get the end tag name. A pair of start and end tags delimits an HTML element. Some tags have end tags and others don't. For browser compatibility, it is best to use the su±x space/> for any element without an end tag. For example, write the \line break" element in the form <br />. The head element contains informational elements for the entire document. For example, the title element (always required) speci¯es a page title which is 1. displayed in the title bar of the browser window 2. used in making a bookmark for the page The body element organizes the content of the document. Brooks/Cole book/January 28, 2003 44 CHAPTER 2. CREATING WEB PAGES: XHTML 2.2 Creating Your First Web Page Let's create a very simple Web page (Ex: FirstPage)2 following the template from the previous section (Section 2.1). Using your favorite editor, type in the following <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>My First Web Page</title> </head> <body style="background-color: cyan"> <p>Hello everyone!</p> <p>My Name is (put your name here) and today is (put in the date).</p> <p>HTML is cool.</p> </body> </html> and save it into a ¯le named firstpage.html. The content of body consists of three short paragraphs given by the p element. The page background color is set to cyan. From your favorite browser, select the Open File option on the file menu and open the ¯le firstpage.html. Now you should see the display of your ¯rst Web page (Figure 2.1). For more complicated Web pages, all you need is to know more HTML elements and practice how to use them. 2.3 Elements and Entities HTML provides over 90 di®erent elements. Generally, they fall into these categories: Top-level elements: html, head, and body. Head elements: elements placed inside head, including title (page title), style (render- ing style), link (related documents), meta (data about the document), base (URL of document), and script (client-side scripting). 2Examples available online are labeled like this for easy cross-reference. Brooks/Cole book/January 28, 2003 2.3. ELEMENTS AND ENTITIES 45 Figure 2.1: First Web Page Block-level elements: elements behaving like paragraphs, including h1|h6 (headings), p (paragraph), pre (pre-formatted text), div (designated block), ul, ol, dl (lists), table (tabulation), and form (user input forms). When displayed, a block-level (or simply block) element always starts a new line and any element immediately after the block element will also begin on a new line. Inline elements: elements behaving like words, characters, or phrases within a block, in- cluding a (anchor or hyperlink), br (line break), img (picture or graphics), em (em- phasis), strong (strong emphasis), sub (subscript), sup (superscript), code (computer code), var (variable name), kbd (text for user input), samp (sample output), span (designated inline scope). When an element is placed inside another, the containing element is the parent and the contained element is the child. Comments in an HTML page are given as <!-- a sample comment -->. Text and HTML elements inside a comment tag are ignored by browsers. Be sure not to put two consecutive dashes (--) inside a comment. It is good practice to include comments in HTML pages as notes, reminders, or documentation to make maintenance easier. In an HTML document certain characters, such as < and &, are used for markup and must be escaped to appear literally. Other characters you may need are not available on the Brooks/Cole book/January 28, 2003 46 CHAPTER 2. CREATING WEB PAGES: XHTML keyboard. HTML provides entities (escape sequences) to introduce such characters into a Web page. For example, the entity < gives < and ÷ gives ¥. Section 3.2 describes characters and entities in more detail. 2.4 A Brief History of HTML In 1989, Tim Berners-Lee at the European Organization for Nuclear Research (CERN) de¯ned a very simple version of HTML based on SGML, standard general markup language, as part of his e®ort to create a network-based system to share documents via text-only browsers. The simplicity of HTML makes it easy to learn and publish. It caught on. In 1992- 93, a group at NCSA (National Center for Supercomputing Applications, USA) developed the Mosaic visual/graphical browser. Mosaic added support for images, nested lists, as well as forms and fueled the explosive growth of the Web. Several people from the Mosaic project later, in 1994, help start Netscape. At the same time, the W3 Consortium (W3C) was formed and housed at MIT as an industry-supported organization for the standardization and development of the Web. The ¯rst common standard for HTML is HTML 3.2 (1997). HTML 4.01 became a W3C recommendation in December 1999.