5. Wring for the web: HTML and XHTML

Dr. Dave Parker

Informaon and the Web, 2014/15

1 Today

• XML/DTDs (and Assm. 1) – brief recap/quesons

• Wring for the web – XHTML vs HTML; XHTML basics

2 XML/DTDs (and Assm. 1)

• XML/DTDs – general ps – use a clear and sensible document structure – follow good principles • don't mix informaon, avoid redundancy, … – think about: • normalisaon, aributes vs. child elements – think ahead to larger, more complex data sets

• Assessment 1 – don't forget to validate (XML and DTD) – look at the exercises (and their soluons)

3 Today

• Wring for the web – XHTML vs HTML; XHTML basics

• Good pracces: – structured documents: semancs, styling – content vs. presentaon – adding metadata

4 HTML & XHTML

• HTML: Hypertext – basic language for wring web pages/documents – World Wide Web Consorum (W3C) standard – various versions: 1.1 (early 90s), …, 4.01 (1999), 5 (latest)

• XHTML: Extensible HTML – reformulaon of HTML 4 in XML (i.e. same elements, aributes, …) – cleaner, stricter syntax + addional rules – goals: extensibility, interoperability with other formats

5 Example: HTML

A Simple Example

Some Simple HTML

This page contains some examples of elements that are commonly used in HTML documents, including:

  • Headings
  • Paragraphs
  • Unnumbered lists
  • Bold Text

6 Example: HTML

A Simple Example

Some Simple HTML

This page contains some examples of elements that are commonly used in HTML documents, including:

  • Headings
  • Paragraphs
  • Unnumbered lists
  • Bold Text

• Does this display correctly in a browser? • Is it XML? Is it XHTML? 7 Example: HTML

A Simple Example

Some Simple HTML

This page contains some examples of elements that are commonly used in HTML documents, including:

  • Headings
  • Paragraphs
  • Unnumbered lists
  • Bold Text

• Does this display correctly in a browser? Yes • Is it XML? Is it XHTML? No: 8 (types of) errors 8 HTML vs XHTML

• Syntacc differences – XHTML needs all elements to be closed (even empty ones) – XHTML needs quotaon marks around aribute values – XHTML is case-sensive (e.g. start/end tags must match) – XHTML requires lower-case element/aribute names – XHTML requires: , , , DOCTYPE

9 HTML vs XHTML

• Which is best?

• Consistency between (many) browsers? – XHTML has cleaner, stricter syntax – but unl recently, not fully supported in some browsers • Easier to parse? – web browsers are very lenient anyway – but also: search engine crawlers, screen readers, … • Easier for extending/embedding? – e.g. MathML, SVG (vector graphics) – e.g. RDFa (semanc web)

10 HTML 5

• Official next-generaon HTML standard – currently a W3C Candidate Recommendaon – i.e. not yet official, no browsers fully support • Aims: – improve support for mulmedia – complex web applicaons (without Flash, etc.) – cross-plaorm mobile applicaons – single markup language writable in both HTML/XHTML • combines both HTML (v.4) and XHTML (v.1) • Notable syntacc addions – ,

11 XHTML: Basic structure

xml:lang="en"> ... ... • XML declaraon () is oponal – can cause some problems with old browsers • DOCTYPE specificaon – 3 types for (X)HTML: Strict, Transional, Frameset – we will use Strict (e.g. no presentaon markup)

12 XHTML: Basic structure

... ... • Root element: – namespace (needed), language (oponal) • Two child elements (both needed) – : page info (not shown in page) – : main content of page

13

• Typical contents – tle, metadata, external file links, JavaScript • Example Required

BBC News - Home ... css" /> 14 XHTML: Main elements

• 2 types of elements: block-level, inline • Block-level elements (new line before/aer) – headings (e.g.

), paragraphs (

), horizontal rule (


) lists (
    ,
      ,
    1. ), pre-formaed (
      ), tables (), … – also: 
      (structure, semancs, layout, style) • Inline elements (within text) – formang/semancs (, , , ), links (), images (), table cells (
      ), line breaks (
      ), code () … – also: (semancs, style)

      15 XHTML connued…

      • Core XHTML element aributes – id, class, style, tle, xml:lang, dir • Comments in XHTML – • Whitespace (spaces, tabs, newlines): ignored, mostly • Formang/presentaon – no: some text – yes: text – yes:

      text

      • Tables – for tabulated data, not for website layout 16 Summary

      • Wring for the web – XHTML vs HTML – XHTML basics

      • Good pracces: – adding metadata – structured documents: semancs, styling – content vs. presentaon

      • Next me – style sheets (CSS)

      17