Hidden Meaning
Total Page:16
File Type:pdf, Size:1020Kb
FEATURES Microdata and Microformats Kit Sen Chin, 123RF.com Chin, Sen Kit Working with microformats and microdata Hidden Meaning Programs aren’t as smart as humans when it comes to interpreting the meaning of web information. If you want to maximize your search rank, you might want to dress up your HTML documents with microformats and microdata. By Andreas Möller TML lets you mark up sections formats and microdata into your own source code for the website shown in of text as headings, body text, programs. Figure 1 – an HTML5 document with a hyperlinks, and other web page business card. The Heading text block is H elements. However, these defi- Microformats marked up with the element h1. The text nitions have nothing to do with the HTML was originally designed for hu- for the business card is surrounded by meaning of the data: Does the text refer mans to read, but with the explosive the div container element, and <br/> in- to a person, an organization, a product, growth of the web, programs such as troduces a line break. or an event? Microformats [1] and their search engines also process HTML data. It is easy for a human reader to see successor, microdata [2] make the mean- What do programs that read HTML data that the data shown in Figure 1 repre- ing a bit more clear by pointing to busi- typically find? Listing 1 shows the HTML sents a business card – even if the text is ness cards, product descriptions, offers, and event data in a machine-readable LISTING 1: HTML5 Document with Business Card way. 01 <!DOCTYPE html> In this article, I describe some impor- 02 <html lang="es"> tant microformats, such as hCard, hCal, 03 <head></head> hProduct, hReview, and Geo. You’ll also 04 <body> get an introduction to microdata, and 05 <h1>Pedro Miguel Díaz ‑ el Arte Guitarra Flamenca</h1> you’ll learn about some open source 06 <div> tools for integrating awareness of micro- 07 <img alt="" src="images/el-fumador.jpg"/> 08 <a href="http://el‑fumador.info">Pedro Miguel Díaz</a> ‑ el Fumador<br/> AUTHOR 09 Flamenco Artist<br/> Andreas Möller, Dipl. Phys. [http:// 10 C/ Peral 43<br/> pamoller.com], has focused on 11 E-41002 Sevilla<br/> developing Internet-based software for 12 España<br/> more than 10 years. His development 13 [email protected]<br/> work includes database applications, 14 +34 954 88 24 37 web applications, and software in the 15 </div> field of single-source publishing. He currently works as a freelance consultant 16 </body> and author. 17 </html> 62 APRIL 2012 ISSUE 137 LINUX-MAGAZINE.COM | LINUXPROMAGAZINE.COM 062-068_Semantic.indd 62 2/14/12 11:58:06 AM FEATURES Microdata and Microformats in a different language, but a computer program would not know the difference between this data and any other data en- closed in HTML tags. Microformats pro- vide a means of showing the purpose of the data. Listing 2 shows the microfor- mat markup for a typical business card. As you can see, microformats mainly use the class attribute from the HTML vo- cabulary. In this case, the hCard micro- format describes a business card. In Listing 2, every property is enclosed by the span element and described by the universal class attribute. For example, class="vcard" in line 6 refers to a busi- ness card object. The class="fn url" as- signment describes the contents of the Figure 1: A website with the most important data for a musician – hosting service providers hyperlink, Pedro Miguel Díaz, as a name also refer to this as a web business card. (fn, “Full name”) and the URL in the href attribute as the matching website LISTING 3: Event Markup with hCalendar (url). 01 <div class="vevent"> The address object class="adr" in line 02 <h3 class="summary">Feria de Abril</h3> 11 comprises multiple properties. It con- 03 <ab br class="dtstart" title="2012-03-23T21:00:00Z">el 23 de marzo de 2012, tains the type, street‑address, 21:00</abbr> postal‑code, locality, and country of the 04 en <span class="location">Sevilla</span> address. The business card’s markup fol- 05 <div class="geo"> lows the hCard [3] microformat, which 06 <ab br class="latitude" title="37.3844">37° 38′ N</abbr> implements the Internet Mail Consor- 07 <abb r class="longitude" title="‑5.9888">5° 98′ W</abbr> tium’s vCard specification [4] in HTML. 08 </div> For each line of text, vCard stores the 09 </div> key/ value pair and additional options. The hCard microformat translates this Listing 3 shows how to use the hCal- Besides grammar rules, microformats vocabulary into HTML by translating the endar microformat, which was derived are characterized by their reusability. keys to values of the class attribute. from iCalendar [6]. The property/ value Listing 3 uses the Geo microformat to Microformats exist for several other pair class="vevent" describes the event, specify the location for the event. The uses in addition to business cards. For which must contain a summary and a start format contains properties for the lati‑ example, you can describe events using (dtstart). In contrast, the location infor- tude and longitude. Again, the machine- hCalendar, geographical information mation is optional. readable variant is given in the title with Geo, and products and product re- The following rules apply for microfor- property of the abbr element. views with hReview and hReview-aggre- mats: if you want to provide a date in a Listing 4 provides another example of gate. For an overview of microformats machine-readable format, it has to be the re-usability of microformats. Line 1 and versions, check out the wiki at Mi- given in the title property of the abbr el- marks of the summary of a product re- croformats.org [5]. ement (line 3). view using the hReview-aggregate micro- LISTING 2: Business Card with hCard Markup 01 <!DOCTYPE html> 13 <span class="street-address">C/ Peral 43</span><br/> 02 <html> 14 <span class="postal-code">E-41002</span> 03 <he ad><link rel="profile" 15 <span class="locality">Sevilla</span><br/> href="http://microformats.org/profile/hcard"/></head> 16 <span class="country-name">España</span> 04 <body> 17 </div> 05 <h1>Pedro Miguel Díaz ‑ el Arte Guitarra Flamenca</h1> 18 <span class="email">[email protected]</span><br/> 06 <div id="me" class="vcard"> 19 <div class="tel"> 07 <img alt="" class="photo" src="images/el-fumador.jpg"/> 20 <span class="type" style="display:None">Home</span> 08 <a class="url fn" href="http://el-fumador.info">Pedro Miguel Díaz</a> 21 <span class="value">+34 954 88 24 37</span> 09 - <span class="nickname">el Fumador</span><br/> 22 </div> 10 <span class="title">Flamenco Artist</span> 23 </div> 11 <div class="adr"> 24 </body> 12 <span style="display:None" class="type">Home</span> 25 </html> LINUX-MAGAZINE.COM | LINUXPROMAGAZINE.COM ISSUE 137 APRIL 2012 63 062-068_Semantic.indd 63 2/14/12 11:58:08 AM FEATURES Microdata and Microformats LISTING 4: Product Markup with hProduct HTML versions. group properties beyond container logic. The time element To allow this to happen, itemref uses the 01 <li class="hreview-aggregate"> is used to mark up universal ID attribute to point to other 02 <span class="item"> dates and times; elements. 03 <span class="hproduct"> the meter element 04 <a class="fn url" href="www.reseller. signifies measured com?isbn=9-874-444-333-85-1">El Arte Flamenco</a> Rich Snippet Tool values. To make The significance of microformats doesn’t 05 <span class="price">33.90€</span> this semantic data become apparent until you actually view 06 <span class="identifier"> more understand- the data prepared with them. For exam- 07 <span class="type">ISBN</span> able for programs, ple, you can use microformats to opti- 08 <span class="value">9-874-444-333-85-1</span> HTML5 adapts the mize the results obtained by search en- 09 </span> microformat con- gines. Figure 2 shows the effect of the 10 </span> cept into the form hCard microformats on the Google 11 </span> known as micro- search engine using the Rich Snippet 12 <ab br class="rating" title="4.9"><sup>☆☆ data. The HTML5 Tool [10]. The Rich Snippet Tool picks up ☆☆☆</sup></abbr> microdata system the microformats from the page and 13 <span class="votes">200</span> defines a set of shows which information the search en- 14 </li> properties – item‑ gine evaluates. For the website shown in scope, itemtype, Figure 1, the search engine picks up format. hProduct is used to state the itemprop, and itemref – that provides the Pedro Miguel’s location (Sevilla) and the product as an item. The rating refers to function of the class attribute in micro- job designation (Flamenco Artist). Re- the overall score, and votes to the num- formats. views based on hReview or hReview-ag- ber of reviews. Listing 5 shows the business card data gregate are also highlighted in the search Microformats use the rel attribute in from Listing 2 as HTML5 microdata. In- results. The grades from 1 through 5 are hyperlinks to express the relationship to stead of a class property, microdata uses encoded graphically as the number of the target resource; for example, the link: the itemprop attribute to express proper- stars. ties. The itemscope attribute is used to It is difficult to estimate how wide- <a rel="met friend colleague" U delimit units, as in lines 10 through 17. spread the more recent microdata has href="http://esoleares.es">U The itemtype attribute (line 5) desig- become compared with microformats; Estrella Soleares - la Pinta</a> nates the vocabu- lary for the section refers to the homepage belonging to Es- it introduces.