XSL Formatting Objects (XSL-FO), Part 1: Get the Basics of XSL-FO Techniques Convert HTML Documents Into Formatting Objects and Then Into PDF Files

XSL Formatting Objects (XSL-FO), Part 1: Get the Basics of XSL-FO Techniques Convert HTML Documents Into Formatting Objects and Then Into PDF Files

XSL Formatting Objects (XSL-FO), Part 1: Get the basics of XSL-FO techniques Convert HTML documents into formatting objects and then into PDF files Skill Level: Introductory Doug Tidwell ([email protected]) developerWorks Cyber Evangelist IBM 04 Feb 2003 Through examples and illustrations this tutorial for developers teaches the basics of working with XSL Formatting Objects (XSL-FO), a powerful, flexible XML vocabulary for formatting data, often used with XSLT to convert XML and HTML documents to PDF (portable document format). This tutorial, Part one of a two-part series, introduces how to use XSLT to convert XML documents into formatting objects and then the Apache XML Project's FOP (Formatting Object to PDF) tool to convert those formatting objects into PDF files. Examples include many XSL-FO sample code, XSLT templates, and some Java commands for the processing. Section 1. Tutorial introduction and preparation What this tutorial covers The XSL Formatting Objects specification, an official recommendation of the W3C that is commonly known as XSL-FO, defines a number of XML tags that describe how something should be rendered. Although XSL-FO contains elements that describe how to render text in nonprint formats such as spoken text, this tutorial introduces how to create portable document format (PDF) files -- the most common use of XSL-FO. The tutorial provides a brief overview of the XSL-FO document structure, as well as the elements that define page sizes, fonts, and margins. It also explains the basics Get the basics of XSL-FO techniques © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 1 of 21 developerWorks® ibm.com/developerWorks of text and graphic formatting and demonstrates the fundamentals of converting a formatting object file to PDF. Downloadable code samples make it easy to adapt samples to experiment on your own. When you've completed this introductory tutorial, you'll understand what XSL-FO is and how it works. You'll be able to adapt the basic samples provided to create simple FO documents of your own. You'll be ready move on to the second tutorial in this series to find out how to control text formatting in detail and how to convert HTML elements to formatting objects. Then you will be able to create your own XML applications that use formatting objects to generate high-quality printable documents. What you need to know to benefit from this tutorial This tutorial assumes that you already comprehend the Extensible Markup Language (XML) and how to work with it and its related technologies, such as XML Stylesheet Language-Transformation (XSLT). You don't need to know anything about XSL-FO yet, but to work with formatting objects, you need a little experience working with XSLT. The tools used for the examples are written in the Java code, but you don't have to understand the Java language to use them. What you need to know about the software and standards Figure 1. FOP project logo Although you can use other XSL-FO rendering engines, this tutorial is written for the Apache XML Project's FOP (Formatting Objects to PDF) translator. The examples in this tutorial work with FOP Version 0.20.4, which was released on July 5, 2002. If you try them with other versions of FOP, they may or may not work. The XSL-FO spec became an official recommendation of the W3C on 15 October 2001; the FOP tool supports most of the final spec. We use the FOP tool at developerWorks for two reasons: • It's written in the Java language, and so it runs on all the platforms that we care about. • It's a no-cost, open-source product, and so anyone can afford it. If you want to immerse yourself in XSL-FO, you can go directly to the source for the the spec at the W3C's site (see Resources). Be aware that this is one of the longest documents at the W3C (roughly 400 pages), although most of it is reference information for the many elements and attributes in the XSL-FO tag set. The Get the basics of XSL-FO techniques Page 2 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved. ibm.com/developerWorks developerWorks® reference sections -- particularly appendixes B, C, and D -- are very useful for looking up property names and values. Remember, as of this writing, FOP does not completely support the XSL-FO spec. Certain property names and values defined by the spec might not be supported by the tool, or they might be supported with slightly different names and values. What tools you'll need for the tutorial and how to configure them To go through the exercises in this tutorial, you'll need to have a Java Developer's Kit (JDK) Version 1.3 or later, as well as the FOP package from the Apache XML Project. You can get the FOP package. Download the latest version and unzip it. Once you have the JDK and FOP installed, you need to set the classpath. If you want to follow the examples in the tutorial without remembering to adapt them, put the FOP package at c:\fop-0.20.4rc and then set the classpath like this (except all on one line, of course; I've broken the line only to fit within the text column here): set classpath=.;c:\fop-0.20.4rc\build\fop.jar;c:\fop-0.20.4rc\ lib\avalon-framework-cvs-20020315.jar;c:\fop-0.20.4rc\lib\bati k.jar;c:\fop-0.20.4rc\lib\xalan-2.3.1.jar;c:\fop-0.20.4rc\lib\ xercesImpl-2.0.1.jar;c:\fop-0.20.4rc\lib\xml-apis.jar; If you unzip the FOP package somewhere else, you'll need to change the command accordingly. If you're running Linux, use the command export classpath=/usr/bin/fop-0.20.4rc/build/fop.jar:/usr/bin/fop-... and so on. Section 2. XSL-FO document function and structure XSL-FO document overview An XSL-FO document defines several things that are important when producing high-quality printable documents: • Information about the physical size of the page (letter, A4, and so on) • Information about margins (top, left, bottom, and right), running headers and footers, and other properties of the page • Information about fonts, font sizes, colors, and other characteristics of the Get the basics of XSL-FO techniques © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 3 of 21 developerWorks® ibm.com/developerWorks text • The actual text to be printed, marked up with elements that describe paragraphs, highlighting, tables, and similar things This section of the tutorial covers the paper size, margins, and other page properties, along with the XSL-FO elements used to describe them. Before going on to the actual elements, let's look at the process used to convert an XML document into a PDF file. Converting XML documents to PDF files Converting an XML document to a PDF file takes two basic steps: 1. Use an XSLT stylesheet to transform the XML document into a file of XSL-FO elements. To perform the transformation, you simply invoke the XSLT processor with the XML document and the stylesheet. (Part 2 of this tutorial includes an XSLT stylesheet that converts XHTML elements into formatting objects.) 2. Use a rendering engine (for example, FOP, which is used in the tutorial examples) to convert the XSL-FO elements into a PDF file. This part is even simpler: You just invoke the FOP tool, giving it the name of the XSL-FO file and the name of the PDF file. Here's a picture that outlines the process: Figure 2. FOP process diagram XSL-FO document structure at a glance This picture plainly illustrates how an XSL-FO document is structured: Figure 3. Structure of an XSL-FO document Get the basics of XSL-FO techniques Page 4 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved. ibm.com/developerWorks developerWorks® The <fo:root> element contains a <fo:layout-master-set> and a <fo:page-sequence>. The <fo:layout-master-set> normally contains information about page layouts, while the <fo:page-sequence> contains the actual content you're formatting. XSL-FO document structure in detail Start by looking at a simple XSL-FO document, simple.fo (also in the download file, x-xslfo-tutorial-samples.zip), and the tags and attributes it contains. Although it looks very complicated, don't be intimidated -- most of the things in this file never change. Normally you don't think about page layouts for every project; you just start by creating a set that works and then use them repeatedly. Look first at the <fo:root>, <fo:layout-master-set>, <fo:simple-page-master>, <fo:region-body>, <fo:page-sequence>, and <fo:flow> elements; all of these define aspects of the document. Everything else contains the content -- the part that changes from one document to another. Here's the file: <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="main" margin-top="36pt" margin-bottom="36pt" page-width="8.5in" page-height="11in" margin-left="72pt" margin-right="72pt"> <fo:region-body margin-bottom="50pt" margin-top="50pt"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="main"> <fo:flow flow-name="xsl-region-body"> <fo:block font-size="14pt" line-height="17pt"> This is a paragraph of text. Notice that as <fo:inline font-style="italic">this meaningless prose</fo:inline> drones on and on, the FOP software automatically calculates line breaks for us.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    21 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us