Get Going With DocBook

Notes for Hackers

Mark Galassi Get Going With DocBook: Notes for Hackers by Mark Galassi

Copyright © 1998 by Mark Galassi

This document can be freely redistributed according to the terms of the GNU General Public License.

Revision History

Revision 0.01997-10-11Revised by: [email protected] Initial revision; mostly just an outline Revision 0.11997-11-01Revised by: [email protected] Firmed up the outline, based on the evolution of Cygnus’s DocBook effort Revision 0.21997-11-09Revised by: [email protected] Have enough meat in there that I have announced it. Revision 0.31997-12-31Revised by: [email protected] Separated the tutorial (this document) from the style guide for my project at Cygnus. This tutorial can now be distributed on its own. Revision 0.41998-02-14Revised by: [email protected] Added some blurbs on descriptions and made the first web release of this document. Revision 0.51998-06-12Revised by: [email protected] Added Sara Mitchell’s brief essay on SGML marked sections as its own chapter. Revision 0.61998-10-19Revised by: [email protected] Mentioned that this document is GPLed, did a bit of cleanup, and added mention of my FrameMaker+SGML EDD. Table of Contents

1. What is this?...... -999 1.1. A tutorial for hacker–writers ...... -999 1.2. An example of how to use DocBook structure...... -999 1.3. Not an example for all situations ...... -999 2. Get going...... -999 2.1. Hello, world ...... -999 2.2. A slightly more complex example...... -999 2.3. From here on...... -999 3. A tour of some DocBook features...... -999 3.1. Global markups...... -999 3.1.1. Books ...... -999 3.1.2. Articles...... -999 3.2. Examples and screen snapshots...... -999 3.2.1. Examples involving plain text ...... -999 3.2.2. Describing GUIs ...... -999 3.2.3. Code samples...... -999 3.3. Describing an API ...... -999 3.4. Tables...... -999 3.5. Reference things ...... -999 4. Concepts ...... -999 4.1. Your world view...... -999 4.2. Markup based on content...... -999 4.3. Explanation of SGML–related terms...... -999 4.3.1. SGML — a framework for defining markup languages...... -999 4.3.2. What about the appearnace on output media? ...... -999 5. How you should structure your documents...... -999 5.1. Structure of a book ...... -999 5.2. Structure of a chapter...... -999 5.3. Structure of sections ...... -999 6. DocBook Resources ...... -999

3 7. SGML entities ...... -999 7.1. BASIC STRUCTURE OF AN SGML DOCUMENT...... -999 7.2. USING ENTITIES TO CONNECT OTHER FILES...... -999 7.3. IDENTIFYING FILES WITH FORMAL PUBLIC IDS ...... -999 7.4. USING ENTITIES FOR SHARED TEXT ...... -999 7.5. USING MARKED SECTIONS TO HANDLE CONDITIONAL CONTENT -999 8. Emacs PSGML mode tips ...... -999 A. Obtaining and installing DocBook tools ...... -999 A.1. UNIX ...... -999 A.2. Win32...... -999 A.3. FrameMaker+SGML ...... -999 Glossary ...... -999

4 List of Tables

3-1. Convergence of dynamical equations and constraints for various choices of constrained edges...... -999

List of Examples

2-1. Bare bones DocBook document — the source...... -999 2-2. Fleshier DocBook document — the source...... -999 3-1. Documenting a typical ftp session ...... -999 3-2. Program listings in DocBook ...... -999 3-3. A simple C program ...... -999 3-4. Describing a function in a C library API...... -999 3-5. A simple informal table (no title)...... -999 3-6. A more complex table ...... -999

5 Chapter 1. What is this?

1.1. A tutorial for hacker–writers

This booklet has two main goals: The first is present a tutorial on writing documentation that will be used in a particular project at Cygnus. The second is for me to clarify my thoughts on how I think the books we ship with should be structured. A third goal, which is not as pressing, is that this booklet should become a tutorial for all people at Cygnus (and elsewhere; the GNOME project, for example is using DocBook) who will be writing notes for future incorporation into Cygnus documentation. If you just want the tutorial with no further background information, please jump right to Chapter 2.

1.2. An example of how to use DocBook structure

This booklet is a valid demonstration of how to use DocBook elements for writing Cygnus documentation. As the Cygnus tag team refines the Cygnus style and typical tag usage, this document will be updated to reflect that style and usage. Whenever I mention Cygnus, please note that the Cygnus stylesheets and all the tools we use are being made available to the world at large, so projects outside of Cygnus can also use these stylesheets, our tools, and this tutorial. If you are a hacker working on a project and you need to write an essay, or if you would like to document your portion of the project, you can use this booklet as an example of

6 Chapter 1. What is this?

how to structure your document and what DocBook elements you should use to mark up your text.

1.3. Not an example for all situations

This booklet is a tutorial written in quasi–slapstick style — I use the words I and you liberally. You should not treat it as an example of how to write a reference manual, or as an example of how to write a more stuffy tutorial.

7 Chapter 2. Get going

Here is a brutal sequence of steps that will get you started writing DocBook documents. If you actually want to understand what is going on, you might want to read Chapter 4 first and then come back to this chapter. This chapter will not tell you how to get the tools installed — it is assumed that your system administrator has done that for you. If she or he has not done so, there is an appendix that gives instructions to get DocBook editing and processing tools. There are sections for Section A.1 (with free tools), Section A.2 (with free tools) and Section A.3.

2.1. Hello, world

Here’s a simple DocBook document to get going. I will show you how to write this document using explicit element tags, but if you are using an authoring tool that provides a high level interface, just choose the same tags using that interface.

Example 2-1. Bare bones DocBook document — the source

1997-10-11 My first booklet it even has a subtitle My first chapter Here’s a paragraph of text because it is stylistically

8 Chapter 2. Get going

poor to start a section right after the chap- ter title. A section in that first chapter All I need is a single paragraph of text to make the section valid. Remaining details Although this booklet is quite complete, here I will mention some details I never got to. Use of the word dude Here’s an example of how to say dude: DUDE.

Brief explanation

This example, and the others in this chapter, are meant to get you used to SGML/DocBook markup, but a minimal explanation should be given here. The first line, with the tag. These must agree. The special words wrapped with < and > symbols are called tags, and they are used to delimit elements. Elements are structural parts of the the document, like chapters, titles, paragraphs and so forth. Another feature of interest in this brief document is the fact that the and have to be followed by a element.</p><p>9 Chapter 2. Get going</p><p>Process and view it</p><p>You might be curious to see how your SGML document will be rendered in both a hardcopy–oriented typesetting output, and in online hypertext output.</p><p>I will outline the steps you should carry out on UNIX to generate postscript and HTML output using the program jade and the DSSSL stylesheets we use at Cygnus. Assuming your file is called myfile.sgml, you can type the following steps: 1. Use jade with the transformation DSSSL to convert DocBook to HTML. $ jade -d /usr/lib/sgml/stylesheets/dbtohtml.dsl - t sgml myfile.sgml > myfile.html The free tools also provide a utility shell script db2html which does the same. So you can just run: $ db2html myfile.sgml 2. Use jade with the real DSSSL to convert DocBook to TeX. $ jade -d /usr/lib/sgml/stylesheets/docbook.dsl - t tex myfile.sgml > myfile.tex 3. Use TeX with the jadetex macros to process myfile.tex. $ jadetex myfile.tex 4. Use dvips to generate a postscript file from myfile.dvi. $ dvips myfile.dvi</p><p>Note: The steps to create the postscript file can be replaced by:</p><p>$ db2ps myfile.sgml 5. You can now view the HTML file with your favourite browser, and you can print the postscript file, or view it with ghostscript, or its front end gv.</p><p>10 Chapter 2. Get going</p><p>2.2. A slightly more complex example</p><p>Let us add some body to the trivial example. The beefed–up example begins to demonstrate how exaggeratedly strict you should be when you write your documents. If this makes you nervous, Chapter 4 will try to reassure you that it is a good thing to do(TM).</p><p>Example 2-2. Fleshier DocBook document — the source</p><p><!doctype book PUBLIC "-//Davenport//DTD DocBook V3.0//EN" [ ]> <book> <bookinfo> <date>1997-10-11</date> <title>My first booklet it even has a subtitle Joe “dude” Smith &cygnus-copyright; &cygnus-legal-notice; My first chapter Here’s a paragraph of text because it is stylistically poor to start a section right after the chapter title. A section in that first chapter A section All I need is a single paragraph of text to make the section valid.

11 Chapter 2. Get going

Remaining details Although this booklet is quite complete, here I will mention some details I never got to. Use of the word dude Here’s an example of how to say dude: DUDE. Saying dude too often can make your brain soft.

Brief explanation

You will noticed that I added several new elements in the document preamble (the bookinfo element). A lot of the information in bookinfo is meta-information, mostly to be used for classification and indexing, and which might or might not appear when the document is rendered. I introduced some entities (sort of like C preprocessor macros) such as &cygnus-copyright;. These are supplied by someone else, and you should use them to get up to date boiler plate information in your document, such as the copyright statement and the legal notice. I also added attributes to some of the elements, for example in the tag

An element’s attributes allow you to pigeonhole extra information that may be useful when that element is processed. In this case, the id attribute identifies the chapters or sections for possible future cross–referencing.

12 Chapter 2. Get going

2.3. From here on

Examples from now on will not give the entire book or article structure: you are supposed to use your powers of analogy to fit future fragments into one of the above top level structures. Lengthy examples will, however, contain an initial doctype statement so that they will constitute a valid SGML file. Remember to strip that away if you include the fragment in a larger document.

13 Chapter 3. A tour of some DocBook features

This chapter gives a tour of the various DocBook features I expect you will need to use, and tells you how to use them. In some cases DocBook allows you to do something in more than one way. In those cases, I will tell you how we have agreed to do those things at Cygnus.

3.1. Global markups

Your docbook document will always have a top level element. In the examples above it was book, but it can be article, set, part, chapter and one of many more. tags. I suspect that as a hacker, rather than a professional doc writer, you will usually be in one of two situations: either you are writing an article or essay yourself (in which case you will probably use article as a top level element) or you will be working within the structure of an existing book or set of books (in which case you will probably be using book or chapter as your top level element.

3.1.1. Books A book will be structured in the following way:

book meta information chapter sect1 sect1 chapter sect1 appendix

14 Chapter 3. DocBook tour

sect1 appendix sect1 . . . glossary

3.1.2. Articles An article will be structured in the following way:

article meta information sect1 sect1 sect2 sect1 . . .

3.2. Examples and screen snapshots

You will frequently want to report a typical session on the command line (sort the way I do in Appendix A), or describe how to interact with a GUI.

3.2.1. Examples involving plain text

15 Chapter 3. DocBook tour

Command line examples are rather straightforward, since they present a linear progression. The main elements to keep in mind are example, programlisting, screen, literal, prompt and userinput. Program listings will be discussed in Section 3.2.3. As an example, here’s how I wrote the beginning of the anonymous ftp session in Section A.1:

Example 3-1. Documenting a typical ftp session

$ ncftp ftp://ftp.cygnus.com/pub/home/rosalia/ ncftp> cd docware/RPMS/i386/ ncftp> mget *.rpm ncftp> quit

If you process and view this snippet, you will see that the HTML code has a nice gray shading for screenshots. For now the TeX/postscript output does not have a similar cartouche, but it does use a typewriter font.

3.2.2. Describing GUIs I have never described a GUI using screen shots or anything like that, so I will skip this one for now. All I know is that you use tags like callout. I’ll take a screen dump from xv running on a Dilbert strip and experiment.

3.2.3. Code samples

16 Chapter 3. DocBook tour

In writing technical documentation you frequently need to show pieces of program source code. You can do this with DocBook’s programlisting tag:

Example 3-2. Program listings in DocBook

A simple C program #include main() { printf("Hello, world!\n"); }

Here’s what it would look like:

Example 3-3. A simple C program

#include main() { printf("Hello, world!\n"); }

3.3. Describing an API

DocBook has a rather detailed way of marking up descriptions of function behaviour. The tag that introduces it is funcsynopsis. Here is an example:

17 Chapter 3. DocBook tour

Example 3-4. Describing a function in a C library API

#include double atof const char *nptr

Here is how it looks:

#include double atof(const char *nptr);

3.4. Tables

For now I am just going to play around with some examples of tables (so look at the source to see how they are done), after which I will present a few simple ways of doing tables. These tables work well, and they might cover most situations in which you will need tables.

Example 3-5. A simple informal table (no title)

A fictitious description of com- piler features Architecture Company Native code support

18 Chapter 3. DocBook tour

Max optimization i386 Intel yes -O4 alpha DEC yes -O3 Z80 Zilog no -O1

Architecture Company Native code Max optimization i386 Intel yes -O4 alpha DEC yes -O3 Z80 Zilog no -O1

The following table has a title, uses some mathematical symbols (like the ∑ entity. It also uses the align attribute to specify alignment based on the decimal point in the floating point numbers.

19 Chapter 3. DocBook tour

Example 3-6. A more complex table

Convergence of dynamical equations and constraints for various choices of constrained edges Constrained edges Model # of iter. ∑E i2 ∑C i2 AD, AF, AG, AA’ Flat space 5 1.15 10-24

1.77 10-19 AD, AF, AG, AA’ Kasner universe 6 1.43 10-19

1.77 10-8 AB, AC, AE, AA’ Flat space 4

1.15 10-24

4.09 10-24

20 Chapter 3. DocBook tour

AB, AC, AE, AA’ Kasner universe 7 1.33 10-19

8.07 10-7

This is how that table would be rendered in the output mode you are viewing now:

Table 3-1. Convergence of dynamical equations and constraints for various

choices of constrained edges

Constrained Model # of iter. E 2 C 2 i i edges

-24 -19 ¡ AD, AF, AG, Flat space 5 1.15 ¡ 10 1.77 10 AA’

-19 -8 ¡ AD, AF, AG, Kasner universe 6 1.43 ¡ 10 1.77 10 AA’

-24 -24 ¡ AB, AC, AE, Flat space 4 1.15 ¡ 10 4.09 10 AA’

-19 -7 ¡ AB, AC, AE, Kasner universe 7 1.33 ¡ 10 8.07 10 AA’

3.5. Reference things

library function reference things, man page reference things

21 Chapter 4. Concepts

Now that you have written your first DocBook document, and you have processed it, you might actually want to understand a little of what goes on in this world. You have probably heard an incoherent babble of terms like SGML, DTD, DSSSL, DocBook, HTML, and so forth, and you might be wondering how they should all fit in to your world view. I will give you both a top–down explanation, illustrating what your world view might be, and a bottom–up explanation of what all the individual SGML–related concepts mean.

4.1. Your world view

Most people who do word processing or typesetting use a WYSIWYG word processor or a typesetting system in which they type explicit markup instructions which tell the typesetter how to position text on the page (such as TeX and troff). Both of these approaches suffer from a few serious problems. The biggest one is longevity of the document: eternal information (the profound things you type) is interspersed with information that will be obsolete (the typesetting information). Another big problem with this old approach is lack of structure: the markup did not express content, but rather page layout. Let’s say you are interested in indexing a bunch of papers written in TeX. It would be rather easy to index all occurances of boldface text, but that’s not interesting at all! Instead, it would be really useful to index all function names in an API. With old typesetting approaches you would need artificially intelligent software that could understand the text and say “aha! this must be the definition of a function in the API”. So your old world view of writing a document and having the main challenge be how to mark it up to look good on paper is a poor one. Your challenge should be how to mark your document up to emphasize semantic content.

22 Chapter 4. Concepts

4.2. Markup based on content

So how do you mark your documents such that useful information can be extracted and indexed? The approach in DocBook is to provide a very rich set of markup tags that all relate to the structure and nature of the document’s content. To give you a couple of examples of tags that could help with generating automatic indices: and . If you have a large body of documentation (for example, all Sun software and hardware is documented with DocBook) you can do a very easy search for any document that discusses a command called mount, or a quote attributed to Ken Thompson. On top of that, with such a structured search you would only find occurances of mount when it is a command name, and of Thompson when he is the author of a quote. Now imagine for a moment what would happen if the entire World Wide Web used a rich content–based markup language instead of HTML: a search engine would give you the information you need without all the extra references which just happen to use those words casually. A search for mount on the web would almost certainly not find you references on the UNIX mount command. So a rich markup language like DocBook is a good idea from many points of view, but it can also be difficult to use. DocBook has hundreds of tags (as opposed to just a few in HTML), so you might find the learning curve steep. That is true, and the only way around that is to write documentation on how to use DocBook! On the other hand, once you are quite familiar with DocBook it will not slow you down too much to type in markup all the time. Keep in mind that most of the time a person is not writing, but rather worrying about meta–level problems with their document. If you use DocBook well you will spend a bit more time writing and a lot less time worrying about other issues like the layout on paper. (There is nothing you can do about it anyway!)

4.3. Explanation of SGML–related terms

23 Chapter 4. Concepts

You have probably already heard many SGML–related terms, but they are seldom used carefully, so people end up with misconceptions which can be annoying.

4.3.1. SGML — a framework for defining markup languages First of all: SGML (which stands for Standard Genralized Markup Language) is not a markup language in itself! It is a framework for describing individual markup languages (such as DocBook or HTML), so it is really a very different beast. Kind of like the difference between a suitcase factory and a suitcase. DocBook and HTML are a specific instantiation of SGML, sometimes called SGML applications. So when people say that they are writing documents in SGML they are being quite imprecise. To be precise they could say (for example) “We are writing our documentation in DocBook, which is an SGML–base markup language”, or something on those lines. The way you define a particular markup language in the SGML formalism is by writing up a Document Type Definition (usually referred to as a DTD. The DTD specifies what tags can be used in the markup language, and in some cases it also specifies the hierarchy of those tags. For example, you have probably noticed that in DocBook you can only put a title tag immediately after certain tags (such as chapter sect1 and some others).

4.3.2. What about the appearnace on output media?

24 Chapter 5. How you should structure your documents

I will just leave fillers for most of this chapter for now, but my scope in this chapter is to describe a few writing style issues that are specific to computer manuals. English writing style is described well in many books.

5.1. Structure of a book

dude

5.2. Structure of a chapter

dude

5.3. Structure of sections

dude

25 Chapter 6. DocBook Resources

• Norman Walsh’s quick reference card on DocBook

26 Chapter 7. SGML entities

This introduction is provided to help explain some basic concepts of using SGML to writers who are not yet very familiar with SGML and illustrate some of the strategies from SGML that you may use to handle common writing issues. It assumes that you already understand the basic concept of marking up documents in an SGML language that indicates what the content is and not what it should look like and that you know a few acronyms, like DTD (for document type definition) or ISO (International Standards Organization).

7.1. BASIC STRUCTURE OF AN SGML DOCUMENT

So what does an SGML document look like? Well, there are at least three different pieces: a main document file, the DTD for that type of document, and the SGML Declaration for that type of document. The main document file can also use any number of other files containing text marked up in SGML or other types of information like graphics or multimedia files. The SGML declaration contains a lot of esoteric information, but basically describes SGML conventions that are followed in documents of that type. The DTD describes the structure for documents of that type and the set of tags (the language) that mark up the document to define the content and structure. SGML systems need all of this information to read the SGML document correctly. When SGML systems work with SGML documents, they start with the main document:

1. The main document defines the start and end of your document and identifies the DTD and SGML declaration to use with the document. It also identifies any other files with content that are included in your document. The main document starts with a line identifying the DTD for this document that looks something like this:

27 Chapter 7. SGML entities

Tags are enclosed in start/end tag characters. The most common characters for this are < (start) and > (end). The !DOCTYPE tells SGML systems that this tag identifies the document type for this document. MyDocuments is the name of the document type and is always the name of the beginning and ending tags for documents that use this DTD. All of the content for the document comes between these tags. PUBLIC and the long strange name in quotes identifies the file for the DTD and for the SGML declaration. This is called a formal public ID, or FPI. An explanation of FPIs comes later.

2. Inside the document type declaration, a document can also have other declarations of information that can be used in the document. This is known as the internal subset and is contained inside brackets [ and ] after the DTD identifier and before the end tag character. It would then look like this:

3. After the document type declaration comes the document. All of the content for your document must be enclosed in a beginning and ending tag that matches the document type. In our example, this looks like: ... the content of your document ... Tags that enclose contents have a start tag like and an end tag that use the same name but the slash identifies the end tag. Tag names are also known as elements or generic identifiers (gi’s).

28 Chapter 7. SGML entities

7.2. USING ENTITIES TO CONNECT OTHER FILES

SGML uses something called a general entity to include other files inside a document. There are other types of entities in SGML, but this is the most common. A general entity is identified by an entity declaration in the internal subset of your main document. It looks like this:

... ]>

The

... ]> My First Novel Joan Duvall &Chapter1;

29 Chapter 7. SGML entities

This document has only one chapter, which is included where &Chapter1; appears. Any file that is included that contains text and SGML tags must also fit properly within the structure of the document, as that is defined in the DTD. They also must be balanced - any tag that starts in a separate file must end in that file. In this example, the file chp1.sgm contains text marked up in SGML and the contents are all enclosed within ... tags as the Chapter element is valid content at this point within the document.

7.3. IDENTIFYING FILES WITH FORMAL PUBLIC IDS

You can always identify files to include by their name and path information (a system identifier). But there are several reasons why you may want to use public IDs, or FPIs, instead. The biggest reason is to make it easier to move files without having to change your documents. Another reason is to make it easier to exchange your files with other people or groups where the directories on their system may be different. FPIs also allow you or your company to claim ownership of important information, such as your DTD. With FPIs, you identify a file by an abstract name in your document and then supply the location of that file in a catalog, sometimes called a mapping file or entity manager. The catalog is another, separate file from your document. If a file that is used in your document moves, you simply change its location in the catalog rather than changing the location in your document or any other document that uses it. If you exchange files with some one else, or simply move the files to a new computer with different directories, you only have to change the location information once in the catalog. FPIs must have a specific structure. Two slashes are used to mark the separation between each part of the structure, such as:

"Registration//Owner//Keyword Description//Language"

30 Chapter 7. SGML entities

Registration The first character indicates whether the FPI is formally registered (+) or not (-) with an ISO approved registration service. If you define your own FPIs and don’t register them, use the hyphen.

Owner The owner of the file is the second part of the FPI. This can be a company, an organization, or a person.

Keyword There are several keywords that indicate the type of information in the file. Some of the most common keywords are DTD, ELEMENT, and TEXT. DTD is used only for DTD files, ELEMENT is usually used for DTD fragments that contain only entity or element declarations. TEXT is used for SGML content (text and tags).

Description Any description you want to supply for the contents of this file. This may include version numbers or any short text that is meaningful to you and unique for the SGML system.

Language This is an ISO two-character code that identifies the native language for the file. EN is used for English.

7.4. USING ENTITIES FOR SHARED TEXT

Entities can be used for other purposes. Another common usage is to define text that is repeated in lots of places that you don’t want to type every time, or that you want to be able to easily change everywhere it is used. These are general entities that you define in the internal subset of your document and then use within your document.

31 Chapter 7. SGML entities

For example, you are writing a document that frequently refers to the ANSI X.12 ASC 835 standard. This phrase must appear exactly as it is shown. Rather than type it out every single time, you define this entity:

... ]>

Within the document, you use this entity every time you need to refer to the standard name. For example:

This interface follows the exchange proto- cols for &ansi835; version 3070 for remittance information.

When the document is printed or processed by an SGML system for any type of output, the &ansi835; characters are replaced with ANSI X.12 ASC 835 instead.

7.5. USING MARKED SECTIONS TO HANDLE CONDITIONAL CONTENT

Sometimes you need to have different versions of the content for different purposes. There are several ways to do this using SGML, one of which is called marked sections. A simple example of conditional content might be the description of keys used in a software program where they appear in boxes in the printed manual but are blue inside brackets on the web site or CD. Rather than have two separate versions of the conventions for each output of the manual, you can use marked sections to keep both variations in the same document. There are different types of marked sections, but the types that allow you to control conditional content are ignore/include sections. These markers act like on/off switches to allow content to be included or ignored in different situations.

32 Chapter 7. SGML entities

The marker for an include marked section looks like this:

F1 ]]> while an ignore marked section looks like this:

F1 ]]>

INCLUDE and IGNORE are the keywords that tell the SGML system what to include or skip. As this example shows, marked sections can contain both text and tags as long as the tags within the markers are balanced (if a tag starts inside a marker, then it ends inside the same marker). In this example, you leave the markers for the print version as INCLUDE and the markers for the electronic version as IGNORE when you print a master copy. When you create the electronic book or HTML for the web site, you change the markers for the print version to IGNORE and the markers for the electronic version to INCLUDE. This works just fine, unless you have several different sections you need to include or ignore together - it’s cumbersome to change each one manually and you can easily make a mistake. So instead, you can define parameter entities under any names you want and then change the entities to turn the include/ignore switches. To do this, you add parameter entity declarations in the internal subset at the top of your master SGML document. For example:

...]>

You then use the names for each marked section, like this:

F1 ]]>

33 Chapter 7. SGML entities

F1 ]]>

To print the master copy, you leave the entity declarations as they are shown above. The SGML system interprets each %hardcopy; it finds as INCLUDE and includes those marked sections. The %softcopy; is interpreted as IGNORE and those sections are skipped. When you’re ready to produce the electronic version, you only have to change the entity declarations at the top of the file, like this:

...]>

With this single change, the electronic versions are included and the printed versions are skipped. Marked sections can be simple, but they are not always the best choice to manage conditional text. They are best if you use them sparingly and in very clear situations - it’s easy to figure out when to use them and when to change the INCLUDE/IGNORE switches. Some of the problems that they can create include:

• - Marked sections can be nested (a marked section inside another marked section), but this can confuse your SGML system and may not produce the effect you want. For example, SGML systems can’t properly handle an included marked section inside an ignored marked section.

• - If you use lots of differently named sections, it’s easy to lose track of the content and can make your SGML document invalid if some required structures are set to IGNORE.

• - Finally, they are not supported in XML, the new standard subset of SGML that will be viewable directly on the World Wide Web. Moving to XML could be difficult if you use marked sections a lot.

34 Chapter 8. Emacs PSGML mode tips

From: Mark Galassi Sender: [email protected] To: [email protected] Subject: Re: dtd parser! Date: Fri, 23 Oct 1998 09:19:01 -0600 (MDT)

Jose> My question is: Is there any avail- able tool (parser) to Jose> analyse a dtd and return the same informations that is Jose> contained in the reference manual?

There are a few. The one I use is emacs with PS- GML mode. It has emacs-style completion *on tags*!! It’s really cool.

If you type "C-c C- e" it will prompt you for an element, and offer as completions only the valid elements at that point.

Once it inserts the element, it inserts it with any re- quired following elements along with a comment say- ing which ones you could put later on.

As an example, I just went to a DocBook buffer and typed

C-c C-e variab

and it inserted this text in the buffer:

35 Chapter 8. Emacs PSGML mode tips

Another example:

C-c C-e i

and it showsme the following completions:

---

Click mouse-2 on a completion to select it. In this buffer, type RET to select the completion near point.

Possible completions are: important indexterm informalequation informalexample informaltable itemizedlist

---

36 Appendix A. Obtaining and installing DocBook tools

Follow these instructions for your favourite platform, either UNIX (free tools), Win32 (free tools) or FrameMaker+SGML.

A.1. UNIX

On UNIX we are distributing these tools as RPM packages, providing a binary distribution for intel-baesd GNU/ systems, and a source distribution for others [FIXME: must clear this up a bit]. Here is an example of a UNIX session to get and install the free DocBook tool set on a GNU/Linux system. The tools can be found at this location [FIXME: the DSSSL for printed output does the wrong thing with ulink.]

$ ncftp ftp://ftp.cygnus.com/pub/home/rosalia/ ncftp> cd docware/RPMS/i386/ ncftp> mget *.rpm ncftp> quit $ su Password: ultra-sucure # rpm -install sgml-common*.rpm # rpm -install docbook*.rpm # rpm -install stylesheets*.rpm # rpm -install psgml*.rpm # rpm -install jade*.rpm # rpm -install jadetex*.rpm # rpm -install sgml-demo*.rpm

Note: The order of installing packages is important.

37 Appendix A. Obtaining and installing

Note: When you are upgrading, rather than installing for the first time, the rpm -install steps should be replaced with rpm -upgrade.

You are now ready to edit SGML/DocBook documents.

A.2. Win32

For Win32 we are distributing these tools as a standard Windows InstallShield self-extracting package.

A.3. FrameMaker+SGML

FrameMaker+SGML is a powerful WYSIWYG1 which allows you to edit SGML documents with all the rendering and other features of FrameMaker. In FrameMaker+SGML the correspondence between SGML tags and printed output is implemented with a document called the EDD2 which maps SGML elements to FrameMaker formatting instructions. FrameMaker+SGML ships with an outdated DocBook implementation, based on the DocBook 2.2.1 DTD (the current version is 3.0) provided by Lynne Price. Lynne Price then upgraded the DTD to 3.0 under contract by Cygnus Solutions but left most of the elements undefined in the EDD. In March of 1998 I started filling in the gaps by defining the output format for many tags in the EDD. Later that year Alyce Gershenson and Frederick Geers modified the EDD and the FrameMaker template file to match the Cygnus publication styles. In September of 1998 I defined many more tags in the EDD, to the point where I now think that Cygnus’s EDD is a useful tool for FrameMaker+SGML users. I have finally made Cygnus’s FrameMaker+SGML EDD available by anonymous ftp at the address ftp://ftp.cygnus.com/pub/home/rosalia/docware/framemaker/ Please keep in mind that there are still some serious unresolved problems in the read/write rules and

38 Appendix A. Obtaining and installing

the FrameMaker API client, which I have not yet sat down to study. To use the FrameMaker+SGML support you should do the following:

1. Grab all the FrameMaker+SGML files from the anonymous ftp site listed above. Put them in a directory (or folder on Windows) of their own. 2. Start up FrameMaker+SGML and load the file Sgmlapps.fm. 3. Double-click on a line that says [FIXME: must look it up] and change the variable to match the location of the folder you created in step 1. 4. Go to the File menu and select Developer's tools, and in there select Reread SGML application. It will ask you to select a file to “read in”, and you should pick Sgmlapps.fm. 5. Go to the File menu and select Set SGML application. You will get a couple of choices. You should choose DocBook-3.0-cygnus (or whatever it was). Do not choose DocBook: this is the obsolete version that Adobe ships. 6. Create a new document. When FrameMaker+SGML asks you for a template, you should select the Template.fm file that you downloaded.

If all went well you should now have a DocBook document open. You can bring up the “structure view” and “element catalog” windows to help you edit your DocBook document.

Note: I would love it if some FrameMaker+SGML were to collaborate with me on the DocBook EDD. If you are interested in doing so, please send email to

Notes

1. WYSIWYG stands for “what you see is what you get”. 2. Element Definition Document

39 Glossary

ASCII

(American Standard Code for Information Interchange) This standard character encoding scheme is used extensively in data transmission.

ANSI

(American National Standards Institute) This group is the U.S. member organization that belongs to the ISO, the International Organization for Standardization.

attribute

An attribute provides more information about an element such as classification level, unique reference identifiers, or formatting information.

CCITT Group 4

(International Consultative Committee on Telegraphy and Telephony) This CALS standard for raster graphics incorporates tiling, which divides a large image into smaller tiles. You can exchange graphic files in CCITT/4 format in a compressed state so they take up much less file space.

CITIS

(Contractor Integrated Technical Information Service) As part of CALS Phase II, CITIS is a draft functional specification for services. DoD acquisition managers designed CITIS as a plan to gain access to product-related digital technical

40 Glossary

information.

CGM

(Computer Graphics Metafile) CGM is one of the CALS standard formats for representing 2–D technical illustrations. CGM is an object-oriented graphic format.

DSSSL

(Document Style Semantics and Specification Language) This draft international standard (DIS 10179) applies to the specification of processing information for SGML documents. DSSSL is expected to became an international standard.

DTD

(Document Type Definition) A DTD is the formal definition of the elements, structures, and rules for marking up a given type of SGML document. You can store a DTD at the beginning of a document or externally in a separate file.

EDI

(Electronic Data Interchange) This is a set of computer interchange standards for business documents such as invoices, bills, and purchase orders. element

An element is a piece of data within a document that may contain either text or other subelements such as a paragraph, a chapter, and so on.

41 Glossary

element declaration

A statement in the DTD defining an element and declaring the order in which it may appear in the document and what other elements it may include.

entity

An entity is a self-contained piece of data that can be referenced as a unit. You can refer to an entity by a symbolic name in the DTD or the document. An entity can be a string of characters, a symbol character (unavailable on a standard keyboard), a separate text file, or a separate graphic file.

entity declaration

A statement in the DTD or document that assigns an SGML name to an entity so you can reference it.

FOSI

(Formatting Output Specification Instance) A FOSI is used for formatting SGML documents for printing and other outputs. It is a separate file that contains formatting information for each element in a document.

HTML

(HyperText Markup Language) This is the format of files published on the World Wide Web. HTML is an application of SGML; to author in HTML using SGML-based authoring software, you simply need the HTML DTD.

IGES

42 Glossary

(Initial Graphics Exchange Specification) The IGES standard for engineering, product design, and manufacturing drawings is one of the CALS standard graphics formats.

Internet

The Internet is a worldwide communications network originally developed by the U.S. Department of Defense as a distributed system with no single point of failure. The Internet has seen an explosion in commercial use since the development of easy-to-use software for accessing the Internet.

ISO

(International Organization for Standardization) The ISO is an industry-supported organization that establishes worldwide standards for everything from data interchange formats to film speed specifications. markup

Markup is anything added to the content of the document that describes the text. parser

A parser is a specialized software program that recognizes SGML markup in a document. A parser that reads a DTD and checks and reports on markup errors is a validating SGML parser. A parser can be built into an SGML editor to prevent incorrect tagging and to check whether a document contains all the required elements.

PDES/STEP

43 Glossary

(Product Data Exchange Standard/Standard for the Exchange of Product Model Data). PDES/STEP are standards under development for communicating a complete product model with sufficient information content that advanced CAD/CAM applications can interpret. PDES is under development as a national standard and STEP is under development as its international counterpart.

tag

In the world of SGML, a tag is a marker embedded in a document that indicates the purpose or function of the element. Each element has a beginning tag and an end tag.

World Wide Web

Often referred to as WWW or the Web, this usually refers to information available on the Internet that can be easily accessed with software usually called a “browser.” Organizations publish their information on the Web in a format known as HTML; this information is usually referred to as their “home page” or “web site”.

44