Microsoft Office Word 2003 Rich Text Format (RTF) Specification White Paper Published: April 2004 Table of Contents Introduction......................................................................................................................................1 RTF Syntax.......................................................................................................................................2 Conventions of an RTF Reader.............................................................................................................4 Formal Syntax...................................................................................................................................5 Contents of an RTF File.......................................................................................................................6 Header.........................................................................................................................................6 Document Area............................................................................................................................29 East ASIAN Support........................................................................................................................142 Escaped Expressions...................................................................................................................142 Character Set.............................................................................................................................143 Character Mapping......................................................................................................................143 Font Family................................................................................................................................143 Appendix A: Sample RTF Reader Application......................................................................................153 How to Write an RTF Reader.........................................................................................................154 A Sample RTF Reader Implementation...........................................................................................154 Notes on Implementing Other RTF Features....................................................................................158 Other Problem Areas in RTF.........................................................................................................158 Appendix B: Index of RTF Control Words............................................................................................178 Special Characters and A–ppendix C: Control Words introduced by other Microsoft Products........................................................225 Pocket Word..............................................................................................................................225 Exchange (Used in RTF<->HTML Conversions)................................................................................226 Microsoft Office Word 2003 Rich Text Format (RTF) Specification White Paper Published: April 2004 For the latest information, please see http://www.microsoft.com/office/ For Microsoft® MS-DOS®, Windows® , and Apple Macintosh Applications ® Version: RTF Version 1.8 Microsoft Technical Support Subject: Rich Text Format (RTF) Specification Specification Contents: 214 Pages 10/2003– Word 2003 RTF Specification Introduction Rich Text Format (RTF) is a method of encoding formatted text and graphics for use within applications or for data and formatting transfer between applications. Currently, users depend on special translation software to move word-processing documents between various applications developed by different companies. RTF serves as both a standard of data transfer between word processing software, document formatting, and a means of migrating content from one operating system to another. This document specifies the format used by RTF for text and graphics interchange. RTF uses ASCII (lower byte range – 7 bits) or the ANSI, PC-8, Macintosh, or IBM PC character sets to represent the formatting of a document. RTF files created in Microsoft Word 6.0 (and later) for the Macintosh and Power Macintosh have a file type of “RTF.” However, earlier versions of Word do not necessarily support all the RTF commands noted in this specification. You must consult prior versions of this document for each version of Word that was developed prior to Word 2003 in order to determine which RTF commands were supported for that release. However, files previously created with an earlier version of Word using RTF should be read without problem by newer versions of Word. Software that can convert a file to RTF is called an RTF writer. An RTF writer separates the application's control information from the actual text and writes a new file containing the text and the RTF command groups associated with that text. Software that reads an RTF file and is capable of displaying the formatting commands of the selected text on the screen as WYSIWIG is called an RTF reader. A sample RTF parsing reader application is available (see Appendix A: Sample RTF Reader Application in this document). This sample RTF parsing reader is designed for use in conjunction with this document to assist those interested in developing their own RTF readers. This application and its use are described in Appendix A. The sample RTF reader is not a for-sale product, and Microsoft does not provide technical support or any other kind of support for the sample RTF parsing reader code or this document. RTF version 1.7 included many new control words introduced specifically for Microsoft Word for Windows 95 version 7.0, Microsoft Word 97 for Windows, Microsoft Word 98 for the Macintosh, Microsoft Word 2000 for Windows, and Microsoft Word 2002 for Windows, as well as other Microsoft products. Version 1.8 includes new command extensions specifically for use with new features available in Microsoft Word 2003. RTF Syntax RTF files are plain text, usually 7-bit ASCII (low seven bits), and consist of clear text control words, control symbols, and groups. RTF files are easily transmitted between most PC based operating systems because of their 7-bit ASCII characters. However, converters that communicate with Microsoft Word for Windows or Microsoft Word for the Macintosh should expect data transfer as 8- bit characters. Unlike most clear text files, there is no set maximum line length for an RTF file before a carriage return/line feed is expected. In fact, a carriage return line feed is never expected to be found in an RTF file and can be overlooked by some RTF readers when found in clear text segments. Control Word An RTF control word is a specially formatted command used to mark characters for display on a monitor or characters destined for a printer. A control word cannot be longer than 32 characters. A control word commonly takes the following form: \LetterSequence<Delimiter> Example: \par Note A backslash begins each control word and the control word is also case sensitive. The LetterSequence is made up of alphabetic characters (a through z or A through Z). Control words (also known as Keywords) originally did not contain any uppercase characters, however in recent years uppercase characters have begun to appear in some newer control words. A Delimiter commonly is used to mark the end of an RTF control word, and can be one of the following: • A space. • A numeric digit or a hyphen (-), which indicates that a numeric parameter is associated with the control word. The subsequent digital sequence is then delimited by a space or any character other than a letter or a digit (commonly another control word which begins with a backslash). The parameter can be a positive or negative number. The range of the values for the number is generally –32767 through 32767. However, Word tends to restrict the range to –31680 through 31680 and also allows values in the range –2,147,483,648 to 2,147,483,648 for a small number of keywords (specifically \bin, \revdttm, and some picture properties). An RTF parser must allow an arbitrary string of digits as a legal value for a keyword (providing it does not exceed value ranges noted earlier). The control word can then be delimited by a space, nonalphabetic or nonnumeric character, or a backslash “\” in the same manner as any other control word. • Any character other than a letter or a digit. In this case, the delimiting character terminates the control word but is not actually part of the control word. Such as a
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages228 Page
-
File Size-