Word Processing in Classical Languages
Latin, Germanic, Greek
corue auis nimis nitida & splendida oque auis est tibi similis in pennis nisi solus cignusµ & super omnia places michi ymno si sol¯u cantu¯ tu¯ u audire posse¯, inter ceteras aues te utique extollere¯ ¸
David J. Perry
DRAFT FOR COMMENT #2: NOT FINAL ii Word Processing in Classical Languages
[back of cover]
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages iii
Word Processing in Classical Languages
Latin, Germanic, Greek
David J. Perry
Rye High School, Rye, New York
DRAFT FOR COMMENT #2: NOT FINAL iv Word Processing in Classical Languages
This Draft for Comment may be obtained from
This document is set up like a printed book; even-numbered pages should be on the right and odd- numbered pages on the left. If you print out the document before reading it, turn each even-num- bered page over, print down, and back it up with the preceding odd-numbered page. Then punch for a three-ring binder or staple at the spine.
Body text of this book is set in Cardo, a Unicode font by David Perry; major heads are in Lithos and subheads in CG Omega.
The Latin quotation on the cover is from a prose version of the version of the fable of the fox and the crow. These prose versions are found in the Wolfenbüttel manuscript of the fables attributed to ‘Wal- ter of England’ where they were added to help students struggling with the verse originals. “O corve, avis nimis nitida et splendida, que avis est tibi similis in pennis nisi solus cignus? et super omnia placet michi ymno si solum cantum tuum audire possem, inter ceteras aves te utique extollerem.” It is set in the Beowulf font by Peter Baker.
This book refers to a number of company names and product names which are trademarks. These references are used in an editorial fashion to provide readers with information about the products mentioned, and no trademark infringement is intended. All trademarks are the property of their re- spective owners.
Copyright © by David J. Perry.
Information in this book is provided to help users find appropriate ways to prepare their documents. It is the responsibility of each user to evaluate any product mentioned to see whether it is suitable for his or her needs. In no event shall David J. Perry be liable for difficulties with or damage to any com- puter system caused by use of any product or procedure mentioned in this book.
Second draft for comment, printed 1/12/01 with various corrections and additions to the draft of August 2000.
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages v
Contents
List of Tables and Figures vi Acknowledgements vi Introduction 1 Part I. Font and Keyboard Basics 1. Fonts, Character Sets and Unicode 5 2. Keyboard Entry and Other Useful Information 11 Part II. The Present 3. Latin 17 4. Interlude: Using Unicode Characters with Microsoft Word 25 5. Germanic 29 6. Interlude: What If I Need Characters That Aren’t in My Font? 33 7 Greek 37 8. Epigraphy 49 9. Metrics 53 10. Setting Type 55 11. Sharing Documents with Others 59 Part III. The Future 12. The Need for Standardization 65 13. OpenType 69 Part IV. Resources Sources of Information 71 Works Cited 72 Appendices Appendix 1. Macintosh character set 76 Appendix 2. Windows character set 77 Appendix 3. Windows 2000 Polytonic Greek Keyboard 78 Appendix 4. ISO Language Codes 83
DRAFT FOR COMMENT #2: NOT FINAL vi Word Processing in Classical Languages
List of Tables and Figures
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages vii
Table 1. Selected combining diacritical marks in Unicode. 10 Table 2. Unicode characters for classical and medieval Latin. 23 Table 3. Medieval Germanic characters in Unicode. 31 Table 4. The Greek and Coptic block of Unicode. 43 Table 5. The Greek Extended block of Unicode. 44 Table 6. Epigraphic characters in Unicode. 50 Figure 1. Adding languages and keyboard layouts. 13 Figure 2. The US-International keyboard. 14 Figure 3. Word’s Insert/Symbol dialog box. 25 Figure 4. add new figure 4 (PDF screenshot) here p. 57 Sample 1. Printing with CL Fonts. 20
Acknowledgements
I am grateful to the following people for comments and suggestions: Rob Latousek. Any errors or infelicities which remain are mine.
DRAFT FOR COMMENT #2: NOT FINAL
Word Processing in Classical Languages 1
Introduction
About This Book Audience and Purpose This book is intended for anyone who works with text in classical or medieval Latin, medieval Ger- manic languages, or polytonic Greek.1 This includes teachers at all levels who produce materials for students and authors of textbooks as well as those who prepare scholarly articles or editions. This material may also be of help to some people outside the academic professions, such as type designers, font manufacturers, and typesetters or desktop publishers who sometimes need to work with classical languages. Both Microsoft Windows and the Macintosh operating system (Mac OS) are covered.
The book is intended, first and foremost, to provide practical help for users in getting the characters that they need in their work. A secondary purpose is to educate academic users about some key con- cepts and issues concerning the use of type on computers. I have also provided some information about how to get non-English characters used in modern Western languages, partly because scholars frequently work in several languages and partly because some of the concepts apply to the informa- tion about classical languages.
Although little of this information is original with the present author, it has never been available in one place before. I hope that this compilation will be a convenient source of help for the community of classical language users. Some may be surprised to find medieval Germanic languages (Old Eng- lish, Old Norse, etc.) treated together with Latin and Greek. You will find, however, that users of these languages need many of the same characters that are used in classical and medieval Latin— particularly vowels with macra and brevia. Hebrew could also be added her, since biblical scholars of- ten use it along with Greek and some software publishers provide support for both languages. How- ever, since I do not know Hebrew, I have not included much information about it.2
Origin of This Book This book is the outgrowth of an interest dating back years in the problems faced by classicists and others who need special characters. While developing the CL Fonts package for Latinists (described below on page 20) I became frustrated with the limitations of -character fonts and began to investigate Unicode. This research has convinced me that we need to educate ourselves about this technology and take advantage of it to solve some of the problems that we have faced over the years.
1 I use the term polytonic rather than classical Greek. Although many readers of this book will be classical scholars, the information will be of use to anyone who needs to represent a Greek text—ancient, Byzantine, or modern—that contains the various diacritics used prior to the promulgation of the monotonic system in . 2 The WinGreek, Son of WinGreek, Silver Mountain and Antioch packages discussed in the Greek section also support Hebrew.
DRAFT FOR COMMENT #2: NOT FINAL 2 Word Processing in Classical Languages
I have no financial interest in any of the products mentioned here; comments are my own opinions based on my experience with the various products. I welcome corrections or information about additional products. Email me at
This document contains many
Finding What You Need Because some users of this book may be relatively unfamiliar with the issues discussed, while others may be highly sophisticated computer users, I have tried to make it easy to find the information you need and to skip material you do not need to deal with.
Part I of this book presents basic information about fonts used on computers, character sets, Unicode, and keyboard entry. Part II describes solutions that are available right now (January ) for users of Latin, Germanic languages, and Greek. The chapter for each language begins with an Overview that summarizes the characters needed for that language and provides any other necessary general information. Immediately after the Overview comes a How-To section that provides practical help for people who need to find out how to get macra, Greek characters, etc. in their documents. The How-To section is broken down into two parts: one that will apply to all users, discussing traditional -character fonts, and the other specific to users who have a Unicode-capable word processor such as Microsoft Word . If you do not understand the reasons for this distinction, you need to read the section below on fonts and Unicode.
Part III goes beyond the question of practical help with documents and presents some information about where we will be in the future with characters and fonts. This section will be of interest to anyone who is seriously interested in these issues and is important for anyone who plans to work with classical languages in the future. There is a pressing need for standards which should be of con- cern to anyone in the profession.
Full references for all the books and other items referred to in the body of this book will be found in the Works Cited section, while the Resources section provides additional places to go for information.
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 3
Part I. Font and Keyboard Basics
β b b ב
DRAFT FOR COMMENT #2: NOT FINAL 4 Word Processing in Classical Languages
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 5
1. Fonts, Character Sets, and Unicode
Note: Part I introduces some basic concepts and terms that will be used in the following pages. If you are already knowledgeable about computer character sets and have a general acquaintance with Unicode, you can skim this part, stopping to focus only things that are new to you.
Font Formats Personal computers mainly use two types of fonts. PostScript is a font format developed by Adobe Systems and built into some laser printers and all pc-based typesetting systems. It was the first widely used format that allowed the user to scale type to any point size without losing quality. PostScript fonts are sometimes also referred to as Type fonts. TrueType is a font format originally promoted by Apple and Microsoft as an alternative to PostScript; it is built into Windows ., Windows / and into the Mac OS from version .. on. Nowadays most Windows and Mac users rely primarily on TrueType fonts, although the publishing industry still employs PostScript extensively. In order to use PostScript fonts on a Windows or Mac computer, you need to install Adobe Type Manager (ATM), which can be downloaded for free from Adobe’s web site
Microsoft and Adobe have recently created the OpenType format, supported in Windows and Mac OS . OpenType allows font developers to make new fonts based on either TrueType or Post- Script outlines, and it adds some features that are important for multilingual users; more will be said about it in Chapter . One may also occasionally encounter bitmapped fonts, which are designed to be used at a specific point size and cannot be enlarged or reduced without significant loss of quality. Bitmapped fonts are used mainly to display text on screen.
Character Sets and Code Pages Many years ago, a standard for fonts used on a Windows PC or a Mac was established with characters. The first positions (numbered –) contain the letters, numbers, and punctuation marks we see on the keyboard, plus a few others that are used internally by the computer. Characters with accent marks, non-English characters, and many additional symbols are found in the range – and are referred to as extended or upper-order characters. Windows uses the ANSI (American National Standards Institute) character set, and each character may be referred to by its ANSI num- ber; for example, the upside-down question mark used in Spanish is ANSI . Apple Computer established its own arrangement for upper-order characters on the Macintosh; like the Windows set, it includes accented characters and other symbols—but not exactly the same ones as in Windows or
DRAFT FOR COMMENT #2: NOT FINAL 6 Word Processing in Classical Languages in the same order. See Appendices and , pp. 76–77, for the standard arrangement of upper-order characters as found in the two operating systems.
However, it is obvious that character positions cannot contain even all the characters used in lan- guages that employ the Latin script, to say nothing of other writing systems such as Greek, Cyrillic, Hebrew, Arabic, or Japanese. Computer manufacturers created code pages to support Latin characters not in the standard Windows or Mac sets as well as non-Latin scripts. A code page is an arrangement of characters designed to support one or more languages and assigned to the available positions in a standard font. For example, Windows has a Central European code page to support the diacritics needed in Polish, Czech, etc., as well as a Russian code page and many others. Code pages work well enough except that a user may encounter problems mixing languages in a document or exchanging files with a machine set up for a different code page. The advent of TrueType fonts helped this situa- tion somewhat, since TrueType fonts can contain more than characters—but the user could still only access of them at a time.
Neither the Mac OS nor Windows has ever provided direct support for many of the scholarly charac- ters needed in various languages, so a number of individuals and companies developed fonts that sup- port Latin, Greek, and the medieval Germanic languages. Until quite recently, none could support more than characters (which can be a problem for Greek). Since no standard exists, these fonts contain characters in different orders and are mutually incompatible; see discussion of this problem below.
The Arrival of Unicode In , two projects were undertaken to develop a standard that would cover all the writing systems commonly used in the world. One was begun by the International Standards Organization (ISO) and the other by the Unicode Working Group, a private organization supported by hardware and software manufacturers. The two groups soon realized that it would be a waste of time and effort to develop two such standards and so they agreed to work together. Today the Unicode Standard, spon- sored by the Unicode Consortium, Inc., and ISO/IEC-, the standard promulgated by the ISO, cover the same characters in the same order. The two standards are developed together and any char- acters added to one are also added to the other. The differences between the two are technical and of concern only to software developers; in this document, I will refer to the Unicode Standard for con- venience.3 Those who want more information about the history and development of the two stan- dards should refer to Graham.
Unicode contains blocks of characters for every commonly used script, including Latin, Greek, He- brew, Arabic, and various Indic scripts as well as the ideographs used in Chinese, Japanese, and Ko- rean; it can handle about , characters plus others through an extension mechanism. Unicode does make an effort to support the historical scripts and historical characters that are needed by schol- ars, but scholarly use also has its own problems that are not addressed in Unicode and about which more will be said below. Fonts designed under the Unicode standard normally contain a subset of
3 Since The Unicode Consortium is a private company, it is sometimes preferable or necessary to make reference to the ISO standard when dealing with certain government agencies or other groups that prefer to work with publicly defined standards.
DRAFT FOR COMMENT #2: NOT FINAL Fonts, Character Sets, and Unicode 7 the full Unicode repertoire, since including all the , characters found in the current version of Unicode makes for a very large font which consumes significant computer resources. Unicode also reserves a block called the Private Use Area (PUA) which will never have characters assigned to it by the Unicode Standard. Instead, this area is available to users for their own needs, and more will be said about this below. Unicode characters are identified by the letters “U+” followed by a hexadeci- mal (base ) number; for example, the character Œ in Unicode is U+2, LATIN CAPITAL LIGATURE OE (official Unicode names may contain only capital letters, hyphens, and spaces; they may be printed in small capitals when mixed with running text, as in this paragraph).
Windows and allow applications to use Unicode, but do not provide direct support through the operating system. Microsoft Word was the first widely used piece of consumer software that could take advantage of Unicode fonts, albeit in a way that was awkward for users, and some people began using Greek Unicode texts soon after it was released. We now have Unicode support built di- rectly into Windows NT, Windows and Mac OS . and . More and more applications are being released that can handle Unicode, and there is no question that Unicode is the way of the fu- ture. The official reference to Unicode is The Unicode Standard, Version , but I recommend that users who are new to the subject start with Graham, which is designed as an introduction to the subject and which also includes some information that is not appropriate to be included with the official standard. If you want access to the complete Unicode character list but don’t have access to the printed standard, you can download the data from the Unicode web site
The Unicode Model: Characters not Glyphs A fundamental principle of Unicode is to encode characters not glyphs. The Latin letter “a” is a character which may take many different shapes: a, a, a, a, and so forth. The various shapes or out- lines of “a” are referred to a glyphs, while the underlying abstraction—the Platonic Idea of “a”, so to speak—is the character. Unicode could not separately encode an italic “a,” an uncial “a,” a script “a,” and so forth; there aren’t enough codepoints and doing so would make things much more compli- cated than they should be. It is understood that the shape of glyphs will vary from one font to an- other, and Unicode does not prescribe this sort of thing.
This model does pose some problems for scholarly users, however. Take the case of the symbol for the sestertius, a Roman coin, which is not yet included in Unicode. Most often this appears as Š, but is also found as the letters HS, as II with a horizontal bar, as IS with horizontal bar or even S with a horizontal bar. If we were to propose adding the sestertius to Unicode and the proposal were ac- cepted (as it might well be, since it certainly is a character that classicists, epigraphers, and numisma- tists use) it would be added as one codepoint. That is, the character “sestertius” might be added, but five separate glyphs would not be. But scholars usually wish to preserve this kind of information when they prepare editions. There are some solutions, such as using the Private Use Area, about
DRAFT FOR COMMENT #2: NOT FINAL 8 Word Processing in Classical Languages which more will be said below, but the fact remains that sometimes the Unicode character-not-glyph model is not ideal for scholars although it works well enough in everyday use with modern lan- guages.
The Issue of Precomposed Characters The original intention in Unicode was to encode any base character followed by diacritic(s) as a se- quence of separate characters (e.g., the letter e followed by a macron rather than as one combined unit consisting of e with a macron over it). A combination of base character plus diacritic is referred to as a precomposed character, while the sequence of two separate characters is referred to as decom- posed. Diacritics that are designed to be placed over a base character are called combining marks. See Table 1, page 10, for a sample of the combining marks found in Unicode. Avoiding precom- posed characters dramatically reduces the number of codepoints that need to be used. (A codepoint is a slot, identified by number, into which a Unicode character may be assigned.) As the standard was developed, however, a large number of precomposed characters were included, mainly to insure that text could be converted correctly from various existing national standards into Unicode and back again into the original encoding. In addition to the standard Windows character set, whose arrange- ment is followed by Unicode for the first codepoints, there are three blocks of Latin characters: Latin Extended-A, Extended-B, and Extended Additional. Characters of interest to classicists are scattered throughout these blocks.
The presence of the precomposed characters has implications for scholarly use, as we shall see below. Their presence also may obscure the true design and intentions of Unicode for someone who is just learning about it. When I first saw a list of all the characters in Unicode, my reaction was “Wow! Look at all those characters!” Soon I saw, however, that while a great many precomposed combina- tions are defined, many are not. For example, a classicist might want to put a dot under a letter to show a doubtful reading. Unicode contains precomposed forms for letters plus underdot, as shown in Table 6, but not for the other seven. (It does have, of course, a combining underdot.) Another ex- ample is the case of Lithuanian, which requires a few letters with diacritics that are not provided as precomposed combinations in Unicode .. Some Lithuanian users are upset about this, particularly when they learn that it is very unlikely that any more precomposed forms will be added to Unicode. The response from the Unicode powers is that Unicode does support Lithuanian, since all the neces- sary diacritics are included as combining marks; Lithuanian users just need applications that will take a sequence of base character + diacritic and display it properly. Unfortunately, there are no such applications for Western scripts at the present time, which may make someone who needs such combinations in her or his work feel that such needs are being ignored. If you try to use the com- bining diacritics in a Unicode font with today’s word processors, you will find that the diacritics usually don’t line up properly over or under the base character. This is a great nuisance to adjust manually, even if your word processor allows this kind of typographic control (see Chapter ). OpenType will provide a solution, as explained in Chapter .
These remarks about the limitations of combining characters apply to situations where a user wishes to view a document in a form appropriate for reading, as in a word processor or on a web page. There is a different situation where the use of combining marks is highly appropriate and has some important advantages: the storage of texts in machine-readable form, such as the databases of the Thesaurus Linguae Graecae and the Packard Humanities Institute. It is easier to perform operations
DRAFT FOR COMMENT #2: NOT FINAL Fonts, Character Sets, and Unicode 9 such as searching and sorting if the data make use of combining marks which come after the base character. Both Beta Code and Unicode prescribe this method storing characters. For such data to be viewed on a web page, for instance, we need software that will replace the vowel+combining mark combinations with the equivalent precomposed Unicode characters. Such software is not yet readily available in web browsers or word processors, although the specialized programs that are used to ac- cess the TLG or PHI databases do perform this function. The Perseus Project offers viewers an op- tion of “Unicode” display but they simply replace the Beta diacritics with the corresponding Unicode combining characters, with the result that accents are not properly positioned and, if two accents come over a single vowel, they sometimes blot each other out.
The case of polytonic Greek is also interesting in this regard. The first version of Unicode contained only basic letters and combining accent marks. Unicode ., partly as a result of being combined with the ISO standard, added precomposed forms for most (but not quite all) the combinations of letters and accents/breathings. As soon as Word , which supported Unicode fonts, was released, classicists began to use Greek Unicode text since they were eager for an international standard to re- place the various systems that were then available. And so the precomposed Greek combinations, which in the Unicode view existed only for compatibility with existing standards and were not in- tended for regular use in the future, are now commonly employed for polytonic Greek. This is not due to any flaw in the design of Unicode; rather, software that supports combining characters has be- come available much more slowly than was anticipated. (Anyone interested in using Unicode Greek should be sure to read the Unicode section of Chapter , which contains some important information about various issues connected with Greek. The standard has been modified over the years and some questionable characters included; this information, hard to find elsewhere, is clearly set out starting on page 41).
Printing Issues with Unicode One caveat is in order here. You may encounter a printer driver that can’t handle fonts with more than characters. This happened to me with a DeskJet C, which substituted a lower-numbered character when I tried to print using a Unicode font that contained a large character set. In such a case, you may be able to get an updated driver from the manufacturer’s web site, or you may have to use a different printer. Some printers also let you control how TrueType fonts are handled. add more details on this? In Windows, choose File/Print and click on the Properties button for the printer. Ad- justing these settings will sometimes solve printing problems.
Conclusion Unicode provides the most internationally recognized and standardized way to include more than characters in a font. This is clearly beneficial to those who mix languages in their documents, who need to use a wide variety of characters and diacritics even in one language, or who wish to ex- change documents with other users without running into incompatible arrangements of characters. For scholars, Greek is an obvious beneficiary. However, I would encourage Latinists and medievalists to become acquainted with the characters of interest to them that are found in Unicode, of which there are quite a number (see Tables , and for complete lists) and to consider seriously the bene- fits of such standardization. More Unicode fonts with these characters will become available as greater numbers of users ask for them.
DRAFT FOR COMMENT #2: NOT FINAL 10 Word Processing in Classical Languages
Table 1. Selected Combining Diacritical Marks in Unicode.
GLYPH UNICODE CHARACTER NAME ◌̀ U+0300 COMBINING GRAVE ACCENT ◌́ U+0301 COMBINING ACUTE ACCENT ◌̂ U+0302 COMBINING CIRCUMFLEX ACCENT ◌̃ U+0303 COMBINING TILDE ◌̄ U+0304 COMBINING MACRON ◌̅ U+0305 COMBINING OVERLINE ◌̆ U+0306 COMBINING BREVE ◌̇ U+0307 COMBINING DOT ABOVE ◌̈ U+0308 COMBINING DIAERESIS ◌̉ U+0309 COMBINING HOOK ABOVE ◌̊ U+030A COMBINING RING ABOVE ◌̋ U+030B COMBINING DOUBLE ACUTE ◌̌ U+030C COMBINING CARON ◌̓ U+0313 COMBINING COMMA ABOVE (= smooth breathing) ◌̔ U+0314 COMBINING REVERSED COMMA ABOVE (= rough) ◌̣ U+0323 COMBINING DOT BELOW ◌̥ U+0325 COMBINING RING BELOW ◌̦ U+0326 COMBINING COMMA BELOW ◌̧ U+0327 COMBINING CEDILLA ◌̨ U+0328 COMBINING OGONEK ◌̯ U+032F COMBINING INVERTED BREVE BELOW ◌̵ U+0335 COMBINING SHORT STROKE OVERLAY ◌̶ U+0336 COMBINING LONG STROKE OVERLAY ◌͂ U+0342 COMBINING GREEK PERISPOMENI ◌ͅ U+0345 COMBINING GREEK YPOGEGRAMMENI Note: the Combining Marks block of Unicode is found at U+0300–U+036F. The characters in this table are meant to give a sense of the types of things included; those not relevant to classical languages have been omitted. By convention, combining marks in Unicode charts are printed over a broken circle ◌ which represents whatever character the mark would go over.
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 11
2. Keyboard Entry and Other Useful Information
Beta Code Because of the limited character repertoire supported by standard fonts, various individuals, organiza- tions, and software companies have created their own methods of handling scholarly text. The most important one for classicists is Beta Code, which was created by the Thesaurus Linguae Graecae (TLG) to enable the storage of classical texts in machine-readable form. Today Beta Code is used for the Greek texts on the TLG CD-ROM and the Latin texts available from the Packard Humanities In- stitute (PHI). In addition, some users employ Beta Code as a convenient way of representing Greek using Latin script (see Verbrugghe). It should be clearly understood Beta Code is designed to store texts, not to represent them properly on screen or paper. For example, a Greek alpha with a smooth breathing and an acute accent is represented as three separate characters [give example in Beta] in the data file. Specialized software and fonts are required to read the text and present the diacritics in their proper position over the base letter. Information about such software can be obtained from
Entering Multilingual Characters from the Keyboard In any discussion of multilingual computing, the issue of keyboard entry of characters comes up sooner or later. This section presents the ways that one can access the multilingual characters found in standard -character Mac and Windows fonts (entry of Unicode characters is discussed sepa- rately in Chapter , page 25). Issues specific to classical language characters will be discussed below, building on the principles outlined here. If you already know the various ways to access non-English characters, you can skip this section.
The Mac OS from its early years has had better support for multilingual characters than Windows. The Mac OS provides access all the upper-order characters through the OPTION key. To get an um- laut/diaeresis, type OPTION-u4, let go, and type a vowel; likewise OPTION-e for acute, OPTION-i for circumflex, OPTION-` for grave, and OPTION-~ for tilde, all followed by an appropriate vowel (or n with tilde). These keystrokes work in all applications and never conflict with anything. Appendix (page 76) shows all the standard Mac characters with their keystrokes. Mac OS includes an ex- tended Roman keyboard which allows entry of many of the precomposed combinations found in Unicode, again with the OPTION key. You must have installed a Language Pack before this keyboard will be available. confirm this You can also use third-party utilities such as Günther Blaschek’s well known PopChar to check out and enter characters.
4 The hyphen is a sign that the first key is to be held down while the second one is pressed; do not type a hyphen.
DRAFT FOR COMMENT #2: NOT FINAL 12 Word Processing in Classical Languages
To install an additional keyboard on a Mac, from the Apple menu open the Control Panels, then the Keyboard control panel. You will see a list of all the keyboards available to your system; pick the one you want and then close the Keyboard control panel. If you have received a third-party (non-Apple) keyboard you must drag it into the System folder before you can access it through the control panel, and the Mac will ask your permission to install it into the System file.
Windows, by contrast, provides two system-wide methods for accessing non-English characters, both of them clumsy. You can turn on the NUM LOCK key, hold down the ALT key, and type, on the numeric keypad at the right of the keyboard, the four-digit ANSI number of any character. For example, to get an upside-down question mark in Spanish, you would type ALT- (after checking to make sure the NUM LOCK key is depressed). You must use the numeric keypad not the the regular numerals on the top row of the keyboard. You have to keep a chart of ANSI numbers handy, of course. You can also use the Character Map applet that comes with Windows, but since you have to leave your application, copy the character you want, and then return to your application and paste it in, this is a slow business.
Because Windows itself provides such poor support for entering upper-order characters, some applications have provided their own methods. Microsoft Word, for example, uses the CONTROL key to access accented characters; CTRL-: followed by a vowel gets you an umlaut/ diaeresis, CTRL-‘ pro- duces an acute, and CTRL-^ a circumflex. Unfortunately these keystrokes do not work across all the Microsoft Office applications. Word also provides an Insert/Symbol dialog box from which you can insert any character in a font, even if the character does not have a keyboard shortcut. This is fine for rarely used characters but too slow for things you use all the time. See the screen shot on page 25 for an illustration of Insert/Symbol.
If your application does not provide its own keystrokes for non-English characters, or if you are look- ing for a set of keystrokes that will work in all your Windows applications, an excellent alternative is to install the US-International keyboard. This keyboard provides easy access to a great many multilingual characters and some symbols. To install it, activate the Start menu, choose Settings, open the Control Panel, and double-click the Keyboard item. Choose the Language tab at the upper left of the dialog box; if you are an American, you will see English as the first language. Click the Properties button and you will be able to change the keyboard layout associated with English. Choose US-International and click OK; you will probably need to have your Windows / CD- ROM in the drive so that Windows can copy the keyboard file. The US-International keyboard does have one minor drawback: if you type an opening quotation mark followed by a vowel, you will get an umlauted vowel. The solution is to hit the spacebar after typing the opening quotation mark. Some of the characters available via this keyboard are shown in the screenshot on page 14 below.
And, finally, if you work in one particularly language extensively (I’ll use German as an example) and learned at some point to type on a German typewriter, you can install a German keyboard. Follow the procedure given in the previous paragraph, but click on Add instead of Properties and select Ger- man. Be sure the German keyboard is associated with the language and you will then find the ac- cented characters in their usual position on a German keyboard. The same procedure works for any language supported by Windows; be sure that the Switch languages and Enable indicator on taskbar
DRAFT FOR COMMENT #2: NOT FINAL 2. keyboard Entry 13 features are checked. This way you can see what language is in effect at any time and can easily switch from one to another.
Figure 1. Adding languages and keyboard layouts in Windows.
Many of the Microsoft keyboards make use of the AltGr (Alternate Graphic) key to obtain various accents and international characters. This is the right ALT on most keyboards; if you have only one ALT key, you can get the same results by holding down ALT and CTRL together. The screenshot be- low shows how the AltGr key is used in the US-International keyboard.
A very helpful utility is the Microsoft Visual Keyboard, designed to work with Office and avail- able as a free download from Microsoft’s Office website
DRAFT FOR COMMENT #2: NOT FINAL 14 Word Processing in Classical Languages
Figure 2. US-International Keyboard.
The US-International keyboard as shown in Microsoft Visual Keyboard. The AltGr key has been pressed, showing some of the international characters available. The white key is a deadkey for enter- ing the acute accent.
The information in the preceding paragraphs applies to standard Windows fonts. If you are dealing with a font that has been modified so that some characters are replaced with others for a specific lan- guage, you will need to consult the documentation that came with the font to find out how to enter the characters. Let’s say that all the vowels with grave accents in a font have been replaced by vowels with brevia. You would use whatever keystrokes you normally use to get grave accents to obtain the vowels with brevia. Any font with non-standard characters must include a chart showing what’s been replaced with what; some fonts include macros or special keyboard drivers, but most simply let you type the special characters with the keystrokes for their equivalents in a standard font. Word’s Insert/Symbol dialog box shows the keystrokes that will enter a highlighted character at the bottom right of the box.
Finally, it should be noted that there are a variety of third-party utilities that you can use to customize your keyboard if you can’t find anything available that suits your needs. The best one I have found for Windows is Tavultesoft Keyboard Manager (Keyman), available as shareware from
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 15
Part II. The Present
γ g g ג
DRAFT FOR COMMENT #2: NOT FINAL 16 Word Processing in Classical Languages
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 17
3. Latin
Overview Scholarly editions of standard classical Latin texts (Oxford, Loeb, Budé, Teubner) usually require no special characters beyond those in standard computer fonts. Beginners’ textbooks require, at a mini- mum, vowels with macra. The macron indicates a lengthened vowel; the distinction between long and short vowels is essential to the phonetics of classical Latin, although it was indicated in writing only sporadically.5 Textbooks and reference books may also employ brevia, stress marks (sometimes in combination with a breve or macron), and sometimes tildes to represent nasal vowels (e.g., incolã for incola [acc.] written without the final -m). Ancient Romans sometimes wrote a horizontal line over numerals to distinguish them more readily from the surrounding Latin words. Modern editors place dots under letters to indicate uncertainty about the text and employ the obelus or dagger (†) to mark corrupt passages. Letters accidentally omitted by a scribe are marked with angle brackets 〈a〉; if a font does not contain true angle brackets, it may be more attractive to use the single guillemets ‹a› rather than the greater-than and less-than signs for this purpose. In addition, editions of a few classical authors require specialized signs, mainly for various units of measurement and currency. About such signs are found in the Latin authors on the CD-ROM produced by the Packard Hu- manities Institute (PHI), which contains all the extant texts of classical Latin authors. Additional characters are needed when dealing with epigraphy and metrics; these are discussed separately in Chapter below.
Medieval and Renaissance Latin texts present additional requirements, chief among them the large number of abbreviations found in these texts. There are over , abbreviations in the Brepols database of medieval Latin texts, and it is unlikely that there ever will be (or should be) a font with all of them. A number of them are very common, however, and were used in early printed books as well as in manuscripts; these need to be available to scholars and librarians. [add ISO standard for libraries here] Of particular note is the very common use of a horizontal stroke over a letter to denote the omission of letters, most often m. (This is not to be confused with the macron over long vowels.) There are also symbols used in liturgical texts, such as the veriscle and response signs ™ š.
None of the characters mentioned above except the dagger has been readily available on either Win- dows or Mac systems. The standard Mac character set includes both the macron and the breve, and the Windows set has the macron. However, it is very awkward to make use of these with standard word processors6 and so specialized solutions have been required, although that is somewhat changed now that Unicode support is widely available.
5 See below in the Epigraphy section for more information about how the ancient Romans marked vowel length. During the 18th and early 19th centuries, a circumflex accent was used to mark a vowel as long, and this usage is still occasionally encountered in Latin as well as in Welsh and other languages. 6 The procedure is given below, page 33, for those who are interested.
DRAFT FOR COMMENT #2: NOT FINAL 18 Word Processing in Classical Languages
How-to -character Fonts Since there has never been a standardized solution to the needs of Latinists, various ad hoc methods have been employed, particularly by teachers who are eager to prepare materials with macra for their students. Under both Windows and the Mac OS, only the diaeresis/umlaut appears over all six vowels in standard fonts (and can therefore be entered from the keyboard). Some teachers have simply used umlauts to stand in place of macra. Those who are willing to spend the time and effort have modi- fied their own fonts, replacing the umlauted vowels with macra. (TypeTool from FontLab
For those who want macra and brevia, perhaps in combination with a stress mark, the first commer- cial solution was the TransRoman package ($. and up) from Linguists Software; information available at
The CL Fonts package, developed by the present author with the support of the Classical Association of the Empire State, is specifically designed to meet the needs of Latin teachers. It provides macra, brevia and stress marks for regular text printing, plus common metrical signs and a small selection of epigraphical and medieval characters. The regular weight face can be downloaded from the CAES homepage
The Old English Font Pack from Dr. Peter Baker, although designed for scholars of Anglo-Saxon, contains about medieval Latin abbreviations as well as vowels with macra and brevia. confirm number Download it from
A larger selection of characters that will be of particular interest to medievalists or Renaissance schol- ars is contained in the font Garamond from Tiro Typeworks. These fonts are exquisitely
DRAFT FOR COMMENT #2: NOT FINAL 3. latin 19 crafted, although somewhat expensive for those who don’t use them frequently. Visit Tiro’s home- page at
Unicode Solutions Unicode . contains all vowels with macra and brevia, except Y/y with breve, in precomposed form as well as a combining macron and combining breve which can (if your software cooperates; at the moment, none does) be placed over any character. Anyone with a Unicode-friendly word processor7 can take advantage of these precomposed characters; this may be a useful solution for someone who needs only macra and brevia, since you do not need to purchase any additional software beyond the Times New Roman and Arial fonts that ship with Windows . See Interlude , page 25, for information about using Unicode characters in Word. Unicode . also contains a number of other characters that are of interest to Latinists. See Table (page 23) for a complete list of these. Table in the epigraphy chapter (page 50) contains vowels with underdots and other characters that may be useful to those who are preparing critical editions.
At the moment, there are two Unicode fonts that may be useful for Latinists since they contain more characters than those found in the Times New Roman and Arial typefaces. The Titus Project from the University of Frankfurt, whose main focus is getting ancient texts on the Web, has produced a Unicode font that they will make available for non-commerical use upon request; see the Titus home page at
You can also download a Beta version of the Cardo font from
7 At the moment, this means Microsoft Word; version 8.4 and later of WordPerfect will open Unicode files, but WordPerfect provides no way for users to enter Unicode text easily.
DRAFT FOR COMMENT #2: NOT FINAL 20 Word Processing in Classical Languages
Sample 1. Printing with CL Fonts
The following is a sample of the characters included in the CL Fonts package developed by David Perry and the Classical Association of the Empire State specifically for Latinists. Samples are at 12 point.
Vowels with macra and brevia Ä Ë Ï Ö Ü Ÿ ä ë ï ö ü ÿ Â Ê Î Ô Û Ñ â ê î ô û ñ À È Ì Ò Ù Þ à è ì ò ù þ Å ¼ Ð Ø ½ ß ã å ð õ ÷ ø
Sample of connected prose Clärörum virörum facta mörësque posterïs trädere, antïquitus üsitätum, në nostrïs quidem tempori- bus quamquam incüriösa suörum aetäs omïsit, quotiëns magna aliqua ac nöbilis virtüs vïcit ac supergressa est vitium parvïs magnïsque cïvitätibus commüne, ignörantiam rëctï et invidiam. —Tacitus, De vita Iulii Agricolae liber i.1–2
Word Stress pátrës, étiam, diffícilis, pópulus, púerum, cýathus c‚dere, cƒsa, „gepae, ob…diëns, cˆque cantâre, vidêre, dormîre, senätôrem, virtûtem, papñrus cãntant, tåneö, inð tium, õrior, h÷milis, cøathus
Roman numerals with lines ¡ ¢£¤¥¬−
Inscriptions H¾ C·SITVS·EST·C·I¿LIVS·BARÓ·QV¾ ·V¾ XIT·ANN·¤£ doubtful letters: ˜ a ˜ b ˜ c ˜ p ‹a› ‹m› and abbreviations: º Pº P, º Tº R denarius ‰ and sestertius Š
Nasal vowels & dieresis (lower-case only) fëminª a, mïlitª e, part« , cª osul, virª u, ª y; ¨ a, ¨ e, © , ¨ o, ¨ u, ¨ y
DRAFT FOR COMMENT #2: NOT FINAL 3. latin 21
Ligatures Æ, æ , Ç, ç, Œ, œ
Medieval & religious symbols ™ , š, Ã and Õ = et, bar above any letter for abbreviations: » q, » n, » c, » m
Poetry metrical schemes: × × | ¯ µ µ | ¯ µ ¯ µ | ¯ × (hendecasyllabics) ¯ ² / ¯ ² / ¯ ¦ ² / ¯ ² / ¯ µ µ / ¯ × ¯ ² / ¯ ² / ¯ ¦ ¯ µ µ / ¯ µ µ / × (elegiac couplet) vowel quantities and syllabic quantities: ¯ µ µ / ¯ µ µ / ¯ ¦ ¯/ ¯ ¯ / ¯ µ µ / ¯ ¯ Arma virumque canö, Troiae quï prïmus ab örïs ¯ µ / ¯ ¯ /¯ ¦ µ µ /¯ µ ¯ /¯ Fürï¹et Aurëlï, comitës Catullï syllabic quantities only: änte¹ömnësquè Lèlëx, ànìmö mätürùs et » aevö synizesis: ant® ehäc plus ± and ³ and marks for ictus: ´ ¸
Standard publishing characters em- and en-dashes: —, – bullet: • dagger and double dagger: † ‡ section and paragraph markers: § ¶ curly quotes: “She said, ‘You shouldn’t stand out here in the cold!’ but he ignored her.” single guillemets: ‹ ›
DRAFT FOR COMMENT #2: NOT FINAL 22 Word Processing in Classical Languages
Issues with Unicode Unicode encodes the Roman numerals I–XII plus L, D, C, and M in both upper- and lowercase forms (U+2160–U+217F). These were included for compatibility with some East Asian standards and I do not see any point in using them in classical texts. The only advantage might be that one could search for numerals without the possibility of finding regular Latin letters; however, since Ro- man numerals are normally printed separated from surrounding words by spaces, they are easy to find even when regular letters are used to represent them. Furthermore, not all Unicode fonts will con- tain these characters, and they only handle a very small fraction of the possible Roman numerals; so I would not bother with them. (These comments of course do not apply to the Unicode characters for the Roman numerals , and ,, which are tabulated in Table 6 in the Epigraphy section be- low.)
The main difficulty with Unicode, as far as Latinists are concerned, is that a number of useful pre- composed combinations are not included, and there is very little chance of their being added.8 Y/y + breve is probably the most important of these. Unicode includes all letters with an underdot except C/c, F/f, G/g, J/j, P/p, Q/q and X/x; there is also a combining underdot, which at the moment is not useful with normal software. Finally, Unicode includes only two instances of the combination of ma- cron plus acute accent (E/e and O/o) and none of acute + breve, which are sometimes needed to show word stress. See the Chapter , page 33, for ways to deal with this. The other problem with Unicode at the present time is that there is considerable inconsistency in which characters are in- cluded. Font designers tend to omit characters that they think will not be used, which includes many characters that scholars are interested in. This situation will improve with time.
8 The decision not to allow any additional precomposed combinations after Version 3.0 of Unicode was made because of the increasing reliance on Unicode by web applications and other software that change decomposed into precomposed forms in order to display text properly; such applications will have to be constantly updated if additional precomposed combinations continue to be added.
DRAFT FOR COMMENT #2: NOT FINAL 3. latin 23
Table 2. Unicode Characters for Classical and Medieval Latin See also Table 6, page 50, for epigraphic characters. GLYPH UNICODE CHARACTER NAME Ā U+0100 LATIN CAPITAL LETTER A WITH MACRON ā U+0101 LATIN SMALL LETTER A WITH MACRON Ă U+0102 LATIN CAPITAL LETTER A WITH BREVE ă U+0103 LATIN CAPITAL LETTER A WITH BREVE Ē U+0112 LATIN CAPITAL LETTER E WITH MACRON ē U+0113 LATIN SMALL LETTER E WITH MACRON Ĕ U+0114 LATIN CAPITAL LETTER E WITH BREVE ĕ U+0115 LATIN SMALL LETTER E WITH BREVE Ī U+012A LATIN CAPITAL LETTER I WITH MACRON ī U+012B LATIN SMALL LETTER I WITH MACRON Ĭ U+021C LATIN CAPITAL LETTER I WITH BREVE ĭ U+021D LATIN SMALL LETTER I WITH BREVE Ō U+014C LATIN CAPITAL LETTER O WITH MACRON ō U+014D LATIN SMALL LETTER O WITH MACRON Ŏ U+014E LATIN CAPITAL LETTER O WITH BREVE ŏ U+014F LATIN SMALL LETTER O WITH BREVE Ū U+016A LATIN CAPITAL LETTER U WITH MACRON ū U+016B LATIN SMALL LETTER U WITH MACRON Ŭ U+016C LATIN CAPITAL LETTER U WITH BREVE ŭ U+016D LATIN SMALL LETTER U WITH BREVE Ȳ U+0232 LATIN CAPITAL LETTER Y WITH MACRON [3.0] ȳ U+0233 LATIN SMALL LETTER Y WITH MACRON [3.0] Æ U+00C6 LATIN CAPITAL LETTER AE æ U+00E6 LATIN SMALL LETTER AE Œ U+0152 LATIN CAPITAL LIGATURE OE œ U+0153 LATIN SMALL LIGATURE OE ſ U+017F LATIN SMALL LETTER LONG S ⁊ U+204A TIRONIAN SIGN ET † U+2020 DAGGER ‡ U+2021 DOUBLE DAGGER § U+00A7 SECTION SIGN ¶ U+00B6 PILCROW SIGN ⁋ U+204B REVERSED PILCROW SIGN ℟ U+211F RESPONSE ℣ U+2123 VERSICLE ※ U+203B REFERENCE MARK ⁁ U+2041 CARET INSERTION POINT ⁂ U+2042 ASTERISM
DRAFT FOR COMMENT #2: NOT FINAL 24 Word Processing in Classical Languages
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 25
4. Interlude: Using Unicode Characters with Microsoft Word
This section tells how to enter a character from a Unicode font using Word’s built-in facilities. These are fine for occasionally inserting characters to supplement those found on standard keyboards; how- ever, if you work regularly in Greek or another non-Latin script, you will normally need to get a spe- cially programmed solution, such as Antioch or MultiKey (see the Greek section for information); MultiKey is the only package I know of at the moment that provides keyboard support for entering Latin Unicode characters. See also the discussion of UniPad in the Resources section (page 72).
In Word, put your text into a Unicode font, then choose Insert/Symbol from the menu. In the Sub- set pulldown menu at the upper right of the dialog box, choose Latin Extended-A and you will see an A-macron as the first character in that group. Click the character you want, then click Insert, and the character will be added to your document. See the screen shot below, where the macron and breve characters have been highlighted in yellow.
Figure 3. Word’s Insert/Symbol dialog box.
You can use the scroll bar at the right to view all the characters in the font. However, if you know the Unicode range where the character is located, using the pulldown list provides a faster way to get to the character you want. If your font is a PostScript (Type ) font, not a TrueType font, the In-
DRAFT FOR COMMENT #2: NOT FINAL 26 Word Processing in Classical Languages sert/Symbol dialog displays a peculiar behavior: if you just choose Insert/Symbol, all you see are blank spaces. If you highlight a character in your document to which the Type font has been ap- plied and then choose Insert/Symbol, you will see the characters. This is true for Word and ATM .; whether it is true for all versions of Word and ATM I am not certain.
There is another way to get a Unicode character. Open a Word dialog such as Edit/Find or File/Open. Enter the four-digit Unicode value in hexadecimal and then type ALT-x. The four digits will be replaced by the corresponding Unicode character; if that character doesn’t exist in the font that your system uses for dialog boxes and such, a rectangle will be displayed. Highlight the charac- ter or rectangle and copy it to the clipboard (CTRL-C), hit ESCAPE to close the dialog, and then paste (CTRL-V) the character at the appropriate spot in your document. This technique is useful on those occasions when you know a character’s Unicode value but just can’t find it in the small chart that ap- pears when you choose Insert/ Symbol. You may see an empty rectangle in the dialog box after pressing ALT-x; this means that the character you want does not exist in your system font (the default font that Windows uses for dialog boxes and such). Go ahead and follow this procedure anyway; the character will display properly when you paste it into the document which is formatted in a font that does contain the character you want.
If you receive a Unicode-based document that contains characters which do not exist in any font you have (and so are represented by the generic “missing character” character, usually a small rectangle), you can identify them by following the reverse of the procedure given in the previous paragraph: highlight and copy the unknown character, select Edit/Find, and paste it into the “Find what” field. Then type SHIFT-ALT-x and the unknown character will be replaced by its hexadecimal value which you can look up in a Unicode chart. You can also visit
A very useful set of Word macros to help with Unicode questions is available from need url. These macros allow you to identify unknown Unicode characters, insert characters if you know their num- ber, search for Unicode characters, and other things. You can also create your own macros to insert Unicode characters, if you are comfortable with Word’s macro language.
It is also possible to use Word’s AutoCorrect function to enter characters. AutoCorrect was designed to correct common typing errors; for example, “htis" is automatically changed to “this.” It can also be used as a shorthand method of entering text, however. Choose Insert/Symbol, highlight the character you want and then click the AutoCorrect button in the lower left; the character you chose will automatically appear in the AutoCorrect dialog box, and you can enter whatever key combina- tion you wish to use for the selected character. The limitation with using AutoCorrect this way is that Word treats each entry as a separate word; that is, the AutoCorrect function is not activated until you enter a space, a period, a colon, or a semicolon. So this doesn’t work well for entering a single character in the middle of a word. Ralph Hancock has developed a set of AutoCorrect entries for common Greek words which is included with the Antioch package (see the Greek section for information).
DRAFT FOR COMMENT #2: NOT FINAL 4. unicode with word 27
You can use the Find command on Word’s Edit menu to look for a Unicode character. Type the character’s decimal number (not hexadecimal as in most Unicode charts9) preceded by ^u, for example ^u97 to find γ (lower case Greek gamma). This does not work with Replace.
Remember that not all Unicode-based fonts contain all the characters defined in the Unicode Stan- dard; there is no way to tell which characters a font contains (unless some documentation comes with the font) except by looking at the characters in Word’s Insert/Symbol dialog or in a separate font viewing utility.10 The Times New Roman and Arial that ship with Windows do contain most of the Unicode characters for western languages. Note that Unicode . (the version that preceded the current .) did not include macra on Y/y. Most available Unicode fonts, as of this writing, still have not caught up to version .; you can see in the screen shot above that Y-macron and y-macron are missing.11 This also affects documents you share with other users. The person you send a document to needs to have a Unicode font on his/her system with the macron or breve characters.
For more about keyboard entry of characters, see Chapter .
9 The Windows Calculator accessory can convert between number systems for you; start Calculator and choose View/Scientific, click on Hex, type the hex value and click Decimal. 10 Microsoft’s Font Properties Extension 2.1, which can be downloaded from
DRAFT FOR COMMENT #2: NOT FINAL 28 Word Processing in Classical Languages
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 29
5. Germanic
Overview Scholars who work in Old English and Old Norse have some special needs that are not met in stan- dard fonts. Old English requires at a minimum the letters eth, thorn, and ash which are part of the standard Windows character set (since they are still used in Icelandic), but the first two are not in the regular Mac set. In addition, macra, brevia, and several other diacritics are sometimes needed as well as a number of specialized characters such as yogh and wynn. Unicode now provides most of the characters that are needed for Germanic languages and they are shown in Table below along with their Unicode values. Those characters that are not found in Unicode are listed on page 30.
How-to -character Fonts There are a few good solutions for Old English and Old Norse. Edlund by Carl Anderson is a Mac font with a great many characters for northern European languages; supplementary fonts with true small capitals and insular letterforms are also available. It is free for scholars and can be downloaded from the Edlund Project’s page
A Windows font for the Gothic language is available from Dr. Berlin’s Foreign Font Archives
Several Runic fonts are downloadable from Coron’s Sources of Fonts
DRAFT FOR COMMENT #2: NOT FINAL 30 Word Processing in Classical Languages
Unicode Solutions Table below lists all the Germanic characters currently in Unicode. Vowels with macra and brevia are tabulated in the Latin section, page 20. “[.]” indicates that the character was added in the re- cent version . of Unicode; many existing fonts do not have these characters. Unicode . also added a block devoted to Runic letters, U+16A0–U+16F0. For ways to use the Germanic characters that are in Unicode with currently available software, see the discussion in the previous chapter, page 25. And, finally, we should note that the Gothic script has been accepted by Unicode for inclusion in version ., although it has not yet received final approval from ISO. Gothic will be located in Plane ( characters; 10330–1034B) and the characters will be accessed through the use of surrogate pairs (for an explanation of this, see below crossreference).
The following characters are not included in Unicode .: Ã, , ¬ = thæt, and the precomposed combinations /, ·/ª, ›/¢, ⁄/¡, «, /, ḗ ṓ , and r.̥ There are also no metrical signs such as . See Interlude , page 33, for some ways to get these combinations. need to check Edlund Mac docs; some chars didn’t translate properly to Windows font, I think
The only Unicode font for medievalists that I am aware of is Junicode from Peter Baker, download- able from
As mentioned above, Unicode now contains a block devoted to runes. I have not yet seen a font de- signed around this block; until one is published, users will have to continue with the -character fonts now available and discussed on the previous page. Using the Unicode Runic block will provide considerable advantages in terms of standardization and interchangeability, and so I hope that we will soon see some Unicode runic fonts. However, anyone contemplating developing such a font must read the statements in The Unicode Standard, Ve r s i o n (pages –), particularly the remarks regard- ing the shape of the glyphs used in the Unicode chart; simply looking at the charts does not provide all the necessary information.
Block of medieval superscript letter diacritics has been proposed for Unicode 3.2. Although used mainly in Germanic manuscripts, they are found sometimes in other languages and occasionally appeared as late as the 19th century. If given final approval, these letters will be added to the combining diacritical marks range; they are shown in magenta in the chart below.
DRAFT FOR COMMENT #2: NOT FINAL 5. Germanic 31
Table 3: Medieval Germanic Characters in Unicode 3.0 (see also Table 2 for vowels with macra and brevia)
GLYPH UNICODE CHARACTER NAME Æ U+00C6 LATIN CAPITAL LETTER AE æ U+00E6 LATIN SMALL LETTER AE Ǽ U+01FC LATIN CAPITAL LETTER AE ACUTE ǽ U+01FD LATIN SMALL LETTER AE ACUTE Ǣ U+01E2 LATIN CAPITAL LETTER AE WITH MACRON ǣ U+01E3 LATIN SMALL LETTER AE WITH MACRON Ą U+0104 LATIN CAPITAL LETTER A WITH OGONEK ą U+0105 LATIN SMALL LETTER A WITH OGONEK ƀ U+0180 LATIN SMALL LETTER B WITH STROKE (Old Saxon) Ċ U+010A LATIN CAPITAL LETTER C WITH DOT ABOVE ċ U+010B LATIN SMALL LETTER C WITH DOT ABOVE đ U+0111 LATIN SMALL LETTER D WITH STROKE Ę U+0118 LATIN CAPITAL LETTER E WITH OGONEK ę U+0119 LATIN SMALL LETTER E WITH OGONEK Ġ U+0120 LATIN CAPITAL LETTER G WITH DOT ABOVE ġ U+0121 LATIN SMALL LETTER G WITH DOT ABOVE ħ U+0127 LATIN SMALL LETTER H WITH STROKE Ƕ U+01F6 LATIN CAPITAL LETTER HWAIR [3.0] (Gothic transcription) ƕ U+0195 LATIN SMALL LETTER HV ĸ U+0138 LATIN SMALL LETTER KRA Ŋ U+014A LATIN CAPITAL LETTER ENG ŋ U+014B LATIN SMALL LETTER ENG Ʀ U+01A6 LATIN LETTER ÝR (Old Norse) ʀ U+0280 LATIN SMALL CAPITAL LETTER R (used for lowercase ýr) Œ U+0152 LATIN CAPITAL LIGATURE OE œ U+0153 LATIN SMALL LIGATURE OE Ǿ U+01FE LATIN CAPITAL LETTER O WITH SLASH AND ACUTE ǿ U+01FF LATIN SMALL LETTER O WITH SLASH AND ACUTE Ǫ U+01EA LATIN CAPITAL LETTER O WITH OGONEK (Old Icelandic) ǫ U+01EB LATIN SMALL LETTER O WITH OGONEK Ǭ U+01EC LATIN CAPITAL LETTER O WITH OGONEK AND MACRON (Old Icelandic) ǭ U+01ED LATIN SMALL LETTER O WITH OGONEK AND MACRON ſ U+017F LATIN SMALL LETTER LONG S Ð U+00D0 LATIN CAPITAL LETTER ETH ð U+00F0 LATIN SMALL LETTER ETH (table continued on next page)
DRAFT FOR COMMENT #2: NOT FINAL 32 Word Processing in Classical Languages
Þ U+00FE LATIN CAPITAL LETTER THORN þ U+00FE LATIN SMALL LETTER THORN Ƿ U+01F7 LATIN CAPITAL LETTER WYNN [3.0] ƿ U+01BF LATIN [SMALL] LETTER WYNN Ȝ U+021C LATIN CAPITAL LETTER YOGH [3.0] ȝ U+021D LATIN SMALL LETTER YOGH [3.0] Ƶ U+01B5 LATIN CAPITAL LETTER Z WITH STROKE ƶ U+01B6 LATIN SMALL LETTER Z WITH STROKE Ȥ U+0224 LATIN CAPITAL LETTER Z WITH HOOK [3.0] (Middle High German) ȥ U+0225 LATIN SMALL LETTER Z WITH HOOK [3.0] Ʒ U+01B7 LATIN CAPITAL LETTER EZH ʒ U+0292 LATIN SMALL LETTER EZH χ U+03C7 GREEK SMALL LETTER CHI ⁋ U+204B REVERSED PILCROW SIGN [3.0] ⁊ U+204A TIRONIAN ET SIGN [3.0] ◌ U+0363 COMBINING LATIN SMALL LETTER A ◌ U+0364 COMBINING LATIN SMALL LETTER E ◌ U+0365 COMBINING LATIN SMALL LETTER I ◌ U+0366 COMBINING LATIN SMALL LETTER O ◌ U+0367 COMBINING LATIN SMALL LETTER U ◌ U+0368 COMBINING LATIN SMALL LETTER C ◌ U+0369 COMBINING LATIN SMALL LETTER D ◌ U+036A COMBINING LATIN SMALL LETTER H ◌ U+036B COMBINING LATIN SMALL LETTER M ◌ U+036C COMBINING LATIN SMALL LETTER R ◌ U+036D COMBINING LATIN SMALL LETTER T ◌ U+036E COMBINING LATIN SMALL LETTER V ◌ U+036F COMBINING LATIN SMALL LETTER X
Note about ezh and yogh: in Unicode . these characters were unified at one codepoint, but they were disunified and new codepoints added for Yogh/yogh in version .. This was done in recogni- tion of the fact that the two are historically distinct characters.
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 33
6. Interlude: What if I really need characters that aren’t in my font?
If you’ve tried without success to locate a font whose design is suitable for your project and which contains the characters you need, you have two choices.
Adding or modifying characters The first option is to create the missing characters with a font editor. Fontographer, now published by Macromedia
Mention should also be made of the Font Creator Program by Erwin Denissen of High-Logic in the Netherlands. This is a shareware font editor which, while not offering all the bells and whistles of the more expensive products, may do what you need for simple modifications such as adding accents. And, since it is shareware, you can try it before you purchase. See the High-Logic page at
If you have a special font that you want to convert from Mac to Windows format or vice versa, Font- Lab makes TransType, a conversion utility for just this purpose. You can also open the font with Fontographer and export it as a file for the other platform. add other utils here
It is not difficult to learn to use Fontographer or TypeTool to add accents onto existing letters since you can copy and paste the accents onto existing character outlines. To go beyond this, however (i.e., to create an actual letter or symbol that is missing from the font), you will need to acquire some seri- ous type design skills. You may be better off having this kind of work done for you. There are a number of small type companies that will do custom work at reasonable prices; the big companies such as Monotype will also do custom work, but the prices may be prohibitive for scholars. Some font developers are much more knowledgeable about non-English characters than others, which may
DRAFT FOR COMMENT #2: NOT FINAL 34 Word Processing in Classical Languages affect the quality of the work they do for you. Also, custom-made or modified fonts may not look as good on screen as products from the large companies such as Monotype. This is because the large companies can do special ultra-sophisticated hinting which improves the appearance of characters on a monitor; the software required for such super-hinting is beyond the reach of the smaller companies. This does not affect printed output.
The books by Cavanaugh and Moye are excellent resources on digital type design for those who want to learn to create or modify fonts.
A note about legalities: most font makers allow you to modify for your own use, or to have modified for you, a font for which you have purchased the appropriate license. A few require the person doing the modification to purchase a copy in addition to the end user, while a few do not allow any modification. So check with the vendor in case of doubt. However, in no case are you legally able to distribute a modified font to others who have not purchased a license for the original. Font develop- ment is a painstaking, time-consuming task and developers deserve to be compensated for use of their work, unless they have chosen to make it freely available; some have done so, and we owe them great thanks for their efforts. If a font or keyboard utility is distributed as shareware, you do have an obligation to pay the registration fee if you use the product to do any signicant work.
If you are modifying a -character font, you will have to decide which characters you will replace with your customized ones, since no more than characters can be accessed at once. If you are working with a Unicode font, you can place your customized characters in the Private Use Area (U+E000–U+F8FF) if there are no codepoints already assigned to them. It is definitely, absolutely not a good idea to put any characters in the codepoints that the Unicode Standard marks as “reserved” and which are shown in the charts with diagonal lines (see the Greek charts on pages 43–45 for examples of this); use the Private Use Area for such characters. In the future, OpenType fonts may make it possible to use the combining diacritics that Unicode provides; see Chapter for more dis- cussion of this.
Using diacritics that are built into a font If you wish to use your special characters in a database, spreadsheet, or on the web, the customized font route is the only way to go. However, if you are preparing a text for publication in print, and if all you need are combinations of base letters plus accents, you may be able to get the results you want by using a program that allows you to control exactly how characters are placed next to each other. Sophisticated word procesors do this to some extent, although you get much greater control with a high-end page layout program such as PageMaker, Quark Express, or Adobe InDesign.12
If the font you are using contains (let’s say) an underdot and a macron, you could get an o with an underdot and macron as follows: type the o and then use whatever facilities your program provides to
12 The power of page layout programs comes at a price, however. Unless you have access to one through your institution, you will have to purchase the software—and these programs are expensive. Second, you will have to learn to use the program; this requires a significant investment of time if you want to get professional-looking results. You may wish to employ a desktop publishing service that can import your word-processor file into a page layout program. In either case, check early on to be sure that the font you plan to use has the diacritics you need.
DRAFT FOR COMMENT #2: NOT FINAL 6. getting special characters 35 insert special characters to add the underdot and macron. You will then have to move the two diacritics into proper position, usually by kerning each one with a negative value. This process is tedious to carry out many times, so you can use a character that won’t occur in your document (let’s say o-circumflex) when you are initially typing the material. Then create the first instance of the base character + diacritic combination, copy it to the clipboard, and then replace all the o-circum- flexes with it.
Here’s an illustration. I typed an o and then used Word’s Insert/Symbol command to add the breve and the accute accent: o˘´. I then highlighted the o and used Word’s Format/Font/CharacterSpacing feature to move the breve back over the o; Word calls this “condensed” character spacing and it should be applied to the first character of the pair. Finally, I highlighted the breve and condensed its spacing in order to move the acute back; then highlighted the acute and moved it up (“raised” posi- tioning, in Word’s terms) and the result looks like this: o˘´. Note that your cursor will appear to behave strangely in this scenario, and you may not be able to use the mouse to select things. You will have to use the arrow keys and hold down SHIFT to highlight the item you want, even if you can’t really see what you’re highlighting (it’s still there if you typed it and saw it earlier in the process). Some- times the cursor may appear not to move when you run it over the combined letter+diacritic combination. You’ll get used to this.
You can see that this is not something you want to do very often, since it is quite fussy and awkward. You have to keep trying until you find just the right amount to move the accent back or up. And, of course, it only works if the font already contains the accents you need. But, for the occasional rare combination, it’s one possibility. (For example, if you have a Unicode font that is missing the seven precomposed combinations with underdot [see page 8], this would be one way to get the missing items.) It is clearly much better to have any precomposed combination that you will use regularly built into the font.
Choosing a font Scholars often have little choice about fonts if they need special characters; one has to take what’s available unless one is willing to customize a font for oneself or have the work done. However, if you do have a choice, either because you are creating your own special characters or because you are able to access the diacritics built into various commercial fonts, you should think about the following factors.
Fonts generally fall into serif and sans-serif categories. Times New Roman is a serif font since it has the little horizontal elements at the top and bottom of such letters as h and M, while Arial is a sans- serif font since it lacks such elements: h M (serif) h M (sans serif) Serif fonts are generally considered more legible for blocks of text such as are regularly found in scholarly books, so you will probably want to choose a serif font for the body text of your publica- tion. It is also a very common practice to use a serif font for the body and a sans-serif font for headings.
DRAFT FOR COMMENT #2: NOT FINAL 36 Word Processing in Classical Languages
You will also probably want to pick a font of a traditional design, such as Caslon, Garamond, Basker- ville, or a modern one such as Palatino whose letterforms are closely related to the traditional faces. These have been used for centuries for books and so convey the sense of solidity and accuracy that one wants in a scholarly publication. There are a great many fonts that work fine in a flyer or adver- tisement but are not the best choice for scholarly work.
The remarks in the previous paragraph assume that you are preparing a print publication. If you are preparing a document that users will spent long periods of time reading on a computer screen, some additional factors come into play. Examples of such documents include pages to be posted on the web, CD-ROMs containing reference materials, or instructional software. Keep in mind the essential point that computer screens operate at a much lower resolution (less than dots per inch) than laser printers (normally or dpi, at most dpi) or imagesetters ( or dpi).
There are two related factors that determine the suitability of type for extended on-screen use: the design of the typeface and the quality of its digital implementation. Many traditional book faces such as Garamond and Caslon were designed with noticeable contrasts between thick and thin strokes as well as subtle curves. Such designs don’t look their best on a low-resolution screen. This is particuarly so if the person who digitized the font didn’t know how (or didn’t bother) to create the best, most efficient outlines; unfortunately, this is the case with some of the cheap imitation fonts on the market. A font with a simpler design may stand up better at low resolutions, and some fonts on the market today have been designed specifically for digital use. But any font, whether a traditional book design or new digital face, will not look good on screen unless it has been quite carefully hinted. Such hinting is very time-consuming and many fonts have not been given such treatment. Some of the fonts distributed with Microsoft products (including Times New Roman, Arial, Verdana, and Georgia) do feature this high quality hinting but do not contain all the characters that scholars may need. Compare the appearance of these fonts with the ones you are considering for your project in order to establish a point of reference. In any event, before settling on a typeface for an on-screen document, be sure to test it with a variety of readers for an extended period.
For more on the important topic of choosing appropriate typefaces, see the excellent book by Bringhurst or one of the many available books on desktop publishing.
DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 37
7. Greek
Overview Greek grammarians in ancient Alexandria developed a system of three accent marks (acute, grave, and circumflex) to be placed over vowels as well as two breathing marks (rough and smooth) for vow- els at the beginning of a word.13 In addition, the diaeresis, macron, breve, iota-subscript, and coronis (a contraction sign) are called for on occasion in scholarly work. Given all the possible combinations of these signs, traditional Greek typesetting is not a simple affair. In the Greek government de- creed the end of this traditional polytonic system, which was no longer needed except for classical texts, and replaced it with a monotonic system. Monotonic Greek uses one accent mark, the tonos, similar in shape to the acute (but usually with a slightly more vertical orientation14) plus the diaeresis which may appear on iota or upsilon.
The Mac was the first computer of choice for Hellenists because it provided the ability to use non- Latin fonts long before Windows did, and GreekKeys on the Mac was the first widely used font package for classicists. Windows caught up with the introduction of TrueType fonts in Windows . and there are now several solutions available on this platform. However, there has never been an established standard for either the arrangement of characters in a font or for the entry of characters from the keyboard; Unicode now provides for the former, but the latter remains an issue. Both GreekKeys and WinGreek have been widely used and have to some extent constituted de facto stan- dards; but I strongly recommend that anyone starting out now with Greek text adopt a Unicode- based solution rather than one of the older packages, although the latter still work, since Unicode will certainly be the dominant standard in the future. However, in the interest of completeness, I will provide information on the earlier packages. Sean Redmond has written a utility that will convert from earlier Greek layouts into Unicode; see
An excellent source of information on Greek fonts and typesetting is the article by Yannis Haralam- bous titled From Unicode to Typography, A Case Study: The Greek Script (available online). Anyone with a se- rious interest in Greek typography should consult the wonderful book Greek Letters: From Tablets to Pix-
13 The purpose of the accents was to indicate rising or falling pitch of a syllable, while the rough breathing indicated an h sound and a smooth breathing its absence. The rules about accents are complex and may be obtained from any Greek grammar. For more about the pitch accent of ancient Greek, see Allen, p. 116 ff. 14 The Greek font used to print Version 2 of The Unicode Standard showed vowels with a tonos that was pointing straight up, 90° to the baseline, which is not the usual practice (and has been corrected in Version 3). Some font developers not well versed in Greek typography followed these samples, with the result that you may encounter fonts with such vertical tonoi.
DRAFT FOR COMMENT #2: NOT FINAL 38 Word Processing in Classical Languages els which includes a chapter by Jeffrey Rusten on the history of Greek fonts on personal computers. A standard history of Greek typefaces is that of Scholderer.
How-to -character Fonts The first widely used package for classicists was GreekKeys. Created by George B. Walsh and Jeffrey Rusten for the Macintosh, and for a while available in a Windows version, it became a kind of stan- dard, particularly among Mac users. It is still available for the Mac; for up to date information, see the FAQ page at
Allotype Typographics offers some very nice looking Greek fonts, Kadmos and Bosporos, in their own layout and in GreekKeys layout; both Windows and Mac versions are available. See their home page at
The Summer Institute of Linguistics (SIL) at
SP Fonts, a collection of freeware fonts for Biblical studies from Jimmy Adair, are available in both Mac and Windows at
WinGreek, by Peter Gentry and Andrew Fountain, supports Greek, Coptic, and Hebrew; it provides fonts and a utility called Beta that enables the user to enter Greek characters and to type Hebrew from right to left. It was written for Windows . and is still available for use on that platform; it does not work under Windows or later.
Son of WinGreek is an updated version of WinGreek designed to work on Windows /; this is probably the most widely used Windows Greek package at the present time, although many users are now moving to Unicode. Son of WinGreek uses the same fonts and keyboard layout as WinGreek but has been updated in a variety of ways. Several other fonts are also available that follow the Win- Greek arrangement. All the various WinGreek/Son of WinGreek materials can be found at
MultiKey supports both -character fonts and Unicode fonts under Windows; see below in the Unicode section for details.
DRAFT FOR COMMENT #2: NOT FINAL 7. greek 39
Afga Monotype
Unicode Solutions Unicode has been a great benefit for users of polytonic Greek because it finally provides a standard- ized, internationally recognized method of storing and exchanging Greek text (although some issues remain unresolved; see below). There are currently several packages available that enable users to pre- pare Unicode Greek text. Each requires, of course, a Unicode-friendly word processor such as Word or .
It should be clearly understood that, while Unicode provides a standard for fonts, it does not address the issue of keyboard entry. For instance, one can install the modern Greek keyboard (see the Part I) and be able to type the standard Greek letters and the single acute-style accent used in modern Greek, but there is no provision for entering the polytonic accents. Windows keyboard drivers can access only characters at a time and don’t deal well with Unicode, so specialized program- ming is still required for polytonic. This situation is improved in Windows , which is Unicode- native (see below and also Chapter ).
As mentioned in Part I (page 9), the first version of Unicode supported only contemporary mono- tonic Greek along with combining polytonic accents (which no software can yet readily take advan- tage of). The second version added the precomposed combinations which made polytonic Unicode Greek usable with today’s word processors. As a result, Unicode contains two blocks of Greek char- acters: the Greek and Coptic (sometimes referred to as Basic Greek) and the Greek Extended areas. Users should be aware that some Unicode fonts that claim to support Greek contain only the Basic Greek characters and so will not be adequate for polytonic users. (Fonts mentioned in this section do contain the necessary Greek Extended characters.) Version . of Unicode added lowercase forms of the letters stigma, diagamma, koppa, and sampi plus the kai symbol. The charts at the end of this chapter show the contents of the two blocks of Greek characters.
Antioch is a package by Ralph Hancock and Neil Bashoori that includes Unicode fonts and keyboard drivers for Windows / (no Mac version, unfortunately). It is designed to work with Word or Wo rd and installs a small command bar on the Word toolbar that enables one to switch quickly between English and Greek or Hebrew; the Greek portion also supports Coptic. For typing Greek letters, users have the choice of the WinGreek transliterating keyboard or a slightly modified standard modern Greek keyboard; accents and breathings are usually handled through the numeric keypad, al- though a version for laptop users that relies on the numerals found on the top row of the keyboard can be installed. A demo version can be downloaded from
DRAFT FOR COMMENT #2: NOT FINAL 40 Word Processing in Classical Languages
Cardo, the font in which this book is set, contains the complete Unicode Greek character set plus many other characters useful to classicists and medievalists. I have also written a Greek keyboard using Tavultesoft’s Keyman; as soon as Keyman 5 is officially released, I will post it on my web page. See
Matthew Robinson has written a set of Word macros to enter Unicode Greek characters; see
You can download the Athena Roman Unicode font by Jeffrey Rusten from Sean Redmond’s web site
Multikey is a shareware package that provides support for Greek (both Unicode and several -char- acter fonts), Hebrew, and several other languages under Windows. Check out
The Titus Project’s font includes Greek as well as Latin characters; see page 19 for more details.
Production First Software has several multilingual typefaces which include Unicode Greek characters. See
Unitype offers Global Writer and Global Office. The former is a word processor specifically de- signed for multilingual applications, while the latter is an add-in for Microsoft Office that enables the user to enter text in a variety of languages easily. Both Greek (modern) and Ancient Greek are sup- ported; fonts and keyboard drivers are included. See the Unitype homepage at
The Greek company Magenta
Windows , like its predecessor Windows NT , was designed from the ground up around Uni- code. As a result, it includes some features useful to multilingual users; in particular, its keyboard drivers can select any Unicode character (Windows / keyboards can only select from characters at a time). Win ships with a polytonic Greek keyboard and the excellent Palatino Linotype font (see a review of this font by Jeffrey Rusten at
The Win polytonic Greek keyboard follows the standard (pre-monotonic) Greek typewriter layout with accents on the right side; as a result, it will be most useful to those who have acquired typing habits with this layout, or who work with Greek enough that they want to invest the time
DRAFT FOR COMMENT #2: NOT FINAL 7. greek 41 learning this arrangement. The keystrokes are not mnemonic, and one must learn separate keystrokes for the various combinations (e.g., acute, acute with smooth, acute with rough, etc.). In my opinion, systems such as that found in the GreekKeys Mac keyboard or in Antioch, where the accents “accumulate” (that is, there is one key for a smooth breathing and one for an acute; if the two are typed in sequence, the software chooses the correct precomposed character) are easier to remember. However, if you have Win you don’t have to purchase anything else, which is certainly an advantage.15 It must also be said that if one does invest the time to memorize the keystrokes of the Win driver, typing may be faster because fewer keystrokes are needed. This layout is shown in Appendix . It must be said that the Win layout requires fewer keystrokes than the more mnemonic methods.
Issues with Unicode There are still a few missing Greek letters in Unicode, including uppercase lunate sigma, alphabetic koppa Ϟ (the koppa glyphs added to Unicode . have the lightning-bolt shape ϟ used in Greece to- day as a numeral; these alphabetic koppa characters have been proposed for version . of Unicode but not yet officially adopted), uppercase rho with smooth breathing, uppercase upsilon with smooth breathing and accents, uppercase yod (for all-caps typesetting) and the inverted iota and upsilon with circumflex sometimes used in modern Greek printing. In addition, combinations needed when transcribing inscriptions but not found in classical literary texts (e.g., epsilon and omega with cir- cumflex) are not defined in Unicode nor are acrophonic numerals.
Unicode uses the modern Greek terms for accents. For reference, here is a list of the corresponding English terms (with stress marks to show correct pronunciation):
oxía = acute psilí = smooth (British, lenis) varía = grave dasía = rough (British, asper) perispoméni = circumflex hypogegramméni = iota subscript vrachý = breve prosgegramméni = iota adscript [lowercase] dialytiká = diaeresis áno teleía = Greek colon
Unicode contains several precomposed forms consisting of a capital alpha, eta, or omega plus a lowercase iota adscript (e.g., U+1F88, GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROS- GEGRAMMENI). The inclusion of such combinations is not absolutely necessary, since they can of course be represented by a capital followed by a normal iota; however, their presence makes it easier for programmers to adjust capitalization automatically. According to Haralambous (page ), printers in Greece sometimes print the iota subscript under a capital instead of adscript; this is regarded simply as an alternate method of printing such diphthongs. Version of The Unicode Standard showed these
15 Should you upgrade from Windows 95/98 to Windows 2000? Almost all business-oriented applications that work with Windows 95/98 will also work on Win2000, but there are a few that won’t or that will have to be upgraded, so check the ones that you need before making this decision. However, many games and multimedia applications do not work under Win2000, which makes it less than ideal for some home users. One alternative, if you have the hard disk space, is to set up a dual-boot system so that you can run either Windows 2000 or an earlier version on the same machine. See Livingston and Brown, pages 31–40, or similar books for instructions on how to do this. A new consumer-oriented version of Windows is due out in late 2001 or early 2002. Unlike Windows 95/98/Me, it will be based on the same underlying computer code as Windows 2000 and will, presumably, offer direct Unicode support.
DRAFT FOR COMMENT #2: NOT FINAL 42 Word Processing in Classical Languages characters with the iota subscript, to the surprise of many classicists, although version shows them adscript and states (page ) that lowercase iota is “normally” written adscript next to a capital. You may encounter some Greek fonts, based on the Version charts, with the capitals + iota subscript. Those who want both sets of glyphs in one font will have to use the Private Use Area for one or the other, or else use an OpenType font that can accommodate both sets of glyphs.
Unicode provides separate precomposed combinations for vowels with the accent mark used in modern Greek as well as the acute accent found in polytonic texts (e.g., U+0AC, GREEK SMALL LET- TER ALPHA WITH TONOS vs. U+1F71, GREEK SMALL LETTER ALPHA WITH OXIA). This makes it convenient to have one font that can be used for any Greek text, since the monotonic tonos usually has a more vertical shape than the acute accent—even though the word tonos essentially refers to an acute accent, and some fonts do use the same glyph for both, as the Palatino Unicode font does. Haralambous in fact argues that there is no need for any distinction between tonos and acute, but certainly the vowels with tonos are not going to be removed from Unicode at this point. You can see the distinction between tonos and oxia in the charts on the following pages.
Unicode encodes alternative forms for a few Greek letters: β/ϐ, θ/ϑ, φ/ϕ, κ/ϰ, π/ϖ, ρ/ϱ, and Y/ϒ (U+03D0–U+03D6 and 03F0–3F2). Some of these were included for compatibility with earlier standards (ϐ and ϒ) and others to provide appropriate forms when the Greek letters are used as technical symbols. The “script” kappa and “curly” rho are simply stylistic alternates, as are the two forms of pi (the “closed” form ϖ is rarely used in American scholarly texts. but is frequently seen in Greece and in texts printed in France). I strongly suggest avoiding these alternate forms in Greek texts, since they may not sort, analyse, or display properly; use them only as they were intended, for technical symbols.
(Discussion of alternate characters continues on page 41.)