<<

Word Processing in Classical Languages

Latin, Germanic, Greek

corue auis nimis nitida & splendida oque auis est tibi similis in pennis nisi solus cignusµ & super omnia places michi ymno si sol¯ cantu¯ tu¯ u audire posse¯, inter ceteras aues utique extollere¯ ¸

David . Perry

DRAFT FOR COMMENT #2: NOT FINAL ii Word Processing in Classical Languages

[back of cover]

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages iii

Word Processing in Classical Languages

Latin, Germanic, Greek

David J. Perry

Rye High School, Rye, New York

DRAFT FOR COMMENT #2: NOT FINAL iv Word Processing in Classical Languages

This Draft for Comment may obtained from . Please send comments or corrections to .

This document is set up like a printed book; even-numbered pages should be on right and odd- numbered pages on the left. If you print out the document before reading it, turn each even-num- bered page over, print down, and back it up with the preceding odd-numbered page. Then punch for a three- binder or staple at the spine.

Body text of this book is set in Cardo, a by David Perry; major heads are in Lithos and subheads in CG .

The Latin quotation on the cover is from a prose version of the version of the fable of the fox and the crow. These prose versions are found in the Wolfenbüttel manuscript of the fables attributed to ‘Wal- ter of England’ where they were added to help students struggling with the verse originals. “ corve, avis nimis nitida et splendida, que avis est tibi similis in pennis nisi solus cignus? et super omnia placet michi ymno si solum cantum tuum audire possem, inter ceteras aves te utique extollerem.” It is set in the Beowulf font by Peter Baker.

This book refers to a number of company names and product names which are trademarks. These references are used in an editorial fashion to provide readers with information about the products mentioned, and no trademark infringement is intended. All trademarks are the property of their re- spective owners.

Copyright © by David J. Perry.

Information in this book is provided to help users find appropriate ways to prepare their documents. It is the responsibility of each user to evaluate any product mentioned to see whether it is suitable for his or her needs. In no event shall David J. Perry be liable for difficulties with or damage to any com- puter system caused by use of any product or procedure mentioned in this book.

Second draft for comment, printed 1/12/01 with various corrections and to the draft of August 2000.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages

Contents

List of Tables and Figures vi Acknowledgements vi Introduction 1 Part . Font and Keyboard Basics 1. , Character Sets and Unicode 5 2. Keyboard Entry and Other Useful Information 11 Part II. The Present 3. Latin 17 4. Interlude: Using Unicode Characters with Microsoft Word 25 5. Germanic 29 6. Interlude: What If I Need Characters That Aren’ in My Font? 33 7 Greek 37 8. 49 9. Metrics 53 10. Setting Type 55 11. Sharing Documents with Others 59 Part III. The Future 12. The Need for Standardization 65 13. OpenType 69 Part IV. Resources Sources of Information 71 Works Cited 72 Appendices Appendix 1. Macintosh character set 76 Appendix 2. Windows character set 77 Appendix 3. Windows 2000 Polytonic Greek Keyboard 78 Appendix 4. ISO Language Codes 83

DRAFT FOR COMMENT #2: NOT FINAL vi Word Processing in Classical Languages

List of Tables and Figures

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages vii

Table 1. Selected combining diacritical marks in Unicode. 10 Table 2. Unicode characters for classical and medieval Latin. 23 Table 3. Medieval Germanic characters in Unicode. 31 Table 4. The block of Unicode. 43 Table 5. The block of Unicode. 44 Table 6. Epigraphic characters in Unicode. 50 Figure 1. Adding languages and keyboard layouts. 13 Figure 2. The US-International keyboard. 14 Figure 3. Word’ Insert/Symbol dialog box. 25 Figure 4. add new figure 4 (PDF screenshot) here p. 57 Sample 1. Printing with CL Fonts. 20

Acknowledgements

I am grateful to the following people for comments and suggestions: Rob Latousek. Any errors or infelicities which remain are mine.

DRAFT FOR COMMENT #2: NOT FINAL

Word Processing in Classical Languages 1

Introduction

About This Book Audience and Purpose This book is intended for anyone who works with text in classical or medieval Latin, medieval Ger- manic languages, or polytonic Greek.1 This includes teachers at all levels who produce materials for students and authors of textbooks as well as those who prepare scholarly articles or editions. This material may also be of help to some people outside the academic professions, such as type designers, font manufacturers, and typesetters or desktop publishers who sometimes need to work with classical languages. Both and the Macintosh (Mac ) are covered.

The book is intended, first and foremost, to provide practical help for users in getting the characters that they need in their work. A secondary purpose is to educate academic users about some key con- cepts and issues concerning the use of type on computers. I have also provided some information about how to get non-English characters used in modern Western languages, partly because scholars frequently work in several languages and partly because some of the concepts apply to the informa- tion about classical languages.

Although little of this information is original with the present author, it has never been available in one place before. I hope that this compilation will be a convenient source of help for the community of classical language users. Some may be surprised to find medieval Germanic languages (Old Eng- lish, Old Norse, etc.) treated together with Latin and Greek. You will find, however, that users of these languages need many of the same characters that are used in classical and medieval Latin— particularly vowels with macra and brevia. Hebrew could also be added her, since biblical scholars of- ten use it along with Greek and some software publishers provide support for both languages. How- ever, since I do not know Hebrew, I have not included much information about it.2

Origin of This Book This book is the outgrowth of an interest dating back  years in the problems faced by classicists and others who need special characters. While developing the CL Fonts package for Latinists (described below on page 20) I became frustrated with the limitations of -character fonts and began to investigate Unicode. This research has convinced me that need to educate ourselves about this technology and take advantage of it to solve some of the problems that we have faced over the years.

1 I use the term polytonic rather than classical Greek. Although many readers of this book will be classical scholars, the information will be of use to anyone who needs to represent a Greek text—ancient, Byzantine, or modern—that contains the various used prior to the promulgation of the monotonic system in . 2 The WinGreek, Son of WinGreek, Silver Mountain and Antioch packages discussed in the Greek section also support Hebrew.

DRAFT FOR COMMENT #2: NOT FINAL 2 Word Processing in Classical Languages

I have no financial interest in any of the products mentioned here; comments are my own opinions based on my experience with the various products. I welcome corrections or information about additional products. Email me at .

This document contains many (printed in blue between ) to Internet sites. If you are viewing the document on screen, you can click on the hyperlink and your browser will open and take you to the site. All links were valid as of January .

Finding What You Need Because some users of this book may be relatively unfamiliar with the issues discussed, while others may be highly sophisticated computer users, I have tried to make it easy to find the information you need and to skip material you do not need to deal with.

Part I of this book presents basic information about fonts used on computers, character sets, Unicode, and keyboard entry. Part II describes solutions that are available right now (January ) for users of Latin, Germanic languages, and Greek. The chapter for each language begins with an Overview that summarizes the characters needed for that language and provides any other necessary general information. Immediately after the Overview comes a How-To section that provides practical help for people who need to find out how to get macra, Greek characters, etc. in their documents. The How-To section is broken down into two parts: one that will apply to all users, discussing traditional -character fonts, and the other specific to users who have a Unicode-capable word processor such as Microsoft Word . If you do not understand the reasons for this distinction, you need to read the section below on fonts and Unicode.

Part III goes beyond the question of practical help with documents and presents some information about where we will be in the future with characters and fonts. This section will be of interest to anyone who is seriously interested in these issues and is important for anyone who plans to work with classical languages in the future. There is a pressing need for standards which should be of con- cern to anyone in the profession.

Full references for all the books and other items referred to in the body of this book will be found in the Works Cited section, while the Resources section provides additional places to go for information.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 3

Part I. Font and Keyboard Basics

β b ב

DRAFT FOR COMMENT #2: NOT FINAL 4 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 5

1. Fonts, Character Sets, and Unicode

Note: Part I introduces some basic concepts and terms that will be used in the following pages. If you are already knowledgeable about computer character sets and have a general acquaintance with Unicode, you can skim this part, stopping to focus only things that are new to you.

Font Formats Personal computers mainly use two types of fonts. PostScript is a font format developed by Adobe Systems and built into some laser printers and all pc-based systems. It was the first widely used format that allowed the user to scale type to any size without losing quality. PostScript fonts are sometimes also referred to as Type  fonts. TrueType is a font format originally promoted by Apple and Microsoft as an alternative to PostScript; it is built into Windows ., Windows / and into the Mac OS from version .. on. Nowadays most Windows and Mac users rely primarily on TrueType fonts, although the publishing industry still employs PostScript extensively. In order to use PostScript fonts on a Windows or Mac computer, you need to install Adobe Type Manager (ATM), which can be downloaded for free from Adobe’s web site . (Exception: Windows  has built in support for PostScript fonts and so does not require ATM.)

Microsoft and Adobe have recently created the OpenType format, supported in Windows  and Mac OS . OpenType allows font developers to make new fonts based on either TrueType or Post- Script outlines, and it adds some features that are important for multilingual users; more will be said about it in Chapter . One may also occasionally encounter bitmapped fonts, which are designed to be used at a specific point size and cannot be enlarged or reduced without significant loss of quality. Bitmapped fonts are used mainly to display text on screen.

Character Sets and Code Pages Many years ago, a standard for fonts used on a Windows PC or a Mac was established with  characters. The first  positions (numbered –) contain the letters, numbers, and marks we see on the keyboard, plus a few others that are used internally by the computer. Characters with accent marks, non-English characters, and many additional symbols are found in the range –  and are referred to as extended or upper-order characters. Windows uses the ANSI (American National Standards Institute) character set, and each character may be referred to by its ANSI num- ber; for example, the upside-down used in Spanish is ANSI . Apple Computer established its own arrangement for upper-order characters on the Macintosh; like the Windows set, it includes accented characters and other symbols—but not exactly the same ones as in Windows or

DRAFT FOR COMMENT #2: NOT FINAL 6 Word Processing in Classical Languages in the same order. See Appendices  and , pp. 76–77, for the standard arrangement of upper-order characters as found in the two operating systems.

However, it is obvious that  character positions cannot contain even all the characters used in lan- guages that employ the , to say nothing of other writing systems such as Greek, Cyrillic, Hebrew, Arabic, or Japanese. Computer manufacturers created code pages to support Latin characters not in the standard Windows or Mac sets as well as non-Latin scripts. A code page is an arrangement of characters designed to support one or more languages and assigned to the  available positions in a standard font. For example, Windows has a Central European code page to support the diacritics needed in Polish, Czech, etc., as well as a Russian code page and many others. Code pages work well enough except that a user may encounter problems mixing languages in a document or exchanging files with a machine set up for a different code page. The advent of TrueType fonts helped this situa- tion somewhat, since TrueType fonts can contain more than  characters—but the user could still only access  of them at a time.

Neither the Mac OS nor Windows has ever provided direct support for many of the scholarly charac- ters needed in various languages, so a number of individuals and companies developed fonts that sup- port Latin, Greek, and the medieval Germanic languages. Until quite recently, none could support more than  characters (which can be a problem for Greek). Since no standard exists, these fonts contain characters in different orders and are mutually incompatible; see discussion of this problem below.

The Arrival of Unicode In , two projects were undertaken to develop a standard that would cover all the writing systems commonly used in the world. One was begun by the International Standards Organization (ISO) and the other by the Unicode Working Group, a private organization supported by hardware and software manufacturers. The two groups soon realized that it would be a waste of time and effort to develop two such standards and so they agreed to work together. Today the Unicode Standard, spon- sored by the Unicode Consortium, Inc., and ISO/IEC-, the standard promulgated by the ISO, cover the same characters in the same order. The two standards are developed together and any char- acters added to one are also added to the other. The differences between the two are technical and of concern only to software developers; in this document, I will refer to the Unicode Standard for con- venience.3 Those who want more information about the history and development of the two stan- dards should refer to Graham.

Unicode contains blocks of characters for every commonly used script, including Latin, Greek, He- brew, Arabic, and various Indic scripts as well as the ideographs used in Chinese, Japanese, and Ko- rean; it can handle about , characters plus others through an extension mechanism. Unicode does make an effort to support the historical scripts and historical characters that are needed by schol- ars, but scholarly use also has its own problems that are not addressed in Unicode and about which more will be said below. Fonts designed under the Unicode standard normally contain a subset of

3 Since The Unicode Consortium is a private company, it is sometimes preferable or necessary to make reference to the ISO standard when dealing with certain government agencies or other groups that prefer to work with publicly defined standards.

DRAFT FOR COMMENT #2: NOT FINAL Fonts, Character Sets, and Unicode 7 the full Unicode repertoire, since including all the , characters found in the current version of Unicode makes for a very large font which consumes significant computer resources. Unicode also reserves a block called the Private Use Area (PUA) which will never have characters assigned to it by the Unicode Standard. Instead, this area is available to users for their own needs, and more will be said about this below. Unicode characters are identified by the letters “U+” followed by a hexadeci- mal (base ) number; for example, the character Œ in Unicode is U+2, LATIN CAPITAL (official Unicode names may contain only capital letters, , and spaces; they may be printed in small capitals when mixed with running text, as in this paragraph).

Windows  and  allow applications to use Unicode, but do not provide direct support through the operating system. Microsoft Word  was the first widely used piece of consumer software that could take advantage of Unicode fonts, albeit in a way that was awkward for users, and some people began using Greek Unicode texts soon after it was released. We now have Unicode support built di- rectly into Windows NT, Windows  and Mac OS . and . More and more applications are being released that can handle Unicode, and there is no question that Unicode is the way of the fu- ture. The official reference to Unicode is The Unicode Standard, Version , but I recommend that users who are new to the subject start with Graham, which is designed as an introduction to the subject and which also includes some information that is not appropriate to be included with the official standard. If you want access to the complete Unicode character list but don’t have access to the printed standard, you can download the data from the Unicode web site . It is a plain text file that you can import into a database or spreadsheet and search as you please. The Unicode website also contains a great deal of other information that you can browse. The full text of the standard has recently been put on line at the Unicode web site in PDF form. It should also be noted that incremental updates to the standard issued between major revisions are posted on the web site. Draft versions of . and . have already been posted; once these have obtained final approval, they will become official parts of the standard.

The Unicode Model: Characters not Glyphs A fundamental principle of Unicode is to encode characters not glyphs. The Latin letter “a” is a character which may take many different shapes: a, a, a, a, and so forth. The various shapes or out- lines of “a” are referred to a glyphs, while the underlying abstraction—the Platonic Idea of “a”, so to speak—is the character. Unicode could not separately encode an italic “a,” an uncial “a,” a script “a,” and so forth; there aren’t enough codepoints and doing so would make things much more compli- cated than they should be. It is understood that the shape of glyphs will vary from one font to an- other, and Unicode does not prescribe this sort of thing.

This model does pose some problems for scholarly users, however. Take the case of the symbol for the sestertius, a Roman coin, which is not yet included in Unicode. Most often this appears as Š, but is also found as the letters HS, as II with a horizontal , as IS with horizontal bar or even S with a horizontal bar. If we were to propose adding the sestertius to Unicode and the proposal were ac- cepted (as it might well be, since it certainly is a character that classicists, epigraphers, and numisma- tists use) it would be added as one codepoint. That is, the character “sestertius” might be added, but five separate glyphs would not be. But scholars usually wish to preserve this kind of information when they prepare editions. There are some solutions, such as using the Private Use Area, about

DRAFT FOR COMMENT #2: NOT FINAL 8 Word Processing in Classical Languages which more will be said below, but the fact remains that sometimes the Unicode character-not-glyph model is not ideal for scholars although it works well enough in everyday use with modern lan- guages.

The Issue of Precomposed Characters The original intention in Unicode was to encode any base character followed by diacritic(s) as a se- quence of separate characters (.., the letter e followed by a rather than as one combined unit consisting of e with a macron over it). A combination of base character plus diacritic is referred to as a , while the sequence of two separate characters is referred to as decom- posed. Diacritics that are designed to be placed over a base character are called combining marks. See Table 1, page 10, for a sample of the combining marks found in Unicode. Avoiding precom- posed characters dramatically reduces the number of codepoints that need to be used. (A codepoint is a slot, identified by number, into which a Unicode character may be assigned.) As the standard was developed, however, a large number of precomposed characters were included, mainly to insure that text could be converted correctly from various existing national standards into Unicode and back again into the original encoding. In to the standard Windows character set, whose arrange- ment is followed by Unicode for the first  codepoints, there are three blocks of Latin characters: Latin Extended-A, Extended-B, and Extended Additional. Characters of interest to classicists are scattered throughout these blocks.

The presence of the precomposed characters has implications for scholarly use, as we shall see below. Their presence also may obscure the true design and intentions of Unicode for someone who is just learning about it. When I first saw a list of all the characters in Unicode, my reaction was “Wow! Look at all those characters!” Soon I saw, however, that while a great many precomposed combina- tions are defined, many are not. For example, a classicist might want to put a under a letter to show a doubtful reading. Unicode contains precomposed forms for  letters plus underdot, as shown in Table 6, but not for the other seven. (It does have, of course, a combining underdot.) Another ex- ample is the case of Lithuanian, which requires a few letters with diacritics that are not provided as precomposed combinations in Unicode .. Some Lithuanian users are upset about this, particularly when they learn that it is very unlikely that any more precomposed forms will be added to Unicode. The response from the Unicode powers is that Unicode does support Lithuanian, since all the neces- sary diacritics are included as combining marks; Lithuanian users just need applications that will take a sequence of base character + diacritic and display it properly. Unfortunately, there are no such applications for Western scripts at the present time, which may make someone who needs such combinations in her or his work feel that such needs are being ignored. If you try to use the com- bining diacritics in a with today’s word processors, you will find that the diacritics usually don’t line up properly over or under the base character. This is a great nuisance to adjust manually, even if your word processor allows this kind of typographic control (see Chapter ). OpenType will provide a solution, as explained in Chapter .

These remarks about the limitations of combining characters apply to situations where a user wishes to view a document in a form appropriate for reading, as in a word processor or on a web page. There is a different situation where the use of combining marks is highly appropriate and has some important advantages: the storage of texts in machine-readable form, such as the databases of the Thesaurus Linguae Graecae and the Packard Humanities Institute. It is easier to perform operations

DRAFT FOR COMMENT #2: NOT FINAL Fonts, Character Sets, and Unicode 9 such as searching and sorting if the data make use of combining marks which come after the base character. Both Code and Unicode prescribe this method storing characters. For such data to be viewed on a web page, for instance, we need software that will replace the vowel+combining mark combinations with the equivalent precomposed Unicode characters. Such software is not yet readily available in web browsers or word processors, although the specialized programs that are used to ac- cess the TLG or databases do perform this function. The Perseus Project offers viewers an op- tion of “Unicode” display but they simply replace the Beta diacritics with the corresponding Unicode combining characters, with the result that accents are not properly positioned and, if two accents come over a single vowel, they sometimes blot each other out.

The case of polytonic Greek is also interesting in this regard. The first version of Unicode contained only basic letters and combining accent marks. Unicode ., partly as a result of being combined with the ISO standard, added precomposed forms for most (but not quite all) the combinations of letters and accents/breathings. As soon as Word , which supported Unicode fonts, was released, classicists began to use Greek Unicode text since they were eager for an international standard to re- place the various systems that were then available. And so the precomposed Greek combinations, which in the Unicode view existed only for compatibility with existing standards and were not in- tended for regular use in the future, are now commonly employed for polytonic Greek. This is not due to any flaw in the design of Unicode; rather, software that supports combining characters has be- come available much more slowly than was anticipated. (Anyone interested in using Unicode Greek should be sure to read the Unicode section of Chapter , which contains some important information about various issues connected with Greek. The standard has been modified over the years and some questionable characters included; this information, hard to find elsewhere, is clearly set out starting on page 41).

Printing Issues with Unicode One caveat is in order here. You may encounter a printer driver that can’t handle fonts with more than  characters. This happened to me with a DeskJet , which substituted a lower-numbered character when I tried to print using a Unicode font that contained a large character set. In such a case, you may be able to get an updated driver from the manufacturer’s web site, or you may have to use a different printer. Some printers also let you control how TrueType fonts are handled. add more details on this? In Windows, choose File/Print and click on the Properties button for the printer. Ad- justing these settings will sometimes solve printing problems.

Conclusion Unicode provides the most internationally recognized and standardized way to include more than  characters in a font. This is clearly beneficial to those who mix languages in their documents, who need to use a wide variety of characters and diacritics even in one language, or who wish to ex- change documents with other users without running into incompatible arrangements of characters. For scholars, Greek is an obvious beneficiary. However, I would encourage Latinists and medievalists to become acquainted with the characters of interest to them that are found in Unicode, of which there are quite a number (see Tables ,  and  for complete lists) and to consider seriously the bene- fits of such standardization. More Unicode fonts with these characters will become available as greater numbers of users ask for them.

DRAFT FOR COMMENT #2: NOT FINAL 10 Word Processing in Classical Languages

Table 1. Selected Combining Diacritical Marks in Unicode.

GLYPH UNICODE CHARACTER NAME ◌̀ U+0300 COMBINING ◌́ U+0301 COMBINING ◌̂ U+0302 COMBINING ACCENT ◌̃ U+0303 COMBINING ◌̄ U+0304 COMBINING MACRON ◌̅ U+0305 COMBINING OVERLINE ◌̆ U+0306 COMBINING ◌̇ U+0307 COMBINING DOT ABOVE ◌̈ U+0308 COMBINING ◌̉ U+0309 COMBINING ABOVE ◌̊ U+030A COMBINING RING ABOVE ◌̋ U+030B COMBINING DOUBLE ACUTE ◌̌ U+030C COMBINING ◌̓ U+0313 COMBINING ABOVE (= ) ◌̔ U+0314 COMBINING REVERSED COMMA ABOVE (= rough) ◌̣ U+0323 COMBINING DOT BELOW ◌̥ U+0325 COMBINING RING BELOW ◌̦ U+0326 COMBINING COMMA BELOW ◌̧ U+0327 COMBINING ◌̨ U+0328 COMBINING ◌̯ U+032F COMBINING BELOW ◌̵ U+0335 COMBINING SHORT STROKE OVERLAY ◌̶ U+0336 COMBINING LONG STROKE OVERLAY ◌͂ U+0342 COMBINING GREEK PERISPOMENI ◌ͅ U+0345 COMBINING GREEK YPOGEGRAMMENI : the Combining Marks block of Unicode is found at U+0300–U+036F. The characters in this table are meant to give a sense of the types of things included; those not relevant to classical languages have been omitted. By convention, combining marks in Unicode charts are printed over a broken circle ◌ which represents whatever character the mark would go over.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 11

2. Keyboard Entry and Other Useful Information

Beta Code Because of the limited character repertoire supported by standard fonts, various individuals, organiza- tions, and software companies have created their own methods of handling scholarly text. The most important one for classicists is , which was created by the Thesaurus Linguae Graecae (TLG) to enable the storage of classical texts in machine-readable form. Today Beta Code is used for the Greek texts on the TLG CD-ROM and the Latin texts available from the Packard Humanities In- stitute (PHI). In addition, some users employ Beta Code as a convenient way of representing Greek using Latin script (see Verbrugghe). It should be clearly understood Beta Code is designed to store texts, not to represent them properly on screen or paper. For example, a Greek with a smooth breathing and an acute accent is represented as three separate characters [give example in Beta] in the data file. Specialized software and fonts are required to read the text and present the diacritics in their proper position over the base letter. Information about such software can be obtained from as can documentation about Beta Code.

Entering Multilingual Characters from the Keyboard In any discussion of multilingual computing, the issue of keyboard entry of characters comes up sooner or later. This section presents the ways that one can access the multilingual characters found in standard -character Mac and Windows fonts (entry of Unicode characters is discussed sepa- rately in Chapter , page 25). Issues specific to classical language characters will be discussed below, building on the principles outlined here. If you already know the various ways to access non-English characters, you can skip this section.

The Mac OS from its early years has had better support for multilingual characters than Windows. The Mac OS provides access all the upper-order characters through the . To get an um- laut/diaeresis, type OPTION-u4, let go, and type a vowel; likewise OPTION-e for acute, OPTION-i for circumflex, OPTION-` for grave, and OPTION-~ for tilde, all followed by an appropriate vowel (or with tilde). These keystrokes work in all applications and never conflict with anything. Appendix  (page 76) shows all the standard Mac characters with their keystrokes. Mac OS  includes an ex- tended Roman keyboard which allows entry of many of the precomposed combinations found in Unicode, again with the OPTION key. You must have installed a Language Pack before this keyboard will be available. confirm this You can also use third-party utilities such as Günther Blaschek’s well known PopChar to check out and enter characters.

4 The is a sign that the first key is to be held down while the second one is pressed; do not type a hyphen.

DRAFT FOR COMMENT #2: NOT FINAL 12 Word Processing in Classical Languages

To install an additional keyboard on a Mac, from the Apple menu open the Control Panels, then the Keyboard control panel. You will see a list of all the keyboards available to your system; pick the one you want and then close the Keyboard control panel. If you have received a third-party (non-Apple) keyboard you must drag it into the System folder before you can access it through the control panel, and the Mac will ask your permission to install it into the System file.

Windows, by contrast, provides two system-wide methods for accessing non-English characters, both of them clumsy. You can turn on the key, hold down the , and type, on the at the right of the keyboard, the four-digit ANSI number of any character. For example, to get an upside-down question mark in Spanish, you would type ALT- (after checking to make sure the NUM is depressed). You must use the numeric keypad not the the regular numerals on the top row of the keyboard. You have to keep a chart of ANSI numbers handy, of course. You can also use the Character Map applet that comes with Windows, but since you have to leave your application, copy the character you want, and then return to your application and paste it in, this is a slow business.

Because Windows itself provides such poor support for entering upper-order characters, some applications have provided their own methods. Microsoft Word, for example, uses the to access accented characters; CTRL-: followed by a vowel gets you an umlaut/ diaeresis, CTRL-‘ pro- duces an acute, and CTRL-^ a circumflex. Unfortunately these keystrokes do not work across all the Microsoft Office applications. Word also provides an Insert/Symbol dialog box from which you can insert any character in a font, even if the character does not have a . This is fine for rarely used characters but too slow for things you use all the time. See the screen shot on page 25 for an illustration of Insert/Symbol.

If your application does not provide its own keystrokes for non-English characters, or if you are look- ing for a set of keystrokes that will work in all your Windows applications, an excellent alternative is to install the US-International keyboard. This keyboard provides easy access to a great many multilingual characters and some symbols. To install it, activate the Start menu, choose Settings, open the Control Panel, and double-click the Keyboard item. Choose the Language tab at the upper left of the dialog box; if you are an American, you will see English as the first language. Click the Properties button and you will be able to change the associated with English. Choose US-International and click OK; you will probably need to have your Windows / CD- ROM in the drive so that Windows can copy the keyboard file. The US-International keyboard does have one minor drawback: if you type an opening followed by a vowel, you will get an umlauted vowel. The solution is to hit the spacebar after typing the opening quotation mark. Some of the characters available via this keyboard are shown in the screenshot on page 14 below.

And, finally, if you work in one particularly language extensively (I’ use German as an example) and learned at some point to type on a German typewriter, you can install a German keyboard. Follow the procedure given in the previous paragraph, but click on Add instead of Properties and select Ger- man. Be sure the German keyboard is associated with the language and you will then find the ac- cented characters in their usual position on a German keyboard. The same procedure works for any language supported by Windows; be sure that the Switch languages and Enable indicator on taskbar

DRAFT FOR COMMENT #2: NOT FINAL 2. keyboard Entry 13 features are checked. This way you can see what language is in effect at any time and can easily switch from one to another.

Figure 1. Adding languages and keyboard layouts in Windows.

Many of the Microsoft keyboards make use of the AltGr (Alternate Graphic) key to obtain various accents and international characters. This is the right ALT on most keyboards; if you have only one ALT key, you can get the same results by holding down ALT and CTRL together. The screenshot be- low shows how the AltGr key is used in the US-International keyboard.

A very helpful utility is the Microsoft Visual Keyboard, designed to work with Office  and avail- able as a free download from Microsoft’s Office website . You can use it to see the keyboard layout of any language you have installed. You can configure it to remain visible while your type in your word processor and to send keystrokes into your word processor. The screenshot below is taken from the Microsoft Visual Keyboard.

DRAFT FOR COMMENT #2: NOT FINAL 14 Word Processing in Classical Languages

Figure 2. US-International Keyboard.

The US-International keyboard as shown in Microsoft Visual Keyboard. The AltGr key has been pressed, showing some of the international characters available. The white key is a deadkey for enter- ing the acute accent.

The information in the preceding paragraphs applies to standard Windows fonts. If you are dealing with a font that has been modified so that some characters are replaced with others for a specific lan- guage, you will need to consult the documentation that came with the font to find out how to enter the characters. Let’s say that all the vowels with grave accents in a font have been replaced by vowels with brevia. You would use whatever keystrokes you normally use to get grave accents to obtain the vowels with brevia. Any font with non-standard characters must include a chart showing what’s been replaced with what; some fonts include macros or special keyboard drivers, but most simply let you type the special characters with the keystrokes for their equivalents in a standard font. Word’s Insert/Symbol dialog box shows the keystrokes that will enter a highlighted character at the bottom right of the box.

Finally, it should be noted that there are a variety of third-party utilities that you can use to customize your keyboard if you can’t find anything available that suits your needs. The best one I have found for Windows is Tavultesoft Keyboard Manager (Keyman), available as shareware from . Version  of Keyman, currently under development, will support both ANSI and Unicode fonts. I don’t know of an equivalent product for the Mac although there probably are some. Technically sophisticated users have long been in the habit of using ResEdit (a powerful but potentially dangerous program that allows one to customize various features of the Mac OS) to adapt Macintosh keyboards to their own needs.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 15

Part II. The Present

γ g g ג

DRAFT FOR COMMENT #2: NOT FINAL 16 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 17

3. Latin

Overview Scholarly editions of standard classical Latin texts (Oxford, Loeb, Budé, Teubner) usually require no special characters beyond those in standard computer fonts. Beginners’ textbooks require, at a mini- mum, vowels with macra. The macron indicates a lengthened vowel; the distinction between long and short vowels is essential to the of classical Latin, although it was indicated in writing only sporadically.5 Textbooks and reference books may also employ brevia, marks (sometimes in combination with a breve or macron), and sometimes to represent nasal vowels (e.g., incolã for incola [acc.] written without the final -). Ancient Romans sometimes wrote a horizontal line over numerals to distinguish them more readily from the surrounding Latin words. Modern editors place dots under letters to indicate uncertainty about the text and employ the obelus or (†) to mark corrupt passages. Letters accidentally omitted by a scribe are marked with angle brackets 〈a〉; if a font does not contain true angle brackets, it may be more attractive to use the single ‹a› rather than the greater-than and less-than signs for this purpose. In addition, editions of a few classical authors require specialized signs, mainly for various units of measurement and currency. About  such signs are found in the Latin authors on the CD-ROM produced by the Packard Hu- manities Institute (PHI), which contains all the extant texts of classical Latin authors. Additional characters are needed when dealing with epigraphy and metrics; these are discussed separately in Chapter  below.

Medieval and Renaissance Latin texts present additional requirements, chief among them the large number of found in these texts. There are over , abbreviations in the Brepols database of medieval Latin texts, and it is unlikely that there ever will be (or should be) a font with all of them. A number of them are very common, however, and were used in early printed books as well as in manuscripts; these need to be available to scholars and librarians. [add ISO standard for libraries here] Of particular note is the very common use of a horizontal stroke over a letter to denote the omission of letters, most often m. (This is not to be confused with the macron over long vowels.) There are also symbols used in liturgical texts, such as the veriscle and response signs ™ š.

None of the characters mentioned above except the dagger has been readily available on either Win- dows or Mac systems. The standard Mac character set includes both the macron and the breve, and the Windows set has the macron. However, it is very awkward to make use of these with standard word processors6 and so specialized solutions have been required, although that is somewhat changed now that Unicode support is widely available.

5 See below in the Epigraphy section for more information about how the ancient Romans marked vowel . During the 18th and early 19th centuries, a circumflex accent was used to mark a vowel as long, and this usage is still occasionally encountered in Latin as well as in Welsh and other languages. 6 The procedure is given below, page 33, for those who are interested.

DRAFT FOR COMMENT #2: NOT FINAL 18 Word Processing in Classical Languages

How-to -character Fonts Since there has never been a standardized solution to the needs of Latinists, various ad hoc methods have been employed, particularly by teachers who are eager to prepare materials with macra for their students. Under both Windows and the Mac OS, only the diaeresis/umlaut appears over all six vowels in standard fonts (and can therefore be entered from the keyboard). Some teachers have simply used umlauts to stand in place of macra. Those who are willing to spend the time and effort have modi- fied their own fonts, replacing the umlauted vowels with macra. (TypeTool from FontLab is designed for exactly this kind of simple font modification; see page 33 for more about modifying fonts.) David Meadows has made such a font, similar to the Arial , available on his web site he will send new url; it includes macra, brevia, and with lines. Others have used a font package designed for Hawai’ian and other Polynesian languages; these languages use macra to represent long vowels, but do not use the letter with a macron which we sometimes need in Latin. Some Hawai’ian fonts are available from .

For those who want macra and brevia, perhaps in combination with a stress mark, the first commer- cial solution was the TransRoman package ($. and up) from Linguists Software; information available at . This package has received good reports from users, and does provide the diacritics that Latin teachers need along with assistance with keyboard input for the characters. It is, however, a proprietary solution; one must use the keyboard driver and fonts that Linguists Software supplies. TransRoman also contains diacritics for many other Latin-script lan- guages, which may be of interest to some users. Unitype provides similar functionality in their Global Writer and Global Office packages, which are described more fully below in the section on Greek.

The CL Fonts package, developed by the present author with the support of the Classical Association of the Empire State, is specifically designed to meet the needs of Latin teachers. It provides macra, brevia and stress marks for regular text printing, plus common metrical signs and a small selection of epigraphical and medieval characters. The regular weight face can be downloaded from the CAES homepage and evaluated free of charge; the full package, with italic, bold, and bold italic fonts plus a detailed manual and keyboard drivers, can be purchased for $ through the CAES homepage or from other sources. The fonts were produced to professional stan- dards and are available in both Mac and Windows versions. See pages 20–21 for a sample printout of all the characters in the CL Fonts arrangement.

The Old English Font Pack from Dr. Peter Baker, although designed for scholars of Anglo-Saxon, contains about  medieval Latin abbreviations as well as vowels with macra and brevia. confirm number Download it from .

A larger selection of characters that will be of particular interest to medievalists or Renaissance schol- ars is contained in the font  from Tiro Typeworks. These fonts are exquisitely

DRAFT FOR COMMENT #2: NOT FINAL 3. latin 19 crafted, although somewhat expensive for those who don’t use them frequently. Visit Tiro’s home- page at .

Unicode Solutions Unicode . contains all vowels with macra and brevia, except Y/y with breve, in precomposed form as well as a combining macron and combining breve which can (if your software cooperates; at the moment, none does) be placed over any character. Anyone with a Unicode-friendly word processor7 can take advantage of these precomposed characters; this may be a useful solution for someone who needs only macra and brevia, since you do not need to purchase any additional software beyond the Times New Roman and Arial fonts that ship with Windows . See Interlude , page 25, for information about using Unicode characters in Word. Unicode . also contains a number of other characters that are of interest to Latinists. See Table  (page 23) for a complete list of these. Table  in the epigraphy chapter (page 50) contains vowels with underdots and other characters that may be useful to those who are preparing critical editions.

At the moment, there are two Unicode fonts that may be useful for Latinists since they contain more characters than those found in the Times New Roman and Arial . The Titus Project from the University of Frankfurt, whose main focus is getting ancient texts on the Web, has produced a Unicode font that they will make available for non-commerical use upon request; see the Titus home page at . The excellent package from Dr. Peter Baker con- tains a number of medieval Latin abbreviations as well as vowels with macra and brevia, even though it does need to be updated to conform with Unicode .. Download from . We hope to produce a Unicode version of the CL Fonts package in the not-too-distant future. Also, the TLG and PHI Workplace programs from from Silver Mountain Software , which are designed to access the TLG and PHI CD-ROMs respectively, include Silver Humana, a Unicode font with Latin, Greek, and Hebrew characters (not available separately).

You can also download a Beta version of the Cardo font from . Cardo is a Unicode font that contains all the characters discussed in this book plus many others. It was specifically designed for scholars and is used to set the body of this book.

7 At the moment, this means Microsoft Word; version 8.4 and later of WordPerfect will open Unicode files, but WordPerfect provides no way for users to enter Unicode text easily.

DRAFT FOR COMMENT #2: NOT FINAL 20 Word Processing in Classical Languages

Sample 1. Printing with CL Fonts

The following is a sample of the characters included in the CL Fonts package developed by David Perry and the Classical Association of the Empire State specifically for Latinists. Samples are at 12 point.

Vowels with macra and brevia Ä Ë Ï Ö Ü Ÿ ä ë ï ö ü ÿ Â Ê Î Ô Û â ê î ô û ñ À È Ì Ò Ù Þ à è ì ò ù þ Å ¼ Ð Ø ½ ß ã å ð õ ÷ ø

Sample of connected prose Clärörum virörum facta mörësque posterïs trädere, antïquitus üsitätum, në nostrïs quidem tempori- bus quamquam incüriösa suörum aetäs omïsit, quotiëns magna aliqua ac nöbilis virtüs vïcit ac supergressa est vitium parvïs magnïsque cïvitätibus commüne, ignörantiam rëctï et invidiam. —, vita Iulii Agricolae liber i.1–2

Word Stress pátrës, étiam, diffícilis, pópulus, púerum, cýathus c‚dere, cƒsa, „gepae, ob…diëns, cˆque cantâre, vidêre, dormîre, senätôrem, virtûtem, papñrus cãntant, tåneö, inð tium, õrior, ÷milis, cøathus

Roman numerals with lines ¡ ¢£¤¥¬−

Inscriptions H¾ C·SITVS·EST·C·I¿LIVS·BARÓ·QV¾ ·V¾ XIT·ANN·¤£ doubtful letters: ˜ a ˜ b ˜ c ˜ p ‹a› ‹m› and abbreviations: º Pº P, º Tº denarius ‰ and sestertius Š

Nasal vowels & dieresis (lower-case only) fëminª a, mïlitª e, part« , cª osul, virª u, ª y; ¨ a, ¨ e, © , ¨ o, ¨ u, ¨ y

DRAFT FOR COMMENT #2: NOT FINAL 3. latin 21

Ligatures Æ, æ , , ç, Œ, œ

Medieval & religious symbols ™ , š, Ã and Õ = et, bar above any letter for abbreviations: » , » n, » c, » m

Poetry metrical schemes: × × | ¯ µ µ | ¯ µ ¯ µ | ¯ × (hendecasyllabics) ¯ ² / ¯ ² / ¯ ¦ ² / ¯ ² / ¯ µ µ / ¯ × ¯ ² / ¯ ² / ¯ ¦ ¯ µ µ / ¯ µ µ / × (elegiac couplet) vowel quantities and syllabic quantities: ¯ µ µ / ¯ µ µ / ¯ ¦ ¯/ ¯ ¯ / ¯ µ µ / ¯ ¯ Arma virumque canö, Troiae quï prïmus ab örïs ¯ µ / ¯ ¯ /¯ ¦ µ µ /¯ µ ¯ /¯ Fürï¹et Aurëlï, comitës Catullï syllabic quantities only: änte¹ömnësquè Lèlëx, ànìmö mätürùs et » aevö synizesis: ant® ehäc plus ± and ³ and marks for ictus: ´ ¸

Standard publishing characters - and -: —, – : • dagger and double dagger: † ‡ section and paragraph markers: § ¶ curly quotes: “She said, ‘You shouldn’t stand out here in the cold!’ but he ignored her.” single guillemets: ‹ ›

DRAFT FOR COMMENT #2: NOT FINAL 22 Word Processing in Classical Languages

Issues with Unicode Unicode encodes the Roman numerals I–XII plus , , C, and M in both upper- and lowercase forms (U+2160–U+217F). These were included for compatibility with some East Asian standards and I do not see any point in using them in classical texts. The only advantage might be that one could search for numerals without the possibility of finding regular Latin letters; however, since Ro- man numerals are normally printed separated from surrounding words by spaces, they are easy to find even when regular letters are used to represent them. Furthermore, not all Unicode fonts will con- tain these characters, and they only handle a very small of the possible Roman numerals; so I would not bother with them. (These comments of course do not apply to the Unicode characters for the Roman numerals , and ,, which are tabulated in Table 6 in the Epigraphy section be- low.)

The main difficulty with Unicode, as far as Latinists are concerned, is that a number of useful pre- composed combinations are not included, and there is very little chance of their being added.8 Y/y + breve is probably the most important of these. Unicode includes all letters with an underdot except C/c, /f, G/g, J/j, P/p, Q/q and /x; there is also a combining underdot, which at the moment is not useful with normal software. Finally, Unicode includes only two instances of the combination of ma- cron plus acute accent (E/e and O/o) and none of acute + breve, which are sometimes needed to show word stress. See the Chapter , page 33, for ways to deal with this. The other problem with Unicode at the present time is that there is considerable inconsistency in which characters are in- cluded. Font designers tend to omit characters that they think will not be used, which includes many characters that scholars are interested in. This situation will improve with time.

8 The decision not to allow any additional precomposed combinations after Version 3.0 of Unicode was made because of the increasing reliance on Unicode by web applications and other software that change decomposed into precomposed forms in order to display text properly; such applications will have to be constantly updated if additional precomposed combinations continue to be added.

DRAFT FOR COMMENT #2: NOT FINAL 3. latin 23

Table 2. Unicode Characters for Classical and Medieval Latin See also Table 6, page 50, for epigraphic characters. GLYPH UNICODE CHARACTER NAME Ā U+0100 LATIN CAPITAL LETTER A WITH MACRON ā U+0101 LATIN SMALL LETTER A WITH MACRON Ă U+0102 LATIN CAPITAL LETTER ă U+0103 LATIN CAPITAL LETTER A WITH BREVE Ē U+0112 LATIN CAPITAL LETTER E WITH MACRON ē U+0113 LATIN SMALL LETTER E WITH MACRON Ĕ U+0114 LATIN CAPITAL LETTER E WITH BREVE ĕ U+0115 LATIN SMALL LETTER E WITH BREVE Ī U+012A LATIN CAPITAL LETTER ī U+012B LATIN SMALL LETTER I WITH MACRON Ĭ U+021C LATIN CAPITAL LETTER I WITH BREVE ĭ U+021D LATIN SMALL LETTER I WITH BREVE Ō U+014C LATIN CAPITAL LETTER O WITH MACRON ō U+014D LATIN SMALL LETTER O WITH MACRON Ŏ U+014E LATIN CAPITAL LETTER O WITH BREVE ŏ U+014F LATIN SMALL LETTER O WITH BREVE Ū U+016A LATIN CAPITAL LETTER ū U+016B LATIN SMALL LETTER U WITH MACRON Ŭ U+016C LATIN CAPITAL LETTER U WITH BREVE ŭ U+016D LATIN SMALL LETTER U WITH BREVE Ȳ U+0232 LATIN CAPITAL LETTER Y WITH MACRON [3.0] ȳ U+0233 LATIN SMALL LETTER Y WITH MACRON [3.0] Æ U+00C6 LATIN CAPITAL LETTER æ U+00E6 LATIN SMALL LETTER AE Œ U+0152 LATIN CAPITAL LIGATURE OE œ U+0153 LATIN SMALL LIGATURE OE ſ U+017F LATIN SMALL LETTER LONG S ⁊ U+204A TIRONIAN SIGN ET † U+2020 DAGGER ‡ U+2021 DOUBLE DAGGER § U+00A7 ¶ U+00B6 SIGN ⁋ U+204B REVERSED PILCROW SIGN ℟ U+211F RESPONSE ℣ U+2123 VERSICLE ※ U+203B ⁁ U+2041 INSERTION POINT ⁂ U+2042

DRAFT FOR COMMENT #2: NOT FINAL 24 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 25

4. Interlude: Using Unicode Characters with Microsoft Word

This section tells how to enter a character from a Unicode font using Word’s built-in facilities. These are fine for occasionally inserting characters to supplement those found on standard keyboards; how- ever, if you work regularly in Greek or another non-Latin script, you will normally need to get a spe- cially programmed solution, such as Antioch or MultiKey (see the Greek section for information); MultiKey is the only package I know of at the moment that provides keyboard support for entering Latin Unicode characters. See also the discussion of UniPad in the Resources section (page 72).

In Word, put your text into a Unicode font, then choose Insert/Symbol from the menu. In the Sub- set pulldown menu at the upper right of the dialog box, choose Latin Extended-A and you will see an A-macron as the first character in that group. Click the character you want, then click Insert, and the character will be added to your document. See the screen shot below, where the macron and breve characters have been highlighted in yellow.

Figure 3. Word’s Insert/Symbol dialog box.

You can use the scroll bar at the right to view all the characters in the font. However, if you know the Unicode range where the character is located, using the pulldown list provides a faster way to get to the character you want. If your font is a PostScript (Type ) font, not a TrueType font, the In-

DRAFT FOR COMMENT #2: NOT FINAL 26 Word Processing in Classical Languages sert/Symbol dialog displays a peculiar behavior: if you just choose Insert/Symbol, all you see are blank spaces. If you highlight a character in your document to which the Type  font has been ap- plied and then choose Insert/Symbol, you will see the characters. This is true for Word  and ATM .; whether it is true for all versions of Word and ATM I am not certain.

There is another way to get a Unicode character. Open a Word dialog such as Edit/Find or File/Open. Enter the four-digit Unicode value in and then type ALT-x. The four digits will be replaced by the corresponding Unicode character; if that character doesn’t exist in the font that your system uses for dialog boxes and such, a rectangle will be displayed. Highlight the charac- ter or rectangle and copy it to the clipboard (CTRL-C), hit ESCAPE to close the dialog, and then paste (CTRL-V) the character at the appropriate spot in your document. This technique is useful on those occasions when you know a character’s Unicode value but just can’t find it in the small chart that ap- pears when you choose Insert/ Symbol. You may see an empty rectangle in the dialog box after pressing ALT-x; this means that the character you want does not exist in your system font (the default font that Windows uses for dialog boxes and such). Go ahead and follow this procedure anyway; the character will display properly when you paste it into the document which is formatted in a font that does contain the character you want.

If you receive a Unicode-based document that contains characters which do not exist in any font you have (and so are represented by the generic “missing character” character, usually a small rectangle), you can identify them by following the reverse of the procedure given in the previous paragraph: highlight and copy the unknown character, select Edit/Find, and paste it into the “Find what” field. Then type SHIFT-ALT-x and the unknown character will be replaced by its hexadecimal value which you can look up in a Unicode chart. You can also visit and use the facilities provided there to find the identity of any unknown Unicode character.

A very useful set of Word macros to help with Unicode questions is available from need url. These macros allow you to identify unknown Unicode characters, insert characters if you know their num- ber, search for Unicode characters, and other things. You can also create your own macros to insert Unicode characters, if you are comfortable with Word’s macro language.

It is also possible to use Word’s AutoCorrect function to enter characters. AutoCorrect was designed to correct common typing errors; for example, “htis" is automatically changed to “this.” It can also be used as a shorthand method of entering text, however. Choose Insert/Symbol, highlight the character you want and then click the AutoCorrect button in the lower left; the character you chose will automatically appear in the AutoCorrect dialog box, and you can enter whatever key combina- tion you wish to use for the selected character. The limitation with using AutoCorrect this way is that Word treats each entry as a separate word; that is, the AutoCorrect function is not activated until you enter a , a period, a , or a . So this doesn’t work well for entering a single character in the middle of a word. Ralph Hancock has developed a set of AutoCorrect entries for common Greek words which is included with the Antioch package (see the Greek section for information).

DRAFT FOR COMMENT #2: NOT FINAL 4. unicode with word 27

You can use the Find command on Word’s Edit menu to look for a Unicode character. Type the character’s decimal number (not hexadecimal as in most Unicode charts9) preceded by ^u, for example ^u97 to find γ (lower case Greek ). This does not work with Replace.

Remember that not all Unicode-based fonts contain all the characters defined in the Unicode Stan- dard; there is no way to tell which characters a font contains (unless some documentation comes with the font) except by looking at the characters in Word’s Insert/Symbol dialog or in a separate font viewing utility.10 The Times New Roman and Arial that ship with Windows  do contain most of the Unicode characters for western languages. Note that Unicode . (the version that preceded the current .) did not include macra on Y/y. Most available Unicode fonts, as of this writing, still have not caught up to version .; you can see in the screen shot above that Y-macron and y-macron are missing.11 This also affects documents you share with other users. The person you send a document to needs to have a Unicode font on his/her system with the macron or breve characters.

For more about keyboard entry of characters, see Chapter .

9 The Windows Calculator accessory can convert between number systems for you; start Calculator and choose View/Scientific, click on Hex, type the hex value and click Decimal. 10 Microsoft’s Font Properties Extension 2.1, which can be downloaded from , shows a good deal of useful information about a TrueType or OpenType font when you right-click on the font file name and choose the Properties item. It shows the ranges supported by a font (e.g., Basic Greek, Latin Extended-A) but does not show every character. One font viewer that does show every character is written by Arjan Mels; see . 11In Unicode 3.0, Y-macron and y-macron are located at the very end of the Latin Extended-B block (U+0232 and U+0233).

DRAFT FOR COMMENT #2: NOT FINAL 28 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 29

5. Germanic

Overview Scholars who work in Old English and Old Norse have some special needs that are not met in stan- dard fonts. Old English requires at a minimum the letters , thorn, and ash which are part of the standard Windows character set (since they are still used in Icelandic), but the first two are not in the regular Mac set. In addition, macra, brevia, and several other diacritics are sometimes needed as well as a number of specialized characters such as yogh and wynn. Unicode now provides most of the characters that are needed for Germanic languages and they are shown in Table  below along with their Unicode values. Those characters that are not found in Unicode are listed on page 30.

How-to -character Fonts There are a few good solutions for Old English and Old Norse. Edlund by Carl Anderson is a Mac font with a great many characters for northern European languages; supplementary fonts with true small capitals and insular letterforms are also available. It is free for scholars and can be downloaded from the Edlund Project’s page . Some Mac users have also obtained Icelandic versions of Times which contain eth and thorn. The Old English Font Pack from Dr. Peter Baker of the University of Virginia contains two complete fonts (Junius and Beowulf) with all the necessary characters as well as some supplementary characters designed to match Times New Roman plus some decorative Anglo-Saxon capitals. Both Mac and Windows versions are availalble. See Dr. Baker’s web page at . From the University of Leeds in the there is LeedsTimes Medieval4 (plain and bold versions; no italics, and apparently only for Windows), part of a family of several LeedsTimes fonts that accommodate various languages. These are free for non-commercial use, but you must advise the University if you use the font in a publica- tion. While not quite as complete as Edlund or Dr. Baker’s fonts, it does include a couple of charac- ters not in either of the other two. See the homepage at . See also the discussion above in the Latin sec- tion for ways to get macra and brevia, both with -character fonts and with Unicode fonts.

A Windows font for the Gothic language is available from Dr. Berlin’s Foreign Font Archives ; the download file is called gothic.zip.

Several Runic fonts are downloadable from Coron’s Sources of Fonts . Since I do not work with Runic, I am not in a position to comment on their scholarly quality.

DRAFT FOR COMMENT #2: NOT FINAL 30 Word Processing in Classical Languages

Unicode Solutions Table  below lists all the Germanic characters currently in Unicode. Vowels with macra and brevia are tabulated in the Latin section, page 20. “[.]” indicates that the character was added in the re- version . of Unicode; many existing fonts do not have these characters. Unicode . also added a block devoted to Runic letters, U+16A0–U+16F0. For ways to use the Germanic characters that are in Unicode with currently available software, see the discussion in the previous chapter, page 25. And, finally, we should note that the Gothic script has been accepted by Unicode for inclusion in version ., although it has not yet received final approval from ISO. Gothic will be located in Plane  ( characters; 10330–1034B) and the characters will be accessed through the use of surrogate pairs (for an explanation of this, see below crossreference).

The following characters are not included in Unicode .: Ã, , ¬ = thæt, and the precomposed combinations /, ·/ª, ›/¢, ⁄/¡, «, /,   ḗ  ṓ  ,        and r.̥ There are also no metrical signs such as         . See Interlude , page 33, for some ways to get these combinations. need to check Edlund Mac docs; some chars didn’t translate properly to Windows font, I think

The only Unicode font for medievalists that I am aware of is Junicode from Peter Baker, download- able from . This is an excellent font with a very large selection of characters (many of the characters in the preceding paragraph are printed in Junicode). See also the section on page 33 ff. for ways to get special characters that you can’t find in any available font.

As mentioned above, Unicode now contains a block devoted to . I have not yet seen a font de- signed around this block; until one is published, users will have to continue with the -character fonts now available and discussed on the previous page. Using the Unicode Runic block will provide considerable advantages in terms of standardization and interchangeability, and so I hope that we will soon see some Unicode runic fonts. However, anyone contemplating developing such a font must read the statements in The Unicode Standard, r s i o n  (pages –), particularly the remarks regard- ing the shape of the glyphs used in the Unicode chart; simply looking at the charts does not provide all the necessary information.

Block of medieval superscript letter diacritics has been proposed for Unicode 3.2. Although used mainly in Germanic manuscripts, they are found sometimes in other languages and occasionally appeared as late as the 19th century. If given final approval, these letters will be added to the combining diacritical marks range; they are shown in magenta in the chart below.

DRAFT FOR COMMENT #2: NOT FINAL 5. Germanic 31

Table 3: Medieval Germanic Characters in Unicode 3.0 (see also Table 2 for vowels with macra and brevia)

GLYPH UNICODE CHARACTER NAME Æ U+00C6 LATIN CAPITAL LETTER AE æ U+00E6 LATIN SMALL LETTER AE Ǽ U+01FC LATIN CAPITAL LETTER AE ACUTE ǽ U+01FD LATIN SMALL LETTER AE ACUTE Ǣ U+01E2 LATIN CAPITAL LETTER AE WITH MACRON ǣ U+01E3 LATIN SMALL LETTER AE WITH MACRON Ą U+0104 LATIN CAPITAL LETTER A WITH OGONEK ą U+0105 LATIN SMALL LETTER A WITH OGONEK ƀ U+0180 LATIN SMALL LETTER B WITH STROKE (Old Saxon) Ċ U+010A LATIN CAPITAL LETTER C WITH DOT ABOVE ċ U+010B LATIN SMALL LETTER C WITH DOT ABOVE đ U+0111 LATIN SMALL LETTER Ę U+0118 LATIN CAPITAL LETTER E WITH OGONEK ę U+0119 LATIN SMALL LETTER E WITH OGONEK Ġ U+0120 LATIN CAPITAL LETTER G WITH DOT ABOVE ġ U+0121 LATIN SMALL LETTER G WITH DOT ABOVE ħ U+0127 LATIN SMALL LETTER Ƕ U+01F6 LATIN CAPITAL LETTER HWAIR [3.0] (Gothic transcription) ƕ U+0195 LATIN SMALL LETTER HV ĸ U+0138 LATIN SMALL LETTER KRA Ŋ U+014A LATIN CAPITAL LETTER ENG ŋ U+014B LATIN SMALL LETTER ENG Ʀ U+01A6 LATIN LETTER ÝR (Old Norse) ʀ U+0280 LATIN SMALL CAPITAL LETTER R (used for lowercase ýr) Œ U+0152 LATIN CAPITAL LIGATURE OE œ U+0153 LATIN SMALL LIGATURE OE Ǿ U+01FE LATIN CAPITAL LETTER O WITH AND ACUTE ǿ U+01FF LATIN SMALL LETTER O WITH SLASH AND ACUTE Ǫ U+01EA LATIN CAPITAL LETTER O WITH OGONEK (Old Icelandic) ǫ U+01EB LATIN SMALL LETTER O WITH OGONEK Ǭ U+01EC LATIN CAPITAL LETTER O WITH OGONEK AND MACRON (Old Icelandic) ǭ U+01ED LATIN SMALL LETTER O WITH OGONEK AND MACRON ſ U+017F LATIN SMALL LETTER LONG S Ð U+00D0 LATIN CAPITAL LETTER ETH ð U+00F0 LATIN SMALL LETTER ETH (table continued on next page)

DRAFT FOR COMMENT #2: NOT FINAL 32 Word Processing in Classical Languages

Þ U+00FE LATIN CAPITAL LETTER THORN þ U+00FE LATIN SMALL LETTER THORN Ƿ U+01F7 LATIN CAPITAL LETTER WYNN [3.0] ƿ U+01BF LATIN [SMALL] LETTER WYNN Ȝ U+021C LATIN CAPITAL LETTER YOGH [3.0] ȝ U+021D LATIN SMALL LETTER YOGH [3.0] Ƶ U+01B5 LATIN CAPITAL LETTER WITH STROKE ƶ U+01B6 LATIN SMALL LETTER Z WITH STROKE Ȥ U+0224 LATIN CAPITAL LETTER Z WITH HOOK [3.0] (Middle High German) ȥ U+0225 LATIN SMALL LETTER Z WITH HOOK [3.0] Ʒ U+01B7 LATIN CAPITAL LETTER EZH ʒ U+0292 LATIN SMALL LETTER EZH χ U+03C7 GREEK SMALL LETTER ⁋ U+204B REVERSED PILCROW SIGN [3.0] ⁊ U+204A TIRONIAN ET SIGN [3.0] ◌ U+0363 COMBINING LATIN SMALL LETTER A ◌ U+0364 COMBINING LATIN SMALL LETTER E ◌ U+0365 COMBINING LATIN SMALL LETTER I ◌ U+0366 COMBINING LATIN SMALL LETTER O ◌ U+0367 COMBINING LATIN SMALL LETTER U ◌ U+0368 COMBINING LATIN SMALL LETTER C ◌ U+0369 COMBINING LATIN SMALL LETTER D ◌ U+036A COMBINING LATIN SMALL LETTER H ◌ U+036B COMBINING LATIN SMALL LETTER M ◌ U+036C COMBINING LATIN SMALL LETTER R ◌ U+036D COMBINING LATIN SMALL LETTER T ◌ U+036E COMBINING LATIN SMALL LETTER V ◌ U+036F COMBINING LATIN SMALL LETTER X

Note about ezh and yogh: in Unicode . these characters were unified at one codepoint, but they were disunified and new codepoints added for Yogh/yogh in version .. This was done in recogni- tion of the fact that the two are historically distinct characters.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 33

6. Interlude: What if I really need characters that aren’t in my font?

If you’ve tried without success to locate a font whose design is suitable for your project and which contains the characters you need, you have two choices.

Adding or modifying characters The first option is to create the missing characters with a font editor. Fontographer, now published by Macromedia , was the first general-purpose font editor for per- sonal computers; it still works well, although it it has not been updated in quite some time. More re- cent products come from the FontLab Group: FontLab, a very powerful, full-featured font editor, and TypeTool, designed to make small modifications to existing fonts. If you are working with True- Type fonts, the FontLab products have a significant advantage over Fontographer: they preserve the hinting, whereas Fontographer strips out existing hints and substitutes its own. Hints are special in- structions built into the font that enable it to look its best at various sizes under low-resolution condi- tions. Hinting is very important for fonts which will be used on screen or printed at less than  dot per inch (early laser printers were 3 dpi, more recent ones ). If the font was well hinted by its designer, Fontographer’s automatically generated hints are usually not as good as those those in the original. The FontLab home page is at . Both Macromedia and Font- Lab offer reduced prices to qualified academic users.

Mention should also be made of the Font Creator Program by Erwin Denissen of High-Logic in the Netherlands. This is a shareware font editor which, while not offering all the bells and whistles of the more expensive products, may do what you need for simple modifications such as adding accents. And, since it is shareware, you can try it before you purchase. See the High-Logic page at .

If you have a special font that you want to convert from Mac to Windows format or vice versa, Font- Lab makes TransType, a conversion utility for just this purpose. You can also open the font with Fontographer and export it as a file for the other platform. add other utils here

It is not difficult to learn to use Fontographer or TypeTool to add accents onto existing letters since you can copy and paste the accents onto existing character outlines. To go beyond this, however (i.e., to create an actual letter or symbol that is missing from the font), you will need to acquire some seri- ous type design skills. You may be better off having this kind of work done for you. There are a number of small type companies that will do custom work at reasonable prices; the big companies such as Monotype will also do custom work, but the prices may be prohibitive for scholars. Some font developers are much more knowledgeable about non-English characters than others, which may

DRAFT FOR COMMENT #2: NOT FINAL 34 Word Processing in Classical Languages affect the quality of the work they do for you. Also, custom-made or modified fonts may not look as good on screen as products from the large companies such as Monotype. This is because the large companies can do special ultra-sophisticated hinting which improves the appearance of characters on a monitor; the software required for such super-hinting is beyond the reach of the smaller companies. This does not affect printed output.

The books by Cavanaugh and Moye are excellent resources on digital type design for those who want to learn to create or modify fonts.

A note about legalities: most font makers allow you to modify for your own use, or to have modified for you, a font for which you have purchased the appropriate license. A few require the person doing the modification to purchase a copy in addition to the end user, while a few do not allow any modification. So check with the vendor in case of doubt. However, in no case are you legally able to distribute a modified font to others who have not purchased a license for the original. Font develop- ment is a painstaking, time-consuming task and developers deserve to be compensated for use of their work, unless they have chosen to make it freely available; some have done so, and we owe them great thanks for their efforts. If a font or keyboard utility is distributed as shareware, you do have an obligation to pay the registration fee if you use the product to do any signicant work.

If you are modifying a -character font, you will have to decide which characters you will replace with your customized ones, since no more than  characters can be accessed at once. If you are working with a Unicode font, you can place your customized characters in the Private Use Area (U+E000–U+F8FF) if there are no codepoints already assigned to them. It is definitely, absolutely not a good idea to put any characters in the codepoints that the Unicode Standard marks as “reserved” and which are shown in the charts with diagonal lines (see the Greek charts on pages 43–45 for examples of this); use the Private Use Area for such characters. In the future, OpenType fonts may make it possible to use the combining diacritics that Unicode provides; see Chapter  for more dis- cussion of this.

Using diacritics that are built into a font If you wish to use your special characters in a database, spreadsheet, or on the web, the customized font route is the only way to go. However, if you are preparing a text for publication in print, and if all you need are combinations of base letters plus accents, you may be able to get the results you want by using a program that allows you to control exactly how characters are placed next to each other. Sophisticated word procesors do this to some extent, although you get much greater control with a high-end page layout program such as PageMaker, Quark Express, or Adobe InDesign.12

If the font you are using contains (let’s say) an underdot and a macron, you could get an o with an underdot and macron as follows: type the o and then use whatever facilities your program provides to

12 The power of page layout programs comes at a price, however. Unless you have access to one through your institution, you will have to purchase the software—and these programs are expensive. Second, you will have to learn to use the program; this requires a significant investment of time if you want to get professional-looking results. You may wish to employ a service that can import your word-processor file into a page layout program. In either case, check early on to be sure that the font you plan to use has the diacritics you need.

DRAFT FOR COMMENT #2: NOT FINAL 6. getting special characters 35 insert special characters to add the underdot and macron. You will then have to move the two diacritics into proper position, usually by kerning each one with a negative value. This process is tedious to carry out many times, so you can use a character that won’t occur in your document (let’s say o-circumflex) when you are initially typing the material. Then create the first instance of the base character + diacritic combination, copy it to the clipboard, and then replace all the o-circum- flexes with it.

Here’s an illustration. I typed an o and then used Word’s Insert/Symbol command to add the breve and the accute accent: o˘´. I then highlighted the o and used Word’s Format/Font/CharacterSpacing feature to move the breve back over the o; Word calls this “condensed” character spacing and it should be applied to the first character of the pair. Finally, I highlighted the breve and condensed its spacing in order to move the acute back; then highlighted the acute and moved it up (“raised” posi- tioning, in Word’s terms) and the result looks like this: o˘´. Note that your cursor will appear to behave strangely in this scenario, and you may not be able to use the mouse to select things. You will have to use the and hold down SHIFT to highlight the item you want, even if you can’t really see what you’re highlighting (it’s still there if you typed it and saw it earlier in the process). Some- times the cursor may appear not to move when you run it over the combined letter+diacritic combination. You’ll get used to this.

You can see that this is not something you want to do very often, since it is quite fussy and awkward. You have to keep trying until you find just the right amount to move the accent back or up. And, of course, it only works if the font already contains the accents you need. But, for the occasional rare combination, it’s one possibility. (For example, if you have a Unicode font that is missing the seven precomposed combinations with underdot [see page 8], this would be one way to get the missing items.) It is clearly much better to have any precomposed combination that you will use regularly built into the font.

Choosing a font Scholars often have little choice about fonts if they need special characters; one has to take what’s available unless one is willing to customize a font for oneself or have the work done. However, if you do have a choice, either because you are creating your own special characters or because you are able to access the diacritics built into various commercial fonts, you should think about the following factors.

Fonts generally fall into and sans-serif categories. Times New Roman is a serif font since it has the little horizontal elements at the top and bottom of such letters as h and M, while Arial is a sans- serif font since it lacks such elements: h M (serif) h M (sans serif) Serif fonts are generally considered more legible for blocks of text such as are regularly found in scholarly books, so you will probably want to choose a serif font for the body text of your publica- tion. It is also a very common practice to use a serif font for the body and a sans-serif font for headings.

DRAFT FOR COMMENT #2: NOT FINAL 36 Word Processing in Classical Languages

You will also probably want to pick a font of a traditional design, such as , Garamond, Basker- ville, or a modern one such as Palatino whose letterforms are closely related to the traditional faces. These have been used for centuries for books and so convey the sense of solidity and accuracy that one wants in a scholarly publication. There are a great many fonts that work fine in a flyer or adver- tisement but are not the best choice for scholarly work.

The remarks in the previous paragraph assume that you are preparing a print publication. If you are preparing a document that users will spent long periods of time reading on a computer screen, some additional factors come into play. Examples of such documents include pages to be posted on the web, CD-ROMs containing reference materials, or instructional software. Keep in mind the essential point that computer screens operate at a much lower resolution (less than  dots per inch) than laser printers (normally  or  dpi, at most  dpi) or imagesetters ( or  dpi).

There are two related factors that determine the suitability of type for extended on-screen use: the design of the typeface and the quality of its digital implementation. Many traditional book faces such as Garamond and Caslon were designed with noticeable contrasts between thick and thin strokes as well as subtle curves. Such designs don’t look their best on a low-resolution screen. This is particuarly so if the person who digitized the font didn’t know how (or didn’t bother) to create the best, most efficient outlines; unfortunately, this is the case with some of the cheap imitation fonts on the market. A font with a simpler design may stand up better at low resolutions, and some fonts on the market today have been designed specifically for digital use. But any font, whether a traditional book design or new digital face, will not look good on screen unless it has been quite carefully hinted. Such hinting is very time-consuming and many fonts have not been given such treatment. Some of the fonts distributed with Microsoft products (including Times New Roman, Arial, Verdana, and Georgia) do feature this high quality hinting but do not contain all the characters that scholars may need. Compare the appearance of these fonts with the ones you are considering for your project in order to establish a point of reference. In any event, before settling on a typeface for an on-screen document, be sure to test it with a variety of readers for an extended period.

For more on the important topic of choosing appropriate typefaces, see the excellent book by Bringhurst or one of the many available books on desktop publishing.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 37

7. Greek

Overview Greek grammarians in ancient developed a system of three accent marks (acute, grave, and circumflex) to be placed over vowels as well as two breathing marks (rough and smooth) for vow- els at the beginning of a word.13 In addition, the diaeresis, macron, breve, -subscript, and coronis (a contraction sign) are called for on occasion in scholarly work. Given all the possible combinations of these signs, traditional Greek typesetting is not a simple affair. In  the Greek government de- creed the end of this traditional polytonic system, which was no longer needed except for classical texts, and replaced it with a monotonic system. Monotonic Greek uses one accent mark, the tonos, similar in shape to the acute (but usually with a slightly more vertical orientation14) plus the diaeresis which may appear on iota or .

The Mac was the first computer of choice for Hellenists because it provided the ability to use non- Latin fonts long before Windows did, and GreekKeys on the Mac was the first widely used font package for classicists. Windows caught up with the introduction of TrueType fonts in Windows . and there are now several solutions available on this platform. However, there has never been an established standard for either the arrangement of characters in a font or for the entry of characters from the keyboard; Unicode now provides for the former, but the latter remains an issue. Both GreekKeys and WinGreek have been widely used and have to some extent constituted de facto stan- dards; but I strongly recommend that anyone starting out now with Greek text adopt a Unicode- based solution rather than one of the older packages, although the latter still work, since Unicode will certainly be the dominant standard in the future. However, in the interest of completeness, I will provide information on the earlier packages. Sean Redmond has written a utility that will convert from earlier Greek layouts into Unicode; see . This program can convert to and from GreekKeys, WinGreek, and Beta Code and is an excellent way to move your documents into a Uni- code format. issue of formatting loss

An excellent source of information on Greek fonts and typesetting is the by Yannis Haralam- bous titled From Unicode to Typography, A Case Study: The Greek Script (available online). Anyone with a se- rious interest in Greek typography should consult the wonderful book Greek Letters: From Tablets to Pix-

13 The purpose of the accents was to indicate rising or falling pitch of a , while the indicated an h sound and a smooth breathing its absence. The rules about accents are complex and may be obtained from any Greek . For more about the pitch accent of , see Allen, p. 116 ff. 14 The Greek font used to print Version 2 of The Unicode Standard showed vowels with a tonos that was pointing straight up, 90° to the baseline, which is not the usual practice (and has been corrected in Version 3). Some font developers not well versed in Greek typography followed these samples, with the result that you may encounter fonts with such vertical tonoi.

DRAFT FOR COMMENT #2: NOT FINAL 38 Word Processing in Classical Languages els which includes a chapter by Jeffrey Rusten on the fonts on personal computers. A standard history of Greek typefaces is that of Scholderer.

How-to -character Fonts The first widely used package for classicists was GreekKeys. Created by George B. Walsh and Jeffrey Rusten for the Macintosh, and for a while available in a Windows version, it became a kind of stan- dard, particularly among Mac users. It is still available for the Mac; for up to date information, see the FAQ page at .

Allotype Typographics offers some very nice looking Greek fonts, Kadmos and Bosporos, in their own layout and in GreekKeys layout; both Windows and Mac versions are available. See their home page at . The in makes some excellent polytonic fonts. Two of these, GFS Bodoni and GFS Didot, are available from Schol- ars’ Press in a layout that works with GreekKeys (Scholars Press Software,  Houston Mill Road, Atlanta GA , USA Telephone: (toll free)   ; or    (Fax   ).); samples are available online at .

The Summer Institute of Linguistics (SIL) at makes available a set of fonts and keyboard utilities designed for Biblical scholars who work in Greek, but of course it will work for Attic or as well. It is free and comes in both Mac and Windows versions.

SP Fonts, a collection of freeware fonts for Biblical studies from Jimmy Adair, are available in both Mac and Windows at . The fonts include no precomposed combinations of vowel with accents and breathings, which limits their utility.

WinGreek, by Peter Gentry and Andrew Fountain, supports Greek, Coptic, and Hebrew; it provides fonts and a utility called Beta that enables the user to enter Greek characters and to type Hebrew from right to left. It was written for Windows . and is still available for use on that platform; it does not work under Windows  or later.

Son of WinGreek is an updated version of WinGreek designed to work on Windows /; this is probably the most widely used Windows Greek package at the present time, although many users are now moving to Unicode. Son of WinGreek uses the same fonts and keyboard layout as WinGreek but has been updated in a variety of ways. Several other fonts are also available that follow the Win- Greek arrangement. All the various WinGreek/Son of WinGreek materials can be found at .

MultiKey supports both -character fonts and Unicode fonts under Windows; see below in the Unicode section for details.

DRAFT FOR COMMENT #2: NOT FINAL 7. greek 39

Afga Monotype , corporate successor to the venerable Monotype Corporation, sells several polytonic Greek fonts in Type 1 format: Porson, New Hellenic, Andale Mono, and several others. As you would expect from this source, the outlines are very high quality and well hinted. The catch is that they come in Monotype’s own encoding; you have to use Fonto- grapher or one of the FontLab products to reencode them into the layout you use (WinGreek, GreekKeys, even Unicode) This, I suppose, reflects the fact that there never was a standardized encoding for polytonic Greek until the arrival of Unicode.

Unicode Solutions Unicode has been a great benefit for users of polytonic Greek because it finally provides a standard- ized, internationally recognized method of storing and exchanging Greek text (although some issues remain unresolved; see below). There are currently several packages available that enable users to pre- pare Unicode Greek text. Each requires, of course, a Unicode-friendly word processor such as Word  or .

It should be clearly understood that, while Unicode provides a standard for fonts, it does not address the issue of keyboard entry. For instance, one can install the keyboard (see the Part I) and be able to type the standard Greek letters and the single acute-style accent used in modern Greek, but there is no provision for entering the polytonic accents. Windows  keyboard drivers can access only  characters at a time and don’t deal well with Unicode, so specialized program- is still required for polytonic. This situation is improved in Windows , which is Unicode- native (see below and also Chapter ).

As mentioned in Part I (page 9), the first version of Unicode supported only contemporary mono- tonic Greek along with combining polytonic accents (which no software can yet readily take advan- tage of). The second version added the precomposed combinations which made polytonic Unicode Greek usable with today’s word processors. As a result, Unicode contains two blocks of Greek char- acters: the Greek and Coptic (sometimes referred to as Basic Greek) and the Greek Extended areas. Users should be aware that some Unicode fonts that claim to support Greek contain only the Basic Greek characters and so will not be adequate for polytonic users. (Fonts mentioned in this section do contain the necessary Greek Extended characters.) Version . of Unicode added lowercase forms of the letters , diagamma, , and plus the kai symbol. The charts at the end of this chapter show the contents of the two blocks of Greek characters.

Antioch is a package by Ralph Hancock and Neil Bashoori that includes Unicode fonts and keyboard drivers for Windows / (no Mac version, unfortunately). It is designed to work with Word  or Wo rd  and installs a small command bar on the Word toolbar that enables one to switch quickly between English and Greek or Hebrew; the Greek portion also supports Coptic. For typing Greek letters, users have the choice of the WinGreek transliterating keyboard or a slightly modified standard modern Greek keyboard; accents and breathings are usually handled through the numeric keypad, al- though a version for laptop users that relies on the numerals found on the top row of the keyboard can be installed. A demo version can be downloaded from ; registration costs US$. This is an excel- lent “all in one” product for those making the move to Unicode Greek.

DRAFT FOR COMMENT #2: NOT FINAL 40 Word Processing in Classical Languages

Cardo, the font in which this book is set, contains the complete Unicode Greek character set plus many other characters useful to classicists and medievalists. I have also written a Greek keyboard using Tavultesoft’s Keyman; as soon as Keyman 5 is officially released, I will post it on my web page. See for the latest updates.

Matthew Robinson has written a set of Word macros to enter Unicode Greek characters; see . Another set of similar macros by Manuel Lopez is used to be available online but apparently no longer is.

You can download the Athena Roman Unicode font by Jeffrey Rusten from Sean Redmond’s web site .

Multikey is a shareware package that provides support for Greek (both Unicode and several -char- acter fonts), Hebrew, and several other languages under Windows. Check out and go to the Multikey link on the left. Multikey was created by Stefan Hagel, author of the unique Classical Text Editor (see page 55).

The Titus Project’s font includes Greek as well as Latin characters; see page 19 for more details.

Production First Software has several multilingual typefaces which include Unicode Greek characters. See .

Unitype offers Global Writer and Global Office. The former is a word processor specifically de- signed for multilingual applications, while the latter is an add-in for Microsoft Office that enables the user to enter text in a variety of languages easily. Both Greek (modern) and Ancient Greek are sup- ported; fonts and keyboard drivers are included. See the Unitype homepage at .

The Greek company Magenta specializes in fonts, diction- aries, and keyboard products (including add-ins for Word  and ). Their Polytonistis (Πολυτονιστήs) package provides fonts and keyboard drivers; these are good products, but the keyboard drivers do follow the old Greek polytonic keyboard arrangement (see next paragraph).

Windows , like its predecessor Windows NT , was designed from the ground up around Uni- code. As a result, it includes some features useful to multilingual users; in particular, its keyboard drivers can select any Unicode character (Windows / keyboards can only select from  characters at a time). Win ships with a polytonic Greek keyboard and the excellent Palatino Linotype font (see a review of this font by Jeffrey Rusten at ; unfortunately, the font does not seem to be available separately, and it does not contain the new characters added in Unicode .).

The Win polytonic Greek keyboard follows the standard (pre-monotonic) Greek typewriter layout with accents on the right side; as a result, it will be most useful to those who have acquired typing habits with this layout, or who work with Greek enough that they want to invest the time

DRAFT FOR COMMENT #2: NOT FINAL 7. greek 41 learning this arrangement. The keystrokes are not mnemonic, and one must learn separate keystrokes for the various combinations (e.g., acute, acute with smooth, acute with rough, etc.). In my opinion, systems such as that found in the GreekKeys Mac keyboard or in Antioch, where the accents “accumulate” (that is, there is one key for a smooth breathing and one for an acute; if the two are typed in sequence, the software chooses the correct precomposed character) are easier to remember. However, if you have Win you don’t have to purchase anything else, which is certainly an advantage.15 It must also be said that if one does invest the time to memorize the keystrokes of the Win driver, typing may be faster because fewer keystrokes are needed. This layout is shown in Appendix . It must be said that the Win layout requires fewer keystrokes than the more mnemonic methods.

Issues with Unicode There are still a few missing Greek letters in Unicode, including uppercase lunate , alphabetic koppa Ϟ (the koppa glyphs added to Unicode . have the lightning-bolt shape ϟ used in to- day as a numeral; these alphabetic koppa characters have been proposed for version . of Unicode but not yet officially adopted), uppercase with smooth breathing, uppercase upsilon with smooth breathing and accents, uppercase yod (for all-caps typesetting) and the inverted iota and upsilon with circumflex sometimes used in modern Greek printing. In addition, combinations needed when transcribing inscriptions but not found in classical literary texts (e.g., and omega with cir- cumflex) are not defined in Unicode nor are acrophonic numerals.

Unicode uses the modern Greek terms for accents. For reference, here is a list of the corresponding English terms (with stress marks to show correct pronunciation):

oxía = acute psilí = smooth (British, lenis) varía = grave dasía = rough (British, asper) perispoméni = circumflex hypogegramméni = vrachý = breve prosgegramméni = iota adscript [lowercase] dialytiká = diaeresis áno teleía = Greek colon

Unicode contains several precomposed forms consisting of a capital alpha, , or omega plus a lowercase iota adscript (e.g., U+1F88, GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROS- GEGRAMMENI). The inclusion of such combinations is not absolutely necessary, since they can of course be represented by a capital followed by a normal iota; however, their presence makes it easier for programmers to adjust automatically. According to Haralambous (page ), printers in Greece sometimes print the iota subscript under a capital instead of adscript; this is regarded simply as an alternate method of printing such . Version  of The Unicode Standard showed these

15 Should you upgrade from Windows 95/98 to Windows 2000? Almost all business-oriented applications that work with Windows 95/98 will also work on Win2000, but there are a few that won’t or that will have to be upgraded, so check the ones that you need before making this decision. However, many games and multimedia applications do not work under Win2000, which makes it less than ideal for some home users. One alternative, if you have the hard disk space, is to set up a dual-boot system so that you can run either Windows 2000 or an earlier version on the same machine. See Livingston and Brown, pages 31–40, or similar books for instructions on how to do this. A new consumer-oriented version of Windows is due out in late 2001 or early 2002. Unlike Windows 95/98/Me, it will be based on the same underlying computer code as Windows 2000 and will, presumably, offer direct Unicode support.

DRAFT FOR COMMENT #2: NOT FINAL 42 Word Processing in Classical Languages characters with the iota subscript, to the surprise of many classicists, although version  shows them adscript and states (page ) that lowercase iota is “normally” written adscript next to a capital. You may encounter some Greek fonts, based on the Version  charts, with the capitals + iota subscript. Those who want both sets of glyphs in one font will have to use the Private Use Area for one or the other, or else use an OpenType font that can accommodate both sets of glyphs.

Unicode provides separate precomposed combinations for vowels with the accent mark used in modern Greek as well as the acute accent found in polytonic texts (e.g., U+0AC, GREEK SMALL LET- TER ALPHA WITH TONOS vs. U+1F71, GREEK SMALL LETTER ALPHA WITH OXIA). This makes it convenient to have one font that can be used for any Greek text, since the monotonic tonos usually has a more vertical shape than the acute accent—even though the word tonos essentially refers to an acute accent, and some fonts do use the same glyph for both, as the Palatino Unicode font does. Haralambous in fact argues that there is no need for any distinction between tonos and acute, but certainly the vowels with tonos are not going to be removed from Unicode at this point. You can see the distinction between tonos and oxia in the charts on the following pages.

Unicode encodes alternative forms for a few Greek letters: β/ϐ, θ/ϑ, φ/ϕ, κ/ϰ, π/ϖ, ρ/ϱ, and Y/ϒ (U+03D0–U+03D6 and 03F0–3F2). Some of these were included for compatibility with earlier standards (ϐ and ϒ) and others to provide appropriate forms when the Greek letters are used as technical symbols. The “script” and “curly” rho are simply stylistic alternates, as are the two forms of (the “closed” form ϖ is rarely used in American scholarly texts. but is frequently seen in Greece and in texts printed in France). I strongly suggest avoiding these alternate forms in Greek texts, since they may not sort, analyse, or display properly; use them only as they were intended, for technical symbols.

(Discussion of alternate characters continues on page 41.)

The charts on the following pages show all the characters in the Greek and Coptic block (page 43) followed by the characters in the Greek Extended block (pages 44–45). These charts have been set up in the same way as those in the The Unicode Standard, Ve r s i o n  (pages  and –). The combining accent marks are not shown, since they come in the Combining Diacritics block, not in the Greek block (see Table 1, page 10 for the combining diacritics). Note that the coronis is U+1FBD while the smooth breathing is U+1FBF; U+0374 and U+0375 are the numerical markers. You probably won’t need these charts often, since packages such as GreekKeys or Antioch provide keyboard drivers that take care of the details for you, but they are here just in case.

For modern Greek, also note the drachma sign ₯ at U+20AF.

DRAFT FOR COMMENT #2: NOT FINAL 7. greek 43

Table 4. The The Greek and Coptic Block of Unicode.

037 038 039 03A 03B 03C 03D 03E 03F



 0 

0370 0380     ΐ Π ΰ π ϐ Ϡ ϰ

0390 03A0 03B0 03C0 03D0 03E0 03F0  



1 

  0371 0381 A P α ρ ϑ ϡ ϱ

0391 03A1 03B1 03C1 03D1 03E1 [3.0] 03F1   

 

2 

 

0372 0382 B 03A2 β ς ϒ Ϣ ϲ

   0392 03B2 03C2 03D2 03E2 03F2



3 

 

0373 0383 Γ Σ γ σ ϓ ϣ j

  0393 03A3 03B3 03C3 03D3 03E3 03F3 4 ʹ ΄ ∆ T δ τ ϔ Ϥ ϴ 0374 0384 0394 03A4 03B4 03C4 03D4 03E4 03F4 [3.2] 5 ͵ ΅ E Y ε υ φ ϥ ϵ

0375 0385 0395 03A5 03B5 03C5 03D5 03E5 03F5 [3.2] 

 6 

0376   Ά Z Φ ζ φ ϖ Ϧ ϶

0386 0396 03A6 03B6 03C6 03D6 03E6 03F6 [3.2]  

  

7 

      0377 · H X η χ ϗ ϧ 03F7

0387 0397 03A7 03B7 03C7 03D7 [3.0] 03E7   

  

8 

      

0378 Έ Θ Ψ θ ψ Ϟ Ϩ 03F8  

 0388 0398 03A8 03B8 03C8 03E8

03D8 [3.2]

  

9 

     

0379 Ή I Ω ι ω Ϟ ϩ 03F9



 0389 0399 03A9 03B9 03C9 03D9 [3.2] 03E9

  

   A 

ͺ Ί Ϊ κ ϊ Ϛ Ϫ 03FA

 

037A 038A 039A 03AA 03BA 03CA 03DA 03EA 

       

          B 

037B 038B 03FB    Λ Ϋ λ ϋ ϛ ϫ

039B 03AB 03BB 03CB 03DB [3.0] 03EB    

 

C 

     037C Ό M ά µ ό Ϝ Ϭ 03FC

038C 039C 03AC 03BC 03CC 03DC 03EC  

 

D 

    

037D 038D N έ ν ύ ϝ ϭ 03FD

 

 039D 03AD 03BD 03CD 03DD 03ED

[3.0]



E    

; Ύ Ξ ή ξ ώ Ϟ Ϯ 03FE

    

037E 038E 039E 03AE 03BE 03CE 03DE 03EE

  

   F 

037F Ώ O ί ο 03CF ϟ ϯ 03FF      

 038F 039F 03AF 03BF 03DF [3.0] 03EF   

DRAFT FOR COMMENT #2: NOT FINAL 44 Word Processing in Classical Languages

Table 5. The Greek Extended Block of Unicode. 1F0 1F1 1F2 1F3 1F4 1F5 1F6 1F7

0 ἀ ἐ ἠ ἰ ὀ ὐ ὠ ὰ 1F00 1F10 1F20 1F30 1F40 1F50 1F60 1F70 1 ἁ ἑ ἡ ἱ ὁ ὑ ὡ ά 1F01 1F11 1F21 1F31 1F41 1F51 1F61 1F71 2 ἂ ἒ ἢ ἲ ὂ ὒ ὢ ὲ 1F02 1F12 1F22 1F32 1F42 1F52 1F62 1F72 3 ἃ ἓ ἣ ἳ ὃ ὓ ὣ έ 1F03 1F13 1F23 1F33 1F43 1F53 1F63 1F73 4 ἄ ἔ ἤ ἴ ὄ ὔ ὤ ὴ 1F04 1F14 1F24 1F34 1F44 1F54 1F64 1F74 5 ἅ ἕ ἥ ἵ ὅ ὕ ὥ ή

1F05 1F15 1F25 1F35 1F45 1F55 1F65 1F75  

  6 

1F16 1F46   ἆ  ἦ ἶ ὖ ὦ ὶ

1F06 1F26 1F36 1F56 1F66 1F76  

     

7 

     

ἇ 1F17 ἧ ἷ 1F47 ὗ ὧ ί

1F07 1F27 1F37 1F57 1F67 1F77  



8 

  

Ἀ Ἐ Ἠ Ἰ Ὀ 1F58 Ὠ ὸ 1F08 1F18 1F28 1F38 1F48  1F68 1F78 9

Ἁ Ἑ Ἡ Ἱ Ὁ Ὑ Ὡ ό

 

1F09 1F19 1F29 1F39 1F49 1F59 1F69 1F79



A    

Ἂ Ἒ Ἢ Ἲ Ὂ 1F5A Ὢ ὺ 1F0A 1F1A 1F2A 1F3A 1F4A  1F6A 1F7A B Ἃ Ἓ Ἣ Ἳ Ὃ Ὓ Ὣ ύ

1F0B 1F1B 1F2B 1F3B 1F4B 1F5B 1F6B 1F7B 

 C 

1F5C  Ἄ Ἔ Ἤ Ἴ Ὄ  Ὤ ὼ 1F0C 1F1C 1F2C 1F3C 1F4C 1F6C 1F7C

D  Ἅ Ἕ Ἥ Ἵ Ὅ Ὕ Ὥ ώ

1F0D 1F1D 1F2D 1F3D 1F4D 1F5D 1F6D 1F7D     

  

E 

      

Ἆ 1F1E Ἦ Ἶ 1F4E 1F5E Ὦ 1F7E

       1F0E  1F2E 1F3E 1F6E

  

F 

  

Ἇ 1F1F Ἧ Ἷ 1F4F Ὗ Ὧ 1F7F

  1F0F  1F2F 1F3F 1F5F 1F6F

DRAFT FOR COMMENT #2: NOT FINAL 7. greek 45

Table 5 continued.

1F8 1F9 1FA 1FB 1FC 1FD 1FE 1FF  



  

   0 

1FF0   ᾀ ᾐ ᾠ ᾰ ῀ ῐ ῠ 

1F80 1F90 1FA0 1FB0 1FC0 1FD0 1FE0  



1 



ᾁ ᾑ ᾡ ᾱ ῁ ῑ ῡ 1FF1 1F81 1F91 1FA1 1FB1 1FC1 1FD1 1FE1  2 ᾂ ᾒ ᾢ ᾲ ῂ ῒ ῢ ῲ 1F82 1F92 1FA2 1FB2 1FC2 1FD2 1FE2 1FF2 3

ᾃ ᾓ ᾣ ᾳ ῃ ΐ ΰ ῳ

 

1F83 1F93 1FA3 1FB3 1FC3 1FD3 1FE3 1FF3

  

4 

 

ᾄ ᾔ ᾤ ᾴ ῄ 1FD4 ῤ ῴ

 

1F84 1F94 1FA4 1FB4 1FC4  1FE4 1FF4  

     5 

1FB5 1FC5 1FD5 1FF5   ᾅ ᾕ ᾥ  ῥ 1F85 1F95 1FA5 1FE5 6 ᾆ ᾖ ᾦ ᾶ ῆ ῖ ῦ ῶ 1F86 1F96 1FA6 1FB6 1FC6 1FD6 1FE6 1FF6 7 ᾇ ᾗ ᾧ ᾷ ῇ ῗ ῧ ῷ 1F87 1F97 1FA7 1FB7 1FC7 1FD7 1FE7 1FF7 8 ᾈ ᾘ ᾨ Ᾰ Ὲ Ῐ Ῠ Ὸ 1F88 1F98 1FA8 1FB8 1FC8 1FD8 1FE8 1FF8 9 ᾉ ᾙ ᾩ Ᾱ Έ Ῑ Ῡ Ό 1F89 1F99 1FA9 1FB9 1FC9 1FD9 1FE9 1FF9 A ᾊ ᾚ ᾪ Ὰ Ὴ Ὶ Ὺ Ὼ 1F8A 1F9A 1FAA 1FBA 1FCA 1FDA 1FEA 1FFA B

  ᾋ ᾛ ᾫ Ά Ή Ί Ύ Ώ

1F8B 1F9B 1FAB 1FBB 1FCB 1FDB 1FEB 1FFB 

   C 

1FDC  ᾌ ᾜ ᾬ ᾼ ῌ  Ῥ ῼ

1F8C 1F9C 1FAC 1FBC 1FCC 1FEC 1FFC  D ᾍ ᾝ ᾭ ᾽ ῍ ῝ ῭ ´ 1F8D 1F9D 1FAD 1FBD 1FCD 1FDD 1FED 1FFD E

ᾎ ᾞ ᾮ ι ῎ ῞ ΅ ῾

 

1F8E 1F9E 1FAE 1FBE 1FCE 1FDE 1FEE 1FFE



F 

  

ᾏ ᾟ ᾯ ᾿ ῏ ῟ ` 1FFF 1F8F 1F9F 1FAF 1FBF 1FCF 1FDF 1FEF 

DRAFT FOR COMMENT #2: NOT FINAL 46 Word Processing in Classical Languages

Moreover, the shapes of the glyphs in the Unicode charts are not meant to constrain font designers in inappropriate ways. The designer of a typeface for use in Greek text will choose the form or that he/she considers most appropriate given the overall design of the characters; for example, the Vusillus Old Face font uses the open and the script kappa in the regular : α β γ δ ε ζ η θ ι κ λ. A font such as Vusillus Old Face thus may have two instances of open theta, one at U+03B8 and another at U+03D1. This illustrates why it is important for designers to have excellent knowledge both of the language for which they are designing a typeface as well as of Unicode. It is not enough simply to copy the glyphs in The Unicode Standard, which might lead one to believe that closed theta was required for regular Greek text while open theta was always used as a technical symbol.16

The cases of phi and Upsilon require special comment. Version 2 of The Unicode Standard showed the circle-plus-vertical line form φ as the standard for text and the script form ϕ as the scientific symbol; however, technical usage is in fact the opposite, and this was changed in Version . But most existing Unicode fonts still have the orignal arrangement, and of course either form may be found in a Greek text.

The alternate form of Upsilon (U+03D2–U+03D4) is meant to have asymetrical arms and was en- coded because it was present in a previous standard. The Upsilon that is found in the regular run of the alphabet (U+03A5) may be designed either with straight arms like a Roman Y or with a rams- shape.

Except for theta and phi in mathematical contexts, I would avoid using any of these alternate forms. If you prefer a curly rho or script kappa, then locate a font that uses the shape you like for the regular alphabetic run of letters. However, font designers should include glyphs for all the alternate characters at the appropriate codepoints, even if this only seems to duplicate characters; many do not (e.g., the Palatino Linotype font uses the curly rho at U+03C1, the regular alphabetic rho, and has no glyph at U+03F1, presumably since the curly rho is already in the font). Otherwise users may receive a document and apply the font, only to discover missing glyphs. Since the characters are in the standard, we have to assume people will sometimes use them, even if inadvisedly or unnecessarily. An ideal solution for including alternate glyphs in one font would be to use an OpenType font with alternate letter forms (see Chapter ).

Unicode includes a Greek question mark at U+037E for compatibility with earlier standards, but the regular Latin semicolon is the preferred character. The same applies with the combining coronis (U+0343)—one should normally use U+0313, COMBINING COMMA ABOVE.

Because the circumflex accent in Greek can take either a tilde shape ˜ or a rounded circumflex shape, Unicode has included the COMBINING GREEK PERISPOMENI (U+0342) in order to provide a character that is distinct from both the French-style circumflex ˆ and the tilde. At the present time, virtually all users of polytonic Greek are using the precomposed forms, so this character is not often needed. But

16 An unfortunate example of this tendency to copy glyphs without adequate understanding occurred with Version 2 of The Unicode Standard, whose chart for monotonic Greek showed vowels with a straight-up accent. This is not the standard design of the tonos and was corrected in Version 3, but you will still find some Unicode Greek fonts with this absolutely vertical tonos.

DRAFT FOR COMMENT #2: NOT FINAL 7. greek 47 if you are designing an application that uses the combining marks, do use U+0342 instead of the regular combining circumflex. This character is shown with a tilde shape in the Unicode charts, but a font designer may give it either shape.

Also explain deprecated Greek characters in Unicode 3.0. Also explain spacing diacritics at end of block.

Five additional Greek characters are included in the forthcoming Unicode .2; they have been accepted by the Unicode Consortium but not yet been given final approval by ISO. They include the q-shaped koppa (U+03D8, GREEK LETTER ARCHAIC KOPPA and 03D9, GREEK SMALL LETTER ARCHAIC KOPPA), discussed above, and three additional symbols: U+03F4, GREEK CAPITAL THETA SYMBOL; U+03F5, GREEK LUNATE EPSILON SYMBOL; and U+03F6, GREEK REVERSED LUNATE EPSILON SYMBOL. I have not yet been able to obtain any additional information about these additions last three, but they seem to be technical symbols of no concern to classicists.

A word about the Coptic characters might be in order. The Copts (Egyptian Christians) adopted the Greek alphabet and added several characters to represent sounds that did not occur in Greek; these Coptic-unique characters are located in Unicode at U+03E3–U+03EF. For working in Coptic, you will want to obtain a font in which the Greek characters have been designed with the shapes that are traditional for Coptic rather than for regular Greek text. The Coptic characters in Unicode are unified with the Greek characters rather than being given separate codepoints, as is appropriate since the is unquestionably the Greek alphabet with a few additions. However, it is expected that users will employ a font with appropriately shaped glyphs when dealing with Coptic. This is another illustration of the distinction between character and glyph that was mentioned in Chapter 1.

DRAFT FOR COMMENT #2: NOT FINAL 48 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 49

8. Epigraphy

Overview Epigraphy, the study of inscriptions, requires a number of specialized characters that are not found in conventional computer fonts. In Latin, the most commonly needed characters are the I-longa (a capital I that is taller than usual and signifies a long vowel); the , another sign of that looks like an acute accent or like a sort of hook; the , a dot to separate words; Roman numerals with lines over them to distinguish them from text; bars over or through certain letters to mark abbreviations; and various ligatures.

For Greek, the most essential characters are the acrophonic numerals plus epsilon and omega with a circumflex. If one wants to reproduce the actual look of inscriptions, as opposed to transcribing them into standard Greek type, one needs a font that reproduces the various early Greek alphabets. I currently know of no such font that is available.

There are a several punctuation marks that epigraphers conventionally employ regardless of the language of the text: parentheses for filled-out abbreviations, the underdot to mark characters that are uncertain, brackets for missing letters about which there is no doubt, angle brackets for letters accidentally omitted, double brackets for letters deliberately erased, and several others. Genuine angle brackets 〈a〉 are missing from most computer fonts, although they are present in Unicode (see table below). Using the single guillemets ‹a› (found in standard Windows and Mac fonts) instead of the greater-than and less-than signs may be acceptable alternative in a text setting. For double brackets, one can of course simply use two regular brackets [[ ]], but some fonts do have specially designed double characters 〚〛 which look nicer.

More informaton about epigraphy can be found in the books by Woodhead, Gordon, Keppie, Cook, and Page.

Solutions -character Fonts The CL Fonts package contains the most common characters needed for transcribing Latin inscrip- tions; see page 20 for a sample. I know of no other font specifically designed for Latin epigraphy. Standard Windows and Mac fonts contain a raised dot ∙ (ANSI number 3; Mac number , entered with OPTION-) that works very nicely as an interpunct, and regular Latin vowels with an acute do fine for apices (no V with acute in standard fonts, of course).

The SymbolAthenian font that is part of the GreekKeys package has a number of epigraphical symbols including acrophonic numerals. As its name implies, is designed to be used along with the

DRAFT FOR COMMENT #2: NOT FINAL 50 Word Processing in Classical Languages

GreekKeys Athenian font; it also works with the Athena Unicode font, since the Greek characters in that font have the same design as they do in Athenian (they were all designed by Jeffrey Rusten).

Of the punctuation marks, the underdot the most problematic since it is not found in most fonts; even if there is an underdot character, one encounters problems positioning it appropriately under the various letters. A lowered period could be used. See the section on “Using Diacritics That Are Built into a Font” (page 34) for a discussion of how to position diacritics manually; this is awkward but can work for printed materials.

Unicode Solutions Unicode defines a number of characters helpful to epigraphers. These include the Roman numerals one, five, and ten thousand plus a reversed  as well as a turned capital F and that work for two of the Claudian letters. The combining turned comma above (U+0312) might be used for the sicilicus.

As discussed in Chapter , Unicode now has a Runic block that will be useful to medieval epigra- phers. A block of Old Italic characters for Oscan, Ubrian, and Etruscan has been proposed for inclusion in Unicode ., U+10300–U+1032F, although this block has not yet received final approval from the ISO. Note that the Old Italic block will be located in Plane  (see ??? for information about this).

Unicode contains two sets of angle brackets, U+2329/232A and U+3008/09. The former are intended as technical symbols, and the latter as punctuation for Asian characters. I use the latter for indicating uncertain text but there is no official consensus yet. Both are included here for completeness.

Table 6. Epigraphic Characters in Unicode. ‹ U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK › U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 〈 U+2329 LEFT-POINTING ANGLE BRACKET 〉 U+232A RIGHT-POINTING ANGLE BRACKET 〈 U+3008 LEFT ANGLE BRACKET 〉 U+3009 RIGHT ANGLE BRACKET 《 U+300A LEFT DOUBLE ANGLE BRACKET 》 U+300B RIGHT DOUBLE ANGLE BRACKET 〚 U+301A LEFT WHITE SQUARE BRACKET 〛 U+301B RIGHT WHITE SQUARE BRACKET Ↄ U+2183 ROMAN NUMERAL REVERSED ONE HUNDRED (3.0) ∞ U+221E ROMAN NUMERAL ONE THOUSAND = INFINITY ↀ U+2180 ROMAN NUMERAL ONE THOUSAND ↁ U+2181 ROMAN NUMERAL FIVE THOUSAND ↂ U+2182 ROMAN NUMERAL TEN THOUSAND

DRAFT FOR COMMENT #2: NOT FINAL 8/9. epigraphy & Metrics 51

Ⅎ U+2132 TURNED CAPITAL F (CLAUDIAN ) Ɔ U+0186 CAPITAL LETTER OPEN O (CLAUDIAN BS/) ɔ U+0254 SMALL LETTER OPEN O (SMALL CLAUDIAN BS/PS) ̣ U+0323 COMBINING DOT BELOW Ạ U+1EA0 LATIN CAPITAL LETTER A WITH DOT BELOW ạ U+1EA1 LATIN SMALL LETTER A WITH DOT BELOW Ḅ U+1E04 LATIN CAPITAL LETTER B WITH DOT BELOW ḅ U+1E05 LATIN SMALL LETTER B WITH DOT BELOW U+1E0C LATIN CAPITAL LETTER D WITH DOT BELOW ḍ U+1E0D LATIN SMALL LETTER D WITH DOT BELOW Ẹ U+1EB8 LATIN CAPITAL LETTER E WITH DOT BELOW ẹ U+1EB9 LATIN SMALL LETTER E WITH DOT BELOW Ḥ U+1E24 LATIN CAPITAL LETTER H WITH DOT BELOW ḥ U+1E25 LATIN SMALL LETTER H WITH DOT BELOW Ị U+1ECA LATIN CAPITAL LETTER I WITH DOT BELOW ị U+1ECB LATIN SMALL LETTER I WITH DOT BELOW Ḳ U+1E32 LATIN CAPITAL LETTER K WITH DOT BELOW ḳ U+1E33 LATIN SMALL LETTER K WITH DOT BELOW Ḷ U+1E36 LATIN CAPITAL LETTER L WITH DOT BELOW ḷ U+1E37 LATIN SMALL LETTER L WITH DOT BELOW Ṃ U+1E42 LATIN CAPITAL LETTER M WITH DOT BELOW ṃ U+1E43 LATIN SMALL LETTER M WITH DOT BELOW Ṇ U+1E46 LATIN CAPITAL LETTER N WITH DOT BELOW ṇ U+1E47 LATIN SMALL LETTER N WITH DOT BELOW Ọ U+1ECC LATIN CAPITAL LETTER O WITH DOT BELOW ọ U+1ECD LATIN SMALL LETTER O WITH DOT BELOW Ṛ U+1E5A LATIN CAPITAL LETTER R WITH DOT BELOW ṛ U+1E5B LATIN SMALL LETTER R WITH DOT BELOW  U+1E62 LATIN CAPITAL LETTER S WITH DOT BELOW ṣ U+1E63 LATIN SMALL LETTER S WITH DOT BELOW Ṭ U+1E6C LATIN CAPITAL LETTER T WITH DOT BELOW ṭ U+1E6D LATIN SMALL LETTER T WITH DOT BELOW Ụ U+1EE4 LATIN CAPITAL LETTER U WITH DOT BELOW ụ U+1EE5 LATIN SMALL LETTER U WITH DOT BELOW Ṿ U+1E7E LATIN CAPITAL LETTER V WITH DOT BELOW ṿ U+1E7F LATIN SMALL LETTER V WITH DOT BELOW Ẉ U+1E88 LATIN CAPITAL LETTER WITH DOT BELOW ẉ U+1E89 LATIN SMALL LETTER W WITH DOT BELOW ỵ U+1EF4 LATIN CAPITAL LETTER Y WITH DOT BELOW ỵ U+1EF5 LATIN SMALL LETTER Y WITH DOT BELOW

DRAFT FOR COMMENT #2: NOT FINAL 52 Word Processing in Classical Languages

Ẓ U+1E92 LATIN CAPITAL LETTER Z WITH DOT BELOW ẓ U+1E93 LATIN SMALL LETTER Z WITH DOT BELOW

DRAFT FOR COMMENT #2: NOT FINAL 8/9. epigraphy & Metrics 53

9. Metrics

Overview The ancient organized their through elaborate patterns of long and short , and many of these patterns were copied by the Romans after they became familiar with . The most important signs needed for metrics, therefore, are the macron and breve. Also important are the signs for caesura (a pause, often indicated by a double line) and for (the dropping of a vowel at the end of a word when followed by a word beginning with a vowel), indicated with an undertie ¹ .The advanced study of Greek metrics requires a large number of special symbols; West’s Greek , the standard work in English, contains over , along with a number of abbreviations and signs that are composed of standard letters. A subset of these symbols suffices for printing texts aimed at intermediate students as well as for classical Latin, since Latin metrics are somewhat less complex than Greek. It should also be mentioned that, at a very advanced level, not all metricians use symbols in exactly the same way, which complicates the issue further.

Medieval Latin poetry, on the other hand, is built around patterns of stressed and unstressed syllables and uses the acute accent to mark stressed syllables. [more here--check sources for medieval poetry] In Anglo-Saxon poetry, which is likewise organized around stressed and unstressed syllables, stressed syllables are indicated by an acute and unstressed syllables by the sign × that is used for syllabae ancipites in Greek and Latin. 17 The acute and breve may also be printed over a macron that indicates a naturally long vowel.

Such metrical characters are hard to find in computer fonts, and no standard exists for their placement or use.

How-to Fonts To my knowledge, there is only one font specially designed for the needs of professional metricians in Classics that is commonly available: Anaxiphorminx by Dr. I. L. Pfeijffer of the University of Lei- den. A Mac version can be downloaded from , while a Windows version, converted from the Mac version and posted with Dr. Pfeijffer’s permission, is available at (follow the link to the CL Fonts project and scroll down).

17 The mathematical times sign found in standard fonts is exactly the shape needed and is very useful if you don’t have a special metrical font.

DRAFT FOR COMMENT #2: NOT FINAL 54 Word Processing in Classical Languages

The CL Fonts package contains symbols that are adequate for printing common metrical schemes encountered by intermediate students. It is not designed for advanced students or professional metricians. See page 21 for a sample of poetry printed with CL Fonts. The Symbol Athenian font included in the GreekKeys package also includes a number of common metrical symbols.

The Junicode font from Peter Baker contains metrical symbols for Anglo-Saxon poetry; see page 30.

Printing Scanned Poetry There are two ways to print scanned poetry: by printing a long or short mark directly over each vowel or by printing all the scansion marks on a separate line above the line of text, as in the fol- lowing examples:

īllĕ mī pār ēssĕ dĕō vĭdētūr ¯ µ ¯ ¯ ¯ µ µ ¯ µ ¯ ¯ ille mi par esse deo videtur

The second method is, I think, generally preferable because it is much easier to indicate caesurae and because it allows the marking of vowels that are long by nature, if desired, in addition to the metrically long and short syllables; this helps make clearer the distinction between long vowels (a phonemic feature of Latin and Greek) and long or heavy syllables (a feature of prosody). Both methods, however, are in use in scholarly texts, and the first may provide better appearance if the metrical citation is located within a regular run of text. Which method is easier will depend on the fonts and software you have available; for the second method, you either need a font with metrical symbols positioned at the baseline, or will need to adjust line spacing in your word processor or page layout program. You will also need to fuss with the spacing between metrical characters so that they line up as well as they can over the vowels below. In traditional typography, there are a variety of space characters, including the punctuation space, the thin space, and the hair space (Unicode U+2008, 2009, and 200A respectively). If your font and page layout program support these spaces, you might use them to get the metrical characters better lined up. If you are using the first method, you will occasionally need to position a macron over dipthongs such as ae or au; see page 34 for ways to do this. Metrical schemes are also sometimes printed separately, without accompanying text (see page 21 for examples).

[give exx of medieval Latin and Anglo-Saxon poetry?]

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 55

10. Setting Type

Overview This section is directed to people in the publishing business who may be called upon to work with books or articles containing text in Greek or Latin as well as to classicists. It does not pretend to be a complete tutorial on typesetting and page layout; there are many excellent books devoted to that purpose available. Rather, its focus is on issues specific to classical languages—the kind of information not readily available elsewhere. However, I would urge people in the academic professions to become knowledgeable about the basics of good typography, since we live in an era when scholars are often called on to supply camera-ready copy. If you are not in a position to learn to do such things yourself, you should seek out a service bureau whose employees are acquainted with the world of scholarly publishing, which is somewhat different from the production of magazines or other popular literature. For suggestions on choosing a typeface, see Chapter .

Before turning to general typesetting issues, however, I want to mention a unique product, the Classical Text Editor by Stefan Hagel (see his page at ). This is designed specifically to help scholars prepare critical editions. It includes features to facilitate the use of apparatus criticus, variant readings, sigla, and many other things along with the production of camera-ready copy. I have not used it myself, but I would suggest that anyone beginning such a project check it out.

Generally typesetters have followed the practices used in their native language when setting type for classical works. Thus books with Latin text produced in England or the United States use English- style quotation marks, those manufactured in Germany use German-style marks, and so forth. Modern editors insert punctuation for the convenience of their readers. This practice is almost universally accepted, although in many cases the punctuation employed indicates the editor’s view of how the text should be interpreted and alert readers sometimes notice situations where a different punctuation would lead to a different meaning. If you are preparing a document for publication but do not have training in the language, you can generally follow good typographical practices in your native language (or the language of the intended audience, if the two are different) and be assured that the results will be acceptable. There are a some exceptions, particularly for Greek, which are discussed in the following paragaphs. And, of course, if you are dealing with a project that attempts to reproduce the look of an ancient book or one published, for instance, during the Renaissance, you will be guided by the instructions of the editor. For information about arranging the printing of scanned poetry, see Chapter  above.

For many of the issues discussed below, there are no absolute right or wrong answers. Various publishing houses have to some extent set their own conventions, and you may be instructed to follow one set or another of these. The overriding concern, of course, is to make it as easy as possible for the reader to interpret the text on the page.

DRAFT FOR COMMENT #2: NOT FINAL 56 Word Processing in Classical Languages

Capitalization In both Greek and Latin, capitalization is sometimes not used at the beginning of each sentence, only at the beginning of each paragraph or chapter; this is because the ancients did not distinguish the beginnings of sentences this way. The practice of capitalizing the first word in a paragraph may show the influence of the large inital letters found in medieval manuscripts (of which the modern dropped capital is a descendant) or may simply be a concession to modern sensibilities which would rebel at seeing a small letter at the opening of a chapter. Proper names are always capitalized, and the treatment of proper adjectives depends on the customs of the country where the book was produced confirm this . Latin texts produced in Germany do not capitalize every noun as is done in German. The Chicago Manual of Style specifies that only the first word of a title in Latin should be capitalized.

It is usually not advisable to put extra space after a period or semicolon in electronic texts, although many of us were taught in typing class to hit the spacebar twice after these keys. However, Bringhurst suggests (page ??) that it is useful to put two spaces at the end of a sentence if the following sentence is not capitalized, to help alert the reader to the beginning of a new sentence.

Mixing Languages Sometimes it is necessary to mix Latin or Greek words into blocks of text in a modern language—for example, if a text contains notes for students. A very common practice is to use boldface for the Latin or Greek words intermingled into English text. This practice clearly separates the two languages for the reader, and is particularly effective with notes that are keyed to words in the Latin or Greek text since the reader can see clearly where each note begins and ends. For an illustration, see need examples here. Using bold this way also preserves italics for emphasis or for phrases in modern languages; the only drawback is that a page with a great deal of boldface may appear too dark overall. In this case, the use of a font family that contains a semibold weight (noticeably darker than the regular weight but not as dark as the acutal boldface) may be helpful.

Issues with Latin The ancient Romans used one symbol V that stood for both a vowel sound (modern u) and a semi- vowel (represented by w in English). Likewise, one symbol I represented both the vowel i and the semivowel /j/ (as in English “yellow”). Textbooks for beginners or intermediate students usually differentiate the the first one by using U/u for the vowel and V/v for the semivowel, and the second one with I/i for the vowel and J/j for the semivowel. The use of J/j, while common in American books from the early  century until the s, has recently fallen out of favor. From the strictly logical point of view, there is no reason to drop the one while maintaining the other, since there is no more justification for writing vivit with a v than there is for writing jam with a j. The most scholarly modern treatment is to use u and i in lowercase and V and I in uppercase, which leads to such spellings as uiuit (but Viuit as the first word in a paragraph or VIVIT in all caps). This style is used in the Oxford Classical Text series.

This issue does have some implications for electronic storage of texts. Ideally, perhaps, one would store all words with u and give the user the option to display either as u or v, and likewise for i/j.

DRAFT FOR COMMENT #2: NOT FINAL 10. setting Type 57

This could be done with OpenType fonts. It is interesting to note that the texts on the PHI CD- ROM use v (perhaps simply copied from the editions they used as the basis for their texts).

Issues with Greek There are some special considerations for Greek. First, Greek does not use boldface for emphasis; instead, the space between words is sometimes increased, as is done in some modern languages. This convention is normally reserved for modern texts, however, since any indication of emphasis in an ancient text would be an editorial interpolation.

Italics were adopted as a subordinate companion style for Roman text. No such companion face was developed for Greek in traditional typography. Some Greek fonts (chiefly members of large families that also contain Roman characters) do come with italic versions. However, it is not possible to use italics as a complement for a Greek font, such as the familiar Porson Greek used in the Oxford Classical Texts, that is designed with slanted letters.

When printing polytonic Greek, choose your typeface with care. It is essential that the reader be able to distinguish, for example, the smooth and rough breathings without difficultly. Some typefaces are more legible than others when it comes to diacritics, and some do better than others at low resolutions, so you must consider the output device and the printing method also.

DRAFT FOR COMMENT #2: NOT FINAL 58 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 59

11. Sharing Documents with Others

The wide availability of computers and the rise of email and the World Wide Web ought to make it easier for scholars to share information, and this is indeed happening in a variety of ways. However, if one needs to use fonts with characters that are different from the standard Windows or Mac arrangement, things get complicated. This chapter will suggest some ways to deal these problems.

Direct Sharing of Documents By “direct sharing” I mean giving someone else a copy of a document either by handing him or her a diskette or by emailing the file as an attachment; no other step, such as posting the document to a web site, is involved.

Exchanging Files on the Same Platform If the person you are sending the document to uses the same platform (i.e., Mac or Windows) and has the same fonts that you used to create the document, then things should work fine. For documents that use a non-standard  character font, there’s no alternative except to make sure that the recipi- ent has the font (or one that uses the same character arrangement; for example, there are several fonts available that follow the WinGreek layout). Things are a little easier if you’re using Unicode fonts. As mentioned in previous chapters, not all Unicode fonts contain all characters defined in Unicode, especially those added in Version . If your document makes extensive use of such characters, you may need to help the recipient locate a suitable font. If the recipient’s font does not contain a particular character, he will see the generic “missing character” glyph instead. This usually looks like a rectangle but the font designer may employ another shape; for instance, the Palatine Linotype font in which this text is set uses a whorl shape . Seeing the missing character glyph is certainly better than seeing the wrong character, and things will look fine when a font containing the needed characters is applied. Unicode Greek is sufficiently standardized at this point that it should work quite smoothly; only those few relatively rare characters added in Version . or later of Unicode may cause problems.

Note: the comments above apply only to making sure that the right characters appear in the recipi- ent’s document. Using a font different than the one used to create the document will almost cer- tainly cause line breaks, page breaks, etc. to change.

Remember that you can’t just give out copies of fonts unless the font designer has chosen to make the font freely available. If it is legal to pass along a font you used, you can create a compressed archive (such as a ZIP file in Windows or a SIT file on the Mac) that contains both the font and your docu- ment to send via email. If the file is large, make sure that the recipient’s email system can handle big

DRAFT FOR COMMENT #2: NOT FINAL 60 Word Processing in Classical Languages attachments. It’s also a good idea, due to current concerns about viruses that can be carried on attachments, to make sure that the recipient is expecting the file before you send it. You should also, of course, have an antivirus program installed and make sure to update the virus definition files used by your antivirus program regularly.

Cross-Platform File Sharing Sharing a file between a Mac user and a Windows user is harder. To begin with, chances of success are much greater if author and recipient have the same software (e.g., Microsoft Word or PageMaker, both of which exist in Mac and Windows versions). You may also be able to save the document in a particular file format for cross-platform exchange. For example, in Word  for Windows I can save a document as Mac Word . or .. This is not necessary if you are using recent versions on both platforms, such as Word  in Windows and Word  on the Mac, which can automatically read each other’s documents. If transferring files by diskette, use a Windows disk since the Mac can read and write to Windows disks, but the reverse is not true without installing special software. Even if you do have such software, the transfer is more accurate, in my experience, if done with Windows diskettes.

If at all possible, use Unicode-capable applications and Unicode fonts; Unicode characters should translate without problem, although of course the recipient will need a Unicode font with the right characters installed on his or her system. When dealing with -character fonts, Word and other cross-platform applications can normally handle correctly characters that exist in both the standard Windows and Mac sets, such as e-acute, a-circumflex, etc. Characters that exist in one set but not the other, such as the ¼ ½ ¾ that are found in Windows but not on the Mac, will cause prob- lems. If you are customizing a -character font and anticipate sharing documents with users on the other platform, make alterations only to characters that are present in both sets (you can find such characters by consulting the charts at the end of this book).

Using PDF Files The Portable Document Format (PDF) was invented by Adobe Systems to make it easier to share documents among computer users. It makes use of Adobe’s PostScript language to produce docu- ments that can be viewed on several operating systems—Mac, Windows or Unix. PDF documents can be read by anyone who downloads the free Acrobat Reader from Adobe’s website . (Adobe Acrobat is the application that creates PDFs; you must purchase it in order to make PDFs, although the Reader, which lets users view your PDFs, is free.) PDF has become a popular file format—even the IRS has downloadable tax forms in PDF on their web site. It is an excellent way to share documents if you can’t be sure that the recipient will have the same software that you used to create the files. You can also add many addi- tional features to PDFs such as hyperlinks, naviagation links (to enable the reader to move quickly from one part of the document to another), and lots of others. Complete instruction in how to pre- pare PDFs is outside the scope of this book (there are several books available that will teach you this), but I will mention some issues of particular concern to users of classical languages.

DRAFT FOR COMMENT #2: NOT FINAL 11. sharing Documents with Others 61

NOTE: all information given here is true for version  of Acrobat. I have not yet had a chance to ex- periment with the more recent version .

If you want to use any characters outside the standard Mac and Windows character sets you have to deal with font issues. When you create PDFs, you have the option of embeding fonts into the docu- ment and you should do so if they contain non-standard characters18. You must embed any TrueType fonts in your document by checking the appropriate box in the printer driver when you create the PDF, since Acrobat uses PostScript Type  fonts by default and ignores TrueType fonts unless you specify otherwise. Embedding for PostScript fonts is performed by need more here make screenshot and put here

To keep your PDFs small and minimize download time, you want to embed fonts only if they con- tain special characters or if you wish to preserve the exact appearance of the document when it is read or printed. Make sure to use at least version  of Acrobat to create the your PDF, since version  pro- duced much larger files when fonts were embedded. (If you do not embed fonts, the Acrobat Reader will use standard system fonts such as Times or Helvetica/Arial.)

PDF documents are designed to be cross-platform compatible, and they are as long as only standard Mac or Windows characters are used. Customized -character fonts may work with cross-plat- form PDFs, or they may not; there is no choice but to experiment. Chances are for success are better if the customization is performed only on characters that do exist in both Mac and Windows, as ex- plained above. I have not yet tested Unicode fonts in PDFs. If the font developer named the charac- ters according to Adobe’s conventions, there’s a good chance that they will work. The new version  of Acrobat may also facilitate such cross-platform use of Unicode. (Plea to font developers: do use

18 TrueType font files contain information that indicates whether or not the font creator has allowed the font to be embedded in documents; if this is not allowed, the font will not embed. You can determine this by checking the font file with Microsoft’s Font Properties Extension or Arjun Mel’s Font Viewer; see note 10, page 27, for information about these products.

DRAFT FOR COMMENT #2: NOT FINAL 62 Word Processing in Classical Languages the Adobe naming conventions; see the Adobe web site add url here for details. It makes things simpler in several ways.)

Web Publishing Setting up documents so that they can be read in a standard browser by anyone who visits the web site is a topic of considerable interest to many scholars. It is a terrific way to make information avail- able to a wide audience without the expense and other constraints of traditional publishing. It is also a very complex topic. I began to research it while writing this book and soon realized that to do jus- tice to this topic would require a large investment of time that would seriously delay the first version of this text. Character and font issues are, as you expect by now, prominent among the things to be discussed. The next version of this book will include substantial information about web publishing. In the meantime, note that both Internet Explorer and Netscape Navigator can, with the appropriate plugins loaded, directly access PDF files.

Those interested in using Greek on the web should consult Patrick Rourke’s very useful page at . There you will find an excellent discussion of the problems of getting Greek to display properly in web browsers, and the solutions to them, as well as a discussion of most of the available Unicode Greek fonts.

It is worth noting that Perseus and the Suda On Line projects have an option to display Greek as “Unicode.” However, they have chosen to make use of the combining diacritics rather than the precomposed forms. As explained above, given the font technology in use today it is simply not possible to get a good display using combining diacritics. OpenType fonts will solve this problem, but it is likely to be some time before these are widely supported; see Chapter 13 for more about OpenType.

Another very useful site for those interested in electronic publishing is The Stoa: A Consortium for Electronic Publishing in the Humanities , organized by Ross Scaife and Anne Mahoney.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 63

Part III. The Future

δ d d ד

DRAFT FOR COMMENT #2: NOT FINAL 64 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 65

12. The Need for Standardization

As I have thought about many issues connected with characters and computers over the last few years, I have concluded that the scholarly professions need to address standardization in two crucial areas: keyboard entry of characters and what to do about characters that are not included in the Uni- code standard.

Keyboard Entry As explained in Chapter , Windows  and  provide only rudimentary facilities for entering ex- tended characters; Windows  does not improve on this, even though Win users have direct access to thousands of Unicode characters. The Mac OS is better, and has provided an Extended Ro- man keyboard that gives access to most of the Western characters defined in Unicode .—but this by no means includes all the characters of interest to classicists and medievalists. In addition, there has been no standard for entry of polytonic Greek characters (except the old polytonic typewriter layout in Greece, about which see page 40) or of medieval characters such as yogh and wynn.

An ideal keyboard for classicists and medievalists would • allow easy entry of all modern Latin-script languages plus Greek and Coptic, but does not in- terefere too much with regular English typing. It also supports common publishing characters such as the em-, section sign, and curly quotes • provide maximum consistency among languages; e.g., an acute is entered with the same key- stroke in Greek, Latin, or any other language. • attempt to unify all our requirements in one place • be used with various technologies; e.g., a standard -character font or a Unicode font with many additional characters. The programming of the keyboard driver would of course be different in these two scenarios, but the keystrokes themselves would remain constant. • take advantage of any symbols that are present on the keyboard (e.g., tilde and grave) • make all common diacritics available with one keystroke; the AltGr key may be used for less common ones which are visually similar to another (e.g, ^ for circumflex, AltGr-^ for caron) • have an arrangement that is as mnemonic as possible • use “accumulating” accents to get combinations such as macron+acute or smooth breath- ing+circumflex; this reduces the memory burden on the user significantly. • avoid the use of the numeric keypad: laptop users will have problems since they have no key- pad, and moving one’s hands from the regular keyboard to the keypad slows down typing considerably.

DRAFT FOR COMMENT #2: NOT FINAL 66 Word Processing in Classical Languages

I have designed a keyboard layout built on the above principles and am currently building a Win driver using Tavultesoft’s Keyman program (the most sophisticated keyboard editor for Windows that I know; see . Check my web site for more details on this. I would like to circulate it in the hope of starting discussions about this important issue; it is important that we deal with this issue now or else we will continue to have a variety of incompatible solutions for a much larger repertoire of Unicode characters. print this proposal in appendix??

There is another possibility that applies to Mac users only: to create a customized keyboard driver that builds on the keystrokes in Apple’s Extended Roman driver. For example, Apple’s Extended Ro- man keyboard defines OPTION-a as the macron (applicable to A/a, E/e, I/i, O/o, and U/u) and OP- TION-b as the breve (applicable to A/a, E/e, G/g, O/o, and U/u). A customized version of this key- board would add macra on Y/y and brevia on I/i and Y/y, and other diacritics could be handled in similar fashion. This preserves a good Mac-centric approach to keyboards and avoids reinventing the wheel for Mac users.

Characters Not Defined in Unicode The characters that classicists and medievalists need and that are not included in Unicode fall into two categories. The first consists of precomposed combinations of letters and diacritics whose elements are already defined in Unicode; for example, upsilon + smooth breathing for poetry in the Aeolic dia- lect, Y/y + breve, etc. As explained in Chapter , it is extremely unlikely that any additional precom- posed combinations will be added to Unicode in the future.

The second category consists of items that cannot be formed from characters already in Unicode. Such characters might be added to Unicode. This is a lengthy process, particularly since new propos- als must be approved by both the Unicode Consortium and the group that maintains ISO-, the document promulgated by the International Standards Organization that covers the same character set. Unicode requires evidence that characters proposed for inclusion in the standard are actually in use, and if the characters are accepted the person or organization submitting the proposal must also submit a font to print the proposed characters. Although proposals from individuals are allowed, it is most appropriate for scholarly organizations to come to consensus about which characters are needed in their fields and make the proposal on behalf of their members. Some organizations are already working on such proposals.

The original Unicode space of about , characters, referred to as the Basic Multilingual Plane (BMP), is now almost filled up. As a result, additional scripts, particularly those used by scholars, are now being allocated to a second set of , characters called Plane . (Imagine the original Uni- code codepoints as a big sheet of paper; when it is filled, another sheet is placed on top of it, and when that is filled, another. These various sheets of paper are planes.) Characters assigned to these additional planes will be accessed through the use of surrogate characters, special pairs of characters that will allow applications to retrieve characters from Plane  and higher. Etruscan, Gothic, and are some of the scripts that are being implemented in Plane . Information about scripts and charac- ters proposed for inclusion in Unicode may be obtained from

DRAFT FOR COMMENT #2: NOT FINAL 12. the Need for Standardization 67

; directions for submitting a proposal (includ- ing required documentaton for historical characters) are at .

We are left with the questions of what to do until proposals for additional characters can be made and accepted and of how to handle precomposed combinations that are not eligible for addition to the standard. The answer is the Private Use Area. As explained in Chapter , this is a special area of Uni- code that has been left open to provide users with a way to make use of characters not defined in the Unicode standard that are required for their particular purposes.

Agreement about the Private Use Area It is of the utmost important that scholars interested in font issues and their organizations come to a consensus about how to utilize the PUA; failure to do so will result in many unnecessary incompatibilities and roadblocks to the easy ex- change of information. The time to do this is now, while scholars are beginning to make use of Unicode. Trying to do so later, after various incompatible fonts have been produced and gone into use, will be an exercise in frustration. One has only to look at the number of Greek fonts with many different encodings to see why this is not a desirable situation.

The fact that some characters initially placed by scholars the PUA may later be officially accepted into Unicode should not be seen as a problem. If any such characters are accepted, we will of course start using the officially designated codepoints; but given the uncertainties, it is better not to risk tripping over ourselves by failing to devise a plan for the PUA at this time. In fact, the Unicode Consortium encourages organizations to use the PUA while developing their official proposals for characters and waiting for them to be approved. There is plenty of room in the PUA; we do not need to worry about excluding some characters to provide room for others.

It is also important that scholars from different fields should work together. For example, medievalists and classicists should cooperate since they both make use of Latin. Again, failure to do so will make life unnecessarily hard. The various scholarly organizations should give comprehensive proposals their imprimatur and promote the use of such designs by their members.

Both precomposed combinations and unique characters should be included in proposals for the PUA. It is possible that in the future OpenType fonts may make it feasible to use the combining marks al- ready in Unicode, which will make it unnecessary to have separate codepoints for such characters in the PUA (see Chapter ). However, given the great uncertainty about this, it is almost certainly best to include such characters in any proposals that are developed.

I have written the first draft of a proposal that includes all the characters found on the TGI and PHI CD-ROMS plus some epigraphic, medieval and metrical characters. This proposal is much to long to include here but I welcome inquiries from anyone interested.

DRAFT FOR COMMENT #2: NOT FINAL 68 Word Processing in Classical Languages

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 69

13. OpenType

Overview As mentioned in Chapter , Adobe and Microsoft have jointly developed the OpenType font format. This format is designed to put an end to the “font wars” between PostScript and TrueType, since a font outline in either format can be the basis for an font. OT fonts are directly supported in Windows  and will also work under Windows  if Adobe Type Manager, version . or greater, is installed. On the Mac you need ATM . or greater to use OT fonts. The first batch of OT fonts was released by Adobe early in August  and you can see samples on the Adobe web site ; search for “OpenType” or “Pro.” The second batch of OT fonts was released in September and includes Minion Pro, an excellent book face which is the only OT font so far with the complete set of polytonic Greek characters.

What OpenType Fonts Can Do OpenType fonts contain a great many new features. Some of these make it easier for graphics pro- fessionals to produce high-quality typography while others support complex scripts19 or enable users to do things with simpler scripts (Latin, Cyrillic, Greek) that have not been possible before. Here is one example of the advanced typographical features: an application can automatically substitute old- style figures 1234567890, if they are present in the font, in place of the usual lining figures 1234567890, or replace pairs of letters such as fi and fl with the corresponding ligatures. The first application that can take advantage of OT features is InDesign, Adobe’s new page layout program.

It will be the language support features, however, that will probably interest readers of this book the most. All OT fonts are built around Unicode and support an almost unlimited number of characters; the -character limit is gone. OT fonts can also contain information about how characters are placed next to one another, particularly diacritics. This will finally make it possible to use the combining diacritics that are defined in Unicode. For example, instead of having precomposed glyphs for a-macron, e-macron, and so forth, the user would type a vowel followed by the combining macron and the font would place the macron correctly over the vowel, even raising it if necessary to go over a capital letter. If the user typed a macron followed by an i, the font would know enough to use a special narrow macron and place it over a dotless i . Information can also be placed in the font about what should happen if two diacritics are used over one vowel, as in Greek. OT fonts can also contain language-specific features. For example, if a user defined a run of text as being in Coptic (using Word’s Tools/Language/ SetLanguage feature), the font would automatically use Coptic-style

19 In typographic terms, a “complex” script is one in which letters change their form or their position depending on what precedes or follows. Arabic, Devanagari (the script used to write Hindi), and Lao are all complex scripts.

DRAFT FOR COMMENT #2: NOT FINAL 70 Word Processing in Classical Languages glyphs, whereas Greek text in the same document would have the usual Greek letterforms (cf. the discussion of Coptic at the end of Chapter ).

Finally, OT fonts can contain alternative letterforms. For example, in an italic font a type designer might include, in addition to the regular “A,” an “A” glyph with fancy swashes designed to be used as the first letter in a word. An OT-aware application would allow the user to choose the alternate glyph but still keep the underlying character an “A” so that a spell checker wouldn’t flag it as mis- spelled. This kind of thing could be useful, for example, in regard to the sestertius character that was mentioned in Chapter . We could have one codepoint for the sestertius but the user could select from several possible glyphs. Using alternative glyphs would also be a way to provide, for example, a straight rho and a curly rho in one Greek font while always storing the rho as U+03C1 and avoiding the use of the curly rho Unicode character (see Chapter , page 41 ff., for discussion of this issue).

What’s (Not) Happening Now Microsoft has made the modifications to Windows that are necessary to support complex scripts via OpenType and is already using this technology in versions of Windows sold in southeast Asia. But in the Windows  that is sold in the West, Microsoft has provided a very limited support for OT fonts—Windows recognizes them and will install them, but provides no support for the high-level things that one can do with them. The code for the higher level features apparently exists in all the versions of Windows  but is turned off except in Asian versions. A software application that chooses to use any of these features (so far, only InDesign) will have to do so entirely on its own, without help from the operating system. I have tried to find out, so far in vain, when such support will be provided in Windows. This is a frustrating situation for anyone whose needs are not already supported through well-implemented international standards, whether they are Western scholars or users of minority languages in various parts of the world who need support for diacritics in Latin- script languages.

Apple Computer has created a new component of the Mac OS called ATSUI (Advanced Typo- graphic Support Unicode Interface check this actual title) which will provide support for complex scripts and other refinements in cooperation with Unicode and OpenType fonts. So far no commer- cial applications take advantage of this feature of the Mac OS.

Despite the frustrations, this is an interesting and hopeful time for anyone interested in type and in greater support for a variety of languages. Things are progressing, even if more slowly than we might wish. Stay tuned.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 71

Part IV. Resources

δ d d ד

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 72

Sources of Information

Lists and Such In addition to the Works Cited on the following pages, the following are useful sources:

Luc Devroye of McGill University in Montréal maintains a very extensive collection of links to font resources, including many Greek fonts: .

If you are seriously interested in Unicode, an excellent source of information is the Unicode mailing list. You can subscribe through the Unicode web site, . There is also a mailing list for OpenType; give subscription info

Other Miscellaneous Information UniPad One alternative to using Word for inputting Unicode text is UniPad from Sharmahd Computing . This is a plain text editor (i.e., no fancy layouts or fonts) but you can customize the keyboard to suit yourself and it is specifically designed to work with Unicode. Once the text is keyed in, you can import the file into a word processor or page layout program to prepare it for printing.

DRAFT FOR COMMENT #2: NOT FINAL Resources 73

Works Cited

Allen, W. Sidney. Vox Graeca: The Pronunciation of Classical Greek. rd edition . Cambridge: Cam- bridge University Press. Bringhurst, Robert. The Elements of Typographic Style. nd edition . Vancouver, BC and Point Roberts, WA: Hartley and Marks Publishers. An absolutely superb book that teaches one how to look at type and suggests the strong and weak points of many common typefaces. One thing that differentiates it from some other books is Bringhurst’s knowledge of non-English characters; he treats Greek fonts on pages – and provides a glossary with many useful entries. Some of the material is too advanced for beginners (e.g., the mathematical proportions of page layouts), but this is easily skipped. Another strong point of this book for scholars is its focus on traditional book typography (many dicussions of desktop publishing devote much attention to magazines or advertisements). Cavanaugh, Sean. Digital Type Design Guide. Hayden Books, , ISBN: -0-0-. OP, but an updated edition is in preparation as of August . This is a useful book covering all aspects of digital type. Cook, B.F. Greek Inscriptions. Berkeley: University of California Press, . Volume  in the Reading the Past series. Designed for a less advanced reader than Woodhead, but still useful and nicely illustrated. Gordon, Arthur E. Illustrated Introduction to Latin Epigraphy. Berkeley: University of California Press, . A very scholarly and thorough treatment of the subject; a standard resource. Graham, Tony. Unicode: A Primer. Foster City, CA: M&T Books, an imprint of IDG Books World- wide, . Provides just about everything you could want to know about Unicode and related topics in form that is more accessible to beginners than The Unicode Standard, the official publication. Haralambous, Yannis. From Unicode to Typography, A Case Study: The Greek Script. Available online at . This is a three-megabyte download, so don’t try it with a slow modem, but it is full of information about Greek typesetting practices that is hard to find elsewhere. You need the free Adobe Acrobat reader on your system to read this document (see Chapter ). Keppie. Understanding Roman Inscriptions. need rest of bibliography Aimed at a less specialized audience than Gordon, but still very useful. Livingston, Brian and Brown, Bruce. Windows 2••• Secrets. Foster City, CA: IDG Books Worldwide, . Macrakis, Michael S., ed. Greek Letters: From Tablets to Pixels. New Castle, Delaware: Oak Knoll Press, . Moye, Stephen. Fontographer: Type by Design. MIS Press, . OP. Although written as a companion to Macromedia’s Fontographer program, this book provides a great deal of information about how to create a digital typeface. It is worth the search for anyone who wants to learn how to digitize fonts.

DRAFT FOR COMMENT #2: NOT FINAL 74 Word Processing in Classical Languages

Page, R.I. Runes. Berkeley: University of California Press, . Volume  in the Reading the Past series. Scholderer, Victor. Greek Printing Types •465-1927. London: need publisher, . OP. By the de- signer of New Hellenic, the Greek face used at the Cambridge University Press. Unicode Consortium. The Unicode Standard, Version 3. Reading, Massachusetts: Addison-Wesley, . The full text of the standard, as well as the character tables, are now available from the web site . Verbrugghe, Gerald P. “Transliteration or Transcription of Greek.” Classical World volume , num- ber  (July/August ), pp. –. We s t , M . L . Greek Metre. Oxford: The Clarendon Press, . West’s Introduction to Greek Metre () uses the same symbols in a book aimed at intermediate students. Woodhead, A.G. The Study of Greek Inscriptions. nd edition, now apparently OP. Cambridge: Cam- bridge University Press, . The standard scholarly work on the subject.

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 75

Appendices

DRAFT FOR COMMENT #2: NOT FINAL 76 Word Processing in Classical Languages

Appendix 1: The Macintosh Character Set

This chart shows the standard Macintosh character set in numerical order, with the keystrokes. Ex- amples of how to type in a character: O-r means hold down the OPTION key and type an “r” (don’t type the hyphen) SO-8 means hold down the SHIFT and OPTION keys together and type an “8” Ou-SA means hold down OPTION and type a “u”, let go, and hold down SHIFT and type an “A”

Ä 128 Ou-SA † 160 O-t ¿ 192 SO-/ ‡ 224 SO-7 Å 129 SO-A ° 161 SO-8 ¡ 193 O-1 ∙ 225 SO-9 Ç 130 SO-C Ø 162 O-4 ¬ 194 O-l ‚ 226 SO-0 É 131 Oe-SE £ 163 O-3 √ 195 O-v „ 227 SO-W Ñ 132 On-SN § 164 O-6 ƒ 196 O-f ‰ 228 SO-R Ö 133 Ou-SO • 165 O-8 ≈ 197 O-x  229 Oi-SA Ü 134 Ou-SU ¶ 166 O-7 ∆ 198 O-j Ê 230 Oi-SE á 135 Oe-a ß 167 O-s « 199 O-\ Á 231 Oe-SA à 136 O`-a ® 168 O-r » 200 SO-\ Ë 232 Ou-SE â 137 Oi-a © 169 O-g … 201 O-; È 233 O`-SE ä 138 Ou-a ™ 170 O-2 202 non-br sp Í 234 Oe-SI ã 139 On-a ́ 171 Oe-spcbr À 203 O`-SA Î 235 Oi-SI å 140 O-a ¨ 172 Ou-spcbr à 204 On-SA Ï 236 Ou-SI ç 141 O-c ≠ 173 O-= Õ 205 On-SO Ì 237 O`-SI é 142 Oe-e Æ 174 SO-’ Œ 206 SO-q Ó 238 Oe-SO è 143 O`-e ø 175 SO-O† œ 207 O-q Ô 239 Oi-SO ê 144 Oi-e ∞ 176 O-5 – 208 O- -  240 SO-k ë 145 Ou-e ± 177 SO-= — 209 SO- - Ò 241 O`-SO í 146 Oe-i ≤ 178 O-, “ 210 O-[ Ú 242 Oe-SU ì 147 O`-i ≥ 179 O-. ” 211 SO-[ Û 243 Oi-SU î 148 Oi-i ¥ 180 O-y ‘ 212 O-] Ù 244 O`-SU ï 149 Ou-u µ 181 O-m ’ 213 SO-] ı 245 SO-B ñ 150 On-n ∂ 182 O-d ÷ 214 O-/ ˆ 246 SO-i1 ó 151 Oe-o ∑ 183 O-w ◊ 215 SO-v ˜ 247 SO-n2 ò 152 O`-o ∏ 184 SO-P ÿ 216 Ou-y ¯ 248 SO-, ô 153 Oi-o π 185 O-p Ÿ 217 Ou-SY ˘ 249 SO-. ö 154 Ou-o ∫ 186 O-b ⁄ 218 SO-1 ˙ 250 O-h õ 155 On-o ª 187 O-9 ¤† 219 SO-2 ˚ 251 O-k* ú 156 Oe-u º 188 O-0 ‹ 220 SO-3 ¸ 252 SO-z* ù 157 O`-u Ω 189 O-z › 221 SO-4 ˝ 253 SO-g* û 158 Oi-u æ 190 O-’ fi 222 SO-5 ˛ 254 SO-x* ü 159 Ou-u ø 191 O-o† fl 223 SO-6 ˇ 255 SO-t* 1 SO-N under System 6 * not available under System  2 SO-M under System 6 † Euro symbol € in System 8.5 and later.

DRAFT FOR COMMENT #2: NOT FINAL Appendices 77

Appendix 2: The Windows character Set

To enter an upper-order character, make sure that NUM LOCK is turned on, then hold down ALT and type a zero plus the three-digit number before the character you want. Type on the numeric keypad not the row of numerals at the top of the keyboard. See Chapter  for more about entering characters.

128 €* 160 nonbrk space 193 Á 226 â 129 [not used] 161 ¡ 194  227 ã 162 ¢ 195 à 228 ä 130 ‚ 163 £ 196 Ä 229 å 131 ƒ 164 ¤ 197 Å 132 „ 165 ¥ 198 Æ 230 æ 133 … 166 ¦ 199 Ç 231 ç 134 † 167 § 232 è 135 ‡ 168 ¨ 200 È 233 é 136 ˆ 169 © 201 É 234 ê 137 ‰ 202 Ê 235 ë 138 Š 170 ª 203 Ë 236 ì 139 ‹ 171 « 204 Ì 237 í 172 ¬ 205 Í 238 î 140 Œ 173 - 206 Î 239 ï 141 [not used] 174 ® 207 Ï 142 Ž* 175 ¯ macron 208 Ð 240 ð 143 [not used] 176 ° 209 Ñ 241 ñ 144 [not used] 177 ± 242 ò 145 ‘ curly 178 ² 210 Ò 243 ó 146 ’ || 179 ³ 211 Ó 244 ô 147 “ || 212 Ô 245 õ 148 ” quotes 180 ´ 213 Õ 246 ö 149 • bullet 181 µ 214 Ö 247 ÷ 182 ¶ 215 × 248 ø 150 – en-dash 183 ∙ 216 Ø 249 ù 151 — em-dash 184 ¸ 217 Ù 152 ˜ 185 ¹ 218 Ú 250 ú 153 ™ 186 º 219 Û 251 û 154 š 187 » 252 ü 155 › 188 ¼ 220 Ü 253 ý 156 œ 189 ½ 221 Ý 254 þ 157 [not used] 222 Þ 255 ÿ 158 ž* 190 ¾ 223 ß 159 Ÿ 191 ¿ 224 à 192 À 225 á

* added in Windows 98SE confirm

DRAFT FOR COMMENT #2: NOT FINAL 78 Word Processing in Classical Languages

Appendix 3. The Windows 2000 Polytonic Greek Keyboard

This is the layout of the polytonic keyboard that ships with Windows 2000. As stated in Chapter 7, it is based on the traditional, pre-monotonic typewriter layout used in Greece, with some additions. It accommodates both monotonic and polytonic accents. The keystrokes are not mnemonic, although some make sense given what is printed on the keys of a Greek keyboard. There are some duplica- tions, as the chart below shows. Remember that the AltGr key is the right Alt.

The alphabetic keys on a Greek typewriter/computer keyboard are analogous to those of a U.S. key- board with the following exceptions: V/v = Ω/ω U/u = Θ/θ J/j = Ξ/ξ W/w =Σ/ς Q/q = ¨ / ‘

Modern Breathings alone ; tonos ’ smooth () : diaeresis (also AltGr-Q) ” rough AltGr-; diaeresis + tonos W diaeresis + tonos Breathings with accents AltGr-’ smooth + iota subscript Accents alone / smooth + acute q acute AltGr-/ smooth + acute + iota subs. ] grave ? rough + acute [ circumflex AltGr-? rough + acute + iota subs. : diaeresis (also AltGr-Q) { iota subscript \ smooth + grave - macron AltGr-\ smooth + grave + iota subs. _ breve | rough + grave AltGr-| rough + iota subscript ` diaeresis + acute AltGr-` diaeresis + circumflex = smooth + circumflex ~ diaeresis + grave + rough + circumflex AltGr-= rough + grave + iota subs. AltGr-+ rough + grave + iota subs.

other ways to get a diaeresis : Q AltGr-Q

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 79

Appendix 4. classical keyboards

Below are printed two suggested keyboard layouts, one for Latin and one for Greek. The two keyboards

I. The Latin Keyboard

Character Keystrokes Comments A. Diacritics acute ' USIK, but some interference with regular typing; use AltGr-' [USIK] ? grave ` USIK and visually logical circumflex ^ USIK and visually logical caron AltGr-^ alternative/visually related version of item above diaeresis : don’t use USIK double-quote b/c it interferes with regular typing tilde above ~ the regular tilde; visually logical tilde through AltGr-~ alternative/visually related version of item above macron / also use = in Greek keyboard (GreekKeys); too much disruption of regular typing for non-Latinists? breve ? shifted version of the above line through AltGr- - logical, unless needed for macron line over AltGr- _ shifted version of item above cedilla , does not interfere with regular typing, since comma is always followed by a space ogonek AltGr-, visually related to above; a “reversed” cedilla, so mapped to AltGr-, hook over ! = Latin epigraphy sicilicus; no interference with regular typing, since ! is always followed by a space (except when used to represent a click in African languages) dot under AltGr-. USIK uses this for … dot over AltGr-> shifted version of item above ring @ visually logical, sort of double acute AltGr-" visually logical (USIK uses for umlaut)

DRAFT FOR COMMENT #2: NOT FINAL 80 Word Processing in Classical Languages

Character Keystroke Comments B. Latin chars. eth AltGr-D/d USIK and visually logical thorn AltGr-T/t USIK and visually logical o-slash AltGr-L/l USIK and visually logical, plus leaves the slash free for other uses sharp S AltGr-s USIK and visually logical long s AltGr-f visually logical a-ring @-A/a USIK uses AltGr-W/w Euro AltGr-e this is the position preferred by the European Union and used in many European keyboads; USIK has it on AltGr-5 tironian et AltGr-7/& check the Irish national keyboard; allows for two different forms

C. Punctuation & Symbols section AltGr-S USIK paragraph AltGr-P USIK uses AltGr-; bullet AltGr-B uppercase for consistency with the previous two double guillemets AltGr-[/] USIK single guillemets AltGr-{/} shifted form of above en-dash AltGr-n used between figures em-dash AltGr-m long dash AltGr-M used to introduce quoted text in French and Spanish; ≠ em- dash upside down ? AltGr-! used in Spanish; ! is SHIFT-1 upside down ! AltGr-? used in Spanish; ? is SHIFT-/ © AltGr-c USIK and visually logical ® AltGr-r USIK and visually logical

D. Ligatures + as prefix includes æ and œ as well as epigraphical ligatures; ff-type ligatures can be handled by GPOS table as well, if one desires them automatically Æ/æ +AE/+ae USIK has AltGr-Z/z, which is not logical Œ/œ +OE/+oe not on USIK

DRAFT FOR COMMENT #2: NOT FINAL Appendices 81

III. The Greek & Coptic Keyboard

The Greek national keyboard is followed for all keys Α–Ω and α–ω. The only ones that do not appear on keys that are analogous to Latin characters are: Σ/ς on W/w Ψ/ψ on C/c Θ/θ on U/u Ω/ω on V/v Ξ/ξ on J/j :/; on Q/q (cf. issue of which Unicode value to use) It would also be possible to use a transliterating keyboard as in WinGreek, although this is not implemented here since we propose the Greek national keyboard as the basis for our standard.

Character Keystroke Comments A. Diacritics acute ' USIK, but some interference with regular typing; use AltGr-' [USIK] ? grave ` USIK and visually logical circumflex ^ USIK and visually logical diaeresis : don’t use USIK double-quote b/c it interferes with regular typing tilde above ~ the regular tilde; visually logical** tilde through AltGr-~ alternative/visually related version of item above** macron / also use = in Greek keyboard (GreekKeys); too much disruption of regular typing for non-Latinists? breve ? shifted version of the above line through AltGr- - logical, unless needed for macron line over AltGr- _ shifted version of item above dot under AltGr-. USIK uses this for … dot over AltGr-> shifted version of item above** rough/dasia [ visually logical; slight interference with regular typing smooth/psili ] visually logical tonos ; the position on a standard Greek keyboard koronis AltGr-' missing from Win2000 Greek Polytonic keyboard, ut videtur iota subscript AltGr-ι alternative/visually related form of iota (same keystroke as AltGr-i)

** These characters are not needed for Greek, as far as I know; need to confirm this. If so, there is no need to implement in the Greek keyboard; remove blue shading from these accents in the Latin keyboard.

82 Word Processing in Classical Languages

B. Add’l Greek Characters question mark ; q the position on a standard Greek keyboard single dot colon Q the position for a modern two-dot colon on a standard Greek keyboard; we substitute the single-dot ano teleia two dot colon : W replaces the extra Sigma from the standard Greek keyboard lunate sigma AltGr-C/c uppercase is not in Unicode 3.0 digamma AltGr-W/w phonetically logical, not visually koppa numeric AltGr-Z/z sort of visually logical, given the shape of this glyph in Unicode koppa alphabetic AltGr-Q/q visually logical; this char. proposed but not yet in Unicode sampi AltGr-P/p stigma AltGr-T/t yod AltGr-J/j uppercase is not in Unicode 3.0 numeric low AltGr-N bigger numbers = CAPITAL numeric high AltGr-n smaller numbers = lowercase kai ligature AltGr-7 lowercase of &; analogous to Tironian et sign on Latin keyboard

Coptic shei AltGr-X/x ϣ not logical but not much left Coptic fei AltGr-F/f ϥ Coptic khei AltGr-K/k ϧ Coptic hori AltGr-H/h ϩ Coptic gangia AltGr-G/g ϫ Coptic shiwa AltGr-O/o ϭ vaguely resembles the shape; not much else left Coptic dei AltGr-D/d ϯ

C. Punctuation & Symbols section AltGr-S USIK paragraph AltGr-P USIK uses AltGr-; bullet AltGr-B double guillemets AltGr-[/] USIK; same as on Greek national keyboard single guillemets AltGr-{/} shifted form of above en-dash AltGr-n used between figures em-dash AltGr-m long dash AltGr-M is this ever used in Greek?? confirm

D. Ligatures + as prefix if any are needed for Greek

DRAFT FOR COMMENT #2: NOT FINAL Word Processing in Classical Languages 83 Appendix 5. ISO Language Codes

ISO standard 639-1 specifies codes for languages using a two-letter format; to be eligible for one of these codes, a language must meet a number of stringent criteria which show it to be widely used and have official government sponsorship. The more recent ISO 639-2 uses a three-letter format and requires a much more modest level of documentation; it includes a number of historical languages. For additional information (including application requirements) and codes not listed here, see the ISO 639 web site at .

This table is alphabetized by the name of the language, which may not always begin with the same letter as the . ISO 639-2/B is the bibliographic code and ISO 639-2/T is the terminol- ogy code. ISO 639-1 is the Alpha-2 code. explain these more explain use on web as well as in OT etc

639-2/B 639-2/T 639-1 Language Name (English) arc arc Aramaic cop cop Coptic eng eng en English enm enm English, Middle (11-15) ang ang English, Old (ca.45-11) fre fra fr French frm frm French, Middle (ca.14-16) fro fro French, Old (842-ca.14) gla gla gd Gaelic (Scots) ger deu de German gmh gmh German, Middle High (ca.15-15) goh goh German, Old High (ca.75-15) nds nds Low German; Low Saxon; German, Low; Saxon, Low got got Gothic grc grc Greek, Ancient (to 1453) gre ell Greek, Modern (1453-) heb heb he Hebrew

(Table continues on next page.)

DRAFT FOR COMMENT #2: NOT FINAL 84 Word Processing in Classical Languages

639-2/B 639-2/T 639-1 Language Name (English) gle gle ga Irish mga mga Irish, Middle (9-12) sga sga Irish, Old (to 9) lat lat la Latin non non Norse, Old nor nor no Norwegian nob nob nb Norwegian Bokmål nno nno nn Norwegian Nynorsk oci oci oc Occitan (post 15); Provençal pro pro Provençal, Old (to 15)

DRAFT FOR COMMENT #2: NOT FINAL