Hebrew in Bits and Bytes: An

ASCII was later extended to 8 bits to PERSPECTIVES ON TECHNOLOGY include accented characters for other Western European languages. Other international standards developed to HEBREW IN BITS AND BYTES: include character sets such as Greek and Hebrew. However, 8-bit AN INTRODUCTION TO character encoding was limited to 256 characters, and these standards CODING AND FORMATTING were usually inadequate when users wanted to work in more than one language at once. Applications using OF HEBREW ELECTRONIC such standards are forced to switch between character sets to obtain RESOURCES characters or symbols not provided Heidi Lerner by the normally used default set. To uring the past two decades, many evolving research environment requires make matters more complicated, earlier Hebraic language electronic that Jewish studies specialists know the DOS and Apple programs used character Dresources became available in basics of multilingual and multiscript sets that did not comply with international both Hebrew and Latin scripts. These computing. standards. texts, databases, and Unicode Version 4.0 encodes over bibliographic tools either 95,000 characters, covering most required researchers to modern and historic scripts. transliterate (romanize) the text Unicode has the potential to using Latin characters, or to use a encode over a million characters. proprietary and stand-alone The Unicode Standard also gives software program that displays specifications for the presentation Hebrew characters using special of bi-directional text: Hebrew, fonts and add-on features. Arabic, etc., are properly output Romanization of Hebrew is as right-to-left. problematic at best: There are Hebrew and MS Windows many different schemes in use Unicode support is provided in today, and to provide correct Windows 2000 and Windows XP, vocalization, a strong knowledge and, in a more limited scope, in of Hebrew grammar is required. Windows 98, NT4, and ME. What Most of us have experienced this means is that users can create great frustration in trying to and disseminate documents that are locate Hebraica materials in directly readable, searchable, and library catalogs and periodical printable in Hebrew and Latin indexes that are in the Latin scripts. Scholars can cut and paste alphabet only. Academic journals Hebrew text directly from Unicode- and encyclopedias vary in their based resources into Word, send requirements for transliteration of Hebrew e-mail in Outlook and Hebraica. Diacritics are required Outlook Express, and mix scripts to represent certain Hebrew characters. within documents. Windows XP and Additional diacritics are employed to ASCII versus Unicode Windows 2000 also support bi- represent special consonants employed by A computer records text as a sequence of directionality, allowing users to use most the many and diverse Jewish languages numbers in binary form. One of the software both from left-to-right and right- written in Hebrew characters. earliest standards for numerically encoding to-left (provided that the programs allow the Latin alphabet was the American At the same time, users of Hebrew script for bi-directional use). In addition, Standard Code for Information software have been faced with difficulties Microsoft Proofing Tools offers special Interchange (ASCII). This 7-bit code (a when attempting to share their work with editing tools for Hebrew: thesauri, “bit” is a single “binary digit” with a value anyone, communicate via e-mail, or spelling and grammar checkers, a of 0 or 1) only covered 128 characters, transfer files between programs and translation dictionary, and specialized consisting of the English alphabet, operating systems. Today’s rapidly fonts. The other more proprietary and numbers, punctuation and some symbols. 18 often incompatible formats do not allow DEFINITIONS: (ISO-8859-8-I) is selected, click on the same ease for interchanging data Bi-directional Display (BIDI): The Hebrew Visual (ISO-8859-8). If between databases and application process or result of mixing left-to-right Hebrew Visual (ISO-8859-8) is software. Hebrew support is not yet oriented text and right-to-left oriented selected, click on Hebrew (Windows- available for Macintosh versions of text in a single line. 1255). You can also try Hebrew (ISO- Internet Explorer or Office. The Netscape Character: The minimal unit of encoding 8859-8-I). 7 Web browser is Unicode-compatible. for a character set. A character often I need to insert a special diacritic or Most of the Hebrew fonts bundled with corresponds to a single graphic sign of a symbol in Word?: Select a Unicode font, Windows (Times New Roman, Arial, writing system, e.g., a letter or a all of which offer a full array of diacritics. Tahoma, Courier New, Arial Unicode, punctuation mark. You can (A) Click where in the Lucida Sans Unicode, David, and Miriam) Character Set: A table that assigns codes document you want to insert the do not support cantillation (te’amim) or to characters so that the characters can be character. Open the Insert menu and even some of the non-standard nikud. For stored and manipulated in computer click Symbol. Click the Special these characters, special Unicode fonts are applications. Characters tab and double-click the required that support Hebrew fully and Code point: A numerical index (or character you want to insert. (B) Use some that are already available include SIL position) in an encoding table used for Character Map by opening Start menu/ Ezra Hebrew Unicode Fonts (freeware encoding characters. Programs/Accessories/System produced and distributed by the Summer Diacritic: A small mark added above, Tools/Character Map. Windows Institute of Linguistics below, or after a base character to change 2000/XP users may check Advanced [www.sil.org/computing/ its pronunciation. View; set Character Set to Unicode; catalog/show_software.asp?id=76]), and Encoding: The process of assigning and group by Unicode Subrange. Next, Code2000 and Code2001 fonts characters to available code points so that choose Hebrew to display the full array (shareware, produced by James Kass the characters can be represented in of Hebrew characters in the selected [home.att.net/~jameskass/code2001.htm]). computer applications. font. Double-click on the selected Font: A collection of glyphs used for the character, or highlight a character and Hebraica Resources visual depiction of character data. A number of important electronic click on Select. The character(s) can then Glyph: An image used in the visual be pasted into Word. resources in Jewish studies now use depiction of characters. Often, for a given Unicode. These include: the most recent font, there is a one-to-one relationship editions of the Bar-Ilan Judaic Library, Heidi Lerner is the Hebraica/Judaica between an encoded character and a Cataloger at Stanford University. some publications from Mechon Mamre, glyph. But in languages with complex the Penn/Cambridge Genizah Fragment writing, one character may correspond to Resources: Project based at the University of several glyphs, or several characters to one Pennsylvania’s Schoenberg Center for 1. Unicode home page: www.unicode.org glyph. 2. Hebrew Computing on Windows (Web Electronic Text and Image, and Logical order: Order in which characters bibliographic databases including the site, maintained by Tsuguya Sasaki): are typed on a keyboard. www.jewish-languages.org/windows.html Eureka interface to the RLIN database, Nikud/Te’am : See “Diacritic.” the Index to Hebrew Periodicals (IHP), 3. Issues in the Representation of Pointed Visual order: Order of characters as they Hebrew in Unicode (3rd draft, Peter Kirk, the Index to Periodicals in Jewish Studies are presented for reading. (RAMBI), and the Israel Union Catalog August 2003): www.qaya.org/academic/ (ULI) and Union List of Serials (ULS) in FAQ’S: hebrew/Issues-Hebrew-Unicode.html Israeli libraries. Unfortunately, many What do I do if? 4. Enabling International Support in other electronic resources still rely on Hebrew appears backwards or displays as Windows 2000: www.microsoft.com/ older 7-bit and 8-bit encoding. These gibberish in Internet Explorer?: Open the globaldev/handson/user/2kintlsupp.mspx include the Historical Dictionary of the View menu and choose Encoding. If 5. Working with Non-Roman Script Text Hebrew Language, Otzar ha-Poskim, Hebrew (Windows) or Hebrew (ISO- in MS Windows Applications: Takdin, Bibliography of the Hebrew Book, Logical) is selected, click on Hebrew www.lib.umich.edu/area/Near.East/Non Dead Sea Scrolls Electronic Reference (ISO-Visual). If Hebrew (ISO-Visual) is RomanDemo.pdf Library, and the Henkind Talmud Text selected, click on Hebrew (Windows). Databank. One hopes that publishers of You can also try Hebrew (ISO-Logical). these resources will adopt the Unicode Hebrew appears backwards or displays as standard, a step that would greatly gibberish in Netscape 7?: Open the View enhance their scholarly utility. menu and choose Character Coding. If Hebrew (Windows-1255) or Hebrew 19.

Hebrew in Bits and Bytes: An

Old Cyrillic in Unicode*

The Fontspec Package Font Selection for XƎLATEX and Lualatex

Proste Unikodne Vektorske Pisave

The Fontspec Package

Complete Issue 25:0 As One

The Fontspec Package Font Selection for X LE ATEX and Lualatex

Ucharclasses

Pdflib Tutorial 9.0.1

EMERGING TECHNOLOGIES Multilingual Computing

A Study of Traditional Mongolian Script Encodings and Rendering: Use of Unicode in Opentype Fonts

Xetex Output 2005.09.28:2230

Addis Ababa University School of Graduate Studies School of Information Science