Advancing the Encoding Model for Book Pahlavi Letters
Total Page:16
File Type:pdf, Size:1020Kb
L2/21-090 2021-04-23 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey [email protected] April 23, 2021 1 Introduction This document aims to provide a model for representing the letters of the Book Pahlavi script. Combining signs and numbers are not discussed here. 2 Description of the Letters The basic repertoire for Book Pahlavi contains the following 14 letters: ரளஷிூ்ௐௗ aleph beth gimel, he waw, zayin kaph lamedh mem, samekh pe sadhe shin taw heth daleth, ayin, qoph yodh nun, resh It has 11 supplemental letters, which are alternate and historical forms: ழஹீுைோௌನ ‘curled’ ‘old’ ‘old’ final ‘stroked’ ‘looped’ ‘hooked’ ‘old’ ‘tall’ ’Indian’ ‘curled’ gimel, daleth kaph ‘old’ lamedh lamedh lamedh lamedh samekh samekh shin daleth, kaph yodh The repertoire also contains the following 3 atomic ligatures, which behave as letters: X1 X2 yʾ There names for letters are based on Aramaic analogues; however, there is no formal convention for de- scribing alternate forms. The descriptors given above, such as ‘old’, ‘curled’, etc., have been assigned by the proposal author as a means for convenient reference. 1 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 2.1 Directionality The script is written from right to left, with lines that advance from top to bottom. Letters are written along a baseline, which is not readily apparent, but may be identified as • the bottom of ர, ழ, , ௌ, • the resting spot for the heads of ி, ௐ, , , ௗ • the cross-bar of and ். The following shows the alignment of all letters with the baseline (gray). The head-height is measured by the tops of these letters, while the below-base is determined by ி, , ௗ, etc. The swash terminals of ள, ஹ, ு run just under the below-base level, in order to accomodate letters that are written within the terminal. ௗನௐ்ௌோைூீுிஹஷழளர 2.2 Joining behavior Book Pahlavi is a cursive abjad script whose letters are dual-joining, right-joining, or non-joining: dual-joining ನ்ோைூஷ ழர right-joining ௗௐௌுிஹள non-joining ீ During joining, a letter may be represented using a contextual or modified form, which is determined by its position within the cursive string, by adjacent letters, and in many cases, by both. In the example below, the first line shows a string of isolated letters, the second line shows the contextual forms of those letters, and the third line shows the shaped forms of the letters: ௦ௗள ரர ௦ரூழர ரர ௦ரரர ௦ௐரௗ ௦ௗఞ ఒఔಢ ௦ఢఐ ఒఔಢ ௦హఔఔಢ ௦ಗಅఐௗಢ ௦ௗఞ ఒఔಢ ௦హఢఐ ఒఔಢ ௦హఔఔಢ ௦ಗಅఐಫಥ <wštʾsp′ šʾhʾn′ šʾh w ʾylʾn′ šʾh bwt′> wištāsp šāhān šāh ud ērān šāh būd Wištāsp was the king of kings and the king of the Iranians. The table below shows the contextual forms of Book Pahlavi letters, grouped according to joining patterns as analyzed by the proposal author. Two tables follow that, which show the joining behavior of dual-joining and right-joining letters. 2 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey Xn Xf Xm Xi beth ள ఝ- — — ‘old’ daleth ஹ ஹ- — — kaph ி ౄ- , - — — ‘old’ kaph ீ ే- — — ‘hooked’ lamedh ோ - -- - ‘old’ lamedh ௌ - — — Minimal or no change samekh - - - - in shape ಐ ಏ ಎ ‘tall’ samekh ಇ- -ಇ- -಄ ‘Indian’ samekh ಔ- -ಓ- -ಒ taw ௗ ௗ- — — X1 - — — X2 - — — y-h ligature - — — waw-nun-ayin-resh - , హ- — — Height zayin -ా , ఼- ి, -ా , -఼- ి, -ా , - adjustments for below-base lamedh ூ - , ౌ- , -- , -ొ- , -్ , -ై or baseline connections ‘stroked’ lamedh ౡ- , ౝ- ౢ, -ౠ- , -- ౢ, - , -ౙ ‘looped’ lamedh ை ౭- , ౩- ౮, -౬- , -౧- ౮, -౪ , - - Vertical he ష- , ಃ — — positioning of body - mem-qoph ் ಀ- , ౿- -౺- , ౼- , -ಃ ౸ , - Truncation of pe ௐ ಗ- , ಖ- , ಕ- — — strokes, or no change sadhe - , ಘ- — — aleph-heth ர ఒ- ఔ- , -క- , -- ఓ , -క , -ఐ gimel-daleth-yodh త ధ- -ళ- , -ద- -ల , -థ Descent of terminal ‘curled’ gimel-daleth-yodh ழ ణ- భ- , -ప- , -డ- బ , - , -ఠ shin ಝ- ಧ- , ಣ- , -ಜ- , -ಛ- ಧ- , ಢ , -ಚ ‘curled’ shin ನ ಝ- ಧ- , ಣ- , -ಠ- ಧ- , ಢ , - 3 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey Dual-joining letters ் ோ ூ ழ ர ಔఐ ಝఐ ಆఐ ಃఙ ఐ ౝఐ ౌఐ ఽఐ ధఓ ఒఓ ர ಜక ణఓ ణఐ ಔఠ ಝఠ ಆఠ ಃన — ౝఠ ౌఠ ఽఠ ణఠ ఒబ ழ ಇ ణథ ణబ ధబ ಔ ಝ ಆ ಃూ — ౝ ౌ ఽ ణ ఒ ಞా ణి ఒి ಔై ಝై ಆై ಃౘ — ౝై ౌై ఽై ణై ఒై ூ ಞ్ ధ ఒ ಔౙ ಝౙ ಆౙ ಃ — ౝౙ ౌౙ ఽౙ ణౙ ఒౙ ಞ ధౢ ఒౢ ————————— ఒ ோ ಔ ಝ ಆ ౿ — ౝ ౌ ఽ ణ ఒ ் ಐ ధ ఒ౸ ణ౸ ధ౸ ಔ಄ ಞ಄ ಇ಄ ಃ — ౡ಄ ಄ ృ಄ ణಈ ఒಈ ౝಎ ౌಎ ఽಎ ధಈ ణಎ ధಎ ಔಒ ಝಒ ಆಒ ౿ಒ — ౝಒ ౌಒ ఽಒ ణಒ ఒಒ ಔಚ ಞಧ ಆಚ ౿ಚ — ౝಢ ౌಢ ఽಢ ణಢ ఒಢ 4 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey Right-joining letters ௗ ௐ ె ி ௌ ஹ ள ಫగ ಕక ಕక ేఐ ౄఐ ఐ హఐ శఙ వఐ ఝఐ ர ಙఐ ಖఐ ಯఐ ಗక ಫయ ಕ ಕ ేఠ ల — హఠ శన వఠ ఝఠ ழ ಙఠ ಖఠ ౄథ హథ ಗ ౄఠ ಯథ ಫు ಕా ಖ ే ౄ — హ శూ వ ఝ ಙ ಗా ా ಕా ಫ ಕ్ ಖై ేై ౄై — హై శౘ వై ఝై ூ ಙై ಗ్ ಫౣ ಕ ಖౙ ేౙ ౄౙ — హౙ శ వౙ ఝౙ ಙౙ ಗ ಪ ಙ ಖ ే ౄ — హ ష వ ఝ ் ಫ౹ ಫಊ ಕ಄ ಗ಄ ేಎ ಄ — హಎ శ వಎ ఝಎ ಪಎ ಙಎ ಖಎ ౄಎ ಕ಄ ಪಒ ಙಒ ಖಒ ేಒ ౄಒ — హಒ షಒ వಒ ఝಒ ಫಥ ಙಚ ಖಚ ేಚ ౄಚ — హಚ షಚ వಚ ఝಚ ಕಧ ಕಧ 5 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 3 Summary of Graphemes The Book Pahlavi letters may be categorized as those that have ‘simple’ or ‘complex’ graphemes. A summary of the distinctive graphemes for each letter, as shown in the above tables, is provided below: Simple Letters Complex Letters beth ள aleph-heth ఓ , క , ఐ ‘old’ daleth ஹ a-y lig waw-nun-ayin-resh , gimel-daleth-yodh ల , థ zayin ి , ా , ‘curled’ gimel-daleth-yodh బ , , ఠ kaph ி he ( + ்) ‘old’ kaph ீ samekh (ழ + ழ) final ‘old’ kaph ு ‘tall’ samekh ( + ) lamedh , ్ , ై ‘Indian’ samekh (త + బ) ‘stroked’ lamedh ౢ , , ౙ pe ಕ , ௐ ‘looped’ lamedh ౮ , ౪ , sadhe ಕ , ‘hooked’ lamedh ோ shin ಧ , ಢ , ‘old’ lamedh ௌ ‘curled’ shin ನ mem-qoph ் taw ௗ X1 X2 y-h ligature ’Simple’ letters have graphical identities that are largely preserved while joining, despite elongation or short- ening. ‘Complex’ letters may resemble sequences of other letters, or may be obscured when joined with certain adjacent letters, resulting in ambigious representations. The above graphemes could be used to reproduce Book Pahlavi text from transliterated sources. If a user understand basic rules of the script, these characters could also be used unambigiously to reproduce a Book Pahlavi document as encoded text. The accuracy of the latter, however, is dependently on the care and precision with which a source was written or printed. 6 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 4 Complexities of the script The shaping behavior of Book Pahlavi letters presents several challenges for developing an encoding model. Some palaeographically distinct letters resemble shaped sequences of other letters: ர aleph-heth ధ + త daleth-gimel-yodh + daleth-gimel-yodh he + ் mem-qoph + height-adjusted waw-nun-ayin-resh samekh ழ + ழ ‘curled’ daleth-gimel-yodh + ‘curled’ daleth-gimel-yodh ‘tall’ samekh + descending ‘curled’ daleth-gimel-yodh + descending ‘curled’ daleth-gimel-yodh ‘Indian’ samekh ೠ + ೳ descending ‘curled’ daleth-gimel-yodh + daleth-gimel-yodh ನ ‘curled’ shin ர + ೳ descending ‘curled’ daleth-gimel-yodh + aleph-heth Some palaeographical letters have distinctive forms in some sources, while in other sources they may re- semble shaped sequences of other letters: ‘curled’ shin ఒబ ர + ೳ descending ‘curled’ daleth-gimel-yodh + aleph-heth + short descending ‘curled’ daleth-gimel-yodh + back-sloping aleph- heth Some letter combinations have alternate, multiple valid representations, which may occur concurrently. Some forms are used in preserved spellings. Usage of a particular form generally cannot be predicted. aleph + gimel-daleth-yodh దఓ , డఓ , దఐ , డఐ , పఐ (medial) aleph + shin ಝఐ , ಜక aleph + pe ಖఐ , ಕక , ಗక gimel-daleth-yodh + aleph-heth ఒబ , gimel-daleth-yodh + kaph ల , ౄథ , ౄఠ gimel-daleth-yodh + pe ಖఠ , ಗ , ಕ zayin + aleph ఒ , ఒి zayin + kaph ౄ , ా lamedh + aleph ఒై , ఒ 7 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey lamedh + pe ಖై , ಗ్ lamedh + sadhe ಙై , ಗ్ mem-qoph + waw-nun-ayin-resh హ , samekh + pe ಗ಄ , ಕ಄ shin + pe ಖಚ , ಕಧ shin + sadhe ಙಚ , ಕಧ Some letter sequences may be analyzed in multiple ways: త + ர aleph-heth + daleth-gimel-yodh ர + త daleth-gimel-yodh + aleph ಕక ௐ + ர aleph + pe + ர aleph + sadhe ಇ + ழ ‘curled’ daleth-gimel-yodh + samekth త + samekh + medial daleth-gimel-yodh ಕ ௐ + త daleth-gimel-yodh + pe + త daleth-gimel-yodh + sadhe హ + ನ shin + waw-nun-ayin-rest + ர + ழ daleth-gimel-yodh + aleph + waw-nun-ayin-rest ಕ಄ ௐ + samekh + pe ௐ + + descending ‘curled’ daleth-gimel-yodh + descending ‘curled’ daleth-gimel-yodh + pe + samekh + sadhe + ழ + ழ daleth-gimel-yodh + daleth-gimel-yodh + sadhe ಕಧ ௐ + shin + pe + shin + sadhe 8 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 5 Challenges for Defining the Encoding Model In a typical ‘palaeographic’ encoding model for cursive joining scripts, all of the distinctive letters of a script would be encoded as characters, while the contextual forms would be handled at the font level. While many Book Pahlavi letters have straight-forward rules for joining, which are predictable, there are a handful of letters that have multiple representations when they are adjacent to certain other letter.