Advancing the Encoding Model for Book Pahlavi Letters

Advancing the Encoding Model for Book Pahlavi Letters

L2/21-090 2021-04-23 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey [email protected] April 23, 2021 1 Introduction This document aims to provide a model for representing the letters of the Book Pahlavi script. Combining signs and numbers are not discussed here. 2 Description of the Letters The basic repertoire for Book Pahlavi contains the following 14 letters: ரளஷ஺஻஼ிூ்಍ௐ௒௓ௗ aleph beth gimel, he waw, zayin kaph lamedh mem, samekh pe sadhe shin taw heth daleth, ayin, qoph yodh nun, resh It has 11 supplemental letters, which are alternate and historical forms: ழஹீு௅ைோௌ௎௏ನ ‘curled’ ‘old’ ‘old’ final ‘stroked’ ‘looped’ ‘hooked’ ‘old’ ‘tall’ ’Indian’ ‘curled’ gimel, daleth kaph ‘old’ lamedh lamedh lamedh lamedh samekh samekh shin daleth, kaph yodh The repertoire also contains the following 3 atomic ligatures, which behave as letters: ௘ ௙ ௛ X1 X2 yʾ There names for letters are based on Aramaic analogues; however, there is no formal convention for de- scribing alternate forms. The descriptors given above, such as ‘old’, ‘curled’, etc., have been assigned by the proposal author as a means for convenient reference. 1 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 2.1 Directionality The script is written from right to left, with lines that advance from top to bottom. Letters are written along a baseline, which is not readily apparent, but may be identified as • the bottom of ர, ழ, ஼, ௌ, ಍ • the resting spot for the heads of ி, ௐ, ௒, ௓, ௗ • the cross-bar of ஺ and ். The following shows the alignment of all letters with the baseline (gray). The head-height is measured by the tops of these letters, while the below-base is determined by ி, ௎, ௗ, etc. The swash terminals of ள, ஹ, ு run just under the below-base level, in order to accomodate letters that are written within the terminal. ௛௙௘ௗನ௓௒ௐ௏௎಍்ௌோை௅ூீுி஼஻஺ஹஷழளர 2.2 Joining behavior Book Pahlavi is a cursive abjad script whose letters are dual-joining, right-joining, or non-joining: dual-joining ನ௓௏௎಍்ோை௅ூ஼ஷ ழர right-joining ௙௘௛ௗ௒ௐௌுி஻஺ஹள non-joining ீ During joining, a letter may be represented using a contextual or modified form, which is determined by its position within the cursive string, by adjacent letters, and in many cases, by both. In the example below, the first line shows a string of isolated letters, the second line shows the contextual forms of those letters, and the third line shows the shaped forms of the letters: ௦ௗ஻ள ரர௓ ௦஻ரூழர ஻ ரர௓ ௦஻ரரர௓ ௦ௐ௎ரௗ௓஻ ௦ௗ஻ఞ ఒఔಢ ௦஻఑౒ఢఐ ஻ ఒఔಢ ௦హ఑ఔఔಢ ௦ಗಅఐௗಢ஻ ௦ௗ஻ఞ ఒఔಢ ௦హ఑౒ఢఐ ஻ ఒఔಢ ௦హ఑ఔఔಢ ௦ಗಅఐಫಥ஻ <wštʾsp′ šʾhʾn′ šʾh w ʾylʾn′ šʾh bwt′> wištāsp šāhān šāh ud ērān šāh būd Wištāsp was the king of kings and the king of the Iranians. The table below shows the contextual forms of Book Pahlavi letters, grouped according to joining patterns as analyzed by the proposal author. Two tables follow that, which show the joining behavior of dual-joining and right-joining letters. 2 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey Xn Xf Xm Xi beth ள ఝ- — — ‘old’ daleth ஹ ஹ- — — kaph ி ౄ- , ౅- — — ‘old’ kaph ீ ే- — — ‘hooked’ lamedh ோ ౴- -౳- -౲ ‘old’ lamedh ௌ ౱- — — Minimal or no change samekh - - - - in shape ಍ ಐ ಏ ಎ ‘tall’ samekh ௎ ಇ- -ಇ- -಄ ‘Indian’ samekh ௏ ಔ- -ಓ- -ಒ taw ௗ ௗ- — — X1 ௘ ௘- — — X2 ௙ ௙- — — y-h ligature ௛ ௛- — — waw-nun-ayin-resh ஻ ఺- , హ- — — Height zayin ஼ -ా , ఼- ి, -ా , -఼- ి, -ా , -఻ adjustments for below-base lamedh ூ ౐- , ౌ- ౑, -౏- , -ొ- ౑, -్ , -ై or baseline connections ‘stroked’ lamedh ௅ ౡ- , ౝ- ౢ, -ౠ- , -౛- ౢ, -౞ , -ౙ ‘looped’ lamedh ை ౭- , ౩- ౮, -౬- , -౧- ౮, -౪ , -౥ - Vertical he ஺ ష- , ಃ — — positioning of body - mem-qoph ் ಀ- , ౿- -౺- , ౼- , -ಃ ౸ , -౶ Truncation of pe ௐ ಗ- , ಖ- , ಕ- — — strokes, or no change sadhe ௒ ௒- , ಘ- — — aleph-heth ர ఒ- ఔ- , -క- , -఑- ఓ , -క , -ఐ gimel-daleth-yodh త ధ- -ళ- , -ద- -ల , -థ Descent of terminal ‘curled’ gimel-daleth-yodh ழ ణ- భ- , -ప- , -డ- బ , -఩ , -ఠ shin ௓ ಝ- ಧ- , ಣ- , -ಜ- , -ಛ- ಧ- , ಢ , -ಚ ‘curled’ shin ನ ಝ- ಧ- , ಣ- , -ಠ- ಧ- , ಢ , -಩ 3 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey Dual-joining letters ௏ ௓ ௎ ் ோ ௅ ூ ஼ ழ ர ಔఐ ಝఐ ಆఐ ಃఙ ౴ఐ ౝఐ ౌఐ ఽఐ ధఓ ఒఓ ர ಜక ణఓ ణఐ ௚ ಔఠ ಝఠ ಆఠ ಃన — ౝఠ ౌఠ ఽఠ ణఠ ఒబ ழ ಇ఩ ణథ ௛ ణబ ధబ ಔ఻ ಝ఻ ಆ఻ ಃూ — ౝ఻ ౌ఻ ఽ఻ ణ఻ ఒ఻ ஼ ಞా ణి ఒి ಔై ಝై ಆై ಃౘ — ౝై ౌై ఽై ణై ఒై ூ ಞ్ ధ౑ ఒ౑ ಔౙ ಝౙ ಆౙ ಃ౤ — ౝౙ ౌౙ ఽౙ ణౙ ఒౙ ௅ ಞ౞ ధౢ ఒౢ ————————— ఒ౲ ோ ಔ౶ ಝ౶ ಆ౶ ౿౶ — ౝ౶ ౌ౶ ఽ౶ ణ౶ ఒ౶ ் ಐ౶ ధ౶ ఒ౸ ణ౸ ధ౸ ಔ಄ ಞ಄ ಇ಄ ಃ಑ — ౡ಄ ౐಄ ృ಄ ణಈ ఒಈ ௎ ౝಎ ౌಎ ఽಎ ధಈ ణಎ ధಎ ಔಒ ಝಒ ಆಒ ౿ಒ — ౝಒ ౌಒ ఽಒ ణಒ ఒಒ ௏ ಔಚ ಞಧ ಆಚ ౿ಚ — ౝಢ ౌಢ ఽಢ ణಢ ఒಢ ௓ 4 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey Right-joining letters ௗ ௒ ௐ ె ி ௌ ஻ ஺ ஹ ள ಫగ ಕక ಕక ేఐ ౄఐ ౱ఐ హఐ శఙ వఐ ఝఐ ர ಙఐ ಖఐ ಯఐ ಗక ಫయ ಕ఩ ಕ఩ ేఠ ౅ల — హఠ శన వఠ ఝఠ ழ ಙఠ ಖఠ ౄథ హథ ಗ఩ ౄఠ ಯథ ಫు ಕా ಖ఻ ే఻ ౄ఻ — హ఻ శూ వ఻ ఝ఻ ஼ ಙ఻ ಗా ౅ా ಕా ಫ౔ ಕ్ ಖై ేై ౄై — హై శౘ వై ఝై ூ ಙై ಗ్ ಫౣ ಕ౞ ಖౙ ేౙ ౄౙ — హౙ శ౤ వౙ ఝౙ ௅ ಙౙ ಗ౞ ಪ౶ ಙ౶ ಖ౶ ే౶ ౄ౶ — హ౶ ష౶ వ౶ ఝ౶ ் ಫ౹ ஺ ಫಊ ಕ಄ ಗ಄ ేಎ ౅಄ — హಎ శ಑ వಎ ఝಎ ௎ ಪಎ ಙಎ ಖಎ ౄಎ ಕ಄ ಪಒ ಙಒ ಖಒ ేಒ ౄಒ — హಒ షಒ వಒ ఝಒ ௏ ಫಥ ಙಚ ಖಚ ేಚ ౄಚ — హಚ షಚ వಚ ఝಚ ௓ ಕಧ ಕಧ 5 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 3 Summary of Graphemes The Book Pahlavi letters may be categorized as those that have ‘simple’ or ‘complex’ graphemes. A summary of the distinctive graphemes for each letter, as shown in the above tables, is provided below: Simple Letters Complex Letters beth ள aleph-heth ఓ , క , ఐ ‘old’ daleth ஹ a-y lig ௚ waw-nun-ayin-resh ఺ , ஻ gimel-daleth-yodh ల , థ zayin ి , ా , ఻ ‘curled’ gimel-daleth-yodh బ , ఩ , ఠ kaph ி he (఺ + ்) ஺ ‘old’ kaph ீ samekh (ழ + ழ) ಍ final ‘old’ kaph ு ‘tall’ samekh (఩ + ఩) ௎ lamedh ౑ , ్ , ై ‘Indian’ samekh (త + బ) ௏ ‘stroked’ lamedh ౢ , ౞ , ౙ pe ಕ , ௐ ‘looped’ lamedh ౮ , ౪ , ౥ sadhe ಕ , ௒ ‘hooked’ lamedh ோ shin ಧ , ಢ , ௓ ‘old’ lamedh ௌ ‘curled’ shin ನ mem-qoph ் taw ௗ X1 ௘ X2 ௙ y-h ligature ௛ ’Simple’ letters have graphical identities that are largely preserved while joining, despite elongation or short- ening. ‘Complex’ letters may resemble sequences of other letters, or may be obscured when joined with certain adjacent letters, resulting in ambigious representations. The above graphemes could be used to reproduce Book Pahlavi text from transliterated sources. If a user understand basic rules of the script, these characters could also be used unambigiously to reproduce a Book Pahlavi document as encoded text. The accuracy of the latter, however, is dependently on the care and precision with which a source was written or printed. 6 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 4 Complexities of the script The shaping behavior of Book Pahlavi letters presents several challenges for developing an encoding model. Some palaeographically distinct letters resemble shaped sequences of other letters: ர aleph-heth ధ + త daleth-gimel-yodh + daleth-gimel-yodh ஺ he ఺ + ் mem-qoph + height-adjusted waw-nun-ayin-resh ಍ samekh ழ + ழ ‘curled’ daleth-gimel-yodh + ‘curled’ daleth-gimel-yodh ௎ ‘tall’ samekh ఩ + ఩ descending ‘curled’ daleth-gimel-yodh + descending ‘curled’ daleth-gimel-yodh ௏ ‘Indian’ samekh ೠ + ೳ descending ‘curled’ daleth-gimel-yodh + daleth-gimel-yodh ನ ‘curled’ shin ர + ೳ descending ‘curled’ daleth-gimel-yodh + aleph-heth Some palaeographical letters have distinctive forms in some sources, while in other sources they may re- semble shaped sequences of other letters: ‘curled’ shin ఒబ ர + ೳ descending ‘curled’ daleth-gimel-yodh + aleph-heth ೸೺ ೷ + ೹ short descending ‘curled’ daleth-gimel-yodh + back-sloping aleph- heth Some letter combinations have alternate, multiple valid representations, which may occur concurrently. Some forms are used in preserved spellings. Usage of a particular form generally cannot be predicted. aleph + gimel-daleth-yodh దఓ , డఓ , దఐ , డఐ , పఐ (medial) aleph + shin ಝఐ , ಜక aleph + pe ಖఐ , ಕక , ಗక gimel-daleth-yodh + aleph-heth ఒబ , ௛ gimel-daleth-yodh + kaph ౅ల , ౄథ , ౄఠ gimel-daleth-yodh + pe ಖఠ , ಗ఩ , ಕ఩ zayin + aleph ఒ఻ , ఒి zayin + kaph ౄ఻ , ౅ా lamedh + aleph ఒై , ఒ౑ 7 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey lamedh + pe ಖై , ಗ్ lamedh + sadhe ಙై , ಗ్ mem-qoph + waw-nun-ayin-resh హ౶ , ఺౶ samekh + pe ಗ಄ , ಕ಄ shin + pe ಖಚ , ಕಧ shin + sadhe ಙಚ , ಕಧ Some letter sequences may be analyzed in multiple ways: ௚ త + ர aleph-heth + daleth-gimel-yodh ர + త daleth-gimel-yodh + aleph ಕక ௐ + ர aleph + pe ௒ + ர aleph + sadhe ಇ఩ ௎ + ழ ‘curled’ daleth-gimel-yodh + samekth త + ௎ samekh + medial daleth-gimel-yodh ಕ఩ ௐ + త daleth-gimel-yodh + pe ௒ + త daleth-gimel-yodh + sadhe హ಩ ஻ + ನ shin + waw-nun-ayin-rest ஻ + ர + ழ daleth-gimel-yodh + aleph + waw-nun-ayin-rest ಕ಄ ௐ + ௎ samekh + pe ௐ + ఩ + ఩ descending ‘curled’ daleth-gimel-yodh + descending ‘curled’ daleth-gimel-yodh + pe ௒ + ௎ samekh + sadhe ௒ + ழ + ழ daleth-gimel-yodh + daleth-gimel-yodh + sadhe ಕಧ ௐ + ௓ shin + pe ௒ + ௓ shin + sadhe 8 Advancing the encoding model for Book Pahlavi letters Anshuman Pandey 5 Challenges for Defining the Encoding Model In a typical ‘palaeographic’ encoding model for cursive joining scripts, all of the distinctive letters of a script would be encoded as characters, while the contextual forms would be handled at the font level. While many Book Pahlavi letters have straight-forward rules for joining, which are predictable, there are a handful of letters that have multiple representations when they are adjacent to certain other letter.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    14 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us