Universal Shaping Engine

Total Page:16

File Type:pdf, Size:1020Kb

Universal Shaping Engine making fonts for the Universal Shaping Engine John Hudson, Tiro Typeworks Ltd • TYPO Labs, Berlin, 10 May 2016 Version 1.1, 23 May 2016 This paper, based on a presentation delivered at the inaugural TYPO Labs font technology con- ference in Berlin, concerns making a particular kind of OpenType font to work with a new shap- ing engine for complex script layout. If you’re not involved in making fonts for complex scripts, I hope you might still find some interest in the conceptual problems and solutions involved, and also in the insights these provide into the architecture and history of OpenType Layout. Let me begin by defining what we mean by ‘complex script’. These are scripts that require processing beyond a simple display of the default encoded glyph for each character in order to correctly present text in an acceptably readable form. This processing typically involves char- acter string analysis and manipulation, as well as glyph substitution and positioning. There are, of course, instances in which a font for any script may assume complex behaviours — ligation, contextual substitutions, dynamic mark positioning —, but an inherently complex script is one forWhat which theis aplain complex text encoded script? character sequence will be unreadable without additional processing. تعقيد ديقعت मिश्रित ममम섿त ಸಂಕೀರ್ಣ ಸಂಕೕಣ㒣 Complex scripts tend to fall into one of two broad categories: those, like Arabic, involving join- ing behaviour that requires knowledge about adjacent characters and substitution of appropri- ate forms to display connected lettergroups, and those, like the many Brahmi-derived scripts of South and Southeast Asia, in which the orthographic unit is a cluster that may consist of multiple consonant letters plus dependent vowel sign and additional modifier marks. Scripts in the latter category also tend to involve reordering behaviours, in which there is a distinction between the graphical order of signs in the cluster and their phonetically encoded ordering. In the OpenType model, complex script layout is handled collaboratively by a shaping engine — residing at the operating system or application level — and the layout in a font. This is a some- what simplified diagram of that collaboration, and I’m not going to discuss it in a step-by-step way.OpenType [For more detailed, Layout see my simplifiedUnicode conference collaborative presentation from 2015 .]model Layout services Shaping engine Font [Bidi algorithm] Script itemisation Run segmentation Run analysis Cluster segmentation [Split vowels] [Initial reordering] Cluster shaping Basic shaping GSUB features Final reordering Standard GSUB features Conditional GSUB features GPOS features � Line breaking Justification For more detailed discussion, see : http://tiro.com/John/Hudson_IUC39_Beyond_Shaping.pdf Complex script handling was Microsoft’s primary goal in developing a smart font format in the mid-1990s. Microsoft developed Arabic, Hebrew, and Thai shaping engines for TrueType Open, the immediate precursor to OpenType, and in early 1999 shipped the first version of the Uni- code Script Processor for Complex Scripts — or Uniscribe — with Internet Explorer 5.01. Sub- sequent versions of Uniscribe have shipped with all versions of Windows, Office, and Microsoft browsers, often leapfrogging each other in support for additional scripts and languages. Other companies have produced their own OpenType Layout engines for complex scripts, notably the open source Harfbuzz shaper — maintained by Behdad Esfahod —, Adobe’s World Ready Composer, and Apple’s CoreText engine. The assignment of scripts to processing by a particular engine generally depends on similar- ities in shaping needs. This leads to predictable groupings such as the handling of numerous South Asian Brahmi-derived scripts in a common Indic shaping engine, and occasionally to strange-bedfellows, such as assignment of the Thaana script of the Maldive Islands to Unis- cribe’s Hebrew shaping engine. The current Windows 10 version of Uniscribe includes nine engines, each of which is responsible for shaping one or more scripts. An engine may also support more than one version of shaping for a given script, mapped to different OpenType script tags, for example the old Windows XP IndicUniscribe shaping and theshaping new ‘Indic2’ engines model introduced as of in WindowsWindows Vista. This 10 enables (RS1) continued support for older fonts while allowing improved implementations to emerge. Arabic engine Arabic, Syriac Generic engine Cyrillic, Greek, Latin, etc. (non-complex scripts) Hangul engine Hangul, Old Hangul Hebrew engine Hebrew, Thaana Indic engine Bengali, Devanagari, Gujurati, Gurmukhi, Kannada, etc. Khmer engine Khmer Myanmar engine Myanmar (Burmese) Thai/Lao engine Lao, Thai Universal engine Balinese, Batak, Brahmi, Buginese, Buhid, Chakma, Cham, Duployan, Egyptian Hieroglyphs, Grantha, Hanunoo, Javanese, Kaithi, Kayah Li, etc. (45 total) In case you are unfamiliar with the kinds of things that a shaping engine does with a script, I’ll take a moment to discuss a step-by-step example of typical script-specific shaping for a mock character sequence (not a real word). Layout services will have identified this as Bengali, based on the Unicode script property of the characters involved, and will have passed the run to the appropriate Indic shaping engine. The shaping engine has determined that the font supports theStep-by-step Indic2 shaping model example using the <bng2> (Bengali script tag, so<bng2> is going to shaping) apply that shaping model. ক ে◌া ল ◌্ ম ◌ু র ◌্ ত ◌্ ক ি◌ 0995 09CB 09B2 09CD 09AE 09C1 09B0 09CD 09A4 09CD 0995 09BF ka o la [x] ma u ra [x] ta [x] ka i TheStep-by-step shaping engine analyses example the character (Bengali run, and segments <bng2> it into shaping) three orthographic units, 1 in this case clusters consisting of one or more consonants with explicit vowel signs. The small diagonalক markে (U+09CD)া ল is a◌ vowel্ ম killer, ◌ indicatingু র that ◌ the্ precedingত ◌ and্ followingক ি◌ conso- nants ক0995 are 09C7েpart◌ 09BEofা the 09B2sameল cluster.09CD◌ ্ 09AEম 09C1◌ ু 09B0র 09CD◌ ্ 09A4ত 09CD◌ ্ 0995ক ি09BF◌ The Step-by-step0995 first step09CB is to split09B2 exampleany two-part09CD 09AE vowel (Bengali 09C1 signs into09B0 <bng2>their09CD constituent 09A4 shaping) 09CDelements. 0995 This is09BF a buff- ered ka character o level operation,la [x] made ma possible u becausera both[x] elementsta are[x] alsoka atomically i en- 2 codedে as ক characters া inল Unicode. ◌্ ম ◌ু ি◌ র ◌্ ত ◌্ ক ক09C7 0995ে◌ 09BEা 09B2ল 09CD◌ ্ 09AEম 09C1◌ ু 09BFর ◌09B0্ ত09CD ◌09A4্ ক09CD ি0995◌ 1 0995 09CB 09B2 09CD 09AE 09C1 09B0 09CD 09A4 09CD 0995 09BF Step-by-step কka ে o া লla example ◌[x]্ মma (Bengali ◌u ু রra <bng2> ◌[x]্ তta shaping) ◌[x]্ কka ি◌i 0995 09C7 09BE 09B2 09CD 09AE 09C1 09B0 09CD 09A4 09CD 0995 09BF 3 The secondে ক step inা this লshaping ◌ model্ ম is initial ◌ reordering.ু ি◌ In our ◌ example,� this ত involves ◌ moving্ ক 09C7 0995 09BE 09B2 09CD 09AE 09C1 09BF 09B0+09CD 09A4 09CD 0995 1 of কleft-side ে vowel◌া signs ল in the ◌ first্ মand third ◌ clusters.ু র Again,◌্ this ত is a buffered◌্ ক character ি◌ level 2 0995ক ে09CB া ল09B2 ◌09CD্ ম09AE ◌09C1ু র09B0 ◌09CD্ ত09A4 ◌09CD্ ক0995 ি◌09BF operation: 0995েka ক09C7 there’s o 09BE া no ল09B2 interactionla ◌09CD[x] ্ with 09AEমma the ◌09C1fontu ু layout 09B0িra◌ tables 09CD [x]র up 09A4to◌ta this ্ 09CDstage.[x]ত 0995◌ka ্ 09BFকi 09C7 0995 09BE 09B2 09CD 09AE 09C1 09BF 09B0 09CD 09A4 09CD 0995 21 3 কে ক ে া ল ◌্ ম ◌ু রি◌ ◌ র্ ত◌ ্ ◌ ত্ ক◌ ্ িক◌ 099509C7ে 099509C7ক 09BE া 09B2 ল 09CD◌ ্ 09AE ম 09C1◌ ু 09B0 09BFি◌ 09CD 09B0 ◌ 09A409CD� 09CD 09A4 ত 0995 09CD ◌ 09BF্ 0995 ক 09C7 0995 09BE 09B2 09CD 09AE 09C1 09BF 09B0+09CD 09A4 09CD 0995 That interaction begins in the third step: application of basic shaping glyph substitution fea- 3 tures. These may include precomposition of letter plus nukta forms, and formation of akhand 2 ligaturesেে ক (aক kind া of া pseudo-letter). ল ল ◌ ◌্ ্ Inম মour example,◌ ◌ু ু ি the◌ি◌ shaping র ◌ engine ◌�্ applies ত ত the◌ Reph◌্ ক্ Forms ক <rphf> 09C709C7 feature 09950995 09BEto09BE the 09B2 sequence09B2 09CD09CD of cluster-initial09AE09AE 09C109C1 Ra09BF plus09BF the 09B0 09B0vowel +09CD09CD killer 09A4 character09A4 09CD09CD in the09950995 third cluster, substituting the repha mark glyph found in the substitution lookup. 3 ে ক া ল ◌্ ম ◌ু ি◌ ◌� ত ◌্ ক 09C7 0995 09BE 09B2 09CD 09AE 09C1 09BF 09B0+09CD 09A4 09CD 0995 If we had other characters that take special forms in particular situations, these would also be substituted during this phase. Some fonts might substitute half forms of other letter plus vowelStep-by-step killer sequences, althoughexample I generally (Bengali don’t do this <bng2> in Bengali. The shaping) next step in our exam- ple, is substitution of consonant ligatures in the Conjuct Forms <cjct> feature. Note that in our example, only the conjunct in the second cluster takes a ligature form; this is because the font does not contain a ligature form for the conjunct in the third cluster, which instead will display withক an explicit ে◌ vowelা ল killer sign.◌্ This ম seldom ◌ happensু র in◌ Bengali্ ত text, but◌ is্ theক sort িof◌ thing 0995 09CB 09B2 09CD 09AE 09C1 09B0 09CD 09A4 09CD 0995 09BF that ka happens owhen Englishla or[x] other ma foreign u loanwordsra are[x] transliteratedta [x] in anka Indian iscript, producing character sequences that don’t occur in the local language. 4 Step-by-stepে ক া example � (Bengali◌ু ি◌ <bng2> ◌� ত shaping) ◌্ ক 09C7 0995 09BE 09B2+09CD+09AE 09C1 09BF 09B0+09CD 09A4 09CD 0995 At কthis stage, ে◌ basicা shaping ল features◌্ ম are complete,◌ু র and the◌ next্ ত step is◌ final্ কreordering, ি◌ this 0995 09CB 09B2 09CD 09AE 09C1 09B0 09CD 09A4 09CD 0995 09BF timeStep-by-step ka performed o at thela glyphexample [x]level, takingma (Bengali outputu fromra <bng2> features[x] suchta shaping) Reph[x] Formska <rphf>i and 5 trackingে their কposition া in the glyph� string.
Recommended publications
  • Resource, Valuable Archive on Social and Economic History in Western India
    H-Asia Resource, Valuable archive on social and economic history in Western India Discussion published by Sumit Guha on Friday, September 2, 2016 Note on a valuable new resource: Haribhakti Collection Department of History, Faculty of Arts The Maharaja Sayajirao University of Baroda, Vadodara, Gujarat-INDIA Foundation: 1949 Eighteenth Century Baroda in Gujarat has not only evidenced the emergence of political potentates in Gaekwads but also the pecuniary mainstays amongst citizens. The foremost were the Haribhaktis’[i]; who are remembered for business success in areas such as money-lending/indigenous banking, coin- changing, traders in private capacity and banking; formation of Gaekwad’s State financial policy- which stimulated rural resources and commercial economy that benefitted in the making of urban Gujarat during the 18th and 19th centuries; and as philanthropists in individual capability. The business acumen and continuous support to Gaekwad fetched honours and titles like Nagar‘ Seth’ and ‘Raj Ratan' ‘Raj Mitra’ ‘Chiranjiva’&c to them by rulers and citizens. Their firm building in Vadodara dates back to last quarter of 19th century; and its location is near Mandvi darwaza in Ghadiali pol popularly known as Haribhakti ni Haveli “…made up of red and yellow wood and …stands as grandeur of 200 years past”. This family as state bankers were Kamvisadars, traders and Nagarseths of Gaekwad`s of Baroda. Their multifunctional role is apparent as we have more than 1000bahis/ account books and around 10,000 loose sheets of correspondence and statements;kundlis, astrological charts, receipts of transactions related to religious donations, grants for educational and health infrastructure, greetings, invitations, admiration and condolence letters etc.
    [Show full text]
  • L2/20-246 Teeth and Bellies: a Proposed Model for Encoding Book Pahlavi
    L2/20-246 Teeth and bellies: a proposed model for encoding Book Pahlavi Roozbeh Pournader (WhatsApp) September 7, 2020 Background In Everson 2002, a proposal was made to encode a unified Avestan and Pahlavi script in the Unicode Standard. The proposal went through several iterations, eventually leading to a separate encoding of Avestan as proposed by Everson and Pournader 2007a, in which Pahlavi was considered non-unifiable with Avestan due to its cursive joining property. The non-cursive Inscriptional Pahlavi (Everson and Pournader 2007b) and the cursive Psalter Pahlavi (Everson and Pournader 2011) were later encoded too. But Book Pahlavi, despite several attempts (see the Book Pahlavi Topical Document list at https://unicode.org/L2/ topical/bookpahlavi/), remains unencoded. Everson 2002 is peculiar among earlier proposals by proposing six Pahlavi archigraphemes, including an ear, an elbow, and a belly. I remember from conversations with Michael Everson that he intended these to be used for cases when a scribe was just copying some text without understanding the underlying letters, considering the complexity of the script and the loss of some of its nuances to later scribes. They could also be used when modern scholars wanted to represent a manuscript as written, without needing to over-analyze potentially controversial readings. Meyers 2014 takes such a graphical model to an extreme, trying to encode pieces of the writing system, most of which have some correspondence to letters, but with occasional partial letters (e.g. PARTIAL SHIN and FINAL SADHE-PARTIAL PE). Unfortunately, their proposal rejects joining properties for Book Pahlavi and insists that “[t]he joining behaviour of the final stems of the characters in Book Pahlavi is more similar to cursive variants of Latin than to Arabic”.
    [Show full text]
  • The Unicode Cookbook for Linguists: Managing Writing Systems Using Orthography Profiles
    Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2017 The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles Moran, Steven ; Cysouw, Michael DOI: https://doi.org/10.5281/zenodo.290662 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-135400 Monograph The following work is licensed under a Creative Commons: Attribution 4.0 International (CC BY 4.0) License. Originally published at: Moran, Steven; Cysouw, Michael (2017). The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles. CERN Data Centre: Zenodo. DOI: https://doi.org/10.5281/zenodo.290662 The Unicode Cookbook for Linguists Managing writing systems using orthography profiles Steven Moran & Michael Cysouw Change dedication in localmetadata.tex Preface This text is meant as a practical guide for linguists, and programmers, whowork with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together. The intersection of the Unicode Standard and the International Phonetic Al- phabet is often not met without frustration by users. Nevertheless, thetwo standards have provided language researchers with a consistent computational architecture needed to process, publish and analyze data from many different languages. We bring to light common, but not always transparent, pitfalls that researchers face when working with Unicode and IPA. Our research uses quantitative methods to compare languages and uncover and clarify their phylogenetic relations. However, the majority of lexical data available from the world’s languages is in author- or document-specific orthogra- phies.
    [Show full text]
  • Kirja-Alan Onix
    KIRJA-ALAN ONIX SUOMALAINEN SOVELLUS Versio 3.0.2. Marraskuu 2012 Päivitetty huhtikuu 2015 Helsinki 2012 ISBN 978-952-10-8321-1 (PDF) Työryhmä: Mirja Anttila, Kansalliskirjasto Peter Bagge, Kustannusosakeyhtiö Otava Juha Hakala, Kansalliskirjasto Jari Heikkinen, Kansalliskirjasto Susanna Honkanen, Suomalainen Kirjakauppa Oy Maarit Huttunen, Kansalliskirjasto Karin von Koskull, Stockmann Oyj Abp, Akateeminen Kirjakauppa Tuomo Suominen, Sanoma Pro Oy Olli Tuuteri, BTJ Finland Oy Aija Vahtola, Kansalliskirjasto Eila Vainikka, Kuntaliitto Susanna Vestman, Kirjavälitys Oy Laura Vuorensola, Stockmann Oyj Abp, Akateeminen kirjakauppa Suomalainen ONIX-sovellus perustuu seuraaviin dokumentteihin: ONIX for Books Product Information Format Specification Release 3.0 revision 1, January 2012 ONIX for Books: Code Lists Issues 18–27 Alkuperäiset dokumentit ovat saatavilla osoitteessa http://www.editeur.org/ Kommentit formaattiin ja mahdolliset lisäkysymykset Suomen ONIX-keskukselle. http://www.onix.fi/ 1 Sisällys Suomalaisen sovelluksen esipuhe ........................................................................................................... 4 ONIX-viesti .............................................................................................................................................. 5 Viestin rakenne ................................................................................................................................... 5 P.1 Tuotetietueen viite ja tyyppi (Record reference number, type and source) .............................
    [Show full text]
  • Suspicious Identity of U+A9B5 JAVANESE VOWEL SIGN TOLONG
    L2/19-003 Suspicious identity of U+A9B5 JAVANESE VOWEL SIGN TOLONG Liang Hai / 梁海 <[email protected]> Aditya Bayu Perdana / <[email protected]> ꦄꦢꦶꦠꦾ ꦧꦪꦸꦥꦢꦤ 4 January 2019 1 Acknowledgements The authors would like to thank Ilham Nurwansah and the Script Ad Hoc group for their feedback. Ilham Nurwansah also kindly provided the Sundanese samples (Figure 2, 3, 4, and 5). 2 Background In the original Unicode Javanese proposal L2/08-015R Proposal for encoding the Javanese script in the UCS, the character tolong (U+A9B5 JAVANESE VOWEL SIGN TOLONG) was described as a vowel sign that is used exclusively in the Sundanese writing system with three major use cases: 1. Used alone as the vowel sign o 2. As a part of the vowel sign eu: <vowel sign ĕ, tolong> 3. As a part of the letters and conjoined forms of reu/leu: <letter / conjoined form rĕ/lĕ, tolong> Table 1. Sundanese tolong usage according to the original proposal Written form ◌ ◌ꦵ ◌ꦼ ◌ꦼꦵ ◌� ◌�ꦵ A9C0 PANGKON A9BC PEPET Encoding (A9B5 TOLONG) A989 PA CEREK (A9B5 TOLONG) (A9B5 TOLONG) Transcription a o ĕ eu rĕ reu Pronunciation [a] [o] [ə] [ɤ] [rə] [rɤ] See also the note under Table 2. However, tolong appears to be merely a stylistic variant of tarung (U+A9B4 JAVANESE VOWEL SIGN TARUNG), therefore the disunification of tolong from tarung is likely a mistake. 3 Proposal The Unicode Standard needs to recommend how the inappropriately disunified character U+A9B5 JAVANESE VOWEL SIGN TOLONG should be handled. 1 In particular, clarification in the names list and the Core Specification is necessary for explaining the background of the mis-disunification and recommending how both the tarung and tolong forms for both the Javanese and Sundanese languages should be implemented.
    [Show full text]
  • Cqmejj · -Uhhrersity
    $9uth¢a$t Mia JTogtam -1986-:13.ulletin CQmeJJ · -Uhhrersity ' - SEAP ARCHIVE COPY DO NOT REMOVE This publication has been made possible by the generosity of Robert and Ruth Polson. Southeast Asia Program 1986 Bulletin Cornell University Contents From the Director . 2 Badgley Appointed Curator of the Echols Collection . .. .. .. .. 3 Filming Javanese Manuscript Collections in Surakarta . 4 Microcomputers and the Study of Southeast Asia. .. 6 Celebrating Our Founder's Birthday.............. .... ... 7 Interview with Dr. Hendrik M. J. Maier..................... ... .. 9 Retirements. .. .. .. .. .. .. .. .. .. I 2 Program Publications . 13 About Program People . 14 Thursday Luncheon Speakers .. .. .. .. I 4 Faculty and Staff Publications. .... ... .. .. .. 14 Lauriston Sharp Prize. 14 Social Science Research Council Fellowships . 15 Resident Faculty . .. .. .. 15 Visiting Faculty .. .. .. .. .. .. .. 15 Visiting Fellows. 15 Graduate Students in Field Published by the Southeast Asia Program, Research . 15 Cornell University, 1987 Graduate Students in Residence, Edited by Stanley J. O'Connor Spring 1986................ 15 Full-Year Asian Language Designed by Deena Wickstrom Concentration . I 6 Produced by the Office of Publications Services, Advanced Indonesian Abroad Cornell University Program. .... .......... 16 Recent Doctoral Dissertations The photograph of John H. Badgley was taken by Helen Kelley and of Hendrik M. J . Maier, by Margaret Fabrizzio. by SEAP Students........... 16 Recent Dissertations and Cover design after a woodcut of cloves from 1ratado das drogas e Theses on Southeast Asia by medicinas das indias Orientais, by Crist6vao da Costa Other Students at Cornell.. 16 from the Director Dear Friends, year we were fortunate to have Professor Charnvit Kasetsiri, vice rector of Thammasat University, come to Last year I noted that the Southeast Asia Program was teach the Thailand Seminar.
    [Show full text]
  • Tai Lü / ᦺᦑᦟᦹᧉ Tai Lùe Romanization: KNAB 2012
    Institute of the Estonian Language KNAB: Place Names Database 2012-10-11 Tai Lü / ᦺᦑᦟᦹᧉ Tai Lùe romanization: KNAB 2012 I. Consonant characters 1 ᦀ ’a 13 ᦌ sa 25 ᦘ pha 37 ᦤ da A 2 ᦁ a 14 ᦍ ya 26 ᦙ ma 38 ᦥ ba A 3 ᦂ k’a 15 ᦎ t’a 27 ᦚ f’a 39 ᦦ kw’a 4 ᦃ kh’a 16 ᦏ th’a 28 ᦛ v’a 40 ᦧ khw’a 5 ᦄ ng’a 17 ᦐ n’a 29 ᦜ l’a 41 ᦨ kwa 6 ᦅ ka 18 ᦑ ta 30 ᦝ fa 42 ᦩ khwa A 7 ᦆ kha 19 ᦒ tha 31 ᦞ va 43 ᦪ sw’a A A 8 ᦇ nga 20 ᦓ na 32 ᦟ la 44 ᦫ swa 9 ᦈ ts’a 21 ᦔ p’a 33 ᦠ h’a 45 ᧞ lae A 10 ᦉ s’a 22 ᦕ ph’a 34 ᦡ d’a 46 ᧟ laew A 11 ᦊ y’a 23 ᦖ m’a 35 ᦢ b’a 12 ᦋ tsa 24 ᦗ pa 36 ᦣ ha A Syllable-final forms of these characters: ᧅ -k, ᧂ -ng, ᧃ -n, ᧄ -m, ᧁ -u, ᧆ -d, ᧇ -b. See also Note D to Table II. II. Vowel characters (ᦀ stands for any consonant character) C 1 ᦀ a 6 ᦀᦴ u 11 ᦀᦹ ue 16 ᦀᦽ oi A 2 ᦰ ( ) 7 ᦵᦀ e 12 ᦵᦀᦲ oe 17 ᦀᦾ awy 3 ᦀᦱ aa 8 ᦶᦀ ae 13 ᦺᦀ ai 18 ᦀᦿ uei 4 ᦀᦲ i 9 ᦷᦀ o 14 ᦀᦻ aai 19 ᦀᧀ oei B D 5 ᦀᦳ ŭ,u 10 ᦀᦸ aw 15 ᦀᦼ ui A Indicates vowel shortness in the following cases: ᦀᦲᦰ ĭ [i], ᦵᦀᦰ ĕ [e], ᦶᦀᦰ ăe [ ∎ ], ᦷᦀᦰ ŏ [o], ᦀᦸᦰ ăw [ ], ᦀᦹᦰ ŭe [ ɯ ], ᦵᦀᦲᦰ ŏe [ ].
    [Show full text]
  • This Document Serves As a Summary of the UC Berkeley Script Encoding Initiative's Recent Activities. Proposals Recently Submit
    L2/11‐049 TO: Unicode Technical Committee FROM: Deborah Anderson, Project Leader, Script Encoding Initiative, UC Berkeley DATE: 3 February 2011 RE: Liaison Report from UC Berkeley (Script Encoding Initiative) This document serves as a summary of the UC Berkeley Script Encoding Initiative’s recent activities. Proposals recently submitted to the UTC that have involved SEI assistance include: Afaka (Everson) [preliminary] Elbasan (Everson and Elsie) Khojki (Pandey) Khudawadi (Pandey) Linear A (Everson and Younger) [revised] Nabataean (Everson) Woleai (Everson) [preliminary] Webdings/Wingdings Ongoing work continues on the following: Anatolian Hieroglyphs (Everson) Balti (Pandey) Dhives Akuru (Pandey) Gangga Malayu (Pandey) Gondi (Pandey) Hungarian Kpelle (Everson and Riley) Landa (Pandey) Loma (Everson) Mahajani (Pandey) Maithili (Pandey) Manichaean (Everson and Durkin‐Meisterernst) Mende (Everson) Modi (Pandey) Nepali script (Pandey) Old Albanian alphabets Pahawh Hmong (Everson) Pau Cin Hau Alphabet and Pau Cin Hau Logographs (Pandey) Rañjana (Everson) Siyaq (and related symbols) (Pandey) Soyombo (Pandey) Tani Lipi (Pandey) Tolong Siki (Pandey) Warang Citi (Everson) Xawtaa Dorboljin (Mongolian Horizontal Square script) (Pandey) Zou (Pandey) Proposals for unencoded Greek papyrological signs, as well as for various Byzantine Greek and Sumero‐Akkadian characters are being discussed. A proposal for the Palaeohispanic script is also underway. Deborah Anderson is encouraging additional participation from Egyptologists for future work on Ptolemaic signs. She has received funding from the National Endowment for the Humanities and support from Google to cover work through 2011. .
    [Show full text]
  • 2903 Date: 2005-08-22
    ISO/IEC JTC 1/SC 2/WG 2 N2903 DATE: 2005-08-22 ISO/IEC JTC 1/SC 2/WG 2 Universal Multiple-Octet Coded Character Set (UCS) - ISO/IEC 10646 Secretariat: ANSI DOC TYPE: Meeting Minutes TITLE: Unconfirmed minutes of WG 2 meeting 46 Jinyan Hotel, Xiamen, Fujian Province, China; 2005-01-24/28 SOURCE: V.S. Umamaheswaran, Recording Secretary, and Mike Ksar, Convener PROJECT: JTC 1.02.18 – ISO/IEC 10646 STATUS: SC 2/WG 2 participants are requested to review the attached unconfirmed minutes, act on appropriate noted action items, and to send any comments or corrections to the convener as soon as possible but no later than 2005-09-05. ACTION ID: ACT DUE DATE: 2005-09-05 DISTRIBUTION: SC 2/WG 2 members and Liaison organizations MEDIUM: Acrobat PDF file NO. OF PAGES: 67 (including cover sheet) Mike Ksar Convener – ISO/IEC/JTC 1/SC 2/WG 2 Microsoft Corporation Phone: +1 425 707-6973 One Microsoft Way Fax: +1 425 936-7329 Bldg 24/2217 email: [email protected] Redmond, WA 98052-6399 or [email protected] Unconfirmed Meeting Minutes ISO/IEC JTC1/SC2/WG2 Meeting 46 Page 1 of 67 N2903 Jinyan Hotel, Xiamen, Fujian Province, China; 2005-01-24/28 2005-08-22 ISO International Organization for Standardization Organisation Internationale de Normalisation ISO/IEC JTC 1/SC 2/WG 2 Universal Multiple-Octet Coded Character Set (UCS) ISO/IEC JTC 1/SC 2/WG 2 N2903 Date: 2005-08-22 Title: Unconfirmed minutes of WG 2 meeting 46 Jinyan Hotel, Xiamen, Fujian Province, China; 2005-01-24/28 Source: V.S.
    [Show full text]
  • PDF Copy of My Gsoc Proposal
    GSoC Proposal for Haiku Add Haiku Support for new Harfbuzz library ● Full name: Deepanshu Goyal ​ ● Timezone: +0530 ​ ● Email address: [email protected] ​ ● IRC username (freenode.net): digib0y ​ ● Trac username (dev.haiku-os.org): digib0y ​ ● Trac ticket(s) containing patches for Haiku/ Pull requests: ○ https://github.com/haiku/website/pull/26 ○ https://github.com/haiku/website/pull/41 ○ https://github.com/haikuports/haikuports/pull/1204 ○ https://github.com/HaikuArchives/ArtPaint/pull/54 We also had technical discussions over few issues in ArtPaint which you can checkout: https://github.com/HaikuArchives/ArtPaint/issues , I have also commented few lines of code in one the ​ issues which you might be interested in! Apart from these I have submitted a patch to Haiku however most of the work on the patch was already done by a previous contributor . https://dev.haiku-os.org/attachment/ticket/11518/0001-Implemented-BFont-Blocks-added-build-featur e-for-fon.patch ● GitHub (or other public) repository: https://github.com/digib0y/ ​ ● Will you treat Google Summer of Code as full time employment? ​ ​ Yes, I do understand that Google Summer of Code is a full time virtual internship and I have no other commitment to any other internship, job, and exams. ● How many hours per week will you work? I will work for 40-50 hours a week, with give work update on every alternate day to the assigned mentor.. ● List all obligations (and their dates) that may take time away from GSoC (a second job, vacations, classes, ...): One day per week will be an off day, most probably weekend only if the goals for the week has been completed.
    [Show full text]
  • ISO/IEC JTC1/SC2/WG2 N 4823 Date: 2017-05-24
    ISO/IEC JTC1/SC2/WG2 N 4823 Date: 2017-05-24 ISO/IEC JTC1/SC2/WG2 Coded Character Set Secretariat: Japan (JISC) Doc. Type: Disposition of comments Title: Disposition of comments on PDAM1.2 to ISO/IEC 10646 5th edition Source: Michel Suignard (project editor) Project: JTC1 02.10646.00.01.00.05 Status: For review by WG2 Date: 2017-05-24 Distribution: WG2 Reference: SC2 N4518 Medium: Paper, PDF file Comments were received from the following members: China, Ireland, Japan, Mongolia, UK, and USA. The following document is the disposition of those comments. The disposition is organized per country. Note – With some minor exceptions, the full content of the ballot comments has been included in this document to facilitate the reading. The dispositions are inserted in between these comments and are marked in Underlined Bold Serif text, with explanatory text in italicized serif. As a result of this disposition, a new PDAM1.3 ballot will be initiated. It is expected to be the last PDAM ballot for Amendment 1 before a DAM ballot is initiated. Page 1 Following these dispositions, the following changes were done to the Amendment repertoire: Xiangqi game symbols 30 characters removed (U+1F270..U+1F28D) from the Enclose Ideographic Supplement block (U+1F200..U+1F2FF) and replaced by 14 characters (U+1FA60..U+1FA6D) in a new block: Chess Symbols (U+1FA00..U+1FA6F) with names and code points as follows: 1FA60 RED XIANGQI GENERAL 1FA61 RED XIANGQI MANDARIN 1FA62 RED XIANGQI ELEPHANT 1FA63 RED XIANGQI HORSE 1FA64 RED XIANGQI CHARIOT 1FA65 RED XIANGQI CANNON 1FA66 RED XIANGQI SOLDIER 1FA67 BLACK XIANGQI GENERAL 1FA68 BLACK XIANGQI MANDARIN 1FA69 BLACK XIANGQI ELEPHANT 1FA6A BLACK XIANGQI HORSE 1FA6B BLACK XIANGQI CHARIOT 1FA6C BLACK XIANGQI CANNON 1FA6D BLACK XIANGQI SOLDIER Small Historic Kana The characters proposed at 1B127..1B12F are removed from this amendment.
    [Show full text]
  • Notes on Linguistics, 1999. INSTITUTION Summer Inst
    DOCUMENT RESUME ED 439 605 FL 026 195 AUTHOR Payne, David, Ed. TITLE Notes on Linguistics, 1999. INSTITUTION Summer Inst. of Linguistics, Dallas, TX. ISSN ISSN-0736-0673 PUB DATE 1999-00-00 NOTE 242p.; Published quarterly. AVAILABLE FROM International Academic bookstore, 7500 W. Camp Wisdom Rd., Dallas, TX 75236 (annual subscription, $20.95; $3 dach). Tel: 972-708-7404. PUB TYPE Collected Works Serials (022) JOURNAL CIT Notes on Linguistics; v2 n1-4 1999 EDRS PRICE MF01/PC10 Plus Postage. DESCRIPTORS *Applied Linguistics; Arabic; Canada Natives; Code Switching (Language); Decoding (Reading); Descriptive Linguistics; Diachronic Linguistics; Discourse Analysis; Foreign Countries; German; Grammar; Hebrew; Japanese; Language Patterns; Lexicology; Linguistic Theory; *Literacy; Malayo Polynesian Languages; Morphology (Languages); Negative Forms (Language); Phonology; Polish; Pragmatics; Russian; Second Language Instruction; Second Language Learning; Semantics; Semiotics; *Sociolinguistics; Speech Communication; Syntax; Uncommonly Taught Languages ABSTRACT The 1999 issues of "Notes on Linguistics," published quarterly, include the following articles, review articles, reviews, book . notices, and reports: "A New Program for Doing Morphology: Hermit Crab"; "Lingualinks CD-ROM: Field Guide to Recording Language Data"; "'Unruly' Phonology: An Introduction to Optimality Theory"; "Borrowing vs. Code Switching: Malay Insertions in the Conversations of West Tarangan Speakers of the Aru Islands of Maluku, Eastern Indonesia"; "What's New in Lingualinks
    [Show full text]