Proposal to Encode Combining Glagolitic Letters in Unicode
Total Page:16
File Type:pdf, Size:1020Kb
Ponomar Project Slavonic Computing Initiative Proposal to Encode Combining Glagolitic Letters in Unicode Aleksandr Andreev*, Heinz Miklas, Yuri S$ardt Section 1. Introduction Glagolitic, also kno%n as “Glagolitsa', is an alp$abetic writing s(stem used to record C$urc$ Slavonic and other Slavic languages) Originating in t$e 9t$ centur(, it is t$e earliest kno%n Slavonic alp$abet. ,$e creation o- the alp$abet is attributed to t$e younger o- the teac$ers o- the Slavs, St. C(ril) Glagolitic writing may be found in medi.val manuscripts and in printed liturgical books, mostl( o- a Croatian origin. In Bulgaria, Glagolitic was graduall( replaced b( t$e C(rillic alp$abet, and this C(rillic alp$abet was subse0uentl( used also b( other Slavs) For its part, the Glagolitic script has been preserved b( some communities in Croatia even up to t$e present) E2tant Glagolitic te2ts are o- enormous value to linguists, pal.ograp$ers, and sc$olars o- liturg() Support for Glagolitic in the Unicode standard is re0uired -or t%o purposes) 1irst, contemporar( specialists need to be able to typograp$icall( represent medi.val te2ts written in the Glagolitic script, both in printed matter 3suc$ as academic publications4 and in an electronic format (-or use %it$ computer anal(sis, suc$ as string comparison, %ordlist generation and searc$ing4) ,o this end, computer fonts that contain the repertoire o- Glagolitic c$aracters must be created) Second, o%ing to t$e close relations$ip bet%een the C(rillic and Glagolitic writing s(stems, sc$olars have traditionall( represented Glagolitic te2ts also in C(rillic transcription. To facilitate t$e transliteration process, an encoding model that parallels the model for t$e C(rillic script needs to be available for Glagolitic) ,$e base repertoire o- Glagolitic c$aracters has been included in t$e Unicode standard since version 5)6) Nonetheless, t$is repertoire is incomplete because it lacks combining Glagolitic letters) Suc$ combining letters e2ist in t$e Glagolitic script and play a function t$at is analogous to t$eir role in C(rillic – that is, the( are used in abbreviations t$at are either space saving devices (-or e2ample, commonl( written words are o-ten abbreviated4 or in nomina sacra) For full support o- t$e Glagolitic %riting s(stem in Unicode, as well as for proper interoperability between the implementations o- the Glagolitic and C(rillic scripts, %e propose t$e encoding o- t$ese combining c$aracters in an additional block entitled Glagolitic Extended) Section 2. Proposed Characters ,$e follo%ing table contains e2amples o- combining Glagolitic letters t$at occur in various Glagolitic manuscripts and in printed literature) 9e propose to encode t$e c$aracters as one block, in t$e same codepoint order as t$e base Glagolitic letters encoded at U:;C66 and follo%ing. T$is allo%s for simple computer manipulation o- Glagolitic c$aracters, as well as leaving some encoding positions empty to be used in the unlikel( instance t$at additional combining c$aracters are discovered b( researc$ers and need to be encoded) Note that since a Glagolitic Extension is not in t$e Roadmaps to Unicode, all o- t$e indicated codepoints in t$is proposal are provisional codepoints in t$e Private Use Area (PUA4) * Corresponding aut$or, aleksandr.andreev=gmail.com) Name Codepoint Appearance Location in Sources Combining Glagolitic U+E066 ◌◌ Srez., p) 4;> MissSin, ?@v?A, 55r?BC?D, Letter Azu 5Ev;6> PsDem, ;DvB, 4;rD, 1?ArE Combining Glagolitic U+E06? Srez), p) 2E5> MissSin, @Ar?@, 5+v;6 Letter Buki ◌◌ Combining Glagolitic U+E06; ◌◌ Srez), p) 5+> MissSin, Dr;, ;;r?6, 46v?6, Letter Vede E;r?D> PsDem, 165rA Combining Glagolitic U+E06@ Srez., p) 2E5> PsDem, ??Dr? Letter Glagoli ◌◌ Combining Glagolitic U+E065 ◌ Srez), p) 2;A> MissSin, 55v?5> PsDem, 2?v?, Letter Dobro ◌ 5;rA, 7Ar?+, 1@?r+ Combining Glagolitic U+E06E Srez), p) E+> MissSin, 2+r;6, 4Br?; Letter Yestu ◌◌ Combining Glagolitic U+E06B Srez), p) 85 Letter Zhivete ◌◌ Combining Glagolitic U+E06A EuchSinV, 16@r?Bm Letter Zemlja ◌◌ Combining Glagolitic U+E06+ Sre!), p) A; Letter Izhe ◌◌ Combining Glagolitic U+E06A PsSinV, 1DDr?A Letter Initial Izhe ◌◌ Combining Glagolitic U+E06/ Srezn., p) A; Letter I ◌◌ Combining Glagolitic U+E06C Srezn., p) 85> MissSin, 1AvB Letter Djervi ◌◌ Combining Glagolitic U+E06F Srezn., p) 2;5> PsDem, 1;Br; Letter Kako ◌◌ Combining Glagolitic U+E06E ◌ Srezn., p) 46, p) 4;> PsDem, Ev?> MissSin), Letter Ljudie ◌ @6v;;, 5?r?E Combinign Glagolitic U+E061 ◌◌ Srezn., p) 2;5> PsDem, 56v?E, 16Er?+, Letter Myslite ??@v@> MissSin, 1@r?A Combining Glagolitic U+E0?6 Srezn., p) 5+> PsDem, 2?v@> MissSin, 26v;@ Letter Nashi ◌◌ Combining Glagolitic U+E0?? PsDem, 1;Br;)> MissSin, 4@v?5, 4@v;?C; Letter Onu ◌◌ Combinign Glagolitic U+E0?; Srezn., p) 4;, p) 25A> MissSin, 1@r?A, 1@v?6 Letter Pokoji ◌◌ Combining Glagolitic U+E0?@ Srezn., p) 4;> MissSin, 3Ev?E Letter Ritsi ◌◌ Combining Glagolitic U+E0?5 ◌ Srezn., p) 2;A> PsDem, 1;Br;5, 1;AvD, Letter Slovo ◌ ?;Br;5, 1;Bv?A Combinign Glagolitic U+E0?E Sre!n), p) @B, p) 5;, E+; PsDem, ??Ar4> MissSin, Letter Tvrido ◌◌ ?Av@, ?Dv?;, 5Ev;6, 5Br1+ Name Codepoint Appearance Location in Sources Combining Glagolitic U+E0?B MissSin, 4@v?5 Letter Uku ◌◌ Combining Glagolitic U+E0?D MissSin, 2Er?@, 2+r?5, 3@r;@, 46r?E, 2;r?6 Letter Fritu ◌◌ Combining Glagolitic U+E0?A MissSin, 1+r?;, 1@v?E, 1Dv?5 Letter Heru ◌◌ Combining Glagolitic U+E0?/ MissSin, 5ArB, 7;rGvA Letter Shta ◌◌ Combining Glagolitic U+E0?C Srezn., p) 5+> MissSin, 1+r?; Letter Tsi ◌◌ Combining Glagolitic U+E0?F MissSin, 5@rE, 4Er+C?6, 3+v?@ Letter Chrivi ◌◌ Combining Glagolitic U+E0?E ◌ Srezn., p) 2;5> MissSin, 2?r?;, 1@r?, 1Dr?A, Letter Sha ◌ ;;v?A, 25r;@, 253?E4vA Combining Glagolitic U+E0?1 MissSin, 1AvB Letter Yeru ◌◌ Combinign Glagolitic U+E0;6 Srezn., p) 4; Letter Yeri ◌◌ Combining Glagolitic U+E0;? ◌ Srezn., p) 25A> PsDem, 8@r?B> MissSin, Letter Yati ◌ 56r?E Combining Glagolitic U+E0;@ Srezn., p) 2E5 Letter Yu ◌◌ Combining Glagolitic U+E0;5 EuchSinV, 3;v?D, 5?r?? Letter Small Yus ◌◌ Combining Glagolitic U+E0;B Foes not exist as a single c$aracter, but is a Letter Yo ◌◌ component o- U:E6;+) Combining Glagolitic U+E0;D Srezn., p) 25A> MissSin, 1+r?; Letter Iotated Small ◌◌ Yus Combining Glagolitic U+E0;A "ansvetov, p) @B; 3given b( "ansvetov in Letter Big Yus ◌◌ C(rillic transcription only4 Combining Glagolitic U+E0;+ ◌ EuchSinV, 5r?6, 1Dr?, 1Dv?A, 2Av@, 2+r?A Letter Iotated Big ◌ etc) Yus Combining Glagolitic U+E0;A ◌ Assem, 1;EvE, 15+v;E> (EuchSinV, 8@v?5, Letter Fita ◌ A5v?E, 8Ev5 for /-G4 ,$e follo%ing entries are proposed for addition to UnicodeFata.txt (note that all codepoints are provisional4H E000;COMBINING GLAGOLITIC LETTER AZU;Mn;230;NSM;;;;;N;;;;; E001;COMBINING GLAGOLITIC LETTER BUKI;Mn;230;NSM;;;;;N;;;;; E002;COMBINING GLAGOLITIC LETTER VEDE;Mn;230;NSM;;;;;N;;;;; E003;COMBINING GLAGOLITIC LETTER GLAGOLI;Mn;230;NSM;;;;;N;;;;; E004;COMBINING GLAGOLITIC LETTER DOBRO;Mn;230;NSM;;;;;N;;;;; E005;COMBINING GLAGOLITIC LETTER YESTU;Mn;230;NSM;;;;;N;;;;; E006;COMBINING GLAGOLITIC LETTER ZHIVETE;Mn;230;NSM;;;;;N;;;;; E008;COMBINING GLAGOLITIC LETTER ZEMLJA;Mn;230;NSM;;;;;N;;;;; E009;COMBINING GLAGOLITIC LETTER IZHE;Mn;230;NSM;;;;;N;;;;; E00A;COMBINING GLAGOLITIC LETTER INITIAL IZHE;Mn;230;NSM;;;;;N;;;;; E00B;COMBINIGN GLAGOLITIC LETTER I;Mn;230;NSM;;;;;N;;;;; E00C;COMBINING GLAGOLITIC LETTER DJERVI;Mn;230;NSM;;;;;N;;;;; E00D;COMBINING GLAGOLITIC LETTER KAKO;Mn;230;NSM;;;;;N;;;;; E00E;COMBINING GLAGOLITIC LETTER LJUDIE;Mn;230;NSM;;;;;N;;;;; E00F;COMBINING GLAGOLITIC LETTER MYSLITE;Mn;230;NSM;;;;;N;;;;; E010;COMBINING GLAGOLITIC LETTER NASHI;Mn;230;NSM;;;;;N;;;;; E011;COMBINING GLAGOLITIC LETTER ONU;Mn;230;NSM;;;;;N;;;;; E012;COMBINING GLAGOLITIC LETTER POKOJI;Mn;230;NSM;;;;;N;;;;; E013;COMBINING GLAGOLITIC LETTER RITSI;Mn;230;NSM;;;;;N;;;;; E014;COMBINING GLAGOLITIC LETTER SLOVO;Mn;230;NSM;;;;;N;;;;; E015;COMBINING GLAGOLITIC LETTER TVRIDO;Mn;230;NSM;;;;;N;;;;; E016;COMBINING GLAGOLITIC LETTER UKU;Mn;230;NSM;;;;;N;;;;; E017;COMBINING GLAGOLITIC LETTER FRITU;Mn;230;NSM;;;;;N;;;;; E018;COMBINING GLAGOLITIC LETTER HERU;Mn;230;NSM;;;;;N;;;;; E01B;COMBINING GLAGOLITIC LETTER SHTA;Mn;230;NSM;;;;;N;;;;; E01C;COMBINING GLAGOLITIC LETTER TSI;Mn;230;NSM;;;;;N;;;;; E01D;COMBINING GLAGOLITIC LETTER CHRIVI;Mn;230;NSM;;;;;N;;;;; E01E;COMBINING GLAGOLITIC LETTER SHA;Mn;230;NSM;;;;;N;;;;; E01F;COMBINING GLAGOLITIC LETTER YERU;Mn;230;NSM;;;;;N;;;;; E020;COMBINING GLAGOLITIC LETTER YERI;Mn;230;NSM;;;;;N;;;;; E021;COMBINING GLAGOLITIC LETTER YATI;Mn;230;NSM;;;;;N;;;;; E023;COMBINING GLAGOLITIC LETTER YU;Mn;230;NSM;;;;;N;;;;; E024;COMBINING GLAGOLITIC LETTER SMALL YUS;Mn;230;NSM;;;;;N;;;;; E026;COMBINING GLAGOLITIC LETTER YO;Mn;230;NSM;;;;;N;;;;; E027;COMBINING GLAGOLITIC LETTER IOTATED SMALL YUS;Mn;230;NSM;;;;;N;;;;; E028;COMBINING GLAGOLITIC LETTER BIG YUS;Mn;230;NSM;;;;;N;;;;; E029;COMBINING GLAGOLITIC LETTER IOTATED BIG YUS;Mn;230;NSM;;;;;N;;;;; E02A;COMBINING GLAGOLITIC LETTER FITA;Mn;230;NSM;;;;;N;;;;; ,$e follo%ing entries are proposed for addition to Scripts)txt: E000..E02A ; Glagolitic # Mn [43] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER FITA Section 3. Collation It is proposed t$at t$e de-ault collation order given b( the DUCET for Glagolitic c$aracters mimic the de-ault collation order for C(rillic c$aracters as follo%sH ⰰ <<< ◌◌ <<< Ⰰ < ⰱ <<< ◌◌ <<< Ⰱ < ⰲ <<< ◌◌ <<< Ⰲ < ⰳ <<< ◌◌ <<< Ⰳ < ⰴ <<< ◌◌ <<< Ⰴ < ⰵ <<< ◌◌ <<< Ⰵ < ⰶ <<< ◌◌ <<< Ⰶ < ⰷ <<< Ⰷ < ⰸ <<< ◌◌ <<< Ⰸ < ⰹ <<< ◌◌ <<< Ⰹ < ⰺ <<< ◌◌ <<< Ⰺ < ⰻ <<< ◌◌ <<< Ⰻ < ⰼ <<< ◌◌ <<< Ⰼ < ⰽ <<< ◌◌ <<< Ⰽ < ⰾ <<< ◌◌ <<< Ⰾ < ⰿ <<< ◌◌ <<< Ⰿ < ⱀ <<< ◌◌ <<< Ⱀ < ⱁ <<< ◌◌ <<< Ⱁ < ⱂ <<< ◌◌ <<< Ⱂ < ⱃ <<< ◌◌ <<< Ⱃ < ⱄ <<< ◌◌ <<< Ⱄ < ⱅ <<< ◌◌ <<< Ⱅ < ⱆ <<< ◌◌ <<< Ⱆ < ⱇ <<< ◌◌ <<< Ⱇ < ⱈ <<< ◌◌ <<< Ⱈ < ⱉ <<< Ⱉ < ⱊ <<< Ⱊ < ⱋ <<< ◌◌ <<< Ⱋ < ⱌ <<< ◌◌ <<< Ⱌ < ⱍ <<< ◌◌ <<< Ⱍ < ⱎ <<< ◌◌ <<< Ⱎ < ⱏ <<< ◌◌ <<< Ⱏ < ⱐ <<< ◌◌ <<< Ⱐ < ⱑ <<< ◌◌ <<< Ⱑ < ⱒ <<< Ⱒ < ⱓ <<< ◌◌ <<< Ⱓ < ⱔ<<< ◌◌ <<< Ⱔ < ⱕ <<< Ⱕ < ⱖ <<< ◌◌ <<< Ⱖ < ⱗ <<< ◌◌ <<< Ⱗ < ⱘ <<< ◌◌ <<< Ⱘ < ⱙ <<< ◌◌ <<< Ⱙ < ⱚ <<< ◌◌ <<< Ⱚ < ⱛ <<< Ⱛ < ⱜ <<< Ⱜ < ⱝ <<< Ⱝ < ⱞ <<< Ⱞ Section 4.