ISO/IEC JTC1/SC2/WG2 N4608 Date: 2014-08-20 Ponomar Project Slavonic Computing Initiative Proposal to Encode Combining Glagolitic Letters in Unicode
Total Page:16
File Type:pdf, Size:1020Kb
ISO/IEC JTC1/SC2/WG2 N4608 Date: 2014-08-20 Ponomar Project Slavonic Computing Initiative Proposal to Encode Combining Glagolitic Letters in Unicode Aleksandr Andreev* Heinz Miklas Yuri S%ardt Section 1. Introduction Glagolitic also kno&n as “Glagolitsa” is an alp%abetic writing s)stem used to record C%urc% Slavonic *IS+ 6-./0 code cu1 and other Slavic languages2 Originating in the 9t% centur) it is t%e earliest kno&n Slavonic alp%abet. The creation o4 t%e alp%abet is attributed to t%e younger o4 the teac%ers o4 the Slavs St. C)ril2 Glagolitic writing may be found in medi5val manuscripts and in printed liturgical books mostl) o4 a Croatian origin. In Bulgaria, Glagolitic was graduall) replaced b) t%e C)rillic alp%abet and t%is C)rillic alp%abet was subse7uentl) used also b) other Slavs2 For its part, t%e Glagolitic script has been preserved b) some communities in Croatia even up to the present. Extant Glagolitic te9ts are o4 enormous value to linguists pal5ograp%ers and sc%olars o4 liturgy2 Support for Glagolitic in t%e Unicode standard is re7uired for t&o purposes2 First, contemporar) specialists need to be able to typograp%icall) represent medi5val te9ts written in t%e Glagolitic script, both in printed matter (suc% as academic publications1 and in an electronic format (4or use &ith computer anal)sis suc% as string comparison wordlist generation and searc%ing12 To this end computer fonts t%at contain t%e repertoire o4 Glagolitic c%aracters must be created2 Second o&ing to the close relations%ip between the C)rillic and Glagolitic writing s)stems sc%olars have traditionall) represented Glagolitic te9ts also in C)rillic transcription. To facilitate the transliteration process an encoding model that parallels the model for the C)rillic script needs to be available for Glagolitic2 3e base repertoire o4 Glagolitic c%aracters has been included in the Unicode standard since version ;2<2 Nonet%eless this repertoire is incomplete because it lacks combining Glagolitic letters2 Suc% combining letters e9ist in t%e Glagolitic script and play a function t%at is analogous to their role in C)rillic – t%at is t%e) are used in abbreviations t%at are either space saving devices (4or e9ample commonl) written words are o?en abbreviated1 or in nomina sacra2 For full support o4 t%e Glagolitic &riting s)stem in Unicode as well as for proper interoperability between the implementations o4 t%e Glagolitic and C)rillic scripts we propose the encoding o4 t%ese combining c%aracters in an additional block entitled Glagolitic Extended-A2 Section 2. Proposed Characters 3e follo&ing table contains e9amples o4 combining Glagolitic letters that occur in various Glagolitic manuscripts and in printed literature2 We propose to encode t%e c%aracters as one block in the same codepoint order as the base Glagolitic letters encoded at UA0C<< and follo&ing. This allo&s for simple computer manipulation o4 Glagolitic c%aracters as well as leaving some encoding positions empty to be used in t%e unlikel) instance that additional combining c%aracters are discovered b) researc%ers and need to be encoded2 * Corresponding aut%or aleksandr.andreevBgmail.com2 Name Codepoint Appearance Location in Sources Combining Glagolitic UACE0<< ◌◌ Srez. p2 40D MissSin, C-vCE 4;rC,/CF Letter Azu ;Gv0<D PsDem 2Fv, 40rF 1CErG Combining Glagolitic UACE0<C ◌◌ Srez2 p2 2G;D MissSin, -ErC- ;.v0< Letter Buki Combining Glagolitic UACE0<0 ◌◌ Srez2 p2 5.D MissSin 7r0 20rC< 4<vC< Letter Vede G0rCFD PsDem 1<;rE Combining Glagolitic UACE0<- ◌◌ Srez. p2 2G;D PsDem, CCFrC Letter Glagoli Combining Glagolitic UACE0<; ◌◌ Srez2 p2 20ED MissSin 5;vC;D PsDem 2CvC Letter Dobro ;0rE 7ErC. 1-Cr. Combining Glagolitic UACE0<G ◌◌ Srez2 p2 G.D MissSin 2.r0< 4,rC0 Letter Yestu Combining Glagolitic UACE0<, ◌◌ Srez2 p2 8; Letter Zhivete Combining Glagolitic UACE0<E ◌◌ EuchSinV 1<-rC,m Letter Zemlja Combining Glagolitic UACE0<. ◌◌ Sre"2 p2 80 Letter Izhe Combining Glagolitic UACE0<A ◌◌ PsSinV 1FFrCE Letter Initial Izhe Combining Glagolitic UACE0<6 ◌◌ Srezn. p2 80 Letter I Combining Glagolitic UACE0<C ◌◌ Srezn. p2 8;D MissSin 1Ev, Letter Djervi Combining Glagolitic UACE0<H ◌◌ Srezn. p2 20;D PsDem 10,r0 Letter Kako Combining Glagolitic UACE0<E ◌◌ Srezn. p2 4< p2 40D PsDem GvCD MissSin2 Letter Ljudie -<v00 5CrCG Combinign Glagolitic UACE0<8 ◌◌ Srezn. p2 20;D PsDem 5<vCG 1<GrC. Letter Myslite CC-v-D MissSin 1-rCE Combining Glagolitic UACE0C< ◌◌ Srezn. p2 5.D PsDem 2Cv-D MissSin 2<v0- Letter Nashi Combining Glagolitic UACE0CC ◌◌ PsDem 10,r02D MissSin 4-vC; 4-v0C/0 Letter Onu Combinign Glagolitic UACE0C0 ◌◌ Srezn. p2 40 p2 2;ED MissSin 1-rCE 1-vC< Letter Pokoji Combining Glagolitic UACE0C- ◌◌ Srezn. p2 40D MissSin 3GvCG Letter Ritsi Combining Glagolitic UACE0C; ◌◌ Srezn. p2 20ED PsDem 10,r0; 10EvF Letter Slovo C0,r0; 10,vCE Combinign Glagolitic UACE0CG ◌◌ Sre"n2 p2 -, p2 ;0 G.; PsDem CCEr4D MissSin Letter Tvrido CEv- CFvC0 ;Gv0< ;,r1. Name Codepoint Appearance Location in Sources Combining Glagolitic UACE0C, ◌◌ MissSin 4-vC; Letter Uku Combining Glagolitic UACE0CF ◌◌ MissSin 2GrC- 2.rC; 3-r0- 4<rCG 20rC< Letter Fritu Combining Glagolitic UACE0CE ◌◌ MissSin 1.rC0 1-vCG 1FvC; Letter Heru Combining Glagolitic UACE0C6 ◌◌ MissSin 5Er, 70rIvE Letter Shta Combining Glagolitic UACE0CC ◌◌ Srezn. p2 5.D MissSin 1.rC0 Letter Tsi Combining Glagolitic UACE0CH ◌◌ MissSin 5-rG 4Gr./C< 3.vC- Letter Chrivi Combining Glagolitic UACE0CE ◌◌ Srezn. p2 20;D MissSin 2CrC0 1-rC 1FrCE Letter Sha 00vCE 2;r0- 2;*CG1vE Combining Glagolitic UACE0C8 ◌◌ MissSin 1Ev, Letter Yeru Combinign Glagolitic UACE00< ◌◌ Srezn. p2 40 Letter Yeri Combining Glagolitic UACE00C ◌◌ Srezn. p2 2;ED PsDem 8-rC,D MissSin Letter Yati ;<rCG Combining Glagolitic UACE00- ◌◌ Srezn. p2 2G; Letter Yu Combining Glagolitic UACE00; ◌◌ EuchSinV 30vCF 5CrCC Letter Small Yus Combining Glagolitic UACE00, ◌◌ Hoes not exist as a single c%aracter but is a Letter Yo component o4 UAE<0.2 Combining Glagolitic UACE00F ◌◌ Srezn. p2 2;ED MissSin 1.rC0 Letter Iotated Small Yus Combining Glagolitic UACE00E ◌◌ #ansvetov p2 -,0 *given b) #ansvetov in Letter Big Yus C)rillic transcription onl)1 Combining Glagolitic UACE00. ◌◌ EuchSinV 5rC< 1FrC 1FvCE 2Ev- 2.rCE Letter Iotated Big etc2 Yus Combining Glagolitic UACE00A ◌◌ Assem 10GvG 1;.v0GD (EuchSinV 8-vC; Letter Fita E;vCG 8Gv; for /4I1 3e follo&ing entries are proposed for addition to UnicodeHata.txt: 1E000;COMBINING GLAGOLITIC LETTER AZU;Mn;230;NSM;;;;;N;;;;; 1E001;COMBINING GLAGOLITIC LETTER BUKI;Mn;230;NSM;;;;;N;;;;; 1E002;COMBINING GLAGOLITIC LETTER VEDE;Mn;230;NSM;;;;;N;;;;; 1E003;COMBINING GLAGOLITIC LETTER GLAGOLI;Mn;230;NSM;;;;;N;;;;; 1E004;COMBINING GLAGOLITIC LETTER DOBRO;Mn;230;NSM;;;;;N;;;;; 1E005;COMBINING GLAGOLITIC LETTER YESTU;Mn;230;NSM;;;;;N;;;;; 1E006;COMBINING GLAGOLITIC LETTER ZHIVETE;Mn;230;NSM;;;;;N;;;;; 1E008;COMBINING GLAGOLITIC LETTER ZEMLJA;Mn;230;NSM;;;;;N;;;;; 1E009;COMBINING GLAGOLITIC LETTER IZHE;Mn;230;NSM;;;;;N;;;;; 1E00A;COMBINING GLAGOLITIC LETTER INITIAL IZHE;Mn;230;NSM;;;;;N;;;;; 1E00B;COMBINIGN GLAGOLITIC LETTER I;Mn;230;NSM;;;;;N;;;;; 1E00C;COMBINING GLAGOLITIC LETTER DJERVI;Mn;230;NSM;;;;;N;;;;; 1E00D;COMBINING GLAGOLITIC LETTER KAKO;Mn;230;NSM;;;;;N;;;;; 1E00E;COMBINING GLAGOLITIC LETTER LJUDIE;Mn;230;NSM;;;;;N;;;;; 1E00F;COMBINING GLAGOLITIC LETTER MYSLITE;Mn;230;NSM;;;;;N;;;;; 1E010;COMBINING GLAGOLITIC LETTER NASHI;Mn;230;NSM;;;;;N;;;;; 1E011;COMBINING GLAGOLITIC LETTER ONU;Mn;230;NSM;;;;;N;;;;; 1E012;COMBINING GLAGOLITIC LETTER POKOJI;Mn;230;NSM;;;;;N;;;;; 1E013;COMBINING GLAGOLITIC LETTER RITSI;Mn;230;NSM;;;;;N;;;;; 1E014;COMBINING GLAGOLITIC LETTER SLOVO;Mn;230;NSM;;;;;N;;;;; 1E015;COMBINING GLAGOLITIC LETTER TVRIDO;Mn;230;NSM;;;;;N;;;;; 1E016;COMBINING GLAGOLITIC LETTER UKU;Mn;230;NSM;;;;;N;;;;; 1E017;COMBINING GLAGOLITIC LETTER FRITU;Mn;230;NSM;;;;;N;;;;; 1E018;COMBINING GLAGOLITIC LETTER HERU;Mn;230;NSM;;;;;N;;;;; 1E01B;COMBINING GLAGOLITIC LETTER SHTA;Mn;230;NSM;;;;;N;;;;; 1E01C;COMBINING GLAGOLITIC LETTER TSI;Mn;230;NSM;;;;;N;;;;; 1E01D;COMBINING GLAGOLITIC LETTER CHRIVI;Mn;230;NSM;;;;;N;;;;; 1E01E;COMBINING GLAGOLITIC LETTER SHA;Mn;230;NSM;;;;;N;;;;; 1E01F;COMBINING GLAGOLITIC LETTER YERU;Mn;230;NSM;;;;;N;;;;; 1E020;COMBINING GLAGOLITIC LETTER YERI;Mn;230;NSM;;;;;N;;;;; 1E021;COMBINING GLAGOLITIC LETTER YATI;Mn;230;NSM;;;;;N;;;;; 1E023;COMBINING GLAGOLITIC LETTER YU;Mn;230;NSM;;;;;N;;;;; 1E024;COMBINING GLAGOLITIC LETTER SMALL YUS;Mn;230;NSM;;;;;N;;;;; 1E026;COMBINING GLAGOLITIC LETTER YO;Mn;230;NSM;;;;;N;;;;; 1E027;COMBINING GLAGOLITIC LETTER IOTATED SMALL YUS;Mn;230;NSM;;;;;N;;;;; 1E028;COMBINING GLAGOLITIC LETTER BIG YUS;Mn;230;NSM;;;;;N;;;;; 1E029;COMBINING GLAGOLITIC LETTER IOTATED BIG YUS;Mn;230;NSM;;;;;N;;;;; 1E02A;COMBINING GLAGOLITIC LETTER FITA;Mn;230;NSM;;;;;N;;;;; 3e follo&ing entries are proposed for addition to Scripts2txt: 1E000..1E02A ; Glagolitic # Mn [43] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER FITA Section 3. Collation It is proposed that t%e de4ault collation order given b) the DUCET for Glagolitic c%aracters mimic the de4ault collation order for C)rillic c%aracters as follo&sJ ⰰ <<< ◌◌ <<< Ⰰ < ⰱ <<< ◌◌ <<< Ⰱ < ⰲ <<< ◌◌ <<< Ⰲ < ⰳ <<< ◌◌ <<< Ⰳ < ⰴ <<< ◌◌ <<< Ⰴ < ⰵ <<< ◌◌ <<< Ⰵ < ⰶ <<< ◌◌ <<< Ⰶ < ⰷ <<< Ⰷ < ⰸ <<< ◌◌ <<< Ⰸ < ⰹ <<< ◌◌ <<< Ⰹ < ⰺ <<< ◌◌ <<< Ⰺ < ⰻ <<< ◌◌ <<< Ⰻ < ⰼ <<< ◌◌ <<< Ⰼ < ⰽ <<< ◌◌ <<< Ⰽ < ⰾ <<< ◌◌ <<< Ⰾ < ⰿ <<< ◌◌ <<< Ⰿ < ⱀ <<< ◌◌ <<< Ⱀ < ⱁ <<< ◌◌ <<< Ⱁ < ⱂ <<< ◌◌ <<< Ⱂ < ⱃ <<< ◌◌ <<< Ⱃ < ⱄ <<< ◌◌ <<< Ⱄ < ⱅ <<< ◌◌ <<< Ⱅ < ⱆ <<< ◌◌ <<< Ⱆ < ⱇ <<< ◌◌ <<< Ⱇ < ⱈ <<< ◌◌ <<< Ⱈ < ⱉ <<< Ⱉ < ⱊ <<< Ⱊ < ⱋ <<< ◌◌ <<< Ⱋ < ⱌ <<< ◌◌ <<< Ⱌ < ⱍ <<< ◌◌ <<<