<<

Title: Encoding Finno-Ugric Phonetic Alphabet (FUPA) in the UCS Source: Michael Everson and Klaas Ruppel Version: 2.0 Date: 1998-11-02

The Finno-Ugric Phonetic Alphabet (FUPA)

0. Introduction

This paper presents a collection of characters, , and notation marks used in the FUPA scheme. This transcription is and has been used creatively by different scientists in different countries at different times (Lagercrantz probably made the most baroque use of the system); therefore the phonetic or phonematic values of the elements of the FUPA are intentionally left aside here. However, all the users of this scheme have one thing in common: they use the FUPA in a technical way, as described below. We prefer the term ÒFUPAÓ to ÒFUTÓ (Finno- Ugric Transcription) because of its analogy to the IPA (International Phonetic Alphabet).

It is not the intention of this paper to encourage exact (or over-exact) or the extensive use of diacritics. The intention is rather to facilitate the use of the FUPA by the Uralicist community in the context of the Universal Character Set (UCS), by ensuring that a comprehensive analysis of the system leads to the possibility of encoding FUPA texts, past and present, with the UCS. In the exploratory versions of this document, firm advice on how to use FUPA will not be given; when the study is complete, however, advice on which characters in the UCS refer to whch letters used in the FUPA system will be given. An example of this, we believe, will be the advice for the LATIN SMALL LETTER ENG to be used in coded representations of FUPA texts, past and present, for the velar nasal, in preference to the GREEK SMALL LETTER ETA, although a great many printed texts use an eta glyph (Ç) and not an eng glyph (Ë).

This project will lead to a standardization and normalization of the FUPA which will enable Uralicists to use the UCS unambiguously to represent their texts, past, present, and future.

It is well known that this collection is not yet complete and may have other faults. Comment is invited; the point of this project is to arrive at a consensus on the FUPA in the UCS for all future work in Uralics.

Characters are identified by their glyphs, their UCS positions, and their names. If no UCS position (no hexadecimal code) is given, the character is not presently found in the UCS (ISO/IEC 10646 and ), and is therefore a candidate for inclusion. It is intended that missing characters be proposed for addition to the UCS.

1.0. Basic elements. The basic elements of the FUPA are SMALL LETTERs. The normal display of FUPA examples is italic. Full texts may be presented in plain style. Letters crossed out here indicate that they have not yet been identified as having been used in any Uralicist source. Letters followed by an asterisk * indicate problematic characters which need further clarification and discussion. Further definitions are given below. The glyphs below are given in plain style, not in italics.

NOTE: In order to facilitate discussion of this document, the following comment with regard to the asterisked characters should be noted: The IPA and the FUPA are different systems but there is some overlap between them. The IPA does not use the curly , for instance, but instead always

Page 1 uses the script ‘ to represent the voiced velar stop. No FUPA text has yet been found which uses both at the same time for different purposes. The question is, shall we, in standardizing the FUPA, abolish the distinction between g and ‘, or shall we ensure that both characters are defined so that the choice of one or the other is left to the discretion of the Uralicist? In ordinary text this is not so important Ð both characters are already in the UCS Ð but the superscript forms, Æ and ·, are not. Both of these characters must, if the distinction is considered important, be proposed to the responsible committees.

This distinction is even more acutely questionable with regard to the letters alpha and epsilon; that is, with regard to the choice between the IPA alpha and open , and the Greek alpha and epsilon. A choice must be made with regard to FUPA practice as to which of the available characters should be used. Texts do differentiate between a round script Latin • and a round crossed Ü (Greek alpha), but does this mean that one (in non italics) is Latin a and one is Latin •, or is the distinction between Latin • and Greek Ü? Or is one Latin a, with an italic •, and the other Latin •, with an italic Ü? Latin open e is always single- humped ‹, but Greek epsilon may be double-humped ‹ or single humped Ý. Is the latter acceptable in FUPA transcription?

This issue is probably the first which the URA-LIST should undertake with regard to discussion of the present paper.

1.1. Base character. A character with no diacritics attached to it. Base characters can be added to the UCS.

1.2. Precomposed character. A character with one or more mark attached to it. Precomposed characters include, for example, Š, Ÿ, and Œ (regardless of their use as basic letters in Germanic and Finnic alphabets). There are many precomposed characters already encoded in the UCS, but it is proposed that, for FUPA support, no additional precomposed characters should be added to the UCS. The FUPA must make use of Level 3, technology, precisely because it is a dynamic and productive system.

1.3. Combining characters. A combining character as defined in the UCS is the same as what among Uralicists is called a diacritic or a diacritical mark.

1.4. The repertoire of base characters. In version 2 of this document, citations for attested characters here and a complete bibliography will be given. a 0061 LATIN SMALL LETTER A * ...... SovijŠrvi & Peltola 1977:3 € 0250 LATIN SMALL LETTER * ...... Itkonen 1986:7 • 0251 LATIN SMALL LETTER ALPHA * ...... SovijŠrvi & Peltola 1977:4 ‚ 0252 LATIN SMALL LETTER TURNED ALPHA * . . . . SovijŠrvi & Peltola 1977:4 ¾ 00E6 LATIN SMALL LETTER AE ...... SovijŠrvi & Peltola 1977:3 „ LATIN SMALL LETTER TURNED AE ...... Lehtisalo 1956:cvii Š LATIN SMALL LETTER SIDEWAYS AE . . . . . SovijŠrvi & Peltola 1977:3 0062 LATIN SMALL LETTER B ...... Itkonen 1986:7 € 0180 LATIN SMALL LETTER B WITH STROKE ...... Itkonen 1992:15 0063 LATIN SMALL LETTER C ...... Itkonen 1986:7 0064 LATIN SMALL LETTER D ...... SovijŠrvi & Peltola 1977:3 ‘ 0111 LATIN SMALL LETTER ...... Itkonen 1986:7 ð 00F0 LATIN SMALL LETTER ...... SovijŠrvi & Peltola 1977:3 e 0065 LATIN SMALL LETTER E ...... SovijŠrvi & Peltola 1977:3 ‹ 025B LATIN SMALL LETTER OPEN E *...... Itkonen 1986:7 ‰ 0259 LATIN SMALL LETTER SCHWA ...... SovijŠrvi & Peltola 1977:4

Page 2 ‚ LATIN SMALL LETTER TURNED OPEN E * ...... Itkonen 1958:xxxiii 0066 LATIN SMALL LETTER F ...... SovijŠrvi & Peltola 1977:3 g 0067 LATIN SMALL LETTER G * . SovijŠrvi & Peltola 1977:3, Itkonen 1986:7 å 01E5 LATIN SMALL LETTER ...... Itkonen 1992:15 ‘ 0261 LATIN SMALL LETTER SCRIPT G * Toivonen 1948:xxvii, Itkonen 1958:xxxii “ 0263 LATIN SMALL LETTER GAMMA * 0068 LATIN SMALL LETTER H ...... SovijŠrvi & Peltola 1977:3 – 0068 LATIN SMALL LETTER H WITH . . . . . SovijŠrvi & Peltola 1977:4 i 0069 LATIN SMALL LETTER I ...... SovijŠrvi & Peltola 1977:3 • LATIN SMALL LETTER TURNED I * ...... SovijŠrvi & Peltola 1977:4 006A LATIN SMALL LETTER J ...... SovijŠrvi & Peltola 1977:3 006B LATIN SMALL LETTER K ...... SovijŠrvi & Peltola 1977:3 006C LATIN SMALL LETTER L ...... SovijŠrvi & Peltola 1977:3 Â 0142 LATIN SMALL LETTER L WITH STROKE . . . . SovijŠrvi & Peltola 1977:3 006D LATIN SMALL LETTER M ...... SovijŠrvi & Peltola 1977:3 006E LATIN SMALL LETTER N ...... SovijŠrvi & Peltola 1977:3 Ë 014B LATIN SMALL LETTER ENG ...... Itkonen 1986:7 006F LATIN SMALL LETTER O ...... SovijŠrvi & Peltola 1977:3 … LATIN SMALL LETTER SIDEWAYS O ...... SovijŠrvi & Peltola 1977:4 ž LATIN SMALL LETTER SIDEWAYS DIAERESIZED O ¿ 00F8 LATIN SMALL LETTER O WITH STROKE . . . . SovijŠrvi & Peltola 1977:3 ‰ LATIN SMALL LETTER SIDEWAYS O WITH STROKE SovijŠrvi & Peltola 1977:4 ¥ 0275 LATIN SMALL LETTER BARRED O ...... Itkonen 1958:xxx „ 0254 LATIN SMALL LETTER OPEN O ...... SovijŠrvi & Peltola 1977:4 ˆ LATIN SMALL LETTER SIDEWAYS OPEN O . . . SovijŠrvi & Peltola 1977:4 Ï 0153 LATIN SMALL LIGATURE OE ...... SovijŠrvi & Peltola 1977:3 ƒ LATIN SMALL LETTER TURNED OE ...... SovijŠrvi & Peltola 1977:4 — LATIN SMALL LETTER UK ...... SovijŠrvi & Peltola 1977:10 • LATIN SMALL LETTER TOP HALF O ...... Itkonen 1958:xxxii – LATIN SMALL LETTER BOTTOM HALF O ...... Lagercrantz 1939:146 p 0070 LATIN SMALL LETTER P ...... SovijŠrvi & Peltola 1977:3 0071 LATIN SMALL LETTER Q ...... SovijŠrvi & Peltola 1977:3 0072 LATIN SMALL LETTER R ...... SovijŠrvi & Peltola 1977:3 © 0279 LATIN SMALL LETTER TURNED R ...... SovijŠrvi & Peltola 1977:4 0073 LATIN SMALL LETTER S ...... SovijŠrvi & Peltola 1977:3 ³ 0283 LATIN SMALL LETTER ESH ...... BenkÑ 1993:xviii § 00DF LATIN SMALL LETTER SHARP S ...... BenkÑ 1993:xviii 0074 LATIN SMALL LETTER T ...... SovijŠrvi & Peltola 1977:3 ç 0167 LATIN SMALL LETTER ...... Sinor 1988:276 0075 LATIN SMALL LETTER U ...... SovijŠrvi & Peltola 1977:3 † LATIN SMALL LETTER SIDEWAYS U ...... SovijŠrvi & Peltola 1977:4 ‡ LATIN SMALL LETTER SIDEWAYS DIAERESIZED U SovijŠrvi & Peltola 1977:4 Ÿ 026F LATIN SMALL LETTER TURNED M ...... SovijŠrvi & Peltola 1977:4 ¬ LATIN SMALL LETTER SIDEWAYS TURNED M ...... Lehtisalo 1956:cvi 0076 LATIN SMALL LETTER V ...... Itkonen 1986:7 ¼ 028C LATIN SMALL LETTER TURNED V * ...... SovijŠrvi & Peltola 1977:4 0077 LATIN SMALL LETTER W ...... SovijŠrvi & Peltola 1977:3 ½ 028D LATIN SMALL LETTER TURNED W 0078 LATIN SMALL LETTER X ...... SovijŠrvi & Peltola 1977:3 0079 LATIN SMALL LETTER Y ...... SovijŠrvi & Peltola 1977:3 ¾ 028E LATIN SMALL LETTER TURNED Y 007A LATIN SMALL LETTER Z ...... SovijŠrvi & Peltola 1977:3 Â 0292 LATIN SMALL LETTER EZH ...... Itkonen 1986:7 þ 00FE LATIN SMALL LETTER THORN ...... Itkonen 1992:15 Ä 0294 LATIN LETTER GLOTTAL STOP ...... SovijŠrvi & Peltola 1977:4 € LATIN LETTER VOICED LARYNGEAL ASPIRATE. . SovijŠrvi & Peltola 1977:4 û LATIN LETTER AIN...... Lagercrantz 1939:1230 Á 03B1 GREEK SMALL LETTER ALPHA * ...... SovijŠrvi & Peltola 1977:3

Page 3 ‚ GREEK SMALL LETTER TURNED ALPHA * . . . . SovijŠrvi & Peltola 1977:4 Â 03B2 GREEK SMALL LETTER BETA ...... SovijŠrvi & Peltola 1977:3 é 03B3 GREEK SMALL LETTER GAMMA * ...... SovijŠrvi & Peltola 1977:3 Ä 03B4 GREEK SMALL LETTER DELTA ...... SovijŠrvi & Peltola 1977:3 Å 03B4 GREEK SMALL LETTER EPSILON * ...... SovijŠrvi & Peltola 1977:3 ‚ GREEK SMALL LETTER TURNED EPSILON * . . . SovijŠrvi & Peltola 1977:4 Ç 03B8 GREEK SMALL LETTER ETA * ...... SovijŠrvi & Peltola 1977:3 È 03B8 GREEK SMALL LETTER THETA * ...... Lagercrantz 1939:1215 ë 03BA GREEK SMALL LETTER KAPPA * ...... SovijŠrvi & Peltola 1977:3 Ð 03C0 GREEK SMALL LETTER PI ...... SovijŠrvi & Peltola 1977:3 Ñ 03C1 GREEK SMALL LETTER RHO ...... SovijŠrvi & Peltola 1977:3 Ô 03C4 GREEK SMALL LETTER TAU ...... SovijŠrvi & Peltola 1977:3 Ö 03C6 GREEK SMALL LETTER PHI ...... SovijŠrvi & Peltola 1977:3 × 03C7 GREEK SMALL LETTER CHI ...... SovijŠrvi & Peltola 1977:3 Ø 03C8 GREEK SMALL LETTER PSI ...... SovijŠrvi & Peltola 1977:3 Ù 03C9 GREEK SMALL LETTER OMEGA ...... Toivonen 1948:xxx • 03D1 GREEK THETA SYMBOL *SovijŠrvi & Peltola 1977:3, Lagercrantz 1939:1238 » 043B CYRILLIC SMALL LETTER EL ...... Itkonen 1958:xxxi

2. Modification of basic elements

Basic elements can be attached by a modification of case, and/or a modifying placement, and/or by the addition of one or more modifying diacritics.

2.1 Modifying case

2.1.1 Small capitals

€ LATIN LETTER SMALL CAPITAL A ...... SovijŠrvi & Peltola 1977:3 • LATIN LETTER SMALL CAPITAL TURNED A ® LATIN LETTER SMALL CAPITAL AE ...... SovijŠrvi & Peltola 1977:3 þ LATIN LETTER SMALL CAPITAL TURNED AE • 0299 LATIN LETTER SMALL CAPITAL B ...... SovijŠrvi & Peltola 1977:3 › LATIN LETTER SMALL CAPITAL BARRED B ...... Lehtisalo 1956:cvii ‚ LATIN LETTER SMALL CAPITAL C ...... Lagercrantz 1939:243 ƒ LATIN LETTER SMALL CAPITAL D ...... SovijŠrvi & Peltola 1977:3 Ú LATIN LETTER SMALL CAPITAL ETH * . . . . SovijŠrvi & Peltola 1977:3 œ LATIN LETTER SMALL CAPITAL BARRED D ...... Lehtisalo 1956:cvii „ LATIN LETTER SMALL CAPITAL E ...... SovijŠrvi & Peltola 1977:3 Û LATIN LETTER SMALL CAPITAL TURNED E … LATIN LETTER SMALL CAPITAL F † 0262 LATIN LETTER SMALL CAPITAL G ...... SovijŠrvi & Peltola 1977:3 ‡ 029C LATIN LETTER SMALL CAPITAL H ...... Itkonen 1986:9 ˆ 026A LATIN LETTER SMALL CAPITAL I ...... SovijŠrvi & Peltola 1977:3 ‰ LATIN LETTER SMALL CAPITAL J ...... SovijŠrvi & Peltola 1977:3 Š LATIN LETTER SMALL CAPITAL K ...... Lagercrantz 1939:146 ‹ 029F LATIN LETTER SMALL CAPITAL L ...... SovijŠrvi & Peltola 1977:3 Ü LATIN LETTER SMALL CAPITAL L WITH STROKE. SovijŠrvi & Peltola 1977:3 Ý LATIN LETTER SMALL CAPITAL TURNED L Œ LATIN LETTER SMALL CAPITAL M ...... SovijŠrvi & Peltola 1977:3 • 0274 LATIN LETTER SMALL CAPITAL N ...... SovijŠrvi & Peltola 1977:3 Ž LATIN LETTER SMALL CAPITAL O ...... Lagercrantz 1939:1220 ¯ LATIN LETTER SMALL CAPITAL OPEN O . . . . SovijŠrvi & Peltola 1977:4 ý LATIN LETTER SMALL CAPITAL SIDEWAYS OPEN O • LATIN LETTER SMALL CAPITAL P ...... SovijŠrvi & Peltola 1977:4 • LATIN LETTER SMALL CAPITAL Q ‘ 0280 LATIN LETTER SMALL CAPITAL R ...... SovijŠrvi & Peltola 1977:3 Þ LATIN LETTER SMALL CAPITAL TURNED R . . . SovijŠrvi & Peltola 1977:4

Page 4 ’ LATIN LETTER SMALL CAPITAL S “ LATIN LETTER SMALL CAPITAL T ...... Itkonen 1958:609 ” LATIN LETTER SMALL CAPITAL U ...... SovijŠrvi & Peltola 1977:3 • LATIN LETTER SMALL CAPITAL V ...... SovijŠrvi & Peltola 1977:3 – LATIN LETTER SMALL CAPITAL W ...... SovijŠrvi & Peltola 1977:3 — LATIN LETTER SMALL CAPITAL X ˜ 028F LATIN LETTER SMALL CAPITAL Y ...... SovijŠrvi & Peltola 1977:3 ™ LATIN LETTER SMALL CAPITAL Z ...... SovijŠrvi & Peltola 1977:3 ‹ LATIN LETTER SMALL CAPITAL EZH ...... Lagercrantz 1939:1230 « LATIN LETTER SMALL CAPITAL TWO ...... BenkÑ 1993:xvii ß GREEK LETTER SMALL CAPITAL GAMMA ...... Toivonen 1948:xxvii Ÿ GREEK LETTER SMALL CAPITAL LAMDA * ...... Toivonen 1948:xxvii ¸ CYRILLIC LETTER SMALL CAPITAL I * (or rev N?). Lagercrantz 1939:1230

2.1.2 Capitals

C 0043 LATIN CAPITAL LETTER C ...... Koponen 1998:136 V 0056 LATIN CAPITAL LETTER V...... Sinor 1988:10 ¦ 01A6 LATIN LETTER YR ...... Itkonen 1992:15 £ 0393 GREEK CAPITAL LETTER GAMMA ...... RŽdei 1988:xxiii ° 03A0 GREEK CAPITAL LETTER PI ...... SovijŠrvi & Peltola 1977:4 ¸ 03A8 GREEK CAPITAL LETTER PSI ...... SovijŠrvi & Peltola 1977:3 › 043B CYRILLIC CAPITAL LETTER EL ...... SovijŠrvi & Peltola 1977:3

2.2 Modifying placement

2.2.1 Superscript characters. There is already a number of superscript characters in the UCS. For consistency, the names here follow the convention used in the UCS (MODIFIER LETTER É).

MODIFIER LETTER CAPITAL A ...... Itkonen 1986:41 » MODIFIER LETTER CAPITAL AE ...... Lehtisalo 1956:352 ¡ MODIFIER LETTER CAPITAL B ...... Lagercrantz 1939:1236 ¢ MODIFIER LETTER CAPITAL C £ MODIFIER LETTER CAPITAL D ...... Itkonen 1958:xxviii ¤ MODIFIER LETTER CAPITAL E ...... SovijŠrvi & Peltola 1977:4 ¼ MODIFIER LETTER CAPITAL REVERSED E ...... Lehtisalo 1956:cvii ¥ MODIFIER LETTER CAPITAL F ¦ MODIFIER LETTER CAPITAL G ...... Lagercrantz 1939:1236 § MODIFIER LETTER CAPITAL H ...... Lagercrantz 1939:146 ¨ MODIFIER LETTER CAPITAL I ...... Itkonen 1958:339 © MODIFIER LETTER CAPITAL J ...... Itkonen 1958:xxix ª MODIFIER LETTER CAPITAL K ...... Lagercrantz 1939:146 « MODIFIER LETTER CAPITAL L ...... Lagercrantz 1939:1232 ¬ MODIFIER LETTER CAPITAL M ...... Lagercrantz 1939:146 • MODIFIER LETTER CAPITAL N ...... Lagercrantz 1939:146 ¼ MODIFIER LETTER REVERSED CAPITAL N * (not cyr i) ® MODIFIER LETTER CAPITAL O ...... Lagercrantz 1939:1235 ¯ MODIFIER LETTER CAPITAL P ...... Lagercrantz 1939:1236 ° MODIFIER LETTER CAPITAL Q ± MODIFIER LETTER CAPITAL R ...... Itkonen 1958:xxxi ² MODIFIER LETTER CAPITAL S ³ MODIFIER LETTER CAPITAL T ...... Lagercrantz 1939:1236 ´ MODIFIER LETTER CAPITAL U ...... Lagercrantz 1939:146 µ MODIFIER LETTER CAPITAL V ¶ MODIFIER LETTER CAPITAL W ...... Lagercrantz 1939:1231 · MODIFIER LETTER CAPITAL X ¸ MODIFIER LETTER CAPITAL Y

Page 5 ¹ MODIFIER LETTER CAPITAL Z À MODIFIER LETTER SMALL A ...... SovijŠrvi & Peltola 1977:6 ° MODIFIER LETTER SMALL TURNED A ...... Itkonen 1986:17 ± MODIFIER LETTER SMALL TURNED AE ...... Lehtisalo 1956:360 Á MODIFIER LETTER SMALL B ...... Itkonen 1986:17 Â MODIFIER LETTER SMALL C * . . . RŽdei 1988:119 (but it is not a c!) Ã MODIFIER LETTER SMALL D ...... Itkonen 1958:xxviii Ä MODIFIER LETTER SMALL E ...... SovijŠrvi & Peltola 1977:6 ´ MODIFIER LETTER SMALL SCHWA ...... Itkonen 1958:xxxii µ MODIFIER LETTER SMALL OPEN E ...... Itkonen 1958:xxxii Å MODIFIER LETTER SMALL F Æ MODIFIER LETTER SMALL G...... RŽdei 1988:390 · MODIFIER LETTER SMALL SCRIPT G ...... Itkonen 1958:xxviii ° MODIFIER LETTER SMALL GAMMA * Ç 02B0 MODIFIER LETTER SMALL H È MODIFIER LETTER SMALL I ...... SovijŠrvi & Peltola 1977:6 ä MODIFIER LETTER SMALL TURNED I ...... Toivonen 1948:1061 É 02B2 MODIFIER LETTER SMALL J ...... SovijŠrvi & Peltola 1977:4 ~ MODIFIER LETTER SMALL K ...... Itkonen 1958:xxviii Ë 02E1 MODIFIER LETTER SMALL L Ì MODIFIER LETTER SMALL M ...... Itkonen 1958:xxxvii Í 207F SUPERSCRIPT LATIN SMALL LETTER N ...... Itkonen 1958:xxxi ¸ MODIFIER LETTER SMALL ENG ...... Lehtisalo 1956:325 Î MODIFIER LETTER SMALL O ...... SovijŠrvi & Peltola 1977:4 ² MODIFIER LETTER SMALL OPEN O ...... Itkonen 1958:xxviii º MODIFIER LETTER TOP HALF O ...... RŽdei 1988:xxiv ã MODIFIER LETTER BOTTOM HALF O ...... Lagercrantz 1939:1235 Ï MODIFIER LETTER SMALL P ...... Itkonen 1958:xxviii Ð MODIFIER LETTER SMALL Q Ñ 02B3 MODIFIER LETTER SMALL R Ò 02E2 MODIFIER LETTER SMALL S ...... Lehtisalo 1956:64 Ó MODIFIER LETTER SMALL T ...... SovijŠrvi & Peltola 1977:6 Ô MODIFIER LETTER SMALL U ...... SovijŠrvi & Peltola 1977:4 ¹ MODIFIER LETTER SMALL SIDEWAYS U . . Lehtisalo 1956:[eng-&-vokalisch] Õ MODIFIER LETTER SMALL V ...... Itkonen 1958:xxx Ö 02B7 MODIFIER LETTER SMALL W ...... Itkonen 1958:xxx × 02E3 MODIFIER LETTER SMALL X ...... Itkonen 1986:11 Ø 02B8 MODIFIER LETTER SMALL Y Ù MODIFIER LETTER SMALL Z • 02C0 MODIFIER LETTER SMALL GLOTTAL STOP . . . SovijŠrvi & Peltola 1977:5 ÿ MODIFIER LETTER SMALL AIN ...... Lagercrantz 1939:1236 å MODIFIER LETTER SMALL BETA ...... Lehtisalo 1956:352 ë MODIFIER LETTER SMALL GREEK GAMMA ...... Itkonen 1958:xxix ³ MODIFIER LETTER SMALL DELTA ...... Itkonen 1958:xxxii ¶ MODIFIER LETTER SMALL PHI ...... RŽdei 1988:xxiii º MODIFIER LETTER SMALL CHI ...... Lagercrantz 1939:1236

2.2.2 Subscript

À LATIN SUBSCRIPT CAPITAL LETTER A ¾ LATIN SUBSCRIPT CAPITAL LETTER AE Á LATIN SUBSCRIPT CAPITAL LETTER B Â LATIN SUBSCRIPT CAPITAL LETTER C Ã LATIN SUBSCRIPT CAPITAL LETTER D Ä LATIN SUBSCRIPT CAPITAL LETTER E ¿ LATIN SUBSCRIPT CAPITAL LETTER REVERSED E Å LATIN SUBSCRIPT CAPITAL LETTER F Æ LATIN SUBSCRIPT CAPITAL LETTER G

Page 6 Ç LATIN SUBSCRIPT CAPITAL LETTER H È LATIN SUBSCRIPT CAPITAL LETTER I É LATIN SUBSCRIPT CAPITAL LETTER J ~ LATIN SUBSCRIPT CAPITAL LETTER K Ë LATIN SUBSCRIPT CAPITAL LETTER L Ì LATIN SUBSCRIPT CAPITAL LETTER M Í LATIN SUBSCRIPT CAPITAL LETTER N Î LATIN SUBSCRIPT CAPITAL LETTER O Ï LATIN SUBSCRIPT CAPITAL LETTER P Ð LATIN SUBSCRIPT CAPITAL LETTER Q Ñ LATIN SUBSCRIPT CAPITAL LETTER R Ò LATIN SUBSCRIPT CAPITAL LETTER S Ó LATIN SUBSCRIPT CAPITAL LETTER T Ô LATIN SUBSCRIPT CAPITAL LETTER U Õ LATIN SUBSCRIPT CAPITAL LETTER V Ö LATIN SUBSCRIPT CAPITAL LETTER W × LATIN SUBSCRIPT CAPITAL LETTER X Ø LATIN SUBSCRIPT CAPITAL LETTER Y Ù LATIN SUBSCRIPT CAPITAL LETTER Z à LATIN SUBSCRIPT SMALL LETTER A á LATIN SUBSCRIPT SMALL LETTER B â LATIN SUBSCRIPT SMALL LETTER C ã LATIN SUBSCRIPT SMALL LETTER D ä LATIN SUBSCRIPT SMALL LETTER E å LATIN SUBSCRIPT SMALL LETTER F æ LATIN SUBSCRIPT SMALL LETTER G ½ LATIN SUBSCRIPT SMALL LETTER SCRIPT G ç LATIN SUBSCRIPT SMALL LETTER H è LATIN SUBSCRIPT SMALL LETTER I . . . . . SovijŠrvi & Peltola 1977:5 é LATIN SUBSCRIPT SMALL LETTER J ê LATIN SUBSCRIPT SMALL LETTER K ë LATIN SUBSCRIPT SMALL LETTER L ì LATIN SUBSCRIPT SMALL LETTER M í LATIN SUBSCRIPT SMALL LETTER N ...... Lehtisalo 1956:208 î LATIN SUBSCRIPT SMALL LETTER O * . . . . SovijŠrvi & Peltola 1977:5 ï LATIN SUBSCRIPT SMALL LETTER P ð LATIN SUBSCRIPT SMALL LETTER Q ñ LATIN SUBSCRIPT SMALL LETTER R . . . . . SovijŠrvi & Peltola 1977:5 ò LATIN SUBSCRIPT SMALL LETTER S ó LATIN SUBSCRIPT SMALL LETTER T ô LATIN SUBSCRIPT SMALL LETTER U . . . . . SovijŠrvi & Peltola 1977:5 õ LATIN SUBSCRIPT SMALL LETTER V ö LATIN SUBSCRIPT SMALL LETTER W ÷ LATIN SUBSCRIPT SMALL LETTER X ...... Toivonen 1948:[vokale] ø LATIN SUBSCRIPT SMALL LETTER Y ù LATIN SUBSCRIPT SMALL LETTER Z Û GREEK SUBSCRIPT SMALL LETTER BETA Ú GREEK SUBSCRIPT SMALL LETTER GAMMA ...... Toivonen 1948:xxvi ’ GREEK SUBSCRIPT SMALL LETTER RHO . . . . . Lehtisalo 1956:[1st page] “ GREEK SUBSCRIPT SMALL LETTER PHI ...... RŽdei 1988:xxiii ” GREEK SUBSCRIPT SMALL LETTER CHI ...... Toivonen 1948:xxvi

2.3 Modifying diacritics

In the UCS, multiple diacritics are stacked vertically, above and/or below the base character. For example the voiceless postalveolar fricative is LATIN SMALL LETTER S + COMBINING . To indicate palatalization, an Ì• is added above the caron. With some base characters (b, d, f, h, k, l, t) the

Page 7 palatalization stroke does not look like a acute accent. At least for those letters, a combining palatalization stroke Ìü is proposed. A palatalized long l could then represented as LATIN SMALL LETTER L + COMBINING PALATALIZATION STROKE + COMBINING . In this case, the macron sits above the l and the palatalization stroke beside the l, à, as expected.

In principle, the FUPA stacks diacritics in the same way as the UCS: vertically. However there are exceptions 1) when there are more than two diacritics or 2) for typographical considerations. The latter is the case when two diacritics are attached to a wide letter or when stacking in vertical direction would look bad. We have not yet determined whether or not this kind of placement of diacritics affects the phonetic meaning. If more than two diacritic marks are needed above or below the basic character, it seems to be common to place the first two side by side and the third one above (when they sit above the letter) or below (when they sit below the letter). An example is a long palatalized voiceless postalveolar fricative:

LATIN SMALL LETTER S + COMBINING CARON AND ACUTE ABOVE + COMBINING MACRON = ½

Note that a different order of these elements would produce a different result:

SMALL LETTER S + COMBINING CARON + COMBINING ACUTE ABOVE + COMBINING MACRON = ¾ SMALL LETTER S + COMBINING CARON + COMBINING MACRON + COMBINING ACUTE ABOVE = ¿

Due to the technology of the UCS, diacritics side-by-side should probably be encoded as seperate composed units. Combinations discovered so far are:

COMBINING CARON AND ACUTE ABOVE COMBINING GRAVE AND ACUTE ABOVE ...... Lehtisalo 1956:325 COMBINING AND ACUTE ABOVE ...... Lehtisalo 1956:325 COMBINING GREATER-THAN AND BELOW ...... Toivonen 1948:xxv COMBINING GREATER-THAN AND CIRCUMFLEX BELOW ...... Toivonen 1948:xxvi

If it is true that more than three diacritics are never found above or below a basic character (i.e. if together above and below a basic character more than six diacritics are never found), then this approach may be appropriate.

2.3.1 Combining characters above the base character

Ì€ 0300 COMBINING ...... SovijŠrvi & Peltola 1977:7 Ì• 0301 COMBINING ACUTE ACCENT ...... SovijŠrvi & Peltola 1977:5 Ì‚ 0302 COMBINING CIRCUMFLEX ACCENT ...... Itkonen 1986:8 ̃ 0303 COMBINING ...... SovijŠrvi & Peltola 1977:5 Ì„ 0304 COMBINING MACRON ...... SovijŠrvi & Peltola 1977:4 ̆ 0306 COMBINING BREVE ...... SovijŠrvi & Peltola 1977:6 ̇ 0307 COMBINING ABOVE ...... SovijŠrvi & Peltola 1977:5 ̈ 0308 COMBINING ...... SovijŠrvi & Peltola 1977:5 ÌŠ 030A COMBINING ABOVE ...... Itkonen 1958:xxx Ì‹ 030B COMBINING ...... Munk‡csi 1986:24 ÌŒ 030C COMBINING CARON ...... SovijŠrvi & Peltola 1977:5 Ì‘ 0311 COMBINING ...... Itkonen 1986:8 Ìü COMBINING PALATALIZATION STROKE * . . . . SovijŠrvi & Peltola 1977:7 ̤ COMBINING LESS-THAN ABOVE Ì¥ COMBINING GREATER-THAN ABOVE ...... SovijŠrvi & Peltola 1977:6 Ìç COMBINING LEFT HALF RING ABOVE ...... Toivonen 1948:xxvii ̧ COMBINING CARON AND ACUTE ABOVE ÌÞ COMBINING GRAVE AND ACUTE ABOVE ...... Lehtisalo 1956:325 Ìß COMBINING BREVE AND ACUTE ABOVE ...... Lehtisalo 1956:325

Page 8 ̨ COMBINING SMALL LETTER C ABOVE ...... BenkÑ 1993:xvii Ì© COMBINING SMALL LETTER E ABOVE ...... BenkÑ 1993:xvii Ìè COMBINING INVERTED ...... Lagercrantz 1939:1235 Ì1 COMBINING ONE ABOVE ...... SovijŠrvi & Peltola 1977:8 Ì2 COMBINING TWO ABOVE ...... SovijŠrvi & Peltola 1977:8 Ì3 COMBINING THREE ABOVE ...... SovijŠrvi & Peltola 1977:8 Ì4 COMBINING FOUR ABOVE ...... SovijŠrvi & Peltola 1977:8

2.3.2 Combining characters below the base character

Ìœ 031C COMBINING LEFT HALF RING BELOW ...... Itkonen 1986:10 ÌŸ 031F COMBINING PLUS SIGN BELOW ...... SovijŠrvi & Peltola 1977:6 Ì£ 0323 COMBINING DOT BELOW ...... SovijŠrvi & Peltola 1977:5 ̤ 0324 COMBINING DIAERESIS BELOW ...... SovijŠrvi & Peltola 1977:5 Ì¥ 0325 COMBINING RING BELOW ...... SovijŠrvi & Peltola 1977:5 ̦ 0326 COMBINING BELOW ...... SovijŠrvi & Peltola 1977:6 ̨ 0328 COMBINING ...... SovijŠrvi & Peltola 1977:6 ̬ 032C COMBINING CARON BELOW ...... SovijŠrvi & Peltola 1977:5 Ì• 032D COMBINING CIRCUMFLEX ACCENT BELOW . . . . SovijŠrvi & Peltola 1977:5 Ì® 032E COMBINING BREVE BELOW ...... SovijŠrvi & Peltola 1977:6 ̯ 032F COMBINING INVERTED BREVE BELOW . . . . . SovijŠrvi & Peltola 1977:6 Ì° 0330 COMBINING TILDE BELOW ...... SovijŠrvi & Peltola 1977:5 ̱ 0331 COMBINING MACRON BELOW ...... SovijŠrvi & Peltola 1977:4 ̺ 033A COMBINING INVERTED BRIDGE BELOW Ì£ COMBINING X BELOW ...... SovijŠrvi & Peltola 1977:6 Ì¡ COMBINING LESS-THAN BELOW ...... SovijŠrvi & Peltola 1977:5 Ì¢ COMBINING GREATER-THAN BELOW ...... SovijŠrvi & Peltola 1977:5 ̦ COMBINING GREATER-THAN AND CIRCUMFLEX BELOW. . . . Toivonen 1948:xxv Ìâ COMBINING LESS-THAN AND CIRCUMFLEX BELOW. . . . . Toivonen 1948:xxvi Ìæ COMBINING OPEN BOX BELOW

4. Notation and

! 0021 EXCLAMATION MARK...... Itkonen 1986:40 ( 0028 LEFT PARENTHESIS ...... Itkonen 1986:339 ) 0029 RIGHT PARENTHESIS...... Itkonen 1986:339 * 002A ASTERISK ...... Kettunen 1938:xix + 002B PLUS SIGN ...... Lagercrantz 1939:1321 - 002D -MINUS ...... SovijŠrvi & Peltola 1977:6 / 002F SOLIDUS...... SovijŠrvi & Peltola 1977:7 : 003A ...... SovijŠrvi & Peltola 1977:7 < 003C LESS-THAN SIGN ...... Itkonen 1986:9 = 003D EQUALS SIGN ...... Kettunen 1986:41 > 003E GREATER-THAN SIGN ...... Kettunen 1938:xix ? 003F QUESTION MARK ...... Itkonen 1986:16 [ 005B LEFT SQUARE BRACKET...... RŽdei 1988:5 ] 005D RIGHT SQUARE BRACKET ...... RŽdei 1988:5 | 007C OPEN BOX ...... Kettunen 1938:428 á 00B7 MIDDLE DOT ...... SovijŠrvi & Peltola 1977:7 ‹ 02BB MODIFIER LETTER TURNED COMMA ...... Itkonen 1986:11 Œ 02BC MODIFIER LETTER ...... SovijŠrvi & Peltola 1977:6 • 02BD MODIFIER LETTER REVERSED COMMA * . . . . SovijŠrvi & Peltola 1977:5 Ž 02BE MODIFIER LETTER RIGHT HALF RING ...... Itkonen 1958:xxix • 02BF MODIFIER LETTER LEFT HALF RING * ...... Itkonen 1958:xxix – 02C6 MODIFIER LETTER CIRCUMFLEX ACCENT . . . . SovijŠrvi & Peltola 1977:8 — 02C7 CARON ...... SovijŠrvi & Peltola 1977:8 ˜ 02C8 MODIFIER LETTER VERTICAL LINE ...... SovijŠrvi & Peltola 1977:7 ™ 02C8 MODIFIER LETTER MACRON ...... SovijŠrvi & Peltola 1977:8

Page 9 š 02CA MODIFIER LETTER ACUTE ACCENT ...... Koponen 1998:136 › 02CB MODIFIER LETTER GRAVE ACENT ...... Koponen 1998:136 œ 02CC MODIFIER LETTER LOW VERTICAL LINE . . . . SovijŠrvi & Peltola 1977:7 ž 02CE MODIFIER LETTER LOW GRAVE ...... Toivonen 1948:xxix 02D0 MODIFIER LETTER TRIANGULAR COLON ¡ 02D1 MODIFIER LETTER HALF TRIANGULAR COLON ¨ 02D8 BREVE ...... Lagercrantz 1939:1216 ¬ 02DC SMALL TILDE (or is it middle tilde?) . . SovijŠrvi & Peltola 1977:8 ¯ (02DF) MODIFIER LETTER CROSS ACCENT ...... Kettunen 1938:xix ˜ (02EC) MODIFIER LETTER VOICING (low caron?). . . SovijŠrvi & Peltola 1977:5 ™ MODIFIER LETTER LOW CIRCUMFLEX . . . . . SovijŠrvi & Peltola 1977:5 š MODIFIER LETTER LESS-THAN. . . Lagercrantz 1939:243, Itkonen 1992:15 á MODIFIER LETTER GREATER-THAN ...... Lagercrantz 1939:1212 œ MODIFIER LETTER LABIALIZATION. . . . Itkonen 1992:15, Itkonen 1986:8 • MODIFIER LETTER MIDDLE GRAVE ...... SovijŠrvi & Peltola 1977:6 ž MODIFIER LETTER MIDDLE ACUTE ...... SovijŠrvi & Peltola 1977:8 ~ MODIFIER LETTER LOW TILDE need sample text . . Lagercrantz 1939:1215 á/. MODIFIER LETTER ??? ...... Lagercrantz 1939:xiii, 1214 : MODIFIER LETTER RAISED COLON ...... Munk‡csi 1986:9 ª MODIFIER LETTER HALF-VOICING * ...... Lehtisalo 1956:cvi ª MODIFIER LETTER HIGH TONE * ...... Lagercrantz 1939:1219 º (02EB) MODIFIER LETTER YANG DEPARTING TONE MARK (low) Lagercrantz 1939:1219 ¢ 2022 BULLET ...... Munk‡csi 1986:9 à SUPERSCRIPT SQUARE ROOT ...... Koponen 1998:136 2020 DAGGER ...... Itkonen 1986:339 ¿ 203F UNDERTIE ...... SovijŠrvi & Peltola 1977:6 Ÿ INVERTED UNDERTIE ...... SovijŠrvi & Peltola 1977:8 INVERTED UNDERTIE WITH DOT ...... SovijŠrvi & Peltola 1977:8 ú 207A SUPERSCRIPT PLUS SIGN ...... SovijŠrvi & Peltola 1977:8 ð 2070 SUPERSCRIPT ZERO ...... SovijŠrvi & Peltola 1977:8 ¹ 00B9 SUPERSCRIPT ONE ...... SovijŠrvi & Peltola 1977:8 ² 00B2 SUPERSCRIPT TWO ...... SovijŠrvi & Peltola 1977:8 ³ 00B3 SUPERSCRIPT THREE ...... SovijŠrvi & Peltola 1977:8 ô 2074 SUPERSCRIPT FOUR ...... SovijŠrvi & Peltola 1977:8 • 2260 NOT EQUAL TO á 2261 IDENTICAL TO...... Lagercrantz 1939:1215 ê 226A MUCH LESS-THAN ë 226B MUCH GREATER-THAN î 226E NOT LESS-THAN ï 226F NOT GREATER-THAN ö 2276 LESS-THAN OR GREATER-THAN ÷ 2277 GREATER-THAN OR LESS-THAN š 2423 OPEN BOX ...... SovijŠrvi & Peltola 1977:6 › OPEN SHELF ...... SovijŠrvi & Peltola 1977:6 ý 207D SUPERSCRIPT RIGHT PARENTHESIS ...... Lagercrantz 1939:1235 þ 207E SUPERSCRIPT LEFT PARENTHESIS ...... Lagercrantz 1939:1235 • 208D SUBSCRIPT LEFT PARENTHESIS ...... SovijŠrvi & Peltola 1977:6 ú SWUNG DASH ...... SovijŠrvi & Peltola 1977:8 í 266D MUSIC FLAT SIGN ...... SovijŠrvi & Peltola 1977:8 ï 266E MUSIC NATURAL SIGN ...... SovijŠrvi & Peltola 1977:8 î 266F MUSIC SHARP SIGN ...... SovijŠrvi & Peltola 1977:8 |™ 300C LEFT CORNER BRACKET ...... SovijŠrvi & Peltola 1977:10 ™| RIGHT TOP CORNER BRACKET ...... SovijŠrvi & Peltola 1977:10

4. Structure of and use of the FUPA

Technically the FUPA can be described as a system of 1) base characters, 2) combining characters, and 3) methods for combinination and modification.

Page 10 4.1 Base characters

The base characters of the FUPA are LATIN SMALL LETTERS. Base characters also include a number of GREEK SMALL LETTERS and a few CYRILLIC SMALL LETTERS.

4.2 Combining characters

Combining characters are used to modify the meaning (the sound value) of a base character in combination with it.

4.3 Methods of representation

4.3.1 Combination

Combining characters are non-spacing and can be located above or below a base character. Some characters with the same appearance as combining characters are, however, spacing characters (called MODIFIER LETTERs), following a base character. Multiple combining characters are allowed, but in practice the maximum is six (three above and three below the basic element).

4.3.2 Modification

By giving a base character a special modification, the sound value it represents can be altered. The FUPA employs three modifying methods.

4.3.2.1 Case

SMALL LETTERS SMALL CAPITALS (CAPITAL LETTERS)

4.3.2.2 Feature

turned letters sideways letters distorted letters

4.3.2.3 Placement

SUPERSCRIPT SUBSCRIPT

In UCS encoding, each of these modifications requires a separate, new base character to be employed, since plain-text representation of these characters is required. These modifications are not, in principle, productive, but a set of modified characters has been identified here in clause 2.

4.4 Open system

In principle all combinations of basic elements and diacritics are possible. The number of possible combinations is very high. Even if only the number of meaningful combinations (from a scientific point of view) is considered, the number of possible combinations remains high. This paper does not attempt to investigate which combinations are meaningful.

4.5 Possible modifications

Page 11 4.5.1 Case (SMALL, SMALL CAPITAL, (CAPITAL)) Feature (turned, sideways, distorted) Placement (SUPERSCRIPT, SUBSCRIPT)

4.5.2 1 combining character above 1 combining character below

2 combining characters above 2 combining characters below

3 combining characters above 3 combining characters below

1 combining character above and 1 combining character below 2 combining characters above and 1 combining character below 3 combining characters above and 1 combining character below 1 combining character above and 2 combining characters below 2 combining characters above and 2 combining characters below 3 combining character above and 2 combining characters below 1 combining character above and 3 combining character below 2 combining characters above and 3 combining characters below 3 combining characters above and 3 combining characters below

4.5.3 Case + combining character(s) Feature + combining character(s) Placement + combining character(s)

Case + feature Case + placement Feature + placement

Case + feature + placement

Case + feature + combining character(s) Case + placement + combining character(s) Feature + placement + combining character(s)

Case + feature + placement + combining character(s)

5 Bibliography

BenkÑ Lor‡nd et al. 1993-. Etymologisches Wšrterbuch des Ungarischen 1-2. Budapest. ISBN 963-05-6227-8 Collinder, Bjšrn. 1957. Survey of the Uralic languages. Stockholm: Almqvist & Wiksell. Itkonen, Erkki. 1986. Inarilappisches Wšrterbuch. (Lexica Societatis Fenno- ugricae; 20,1) Helsinki: Suomalais-ugrilainen seura. Itkonen, Erkki, et al. 1992. Suomen sanojen alkuperŠ: etymologinen sanakirja. 1. A-K, 1992. 2. L-P, 1995. (Suomalaisen Kirjallisuuden Seuran toimituksia; 556. Kotimaisten kielten titkumuskeskuksen julkaisuja; 62) Helsinki: Suomalaisen Kirjallisuuden Seura; Kotimaisten kielten tutkimuskeskus. Itkonen, T. I. 1958. Koltan- ja kuolanlapin sanakirja 1-2 = Wšrterbuch des Kolta- und Kolalappischen. (Lexica Societatis Fenno-ugricae 15) Helsinki: Suomalais- ugrilainen seura. Kettunen, Lauri. 1938. Livisches wšrterbuch mit grammatischer einleitung. (Lexica Societatis Fenno-ugricae; 5) Helsinki: Suomalais-ugrilainen seura. Koponen, Eino. 1998. EtelŠviron murteen sanaston alkuperŠ: itŠmerensuomalaista

Page 12 etymologiaa. (Suomalais-ugrilaisen seuran toimituksia = MŽmoires de la SociŽtŽ Finno-ugrienne) Helsinki: Suomalais-ugrilainen seura. Lagercrantz, Eliel. 1939. Lappischer Wortschatz 1-2. (Lexica Societatis Fenno- ugricae; 6) Helsinki: Suomalais-ugrilainen seura. Lehtisalo, T. 1956. Juraksamojedisches Wšrterbuch. (Lexica Societatis Fenno- ugricae; 13) Helsinki: Suomalais-ugrilainen seura. Munk‡csi Bern‡t. 1986. Wogulisches Wšrterbuch. Ed. K‡lm‡n BŽla. Budapest: AkadŽmiai Kiad—. ¥Posti-Itkonen (ed.): FU-transkription yksinkertaistaminen. Az FU-‡t’r‡s egyszerñs’tŽse. Zur Vereinfachung der FU-Transkription. On simplifying of the FU transcription. - Catsrenianumin toimitteita 7. Helsinki 1973. RŽdei Karoly et al. 1988. Uralisches etymologisches Wšrterbuch 1-2. Wiesbaden: Otto Harrassowitz. ¥SetŠlŠ, E. N.: †ber transskription der finnisch-ugrischen sprachen. - Finnisch- Ugrische Forschungen 1. Pp. 13-52. Helsinki 1902. Sinor, Denis. 1988. The Uralic languages: description,history and foreign influences. (Handbook of Uralic Studies; 1) Leiden: E. J. Brill. SovijŠrvi, Antti, & Reino Peltola, eds.. 1977. Suomalais-ugrilainen tarkekirjoitus. - (Helsingin Yliopiston Fonetiikan Laitoksen Julkuaisua; 9) Helsinki: Publicationes Instituti Phonetici Universitatis Helsingiensis. ISBN 951-45-1019-4 Steinitz, Wolfgang. 1966. Dialektologisches und etymologisches Wšrterbuch der ostjakischen Sprache - 1. Lieferung = „èàëåêòîëîãè÷åñêèé è ýòèìîëîãè÷åñêèé ñëîâàðü õàíòûéñêîãî ßçûêà - âûïóñê ïåðâûé. Berlin: Akademie Verlag. Toivonen, Y. H. 1948. K. F. Karjalainens ostjakisches wšrterbuch. (Lexica Societatis Fenno-ugricae; 10) Helsinki: Suomalais-ugrilainen seura.

Page 13