Quick viewing(Text Mode)

Spanish Adaptation of SAMPA and Automatic Phonetic Transcription

Spanish Adaptation of SAMPA and Automatic Phonetic Transcription

Llisterri, ., & Mariño, J. . (1993). Spanish adaptation of SAMPA and automatic . SAM-A/UPC/001/V1. ESPRIT Project 6819 SAM-A, Technology Assessment in Multilingual Applications. http://liceu.uab.cat/~joaquim/publicacions/ SAMPA_Spanish_93.pdf SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

ESPRIT PROJECT 6819 (SAM-A0 Speech Technology Assessment in Multilingual Applications

Report Title: Spanish adaptation of SAMPA and automatic phonetic transcription

Document No: SAM-A/UPC/001/V1

Status: Final

Date: 19.2.1993 Revised: 20.4.1993

Source: Joaquim Llisterri Universidad Autónoma de Barcelona

Tel 34 3 581 12 16 Fax 34 3 581 16 86

José B. Mariño Universidad Politécnica de Cataluña

Tel 34 3 401 64 37 Fax 34 3 401 64 47

Note: This work was partially supported by Spanish Government TIC 91-1488-C06-02

1 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

1. The Spanish adaptation of SAMPA

1.1. Phonetic and phonological inventories of Spanish

1.1.1. The phonetic inventory

The traditional description of the inventory of phonetic segments used by Peninsular Spanish speakers is found in Navarro (1918). His early descriptive work can completed with the detailed list of compiled by Canellada - Kuhlman (1987). The number of allophones quoted by these two sources amounts up to 20 vocalic elements and 43 consonantal segments. The causes of the allophonic variability according to these traditional sources can be summarized as follows:

Assimilations of

¥ Interdental allophones of //, // and //, the labiodental of //, dental allophones of /n/, // and /l/ and the palatal and velar allophones of /n/ are included in this category. The occurrence of these allophones is always conditioned by the place of articulation of the following .

¥ The can be modified according to the following : /a/ has a palatal allophone and a velar one; the quality of the other vowels is also changed by the following consonant in the same , producing open and close varieties.

Changes in

¥ The allophones of /b/, // and // and the allophone of // appear according to the character of the preceding consonant.

Devoicing and voicing

¥ /b/, / /, // and /g/ have devoiced allophones in syllable-final position before a voiceless consonant.

¥ / / and /s/ have voiced allophones when they are followed by a voiced consonant.

Position in the syllable and syllabic type

¥ According to Navarro (1918) the Spanish vowels have close and open allophones depending on the structure of the syllable in which they appear; vowels tend to be closed in CV and open in CVC syllables.

¥ The vowels /i/ and /u/ have allophonic variants -- known as semiconsonants or -- according to their nuclear or peripheral position in the syllable.

¥ The approximant allophones of /b/, /d/ and /g/ and the affricate allophone of /y/ are also conditioned by their position in the syllable

2 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

Position in the word

¥ Initial vs. non-initial position in the word is another factor that controls the appearance of the approximant allophones of /b/, /d/ and /g/ and the affricate allophone of /y/ .

¥ Traditional phoneticians such as Navarro (1918) distinguish lax allophones of the vowels depending on their position within the word.

Stress

¥ As well as the position in the word, the situation with respect to the main also results in lax allophones of the vowels. The affricate allophone of /y/ is also conditioned by the stress.

1.1.2. The phonological inventory

On the other hand, the inventory of phonological units proposed by classical authors such as Alarcos (1950) consists of 5 vowels and 19 . The phonological segments identified by Alarcos are the following ( transcribed according to IPA conventions ):

Phoneme (IPA) voiceless labial voiced labial plosive voiceless labiodental voiceless dental plosive voiced dental plosive voiceless interdental fricative voiceless palatal affricate voiced voiceless alveolar fricative voiceless velar fricative voiced labial nasal voiced alveolar nasal voiced palatal nasal voiced alveolar lateral voiced palatal lateral voiced alveolar tap voiced alveolar trill

3 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

a central open front i front back mid rounded vowel u back close rounded vowel

In order to be able to use a manageable number of units, but also to ensure a certain amount of phonetic detail, a compromise has been sought between the maximal number of allophones and the relatively short list of phonological units.

1.2. Statistical study of the occurrence of Spanish allophones

To arrive at such a compromise, a statistical study of the frequency of occurrence of the Spanish allophones has been undertaken. Since no data on the distribution of the allophones were available1, it was decided to undertake a pilot experiment to evaluate the frequency of occurrence of the set of allophones described in the literature.

Three native Spanish speakers aged between 20 and 40 were interviewed by one experimenter for around one hour to obtain a large sample of speech. The interviewers restricted their interventions to the minimum, so that semi-spontaneous guided interviews were obtained. The recordings took place in an acoustically controlled environment using professional recording equipment An orthographic transcription was made, introducing punctuation according to prosodic, syntactic and semantic criteria. This transcription was the input of au automatic to allophone conversion programme, that generated a phonetic output with most of the allophones described in the literature. A sample of more that 100.000 segments was obtained, and the frequency of occurrence of each allophone, as well as other parameters, was computed.

1.3. Final inventory for Spanish

A final inventory was established by eliminating all the allophones with a frequency of occurrence below 0.10% in the corpus analyzed. Following this procedure, 31 segments were retained. The following table shows the IPA transcription for each allophone, its phonetic definition, the frequency of occurrence in the analyzed corpus and the frequency of occurrence quoted by Rojo (1991) when available.

IPA % of % of occurrence occurrence in the according corpus to Rojo analyzed (1991) voiceless bilabial plosive 2.6 2.66

1 Previously published studies were carried out considering only phonological segments ( see. for example. Rojo (1991) )

4 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

0.45 2.66 voiceless dental plosive 4.63 4.48 voiced dental plosive 0.76 4.79 voiceless velar plosive 4.04 3.98 voiced velar plosive 0.11 0.95 3.63 3.09 voiced alveolar nasal 7.02 6.99 voiced palatal nasal 0.27 0.19 0.46 included in /n/ voiceless palatal affricate 0.40 0.28 voiced bilabial approximant 2.47 included in /b/ voiceless labiodental fricative 0.51 0.68 voiceless interdental fricative 1.53 1.68 voiced dental approximant 3.20 included in /d/ voiceless alveolar fricative 6.95 7.58 voiced alveolar fricative 1.33 included in /s/ 0.19 0.22 voiceless velar fricative 0.63 0.73 voiced velar approximant 0.79 included in /g/ voiced alveolar lateral 4.25 5.08 voiced palatal lateral 0.54 0.38 voiced alveolar trill 0.40 0.79 voiced alveolar tap 4.25 5.67 front close vowel 4.29 7.5 voiced palatal approximant 2.60 included in /i/ front mid vowel 13.72 13.51 central 13.43 13.40 back mid rounded vowel 10.37 9.57 back close rounded vowel 1.98 3.16 voiced labial-velar approximant 1.35 included in /u/

Thus, the final inventory contains the 24 defined by Alarcos (1950) plus 7 segments traditionally considered allophones: the three approximant variants of the

5 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription voiced [ ], the voiced allophone of /s/ -- [] -- the velar allophone of /n/ -- [ ]-- and the two semiconsonants or semivowels -- [j ] --.

1.4. Phonetic notation in Spanish using SAMPA

In our Spanish adaptation of SAMPA we have taken into account the proposals made by Wells (1989: 52-53), which are summarized below:

Approximant allophones of / b d g /

Since [D] ( IPA [ ] ) and [G] ( IPA [ ] ) already exist in SAMPA, only [B] is needed to represent the approximant [ ]

Alveolar trill

The [rr] can be used to represent the alveolar trill.

Affricate allophone of /y/

The affricate allophone of /y/ can be symbolized by [dZ] ( IPA [ ] ). However, this allophone has not been retained in our basic inventory due to its low frequency of occurrence.

Palatal fricative consonant

According to Wells (1989:52) the palatal fricative consonant /y/ can be considered an allophone of the [j]. Alarcos (1950 ¤ 98) offers convincing arguments in favor of the phonological status of /y/ based on functional grounds and cites minimal pairs contrasting /y/ and the other consonants of the phonological system. His solution is widely accepted in the literature on Spanish and and, moreover, it does not seem to be counterintuitive with regard to native speakers' intuitions. It is widely accepted that the /y/ can be realized as a fricative, as an approximant, and also as an affricate under certain conditions. Following Wells' (personal communication ) suggestion, this phoneme will be represented in SAMPA by the digraph /jj/.

The following table summarizes the set of symbols that can be used in SAMPA to transcribe the phonemes and allophones of Spanish selected according to the previously described criteria.

IPA SAMPA Example Transcription p voiceless bilabial plosive pala "pala b voiced bilabial plosive bala "bala t voiceless dental plosive tala "tala d voiced dental plosive dar dar voiceless velar plosive cala "kala g voiced velar plosive gala "gala

6 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

m voiced bilabial nasal mala "mala n voiced alveolar nasal nada "naDa J voiced palatal nasal caña "kaJa N voiced velar nasal hongo "oNgo tS voiceless palatal affricate chico "tSiko B voiced bilabial approximant lava "laBa voiceless labiodental fricative falso "falso T voiceless interdental fricative zona "Tona D voiced dental approximant cada "kaDa s voiceless alveolar fricative sala "sala z voiced alveolar fricative desde "dezDe jj voiced palatal fricative ayer a"jjer voiceless velar fricative jamón xa"mon G voiced velar approximant lago "laGo l voiced alveolar lateral la la L voiced palatal lateral llana "Lana rr voiced alveolar trill carro "karro r voiced alveolar tap caro "karo i front close vowel tila "tila j voiced palatal approximant labio "laBjo e front mid vowel tela "tela a central open vowel tal tal o back mid rounded vowel todo "toDo u back close rounded vowel tul tul w voiced labial-velar approximant agua "aGwa

If there is a need to represent other allophones not present in the set of segments described, the following SAMPA symbols are available:

IPA SAMPA Example Transcription dZ voiced palatal affricate conyugal kondZu"Gal

2. Automatic phonetic transcription for Spanish: generating SAMPA representations from orthographic representations

2.1. Grapheme to allophone correspondences

In order to produce an automatic transcription the correspondences between the and the SAMPA symbols have to be established. The following table summarizes some of the main correspondences that has been taken into account to design the transcription algorithm.

7 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription grapheme rules for the transcription to SAMPA examples a after a pause: b after or : b comba" "komba other cases: B labio: "laBjo followed by or : T celo: "Telo in word final position followed by or : G followed by , , (preceding , , ), , , <ñ> or : G acné: aG"ne other cases: K tacto: "takto coro" "koro tecla: "tekla tS chelo: "tSelo after a pause: d after , or : d caldo: "kaldo other cases: D codo: "koDo e f cofia: "kofja after a pause and followed by , , , or : g after or and followed by , or tongo: "tongo : g genio: "xenjo followed by or : x tigre: "tiGre other cases: G lago: "laGo in word-initial position followed by : jj hierba: "jjerBa other cases: no sound halo: "alo in nuclear position in the syllable: i tipo: "tipo in non nuclear position in the syllable: j cielo: "Tjelo x jarana: "jarana k kiosko: "kjosko l lote: "lote L tallo: "taLo m arma: "arma followed by

, , , or : m ánfora: "amfora other cases: n cono: "kono <ñ> J uña: "uJa o

p perro: "perro always followed by : k queso: "keso in word-initial position: rr rama: "rrama preceded by , or : rr honra: "onrra other cases: r arpa: "arpa trampa: "trampa pera: "pera amor: a"mor

8 SAM-A Spanish adaptation of SAMPA and automatic phonetic transcription

rr carro: "karro s rasgo: "rrasGo casa: "kasa trasto: trasto in syllable-final position: D atleta: aD"leta other cases: t toro: "toro without diaresis preceded by or : no sound queso: "keso in non nuclear position in the syllable: w cigüeña: Ti"GweJa in nuclear position in the syllable: u lujo: "luxo after a pause: b after or : b con velo: kom "belo other cases: B calvo: "kalBO in foreing words: Gu, gü or like a whisky: "gwiski kiwi: "kiBi in non word-initial position and followed by a vowel: Gs examen: eG"samen other cases: s externo: "terno in initial position of a syllable with two or more yunque: "jjunke sounds: jj cónyuge: "konjjuGe other cases: it will be processed like a dos y dos: dos i "Dos muy: mwi T zarza: "TarTa tizne: "tiTne

3. References

ALARCOS, E. (1950) Fonología española. Madrid: Gredos ( Biblioteca Románica Hispánica, Manuales 1 ), 1965 4a ed. aumentada y revisada.

CANELLADA, M. J. - KUHLMAN MADSEN, J. (1987) Pronunciación del español. Lengua hablada y literaria. Madrid: Castalia.

NAVARRO TOMÁS, T. (1918) Manual de pronunciación española. Consejo Superior de Investigaciones Científicas: Madrid, Instituto Miguel de Cervantes ( Publicaciones de la Revista de Filología Española, III ). 21» edición, 1982.

ROJO, G. (1991) " Frecuencia de fonemas en español actual ", in BREA, M.- FERNANDEZ REI, F. ( Coord ) Homenaxe ó profesor Constantino García. Santiago de Compostela: Universidade de Santiago de Compostela. Servicio de Publicación e Intercambio Científico. Pp. 451-467.

WELLS, J. . (1989) " Computer-coded phonemic notation of individual of the European Community ", Journal of the International Phonetic Association 19,1: 31-54

9



© 2022 Docslib.org