<<

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter 6ce, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Information Company 300 North Zeeb Road, Ann Arbor MI 48106-1346 USA 313/761-4700 800/521-0600

GESTURES AND LINGUISTIC FUNCTION IN LEARNING RUSSIAN: PRODUCTION AND PERCEPTION STUDIES OF RUSSIAN PALATALIZED CONSONANTS

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

Erin Elizabeth Diehm, MA.

The Ohio State University 1998

Dissertation Committee: Approved by

Professor Anelya Rugaleva, Co-Adviser

Professor Keith Johnson, Co-Adviser

Professor Ernest Scatton ^ Co-Advisers Department of Slavic and East European and Literatures UMI Number: 9833968

UMI Microform 9833968 Copyright 1998, by UMI Company. All rights reserved.

This microform edition is protected against unauthorized copying under Title 17, United States Code.

UMI 300 North Zeeb Road Ann Arbor, MI 48103 Copyright by Erin Elizabeth Diehm 1998 ABSTRACT

Drawing on the inexorably intertwined nature of production and perception abilities, this dissertation provides a dynamic account of adult second- learners’ perceptual and production accents. To this end, demonstrate with acoustic-phonetic studies native-speakers’ and second- ianguage-ieamers’ differing patterns in production and perception of sound sequences that are particularly problematic for adult second-language learners: sequences containing the palatalized consonants of Russian. In particular, I investigate four C-V sequences of Russian: /CV/, /OV/,

/OjV/, and /CJijV/ with one acoustic production study and one acoustic perception study. Resulting data are statistically analyzed and discussed in light of articulatory-phonological theory, as well as phonetic theories of acoustic salience, linguistic knowledge and acoustic strategies. In the production study native-Russians make more distinctions than learners, thereby indicating Russians’

“better” production performance. On the other hand, in the perception study learners make more distinctions than native-Russians, thereby indicating learners’ “better” perception performance. In other words, Russians do not distinguish in perception differences that they make in production, while learners do not distinguish in production differences that they make in perception. Theroetical explanations for native-Russians’ and learners’ differing performances are offered and developed.

Chapter 1 presents several theories of first-language (LI) and second-language (L2) acquisition.

In particular, the Critical Age Hypothesis is discussed in light of L2 learners’ virtually inevitable permanent accent. However, I focus on recent claims that while adult L2 learners maynever attain native-like proficiency of L2 phonetics and , they, nonetheless, can continue to reduce their L2 accent with continued study of and exposure to L2. Most importantly, learners’

ii improvements in L2 phonetics are seen to occur not in a categorical, discrete manner, but in a

gradual continuous manner.

Chapter 2 presents an overview of the Russian sound system. The primary organizational role of Russian palatalization is emphasized in light of adult second language acquisition of Russian.

Descriptions of articulatory and acoustic properties of Russian palatalized consonants, or “soft" consonants are given; secondary tongue body raising and its associated raised second (F2) frequency are presented. Finally, I discuss several proposed sources of learners’ accented productions of the soft consonants, including orthography, acoustic saliency, phonological interpretation and phoneme mapping, which results in learners’ incorrect mapping of Russian simple-palatalized,

/civ /, and palatalized-yod sequences, /CijV/, onto “similar” English sequences of /C/+/j/.

In an endeavor to establish a phonological model that is suitable for the purposes of this study, chapter 3 briefly reviews several major phonological frameworks. With their categorical discrete features that are externally timed, both linear and non-linear models are rejected. I argue, instead, that Browman and Goldstein’s (1989) Gestural Model, with its phonological unit of the gesture and intrinsic timing approach, provides an ideal framework for describing learners’ gradual stages of acquisition.

In chapter 4,1 report the results of an acoustic production study. Statistical analyses of F2 duration and frequency data indicate differing patterns of productions for native-Russian speakers and learners of Russian. Russians distinguish /OV/ and /OjV/ in both F2 frequency and duration while learners produce the two sequences identically, thereby adding support for claims of learners’ proposed phonological interpretation. Both Russians and learners distinguish /OjV/ and /OijV/ in

F2 duration but not F2 frequency. Comparative analyses of learners’ productions of native-English

/C/+/j/-h/V/ and L2 Russian /CÎV/, /OjV/ and /OijV/ indicate that learners produce English palatal sequences with a Tongue Body gesture that is later than in their Russian productions, thereby indicating that learners have, to some degree, acquired the more sequential nature of relevant gestural

iii timings. Insofar as second formant data reflect lingual movements, acoustic findings are then associated with their articulatory sources via Gestural Phonology. Gestural Scores of Russians’

“native" productions and learners’ “accented” productions are given. Incorrect inter-articulatory timings and incorrect gestural amplitudes and durations are proposed as a source of learners’ accents.

Using as its source native-Russian productions from chapter 4, chapter 5 reports an acoustic perception study that investigates Russians’ and learners’ perception of the aforementioned four C-V sequences. As with the production study, Russians and learners exhibit differing general perceptual patterns. Most importantly, learners’ “better” performance indicates the presence of a modified near­ merger in Russian that, due to their incomplete linguistic knowledge, learners disregard. I also investigate effect of vowel type and offer initial supporting evidence that temporal cues are more acoustically salient than spectral cues. Finally, dividing learners into those with less linguistic experience and more linguistic experience, I provide evidence of advanced learners’ improved perceptual performance. Learner’s improved, more Russian-like performance, provides initial evidence of L2 acquisition of both phonetic and functional knowledge.

Finally, in chapter 6 ,1 summarize the major findings of preceding chapters and propose several related future phonetic studies.

IV Dedicated to my mother, who encouraged me to see the world, and to my father, who gave me the tools to succeed. ACKNOWLEDGMENTS

I must first express deep gratitude to my dissertation committee members. Dr. Anelya Rugaleva from the Department of Slavic and East European Languages and Literatures at The Ohio State

University, Dr. Keith Johnson from the Department of Linguistics at The Ohio State University and

Dr. Ernest Scatton from the Department of Languages, Literatures and Cultures at SUNY Albany.

This dissertation would have been impossible without their unflagging support and encouragement.

Dr. Rugaleva believed in me from the beginning—especially during my initial years of graduate study—and, perhaps unknowingly, instilled in me a true appreciation of the in its broader humanistic context. Dr. Scatton, who was involved in the project from its inception, contributed greatly in the presentation of my research and provided much appreciated support. His encyclopedic knowledge of Russian linguistics, and in particular phonology, added much depth to the content of this dissertation, and his keen sense for language contributed greater stylistic facility. I cannot express in words the extent of my indebtedness to Dr. Johnson. At the professional level, he sparked my interest in the field of phonetics and guided me with infinite patience, careful thought and wisdom throughout the entire dissertation process. At a more personal level, Keith (as he preferred to be called) was a true mentor, who always found time for thoughtful discussion (even at a second’s notice). His vast and profound knowledge of phonetics, coupled with a down-to-earth style and natural humility, kept me humble and made me realize how much I still had to learn. The old cliché that “this work couldn’t have been written without him” is no exaggeration.

The Department of Slavic and East European Languages and Literatures at The Ohio State

University also deserves special mention. The many years of teaching and research assistantships

vi released me from any financial worries and allowed me to focus my energies on research and

scholarship.

I would also like to thank the faculty, staff and graduate students from the Department of

Linguistics at The Ohio State University for their kindness and generosity. They gave me complete

access to all equipment and allowed me to monopolize the CSL machine for days at a time. Always

making me feel welcome in their “phonies” world, they provided me with friendship, thoughtful

conversation and expert assistance on numerous occasions.

I would also like to thank my family members. Mom, Dad, Deirdre and Pat, who always

supported me and listened to me during my many years as a “professional student”. My mother’s

wisdom and empathetic ear provided me with lessons valuable at any stage of life. A natural teacher,

she has been exceptional in her enthusiastic and undying support throughout this arduous project.

She understood the value of the graduate education as lying in the process itself and never lost sight

of the real purpose of the Ph.D.—broadening one’s mind, exploring the world through travel and

, while developing one’s mental capacities. To be able to call such a wise person one’s

mother is truly a blessing.

I must also thank my dear friends, Carol, Glenn, Lena, Misha, Robert and Ruth (in alphabetical

order!) who endured countless hours of self-centered conversation and bolstered my spirits in

moments of self-doubt. Their constant support and genuine interest gave me the strength to

complete my degree. I honestly could not have finished this project without my “comrades in

arms”. My dear friend Robert, the “computer guru,” was always patient, offered expert help with all

computer matters and gave much-appreciated encouragementin the form of telephone cheerleading.

Lena provided invaluable support in the form of warm, concerned daily e-mail messages during the

final months of writing. Carol, my lane #1 buddy, was always there when I needed understanding, encouragement or just someone to listen. Ruth, the Russian instructor, has always made me feel that I have much to offer students of Russian. Glenn, a fellow “seeker,” has been a true friend who

VII was always ready for good conversation and, in the final stages of my writing, offered exceptional editing help. Finally, Misha was a true friend from the first day of graduate school. Having twice traveled through Europe and Russia together, we share d many special moments as we grew during the graduate-student experience. It is no suprirse that it was Misha whogave me the initial “umph” that set the ball rolling in the summer of ‘97. Finally, I would like to express my love and appreciation to Dennis, a kind soul and trustworthy captain. While I was not always the most pleasant person to be around during this past year, Dennis never failed to shower me with love and kindness. Thank you.

This research was supported by a Graduate Student Alumni Research Award from the graduate school of The Ohio State University.

Vlll VTTA

November 12, 1965 ...... Bom - Oklahoma City, Oklahoma, .S.A.

1989 ...... B.S. Chemistry, Oklahoma State University

1991 ...... M.A. Russian Language, Literature and Linguistics, The Ohio State University

1989 - present ...... Graduate Teaching Associate and Research Associate, The Ohio State University

PUBLICATIONS

1. Diehm, Erin and Keith Johnson. 1997. Near-merger in Russian palatalization. In Kim Ainsworth-Damell and Mariapaola D’Imperio (eds), OSU Working Papers in Linguistics, vol. 50.

FIELDS OF STUDY

Major Field: Slavic and East European Languages and Literatures Specialization: Slavic linguistics

IX TABLE OF CONTENTS

ABSTRACT...... ii

DEDICATION...... v

ACKNOWLEDGMENTS...... vi

VTTA...... ix

LIST OF TABLES...... xiii

LIST OF FIGURES...... xvii

CHAPTER 1. Introduction...... 1

1.1 Preface...... 1 1.2 Opening remarks ...... 1 1.3 The LI acquisition process: A brief review ...... 4 1.4 The L2 acquisition process: Review of past theories and where we stand today ...... 6 1.5 Attaining a good L2 phonetics: A teacher’s justification ...... 10 1.6 The present study: The phonetics of LI acquisition and Russian palatalized sequences 12

CHAPTER 2. An overview of the Russian consonantal system ...... 15

2.1 Introduction...... 15 2.2 Overview of the Russian consonantal system ...... 16 2.2.1 Orthographic representation of Russian palatalization ...... 19 2.2.2 Orthography and phonetic realizations of the palatal glide ...... 21 2.3 Russian palatalization and L2 acquisition...... 23 2.3.1 General approaches to teaching Russian palatalization ...... 23 2.3.2 Accented productions of the palatalized consonants ...... 24 2.3.3 Russian palatalization: An uncommon phenomenon ...... 25 2.4 Articulation of palatalization...... 29 2.4.1 Terminology: Co-articulation vs. ...... 29 2.4.2 General articulatory characteristics of palatalized consonants ...... 30 2.4.3 The active articulator of secondary palatalization ...... 33 2.4.4 The passive articulator of secondary palatalization in light of /j/ ...... 36 2.4.5 /civ/ vs. /CJjV/: Palatalization vs. palatals ...... 38 2.5 Acoustics of palatalization...... 4 1 2.6 Summary...... 44

CHAPTER 3. An overview of phonological models and palatalization ...... 46

3.1 Traditional phonological models ...... 46 3.1.1 Palatalization in Distinctive Feature Theory...... 46 3.1.2 Application of the JFH Distinctive Feature Theory to Russian: The Sound Pattern f Russian...... 47 3.1.3 Palatalization in the SPE framework...... 48 3.1.4 Palatalization in a Non-Linear framework...... 49 3.1.5 Summary: Rejecting previous phonological models ...... 54 3.2 Gestural Phonology ...... 56 3.2.1 Introduction to Gestural Phonology ...... 56 3.2.2 The dynamic and compensatory nature of Gestural Phonology ...... 57 3.2.3 Gestural units and their organization ...... 59 3.2.4 Gestures and their specified degrees of overlap ...... 62 3.2.5 Modifying traditional gestural overlap for L2 acquisition: Phase Windows ...... 65 3.3 Summary...... 67

CHAPTER 4. Articulatory implications of an acoustic production study of Russian palatalized sequences: The gestural model and interphonetics ...... 69

4.1 Introduction...... 69 4.2 Methods ...... 70 4.2.1 Subjects...... 70 4.2.2 Materials: Determining C, V and placement ...... 72 4.2.3 Speech sample: Establishing contrasting word groups ...... 75 4.2.4 Procedures...... 76 4.2.5 Data analysis...... 77 4.2.5.1 Spectrograph procedures and settings ...... 77 4.2.5.2 Discussion of Weismer, etal. (1992) ...... 78 4.2.5.3 Sample spectrograms ...... 78 4.3 Results...... 85 4.3.1 F2 trajectories for native Russian speakers...... 86 4.3.1.1 Russians’ means of F2 frequencies and durations...... 86 4.3.1.2 Russians’ time-aligned F2 trajectories ...... 88 4.3.1.3 Graphs of time-aligned trajectories for female and male Russians ...... 89 4.3.2 F2 trajectories for learners...... 92 4.3.2.1 Learners’ means of F2 frequencies and durations...... 92 4.3.2.2 Learners’ time-aligned F2 trajectories ...... 94 4.3.2.3 Graphs of time-aligned trajectories for female and male learners ...... 94 4.3.3 Statistical analyses of F2-at-transition-onset: Russians...... 97 4.3.4 Statistical analysis of F2-at-transition-onset: Learners...... 105 4.3.5 F2 steady-state duration: Russians...... 110 4.3.6 F2 steady-state duration: Learners ...... 114 4.3.7 Comparing Russians’ and learners’ absolute mean values of F2-steady-state durations...... 116 4.4 Summary of statistical analyses of Russians’ and learners’ F2 formant trajectories...... 119 4.5 Acoustic measurements and their associated articulations...... 122

XI 4.6 Learners’ productions of AE ICI + /j/ + /u/ versusICiVI, /OjV/ and /CijV/ ...... 123 4.6.1 M ethods ...... 124 4.6.1.1 Subjects...... 124 4.6.1.2 Procedures...... 124 4.6.1.3 Data analysis...... 125 4.6.2 Results...... 125 4.6.3 Discussion...... 136 4.7 Native and nonnative palatalization in the Gestural Phonology framework ...... 137

CHAPTER 5. An acoustic perception study: Evidence for differing perceptual patterns, acoustic strategies and effect of linguistic experience ...... 146

5.1 Introduction and discussion of relevant SLA theory ...... 146 5.2 Methods ...... 149 5.2.1 Subjects...... 149 5.2.2 Stimuli...... 154 5.2.3 Procedure...... 155 5.3 Results: /a/, /u/ and /i/ combined...... 157 5.3.1 Presentation of data ...... 157 5.3.2 Discussion...... 166 5.4 Results: Effect of vowel context...... 169 5.4.1 Presentation of data ...... 169 5.4.2 Discussion...... 178 5.5 Results: Learners range of experience...... 183 5.5.1 Presentation of data ...... 184 5.5.2 Discussion...... 189 5.6 Summary...... 191

CHAPTER 6. Conclusion...... 195

APPENDIX A. Ordered word-list: C-V sequences, phonetic transcriptions and glosses ...... 205

APPENDIX B. Random order word-list read by participants ...... 208

BIBLIOGRAPHY...... 210

Xll LIST OF TABLES

Table 2.1. Set of minimal pairs contrasting for Russian palatalization ...... 17

Table 2.2. The paired consonants of Russian. (From Avanesov, 1972:34.) ...... 17

Table 2.3. Classification of the Russian consonants by place and manner of articulation. The following abbreviations are used: vd. = “voiced,” vis. = “voiceless.” (Adapted from Avanesov, 1972:37.) ...... 18

Table 2.4. Orthographic representations of (j]. (Adapted from Hamilton, 1980:54-55.) ...... 22

Table 2.5. Phonetic realizations of/j7. (Adapted from Antonova, 1988:137.) ...... 23

Table 2.6. Articulatory divisions of the tongue. (From De Armond, 1966:110.) ...... 33

Table 2.7. Articulatory divisions of the tongue. (From Recasens, 1990.) ...... 35

Table 3.1. The acoustically-based distinctive features for the palatalized and non-palatalized voiced bilabial stops. (Adapted from Jakobson, Fant and Hall, 1963:43.) ...... 47

Table 3.2. SPE features for palatalized/non-palatalized pairs. (Adapted from Chomsky and Halle, 1968:307.) ...... 49

Table 3.3. Tract variables. (From Browman and Goldstein, 1989:344.) ...... 59

Table 3.4. Gestural symbols (From Browman and Goldstein, 1989:344.) ...... 60

Table 4.1. Russians’ means and other data for the five F2 measurements for the three C-V environments organized according to gender. The following abbreviations are used in the table: “N. of Cases” = the number of total observations for that specific measurement, “Min.” = the minimum value among all observations, “Max.” = the maximum value among all observations, “Stan. Dev.” = the standard deviation from the mean value among all observations...... 87

Table 4.2. Table of Russians’ time-aligned F2 frequency and duration measurements. The following abbieviations are used: /ClV/ = simple-palatalized sequences, /CljV/ = palatalized-yod sequences, /OijV/ = palatalized-i-yod sequences...... 89

Table 4.3. Learners’ means and other data for the five F2 measurements for the three C-V environments according to gender. The following abbreviations are used in the table:

xiii “N. of Cases” = the number of total observations f for that specific measurement, “Min.” = among all observations, the minimum value, “Max.” = among all observations, the maximum value, “Stan. Dev.” = the standard deviation among the observations...... 93

Table 4.4. Table of learners’ time-aligned F2 frequency and duration measurements. The following abbreviations are used: /O V / = simple-palatalized sequences, /OjV/ = palatalized-yod sequences, /OijV/ = palatalized-i-yod sequences ...... 94

Table 4.5. ANOVA for both groups of Russian speakers. All four C-V environments included in calculations...... 98

Table 4.6. ANOVA for both groups of Russians. Only the three C-V environments containing a palatalized consonant are considered ...... 100

Table 4.7. Tukey post-hoc test for both Russian groups which indicates reliable differences in F2- at-transition-onset frequencies between the three C-V environments ...... 101

Table 4.8. Generalization of the reliable differences between the F2-at-transilion-onset for both Russian groups for the three environments CÎV, cijV and ciijV ...... 101

Table 4.9. Tukey post-hoc test of effect of consonant type and C-V environment on the F2-at- transition-onset frequency. The following abbreviations are used: B = labials (/bJ/ymJ/, /vJ/), D = dentals (/dJ/, /zJ/), L = lateral (/U/) and R = trill (/rJ/). The top table gives data for female Russians, the bottom table is for male Russians...... 103

Table 4.10. Summarizing results of Tukey post-hoc test which tested for the effects of C-V environment and consonant type on the F2-at-transition-onset frequency. Results for female and male Russians combined ...... 104

Table 4.11. ANOVA for both groups of learners ...... 105

Table 4.12. Tukey post-hoc test for both learner groups which indicates reliable differences in F2- at-transition-onset frequencies between the three C-V environments ...... 106

Table 4.13. Generalization of the differences between the F2-at-transition-onset for both learner groups for the three environments civ, cijV and ciijV ...... 106

Table 4.14. Tukey post-hoc analyses focuses on effect of consonant type and C-V environment on the F2-at-transition-onset frequency. Only the three C-V environment containing a palatalized consonant are included. The following abbreviations are used: B = labials (/bi/, /mi/, /vi/), D = dentals (/di/, /zi/), L = lateral (/U/) and R = trill (/rJ/) The top table is for female learners, the bottom table is for male learners ...... 108

Table 4.15. Summarizing results of Tukey post-hoc test which tested for the effects of C-V environment and consonant type on the F2-at-transition-onset frequency. Results for female and male learners combined ...... 109

XIV Table 4.16. Summary comparison of F2-at-transition-onset frequencies for the three C-V environment for learners and native Russians...... 110

Table 4.17. The SCS Test of Fixed Effects on the dependent variable f 2s s d u r...... 112

Table 4.18. Russians’ F2-steady-state duration means (in sec.) according to consonant type and C-V environment. PR = female Russians, MR = male Russians...... 112

Table 4.19. Statistically reliable (p<0.01) non-equivalencies of Russians’ mean F2-steady-state durations according to consonant type and C-V environment. Mean durations are given in seconds. FR = female Russians, MR = male Russians...... 114

Table 4.20. Learners’ F2-steady-state duration means (in sec.) according to consonant type and C-V environment. FE = female learners, ME = male learners...... 114

Table 4.21. Statistically reliable equivalencies and non-equivalencies (at p<0.01, unless otherwise noted) of learners’ F2 mean steady-state durations by gender, consonant type and C-V environment. FE = female learners, ME = male learners...... 116

Table 4.22. Summary comparison of F2-steady-state durations for the three C-V environments for learners and native Russians...... 116

Table 4.23. Comparison of learners’ and Russians' absolute F2-steady-state durations at p<0.05. The absolute difference between Russians’ and learners’ mean durations is given in milliseconds under the “greater than,” “less than,” or “equal” signs ...... 117

Table 4.24. Summary of statistical analyses of F2 trajectories for native Russians and learners. ..119

Table 4.25. Table of four advanced learners’ time-aligned F2 frequency and duration measurement of “similar” English and Russian sequences. The following abbreviations are used: ICI+l]l+iyi = American-English sequences, /Cl V/ = Russian simple-palatalized sequences, /CljV/ = Russian palatalized-yod sequences, /OijV/ = Russian palatalized-i-yod sequences. Means represent values averaged over the three labial consonants /b/, /m/, /v/ for the vowel Ini...... 132

Table 5.1. Initial raw data results from the perception study. The upper table (a) gives data for native-Russian speakers. The lower table (b) gives raw data for learners ...... 159

Table 5.2. Results from the perception study given in percentages. The upper table (a) gives percentages for Russians. The lower table (b) gives percentage results for learners.... 160

Table 5.3. Raw data results from the perception study where effect of vowel type is considered. The upper table (a) gives data for native Russian speakers; the lower table (b) gives raw data for learners ...... 170

Table 5.4. Results from the perception study given in percentages where effect of vowel type is considered. The upper table (a) gives percentages for Russians. The lower table (b) gives percentage results for learners ...... 172

Table 5.5. Excerpt from overall percentage results in Table 5.4 for learners for /ClV/ stimuli 181

XV Table 5.6. Excerpt from overall percentage results in Table 5.4 for learners for/OjV/ stimuli __ 182

Table 5.7. Excerpt from overall percentage results for learners for/OijV/stimuli ...... 183

Table 5.8. Raw data counts for learners' performance in the perception experiment. Both vowel type and linguistic experience are considered. Vowel type is indicated as either IzJ+lvJ or W. Data from learners having less linguistic experience are indicated with the abbreviation "UG" (undergraduate) while data from learners having more linguistic experienced are indicated with the abbreviation "G" (graduate student) ...... 185

Table 5.9. Data percentages for learners' performance in the perception experiment according to vowel type and range of experience. Vowel type is indicated as either /a/+/u/ or /i/. Data from learners having less linguistic experience are indicated with the abbreviation "UG" (undergraduate learner), while data from learners having more linguistic experienced are indicated with the abbreviation "G" (graduate learner) ...... 186

Table 5.10. Absolute percentage differences between undergraduate and graduate learners' performance in the perception experiment The upper table gives results for sequences containing the vowel /a/ and /u/. The lower table gives results for sequences containing the vowel /i/. Outlined cells indicate where stimulus and response correspond. Double­ underlined data indicate especially interesting effect of greater linguistic experience.... 188

XVI LIST OF FIGURES

Figure 3.1. Representative tree structure of Non-linear Phonology, specifically, the Halle-Sagey Articulator Model. (From Kenstowicz, 1994:452.) ...... 51

Figure 3.2. Examples of non-linear representations of Russian palatalized [bl] in abbreviated forms of both the Halle-Sagey and Clements-Hume models...... 52

Figure 3.3. Example of native versus nonnative productions of Russian palatalized /bl/, expressed in a simplified version of the Clements-Hume framework...... 53

Figure 3.4. Symbolic gestural score for hypothetical "palm." (From Browman and Goldstein, 1988:345.) ...... 61

Figure 3.5. Hypothetical gestural representation and associated trajectories for "palm." Note that closure is indicated by lowering. (Adapted from Browman and Goldstein, 1988:345.).. 61

Figure 4.1. Female and male AE learners’ number of years of study of Russian both in the United States and in Russia. Female and male data are combined and presented in increasing order of duration of study in the United States followed by duration of study in Russia...... 72

Figure 4.2. Spectrograms illustrating contrasting patterns in F2 for non-palatalized and palatalized consonants. The upper spectrogram is for the non-palatalized sequences /bu/; the lower spectrogram is for the palatalized sequences /Wu/. The x-axis indicates time while the y-axis gives frequency (in Hz.) ...... 80

Figure 4.3. Sample spectrogram of [biju] illustrating the five F2 measurements ...... 82

Figure 4.4. Spectrograms for the four contrasting sequences [bu] -[Wu] - [biju] - [Wiju]. Time is indicated on the x-axis. Frequency (in Hz.) is indicated on the y-axis ...... 83

Figure 4.5. Time-aligned F2 trajectories for native Russians for the tfiree C-V environments. The upper graph gives F2 trajectories for female Russians; the lower graph is for male Russians. Palatalization is indicated with an apostrophe and not IPA superscript ‘j’... 90

Figure 4.6. Time-aligned F2 trajectories for learners for the three C-V environments. Female and male trajectories presented separately. The upper graph gives F2 trajectories for female learners; the lower graph is for male learners. Palatalization is indicated with an apostrophe and not IPA superscript ‘j’ ...... 95

xvii Figure 4.7. Russians’ F2-steady-state mean durations for each of the three C-V environments. Within each C-V environment gender and consonant type is accounted for. Palatalization is indicated with an apostrophe. The C-V environment abbreviations are: C’V = simple-palaalized, C’jV = palatalized-yod, C’ijV = palatalized-i-yod; D = dental ([d’l. [z’]). B = labial ([b’]. [m’]. [v’]), L = lateral ([1’]). R = trill [r’J; FR = female Russians, MR = male Russians...... 113

Figure 4.8. Learners’ F2-steady-state mean durations for each of the three C-V environments. Within each C-V environment gender and consonant type is accounted for. Palatalization is indicated with an apostrophe. The C-V environment abbreviations are: C’V = simple-palatalized, C’jV = palatalized-yod, C’ijV = palatalized-i-yod. D = dental ([d’j, [z’]), B = labial ([b’], [m’], [v’]>, L = lateral ([!’]), R = trill [r’], FE = female learners, ME = male learners...... 115

Figure 4.9. F2 trajectories contrasting representative production of the three C-V sequences. The upper graph is for female Russians, the lower graphs is for female learners ...... 120

Figure 4.10. Sample spectrograms comparing advanced learners’ productions of AE /b/+/j/+/u/ with their productions of Russian /blu/, /biju/ and /Wiju/. Spectrograms are aligned according to onset of vowel transition, indicated by the vertical line. The top four spectrograms are from the productions of one male graduate-student learner; the bottom four spectrograms are for one female graduate-student learner. Time is indicated on the x-axis. Frequency (in Hertz) in indicated on the y-axis. Phonetic transcriptions of the sequences are given below each spectrogram. Palatalization is indicated with an apostrophe and not IPA superscript ‘j’ ...... 126

Figure 4.11. Time-aligned F2 trajectories for four advanced learners for the three Russian palatalized C-V environment and English /C/+/j/+/V/. The upper two graphs give F2 trajectories for two male learners; the lower two graphs give F2 trajectories for two female learners. Russian palatalization is indicated with an apostrophe and not IPA superscript ‘j’. The following abbreviations are used: /C/+/j/ = AE sequence /CV+/j/=/V/, /CÎV/ = Russian simple-palatalized sequences, /CljV/ = Russian palatalized-yod sequences, /CJijV/ = Russian palatalized-i-yod sequences. Trajectories indicate data averaged over the three labial consonants /b/, /m/, /v/ where the vowel is /u/ ...... 133

Figure 4.12. Abbreviated Gestural model of Russian /bi/ as produced by native speakers...... 138

Figure 4.13. Abbreviated Gestural Model of “traditional” nonnatives’ highly-accented Russian [hi] ...... 139

Figure 4.14. Abbreviated Gestural Model of our learners’ accented production of [bl] ...... 139

Figure 4.15. Abbreviated Gestural model of Russian [bij] as produced by native speakers...... 140

Figure 4.16. Abbreviated Gestural model of Russian [blij] as produced by native Russians 141

XVlll Figure 4.17. Abbreviated Gestural model of Russian /blij/ as produced by learners ...... 141

Figure 4.18. Abbreviated Gestural models for Russians’ and learners’ production of the three palatalized sequences under investigation. The height of the rectangles indicates gestural magnitude, or degree of closure. The length of the rectangles indicates gestural duration...... 142

Figure 4.19. The Gestural Model illustrates gradual stages of the L2 acquisition process of the Russian palatalized consonants ...... 144

Figure 5.1. Learners’ number of years of Russian study— both in U.S. and in-country (in Russia). Data for both female and male participants are presented. Data are not grouped according to gender, but are sorted and presented in order of increasing number of years of study in U.S. followed by number of years of study in Russia. Study in the U.S. is indicated by the gray columns, while in-country study is indicated by black columns. The "m” or “f ’ following subject numbers on the x-axis indicate subjects’ gender “m” = male, “f ’ = female...... 151

Figure 5.2. Learners’ self-rating of Russian pronunciation. A rating of 1 indicates weakest Russian phonetics while a 6 indicates native-like phonetics. As in Figure 5.1 above, data for male and female subjects are presented in order of numbers of years of study in U.S. then number of years of study in-country. The “m” or “f ’ following subject numbers on the x-axis indicate subjects’ gender; “m” = male, “f ’ = female ...... 152

Figure 5.3. Graph of learners’ self-ratings vs. total number of years of study (a combination of formal study in U.S. plus time in-country). X-axis indicates self-rating, where 1 is weakest (heavy accent) and 6 is best (no accent, native-like). Y-axis indicates the total duration of study o f Russian (in years)...... 153

Figure 5.4. Excerpt from the forced-choice answer sheet for the perception study ...... 156

Figure 5.5. An example bar-graph for “perfect” perception for the four C-V sequences. Palatalization is indicated with an apostrophe ...... 161

Figure 5.6. Russians’ and learners’ identification of the four C—V sequences. The upper figure is for Russians; the lower figure is for learners. The x-axis indicates stimuli. The y-axis indicates subjects’ response ...... 163

Figure 5.7. Russians’ and learners’ identification of the four C-V sequences. Within each language group, results are summed over gender and consonant. Vowel quality is, however, accounted for. Graphs (a) and (c) are for sequences containing the vowels /a/ and /u/. Graphs (b) and (d) are for sequences containing the vowel /i/. The x-axis indicates stimuli; the y-axis indicates subjects’ distribution of responses in percents ...... 173

XIX CHAPTER 1

INTRODUCTION

1.1 Preface

The intended audience of this dissertation is comprised of specialists in several academic areas of expertise including, but not limited to: phonetics, general linguistics, Slavic as well as Russian and foreign language pedagogy. To ensure that the present exposition is clear to all readers, 1 make minimal assumptions about background knowledge of the various specific fields addressed here. Insofar as it may seem at times that 1 am explaining the obvious, 1 ask for the reader’s patience and hope that s/he finds something of value in this dissertation project.

12 Opening remarks

The capacity to acquire language distinguishes humans from all other creatures on the planet.

Yet, even with our seemingly highly developed innate linguistic capabilities, if we begin study of a second language later than about seven years of age, the majority of us can never completely acquire native fluency in the target language. The various aspects of the target language, however, do not equally elude adult second-language (L2) learners. For example, research has shown that with extended exposure to the target language, adult L2 learners can attain native-like mastery of L2 syntax, semantics, vocabulary, and pragmatics. However, mastering the L2 sound system— its phonetics and phonology— is not so easy.' To the great disappointment of learners, regardless of hundreds of hours

‘ “1 suggest that the critical period constrains only a subset of linguistic knowledge. It impedes specific phonetic and phonotactic abilities but does not affect some of the more globally functional principles” (Weinberger, 1994:285).

1 devoted to language practice, native speakers of the target language most often immediately identify them

as nonnative—as one of “them” and not one of “us.” Why is this? Why does acquisition of the sound

system of a second language elude virtually all but children? More importantly, what properties of the L2

sound system must adult learners acquire in order to sound like a native speaker? Such general questions

on the acquisition of the sound system of a second language motivated the present study.

In order to discuss L2 acquisition, it seems logical that we must first address acquisition of one's

native language (LI). Our understanding of the LI acquisition process, however, is far from complete.

In fact, while much research has already addressed the LI acquisition process, many questions remain

unanswered. And while L 1 acquisition studies have historically given little credence to L2

investigations—often relegating the latter to concerns of classroom practice only—the truth is that L2

studies have much to offer L 1 research. Second language acquisition (SLA) research is also valuable to

the field of general linguistics since SLA findings can often be extended to theories of general linguistics

and acquisition of one’s native language. Finally, investigations into the L1-L2 interface are especially

fruitful. For example, the statement, “the interrelationship between perception, production, and

underlying representation in native speakers is far from clear and has been a source of debate for decades.

In nonnative speakers the picture is even less settled...” (Major, 1994:191) indicates the need for

continued investigations in LI and L2 acquisition.

This dissertation discusses the Russian sound system, adopts a suitable phonological model and

reports results from perception and productions studies on C—V sequences containing Russian palatalized consonants. Discussing general sound patterns of Russian, chapter 2 provides readers with an overview of the Russian system; specific articulatory and acoustic patterns of the palatalized consonants of Russian are of particular interest. In chapter 3 ,1 describe and defend the framework that I will use in this dissertation: Gestural Phonology (Browman and Goldstein, 1989). Unlike more conventional systems of description, Gestural Phonology permits scalar changes in the representation of phonetic entities. It this manner, the Gestural framework can account for the continuous nature of speech dynamics. We will see in the course of the dissertation that this is an important feature for the description of phonetic considerations of L2 language learning. The acoustic phonetic study of chapter 4 provides quantitative evidence of dynamic “native” and “accented” inter-gestural articulatory timing. Investigating the relationship between perception and production capabilities for LI and L2 speakers, as well as native-

Russians’ and learners’ differing perceptual patterns, chapter 5 reports the results of a speech perception study.

Deep, underlying differences between the sound systems of LI and L2 complicate the L2 acquisition process. For example, it is well known that language sound systems are characterized by a great deal of language-specific temporal and acoustic overlap of the underlying discrete segments which results in continuous and non-discrete speech output. In fact, based on the acoustic signal alone, it is often difficult to recover the original segmental boundaries. In spite of this reality, virtually all existing research on L2 acquisition is limited by a focus on only single segments; e.g., the differences in Voice

Onset Time (VOX) between native French speakers’ production of French /d/ and English /d/. The viewpoint that I adopt in this dissertation is that for L2 learners to acquire L2 phonetics, they must internalize not only static segmental properties but also the dynamic and language-specific manner in which the segments interact. There are no linguistic studies which investigate dynamic aspects of inter- segmental interactions during L2 acquisition. This dissertation provides new insights into dynamic segmental timings and overlap in light of LI production and L2 acquisition.

In this case study I investigate Russian palatalized consonants. That the palatalized consonants of Russian are problematic for adult L2 learners is well-established. However—and more important for this study—there are certain sequences containing palatalized consonants that are particularly difficult for learners and are almost never mastered. I hypothesize that the protracted difficulties encountered by learners and their resulting accents are due to basic underlying differences between the language-specific gestural timings of English and Russian. In other words, for native American-English (AE) speakers, many of the gestural timings of Russian are truly “foreign.”

Selinker’s (1972) proposal of L2 learners’ INTERLANGUAGE can aid us in our attempt to describe and understand the source of learners’ accented productions. Selinker proposes that during the

L2 acquisition process, L2 learners or create their own “imperfect” version of L2. This real, functioning, intermediate linguistic system is called INTERLANGUAGE and results from a mixture of LI, L2 and language universals. (It might be useful to view learners’ INTERLANGUAGE as their own accented

” of L2.) For the purposes of this study it is important to understand that each learner’s INTERLANGUAGE is characterized by its own specific underlying INTERPHONOLOGY. To this end, I

view learners’ resulting accented productions as a kind of INTERPHONETICS that exhibits properties of

LI, L2 and language universals. For example, AE learners’ L2 productions of Russian would exhibit

properties of sounds firom LI English, L2 Russian as well as some phonetic properties that are associated

with universals. Learners’ INTERLANGUAGES are not fixed , unalterable system. Therefore, with time,

effort and extended exposure to L2, learners can continue to reshape, adjust and improve their

INTERPHONETICS gradually attaining greater phonetic proficiency of L2.

13 The LI acquisition process: A brief review

Because second language acquisition (SLA) research sometimes draws on existing theories of

LI acquisition, I briefly address here a few relevant developments in LI acquisition theory. Current

theory states that humans are bom with an innate ability to distinguish phonetic differences among sounds of all languages of the world. In this manner infants can be viewed as language-universal’ perceivers

who perceptually differentiate phonetic contrasts regardless of their phonological status or even their occurrence in the adult language to which they are exposed (Strange, 1995: 8). However, soon after birth our malleable perceptual system begins to be modified and shaped by the ambient language community, a process called PHONOLOGIZATION. With time, children begin to focus on and become attuned to those sounds which are meaningful only in their surrounding language. Phonologization has been described as a "loss in the ability to differentiate phonetic categories perceptually that are not phonologically distinctive in...[the] native language, while native contrasts may become more highly differentiated”

(Strange, 1995: IS). The process of PHONOLOGIZATION begins quite early in humans’ linguistic development, some time during the latter months of the first year of life. For example, Japanese infants from 6 to 8 months of age and from 10 to 12 months were tested for their ability to discriminate nonnative

English [r] vs. [1] contrast, a contrast that does not exist in Japanese. Still exhibiting qualities as

’language-universal’ perceivers, the infants from 6 to 8 months of age were able to distinguish the [ra] vs.

[la] contrast. On the other hand, the older infants from 10 to 12 months of age could not distinguish the contrast, thereby indicating that they had already begun to internalize the Japanese phonological system and can no longer distinguish all sounds of the world’s languages. ‘Therefore, their [the infants’] pattern of responding to the nonnative contrasts is the kind showing a decline in sensitivity toward the end of the

first year” (Jusczyk, 1997:820).

However, during the process of PHONOLOGIZATION and linguistic development, children’s ability to both "hear" and "speak" their native language does not equally develop; children’s perception capabilities noticeably precede their production capabilities. As a result of this discrepancy, children’s first word productions exhibit characteristic deviations from targets in the adult language. In general, children’s productions are often a simplification of the adult word “in terms of the numbers and kinds of phonetic segments that are produced” (Jusczyk, 1997:185). (See also Gerken, 1994; Ingram 1974, 1978;

Macken, 1979; Menn, 1978; Smith, 1973.) Substitution, where the substituted segment has similar phonetic properties to the replaced segment, is common. Sometimes children’s simply leave out some of the sounds from the adult word—a change called deletion. For example, a child may pronounce “banana” as [nænaj (Jusczyk, 1997:185). A similar process is cluster reduction, where a child reduces a cluster by deleting one or more of the component consonants. Metathesis, where a child reorders some of the phonetic material in the adult word, is also common. For example, a child might produce “spaghetti” as

[pazgeti].

Children’s relative unequal development of perception and production capabilities is evident in the following scenario: A child might know that s/he is having spaghetti for dinner, but when asking for a portion of food, s/he asks for pasghetti [pazgeti] instead of spaghetti [spageti]. If the parent repeats,

“Oh, you want some pasghetti.” the child recognizes the word is incorrect and repeats the request.’ While children know how the word is supposed to sound (and, therefore, evidence their developed perceptual capabilities), their muscular control of the articulatory organs (production capabilities) has not yet developed to a level that allows them to execute the complex articulations that produce the intended sound sequences.

• Borden and Harris (1980) provide an additional example of the perception-precedes-production phenomenon: “A child protests when others imitate his misarticulation, ‘I didn’t say wabbit, I said wabbit.’ This phenomenon can be interpreted as evidence that perception is ahead of production. When the child hears the adult say ‘wabbit,’ he perceives the mistake but is unable to produce an /r/ and fails to detect the mistake in his own speech” (199). Moreover, as children mature, their capacity to acquire the sound system of a language also changes. Language acquisition research has found that the time preceding puberty is characterized by a

heightened sensitivity to language learning. (The terms “optimal,” “critical" and “sensitive” are often used interchangeably to refer to this time period.) In fact, if language acquisition begins “too late”— somewhere around the time of puberty of later—is very unlikely if not impossible to attain native acquisition of the target language’s sound system. Humans’ loss of ability to attain native-like proficiency of a language’s sound system due to age-related maturational constraints is associated with this “Critical Age Hypothesis.” (See Flege, 1991: Lennenberg, 1967; Long, 1990; Scovel, 1988;

Singleton, 1989; Thompson, 1991.) In other words, there seems to be a biological, chronologically- associated basis, or limitation, for learning the sounds of a language. While the critical period was originally thought to end with the onset of puberty, recent research indicates that the loss of sensitivity may begin earlier than previously thought, somewhere around 6 to 8 years of age. To this end. Long

(1990) notes:

There are sensitive periods governing the ultimate level of first or second language attainment possible in different linguistic domains, not just phonology, with cumulative declines in learning capacity, not a catastrophic one-time loss, and beginning as early as age 6 in many individuals, not at puberty as is often claimed (255).

In other word, humans’ innate capacity to acquire a language declines gradually with time. In addition,

“[tjhe critical period applies to both perception and production and especially to the ability to relate perception to production” (Bordon and Harris, 1980: 204). We will see in the next section that critical age is also relevant to L2 acquisition theory. lA The L2 acquisition process: Review of past theories and where we stand today

Second language phonetics acquisition research also notes the significance of the critical period since “(ajfter puberty a language learner rarely can overcome a phonetic foreign accent” (Scovel, 1988 in

Weinberger, 1994:285). In general, the critical age hypothesis implies that virtually all adult L2 learners cannot attain native-like proficiency of the L2 sound system—a rather disheartening claim for our college-level foreign language students, indeed! Language acquisition literature has proposed that permanent changes in physiology, neurology and/or processing that occur during the years leading up to puberty are the source of L2 learners’ difficulties. For example, Trubetzkoy (1939/1969) noted the

6 significant role of abstract processing skills, stating that the “phonology of LI causes L2 learners to filter

out’ perceptually acoustic differences that are not phonetnically relevant in LI” (Flege, 1987:48-49). In

other words, adult learners perceive L2 phonetics and phonology through their LI “sieve,” causing them

to have a “perceptual foreign accent” (Strange, 1995:39).

However, while adult L2 learners may be unable to attain “perfect” native-like mastery of the L2 sound system, they can, nonetheless, continue to adjust and improve their L2 productions. In other words, if adult learners regularly dedicate time and attention to acquiring the L2 sound system, in most cases, they can continue to reduce their degree of accent: “Perceptual studies of adult L2 learners do provide encoiuaging evidence that, at any age, modification of phonetic perceptual patterns is possible”

(Strange, 1995:35). (See also Flege 1980, 1981, 1987 and Wode 1994.) So that adult L2 learners’ do not establish persistent incorrect accented L2 articulations from the outset of study, instructors should dedicate time to acquisition of L2 phonetics early in students’ study of L2. For example, to establish and improve their L2 phonetics learners should: ( 1 ) be made aware of the differences and similarities of L 1 and L2 phonological spaces; (2) be sensitized to the subtle and not-so-subtle acoustic differences between

L 1 and L2 phones, including new articulations and articulatory timings; (3) be provided accompanying training on the articulations required to produce the different L2 phones. Copious exposure to native productions of the target language, of course, maximize learners’ progress.

What is important for the purposes of this study is that when acquiring L2 articulations, learners must be made aware not only of static articulatory properties (i.e., which part of the tongue approaches or touches what part of the oral cavity) but also, how the articulations are timed with respect to one another.

However, language-specific inter-articulatory timings represent very subtle linguistic properties that are often not especially acoustically or articulatorily salient. Therefore, dedicated study of these dvnamic language-specific properties is beneficial in the L2 acquisition process since it can help reduce learners’ accents. Finally because the L2 acquisition process is influenced by the established LI phonological organization, with its own articulations and timings, learning new L2 inter-articulatory timings is especially complex, requiring learners to retrain or modify basic well-established LI articulatory patterns.

What is the source of learners’ L2 phonetic errors? The theory of contrastive analysis originally attributed mistakes in all aspects of L2 production to direct transfer from LI, or negative transfer. (See

7 Gass and Selinker, 1994:59-66). Negative transfer was assumed to be discrete; partial transfer from LI

was not allowed. In other words, AE learners of Russian produce Russian Ibl either as a native-Russian

would or as they would when speaking English. Current SLA research challenges this rather simplistic

explanation; it takes a broader approach, based more on general linguistic theory, including Universal

Grammar, developmental processes, sociolinguistic variables and markedness theory, to name a few

(Major, 1994:184-185). Therefore, accented L2 productions are a result of numerous processes.

Learners’ productions often seem to represent some kind of intermediate between LI and L2. For example, “...a learner’s mispronunciations will not always match sounds found in LI and L2. In fact,

language learners frequently produce a range of different phonetic variants...for a single L2 phoneme, sometimes even producing sounds which are not typically found in either LI or L2” (Flege, 1980:117).

These statements bring to mind Selinker’s (1972) aforementioned proposal of INTERLANGUAGE where

“[i]nterlanguages are in fact languages” (Major, 1994:187), a type of intermediate language that indicates a blending of LI, L2 and language universals. In this dissertation I propose that L2 accented productions stem from learners' INTERPHONOLOGY and accented INTERPHONETICS. I suggest that learners’ accented INTERPHONETICS result from, but are not limited to, incorrect executions of static articulations and incorrect dynamic articulatory timings.

It is important to understand that learners modify their INTERPHONETICS by retraining their L2 articulations and relative timings in a gradual manner. The proposal of continuously evolving and successive stages of L2 phonetic structures (both motor control and perceptual) is fairly recent because linguistic research has traditionally strived to locate discrete, categorizable entities (especially in phonological description). However, if in our analyses we give particular weight to phonetic reality with its inherently continuous nature, a real need for alternative descriptive methods becomes apparent; “[i]n much previous research, especially that done within a phonemic framework, the L2 sounds produced by a language learner have often been viewed as discrete entities which are produced either correctly or incorrectly instead of as a continuum of approximations to phonetically accurate L2 sounds” (Flege,

1980:119^). Scant research investigates and demonstrates learners’ partial and gradual acquisition of L2

^ "Like the child learning a first language, an adult L2 learner must slowly learn to articulate unfamiliar sounds and to extend production of already familiar sounds to new phonetic contexts. But

8 sounds, and ihe descriptive frameworks usually employed in this research have no way to account for the

continuous phases of acquisition.

Before discussing the aforementioned theories of retraining dynamic articulations in a gradual

manner in light of the L2 acquisition of Russian phonetics, we should first consider one more relevant

aspect of the L2 acquisition process: mapping L2 phones onto LI phonetic and phonological space.

Defining the relationship between LI and L2 phonetic categories is significant since “[i]t is commonly

accepted that L2 learners ‘identify’ L2 phones in terms of native language (LI) categories and, as a result,

use articulatory patterns established during LI acquisition to realize those L2 phones” (Flege, 1987:48).

To this end, L2 learners can evaluate L2 phones as “new” or “similar.” A “new” L2 phone has no

equivalent in the L 1 and presents minimal difficulty. A "similar" L2 phone, however, is similar to an existing L 1 phone, either perceptually or articulatorily, an association which greatly complicates its acquisition (Flege, 1987).

Best and Strange (1992) have also noted that certain types of nonnative contrasts are easily discriminated by English-speaking adults while other nonnative contrasts prove to be quite difficult. To account for these differences they propose the Perceptual Assimilation Model (PAM), which describes how the specific assimilation of L2 phones affects their discrimination. To describe how two L2 phones can be mapped into LI space'', PAM proposes four possible mapping strategies:

1. Two Category (TO—two L2 phones are mapped into two different LI categories; learners display excellent discrimination. 2. Single Category (SC) —two L2 phones are mapped equally into one L 1 category; learners display the lowest level of discrimination. 3. Category Goodness (CG)—two L2 phones are mapped into one LI category, but one L2 item is perceived to be a better exemplar of the LI phone than the other; learners display moderate discrimination. 4. Uncategorizable (UC)—both L2 items are uncategorizable, neither is associated with an LI category; learners display moderate discrimination (Best et al., 1996; Best and Strange, 1992).

unlike the child, the adult learner must also modify certain well-established articulatory habits in order to produce L2 phonemes according to phonetic norms of the target language" (Flege, 1980:117).

'* It must be noted that the PAM model focuses on monolingual adults’ initial perceptions of nonnative phones and does not directly address adults’ long-term L2 acquisition process. Nonetheless, some of PAM’s basic approaches and ideas can be fhiitfully extended to describe the difficulties our adult learners’ of Russian experience with learning palatalized sequences. PAM predicts that L2 learners will display the strongest discrimination between two L2 phones (and,

therefore, the best acquisition) if they are mapped into two categories (TC) and the weakest

discrimination (and, therefore, the worst acquisition) if the two L2 phones are mapped equally well into a

single category (SC). Elucidating the relation between L2 phonetic categories and learners’ native

phonological system, PAM clearly illustrates the source of adult learners’ “perceptual foreign accents ”

(Strange, 1995:40). “Adult and child learners of a foreign language may retain the same kind of phonetic

learning ability evident in early childhood and yet still speak with an accent because phonological

translation provides a two-language source of phonetic input that may ultimately limit progress in

learning to pronounce a foreign language” (Flege 1981:443).

Since perception and production skills are inexorably linked, in order to present a complete

picture of the L2 acquisition process, it is necessary to define L2 perception. Recall that during first-

language acquisition perception markedly precedes production. Limited L2 research in this area indicates

a different pattern for adult learners, one where production precedes perceptual competence. (See Goto,

1971; Strange, 1995.) Moreover, illustrating the inseparable nature of production-perception capabilities,

Catford and Pisoni (1970) suggest that explicit articulatory instruction aids perception in L2 acquisition.

In spite of these initial indications of different patterns for LI and L2 acquisition, few rigorous studies

address adult learners’ production-perception competency. Moreover, all previous research has

investigated only single L2 segments. This dissertation offers new insights into this problem, since it

investigates learners’ production and perception competency with problematic L2 segmental sequences.

13 Attaining a good L2 phonetics: A teacher’s justification.

The main purpose of this dissertation is to report a case-study that has broader implications for

general theories of linguistics and L2 acquisition. However, the origins of my research arose from much

more practical matters—being “in the trenches” of the Russian language classroom, both as instructor and

student. I have been struck by the protracted difficulty that our students have with the Russian phonetic

system. I myself struggled continually to improve my own Russian pronunciation. My students’ and my

own difficulties prompted numerous questions. For example, what abstract linguistic processes cause adult learners’ accented productions? What specific qualities of nonnatives’ productions make them

1 0 noticeably accented? Segmental inaccuracies alone? Or do subtle inaccuracies in our gestural timings

also give us away as a “foreigner”? More in line with instructors' reasoning, why should foreign language

instructors dedicate precious class time to L2 phonetics? Don’t they already have enough to teach in their

courses? And how do we justify requiring our students to regularly spend time working on pronunciation

in the language laboratory?

L2 learners should strive to acquire phonetics of the target language since native speakers of that

language evaluate nonnatives’ overall level of fluency not only on the basis of learners’ mastery of

grammar, syntax and vocabulary, but perhaps to a greater degree on their mastery of L2 phonetics. In

fact, learners who speak the L2 with a heavy accent may find it difficult to locate willing native-L2

interlocutors. For, compared to native L2 productions, accented L2 productions place greater demands on

native-L2 listeners; they cannot sit back and merely passively “hear,” but must actively “listen,”

continually deciphering and adjusting for learners’ incorrect productions. Moreover, accented speech can

produce overall negative evaluations of learners’ general intelligence and personality: "a person whose

speech is accented will be rated less favorably (along subjective rating scales such as intelligent vs. stupid

or kind vs. cruel) than speakers who talk without a foreign accent....Unfortunately for most foreign

language learners, the greater the degree of perceived accent, the more unfavorably a speaker will be

evaluated” (Flege, 1981:447). These statements provide pragmatic support not only for allotting time to

teach L2 phonetics but also for further research on the acquisition of L2 phonetics. Perhaps if we can

better define some of the processes and origins of accented speech, we might be more able to effectively

impart to our students the means to eradicate their foreign accent.

What qualities of learners’ productions make them sound accented to native-L2 listeners? The

most obvious mistakes arc segmental substitutions, e.g., when native-Russian speakers produce English

"thank you” as ’’sank you” (because they have not acquired the interdental ”th” [ 0 ]), or produce ’’big” as

"beeg” (because they have not yet mastered the AE front lax high vowel [i]). Incorrect intonations are also a common source of noticeably accented speech. However, more subtle qualities of learners’ speech can also convey accentedness, such as incorrect articulatory timings, which affect perceived rhythmic and acoustic qualities. For example, voicing in Russian bilabials is known to begin earlier than in English

11 bilabials, giving Russian bilabials a highly resonant quality to the English speaker’s ear/ If learners of

Russian transfer their native English articulatory timings for bilabials to Russian bilabials, native

Russians will sense the accentedness of learners' speech. Because this mistake involves subtle speech

timings, native Russians may not be able to identify exactly why the bilabial sounds wrong. Nonetheless,

they will instinctively know that it is not native speech—a judgment that may result in negative

evaluations of learners, possibly causing native Russians to avoid them. Learners must, therefore,

endeavor to acquire the language-specific dynamic phonetic characteristics of the target language.

It has been demonstrated that complete acquisition of L2 phonetics requires learners to be aware

of numerous differences between LI and L2, including phonological, phonetic, and subtle articulatory

differences. However, even if learners are aware of the myriad differences, they might still retain an

accent. For the process of retraining their articulators is an arduous one, one which requires modified

motor control patterns. It is clear that in order to acquire L2 phonetics, learners must not only gain

control of a new abstract phonology and phonetic system, they must also "acquire complex new sets of

highly automatic articulatory gestures" (Flege, 1980:118).

1.6 The present study: The phonetics of LI acquisition and Russian palatalized sequences

Previous sections of this chapter present the following fundamental established views on the L1

and L2 acquisition processes:

• Humans are bom with the innate capacity to recognize all sounds of the world's languages. Learning one's native language involves enhanced discrimination of LI phones and loss of discrimination capabilities for non-Ll phones.

• Changes that occur prior to puberty make complete acquisition of a second language a significant challenge for adults.

• Adult L2 learners perceive L2 by reference to LI, a process sometimes referred to as "phonological translation " that results in a perceptual foreign accent.

• In spite of changes that occur by puberty, adult L2 learners can continually improve their L2 phonetics.

• The relationship between production and perception capabilities is not well- understood for LI and is even less clear for L2. Scant research investigates the L2 production-perception relationship.

In addition to modifying articulatory timings which produce more resonant Russian bilabials, AE learners of Russian must retrain their relative articulations so as to not aspirate word-initial stops, to produce most AE alveolars as Russian dentals and, to produce simultaneous palatalization, to name a few.

12 • Adult L2 learners' productions of L2 reflect their own INTERLANGUAGE, with its specific INTERPHONOLOGY and INTERPHONETICS.

• Segmental substitutions, incorrect intonation and incorrect relative articulatory timings characterize accented L2 productions.

• To acquire L2 phonetics, learners must not only acquire a new phonology and make new phonetic distinctions between similar phones but must also retrain motor control of their articulators, including place, manner and relative timings.

• "Similar" L2 phones present greater difficulty to learners than "new" L2 phones. The Perceptual Assimilation Model provides a convenient tool for describing this phenomenon.

• Acquisition of L2 articulations proceeds in a gradual, continuous manner, SLA research offers no model for describing this gradual process.

For adult AE learners of Russian, acquiring the palatalized consonants of Russian (/Cl/, or

“simple-palatalized”) is very difficult. Moreover, certain Russian sequences that contain both a

palatalized consonant and a front palatal glide (/Oj/, or “palatalized-yod”) are particularly difficult. I

hypothesize that learners’ difficulties originate from misperceptions in which the simple-palatalized and

palatalized-yod sequences are not distinguished. The closest English approximation to the Russian

palatalized sounds occurs in words such as ‘music*, ‘beautiful’, and ‘view’. English, however, realizes

these initial sounds as a sequences of two segments, /C/ + /j/. To account for learners’ difficulties in

distinguishing /Cl/ and /CJj/, I propose a modified version of the PAM model, where both Russian simple-

palatalized and palatalized-yod sequences are mapped onto English sequences of /C/+/j7. Learners

perceive the Russian-English phonetic relationship to be an example of "similar" phones. Learners’

discrimination capabilities are, therefore, very low. I hypothesize that because learners incorrectly map

the two Russian environments onto one AE sequence, they will produce the two identically and, therefore, incorrectly. Learners’ incorrect productions will display acoustic and articulatory timing properties somewhere between Russian /Cl/ and English /C/+/j/. Learners' incorrect productions, therefore, serve as an example of INTERPHONETICS. The acoustic production study of chapter 4 presents relevant data which show L2 productions that are accented due to incorrect articulatory dynamics.

13 Moreover, I accept the proposition that adult learners improve their L2 phonetics gradually, in

continuous stages. In reaction to the absence of an appropriate framework to account for this gradual

acquisition, I propose to use Browman and Goldstein's Gestural Phonology (Browman and Goldstein,

1989). Based on results from the acoustic production study, chapter 4 also illustrates in detail Russians'

correct and learners' incorrect productions of palatalized sequences in terms of “gestural scores.” In this

manner Gestural Phonology accounts for the different possible language-specific articulatory (inter-

gestural) timings. A hypothetical model of learners' gradual acquisition of Russian timings is also

offered.

The acoustic perception study presented in chapter 5 explores learners' and natives' discrimination of Russian palatalized sequences. I hypothesize that speakers’ production and perception capabilities should agree, i.e., if they can produce a difference (as indicated in chapter 4), they should be able to hear a difference. Likewise, if they cannot produce a difference, then they should not be able to perceive a difference. Results of the perception study indicate that such a proposed direct relationship between L2 production and perception is an over-simplification. Moreover, the fact that learners performed better on the perception task than did native-Russian speakers, indicates native speakers' and learners' differing perceptual patterns and associated linguistic knowledge. Chapter 5 also examines

Russians' and learners' acoustic strategies, providing initial evidence that formant duration cues are more salient than formant frequency cues. Finally, taking into account learners' duration of Russian language study, chapter 5 demonstrates advanced learners' improved—more Russian-like—perceptual patterns.

Finally, drawing conclusions from the combined results of chapters 4 and 5 in light of the theories presented in chapters 2 and 3 ,1 hope to expand our understanding of; (1) the relationsfiip between L 1 and L2 production and perception abilities, (2) the acquisition of dynamic qualities of L2 articulations, (3) dynamic qualities of accented and native speech, (4) the hypothesized gradual stages of learning, and (5) natives' and learners' differing perceptual patterns.

14 CHAPTER 2

AN OVERVIEW OF THE RUSSIAN CONSONANTAL SYSTEM

2.1 Introduction

This chapter presents an overview of the Russian consonantal system &om several theoretical points of view. I briefly review well-established, traditional descriptions of the Russian sound system, with the aim of incorporating previous views into current theories of phonetics and second language acquisition. These theories will serve as the foundation for discussions of the phonetic experiments reported in subsequent chapters of the dissertation.

I first discuss the typology of the Russian consonantal system, including both its orthographic and phonetic realizations. Palatalized and non-palatalized consonants are compared and contrasted. In agreement with terminology used by Slavic linguistics, I refer to the palatalized and non-palatalized consonants as 'hard' and 'soft', respectively. Particular attention is given to comparing secondary palatalization (i.e., the [J] in [O]) and the palatal glide ([j]).*

Having discussed the typology of Russian, I briefly contrast the Russian and English sound systems, mentioning issues of L2 acquisition. The Russian and English sound systems exhibit deep underlying differences in their phonetic and phonological organization. These basic differences are often the source of adult L2 learners' difficulties, which result in obviously nonnative phonetics. Common

conform to International Phonetic Alphabet (IPA) norms. (See Ladefoged, 1993 for a complete table to the IP A.) However, in accordance with traditional Russain phonetic transcription (Avanesov, 1972:33) I transcribe Russian /o/ and /a/ in first pre-tonic position as [ a ] . I use forward-slashes, //, to indicate broader transcription which focuses more on phonemic reality. To indicate a closer transcription that focuses more on phonetic reality I use brackets, [ ]. In most instances, I transcribe palatalization with IPA superscript 'j', e.g., /ndat/, although, due to font and

15 accented productions of Russian are discussed and explanations for nonnative speech are offered. In the

second half of the chapter I consider in detail articulatory and acoustic properties of the palatalized and

non-palatalized consonants and comment on how these properties are related to L2 acquisition.

Cross-language studies of second-language acquisition have noted that certain nonnative phones

are more difficult to acquire than others. For example, according to Strange, "[i]t is well known that adult

L2 learners have difficulty learning to produce some nonnative phonetic segments, which leads to the

persistence of accented pronunciation in the L2" (1995:19). For adult learners of Russian the palatalized

consonants and the palatal glide are often problematic. This fact is widely known to Russian language

instructors, as Kenstowicz notes: "Teachers of Russian report that their English-speaking students have

difficulty articulating palatalized dental consonants, pronouncing Na(dl]a and Volo[dJ]a as either [nadya]

or [nadja], [volodya] or [volodja]" (1994:544). With its focus on phonetic and phonological theory,

this chapter serves as a point of departure in an attempt to provide a dynamically-based description of

Russian language learners’ accented productions of Russian palatalized sequences.

2.2 Overview of the Russian consonantal system

We begin with general phonetic properties of the Russian consonantal system. Traditional descriptions draw attention to two salient organizing contrasts: ( 1) softness vs. hardness (palatalized vs. non-palatalized), and (2) voiced vs. voiceless (voicing). Palatalization is a highly developed and regular system in Russian. Table 2.1 provides just one set of minimal pairs illustrating phonemically contrasting palatalization:

graphing constraints, 1 sometimes indicate palatalization with an apostrophe. 1 have chosen to use curly brackets, { }, to indicate orthography.

16 Environment Russian Phonetic Gloss Example Transcription c v c Mar [mat] checkmate' c v c MflT [m*at] rumpled' c v o METb [mat*] mother' c v c * .MHTb [m*at*] 'to rumple'

Table 2.1. Set of minimal pairs contrasting for Russian palatalization.

Because similar contrasting minimal pairs pervade the Russian language, in order to attain an adequate level of phonetic proficiency, L2 learners of Russian must develop their ability to manipulate the palatalization opposition.

There are a total of 39 phonemes in the Russian language, 34 consonants and 5 vowels. Of the

34 consonants, 24 are paired for hardness-softness and 14 are paired for voicing. Table 2.2 illustrates the organization of and pairings between the consonant phonemes.

n [p] — 5 [b] $[f] — B[v] T [t] — A [d] C [s] — 3 [z] I I 1 1 I I I I Ff [pi]— 6 ' [bi] [£)■] — B'[vi] T [tj] — /[ d J ] C [si] — a' [zi]

□I [J1 — 2 K [3 ] ^ [/):]— X.' [5 ):] K [k] — r [gl

Jim M [m] H[n] P [r] 1 1 1 1 j i 'p ] [mi S '[ni] P' [ri]

X [x] u [ts] [j]

Table 2.2. The paired consonants of Russian. (From Avanesov, 1972:34.)

Consonants are organized within boxes according to voicing and palatalization. Voicing contrasts are indicated horizontally, while palatalization is indicated vertically. It is apparent that not all consonants are paired for both contrasts. While some consonants contrast for both voicing and palatalization (e.g., the stops [b]-[bJ]-[p]-[pJ]), others are paired only for palatalization (e.g., the sonorants [m]-[mJ]), some only for voicing (e.g., the [/]- [ 3 ]), while others have no paired counterparts (e.g., the affncate [t/î]).

17 Those sounds which are not paired for palatalization are either hard' (III [/], vK [ 3 ], U [ts]) or soft' (H

[t/*]. 111 [ji:], [ 3 ):], [j]). Given the phonemic system in Table 2.2, it is evident that Russian is a

consonantal language. This is evident not only in the high number of consonantal phonemes but also in

the pervasiveness of consonant clusters and the strong acoustic influence of consonants on neighboring

vowels. As with most , “the semantic load [in Russian] rests with the consonants”

(Caflisch, 1983: 33). The fact that a large portion of the function load in Russian in carried by the consonants gives additional support for in-depth study of consonantal features, especially as concerns

their L2 acquisition.

Table 2.3 provides additional information on the typology of the Russian consonantal system.

Here the consonantal phonemes are organized according to articulatory parameters, including place and manner of articulation, as well as the features of voicing and palatalization.

PLACE MANNER Bilabial Labio­ Dental Alveo- Palatal Velar dental palatal STOPS vd. [b][bi] [d] [dJ] Igi vis. [p] [p]] [t] [tl] [k] vis. [ts] OBSTRUENTS m FRICATIVES vd. [v] [vJ] [z] [zJ] [3 ] [ÿ:] vis. [f] [fl] [s] [si] [f] Ifl:] [x] NASALS vd. [m] [mi] [n] [ni]

LATERALS vd. [1] [U] RESONANTS T r il l s vd. [r] [ri]

GLIDE vd. U1

Table 2.3. Classification of the Russian consonants by place and manner of articulation. The following abbreviations are used: vd. = “voiced,” vis. = “voiceless.” (Adapted from Avanesov, 1972:37.)

Table 2.3 clearly demonstrates that the majority of Russian phones are articulated at the alveo-palatal region and forward. Among these segments palatalization is quite regular, with only two (the affricates

[ts] and [tp]) not paired for palatalization. In contrast, consonants articulated behind the alveo-palatal

18 zone do not exhibit such systematic pairing’. The three velar segments are non-palatalized. Moreover,

the reader should note the unusual position of the palatal glide, or yod, [j], within the system. The only

glide, it contrasts for neither voicing nor palatalization; it is voiced and inherently ‘soft’. While

secondary palatal articulations are common in the form of palatalization, a primary palatal closure made

with the mid-portion of the tongue is unique to [j].

2J2.1 Orthographic representation of Russian palatalization

Adult students of Russian encounter significant problems in acquiring both the palatalized consonants and the palatal glide. In an attempt to establish possible complicating factors of these sound environments as they relate to L2 acquisition, we begin where most adult L2 learners begin—with the

Russian spelling system. Here, both palatalized consonants and the palatal glide can be expressed through more than one orthographic symbol. These multiple orthographic representations associated with a single underlying segmental category are but one complicating factor for adult learners.

In general, students quickly become familiar with the means the orthography uses to indicate the palatalized nature of paired consonants:

1 ) If the palatalized consonant is in word-final position or followed by another consonant the soft sign {b} is used*:

’ Table 2.3 represents a conservative view of the Russian phonemic system since it does not include the soft velars. The status of palatalized velars in Russian is a much debated topic. However, since a definitive position on their status is not germane to this research 1 have chosen to disregard them. * The soft sign does not always indicate a preceding soft consonant, however. It is sometimes used simply as "an orthographical convention, a spelling rule" (Levin, 1978:11). As a remnant of their historically soft quality, the so-called hushers ()K [ 3 ], Ul [/], lU [p:], H [t/f]) are written with a succeeding "soft sign" {b} in certain environments (regardless of their current status of hardness or softness). These instances are: 1 ) the second person singular, nonpast tense, of all Russian verbs,

KJiaiieillb [kLv'db/] put, place' roeopHUJb [gavAW/] talk, speak’

2 ) the imperative,

pevKb. pevKbTe Irkf], ['rfe/t'i] cut" (2nd sg. imperative, 2nd pi. imperative) ruiaHb. anaHbTe [plat/*], [‘plat/t'i] 'cry' (2nd sg. imperative, 2nd pi. imperative)

3) the nominative and accusative singular of feminine nouns ending with a "husher",

19 Soft consonant in word-final position:

opaxb [brat'] to take' yrom, [•ugal*] coal'

Soft consonant followed bv a consonant:

TOJlbKO ['tol*ka] 'only' ropbKO Pgor'ka] 'bitter'

2) If a palatalized consonant is followed by a vowel, then the following vowel grapheme indicates the palatalized nature of the preceding consonant. There are ten vowel graphemes which correspond to the five underlying Russian vowel phonemes. The vowel graphemes are divided into two sets, five "hard series" vowel graphemes and five "soft series" vowels graphemes.

The five underlying vowel phonemes / a / / e / / i / l o i / u / Hard series graphemes a 3 bl O y Soft series graphemes fl e H e K)

Thus, while the functional load of the sound system is carried by the consonants, in the orthography it

seems to be carried to a greater extent by the vowel symbols. For example, combining the paired bilabial consonants /b-bJ/ with each of the five vowel phonemes, we get the following ten orthographic

combinations:

/ a / / e / / i / l o i / u / l < y i 6a [ba] 63 [be] 6bi [bi] 6o [bo] 6 y [bu] / c v / 6fl [Wa] 6e n^e] 6 h m 6e [b*o] 6 k) [bHr]

The Russian orthographic system for indicating palatalization often confuses adult learners since

the consonant grapheme is the same, regardless of its hard or soft status (i.e. 6 in both hard 6a [ba] and

HOHb [not^] night' Beujb [v 'e/':] thing'

4) the infinitive of certain verbs.

MOMb [mot/*] 'to be able' neqb [p'et/'] to bake'

5) certain adverbs,

npOHb [protji] 'away ; off; averse to' OUlOUJb [splo /1 'all over; throughout'.

(Adapted from Levin, 1978: 10-11.)

20 soft 5fl [b*a]). Learners can misinterpret the system as if it is the vowels which are paired for

palatalization and not the consonants. Learners are further confused by the soft sign grapheme found at

the end of words, since it does not represent a sound itself, but reflects the status of the preceding

consonant. Granted, instructors can explain to learners the relationship between the orthographic and

phonetic realities. Nonetheless, learners sometimes do not comprehend well the structure of the Russian

sound system, especially during the first year or two of study.

2 2 2 Orthography and phonetic realizations of the palatal glide

Perhaps even more challenging for students of Russian are the four possible orthographic

representations of the palatal glide /j/. The phonetic environment of yod determines which of the graphemes is used; among these the soft sign {b}, which indicates a preceding palatalized consonant (as discussed above), can also indicate the presence of a yod phoneme. Thus the single soft sign {b} grapheme can represent two underlying phonemic realities. No doubt the multiple orthographic representations of underlying /j/, the non-uniqueness of the grapheme {b}, coupled with its articulatory qualities and the palatalized' vs. palatal' issue all add to learners’ confusions.

Orthographv. The four possible orthographic representations of underlying /j/ are: the soft series vowels {a}, {e}, (H), {e}, aHü {lO}, the “short-i” {H}, the soft sign {b}, or the {T>}.

The presence or absence of a following vowel is a primary determining factor in the choice between one of the soft series vowel symbols versus {H}, {b} or {!>}. Table 2.4 gives the two main divisions and the three subdivisions of the orthographic norms.

21 ENVIRONMENT EXAMPLE EXAMPLE ORTHO­ (in Russian) (in GRAPHY transcription) 1 . [j#] If /j/ is not followed by a vowel, MOH [moj] it is spelled as {H}: TJiynbtM ['glupÿ] M [jC] jpoHKa [•broika] 2. [jV] When a vowel follows /j/, both jLvia [’jama] the /j/ and the vowel are .VIOfl [mA'ja] soft series indicated by one of the five soft craTbfl [stA't'ja] vowel series vowels: oSbflTHe [Aljfjatfija]

2a. [#jVl In word-initial position, the soft Hjrra [•jalta] series vowels indicate /j7 + en [jel] soft series ! vowel: eacHK [*jo 3 ik] vowel 1 [•jumar] ____ _ KD-VIOp ______1 2 b. TVjV] In inter-vocalic position, the soft .MOfl [eoa ^ ] series vowels indicate /j/ + MOHD [iriA'ju] soft series , vowel; MOH [mAH] o r [mAli] vowel 1 {h } can indicate either [i] or ü iL _ _ _ _ '~2c. EWI When a consonant precedes and jpysbfl [dru'z'ja] a vowel follows the /j/, the soft H3bflTb [i'z*jat*] b/b 1 sign (b) or hard sign (3) is craTbH [stA’t'ji] inserted between the consonant l _ __ and following soft sign vowel:’

Table 2.4. Orthographie representations of [j]. (Adapted from Hamilton, 1980:54-55.)

’ The use of a soft sign {b} or a hard sign (t) is determined by morphology. The hard sign (t) is used to separate a prefix ending in a consonant from a root which begins with /j/.

OTbexaxb [ A t'je x a t’] 'to depart' prefix =OT prefix =[At] root = [Isxat'] root=exaTb

CbeCTb [sjest'l to eat' p re6 x = C prehx = [s] root = [jest'] roof = eCTb

The soft sign {b} is used to indicate a /j7 which exists between a consonant-vowel sequence that is not separated by the morphological boundary of prefix-root. ("The soft sign is used to separate a consonant from a following j everywhere except at the boundary between prefix and root, where the hard sign is used" (Levin, 1978:12).)

ObK) [biju] to beat' ( 1st per. sg. nonpast) nbflH [p'janl 'drunk'

22 Phonetic realization. Just as the surrounding phonetic environment dictates the orthography of underlying /j/, so the stress placement affects the resulting phonetic output. In general, /j/ is fully realized as the front glide [j] only when it is followed by a stressed vowel. In all other environments it tends to be realized as the lax, semi vowel [j], similar to non-syliable [i] (Antonova, 1988:136). See Table 2.5 below for supporting examples.

/j/ followed by stressed /j/n o t followed by stressed vowel -* realized as glide [j] vowel realized as [j]

[#j] [■juk] lor [jubil’ar] KDÔHJWp (VjV] [nÏA'jüi moK) i‘m’6]u i Moio Kjvi" [stAfju] craTbio ['k/hastijuj K CHaCTbK) " n p r w ['t/’ajka] HEHKa [•tjiaj] Haft

Table 2.5. Phonetic realizations of/j/. (Adapted from Antonova, 1988:137.)

23 Russian palatalization and L2 acquisition

23.1 General approaches to teaching Russian palatalization

Textbooks on teaching Russian phonetics to adult second-language learners emphasize that it is imperative for learners to acquire the palatalized consonants and to understand their general role in the

Russian sound system. To this end, Russian-language pedagogues, who themselves are native Russians, provide useful observations and suggestions on teaching palatalization to foreigners. By reviewing this information, we may gain some insight into how Russians perceive their own sound system as well as how they perceive common "accented" productions. "It is through the mistakes of nonnative speakers that one can better understand the articulatory base and phonological system of Russian" (Bryzgunova,

1963:11).

As pointed out above, two principal distinguishing phonological features of the Russian language are voice and palatalization. Bryzgunova notes that these two categories can be called “pivotal” in the sense that each unites a large group of phonemes. As for teaching palatalization to foreigners, she suggests that learners should focus on acquiring palatalization as a system, rather than as separate sounds

23 or unrelated entities. In contrast, other phonological features relevant only for individual phonemes can

be mastered on a phoneme-by-phoneme basis (1963:11).'“

Avanesov ( 1972)" also notes the prominence and "distinctive specificity" of palatalized

consonants within the Russian system. In his view, when teaching these sounds, one should pay special

attention to contrasting the soft and hard consonants. Because the hard-soft distinction is meaning

distinguishing, learners must attain a minimal level of mastery.

Thus, L2 instructors of Russian should encourage students to understand the contrastive role of

the hard-soft distinction while simultaneously focusing on uniting elements within the consonantal

inventory.

23.2 Accented productions of the palatalized consonants

Attaining a degree of mastery of palatalization which might be described as only a "slight accent" often eludes Russian learners. In fact, it is not uncommon even for advanced students of Russian

(who have taken pronunciation-phonetics courses and studied in-country) to produce the soft consonants incorrectly. Russian linguists offer two common explanations for this problem: (1) the secondary palatalizing articulation, with its simultaneous additional tongue-raising, is "imperceptible" (Bryzgunova,

1963:50); and (2) it is difficult for the nonnatives to internalize the phonemic hard-soft distinction.

Nonnative accented productions of the palatalized consonants are characterized by certain common deviations. Russian phonetics texts and pronunciation handbooks offer four descriptions of these pronunciation errors:

B pyccKOM a3biKe Btuenflercfl jma sm a $OHoaorHHecKHX nptoHaKOB. OiWHh 3 h h x MOifCHO HasBaTb «crepiKHflMH» $OHOJiorHHecacoH CMcrevibi, oGieiiHHHiomHMH6o.nbuiHe rpynnbi (pOHe.vi: TaKRMH npH3HaKa.MH HBJiaercfl rjiyxocxb—saoHKOcrb, iBepaocTb—MflrKOcrb. Ecoh jjifl HHocrpaHua HBJiaioTCfl 'tyjtcabLviH raicHe «crepvKHeBbie» tttOHOJiormeacHe npH3HaKH, to npHxoaHTCfl paSoraTb ae aaa OTaenbHCH (J)OHeMOH, a aaa uejiofi rpynnoA $OHeM, o6ieiiHHHe.Mbix «crepîKHeBbLM» npHSHaKOM, nanpnMep MarKOcrbio. üpyme tjOHOJiorMqecKHe nptOHaKH BavKHbt jTHuib juiH GiiHOH $OHe.vibi, HapyuieHHe MX Beaer k HCKa^KeHHHM, .xapaKXepHblM iuia OTaenbHOH ({)OHe.MbI (Bryzgunova, 1963 :11). " OaHOH M3 caMbix xapaKTepHbK oooGeHHOcreH ssyKOBOM CMcreMbi pycacoro jRbiKa flBJiHCTCfl pa3JiMHeHMe TBepabK M .vumcHX coraacHbK—OaHaKG oaa HcpeaKO c rpyaoM aaerca HepyccxM.M_riG3TGMy npM H3yqeHHM pyccKoro H3WKa HepycacH.MM Heo6xGaM.MG GcoGeHHO cepbe3Hoe BHH.\iaHMe yaerrnTb npotCHOuieuMio mhtkmx ooraacHbix b npGTMBOnoJioJKHGCTb TBepabLM. O ieayeT yxasaTb, htg xBepabie m MancMe œ rjiaatb ie b pyccKO.M a3biKe cayjKax

24 1. A palatalized consonant is produced either hard or without enough softness. 2. Soft consonants articulated with the tongue either front or back have excessive frication. 3. Instead of a palatalized consonant [OV], learners produce a hard consonant followed by either the front glide [CjV] or an [i]-like vowel [CiV]; for example, either ['pjatka] or [pi'atka] instead of [•p*atka]. These pronunciations are among the most stubborn “transfer-interference” mistakes. 4. Learners are unable to distinguish sequences of the type [taJ-ttlal-Etija] (Bryzgunova, 1963:50; Antonova, 1988:28).'*

The present study is particularly interested in the third and fourth deviations, where learners produce palatalized consonants as two segments instead of one and/or do not distinguish two- (OV) and three- segment (CljV) sequences. Before offering possible explanations for these accented productions, I first discuss some specific qualities of articulation and acoustics of palatalization, and then address L2 acquisition of the palatalized consonants in light of their typological, acoustic and articulatory properties.

2 3 3 Russian palatalization: An uncommon phenomenon

This section provides brief observations on the phenomenon of palatalization as it is realized in the Russian language. Where these observations are associated with the L2 acquisition process, 1 contrast the Russian and English systems.

The Russian consonantal system with its regular system of paired hard-soft consonants is typologically unusual among the world's languages. The term “palatalization” typically refers to.

o n fl pasjTH'ieHHH c n o a FloaTOMy h x ycBOCHHe npHoSperaer o c o G c h h o ea^KHoe 3Ha40Hne (Avanesov, 1972:100). 1 2 ripn MsyqeHMM pycacoro nsbiKa âjih .vihophx HHOCxpaHueB 5oJTbiuoH xpyaHOcrbio aBJiHCTCfl KaK ca.Ma jonojiHHxejTbHafl apTHKyjiauHH, laKm

Ha6 .nioiiae.Mbie oxicjiOHeHHa b irpoHSHOuieHHH mhpkhx copjiacHbix: 1) TBepaoe hjih HeaocraTOMHO mhpkoc npon3HOuieHMe mhtkhx copjiacHbixi_H npon3HouieHHe coneraHMH — CMeuieHHe TBepabix h mhpkhx copjiacHbix: y b o m i BMecro yBOjiH h Haooopoi; 2) npoH3HOuieHHe coneraHMM CT kbk Cjf—HepasjiimeHMe coneraHMM tho xa— xn— Xbfl. 3) npoHSHOuieiiHe raacHOPO [i] hjth HeHartpjwceHHOPO necnopoBOPO [i] nocne mhpkopo COP/iaCHOPO HJlH nepea HH.M. (Antonova, 1988:28).

25 restricted changes in certain primary places of articulation, as when velars palatalize to palato-alveolars.... Less often [it] refers to a general secondary articulation across all the primary places in a language, as in Russian, where labials, coronals, and velars can all come in surface contrasting pairs of palatalized versus non-palatalized (Bhat, 1978:67). (See also Keating, 1991:30.)

Moreover, palatalization is a complex process since it does not always produce a single change in the affected consonant. For example, in contrast to traditional explanations which describe palatalization as a single diachronic (or morphophonemic) process, Bhat posits that palatalization can affect consonants in one of two ways:

1. It modifies the primary articulation itself, or,

2 . it can add a secondaiy palatal articulation to the consonant, leaving the main articulation unaltered (Bhat, 1978:50). (See also Ward, 1958:35.)

Given either of these two realizations, palatalization can further result in three distinct processes: tongue fronting, tongue raising, and spirantization, either individually or in combinadon. Russian palatalized consonants exhibit all three processes, a level of complexity which may hamper Russian L2 learners’ acquisidon.

Specifically, in light of the three disdnct processes described by Bhat, one might suggest that nonnadves experience difficulty acquiring the palatalized system because there is not a single principle of phonedc correspondence between the hard and soft consonants. For example, consider the hard and soft voiced bilabial stops [b] and [tdj. Here the reladonship between the paired consonants is straight­ forward: the palatalized segment is a result of the addition of a secondary palatalizing ardculadon to the primary non-palatalized bilabial, which itself is not nodceably affected. On the other hand, take the voiced dental stops [d] and [dJj. Here the reladonship between the hard and soft variants is not as direct: the soft consonant [dJ] is slighdy spirandzed (or affricated) and raised in the mouth as compared to the hard one. In addidon, the palatalized variant involves an increased amount of lingo-palatal contact.'^ We encounter yet a different reladonship between the voiceless velar stop [k] and [kJ]. For velars, palatalizadon affects the primary point of ardculadon: here the soft variant is characterized by tongue

Keadng supports Bhat's descripdons with phonedc data from Oliverius (1974). In discussing the reladonsip between [t] and [t*], "the primary coronal ardculadon is retracted and made laminal by the secondary palatalizadon. As a result, the stop is generally also affricated" (1993).

26 fronting (with no simultaneous secondary articulation). In sum, we see that the articulatory consequences of palatalization are not consistent for paired consonants across different points of articulation. This variability may contribute to the problems of learners.

It has also been noted that if L2 has a considerably greater number of vocalic or consonantal phonemes than LI, learners will experience more difficulty acquiring L2 (Bryzgunova, 1963:9-10, referring to Reformatskij, 1960:147-8)." Similarly, Bryzgunova notes that the presence of soft consonants in Russian significantly increases the number of distinctive articulations made in the front portion of the oral cavity. This greater number of “forward” articulations, therefore, complicates AE learners' acquisition process since they must learn to manipulate an increased number of complex lingual articulations in this oral region (Bryzgunova, 1963:52).

Ohman's study (1966) on transconsonantal effects within VCV sequences provides additional evidence on the underlying fundamental differences between the English and Russian consonantal systems and their acoustic realization. (See also Cohn, 1988; Choi and Keating, 1991.) Ohman found that English V,CV, sequences clearly exhibit co-articulatory transconsonantal effects, a type of coarticulation. That is, V, influences the properties of V,. English consonants, therefore, can be said to be somewhat "transparent." Russian stands in distinct contrast since it shows virtually no transconsonantal effects in V,CV, sequences. Instead the consonant, with its possible secondary vowel­ like articulation, effectively blocks V-to-V influences.

Purcell's (1979) study on trans consonantal effects in Russian V,CV, sequences provides detailed quantitative support of Ohman's observations. In particular, Purcell investigates the trans­ consonantal effect of specific vowel features, including: (I) front vs. back, (2) height, (3) place of

" Bryzgunova’s proposition—that a significant disparity between the number of LI and L2 vocalic or consonantal phonemes complicates the L2 acquisition process—seems to be somewhat related to Best and Strange's Perceptual Assimilation Model (PAM). PAM provides a strategy for mapping L2 phones on to the LI phonemic space; in this manner the model accounts for the ease or difficulty with wWch L2 phones are acquired (Best and Strange, 1992; Polka, 1992). Problems will arise if there is a disparity between the number of “similar” LI and L2 phones. For example, if L2 has two “similar” phonemes that are virtually indistinguishable to most nonnatives L2 learners might map the two phones into one LI category, an example of the highly problematic “single-category” mapping. For example, AE learners of Russian might map the two “similar” Russian fricatives [/] and [fî:] on to the single AE [/] category. In order to truly acquire the two Russian fricatives, most learners require explicit phonetic and articulatory instruction combined with prolonged exposure to the target phones. (See chapter 1 for a more detailed

27 articulation, and (4) palatalization vs. non-palatalization of consonants. Using step-wise multiple regression analyses on the first (FI) and second (F2) formant frequencies for both vowels in Russian

VCV sequences, he demonstrates that Russian shows no trans-consonantal effects in either direction (i.e..

V, does not affect and vice-versa).

Ohman’s and Purcell's results, therefore, show underlying typological acoustic differences between English and Russian since these differences can be related to general articulatory and phonological patterns. Providing a phonological (feature-geometry) analysis of the Russian-English difference, Keating (1985) expands upon Ohman’s and Purcell's findings, further developing basic differences between the two languages’ sound systems. Essentially separate, both C and V tiers can be specified for the vocalic feature [back]. With their secondary vowel-like palatalizing articulations,

Russian consonants are specified for vocalic features and, therefore, block cross-segment coarticulation in

V,CV, sequences. English consonants, however, being underspecified for vocalic features, can take on the vocalic feature of the following vowel through the process of spreading.

Russian consonants must be speciEed for vowel features, while English consonants need not be....While data from more languages is required to support the claim, it seems reasonable to conclude that the phonological difference is related to—causes—the acoustic difference. If we assume this, then no other (relevant) differences between the two languages need be posited....The Ending of a difference between Russian and English is rarely if ever cited by studies of coarticulation. However, it provides important evidence about the role of language-specific phonological speciEcation in coarticulation (Keating, 1985:6. See also Keaung, 1988)

Language-specific properties of coarticulation—that is, the infiuence between adjacent segments—have been established. It is highly probable that these subtle, and most likely not very salient, qualifies of a language are difficult for nonnafives to acquire. We could expand the notion of linguistic specificity to include secondary articulations. Since Russian palatalization is achieved with a secondary articulation, adult L2 learners of Russian must modify their relative articulatory timings to produce the palatalized consonants. While there are English sound environments which involve some degree of palatal glide constriction following a primary articulation (as previously mentioned, music', beautiful',

'few'), these English sequences have different articulatory properties than Russian palatalized consonants.

presentation. See also Best and Strange (1992) for a similar discussion of native Japanese speakers learning to distinguish English /y/, /w/. /r/ and /I/.)

28 Therefore, in order to truly acquire the sounds of Russian, native AE-speaking learners must acquire new

articulations, new timing relationships between their articulations as well as a new phonology—one in

which trans-consonantal coarticulation is not allowed.

2A Articulation of palatalization

24.1 Terminology: Co-articulation vs. secondary articulation

At this point in our discussion it is important to consider terminology associated with Russian palatalized consonants. In the past there has not always been a clear distinction between the terms “co- articulation” and “secondary articulation.” According to Daniloff and Hammarberg (1973),

"[cjoarticulation is a term that has been used in many ways by many investigators" (1973:239). For example, for Dunatov "[Russian] palatalization is...a coarticulation" (1963:402). However, more recent linguistic literature is quite uniform in distinguishing the two terms. Coarticulation now refers only to the influence that segments exert on each other, as in the previous discussion of Russian’s transconsonantal effects (Daniloff and Hammarberg, 1973). Coarticulation obtains in connected speech, because at the articulatory level segments are not independently articulated. "Thus, coarticulation is most extensive in running speech" (Daniloff and Hammarberg, 1973:239). Coarticulation can be described as a type of accommodation which,

serves to smooth out the differences between adjacent sounds, differences which would result in transitional sounds between the intended ones if they were articulated in their canonical forms. This appears to be the only sense in which the majority of present-day linguists use the term coarticulation (Chomsky and Halle, 1968:295 and Ladefoged, 1971:19 in Daniloff and Hammarberg, 1973:244).

Thus, coarticulation refers to an inter-segmental phenomenon.

On the other hand, the term “secondary articulation” refers to an intra-segmental phenomenon.

For Ladefoged, "The formal definition of a secondary articulation is that it is an articulation with a lesser degree of closure occurring at the same time as another (primary articulation)" (1993:230). Palatalization is a prime example of a secondary articulation in this sense since it describes the, "addition of a.. .tongue position...to another articulation" (Ladefoged, 1993:230).

29 Recall that one goal of this study is to investigate learners' inability to distinguish the Russian

two- and three- segment sequences, (CÎV) and (OjV). I hypothesize that one way native Russian

speakers differentiate primary and secondary palatal articulations in these sequences (i.e., the [j] in [DjV]

and the [*] in [CIV]) is through different degrees of stricture. Learners, however, may not produce these

subtle distinctions, adding to the accented nature of their productions. I will offer additional acoustic

evidence and support for this claim in chapter 4.

2A2. General articulatory characteristics of palatalized consonants

There is general agreement regarding (I) the articulatory relationship between Russian hard and

soft consonants and (2) the articulatory description of Russian palatalized consonants. Palatalized

consonants are characterized by the addition of a secondary, palatalizing closure to the primary one. The

palatalizing articulation is achieved by raising the mid part of the tongue upwards and forwards towards

the hard palate, resulting in the superimposition of an [i]-like articulation on the primary articulation. The

following quotes demonstrate the uniformity of this articulatory description among Russian linguists and

scholars:

• "The additional [secondary] articulation of all soft consonants is realized as a raising of the middle part of the tongue, in the front or back portions of the oral cavity, towards the hard palate" (Bryzgunova, 1963:50).

• "...in pronouncing a given [soft] consonant, the middle part of the tongue occupies the position which it occupies during the pronunciation of [i]" (Zinder et al., 1964:28).

• "In palatalization the superimposed subsidiary articulation is [i]-like" (Chomsky and Halle, 1968: 306).

• "During the formation of a soft consonant, the tongue occupies a position which is similar to that which is made during the pronunciation of [i] or [j], that is, the middle part of the tongue rises high towards the corresponding part of the palate " (Avanesov, 1972:100-101).

• "For all soft consonants there is a front-mid localization of the tongue dorsum in a convex form as well as movement towards a fronted-raising direction" (Antonova, 1988:11).

It is clear from these descriptions that the [i]-like articulation is achieved during the primary closure, maintained throughout the duration of the consonant and then released "through a palatal approximation...as part of the transition to the next segment" (Clark and Yallop, 1995:65). This

30 articulatory pattern results in a |J]-like, diphthongal transition from the palatalized consonant to the

following vowel. "There appears between the consonant and the following vowel a very short sound like

Russian [j] or English y (as in yet). It must be emphasized that this intervening sound of glide' is very

brief and that it must on no account be exaggerated or prolonged..." (Ward, 1958:36). Thus, soft

consonants are characterized by an additional fronted articulation as well as an acoustic glide, since the

fronting and raising of the tongue changes the general shape and volume of the oral cavity, resulting in

the characteristic raised timbre of palatalized consonants (Bryzgunova, 1963:25).

The lingual movements associated with hard consonants are quite different from the fronting and

raising movements of the soft ones. Non-palatalized consonants are characterized by a lowering of the

mid portion of the tongue'* and backing in the pharyngeal area. This lingual movement has traditionally

been described as ; however Holla (1981) claims that his x-ray studies indicate that

pharyngealization is a more accurate description. Halle (1971) had also employed the term

pharyngealization instead of velarization to describe the Russian hard consonants.

English, however, does not exhibit the paired pattern of palatalization and pharyngealization.

Halle proposes that some of the difficulties encountered by AE speakers learning Russian arises from the

contrasting relationship between point of articulation and formant frequency transitions for Russian and

English. Concerning Russian,

X-ray studies of the pronunciation of these consonants [palatalized (sharp) vs. non­ palatalized (plain)] have shown that the most striking changes in vocal tract configuration correlated with these different modes of articulation are a widening vs. narrowing of the pharynx. In some cases the plain consonants are not only velarized, but labialized as well (Halle, 1971:149).

In distinct contrast to this Russian articulatory pattern is the English one:

Whereas in English different configurations of the pharynx are closely correlated with the different points of articulation; in Russian for most points of articulation there are two distinctively different pharyngeal configurations: a widened pharynx for sharped phonemes, and a narrowed or neutral one for plain phonemes (Halle, 1971:151).

Internalizing this 2:1 relationship in Russian between pharyngeal configuration and point of articulation is

a substantial obstacle for L2 learners of Russian. In order to attain an adequate degree of phonetic

'* Tsepiibie cormcHhie oGpaayiOTCH Ses 3 t o h üono.nHHTejTbHOH « h o t o b o h » apTHKyjiflUHH; qDejHaa nacrb o i h h k h asbiKa npH npoHSHOUieHHH Tsepjibix corjiacHbix Sbisaer onymena (Avanesov, 1972:100-101).

31 proficiency, AE learners must modify, or "retrain," their lingual articulations so that each point of

articulation can be articulated with two different pharyngeal configurations. Given the rather insensible

nature of pharyngeal constrictions, it is not surprising that learners struggle with the articulatory patterns.

Thus, Russian consonantal articulations have two basic lingual “shapes” or ‘“forms” [the

Russian term yKJiaj “structure,” is often used here'*]. They are;

( 1) for palatalized consonants the tongue is raised and fronted, resulting in an overall convex shape,

(2 ) for non-palatalized consonants the tongue is lowered and backed, resulting in an overall concave shape.

In contrast, American-English does not have regular paired convex-concave lingual articulations. The only similar AE opposition might exist in the two allophones of /I/, clear [1] (as in "Lee” and "less") and dark [L] (as in "sell" and "ill"). Note, however, that AE clear [1] is not as soft as Russian /I’/ and AE dark

[t] is not as hard as Russian /I/ (Vasil’ev, 1962:54-55). More generally, the convex lingual forms of many Russian consonants result in numerous dorsal articulations; lacking dorsal articulations, English is characterized by numerous apical ones (Vasil’ev, 1962:34).

Identifying general lingual “forms” of an L2 is only one step in acquiring articulatory patterns of

L2. Textbooks written in Russian (for Russian learners of English) expand the language-specific lingual

“forms” to language-specific “articulatory foundations” (in Russian, apTHKyJlffllHOHHail 5a3a), stating that difficulties in L2 pronunciation are often caused by “differences between the articulatory foundation of their native [Russian] language and English...[and also] by differences in the general form of this foundation/positioning and movement of the speech organs...” (Gal’perina, 1963:16). Thus, in their effort to approximate native-like phonetics of a target language, learners must be actively aware of the similarities and differences between the articulatory bases of LI and L2. When Russian is the target language, learners should pay special attention to the paired concave-convex lingual forms.

‘* In a similar light, textbooks written in Russian for Russian learners of English stress the importance of recognizing and acquiring the specific “articulatory foundation” [apTHKyJIHUHOHHaa 6a3a] of an L2.

32 243 The active articulator of secondary palatalization

Textbooks and handbooks of Russian uniformly describe the secondary articulation of

palatalization as a raising and fronting of the middle part of the tongue towards the hard palate. However,

describing the active lingual for palatalized consonants as the "middle part of the

tongue"'^ is rather vague. While this description may be sufficient for pedagogical piuposes, a more

rigorous, precise one is necessary for research. To tfiis end, recent literature in the field of phonetics and

phonology has investigated the nature and behavior of the active lingual articulator, specifically the

optimal number of specified lingual divisions and their significance to phonological descriptions.

In descriptive linguistic work, the tongue has been divided into either three or four areas. For example. Block and Trager (1942) give a three-way division, while Hockett (1958) and Galkina-Fedoruk, etal. (1962) suggest four.

Block and Trager apex front dorsum

Hockett tip blade center dorsum

Galkina-Fedoruk end front center back (perednejazychnyj) (srednejazychnyj] (zadnejazychnyj)

Table 2.6. Articulatory divisions of the tongue. (From De Armond, 1966:110.)

For discussion and analysis of the Russian palatalized consonants, the three-way distinction of apex-front-dorsum can be problematic. De Armond (1966) asserts that this errant three-way distinction underlies Dunatov's (1963) mistaken claim that palatal sounds (and, therefore, sounds produced with the tongue blade, the laminais) cannot be palatalized. Since the Russian laminais are paired for palatalization, Dunatov's three-way system is flawed.

A four-way distinction of tip-blade-center-dorsum is preferable, since it allows consonants articulated with the tongue blade (laminais) to acquire a secondary palatalizing articulation realized through raising of the tongue center. For De Armond it is clearly the center of the tongue, and not the

Already at the end of the previous century Bogoroditskij (1884) described the palatalizing articulation of soft consonants as a raising of the middle part of the tongue towards the palate.

33 blade, which is raised to produce palatalization (De Armond, 1966:67). Bolla also states that, based on his findings from extensive phonetic studies, the secondary palatalizing articulator is indeed the tongue center, or dorsum (1981:70).

More recently, drawing on results from his electro-palatography studies, Recasens claims that

"the production of palatal consonants may involve a higher degree of articulatory precision than previously assumed” (1990:267). He also argues for a four-way distinction of both the tongue and palate.

As seen in Table 2.7 below, he organizes the four-way lingual division differently than the previous models do. Here "tip" and "blade" are combined to form "laminal," while "center" is divided into

"predorsal" and "mediodorsal." In agreement with the most common descriptions of palatalization as raising of the tongue "center" or "dorsum," Recasens indicates the predorsal and/or mediodorsal lingual sections as the active secondary articulator.

34 Recasens laminal predorsal mediodorsal postdorsal (tip + blade)

Table 2.7. Articulatory divisions of the tongue. (From Recasens, 1990.)

Drawing on both phonetic and phonological data, recent auto-segmental investigations have been concerned with the “coronal” vs. “dorsal” specification of palatalized consonants. For example, the

Halle-Sagey model represents the palatalization of a consonant under the dorsal node (with the feature

[back]), suggesting that the tongue dorsum is the active articulator in palatalization. In the Clements-

Hume model palatalization is accomplished by the coronal articulator, meaning that it is associated with raising the tongue blade. Subsequent work by Keating, which pays particular attention to phonetic facts, argues that it is the raising and fronting of the tongue bodv. or dorsum, that is the general characteristic of palatalization. Thus, contrary to Clements-Hume, she argues that it is the dorsal node, with the associated feature [back], that indicates palatalization:

To me as a phonetician, it makes little sense to attribute this palatalization to the tongue blade, even generously construed. The part of the tongue that appears to be attracted to the roof of the mouth is rather far back from the tongue tip, and it is a part of the tongue which is distinct from the part forming primary coronal articulations. If this is all blade, then it’s a blade with two independent sub-parts, and we lose the whole notion of a single coronal articulator. I take Dorsal to be the expression of this articulator behind the tongue blade....We have seen that secondary palatalization has a rather simple articulatory characterization that seems to refer to the tongue body as the active articulator. It is this kind of phonetic evidence that leaves me unenthusiastic for the proposal that palatalization is an articulation of the tongue blade, and more sympathetic towards the kind of tongue body accounts of e.g. Gorecka (1989) or Lahiri and Evers (1991) (Keating, 1993:11-20).

Assertions of a blade-coronal secondary articulator have generally been supplanted by tongue bodv-dorsum. This proposition is particularly apparent when phonetic fact takes precedence over phonological theory. While it must be kept in mind that the relationship between phonetic reality and phonological representation is not direct, familiarity with some phonological theory as it relates to articulatory phonetics can aid us in our attempt to better understand the phenomenon of palatalization.

35 2AA The passive articulator of secondary palatalization in light of /j/

In general, descriptions of Russian palatalization primarily focus on the active (lingual)

articulator. The passive articulator has not been given the same amount of attention.

The passive articulator for the secondary articulation of palatalized consonants is typically said

to be the front portion of the hard palate, a rather unspecified designation. One reason for the vagueness

is that the hard palate traditionally is divided into only two areas: palatoalveolar, as with [/] and [t/], and

palatal, as with [ji] and [j]. As previously mentioned, Recasens (1990) asserts that palatal consonants

may involve a higher degree of articulatory precision than previously assumed; he, therefore, proposes

that the palate (like the tongue) should be divided four-ways: alveolar, prepalatal, mediopalatal, and

postpalatal. While Recasens ( 1990) does not specifically address the passive articulator of palatalized

consonants, he does discuss in detail the passive articulator of [j]: "the degree of lingual contact increases

from the postalveolar zone towards the prepalate and mediopalate, and decreases again from that particular location towards the postpalate," and the amount of palatal constriction depends on the "degree of tongue dorsum raising" (Recasens, 1990:274). Data from palatograms indicate that in most cases there is extensive realization of a prepalatal constriction alone or simultaneously with a mediopalatal constriction {ibid.). Recasens, therefore, provides a clear articulatory description of [j]:

ACTIVE ARTICULATOR: predorsum and/or mediodorsum, PASSIVE ARTICULATOR: prepalatal and/or mediopalatal.

Having seen how a nonnative speaker of Russian describes a palatal articulation, let us also consider Russians' own descriptions of /j/. Here there is agreement on both the phonemic classification and the articulatory and acoustic characteristics of Russian [j] (Bryzgunova, 1963; Akishina, 1980;

Antonova, 1988). [j] is a voiced oral sonorant produced by raising the middle part of the tongue towards the middle portion of the hard palate. The tongue tip is lowered towards the back of the lower teeth and the sides of the tongue are pressed against the upper and lower teeth. Raising the tongue in this way creates a narrow slit through which the air stream passes. The air flowing through the approximate closure is described as having the quality of “weak noise” or as “weak air flow.” “In form and placement of the tongue, the sound approaches that of the vowel [i]; however, unlike [i], the tongue is raised higher

36 and the tension created for (j) is concentrated in the middle part of the tongue"'* [trans. mine - E.D.]

(Antonova, 1988:136).

Since "noise" and "tension" are frequendy used to describe [j], a certain degree of frication

seems to be important to its production. In fact, this quality of "tension" seems to be a key element in correctly producing (j]. According to Bryzgunova, common incorrect articulations result in a weak [j], due to inadequate tongue raising and tension, and resulting in lower air pressure(Bryzgunova, 1963:53).

Ward's description of [j], however, contradicts Bryzgunova's: "[yod] is quite weakly articulated in Russian and tends to disappear when it occurs between two vowels and the second is unstressed." In contrast, the "English y-sound is sometimes pronounced with quite audible friction, which is quite wrong in Russian" (Ward, 1958:13). Perhaps Bryzgunova's and Ward's contradictory descriptions of Russian [j] reflect either its low acoustic and articulatory saliency or its positional variants. It is, therefore, not surprising that students experience difficulty acquiring its correct articulation.

In general, however, articulatory properties of /j/ have been rather well-established, including both active and passive articulators. I have not yet found data which describes in similar detail the passive articulator for the secondary gesture of the palatalized consonants.

If we recall and further develop Bhat's ideas about the multiple articulatory realizations of palatalization (see section 2.3.3 above), one complicating factor in defining the passive articulator of palatalized consonants is that it is variable, depending on the primary constriction's place of articulation.

For example, the passive articulator of hard and soft velars differs. For non-palatalized velars, the passive articulator is the soft palate. For palatalized velars, instead of there simply being an additional secondary passive articulator in the prepalatal zone, the entire passive articulator is moved forward on the palate.

Similarly, there is not a clear "additive " passive articulator for the secondary articulation of soft dentals.

Rather, the presence of palatalization seems to back, raise and lengthen the primary dental articulation, resulting in a prepalatal passive articulator. Labials, however, present an simpler case. Here the

'* Ho ({«pvie H nojiOvKeHHio asbiKa 3 ByK npHÔJiHSKaerca k rjtaotOMy [m], om aK o b OTJTMMHe o r [H] noÆtev! H3biKa Bbime h HanptHceHHOcrb npH |J] (jtOKycHpyerca b cpeimeH qaCTH aiHHKH H3blKa (Antonova, 1988:136). '■' Caa6oe npon3HotiieHne [j] oSiflCHflercfl HejoCTaTomtbLvi noirbe.vio.vi h HanptuKeHMeM qjCiuteH tacTM aabiKa h 6ojiee cna6bLM. qe.vi b pyccKO.vi H3biKe, Hanopovi BoaayuiHOfi crpyH (Bryzgunova, 1963:53).

37 secondary passive articulator more or less adheres to the prescribed prepalatal/mediopalatal description

given to [j], with the addition of the secondary passive articulator to the primary labial closure.

We have thus far established the static active and passive articulators for palatalized consonants

and [j]. However, these static descriptions represent only part of tlie "L2 phonetics equation." In order

for L2 learners to achieve phonetic competency in Russian they must also master certain dvnamic aspects

of articulation, in particular, relative segment durations and articulatory timings. Mastery of such

articulatory details can differentiate excellent (virtually no accent) and mediocre (highly accented) L2

phonetics. In fact, 1 propose that these properties of closure degree and duration distinguish the

problematic (see section 2.3.2) two- (d V ) and three- segment (CJjV) sequences. The next section will

discuss in greater detail this particular issue.

2 A £ /civ/ vs. /cijV/: Palatalization vs. palatals

While palatalized consonants present considerable difficulties for nonnatives, there is an even

more troublesome sound environment for L2 learners: “[this] final stage in acquiring the sounds of

Russian is the ability to distinguish sound combinations of the type [ClVj and [CljV]” (Bryzgunova,

1963:51). Some degree of mastery of these two sound combinations is desirable since they have wide

phonological distribution within the Russian language, e.g., iieü— JlbCT [l*ot]-[l*jot], 30BGT— saBber,

[zAvbt]-[rivv*jot], ceJl— CbCJl [s’el]-[s*jel], iKHBe.\l— >KMBbe.vl [3ivbm]-[3 iv>jom]. "In this [the CÎV-

CljV contrast] one of the most important characteristics of the phonological system and articulatory base

of the Russian language is apparent"'" (Bryzgunova, 1963:50). Therefore, when acquiring the soft

It is not always the case that the consonant in the [CljV] environment is palatalized. It may sometimes be realized as non-palatalized. To this end, Bryzgunova divides the consonants into two groups (Bryzgunova, 1963:50):

1. The consonants d, t, n, I are always pronounced soft when in word-internal or word-final position. When in word-initial position at a morpheme boundary, they can be either hard or soft.

Russian Phonetic W ord Transcription Gloss Word-internal or -final laTbHHa [tA'fjana] Tatyana' (woman's name) cyabfl [su'd*ja] judge' craTbH [stA't^'a] 'article'

38 consonants, learners should actively develop both the non-palatalized-simple-paiatalized (/CV/-/CJV/) as well as the simple-palatalized-palatalized-yod (/civ/-/CijV/).

Recall, however, that native AE-speaking learners often produce the palatalized consonants as a sequence of two segments, /C/+ /j/. The most simple explanation for this phenomenon is that, when producing L2, learners draw on the most similar LI phonetic realization, resulting in direct phonetic LI transfer to L2. For example, while English does not have phonemic palatalization, there are certain environments which contain a somewhat similar palatal combination, e.g., as in 'music', 'beautiful'.

'coupon', and few'. Here the palatal stands as an autonomous segment (/C/+/j/+/V/). Textbooks of

English phonetics written specifically for native Russian speakers provide a useful description of English phonetics. These manuals emphasize that English sound combinations (such as those listed above) must be produced as a sequence of two separate phones, /C/ + /j/. Native-Russian speakers are instructed that, when producing words such as music' and beautiful', “one must not allow the softness [of the [j]] to affect the preceding consonant...the middle portion of the tongue should be raised to the hard palate only after the articulation of the preceding consonant has been completed” (Gal perina, 1963:92). And while we cannot always assume direct transfer of LI phonetics, for this study we hypothesize that the American students who have not yet internalized the palatalized consonants produce them as a bisegmental sequence, ICI + /j/.

JlblO n*ju] I pour' KOHbflK [kA'nijak] 'cognac' Word-initial position at n o a ie x [pA'djom] or [pA'dijom] lifting, raising' morpheme boundary 0Tbe3ü [a'tjest] or [a'fijest] departure'

2. The consonants b, p, v,f,z,s,m ,r can be pronounced both hard and soft before [j] [regardless of the position within the word]. In word-final position r is pronounced soft before [jj.

Russian Phonetic W ord Transcription Gloss All positions o6-be.M [Atjom] or [A'b'jom] volume' cbe.M [sjem] or [s^jem] 'I will eat (up)' œ M ha [s^'mjal or [sHhnfja]] 'family' 5ypbflH [bu'rjan] or [bu'r*jan] tall weeds' Word-final r nepbfl ['pter>ja] feathers' MapbH [•marija] 'Marya' (woman's name)

39 One goal of this study is to investigate learners’ capacity to distinguish the sequences [CV] and

[C*jV] both in perception and production. Given the fact that learners misarticulate Russian [O], it is

not surprising that they experience even greater difficulty with the [Clj] combination. If learners produce

[O] as [C]+(j], how do they produce [Clj]? As [C]+[j]+[j]? This is an unlikely solution.

/ d / /civ

NATIVE [ d ] [d j]

NONNATIVE [C] + Ü] [????]

My hypothesis is that learners do not distinguish /Cl/ and /Clj/, but produce them both virtually

identically, as [C]+[j]. It, therefore, seems that learning to produce these two sequences is somehow

associated. Perhaps L2 acquisition of these two sound combinations is a two-step process: learners must

retrain themselves to produce Russian /Cl/ as [Cl] and not [C]+[j]. Only when they have mastered this

articulation can they advance to the next stage—producing Russian /Clj/ with the "correct" Russian

articulatory and acoustic qualities.

As a teacher and student of Russian, I have both observed and experienced the difficulties that

the simple-palatalized ([ClV]) and palatalized-yod ([CljV]) environments present the learner. In addition

to LI interference, are there specific acoustic and articulatory qualities of the simple-palatalized and

palatalized-yod sequences that make them difficult for nonnative speakers to distinguish? The phonemic

difference between "palatal" and "palatalization" is quite clear: [j] is a phoneme, [J] is not’’ (Akishina,

1980:68-69). However, while the functional differences are clear, perhaps the acoustic and articulatory

differences between [j] and [J] are not easily accessible and obvious to the learner. Is the [CljV] sequence

produced similar to a [CIV] sequence with the secondary palatalizing articulation simply prolonged? Or

do the lingual articulations for secondary palatalization and the palatal segment differ considerably? How

*' "In contradistinction to palatalized consonants, [j] is a palatal...." B OTJlMHHG OT naJiaTajin30BaHHbK corjiaotbix[j] HBJiaerca najiaxamHbiM-(Akishina, 1980:68-69).

40 acoustically salient is the palatal segment? Perhaps the articulatory timing necessary to produce the

Russian palatal [j] is too “foreign" for our AE students of Russian?

Moreover, while prescriptive descriptions of Russian phonetics emphasize that the [CÎV] and

[OjV] are distinctly different sequences, to what degree do Russians reallv distinguish the two in their speech production and perception? Based on published phonetic data and personal experience, I hypothesize that Russians easily distinguish the two environments, both in production and perception. A primary goal of the acoustic studies reported in chapters 4 and 5 of this dissertation is to test this hypothesis. In addition, I provide new dynamic quantitative accounts of the acoustic and articulatory differences between the [Civj and [CljV] environments.

In my view, while dynamic acoustic and articulatory differences do exist between [j] and [J], learners often can neither hear nor produce these differences, due to a number of factors. I hypothesize that the segment [j] is produced with greater stricture than the secondary [J] articulation. It follows that this greater stricture might result in impressionistic descriptions of [j] as “tense” and “noisy.” Initial review of acoustic formant frequency data for the palatalized consonants (with [J]) and [j] in The Sound

Pattern of Russian (Halle, 1959) does not, however, indicate an obvious static acoustic difference. My hope is that the results of the present acoustic production and perception studies will advance our understanding of the dynamic differences between the primary and secondary articulations [j] and [J].

2 £ Acoustics of palatalization

This section discusses previous studies on the acoustics of palatalized consonants. I will present the information in chronological order, briefly remarking on significant historical advances in our understanding of palatalized consonants. It must be kept in mind that, throughout the history of modem linguistics, contemporary levels of technology have strongly influenced approaches towards and descriptions of speech sounds.

At the turn of the century linguists investigated the sounds and articulations of Russian using instruments such as linguagrams, palatograms, kymographs, pneumotachs, spirometers and others

(Teijaev, 1989:82). As with current phonetic studies of Russian, the palatalized consonants were often

41 the object of investigation. For example, based on palatographic investigations, Ershov argued that “the

soft sounds are 'simple' [i.e., not complex], and not simply a combination of a consonant with /' (Tetjaev,

1989:82). The notion that palatalized consonants are merely hard consonants plus “softening” continued

to be investigated until only recently, as will be seen below.

At the beginning of the century phoneticians, such as Shcherba, focused on the influence and

relationship between soft consonants and following vowels (Teijaev, 1989:82). Based on analyses of

various CV combinations (where C = Cl and V = a/o/u), investigators noted the i-like transitional sound

audible at the consonant offset and vowel onset. Thus, as early as the 1920s, phoneticians understood the

essential role of the i-like transition in conveying the softness of a consonant. It was only later in the

1960s, however, with the wide-spread use of the spectograph, that this essential [i]-like transitional

element could be quantified.

More recently, Russian phonetics research addressedTrubetskoj's (1960; in Bondarko etal.,

1965) proposition that Russian soft consonants are simply hard consonant + “something else.” This same

view, though stated somewhat more phonologically, is captured in the observation that “soft consonants

are marked while hard consonants are unmarked” (Bondarko and Verbitskaja, 1965). In response to these

theoretical propositions, other Russian phoneticians (e.g., Zinder, Bondarko, Verbitskaja, Skalozub and

Zlatoustova) analyzed spectrographically the hard and soft consonants of Russian. Results of their

experiments refuted the claim that soft consonants are simply the addition of “softening” to hard

consonants. Rather, they found that while non-palatalized consonants are characterized by a formant

frequency band of greater intensity in the 1,000-2,000 Hz. range, for soft consonants this band of greater

intensity ranges from 2,000-3,000 Hz. Moreover, the relative amplitudes of the first two vocal tract

resonances, the 1st and 2nd formant frequencies (FI and F2), were seen to play a role in the perception of

the consonants. Strengthening the amplitude of the band in the 1,000-2,000 Hz. range led subjects to

perceive fricatives as non-palatalized. In contrast, strengthening the amplitude of the band in the 2,000-

3,000 Hz. range led them to identify the same segments as palatalized. It was also found that the band ranging from 600-1,OCX) Hz. was important in identifying the consonant as either hard or soft. In sum, the hypothesis that soft consonants are simply C + “softening” is not supported because they differ from hard

42 consonants in both the 1,000-2,000 and 2,000-3,000 Hz. ranges (Zinder etal. 1964; Bondarko and

Verbitskaja, 1965).

Acoustic analysis has shown that the palatalized nature of a consonant is often apparent only

during the C-V transition and not during the consonant closure itself. Specihcally, at the consonant

release-vowei onset in a soft consonant-vowel ([ClV]) sequence, the second formant band (F2) is raised

to approximately 2,000 Hz. (It is helpful to remember that R and the frontness/backness of the tongue

are proportionally related. That is, a higher F2 indicates a fronted tongue position, while a lower FI

indicates a backed tongue position.)

At the [Cl] release, the tongue body is raised high and forward in the oral cavity, resulting in a

higher F2 of approximately 2,000 Hz. As the tongue body moves away from its raised, fronted, and

"palatalized" position towards the tongue position of the vowel, the F2 is lowered. The acoustic results

are an F2 with a negative slope. Eventually the F2 flattens out when it has attained a frequency value

associated with the vowel. In sum, the most significant acoustic realization of sofmess occurs during the

[i]-like C-V transition. (See also Derkach, et. al, 1983:70-73.)

Research on the acoustic properties of Russian from the 1950s through the 1980s often

employed the sound spectrograph as an analytical tool (Bolla, 1981; Derkach, et a i, 1970; Derkach,

1975; Halle, 1971). Investigators continued to quantify articulatory and acoustic properties of the

palatalized consonants. Results of their studies indicated that all three formant frequencies—FI, F2 and

F3—convey the palatalized or non-palatalized quality of a consonant. The three bands, however, are not

equally weighted in their significance. F2 is by far the strongest indicator of palatalization, followed by

FI, and then F3 (i.e., F2 > FI > F3).

Additional acoustic studies on the [i]-like transition confirmed previous conclusions that it

provides the essential acoustic information in perceiving a consonant as soft. According to Derkach et al.

(1970) upon [Ci] release, the raised F2 (of approximately 2,000 Hz) is maintained for approximately 150 milliseconds (msec.) into the following vowel, after which time, F2 lowers to its frequency typical for the given vowel. In the same study, Derkach reports that if 100 msec, of the F2 of the [i]-like transition is spliced out, participants identify the consonant as non-palatalized. On a similar note, research by the

43 Russian phoneticians Panov (1967:119)“ and Bryzgunova (1963:25)“ note that the higher tone associated

with the raised F 2 of palatalized consonants is the listener’s acoustic cue for the palatalized nature of the

consonant. Because the vocalic portion of a CJV sequence is characterized by a brief initial [i]-like

portion followed by a transition and then vowel steady-state, some have described vowels following soft

consonants as diphthongoids (Bolla. 1981:61).

Halle adduces articulatory x-ray data to explain L2 learners’ perception of the Russian

palatalized consonants. Recall that palatalized consonants are characterized by a wide pharynx, non-

palatalized ones by a constricted pharynx. Therefore, when a palatalized consonant (widened pharynx) is

followed by a (widened pharynx), there is virtually no transitional effect. Likewise, when a non-palatalized consonant (constricted pharynx) is followed by a (constricted pharynx), there is little acoustic transitional effect. In contrast, when a palatalized consonant (widened pharynx) is followed by a back vowel (constricted pharynx), the resulting acoustic transition is quite extreme. The same holds true for a non-palatalized consonant (constricted pharynx) followed by a front vowel

(widened pharynx). Significant changes in the width of the pharynx result in acoustically salient transitions. Halle notes that, “[t]he transitional effects explain why foreigners often perceive Russian sharped [palatalized] consonants as being followed by [j], and plain [non-palatalized] consonants as being followed by a [w]” (Halle, 1971:153). To this end, it has been noted in pedagogical-pronunciation literature that it is easier for the learner to hear palatalization when the palatalized consonant is followed by a back vowel, no doubt because of the greater acoustic salience due to the transition.

2,6 Sum m ary

This chapter has served to set the stage for our study of Russian sound sequences which are problematic for L2 learners of Russian, the palatalized consonants and the front glide [j]. This chapter discussed:

1. the typology of the Russian consonantal system, especially the role of palatalized consonants and the front glide [j].

“ KaiKiibiH MflTKHH tMPJiacHbiH Bbiuie HO TOHy.tevi e ro napHbiA TBepabtft- (Panov, 1967:119). “ AKycTHHecKHH 3({Kl)eicT npoaBMHyroH anepea aprHKyjiauHH rjiacHbix nocne .mhtkhx coraacHbix xopoiiio oœsHaerca roBopfliun.MH na pyccKo.vi mbiKe (Bryzgunova, 1963:25).

44 2 . the numerous orthographic representations of both palatalized consonants and [j],

3. detailed articulatory and acoustic properties of the soft consonants,

4. observations on “accented” productions of the palatalized consonants,

5. hypotheses as to why adult AE learners of Russian experience such difficulty with the simple-palatalized and palatalized-yod sequences.

To better understand learners’ difficulties with the palatalized sequences, this study endeavors to provide a dvnamic description of Russians' and learners’ productions and perceptions of Russian C—V sequences which contain palatalized consonants. In order to discuss such dynamic properties, however, an appropriate descriptive system must first be adopted. The next chapter presents several possible phonological descriptive systems. It will emerge, however, that none of these systems is well-suited for describing the results of the acoustic production experiments of chapter 4. Therefore, an alternative phonological system will be proposed and supported.

45 CHAPTERS

AN OVERVIEW OF PHONOLOGICAL MODELS AND PALATALIZATION

3.1 Traditional phonological models

This chapter considers the description of the Russian consonantal system in various phonological frameworks. I will present a brief overview of several of the most important phonological models of this century and will offer for each model discussed, an associated phonological representation of native and accented productions of Russian palatalization. Each model's inherent incapability to describe phonetic aspects of the L2 acquisition of the soft consonants will emerge. Finally, I will offer a model which optimally serves the purposes of this study.

3.1.1 Palatalization in Distinctive Feature Theory

Based on visible patterns in spectrographic-acoustic analyses, Jakobson, Fant and Halle (JFH) in

Preliminaries to Speech Analysis (1963) propose twelve pairs of binary oppositions, or distinctive features, for describing the sounds of the world’s languages.” These features make no reference to absolute quantities, i.e., absolute frequency values. Instead, the features are based on relative properties visible in spectrograms. For example, consider the description of the features compact/diffuse: "In the consonants, compactness is displayed by a predominant formant region, centrally located, as opposed to phonemes in which a non-central region pre-dominates" (Jakobson, Fant and Halle, 1963:27).

” The twelve paired features are: (1) VOCALIC/NON-VOCAUC, (2) CONSONANTAL/NON- CONSONANTAL, (3) INTERRUPTEDCONTINUANT, (4) CHECKED/UNCHECKED, (5) STRIDENT/MELLOW, (6 ) V0IŒE/UNV01CED, (7) COMPACT/DIFFUSE, (8 ) GRAVE/ACUTE, (9) FLAT/PLAIN, (10) SHARP/PLAIN, ( 11 ) TENSEOLAX, (12) NASAL/ORAL (Jakobson, Fant and Halle, 1963:40).

46 In this system, the palatalized/non-palatalized opposition is expressed with the features

SHARP/PLAIN (which is their only use in the system). The feature SHARP is realized in the spectrogram as "a slight rise of the second formant and, to some degree, also of the higher " (31 ). The articulation of palatalization is described as “raising a part of the tongue against the palate” (31), thereby reducing the volume of the oral cavity. In addition, the palatalizing articulation is executed simultaneously with the primary articulation and accompanied by a “dilation of the pharyngeal pass”

(32). Table 3.1 illustrates the relationship between paired non-palatalized and palatalized segments (here, the paired voiced bilabial stops) and their associated distinctive features, where a "+' in a column means that the segment is characterized by the associated feature.

/b/ /hi/ 1. Vocalic —— 2 . Consonantal + + 3. Compact -- 4. Grave + + 5. Nasal —— 6 . Tense ——

7. Continuant -- 8 . Sharp 1 - + 1 9. Voiced + +

Table 3.1. The acoustically-based distinctive features for the palatalized and non- palatalized voiced bilabial stops. (Adapted from Jakobson, Fant and Hall, 1963:43.)

Note that the paired soft/hard consonants are distinguished only by the feature SHARP. All other features are identical.

The distinctive features and their associated segments are inherently categorical. That is, a segment is either completely CONSONANTAL or it is not. It is either VOICED or it is not. The purpose of the model is to reflect a speaker’s internal phonology and not acoustic reality. With its delineated organization ("beads on a string"), the model cannot account for (the more phonetic) varying degrees of segment overlap and partial acquisition of L2 phones.

3.12 Application of the JFH Distinctive Feature Theory to Russian: The Sound Pattern of Russian

In The Sound Pattern o f Russian Halle (1971)also emphasizes the contrastive acoustic elements of speech as they relate to the distinctive features. The paired feature SHARP/PLAIN continues to be used

47 to specify the palatalized nature of Russian consonants. Results of Halle's acoustic investigations confirm the acoustically salient nature of palatalization. In fact, the presence of SHARP affects the acoustics of C -

V transitions more than the place of articulation:

In Russian, therefore, the transitions are dependent on two factors, the point of articulation of the stop and the feature of sharping (palatalization). In the body of data that was investigated the formant transitions were most closely correlated with the feature of sharping. They did not show as systematic a correlation with the point of articulation... (1971:135)

3.13 Palatalization in the SPE framework

Chomsky and Halle's (1968) seminal work The Sound Pattern o f English (SPE) modified earlier distinctive feature theory by basing features on articulatory patterns rather than relative acoustic ones. In addition, they introduced the notion of linear phonology and rule-based . The acoustically-based features were rejected since they were unable to provide satisfactory motivation for some phonological patterns. The Jakobsonian twelve paired features were replaced with fewer articulatorily-based ones.

A major difference between the (previous) Jakobosonian and SPE systems is in their respective use of the distinctive features for consonants and vowels. While JFH used the features uniformly for both vowels and consonants, Chomsky and Halle in SPE argue that "this complete identification of vowel and consonant features seems in retrospect to have been too radical a solution..." (1968:303). Therefore, SPE specifies vowels with the main features [±back] and [±high] (and [±low] if such a distinction if needed), while consonantal strictures are described with the main features: [±continuant], [±sonorant], [±anterior], and [±coronal].

Since the vocalic features characterize tongue body placement, they are used also to indicate secondary consonantal articulations.^ For secondary palatalization (and velarization) only the features

[±high] and [±back] are necessary. The use of these features to distinguish non-Russian plain, Russian velarized hard, and Russian palatalized soft consonants as well as [j] can be seen in Table 3.2 :

^ "The most straight-forward procedure is, therefore, to express these superimposed vowel-like articulation with the help of the features ‘high,’ ‘low,’ and ‘back,’ which are used to characterize the same articulations when they appear in the vowels. We shall say that palatalized consonants are high and nonback..." (Chomsky and Halle, 1968:306).

48 [m] 1 [m'^j [mi] [d] [di] 1 Ij] (plain) 1 (velarized) (palatalized) (plain) (velarized) (palatalized)

Anterior + 1 + + + + + 1 _ Coronal - 1 - + + + 1 -

High — 1 + + — + + 1 + Back - 1 + - - + - 1 -

Table 3.2. SPE features for palatalized/non-palatalized pairs. (Adapted from Chomsky and Halle, 1968:307.)

Purely for the sake of comparison, I have included the plain consonants, which are not part of

the Russian paired consonantal system. Note that plain consonants are designated with the feature [-

high] and [-back]. Velarization is accomplished by backing and raising the tongue body and, therefore, is

associated with the features [+high] and[ 4-back]. Palatalization is accomplished by raising and fronting

the tongue body and is, therefore, expressed with the features [+high] and [-back]. Palatalized consonants and the palatal glide [j] are characterized by the same tongue body features used in the description of the high front vowel /!/. This similarity will be significant in chapter 4, where we compare and contrast native and nonnative productions of /ClV/ and /CljV/ sequences.

3.1.4 Palatalization in a Non-Linear framework

In the traditional linear framework there is no specific organization of the segmental features.

In other words, the phoneme is “an unstructured bundle” (Kenstowicz, 1994:154). This lack of organization is problematic since "[it] gives the misleading impression that features may freely combine in the construction of a phonemic inventory" (ibid.: 145). Moreover, linear frameworks do not adequately reflect the limitations and patterns of linguistic change imposed by articulatory reality. Linear models are, therefore, too powerful. For example, traditional models cannot adequately account for complete assimilation. "Under the earlier generative conception of the features...complete assimilation is rather mysterious. Simply listing all features as changing simultaneously makes for a complex rule.^®

Furthermore, if all features can assimilate, then why not all but one or all but two?" (Kenstowicz,

"...previous generative descriptions typically invoked some ad hoc convention such as [n]-»[otFs]/— [oFs], where [Fs] abbreviated all the features of a segment" (Kenstowicz, 1994:154).

49 1994:154) Insufficiencies in the traditional linear model led to the development of a feature hierarchy called Feature Geometry.

Similar to the traditional linear model, feature geometry is based "on the Jakobsonian idea that phonological segments are.. .bundles of distinctive features whose behavior can be understood from the elucidation of this internal structure” (Kenstowicz, 1994:451). Unlike the linear model, however, the features are structured hierarchically around six articulators. In this manner, the non-linear model better reflects phonetic reality. The exact organization of the features is controversial. However, because the purpose of this chapter is to present an overview of the general characteristics of the most notable non­ linear models, I will not argue in favor of one feature arrangement over another. (See Halle (1992),

Sagey (1986), Clements (1991), and Hume ( 1992) for such discussions.) Instead, I will present an overview of a general non-linear model. While the non-linear model has many inherent advantages over previous linear models, we will nonetheless see that, like the linear models already discussed, it is not the best model for the purposes of this study.

In the Halle-Sagey Articulator Model the features are organized and grouped according to six articulator nodes: GLOTTAL, TONGUE ROOT, SOFT PALATE, LABIAL, CORONAL and DORSAL; see

Figure 3.1. Note that the specific features (many familiar from linear models) are located on either the upper left-hand comer or along the bottom of the figure.

50 [continuant [strident}^ LARYNGEAL S UPRALARYNGEAL [lateral]'— '

ORAL PLACE

GLOTTAL TONGUE ROOTSOFTPALATE LABIAL CORONAL DORSAL / [stiff vf.] [ATR] [nasal] [round] [anterior] / [high] [slack vf.] [RTR] [distributed] ^-[low] [spread gl.] ^v[back] ^ [constr. gl.

Figure 3.1. Representative tree structure of Non-linear Phonology, specifically, the Halle- Sagey Articulator Model. (From Kenstowicz, 1994:452.)

The non-linear model is similar to the traditional JFH one since both "consonants and vowels are described with the same set of articulators...The model [therefore] promises to capture consonant-vowel homologies" (Kenstowicz, 1994:145). In this particular organization, vowels are expressed only on the

DORSAL node with the three features of [±high], [±back], and [±low]. Because of its vocalic ([i]-like) nature, secondary palatalization is also expressed on the DORSAL node as [-f-high] [-back].

Clements (1991) and Hume ( 1992) argue for a somewhat different organization and implementation of the features. Drawing on both phonetic and phonological evidence, they refute the expression of front vowels on the DORSAL node. Instead they propose “the hostile takeover o f [back] by the coronal and dorsal articulators. Specifically, back vowels are implemented by the dorsal articulator and front vowels by the coronal articulator” (Kenstowicz, 1994:464). Since this model expressed both front vowels and consonants through the coronal feature, their shared articulatory quality is emphasized.

It follows that "front vowels such as [i] should also be [coronal]” (Hume, 1992:45), a proposition that

"departs sharply from the traditional SPE view in which front vowels are specified as [-back]" (ibid, 52).

In addition to phonological patterning, Hume cites phonetic X-ray tracings to support the use of [coronal] for front vowels.

The DORSAL vs. CORONAL debate, however, is in no way concluded and linguists continue to research and discuss these phonological issues. For example, Keating (1993) has since countered Hume's

51 proposition. Also basing her conclusions on phonetic X-ray data, Keating states that Russian

palatalization should be expressed as [-back] on the DORSAL node.

How would these non-liner models represent Russian palatalized consonants? Figure 3.2

illustrates two variations of non-linear expressions for the Russian voiced palatalized bilabial stop, [bl],

one in the Halle-Sagey Articulator model and the other in the Clements-Hume model. For the purposes

of this discussion, I include only the most relevant features.

Halle-Saeev Model Clements-Hume Model

/bJ/ /bJ/

+ consonant"! X [- sonorant _| Supralaryngeal C-Flace

Oral Place Labial V-Place

Labial Dorsal coronal

[+ high] [- back]

Figure 3.2. Examples of non-linear representations of Russian palatalized [bl] in abbreviated forms of both the Halle-Sagey and Clements-Hume models.

To summarize, in Figure 3.2 we see that the Halle-Sagey model expressed palatalization as [-t-high] [- back] on the DORSAL tier while the Clements-Hume model expresses palatalization through a coronal articulator under the V-Place node of the consonant.

One strength of the non-linear model is its ability to capture inter-segmental influences. For example FEATURE SPREADING is expressed formally through association lines drawn from the features or nodes of one segment to another. Thus, the non-linear model deviates sharply from the traditional linear model in that instead of a feature simply being copied from one segment to another, the feature is shared. "There is no longer a one-to-one relation between segments of the string and feature

52 specifications" (Kenstowicz, 1994:310). To this end, the non-linear model better captures articulatory and phonetic reality since simple facts of physiology and articulations often lead to segment overlap and inter-segmental influences in speech production.

How are these qualities of non-linear phonology related to our discussion? The present study proposes that beginning L2 learners of Russian produce the palatalized consonants as a sequence of two segments, ICI + /j/. As seen in Figure 3.3*’, the non-linear fhunework can illustrate and contrast nonnative bisegmental productions with native monosegmental ones:

Native Production Nonnative Production

/bJ/ /b/ /j/

X X

C-Place C-Place C-Place

Labial V-Place Labial V-Place

coronal coronal

Figure 3.3. Example of native versus nonnative productions of Russian palatalized /W/, expressed in a simplified version of the Clements-Hume framework.

It is interesting to note that non-linear frameworks retain aspects of both the JFH and SPE models. Like the JFH model, non-linear models propose unified features for both consonants and vowels; like the SPE model, they use features that are based on articulators (and not acoustics). The features in linear and non-linear frameworks, however, are still seen as static—categorical on/off switches—that can

” Note that in the given simplified representation, /j/ would not be distinguishable from the front high vowel /i/. Syllable position must also be referenced in order to differentiate the two. Hume comments on this fact: "Reference to syllable position alone is sufficient to account for the front vowel/palatal glide alternations discussed above. From vowel to glide, the segment changes from syllable nucleus to non-nuclear position" (1992:80).

53 themselves in no way express relative timing properties. And while non-linear models can express inter- segmental influence through processes such as FEATURE SPREADING, the result is a categorical change to the associated segment. Recall that Chapter 1.4 discussed recent propositions that learners’ acquisition of L2 sounds occurs in gradual stages, not categorically. In light of learners’ continuous stages of phonetic acquisition, the next section (3.1.5) discusses why previously discussed linear and non-linear frameworks are inadequate.

3.1,5 Summary: Rejecting previous phonological models

Adult learners recognize and gain control of new phonetic L2 articulatory patterns and timings gradually. I, therefore, propose that when acquiring the palatalized consonants, L2 learners progress in gradual stages; from producing two segments, /C/+/J/ (due to negative transfer from LI), to producing the doubly-articulated palatalized segment /Cl/. However, it is also possible that some learners never attain native-like phonetic proficiency in producing the palatalized consonants; rather, they reach a level of phonetic acquisition that falls somewhere between the two extremes. With their focus on the categorical, abstract organization of speech segments and externally -time phonological units, none of the phonological models discussed thus far can satisfactorily describe learners’ gradient stages of phonetic acquisition.

Why are the previous phonological models inadequate for describing gradient processes? The

Jakobsonian distinctive feature and SPE models have discrete segments as their phonological units.

Therefore, due to the underlying structure of the system, segment classifications are limited to being either completely palatalized ([+sharp] for Jakobsonian; [+high][-back] for SPE) or completely non­ palatalized. Non-linear models are less segmental in that they can express segmental overlap through feature spreading. Nonetheless, their features are inherently categorical with no internal dynamics and, as such, “spreading and delinking will always be inadequate for representing noncategorical processes’’

(Zsiga, 1997:261). Linear and non-linear , therefore, cannot provide a satisfactory account of partial gradual phonetic acquisition—a noncategorical process. This is partly due to the fact that their surface representations have nonoverlapping timing slots which are superordinate to atemporal

54 phonological features. Thus, "the temporal characteristics of the segment is not a part of its phonological

representation; the units are externally timed" (Byrd, 1994:5).

In light of learners’ gradual acquisition process, this study endeavors to provide a dvnamicallv-

oriented description of certain sequences containing palatalized consonants. To this end I analyze and

compare native and nonnative productions of /CV/, /OV/, /CljV/ and /CÜjV/ sequences. I focus on

dynamic gradient properties of the palatal constrictions, particularly those of inter-articulatory timing and constriction degree. I hypothesize that native and accented productions differ fundamentally in these two

properties. By analyzing and comparing their productions of the four sequences, I hope to provide new

insights into the dynamic source of learners’ accents. The proposed analyses, therefore, require a phonological model that can illustrate dynamic properties of speech; a model that describes articulatory timing through an intrinsic-timing approach would be ideal. To this end, I offer the framework of

Gestural Phonology. (Browman and Goldstein, 1986; Browman and Goldstein, 1989; Saltzman and

Munhall, 1989; Zsiga ,1994; Zsiga, 1997) With its Intrinsically dynamic phonological unit of the

GESTURE and its focus on phonetic reality. Gestural Phonology is preferable to both linear and non-linear models for the present study. The remaining sections on this chapter present and discuss the Gestural

Phonology framework in detail.^

1 realize that, by choosing a specific phonological model mainly for its ability to represent phonetic reality, 1 am beginning to delve into the ambiguous realm of phonetics-phonology interface. In fact, with my emphasis on the continuous nature of L2 acquisition, one might argue that 1 am actually addressing the concrete and quantitatively accessible field of phonetics, and not abstract theories of phonology. It is my goal, however, to provide a description which falls somewhere between the two disciplines—a description which can satisfy, to some degree, both phonetic and phonological requirements. This is not an easy task since the relationship between phonetic reality and phonological theory is not always direct. That is, there is not a simple one-to-one relation between the articulatory and acoustic correlates of features (Haile and Stevens, 1991). Moreover, “[f]inding the proper balance between these phonological and phonetic considerations in an explicit representational scheme for the sounds of language continues to be a central question of linguistic theory" (Kenstowicz, 1994:136). Keeping these issues in mind, 1 propose that Gestural Phonology presents an optimal balance between phonetic and phonological description for the purposes of this dissertation.

55 32 Gestural Phonology

3.2.1 Introduction to Gestural Phonology

In this section I will give a brief summary of the Gestural Model, drawing closely from

Browman and Goldstein (1989) and Saltzman and Munhall (1989). I will first present some of the framework's basic qualities and features and then discuss its organization. I will then demonstrate how the model accounts for the palatalized consonants of Russian when L2 acquisition is an important consideration. To this end, by comparing and contrasting native and nonnative productions of the palatalized consonants in several environments, this study will provide a unique and detailed account of both productions. Finally, thanks to the inherent dynamics of gestures, the Gestural Model can illustrate the various, continuous stages that L2 learners pass through when acquiring these important and often troublesome sounds of the Russian language.

It would be helpful at the outset to consider the origins of the Gestural Model and how it differs from previous phonological frameworks. An important and often controversial issue in proposing a phonological model is that phonetic reality often does not directly reflect the proposed phonology.

Saltzman and Munhall (1989) comment on this discrepancy in their article "A Dynamic Approach to

Gestural Patterning in Speech Production" stating,

...we attempt to reconcile the linguistic hypothesis that speech involves an underlying sequencing of abstract, discrete, context-independent units, with the empirical observation of continuous, context-dependent interleaving of articulatory movements. (333) How can one best reconcile traditional linguistic analyses (discrete, context- independent units) with experimental observations of speech articulation and acoustics (continuous, context-dependent flow)? How can one reconcile the hypothesis of underlying invariance with the reality of surface variability? (335)

Discrepancies between phonological description and phonetic reality can be attributed to the inherently different natures of phonetic and phonological descriptions: "Only in the phonology are there discrete and timeless segments characterized by static binary features " (Keating, 1988:4). In contrast to these theoretical quantal segments, the phonetic output is a continuous stream, where proposed discrete

(phonological) segments overlap to varying degrees. Keating (1988) clearly states some important insufficiencies of traditional segmental phonological theory:

Phonological representations involve two idealizations. They idealize in time with segmentation, by positing individual segments which have no duration or internal

56 temporal structure. Temporal information is limited to the linear order of segments and their component features. These idealizations are motivated by the many phonological generalizations that make no reference to quantitative properties of segments, but do make reference to categorical properties. Such generalizations are best stated on representations without the quantitative information, from which more specific and detailed representations can then be derived. Because phonological and phonetic representations are different [emphasis mine - E.D.], the rules that can operate on each must be different. (3-4)

In this same paper, Keating provides additional support for a variable model of speech production, drawing on research on the gradient nature of Arabic tongue backing: "neither segmental feature analysis

(nontraditional or autosegmental) can provide a good account....Clearly categorical phonological rules cannot describe such effects" ( 8 8 ).

In sum, literature on the interface of phonetics and phonology acknowledges that there is often an almost irreconcilable discrepancy between traditional phonological models (with discrete features and their phonological units) and phonetic output. This discrepancy often arises from the dynamic nature of speech. The model of Gestural Phonology, however, can successfully bridge the phonetic-phonology interface.

3,2,2 The dynamic and compensatory nature of Gestural Phonology

Gestural Phonology is an articulatorily-based phonological model founded on an intrinsic-timing approach. Therefore, it accounts for the dynamic properties of speech quite well. The strength of articulatory phonology lies in its ability to describe articulatory dynamics within and between segments.

Able to reference specific moments within a segment, the model can account for phonetic alternations of continuous speech in a systematic and predictable manner by focusing on inter-articulatory timing.

Examples of such alternations are segment insertion, deletion, assimilation and weakening.

Recall that Keating (1988:2) claims that the dynamics and details of coarticulation are language- specific. (See also Byrd, 1994:142.) Similarly, Zsiga (1996, 1997) examines the degree of language- specific CC overlap as it relates to production of the palatalized consonants by native Russian speakers and AE learners of Russian. She examines production of [si] (in Russian) and [s#j] (in both Russian and

English, where "#" indicates a word boundary) sequences and finds that the two languages differ in the degree of CC overlap. Moreover, the internalized L1 degree of overlap is generalized to L2 production.

57 thereby "coloring" L2 production. I extend the understanding that coarticulation is language-specific to

include language-specific secondary articulation.

When acquiring L2 phonetics, learners must internalize new language-specific (and sometimes segment-internal) articulatory timings. It is proposed that mastery of such L2 timings occurs gradually in a process which cannot be captured in discrete, timeless accounts of phonology. However, because the gestural framework describes speech production and inter-articulatory timing on a continuum, it can clearly account for both native and nonnative productions of the Russian palatalized consonants (with their "correct" and "incorrect" or "accented" timings, respectively).

Similar to other phonological frameworks, the phonological units of the gestural model reflect constraints imposed by the articulatory system. A crucial source for the model's organization is related to recent studies on control of skilled motor movements. Our basic understanding of motor control has evolved from one in which movements were specified by "rigid or hard-wired control of joint and/or muscle variables" (Saltzman and Munhall, 1989:337) to one which views movement as the result of controlled COORDINATIVE STRUCTURES:

...[W]e and others have asserted that several coordinate systems (e.g. articulatory and higher-order, goal-oriented coordinates), and mappings among these coordinate systems, must be involved implicitly in the production of speech (ibid.:335).

Thus, the "goal" of the COORDINATIVE STRUCTURE is

...defined abstractly or functionally in a task-specific, flexible manner.... These attributes of task-specific flexibility, functional definition, and time-invariant dynamics have been incorporated into a task-dynamic model of coordinative structures Ubid. :335).

In simpler terms, a task-dynamic gestural model means that speakers control movements in a coordinated manner so as to achieve an overall articulatory and/or acoustic target or goal. The model focuses on the achievement of the overall goal and allows for variation and flexibility in its actual realization.

The validity of the model has been successfully demonstrated at Haskins Laboratories in speech synthesis simulations. Researchers have input parameters and variables of the model into their software articulatory synthesizer. Because the organization and function of the system is result-oriented, it is able to accommodate real-time unexpected articulatory perturbations.

The model has been used to reproduce experimental data on compensatory articulation, whereby the speech system quickly and automatically reorganizes itself when faced

58 with unexpected mechanical perturbations...[and it can] generate many important aspects of natural articulation (Saltzman and Munhall. 1989:340).

For example, if the task is to produce a bilabial closure, and the simulated jaw is perturbed or frozen in place during the closing gesture, the system is able to compensate for this unexpected change by adjusting other articulators as necessary with no explicit rules anticipating this state of affairs. Thus, while the final articulator configurations may not be typical of an unperturbed closure, the overall goal of a bilabial closure is accomplished.

Finally, the model’s ability to achieve speech goals, regardless of unforeseen perturbations and modifications, strongly supports its validity and evidences the long-standing scientific premise of cohesion within a system. In turn, "when slight variations in any one part occur..., other parts become modified" (Darwin, 1896, cited in Saltzman and Munhall, 1989:362). In Gestural Phonology this general notion of system cohesion is expressed as GESTURAL COHESION.

3,2 J Gestural units and their organization

In this model the basic phonological unit is the GESTURE, rather than articulatory or acoustic

FEATURES. Gestures are described as having three qualities: they are abstract, discrete, and inherently spatio-temporal. In addition, they are dynamic linguistic units which comprise a task dynamic model of speech. In order to achieve the intended articulation, task variables are controlled. In Gestural

Phonology, these task variables are called the VOCAL TRACT variables, where a single tract variable is often associated with several articulators. Thus, the goal of a labial closure will be expressed through a tract variable of lip aperture rather than by individual specifications for the jaw, lower lip, and upper lip.

Table 3.3 demonstrates the relationship between tract variables and their associated articulators.

TRACT VARIABLE ARTICULATORS INVOLVED LP lip protrusion upper & lower lips, jaw LA lip aperture upper & lower lips, jaw TTCL tongue tip constrict location tongue tip, body, jaw TTCD tongue tip constrict degree tongue tip, body, jaw TBCL tongue body constrict location tongue body, jaw TBCD tongue body constrict degree tongue body, jaw VEL velic aperture velum GLO glottal aperture glottis

Table 3.3. Tract variables. (From Browman and Goldstein, 1989:344.)

59 with unexpected mechanical perturbations...[and it can] generate many important aspects of natural articulation (Saltzman and Munhall, 1989:340).

For example, if the task is to produce a bilabial closure, and the simulated jaw is perturbed or frozen in

place during the closing gesture, the system is able to compensate for this unexpected change by adjusting

other articulators as necessary with no explicit rules anticipating this state of affairs. Thus, while the final

articulator configurations may not be typical of an unperturbed closure, the overall goal of a bilabial

closure is accomplished.

Finally, the model's ability to achieve speech goals, regardless of unforeseen perturbations and

modifications, strongly supports its validity and evidences the long-standing scientific premise of

cohesion within a system. In turn, "when slight variations in any one part occur..., other parts become

modified" (Darwin, 1896, cited in Saltzman and Munhall, 1989:362). In Gestural Phonology this general

notion of system cohesion is expressed as GESTURAL COHESION.

3,23 Gestural units and their organization

In this model the basic phonological unit is the GESTURE, rather than articulatory or acoustic

FEATURES. Gestures are described as having three qualities: they are abstract, discrete, and inherently spatio-temporal. In addition, they are dynamic linguistic units which comprise a task dynamic model of speech. In order to achieve the intended articulation, task variables are controlled. In Gestural

Phonology, these task variables are called the VOCAL TRACT variables, where a single tract variable is often associated with several articulators. Thus, the goal of a labial closure will be expressed through a tract variable of lip aperture rather than by individual specifications for the jaw, lower lip, and upper lip.

Table 3.3 demonstrates the relationship between tract variables and their associated articulators.

TRACT VARIABLE ARTICULATORS INVOLVED LP lip protrusion upper & lower lips, jaw LA lip aperture upper & lower lips, jaw TTCL tongue tip constrict location tongue tip, body, jaw TTCD tongue tip constrict degree tongue tip, body, jaw TBCL tongue body constrict location tongue body, jaw TBCD tongue body constrict degree tongue body, jaw VEL velic aperture velum GLO glottal aperture glottis

Table 3.3. Tract variables. (From Browman and Goldstein, 1989:344.)

59 The tract variables are arranged into horizontal-vertical pairs, "where both members of a pair refer to the same set of articulators" (Browman and Goldstein, 1989:343). Thus, we have the pair LP-

LA, where LP (Lip Protrusion) represents the horizontal element and LA (Lip Aperture) represents the vertical element. It follows that within the dynamical equation for oral gestures, both constriction location (horizontal) and constriction degree (vertical) must be specified. In turn, the paired tract variables are associated with a single gesture. Table 3.4 provides a list of the paired tract variables and their associated gestures. The symbols in the left column denote the different gestures as they are indicated in the final gestural score.

SYMBOLREFERENT TRACT VARIABLE I palatal gesture (narrow) TBCD, TBCL a pharyngeal gesture (narrow) TBCD, TBCL P bilabial closing gesture LA, LP t alveolar closing gesture TTCD, TTCL a alveolar near-closing gesture (permits TTCD, TTCL fncation) X alveolar lateral closing gesture TTCD, TTCL K velar closing gesture TBCD, TBCL

Table 3.4. Gestural symbols (From Browman and Goldstein, 1989:344.)

The individual gestures are organized relatively within a larger structure, the gestural score. The gestural score is subdivided into three independent articulatory tiers: VELIC, ORAL, and GLOTTAL.

These three articulatory tiers are analogous to many of the organizations posited by many phoneticians and autosegmental phonologists. Within this hierarchy, the ORAL TIER can be further divided into three tiers: TONGUE BODY, TONGUE TIP and LIPS which, for the most part, correspond to the traditional groupings of place of articulation into three major sets: labial, coronal, and dorsal. Figure 3.4 illustrates a hypothetical gestural score for the word “palm”, [p"am].

60 T ier Gestures [P- a m]

Velic: . . . -p.. . + P .

Oral: Tongue Body . a ......

Tongue Tip

Lips . . . p. . . .p. . .

Glottal: • . Y......

Figure 3.4. Symbolic gestural score for hypothetical "palm." (From Browman and Goldstein, 1988:345.)

Figure 3.5 illustrates the gestures and their associated calculated articulatory trajectories for the four tract variables.

m open Velic aperture close,1

open Tongue Body I close

open Lip aperture I close

open Glottal aperture I close

Time

Figure 3.5. Hypothetical gestural representation and associated trajectories for "palm." Note that closure is indicated by lowering. (Adapted from Browman and Goldstein, 1988:345.)

61 In Figure 3.5 rectangles on each tract variable tier indicate gestures. The quantal gestures

(rectangles) are identified with their continuous output through the curved lines—articulatory, or

movement, trajectories—in each tier: Constriction degree is indicated vertically through the height of the

rectangle and each associated “curve shows the changing size of a constriction over time, with a larger

opening being represented by a higher value, and a smaller opening (or zero opening, such as for closure)

being represented by lower values” (Browman and Goldstein, 1988:344). It is interesting to note that for

some tiers, here VELIC APERTURE and LIP APERTURE, the orientation of the trajectories indicates a picture that is inverted with respect to the vertical movement of the major articulator (velum, lower lip).

The horizontal dimension represents time; therefore, the rectangles’ length indicates the duration of a gesture. The initial and final bilabial closures of the word [p"am] are expressed on the LIP

APERTURE tier as the two (rectangular) gestures. Voicing is indicated on the GLOTTAL APERTURE tier.

Here the single gesture indicates opening of the vocal cords for the production of voiceless [p]. The end of this gesture corresponds to vocal cord closure resulting in voiced [a] and [m]. Finally, note the two gestures on the VELIC APERTURE tier. The first rectangle represents the gesture that corresponds to the closed velum for non-nasal [p]. The second rectangle represents velum lowering for nasalized [m].

32,4 Gestures and their specified degrees of overlap

We must also clarify how the gestures themselves are temporally related to one another. Figure

3.5 clearly illustrates the temporal overlap between gestures. In speci^ing this overlap we make reference to “degrees” since “[tjiming between gestures is specified by coordinating a specific point in one gesture (such as onset, achievement of target, or release of target) with respect to a specific point in some other gesture” (Zsiga, 1997:231). The use of the unit “degrees” reflects this task-dynamic model’s grounding in physics: it is based on a critically damped mass-spring mathematical model. In applying the model to the articulators, the tongue's articulators are viewed as the mass located at the end of the theoretical spring.

In agreement with the basic mass-spring model, the motion of the mass (which is attached to the end of a spring) is characterized by a cyclical oscillatory pattern. Typical of such sinusoidal patterns, the motion is referenced in units of degrees where a complete cycle consists of 360 degrees. Because the

62 model is critically damped, the number of oscillations is limited to one. Within this single oscillation,

effective achievement of the target is said to occur before the entire cycle is completed; the target is said

to be reached at 240 degrees.

When we describe the temporal relationship between gestures, relative phase angles are

referenced; i.e., the second gesture is said to start when the first gesture is a designated number of degrees

into its 360 degree cycle. "Gestures are phased with one another such that a particular phase angle in one

gesture corresponds to a particular phase angle in another gesture" (Byrd, 1994:7). Thus, because

gestures have “inherent extent in time” (Zsiga, 1997:77), inter-gestural timing is determined by the

gestures’ relative intrinsic dynamic state rather than timed "by an external clock" (Browman and

Goldstein, 1988:348). This quality of intrinsic timing sets the gestural model apart from most other

models of phonology, in which temporal segmental relationships are expressed through the external

sequencing of segments.

According to literature on Gestural Phonology which emphasizes the phonological capabilities

of the model, the number of possible phase angles of gestural overlap is finite. "Browman and Goldstein

assume the synchronized phase angles to belong to a limited set of points in a gesture; specifically, the

onset (0°) and target (240°), and perhaps release (290°)" (Byrd, 1994:7). The phase angle is determined

according to factors such as placement of word boundary, the number of consonants in a cluster as well as

its well-formedness, and the relationship between the vocalic and consonantal tiers. The gestural score

specifies this explicit coordination of the gestures. Browman and Goldstein’s model, with its proposition

of finite degrees of overlap, therefore agrees to some extent with Lindblom’s (1963) hypothesis that

segments are associated with a single articulatory target. ^

Depending on the constraints enforced on the allowable degrees of gestural overlap, the

articulatory-gestural framework can account for both phonological and phonetic data; for those studies

Lindblom expresses his notion of invariant targets in the following: "A target was found to be independent of consonantal context and duration and can thus be looked upon as an invariant attribute of the vowel. Although a vowel phoneme can be realized in a more or less reduced fashion, the talker's intention" that underlies the pronunciation of the vowel is always the same, independent of contextual circumstances. A vowel target appears to represent some physiological invariance. The present data support the assumption that the control that the talker exercises over his speech organs in vowel articulation is associated with neural events that are in a one-to-one correspondence with linguistic categories " (1963:1778).

63 which focus more on phonetic qualities, a significant feature of the gestural model is that the relative

temporal relations of the gestures can be adjusted. In this manner the system can account for continuous

changes in speech production over time. For example, changes in acoustic output due to varying styles of

speech can be explained by different amounts of temporal overlap of the gestures—casual speech is

characterized by more gestural overlap (especially when the gestures share the same tier), while formal

speech is characterized by less overlap. For example, phenomena that traditional frameworks describe as

segment deletion, the articulatory phonology framework attributes to increased gestural overlap; e.g.,

when the sequence “must be” ['m3St#bi] is produced in formal style all segments in the sequence are

audible. In casual speech, however, the output is ['mssbi]. Traditional descriptions would label this change in output as deletion of the [t]. However, articulatory studies show that articulatory movement for

the alveolar closure remains. The [t] is, therefore, said to be hidden since the gesture for the bilabial closure [b] (on the LIP APERTURE tier) temporally overlaps, or hides, the gesture for the alveolar closure

[t] (on the TONGUE TIP tier). The notion of temporal gestural overlap has broad implications:

“Thus...examples of consonant assimilation and consonant deletion are all hypothesized to occur as a result of increasing gestural overlap between gestures” (Browman and Goldstein, 1989:361).

The propositions that: (1) there is a set number of degrees of intergestural overlap (O', 240’ and

330’) and (2) that the rate and style of speech results in different amounts of intergestural overlap seem contradictory. How can both claims be supported? First, we must keep in mind the focus of the study— either gestural phonology or gestural phonetics—and the source of the productions being described— natives or nonnatives. Second, recent research proposes that for native speakers (who have obviously acquired their language’s phonology) a set number of degrees of gestural overlap applies to smaller units of speech, e.g., within-segment and within-syllable. In fact, it is these tightly bound fixed degrees of overlapping “bundles of gestures” which result in the positing of a phonological segment: “...it is not the case that segmenthood causes stable timing, but rather that stable timing causes the quality of being a segment” (Byrd, 1994:161). Larger units of speech, however, e.g., across word boundaries, tend to exhibit variation in the degrees of overlap (Zsiga, 1994; Zsiga, 1997; Zsiga communication, Jan. 1998).

Tightly bound bundles of gestures, with their limited degrees of overlap, are characteristic of smaller units of native speech. Can the same limited number of degrees of phase angles apply to

64 nonnative L2 productions and the L2 acquisition process? Or is it the case that when accounting for L2 acquisition and production, a more variable, less quantal articulatory model is needed? The next section presents a modified version of Browman and Goldstein’s traditional hamework—one that, in place of discrete degrees of overlap, proposes allowable PHASE WINDOWS.

32,5 Modifying traditional gestural overlap for L2 acquisition: Phase Windows

Byrd (1994) rejects the proposition that the coordination of articulatory movements is governed by invariant degrees of overlap, stating that some variability in intergestural timing needs to be allowed for. To this end, she posits that “timing relationships are constrained LANGUAGE-SPECIFICALLY

[emphasis mine—E.D.] to occur within permissible PHASE WINDOWS” (Byrd, 1994:ix). These phase windows have a continuous nature, ranging from narrow to wide, where narrower windows display a more confined range of allowed intergestural timing relations and wider windows allow a larger range of timings. If the actual degrees of gestural overlap falls anywhere within the hypothetical window, the target is said to be achieved. The quality of production is not differentiated, i.e., all productions within the phase window are “equally good.” In this manner, even the phase window model has a certain categorical quality—if windows overlap within the range, the target is categorically achieved (with no reference to quality being made). However, if the overlap does not fall within the window’s range, the target is categorically not achieved. One obvious advantage of a phase window model is that it allows for some degree of “slop” in producing inter-gestural timings.* In general, gestural constellations which

* In reading about the proposition of an allowable range of inter-articulatory speech timings—all of which result in the perception of a single segment—I can't help but think that there is some relation to Stevens's (1989) discussion of the quantal nature of speech. While Byrd's phase window model and Stevens's quantal model address different specific aspects of speech production, nonetheless there does seem to be a common thread in their understanding and description of speech production. Stevens's quantal model states that, due to the physics of articulation and acoustic output within the vocal tract, imprecision in articulation does not always result in loss of information. For example, in producing the vowel [i], the raised tongue can approximate a constriction within a range on the x-axis, with all constriction locations within this range resulting in a perceived [i]. In a similar manner, Byrd's notion of phase windows states that if the relative timing between articulations (or gestures) falls within the allowed windows, then the target is achieved. For example, let’s consider gestures on the glottal and lip aperture tiers which result in the production of a [b]. On the glottal tier, there will be a gesture which results in the closing of the glottis, thereby producing voicing. At the same time, there will be a gesture on the lip aperture tier, which results in a bilabial closure. According to Byrd’s model, the degree of overlap between the glottal and lip aperture gestures is not a singular set number of degrees. Let's say that the lip closure is indicated as a gesture on the lip aperture tier. We could find that, based on empirical evidence, the glottal gesture usually begins at 90° into the lip aperture gesture. However, given

65 display narrow phasing windows are often characteristic of phonological segments while gestural

constellations having wider windows are often characteristic of inter-segmental overlaps.

For this study, an important feature of the window model is that intergestural timings are

constrained Ianpuage-specificallv (Byrd, 1994:ix)—a proposition which suggests that, to acquire a native­

like accent, learners must internalize not only new articulations (including active and passive constriction

locations, constriction duration and degree) but also their allowable language-specific relative timings.

The modified windows version of the articulatory model provides an excellent tool for

illustrating the L2 acquisition process. Native-speaker productions of a given segment would be

characterized by a gestural score having rather narrow windows of inter-gestural phasing. Nonnative-

speaker productions of the same segment might exhibit different characteristics. Nonnatives’ gestural

scores could exhibit either of the following features:

the phase window model, the relative gestural timings are not fixed. It might be possible that the glottal gesture could begin anywhere from 45° to 120° relative to the onset of the lip aperture gesture. Glottal gestures that begin anywhere within this range of 45°-120° would, nonetheless, result in a segment which is perceived by native speakers to be a "voiced bilabial slop." Thus we see that while actual productions might display continuous characteristics, their perception is categorical.

66 1) if the target L2 segment is not associated with a similar LI phone, learners might not be certain how to correctly articulate the L2 segment and, therefore, produce it with an excessively wide window. The wide window would result in productions which natives perceive as inconsistent, ranging anywhere from “heavily accented” to “understandable yet weakly accented.”

2) if the target L2 segment is similar to an existing LI phone and is, therefore, mapped onto a similar LI segment (see Best and Strange, 1992, for discussions on ideas on phoneme mapping and their Perceptual Assimilation Model), productions of the L2 phone could be characterized by a narrow yet incorrect phasing window. Natives would consistently perceive learners’ resulting productions as “accented.”

It is hypothesized that, with time and extended exposure to L2, learners gradually acquire the articulatory

timing and coordination for these single segments. “As the learner has more experience with L2, he

changes from mapping L2 phones directly on to LI phones to creating a separate L2 phoneme category”

(Best and Strange, 1992:89).

The two examples discussed above refer to INTRASEGMENTAL TIMINGS where allowable

phasings tend to be more limited. Let us further extend the model to native and nonnative productions of

palatalized segments. Native productions of single palatalized segments will be characterized by primary

and secondary articulations having a narrow window of possible overlap. Nonnative productions, on the

other hand, exhibit a different pattern. Here the palatalized segment is mapped onto and, therefore,

realized as a sequence of two phonemes, ICI + /j/. Because these two segments comprise a “larger unit” than a single segment, the phasing windows will be wider, resulting in a production that native Russian speakers perceive as accented. However, with time and continued effort, learners gradually acquire the mono-segmental quality of Russian palatalized consonants; in the process, learners also acquire language-specific timings of Russian palatalized consonants. Learners begin to produce the palatalized consonants with more simultaneous articulations which are associated with a narrower range of phasing windows.

33 Summary

Acquiring the phonetics of L2 is a very complex process for adult L2 learners. They must retrain the manipulation of their articulatory organs so as to produce not only new static articulations but also different (and perhaps even more elusive) coordinations and timings. This chapter has demonstrated

67 that, in general, traditional phonological models (both linear and non-linear) are not well suited for illustrating these more phonetic qualities of L2 acquisition. Specifically, because the phonological unit of traditional models—the feature—cannot express temporal information, qualities of timing are indicated only through the sequential ordering of abstract root nodes. A Gestural Phonology model is better suited for the purposes of this study.

In choosing articulatory phonology as the optimal descriptive framework, I found Zsiga (1997) thoughts on Gestural Phonology and Gestural Phonetics to be especially relevant and helpful. In her investigation of Igbo vowel harmony and vowel assimilation, Zsiga proposes that the choice of a descriptive framework should depend on the nature of the phenomenon being investigated—either categorical changes or gradient changes. For those changes which are categorical (such as Igbo vowel harmony), feature-based non-linear descriptions are ideal. In contrast, changes which are gradient (such as Igbo vowel assimilation) are best accounted for in the Gestural Model, where the phonological unit, the

GESTURE, has “inherent quantitative specifications, most importantly, exact temporal relations” (Zsiga,

1997:229). And while Pierrehumbert (1990) argues that “phonological representations must be qualitative and symbolic while phonetic representation must be quantitative and physical, and that an account of the mapping between them remains elusive” (Zsiga, 1997:228), a major advantage of articulatory phonology is that since phonological and phonetic representation are the same, no interface between phonology and phonetics is required.

Finally, of particular importance to this study is the notion that Browman and Goldstein’s articulatory model can be incorporated as either a Gestural Phonology or Gestural Phonetics Model

(Zsiga communication, Jan. 1998). In choosing the Gestural Model, I make no specific claims about how

L2 learners’ underlying 12 phonology changes but, instead, focus only gradient phonetic processes.

Moreover, if we adopt a PHASE WINDOWS approach (as opposed to invariant degrees of overlap) to intergestural overlap, AE learners’ accented production and gradual phonetic acquisition of C-V sequences containing the palatalized consonants of Russian can be well accounted for.

The next chapter presents an acoustic production study. It will become evident that the gestural framework describes in a straight-forward manner production study results.

68 CHAPTER 4

ARTICULATORY IMPLICATIONS OF AN ACOUSTIC PRODUCTION STUDY OF RUSSIAN PALATALIZED SEQUENCES: THE GESTURAL MODEL AND INTERPHONETICS

4.1 Introduction

Cross-language research has established that certain speech sounds present greater difficulty for adult nonnative learners than other L2 sounds (Polka, 1992). For adult American learners of Russian palatalized consonants of Russian are especially troublesome. While the fact that our learners experience protracted difficulties with the soft consonants is well-established among teachers of Russian, there are no acoustic studies that investigate dynamic properties of learners’ accented productions. Existing phonetics research on the Russian palatalized consonants is limited to acoustic features of native Russian speech, particularly static properties of the raised second formant (F2) frequency associated with the superimposed i-like articulation. A need exists for acoustic-phonetic studies that compare and contrast, in light of L2 acquisition theory, native and nonnative productions of the palatalized consonants of Russian.

The study described in this chapter lays the foundation for potentially numerous studies in the field of Ll-

L2 interface of phonetics and phonology, as well as SLA theory.

Cross-language research has much to offer lingustics research of L2 acquisition theory and the

L1-L2 interface. By comparing and constrasting LI and L2 productions of specific target sounds, we define important acoustic properties of “correct” native and “incorrect” accented productions. This chapter presents the results of an acoustic production study of the static and dynamic properties of the three Russian palatalized C-V sequences—/CÎV/, /CljV/ and /CJijV/—as produced by both native-

Russians and learners of Russian. In particular, this study endeavors to better define: (1) distinguishing

69 properties of inter-articulatory timing in Russians’ productions of these palatalized sequences; (2) given

nonnatives’ accented productions of the palatalized consonants as /C/+/j/, the extent to which learners distinguish Russian /O V / and /OjV/ sequences; (3) to what degree native Russian speakers produce

/OV/ and /OjV/ differently; (4) distinguishing dynamic properties (e.g., closure duration and associated constriction degree) for palatalization, the palatal segment /J/ and the front vowel /i/. Statistical analysis of the data will reveal the significantly reliable acoustic differences between natives’ and learners’ productions. Finally, the acoustic results will be associated with their source articulations through the

Gestural Phonology framework in Gestural models for native and nonnative productions of the three C—V sequences. In this manner the important dynamic differences between native and accented speech will be highlighted.

42 Methods

42.1 Subjects

All subjects were recruited at The Ohio State University on a voluntary basis; no payment was given for their participation. Sixteen ( 8 female, 8 male) native speakers of Russian (henceforth,

"Russians ”) and seventeen ( 11 female, 6 male) native American-English speaking university-level students of Russian (henceforth. “AE learners, ” or “learners ”) participated. The following abbreviations will be used to refer to the four subject groups:

Abbreviation Subject group

PR female native-Russian speakers MR male native-Russian speakers FE female native-American-English speakers learning Russian ME male native-American-English speakers learning Russian

Russians. All Russian subjects have spoken Russian since birth; Russian is the native language of all their parents. The majority of Russian subjects grew up in or St. Petersburg (Leningrad), although a few grew up in Ukraine or Central Russia. At the time of this study, most were in their twenties or early thirties, with only two subjects being older (one 40 years old, the other 58 years old).

They had been in the United States anywhere from six months to ten years. When rating their own level of Russian fluency, 13 of the 16 indicated no change (4 on a scale of 1-4) in their fluency since leaving

70 Russia. Three subjects indicated that, while still fluent, there was a slight decrease (3 on a scale of 1-4)

in their language.

Americans. All AE subjects began their study of Russian after the age of 12. While a few had

taken Russian in high school, the majority began their study of Russian in college. One subject was in

his/her forties; all other subjects were in their twenties or early thirties. All six male subjects were

graduate students who had formally studied Russian in the United States for from four to six years and

had lived from two to 16 months in Russia. Of the eleven female subjects, three were undergraduates; eight were graduate students. The undergraduate females had studied Russian formally in the United

States for from two to four years and had never traveled to Russia. The graduate females had formally studied Russian in the United Slates for from four to seven years; all had lived in Russia for from five to twenty-six months.

Rating their own pronunciation on a scale from 1 (weak, heavy accent) to 6 (fluent, no accent), the undergraduates rated themselves at either a 2 or 3. They, therefore, saw themselves as having considerable difficulties with Russian phonetics. All but one graduate student gave himself/herself a 4 or

5, with one graduate student assessing his/her pronunciation as a 3. The graduate students, therefore, saw themselves as having attained a solid level of phonetic accuracy.

After preliminary analysis of the data and in view of their limited exposure to the target language, it was decided that the data from the three female undergraduates would be excluded from all reported statistical analyses. Therefore, only data for male and female ^aduate student learners will be further referenced. Figure 4.1 illustrates the graduate-student learners’ number of years of study both in the U.S.A. and in Russia; data for both female and male subjects has been combined and sorted first in order of increasing duration of study in the United States, and second in order of their duration of study in

Russia.

71 AE Learners' Study of Russian

VA In Russia

In U.S.A. I -S' 3

C O

3ME 8 FE 4ME 5ME 2ME 5FE 7FE 2FE 6 FE IFE 3FE 6 ME IME 4FE Subject

Figure 4 .1. Female and male AE learners’ number of years of study of Russian both in the United States and in Russia. Female and male data are combined and presented in increasing order of duration of study in the United States followed by duration of study in Russia.

42J, Materials: Determining C, V and stress placement

Many factors had to be taken into consideration when composing the word list. The goal was to record contrasting /CV/ - /ClV/ - /OjV/ -/CJiJV/ sequences for given C and V. Because some of the target C-V sequences are rare or nonexistent, merely engaging subjects in conversation would not guarantee production of the desired forms. It was, therefore, decided that subjects would read some form of written list.

Because subjects’ proficiency ranged greatly (from second-year students to native speakers), the list would have to accommodate all levels. Less-experienced learners might have difficulty producing sentences containing unfamiliar items. It was decided that instead of sentences containing the desired

72 words, subjects would therefore read a word list containing target C-V sequences/' Even with the rather simplified form of a word list, learners might still experience difficulty producing the words; therefore, learners were given a “second chance” to produce the target C-V sequences by repeating them immediately after the carrier word, e.g., COBa— BE [sA'va-'va], peB«— Bfl [rJi'vJa-'vja]. In contrast, it was hypothesized that native Russian speakers would produce isolated words with ease, while a list of only the target C-V sequences would be unnatural. Thus, native Russian speakers were prompted to produce the target C-V sequences more naturally by first producing the carrier word containing the target

C-V sequences, immediately followed by the isolated C-V target sequences.

The next step was to determine which particular consonants and vowels would be included in the study. All consonants which are indisputably paired for palatalization were initially considered: /b-bJ/,

/p-pJ/, /v-vJ/, /f-fi/, /d-dJ/, /t-ti/, /z-zi/, /s-si/, / l - d V , /r-rJ/, /n-nJ/, /m-tnJ/. Since one of the acoustic measurements would be made on an acoustic formant at the C-V transition point, nonnatives’ aspiration of voiceless stops was an important consideration. Because Americans aspirate word-initial voiceless stops and since “formant transitions after voiceless aspirated stops take place during the period of aspiration and are therefore not as apparent on the spectrogram” (Ladefoged, 1993:201), the paired voiceless stops /p-pl/ and /t—t)/ were eliminated. The remaining voiceless fricatves /f-fi/ and /s-sJ/ were also eliminated. The word list, therefore, was created from the remaining 14 paired consonants:

Consonants in word list Consonant tvne

/b-bi/, /m-mi/, /v-vJ/ Labial /d-

Of the five Russian vowel phonemes, the three point vowels were chosen:

Vowels in word list

/i/, /a/, /u/

The choice to use a word list is further supported by the observation of “students’ native-like pronunciation in word lists, but NL [native language] transfer in conversation” (Major, 1994:190). Carlisle (1994) also notes the positive aspects of using a word list as the “highest frequency of the target variant occurred in the reading of word lists, the next highest in the reading of a text, and the least frequent in free speaking” (224.)

73 The four chosen target C-V sound combinations were /CV/, /OV/, /OjV/, and /OijV/. The inclusion of the first three combinations (/CV/, /OV/, /OjV/), already discussed in chapter 2, concerns

issues of L2 acquisition as well as native and accented productions. The final combination, /OijV/, was included to investigate differences between high front constrictions which result in either the palatal glide

/j/, the vowel /i/, or secondary palatalization.

For a given consonant pair and the three designated vowels, 12 C-V combinations are possible.

For example, for the paired voiced bilabial stops, /b-bJ/, the twelve C-V combinations investigated are:

The four C-V environments

c V CV c iv d i v c iiiv

/b -b i/ IV /hi/ m /biji/ /biiji/ 6 bl. H 6bi 6 h 6bH 6 hh

is J ibaJ /bia/ /bija/ /biija/ a. a 6a 6fl 6bfl 5 hh

IvJ /bu/ /biu/ /bijio/ /biiju/ y, 10 Sy 6io 6bK) 6 hk)

In order to insure that the fullest /j/ segment would be produced in /O jV / and /OijV/ sequences, the vowel following /j/ must be stressed.

The consonant j appears in two forms: under stress—with a clearly expressed articulation, which is indicated in transcription as [j], for example as in Kpa*H [krA'ja]; in unstressed syllables the consonant j is characterized by a weakened articulation, with a lesser degree of tongue raising, which is indicated in transcription as [i], for example Kffan [kr'aia]. In Russian, the consonantj is used in contexts with both vowels and consonants (Bryzgunova, 1963:125)^^.

32 CorhacHbiH fi B bicrynaer b m y x paaHOBHaHOCxax: noa yaapcHHeM — c oTHerjiHBO Bbipa>KeHHOH apTHKyjiflUHefi, b rpaHacpHnuHH, o6o3HaHaeMOH nepea [j], HanpH.viep Kpa'a [krA'ja]; B GesyaapHbix cnorax coraacHbiM h xapaxTepHayerca ocnaGaeHHOfi apTHKyjwuHeii, vieHbuiHM noaiCMOM jcbuca h b TpaHOcpHnuHH oGoaHataerca Mepes [1], aanpHMcp xp'aa [kr'aia]. CoraacHbiH H B pyccKOM H3biKe ynoapeGaaercfl b coHeraHHH c raacHbtviH h COraaCHbLVlH (Bryzgunova, 1963:125). [Bryzgunova’s Russian phonetic transcriptions are rendered here in standard IP A. - E.D.]

74 Finally, for purposes of consistency and ease-of-measurement, the C-V sequences were always in word-

final position with syllable-final stress. The next task was to determine words containing the /CV/ -

/dv/ - /d j V/ - /dijV / sequences.

4,23 Speech sample: Establishing contrasting word groups

Contrasting Russian /CV/ and /CIV/ sequences in word-final position with syllable-final stress

are quite common, e.g..

General C-V Specific C-V SgqvePCS sequence Russian word Transcription Gloss

CV# [ru#] 5epy [bii'ru] T take’ d v # [riu#] GopiO [ba"riu] 'I conquer’

CV# [la#] acajta [skAla] ‘cliff, crag’ dv# [lia#] 3e.VLrifl [ziiWia] 'earth’

While quite a few Russian words have /CljV/ sequences in word-final position with syllable-final stress,

for some of the specific C-V combinations, no such words exist. As a result, three of the twenty-one

CljV sequences in the word-list are in nonce words. Quite a few Russian words have word-final /OijV/

sequences; the majority are borrowings such as acrpOHO.viHfl [astrA’nomîija] ‘astronomy’, 5n0il0rHa

[biiAlogiija] ‘biology’, $H3H0Jl0rHfl [fiiziiAlogiija] ‘physiology’. Such borrowings do not, however,

have syllable-final stress. In fact, few Russian words contain the/OijV/sequences in word-final position

with syllable-final stressT herefore, the carrier word is a nonce word in seven of the twenty-one

instances for/O ijV/ sequences.

A few Russian words which do comply with the word-final and stress-final requirements are XCMTMe [gitii'je] life, biography’, SblTHe [bitli'je] ‘being, existence’, OCrpne [astrli'jo] ‘point, spike’, cy/U-IH [sudii'ja] ‘arch, judge’.

75 General C-V Specific C-V 5gq]qçtiçç sequence Russian word Transcription Gloss

djV # [vija#] aiHOBbfl [sirtAVija] ‘sons’ dijV # [viija#] uiBefl [/vii'ja] ‘seamstress’

djV # [miju#] œMbK) [sii'miju] ‘family’ dijV # [miiju#] 3MeK) [zmJi'ju] ‘snake’

djv# [bija#] BopoGbfl [varAtija] ‘sparrow’ dijV # [biija#] ♦crafiHfl [stabii'ja] —

djv# [ziju#] ♦Jia3bK) [lA'ziju] — dijV # [ziiju#] *CTa3HK) [stazii*ju] —

See Appendix A for an ordered list of all specific C-V sequences, their carrier words, phonetic

transcriptions and definitions. Finally, the carrier words were randomized. See Appendix B for a copy of

the randomized word list presented to subjects.

42v4 Procedures

Recordings were made in a sound-attenuated chamber in the Department of Linguistics at The

Ohio State University using a Marantz PMD 432 tape recorder equipped with a head-mounted

microphone (Shure SMIO).

The purpose of the study was not discussed with participants. Subjects were first given the two- page randomized word list and instructed to read through it silently. They were then fit with the head- mounted microphone. While subjects read several words from the list, the input volume was set at an appropriate level. Subjects were then given the following instructions:

(1) Read through the word list at a comfortable and natural speaking-rate and volume, producing the sounds as “naturally” as possible (i.e., try not to hyper- articulate).

(2) While a few of the words in the list are not real, the majority of them are valid Russian words.

(3) For all words and repeated sequences, the final syllable is stressed.

76 (4) If, while reading the list, you feel uncomfortable or dissatisfied with your production, feel free to go back and repeat the item as many times as you wish, until satisfied with the production.

Subjects then read tfirough the list aloud. Having read through it once, subjects were asked to

read tfirough the word list a second time at either a slightly faster or slower pace, depending on the speed

of the original reading. Subjects then filled out an information sheet. The entire recording session lasted

approximately forty-five minutes.

42.5 Data analysis

42.5.1 Spectrograph procedures and settings

Subjects had read the word list two times at different speeds. I listened to and compared both

, choosing the more natural sounding one. All acoustic data were then taken from this one

reading. Subjects’ analog recordings were digitized in 45 second increments and spectrographically

analyzed with the Kay Elemetric Computerized Speech Lab (Model 4300B). Samples were digitized at

10,000 samples/second and 16 bits/sample. Vowel formant frequencies were automatically tracked using autocorrelation-LPC analysis with preemphasis and LPC coefficients.

Recall that the speech sample was composed of a carrier word and then a repetition of the target

C-V sequence. All acoustic measurements were made on the final repeated C-V sequence, i.e., the -pH

[rJa] in MOpH-pa [mAWa-rla]. In order to verify that productions of both the carrier word and repeated target sequence had similar acoustic properties, I conducted a preliminary analysis on the productions of one female Russian speaker. Comparing F2 duration and frequency measurements from both the word and isolated productions, I found no difference in the measurements from the two contexts, which supports the decision to use isolated productions. I then made a spectrogram of each isolated C-V sequence (analysis filter bandwidth was 300 Hz) and took F2 frequency and duration measurements. The method of measurement, with its focus on certain F2 frequency and K duration values within the entire

F2 trajectory, was based on Weismer, eta l (1992) as discussed in the next section.

77 4 2 3 2 Discussion of Weismer, et aL (1992)

Weismer, etal. (1992) presents results of an acoustic-production intelligibility rating experiment

with “normal” male subjects and male subjects afflicted with amyotrophic lateral sclerosis (ALS), a

degenerative neuromuscular disease that is typically associated with dysarthria. The purpose of their

experiment was to investigate acoustic formant trajectory data as an indicator of speech production

deficit. They measured frequency and duration values of the first and second formant bands (FI and F2,

respectively). The extent of dysarthria is reflected in the abnormal nature of the formant trajectories;

abnormal trajectories of the afflicted males were then associated with intelligibility deficits.

Several aspects of the Weismer, et al. experiment are applicable to the present production study.

First, their experiment compares productions of “normal” and “imperfect” (those with ALS) speech. In a

similar manner, I compare the speech of “normal” speakers (native Russian speakers) with learners'

“imperfect” or “accented” productions. Second, Weismer, eta.1 focuses on formant trajectory

measurements within the vocalic nucleus, specifically transition slope and steady-state durations.

Similarly, measurements of F2 frequency values at transition points and steady-state durations of palatal

constrictions provide the data for this experiment. Finally, by associating formant trajectories with their

source articulations, Weismer,et al. suggest that the relative deviations of ALS subjects’ formant

trajectories may serve as an indication of the severity of speech mechanism dysfunction: “Our previous

work suggests that the study of formant trajectories leads to novel insights and focused physiological

hypotheses concerning articulatory behavior” (Weismer, etal., 1992:1097). Similarly, results of acoustic

measurements for both native and nonnative Russian speakers will be used to associate acoustic data with

their source articulations via a modiOed version of Browman and Goldstein’s Gestural Model (Browman

and Goldstein, 1989), thereby illustrating dynamic differences between native and nonnative productions.

4 2 3 3 Sample spectrograms

What information about palatalization can we see in spectograms of Russian speech?

Spectrograms are a three-dimensional display of speech in which the x-axis represents time and the y-axis represents frequency, measured in Hertz. Darkness of the bands indicates acoustic intensity. The vocal tract is described as a source-filter model. Therefore, depending on the shape of the oral cavity and place

78 of constriction, certain frequencies are enhanced, appearing as dark bands in the spectrogram. These dark

bands are called formants.

Because they consistently indicate lingual articulatory patterns, the first two formants are of

particular interest for vowels and vowel-like articulations. The third and fourth formants, however, are

not directly correlated with a consistent articulatory pattern across a groups of segments. In fact, it is often the case that the third and fourth formants display specific acoustic patterns for different segments.

Therefore, only the first and second formant frequencies will be discussed.

The first formant frequency (FI ) is negativelv correlated with vowel height. For example, the low vowel [a] has a relatively higher FI of approximately 700 Hz. while the high vowels [i] and [u] have relatively lower FI values of approximatley 300 Hz. The second formant frequency (F2) is positivelv correlated with vowel frontness. That is, a higher F2 indicates that the tongue is in a front position (e.g. the F2 of the vowel [i] is approximately 2300 Hz.) while a lower F2 indicates that the tongue is in a back position (e.g., the F2 of the vowel [u] is approximately 850 Hz.). Since the constrictions that produce palatals, the front vowel /i/, and secondary palatalization are articulated in the front portion of the oral cavity, these sounds are associated with a high F2. Figure 4.2 gives example spectrograms of non­ palatalized (/CV/) and simple-palatalized (/ClV/) sequences as produced by a male native-Russian speaker, where the specific vowel is [u] and the consonant is the non-palatalized or palatalized voiced bilabial stop, [b] or [bi], respectively.

79 Vowel transition onset 4,000 -

g 3,000 - ë Third formant (F3) Ë 2 ,0 0 0 -■ K a. 1 ,0 0 0 -■ Second formant (F2)

First formant (FI)

4,000

3,000 S' Z -— Third formant (F3)

I — — Second formant (F2) 1.000 First formant (FI)

Time

Figure 4.2. Spectrograms illustrating contrasting patterns in F2 for non-palatalized and palatalized consonants. The upper spectrogram is for the non-palatalized sequences /bu/; the lower spectrogram is for the palatalized sequences /bJu/. The x-axis indicates time while the y-axis gives frequency (in Hz.)

IPA transcriptions of the C-V sequences are given below each spectrogram. The spectrograms are aligned according to vowel onset (which for these two sequences coincides with stop release), indicated by the vertical line. Therefore, the stop closures [b] or [bl] are associated with the space to the left of the vertical line while the vowel /u/ is visible in the space to the right of the vertical light. Immediately after stop release the non-palatalized sequence [bu] has a prominent dark band centered at approximately 800

Hz. This is the second formant frequency (F2).

Recall that a key acoustic quality of Russian palatalized consonants is a raised F2 of approximately 2,000 Hz at the C-V transition. Note, however, that the palatalized nature of the

8 0 consonant is visible at consonant release and not in the stop closure portion which is to the left of the vertical line marking the instant of stop release. As the palatalizing constriction is released, the FI goes down in frequency, resulting in a negative slope which eventually levels off when the following vowel articulation is reached. Derkach eta l (1970:18) have shown that this transition carries the essential information on the hard-soft distinction within a word.

Because this study investigates the degree and duration of high front closures in the four C-V sequences, I took measurements of the second formant frequency. In each spectrogram of the target C-V sequences I made as many as five F2 measurements: three F2 frequency values and two F2 duration values. See Figure 4.3 which gives a sample spectrogram of the palatalized-yod sequences as produced by a male native-Russian speaker. The three F2 frequencies were measured at consonant release, C-V transition onset and vowel steady-state onset. Duration of the F2 steady-state associated with a palatal constriction (from consonant release to transition onset) and duration of the F2 transition (from transition onset to vowel steady-state onset) was also measured.

81 F2 fiiequency at F2 frequency at onset of F2 frequency at transition onset vowel steady-state consonant release

4,000 ------

£ 3,000 ------' ■ - ■ I mz: 1.000

F2 steady-state duration F2 transition duration

Time

Figure 4.3. Sample spectrogram of [bJju] illustrating the five F2 measurements.

Finally, Figure 4.4 provides spectrograms of four contrasting sample sequences, [bu] - [bluj - [Wju] •

[bJiju]. The speaker is a male native-Russian speaker.

82 Vowel transition onset 4,000

3,000

4,000

3,000

2,000

2,000

Time

Figure 4.4. Spectrograms for the four contrasting sequences [bu] -[b)u] - [Wju] - [Wiju]. Time is indicated on the x-axis. Frequency (in Hz.) is indicated on the y-axis.

83 These four spectrograms are temporally aligned at the onset of the C-V transition (or, vowel onset), which is indicated by the solid vertical line. Therefore, for both the non-palatalized [bu] and simple- palatalized [blu] sequences, both consonant release and transition onset occur simultaneously, immediately aligned with the vertical line. However, the palatalized-yod [Wju] and palatalized-i-yod

[Wiju] sequences indicate closure release which is prior to the vertical line while transition onset is aligned with the vertical line. IPA transcriptions of the sequences are given below each spectrogram. As in the previous spectrogram, time is represented on the horizontal axis and frequency (in Hz.) on the vertical axis. The white line visible in the F2 band is from the automatic formant tracking algorithm.

Within each formant band, for each point in time, formant tracking indicates the frequency of the peak amplitude in the formant band. The formant tracking feature was especially helpful when determining critical points within the spectrogram, specifically, the C-V transition onset and the vowel steady-state onset, which appear as sharp bends, or “elbows” in the tracking line.

The spectrograms in Figure 4.4 clearly illustrate the raised F2 associated with the high front vowel /i/, the palatal glide /j/ and palatalized consonants, as well as the different F2 steady-states (visible after consonant release and prior to transition onset) of the four sequence types. A few general statements about the spectrograms for each of the C-V sequences can be made.

Figure 4.4a - /CV/. For the non-palatalized sequence, there is no F2 transition state at the beginning of the vowel directly after closure release. This pattern is characteristic of non-palatalized

/CV/ sequences. (Of course, depending on the consonant’s place of articulation and the specific vowel, the F2 at C-V transition can display some degree of negative sloping.) The F2 at consonant release is centered at approximately 800 Hz and is relatively stable, with time decreasing to approximately 500 Hz.

Figure 4.4b - /ClV/. In the simple-palatalized sequence, [Wu], the C-V transition onset displays the characteristic F2 frequency of approximately 2,000 Hz. As the tongue pulls away from the palatalizing constriction, F2 lowers, leveling off at approximately 1,200 Hz. Note that even after consonant release, a short F2-steady-state is visible, indicating that the palatal constriction is maintained for a brief period of time after release of the primary closure.

84 Fi pure 4.4c -/CJjV/. The yod segment is quite visible in the spectrogram for [Wju]. After

consonant closure release, there is an obvious F2 steady-state duration at approximately 2,100 Hz. to the

left of the line indicating transition onset. This band appears to be rather faint because it has lower

acoustic intensity than the following vowel [uj. In many of the native Russian speakers’ spectrograms for

/CljV/ sequences, a highly fricated, noisy F2 characterized the [j] portion of the spectrogram.

Figure 4.4d -/O iiV /. With the addition of the high front vowel [i], the F2-steady-state duration

following consonant release becomes even longer. The F2 band associated with /i/ is darker than the

band for /j/ in Figure 4.4c, indicating the vowel’s greater acoustic intensity. Note that the F2 band for the

immediately following glide segment is not as dark, indicating the glide’s less resonant qualities.

4J Results

Data from the acoustic measurements were initially stored as a DOS file. The data was then

imported into a spreadsheet program (MicroSoft Excel), where it was sorted according to C-V sequence

and subject. At this point, all five data points (the three F2 frequency measurements and two F2 duration

measurements) were still under consideration. The data were so numerous that it was difficult to see general patterns in the different groups’ acoustic outputs. The spreadsheet was, therefore, imported into a statistics software program (Systat). Means were calculated. Graphical analysis was then accomplished by plotting in superimposition the F2 trajectories across all subjects within a specific language and gender group. In this manner differing acoustic patterns between the Russians and learners (between-language patterns) would become apparent. Similarities among the Russians and among the learners (within- language patterns) would also be illustrated.

This study endeavors to provide a detailed dynamic description of properties that distinguish the secondary palatalizing closure, the primary palatal glide closure and the closure for the high front vowel

/i/. Because the non-palatalized sequences contain none of these tfiree palatal qualities, it was decided at this point in the analysis that data for the non-palatalized sequences would no longer be included. The non-palatalized sequences will therefore no longer be referenced either in the reported data or in the graphs.

85 43.1 F2 trajectories for native Russian speakers

43.1.1 Russians’ means of F2 frequencies and durations

Data summarizing various quantitative aspects of Russians’ formant trajectories for the three C-

V sequences are reported in Table 4.1.

8 6 Female Russians Male Russians

/civ/ /djv/ /CiijV/ /civ/ /cijv/ /ciijV/

F2 frequency at N. of Cases 96 1 1 2 114 78 1 1 2 113 consonant release (Hz.) Min. 1711 1693 1935 1711 1797 1745 Max. 2990 3024 3059 2247 2419 2523 Mean 2266 2548 2550 1932 2165 2163 1 Variance 51024 72743 63611 13820 20916 20930 Stan. Dev. 226 270 252 118 145 145

F2 frequency at N. of Cases 114 1 1 2 113 110 111 111 transition onset (Hz.) Min. 1780 2264 2281 1763 1987 2005 Max. 2973 3076 3266 2333 2489 2506 Mean 2362 2704 2778 2 0 1 2 2233 2268 Variance 60875 41437 35682 15230 8551 10091 Stan. Dev. 247 204 189 123 92 1 0 0

F2 a t vowel steady- N. of Cases 114 1 1 2 113 110 111 1 1 0 state onset (Hz.) Min. 691 708 674 708 708 639 Max. 2056 1935 2005 1814 1797 1676 M ean 1300 1281 1278 1253 1256 1235 1 Variance 186769 190050 200089 95296 97234 111048 Stan. Dev. 432 436 447 309 312 333

F2 steady-state N. of Cases 92 1 1 2 114 76 1 1 2 1 1 2 duration (sec.) Min. .0040 .0500 .1 1 0 0 .0033 .0400 .0700 Max. .1128 .1800 .2500 .1113 .2 2 0 0 .2700 Mean .0332 .1174 .1706 .0251 .1019 .1535 1 Variance .0005 .0007 .0 0 1 2 .0003 .0 0 1 0 .0 0 1 2 Stan. Dev. .0230 .0269 .0350 .0159 .0321 .0344

F2 transition N. of Cases 114 1 1 2 113 110 111 1 1 0 duration (sec.) Min. .0591 .0700 .0600 .0124 .0600 .0700 Max. .1712 .2400 .2700 .2036 .2300 .2400 M ean .1066 .1095 .1140 .1246 .1217 .1129 1 Variance .0007 .0006 .0008 .0 0 1 0 .0009 .0008 Stan. Dev. .0256 .0252 .0283 .0321 .0296 .0260

Table 4.1. Russians’ means and other data for the five F2 measurements for the three C-V environments organized according to gender. The following abbreviations are used in the table: “N. of Cases” = the number of total observations for that specific measurement, “Min.” = the minimum value among all observations, “Max.” = the maximum value among all observations, “Stan. Dev.” = the standard deviation from the mean value among all observations.

Mean values for each of the five F2 measurements for each of the three C-V environments are indicated with a double-lined box. In general, it appears that both the F2-at-transition-onset frequency and F2-

87 steady-state duration values increase with each additional C-V environment (i.e., /CJijV/ > /CljV/ >

/ClV/), thereby indicating that with each successive C-V environment, the closure is produced more forward in the oral cavity and maintained for a longer period of time. Note the overall difference in frequency between female and male speakers. Due to their different vocal tract sizes, females have higher formant frequencies than males.

43.1,2 Russians’ time-aligned F2 trajectories

Of particular interest are the F2-steady-state durations and F2-at-transition-onset frequencies for the three C-V environments, since they reflect the degree of closure and duration of closure of the lingual palatalizing and palatal articulations. In order to more clearly illustrate the relationship between the three C-V environments, mean values from Table 4.1 above were reorganized: data for each C-V sequence was time aligned according to the onset of C-V transition. The C-V transition onset was assigned a relative time of 0.000. The F2-steady-state durations were adjusted accordingly; occurring prior to transition onset, the F2-steady-state duration is expressed with negative duration values.

Occurring after the C-V transition onset, F2-transition durations are expressed with positive duration values. Table 4.2 gives the reorganized F2 frequency values and relative duration times for female and male Russians.

88 G roup C—V sequence Place of F2 frequency F2 frequency Relative time to type measurement (I^) C-V transition onset (sec.) Female Russiam /civ/ Consonant release 2266 -0.033 Transition onset 2361 0.000 Vowel steady-state onset 1299 0.107 /djv/ Consonant release 2548 -0.117 Transition onset 2704 0.(X)0 Vowel steady-state onset 1281 0.110 /d iJ V / Consonant release 2549 -0.171 Transition onset 2778 0.000 Vowel steady-state onset 1278 0.114 Male Russians /dv/ Consonant release 1931 -0.025 Transition onset 2011 0.000 Vowel steady-state onset 1252 0.125 /CljV / Consonant release 2164 -0.102 Transition onset 2232 0.000 Vowel steady-state onset 1255 0.122 /dijv/ Consonant release 2163 -0.153 Transition onset 2268 0.000 Vowel steady-state onset 1235 0.113

Table 4.2. Table of Russians’ time-aligned F2 frequency and duration measurements. The following abbreviations are used; /CIV/ = simple-palatalized sequences, /CljV/ = palatalized-yod sequences, /OijV/ = palatalized-i-yod sequences.

43.13 Graphs of time-aligned trajectories for female and male Russians

Figure 4.5 gives the graphs of the time-aligned adjusted values from Table 4.2. Data for female and male Russians are graphed separately.

89 Female Russians 2900-

2700- /C'ijV/

2500-

23 00- N X 2 1 0 0 - CV o 3 1900-< 3" ë 1700- 2 1500-

1300-

1 1 0 0 -

Time (sec.)

Male Russians 23(X)

2100 - /C jV /

^ 1900- /C’V/ £ i 1700-

1100

- 0.2 - 0.1 0 0.1 0.2 Time (sec.)

Figure 4.5. Time-aligned F2 trajectories for native Russians for the three C-V environments. The upper graph gives F2 trajectories for female Russians; the lower graph is for male Russians. Palatalization is indicated with an apostrophe and not IPA superscript T

90 The two sets of F2 trajectories for male and female Russian subjects indicate very similar patterns:

CJy. For the simple-palatalized sequences there is a brief F2 steady-state duration immediately

following closure release. For Russian females, the average duration is 33 msec., for Russian males 25

msec. This brief steady-state indicates the time when the tongue maintains a palatal closure, immediately

before lowering from the palate, moving towards the vowel articulation. In agreement with phonetics

literature, the F2 value at C-V transition onset for palatalized consonants is raised, here, 2361 Hz. for

females, 1931 Hz. for males^. For both female and male speakers, the following transition duration is

fairly consistent, ranging from 107 to 125 msec.

CljV. With the addition of the yod segment the F2 trajectory changes markedly. There is a

longer steady-state duration preceding C-V transition onset, now averaging 117 msec, for females and

102 msec, for males. The K frequency value at transition onset is much higher, averaging 2704 Hz. for

females and 2232 Hz. for males. The higher F2 frequency values of longer duration indicate a palatal

constriction that is further forward in the mouth and maintained longer than that of simple-palatalized

sequences. The F2 transition duration has not changed noticeably, averaging 110 msec, for females and

1 2 2 msec, for males.

CiiiV. With the addition of the vowel segment [i], the only apparent difference between ClijV and CljV is that the steady-state duration has increased, now averaging 171 msec, for females and 153 msec, for males. It appears that the addition of the front high vowel [i] does not markedly change the F2 frequency at transition onset, here 2778 Hz. for females and 2268 for males. The F2 transition duration does not appear to have lengthened, now 114 msec, for females and 113 msec, for males.

Summing up the F2 trajectory patterns for Russians, we see the three C-V sequences fall into two general patterns: (1) the simple-palatalized sequence, ClV, with a relatively short steady-state duration and lower F2-at-transition-onset; (2) the palatalized-yod, CljV, and palatalized-i-yod, ClijV, sequences with the longer F2-steady-state durations and higher F2-at-transition-onset values.

^ It is interesting to note that I located no linguistic studies which focus on acoustic properties of Russian as produced by female native Russian speakers. All acoustic data that I have found were based on male sources. Therefore, while contrasting male and female native speakers and learners of Russian was not a primary goal of this study, these data are unique in the literature.

91 432 F2 trajectories for learners

432.1 Learners’ means of F2 frequencies and durations

Let us now turn our attention to the learners acoustic outputs. Table 4.3 gives the means, standard deviations and other relevant data for the learners, grouped by C-V type and gender. The data reported is for the six males graduate students and eight female graduate students.

92 Female learners Male learners

/dv/ /d jV / /d ijV / /dv/ /djv/ /dijv/

F2 frequency a t N. of Cases 104 95 111 80 83 84 consonant release (Hz.) Min. 1538 1884 1780 1590 1572 1745 Max. 2921 2834 2938 2471 2523 2679 Mean 2342 2383 2323 2042 2080 2094 1 Variance 63073 46315 45236 28978 30189 23326 Stan. Dev. 251 215 2 1 2 170 174 153

F2 frequency at N. of Cases 111 1 1 2 111 82 84 83 transition onset (Hz.) Min. 1745 2143 2177 1849 1814 1935 Max. 2938 2973 3076 2661 2506 2592 Mean 2505 2492 2594 2153 2114 2206 1 Variance 32899 25945 29657 25305 23584 20897 Stan. Dev. 181 161 172 159 154 145

F2 a t vowel steady- N. of Cases 1 1 0 1 1 2 111 82 84 82 state onset (Hz.) Min. 760 777 795 743 743 725 Max. 2471 1884 1866 1728 1728 1642 M ean | 1385 1321 1325 1265 1239 1 2 1 2 Variance 122940 113173 124109 100213 102366 95327 Stan. Dev. 350 336 352 316 320 309

F2 steady-state N. of Cases 1 0 2 94 111 80 71 84 duration (sec.) Min. .0061 .0168 .0287 .0062 .0 1 0 0 .0600 Max. .1246 .1543 .2643 .1094 .1 2 0 0 .2900 Mean .0567 .0651 .1185 .0396 .0494 .1535 1 Variance .0006 .0 0 1 1 .0 0 2 2 .0006 .0005 .0023 Stan. Dev. .0246 .0327 .0466 .0240 .0232 .0482

F2 transition N. of Cases 1 1 0 1 1 2 111 82 84 82 duration (sec.) Min. .0153 .0455 .0620 .0657 .0500 .0700 Max. .2493 .2454 .3202 .1637 .1800 .2 0 0 0 Mean .1293 .1430 .1444 .1029 .1123 .1259 Variance .0017 .0018 .0 0 2 2 .0005 .0009 .0007 Stan. Dev. .0415 .0427 .0470 .0218 .0295 .0267

Table 4.3. Learners’ means and other data for the five F2 measurements for the three C-V environments according to gender. The following abbreviations are used in the table: “N. of Cases” = the number of total observations f for that specific measurement, “Min.” = among all observations, the minimum value, “Max.” = among all observations, the maximum value, “Stan. Dev.” = the standard deviation among the observations.

Mean values for each of the five F2 measurements for each of the three C-V environments are again indicated inside the double-lined boxes. Upon initial inspection, the F2 patterns for the learners seem to

93 differ from those of Russians’; learners’ frequency values for the simple-palatalized and palatalized-yod are similar, while values for the palatalized-i-yod display higher F2 values and longer steady-state durations.

43,2,2 Learners’ time-aligned F2 trajectories

As with the Russians, data from the initial table of means (Table 4.3) were reorganized and time- aligned relative to F2 transition onset. Table 4.4 below gives the reorganized F2 frequency values and relative duration times for female and male learners.

Group C-V sequence Place of F2 frequency F2 frequency Relative time to type measurement (Hz,) C—V transition onset (sec,) Female learners civ Consonant release 2316 -0.057 Transition onset 2496 0.(X)0 Vowel steady-state onset 1388 0.140 cijv Consonant release 2372 -0.064 Transition onset 2485 0.000 Vowel steady-state onset 1332 0.151 ciijv Consonant release 2333 -0.118 Transition onset 2580 0.000 Vowel steady-state onset 1340 0.154 Male learners d v Consonant release 2042 -0.040 Transition onset 2152 0.000 Vowel steady-state onset 1264 0.103 djv Consonant release 2079 -0.049 Transition onset 2114 0.000 Vowel steady-state onset 1238 0.112 dijv Consonant release 2094 -0.153 Transition onset 2206 0.000 Vowel steady-state onset 1211 0.124

Table 4.4. Table of learners’ time-aligned F2 frequency and duration measurements. The following abbreviations are used: /ClV/ = simple-palatalized sequences, /CljV/ = palatalized-yod sequences, /ClijV/ = palatalized-i-yod sequences.

4 3 3 3 Graphs of time-aligned trajectories for female and male learners

Figure 4.6 gives graphs of the time-aligned F2 trajectories for the data in Table 4.4. Trajectories for female and male learners are graphed separately.

94 Female learners 2 9 0 0 -

2 7 0 0 -

2500- /C'ijV/

_ 2300-

5- 2 1 0 0 - gr 1 1900- g" a 1700- £ 1500-

1300-

1 1 0 0 - III

- 0.2 - 0.1 0 0.1 0.2 Time (sec.)

Male learners 2300

/C’ijV/ 2100 - /C’V/ /C’JV/ 1900- M X 1700-

1500- £ 1300-

1100

- 0.2 - 0.1 0 0.1 0.2 Time (sec.)

Figure 4.6. Time-aligned F2 trajectories for learners for the three C-V environments. Female and male trajectories presented separately. The upper graph gives F2 trajectories for female learners; the lower graph is for male learners. Palatalization is indicated with an apostrophe and not IPA superscript ‘j’.

95 The two graphs of F2 trajectories for both male and female learners present very similar within-language patterns.

CJy. F2 trajectories for the simple-palatalized sequences indicate an F2—steady-state immediately following consonant closure release, averaging 57 msec, for females and 40 msec, for males.

Upon initial inspection, the learners’ F2-steady-state durations seem to be longer than those of Russians’.

(Recall that Russians’ averages were 33 msec, for females and 25 msec, for males.) It could be hypothesized that the learners’ apparently longer steady-state duration corresponds to a separate yod segment following the (non- palatalized) consonant, thereby supporting the traditional description that nonnative productions of palatalized consonants are a sequence of two segments, ICJ + /j/. Only statistical analyses, however, can determine whether a reliable difference between the Russians’ and learners’ F2-steady-state durations does exist.

Learners’ F2 frequency values increased from consonant release to transition onset. Females averaged 2496 Hz and males averaged 2152 Hz. at F2-at-transition-onset. The F2 transition durations for the two gender groups were 140 msec, for females and 103 msec, for males.

Cl iV. The most significant feature of learners’ F2 trajectories for the palatalized-yod sequences is that they are virtually identical to the trajectories for simple-palatalized sequences. Following primary consonant closure release, there is an F2-steady-state duration of approximately the same length as in

ClV, having means of 64 msec, for females and 49 msec, for males. The mean F2-at-transition-onset frequencies for CljV are similar to those of the ClV sequence, here 2485 Hz. for females and 2114 Hz. for males. (Note that, in fact, males have a lower F2 frequency for the CljV than for CÎV.) F2 transition durations for CljV sequences appear to be of somewhat greater duration, 151 msec, for females and 112 msec, for males.

OiiV. With this third C-V environment and the addition of the vowel [i], there appears to be a reliable increase in the F2-steady-state duration, with means of 118 msec, for females and 153 msec, for males. The F2-at-transition-onet frequency seems to be somewhat raised, with mean values of 2580 Hz. for females and 2206 Hz. for males.

96 Summarizing learners’ F2 trajectory patterns, we see the three C-V sequences divided into two

groups: (I) the simple-palatalized. ClV, and palatalized-yod, CljV, appear to pattern the same; (2) with

its longer steady-state duration and slightly higher F2-at-transition-onset, the palatalized-i-yod sequence,

ClijV, seems to have a different acoustic pattern. Most importantlv. Russians’ and learners’ F2 traiectorv

patterns for the three C-V sequences exhibit different pattern in gs: Russians group together CliV and

CliiV and learners group together CliV and ClijV.

4 3 3 Statistical analyses of F2-at-transition-onset: Russians

In the remaining sections, which discuss statistical analyses of the acoustic data, the following

abbreviations will be used to identify factors in the statistical models:

SUBJECT = each individual subject from the four gender-nationality groups; thirty possibilities: 8 female learners 8 female Russians 6 male learners 8 male Russians

WHO = one of the four gender-nationality groups; four possibilities: re, FR, ME, MR

PALENVIR = (< “palatalization environment”) the type of C-V sequence; three possibilities: d v , d jV , or d ijV

POACONS = (< “place of articulation of the consonant”) the consonant type of the C in the C-V sequence; four possibilities: labial, dental, lateral or trill

F2SSDUR = (<“F2—steady-state duration”) the duration of the steady-state portion of the second formant frequency which follows consonant closure release and precedes vowel onset and corresponds to a yod-like, or palatalizing, secondary articulation

F2@TRANS = (<“ F2-at-transition-onset”) the frequency of the second formant frequency at the onset of transition between the C and V of the three C-V sequences, located at the onset of the vowel

This section presents analyses of the F2-at-transition-onset. The next sections, 4.3.5 and 4.3.6, report statistical analyses of F2-steady-state durations. As stated earlier, Russians’ F2 trajectories in

Figure 4.5 for the three C-V sequences indicate that the simple-palatalized sequence differs from the palatalized-yod and palatalized-i-yod sequences in both F2-steady-state duration and F2-at-transition- onset; the palatalized-yod and palatalized-i-yod sequences seem to have differing F2-steady-state durations but equivalent F2-at-transition-onset frequencies. In order to determine which of these

97 apparent differences is statistically reliable, I conducted a balanced repeated measures Analysis of

Variance (ANOVA) for each gender group.

It should be noted that this first ANOVA that I report is invalid since it does not take into

account complications from between-speaker variability. However, I choose to report this initial

simplified ANOVA as a “user-friendly” introduction to the tatistical analyses used in this study. In this

initial ANOVA, I tested for main effects of the independent variables SUBJECT, PALENVIR and

POACONS and their interactions on the dependent variable F2@TRANS. That is, I investigated the cause-

and-effect relationship between the subject, C-V environment and consonant type with the F2-at-

transition-onset frequency. Do the three different C-V environments alone—with their different types of

palatal constrictions—cause reliably different K-at-transition-onset frequencies? Or, are the different consonant types associated with (or "causing") differing F2-at-transition-onset frequencies? Or, perhaps the F2-at-transition-onset frequency is influenced by a combination (or, interaction) of both the C-V environment and consonant type (i.e., a dental O V versus a labial CIV versus a lateral CIV; or a dental c iv versus a dental CljV versus a dental OijV).

At this point we are most interested in the role of PALENVIR on the F2@TRANS frequency. In this initial analvsis. all four C-V environments (CV, OV, OjV and OijV) were included.

ANOVA for all four C-V environments

DEPENDENT VARIABLE: F2@TRANS FEMALE RUSSIANS MALE RUSSIANS INDEPENDENT VARIABLES: F-Ratio P F-Ratio P SUBJECT 26.093 0 .0 0 0 8.996 0 .0 0 0 POACONS 27.904 0 .0 0 0 41.177 0 .0 0 0 PALENVIR 1347.330 0 .0 0 0 1620.859 0 .0 0 0 SUBJECT*POACONS 0.532 0.957 0.361 0.996 SUBJECT*PALENV1R 2.832 0 .0 0 0 2.452 0 .0 0 0 POACONS*PALENVIR 19.246 0 .0 0 0 25.771 0 .0 0 0 SUBJECT*POACONS ♦PALENVIR 0.646 0.982 0.486 1 .0 0 0

Table 4.5. ANOVA for both groups of Russian speakers. All four C-V environments included in calculations.

Table 4.5 indicates that all three factors (SUBJECT, POACONS and PALENVIR) have a main effect (p<.01) on the F2-at-transition-onset frequency. The interactions SUBJECT*PALENVIR and

98 POACONS*PALENVIR are also significant (pxO.Ol), indicating that the specific C-V environments (CV, civ , cijV, or ciijV) for each of the consonant types plays a key role in the resulting F2-at-transition- onset frequency. The interaction SUBJECT*POACONS is not significant.

Upon closer inspection we see that the F-Ratio values for PALENVIR are quite large, thereby indicating that this apparent overall reliable difference could actually be due to an additional factor which has not been taken into consideration. In fact, this initial test is based on incorrect assumptions since we have not factored out between-speaker variation due to the different vocal tract lengths. The larger F-

Ratio values of 1347 (females) and 1621 (males) and their statistically significant results might actually be due to these inherent between-speaker differences.

In order to correctly determine if the PALENVIR effect is consistent across all speakers (that is, in order to factor out the effect of the between-speaker variation), I ran an additional test of PALENVIR effect where the the ERROR TERM was the interaction SUBJECT*PALENVIR. With between-speaker variation factored out, the factor PALENVIR still has a main effect on F2@TRANS (for females

F(3,21)=475.8, p<0.01; for males F(3,21)=661.2, p<0.01).

At this point in our discussion, all four C—V environments have been included in the analyses.

Non-palatalized sequences, however, have substantially lower F2-at—transition-onset values than the other three palatalized sequences. Their inclusion in the analyses, therefore, contributes to the high F-

Ratios and doesn’t permit a very close look at the different types of palatal environments. Similar

ANOVA and test of effects analyses which do not include data for the non-palatalized CV sequences need to be calculated.

99 ANOVA for three oalatalized C-V environments; OV. OiV. OiiV

DEPENDENT VARIABLE; F2@TRANS FEMALE RUSSIANS MALE RUSSIANS INDEPENDENT VARIABLES: F-ratio P F-ratio P SUBJECT 40.666 0 .0 0 0 23.106 0 .0 0 0 POACONS 3.514 0.016 9.027 0 .0 0 0 PALENVIR 209.332 0 .0 0 0 298.960 0 .0 0 0 SUBJECT*POACONS 1.057 0.396 .867 0.634 SUBJECT*PALENVIR 2.403 0.004 3.543 0 .0 0 0 POACONS*PALENVIR 1.045 0.396 5.260 0 .0 0 0 SUBJECT*POACONS*PALENVIR 0.570 0.985 .731 0 .8 8 8

Table 4.6. ANOVA for both groups of Russians. Only the three C-V environments containing a palatalized consonant are considered.

This second ANOVA in Table 4.6 indicates the same significant main effects as when all four C-V

environments are included. However, the significant interactions are different; POACGNS*PALENVIR is

no longer significant for Russian females.

To take into account the fact that repeated measures were taken from the same speakers (with

each speaker having different vocal tract lengths and resulting F2s), I ran an additional test of PALENVIR

effects analysis where the ERROR TERM was the interaction SUBJECT*PALENV1R. After having factored

out between-speaker variability, we see that the sequence type (i.e., CÎV, OjV or OijV) still has a main

effect on the frequency of the second formant at the beginning of C-V transition (for females

F(2,14)=85.1, p<0.01; for males F(2,14)=84.3, p<0.01). Thus, the C-V environment seems to cause the

differing F2-at-transition-onset frequencies.

While we have established that an overall difference between the three C-V environments for

the F2-at-transition-onset frequencies exists, it is not clear if a difference exists between all three environments; perhaps only two environments differ while two have equivalent values. If we return to

Figure 4.5, it appears that the simple-palatalized differs from palatalized-yod while palatalized-yod and palatalized-i-yod sequences have equivalent F2-at-transition-onset frequencies, i.e.. O V C JjV = OijV.

(It seems that the two sequences containing the independent yod segment have equivalent F2-at- transition-onset frequencies and, therefore, have similar lingual articulations.) Table 4.7 gives the results

100 of a Tukey post-hoc test that reveals between which of the three C-V sequences there exist reliably

different F2-at-transition-onset frequencies.

FEMALE RUSSIANS MALE RUSSIANS

dv dijV djv dv dijv djv d v 1.000 1.000 dijv 0.000 1.000 0.000 1.000 d iv 0.000 1 0.049 1.000 0.000 II 0.244 1.000

Table 4.7. Tukey post-hoc test for both Russian groups which indicates reliable differences in F2-at-transition-onset frequencies between the three C-V environments.

The double-outlined cells in Table 4.7 provide the key information. In support of our earlier

hypotheses based on Figure 4.5, both male and female Russian speakers produce simple-palatalized and

palatalized-yod sequences with different F2-at-transition-onset frequencies (p<0.01). Native Russian speakers do not, however, distinguish the F2-at-transition-onset frequencies of palatalized-yod and

palatalized-i-yod sequences. Table 4.8 summarizes these results:

Female and male native Russian speakers Comparison of F2-at-transition-onset frequencies (POACONS not taken into consideration) OV OjV OijV

Table 4.8. Generalization of the reliable differences between the F2-at-transition-onset for both Russian groups for the three environments CÎV, cijV and OijV.

101 This analysis can be taken one step further. While I have established the F2 frequency equivalencies (and non-equivalencies) for the three C-V environments, we have not yet considered how each consonant type, i.e., labial, dental, lateral and trill, affects the F2-at-transition-onset frequency. Do all four kinds of articulations follow the O V CÎ jV =ciijV pattern? Or, do only some of them exhibit this pattern while others do not?

To test how the consonant type for each of the three C-V environments affects F2-at-transition- onset frequency, I ran another Tukey post-hoc test. Between-speaker variability was factored out by designating the effect to be POACONS*PALENVIR and the error term SUBJECT*POACONS*PALENVIR.

See Table 4.9 for the test results. Double-outlined cells indicate key statistical resultsFemale Russians

102 Female Russians

D D D B B B L L L R R R d v dijV d iv d v dijv d jv d v dijV d jv d v dijv d iv DCiv 1 .0 0 0 D dijv 0 .0 0 0 1 .0 0 0 DCijV 0 .0 0 0 0.181 1 .0 0 0 B civ 0.998 0 .0 0 0 0 .0 0 0 1 .0 0 0 B dijV 0 .0 0 0 0.986 0.735 0 .0 0 0 1 .0 0 0 BdiV 0 .0 0 0 0.365 1 .0 0 0 0 .0 0 0 0.950 1 .0 0 0 L d v 0.622 0 .0 0 0 0 .0 0 0 0.929 0.000 0.000 1.000 L dijV 0 .0 0 0 0.997 0.070 0 .0 0 0 0.702 0.144 0 .0 0 0 1 .0 0 0 LdiV 0 .0 0 0 0.088 0.999 0 .0 0 0 0.400 0.952 0.000 1 0.033 1 .0 0 0 R d v 0.610 0 .0 0 0 0 .0 0 0 0.160 0 .0 0 0 0 .0 0 0 0.033 0 .0 0 0 0 .0 0 0 1 .0 0 0 Rdijv 0 .0 0 0 0.700 0.004 0 .0 0 0 0.128 0.009 0 .0 0 0 0.999 0.003 0 .0 0 0 1 .0 0 0 R d jv 0 .0 0 0 0.997 0.970 0 .0 0 0 1 .0 0 0 0.999 0 .0 0 0 0.849 0.748 0 .0 0 0 0.319 1 .0 0 0

Male Russians

D D D B B B L L L R R R d v dijv d iv d v diiV d jv d v dijv diV d v diiv d jv DCiv 1 .0 0 0 D diiv 0 .0 0 0 1 .0 0 0 D d jv 0 .0 0 0 0.860 1 .0 0 0 B d v 0.014 0 .0 0 0 0 .0 0 0 1 .0 0 0 B dijV 0 .0 0 0 1 .0 0 0 0.810 0 .0 0 0 1 .0 0 0 BdiV 0 .0 0 0 0.262 0.999 0 .0 0 0 1 0.162 1 .0 0 0 L d v 0 .0 0 0 0 .0 0 0 0 .0 0 0 0 .0 0 0 0.000 0.000 1.000 Ldijv 0 .0 0 0 1 .0 0 0 0.877 0 .0 0 0 1 .0 0 0 0.435 0 .0 0 0 1 .0 0 0 LdiV 0 .0 0 0 0.477 0.998 0 .0 0 0 0.417 1 .0 0 0 0 .0 0 0 0.533 1 .0 0 0 R d v 0.999 0 .0 0 0 0 .0 0 0 0.621 0 .0 0 0 0 .0 0 0 0 .0 0 0 0 .0 0 0 0 .0 0 0 1 .0 0 0 RdijV 0 .0 0 0 0.972 0.277 0 .0 0 0 0.948 0.049 0 .0 0 0 0.999 0 .1 1 1 0 .0 0 0 1 .0 0 0 RdiV 0 .0 0 0 1 .0 0 0 0.955 0 .0 0 0 1 .0 0 0 0.582 0.000 1.000 0.668 0 .0 0 0 1 0.991 1 .0 0 0

Table 4.9. Tukey post-hoc test of effect of consonant type and C-V environment on the F2- at-transition-onset frequency. The following abbreviations are used: B = labials (/bl//nd/. /vi/), D = dentals (/di/, /zi/), L = lateral (/U/) and R = trill (/rJ/). The top table gives data for female Russians, the bottom table is for male Russians.

103 For each of the four consonant types the simple-palatalized and palatalized-yod sequences have different

F2-at-transition-onset frequencies (p

display the same F2-at—transition-onset frequencies for each consonant type (p<0.01).

The results of this analysis support the previous general conclusions; for both male and female

Russians, the F2-at-transition-onset frequency of simple-palatalized and palatalized-yod sequences

differs, while the palatalized-yod and palatalized-i-yod are reliably the same. Table 4.10 summarizes the

results of Table 4.9.

Female and male native Russian speakers Comparison of F2-at-transition-onset frequencies (POACONS taken into consideration)

civ d j v ç iijv

DENTALS DIV < DijV DiijV LABIALS bJv < BJjV BJijV LATERAL ÜV < iJjV U ijV TRILL RJV < rJ iv RiijV

Table 4.10. Summarizing results of Tukey post-hoc test which tested for the effects of C-V environment and consonant type on the F2-at-transition-onset frequency. Results for female and male Russians combined.

In summary, this section has demonstrated that, for both female and male native Russian speakers, the environment following a palatalized consonant does affect the F2-at-transition-onset frequency in the three C-V sequences. Specifically, a simple-palatalized environment, ClV (one not followed by an additional palatal segment) has a lower F2 frequency than a sequence containing the palatal segment yod, e.g. CJjV, and CfijV. Because the second formant band is associated with the front- back position, and to some extent height, of the tongue in the oral cavity, the statistical results of this section provide acoustic evidence that the palatalized-yod segment is produced with a higher, more forward tongue position than the secondary palatalizing articulation of soft consonants. (Whether or not these differing acoustic realizations are due to different underlying gestures or to the same underlying gestures which, because of time constraints, are not fully realized with Cl V, but are fully realized with

CJjV sequences, will be discussed later.)

104 4 J.4 Statistical analysis of F2-at-transition*onset: Learners

We now turn our attention to learners’ F2-at-transition-onset frequencies. Do they exhibit the

same statistically reliable patterns that native speakers do? According to our initial hypotheses based on

Figure 4.6, learners’ speech exhibits different F2 patterns than Russians’ speech: learners produce ClV

and CljV identically and differentiate OjV and OijV. This section will present the same series of

statistical tests that were conducted on the Russians’ data in the previous section. Having presented a detailed step-by-step account of the analyses of the Russians’ data, this section will present the statistical

tests and results of the Americans’ data more briefly. Furthermore, because the non-palatalized sequence is not the focus of this study (as discussed in the previous section), CV sequences are eliminated from the calculations. We will focus only on those sequences that contain a palatalized consonant: O V , OjV or d ijV .

We begin with an ANOVA that tests for main effects of the factors SUBJECT, POACONS, and

PALENVIR and their interactions on the dependent variable F2@TRANS.

For three C-V environments: d v . d iV . d iiV

DEPENDENT VARIABLE: F2@TRANS FEMALE LEARNERS MALE LEARNERS INDEPENDENT VARIABLES: F-Ratio P F-Ratio P SUBJECT 32.427 0 .0 0 0 69.469 0 .0 0 0 POACONS 15.559 0 .0 0 0 14.582 0 .0 0 0 PALENVIR 26.062 0 .0 0 0 25.788 0 .0 0 0 SUBJECT*POACONS 1.865 0.014 2.159 0.009 SUBJECT*PALENVIR 1.859 0.032 2.525 0.007 POACONS*PALENVIR 1.408 0 .2 1 2 3.055 0.007 SUBJECT*POACONS*PALENVIR 1.610 0.015 1 .0 1 0 0.460

Table 4.11. ANOVA for both groups of learners.

Based on these results in Table 4.11, all three factors, SUBJECT, POACONS, and PALENVIR, exhibit a main effect (p<0.01) on the dependent variable F2@TRANS. The interactions SUBJECT*POACONS and

SUBJECT*PALENVIR are significant (at p<0.01), while (like the Russian females) the interaction

POACONS*PALENVIR is not significant for females but is for males It is interesting to compare the F-

Ratios for the Russians and learners in this first ANOVA. For the Russians, F-Ratio values for the factors

SUBJECT, POACONS and PALENVIR differ considerably. (See Table 4.6.) POACONS exhibits the

105 smallest value (3.514/9.027, female/male respectively); SUBJECT has a larger value (40.666/23.106), while PALENVIR displays a much larger ratio (209.332/298.960). For these same three factors the learners’ F-Ratios do not differ as much: SUBJECT (32.427/69.469), POACONS (15.559/14.582) and

PALENVIR (32.427/69.469). These initial results might indicate that for Russians PALENVIR is overwhelmingly the factor that determines F2-at-transition-onset frequency while for Americans it isn’t.

Factoring out between-speaker variability (due to learners’ different vocal tract lengths) and the repeated-measures design of the experiment, we see that PALENVIR still has a main effect on

F2@TRANS: for females F(2,14)=14.0, p<0.01, for males F(2, 10)= 12.2, p<0.01.

Results of the Tukey post-hoc test in Table 4.12 indicate between which three C-V environments a difference in F2-at-transition-onset frequency exists.

FEMALE LEARNERS MALE LEARNERS

dv diiV diV dv dijv div d v 1 .0 0 0 1 .0 0 0 dijv 0 .0 0 1 1 .0 0 0 0.019 1.000 d iv 1 .0 0 0 II 0 .0 0 1 ll 1 .0 0 0 0.611 II 0.004 1 .0 0 0

Table 4.12. Tukey post-hoc test for both learner groups which indicates reliable differences in F2-at-transition-onset frequencies between the three C-V environments.

Thus, learners exhibit a pattern opposite to that of native Russians. Both female and male learners produced the sequences ClV and CJjV with the same F2-at-transition-onset frequency, while the sequence containing the vowel /!/, CJijV, was produced with a higher F2 frequency (p<0.01). Table 4.13 illustrates this finding.

Female and male native learners Comparison of F2-at-transition-onset frequencies (POACONS not taken into consideration)

CIV CljV ClÿV

Table 4.13. Generalization of the differences between the F2-at-transition-onset for both learner groups for the three environments O V, OJV and CJijV.

106 Let us consider how the four consonant types affect the F2-at-transition-onset frequency for each of the three C-V environments. Perhaps the apparent overall differences in F2@TRANS between the

CljV and OijV environments are actually due to the different consonant types with their specific places and manners of articulation. A Tukey post-hoc test with the effect POACONS*?ALENVIR and the error term SUBJECT*POACONS*PALENVIR gave the following results:

107 Female learners

D D D B B B L L L R R R d v diiV d iv d v diiV d iv d v diiv d iv d v diiV d iv DCiv 1 .0 0 0 Dciiiv 0.537 1 .0 0 0 D d jv 1 .0 0 0 1 0.194 1 .0 0 0 BCiv 0.762 1 .0 0 0 0.330 1 .0 0 0 BCiijV 0.085 1.0 0 0 0.015 0.926 1 .0 0 0 BdiV 0.999 0.917 0.907 0.993 0.302 1 .0 0 0 L d v 0.998 0.229 1 .0 0 0 0.370 0.038 0.852 1 .0 0 0 L dijV 0.364 1 .0 0 0 0.140 0.989 1 .0 0 0 0.721 0.147 1 .0 0 0 LdiV 1 .0 0 0 0.363 1 .0 0 0 0.548 0.075 0.949 1 .0 0 0 1 0.232 1 .0 0 0 R d v 0.257 0.003 0.545 0.005 0 .0 0 0 0.041 0.910 0.003 0.813 1 .0 0 0 R diiV 1 .0 0 0 0.967 0.997 0.997 0.649 1 .0 0 0 0.983 0.829 0.997 0.231 1 .0 0 0 RdiV 0.755 0.026 0.960 0.045 0 .0 0 2 0.253 0.999 0 .0 2 0 0.995 1.0001 0.658 1 .0 0 0

Male learners

D D D B B B L L L R R R d v d iiv d iv d v diiV diV d v diiV d iv d v diiV diV D d v 1 .0 0 0 Ddiiv 0.899 1 .0 0 0 D d iv 0.776 0.057 1 .0 0 0 B d v 0.431 1 .0 0 0 0.004 1 .0 0 0 Bdiiv 0 .2 2 1 0.994 0 .0 0 1 1 .0 0 0 1 .0 0 0 BdiV 1 .0 0 0 0.860 0.639 0.311 0.132 1 .0 0 0 L d v 0 .0 2 1 0 .0 0 1 0.425 0.000 0.000 0.011 1.000 Ldijv 0.830 1 .0 0 0 0.083 1 .0 0 0 1 .0 0 0 0.797 0 .0 0 2 1 .0 0 0 L diV 0.046 0 .0 0 2 0.645 0 .0 0 0 0 .0 0 0 0.024 1 .0 0 0 0.003 1 .0 0 0 R d v 0.758 0.103 1 .0 0 0 0.019 0.008 0 .6 6 6 0.830 0 .1 1 0 0.945 1 .0 0 0 R diiV 0.996 1 .0 0 0 0.338 0.999 0.987 0.995 0.008 1 .0 0 0 0.016 0.352 1 .0 0 0 RdiV 0.732 0.093 1 .0 0 0 0.016 0.007 0.636 0.848 0 .1 0 1 0.955 1 .0 0 0 1 0.332 1 .0 0 0

Table 4.14. Tukey post-hoc analyses focuses on effect of consonant type and C-V environment on the F2-at-transition-onset frequency. Only the three C-V environment containing a palatalized consonant are included. The following abbreviations are used; B = labials (/bJ/, /mi/, /vJ/), D = dentals (/d)/, /zi/), L = lateral (/Ü/) and R = trill (/rJ/) The top table is for female learners, the bottom table is for male learners.

108 Rather surprisingly, Table 4.14 indicates a pattern different from the general pattern of Table 4.13

(averaged over consonant type). When the data are broken down by consonant type, we find that, except

for one case, the pattern found in the average PALENVIR data is not reliable for the places of articulation considered separately. In other words, the PALENVIR effect is weaker for learners; with a smaller number of these analyses the statistical test doesn’t find the differences. Male and female learners produce both the simple-palatalized and palatalized-yod (OV and OjV) and the palatalized-yod and palatalized-i-yod (OjV and OijV) sequences with the same F2-at-transition-onset frequencies (p<.01).

Thus the more general Tukey post-hoc of Table 4.13 indicate that the F2@TRANS frequencies were different for the three C-V environments because the place of articulation affects learners’ acoustic properties, i.e., because of its labial closure, a labial palatalized consonant has lower formant frequencies than dental palatalized consonants.

The only exception to learners’ new O V = O jV = OijV pattern is seen in the lateral productions of male learners. Here we see that male learners do distinguish the F2@TRANS for the i j v and UijV environments. Table 4.15 summarizes results of Table 4.14. The male learners’ non-equivalent production of the laterals CljV and OijV is indicated with “ ”.

Female and male learners Comparison of F2-at-transition-onset frequencies (POACons taken into consideration)

civ d iV O’ijv Dentals dJv DijV = oJijV Labials BJV BJjV = BJijV Lateral U V UjV =/ UijV Trill RJV RijV = RJijV

Table 4.15. Summarizing results of Tukey post-hoc test which tested for the effects of C-V environment and consonant type on the F2-at-transition-onset frequency. Results for female and male learners combined.

In summary, this section has demonstrated that learners’ F2-at-transition-onset fiequencies for the sequences OV, OjV and OijV exhibit a pattern different than those of native Russians. While native speakers exhibit a O V C JjV = OijV pattern, male and female learners do not distinguish the

109 palatalized, palatalized-yod and palatalized-i-yod sequences, resulting in ClV = CljV = ciijV, with a

weak tendency to produce OijV with a higher F2-at-transition onset frequency. Table 4.16 summarize

the results.

Russians: Qjy < cijv dijv

Learners: Qiy = cijv = ciijv

Table 4.16. Summary comparison of F2-at-transition-onset frequencies for the three C-V environment for learners and native Russians.

In section 4.5 I discuss these results as they relate to general articulatory patterns.

Discussion of F2-at-transition-onset frequency addresses only the static domain of our F2

trajectories; it will be seen later that this static domain can be associated with articulatory targets. In

order to present a more complete description of native and nonnative productions of Russian palatalized

sequences, we must also consider the dynamic features of F2 steady-state durations; it will be seen later

that these dynamic qualities can be associated with inter-articulatory timings. By taking into account both

the static and dynamic properties of F2, the articulatory targets and their inter-articulatory timings can be

illustrated in detail in the Gestural Phonology Model.

43,5 F2 steady-state duration: Russians

If we retiuTi to the Russians’ F2 trajectories in Figure 4.5, it appears that that the simple- palatalized sequence, ClV, has the shortest F2-steady-state durations while the duration for the palatalized-yod sequence, CljV, is longer, and the palatalized-i-yod sequence, OijV, is longer yet. To determine if these apparent differences are reliable, statistical tests must be conducted on the duration values. The statistical analyses for F2-steady-state duration, however, are different than those conducted on the F2-at-transition-onset frequencies in the previous sections. This is because, when originally making acoustic measurements on both Russian and AE subjects’ spectrograms, I was not always able to confidently locate the beginning of the yod segment. This was especially true with non-stop consonants

(including /U/, /rJ/ and palatalized affricated /(U V ) . Therefore, many data cells for the F 2 -steady-state

110 duration were missing, resulting in unbalanced repeated measures data. I could not independently

conduct ANOVAs on such data. (My statistics software, Systat 5.2.1, would act as if it were calculating

the ANOVA but would not produce an ANOVA table. Instead it noted that there were 37 missing cases.)

I, therefore, enlisted the help of the Statistical Consulting Service (SCS) at The Ohio State University.

All means and statistical results reported on the F2—steady-state durations are from SCS calculations.

Regarding the statistical methods used, 1 quote from the final SCS report: ‘T o analyze the unbalanced repeated measures data for duration of sounds a mixed-effects ANOVA was fit using restricted maximum likelihood estimation (REML). A covariance structure that accounted for correlation of durations within a subject, and a higher correlation of durations within an environment for a subject was more consistent with the data than the simpler uniform correlation structure. Our model is:

Yijk,™ = -i + Pj + Yk + aPij + «Yik+ aPYjk + + Vmd» + where:

i = 1...4 indicates the nationality-gender combination, i.e., FE, FR, ME, MR

J - 1...3 indicates the environment, i.e., c i V, CljV, ClijV k = 1...4 indicates the sound, or consonant type, i.e.. Lab, Dent, Lat, Trill

1 = 1.. .nj indicates the 1 person among the nj in the ith nationality-gender group, i.e., a total of 30 subjects: 8 FE, 6 FR, 8 ME, 8 MR

m = 1...3*N indicates the environment effect nested within person

An initial ANOVA Tests of Fixed Effects determined that all terms were significant (p<0.01).

(See Table 4.17.) So, the relationship of gender, nationality, C-V environment and consonant type to the

F2-steady-state duration is not simple."

Ill DEPENDENT VARIABLE: F2SSDUR

INDEPENDENT VARIABLES: h d e DDF Typf n F-Raiit? C WHO 3 1035 6.56 0 .0 0 0 2 PALENVIR 2 52 399.11 0 .0 0 0 1 POACONS 3 1035 7.78 0 .0 0 0 1 PALENVIR*WHO 6 1035 21.58 0 .0 0 0 1 POACONS*WHO 9 1035 3.75 0 .0 0 0 1 POACONS*PALENVIR 6 1035 9.13 0 .0 0 0 1 POACONS*PALENVIR*WHO 18 1035 3.00 0 .0 0 0 1

Table 4.17. The SCS Test of Fixed Effects on the dependent variable F2SSDUR.

In order to determine between which of the twelve (4 x 3) POACONS*PALENVIR environments Russians

distinguish the F2SSDUR, mean duration values for each consonant type, for each gender and for each of

the three C-V environments were calculated. Table 4.18 lists the Russians’ means:

d v d iV d iiV Dental FR .047 .114 .164 MR .0 2 2 .099 .156 Labial FR .031 .113 .169 MR .025 .097 .151 Lateral FR .014 .113 .183 MR .029 .104 .148 Trill FR .030 .138 .175 MR .028 .1 2 1 .161

Table 4.18. Russians’ F2-steady-state duration means (in sec.) according to consonant type and C-V environment. FR = female Russians, MR = male Russians.

Figure 4.7 provides corresponding graphs of the duration means given in Table 4.18. For each C-V sequence type (the three groups on the y-axis), each bar indicates the consonant type (labial, dental, lateral, trill) for each gender (female, male).

112 Russians' mean F2—steady-state durations by gender and consonant type

□ D-FR

m D-MR c o 0 .1 5 - m B-FR "O I m B-MR t ■ L-FR □ L-MR £ 0.05 - r—I □ R-FR

5 R-MR

C-V environment

Figure 4.7. Russians’ F2-steady-state mean durations for each of the three C-V environments. Within each C-V environment gender and consonant type is accounted for. Palatalization is indicated with an apostrophe. The C-V environment abbreviations are: C’V = simple-palaalized, C’jV = palatalized-yod, C’ijV = palatalized-i-yod; D = dental ([d’], [z’]), B = labial ([b’], [m’], [v’]), L = lateral ([!’]), R = trill [r’]; FR = female Russians, MR = male Russians.

If we make rough estimates across gender and consonant type for each C-V sequence in Figure 4.7, we see that F2-steady-state durations are approximately 25 msec, for simple-palatalized sequences, 120 msec, for palatalized-yod sequences, and 165 msec, for palatalized-i-yod sequences. Are the mean durations for these three environments statistically different? An additional statistical analysis tested the equivalency (or non-equivalency) of the F2 durations between CiV-CljV and CljV-OijV (for each gender for each consonant type). In all instances, Russians reliably (p<0.01) distinguished the three environments.

113 d v d jv diiv Dental FR .047 < .114 < .164 MR .0 2 2 < .099 < .156 Labial FR .031 < .113 < .169 MR .025 < .097 < .151 Lateral FR .014 < .113 < .183 MR .029 < .104 < .148 Trill FR .030 < .138 < .175 MR .028 < .1 2 1 < .161

Table 4.19. Statistically reliable (p<0.01) non-equivalencies of Russians’ mean F2-steady- state durations according to consonant type and C-V environment. Mean durations are given in seconds. FR = female Russians, MR = male Russians.

The non-equivalent durations of Table 4.19 are indicated in the double-outlined cells. Because each of

the three sequence types (ClV vs. CJjV vs. OijV) contains an additional segment, perhaps these results

lend quantiative support for the existance of the phonemes? How do learners compare with the Russians?

Do learners also distinguish the three environments by different F2-steady-state durations?

43,6 F2 steady-state duration: Learners

Table 4.20 gives learners’ mean F2-steady-state duration values, for consonant type and gender.

d v d iv diiv Dental FE .062 .057 .104 ME .033 .048 .128 Labial FE .054 .071 .135 ME .038 .044 .171 Lateral FE .037 .042 .116 ME .035 .044 .160 Trill FE .067 .072 .099 ME .058 .070 .141

Table 4.20. Learners’ F2-steady-state duration means (in sec.) according to consonant type and C-V environment. FE = female learners, ME = male learners.

Figure 4.8 gives a graph of these duration means, for each C-V environment by consonant type and gender.

114 Learners' mean F2-steady>state durations by gender and consonant type

□ D-FR

B D-ME % 0 .1 5 -

B-ME

□ L-ME 0 .0 5 - □ R-FE

0 R-ME C V C jV C ijV C -V environment

Figure 4.8. Learners’ F2-steady-state mean durations for each of the three C-V environments. Within each C-V environment gender and consonant type is accounted for. Palatalization is indicated with an apostrophe. The C-V environment abbreviations are: C’V = simple-palatalized, C’jV = palatalized-yod, C’ijV = palatalized-i-yod. D = dental ([d’], [z’]), B = labial ([b’j, [m’j, [v’]), L = lateral ([1’]), R = trill [r’j, FE = female learners, fÆ = male learners.

If we make rough estimates across gender and consonant type for each C-V sequence in Figure 4.8, we

see that learners produce the simple-palatalized sequences with approximately 40 msec, of F2-steady-

state duration, palatalized-yod sequences with approximately 50 msec, and palatalized-i-yod sequences

with approximately 125 msec.

If we compare Figure 4.7 (for Russians) and Figure 4.8 (for learners)—particulatory the relative durations for /ClV/ and /(2ljV/ sequences—it is quite apparent that learners and Russians exhibit very different patterns of production. A final statistical analysis tested the equivalency (or non-equivalency) of learners’ F2 durations between the three C-V environments (for each gender and consonant type). As

Table 4.21 indicates, and in agreement with previous predictions, in virtually all instances learners did not distinguish ^-steady-state durations in the simple-palatalized and palatalized-yod sequences but dM in the palatalized-yod and palatalized-i-yod sequences.

115 c iv d j v d i j v Dental FE .062 .057 < .104 ME .033 = .048 < .128 Labial FE .054 < (p<.05; .071 < .135 ME .038 = .044 < .171 Lateral FE .037 = .042 < .116 ME .035 = .044 < .160 Trill FE .067 = .072 < (p<. 0 2 ) .099 ME .058 = .070 < .141

Table 4.21. Statistically reliable equivalencies and non-equivalencies (at p<0.01, unless otherwise noted) of learners’ F2 mean steady-state durations by gender, consonant type and C-V environment. FE = female learners, ME = male learners.

It is interesting to note that the one instance where learners (here, the FE group) do distinguish O V and

CljV is with labials. I will comment further on this in the next section.

43.7 Comparing Russians’ and learners’ absolute mean values of F2-steady-state durations

The previous within-speaker and within-gender groups tests of F2-steady-state duration

established equivalencies and non-equivalencies of the F2-steady-state durations for the three C-V environments. The tests gave the following overall statistically significant results:

Russians: (yy < djv < dijV

Learners: (yy = djv < dijv

Table 4.22. Summary comparison of F2-steady-state durations for the three C-V environments for learners and native Russians.

We have established that Russians produce each of the three C-V environments (each having an additional segment) with a longer F2-steady-state duration. Learners, however, do not successfully reproduce the Russian pattern, producing the simple-palatalized and palatalized-yod environments with the same R-steady-state duration.

However, an additional interesting comparison test remains. How do the Russians’ and learners’ absolute mean F2-steady-state durations compare? Do the learners produce not only the incorrect relative durations (i.e., learners’ pattern: OV = CljV < ciijV vs. Russians’ pattern: CÎV < CljV < ClijV)

116 but also the incorrect absolute duration means (i.e.. learners’ F2-steady-state duration in OV Russians’

F2-steady-state duration in O V)?

A test of fixed effects compared the absolute F2 steady-state durations of the three C-V

environments for female Russians and learners, and for male Russians and learners. Table 4.23 gives the

results of the statistical test.

Female Russians Females learners Male Russians Male learners d v < dv d v d v (25 msec.) (learners’ are 15 msec, longer) djV > djv djV > djV (59 msec.) (53 msec.)

dijV > dijV dijV dijV (59 msec.) (learners’ are 3 msec.shorter)

Table 4.23. Comparison of learners' and Russians' absolute F2-steady-state durations at p<0.05. The absolute difference between Russians’ and learners’ mean durations is given in milliseconds under the “greater than,” “less than,” or “equal” signs.

Overall, male learners were more successful than female learners in reproducing “Russian-like”

steady-state durations. For example, female learners produce the simple-palatalized sequence with a

palatal closure that is too long. Perhaps this indicates the presence of a separate front glide segment,

thereby providing quantitative evidence that learners produce Russian /O / as two segments, /C/+/J/. In contrast, male learners produce ClV sequences with a steady-state duration of statistically the same length as Russians. It is interesting to note, however, that while male learners’ values are not statistically longer, their absolute mean duration value is 15 msec, longer than Russians’. Perhaps male learners’ absolute 15 msec, longer duration is additional quantitative support for the description of accented productions as ICI

+ /j/7

Both female and male learners incorrectly produced the palatalized-yod sequences.

Interestingly, their absolute deviations are almost identical, 59 msec, for females and 53 msec, for males.

How do the learners’ “too short” productions factor into our original hypothesis of nonnative productions

117 of ICil as /C/+/j/?^* These Hndings seem to either ( 1 ) negate learners’ attempts to produce a separate /j/ segment, or (2) indicate that Russians’ measurable palatal closure of palatalized-yod is longer than of learners’ /j/ in /C/+/j/.

With the palatalized-i-yod sequence, male learners again perform better than their female counterparts. Female learners do not produce the sequence with a sufficiently long palatal closure. Their absolute mean duration is 59 msec, shorter than Russians’ absolute mean. Virtually reproducing male

Russians’ exact absolute mean duration value (being only 3 msec, shorter), male learners have the statistically same duration as Russians.

Thus, it seems that while the male learners do a better overall job at producing “Russian-like” palatal closure durations for the three C-V environments, in general, learners produce simple-palatalized with a palatal closure that is too long and palatalized-yod with a palatal closure that is too short—initial evidence that learners’ productions are somewhere between the extremes of bi-segmental and mono- segmental productions.

^ Traditional categorical phonological descriptions of nonnative soft consonants describe accented production of the palatalized consonants as two segments, ICI + /j/. Therefore, from a phonological point of view one could hypothesize that the K-steady-state durations in learners’ production of /CJV/ and /CJJV/ represents a distinct front glide segment. However, from an acoustic-phonetic orientation, given the continuous, overlapping nature of speech, absolute invariant segment durations do not exist. It is not possible to set absolute maximums and minimums of diuation which would support or reject the existence of a segment since speech rate directly affects segment duration. In fact, even though a segment may not be realized acoustically due to gestural overlap, there can still be evidence of an intention to produce the segment. Thus, even with the apparent F2 steady-state durations in the Americans' CJV productions, we cannot state definitively that they are producing a separate yod segment. There is some type of front glide element, but the data cannot support positing a full segment. Moreover, since most of the AE subjects were advanced graduate students, we should assume that they have acquired the Russian soft consonants to some degree and, therefore, do not produce soft consonants as ICI + /]/. When producing the simple-palatalized sequences, Russian subjects also produced a measurable F2-steady-state duration following primary closure release. We would obviously not posit a separate yod segment for the Russians’ productions. Rather, this steady-state portion can be attributed to the continuous overlapping nature of speech in which the effects of the secondary palatalizing closure are maintained for a short period of time after primary closure release, as the tongue pulls away from the palate, moving towards the articulation of the following vowel.

118 44 Summary of statistical analyses of Russians’ and learners’ F2 formant trajectories

Table 4.24 combines results of all statistical analyses reported in this chapter: learners’ and

Russians’ relative F2-at-transition-onset values, relative F2-steady-state duration values and absolute

F2-steady-state duration values for the three C-V sequences:

Relative F2-at-transition onset

Russians: CIV d jV d ijV Learners: ClV CljV ClijV

Relative F2-steady-state duration

Russians: ClV CljV ClijV Learners: ClV O jV O ijV

Absolute F2-steady-state duration

Learners Russians ClV > CIV CljV < CljV ClijV < ClijV

Table 4.24. Summary of statistical analyses of F2 trajectories for native Russians and learners.

At this point in our discussion it would be useful to return to graphs which illustrate the statistically- reliable differences between the two language groups. Figure 4.9 gives contrasting F2 trajectory patterns of native Russians and learners.

119 Female Russians 2900

2700- /C’ijV/ 2500 - /CJV/ _ 2300- /C ’V/ 5 2100 - 5* 1 1900-

J" 1700- 2 1500- 1300-

1100

- 0.2 - 0.1 0 0.1 0.2 Time (sec.)

Female learners 2900 2700- 2500- /Cijv/

_ 2300- C ’V/

5 - 2100 - I' I 1900 8 "

1300-

1100

- 0.2 - 0.1 0 0.1 0.2 Time (sec.)

Figure 4.9. F2 trajectories contrasting representative production of the three C-V sequences. The upper graph is for female Russians, the lower graphs is for female learners.

120 c iv vs. OjV: Russians clearly distinguish the simple-palatalized and palatalized-yod

sequences. Not only is the palatalized-yod sequence characterized by a reliably higher F2-at-transition-

onset frequency, but the F2-steady-state duration is reliably longer. These acoustic differences provide

excellent support for the claim that the palatalized-yod sequence contains an additional front glide

segment, /)/. Even with numerous years of formal study and often substantial time spent in-country,

learners have not yet acquired this distinction. In fact, learners produce the simple-palatalized and

palatalized-yod sequences with both the reliably same F 2 -at-transition-onset frequency and the same F 2 -

steady-state duration (which is longer than Russians’ c iv and shorter than Russians’ CljV). There is

obviously some quality of the palatalized-yod sequence that eludes learners of Russian.^

CljV vs. CliiV: Learners seem to have greater success mimicking the contrast between

palatalized-yod and palatalized-i-yod sequences. Russians produce palatalized-yod and palatalized-i-yod

sequences with the reliably same F2-at-transition-onset frequency. This is not surprising since in both

sequences the vowel is directly preceded by a yod. Palatalized-i-yod sequences are also characterized by a reliably longer F2-steady-state duration prior to transition onset. This is also not unexpected since palatalized-i-yod sequences contain an additional vocalic segment, [i]. Like the Russians, learners do not distinguish the F2-at-transition-onset frequencies for palatalized-yod and palatalized-i-yod sequences. It should be noted, however, that since learners do not distinguish F2-at-transition-onset for any of the

The issue of orthography must be addressed here. In their study of Russian, most beginning students quickly learn to recognize and approximate the Russian orthrographic representation of palatalized consonant + vowel (e.g. 5a, OK). TH, etc.). Beginning students, however, often do not understand the orthographic representation of the palatalized + yod + vowel combination (CljV), in which the yod is represented orthographically as a soft sign, (5bH, 6 blO, Tbfl). In fact, it is often the case that students learn the underlying phonemic reality of the CljV orthographic sequence only at more advanced levels (e.g., third- or fourth-year or even graduate levels). In this study, all data were from graduate-level learners who had taken a course on Russian phonetics and, therefore, had been exposed to the important orthographic-phonetic-phonemic reality. However, even with their previous training in Russian phonetics, many subjects felt uncomfortable producing CljV sequences. In fact,while reading the word list, many of the subjects stopped and conunented negatively about their ability to produce the CljV combination, stating, for example: "1 hate this sound. 1 never know what to do with it,” or "I’ve been told what it’s supposed to sound like but 1 still can’t hear it and can’t produce it." Thus, even though most of the students had received formal instruction on CljV and had had substantial in-country experience, the sequences continue to elude even advanced-level language learners. The quantitative results of this acoustic production experiment, which demonstrates that our learners do not distinguish ClV and CljV, substantiate learners’ feelings of lack of proficiency.

121 three C-V sequences, we must ask ourselves if learners are not producing all three with the “wrong”

(perhaps too low) F2-at-transition-onset frequency. (A direct comparison of F2 frequencies between the

Russians and learners for the three C-V environments, however, is not possible since it would require normalization for subjects’ different vocal tract lengths.) Like Russians, the learners produce the palatalized-i-yod sequences with longer F2-steady-state durations than the palatalized-yod sequences.

However (while male learners were more successful in reproducing the absolute mean values) in general, learners produced the palatalized-yod and palatalized-i-yod closure with a duration shorter than

Russians’. This is especially true for the palatalized-yod environment. Comparing learners’ success in reproducing Russians’ distinction of palatalized-yod and palatalized-i-yod sequences, it seems that the presence of the vowel [i] greatly aids learners in the articulation of these Russian sequences.

4,5 Acoustic measurements and their associated articulations

Recall that the second formant resonant band reflects the front-back position (and to some extent height) of the tongue in the oral cavity. Thus, F2 measurements provide a non-invasive means of determining general articulatory patterns of the tongue body. Also recall that Russian palatalization is articulated by making a primary closure while simultaneously raising the tongue dorsum forwards and up towards the hard palate, resulting in the superimposition of an i-like articulation.

At this point in our discussion, it is important to briefly address the relationship between constriction location (frontness-backness), degree of closure (tongue height) and the resulting F2 acoustic properties. Pant’s Acoustic Theory of Speech Production (1970:84) gives an informative nomogram illustrating this relationship:

Constant constriction degree, variable constriction location. According to the nomogram, for a closure of constant area, or a constant degree of closure (here, a closure having an area of .32 cm.’) which is articulated at 4.5, 5.5 and 6.0 cm. from the lip opening, the resulting F2 ftequency is 2,300, 2,500, and

2,300 Hz., respectively, so that a closure produced at approximately 5.5 cm. behind the lips results in the highest FI frequency. A closure produced either forward or behind this “maximal point” will result in an

F2 of a lower frequency.

122 Variable constriction degree, constant constriction location. If the constriction location is held

constant at 5.5 cm. behind the lip opening, for three constrictions having different degrees of closures the

F2 varies. For areas of .32, 1.3 and 5.0 cm.* (in order of greatest constriction to least constriction) the

resulting F2 frequency is 2,500, 2,200 and 1,700 Hz., respectively. Thus a greater degree of constriction

results in a highest F2 value.

Finally, the nomogram indicates that a closure with a greater degree of constriction yet further

back in the mouth can have a lower F2 value than a more open constriction which is produced further

forward in the mouth.

Having summarized acoustic distinctions of Russians’ and learners’ productions and having

proposed that learners’ accented productions stem from negative transfer of Ll /C/+/j/ sequences, in the

next section I present acoustic data of learners’ productions of AE words containing /C/+/J/ word-initial

sequences.

4,6 Learners’ productions of AE /CV + /j/ + /u/ versus /civ/, /cijV/ and /C4jV/

The next logical step in this investigation is to examine native-English speakers’ productions of

AE /C/+/J/ sequences. A useful approach would be to compare learners’ productions of native-English

/C/+/j/ sequences with their productions of Russian palatalized C-V sequences. How do learners’ native-

English productions compare to their productions of L2 Russian palatalized consonants'? Do learners actually distinguish these “similar” productions in the two languages? Or do they produce the Ll English

/C/+/j7 sequences and L2 Russian palatalized consonants with similar articulatory patterns, thereby indicating negative transfer?

I have already suggested that, while English /C/-+/j/ sequences are bi-segmental, “similar”

Russian palatalized consonants are mono-segmental. Previous sections of this chapter have demonstrated that learners produce Russian palatalized sequences with a palatalizing closure that is of incorrect duration (too long for /O V / and too short for /CljV/) and of incorrect relative degree of closure.

However, 1 have also proposed that learners can gradually acquire L2 properties of inter-articulatory timing. If learners have, indeed, internalized some degree of the inter-articulatory timings of Russian palatalization, then we would expect learners to produce English /C/+/J/ with a Tongue Body gesture that

123 is more sequential to the IQI gesture (i.e., temporally later) than in their productions of Russian palatalization.

4.6.1 Methods

4.6.1.1 Subjects

Subjects were four graduate students (2 females, 2 males) who had participated in the original production study reported in section 2.1 of this chapter. All four subjects were native American-English- speaking graduate students who had begun their study of Russian after the age of 12. All subjects had formally studied Russian in the United States for a minimum of five years and had lived in Russia for from 6 to 15 months. (Subjects’ in-country experience was quite substantial, in fact, with all but one having lived there at least one year. The four individual subjects’ duration of time spent in-country was

6 , 13, 14, and 15 months, respectively.)

4,6.12 Procedures

Recording were made using the same equipment described in section 2 (above) in the same sound-attenuated chamber in the Department of Linguistics as The Ohio State University. Subjects were each paid $ 1 0 for their participation.

The sound sequences under investigation were English word-initial /C/+/j/ sequences where IQI is a voiced ; /b/, /m/ or /v/. Similar to the production study of Russian C-V sequences in section 2 of this chapter, subjects read a word list comprised of a carrier item followed immediately by a repetition of the target/C/+/j/+/u/ sequence. The list was comprised of three words: ‘beautiful’, ‘music’, and ‘view’. Therefore, subjects read “beautiful-beau” [bjutifal-bju], “music-mu” [mjuzik-mju],

“view-view” [vju^'-vju"']. Subjects were instructed to read through this short word list eight times.

There were, therefore, eight recordings of each labial sequence. Subjects were instructed to speak at a normal rate and to not hyperarticulate. Each recording session lasted approximately 15 minutes. During the session, I asked subjects to speak at slightly faster and slower rates. So that the isolated sequences would be produced as naturally as possible, I asked subjects to try to produce the repeated isolated sequences as they would when producing the beginning of the associated (preceding) authentic word.

124 4^.13 Data analysis

I listened to subjects’ recordings and chose for each subject the three most natural sounding

ones: one at a slightly slower rate, one at a one at an intermediate rate and one at a slightly faster rate.

Therefore, for each subject, nine (3 labials x 3 readings) productions were digitized and

spectrographically analyzed with the Kay Elemetric Computerized Speech lab (Model 5400B). All

measurement were made on the repeated isolated sequences. Samples were digitized at 10,(KX)

samples/second and 16 bits/sample. Vowel formant frequencies were automatically tracked using

autocorrelation-LPC analysis with preemphasis and LPC coefficients. As in section 4.2.S.3 above, five

F2 measurements—3 frequency and 2 duration— were made. Measurements were taken at the same

points as with learners’ productions of Russian sequences: F2 frequency at consonant release, F2

frequency at transition onset, F2 frequency at vowel steady-state onset, F2 steady-state duration preceding

transition onset, and F2 transition duration. (See Figure 4.3 for associated measurements made on

Russian sequences.) While I do not statistically analyses these data as I did in previous sections of this

chapter, I do provide comparative spectrograms and F2 trajectories.

4,63 Results

For purposes of illustration and comparison, I made spectrograms of each subject’s production of English /b/+/j/+/u/. I then returned to the four subjects’ original recordings of the Russian C-V sequences and made spectrograms of their three Russian palatalized sequences /Wu/, /bJju/ and /bJiju/.

Figure 4.10 gives sample spectrograms for the one AE sequence and three Russian sequences for one female subject and one male subject

125 Figure 4.10. Sample spectrograms comparing advanced learners’ productions of AE /b/+/j/+/u/ with their productions of Russian /Wu/, /Wju/ and /Wiju/. Spectrograms are aligned according to onset of vowel transition, indicated by the vertical line. The top four spectrograms are from the productions of one male graduate-student learner, the bottom four spectrograms are for one female graduate-student learner. Time is indicated on the x- axis. Frequency (in Hertz) in indicated on the y-axis. Phonetic transcriptions of the sequences are given below each spectrogram. Palatalization is indicated with an apostrophe and not IPA superscript j ’.

126 Advanced male learner #1 Sample spectrograms of English /C/+/j/ and Russian /C'V/, /C'jV/ and /C’ijV/

I 4,000 L.ili £ 3,000 !' iirnîiii’üHCüiiii 2,000 'd .... ■i.'litAJi' English /b/+/j/+/u/ I l i h . n . , 1,000 ......

4,000 i 3,000 Russian /b'iju/ I 2,000 f 1.000

i j u

1 3,000 » a tiu-i ,1. . .1 ' g 1 S' 2,000 I Russian /b’ju/ I 1,000

4,000

g 3,000 Russian /b'u/ I 2,000 I 1,000

b' ■ u Transition onset

Time Figure 4.10 (to be continued)

127 Bgure 4.10 (continued) Advanced female learner #1 Sample spectrograms of English /C/+/J/ and Russian /C'V/, /C'jV/ and /C'ijV/

4,000

3,000

2,000 English /b/+/j/+/u/

4,000

3,000

2,000 Russian /b’iju/ 1,000

4,000

3.000 n 2,000 Russian /b’ju/

4,000 N 3,000

2 ,0 0 0 Russian /b’u/

1 ,0 0 0

b' I u Transition onset

Time

1 2 8 As in Figure 4.4, the spectrograms in Figure 4.10 are aligned according to onset of vowel

transition, indicated by the vertical line. Clearly these two subjects produce these English and Russian

sequences with different articulatory patterns of timing, which is especially apparent in general shape of

the F2 transition. In comparison to their productions of Russian /Wu/, /bJju/ and /bJiju/, subjects seem to

produce English /b/+/j/+/u/ with an F2 transition of longer duration and shallower slope.

However, before drawing conclusions from a comparison of learners’ productions from the two

different recording sessions, learners’ possible differing rates of speech must be considered. If learners’

were simply speaking more slowly during the recording of AE sequences, then a direct comparison of F2

transition durations is invalid In fact, if we look at the spectrograms of male learner #1, it does appear

that he might have been speaking slightly more slowly during the recording of AE sequences; this is

apparent in that the vowel /u/ in AE /b/+/j/+/u/ is of a longer duration than the vowel /u/ in his

productions of Russian /Wu/, /Wju/ and /Wiju/. On the other hand, female learner # 1 seems to have been speaking at virtually the same rate during both recording sessions. This is apparent in the fact that the duration of the vowel /u/ in English /b/+/j/+/u/ is of approximately the same duration as in her productions of Russian /Wu/, /Wju/ and /Wiju/. Thus is seems that female learner# 1 does distinguish between her English and Russian articulatory timings. For this post-hoc analysis of English sequences containing a palatal glide, I will assume that learners’ spoke at approximately the same rate during both recording sessions. My assumption is supported by the fact that graphs and final conclusions are based on average values from each learners’ three recordings, where each recording was at a slightly different speed. In this manner, I hope that possible variations in vowel duration due to rate of speech is somewhat accounted for.

Do all four subjects exhibit this same pattern over all three labial consonants? In order to determine learners’ general F2 patterns for the three labials, I returned to the data from the original production study described in section 2 above. I extracted relevant data for each of the four subjects’ five

F2 data points for their productions of Russian palatalized C-V sequences where /C/ = /W/, /mi/ or /vJ/ and V=/u/. I then calculated mean values for each of the four subjects, averaging across the three labial

129 consonant—/W/, /mJ/ and /vJ/—and for the single vowel, /u/. F2 frequency data were then time-aligned according to the onset of transition. Table 4.25 gives each subjects’ mean values.

130 Speaker C—V sequence Place of F2 frequency F2 frequency Relative time to type measurement (Hz.) C-V transition onset (sec.) Female learner #1 /C/+/j/+/V/ Consonant release 2369 -0.054 Transition onset 2283 0.000 Vowel steady-state onset 1375 0.165 /civ/ Consonant release 2448 -0.054 Transition onset 2371 0.000 Vowel steady-state onset 1112 0.099 /cijv/ Consonant release 2572 -0.070 Transition onset 2425 0.000 Vowel steady-state onset 1135 -0.115 /dijv/ Consonant release 2374 -0.156 Transition onset 2431 0.000 Vowel steady-state onset 1031 0.112 Female learner #2 /C/+/j/+/V/ Consonant release 2377 -0.094 Transition onset 2364 0.000 Vowel steady-state onset 1273 0.209 /civ/ Consonant release 2564 -0.063 Transition onset 2604 0.000 Vowel steady-state onset 1192 0.193 /cijv/ Consonant release 2541 -0.080 Transition onset 2523 0.000 Vowel steady-state onset 1192 0.226 /ciijV/ Consonant release 2379 -0.107 Transition onset 2581 0.000 Vowel steady-state onset 1019 0.261 Male learner #1 /C/+/j/+/V/ Consonant release 2131 -0.071 Transition onset 2066 0.000 Vowel steady-state onset 1004 0.197 /civ/ Consonant release 2126 -0.022 Transition onset 2143 0.000 Vowel steady-state onset 881 0.112 /cijv/ Consonant release 2224 -0.040 Transition onset 2200 0.000 Vowel steady-state onset 829 0.113 /C iijV / Consonant release 2062 -0.177 Transition onset 2218 0.000 Vowel steady-state onset 864 0.123

131 Speaker C-V sequence Place of F2 frequency F2 frequency Relative tim e to type measurement (Hz.) C-V transition onset (sec.) Male learner #2 /C/+/j/+/V/ Consonant release 2197 -0.074 Transition onset 2229 0.000 Vowel steady-state onset 1186 0.172 /c iv / Consonant release 2276 -0.052 Transition onset 2356 0.000 Vowel steady-state onset 967 0.140 /c ijv / Consonant release 2212 -0.063 Transition onset 2229 0.000 Vowel steady-state onset 910 0.123 /d ijV / Consonant release 2102 -0.190 Transition onset 2264 0.000 Vowel steady-state onset 893 0.133

Table 4.25. Table of four advanced learners’ time-aligned F2 frequency and duration measurement of “similar” English and Russian sequences. The following abbreviations are used: ICI+l]l+fWI = American-English sequences, /CIV/ = Russian simple-palatalized sequences, /OjV/ = Russian palatalized-yod sequences, /OijV/ = Russian palatalized-i-yod sequences. Means represent values averaged over the three labial consonants /b/, /m/, /v/ for the vowel /u/.

Figure 4.11 below provides time-aligned graphs of the mean F2 values from Table 4.25.

132 Figure 4.11. Time-aligned F2 trajectories for four advanced learners for the three Russian palatalized C-V environment and English ICJ+I]I+NI. The upper two graphs give F2 trajectories for two male learners; the lower two graphs give F2 trajectories for two female learners. Russian palatalization is indicated with an apostrophe and not IPA superscript ‘j’. The following abbreviations are used: /C/+/j/ = AE sequence /C/+/j/=A^/, /ClV/ = Russian simple-palatalized sequences, /CljV/ = Russian palatalized-yod sequences, /CJijV/ = Russian palatalized-i-yod sequences. Trajectories indicate data averaged over the three labial consonants /b/, /m/, /v/ where the vowel is /u/.

133 Advanced male learner #1 F2 trajectories for three Russian palatalized C-V sequences and English /C/+/J/ 2400

2 2 0 0 -

2 0 0 0 - C'V ë 1800- S' C'jV g 1600- 1 1400- C ijv 2 1200- —— A - C + j

1000 - 800

- 0.2 - 0.1 0 0.1 0.2 0.3

Time (sec.)

Advanced male learner #2 F2 trajectories for three Russian palatalized C—V sequences and English /C/+/J/ 2400

2 2 0 0 -

2 0 0 0 - C'V £ 1800- S' C'JV 1600- 1400- “ * * “ - C ijv

1200- C + J

1000 -

800

- 0.2 - 0.1 0 0.1 0.2 0.3

Time (sec.)

Figure 4.11 (to be continued)

134 Figure 4.11 (continued) Advanced female learner #1 F2 trajectories for three Russian palatalized C-V sequences and English /C/+/Î/ 2 6 0 0 - 2 4 0 0 - C’V

C’jV

C'ijV

C + j

Time (sec.)

Advanced female learner #2 F2 trajectories for three Russian palatalized C-V sequences and English /C/+/1/ 2 6 0 0 - 2 4 0 0 -

2 2 0 0 - C’V

2 0 0 0 - C’jV 1800- C’ijV 1600- 1400- % C + j

1200 - 1000

- 0.2 - 0.1 0 0.1 0.2 0.3

Time (sec.)

135 The graphs in Figure 4.11 for the four advanced learners present a very interesting pattern. Of

particular interest is how male subject #1, male subject #2 and female subject #I distinguish English

/C7+/j/ (indicated with the triangle) from their productions of the three Russian palatalized sequences. In

agreement with the sample spectrograms ofFigure 4.10, these three subjects produce the English

sequences with an F2 transition of longer duration and shallower slope. Female subject #2 does not

reproduce this pattern, instead producing both the English and Russian sequences with F2 transitions of

approximately the same duration and slope.

4,63 Discussion

What do the F2 trajectories in Figure 4.11 indicate about advanced learners’ articulatory patterns

of production of English /C/+/j/ sequences versus Russian palatalized C-V sequences? The longer and

shallower F2 transition of the English sequences can be associated with a palatal (Tongue Body) gesture

that is underlyingly either temporally later or less stiff than the gesture used to produce Russian palatalization and the Russian palatal glide. If we attribute the more shallow F2 slope to decreased gestural stiffness, then we can say that learners take longer to retract their tongue body from the palatal constriction of English /C/+/j/ than with the palatalizing constriction of Russian palatalization. However, in order to make definitive statement about the role of gestural stiffness, additional studies of tongue kinematics are needed. At the present time I do not have access to such data. I will, therefore, attribute the apparent longer F2 transition in English sequences only to a Tongue Body gesture that has a later, more sequential, gestural timing than in L2 Russian sequences.

Thus, three of the four speakers in Figure 4 .1 1 support my hypothesis that, when acquiring the palatalized consonants of Russian, though learners might begin with their most “similar” LI sequence,

/C/+/j/, they gradually adjust their articulatory timings to better approximate native-Russian timings.

Native-English productions of /C/+/j/ are characterized by a palatal Tongue Body gesture that is produced sequentially (in relation to the bilabial closure) and, therefore, extends longer after the release of the primary consonantal articulation. Quite surprisingly, learners actually seem to be acquiring the subtle articulatory timings of Russian palatalized consonants and have adjusted the palatal gesture so that it to occius earlier in relation to the labial closure than it does in English /C7+/j/+/V/.

136 Learner’s produce native-English /C/+/j/ sequences with a “later” Tongue Body gesture than when producing Russian palatalized consonants. Moreover, previous analyses in section 2 of this chapter indicated that learners’ productions of Russian simple-palatalized consonants exhibit a Tongue Body gesture that is still too long, or still too sequential. I, therefore, suggest that the graphs in Figure 4.11 indicate learner’s improved yet still accented productions. I propose that these facts evidence learners’

“imperfect,” “intermediate” L2 interphonetics.

4.7 Native and nonnative palatalization in the Gestural Phonology framework

Gestural Phonology is an excellent tool for associating acoustic data with their articulatory source events. In particular, the Gestural framework can incorporate the results of this study, providing a new detailed non-discrete description of LI and L2 productions of Russian palatalized sequences.

Recall that Browman and Goldstein (1989) offer a model that is based on hierarchially organized articulatory tiers. (See chapter 3 for a detail presentation.) The first level is comprised of three tiers; (1)

Velic, (2) Oral and (3) Glottal. The Oral tier is further broken down into: (1) Tongue Body, (2) Tongue

Tip and (3) Lips. Because Russian palatalization involves raising of the tongue dorsum, the secondary palatalizing articulation is indicated as a gesture on the Tongue Body tier. Depending on the segment’s place of articulation (labial, dental, alveolar, etc.), the primary closure is indicated on any of the existing tiers (Lip Aperture, Tongue Tip, etc.). In the following illustrations, all native productions will be indicated with solid lines, learners’ productions will be indicated with broken lines.

137 Let us begin with a simplified Gestural account of a Russian palatalized consonant. See Figure

4.12. For ease of presentation, I choose the Russian voiced bilabial palatalized stop consonant, [bJ]. The choice of a bilabial consonant simplifies the discussion and presentation, since the primary and secondary closures are indicated on separate tiers (the Tongue Body and Lip Aperture tiers); therefore, inter-gestural timings between the two gestures are evident.

b

j Tongue Body I

Figure 4.12. Abbreviated Gestural model of Russian /bl/ as produced by native speakers.

Figure 4.12 illustrates native speakers’ gestures for the primary bilabial closure and secondary raising of the tongue body. Note that the gesture onsets are simultaneous. While my acoustic data do not address this particular property of the palatalized consonants, I have thus far found no literature that counters this proposition.” More importantly, note the relative timing of the two gestures’ release. In light of the acoustic data, I have indicated that the palatalizing gesture is released just after the end of the gesture for the primary closure; this is due to the brief but measurable F2-steady-state duration present in the

Russians’ spectrograms (34 msec, for females and 25 msec, for males).

How would the Gestural Model account for learners’ accented productions of the simple- palatalized consonants? Recall that traditional accounts describe nonnative productions as two segments,

/C/+/J/. Assuming this is true, the Gestural Model would account for these bi-segmental accented productions with two sequentially ordered gestures located on separate tiers.

” An ariculatory study using a technique such as electro-palatography would be required to determine the relative onset timings.

138 Lip Aperture

Tongue Body

Figure 4.13. Abbreviated Gestural Model of “traditional” nonnatives’ highly-accented Russian [W].

Note that the onset of the Tongue Body gesture begins only at the release of the Lip Aperture gesture. In this manner the sequential and highly accented quality of the articulation is illustrated. The model in

Figure 4.13 assumes that learners have not acquired any of the simultaneous properties of Russian palatalization. How would a model of the same [bl] segment based on productions of our advanced learners appear?

Lip Aperture ------J------

Tongue Body

Figure 4.14. Abbreviated Gestural Model of our learners’ accented production of [bl].

Figure 4.14 provides a representative model of our learners’ production of the simple-palatalized consonants. While onset of the palatalizing gesture does not coincide with the onset of the bilabial gesture (as it did with the Russians), 1 have indicated here that the advanced learners have acquired some degree of the simultaneous inter-articulatory timing. More importantly, the learners’ Tongue Body gesture release is temporally later than for Russians in Figure 4.12. This accounts for learners’ statistically longer F2-steady-state durations for the simple-palatalized environment. It could also be hypothesized that the later release of the Tongue Body gesture indicates the presence of a separate.

139 sequential /j/ segment. Data from this study can neither confirm nor deny such claims, and this proposition remains to be investigated in future phonetics studies.

How would the Gestural Model illustrate native productions of Russian [Wj]? The statistical analyses indicated that, in comparison to simple-palatalized sequences, palatalized-yod sequences are produced with a higher F2-at-transition-onset and with a longer F2-steady-state duration. These different acoustic properties are indicated in the Gestural model with a Tongue Body gesture that is of greater amplitude, or height (indicating a greater degree of closure), and of longer duration (indicating a temporally longer closure).

Lip Aperture

Tongue Body

Figure 4.15. Abbreviated Gestural model of Russian [bJj] as produced by native speakers.

In Figure 4.15 the additional yod segment is indicated with the thinner solid line. Note that palatalization and yod are indicated here as two separate Tongue Body gestures. It could be hypothesized, however, that there is actually only one underlying gesture with one target for both palatalization and the palatal glide, their final realizations being a result of temporal limitations, so that a yod is actually a “fully realized” gesture, while palatalization is a “not completely realized” version of the same gesture.

Since learners’ productions of ClV and (%ljV sequences have the acoustically same F2-at- transition-onset and F2-steady-state durations, a gestural model of learners’ production of the [bjJ] sequence is to be identical to Figure 4.14.

Finally, we must consider palatalized-i-yod sequences. Our analyses indicated that Russians produce these sequences with an even higher and longer palatal closure than in the palatalized-yod sequences.

140 Lip Aperture

Tongue Body

Figure 4.16. Abbreviated Gestural model of Russian [bJij] as produced by native Russians.

Note that in Figure 4.16 the original palatalizing gesture is still indicated with the thick solid line. The gesture for the following [ij] is indicated with the thin solid line that is longer and of greater amplitude than in Figure 4.15. The corresponding GesttmU Model for learners’ productions of the same sequence is given in Figure 4.17.

Lip Aperture

______I . 1 I Tongue Body ______I

Figure 4.17. Abbreviated Gestural model of Russian /Wij/ as produced by learners.

The learners’ Gestural model for [Wij] repeats the original Tongue Body gesture for palatalization (which, recall, is the same for the palatalized-yod sequence) and has an additional Tongue Body gesture for the vowel [i]. Note that the [i] gesture does not have the same amplitude as the Russians’. In this manner, we account for their overall relative lower F2-at-transition-onset values.

Figure 4.18 presents all abbreviated Gestural models for both Russians and learners for the three palatalized environments where C=[bi]. Gestural amplitude is indicated through the height of the rectangle and is proportional to the closure degree. In this manner, articulations with a greater degree of closure are indicated with a taller rectangle. Again, solid lines indicate Russians’ gestures and broken lines indicate learners’ gestures.

141 Lip Aperture Russians’ /W/ Tongue Body I

.------T I I I I Lip Aperture Highly accented I------1 I I /bi/ = A E /b / + /j/ Tongue Body > I

: 1 I I I I Lip Aperture Learners’ I------1 I I /hi/ and /bJj/ I I Tongue Body '

Lip Aperture Russians’ /Wj/

Tongue Body

Lip Aperture Russians’ /Wij/

Tongue Body

I > I I Lip Aperture I------,------1 Learners’ /Wij/ Tongue Body i i

------T im e ------»

Figure 4.18. Abbreviated Gestural models for Russians’ and learners’ production of the three palatalized sequences under investigation. The height of the rectangles indicates gestural magnitude, or degree of closure. The length of the rectangles indicates gestural duration

142 Finally, it has been suggested that learners acquire target L2 phones in gradual stages. It is often

the case that learners never completely acquire certain target sounds. I propose that learners’ partially-

acquired sounds evidence the existence of INTERPHONETICS, which is a result of L 1 phonetics, L2

phonetics and Language Universals. I have chosen the term INTERPHONETICS based on well-established

theories of INTERLANGUAGE and INTERPHONOLOGY (Selinker, 1972; Yavas, 1994). The Gestural

model can provide an excellent account of learners’ INTERPHONETICS. Since gestural onsets can begin

anywhere on the time continuum and can be of any duration, learners’ gradual acquisition and proposed

continually changing (improving) INTERPHONETICS of Russian palatalization can also be accounted for.

Here we focus on the gradual L2 acquisition process of Russian palatalized consonants.

At the very initial stages of acquisition, learners draw on their closest LI production of the palatalized sounds (i.e., /C/-f-/j/) and produce the two gestures sequentially, with a relatively longer

(possibly less stiff) Tongue Body gesture. With learners’ increased exposure to L2, their pronunciation improves (Best and Strange, 1992). For learners of Russian this means that the palatalizing Tongue Body gesture becomes more simultaneous with the Lip Aperture onset; the release of the Tongue Body gesture also “shortens,” becoming more simultaneous with the Lip Aperture release. These various stages of

INTERPHONETICS are indicated by having the Tongue Body gesture simply move to the left. STAGES 1 through STAGE 4 in Figure 4.19 illustrate the gradual acquisition process.

143 Lip Aperture Russians’ M/

Tongue Body

.------T I I I 1 Lip Aperture Learners’ highly accented r ------1 I I /bi/ = /b / + /j/ Tongue Body ' 4- '______STAGE I With improved acquisition Tongue Body gesture shifts ------J I I I I Lip Aperture Learners’ /bi/ I------1 STAGE 2 Tongue Body ! < ' ■ !______

.------T I I I I Lip Aperture Leamers’/bJ/ r ------1 STAGE 3 Tongue Body : ^ :______

.------T I I I I Lip Aperture Learners’ slightly accented 1------I I I /bi/ I I Tongue Body I______I______STAGE 4

Time

Figure 4.19. The Gestural Model illustrates gradual stages of the L2 acquisition process of the Russian palatalized consonants.

144 The acoustic study reported in this chapter provides quantitative support for differences and similarities

between Russians’ and learners’ acoustic and articulatory patterns of production of Russian palatalized

sequences. Overall Russians make more distinctions in R than learners. While Russians clearly

distinguish simple-palatalized and palatalized-yod both in F2 frequency and duration, learners produce

the two sequences identically. (These results lend some support to traditional descriptions of learners'

accented productions as /C/ +/j7.) Similar to Russians, who distinguish palatalized-yod and palatalized-i-

yod only in F2 duration but not F2 frequency, learners produce the two sequences with different F2

durations but identical F2 frequencies. Based on the statistical results, learners have obviously not yet

mastered the complex articulations of the palatalized consonants of Russian, especially those sequences

which contain an additional front glide segment.

Learners' accented F2 durations and frequencies stem from their incorrect articulatory timings

and degrees of closure; that is, because the articulatory timings required to produce Russian CIV and

especially CljV sequences are completely unfamiliar to native American-English speakers, they are

especially difficult to acquire. Not only are the learners’ timings incorrect, but learners also do not

produce the palatal glide /j/ and the vowel N with a high enough closure. By employing the Gestural model, differences in the degree of closure and closure duration can be clearly expressed on a time continuum. I propose that, in general, L2 learners make gradual progress in learning L2 sounds. The

Gestural Model can account for these various stages which represent learners’ stages of

INTERPHONETICS

The next logical question in this investigation of the Russian palatalized consonants is: “How do our Russians and learners perceive the same palatalized sequences?” Do they make the same distinctions in perception that they make in production? Or is there a discrepancy between production and perception capabilities? The next chapter reports a perception experiment which addresses these issues.

145 CHAPTERS

AN ACOUSTIC PERCEPTION STUDY: EVIDENCE FOR DIFFERING PERCEPTUAL PATTERNS, ACOUSTIC STRATEGIES AND EFFECT OF LINGUISTIC EXPERIENCE

5.1 Introduction and discussion of relevant SLA theory

Infant and child speech research has clearly defined the relationship between LI perception and production capabilities: perception precedes production. Research on the adult L2 acquisition process, however, has not defined the developmental relationship between L2 perception and production skills.

Preliminary findings seem to indicate that, unlike the LI acquisition process, L2 production skills precede perception skills: "...L2 learners may actually produce nonnative contrasts better than they perceive them in their own or other's speech. In this sense, L2 learning in adults follows a very different pattern from LI acquisition" (Strange, 1995:40). This chapter reports an acoustic perception study in light of the findings from the related acoustic production study of the previous chapter. The acoustic-production study of Chapter 4 found that Russians’ and learners’ patterns of production of palatalized C-V sequences were reliably different: learners do not distinguish as many sequences as Russians do. Will

Russians and learners also exhibit different perceptual patterns? In other words, "If you can't say it like a native-speaker, can you hear it like a native-speaker?"

Summarizing results from both the production and perception experiments, this chapter strives to provide additional insights into the complex nature of L2 production and perception. In order to define why learners’ and Russians’ perceptual patterns might differ, I address several relevant issues, including: native-speakers' and learners’ differing perceptual patterns, the issue of acoustic saliency as it relates to perceptual acoustic strategies, effects of phonological interpretations through one's LI, changes in

146 perceptual patterns with increased exposure to the L2, as well as the role of linguistic and functional

knowledge.

L2 learners' accentedness can be attributed to several sources. For example, L2 perception is

“colored,” often hindered, by the LI phonological space. Native knowledge of the LI sound system involves heightened awareness of the distinctive sounds of Li; as a result, phonetic distinctions which are not meaningful in LI (but which might be meaningful in another language) are ignored perceptually.

Increased awareness of LI phones coupled with diminished discrimination of non-LI phones adds to adults learners' accents in perception and production. This study provides, to some degree, evidence of learners’ LI phonological interpretation

SLA research has also found that certain nonnative phones are especially difficult for adult L2 learners to acquire, while other nonnative phones are very accessible. “Phonemic, phonetic and acoustic factors have been considered important in accounting for this variability” (Polka, 1991:2961).

Investigating the effect that vowel type has on perceptual patterns, the present perception study demonstrates the effect of phonetic context. A close examination and explanation of the effect of vowel type provide evidence of listeners' differing acoustic strategies. Finally, this study demonstrates that, in addition to phonetic context, syllable position and stress placement also play an important role in natives' and nonnatives' perceptual patterns since they are an integral part of linguistic knowledge.

To gain a solid understanding of the interrelation between sound perception and production, we must consider both acoustic output (the sound source) and its processing by the human perceptual system

(the sound receiver). Chapter 4 established reliable acoustic differences between palatalized Russian C -

V sequences. However, we can in no way assume that these statistically reliable acoustic differences indicate reliably different perceptual cues for the listener. It could be that, when perceptual prtxzessing is also considered, a calculated “reliable acoustic difference” turns out to convey no meaningful difference to LI listeners. Just as speech production is described as a source-filter model (where the vocal cords are the source and the oral cavity is the filter), the perceptual system can be viewed as a source-filter model

(where the acoustic source is physically filtered through the cochlea and processed through the abstract

L1/L2 phonological space). Therefore, in order to determine which acoustic distinctions are indeed meaningful and acoustically salient to listeners, a related perception study is required. By combining

147 results of the production (chapter 4) and perception (chapter 5) studies of the same Russian C-V

sequences, we can determine if numerically significant acoustic difference are indeed perceptually

“significant” to Russians and learners.

Data and statistical analysis of Russians' productions of the three palatalized sequences indicate

that native Russians distinguish /Civ/, /cijV/ and /CiijV/ with F2 frequency and F2 steady-state duration.

However, while Russians distinguish /OV/ and /CljV/ with both F2 frequency and F2-steady-state

duration, they differentiate /CljV/ and /ClijV/ onlv with F2 steady-state durations (and not F2 frequency).

Do listeners attend equally to differences in F2 steady-state duration and to differences in F2 frequency?

The present perception study provides initial (but by no means conclusive) evidence of the acoustic

saliency of duration and frequency cues. If listeners distinguish all three C-V sequences, then we might

conclude that F2 durations are the acoustically salient and differentiating cues. If, however, listeners

distinguish /CÎV/ and /CljV/ but not /CljV/ and /ClijV/, then we might surmise that listeners attend more

to changes in F2 frequencies than F2 durations. Investigating which specific acoustic parameters native

Russians and learners attend to—duration or frequency —the present perception study demonstrates

listeners’ differing perceptual strategies.

It would be a mistake, however, to assume that only acoustic phonetic information determines

perceptual patterns. For native competency of a language implies both phonetic knowledge and

functional knowledge. In other words, not only do LI listeners know which sounds are permitted in their

language, but they also know which sounds occur in what types of environments with what kinds of stress:

Learning the sound system of a language entails more than just learning how to pronounce phonetic elements in words. A learner must not only adequately develop the segmental structure, the syllable structure, and linguistic mechanisms to optimize lexical understandability. Language learners must develop a linguistic knowledge that recognizes the listener....[T]here are two independent types of linguistic knowledge involved in phonological development—a phonetic knowledge and a functional knowledge (Weinberger, 1994:283).

Therefore, learners' perceptual accents result not only from inference of the LI phonological space and its associated phonetic properties, but also from their proposed lack of functional knowledge of the L2. The

148 present perception study evidences Russians' complete linguistic knowledge of Russian (including both

phonetic and functional knowledge) and learners’ imperfect phonetic and functional knowledge.

In sum, this chapter reports the findings of an acoustic perception experiment. Because the test

tokens in the perception experiment are taken from the production experiment of chapter 4, analyses of

Russians' and learners' perception data allow us to explore several topics, including:

• The relationship between sound source (production) and sound receiver (perception). Are significantly different acoustic source parameters recognized as such by the human perceptual system functioning in a specific language? Do adult L2 learners' L2 perception and production capabilities develop equally or does one precede the other?

• Previous claims that it is more difficult for adult L2 learners to acquire nonnative places of articulation than voicings. Since place of articulation is associated with formant frequency cues while voicing is associated with formant duration cues, the perception study explores relative perceptual acoustic saliency of frequency and duration. Do Russians and learners display similar acoustic strategies?

• The extent to which phonetic knowledge and functional knowledge affects listeners' perceptions.

• How increased exposure to the L2 modifies learners' perceptual patterns.

52 Methods

52.1 Subjects

Forty-six listeners participated in the perception experiment, 18 Russians ( 8 female, 10 male)

and 28 AE learners of Russian (12 female, 16 male). Nine of the learners (5 female, 4 male) who had

participated in the production study also participated in this perception study. Subjects were recruited

either from The Ohio State University or from the Slavic Department of Indiana University. Subjects

were each paid $10 for their participation. (Research funding was provided by the Graduate School at

The Ohio State University in the form of a Graduate Student Alumni Research Award.)

Russians: All Russian participants indicated Russian as their native language. In the majority of cases, both of their parents were native Russian speakers; two Russians indicated one parent as bilingual

Ukrainian and Russian. The majority of subjects had grown up in Moscow or Central Russia. Three subjects indicated that they had grown up in Ukraine. Most subjects were in their twenties or early

149 thirties; only two subjects were in their forties. Russians' term of residence in the U.S. varied

considerably, from two weeks to fifteen years. There was no variability, however, in their self-perceived

fluency. Rating their fluency on a scale from 1 (weakest) to 4 (strongest, no change in fluency), all

subjects gave themselves 4, indicating no change in their Russian since leaving Russia.

Learners: American subjects (12 females, 16 males) were both undergraduate and graduate

students who had studied Russian a minimum of two years. Compared with the Americans who

participated in the production study, there was a much greater range of experience among the Americans

in this study, ranging from two to six years of fotmal study in the U.S. Most learners had never studied

Russian in-country, especially the sixteen male learners. Figure 5.1 illustrates American subjects’

number of years of study, both in the U.S. and in-country. Data for male and female learners are combined and sorted according to increasing number of years of study, first by study in U.S. and then in

Russia. The x-axis gives a subject number which indicates subjects’ gender. Subject numbers followed by ”m” indicate male subjects; subject numbers followed by " f indicate female subjects.

150 Learners' duration of study in U.S. and in Russia

Study in U.S

Study in Russia

^ E EE E E Eÿî?=E EE E ESS^S E ESv^fe Eï= ESS Eî= cs'or^aonmw-ifsts— o\— T*-\o(sc4r4Cm— cs —

Subject nutnber

Figure 5.1. Learners’ number of years of Russian study—both in U.S. and in-country (in Russia). Data for both female and male participants are presented. Data are not grouped according to gender, but are sorted and presented in order of increasing number of years of study in U.S. followed by number of years of study in Russia. Study in the U.S. is indicated by the gray columns, while in-country study is indicated by black columns. The “m” or “f ’ following subject numbers on the x-axis indicate subjects’ gender: “m” = male, “f ’ = female.

Note that, except for subjects 25f and 28 f, only those subjects who had studied Russian at least five years in the U.S. had also studied in-country.

Learners’ self-perceived mastery of the Russian sound system also varied quite a bit. Learners self-rated their Russian pronunciation proficiency on a scale from 1 (weakest, very heavy accent) to 6

(native-like Russian, no accent). See Figure 5.2. So that any association between number of years of study and pronunciation self-rating might be clearly illustrated. Figure 5.2 presents subjects’ ratings in

151 the same order as in Figure 5.1 above, i.e., in order of increasing numbers of years of study in U.S.

followed by years of study in-country.

Learners' self-evaluation of phonetic proficiency

S S £.. S.. E _ £ _ E _ .. . £ £ £ _ £ _ E w - . M.. E _ —Eo^^o ^ ^ — £ - — £ oo c

Subject number

Figure 5.2. Learners’ self-rating of Russian pronunciation. A rating of 1 indicates weakest Russian phonetics while a 6 indicates native-like phonetics. As in Figure 5.1 above, data for male and female subjects are presented in order of numbers of years of study in U.S. then number of years of study in-country. The “m” or “f ’ following subject numbers on the x- axis indicate subjects’ gender: “m” = male, “f ’ = female.

Comparing Figure 5.1 and Figure 5.2, we do not see an obvious correlation between subjects’ years of study and their self-perceived pronunciations skills, although their ratings do seem to increase somewhat

with increased exposure to the language. (This trend becomes a little more apparent if we remove the data for subject 25f, who had spent two years in-country.)

Figure 5.3 more clearly illustrates the relationship between subjects’ self-rating and their total years of Russian language study. In order to calculate learners’ total number of years of study, I added the number of formal years of study in the U.S. and time spent in-country. (Three months of in-country study were calculated as one year of study in the U.S.)

152 Association between learners' combined years of study and self-rating

12 -1

1 0 -

o c o s

1 2 3 4 5 6

Learners' self-rating

Figure 5.3. Graph of learners’ self-ratings vs. total number of years of study (a combination of formal study in U.S. plus time in-country). X-axis indicates self-rating, where 1 is weakest (heavy accent) and 6 is best (no accent, native-like). Y-axis indicates the total duration of study of Russian (in years).

The regression line indicates that there is a positive correlation between total number of years of study and perceived pronunciation abilities—the longer they study, the better their pronunciation (at least in their opinion). The regression line might have an even steeper slope if the two outliers, one at rating 3 (at

10 years of study) and one at rating 5 (at approximately 3.5 years of study), were removed. Learners, therefore, perceive that their production capabilities improve with increased linguistic exposure. The final sections of this chapter investigate if learners' perception capabilities also follow this pattern of improvement.

153 522. Stimuli

The speech sample presented to listeners consisted of the productions of one typical male

Russian speaker from the production experiment of chapter 4: a 32-year-old male graduate student at The

Ohio State University, who had lived in Samara from birth to the age of five and then in Moscow from the age of five to thirty. The subject attended high school and college in Moscow. Both parents were native speakers of Russian. At the time of the study he had lived in the U.S. two years and indicated that there was no change in his level of fluency since coming to the U.S. In order to verify that his production is, indeed, typical of male Russian speech, average formant trajectories from his productions from the study reported in chapter 4 were compared with overall averages for the male Russian speakers. His trajectories exhibited the patterns typical for male native-Russian speakers.

Recall that the word list from the production study of chapter 4 was composed of words containing a target C-V sequence in word-final position with syllable-final stress, followed immediately by a repetition of the target sequence in isolation; acoustic measurements of the production study were taken only from the isolated C-V sequences. In the same manner, the stimuli for the perception study were taken only from these isolated C-V sequences. The subject’s productions in the order of the randomized word list (including both word and isolated C-V sequence) were digitized using the Kay

Elemetric Computerized Speech Lab (Model 4300B). From the wave form, isolated C-V sequences were selected and output onto an analog tape recorder. Isolated C-V tokens on the tape were separated by two second intervals. Sequences containing the seven paired consonants—/b-W/, /m-mJ/, /v-vJ/, /d-

154 52.3 Procedure

The recorded stimuli were presented to perception study participants in language laboratories at

either The Ohio State University or Indiana University. At Indiana University, stimuli were presented

from the master control board of a Tandberg Educational System 600. Listening through headphones,

subjects could adjust the volume level individually at their own station. The laboratory was well-

insulated for sound and the equipment new. As the room had been reserved solely for the purpose of this

study, there was little background noise which might have interfered with subjects’ performance.

Subjects reported that they clearly heard the tokens.

Subjects at The Ohio State University were presented stimuli through headphones via a single

master Wollensak 3M cassette player. Depending on the number of participants, subjects adjusted the

volume either individually (when the subject was the only participant) or as a group (if there was more

than one participant). For most sessions at The Ohio State University the room was quiet, with few or no

other students in the laboratory. In a few cases, however, there was background noise from other

students’ conversation. In spite of the few possible distractions, several participants commented on the

ideal circumstances and clarity of the speech tokens.

Before beginning the listening task, subjects were fit with headphones and given an answer

sheet. The answer sheet presented a four-alternative (for the four C-V sequences) or two-altemative (for

the word-final sequences) forced-choice task in . Figure 5.4 provides an excerpt from the answer sheet. The first eight lines are given as they appeared on the answer sheet.

155 1 pbi pu pbH PHH 2 o6 oGb

3 5bi 5bH 5hh

4 pa pa pba PHH 5 ae aeb

6 .na na JTHfl

7 M y .vno MbK) MHK)

8 3a 3fl 3bH 3Hfl

Figure 5.4. Excerpt from the forced-choice answer sheet for the perception study.

Subjects were told that they would hear a series of 98 Russian sound sequences (not complete words),

that each stimulus corresponded to one of the provided forms on the answer sheet, that the series of 98

sequences would be presented a total of three times, and that two seconds of silence would separate each

sound sequence. Subjects were told that, since the stimuli were taken either from real words or nonce

words, all four of the C-V sequences on the answer sheet would be presented in the listening test.

Subjects were not, however, informed about the (equal) distribution of the sequence types. Subjects were

told that the entire experiment would take approximately 40 minutes. Because listeners might tire

towards the end of the study, I asked them to please remain alert during the final minutes of the

experiment. Participants took the task and instructions quite seriously. All seemed to maintain a

surprising level of alertness for the entire experiment.

After completing the listening task, subjects filled out a brief questionnaire and then were paid

for their participation. Many of the native Russian speakers were extremely curious about the focus of

the experiment and freely offered their comments. Many asked if the tape had actually presented any of the /OijV/ sequences found in column four of the answer sheet. When I answered that there was an even distribution of the four C-V sequences, several of them were astounded, saying they heard few or no

instances of /OijV/ sequences. The comments of one Russian participant who had participated in both

156 the production and perception studies were especially interesting. After completing the production

experiment discussed in chapter 4 and learning about its focus, she stated quite matter-of-factly that the

Russian sequences /ClV/ and /OjV/ are very different for native Russian speakers and that Russians

would never confuse the two, neither in perception nor production. (She seemed to imply that the study, therefore, addresses nothing of real interest.) Yet after completing the perception study, the same subject admitted that she didn’t think she was able to distinguish the /O V / and /cijV/ sequences. She admitted complete surprise and disappointment in her self-perceived inability. Learners did not offer as many unsolicited comments as the Russians. In general, learners expressed disappointment with their performance, stating that they were incapable of distinguishing the /CJV/ and /CJJV/ sequences. Learners seemed to feel that they “hadn’t done it right, ” as if this experiment provided them additional proof of their “non-nativeness.” They seemed acutely aware of and deeply disappointed with their self-perceived phonetic insufficiencies.

53 Results: /a/, /u/ and /i/ combined

53.1 Presentation of data

Subjects’ answers were entered into a spreadsheet program. The stimulus C-V type was encoded in one column on a scale from 1 to 6 , where “ 1” = [CV], “2” = [CJV], “3” = [CJjV], “4” =

[CiijV], “5” = [VC#] and “6 ” = [VCJ#]^*. Subjects’ responses were then recorded in another column using the same numerical encoding system. In an additional column the correspondence between stimulus and response was noted. For those instances where stimulus and response coincided, subjects’ responses were encoded as “correct"; when stimulus and response differed, subjects’ responses were encoded as “incorrect.” In light of results from the production study of chapter 4 (that focused only on distinguishing palatalized C-V sequences and did not address palatalized and non-palatalized consonants in word-final position), it was decided that the word-final sequences, i.e., /VC#/ and /VCÎ#/, would be

As in chapter 4, the four C-V sequences will be referred to as follows: /CV/ = “non-palatalized,” /civ / = “simple-palatalized,” /cijV/ = “palatalized-yod,” and /OijV/ = “palatalized-i-yod.”

157 omitted from further analysis. All further discussion in this chapter addresses only the four C-V sequences which were encoded as either 1, 2, 3, or 4 above.

Based on perception-study methodology and analysis from Miller and Nicely (1955), I summarize the stimuli-response correspondences in a confusion matrix. Confusion matrices clearly present in tabular form quantitative associations between presented stimuli and subjects' responses. In this manner the distribution of subjects’ “correct” and “incorrect” identifications are clearly illustrated; confusion matrices reveal those stimuli which are particularly problematic for listeners.

For each of the two language groups—native Russians and AE learners—subjects’ responses were tabulated and presented in confusion matrices. Table 5.1 presents the resulting confusion matrices of the initial raw data. Within each language group (Russians or learners), data are summed over listeners, consonants and vowels. The stimuli are shown in rows. Listeners’ responses are listed in the columns. For example, of the 1134 presentations of non-palatalized consonants the Russian listeners correctly labeled 1130 of them as non-palatalized, while 4 times they heard a non-palatalized token as simple-palatalized. C f the total 1764 non-palatalized stimuli, learners identified 1575 as non-palatalized,

141 as simple-palatalized, 43 as palatalized-yod and 5 as palatalized-i-yod.

158 a) Stimuli Russians’ responses

CV d v d iV d iiV

CV 1130 4

d v 1132 2

d j v 32 1082 16

d ijv 10 826 296

b) Stimuli Learners’ responses

CV d v d i v d i j v

CV 1575 141 43

c iv 152 1288 306 18

d j v 17 526 908 310

d ij v 16 106 264 1378

Table 5.1. Initial raw data results from the perception study. The upper table (a) gives data for native-Russian speakers. The lower table (b) gives raw data for learners.

Cell entries that are enclosed in a box indicate correct responses, i.e., where stimulus and response correspond. If there were no responses for a given stimulus-response pair, the cell is left empty. In this manner, subjects' general degree of "confusion" is illustrated. Learners indicated greater confusion than did the Russians. This is apparent in that while there are 6 blank cells in the Russians’ confusion matrix, there are no blank cells in the learners’ confusion matrix.

159 Because there were unequal numbers of Russians and learners it is more informative to present

the data in percentages rather than raw counts.

a) Stimuli Russians’ responses

CV CIV CljV

CV 99.6%

CIV 99.1 0.2%

CljV 95.4% 1.4%

ClijV 1.0 % 73.0% 26.1

b) Stimuli Learners’ responses

CV d v d iV d i i v

CV 89.0% 1 8 .0 % 2.4% 0.3%

c iv 8 .6 % 73.0% 17.3% 1.0 %

cijV 0.9% 30.0% 51.5% 17.6%

d ij v 0.9% 6 .0 % 15.0% 78.0%

Table 5.2. Results from the perception study given in percentages. The upper table (a) gives percentages for Russians. The lower table (b) gives percentage results for learners.

The percentage results alone given in Table 5.2 provide interesting information about Russians’ and learners’ different performances. Graphic display of subjects’ performance, however, in the form of column graphs, also aids the process of data interpretation.

160 Before discussing graphs calculated from subjects' confusion matrices, let us first consider a hypothetical graph of “perfect” perception— where all stimuli are correctly identified. Figure 5.5 provides a column-graph illustration of “perfect” perception which lacks any indication of listeners’ confusion.

Example of perfect identification of C-V sequences O CV S CV

■ CJV g CijV

1 0 0 90 80 I 70 s. 60 c 50 i 40 1 30 s. 2 0 10 0 —|— CV C V CJV CijV Stimulus; C -V type

Figure 5.5. An example bar-graph for “perfect” perception for the four C-V sequences. Palatalization is indicated with an apostrophe.

A graph of perfect perception shows that each stimulus C-V type (indicated on the x-axis) is associated with only one response column, which is of 100% height (indicated on the y-axis). Graphs of

"perfect" perception, therefore, display only four columns. While interpreting the graphs in this chapter the reader should keep in mind the following two general principles:

1. A smaller number of columns indicates less confusion; a larger number of columns indicates more confusion.

2. Taller columns indicate less confusion; shorter columns indicate greater confusion.

161 Due to formatting constraints of the software used to produce the graphs, palatalization is indicated in all graphs not by a superscript ‘j’ but by an apostrophe. In the following graphs of this chapter, the four C-V sequences—/CV/, /OV/, /OjV/ and /OijV/—will not be enclosed in forward slashes. Also note that in the column graphs, each C-V response type is consistently indicated with a specihc fill pattern:

1 ) non-palatalized—a clear bar, 2 ) simple-palatalized— a bar filled with diagonal lines, 3) palatalized-yod—a solid black bar, 4) palatalized-i-yod— a bar filled with horizontal lines.

Let us now consider column graphs of subjects' actual responses. Figure 5.6 provides column- graphs of perception study percentage results (from Table 5.1) for Russians and learners. Within each language groups, data are summed over gender, consonant and vowel.

162 Russians' identification of C-V sequences: /a/, /u/ and /i/ combined 100-, 90 80 □ CV 0c 7 0 - 1 6 0 - H C V •o 5 0 - 4 0 - I 3 0 - 2 0 - 10

0 —I— CV CV C jV C ijV Stimulus: C-V type

Learners' identification of C-V sequences: 100 /a/, /u/ and /i/ combined 90 80 □ CV 70 H c'v H 60 ë. 50

0C 40 □ C ijV 30 I 1 20

1 0 - 0 L CV C V C jV C ijV JStimulus: C-V type

Figure 5.6. Russians’ and learners’ identiHcation of the four C-V sequences. The upper figure is for Russians; the lower figure is for learners. The x-axis indicates stimuli. The y- axis indicates subjects’ response.

163 Russians and learners graphs indicate different patterns of perception. In general, note that Russians' minimal confusion is indicated by three dominant “correct” columns of almost 100% height. On the other hand, the graph for learners indicates more confusion since there is a total of 12 columns, the tallest of which is of approximately 90% height. The remaining 11 columns are of lesser height Comparing

Russians’ and learners’ graphs for each of the four C-V sequence types, we can make generalizations about each subject group’s performance.

Non-palatalized. Both Russians and learners identified non-palatalized sequences with a high level of accuracy. Correctly identifying 99.6% of /CV/ sequences, Russians indicate virtually no confusion with the hard consonants. Learners also display a relatively high level of accuracy, correctly identifying 89.0% of the hard consonant stimuli. The non-palatalized sequence stimuli represent learners’ best performance. It is interesting to note that a small percentage of Americans (8.0%, 141 responses) identified the non-palatalized as simple-palatalized^®; in even fewer instances (2.4%) learners heard the non-palatalized as palatalized-yod, and in only five instances (0.3%) Americans identified the non- palatalized sequences as palatalized-i-yod. In general, both Russians and learners exhibit very low confusion with the non-palatalized sequences.

Simple-palatalized. As indicated by the single predominant column (99.8%) for the /CJV/ stimuli, Russians easily distinguish simple-palatalized sequences. On the other hand, the associated three prominent columns of learners indicate their greater confusion and weaker performance. Learners correctly identified only 73% of the simple-palatalized sequences. Their second most-frequent answer was to hear simple-palatalized sequences as palatalized-yod (17.3%). It is also interesting to note that learners identified 8 .6 % of the /CJV/ stimuli as ICV/.*°

” Since this first analysis combined results for the three vowels, /a/, /u/ and /i/, and given the particular challenge that the front vowel hard-soft combinations /Ci/ vs. /CJi/ give learners, I hypothesize that the majority of incorrect responses as simple-palatalized results from learners’ inability to distinguish combinations with the vowel /U. The next section of this chapter investigates in greater detail the role of vowel type in the identification process. As in the previous footnote, I suspect that those instances where learners identify /CJV/ as /CV/ are probably due mostly to nonnatives' inability to distinguish the front vowel /i/ in /Ci/ and /O il sequences.

164 Palatalized-vod. With the single prominent response column (95.4%), Russians again display

virtually no confusion in correctly identifying palatalized-yod sequences. Their remaining 4.6%

responses are unevenly split between simple-palatalized and palatalized-i-yod responses, with a

preference for simple-palatalized. To judge from the three prominent columns of short height, learners

display greatest confusion with /CljV/ stimuli. In fact, learners correctly identified the palatalized-yod

sequences at just above chance (51.5%). There is obviously something difficult, particularly foreign and

elusive, about palatalized-yod sequences for L2 learners. Learners’ remaining responses are split, for the

most part, between simple-palatalized (30.0%) and palatalized-i-yod (17.6%).

Palatalized-i-vod. Russians display a very interesting response pattern for /ClijV/ stimuli. Two

prominent response columns above the palatalized-i-yod stimulus indicate Russians’ confusion. Most

interestingly, Russians incorrectlv heard /ClijV/ as /OjV/ for 73.0% of the /OijV/ presentations; Russians correctly identified/OijV/ stimuli with only 26.0% accuracy. Learners’ responses indicate a completely different pattern, one in which they perform better than native speakers! While learners do indicate some level of confusion (as indicated by the three columns), they correctly identified /O ijV / sequences with higher accuracy than they identified either/O V/ or /OjV/! In fact, learners correctly identified /OijV/ stimuli with 78.0% accuracy. In stark contrast to Russians’ 73.0% response, learners incorrectlv identified /OijV/ as /OjV/ in only 15.0% of the stimuli. These data are unusual among studies of L2 acquisition because they show a contrast that is perceived more accurately by second-language learners than it is by native-speakers.

How can we account for the fact that learners’ attend to a distinction better than native speakers?

In addition, if we compare phonetic distinctions maintained in production (chapter 4) with these perception results, how do we explain the fact that, while American speakers make no distinction between simple-palatalized and palatalized-yod sequences in production, they are able to distinguish them, though imperfectly, in the speech of a Russian native speakers? Russians, on the other hand, clearly distinguish palatalized-yod and palatalized-i-yod sequences in production, but exhibit a different pattern in perception. In other words, learners can hear a difference that thev do not produce (/OV/ vs. /CljV/),

165 while Russians do not hear a difference that thev do produce f/QiV/ vs. /CJiiVA. How can these

seemingly contradictory facts be accounted for?

53^ Discussion

Let us begin our discussion with learners’ data. I focus on learners’ confusion of the /civ / and

/CljV/ sequences, in particular the fact that learners identified 17.3% of simple-palatalized stimuli as palatalized-yod and 30.0% of palatalized-yod stimuli as simple-palatalized. (See Figure 5.6.) I hypothesize that these results provide new support for traditional descriptions that learners’ process—and, therefore, phonologically interpret—the palatalized consonants as bi-segmentals. Recall that the English sound environment most similar to Russian palatalized (mono-segmental) consonants is found in words such as ‘beautiful’, ‘music’, and ‘view’. In English, however, these sounds are realized as a bi-segmental sequence, [€]+[]]. (The bi-segmental nature of these English sequences is supported by phonetic descriptions for native-Russian speakers who are learning English. Such phonetic descriptions emphasize that the consonant and palatal must be produced as two sequential elements, where the palatal [j] does not overlap its preceding consonant.) In 2.4.5,1 hypothesized that learners’ proposed accented bi-segmental

(/C/ + /j/) productions of the Russian palatalized consonants are due to LI transfer from their native

"similar" English sequences. In a similar manner, I propose that learners' production and perception capabilities are intertwined. Learners produce both /Cl/ and /Clj/ with an F2-steady-state duration that is longer than Russians’ duration values for /Cl/ yet shorter than Russians durations values for /Clj/; learners’ data, therefore, seem to indicate the presence of some degree of [j], a tendency towards bi- segmental productions. The fact that learners produce Russian mono-segmental /Cl/ with a longer duration than Russians, causes them to identify Russian /ClV/ sequences, with their (learners’) longer F2 steady-state duration, as /CljV/.

In light of SLA research that shows “similar” L2 sounds to be more difficult that “new” L2 sounds, it is not surprising that our learners experience the greatest confusion distinguishing /CIV/ and

/cijV/. /civ/ and /CljV/ are both “similar” to English /C/+/j/ sequences. Having reliably longer F 2- steady-state durations, Russian palatalized-yod sequences are more acoustically similar to AE /C/+/j/

166 sequences than simple-palatalized sequences. I hypothesize that learners attend to the longer F2-steady-

state duration values and, therefore, identified 30.0% of palatalized-yod as simple-palatalized. The

learners results can, therefore, be explained as interpretation through their LI phonetic and phonological

space.

How can Russians’ performance with the palatalized-i-yod stimuli sequences be explained?

Why is it that learners correctly identified significantly more /OijV/ sequences than native speakers? I

hypothesize that Russians’ performance evidences the existence of a modified type of near-merger in

Russian. I also propose that, as a result of their differing levels of linguistic knowledge, native speakers

and learners display different listening strategies.

Near-mergers are traditionally characterized by maintenance of a contrast in production but loss

in perception. For example, a native-English speaker of a New York City dialect acoustically

distinguishes in spontaneous speech the words “source” and “sauce.” However, when the same

productions of these two words are presented to the original speaker, the speaker-Iistener is unable to

reliably distinguish the two words.

Near-mergers are defined as the result of a sound change where “two word classes that are quite distinct in some come into close approximation in a given dialect” (Labov, 1994:350). The

proposition of near-mergers has evoked controversy and disbelief in the field of linguistics since near­

mergers challenge “the reasonable belief that speakers could not produce a distinction without having the ability to hear it” (Labov, 1994:355). In other words,

[t]he most difficult problem raised by near-mergers is that from the productive viewpoint, there are two categories; from the perceptual viewpoint, only one....How does a person leam to articulate each member of one category in one way, and each member of the other category in another, if he or she cannot recognize the difference between the categories? This is a substantive issue of some weight (Labov, 1994:368).

For the purposes of this study it is important to keep in mind three additional characteristics of near­ mergers: ( 1 ) acoustic differences are most often in F2 and not a combination of FI and F2; (2) “[sjtudies of mergers in progress show that changes in speech perception precede changes in production” (Labov,

1994:355); (3) “[pjhoneticians from other areas [in our case, learners] are better able to hear the difference than the native speakers” (Labov, 1994:359).

167 To explain the Russians’ performance, I modify the traditional definition of a near-merger. In

addition to the traditional definition, for our Russian version of near-merger, I include issues of phonetic and functional linguistic knowledge. Native speakers of a language know not only what sounds their LI allows, but also how the sounds function within the language (including phonotactic constraints, restrictions of stress placement, etc.). Native speakers, therefore, possess “complete” linguistic knowledge which encompasses both functional and phonetic knowledge. On the other hand, most L2 learners have not acquired complete linguistic knowledge. Recall that while Russian sound sequences

/ClijV/ in word-final position with syllable-final stress are extremely rare (see 4.2.3.), the production and resulting perception tokens were designed so that all C-V tokens meet the position and stress requirements. We can, therefore, say that /ClijV/ sequences with syllable-final stress have low functional load in Russian. As a result of the low functional load of word-final syllable-final stressed palatalized-i- yod sequences, Russians instinctively know that such sound combination rarely exist in their language, and they, therefore, hear them as the more linguistically acceptable palatalized-yod sequences.

Do L2 learners display equal acquisition of L2 phonetic and functional knowledge? Do learners acquire phonetic and functional knowledge at equal rates? This study indicates that phonetic knowledge of L2 develops before functional knowledge since learners correctly identified 78.0% of /ClijV/ sequences. Learners’ present phonetic knowledge (and absent functional knowledge) cause them to have a different perceptual strategy than Russians. Not having acquired functional knowledge, learners listen

’phonetically,’ attending only to acoustic factors and ignoring other linguistic constraints. Learners’ performance in the perception experiment indicates a lack of awareness of the near-merger of the palatalized-yod and the palatalized-i-yod sequences. They attend to a phonetic contrast without regard to its linguistic status. The Russians, on the other hand, treated the difference between /CljV/ and /(ZlijV/

(which the Americans’ performance shows was perceivable) as a disregardable variation in what is essentially one category encompassing both types of sequence. In other words, native-speakers employ both phonetic and functional knowledge while learners base their decisions only on perceptible acoustic parameters. These results suggest that L2 learners may attend at a psychoacoustic level to phonetic

168 phenomena which are ignored by native speakers: native speakers listen 'linguistically' while learners

tend to listen ‘phonetically’.

5A Results: Effect of vowel context

The previous section demonstrated that there was an effect of subject group (Russians or

learners) on the perceptual pattern of the C-V sequences under investigation. Within each subject group,

is there also an effect of vowel type? That is, do the three vowels /a /, u/ and /i/, with their characteristic

acoustic properties, also determine to any extent subjects’ perceptions? For example, I proposed earlier

that a large portion of learners’ overall confusion in distinguishing non-palatalized and simple-palatalized

sequences would be due to the vowel /i/. Perhaps the acoustic cues that distinguish hard and soft

sequences containing the vowel /i/ are not as acoustically salient as the distinguishing acoustic properties

of hard and soft sequences containing the vowels /a/ and /u/. Differing degrees of acoustic saliency might

arise from the vowel’s general articulatory classification: /a/ and /u/ as back vs. /i/ as front. To investigate

the effect of vowel quality on perception results within each language group, I divided the perception data

into two groups: data for sequences containing the vowel /i/ and data for sequences containing the vowels

/a/ and /u/. Because front and back vowels have characteristically different acoustic properties, examination of perception results for both Russians and learners according to vowel type provides a window into subjects’ different acoustic strategies.

5Æ1 Presentation of data

Similar to Table 5.1, Table 5.3 presents the confusion matrices for initial raw data. Within each language group (Russians or learners), data are divided into two groups according to vowel type: /i/ and

/a/+/u/. Data are summed over listeners and consonants.

169 a) Stimuli V type Russians’ responses according to vowel type

CV d v d iV d i i v

CV /a/+/u/ 754 2 /i/ 376 2

c iv /a/+/u/ 755 1 m 377 1

c ijv /a/+/u/ 24 720 8 N 8 362 8

CiijV laJ+lvJ 4 647 105 N 6 179 191

b) Stimuli V type Learners’ responses according to vowel type

CV d v d iV d iiV

CV lal+lul 1115 34 27 r u 460 107 16 5

d v /a/+/u/ 133 795 233 15 r u 19 493 73 3

d j v la/+/ul 3 447 577 147 r u 14 79 331 163

d ij v /a/+lul 4 69 200 903 r u 12 37 64 475

Table 5.3. Raw data results from the perception study where effect of vowel type is considered. The upper table (a) gives data for native Russian speakers; the lower table (b) gives raw data for learners.

Cells that are enclosed in a box indicate "correct" responses, that is, where stimuli and response correspond. Learners again display greater confusion than native speakers; the Russians' confusion

1 7 0 matrix has 6 blank cells while the learners’ matrix has only one blank cell. So that the relative weight of subjects’ responses is obvious, the data are presented in percentages rather than raw counts in Table 5.4.

171 a)

Stimuli V type Russians’ responses according to vowel type

CV CIV CIJV CliiV

CV /a/+/u/ 99.7% 0.2% N 99.5% 0.5%

c iv tsJ+luI 99.8% 0 . 1% /i/ 99.7% 0.3%

c ijv /a/+/u/ 3.2% 95.7% 1 . 1 % N 2 . 1% 95.8% 2 . 1%

ciijV /a/+/u/ 0.5 85.6% 13.9% 1.6% 47.5% 50.7%

b) Stimuli V type Learners’ responses according to vowel type

CV CIV c ij v c iiiv

CV /a/+/u/ 94.8% 2.9% 2.3% N 78.2% 18.2% 2.7% 0.9%

d v /a/+/u/ 11.3% 67.6% 19.8% 1.3% /!/ 3.2% 83.8% 12.4% 0.5%

d j v /a/+/u/ 0.3% 38.1% 49.4% 12.5% N 2.4% 13.5 56.4% 27.8%

d ij v /a/+/u/ 0.3% 5.9% 17.0% 76.8% N 2.0% 6.3% 10.9% 80.8%

Table 5.4. Results from the perception study given in percentages where effect of vowel type is considered. The upper table (a) gives percentages for Russians. The lower table (b) gives percentage results for learners.

Table 5.4 indicates that there is a definite effect of vowel within subjects’ responses. Figure 5.7 provides corresponding vertical column graphs of the percentage data from Table 5.1.

172 Figure 5.7. Russians’ and learners’ identification of the four C-V sequences. Witfiin each language group, results are summed over gender and consonant. Vowel quality is, however, accounted for. Graphs (a) and (c) are for sequences containing the vowels /a/ and /u/. Graphs (b) and (d) are for sequences containing the vowel N. The x-axis indicates stimuli; the y-axis indicates subjects’ distribution of responses in percents.

173 a)

Russians' identification of C-V sequences: /a/ and /u/ combined

100 - 90- 80- 70- 60- I Q C V C 50- I 40- 30- B C ijV

2 0 -

10 -

0 - —r - CV Stimulus: C-V type b)

Russians' identification of C-V sequences: 111 only 100 - 9 0 - 8 0 - □ CV g 7 0 - 60 - H C'V eI 5 5 0 - 4 0 - I 3 0 - 2 0 -

10

0 CV C'V C'jV C'ijV Stimulus: C-V type

Figure 5.7 (to be continued)

174 c) Figure 5.7 (continued)

Learners' identification of C-V sequences: /a/ and /u/ combined 100 90 80 70 □ CV I 60 S3 C'V 50 40 ■ C'jV I 30 S C'ijV 2 0

1 0 O' n L CV C'V C'jV C'ijV Stimulus: C-V type d) Learners' identification of C-V sequences: /i/ only 100 90 80

g 70 □ CV 60 I H C'V C 50 40 ■ C'jV BO­ I B C'ijV ZO-

1 0

0 - I CV C'V C'jV C'ijV Stimulus: C-V type

175 The graphs in Figure 5.7 indicate both Russians’ and learners’ widely differing perceptual patterns as

well as an effect of vowel type within each language group.

Russians. Russians’ graphs for /a/+/u/ and /i/ indicate no effect of vowel for the first three CV

sequences, /CV/, /ClV/ and /CljV/. Russians exhibit minimal confusion with these three stimuli groups,

as indicated by the single predominant columns for each stimulus type. Response percentages for the

three sequences across both vowel types vary little, ranging from 95.7% to 99.8%. The data for /ClijV/

sequences indicate an opposite pattern. The different relative heights of the columns for/ClijV/ stimulus

for /a/+/u/ and /i/ evidence an obvious effect of vowel. When the vowel in /ClijV/ sequences is a back

vowel (either /a/ or /u/), Russians hear palatalized-i-yod sequence as palatalized-yod in a majority of

instances (73.0%). On the other hand, when the vowel in /ClijV/ sequences is the front vowel /i/,

Russians identify the sequences as palatalized-yod (47.5%) and palatalized-i-yod (50.7%) with almost

equal distribution. Based on the low functional load of syllable-final stressed /ClijV/ sequences and

Russians’ apparent disregard for/ClijV/ sequences, the previous section established the presence of a

near-merger of Russian /CljV/ and /ClijV/ sequences. To explain Russians’ perceptual pattern in light of

the effect of vowel type, a new theoretical approach is necessary. The presence of /i/, with its prolonged

high F2 steady-state and no dramatic F2 transition (negative slope), must somehow affect Russians’ perception of palatalized-i-yod sequences. The following subsection will address theoretical explanations

for Russians’ performance.

Learners. Learners’ graphs for /a/+/u/ and N indicate several apparent effects of vowel. For non-palatalized /CV/ sequences, the quality of the vowel influences learners’ confusion patterns. When the non-palatalized sequences contains a back vowel, /a/ or/u/, learners correctly identify /CV/ sequences

(94.8%) with almost the same accuracy as native-Russian speakers (99.7%). As was predicted earlier, we see that the front vowel /i/, however, confuses learners. Learners’ performance in correctly identifying non-palatalized sequences containing the vowel /i/ (78.2%) is much lower than for the vowels /a/ and /u/.

When learners incorrectly identified non-palatalized sequences, they sometimes identified them as a simple-palatalized sequence (18.2%).

176 Based on the three prominent columns above the /O V / stimulus on the x-axis, simple-palatalized

sequences containing the back vowels /a/ or /u/ prompt a higher degree of confusion in learners' responses

than do sequences containing the front vowel /i/. Learners correctly identified /O V / sequences containing

back vowels with only 67.6% accuracy; learners’ remaining responses are divided between palatalized-

yod (19.8%) and non-palatalized (11.3%). As indicated by a single tall column above the /ClV/ stimulus

x-axis label, simple-palatalized sequences containing /i/ exhibit a very different pattern, one with less

confusion. When the vowel is 111, learners correctly identify the simple-palatalized stimuli with greater

accuracy (83.8%). In their remaining responses learners overwhelmingly favor palatalized-yod sequences

(12.4%) over non-palatalized (3.2%).

Palatalized-yod sequences elicit the greatest confusion among learners, as indicated by three

prominent columns, two of which are of almost equal height. Palatalized-yod sequences also evidence a

strong effect for vowel type. When the sequence contains a back vowel learners divide their responses

almost equally between simple-palatalized (38.1%) and palatalized-yod (49.4%). Palatalized-yod

sequences containing h i indicate a different pattern. When the vowel is /i/, learners correctly identify the

palatalized-yod sequence with slightly better success (56.4% with /i/ vs. 49.4% with /a/+/u/). Learners'

second most numerous response is not simple-palatalized (as is was with /a/+/u/), but palatalized-i-yod

(27.8%). Thus, back vowels in palatalized-yod sequences seems to perceptually pull learners towards

simple-palatalized sequences while front vowels perceptually pull learners towards palatalized-i-yod

sequences.

Palatalized-i-yod sequences exhibit virtually no effect of vowel. In addition, as indicated by the

single prominent column above the /OijV/ stimulus label on the x-axis, /OijV/ sequences exhibit

markedly less confusion then /CljV/ or /ClV/ sequences. For both front and back vowels, learners

correctly identified palatalized-i-yod sequences in the majority of cases: 76.8% for /a/+/u/ and 80.8% for

I'll. When learners did not correctly identify palatalized-i-yod sequences, they heard them as palatalized-

yod: 17.0% for/a/+/u/ and 10.0% for/i/. Remaining responses indicate simple palatalized—5.9% for lal+lul and 6.3% for /i/.

177 How can we account for obvious effects of vowel type in subjects' perception of the four Russian

C-V sequences? To clarify why the different vowel types effect different perception patterns, we must first consider how the tfiree vowels are associated with their particular vocal-tract configurations and the resulting characteristic acoustic signals.

5 4 2 Discussion

To investigate the effect of vowel type, I have divided the three vowels—/a/ /u/ and /i/—into two groups: IdJ+lul and /i/. The division is based on the vowels’ general vocal-tract configurations and resulting second-formant frequency (F2) patterns. Recall that closures articulated front and high within the oral cavity result in a relatively higher F2. Therefore, articulations of the palatalized consonants (cf.

2.5), the palatal glide /j/, and the front fiigh vowel /i/ are characterized by a raised F2 of approximately

2.000-2,400 Hz. The acoustic signal of a palatalized consonant followed by the vowel /i/ is characterized by a raised F 2 (due to the palatalized consonant) that is maintained throughout the following front vowel.

From consonant release continuing through the following vowel duration, the tongue remains in a raised and fronted position in the oral cavity. As a result, virtually no F2 transition (or negative slope) is associated with palatalized sequences containing /i/. The F2 remains high and steady throughout.

Palatalized sequences with the vowel /!/ are, therefore, characterized bv a raised F2 having a long steadv- state duration and no abrupt acoustic transition.

In contrast to palatalized sequences containing the high front vowel /i/, palatalized sequences with back vowels /a/ and /u/ have a very different acoustic pattern. Following release of the palatalized consonant, the tongue body briefly maintains the secondary palatalizing articulation, being raised and fronted in the oral cavity. Closure release is, therefore, characterized by a raised F2 of approximately

2.000-2,4(K) Hz. However, the raised F2 is maintained only briefly, and it soon begins to fall, resulting in a negative slope. F2’s negative transition results from the lowering and backing of the tongue body as it moves towards articulation of the subsequent vowel. In ideal situations the F2 should eventually reach each vowels' steady-state F2 value, approximately 1000-1300 Hz for /a/ and 500-800 Hz for /u/.

Palatalized sequences containing the back vowels /a/ and /u/ are, therefore, characterized bv a raised F2 of very brief duration followed bv an abrupt transition in F2 having a negative slope.

178 Of what relevance to this study is the notion of prolonged F2 steady-state duration (for Ai/) or abrupt frequency transition (for /a/ and /u/)? SLA research has found that adult learners experience a broad range of difficulty perceiving nonnative phonetic contrasts. The reason for the wide range of difficulty is not well-understood since perceptually relevant acoustic cues for the different L2 contrasts remain undefined. Results from perceptual training studies, however, provide initial insight into which acoustic parameters are more easily learned. It could be hypothesized that one reason certain parameters are more easily learned is because they are more acoustically salient. The fact that research has suggested that "nonnative voicing contrasts may be easier to train than nonnative place contrasts" (Strange,

1995:36), might indicate that voicing contrasts are more acoustically salient to the human perceptual system than place contrasts. Voicing contrasts are associated with formant duration cues while place of articulation contrasts are associated with formant frequencv cues.

We can extend the idea that formant duration cues are more acoustically salient than formant frequency differences to our palatalized Russian sequences and their effect of vowel type. Palatalized C -

V sequences containing the vowel /i/ are characterized by a prolonged steady-state duration: formant durations are more acoustically salient, and therefore, listeners are more likely to attend to the prolonged duration. It is almost as if prolonged steady-state durations perceptually lengthen the palatal or palatalized quality of a sequence. On the other hand, palatalized sequences containing /a/ or /u/ are characterized by a change in formant frequencv which is less acoustically salient and, therefore, provides less indication of the palatalized nature of a sequence. It appears as if the abrupt change in frequency perceptually shortens any preceding steady-state duration, thereby perceptually diminishing the palatal or palatalized quality of a sequence.

Based on the perception results for effect of vowel type, I propose that vowel types (with their associated F2 steady-state durations and transitions) in palatalized C-V sequence greatly influence the degree to which subjects identify the "palatalized," or fronted, nature of a C-V sequence. For the purposes of this discussion we can order the three palatalized C-V sequences according to increasing degree of front palatal/palatalized closure: /O V / < /OjV/ < /OijV/. Because a longer raised F2 steady- state perceptually lengthens or increases the degree of palatal/palatalization, subjects are likely to hear

179 sequences containing /i/ as an exemplar of a C-V sequence containing a greater degree of palatal closure,

i.e., /O jV / or /OijV/. Here a prolonged raised F2 steady-state seems to be the dominant acoustically

salient feature. On the other hand, because a brief steady-state duration and abrupt frequency transition

perceptually shortens or decreases the perceived degree of palatal/palatalization, I propose that sequences

containing /a/ or /u/ will tend to be heard as an exemplar of a C-V sequence containing a lesser degree of

palatal closure, i.e., /O V /. Here F2 transition seems to be the dominant acoustically salient feature.

Russians' and learners' data in

Figure 5.7 support the aforementioned claims that palatalized C-V sequences containing /i/ will tend to

be identified as having a greater degree of front closure (/OjV/ or/OijV/) while sequences containing /a/

or/u/ will tend to be identified as having a lesser degree of front closure (/OV/). In light of the effect of

vowel type, perceptual “lengthening” and “shortening” due to dominant acoustic parameters and subjects’

resulting perceptual patterns, I return to the data in Table 5.4 and

Figure 5.7.

Russians: /OiiV/stimuli. Of the four C-V sequence stimuli, only the "most palatal " variant,

/OijV/, merits comment. When the vowel is /a/ or /u/, Russians incorrectlv identified /OijV/ stimuli as a

"less palatal " sequence /OjV/ (85.6%). When the vowel is /i/, however, the quantity of incorrect

identifications by Russians of /O ijV/ as /OjV/ is much lower (47.5%). The presence of /i/ perceptually lengths the degree of palatal closure, and Russians correctly identify the /OijV/ sequences with a very slight majority, 50.7%, up from 13.9% with /a/+/u/.'“

Learners: /OV/stimuli. In light of both LI interference and issues of perceptual acoustic saliency, learners' responses for the /O V / stimuli give very interesting results. Table 5.5 repeats the relevant percentage results from Table 5.4.

■“ Of course, Russians' performance reflects processes of both LI linguistic (phonetic and functional) knowledge and perceptual acoustic saliency.

180 Stimuli V type Learners’ responses according to vowel type

CV dv diV d iiV

dv /a/+/u/ 11.3% 67.6% 19.8% 1.3% N 3.2% 83.8% 12.4% 0.5%

Table 5.5. Excerpt from overall percentage results in Table 5.4 for learners for /OV/ stimuli.

When the vowel is a back vowel, /a/ or /u/, learners correctly identify the stimuli with 67.6% accuracy. Upon initial inspection, I found learners' 11.3% identification of simple-palatalized sequences as non-palatalized to be quite surprising. Based on my own experience teaching and learning Russian,

ICidJ and /O u/ sequences always seemed perceptually obviously palatalized. However, if we consider that the steep F2 transition perceptually shortens the palatalized qualities of the sequence, learners’ responses here are understandable. Learners hear simple-palatalized sequences with /a/ and /u/ as non- palatalized sequences because of perceptual acoustic saliency. On the other hand, I attribute learners' identification of 19.8% of simple-palatalized stimuli as palatalized-yod to LI phonological interpretation.

Quite surprisingly, learners had a higher rate of correct identification of simple-palatalized sequences when the vowel was /i/. (I had initially expected learners to identify /CÜ/ sequences almost equally as "hard” [Ci] and "soft" [Cii].) I attribute these results to an increased perceived degree of

“palatalization” due to the raised, albeit brief, F2-steady-state duration; the fact that the presence of /i/ perceptually increases the amount of “palatal” closure also explains learners' 12.4% identification of /CÜ/ sequences as /CJji/.

181 Learners’ responses according to vowel type

Stimuli V type CV civ cifv ciijv

d jV /a/+/u/ 0.3% 38.1% <— ----- 49.4% 123% N 2.4% 13.5% 56.4%----- ■—> 27,8%

Table 5.6. Excerpt from overall percentage results in Table 5.4 for learners for /OjV/ stimuli.

Learners: /OiV/ stimuli. Table 5.6 repeats the relevant percentage results from Table 5.4.

Learners’ pattern of incorrect identifications of palatalized-yod sequences provides additional strong support for the claim that the vowel type influences perceptual patterns. When the palatalized-yod sequences contain a back vowel, learners tend to hear the sequence as having "less palatal constriction,” evidenced by the 38.1% of /O V / responses. When sequences contain I'll, however, the palatal nature of the constriction is perceptually emphasized, and learners hear the sequence as having "greater palatal constriction"—as evidenced by the 27.8% of /OijV/ responses. Right-facing and left-facing arrows in

Table 5.6 indicate that the learners’ majority of incorrect perceptions agree with proposed effects of vowel type.

Learners: /OiiV/stimuli. Learners’ 76.8% and 80.8% responses (for /a/+/u/and/i/, respectively) indicate that vowel type does not seem to exert a significant effect with palatalized-i-yod sequences. Table 5.7 repeats the relevant palatalized-i-yod percentage results from Table 5.4.

182 Learners’ responses according to vowel type

Stimuli V type CV dv djV diiv

CiijV /a/+/u/ 0.3% 5.9% 17.0% 76.8% N 2 .0 % 6.3% 10.9% 80.8%

Table 5.7. Excerpt from overall percentage results for learners for /CJijV/ stimuli.

Learners indicate very little confusion identifying the palatalized-i-yod sequences. When the vowel is /!/,

however, learners did tend to hear more sequences as having a "longer palatal closure." i.e., /ClijV/. With back vowels the F2-steady-state duration is shorter, prompting learners to hear the sequence as having a

"shorter palatal closure," i.e., /CljV/.

In sum, the analyses given in this section demonstrate that there is a definite effect of vowel type on subjects' perception of the four C-V sequences. Sequences containing the vowel /i/ were typically judged as having a greater amount of palatal constriction (where /CÎV/ < /cijV / < /ClijV/), while sequences containing /a/ or /u/ were typically judged as having a lesser amount of palatal constriction.

We can associate palatalization, palatals, and the vowel IM with a longer raised F2 steady-state duration.

Palatalized sequences containing /a/ and /u/, on the other hand, are associated with a large transition in the second formant frequency. Thus, depending on the sequence and vowel type, subjects tend to attend either to salient duration cues or frequency cues. Furthermore, if we say that perception of voicing involves attending to duration cues and that perception of place of articulation involves attending to frequency cues, then the results of this section provide additional support for previous findings that nonnative distinctions in frequency duration are more acoustically salient to L2 learners than nonnative distinctions in frequency.

5,5 Results: Learners range of experience

How does learners’ range of experience with Russian influence perception performance? Do learners who have had greater exposure to and experience with Russian display a perception pattern more similar to native Russians? Since adult L2 learners can continually modify and improve their L2 phonetic

183 proficiency (Best and Strange, 1992), I hypothesize that linguistically more experienced learners will indeed show improvement over their less experienced counterparts. In order to test the hypothesis, I divided learners' data into two groups. Learners who had studied Russian three years or less were designated as having undergraduate level (UG) experience, while those who had studied Russian four years or more were designated as having graduate level (G) experience. Throughout the following analyses, I continue to take into account the effect of vowel type.

Learners’ confusion matrices were recalculated, this time considering both vowel type and level of experience. The abbreviation "UG" in the following confusion matrices indicates less-experienced students (henceforth "undergraduate learners"), while "G” indicates more-experienced students

(henceforth "graduate learners"). As in previous analyses I divided vowel type into two groups: /a/+/u/ and /i/.

5,5.1 Presentation of data

Table 5.8 gives the resulting confusion matrix of the initial raw data. Learners' data are summed over vowel type and according to level of experience. Cells enclosed in a box indicate correct responses.

The total quantity of data for undergraduate and graduate learners is almost equal, as indicated by the approximate equal total number of responses for each C-V sequence stimuli. For example, for vowel type IdJ+lul in C-V sequence type /CV/, there is a total of 630 undergraduate responses and 551 graduate responses. For non-palatalized stimuli of vowel type /i/, there are 315 undergraduate responses and 273 graduate responses. The relative balance between the total quantity of undergraduate and graduate responses gives additional support to the present claims and comparisons of this section.

184 Stimulus Learners' responses according to experience and vowel type

CV d v d j v d ij v

CV /a/+/u/ UG 580 26 24 0 /a/+/u/ G 535 8 8 0 FM UG 223 74 15 3 FM G 237 33 1 2

d v /a/+/u/ UG 84 357 174 15 /a/+/u/ G 49 438 59 0 FM UG 10 258 44 3 FM G 9 235 29 0

d jV /a/+/u/ UG 3 248 280 99 /a/+/u/ G 0 199 297 48 FM UG 11 41 180 83 FM G 3 38 151 80

d ij v /a/+/u/ UG 3 44 82 501 /a/+/u/ G 1 25 118 402 FM UG 3 24 26 262 FM G 9 13 38 213

Table 5.8. Raw data counts for learners' performance in the perception experiment. Both vowel type and linguistic experience are considered. Vowel type is indicated as either /a/+/u/ orIM. Data from learners having less linguistic experience are indicated with the abbreviation "UG" (undergraduate) while data from learners having more linguistic experienced are indicated with the abbreviation "G" (graduate student).

Table 5.9 presents the preceding raw data counts in percentage according to vowel type. Table 5.9a gives combined data for stimuli sequences containing the vowels /a/ and /u/; Table 5.9b gives data for stimuli sequences containing the vowel IM. Outlined cells indicate those instances where stimuli and response coincide, i.e.. "correct" responses.

185 a) Stimulus Learners' responses according to range of experience and vowel type (/a/+/u/) CV CIV CljV ClijV

CV UG 92.1% 4.1 3.8% G 98.0% 1.5 0.6%

dv UG 13.3% 56.7% 27.6% 2.4% G 9.0% 80.2% 10. 8 % 0

djv UG 0.5% 39.4% 44.4% 15.7% G 36.6 54.6% 8 .8 %

dijv UG 0.5% 7.0% 13.0% 79.5% G 0.2% 4.6% 21.6 % 73.6% b)

Stimulus Learners' responses according to range of experience and vowel type {HI) CV dv djv dijv

CVUG 70.8% 23.5% 4.8% 1.0 % G 8 6 .8 % 12 . 1% 0.4% 0.7%

dv UG 3.2% 81.9% 14.0% 0.9% G 3.2% 8 6 . 1% 0 .6 %

djv UG 3.5% 13.0% 57.1% 26.3% G 1. 1% 14.0% 55.5% 29.4%

dijv UG 1.0 % 7.7% 4.8% 83.2% G 3.3% 4.8% 13.9% 78.0%

Table 5.9. Data percentages for learners' performance in the perception experiment according to vowel type and range of experience. Vowel type is indicated as either /a/+/u/ or I'll. Data from learners having less linguistic experience are indicated with the abbreviation "UG” (undergraduate learner), while data from learners having more linguistic experienced are indicated with the abbreviation "G" (graduate learner).

In almost all instances the graduate learners demonstrate higher percentages of correct identifications. To more clearly illustrate the absolute difference between undergraduate and graduate

186 learners percentage scores, I subtracted percentage scores for undergraduate learners from those of

graduate learners. Therefore, if graduate learners had higher percentage scores than undergraduates, the

difference between their two scores is positive, which, in most instances, indicates improved perception skills. A plus sign "+" before the percentage difference indicates graduate learners’ better scores. On the

other hand, if graduate learners actually had a lower percentage of "correct" scores than undergraduate

learners then the difference between the two results in a decrease in performance. A minus sign " before calculated percentage differences indicates graduate learners' weaker scores. Most interestingly, it

will be shown that in some instances a negative difference actually indicates improved, more Russian- like, performance. Table 5.10 presents the relevant data. Outlined cells indicate where stimuli and response coincide—"correct" responses. The two percentage differences that are marked with a double underline indicate an especially interesting trend in the graduate learners' data

187 a)

Stimulus Change in learners' perception percentages due to linguistic experience for /a/ and /u/

CV CIV CljV ClijV

CV +5.S - 2.6% -3.2%

d v -4.3% +23.5% -16.8% -2.4%

d j v -0.5% - 2.8 + 10.2% -6.9%

d ij v -0.3% -2.4 +8 .6% -5.9% b)

Stimulus Change in learners' perception percentages due to linguistic experience for/i/

CV CIV CljV CIIJV

CV +16.0% -11.4% -4.4% -0.3%

d v 0.0% I +4.2% ] -13.4% -0.9%.

d j v -2.4% + 1.0 % - 1.6 % +3.1%

d ijV -2.3% -2.9% +9.1% -5.2%

Table 5.10. Absolute percentage differences between undergraduate and graduate learners' performance in the perception experiment. The upper table gives results for sequences containing the vowel /a/ and /u/. The lower table gives results for sequences containing the vowel /i/. Outlined cells indicate where stimulus and response correspond. Double- underlined data indicate especially interesting effect of greater linguistic experience.

Previous analyses of the present perception data support claims about learners' perceptual acoustic strategies and L2 development. First, L2 learners' production and perception capabilities are associated. Both the production data of chapter 4 and normative descriptions of learners' phonetics indicate that learners tend to produce mono-segmental Russian palatalized consonants /Cl/ as a sequence of two segments, /C/+/j7. Learners' production strategies, however, also influence perception, since learners also tend to hear simple-palatalized consonants as bi-segmental; this claim is supported by

188 instances where learners identify simple-palatalized stimuli as palatalized-yod. Second, there is an effect

of vowel type which indicates that formant duration properties may be more acoustically salient than

formant frequencies. Third, while native LI speakers tend to listen ’linguistically’ (incorporating both

phonetic and functional knowledge), L2 learners tend to hear L2 ’phonetically’ (attending only to acoustic

differences, disregarding their functional status within the L2). Do the data in Table 5.10 provide

additional evidence of the aforementioned strategies? Most importantly, do the data show that increased

linguistic exposure to L2 results in more Russian-like perceptual patterns?

Finally, it is important to keep in mind that Russians exhibited almost perfect identification of

the non-palatalized, simple-palatalized, and palatalized-yod sequences (99.6%, 99.8% and 95.4%,

respectively). Russians’ identification of palatalized-i-yod sequences, however, was much lower, 26.0%.

5,5,2 Discussion

Non-palatalized stimuli. For both vowel types lal+lvJ and /i/, graduate learners demonstrate

better performance than undergraduate learners. According to the absolute percentage differences—

+5.9% for/a/+/u/, +16.0% for/i/—learners demonstrate greater improvement with sequences containing

/i/. The percentage data in Table 5.9 indicate that undergraduate learners already display fairly accurate identification (92.1% for/a/+/u/, 70.8% for/i/) of non-palatalized sequences . There is not much room for improvement with sequences containing a back vowel. An increase o f+16% for the vowel /i/, however, indicates modification in learners’ L2 phonological space and resulting perceptual patterns.

I attribute graduate learners’ better performance to their improved sensitivity to the allophonic realizations of Russian /i/. When /i/ follows soft consonants, it is allophonically realized as the high front vowel [i]. When l\l follows hard consonants, however, it is allophonically realized as a mid high vowel,

[i], with lower F2 frequencies. 1 suggest that beginning learners of Russian initially perceive both allophonic variants [i] and [i] as two phones that are “similar” to their LI /i/ segment. Learners, therefore, initially perceptually map both [i] and [i] onto one English category, /i/—an example of category-goodness mapping from Strange’s Perceptual Assimilation Model (cf. chapter 3.3, fn. 8 ). With increased exposure to Russian, learners modify their phonological space, becoming more attuned to distinguishing acoustic qualities of the allophones of /i/.

189 Simple-palatalized stimuli. For/OV/sequences, the greatest improvement in performance

occurs with the vowels /a/ and /u/ (as opposed to /i/ with the preceding non-palatalized sequences).

Learners exhibit an increase of +4.2% for sequences containing the front vowel /i/. They exhibit an

improvement of +23.5% for sequences containing back vowels /a/+/u/—the largest difference in

performance for all sequences over both vowel types. What accounts for this large improvement in

identification with the vowel /a/ and /u/? Notice that a majority of the increase in correct /ClV/

corresponds to a large decrease in /CljV/ responses. I propose that as learners modify their perceptual

strategies and phonological space, they begin to associate a large negative change in F2 frequency

(without a noticeable raised F2 steady-state duration) with the phenomenon of simple palatalization.

Learners are, therefore, paying more attention to less acoustically salient changes in frequency. As a

result, learners cease to perceive Russian simple-palatalized sequences through their LI sieve, which

associates Russian simple palatalization with its most similar English realization, /C/+/j/.

Palatalized-vod stimuli. For sequences containing the vowels /a/ and /u/, graduate learners

exhibit a better performance over undergraduates by +10.2%; rather surprisingly, when the vowel is IM,

graduate learners display fewer “correct” identifications than undergraduate learners, (-1.6%). In spite of

their higher percentages of correct responses, for both vowel types graduate learners still perform at

barely above chance (54.6% for /a/+/u/, 55.5% for /i/). It seems that even after 4, 5, or 6 years of formal

study plus months of study in-country, palatalized-yod sequences still elude L2 learners.

For sequences containing the vowels /a/+/u/, an increase in correct /O jV / responses was due more to a decrease in /O ijV / responses (-6.9%) than in /O V / responses (-2.8%). More importantly, if we consider that even the graduate learners incorrectly identified 36.6% of /OjV/ as /CJ V/—still more than one-third of their responses—it seems that advanced learners continue to experience difficulty hearing

/OjV/ sequences. Learners hear greater than one-third of palatalized-yod stimuli as exemplars of a “less palatal” simple-palatalized sequence. I again propose that the F2 pattern associated with the back vowels

/a/ and /u/, with its large negative drop in F2 frequency, perceptually "shortens" the acoustic portion of the signal that conveys the palatal qualities more for learners than natives.

190 Palatalized-i-vod. Perhaps results for the palatalized-i-yod sequences give the most surprising

findings of absolute comparisons between undergraduate and graduate learners perception performance.

Previous discussion of Russians' results indicated that due to the extremely low functional load of

syllable-final stressed /ClijV/ sequences, Russians identified /ClijV/ sequences overwhelmingly as /OjV/.

When the vowel is /a/ or /u/, Russians identified only 13.9% of /OijV/ stimuli as /OijV/; when the vowel

is /i/, Russians identified only 30.7% of /OijV/ stimuli as /OijV/.

For both vowel types, graduate learners identified fewer /OijV/ stimuli as /OijV/ than did the

undergraduates (-5.9% for /a/+/u/ and -5.2% for /i/). In light of the Russian identification pattern for

palatalized-i-yod sequences, graduate learners’ negative numbers actually indicate improved

performance. Graduate learners better reflect the native-Russian perception pattern of these marginal

sequences. It seems that graduate learners’ performance indicates improved functional knowledge of

Russian. Finally, graduate learners' decrease in the number of identified palatalized-i-yod sequences

comes at an increase in the number of palatalized-yod sequences. Graduate learners seem to be retraining

their perceptual systems so that they listen not only phonetically but also functionally.

5.6 Summary

Because humans’ production and perception skills are inexorably linked, a thorough investigations of language proficiency must address both skills simultaneously. To this end, building on the results of the production study of the preceding chapter, this chapter continues to investigate Russian sound sequences which are especially problematic for adult L2 learners—sequences containing palatalized consonants. Using as its source productions from the recordings described in chapter 4, the perception study provides insights into numerous facets of the "perception-production equation. ”

The relationship between production and perception capabilities has been established for the LI acquisition process: perception precedes production. The developmental relationship of adult L2 learners' production-perception capabilities, however, has yet to be defined with such certainty. One significant complicating factor in our quest to define L2 production-perception is that LI influences the

L2 acquisition process. Adult L2 acquisition unavoidably incorporates LI. And because the extent of

191 native-Ianguage interference remains undefined, there "is nota 1:1 relationship" (Major, 1994:192)

between L2 perception and production. Initial attempts to dehne the order of L2 acquisition indicate that

L2 production precedes perception. (See Goto, 1971.) However, results of this perception study indicate

an opposite tendency.

Certain nonnative phones present greater difficulty to L2 learners than other L2 phones. Why is

this? Current research offers explanations which focus on phonemic factors, phonetic familiarity, and

acoustic saliency. However, these three factors address only the acoustic aspects of speech and completely disregard how our perceptual systems process the sound input. For while certain acoustic parameters may be evident in a spectrogram—and, therefore, assumed to be salient—when the same sounds are filtered through our L 1 perceptual system, their functional role within the language affect their

final perception. Thus, previously assumed osvchoacousticallv salient qualities (based on acoustic parameters alone) may not (when the role of the perceptual system is included) always be nerceptuallv salient. Thus, native competency of a language presupposes complete linguistic knowledge—phonetic knowledge and functional knowledge. The degree to which listeners possess these two kinds of knowledge characterizes their perceptual strategies. Those listeners who have incomplete linguistic knowledge and possess only phonetic knowledge— such as L2 learners—will tend to identify sounds based only on acoustic cues, disregarding the sounds' functional role in the language. As a result nonnative speakers tend to listen ‘phonetically’. Native speakers of that language, however, will display a different perceptual pattern, one which draws on phonetic and functional competency. Native speakers can be said to listen linguistically.' Complete L2 acquisition, therefore, involves attention not only to properties of L2 sounds but also to the relative frequency of this use. Prior to conducting the present perception study, I assumed that the vast majority of adult L2 learners would not acquire subtle aspects of

L2 functional knowledge. However, a comparison of Russians' and learners' perceptual patterns indicate otherwise: L2 learners do acquire L2 functional knowledge.

This chapter presented the data and their analyses into three sections; each section focused on a different aspect of the perceptual process. Section 5.3 investigated Russians' and learners' differing perceptual strategies. Because learners tend to listen 'phonetically', they correctly identify syllable-final

192 stressed /CJijV/ sequences at a much higher rate than Russians. Knowing inherently that such syllable-

final stressed sequences are rare in Russian, native Russians listened ‘linguistically’ and “incorrectly”

identified them as /CljV/ (73%). I propose that the relationship between syllable-final stressed /cijV/ and

/ClijV/ is, therefore, in a state of near-merger in Russian: Russians maintain a difference in their

production (cf. chapter 4) but not in perception. These findings are unusual among studies of L2

acquisition because they show a contrast that is perceived more accurately by second language learners

than it is by native-speakers. Section 5.3 also provides evidence that L2 learners’ perception skills are

more developed than their production capacities (at least in the situation we set up). While learners did

not distinguish the simple-palatalized and palatalized-yod sequence in production (cf. chapter 4), they

were able to distinguish them (although imperfectly) in perception. Interestingly, the contrast that L2

learners failed to produce is not the functionally weak contrast (i.e., /OjV/ vs. /OijV/) but rather the more

important (and perhaps acoustically and gesturally subtle) contrast between simple-palatalized and the

palatalized-yod sequences.

Taking into account the effect of vowel quality, section 5.4 investigated listeners’ acoustic

strategies. Specifically, this study addressed the relative acoustic saliency of front and back vowels in

palatalized C-V sequences. The three vowels of this study were divided into two groups, front N vs.

back /a/ and /u/. The prominent acoustic quality of front vowel /i/ is a raised, prolonged F2-steady-state duration, which perceptually increases the degree of frontal closure (palatal or palatalization, where I suggest a hierarchy of /O V / < /OjV/ < /OijV/) of sequences containing it. The prominent acoustic quality of the back vowels /a/ and /u / is a large negative F2 transition following release of the palatalized consonant. It seems that this transition perceptually shortens the amount of perceived frontal closure.

Thus, the presence of the front vowel N causes listeners (both Russians and learners) to hear sequences as more "palatal,” while the back vowels /a/ and /u/ cause listeners to hear sequences as less "palatal."

Focusing on only learners' data, section 5.5 investigated how the range of experience of learners affects their perceptual patterns. Learners' data was divided into two groups: those studying Russian 3 years or less were designated as "undergraduates " (UG); those studying Russian 4 years or more as

"graduates" (G). For all four C-V sequences (/CV/, /CÎV/, /CljV/, and /OijV/) graduate learners

193 displayed improved perception over undergraduate learners. It must be kept in mind, however, that improved perception implies more Russian-like perception. And recall that, because of the state of near­ merger between syllable-final stressed /CljV/ and /ClijV/, Russians "correctly" identified /ClijV/ for only

23% of the presented stimuli. Russian-like perception entails hearing /ClijV/ as /CljV/. Comparison of undergraduate and graduate learners' identification of /ClijV/ sequences showed that graduate learners

"con’ectly" identified fewer/ClijV/ than did the undergraduate learners. Quite surprisingly, graduate learners displayed more Russian-like perception of /ClijV/ sequences, displaying a tendency to hear

/ClijV/ as /CljV/. Thus, it seems that our advanced learners have acquired some degree of functional knowledge of a subtle and rare phonetic contrast of Russian.

In sum, analyses of the perception study data have been fhiitful and have answered many of the questions posed in this chapter's introduction:

• Learners' perception skills seem to precede production.

• Learners and Russians adopt different perceptual strategies. Russians listen linguistically,' while learners listen phonetically.'

• Formant duration cues seem to be more acoustically salient than formant frequency cues.

• With increased exposure to L2, learners develop not only L2 phonetic knowledge but also L2 functional knowledge.

194 CHAPTER 6

CONCLUSION

This dissertation has discussed second language phonetic acquisition in light of several theories of general linguistics. The strength and innovation of this study is its focus on the dynamic nature of native and nonnative speakers’ production and perception of sound sequences. I have studied sound sequences that are notoriously problematic for L2 learners of Russian—the palatalized consonants of

Russian. Comparing and contrasting native-Russians’ and learners’ performance in acoustic production and perception studies, I have presented and explained in detail learners’ related accented productions and perceptions.

In particular, I have discussed possible sources of adult L2 learners’ accented productions.

Explanations of orthography, LI interference and interpretation, and acoustic saliency were offered. To determine signiHcant dynamic acoustic properties of Russians’ and learners’ productions of Russian palatalized sound sequences, I conducted an acoustic-phonetic study. In agreement with SLA studies, I proposed that learners acquire L2 sounds in gradual continuous stages; however, unlike these previous studies, I proposed a framework—Gestural Phonology— for understanding learners’ gradual acquisition and possible resulting “accented” INTERPHONETICS. Providing gestural scores of native-Russians’ and learners’ productions, I offered detailed illustrations of learners’ accented inter-articulatory timings.

Finally, in a study of listeners’ responses to the acoustic properties identified in the production study, I conducted a speech perception study. Comparing and contrasting the production and perception results, I offered pertinent information on native-speakers’ and learners’ differing production and perception skills, discussed salient properties of the palatalized consonants of Russian and demonstrated Russians’ and learners’ different degrees of linguistic knowledge.

195 Chapter 1 discussed several relevant theories of LI and L2 acquisition. The vast majority of

adult second-language learners who begin their study of L2 somewhere after the age of 7 will never attain

native-like phonetics. Why is this? One theory, the Critical Age Hypothesis, attributes adult L2 learners'

inevitable accentedness to irreversible changes in neurological patterning or processing that occur by the

time of puberty. The Critical Age Hypothesis claims that after puberty we no longer have access to the

original linguistic processes that were active during LI acquisition; therefore, the theory predicts that

adult language learners are more or less predestined to produce L2 with a permanent accent. Recent

research, however, has focused on adult learners’ capacity to continually improve their L2

pronunciations. Thus, while adult L2 learners may not actually be able attain native-like L2 phonetics, they can continue to reduce their degree of accentedness. Setting the stage for later chapters, the first chapter sugested , in agreement with previous work, that adult L2 phonetic acquisition occurs in gradual continuous stages. The notion that acquisition of L2 phonetics is a gradient, continual, dynamic process, rather than a categorical, discrete one is of fundamental importance to the approach of this study.

Adult learners perceive L2 sounds through, or in relation to, their abstract LI phonological space; the process of perceiving L2 in relation to LI is sometimes referred to as “phonological interpretation” and is a significant source of learners’ phonetic inaccuracies. Phonological interpretation is associated with mapping L2 phones onto L 1 phonological categories. However, because the relationship between LI and L2 phones is not 1:1, inaccurate mappings obtain that lead to learners' accentedness. To this end, I adopted Best and Strange's Perceptual Assimilation Model (PAM) as a framework for describing learners' problematic mappings of the palatalized consonants of Russian. In this view we can describe the initial sound sequences in English words such as view', music', and

beautiful' as "similar " to both Russian /Cl V/ and /CljV/ sequences. I proposed that learners map the two

(simple-palatalized and palatalized-yod) Russian sequences onto "similar " English /€/+/)/ sequences; as a result of associating a single-sequence-type from LI with two kinds of sequences in L2, learners do not distinguish Russian simple-palatalized and palatalized-yod sequences. Traditional descriptions of L2 learners' accented productions as sequential /C/+/j/ are consistent with this interpretation. It was further proposed that production and perception studies that focus on these particular palatalized sequences

196 would shed light on the proposed model of learners' inaccurate mappings and learners’ ability (or

inability) to distinguish these two Russian sound sequences.

The relationship between production and perception capabilities was also discussed since their

relative development is not well-defined for L2 learners. LI perception skills precede LI production

(perhaps indicating that before we can produce a sound, we must first know what it is supposed to sound

like). The production-perception relationship of L2 acquisition, however, remains unclear. Initial

findings have indicated that L2 production precedes perception.

Giving particular emphasis to palatalized consonants, chapter 2 provided an overview of the

Russian sound system and descriptions of learners' accented productions. Palatalization was shown to be

a dominant organizing feature of the Russian sound system: all but one anterior segment is paired for

palatalization. Given the fundamental role of palatalization within the Russian sound system, L2 learners

of Russian should acquire palatalization in order to attain any real level of phonetic proficiency.

However, in reality, learners experience protracted difficulties acquiring the palatalized consonants. It

was also shown that there is an even more difficult sound environment than simple-palatalized for

learners to acquire: a palatalized consonant followed by the front glide yod /j/, i.e., /CJj/.

Several sources of difficulties in learning palatalized consonants were considered. First, Russian orthography, with its multiple means of graphically expressing underlying consonantal palatalization, confuses learners. Second, the lingual articulation necessary to produce palatalization is physically

"imperceptible" and, therefore, difficult to acquire. Third, general articulatory bases of Russian and

English are quite different: English is “apical” while Russian is “dorsal”. The “apical” lingual pattemings of native-English speakers, with the tongue body lowered and further away from the palate, hinders L2 production of Russian palatalized consonants; it is almost as if the tongue body is “not ready” to make the necessary dorsum-to-palate articulation. Moreover, for almost all Russian primary articulations, there can be two associated lingual articulations: one with the tongue fronted and raised (resulting in a general convex shape) or one with the tongue backed and lowered (resulting in a general concave shape). English lingual articulations do not exhibit the paired convex-concave opposition.

In articulatory terms, palatalization is a secondary articulation that involves raising and fronting of the tongue body. In palatalized consonants primary and secondary closure onsets and releases occur

197 almost simultaneously. L2 learners experience particular difficulties mastering the simultaneous nature

of Russian palatalization. Learners instead draw on the articulatory timings of "similar " LI English

environments found in words such as "view" and "music"; as a result they produce sequential, rather than simultaneous, articulations. Accented productions are, therefore, phonetically realized as some degree of two segments, /C/+/j/. Thus, the L2 acquisition process requires L2 learners to acquire not only new L2 segments but also new language-specific articulatory timings. Finally, chapter 2 presented the most prominent acoustic properties of palatalized consonants. Here we saw that the second formant frequency

(F2) is positively related to tongue height: the higher and fronter the tongue the higher the F2; the lower and backer the tongue the lower the F2. Therefore, palatalized consonants, with their simultaneous secondary fronting of the tongue body, are characterized by a raised F2 band of approximately 2,000-

2,400 Hz. that is evident at consonant release. Discussions in chapter 2 of the role of language-specific articulatory timings and degree of closure laid the groundwork for interpretation of the data presented in chapter 4.

Chapter 3 presented several phonological frameworks in light of their ability to account for learners" accented productions and their gradual acquisition of L2. The desired descriptive framework would need to be dynamic and non-categorical, one founded on an intrinsic timing approach with phonological units that can interact on a continuum. Based on their external timing and categorically organized features, traditional linear and non-linear phonologies were shown to be inadequate for the purposes of this study. Gestural Phonology, it was argued, is the most useful model, for the purposes of this study. The strength of Gestural Phonology is its phonological unit and the unit’s relative organization within the system. Instead of traditional acoustic or articulatory "features," Gestural Phonology postulates as its primary phonological unit the GESTURE where the gestural score gives the relative temporal organization among gestures. Because gestures have intrinsic timing, they can overlap with other gestures in a continuous manner, thereby clearly illustrating native and non-native articulatory timings. Gestural amplitude also varies on a continuum, where greater amplitude is associated with a greater degree of closure and vice versa. It was shown that Russian palatalization is expressed with gestures on the Tongue Body tier. Because the palatalizing Tongue Body gesture can temporally overlap other gestures for associated primary articulations in a continuous manner, learners" gradual stages of

198 phonetic acquisition of the palatalized consonants can be accounted for as a gradual reorganization of the

gestural score and gestural specifications.

Chapter 4 reported an acoustic-phonetic study that investigated Russian C-V sequences

containing palatalized consonants (/OV/, /OjV/ and/OijV/—simple-palatalized, palatalized-yod and

palatalized-i-yod, respectively). Subjects were both native-Russian speakers (Russians) and adult native-

English-speaking learners of Russian (learners). Because F2 frequencies reflect the frontness-backness

of the tongue body in the oral cavity and because Russian palatalization is articulated by raising and

fronting the tongue body, acoustic measurements focused on F2 frequencies and steady-state durations.

Graphical and statistical analyses of the data revealed Russians' and learners' differing acoustic patterns

and, by inference, learners' accentedness. Russians clearly made more acoustic (and, therefore,

articulatory) distinctions than learners. Russians distinguished /O V / and /OjV/ both in F2-at-transition-

onset frequency and F2-steady-state duration. Learners, on the other hand, did not distinguish /OV/ and

/OjV/, neither in F2-at-transition-onset frequency nor K-steady-state duration. Simple-palatalized and

palatalized-yod are clearly very different sound sequences for Russians, while they are essentially the

same for learners. Russians produced /O jV / and /OijV/ with identical F2-at-transition-onset frequencies

but different F2-steady-state durations. Learners produced the same relative pattern of F2 distinctions for

/OjV/ and /OijV/ sequences as did Russians. I proposed that learners’ more Russian-like pattern of production was prompted by the familiar and articulatorily salient front vowel /i/.

An additional acoustic-phonetic study compared four advanced graduate-student learners’ productions of native-English /C/+/j/ 4-/V/ (in the three words ‘beautiful’, ‘music’, and ‘view’) with their

“similar” L2 productions of Russian /O V/, /O jV / and /O ijV / (where C = /W/, /n J/ and /vJ/ and V = lu/).

Sample spectrograms were offered for one female and one male subject. For each subject, time-aligned

F2 trajectories of mean F2 frequencies and durations were given. Most significantly, three of the four subjects distinguish, in the articulatory timings, their productions of native-English sequences versus the

Russian palatalized sequences; in particular, learners’ productions of English/C/+/j/sequences exhibited an F2 transition of longer duration and of shallower slope. Learners, therefore, produce English /C/+/j/ with a palatal gesture that is phased later (and possibly is less stiff) than the palatal gestures they use to

199 produce Russian /OV/. /OjV/ and /OijV/. It was concluded that, in fact, three of these four learners have

actually acquired to some degree the more simultaneous inter-gestural timings of Russian palatalized consonants.

Finally, Russians' and learners’ acoustic patterns were associated with hypothetical gestural scores via Gestural Phonology. In this manner, dynamic differences among Russians' and learners' accented inter-articulatory timings and degree of closure were clearly illustrated. Articulatory distinctions among the three palatalized C-V sequences (/civ /, /OjV/, and /OijV/) were all attributed to properties of inter-articulatory timing as well as amplitude and duration of the Tongue Body gesture.

Incorporating production study results, simplified gestural scores were offered for Russians' and learners' productions of a single representative palatalized segment. I chose the voiced bilabial palatalized stop /bl/ since its Lip Aperture and Tongue Body gestures are on separate tiers. Inter-gestural timings and gestural duration and amplitude of both the primary labial closure and secondary palatalizing closure could, therefore, be clearly illustrated. The palatalizing Tongue Body gesture of Russian was proposed to begin simultaneously with primary closure onset and, based on acoustic data, to end shortly after primary closure release. The hypothetical gestural score of native-Russians for each of the sequences /bl/ /bJj/ and

/blij/ exhibited a Tongue Body gesture of increasing duration and amplitude corresponding to the ordering of F2-at-transition-onset frequencies and F2-steady-state durations found in the production study.

The hypothetical gestural scores for learners exhibited a different pattern. In light of the results from native-English productions of IC/+I]I, the fact that absolute F2-steady-state durations in /ClV/ sequences of learners were longer than those of Russians was interpreted as support for previous claims that learners tend to produce Russian mono-segmental /Cl/ as bi-segmental /C/+/j/. In learners' gestural scores, the palatalizing Tongue Body gesture onset and offset were, therefore, indicated as beginning later than those of Russians (in relation to the Lip Aperture gesture). Because learners produced /ClV/ and

/CljV/ sequences with identical F2-at-transition-onset frequencies and F2-steady-state durations, the hypothetical gestural score of learners for /blj/ is the same as their gestural score for /bl/. Learners did.

200 however, distinguish /CijV/ and /CiiJV/, producing /CJijV/ with a higher F2-at-transition-onset frequency and longer F2-steady-state duration, which was reflected in a larger and longer Tongue Body gesture.

L2 learners’ acquisition of Russian palatalized consonants was proposed to occur gradually, beginning with sequential L 1 /C/+/j/ and, over time, approaching simultaneous native-Russian /CJ/. It was hypothesized that beginning learners—or those learners who have a strong accent—would produce the primary Lip Aperture gesture and secondary Tongue Body gesture sequentially. With increased exposure to Russian, learners slowly internalize language-specific relative articulatory timings.

Improvements in L2 production of the palatalized consonants are seen in a Tongue Body gesture onset that temporally shifts towards the Lip Aperture gesture onset. In the gestural score, learners’ improved articulatory timings are illustrated by shifting the Tongue Body gesture towards the Lip Aperture gestural onset. It was hypothesized that learners' adjustments in the inter-articulatory timings between Lip

Aperture gesture and Tongue Body gesture occur in gradual, continuous stages.

Linguistic production and perception capabilities are inextricably intertwined: changes and improvements in one affect the other. For this reason, studies which endeavor to define the complex L2 acquisition process—including the organization and processing of sounds—need to simultaneously address both skills. To this end, chapter 5 reported a speech perception study that investigated Russians’ and learners’ perception of the palatalized C-V sequences from the production experiment of chapter 4.

Perception study data provided interesting insights into several aspects of learners’ and Russians' perceptual patterns.

The perception study consisted of an identification task where subjects were presented C-V sequences (/CV/, /CÎV /, /CÎjV/, and /CJijV/) and asked to indicate on a four-alternative forced-choice answer sheet what they had heard. Subjects' responses were tabulated and confusion matrices were constructed for the two language groups (Russians and learners). Learners exhibited a greater degree of overall confusion than Russians. Thus, while the production study of chapter 4 demonstrated learners' production accents, the perception study revealed their perceptual accents as well.

Russians indicated virtually no confusion identifying non-palatalized, simple-palatalized and palatalized-yod sequences. Russians’ pattern of distinguishing the three sequences in production was.

201 therefore, positively reflected in their perceptual patterns. Learners identified non-palatalized consonants

with a relatively high degree of accuracy (89%). However, as expected, learners’ displayed much lower

accuracy identifying the simple-palatalized (73%) and palatalized-yod (52%) sequences. Most

importantly, learners’ perceptual patterns did not coincide with their patterns of production. Learners in

no way distinguished /ClV/ and /CJjV/ in production but did—albeit imperfectly—distinguish the two

sequences in perception. The results of the present study indicate that—at least for the conditions we set

up— learners’ perception capabilities precede production capabilities. These flndings contrast with

previously cited L2 research which found L2 production to precede perception.

Subjects’ identification of palatalized-i-yod sequences provided interesting insights into speakers’ different levels of linguistic knowledge. Native competency in a language entails linguistic knowledge that is both phonetic and functional. Do L2 learners possess the same degrees of linguistic knowledge that native-speakers do? If not, does learners' linguistic knowledge improve with increased exposure to L2? Results of this study indicate that learners do not possess the same kind of linguistic knowledge as Russians. For example, Russians incorrectlv identified 73% of palatalized-i-yod stimuli as exemplars of palatalized-yod sequences. Meanwhile, learners’ correctiv identified 78% of palatalized-i- yod stimuli. How can the seemingly better performance of learners be explained? Word-final syllable- final stressed /CliJV/ sequences are quite rare and, therefore, have relatively low functional weight within the Russian system. The perception study indicated that native-speakers of a language listen

’linguistically’ (taking into account phonetic properties and relative functional weight of the tokens), while nonnative learners listen ‘phonetically’ (paying attention only to acoustic differences that Russians intuitively feel to be disregardable). Furthermore, because Russians maintained a difference between

/CljV/ and /CfiJV/ sequences in production, but not in perception, palatalized-yod and palatalized-i-yod sequences were described as being in a state of a modified near-merger in Russian.

Additional analyses of the perception data of chapter 5 provided other insights into native- speakers’ and learners’ perceptual strategies. When the effect of vowel was taken into account, sequences containing the front vowel I'll were heard as having a greater degree of palatal closure (where

/ClV/ < /CljV/ < /ClijV/), while sequences containing the back vowels /a/ and /u/ were heard as having a

202 lesser degree of palatal closure. It was proposed that the presence of the front vowel /i/, with its

associated raised R-steady-state of considerable duration, perceptually lengthened the amount of palatal

closure within the C-V sequences. On the other hand, the presence of the back vowels /a/ and u/, with

their associated brief raised F2-steady-state and abrupt negative F2 frequency transition, perceptually

shortened the amount of palatal closure within the C-V sequences. Initial results from other L2 phonetic

training studies indicate that nonnative qualities of voicing are easier to learn than nonnative qualities of

place of articulation (Strange, 1995:36). Generally speaking, voicing is cued by temporal patterns, and

place of articulation is cued by spectral patterns. I further conjectured that the better performance of

learners with new voicings (rather than new places of articulation) indicates that properties of formant

duration are more acoustically salient than properties of formant frequency. The fact that our subjects

tended to identify sequences as having a greater degree of palatal closure when the sequences contained

the more acoustically salient raised F2-steady-state duration provides additional supports for Strange’s

(1995) claims.

Finally, chapter 5 also investigated how learners’ range of linguistic experience with L2 affects

their perceptual patterns. The findings here were also quite surprising. Learners’ data were divided into

two groups, based on their number of years of formal study of Russian. Those learners who had studied

Russian three years or less were designated as “undergraduate learners” (UG); those who had studied

Russian four years or more were designated as “graduate learners” (G). Confusion matrices for the two

learner groups were compared. In all instances but one (which was of a very small magnitude), graduate learners displayed improved perception over undergraduate learners. In other words, with increased exposure to L2, learners can continue to improve both production and perception skills and to decrease their degree of production and perceptual accents. Most interestingly, graduate learners revealed acquisition of Russian functional knowledge. Graduate learners displayed improved, more Russian-like. perception of the palatalized-i-yod sequences. That is, they identified fewer palatalized-i-yod stimuli as palatalized-i-yod than undergraduate learners, who tended to hear them instead as palatalized-yod. The perception study results seem to indicate that learners actually acquired, to some degree, this subtle nuance of the Russian language.

203 In sum, this dissertation addressed several aspect of L2 phonetic acquisition in a study of the

acquisition of the Russian palatalized consonant sequences. Several explanations for learners' perceptual

accents were offered including, but not limited to, LI phonological interpretation and resulting incorrect

perceptual mappings; deep, underlying differing articulatory patterns of Russian and English; learners’

difficulty in acquiring new Russian articulatory timings’; acquisition of linguistic knowledge (including

phonetic and functional knowledge); issues of psychoacoustic and perceptual saliency, and the overall

relationship between native and nonnative speakers’ production and perception capabilities. Insofar as

the theories discussed here extend to general theories of second-language acquisition, this work

contributes to our understanding of second language acquisition and general linguistic theories of

phonetics.

While the present dissertation has offered insights into many of my original questions, during the

course of conducting and reporting the research, a number of additional questions have arisen which

suggest additional avenues of research. First, spectrograms do not provide information about the onset

and realization of the palatalizing gesture. In order to determine precisely when the palatalizing target is

achieved it would be necessary to carry out an Electro-Palatography (EPG) study. Data from an EPG

study could be incorporated into the proposed gestural scores, thereby providing more complete descriptions of Russians’ and learners’ productions. Second, from the analysis of Russians’

spectrograms, it seems that frication noise might be an important acoustic parameter for conveying the

presence of the front glide, yod segment. Reanalysis of the Russians’ original productions which measures frication noise amplitude from power spectra made at the temporal mid-point during front-glide duration could be very fruitful. While these two supplementary studies will provide additional substantial

information, they themselves will surely suggest other topics of study.

204 APPENDIX A

ORDERED WORD-LIST: C-V SEQUENCE. PHONETIC TRANSCRIPTIONS AND GLOSSES

Sequence Russian word Nonse word Transcription Gloss

[ba] w6a iz*ba hut [Wa] cefifl s'i'Üa self [b-ja] BOpo 6 bfl varA'b'ja sparrow (gen. sg.) [Wija] craS H a stab'i'ja ----

[va] œ e a SA'va owl [v^a] peBH r'i'v'a roaring, howling [Vja] CbIHOBbfl sinA'v'ja sons [v^ija] UIBCfl /v'i'ja seamstress

[da] TOÆfla tAg'da then [d*a] cyAfl su'd'a judging [tfja] cyabB su'dlja judge [tfija] cyjWfl sud’i'ja judge (archaic)

[za] rjiaaa glA'za eyes [z'a] B03H VA'z'a carrying [z'ja] 6pa3ba brA'zija — [z'ija] caana saz'i'ja --

[ma] caMa SA'ma (reflexive pronoun) self (fem. nom. sg.) [m'a] CrOHMfl staj'm'a (adv.) upright [m'ja] cerna s'i'm'ja family [m^ija] 3M6H zm*i'ja snake

[la] acajia skAla cliff, crag [l'a] 3eMJlH z'i'ml'a earth [l^a] TyabH tu'l*ja crown (of a hat) [I4ja] Koaea kal'i'ja rut

[ra] cecrpa s'i'stra sister [r'a] Mopa mA4:*a exterminating [r'ja] crapba stA^ija old things (gen. sg.) [Hija] ocrpHa Astr*i^a point, spike, sharp edge (gen.

205 [bu] Ta5y tA*bu tabboo [b-u] Maôio mA'bKi —• [b-ju] y6bio u'b'ju I will kill [b*iju] KJiaSmo klabii'ju —

[vu] pesy r^iVu I roar/howl [v'u] peBio r'i'v'u revue [vJju] HurepBbK) intHrViju interview [v'iju] UIBCK) /v ’i'ju seamstress (acc. sg.)

[du] B c a iy fsA'du in the garden [d'u] rB03iU0 gvA'zdki nail (dat. sg.) [d>ju] jnaübK) lA'diju (chess) castle, rook (acc. sg.) [d'iju] TailHIG tad’i'ju —

[zu] BHH3y viü'zu below; downstairs [z'u] cre3io stH'zKi path, way (acc. sg.) [z^ju] jia3bK) Ia 'z'ju — [z'iju] cra3MK) stazii'ju —

[mu] TOMy tA^nu (pron.) that (masc. dat. sg.) [nVu] Kavno kA'm'u Camus [m'ju] œMbK) s’i'm'ju family (acc. sg.) [m'iju] 3MCK) zm'i'ju snake (acc. sg.)

[lu] B yrjiy vu'glu in the comer [Pu] MOJTIO mAlHi I pray [Vju] HaJlbK) nAliju I will pour [l^iju] KOJieio kaWju rut (acc. sg.)

[ru] 6epy b'i'ru I take [r'u] SopHD bA'r’u I conquer [fJju] crapbio stAhriju old things (dat. sg.) [r'iju] ocrpmo AStr'i'ju point, spike, sharp edge (dat. sg.)

[by] pa5bi rA"bi slaves m jiioGh iHibii love (2nd sg. imperative) [b-ji] Bopo5bH varAT^^i sparrow (gen. sg.) [b*iji] xaSHH xab'i'ji ---

[vy] TpaBbi tTAVi grass (gen. sg.) [v4] jiioGbh IHib'vli love (gen. sg.) [v^ji] viypaBbH muTAV^i ants [v'iji] UIB0H /vi'ji seamstress (gen. sg.)

206 [dy] caabi sA'di gardens [d’il CHaH s’i'd’i sit (2nd sg. imperative) [d>ji] cyabH su'diji udge (gen. sg.) j [d’iji] cyaHH sud'i'ji (archaic) judge (gen. sg.)

[zy] B03U v a 'z î carts, wagons [z'i] B03H VA'z'i carry (2nd sg. imperative) [z'ji] Baa3bM v Ia 'z ^ — [z'iji] pa3HH raz^'ji ---

[my] MbI we [m'i] MH 'm'i mi (name of musical note) [m'ji] cewbM si'm'ji family (gen. sg.) [m'iji] 3MCH zmH'ji snake (gen. sg. & nom. pi.)

[ly] croabi S tA li tables [l‘i] 3eMJlH zi'ml'i earth (gen. sg. ) [Hji] TyJTbH tuliji crown of a hat (gen. sg.) [Viji] KOaCH kaW'ji rut (gen. sg.)

[ry] 6opbI bA^ri coniferous forests [r’i] 5epH b & i tak! (wnd sg. imperative) [r^n A-My— ^apbM A m u d A h riji (name of a river) Amu-Darya (gen. sg.) [r'iji] BepcH vHrt'ji (dial.) gate-post (gen. sg.), (naut.) wherry

207 APPENDIX B

RANDOM ORDER WORD-LIST READ BY PARTICIPANTS

BepeH— pen Sopw— pbl aoKJiaa— em

apoGb— o6b MypaBbH--- BbH Ca3HH—3MB

BOpoSbH— 5bH UIBCH—BCa HtrrepBbK)— Bbio

cecrpa— pa lerpaob—am, paGbi— 6bi npoHHTaB— as eopoGba— 5bH cyabH— m,M

TyjlbH— Aba TOBap— ap Ma3b— a3b

3MeiO— MCIO otOH—m cxapbH— pbfl

B03H—3« jiaabio—m>K) pa3HH— 3MH

5opio— pio CrOHMH—MH ÀMy—ÆipbH— pbH

CblHOBbH—Bbfl OCipHa— pHfl laoHio—jam

BHH3y— 3y rB03mo— mo 3eMJlH—JIM craCHfl— 6m HaJTbKD—JlbK) KOJieiO—J16K)

TOMy— My pCBy— By OCrpHK)— pHK)

MOJMO—aio CTa3HK)— 3HK) CrapbK)— pbK)

peBK)— BK) KOJICH—aCM laS y — 6y

B03bl— 3bl acajia— aa pCBS— BH

œMbK)— MbK) KJiaSHK)— 6 hk ) ruiaBb— aBb

BJia3bH— 3hH HHBapb—apb caabi— mi

208 cyiw — a«K) KaMIO MH] aïOÔBH— BH

y6bio— 6bio UIBCH BCH 3M6H— MCH xa5HH— 6 h h T o ra a — a a TyjIbH— jlbH cyaMM— m v i B03H— 3H Jia3bI0— 3bK)

6epH— pH ccMba— Mba MH—MH

KOJiea— Jiea noKas— a3 rnasa—3a

Ce.MbH— HbH 5pa3ba— 3ba B jTuy— ay

cyaba— m a caM— aM pa3Baji— aa

jik)5h— 5 h p a 5 — a 6 5epy—py

TpaBbI— Bbl CTOJlbl— ilbl ce6a— 6a

H36a— 5a Mopa— p a B caay—ay

UIBCIO—Beio cre3K)— 3K> 3cvina— aa vieaajib— ajib Mb!— MbI

Ma5io— 6io Ce.Mb CMb

cavia—Ma c y a n a — a n a

ooBa— Ba 3Mca— Mca

209 BIBLIOGRAPHY

AkiSina, A A. and S. A. Baranovskaja. 1980. Russkaja fonetika, Moscow: Russkij jazyk.

Allan, James. 1988. The Acquisition o f a Second Language Phonology: A Linguistic Theory of Developing Sound Structures, Gunter Narr Verlag Tubingen.

Antonova, D. N. 1988. Fonetika i intonatsija, Moscow. Russkij jazyk.

Avanesov, R. 1. 1956. Fonetika sovremennogo russkogo literaturnogo jazyka, Moscow: Izdatel’stvo Moskovskogo universiteta.

1972. Russkoe literaturnoe proiznoSenie, Moscow: ProsveSCenie.

Best, Catherine T. and Winifred Strange. 1992. Effects of phonological and phonetic factors on cross-language perception of . Haskins Laboratories Status Report on Speech Research, SR-109/110,89-108.

Best, Catherine T., Alice Faber and Andrea Levitt. 1996. Perceptual assimilation of non-native vowel contrasts to the American English vowel system. Paper presented at the meeting of the Acoustical Society of America, Indianapolis, May 13-17.

Bloch, B. and G.Trager. 1942. Outlineof Linguistic Analysis, Ezüûmotc.

Bolla, K. 1981. A Conspectus of Russian Speech Sounds, Cologne, Vienna: Bohlau Verlag.

Bogoroditskij, V. A. 1884. Glasnye bezudarenijav oblSderusskomjazyke,K.a 2 2n'.

Bondarko, L. V. and L. A. Verbitskaja. 1965. O markirovannosti priznaka mjagkosti russkix soglasnyx. Zeitschrift fiir Phonetik, 18,119-26.

Borden, Gloria and Katherine Harris. 1980. Speech Science Primer: Physiology, Acoustics, and Perception o f Speech, Baltimore: Williams & Wilkins.

Bratkowsky, Joan G. 1980. The predictability of palatalization in Russian. Russian Linguistics, 4, 329-336.

Browman, C. P. and L. Goldstein. 1986. Towards an articulatory phonology. Phonology Yearbook 3, 219-252.

1989. Tiers in articulatory phonology, with some implications for casual speech. In J. Kingston and M. E. Beckman (eds). Papers in laboratory phonology: I .Between the grammar and the physics o f speech, Cambridge, England: Cambridge University Press, 341-376.

210 1993. Dynamics and articulatory phonology. Haskins Laboratories Status Repon on Speech Research, SR- II3 ,5 1 -62.

Bryzgunova, E. A. 1963. Praktiâeskaja fonetika i intonacija russkogo jazyka, Moscow: Izdatel’stvo Moskovskogo universiteta.

1972. Zvuki i intonatsijarusskoj reâi, Moscow.

Byrd, Dani. 1994. Articulatory timing in English consonant sequences. UCLA Working Papers in Phonetics, 8 6 , 1-196.

Caflisch, Jacob, Sr. 1983. A pedagogical assessment o f palatalized segments in Slavic systems. Proceedings o f the Kentucky Foreign Language Conference: Slavic Section, 1,32-39.

Carlisle, Robert S. 1994. Markedness and environment as internal constraints on the variability of interlanguage phonology. In Mehmet Yavas (ed). First and Second Language Phonology, San Diego, California; Singular Publishing Group, Inc., 223-249.

Catford, J.C , and David B.Fisoni. 1970. Auditory vs. articulatory training in exotic sounds. The Modern Language Journal, vol. LIV, 7, Nov.

Choi, John D. and Patricia Keating 1991. Vowel-to-vowel coarticulation in three Slavic languages. UCLA Working Papers in Phonetics, vol. 78,78-86.

Chomsky, N. and M. Halle. 1968. The Sound Pattern o f English, New York: Harper and Row.

Clark, John and Colin Yallop. 1995. An Introduction to Phonetics and Phonology. Second Edition, Oxford: Blackwell Publishers, Ltd.

Clements, George N. 1991. Place of articulation in consonants and vowels: A unified theory. Working Papers of the Cornell Phonetics Laboratory, 5 ,77-123.

Cohn, Abigail C. 1988. Quantitative characterization of degree of coarticulation in CV tokens. UCLA Working Papers in Phonetics, 69,51-59

Cooper, Andre M., D. H. Whalen and Carol A. Fowler. 1986. P-centers are unaffected by phonetic categorization. Haskins Laboratories Status Report on Speech Research, SR -85,115-131.

1988. The syllable's rhyme affects its p-center as a unit. Haskins Laboratories Status Report on Speech Research, WR-93/94,23-32.

Crothers, Edward and Patrick Suppes. 1967. Experiments in Second-Language Learning, New York: Academic Press.

Daniloff, T.G. and R. E. Hammarberg. 1973. On defining coarticulation. Journal of Phonetics, 1, 239-248.

Darwin, C. 1896. The Origin o f Species, New York: Caldwell.

DeArmond, Richard C. 1975. On the phonemic status of [i] and (j] in Russian. Russian Linguistics, vol. 2,1/2, March/June, 23-35.

211 Derkach, M. 1975. Acoustic cues of softness in Russian syllables and their application in automatic speech recognition. In G. Fant and M. A. A. Tatham (eds). Auditory Analysis and Perception o f Speech, New York: Academic Press.

Derkach, M., G. Fant and A. de Serpa-Leitao. 1970. Phoneme coarticulation in Russian hard and soft VCV-Utterances with voiceless fricatives. STL-QPSR, 2-3.

Derkach, M, R. J. GumetskiJ, B. M. Gura, and M. E. Chaban. 1983. Dinamiâeskie spektry recevyx signaiov. Lvov: ViSCa Ëkola.

ErSov, S .I. 1903. Eksperimental’naja fonetika, Ksaan'.

Fant, C. G. M. 1960. Acoustic Theory of Speech Production,T\is Hague: Mouton.

Ferguson, C. and O. Gamica. 1975. Theories of phonological development. In E. H. Lenneberg and E. Lenneberg (eds). Foundations o f Language Development, vol. /, New York: Academic Press.

Flege, J. 1980. Phonetic approximation in second language acquisition. Language Learning, vol. 20, 1, 117-134.

1981. The phonological basis of foreign accent: A hypothesis. TESOL Quarterly, vol. 15,4, 443-457.

1984. The detection of French accent by American listeners. Journal of the Acoustical Society o f America, 76,3,692-707.

1987. The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal o f Phonetics, 15,47-65.

1991. Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. Journal of the Acoustical Society o f America, 89,395-411.

1995. Second language speech learning theory, findings, and problems. In Winifred Strange (ed). Speech Perception and Linguistic Experience: Issues in Cross-Language Research. Baltimore: York Press, 233-277

Flege, J. and Fletcher, K. 1992. Talker and listener effects on degree of perceived accent. Journal o f the Acoustical Society of America, 91,370-389.

Flege, J., M. Munro and R. Fox. 1994. Auditory and categorical effects on cross-language vowel perception. Journal of the Acoustical Society o f America, 95,6,3623-3641.

Flege, J., M. Munro and I. MacKay. 1995. Factors affecting strength of perceived foreign accent in a second language. 7or/rna/ of the Acoustical Society of America, 97,5,3125-3134.

Gal'perina, I. R. (ed). 1963. Uâebnik anglijskogo jazyka: Dlja l-ogo kursa pedagogiâeskix institutov i fakul'tetov inostrannyx Jazykov, Moscow: Gosudarstvennoe izdatel'stvo Vy5aja §kola.

Galkina-Fedoruk, E. M., K. V. Gorshkova, and N. M. Shanskij. 1962. SovremennyJ russkij jazyk, Moscow.

212 Gass, Susan M. and Larry Selinker. 1994. Second Language Acquisition: An Introductory Course, Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Gaver, Willicun W. 1993. What in the world do we hear?: An ecological approach to auditory event perception. Ecological Psychology,5,1,1-29.

Gerken, L-A. 1994. “Child phonology: Past research, present questions, future directions.” In MA. Gemsbacher (ed.). Handbook of psycholinguistics. New York: Academic Press.

Gorecka, A. 1989. Phonology o f articulation. Ph.D. dissertation, MIT.

Goto, H. 1971. Auditory perception by normal Japanese adults of the sounds ‘L’ and ‘R’. Neuropsychologia, 9,317—323.

Halle, Morris. 1971(1959). The Sound Pattern o f Russian, The Hague: Mouton.

1992. Phonological features. In W. Bright (ed). International Encyclopedia o f Linguistics, Oxford: Oxford University Press, vol. 3,207-212.

Halle, Morris and Kenneth Stevens. 1971. A note on laryngeal features. Quarterly Progress Report 101, Cambridge, MA: Research Laboratory of Electronics, MIT, 198-212.

1991. Knowledge of language and the sounds of speech. In Sundbert era/. (eds),Mws/c. Language, Speech and Brain, London: Macmillan, 1-19.

Hamilton, William S. 1980. Introduction to Russian Phonology and Word Structure Columbus: Slavica.

Hockett, Charles F. 1958. A Course in Modern Linguistics, New York.

Hume, Elizabeth. 1992. Front Vowels, Coronal Consonants and Their Interaction in Nonlinear Phonology. Ph.D. dissertation, Cornell University.

Ingram, D. 1974. Phonological rules in young children. Journal of Child Language, 1,49-64.

1978. Phonological patterns in the speech of young children. In P. Fletcher and M. Garman (eds). Language Acquisition: Studies in First Language Development, New York: Cambridge University Press.

Johnson, K., P. Ladefoged and M. Lindau. 1993. Individual differences in vowel production. Journal o f the Acoustical Society o f America, 94,2, pt. 1,701-714.

Jakobson, R., C. G. M. Fant and M. Halle. 1963(1951). Preliminaries to Speech Analysis: The Distinctive Features and Their Correlates, Cambridge, MA: MIT. Press.

Jusczyk, P. 1997. The Discovery o f Spoken Language, Cambridge, MA.: MIT Press.

Keating, Patricia A. 1985 C-V Phonology, experimental phonetics, and coarticulation. UCLA Working Papers in Phonetics, 62, 1-14.

1985. The phonology-phonetics interface. UCLA Working Papers in Phonetics, 62, 14-34.

1988. Coarticulation and timing. UCLA Working Papers in Phonetics, 69, 1-2.

213 1988. The window model of coarticulation: Articulatory evidence. UCLA Working Papers in Phonetics, 69, 3-27.

1991. Coronal places of articulation. Phonetics and Phonology: Volume 2. The Special Status of Coronals, Academic Press, 29-48.

1993. Phonetic representation of palatalization versus fronting. UCLA Working Papers in Phonetics, 85,6-21.

Kenstowicz, Michael. 1994. Phonology in Generative Grammar, Cambridge, MA: Blackwell.

Kenstowicz M. and C. Kisserberth. 1979. Generative Phonology Description and Theory, New York: Academic Press, Inc.

Kozevnikov, V_A. and L. A. Cistovich. 1965. ReC: Artikuljacija i vosprijatie, Nauka.

Kovaleva, L j\. 1981. K voprosu o palatalizatsii. Russkoe jazykoz/ianie,Z,\03-\Cn.

Labov, W. 1994. Principles o f Unguistic Change, Blackwell: Cambridge, MA.

Ladefoged, P. 1971. Preliminaries to Linguistic Phonetics, Chicago: University of Chicago Press.

. 1993. A Course In Phonetics, Third Edition. Harcourt Brace Jovanovich.

Ladefoged, P. and I. Maddieson. 1996. The Sounds of the World’s languages, Cambridge, MA: Blackwell Publishers.

Lahiri A. and V. Evers. 1991. Palatalization and coronality. In Paradis and Prunet (eds). The Special Status of Coronals. Phonetics and Phonology 2, San Diego: Academic Press, 79-100.

Lehiste, Use (ed). 1967. Readings in Acoustic Phonetics, Cambridge, MA: MIT Press.

Lennenberg, E. H. 1967. Biological Foundations of Language, New York: Wiley & Sons.

Levin, Maurice I. 1978. and Conjugation, Columbus: Slavica.

Linblom B. 1963. Spectrographic study of . Journal o f the Acoustical Society of America, vol. 35, 11, 1773-1781.

Long, M. 1990. Maturational constraints on language development. Studies in Second Language Acquisition, 12,251-285.

Macken, M A. 1979. Developmental reorganization of phonology: A hierarchy of basic units of acquisition. Lingua, A9, 11-49.

Major, Roy C. 1994. Current trends inilnterlanguagepPhonology. In: Mehmet Yavas (ed). First and Second Language Phonology, San Diego: Singular Publishing Group, Inc., 181-204.

Miller, G. and P. Nicely. 1955. An analysis of perceptual confusions among some English consonants. Journal o f the Acoustical Society of America, 27,2.

Menn, L. 1978. Phonological units in beginning speech. In J. B. Hooper (ed) Syllables and Segme/iw, Amsterdam: North Holland.

214 Ohman, SvenE. 1967. Numerical Model of Coarticulation. Journal o f the Acoustical Society of America, vol. 41,2,310-320.

Oliverius, Z. F. 1974. Fonetika russkogo jazyka, Prague: Statiu Pedagogické Nakladatelstvi.

Panov, M.V. 1967. Russkaja fonetika. Moscow: ProsveSCenie.

Patkowski, Mark S. 1994. The critical age hypothesis and interlanguage phonology. In Mehmet Yavas (ed). First and Second Language Phonology, San Diego: Singular Publishing Group, Inc., 205-221.

Paufo5ima, R. F. 1983. Fonetika slovaifrazy v severnorusskix govorax. Moscow: Nauka.

Pierrehumbert, Janet. 1990. Phonological and phonetic representation. Journal of Phonetics. 18, 375-394.

Polka, Linda. 1991. Cross-language speech perception in adults: Phonemic, phonetic, and acoustic contributions. Journal o f the Acoustical Society of America, 89,6,2961-2977.

1992. Characterizing the influence of native language experience on adult speech perception. Perception & Psychophysics, 5 2 ,1,37-52.

Recasens, Daniel. 1990. The articulatory characteristics of palatal consonants. Journal o f Phonetics, 18,267-280.

Reformatskij, A. A. I960. ObuCenije proiznoSeniju i fonologija. Filologiâeskije nauki, no. 2,147- 148.

Shvedova, N. Ju. (ed). 1982. Russkaja grammatika, Moscow: Nauka.

Saltzman, Eliot L. and Kevin G. Munhall. 1989. A dynamical approach to gestural patterning in speech production. Ecological Psychology, 1,4,333-382.

Sagey, Elizabeth. 1986. The Representation of Features and Relations in Nonlinear Phonology, Ph.D. dissertation, MIT.

Sancier, M. L. and C. A. Fowler. 1997. Gestural drift in a bilingual speaker of Brazilian Portuguese and English. Journal of Phonetics,25, A2\-A36.

Scovel,T. 1969. Foreign accent, language acquisition and cerbral dominance. Language Learning, 19,245-54.

Scovel, T. 1988. A Time to Speak, New York: Newbury House.

Selinker, L. 1972. Interlangauge. International Review of Applied Linguistics, 10,209-251.

Singleton, D. 1989. Language Acquisition: The Age Factor, Clevedon, England: Multilingual Matters.

Skalozub, L. G. 1979. Dinamika zvukoobrazovanija po dannym kinorentgenograjirovanij, Kiev.

Smith, N.V. 1973. The acquisition o f phonology, London: Cambridge University Press.

215 Stevens, Kenneth N. 1989. On the quanta] nature of speech. Journal o f Phonetics, 17,3—45.

Strange, Winifred 1995. Cross-language studies of speech perception; A historical review. In Winifred Strange (ed). Speech Perception and Linguistic Experience: Issues in Cross- Language Research, Baltimore: York Press, 3-45.

Terjaev, D. A. 1989. K voprosu o tverdosti-mjagkosti soglasnyx v russkom jazyke: istorija izuôenija i éksperimental'nye dannye. Russkoe jazykoznanie, vyp. 18,19-86.

Thompson, I. 1991. Foreign accents revisited: The English pronunciation of Russian immigrants. Language Learning, 41,177-204.

Trubetskoj, N. S. 1960. Osnovyfonologii, Moscow.

1969(1939). Grundziige der pAono/ogy, Travaux du Cercle Linguistique de Prague, 7 English translation by C. Bal taxe. Berkeley: University of California Press.

Vasil'ev, O. V. 1962. English Phonetics, hemngtzâ.

Ward, D. 1958. Russian Pronunciation, A Practical Course, Great Britain: Robert Cunningham and Sons, Ltd.

Weinberger, Steven H. 1994. Functional and phonetic constraints on second language phonology. In Mehmet Yavas (ed). First and Second Language Phonology, San Diego, California: Singular Publishing Group, Inc., 283-302.

Weismer, Gary, Ruth Martin, Ray D. Kent and Jane F. Kent. 1992. Formant trajectory characteristics of males with amyotrophic lateral sclerosis. Journal o f the Acoustical Society of America, 91,2, 1085-1098.

Whalen, D.H., Andre M. Cooper and Carol A. Fowler. 1991. P-center judgments are generally insensitive to the instructions given. Haskins Laboratories Status Report on Speech Research, SR-107/108, 141-146.

Wode, Henning. 1994. LI and L2 phonology: Looking ahead. In Mehmet Yavas (ed). First and Second Language Phonology, San Diego: Singular Publishing Group, Inc., 175-179.

Woods, A, P. Fletcher, and A. Hughes. 1993. Statistics in Language Studies, Cambridge University Press.

Yavas, Mehmet, 1994. Introduction. \nM chm e\'^z\^[c6), First and Second Language Phonology, San Diego: Singular Publishing Group, Inc., xi - xx.

Zalizn'ak, A. A. 1977. Grammatiâeskij slovar' russkogo jazyka, Moscow: Russkij jazyk.

Zinder, L. R. and L. V. Bondarko. 1980. Problemy i metody ekspirimental'no-foneticeskogo analiza reci, Leningrad: Izdatel’stvo Leningradskogo universiteta.

Zinder, L. R., L. V. Bondarko, and L. A. Verbitskaja. 1964. Akustiôeskaja xarakteristika razliôija tverdyx i mjagkix soglasnyx v russkom jazyke. Udenyjje zapiski LGU, Serija fiologiâeskix nauk, vyp. 69,28-36.

216 Zsiga, Elizabeth C. 1997. Features, gestures, and Igbo vowels: An approach to the phonology- phonetics interface. Language, vol. 73,2,227-274.

Zsiga. Elizabeth C. and Stefan Kaufmann. 1996. Palatalization and gestural overlap in Russian and English. Foster presented at the Fifth Conference on Laboratory Phonology. Chicago, July 6- 8 .

217 IMAGE EVALUATION TEST TARGET (QA-3) /

V.

%

1.0 Li K iâ mil 2.2 L à Ito 2.0 l .l 1.8

1.25 1.4 1.6

150mm

V

V /1PPLIED ^ IIVMGE . Inc 1653 East Main Street Rochester, NY 14609 USA % Phone: 716/482-0300 —^ ^ Fax: 716/288-5989 Y // 0 1993. Applied Image. Inc.. Ail Rlgtits Reserved

O / é S