Coarticulation in VCV Utterances: Spectrographic Measurements S. E. G. Öhman

Citation: The Journal of the Acoustical Society of America 39, 151 (1966); doi: 10.1121/1.1909864 View online: https://doi.org/10.1121/1.1909864 View Table of Contents: https://asa.scitation.org/toc/jas/39/1 Published by the Acoustical Society of America

ARTICLES YOU MAY BE INTERESTED IN

Numerical Model of Coarticulation The Journal of the Acoustical Society of America 41, 310 (1967); https://doi.org/10.1121/1.1910340

Spectrographic Study of Vowel Reduction The Journal of the Acoustical Society of America 35, 1773 (1963); https://doi.org/10.1121/1.1918816

Acoustic characteristics of American English vowels The Journal of the Acoustical Society of America 97, 3099 (1995); https://doi.org/10.1121/1.411872

Control Methods Used in a Study of the Vowels The Journal of the Acoustical Society of America 24, 175 (1952); https://doi.org/10.1121/1.1906875

Effects of consonant environment on vowel formant patterns The Journal of the Acoustical Society of America 109, 748 (2001); https://doi.org/10.1121/1.1337959

A model of lingual coarticulation based on articulatory constraints The Journal of the Acoustical Society of America 102, 544 (1997); https://doi.org/10.1121/1.419727 Received 13 September 1965 9.3, 9.7

Coarticulation in VCV Utterances: Spectrographic Measurements

S. E.G. 0aMA•

SpeechTransmission Laboratory, Royal Institute of Technology(KTH), Stockholm,Sweden

In this paper, the formant transitionsin vowel q- stop consonantq- vowel utterancesspoken by Swedish, American, and Russian talkers are studied spectrographica]ly.The data suggesta physiologicalmodel in terms of which the VCV articulationsare representedby a basicdiphthongal gesture with an independent stop-consonantgesture superimposedon its transitional portion. This interpretation necessitatesa re- evaluationof the locustheory proposed by the workersof the HaskinsLaboratories. Some conclusions about the generalnature of the neural instructionsbehind the VCV gesturesare discussed.

INTRODUCTION The presentwork dealswith someof the apparently very lawful rules that describehow voiced stops are coarticulated with vowels in vowel-consonant-vowel PEECHpects: theproduction stationary involvesproperties two of fundamental phoneme reali- as- zation and the dynamic rules governingthe of (VCV) context. As a preliminary illustration of the strings of phonemesinto connectedspeech. Recent type of effectsinvestigated, Fig. 1 showsa pair of VCV words chosen from the material described in more detail researchhas dealt successfullywith the acoustic de- scription of static speechsounds. The dynamics of later. In both utterances,the initial vowel is/•/ and articulation, however, have receivedmuch less atten- the consonantis/g/. The final vowel is different.It is tion, and reportsof quantitative studiesin this area are /y/in the left and/u/in the right part of the Figure. The talker is a male Swede. relatively rare. A selection of referencesis given below.•-•3 When the intervocalic consonant is inspected for changesthat might be due to the variation imposedby • R. H. Stetson,Motor (North-Holland Publ. Co., the vowel context, the following observationpresents Inc., Amsterdam, 1951). itself immediately. The second-formanttransition in •' P. Menzerath and A. de Lacerda, Koartikulation, Steuerung the vowel precedingthe stop is rising when the vowel und Lautabgrenzung(Fred. DfimmlersVerlag, Berlin, 1933). 3L. Kaiser, "Some Propertiesof SpeechMuscles and the followingthe stopis/y/, but it is fallingwhen the vowel InfluenceThereof on Language,"Arch. N•erl. Phon. Exptl. 10, followingthe stopis/u/. Thereis alsosome variation in 121-133 (1934). the third-formant transition of the initial vowel. The 4M. Joos,"Acoustic Phonetics," Language 24, 1-136, Chap.6 (1948). articulatorymotion from the initial vowelinto the/g/, 5B. Lindblom, "SpectrographicStudy of Vowel Reduction," as mirrored by the formant transitions, is apparently J. Acoust.Soc. Am. $S, 1773-1781(1963). 0A. S. House and K. N. Stevens, "Perturbation of Vowel modifiedby the vowelthat is to.followthe/g/. Articulationsby ConsonantalContext' An AcousticalStudy," Figure 2 showsa situationthat is converseto that of J. Speech& HearingRes. 6, 111-128 (1963). Fig. 1. The final vowel was kept constanthere and the 7 p. F. MacNeilage, "Electromyographicand AcousticStudy of the Productionof Certain Final Clusters,"J. Acoust. Soc.Am. initial vowel was varied. The utterancein the left part $S, 461-463 ..(1963). of the Figure is /yd(/ and that of the right part is 8 S. E.G. Ohmanand K. N. Stevens,"Cin•radiographic Studies of Speech:Procedures and Objectives,"J. Acoust.Soc. Am. /od•/. This time, thereis a changein the formanttransi- 1889(A) (1963). tions of the final vowel. The second-formant transition 00. Fujimura, "Motion-PictureStudies of ArticulatoryMove- of the final vowel is falling when the vowel preceding ments," Mass. Inst. Technol.Res. Lab. Electron. Quart. Progr. Rept. No. 62, 197-202 (15 July 1961). •0y. Kato, "Analysisof Liquid Consonants,"J. Acoust. Soc. •' P. Ladefoged,"The Nature of GeneralPhonetic Theories," Am. $6, 1989(A) (1964). paperpresented at 16thAnn. Round Table Meeting on Linguistics n S. Inomata,"Program for ActiveSegmentation and Reduction & LanguageStudies, Georgtown Univ. (26 Mar. 1965). of PhoneticParameters," Paper B? in Proceedingsof theSpeech •3E. Fischer-J•rgensen,"Beobachtungen fiber den Zusammen- CommunicationSeminar, 1962 (SpeechTransmission Lab., KTH, hang zwischenStimmhaftigkeit und intraoralemLuftdruck," Stockholm,1963), Vol. 1; J. Acoust.Soc. Am. 35, 1112(A)(1963). Z. Phonetik16, No. 1-3, 19-36 (1963). 151 152 S. E. G. 0HMAN

kc/'•ec

4

3

o -0.3 -0.2 -o.1 o o.1 0.2 0.3 -0.2 -o. 1 o o.1 0.2 0.3 Fro.1. Sound spectrograms ofthe utterances/•gy/(left) and/½ga/ (right) as spoken by a maleSwedish talker. The formant transi- tionsin the initial vowelare differentin the two cases,owing to influenceof the final vowel.The linessuperimposed on the spectrograms indicate method of measurement discussedin the text. the stopis/y/, but it is slightlyrising when the vowel Sona-Graphwith an expandedfrequency scale (0-4.500 precedingthe stopis /o/. Thus,the initialvowel in- cps approximately).The frequenciesof the second fluencesthe roedialstop-to-final vowel transition across formant were measuredin the stationary parts of the the intervocalic consonant. initial and final vowelsand at the beginningand end of In order to obtain a more detailed picture of the in- the closureof the stopconsonants. Thin lineswere drawn teraction of the vowel and the stop-consonantgestures with a pencil so as to follow the center of the second- in VCV context, a seriesof studieshas been made formant bars as seenin the spectrogramsand vertical involvingspeakers of Swedish,English, and Russian. lines were drawn at the beginningand end of the stop closure.The second-formantfrequencies were measured where the vertical lines intersected the formant lines. I. F•. TRANSITIONS' INVESTIGATOR'S SPEECH Frequenciesmeasured by this method are accurate A. Measurements within lessthan 50 cps (Ref. 15). The beginningand end of the stop-closureintervals A list of VCV "words" was constructedusing the three were determined with the aid of a number of criteria voicedstops /b/, /d/, /g/ andthe four vowels/y/, derived from the acoustic theory of stop-consonant /•/,/u/, and/u/. Theseare Swedish phoneroes? 4The production?6 The beginningof the closureis associated list contained 4X3X4=48 different words. with a substantialdecrease in amplitude of the second This material was read three times by the author in and higher formants and a downward frequency shift an anechoicchamber. A Briiel & Kja•r condensermicro- in the first formant. The end of the closureis usually phoneand microphoneamplifier, and an Ampex300 marked by a weak-burstspike and a rapid increasein taperecorder were used. The systemwas adjusted to the frequencyof the first formant. In somecases where givean essentiallyflat responseover the 20-to 10000- it is impossibleto assessthe location of the secondfor- cpsfrequency band. mant at the closureboundaries, an extrapolation rule The wordswere read in a monotoneand both syllables was adoptedthat stated that the formant wasassumed wereequally stressed. All vowelswere phonemically to move into the stop gap without changein slopefrom long.Wide-band spectrograms were then made of all its nearest observable value in the vowel. The vertical the wordsrecorded, using a Kay Electric Company lines of Fig. 1 exemplify the segmentationprocedure. •4/b/ and/g/ closelyresemble the English/b/and /g/, but the Swedish/d/isapico-dental rather than apico-alveolar. The •5B. E. F. Lindblom, "Accuracy and Limitations of Sona- vowels/y/and/•/are, by andlarge, lip-rounded versions of the GraphMeasurements," in Proceedingsof theFourth International English/i/ and /el, respectively./a/ is intermediatebetween Congressof PhoneticSciences, Helsinki 1961 (Mouton & Co., AmericanEnglish/a/and/o/. The Swedishvowels/y/and/u/ 's-Gravenhage,1962), pp. 208-213. have a narrowertongue constriction than the corresponding •6 G. Fant, AcousticTheory of SpeechProduction (Mouton & Englishvowels/i/and/u/. Co., 's-Gravenhage,1960), p. 169. COARTICULATION IN VCV UTTERANCES 153

FIo. 2. Soundspectrograms of the utterances/yd½/(left) and/ode/(right) asspoken by a maleSwedish talker. The formanttransi- tions in the final vowel are differentin the two cases,owing to influenceof the initial vowel.

B. Results have beensplit into two halvesdenoted by VC and CV. The numbersin a VC columnrepresent the extent of a The results are summarized in Table I. Here is shown second-formant transition from an initial vowel into a the differencesbetween the terminal frequenciesof the stopconsonant when the vowelfollowing the stopis as second-formant transitions, measured at the closure indicated in the second column of the Table. The initial boundaries,and the frequenciesof the sameformant in vowelis foundin the first columnof the Table. Similarly, the stationary parts of the vowels.To read the Table, the numbersin a CV column representthe extent of a note that the columnsheaded by VbV, VdV, and VgV second-formant transition into a final vowel when the vowel precedingthe stop is as indicatedin the second Tanns I. Data pertaining to investigator's second-formant transitions in VCV utterances: Each number describes the dif- column of the Table. in this case, the first column of ferenceF=t--F=• in cps, where F=t denotesthe terminal frequency the Table indicatesthe final vowel.A numberis positive of the formant transition measured in the neighborhoodof the if the terminal frequencyof the formant is higher than stop gap, and F=• denotesthe frequencyof the formant as meas- ured in the moststationary part of the vowel.The vowelin which its value in the stationarypart of the vowel, and nega- the transition is observed is indicated in the leftmost column of tive if lower. Each number in the Table is an average of the Table. The vowel on the oppositeside of the stop is shownin the second column from the left. Each number is based on three three measurements.The average stationary second- measurements. formantfrequencies of the vowels/y/, /•/, /u/, and /u/ for this speaker are, respectively,2000, 1600, VbV VdV VgV 1000, and 650 cps. VC CV VC CV VC CV Table I shows that the second-formant transition in --450 --350 --155 --130 --15 +140 everyVC sequencestudied is variableand dependenton --515 --370 --290 --200 --30 +55 the formant pattern of the final vowel of the VCV --775 --370 --370 --280 --255 +115 --650 --390 --200 --190 --325 q-100 utterance.In the sequence/ag/,for example,the second formantrises by 175 cpswhen the final vowelis/y/, --375 --65 --90 +75 0 +430 --415 --175 --200 --5 --25 +375 andfalls by 65 cpswhen the finalvowel is/u/. --465 --90 --340 --25 --275 +340 Similarly,in all the CV sequences,except/by/, there --515 --140 --240 +40 --365 +315 is a differentsecond-formant transition for eachpreced- +100 +205 +265 +520 +175 +330 ing vowel.An exampleis/ba/, in whichthe secondfor- +25 +75 +240 +365 +125 +225 --100 --100 +165 +290 +30 +165 mant failsby 205 cpswhen the initial vowelof the VCV --150 --55 +250 +315 --65 +80 utteranceis/y/ and risesby 100 cpswhen the initial +200 +255 +575 +940 +190 +115 vowelis/o/. The rangesof coarticulatoryvariation of +80 +325 +530 +840 +65 +255 the F,. terminalfrequencies in the VC and CV sequences +15 +200 +490 +370 +15 +190 q-5 -- 5 q-490 q- 690 -- 50 -- 15 are quite large,sometimes as largeas 380 cps,the aver- age being 210 cps. 154 s.E.G. 0HMAN

TABLEII. Vowel-formantfrequencies in cpsof male Swedishspeaker. Values are givencorresponding to the initial and the final positionsof the VCV utterancesand corresponding to the differentintervocalic stop-consonant contexts. The numbersin parentheses denoteinterquintile ranges. Each formant-frequencyvalue is an averageof 25 measurements.

y • {1 o u Initial Final Initial Final Initial Final Initial Final Initial Final

VbV F1 290 330 400 430 670 660 400 420 340 360 (40) (50) (50) (40) (40) (65) (40) (50) (25) (40) F• 2000 1890 1650 1580 990 1040 670 730 670 680 (50) (140) (75) (90) (75) (100) (50) (50) (40) (40) Fs 2350 2320 2320 2380 2670 2620 2450 2430 (100) (75) (50) (75) (100) (125) (90) (100)

VdV 270 300 390 400 660 650 390 410 330 320 (45) (35) (40) (60) (45) (65) (40) (35) (40) (45) F2 1990 1950 1620 1580 970 1030 690 760 660 720 (75) (75) (70) (55) (95) (100) (40) (45) (40) (75) F8 2380 2350 2350 2410 2740 2700 2530 2560 2500 (100) (75) (65) (90) (125) (125) (175) (175) (150)

F1 320 330 420 420 660 650 430 430 360 360 (40) (65) (45) (55) , (55) (80) (45) (35) (50) (55) F,. 2010 1960 1650 1650 990 1160 780 840 770 790 (75) (90) (50) (85) (90) (125) (45) (90) (40) (45) Fa 2440 2340 2440 2350 2560 2480 2560 2450 2480 2460 (75) (100) (115) (75) (150) (165) (115) (145) (115)

II. F1, F•, AND Fa TRANSITIONS: MALE B. Results SWEDISH TALKER The presentationof the resultsis dividedinto two A. Measurements parts' the first dealswith the stationarysegments of the In orderto avoid any biasdue to the useof the investi- vowels;in the second,the transitionsare discussed. gator'sown speech, it wasdecided to repeatthe measure- 1. Vowel Formants mentsusing a phoneticallyuntrained speaker who was ignorantof the purposeof the experiment.The recording The formant frequenciesof the stationaryparts of the equipment was the same as before, but the word vowelsare presented in Table II. The data aredisplayed material was somewhat enlarged.The VCV word en- so as to showthe smallformant-pattern differences that sembleconsisted of all combinationsof the voicedstops seemto be associatedwith the position(initial or final) /b/,/d/, and/g/with the vowels/y/,/•/,/u/, /o/, of the vowel in a VCV word aswell aswith the identity and/u/in initial and final position.The total number of the intervocalic consonant. of different words was thus 5X3X5=75. Each of the The physiologicalnature of thesevariations cannot words was written on five 2X3-in. file cards. The cards be inferred with perfect certainty from the acoustic were sortedinto three deckscontaining the VbV, VdV, data. The fact that the formant frequenciesof /y/, and VgV words, respectively,and finally each deck /•/, and /u/ in final positionseem to approachmore of cards was shuffled. The three decks contained 125 central values than those of the same vowels in initial randomlyordered words each. positionindicates, however, that the final vowelsare The speakerread the wordsfrom the card decksin somewhatneutralized. This trend is not equally clear the anechoicchamber, making a pauseof about 5 sec in the caseof/o/and/u/. betweeneach word pair. He was told to keep the pitch There is a small influence of the consonants on the asmonotone as possible, to giveequal stress to the initial vowel-formant frequenciesin the stationary portions and final vowels, and to make the vowels long. The of the vowels. This influence is observable both in initial subjectwas a relatively low-pitchedmale speakerof a and final positions.Thus, the measuredfrequencies of commonStockholm variety of Swedish.He experienced the secondand the third formants of any given vowel no difficulty in reading the wordsin accordancewith precededby /g/ are consistentlysomewhat different the instruction. from the valuesobtained when/b/or/d/precedes that Spectrogramswere made in the sameway asbefore vowel.A differenceis alsoseen in the formant frequencies and so were the measurements,expect that this time of/o/and/u/before/g/as comparedwith the values not only the secondbut alsothe first and third formants obtainedwhen these vowels are followed by/b/or/d/. were recorded. _ Figure, 3 showsthe F• distributionsfor /u/ and /o/ COARTICULATION IN VCV UTTERANCES 155

FxG. 3. Histograms •s 15 thedifference isprobably duesimply tothe great sensi- representingthe dis- od- og- ud- ug- tributionof thesecond- •0 tivity of F•.of/u/to smallfluctuations in the placeof formant frequenciesof constrictionof the vocal tract. The nomogramspub- the vowels/o/ (left) 7• lishedby Fant18 relating formant frequencies to placeof initial(upper)and final •> constriction in four-section straight-tube models for and(lower)/u/positions (right) inof •_o I VCV utterances. The • 6•0 ?•0 6;0 9;0 vowelsshow that the F2 gradientis maximalin the con- twohistograms in each om strictionregion corresponding to a good/'u/model. It is portion of the Figure likely that a similar effect accountsfor most of the correspondto different roedial-consonant con- F-patternvariability of the presentspeaker's/(•/. texts (/d/ and /g/). • -do -go - u - u In summary,it can be said that the formant-pattern The mean second- •710 10 variationsin the stationary portionsof the vowelsin formant frequency of each subdivision of Table II are small. This conclusion any one of the vowels •s lower if the roedial is alsoconfirmed by informal auditory analysisin which consonant of the VCV t i utterance is /d/ than the vowelsof the wordsrecorded were easilyidentified if it is/g/. FREQUENCYOF F2 IN C PS with thoseintended by the speaker.

2. Formant Transitions beforeand after/g/and/d/as measuredin the VCV utterances. The formant-transitiondata are presentedin Table The numbersgiven in parenthesisbelow the formant- IV, which is organizedin the sameway as Table I. A frequencynumbers in Table II representdispersions. few exampleswill suffice to illustrate the reading of The dispersionmeasure used is the "interquintile Table IV. Thus, for instance,the extent and direction range,"which is the lengthof an intervalalong the of the third-formant transition of the final vowel of the frequencyvariable that embraces60% of the distribu- word/ydu/•i.e., of the CV portionof/ydu/--is found tion andto eitherside of whichhalf (i.e.,20%) of the in the CV part of the Fa columnin the VdV section.To remaining part of the distribution is contained)7 The find the row, note that the leftmost column of the Table formant frequencyand dispersionnumbers are based indicates the vowel in which the formant transition is on 25 datumpoints in all cases,except those pertaining observed and that the transconsonantal vowel is indi- to the third formantof/o/and/u/, whichsometimes cated in the secondcolumn from the left. Since,in the couldnot bemeasured. The formantfrequency and dis- present case, we are concernedwith transitions in the persionnumbers have been rounded off to the nearest final vowel,it is the initial vowel (/y/) that is trans- 10 and 5 cps,respectively. consonantaland the final vowel (/u/) is the "observed" The dispersiondata showthat the variation increases vowel. The entry that we are interestedin is thus found with increasingformant number but is aboutequal for in the first (/y/) row of the/u/part of the first column. the same formant of initial and final vowels. The aver- The entry in question is --20, which means that the ages are shown in Table III. starting frequencyof Fa in /u/ of /ydu/ is, on the average,20 cpslower than its frequencyin the station- ary part of the/u/. In the sameway, we find, for in- Formant Initial Final stance,that the F2 transitionof/u/in/ugh/terminates TAn•.•. III. Average inter- vowel vowel at a frequency400 cpshigher than that of the stationary quintile ranges of vowel for- (cps) (cps) mant frequencies. F2 of/u/, etc.The emptyentries of the Tablerepresent F• 45 50 formant transitions that could not be measured from F• 60 80 Fa 105 110 the spectrograms. Table IV shows that there is substantial variation in the formanttransitions of any givenVC or CV sequence, The formant-frequencyvariability of eachof the five dependingon the identity of the vowelon the opposite vowelsas averagedover all positionsand contextsand sideof the stop.This canbe seenby selectingany vowel withoutdistinction of formantnumbers are presented in the observedvowel columnand comparingthe five below. formant-frequencyvalues found in the intersectionof y • u o u the row for this vowel and any columnof the interior 72.5 60.3 96.0 71.0 67.0 (cps). of the Table. The largest differencebetween any pair of thesefive valuesmay be calledthe rangeof variation Althoughthe averageinterquintile range of a formant of the associated VC or CV formant transitions. of the vowel/u/was higherthan that of the remaining The rangesof variation of all the VC and CV transi- vowels,it cannotbe concludedfrom this that the tions are given in Table V. The maximum rangesfor wasarticulated with lessprecision, for a largepart of transitionsof F•, F2, and Fa are hereseen to be, respect- ively, 130, 540, and 330 cps,and the averageranges are •?H. Cramer,Mathematical Methods of Statistics(Princeton UniversityPress, Princeton, N.J., 1946),p. 213. •aRef. 16, p. 72 seq. 156 S. E. G. OHMAN

7•77 7•7•7

777•7 COARTICULATION IN VCV UTTERANCES 157

TABLe.V. Rangesof variationin cpsof the formant transitionspresented in Table IV. Valuesthat are too large to be due to errorsof measurementor chanceare printed in boldface.

VbV VdV VgV F• F2 Fa Ft F2 Fa Ft F• Fa VC CV VC CV VC CV VC CV VC CV VC CV VC CV VC CV VC CV 30 70 225 155 70 150 30 50 175 205 100 185 70 50 540 80 115 110 30 25 180 220 85 90 45 45 170 265 85 150 40 85 470 110 150 75 130 60 305 220 150 135 65 120 240 310 105 140 65 120 460 410 295 330 55 45 315 275 240 245 60 120 260 395 -.- 90 20 55 180 285 ...... 30 60 510 300 ... 225 45 90 320 280 ...... 45 85 225 330 ......

62, 280, and 153 cps,respectively. In other words,the of the reverseof/--gy/qi.e., to/yg--/for whichit is terminal frequency of, e.g., the secondformant in a 540 cps.A similar relation holdsbetween F= of/•g--/ fixed vowel+stop or a stop+vowel sequencemay be and/-- g•/. expectedto vary by 280 cps, on the average,owing to 50% of the Fa rangesof Table V and also50% of the coarticulation with transconsonantal vowels. F• rangesare greater than the critical values. In the case The degreeof statisticalsignificance of thesedata can of/u/', the rangesof the terminalfrequencies of all three be determinedby meansof Fig. 4, which showshisto- formantsare significantunder all conditions.In general, grams representingthe deviations of the terminal fre- Table V showsthat the terminal frequenciesof the for- quenciesof the individual transitionsfrom their associ- mant transitions in a vowel+stop or stop+vowel ated mean values. The fitted curves show that the dis- sequencein the speechmaterial studied here are not tributions are approximately normal with standard unique but depend on the F-pattern of the transcon- deviations35, 75, and 70 cps,respectively. The differ- sonantal vowel. This is an agreementwith the results ence of the means of two sampleseach containing n summarized in Sec. I-B. elements drawn from some one of these distributions In order to emphasizefurther the complexityof the will approacha normal distributionas n becomeslarge. stop-consonantformant transitions, we discusscertain The approximatestandard deviation for a sampleof n aspectsof the presentdata in more detail in Secs.II-C, elementsis s'=s(2/n)•, wheres is the standarddevia- D, and E. In Secs.III and IV, we examine transition tion of the original distribution. Using this formula data from an American-Englishand a Russiantalker, with n= 5, we can calculatethe s' correspondingto the respectively. first-, second-,and third-formanttransitions. They are, respectively,21, 48, and 45 cps.The critical differences C. Nonuniqueness of Terminal Frequencies at the 0.01 and 0.05 levels of significanceare shown Parts of the data of Table IV have been plotted in in Table ¾I. Figs. 5 and 6. Figure 5 shows all the second-formant transitionsobserved in the sequences/•C-/ (left part) Level of significance and/-C•/ (right part). The secondformant of/•/has beenrepresented by a thick horizontal bar and the slant TAB•.•.VI. Criticalranges of Formant 0.01 0.05 variationofformant transitions F• 57 44 lines convergingupon it representthe formant transi- at twosignificance levels. 118 93 tions. No attempt has been made to indicate the correct Fa 115 87 time relationsJ9 The symbolsalong the vertical lines

2002 F1 F2 F3 In other words, if the terminal frequenciesof F2 or N= 750 = N=509

F3 of a givenCV or VC sequenceof Table IV changeby 150- more than about 100 cps from one transconsonantal vowel conditionto another, then the differenceis almost certainly not due to chance. The same holds for F• 50- when the differenceis greater than only about 50 cps. In Table V, those ranges that exceedthe critical dif- , ferenceshave been printed in boldface. The average -200 0 200 -2;0 I• 200 -200 0 200 ranges for the terminal frequenciesof F•, F2, and F3, FREQUENCY IN C PS which, as noted earlier, are 62, 280, and 153 cps, re- Fro. 4. Histograms representingthe deviations of the terminal frequenciesof individual formant transitionsfrom their associated spectively,are all significant.In particular, the range mean values.The superimposedcurves are the best-fittingnormal of the terminal frequencyof F2 is significantin every distributions. VC or CV sequenceexcept/--gy/, whereit is only 80 t0 Measurements on the durations of the formant transitions cps. It is interestingthat the largest range observedin are presentedin Roy. Inst. Technol. (Stockholm)Speech Trans- all of Table V pertainsto the second-formanttransition missionLab. Quart. Progr. Stat. Rept. 1/1965 (15 Apr. 1965). 158 s.E.G. 0HMAN

2000- g ugvo •gog

gy

Fro. 5. Stylized sec- ond-formant transitions observed in the se- quences/'½C-/(left)and /-C½/ (right). The let- ters at the terminal points identify the me- dizl consonant and trans- consonantal vowel of go do the VCV utterance in 1500 which the transition was b observed. The vertical lines cover the rangesof •]gøldo] variation of the formant transitions associated Ylgu ] obub with the medial conso- nants/b/,/d/, and/g/.

lOOO facing the formant transitionsare used to identify the moves along the frequencydimension of the diagram consonant and transconsonantal vowel with which each from about 1350 to about 2100 cps.There are no fixed transition line is associated. The vertical lines cover frequenciescorresponding to the three stops. This is the frequencyranges of the transitionscorresponding true not only of the caseillustrated in Fig. 5, but of the to the threestops/b/,/d/, and/g/. whole material studiedin this experiment(see Table The vertical lines in the left part of Fig. 5 show that •V). the/d/-transition rangeoverlaps completely with the The overlapof the terminal frequencyregions of Fig. /g/-transition range; i.e., there is no such thing as a 5 canbe reducedby taking accountof the third formant. unique/d/-transitionterminal frequency of the second This is seenin Figs.6 (a) and (b), whichare F2-- F3plots formant of an/•d-/sequence for this talker. There is of the data presentedin Fig. 5. Here, the vowel/•/is even a frequencyregion around 1450 cpssuch that, if a representedby a point at F2= 1650 cps and F3= 2400 second-formanttransition from/•/terminates there, it cps.The second-and third-formant transitionpairs are may be associatedwith any oneof the three stops/b/, indicated by straight lines originating from the point /d/, and/g/. Here is, accordingly,a casein whichthe representingthe vowel/•/ and terminatingat points second-formantterminal frequency alone carries no representingthe terminal frequenciesof the two form- informationwhatsoever about the ants. The left part of the Figure showsthe transitions of the stop. of the/•C-/sequences and the right part thoseof the The right part of Fig. 5 showsa relativelywell-defined /-C•/ sequences.The labels at the terminal points /g/-transition range, but againthe/d/and/b/ranges identify the stop and the transconsonantalvowel. The overlap. insets show the stylized time courseof the F2 and F• The most striking feature of Fig. 5 is perhaps the transitionsof the four quadrantsabout the vowel target relatively evenspread of the terminalfrequencies as one point. COARTICULATION IN VCV UTTERANCES 159

Ft F3 • 2000- FIO. 6. (a) F•. vs Fa plot of the formant gy transitions shown in the left part of Fig. 5. The inset in the upper right corner showsthe approximate courseof the formant transitions represented by the points in the four u quadrants about the location of /½/ in the • •soo, 9o,•///// %.'"'"•-•oddu F•.- Fa plane. (b) F•. 9o ""• de vs Fa plot of the form- gu ant transitions shown in the right part of Fig. 5.

i i i i I ' i i i i , I i i 2OOO 2500 2000 2500 (a) (b) Frequencyof F3 (cps)

In Fig. 6(b), all the/b/transitionsare located in the to formanttransitions. To judgefrom the graphsand lower-leftquadrant, the/g/transitions in theupper left the spectrograms,the only cuesfor the distinctionare quadrant,and the/d/transitions in the upperand lower (1) the lower F, and F3 terminalfrequencies of the right quadrants.In thisplot, unlikethat of Fig. 5, there initialvowel of/ybo/ ascompared to thoseof/ygo/, is, hence,no overlapbetween the/b/ and/g/transi- and (2) the differentrates of changeof the VC transi- tions.In Fig. 6(a), however,some of the/g/transitions tionsof the two words,F, beingslower and Fafaster in occupythe samequadrant as the/b/transitions. Thus, /yb/. for instance,the F•. and F3 terminalfrequencies of the Anotherinteresting feature is the peculiarS-like transitionsin the initial vowelsof /•b•/ and /•gu/ shapeof the F2 transitionin thefinal vowel of someof are closelysimilar. the VgV displays.This characteristicis particularly clearin wordsthat end in a front vowel (/y/ or/•/) D. Temporal Features of the Formant Transitions and beginwith a back vowel (/u/ or /o/)--i.e., in /ogy/,/ugy/,/og•/, and/ugh/ (upperright corner of A qualitativesurvey of the entireset of data for this Fig.9). TheF2 transition of the final vowel of/og•/, for speakeris given in Figs. 7-9, which representspectro- instance,starts with a rise that lastsfor about 40 gramsof VCV wordscontaining the intervocalicstops msec. The formant then falls down toward the target /b/, /d/, and/g/, respectively.In theseFigures, the frequencyfor/•/. wordsin any row have the samefinal voweland the A tentativeexplanation of thisphenomenon may be wordsin any columnhave the sameinitial vowel.Each givenwith referenceto Fant'sthree-parameter model of the "spectrograms"is an averageof the fivereadings for simplearticulatory constrictions of the vocal tract? of eachword obtainedby tracing on transparentpaper Figure10, which is derivedfrom this model, shows the the center lines of the formants that are seen on the frequenciesof F2 andF3 for a constantlip-section area soundspectrograms. The five superimposedtracings (65 mm•) and a constrictionof fixedminimum cross- were then used to get a visually determinedaverage sectionalarea (65 mm2) locatedat variousplaces along correspondingto each word. The accuracyof this the tract.The positionx of thepoint of minimumcross methodis not much inferior to thatinvolving the statisti- sectionis measuredby its distancein centimetersfrom cal calculationsand has the advantageof being much theglottis. For a mediopalatalconstriction (x= 13cm), faster. The marks along the frequencyscales of the F•. has a maximum.In a back vowel q-/g/q- front graphsare 500 cpsapart and the time marksare 100 vowelutterance, the /g-/ closuremay be assumedto msec apart. start with a relativelyposterior point of constriction By readingacross rows or columnsin Figs.7-9, one correspondingapproximately to the point x=8 cm of can observe the successiveshifts of locus frequencies Fig. 10. This is suggestedby the low-terminalF, that are due to the more or less marked influence of the frequencyof the initialvowel. During the closure,the transconsonantalvowel. Severalother qualitative ob- configurationof the finalvowel is anticipatedand this servationscan alsobe made.A comparisonof /ybo/ anticipationcauses the point of constrictionto move and /ygo/, for instance,shows that the CV parts of thesetwo wordsare practicallyidentical with respect •.0Ref. 16, p. 71. 160 S. E. G. 0HMAN COARTICULATION IN VCV UTTERANCES 161 162 S.E.G. 0HMAN COARTICULATION IN VCV UTTERANCES 163

forward so that at the moment of release the minimum cps cross-sectionalarea is located slightly behind the F2 3ooo F3 maximum of Fig. 10--i.e., at about x-12 cm. The F2 terminal frequency,therefore, has a highervalue at the releasethan at the moment of closure.The configurations 2ooo1000 F•

, of /y/ and /•/ have their minimum cross-section ! , , , , ] , , , , i , , , , i pointslocated anteriorly of the F•.maximum. The point 20 15 10 5 of constrictionof the transitionalconfiguration sequence DISTANCE OF CONSTRICTION FROM GLOTTIS (CM) followingthe relaseof the /g/ must, therefore,pass Fro. 10. Frequenciesof secondand third formantsof a tube of through the regionwhere F2 has its maximum. In conse- !ength18 cm and crosssectional area A (x)=A0 cosh•(x--xo)/h quenceof this, F2 would first rise somewhatand then •n the neighborhoodof the point of minimum crosssection x0. The minimum cross-sectional area A0 is assumed to be 65 mm •. fall so that the typical S-shapedformant transition The laryngealsection has a fixedshape, h is a constant,and the would result. lip-sectionarea is alsoconstant and equalto 65 mm•. The abscissa representsx0, the point of minimum cross-sectionalarea. •After Figure 10 suggeststhat, on the above assumptions, C. G. M. Fant, AcousticTheory of SpeechProduction (Mouton & the F3 transitionshould also be S-shaped,but with a Co., 's-Gravenhage1960), p. 80•. curvature oppositeto that of the F• transition. This is not observedin the spectrograms,however. The reason a maximally open vowel; i.e., /u/, and a maximally for this may be that, while Fant's model postulatesa closedvowel such as/y/. If the differencein the course catenoidal shape for the vocal-tract constriction,the of the VC formant transitions of the two utterances actual configurationsunderlying the utterancesof Fig. 9 /ugy/ and /ugu/, for instance,were due simply to a were probablymuch more complicated.In general,the relatively more rapidly decreasinglip-opening area dur- frequenciesof the first and secondformants can be esti- ing the initial vowelof/ugy/, thenthe terminal-formant matedfrom relativelygross approximations of the vocal- frequenciesof /ug/ before/y/ would be expectedto tract shape.The dependenceof a formant frequencyon havelower values than thoseof/ug/before/u/ (Refs. small irregularitiesin the shape of the vocal tract in- 23, 24). Inspectionof Fig. 1 showsthat the actualsitua- creasesrapidly with the ordinalnumber of the formant, tion is exactly oppositeto this, however.A large num- however?• X-ray studiesof the utterancesof Fig. 9 as ber of similar casescan be found in the data. Naturally, spoken by a different talker suggestthat the above- the labial gesture will have some influence, but the mentionedtransitions from/g/ to/y/ or/•/involve lingual factor dominates. The same remark holds for a rapid decreaseof the volume and length of the cavity the American-Englishmaterial, to which we turn next. anterior to the constriction. Model experimentsby Fant ss indicate that these changes should cause an III. AMERICAN-ENGLISH TALKER upward shift of F3 that could be large enoughto cancel In this study, a speakerof standardAmerican English a possibledownward deflection suggested by Fig. 10. read a list of VCV words consistingof all combinations It is interestingthat the transitionfrom/g/to a front of the vowels/i/,/a/,/u/and the consonants/d/'and vowel after an initial back vowel is not just a time- /g/. reversed version of the transition from a front vowel to The average second-formant frequencies of the /g/ followedby a backvowel. This is seenfrom a com- stationary parts of the vowelsare presentedin Table parisonof the displaysin the lower left cornerof Fig. 9 VII and the second-formanttransition data appear in with thoseof the upper right corner. The former group doesnot showS-shaped F2 transitions.Apparently, the TAB•.• VII. Frequenciesin cpsof the secondformant measured pointof closurein/ygu/, for example,is more anteriorly in the vowelsof VCV utterancesspoken by an Americansubject. Each value is based on five measurements. locatedthan the point of releasein/ugy/, and the F• maximum of Fig. 9 is passedduring the closure. VdV VgV Initial Final Initial Final E. R61e of Lip-Rounding 2260 2340 2290 2325 The coarticulatoryvariations of the formant transi- 2340 2280 2240 2280 2290 2260 2280 2280 tions that have been discussedabove must be mainly the consequenceof a correspondingvariation of the 1180 1210 1200 1245 1140 1200 1160 1175 shapeof the tongueduring the stop consonants.All the 1130 1180 1120 1205 vowels of the word material were rounded and consider- i 830 960 870 960 able changesin degree of lip-rounding would be ex- a 830 950 860 930 pectedto take placeonly in utterancescontaining both u 860 950 860 960

•'•J. M. Heinz and K. N. Stevens,"On the Derivation of Area Functions and Acoustic Spectra from Cin•radiographicFilms of •3V. Fromkin, "Lip Positionsin American English Vowels," Speech,"J. Acoust.Soc. Am. 36, 1037-1038(A) (1964). Language& Speech7, 215-225 (1964). • Ref. 16, p. 66. 34Ref. 16, p. 219. 164 S. E. G. 0HMAN

TABLEVIII. Valuesin cpsof F2t-- F,.vfor the Americansubject. served in Swedishand English does not obtain in the The Table is arrangedin the sameway as Table I. Each number is an averageof five measurements. Russian of this talker. , An interestingacoustic feature of palatalization was VdV VgV observed when the formant transitions following the VC CV VC CV palatalized stops were comparedwith those following i - 130 -380 q- 10 -65 the correspondingunpalatalized stops in the same i a - 190 - 390 q- 100 - 80 vowel context. The former transitions were usually u -250 -370 -40 -20 convexupwards and the latter wereconvex downwards. i +360 +550 q-260 +825 With referenceto Fig. 10, this is what would be expected a a +275 + 110 + 160 +355 in general if the releaseof the palatalized stops were u +290 +420 +80 + 165 associatedwith a forward motion of the point of maxi- i + 730 + 820 + 130 +640 mum constrictionof the tract and if the unpalatalized u a +630 +650 q- 90 q- 280 variants involved a backward motion. u q- 500 +650 0 q- 75

V. FURTHER OBSERVATIONS

Table VIII. These data are based on five readingsof A large number of lesssystematic observations have each word. beenmade on variouscoarticulation effects in the speech It is apparent from Table VIII that the samegeneral of several Swedishand American-English talkers. The coarticulation rule holds for the American talker as that resultsof thesestudies may be summarizedsimply by which was observed in the Swedish material. The locus saying that they confirm the results reported in the shift is most prominent in the CV parts of the pairs presentpaper insofar as stop consonantsare concerned. /ida/versus/ada/and/iga/versus/uga/. Samplemeasurements on American-Englishand Swed- ish fricatives in VCV context have not exhibited demon- IV. RUSSIAN TALKER strable locus shifts of the kind seen in connection with The palatalized/velarizeddistinction is an important the stops,however. It would seemthat the coarticula- feature of the Russian consonantsystem. 25 One of its tion propertiesof the former classof soundsare rather acoustic manifestations is a fixation of the terminal different from thoseof stopsconsonants. frequenciesof the formant transitions.This is illustrated by the followingdata. VI. GENERAL DISCUSSION A native speakerof Russian2• read a list of VCV words A. Locus Theory eachof which consistedof the vowel/e/followed by /b/, /d/, or/g/ or their palatalizedcounterparts and Before turning to the physiologicalinterpretation endingin oneof the vowels/i, bI/, /e/, /a/, /o/, or/u/. of the data, it may be appropriateat this point to make The words were read in random order and each word a few commentsabout the locus theory of Delattre, was read five times. Table IX showsthe rangesof varia- Liberman, Cooper,27 et al., since the resultsreviewed above seem to differ from those obtained by these TABLE IX. Ranges of variation in cps of the second-formant workers. In a now classicalpaper, Liberman ten years transitionsin the vowel/el, followedby variousCV combinations ago stated28 (with severalreservations)' "... there as spokenby a Russiansubject. Primes on the consonantsrepre- are, for each consonant,characteristic frequency posi- sentpalatalization. The dashcorresponds to the five vowels/i, bI/, /e/, /a/, /o/, and/u/. tions, or loci, at which the formant transitionsbegin, or to whichthey may be assumedto point. On thisbasis, eb'- eb- ed'- ed-- eg'-- the transitionsmay be regardedsimply as movements 120 90 80 60 45 570 of the formants from their respectiveloci to the fre- quencylevels appropriate for the next phone,wherever thoselevels might be. The spectrographicpatterns..., tion of the second-formant transitions as measured in which produce/d/before/i/,/a/, and/o/, showhow the initial vowel under the different contextual this assumptionsuggests itself for the case of the second-formant transitions. We observe that all of conditions. thesetransitions seem to be pointing to a locusin the The variation is too small to be significant,except in the singlecase of the unpalatalized/g/. However,the vicinity of 1800 cps." The Haskinsworkers were able sequenceunpalatalized/g/d- front vowelis not admis- to extrapolatefixed-locus frequencies also for American English/b/and/g/. In the caseof/g/, two locihad to sible inside Russianmorphemes. The measurementsfor /g/may, therefore,not be representative.If thiscase is •?P. C. Delattre, A.M. Liberman, and F. S. Cooper,"Acoustic excluded,it appearsthat the type of coarticulationob- Loci and Transitional Cues for Consonants,"J. Acoust. Soc.Am. 27, ,769-773 (1955). M. Halle, SoundPattern of Russian (Mouton & Co., 's-Graven- •8 A.M. Liberman, "Some Results of Researchon SpeechPer- hage, 1959). ception,"J. Acoust.Soc. Am. 29, 117-123 (1957), quotationfrom Male native of Leningrad. p. 118. COARTICULATION IN VCV UTTERANCES 165

the sameVC and CV first-, second-,and third-formant 2000 terminal frequenciesbut that neverthelessdiffer with FIG. 11. Frequen- respectto the auditory impressionof the consonantthat cies of second-formant 1500- • transition loci of /b/ they giveto a Swedishlistener. At leastone of the vowels + vowel and /d/+ must be chosendifferently in the two cases.The conso- vowel sequences for 1000 - nant of oneof the stimulimay be perceivedas a/b/and different transconso- nantal vowel condi- that of the other stimulusas a/g/. Evidently, in such tions. 500 cases, the terminal frequencies themselves are less 500i 101 00 1500i 2000i important for the perception of the stops than the F2 FREOUENCY OF INITIAL VOWEL (Cpco) manner in which thesefrequencies are approachedand left by the formants during the transitional segments. be assumed'a low-frequencylocus for contextsinvolv- It would thus seem that the perception of the inter- ing back vowels and a high-frequencylocus for front vocalic stop must be basedon an auditory analysisof vowels. the entire VCV pattern rather than on any constant formant-frequencycue. In Fig. 11, a plot is presentedthat summarizesthe CV second-formantlocus frequencies of the data contained B. Physiological Interpretation in Figs. 7 and 8. The /b/d- vowel and /d/d- vowel second-formanttransitions were traced on transparent The data that have been presentedin this paper sug- paper for each initial vowel of the VCV words sepa- gest that in Swedishand English the variability of the rately. The locusfrequency was then estimatedin each formant transitionsin VC sequencesis controlledby the of the five casesfollowing the rules of Ref. 27. I.e., a postconsonantalvowel (Fig. 1). Sincetraces of the final common CV second-formantterminal frequency was vowel are observablealready in the transition from the extrapolatedfrom the five utterances/yCy/, /yC•/, initial vowel to the consonant,it must be concludedthat /yCo/,/yCo/, and/yCu/. A secondone was obtained a motion toward the final vowel starts not much later from/•Cy/,/•C•/,/•Co/,/•Co/, and/•Cy/, and so than, or perhapseven simultaneouslywith, the onsetof on for eachinitial vowel (C-b or d). In Fig. 11, these the stop-consonantgesture. A VCV utterance of the CV loci are plotted alongthe vertical axis as a function kind studied here can, accordingly,not be regarded as of the mean stationary F2 frequency of the initial a linear sequenceof three successivegestures. We have vowel of the VCV words (horizontal axis). clear evidence that the stop-consonantgestures are It is apparentfrom this graphthat the CV/b/and actually superimposedon a context-dependentvowel /d/loci are notindependent of the vowelthat precedes substrate that is present during all of the consonantal the CV sequence.Thus, the secondformant in a transi- gesture? tion from/b,/to a followingvowel must originatefrom To understand somethingof the physiologybehind a point at 500cps when the vowel/u,/precedesthe stop. these facts, we must considerthe articulatory activity When/y/precedes, the samepoint mustbe locatedat of the tongue. In Swedish and English, the invariant 1300 cps.An analogousstatement holds for/d/. How- featuresof /d/ and /g/ are, respectively,the apico- ever, although the loci are not constant, the estimation alveolar2 and dorsopalatalcontacts. The preciseshape of of a locus frequency from VCV utteranceshaving the the vocal tract during the stopclosure is quite irrelevant, sameinitial vowel (and consonant)was easy to perform phonemically,in theselanguages insofar as the identities in all casespresented in Fig. 11. In the caseof/g,/, more of the stopsare concerned.This shapemay, in fact, vary than one locus must be assumed even when the initial a great deal without affecting the abovementioned vowel is fixed (seefor instancethe secondcolumn from characteristicclosure features. Indeed the progressive the left in Fig. 9). of the stop to the final vowel manifests The locus values presentedby Delattre, Liberman, itself in adjustments of the very shape of the uncon- and Cooper were based on spectrographicstudies of stricted parts of the vocal tract. This is demonstrated American-EnglishCV utterancesand were tested by by the x-ray tracingsof Fig. 12 (Ref. 30). means of synthesis experiments. These locus values The secondcolumn from the left shows,from top to are apparently close to those that are obtained if the bottom,tracings of the vowels/y/, ,/o/, and/u/. The consonants are studied in symmetric VCV contexts. column to the left of it shows/d/closures in the environ- The resulting loci will then be intermediate between the ments/y-y/,/u-o/, and/u-u/, and the columnto the largest and smallestvalues given in Fig. 11. Thus, the right shows/g/closuresin the sameenvironments. It is singleF•. locus frequency obtainable for eachof the con- sonants/b/ and /d/ by the methodsof Ref. 27 will •'9The variability of the formant transitions in fixed-CV sequencesapparently arises from the circumstancethat the final probably be an averageof the severalvalues that arise vowel is approachedin different ways dependingon which pre- when all VCV contexts are considered. consonantalvowel configurationthe motion starts from. •0The tracings of Fig. 12 were obtained from x-ray motion It is possible,on the basis of the data presentedin picturesof the author'sspeech. The cooperationof Prof. Negelius, Figs.7-9, to synthesizetwo VCV utterancesthat have Wenner-GrenRes. Lab.• Stockholm,is acknowledged, 166 S. E. G. 0HMAN

i i I 1

.

•d• Y T T T T

I 1 1 Fro. 12. Contour tracingsfrom x-ray motion pictures of VCV utterances.The edgesof the hyoid bone, the mandible, and the epiglottis are indicated.The grid in the lowerright corner showsthe spatial distortionintroduced by the x-ray system. The real distances between adjacent lines in the grid are 1 cm. T T T

I i I

T T T apparent from these tracings that the unconstricted mands over the apical and dorsal channels,leaving parts of the vocal tract assimilatethe shapeof the initial certain subsets of the muscles responding to vowel and final vowels. Note also, the vowel-dependentvari- instructionsfree to anticipatea followingvowel. On the ability of the placeof constrictionin the/g/column. other hand, there are languages,such as Russian, in The rightmostpart of the Figure(top) showsa tracing which the instructionsfor the stop consonantsare made of the/g/closure in/ygu/sampled a few milliseconds up of an apical or dorsal command as in English or after the beginningof the closure(solid lines). The Swedish but with the additional feature that the vowel superimposeddotted and dashedtongue contours repre- channelmust simultaneouslyreceive exactly oneof two sent the/g/ closureof/aga/and/ygy/, respectively. fixedcommands [i} or [bI}, where[i• producespalatal- It is seenhere that the vocal-tract shapedelimited by ization and EbI} ? the solid line is intermediate between that of the dotted To say that the apicaland dorsalconstriction systems and dashed lines. This illustrates the fact that the cavi- of the tongue may be controlledind. ependently of the tiesmay changetheir shapeeven during the stopclosure. vowel activity is, thus, analogousto the statementthat In other words, the tongue is able to make a distorted nasalizationand voicingare independentparameters in vowel gesture,while it is executingthe stop consonant. speech.This is a universalfeature of the basicphonetic We may summarizethese observationsin terms of competenceof all normal speakers.The fact that some the following more-hypothesizedstatement. The pro- languagesdo not make a distinction between voiced duction of the vowels and of the apical and dorsal and voicelessnasals, whereas others do, 3•' must be inter- consonantsinvolve activity in three (probablypartly preted as a differencein the linguisticprogramming of overlapping)sets of muscles.These sets have separate the availablephysiological apparatus. neural representationsin the motor networks of the The model discussedthus far may perhapsalso shed speaker'sbrain. Articulatory commandsmay be trans- some light on the dynamic behavior of the lips. Ac- mitted over the three neural control channels inde- cordingly,we shouldprobably distinguishtwo physio- pendently of each other. However, the dynamic re- logicallyindependent types of labial activity in speech' sponseof the tongue to a compoundinstruction is a namely (Type 1), the closingmotions that take place complexsummation (neural, muscular, and probably in the vertical dimensionand width are observable,e.g., mechanicalalso) of the responsesto each of the com- in the Englishconsonants/p b m f v/; and (Type 2) ponentsof the instruction. the rounding-spreadingdimension of motion, which is In other words, we may probably regard the tongue usedfor the phoneticfeature of roundingin vowels. as three separatearticulatory systemsthat share some There are rounded as well as unrounded labial con- muscles.The threesystems may be controlledin a purely sonantsin English (comparethe /p/ of "Oh Pooh!" independentmanner. Different languagesmay impose phoneticrestrictions overruling this freedomof control, ø•S. E.G. 0hman,"Note on Palatalizationin Russian,"Mass. however. Thus, in Swedishand English, the stop con- Inst. Technol. Res. Lab. Electron. Quart. Progr. Rept. No. 73, sonantsseem to coarticulaterelatively freely with the 161-171 (15 April 1964). 0sR. Jakobson,G. Fant, and M. Halle, Preliminariesto Speech vowels.This implies,in termsof the presentmodel, that Analysis(MIT Press,Cambridge, Mass.•, 1963)•, 4th printing.t thesestops are producedalmost exclusively by com- p. 40, COARTICULATION IN VCV UTTERANCES 167 with that of "Yippee!"). This suggeststhat the vowel componentof the total labial system, however non- phonemicin English, in principle could be used to add phonemic distinctions to labial consonantsthat are otherwiseproduced by the consonantal(Type 1) activity mentioned above.33 The behavior of the lips would thus seem to be analogous,in this respect, to that of the tongue.•4

C. Timing of the Instruction Components The modelsuggested here postulatesonly three motor instructions correspondingto a Swedish or English vowel d-stop d- vowel utterance'namely, one for the initial vowel, one for the consonant,and one for the final vowel, the latter beingapplied before the response to the consonant instruction has died out so that the observedcoarticulation effect may result.35 However, the data presentedin this paper support only the assertionthat somecomponents of the vowel gestureare active already at the onsetof the consonant gesturein certainVCV words,but they are inconclusive Fro. 13. Contour tracingsfrom x-ray motion picturesof the as to the questionwhether or not further vowel instruc- vowel/u/ before/g/ (solidline) and before/d/ (dashedline) in tion componentsare added on during or at the offsetof VCV utterances. the consonantgesture. To judge from x-ray motion picturesof /ydu/ for instance,the /d/ couldnot be during /u/ as recordedseveral hundred milliseconds properly articulated if the back part of the tongue as- beforethe closuresof/g/ and/d/, respectively.It is sumedthe/u/position too early for, then, the tip of seenthat/u/ before/g/ has a narrowerdorsopalatal the tonguewould apparently not be able to reach the regionthan/u/before/d/. This physiologicaldifference alveolarridge. This is alsoobservable in/udu/, where correlateswell with the acousticdifference seen in Fig. 3. the back part and indeed the whole massof the tongue Apparently, the articulatory system"prepares" for is seento move forward during the/d/gesture, seem- the medial consonantduring all of the initial vowel.The ingly for the purpose of making the alveolar contact effect rese_mbles the of vowels before nasal possible.Thus, in/ydu/, the motionin the directionof consonants,which in _manylanguages may effect the the final vowel during the /d/ gesturecannot go too entire vowel, from the moment it starts, even when the vowel is prolonged. completion.The consonantgesture somehow overrules Evidently, the fact that a consonantmay color a the vowel gesture if the latter is antagonistic to the former. It is therefore not clear whether the vowel prolongedpreceding vowel speaksagainst the notion that the neural instructions that control the stream of gesturemay be said to be the resuk of o•e temporally invariant instruction or whether the relative timing of articulatory gesturesof an utterancemay be regarded the components of the instruction is variable from as a simple sequenceof temporally well-definedim- contextto context.Further physiologicalmeasurements pulses,each of which correspondsto a singleunit of a are neededto clarify this problem. linguistic transcription of that utterance. A more-ap- The complexityof the relative timing relationsis also propriate model would perhaps be one that operates illustrated by the data of Fig. 3, which show that the with a hierarchyof instructionlevels corresponding, e.g., formant frequenciesduring the stationary parts of the to the linguisticunits of phoneme,syllable, word, and •d•dal vowels of the Swedish word material were dif- phrase.The coarticulationphenomena studied in the present paper could then be termed phonetalc,since ferent for different roedial stop consonants.Figure 13 they concerninteractions between phonemegestures. showstwo superposedx-ray tracingsof the tongueshape The preparatory effect of consonantsupon the vowel of a precedingsyllable mentioned in the last paragraph 28According to the authors of Ref. 32, p. 50, this is the casein Circassian. might be the consequenceof some higher-level co- 84The retroflex vowels of English exemplify the use of con- articulation rule. sonantal (apical) gesturesfor the addition of an extra vowel distinction. VII. SUMMARY cal Model for Coarticulation,Using a Computer-SimulatedVocal Tract,"J..A.coust. Soc. Am. 36, 1038(A)(1964). A morecomplete For the speakersof Swedishand English who were summary•s in preparation. discussedin the presentpaper, the terminalfrequencies 168 S. E. G. 0HMAN of the formantsin VCV utterancesdepend not only on ACKNOWLEDGMENTS the consonants but on the entire vowel context. The I am deeplyindebted to Dr. C. G. M. Fant, Royal stop-consonantloci are, therefore, not unique. The Institute of Technology(KTH), Prof. K. N. Stevens, reasonfor this fact seemsto be that the productionof MassachusettsInstitute of Technology,and B. E. F. the consonantinvolves concomitant articulatory adjust- Lindblom, Royal Institute of Technology(KTH), for ments partially anticipating the configurationof the the invaluablesupport, encouragement, and intellectual succeedingvowel. Neural commandsfor the consonant stimulusthat I have receivedfrom them during my and the followingvowel must, hence,be active simul- work with the present paper. Stimulating discussions taneously. However, certain componentsof the final with ProfessorM. Halle, MassachusettsInstitute of vowel instruction are probably inhibited during the Technology, ProfessorP. Ladefoged, University of consonant. Furthermore, the medial consonant con- California, Los Angeles,and ProfessorJ. M. Heinz, figurationmay be slightlyanticipated during the initial MassachusettsInstitute of Technology,have given me vowel.The latter effectis observableeven at the begin- useful instruction of a kind that I could not have ob- ning of prolongedinitial vowels. tained otherwise.I also want to expressmy sincere The freedomof coarticulationof the Russianstops thanks to all members of the SpeechTransmission is small. Apparently, thesestops must alwaysbe co- Laboratory, Royal Institute of Technology (KTH), articulatedwith one of two fixed vowels,the palatal Stockholm, Sweden, the Speech Group, Research [i• or the velar[bI•. Similarly,Swedish and American- Laboratory of Electronics,Massachusetts Institute of Englishfricatives do not seemto enjoy the sameco- Technology,Cambridge, Massachusetts,and the Has- articulationproperties as the stops. kins Laboratories,New York, amongwhom I have had Probably, theseobservations may be accountedfor the opportunityto work. I am particularly grateful to in terms of an articulatoryspeech synthesis model in S. Felicetti for her expert aid in the preparationof the which (1) apicalconstrictions, (2) dorsalconstrictions, manuscript. and (3) theposition of thebody of thetongue are repre- This work hasbeen supported by the U.S. Air Force, sented by a small number of independentcontrol by the U.S. Army, by the National Institutesof Health, parameters.The operation of this model would involve U.S. Department of Health, Education, and Welfare, a hierarchy of instructions correspondingto the by the SwedishTechnical Research Council (Statens phoneme,syllable, word, and phraselevels of a linguistic Tekniska Forskingsr&d),and by the Sweden-America representation. Foundation(Sverige-Amerika Stiftelsen).