Lip-Pellet Positions During Vowels and Labial Consonants
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Phonetics (1997) 25, 405Ð 419 Lip-pellet positions during vowels and labial consonants John R. Westbury and Michiko Hashi Waisman Center and Department of Communicative Disorders, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, WI 53705Ð2280, U.S.A. Received 15th July 1996, and in revised form 8th April 1997 Sagittal-plane movements of small markers attached to the upper and lower lips were analyzed for ten speakers of American English, and seven speakers of Japanese. Each speaker produced simple utterances containing vowels and labial consonants. The data were analyzed to better understand: (1) patterns of pellet motions associated with labial consonant production; (2) pellet positions at discrete, acoustically- defined moments during selected speech sounds; and (3) the relationship between midline separation between the lip surfaces and inter-lip-pellet distance. Results from the study provide qualitative information about the dynamics of labial gestures for consonants involving lip closure. The data also indicate that the English and Japanese speakers positioned and moved their lips in generally similar ways during the test sounds analyzed. Finally, results suggest that plausible estimates of mid-line inter-lip separation can be derived from the trajectories of two pellets, one on each lip, as long as the possibility of lip-body deformation is taken into account. ( 1997 Academic Press Limited 1. Introduction Houde (1968) introduced the phrase ‘‘point-parameterized’’ to refer to records of speech movement that take the form of trajectories of discrete, point-like ‘‘markers’’ (e.g., pellets, coils, light-emitting diodes, reflecting disks, or even bony landmarks) on an articulator. Several modern techniques for studying speech movement provide data of this type for accessible articulators such as the tongue, lips, jaw, and soft palate. For example, the X-ray microbeam (e.g., Macchi, 1988), electro-magnetic articulometry (e.g., Perkell, Matthies, Svirsky & Jordan, 1993), and certain opto-electrical techniques (e.g., Boyce, 1990), have all been used to examine and describe speech-related motions of the lips in terms of sagittal-plane coordinates of the centers of small markers, one firmly attached in the mid-line at the vermillion border of each lip.1 The lips pucker and spread and rise and fall during speech, as talkers shape the inter-lip cavity to modify the spectrum of sound radiating from the mouth. In the best of all worlds, a complete understanding of these actions might be inferred from a sagittal-plane representation of the motions of two fleshpoints. However, more must be known about the relationship between lip-marker 1 A recent report by Ramsey, Munhall, Gracco & Ostry (1996) describes three-dimensional fleshpoint- kinematic data, recorded from a single talker, using multiple lip markers tracked by means of an optoelectrical system. Those data provide a more complete view of the actions of the lips than the more conventional two-marker, sagittal-plane view considered in this report. However, inferences about size and shape of the lip opening, from both types of data, will probably be constrained in similar ways. 0095Ð4470/97/040405#15 $25.00/0/jp970050 ( 1997 Academic Press Limited 406 J. R. ¼estbury and M. Hashi positions, and the nature and degree of labial constrictions, before broad inferences about lip action can be drawn. A handful of direct-imaging studies of lip function (e.g., Fujimura, 1961; Fromkin, 1964; Linker, 1982; Abry & Bo¬ e, 1986) provide information about time changes in size and shape of the lip opening during speech. However, these studies do not tell us what discrete point motions reveal about labial articulation. An indirect way to begin to address this question is to examine trajectories of lip markers during speech events that limit lip opening. The labial consonants /p b m/ are events of this type. During their closures, no air exits the mouth, because the lips are fully together and form a tight seal. For this reason, we might expect the positions of lip markers, and the distances between them, to vary little during consonantal closure intervals. The analysis of lip-marker kinematics described in this report was designed, in part, to address these simple expectations, and in general, to provide an improved understanding of two broader topics. The first relates to how midline labial fleshpoints move as lip closures are formed and released for the labial consonants /p b m/. The second relates to the question of whether reliable conclusions about labial constrictions are possible from something so simple as the sagittal-plane trajectories of one marker on each lip. A practical benefit associated with a better understanding of the articulatory signifi- cance of lip-marker positions has to do with representing vocal tract postures for vowels. It is common to think of the vocal tract as a flexible tube, and to define the articulatory posture at any moment during speech in terms of an area function which represents the cross-sectional area of the tube as a function of its length, from the glottis to the lips. A plausible labial termination can easily be ‘‘attached’’ to a tube approximation of the Figure 1. A stylized tracing of a midsagittal section of the vocal tract. Hypothetical pellets are represented by small solid circles ‘‘attached’’ to the outlines of both lips. Letter symbols and lines and defined in the text. ¸ip-pellet positions during selected sounds 407 vocal tract if length and degree of the lip constriction are simply related to the measured positions of upper and lower-lip markers. A sketch shown in Fig. 1 suggests a scheme for doing this. To a first approximation, constriction length might be represented by the length of a line segment such as A, drawn tangent to the lower-most edges of the maxillary incisors, and perpendicular to the segment connecting upper and lower-lip marker positions. Constriction degree reflected by the midsagittal separation between the lip surfaces, analogous to B, might then be represented by the distance C between the two markers, minus some reference measure of the lips’ combined thickness. A ‘‘candi- date’’ reference measure might be some distance between markers when the opposing lip surfaces are in contact (e.g., during closure for a labial stop). 2. Methods and materials Acoustic and labial fleshpoint kinematic data from a sample of 10 speakers of American English, and seven speakers of Japanese, were analyzed. Data from these speakers were not collected specifically for this analysis. Instead, they were available from existing corpora, in which the sound pressure wave had been sampled 21,739 times/s, while each of the speakers produced single examples of /p b m/ in isolated /" Ð "/ frames, with primary stress on the second vowel; and, single, isolated examples of the five vowels /i 2"ou/2. During each brief speech task, sagittal-plane positions of upper-lip (UL) and lower-lip (LL) markers (gold pellets, 2.5 mm in diameter, attached in the midline at the vermillion border), were recorded at 40 and 80 times/s, respectively, using the X-ray microbeam (XRMB) system at the University of Wisconsin3. Pellet positions were expressed relative to cranial axes, defined relative to each speaker’s maxillary occlusal plane (MaxOP) and central maxillary incisors (CMI), according to conventions de- scribed elsewhere (Westbury, 1994a). Materials for the English speakers were drawn from the publicly available XRMB Speech Production Database (Westbury, 1994b) [speakers E29 (female), E31(f), E34(f), E35(f), E41 (male), E44(m), E53(m), E54(f), E59(m), E61(m)].4 Materials for the Japanese 2 Throughout the text of this report, the mid-front, low-back, and high-back vowels of Japanese are transcribed phonemically as /2"u/, respectively. These symbols are correct for the English vowels, but possibly misleading for Japanese. Sources cited by Vance (1987) suggest that the mid-front vowel in Japanese may be phonetically ‘‘midway’’ between [e] and [2]; the low-back vowel midway between [a] and ["]; and the high-back vowel closer to [ɯ]. In Fig. 5 the symbols e and a are used to represent the mid-front and low-back vowels produced by both sets of speakers. This usage is due to graphical limitations imposed by plotting software. 3 The XRMB system sometimes fails to track one or more pellets for some or all time samples spanning a speech task. Examples of /m b/ were lost to mistracking for one Japanese speaker (J5). Consequently, only 49 VCV utterances were available for analysis, rather than the 51 that would be expected given one example each of/pbm/inan["Ð"] frame, for seventeen speakers. Two time samples of UL pellet position were also lost to mistracking for one English speaker (E61), during the release gesture for /m/. However, su¦cient residual information was available in the latter case to approximate the two missing time samples inthe trajectory, using a simple linear interpolation scheme. 4 Identification numbers for English speakers are the same as in the publicly available XRMB Speech Production Database, while the speech tasks correspond to vcv and vowel records from that corpus. Waveforms analyzed in the current study were processed and filtered in a slightly di¤erent way from those in the public release of the XRMB corpus, though these di¤erences have no material e¤ect on relative magnitudes of positional measures. Materials from these XRMB Database speakers, tasks, and/or waveforms may be used for similar or other purposes, in future work by other investigators, and it may become important to know which of the relevant materials were included in the current analysis. The 17 English and Japanese speakers included in this study are the same group also described in an independent analysis of vowel configurations (Hashi, Westbury & Honda, 1994).