Adaptation in the Speech Tract
Total Page:16
File Type:pdf, Size:1020Kb
314 Adaptation in the Speech Tract Gaberell Drachman Ohio State University This paper will discuss the extreme examples of vicarious articulation arising in the cases of speech without a larynx or without a tongue--whether during speech acquisition as a result of the congenital lack of an organ, or during rehabili- tation following extirpation of an organ. It will be suggested that a viable model for production and perception must formalize the notion "ease of articulation" so as to take account of the Speech-adaptations obtaining in the grossly defective as well as in the normal speech tract. Essentially, the vocalisation mechanism in man, as in the primates generally, consists of a source (the larynx) and a variable filter (the supra-laryngeal tract).1 The representa- tion of the innate capabilities of this tract in human speech, and thus also the definition of the Distinctive Features to be utilised in that representation could therefore, it might seem, be made simply on the basis of anatomical measurements. But the tract that speaks is not the tract that yawns, blows, or swallows. Just as the body takes up a specific stance or priming before the act of running, so for the far more complex act of speech, a function-specific programme-adaptation is said to override the non-speech mode of functioning of the parts of the vocal tract. This new programme modifies not only the overall configuration of the tract, but also the co-ordination of its parts. It is held to occur measurably before the onset of the speech act, to be context-free (i.e., independept of the nature of the first segment of the utterance to come),` and even to be carried out in a precise time-sequence (Krmpoti8, 1959). The following are a few of the putative elements of program- ming for what (Cf. Chomsky-Halle, 1968) we shall refer to as the "speech neutral tract". The whole larynx rises at the beginning of the phrase (Perkell, 1965), then (after fluctuating in height during speech--e.g., undergoing further raising for high vowels or ejectives, or lowering for voiced obstruents or implosives) lowers again to its "rest" position only at the end of a phrase.3 Apart from the still controversial problem of the speech- neutral stance of the vocal cords (Halle-Stevens, 1967)," the configuration of the internal surfaces of the infra-glottal space of the larynx forms an exponential acoustic transformer between the sub- and supra-glottal cavities during phonation (Fink-Kirschner, 1959). There is also evidence that the 315 lateral pharynx walls contract slightly before the onset of speech, again fluctuating about the new (speech-neutral) norm, and again relaxing fully only at the end of the phrase (Kelsey, et al., 1968). I mention, though with but a single example here, the second aspect of speech-adaptation: this is the adaptation of the timing and co-ordination programmes that contrast the use of the elements of the vocal tract in speech to their use in non-speech functions. For example, quite new co-ordinations are required for the use of the velum in speech, as against swallowing, as regards its co-operation with the muscles of the upper pharynx (Bloomer, 1953). Thus, it seems to be the case that the definition of Distinctive Features must depend on the superordinate speech- neutral configuration of the vocal tract, and not simply on anatomical measurements made with the tract at rest. Now it is perhaps to be supposed that all of these programme-adaptations are inherited, and thus a part of the innate mechanism of language. However, it is obvious that rather parallel (compensatory) programme-adaptations are being continually made throughout the life of the individual, whether because of differ- ential growth5 or through accidental changes in the tract. First, notice one respect in which we do not attempt to make compensatory changes of articulation. It is clear (Cf. Mol, 1964) that the child attempts neither to imitate the exact position of the formants of the adult vowels that he is matching, nor to retain his childhood vowels in the face of a lengthening tract and subsequently downward-shifting formants. On the other hand, the adolescent boy quickly compensates and regains command of the muscle-co-ordinations required for pitch and register-shift, despite changes in the front-to-back dimensions of the larynx, with attendant maladjustments in the thyro-arytenoid and vocalis muscles, at the comparatively sudden onset of puberty (Cf. Weiss, 1950). Adaptation to artificial dentures is of course well-known, and it is also clear that quite large growths in the oral cavity--e.g., within the dome of the palate--are easily compensated for in articulation. However, there are extreme cases of adaptation in articula- tion-programming for which "compensation" is no longer appropriate, certainly by any reasonable constraint on the term. These are the cases of alaryngeal andaglossal speech, that is, cases where the larynx or tongue is either immobilised or absent; here, since acoustical rather than physiological approximation is the aim of the speaker, the term "vicarious articulation" is more appropriate. When the larynx is incapacitated or extirpated, the most common form of speech developed is known as "esophageal speech". Figs. 1(a) and 1(b) make clear the usual anatomical changes involved in total laryngectomy. For the two components of the source, the air source (or activator) is now supplied by the upper part of the esophagus, and the glottis by the musculature 316 at the junction of the pharynx and the esophagus [Fig. 1(c)]. Cervical Vertebrae Glottis Excised__ Part C‘ Neo- glottis Tracheal Activator Stoma --"" Neo- (Lungs) activator (a) (b) (c) Normal Section Total Laryngectomy Esophageal Speech Fig. 1. Esophageal Speech (schematic) The output programming for such a speaker shows many inter- esting modifications. First, air must be conveyed to the activator, the upper esophagus. Notice that perhaps the most efficient way of achieving this--by using the tongue as a pump, rather than by simply swallowing air-allows him only some 5cc (a teaspoonful) per stroke of the pump, normally enough for about five syllables of speech. This is a serious constraint on the speaker's performance, requiring short phrases, together with overlength pauses to mark larger constituents. The alaryngeal speaker breathes through a hole (a stoma) in his neck', using the (normal) intercostal muscles; but the desire to eliminate breathing-noises at the (tracheal) stoma during speech may lead hlm to de-synchronise his pulmonary and phonic breathing cycles.° He must then suspend the delicate balance of the muscles (Ladefoged, 1962) which in the normal speaker guarantee a smooth volume-velocity flow across the glottis despite steadily falling lung volume. For speech, too, he brings thoracic pressure to bear on the neo (esophageal) activator, in conjunction with the activity of the muscles controlling the junction between the esophagus and the hypo-pharynx7,muscles over which we normally exert no direct control--for these now constitute his glottis (called the neo-glottis). Last but not least, he must acquire control over the exact position, length and tension of the neo-glottis, so that an aerodynamic vibration will 317 obtain over a reasonable range of frequencies (his new pitch- range). For alaryngeal speech, however, intensity and pitch are barely independent parameters; higher pitch will usually result in greater loudness, and greater intensity is hardly achieved without concomitant raise of pitch. But our perception of pitch differences seems to be less acute at the low range of fundamentals he is using ( which is about an octave lower than his earlier voice), at least when listening to alaryngeal speech.9 He must thus learn to get the neo-glottis as sharp in cross section as possible, in order to raise his pitch into the range "normal" for his speech-community, balancing this against the fact that the higher pitch will take greater subglottal pressure and consequently quicker loss of air. A good esophageal speaker can utter over twenty syllables within the span of one (apparent) breath; for he injects small amounts of air at frequent intervals, quite noiselessly and 'unnoticeably. His pitch range may be as much as 3/4 of an octave, so that he can sing a little, although the song is of course badly chopped up. Nevertheless his speech.can be perfectly comprehensible, with good intonation, and the listener adjusts almost immediately to the pitch perturbations and high noise content of the neo-glottis tone. But even when the activating and neo-glottal mechanisms are described, there is more to account for. It is fairly clear that articulation does not follow in the usual manner following laryngectomy. There is, for example, adaptation in the segment- timing program--and we note that the contact time for labial and dental stops is shorter for the alaryngeal speaker. Diedrich- Youngstrom suggest that in normal speech, "the tongue and lips wait for the driving pressures to be supplied by respiratory musculature," whereas "in esophageal speech, these pressures (intraoral-pharyngeal) must be supplied by the tongue and lips." On the other hand, these short times may reflectichanged articulation speeds connected with pressure economy (but compare Kozhevnikov- Chistovich (1965)10. Some sample closure times follow (Diedrich- Youngstrom (196611, for the same subject pre- and post-operatively-- taking only stop - initial open syllables: pa ta sa ma pre-operative: 360 360 280 560 (milliseconds) post-operative: 240 280 200 120 Notice that the largest proportional difference between the normal and alaryngeal contact times is found with the nasal (4:1, as against 3:2), for which we also characteristically find very short velo-pharyngeal port opening times in alaryngeal speech.