Evolution of Vowel Production Studies and Observation Techniques
Total Page:16
File Type:pdf, Size:1020Kb
Acoust. Sci. & Tech. 23, 4 (2002) TUTORIAL Evolution of vowel production studies and observation techniques Kiyoshi Hondaà ATR Human Information Science Laboratories, 2–2–2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619–0288 Japan Keywords: Acoustics, Vowel theories, Observation method, X-ray imaging, Magnetic resonance imaging PACS number: 43.70.Jt human speech mechanism. 1. INTRODUCTION In the 19th century, theoretical investigations of the Speech production studies have evolved along with vowel were carried out, as epitomized by the cavity tone technical advances in observing speech production pro- theory of Willis and the overtone theory of Wheatstone. cesses. Therefore, the history of speech production studies (The cavity tone and overtone are not frequently used can be understood to be the history of the development of today. The former corresponds to a resonance of a closed- observation techniques. As a rule, a new technique supports end tube and the latter indicates combined harmonic and a discovery of new evidence, and it is evident that some non-harmonic partials of complex sounds.) Also, experi- observation techniques have made great contributions to mental studies made advances, as seen in the work of J. speech research. However, the fact that the speech organs Mu¨ller [3], who experimented with a model of the vocal are mostly hidden within the body has been a critical organs and proposed the source-filter theory in 1848. barrier to conducting speech research. This review article, Helmholtz [4], a successor to Mu¨ller, conducted studies of focusing on the special issue of the vowel, summarizes the sound including making a mechanical model of the larynx progress of vowel production studies from the 18th century and described vowel qualities. He also performed vowel to the present day, by documenting several previous efforts synthesis using tuning forks and a series of Helmholtz to overcome methodological difficulties. resonators. In addition, a few mechanical sound-wave recording devices were invented, such as the phonauto- 2. A SHORT HISTORY OF VOWEL graph and kymograph. Also in the 19th century, many PRODUCTIONSTUDIES instrumental studies of the larynx were carried out by How far the origin of speech production studies is phoneticians and physicists. In 1855, Garcia used a thought to go back may vary depending on the investiga- laryngeal mirror to observe the vocal folds during tors or the research fields. According to Judsen and Weaver phonation, and in 1878, Oertel applied a stroboscope to [1], who cite an ancient medical text from China in B.C. as visualize vocal fold vibration. Ewald [5] in 1898 proposed the oldest known evidence, experimental speech research a conceptual model of the vocal folds that is roughly began in the 18th century and analytical speech research equivalent to the one-mass model. started in the 19th century. The following two paragraphs During the florescence of speech research in the 19th are a short summary of speech research conducted in the century, a dispute on vowel theories arose. It is curious to 18th and 19th centuries. find that this dispute has almost been forgotten in the In the 18th century, Dodart, in 1700–1707, originated present day. The early vowel theories of Willis and the puff theory of vowels and found the pitch of the voice Wheatstone were succeeded by many other theories with being dependent on the tension of the vocal folds. Ferrein, various names, and they were developed into schools of in 1741, originated the partials theory of vowels and studies thought involving many arguments. Fletcher [6] summar- the larynges of cadavers and of animals. Donders, in 1780, ized these theories into harmonic and inharmonic theories, discovered that the resonance cavities of the speech as shown in Table 1. The harmonic theory is the one mechanism are tuned to different pitches for different supported by Wheatstone and Helmholtz, and is also vowels. And in 1791, von Kempelen [2] invented a known as overtone theory, resonance theory, or relative- mechanical speaking machine based on the studies of the pitch theory. The inharmonic theory was claimed by Willis, Hermann, and Scripture, and appeared under the names of Ãe-mail: [email protected] 189 Acoust. Sci. & Tech. 23, 4 (2002) Table 1 Vowel theories summarized by Fletcher [6]. Harmonic Theory ‘‘The vocal cords generate a complex wave having a fundamental and a large number of harmonics. The component frequencies are all exact multiples of the fundamental. ÁÁÁ when these waves pass through the throat, the mouth, and the nasal cavities those frequencies near the resonant frequencies of these cavities are radiated into the air very much magnified, ÁÁÁ These reinforced frequency regions determine the vowel quality.’’ Inharmonic Theory ‘‘The vocal cords act only as an agent for exciting the transient frequencies which are characteristic of the vocal cavities. A puff of air from the glottis sets the air in these cavities into vibration. This vibration soon diminishes until it is started anew by a second puff. ÁÁÁ the puffs do not necessarily follow each other periodically.’’ cavity tone theory, transient theory, or fixed-pitch theory. Table 2 Dispute on the vowel chart. Despite the view of Helmholtz, who sought to unify the two groups of theories, Fletcher noted that the parameters Triangle Account for speech coding must depend on which of the two Lloyd (1890–1891): [14] A classical triangular vowel chart based on speculation and phonetic knowledge. theories is realistic. The dispute was longstanding and even Jones (1932): [13] Quadrilateral distribution of the highest persisted in the beginning of the 20th century. This was points of the tongue for English cardinal vowels. partly because a new discussion was instigated regarding the applicability of Fourier analysis to the spectral analysis Non-triangle Account of vowels [7]. Russell (1928): [15] Varied tongue positions in lateral X-ray depending on speakers and speaking style. In the 20th century, development of various recording Chiba (1931): [17] Non-triangle distribution of tongue positions and imaging techniques contributed to rapid advances in for Japanese and English vowels. speech research. Mechanical sound recording techniques used during this time included Miller’s Phonodeik, Edison’s phonograph, and Berliner’s gramophone. Electric to replicate the classic chart of the vowel triangle, as recording techniques made a great contribution to speech previously seen in Lloyd [14]. Russell [15,16], basing his analysis as well as acoustic studies in general, and they conclusions on extensive X-ray images of vowels, claimed employed condenser or dynamic microphone, vacuum tube that the tongue positions for vowels did not follow the amplifier, oscillograph, sound film, and electrical gramo- cardinal vowel chart at all. In Japan, Chiba [17] conducted phone recording. Mechanical and electric harmonic an X-ray study of vowels and concluded that the highest analyzers were invented for use in on-line sound analysis points of the tongue for vowels did not form a triangle in methods, and Fourier’s harmonic analysis was commonly Japanese or in English. The vowel chart debate stimulated used for off-line spectral analysis of oscillograph pictures. further X-ray observations of vowels (see Zemlin [18]), The development of photographic techniques also con- which dealt mostly with observed tongue positions during tributed to speech research. Stroboscopic photography and vowel production. cinematography were used for the study of the vocal folds [8,9], and the real-time recording of vocal fold vibration 3. THE VOWEL STUDY BY CHIBA AND was performed at Bell Laboratories using high-speed KAJIYAMA cinematography [10]. As noted above, the history of vowel production studies The discovery of the X-ray by Roentgen in 1895 experienced two debates; one regarding vowel theories in opened up new areas of research based on imaging the 19th century and the other regarding the vowel chart in techniques. The X-ray was first used in speech research the early 20th century. There may be no doubt from a in 1904 by Moeller and Fischer [11] who observed the retrospective historical perspective that a study by two positions of laryngeal cartilages for different pitch levels. Japanese researchers provided the final conclusion to both The first X-ray application to vowel study was made by disputes and further directed the flow of speech research Meyer [12] in 1910. The X-ray observations of the tongue toward modern analytical science [19]. Chiba and Kajiya- position during vowel production evoked a debate on the ma [20] employed state-of-the-art instruments of the time vowel chart. At the time, vowels were described by various and arrived at two definitive conclusions: the acoustic types of charts that indicated a triangle or quadrilateral nature of vowels is determined by vocal tract shape, and distribution of tongue positions. Jones conducted X-ray vowel spectra can be calculated from electrical analogues photography for four cardinal vowels in 1919 and, in 1932, of vocal tract area function. They further endeavored to he presented a schematic figure of the tongue positions for solve the problem of vowel normalization by developing a the cardinal vowels [13]. He used the highest tongue points space-pattern account of vowel perception in opposition to 190 K. HONDA: VOWEL PRODUCTION AND OBSERVATION TECHNIQUES described. 4.1. Cineradiography Speech research after the end of the World War II began with electric analog speech synthesis and was followed by speech production studies using cineradiogra- phy [26,27]. The dynamic characteristics of speech move- ments were studied by Perkell [28], Kent [29], and Wood [30]. Frame-by-frame analysis of the trajectories of metal markers attached to the lips, jaw, and tongue revealed the real activity of articulation, and the films served as good material for students to learn anatomy and physiology. However, the X-ray dosage was significant and this technique therefore gradually fell into disuse.