Acoust. Sci. & Tech. 37, 2 (2016) #2016 The Acoustical Society of Japan

Principles of in human whistling using physical models of human vocal tract Toshiro Shigetomi and Mikio Mori Graduate School of Engineering, University of Fukui, 3–9–1 Bunkyo, Fukui, 910–8507 Japan (Received 16 June 2015, Accepted for publication 28 October 2015) Keywords: Whistling, Human vocal tract modeling, , Air-column resonance PACS number: 43.64.Yp [doi:10.1250/ast.37.83]

1. Introduction produce the vocal tract forms corresponding to the five Whistling competitions have been held regularly in Japan Japanese vowels through changes in the plate combination. in recent years, and some prize-winning whistlers have Because the shape of the mouth at the time of whistling is established music schools to teach musical whistling. How- considered to be close to the mouth structure required for the ever, no theoretical textbook on whistling has been written pronunciation of the vowel /u/, Mori et al. reported adjust- thus far, and the principles of human whistling sound ments made to physical models to achieve this shape [5]. production are relatively unknown. A good understanding of Following this approach, the vocal cord parts in the model these principles would be very beneficial to both the trainer used in this study were detached and various components and the trainee. (described below) were added. By employing simplified Previously, Rayleigh reported that the whistling frequency models, Mori et al. also demonstrated that the resonant is determined by the Helmholtz frequency in the frequency and mode in human whistling involve not only the mouth cavity. He also noted that earlier theories linking Helmholtz resonance mode but also an air-column resonance the sound-production mechanism to the vibration of the lips mode [5]. were inaccurate [1]. Later, Wilson et al. reported that the human whistling resonant frequency is extremely close to 3. Experimental apparatus the Helmholtz resonator frequency. Through physical-model- 3.1. Human whistling model based experiments, they determined that the resonator can Figure 1 shows the human whistling model, which be excited by a flow through the smooth-edged orifices comprises a physical model of the human vocal tract. As bounding the resonant cavity [2]. However, the length of the stated above, this was obtained from an X-ray CT image of vocal tract in Wilson’s model was 1.5–4.5 in (3.8–11.4 cm), the human vocal tract during whistling, the details of which which is shorter than the average length of an adult male vocal have been given in a previous study [6]. The model is tract. composed of 17 10-mm-thick acrylic board unit plates and Therefore, the principles of whistling in terms of one 5-mm-thick acrylic board plate. Figure 2 shows the articulation have been reported to be based on the Helmholtz internal diameter of the human whistling model. The plate-0 resonance. However, some whistlers can produce high- acrylic board has an open hole that is equivalent to the lips. pitched by blowing more forcefully without changing While the human whistling sound source is, in general, the capacity of the resonance chamber, which is similar to the turbulence created by the orifice formed by the mouth, the production of a high-pitched sound by a wind instrument sound source in this model is turbulence created by the plate-0 using air-column resonance. orifice. The hole thickness and diameter are 5 and 4 mm, In this work, we study the principles of human whistling respectively. The model is composed of plates numbered sound production using physical models of a human vocal 1–17, which represent the vocal tract. This model has a tract, obtained from an X-ray computed tomography (CT) maximum length L of 6.7 in (17 cm), but the number of plates image of the vocal tract during whistling. We demonstrate can be adjusted to give lower L values. that the principles of resonance in human whistling involve 3.2. Experimental measurement equipment not only the Helmholtz resonance but also an air-column A photograph of the experimental equipment is shown in resonance when the length of the vocal tract model is Fig. 3. An air jet representing a breath was blown through the increased. model from the side opposite the lips, and the fundamental frequency was measured whenever a whistling sound was 2. Human whistling model produced. A microphone (MM-MC1, condenser microphone) A physical experiment was conducted using a human was placed 5 cm from the lips, and the output sound was vocal tract model, specifically, the plate-type model proposed multiplied using a Hanning window to yield 2,048 central by Arai [3,4]. The model consisted of 10-mm-thick acrylic points of the steady-state portions. The fundamental frequen- boards, each containing an open hole, and could imitate and cy was determined from the peak value of the fast Fourier transform (FFT) power spectrum. Because the amplitude and period of the human whistling sound fluctuate to a minor e-mail: [email protected]

83 Acoust. Sci. & Tech. 37, 2 (2016)

Fig. 4 Sample vocal tract model length varied from Fig. 1 Human whistling model comprising a physical seven to six pieces by removing plate 7. model of the human vocal tract.

plates is 17. In the case of constriction, where part of the vocal 35 30 tract acts as a resonance chamber, the number of plates is 25 less than 17. A person blew into the model for each plate 20 configuration. We performed this experiment three times in 15 total and obtained the average and standard deviation of the 10 resonant frequency. 5 3.4. Calculation of Helmholtz resonance and air-column

Internal diameter [mm] 0 resonance theoretical frequencies 01234567891011121314151617 The theoretical frequencies of the Helmholtz resonance Number of plates frequency FH are expressed assffiffiffiffiffiffiffiffiffiffiffiffi v A Fig. 2 Internal diameter of human whistling model. FH ¼ ; ð1Þ 2 V0Leq where v is the speed of sound in air; A is the cross-sectional area of the neck; V0 is the static volume of the cavity; Leq is the equivalent length of the neck with end correction. We treated the model shape as being flanged. Thus, Leq was calculated as

Leq ¼ Lneck þ 0:85Dneck: ð2Þ Here, L and D are the actual length and diameter of Fig. 3 Photograph of experimental equipment. neck neck the neck, respectively. In the calculation, the neck part is plate 0, while the cavity part is composed of plates # 1–17. The theoretical frequencies of air-column resonance are extent, the sound is close to a (sine wave) pure tone. The given by the transfer function of the vocal tract [7], and the sampling frequency was 44.1 kHz and the number of vocal tract of the physical model is a combination of the quantization bits was 16. lossless and one-side-closed acoustic tube. The reflection 3.3. Change in resonance following model shortening coefficient between each unit plate rk and the transfer function In this study, the length of the model was shortened by of the vocal tract VðzÞ are removing plates and the resonance frequencies were mea- Akþ1 Ak sured. The first vocal tract model composition is shown in rk ¼ ; ð3Þ Akþ1 þ Ak Fig. 2. From this composition, we removed a plate from the Q N N glottis section and changed L. Overall, L was varied from 17 0:5ð1 þ r Þ ð1 þ r Þz 2 VðzÞ¼ G k¼1 k ; ð4Þ to 3 cm by removing plates (from 17 to 3 plates of 1 cm DðzÞ thickness). The resonance was measured for each plate  ÂÃ composition. 1 r1 DðzÞ¼ 1 rG Figure 4 shows an example of the length variation of the a1 a1 r1z z vocal tract model from seven to six pieces by the removal of  ð5Þ 1 r2 1 rN 1 plate 7. In a physical sense, this corresponds to the movement ; a2 a2 aN aN of the constriction formed by raising the back of the tongue r2z z rN z z 0 close to the palate. For a constriction-free case, the number of where

84 T. SHIGETOMI and M. MORI: PRINCIPLES OF SOUND RESONANCE IN HUMAN WHISTLING

2000 FA_F1 FA_F2 FM 1800 2000 1600 1800 1400 1600 1200 1400 1000 1200

Frequency [Hz] 1000 800 800 600 600 Frequency [Hz] 17161514131211109876543 400 Number of plates in model 200 0 Fig. 5 Mean and standard deviations of measured 17161514131211109876543 fundamental frequencies of human whistling model. Number of plates in model

Fig. 7 Measured fundamental frequency (FM) and first FH FM and second formant frequencies (FA F1 and FA F2, 2000 respectively). 1800 1600 1400 FA_F2 FM 1200 7000 1000 6000 800 5000

Frequency [Hz] 600 400 4000 200 3000 0 Frequency [Hz] 2000 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 Number of plates in model 1000 0 Fig. 6 Measured fundamental frequency (FM) and 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 Helmholtz resonance frequency (FH). Number of plates in model

Fig. 8 Measured fundamental frequency (FM) and Ak is the cross-sectional area of a given plate hole; second formant frequency (FA F2). rG is the reflection coefficient of the output side; ak is the length of each plate. In the case of a one-side-closed tube, rG ¼ 1. FA_F1 FH FM 2000 4. Experimental results 1800 Figure 5 shows the average resonant frequency values 1600 obtained using the human whistling model employing 1400 physical models of the human vocal tract. The lower and 1200 upper lines are standard deviation error bars. We believe that 1000 the distribution in these plots indicates two categories, i.e., 800

plates 17–9 and 8–3. Frequency [Hz] 600 Figure 6 shows the measured results (FM) and calculated 400 results (FH). FH gradually increases from 346 Hz, and when 200 3–8 plates are used, FM is close to FH. 0 Figure 7 shows the FM and the theoretical values of the 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 air-column resonance frequency (FA). FA F1 and FA F2 are Number of plates in model the first and second formant frequencies, respectively. In the 17–9-plate composition region, the measured frequencies Fig. 9 Measured fundamental frequency (FM), first were close to FA F2. For the 8–3 plate configurations, the formant frequency (FA F1), and Helmholtz resonance measured frequencies were close to FA F1. frequency (FH). Figure 8 shows all the theoretical values of FA F2 and FM. For the 8–3 plate configurations, FA F2 diverges from FM. 5. Discussion Figure 9 shows FM, FA F1, and FH. It is apparent that Figure 10 shows FM and the calculated values FA F1, the calculated FH is closer to the measured FM than the FA F2, and FH. The two regions (marked with black ovals) calculated FA F1. in this figure are explained as follows. The results for the

85 Acoust. Sci. & Tech. 37, 2 (2016)

FA_F1 FA_F2 FH FM tract during whistling. We demonstrated that the principle of resonance in human whistling includes not only the Helmholtz 2000 resonance but also an air-column resonance when the length 1800 of the vocal tract model is increased. It was found that 1600 two different resonance regions appear. Region I, when the 1400 number of plates used is eight or less (3–8 cm), is the 1200 Helmholtz resonance region. Region II, for which the number 1000 of plates is greater than eight (9–17 cm), indicates the air- 800 column resonance region. The findings of this study are

Frequency [Hz] expected to be useful for engineers because the principles of 600 sound production in wind instruments (including human 400 whistling) are not yet completely known. 200 0 References 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 Number of plates in model [1] J. W. S. Rayleigh, The Theory of Sound, Vol. 2 (Dover, New York, 1945), pp. 223–224. Fig. 10 Measured fundamental frequency (FM), first [2] T. A. Wilson, G. S. Beavers, M. A. Decoster, D. K. Holger and and second formant frequencies (FA F1 and FA F2, M. D. Regenfuss, ‘‘Experiments on the fluid mechanics of respectively), and Helmholtz resonance frequency whistling,’’ J. Acoust. Soc. Am., 50, 366–372 (1971). (FH). [3] T. Arai, ‘‘The replication of Chiba and Kajiyama’s mechanical models of the human vocal cavity,’’ Phonet. Soc. Jpn., 5, 31–38 (2001). compositions featuring 8–3 plates occupy a region referred to [4] T. Arai, ‘‘Education system in acoustics of speech production as Region I, i.e., the Helmholtz resonance region. For these using physical models of the human vocal tract,’’ Acoust. Sci. & lengths, our human whistling model and Wilson’s whistling Tech., 28, 190–201 (2007). model behave similarly. For 17–9 plates, Region II is [5] M. Mori, Y. Satomi, M. Ogihara, S. Taniguchi and C. Araki, obtained, which is the air-column resonance region. Thus, ‘‘Principle of sound production in human whistling studied using physical models of the human vocal tract,’’ ICEE2012, both Helmholtz and air-column are involved in MI-6, 128–132 (2012). human whistling sound production. [6] M. Mori and N. Shigekawa, ‘‘Reconsideration on the principle of sound production in human whistling using the vocal tract 6. Conclusions model: Formant and the vocal tract model based on the X-ray We examined the principles of human whistling sound CT data,’’ IEEJ Trans. A, 133, 674–675 (2013) (in Japanese). production using physical models of a human vocal tract. [7] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech These models are based on an X-ray CT image of the vocal Signals (Prentice-Hall, Englewood Cliffs, N.J., 1978).

86