<<

Dept. for Speech, Music and Hearing Quarterly Progress and Status Report

Perception of

Sundberg, J.

journal: STL-QPSR volume: 20 number: 1 year: 1979 pages: 001-048

http://www.speech.kth.se/qpsr

STL-QPSR 6/1979

r, I , -1 . .

I. MUSICAL ACOUSTICS .- .

.'' , A. PERCEPTION UF*SINGING . '

%L J. Sundberg

9s .This article reviews investigations of the singing which ,. possess an interest from a perceptional point of view. Acoustic hnction, formant frequencies, , pitch, and phrasing in , singing are discussed. It is found that singing differs from speech in a highly adequate manner, k. g. for the pufpose of increasing the loudness of the voice at a minimum cost as regards vocal ef- fort. The terminology used for different type's of voice . seems to depend on the properties and use of. the voice organ rather than on specific acoustic signal characteris tics.

- Pr- STL-QPSR f/i979 ? c 4.

partials decrease monotonically with rising frequency. As a rule of thumb a given partial is 12 dB stronger than a partial located one oc- tave higher. However, for high degrees of vocal effort this source spectrum slopes less than 12 d~/octave. With soft phonation it is I steeper than 12 d~/octave. On the other hand, the voice source ;n+rr,. spectrum slope is generally not,,dependeet on which voiced sound is produced. :: br:,iec,:.ceii~9 !: -<; c,*>iriF cat i~::5~:- I VIXJ ko 31i30e .JTT The spectral differences between various voiced sounds arise when the sound from the voice source is transferred through the vocal tract, i. e. from the vocal folds to the lip bpeiliig. The reason for this is that the ability of the vocal tract to transfer sound is highly dependent on; the frequency of the sound being transferred, This ability culminates -,ti-- .lr ------JAIJ- id LLW~JXS~ -2 at certain frequencies, called the formant frequencies. A& a 16nse- quence of this, those voice source partials which lie closest to the <- formant frequencies are radiated from the lip opening with greater amplitudes than the other, neighboring partials. Hence, the formant frequencies are manifested as peaks in the spectrum of the radiated - - :< sound. -

The formant frequencies vary within rather wide limits in response to changing the position of the articulators, i. e. , lips, tongue body, tongue tip, lower jaw, velum, and . We can change the two lowest formant frequencies two octaves or more by changing the posi- t. ' . , :A,' c. CJj, >, tion of the articulators 4 The frequencies of these two formants deter- mine the identity of most vowels. The higher formant frequencies can-

58 not be varied as much. They seem to be more relevant as personal

?r, -? :rii.r~uo3,133 of ls~r5ns 1.1 voice characteristics. Thus, properties of vowel sounds which are

of great importance to the vowel identity can be described in a chart showing the frequencies of the two lowest formants, as is done in Fig. .-- 3' .-.. . I-A- 1. Note that each vowel is represented by a amall area rather than *a " . . . STL-QPSR i/f '979

by a poizit in the chart. In other words, these formant frequencies may vary within certain limits without changing the identity of the vowel. This

reflects the fact that a given vowel is normally observed to possess higher

formant frequencies in a child or in a woman than in a male adult. The .. ..,i reason for such differences is differing vocal tract dimensions, as Grill

I r I be shown later;cl-,:J a ill( - -2i In singing, more or less substantial deviations are observed from the

vowel ranges shown in Fig. I-A-I. Indeed, a male singer may change the formant frequencies so much that they enter the area of dif - ferent vowel. For instance, in the vowel [il * as sung by a male opera .. . . singer the two lowest formant frequencies may be those of the vowel [ yJ according to Fig. I-A- 1. And in female high-pitched opera singing the

formant frequencies may be totally different from those found in normal

speech. Still we. tend to identify such vowels correctly. This shows that

the frequencies of the two lowest formants do not determine the vowel iden-

tity entirely. Next we will see how and why these deviations from normal ,adhau?) as:!,.:.!iq $2 (lr.qz&2 +s Z!$W~)V zL''.j;- .. speech are made in singing, - -.------c 7.: s n~ol2;sd: ~i+trr'r,:lr iisw )I. eprij ,91~%i'f-A? 111. ~esonator~aspects

,cc +;; rlh: sfi? , rfq~.r*js jr,rj avt- (faJ $3.S*Gq:s? i . Formant frequencies ;- , 4 I rf?..'CC." 3 .'(i -4.4 n* x--jZ . \ . I. - 1 singer is required to sing at fundamental frequencies as high as f 000 or 1400 Hz. In normal female speech the fundamental £re- 02 ,. i .I;;.i.rrsa: OR ir , * quency rarely exceeds ca 350: Hz. The normal value of the first (and in . _.- some vowels even the second) formant frequency is far below 1000 Hz, as

can be seen in Fig. I-A- I. If the soprano were to use the same articula- arf ..>'G tion in singing a high pitched tone as in normal speech, the situation illust-

rated in the upper part of Fig. I-A-2 would occur. The lowest partial in the spectrum, i. e. the fundamental, would appear at a frequency far be-

. ,, 3 .,, low that of the first formant. In other words, the vocal tract ability

* All letters appearing within C J are symbols in the International Phonetic Alphabet.

~a 4 STL-QPSR f/i979

to transfer sound would culminate at a frequency where there is no

sound to transfer. It seems that singers tend to avoid this situation.

Instead they abandon the formant frequencies of normal speech and move

the first f~rmantfrequency to a close neighbothood of the fundamental.

The main articulatory gesture used to achieve this tuning of the first

formant i~ P change of the jaw opening, which $5 particularly effective

for changing the first forrnant frequency (cf, Lindblom, & Swdberg. 1971).

This explains why female singers tend to change their jaw opening in a

pitch dependent manner rather than in a vowel dependent manner, as in

normal speech. The acoustic result of this is illustrated in the lower part of.the same Fig. I-A-2: the amplitude of the fundamental and hence ,>. I. ,the sound power of the vowel increases considerirbly. Note that this gain , 1 in sound pover results from a resonatory phenomenon. It is obtained 1 ', ", 9

I Fig. 1-A-3 shows formant frequencies measured"in a soprano singing 1

various vowels at varying pitches (Sundberg 1975). As can be seen in . . the figure, the vowels maintain their formant frequencies cf normal

speech up to that pitch where the fundamental coincides with the fir'st

formant. Above that fxequency the first formant is raised to a frequen-

cy close to the fundamental. If the jaw opening is changed, the main

t IX'i $* :.3* ' ' " but effect is observed in the first formant frequency, the higher for- , LT ', , mant frequencies a1 so change to some extent. This is illustrated in Fig.

I-A-3: all formant frequencies change when the first formant starts

(3 IL 3.~3.lfIk;,bY +<*+ ' 'I to match the fundamental frequency. >,& .,: . !- /,.-: .,yz7. -.:l-.JA:113 5 $ (L 2. Sound intensity and masking , , ' rf~ids yn.$rrlt r- rr - As was mentioned above, the amplitude of the fundamental increases '< when the first formant is tuned to that frequency. *This results in a gain

in overall sound pressure level (SPL). The magnitude of the gain can .. .I tlL i,, PARTIALS FREQUENCY

FORMANTS

W D 3 C-

%$ PARTIALS -*FREQUENCY

Fig. I-A-2. Schematical illustration of the formant strategy in female singing at high pitches. In the upper case the singer has Fig. I-A-I. Ranges of the two lowest formant fre- a small jaw opening. The first formant appears at a fre- quencies for different vowels repre- quency far below the frequency of the lowest partial of the sented by their symbols in the interna- vowel spectrum. The result is a low amplitude of that tional phonetic alphabet (IPA) . Above partial. In the lower case the jaw opening is widened so the scale of the first formant frequen- that the first formant matches the frequency of the funda- cy is translated into musical notation. mental. The result is a considerable gain in amplitude of that partial. Reprinted from Sundberg (1977). STL-QPSR 1/1979

be seen in Fig.'P-A-4 which shows the increase in SPL associated with the

formant frequencies plotted in Fig. I-A-3. We can see that the pitch de-

pendent'choice of formant frequencies results in an amplitude gain of

almost 30 dB in extre-me cases. This. cor.responds to a thousandfold in-

crease of sound power. A perceptually important conclusion is that the

female singer will gain in loudness to a corresponding extent. i.0 ,->:

The singersJ1need2for exceptionally high degrees of loudness is of

'coujrse a consequence' of the fact that opera singers are generally accom-

I.' panied by an orchestra The average SPL of an orchestr&.playing loudly

I I ' in a concert hall is on the order of mangitude of 90 to 100 dB. This is

, much mForethan we can expect from .a human speaker. The masking ef-

fect which the orchestral sound will exert on a-singer' s voice is deter-

mined byJthe'distxibution of sound energy along tfrg frequency scale. A

long- time -average-spectrum of orchestral music shows the . average. of , this distribution. Such a spectrum is shown,in Fig..I-A-5. It was obtained . . ,t from the "Vorspiel" to the first act of;.WagverJs Meistersinger opera.

The' frequency scale is based on .the MeJ uvit which "is preferable,when

rn:sking and spectral pitch are considered (cf. Zwicker & Feldtkeller,

.. 1967). The graph shows that the strongest spectral components are A

.I . , found in the region of 400-500 Hz. The average spectrum level falls off

i' steeper towards higher frequencies than towards lower frequencies

' (Sundberg, 1972a).~. .IQL yi3an ~JQO!.,rw exrll .-~:e;lcz~:.~:~~~*,~~:lac. -. ?'he masking effect of ,anoise with the spectrum shown in Fig. I-A-5 can

be estimated if we use3hearingtheory (see Zwicker & Feldtkeller, 1967).

"Avoiding details wk may say that the masking effect will culminate at

.I . those frequencies where the masking sound is loudest and it. will de-

crea'se as the amplitude of the masker decreases towards higher and L.

lower frequencies. Thus, on the average, the masking effect of the " Aso.und or theFbrchlCtrawill cblminate at 400 - 500 Hz and decrease to-

wards higher and lower frequencies, STL-QPSR 1/1979

What types of spectra does the produce, then? From Fig. 1-A-5 we can see that the long-time -average - spectrum of normal speech is very similar to that of the orchestra! This suggests that the combina- tion of the sound of an orchestra with that of the human voice during nor - ma1 speech is probably the most unfortunate one possible. If the sound level of the orchestra ie higher than that of the voice, the voice is likely

to be completely masked. And, invereely, if the sound of the voice would . be stronger (which is very unlikely) the orchestra may be masked. From

this we can conclude that the acoustic characteristics of the human voice

as observed in normal speech is not very useful for solo parts when com-

I bined with the ~oundof an orchestra. The=efore, the acoustic characte-

ristic~of the voice have to be modified in auch a way that both the singer's voice and the o*cheetral accompaniment can be loud and independently

bUdibier. ,,i. ,C,~ . . - . 3 -. - , --- *.-<# .~VJ..- +lTX.'?- p K(

; , L(?t us dow return to the ease of female singing. The spectrum will

be dominated by the fundamental if the firet formant is tuned to the fre- quency of that partiai, %Ve would expect this to occur as eoon as the fun-

damental frequency is higher than the normal frequency value of the first

formant. This value is 300 to iO0.0 )L; depending on the vowel (cf. Fig. I-A- 1). From what wae said above about masking we see that all vowels are likely to be marked by the orchestra as long as their first formant is below

500 Ha, approximately. This will be the caae for all vowels except [a ,

a.ae3 sung at fundamental frequencies lower than about 500 He, which

is close to the pitch B4. As soon as the fundamental frequency exceeds this value it will be strong and its frequency will be higher than the strongest partials of the accompaniment. Summarizing we can say that

a female singer' e voice can be expected to be masked by a strong orchest-

ral accompaniment as eoon as the vowel is not CU, a,=] and the pitch is below B4. This seems to agree with the general experience of female FUNDAMENTAL FREQUENCY (Hz )

FUNDAMENTAL FREQUENCY (HZ)

Fig. I-A-4. The overall sound level of vowels in- dicated (IPA symbols), which would' result at different fundamental fre- quencies if the formant frequencies Fig. I-A-3. Formant frequencies in the vowels indicated were kept constant at the values ob- (IPA symbols) measured in a professional served for the fundamental frequency eoprano singer singing at different fundament- of 260 Hz in Fig. I-A-3. The arrows al frequencies. The lines show schematically show how much these sound levels in- how the formant frequencies are changed with crease when the formant frequencies fundamental frequency for the vowele indicated are changed with the fundamental fre- by the zircled symbols. Adapted from Sundberg quency in the way indicated by the non- ( 1978~). circled vowel symbols in Fig. I-A- 3. STL-QPSR i/i979

voices in opera singing. They are rarely difficult to hear when they

.., :. sing at high pitches, eve? when.. .. the, orche8tral, accompaniment is *loud. . . . . -.q, , . -5 ::I ..... L :::., .. . lit .... - . . . -.:;l*l,; :A>*;, ::>: ,.LC<: "7~~~'~~~- . . ..., *!j'$,,::s: y,'?kl ;jT(Js('fc,:> l~:b;~<,r !)~J~~~-)::~~,ci~;)~;~;~!L~;y~ 3: Vowel intelligibility . &,,? , -r ,i.,>7.,,pl i ..:.,',- ..-. :... , . : .,. . .- C., .-r~..~-;. *.r+,' .? fD)t Above we have seen that female singers gain considerably _in loud- <- - -.- -. -1- 5. -< -2&<<--..2 '-1: ness by abandoning the formant frequencies typical of normal speech ' i when they sing at high pitches. On the other hand, the formant frequen-

cie s are extremely important fo vowel intelligibility. This poses the ** fJ1 question of how vowel intelligibility is affected by the high pitches *ln .-

female singing. ., ? ..;, ; ,. .. ,-.. re , if,~!.-a+ C...~JF. i .. .. a One of the first to study this problem was the phonetician Carl Stumpf . . (1926) although he probably was not aware of its acoustic background. .., 1-7 Stumpf used one professional opera singer and two amateur singers."4t,;i

Each singer sang various vowels at different pitches turning their backs to a group of listeners, who tried to identify the vowels. The identifica-

tions were found to be better when the vowels were sung by the professional singer. These results are illustrated in Fig. I-A-6a. The percentage of

cor rect identification dropped as far as 50% in several vowels sung at the

top pitch of G #5. The identification was far better for most vowels when

the vowel was preceded by a consonant. particularly (: t,] . This shows

' that vowels are much easier to identify when they contain some transitions.

' Incidentally, this seems to be a perceptual universal: changing stimuli are

' mare easy to process than quasi- stati onary stimuli.

Stumpf' s results relate to later findings reported by Scotto di Car lo

(1972). She found that vowel intelligibility is different for different vowels STL-QPSR 1/1979

La . r i ' : i. r-; s 3. . ?,PI ; , <,,.., sung at the same pitch. In a later study she also discovered that the: transitions associated with consonants have considerably longer dura- tions and involve wider initial frequency glides when a singer attempts to exaggerate articulation than when she sings normally (Scotto di Carlo,

. - - .LA - I ad~y(J -21 r-JulJU1T JAILKXT.:QJ, 3flJ-:)C., "!,P:?* (- (, -*rc " 3 3 2 976). -

' ., , The question of how the female singer' s deviations from the, for- . mant frequencies of normal speech affect vowel intelligibility was stu-

' died by Gundberg (1 977a). A set of six vowels were synthesized (with, ) at different fundamental frequencies ranging from 300 to F:

1000 Hz. The formant frequencies were kept constant in each of the vowels. The sounds were presented to a group of phonetically trained .. listeners who tried to identify the sounds as anyone of twelve given 'i..' -*.. -. . ' vo&els. ~hs~~ti~~ltsa'x= 'shown in Fig. I-A-6b. It can be seen in the figure that, on the average, vowel intelligibility decreases manotonical-

-4-, . rj ly as pitch rises, although there are exceptions and minor variations.

More important, though, is that the percentages of correct identifica- tion are much lower than those reported by Stumpf using non-synthetic vowels. A major difference between the synthetic vowels and the drq- qi33

t -- .I vowels used by Stumpf is that presumably the first formant was never lower than the fundamental in Sturnpf' s case. This being the case we may conclude that the pitch dependent articulation in high-pitched fe- .3 IT

.,,.*:a- .-..-*A : male singing improves vowel intelligibility as compared with the case where the formant frequencies were kept constant regardleas of the pitch. - OICWESTRA -40 - - -- ssrtc" ...... OICWESTIA 4' $11111(1 I I I I I I 0 1 2 3 45 FREQUENCY ( kHz)

Fig. I-A-5. Idealized long-time-average spectra showing the mean distribution of sound energy in the "Vorspiel" of act 1 in Wagner's Meistersinger opera (solid line) and in nor- mal speech (dashed line). The dotted line pertains to an opera singer singing with orchestra accompaniment. From Sundberg (1977b).

iua ta to

o,ti iuo e

I I I I 300 450 67 5 1000 FUNDAMENTAL FREQUENCY (Hz ) FUNDAMENTAL FREQUENCY (Hz )

Fig. I-A-6a. Percentages of correct identification of Fig. I-A-6b. Corresponding values obtained by Sund- vowels (IPA symbols) sung by a profes- berg (1977a) in an experiment with syn- sional singer according to Stumpf (1 926). thesized vibrato vowels, each of which The solid line represents the average. had the same formant frequencies regard- Note that intelligibility increased when less of the fundamental frequency. The the vowels were preceded by a Ct I . solid line represents the average. Cf. also Fig. I-A-12. STL-QPSR 111979

An important point in this connection is the fact that a rise in pitch must be accompanied by a rise in formant frequencies if vowel quality is to be preserved. Slawson (1968) found that maximum similarity in vowel quality was obtained when the formant frequencies were increased

by iOO/o on the average for a one octave increase of the fundamental fre- quency. However, Slawson worked with speech-like sounds with a funda-

mental which never exceeded 270 Hz. In any case, our ears deem to ex- pect a certain increase in the formant frequencies when the fundamental

frequency is increased. . t id? . The difference in the percentage of correct identification between

stum1pfi 8 ahd Sundberg' s investigation8 may not necessarily depend sole-

ly aa ib dittsrencc in the formant frequencies. Other differences between the dynh;dtici and the real vowale) may very well have contributed. As was

just mentioned, the beginnidg add ending of a sound are ptobably very -

revealing and presumably the vowels differed in this respect between the two investigations. Therefore, a direct comparison with well defined synthetic stimuli is needed before we can draw safe conclusions as to whether or not the pitc;h'depcndent choice of formant frequencies in: high

- ,' pitched female singing really is a positive factor in vowel identification.

.- < : + ..:rI!!ifx 3 . :i~'lj~:lltr1;~13,+39qa 9Zf. id? .:rj33*;m)xo'~q$s. I B. Male singing - -, : lrC -., ~i1~sl-r.i2kf s~i:'CTJ bir~qJ;:~C~:TY~ '(3 ~3"+L I. The "singer's formant" .i' The audibility problem appears a bit different for a male singer than

I for a female singer. The reason for this is the difference in the funda- mental frequency ranges. In normal speech the male voice centers a-

round if0 Hz, approximately, whereas the female voice is about one oc- tt-..r+q j199d tave higher. The top pitch for a , a , and a is generally

E4 (330 HZ), C4 (392 HZ), and C5 (523 HZ), respectively. Consulting once more Fig. I-A-1 we find that most vowels have a first formant frequen- I

I 8. I cy which is higher than these top fundamental frequencies, at least

in the cases of bass and baritone voices. The case where the funda- .

mental frequency is higher than the norm.al values of the fi~stformant :.

frequency will occur only in the upper part of the tenor and baritone

ranges. Therefore, in male singing a pitch dependent choice of the -13.

two lowest formant frequencies is'not to be expected except in vowels with

a lowfirst' formant frequency sung at high pitches by and bari- . .,

tones. Measurements by Sundberg ( 1973) and Cleveland (1977) confirm -

this. , .k BC 63% 5ni:a s- y asrscp3: I ':

.-.Theconsequence of this seems to be that the,male singers produce

spectra which on the average are similar to the average spectrum of the

orchestral accompaniment (cf. Fig. I-A- 5). Above we found that such a si- I!

milarity in the spectrum leads to maximum masking. On the other. hand, I

we know that male voices can be heard readily even when the orcheet.ral.

accompaniment is loud. ii !,cj.: 5lfiij E!SW~V .$A! =II~S~T~~tlrqI*,IsT~~.T~!:=*.xY 1

;-If vowel spectra of riormal speech are compared with those produced

by male op.era and concert singers, there is at least one difference which

can be observed almost invariably. Sung vowels contain more sound.ener-

gy than spoken vowels in the partials dalling in the fr,equencly region-of 2,. 5-

3 kHz, approximately. Thus the spectrum envelope exhibits a more or - ' t; h6,; . lees prominent peak in the high frequency region. This peak is generally

referred to as the "singer' s formant", and it has been observed in most

acoustic studies of male singing (see e. g. Bartholornew , 1934; Winckel,

1953; Rzevkin, 1956; Sundberg, 1974; Hollien, ~eistei& Hollien. 1978). . i' ,>- rig. I-A- 7 provides a typical example. . , The "singer' s formant1'has been studied from acoustical and percep-

J tual points of view by Sundberg (1974). There are strong reasons to as- e -,.. J* or;-,, rl sume that the "singer' s formant" is an acoustic consequence of a clustering . ,. ..-. - * of the third, fourth, and fifth formant frequencies. If formants approach

each other in frequency, the ability of the vocal tract to transfer sound

increases in the corresponding frequency region. Hence, the "singer's

,>, formant" seems to be primarily a resonatory phenomenon. Also, it :,

doea h& seem to depend on one but rather several formants. .+ . TT'J~EI Formant frequencies are determined by the dimensions of the vocal

tract, i. e. by articulation. According to Sundberg (1974) an articulatory

configuration which clusters the higher formants in such a way that a

"singer's formant" is generated involves a wide pharynx which appears

to result from a lowering of the larynx. Such lowering of the larynx is

typically obserfed in male singers. Thus, the "singer's formant'' can

be interpreted acouetically and articulatorily, It should be mehtioned

that even other articulatory interpretations have been suggested (Hollien i l.

& al, 1978). 'J -b..(rs: Lroi*t>*i ..- '~*$iss+:,,ii: a: ,.Z:L&~A~ -*A:LT~. e-' ,L ,&J~L,.,.. L . . . '~3'19f1;47\d 32 :>f O'Ef '?~&Z~TOP8 '1941;'d" 311 1 1.3 9fii~fii;' itJ 2. Audibility

, rr c '3 ,.:* * ;-;... .- - ? t. , . ... - --. - A different question is why male opera singers' Add a Itsinge'$ for- mant" to their voiced sounds in singing. Probably the reason is percep-

tual. In a ~oundillus'tration contained in Sundborg (1977b) it is dernon- _... . S ?r> nr..r+ y,ry> id.. -.d$ -*. -7 L -* - c . r strated that a singsres voice'is much'elsier to disci=n'a&i&t the bacG

ground of a noise with the same average spectrum as the sound of an . . . orchestra, when the voice has a prominent "singer's formant". his'^ .. . -. . --., . - .- -1- 1 - --,.*;,- f ,. effect is certainly associated with masking. The average 'specdrum of . . an orchestra culminates around 400 - 500 Hz and then decreases towards high- er frequencies (see Fig. I-A-5). The mean spectral level at 2.5-3 kHz . . is about 20 dB below the level at 400 -500 Hz. It seems to be ad-&re&=- ly good idea to enhance the spectrum partials in this frequency range.

These partials are likely to be perceived without difficulty by the audience, / . .i ..i.,.. - because the concurrence from.the orchestra's partials is moderate at'

. . .. ,a < -. .-. .. . . - . . .a ,.J. '- ': . - c-- --.*. these high frequencies. .: .:: ,

t

STL-QPSR i/i979

articulatory background of these modifications probably is the lowering

of the larynx and the widening of the pharynx required for the generation

of the "singer' s fo-rmants" These articulatory characteristics affect . ", not only; the third and higher formant frequencies but also the two lowest

formant frequencies, which are critical to vowel quality, as mentioned.

Sundberg (1970) measured formant frequer~ciesin vowels sung by four . ..

singers and compared these frequencies with those reported by Fant (1973) for

non- singers cf. Fig. I-'A-~.The figure show8 considerable differences. For instance, the second fbrmant does not reach ag high in frequency in

sung vowels as in lspoken yowels. This is the acoustic consequence of .i - a wide' pharynx and.a low larynx. As a result some vowels do in fact ,, assume formant frequencies typical of a different ~pwelin singing. This

poses the same question as in the case of female singing: can we really

identify the sung vowels correctly? t,,ra :,,% ;i tr,3JlLV ZSp,:iS 31.;i71 :,,

Unfortunately there is no formal evidence avialable to supply ar, an-

swer to that question. On the other hand, the vow,el quality differences

between spoken and sung vowels are-well known at least to singers and

singing teachers. Many voice teachers instruct their students to modify

an Ci3 towardsan [yJ, an re] .towardsan [ce], an [a] to- . : wards an Ca.3 etc. (see e. g. Appelman, 1967). It is considered impor-

tant that a vowel should not be replaced by but only modified towards

another vowel. This must mean that. the sung vowels do retain their !. .- vowel identity, although the two lowest formant frequencies are clearly

"wrong". It is likely that the modification of the two .lowest formant fre- . , quencies can be compensated for by the higher formants, at least Ln part.

In summary we can say that the departures from the formant fre- ,- . - - quencies typical of normal speech lead to a modification of vowel quality.

This modification is probably not sufficiently great to shift the vowel . . - Stnging - e-- a Normal spmh .-, *.I *.I \ - *

a.=-

1 2 3 FREQUENCY (kHz) 00 [u:] [o:][a:] [=:I [ej [i:] [y:] [t+:J [a:]

Fig. 14-7. Spectrum contours (envelopes) of the vowel [u] Fig. 14-8. Average formant frequencies in different spoken (dashed curve) and sung (solid curve) by a vowels as produced by non-singers (dashed professional opera singer. The amplitudes of the curves) according to Fant (1973) and four harmonics between 2 and 3 kHz give a marked peak male singers (solid curve) according to in singing as compared with speech. This peak is Sundberg (1 970). Note that the fourth for- called the "singer' s formant". It is typical for all mant (F4)in non-singers is slightly higher voiced sounds in male professional opera singing. in frequency than the fifth formant (F5) in Adapted from Sundberg (1978a). singing. Reprinted from Sundberg (1974). ~dentity. Part of the reason for this might be that the "singer' s for- . . . , mant" compensates the effect from the changes in the two lowest for- (,an- mant frequencies. It is also likely that transitions associated with e. g. consonants contribute to produce the same effect. - *-,.-,:% .,- -.-=. ..< .,-, - . AL Before leiving this subject reference should be made to a study by,,,l

Simon, Lips & Brock (1972). It concerns spectra ofa vowel sung with differing by a professional singer. These measurements show

how properties of the spectrum vary when the singer mimics types of , .%. . . singing with different labels such as "Knbdel" etc. It seems that formant

I > frequencies explain a good deal of these differences. d:iis. i,5riaar?'. nvp I' .

.- . ,. ,:. ji.:;: a b--qi-,.7;:~(jc~~8 ! !,~,gq?a .l: f$j?tFJ 9-~c,.,: 7.s.. ,.>~>?938.. it; I ... . c:; :-':vbi;=: . r . . ! '.. ' .. . ..:*...... cla a sihcah&n rrc..lr .:.),\?Y~;JJ sqe,cdl~, Lfi.sis ::-, . . ",., , .... . '. 12 .-'r!:t- i. Bask, baritone, and tenor timbre ' ':' B k~snoijr~.\rc~r94i e crl d pr;5T:~

, As we all know singing voices are classified in terms of soprano, altb, tenor, baritone, bass. The main criterion for such classification is the pitch range available to the singer. If a singer's range is'C3 to CS'(131- 523 Hz) his classification is tenor. Pitch range9 of differeht voice clas- sifications overlap to some extent. In fact, the range C4' to E4 (262-330 Hz) is cornmon to all voices. Still we rarely have any difficulties in hearing if a tone in'thie range is sung by a male or a female singer, and often we can even judge the voice classification correctly.

Cleveland (1 977) studied the acoustic backgrourid of this discrimination 3ic;t-i:r d1~4 4- *:;<.; f .. ,3733 ITifit-r.)', +crl,' ability in the case of male singing. He presented five vowels sung at four

pitches by eight singers classified as basses, , or tenors to '%'

singing teachers who were asked to decide on the voice classification. ' .., .?*,.a :7 - J- ti?? ..tr ,*ii< ?. :.:~:YJ-? ; ,-A*- 5. 2 ,-.I . ;13;':~-7-'; .,7'1 7 -t? r. The normal beginnings and endings of the tones were"sp1iced out. The , results revealed that the major acoustic cue in voice classification is the I fundamental frequency. Incidentally, the same result was found by Gole- .,r:f~:::,; ,!i *; <,>.,:- -. . --.. LX-2 **t* ; man (1976) in a study of maleness and femaleness in voice timbre. The result is not very surprising if we assume that we rely mainly on the., .

most apparent acoustic characteristic in a classification task. By com-

paring vowels sung at the same pitches Cleveland found that the formant

frequencies serve as a secondary cue. The general trend was that the

lower the formant frequencies, the lower the pitch range which the singer

is assumed to possess. In other words, low formant frequencies ,cr:- seem to be associated with bass singers, and high formant frequen- 1. .r

cles with tenors. In a subsequent listening test ,Cleveland verified .. ..".

these results by presenting to the same singhg teachers vowels G;ga,,, synthesized with formant frequencies that were varied systematically

in accordance with his results obtained with real vowel sounds. I I Cleveland also speculated about the morphological background of the se

findings. Ae has been mentioned above several times , the formant fre- I

quencies are determined by the dimensions of the vocal trac't. These di-

mensions are srnaller in children and females than in male adults, and

the formant frequencles dlffer accordingly. As a longer tube reso- .-1 ' nator has lower resonance frequencies than a shorter tube, the ..,-r., .. formant frequencies in a male are lower than those of a female for I ' a given vowel. The female vocal tract is not a simple small-scale copy

of the male vocal tract (Fant 1973). The pharynx-to-mouth length ratio

1s smaller in females than in males. The acoustic consequence is

that certain formant frequencies in certain vowels exhibit greater differ-

ences between sexes than others, as can be seen in Fig. 1-i--9 (see also Nord-

5- 81.' tr:-:%-.rfT strum, 1977). The greatest variations are found in the two lowest for-

mant frequencies. in the same figure is shown the corresponding values

which Cleveland found when he compared a tenor voice with a bass voice. -i- -2 r rk[> sg.t,z.trc?sfi . .

in the tenor/bass case as in the female/male case. This finding should , . , - .. ------. : 1 >so. a*,? p .

be corroborated by X-ray measurements on a number of singers of dif-

fering voice classtflcation. As yet, we can only hypothesize that tenors . . tend to have smaller pharynx-to-mouth ratios than basses. .

In summary, experimental support has been found for the following

_I 1 . t. conclusions. In voice dlassification the fundamental frequency seems to

be the main acoustic cue. However, the formant frequencies typically

differ between bass, baritone, and,tepor voices. This. difference which

- - ' ?.,mr,;n., .r.-r a. . ,r+ .T .-.. ."" probably rtX4.ects differences in the pharynx-to-mouth length ratios serves . d' , - -- - as a secondary cue in voice~classification.

.% ... .rr' .. . . !. 2. Altoandtenortimbre .,,: ' . , . .- . c,(.I-i pi Ic~;,-.~r,hi*..:;.,.!;afrr; -rTrcrrt~s,,+r~.-. . . T Generally there is a clear difference in voice timbre between

and tenor voices. As their pitch ranges overlap to a great extent, the. . . :Ji~i';; .>,, ... -A ,j3AlT2 .11' 17.75 3T::P;T hTL .':P >.If! III ''1 VL; C fundamental frequency cannot always explain this difference. We have

seen that tenor and bass vorces differ with resp.ect to the formant frequen-

cies in a way similar to that in which female and male voices dlffer. This r: 1;,?1'O -->C.~~UEfi39iq-mon rr: 5, .~:-qo,'~?x~ suggests that with respect to formant frequencies a tenor voice is more

slmilar to a female voice than a bass vofce is. What are the acoustic dif-

ferences which account for the timbral differences between alto and tenor as~~ri2.ru-xo$ 339qaS2 fl:.~~, voices, then? Agren & Sundberg (1978) compared two alto and Go tenor

voices singing the same vowels at the same pitches. Although the number

of subjects does not allow general conclusions, and in spite of the fact that ,I_ I. Ci nainvr a.!.r,.rlr.sq no perceptual evaluation of the results was attempted, the results 6f 'their

investigation have perceptual relevance. . - . . , .

Only the fourth, formant frequency showed a consistent difference which Jrrs s:~VL p (3 ' qu- could account for the considerable difference in timbre between the two

voice types. This formant was observed to have higher frequency in the

alto voices than in the tenor voices. This means'that the frequency dis-

I tance between the third and fourth formants was smaller in the tenor voices. I. '- .- There was also a clear difference in the source spectrum: the amplitude of

the fundamental was higher in the alto voices. As we shall see later 1 ., I

STL-QPSR i/i 979

to roughness. Let us next consider a harmonic speckrum with a higher

fundamental frequency. In this case all of the lowest five partials ex-

,.. cite different critical bands, and hence, they cannot give rise to rough-

ness, Roughness can occur in such spectra only if one or more pairs

of higher partials have high and reasonably equal amplitude. ...*A qszi; .\.. . ,: Let us now return to the alto/tenor case. In the pitch region of re-

levance, only pairs of partials above the fourth partial can give rise to

. roughness. If we take into account the fundamental frequency ranges of

alto and tenor voices, thie leads us to consider partials in the vicinity

of the third formant, which is generally located around 2 500 Hz. If the frequency distance between the third and fourth formant 4s on the same

order of magnitude as the fundamental frequency, it,is likely that these

formants will enhance two adjacent partials and thus give rise to rough-

n-8. In the Agren &Suntiberg study (t. 978) the mean frequency distance between these formants was found to be 785 Hz (SD = 2 22 He) in the

caee of the and 439 Hz (SD = 189 Hs) in the case of the tenors. Thus. we find that this distance is of the same order of magnitude as the fre-

quency separation between the partials only in the case of the teno;. voices.

Therefore we would expect roughneas from a tenor voice but not from an

alto voice. It sedms reasonably safe to conclude that alto and tenor voices

dLffer with respect to roughness because of the difference fn the frequency

distance between the third and fourth forrnahts. , rrrsbnrj? ..,+ 'lo ..6~::

.. , I ..,* id+....:<$...... ? . c o , i 2.~~.bvukA , L. ,L!C~~..X!JV~:.,.,*.I? !O ~:?b~;.il IV . Phonation :i>;? I If,-\,3dh313fl, 3;H-i ?' .I ?.jOd~41~- is.?;,. To this point we have focused primarily upon resonatory phenomena,

lr ' i. e. , on the characteristics associated with formant frequencies. In the

present section some aspects on phonation will be presented, i. e., the be-

> -t \ . . ', JA' '17 havior of the vibrating vocal folds and the acoustic properties of the resul-

ting voice source. i 17u.td ti r),i'r~ '~L)~sL:-I~"3hd; ~>~LLI;:~t?i~ (XCP I j nii?~rt.3 .cj. ? r A. Vocal effort and pitch, . , , :I lo . . - ..

1 The 'voice source cha'racteriskics 'change 'kith vb'cal. effort'and with the

pitch as related to thk pitch range'of the individual voi=e. ~h'xlorrnalspeech

the- amplitudes of thk higher ove=tones increase at a farter rate than the

\ b amplitude of the fundamental when vocal effort is increased whl'le the

reverse is true when pitch is raised(cf e. g, Fant, 1960). Sundberg (1973)

studied the voice source In two professional singers and found the arnpli-

4 A tudes of the o&rtones abovA'1 k~kto fdcrease at a fastef rate not only wheri

tde vocal effort waa increased but also when pitch waa "'rdrisCd. 'o:~:;

'1n a later study ~undberg& Gauffin (1978) measured bbth waveform and

spectrumof the voice'sburce in sihgers. They used an inverse filter

'8 technique ad moduh ~othenber~(i972) 'which allowed them toL'study-

the partials up td 1. 5 k~z,approximately. The' results shobed that

in this lob f&&ency part of the 'bource sbectrum the axkiplitude

relationship betwekn the fundamental and the ove'itones changed with , . pitch rather than withr'vocal effoit. when pitch was rdired the amplitudes

of the overtbnes' increased more tHBn the amplitude of the ftzndamental.

,: When vocal effort was increased 'the amplitude of tlib Tundamentil was

observed to increase it approximately the sanie rat6 is the 'SPL;] As

' the &L is mainly determined by ;he amplitude 'of the.$artlal ~nderl$ing - the first formant, which 'was an ovkrtone, this indicates 'thatLthe' amp-

litude of the fundamental increased at about the same rate as the.drnp-

litudes of the . However, the amplitudes of the source spectrum I . .r partials above 1.5 kHz would increase more rapidly than the amp1 itudes

of the lowest source spectrum partials when vocal effort is increased. This t. \ can be inferred from the observations by Sundberg (1973) and Hollien & a1 1' ~4

(1978) , that the arnpllhtde of the " singer's formant" increases faster than I, I . . the SPL when vocal effort is raised. In addition to these findings Sundberg ~. I .<- & Gauffin ( 1978) also found that "pressed" phonation is characterized by

strong overtones as compared with the fundamental. STL-QPSR 1/1979.

I. . ,< .,*

The above findlngs may explain why Agren & Sundberg (1978) found a stronger fundamental in the alto voices than in the tenor voices. In that investigation all subjects sang the vowels at identical fundamental fre- quencies. Hence the tenors sang in the upper part of their pitch range while the altos sang in the lower part of their pitch range. A sirnilar?:~. reasoning could be applied whenever vowel sounds of voices with dif-s,: fering pitch ranges .are compared under conditions of identity with res- pect to fundamental frequency, as e. g. in Cleveland' s investigation ( 1977).

It is likely that voice experts can hear if an individual phonates in the upper, middle, or lower part of his/her pitch range by listening to the voice timbre characteristics associated with the voice source spectrum.

Summarizing, the dominance of the source spectrum fundamental is promoted by low and medium pitch. At high pitch and in "pressed1',,-; . . phonation the , .dominance of the fundamental is decreased. The ampli-

.tude of the overtones above 1.5 kHz increase more rapidly than the ,,-J .. . overall SPL when the vocal effort and the pitch are raised. 33 387 ,.

One phonatory aspect of singing which has been subjict to cdnsiderable amount of scientific effort is register (see Large, 1972). Unfortunately there is no good definition available. On the other hand, there is gene- ral agreement that a register is a series of adjacent tones on the, . scale which (a) sound equal in timbre and (b) are felt to be produced in a simi- lar way. Also, it is generally agreed that register differences reflect differences in the mode of vibration of the vocal folda,"fis~*sflt b3'+ry53?: . Several objections can be raised against this definition as it relies I so heavily on subjective impressions . Nevertheles s, lacking a definition which is based on physiological facts we accept it for the time being, and

; .-IJII~I J;AAU 2--)3~xdsL UJ q;uclxfnOtE yiny,~?.~&~ffbq mux 1 I 1 STL-QPSR 1/1979

rely on our hermeneutic ability in trying to understand it. Thereby it is helpful to contrast two registers, namely the modal (normal) and

the falsetto register of the male voice. These are two clear examples

of different registers. In the female voice there are three main regis-

ters: chest, middle, and head. They cover the lowest, the middle, and

the top part of the pitch range, respectively. However, many voice ex-

perts speak about modal and falsetto register both in male and female

'"f'l':!'; ?O rX-*>.~?br(13 '- ,:3~kr -i J..

2. Female cheet and middle register el; ."( )';:'~f~.~7~! ";LX,H'~S~~IKJ~ 03 ?331~

Mostly together with different co-authors Large has published a se-

ries of investigations concerning the acoustic characteristics of different

registers. 'r vo.2 .s:,iov sn: r,ss.~fD3T6230F. *fi :1!:>raa4-'4~5~~~1~13FSZL~~L~ 9320'1

With respect to the physiological background, Large, Iwata & 12

von Leden (.1970) found that tones sung in chest register consume. ,)y9 more .air than. tones sung in middle register. They conclude that the con-

version of air stream to sound is more efficient in chest register. ) c*F..~J

Large & Shipp (1969) studied the influence of varioas parti of the i'

spectrum on the possibilities to discriminate between chest and .m'iddle

I register. The material included the vowel [aJ sung by twelve singers

at the pitch E4 (330 Hz). The quality of the vowel (but obviously not its

timbre) and the acoustic intensity were kept approximately constant by

the singers. A test tape was made where the natural beginnings and ?r

endings of each tone were spl iced out. The vowel sounds were presented

with and without low-pass-filtering at 1400 Hz to a jury of voice experts

who were asked to classify them with respect to register. The results

revealed that generally the registers were correctly identified when

the vowels were unfiltered. When they were low-pass-filtered identifica-

tion of register became more difficult, but it never dropped as far as

the level of mere guessing. The authors conclude that the higher spect- rum partials only contribute to register differences. STL-QPSR 1/1979

. . In a later study Large (1974) returned to the same question in a

/ similar experiment. The results agreed with those of the previous

inveetigation, but this time Large also studied the spectrum of the

vowels more closely. The experiment showed typical differences bet*&-

tween the registers in the amplitudes of the lower spectrum partials. '(1

By and large the chest register vowels were found to possess stronger

high partials than the middle register vowels. However the differences were , all very emafl. Large found the results support the assumption that register

J3 I dffferences reflect differences in the vocal fold vibrations. "-' - : , -'"[XI

I Sundberg (1977~)studied the voice source and the formant frequency

1 ' .; -,-I , characteristics underlying the timbre differences between chest and

middle register in one soprano singer. The subject sang the same vowel in both registers at the same pitches. The intensity was left to the . ;

subject to decide. The results revealed a considerable source spect-

rum difference in that the relative amplitude of the fundamental was

-- . -' I - \ more than 10 dB stronger in the middle register. This Is much more than

the small differences reported by Large (11974). Probably, the register dLf-

ference was less pronounced in Large's subjects. Sundberg (1977~)also

found formant frequency differences bemeen the registers suggesking that . . the timbre dffferences between the registers may depend not only on voice .- . ,- source but also articulatory differences. In order to test this hypothesis

pairs of vowels were synthesized differing with respect to either for'- '2 , . mant frequencies or source spectrum. A group of singing teachers. - 34 . - were asked to identify the registers in these pairs of vowel sounds. - ' ' The results confirmed that both formant frequencies and source spect- -

rum may contribute to register identification. Thus, part of the spect-

ral differences reported in the above mentioned studies may be due to

formant frequency differences. We will return to this question below...... , . . , . $." '7. ' . ., , L.' .,. . , - b . ,./ . - -- . . . . STL-QPSR 1/1979 !';.. '. - 26.

r ...

A number of investigations of the differences between the modal

and falsetto registers have, been published,. Although; fa1 setto is rare-

ly used in traditional Western singing, except; pe~haps,in counter-':. .:

tenor singing, the researc,h in this field will be.reviewed.' 'bcl; T: -'

It ha8 been ehown that physiologically the vocal folds are 'longer, -,. i;

,i ati ffer, and thinner in falsetto than in modal register. As a rule, the

is never completely closed in falsetto. This is in Agreement with

the fbding of Large, Iwata & von Leden (1972) that falsettb tones consume

more a'& than comparable tones sung in modal register. On the other hand,

complete glottal closure may occur in falsetto (see Fig. 35, frame F on

page 7 1 in Venaard, 1967) and inver se,ly, incomplete glottal, closure-id' . , sometimes observed in modal register phonation,-,;.- ..c., 'I'.<.. .'.; , 3~irjyb oJ .:-:;3i,~j: . . Part of the literature on faLsetto and modal register focus 0l.i thd" ' i question whether or not listeners can identify the registers from sus - - tained, iaolated voyel sounds. Even though difficulties sometimes arise,

particularly when the vowels are sung by pqofessional. singers, the an-, -1 r

' ewer to the question is generally found to be in the affirmative (see e. g I. >I. Lerman & Duffy, 1970)~~.The dependence on the voice training of the su6-

3 ,. jects is not surpri.sing since singers are generally trained to blend re- - - ., - gisters, i. e. to conceal timbral differences between registers. An ex-

periment by Colton & Hollien (1973) allowed more detailed conclusions. '-"'

They found vocal registers to be a multidimensional phenomenon: "Under '

normal conditions it is the combination of pitch, loudness, and quality ?-' '

,. . that an observer utilizes 60 distinguish two vocal registers. When pitch ' . and loudness are equalized, register discrimination becomes more dif - ficult" .

The study by Large & a1 (1972) used vowels which were recorded '.- ' under conditions of equality in pitch and acoustic intensity. Under these I i STL-QPSR 1/1979 ., . . , I -,,, 27.

,,, - , ^.. $., J; -I . , ...... conditions the falsetto was found to produce weaker high overtones

than the modal register. This agrees with the observation made by the same authors that more air is consumed in falsetto singing; the ' JC: ') rr, . -,> q conversion of air stream into sound is less efficient in falsetto than

, . in modal register; # . .I Again equalizing pitch and acoustic intensity, Russo & Large (1978) :ir ,?>ZIT< >rP;'.i~a&iyi~&~av&vf~ ad, compared the two registers perceptually and acoustically. Twelve r- / expert listeners judged the similarity in pairs of tones sung in the

different registers. The pairs considered as most dissimilar in ~3b~i'li'qirb 5'3. -.,Fs ,T&;; ia9. ",J) )i. r 3 timbre differed mainly with respect to (a) the amplitudes of the higher

, spectrum partials, which were lower in falsetto, and (b) the ampli- tude of the fundamental which tended to be slightly greater in falsetto. ~l~dt~.'$.-(s-iF'!if f?..:, 327s - t'' Both these obeervations agree with spectral evidence collected from

singers and non-singers which Colton had published earlier (1972). .-3 i . The studies referred to above have dealt with the amplitudes of :a :$ :$ :j:;Tdn '.:j::.i. guia"sxsw' PE)EO;: spectrum partials. As we have seen before, such amplitudes depend

not only on the amplitudes that the partials have in the source spectrum , , 3. but also on the -frequency separation between the individual partial and 6 .-isqrno* 3~'~'s-- the formant lying closest to that partial in frequency. Thus,' the rela- ,.. , > tionships between the amplitudes of specific partials and identification

of registers can be expected to be totally different if a different pitch fj,yf ' -:sr.ri-14qx? ssx * .; or a different vowel had been chosen. Against this background it '-9 1 ' .:' seems interesting to explore the voice source properties which charac- I

? J .*, \ C.-. . ..J terize the registers. .S:J ,,:; . . .,, 7,iJ ~Y~~JIoI-: Monren & Engebretson (1977) studied the voice source in various , >. : t .! , - i;: . types of phonation. In order to elminate the formants the subjects ., phonated into a reflectionless tube. The resulting waveforms deviate ,

from other recorclfngs of the volce source obtalned by different methods, $3.. ' . .probably because of a phase distortion fn the system they used. Such

dfstortfon does not affect amplitudes of partfals. Hence, their results as

I i

uooei uooei

VOWEL ( IPA SYMBOLS)

Fig. I-A-9. Percentual differences in the first (left), second (middle), and third (right) formant frequency in the vowels indicated. Solid curves pertain to a tenor singer as compared with a bass singer, according to Cleveland (1977). Dashed curves show the average over six languages of female non-singers as compared with male non-singers accor- ding to Fant (1975). Adapted from Fant (1975).

MODAL FUSETm MODAL FALSETTO

Fig. I-A-10. Voice source characteristics in modal and falsetto register in three singers as determined by inverse filtering technique ad modum Rothenberg (1972). The left series of curves shows the waveform and the right series shows the corresponding spectrum boosted by 6 d~/oct. The ripple in the modal register waveforms is an artefact due to the particular inverse fil- ter setup used, which could cancel the influence from the two lowest formants only. This caused the "singer's formant" appear as a ripple. Note that the amplitude difference be- tween the first and second partial is much greater in falsetto than in modal register.

STL-QPSR 1/1979

the interval betyeenLtwotones having the frequency ratio of ., . - . < L , ' - k ,' 1 . . . 1/1200. ) I. 1:2 . . r '..f ,. ,. /'

, The physiological badkground of the vibrato is unclear. In

electromyographic measurements on laryngeal muscles, pulsations

in synchrony with the vibrato are generally observed (Vennard, ,

Hirano, Ohala & Fritzell, 1970-71). Moreover, the subglottic pres-

.- ., sure and the transglottal air flow often undulate in synchrony with

. .,' .'."...C. > :+ the vibrato as can be seen in recordings ,publis.hed by Rubin, Le $.

Cover & Vennard (1,967). An observation which may prove relevant

has been reported by Weait & Shea (iq77) wh~studied the glottal

, behavior in a bassoon,player.. , They fa~dthat the glottal area

t.2 varied in synchrony with the vibrato. This can be interpreted as

support for the hypothesis that the vibrato originates in the laryngeal

muscles while undulations in air-flow and subglottic pressure are , - ,, .arm 7

i secondary effects. '2 ..., -- r 8 '? 5 Several aspects on the vibrato have been studied. As early '_ as in the thirties Seashore (1938) summarized, among other things,

a series of investigations which he and his co-workers had made on , . the vibrato. He found the vibrato rate to be pretty constant within I.1, / a singer but slightly varying between singers. The mean over 29

singers was 6.6 undulations per second (extremes 7.8 and 5.9).

,-- .I The average extent was * 48 cent (extremes .f- 98 and -i: 31). .-C . - <,?, *c!j"' <)4$ n'$i..~'+LC - """

, B. Perceptual aspects 8 -I ,, ,r, * 7: ,&. ;i(j., til3bZ.U > 1. Vowel intelligibility

. , : As mentioned before the identification of vowels is assumed to be

related to the detection of peaks in the spectrum envelope. These

.i ' peaks signal the frequencies of the formants and the formant frequencies

.characterize the vowel. If the number of partials is low compared - .; :. J f STL-QPSR i/i979 ' ,xrC: , 3f.

to the number of formants, i. e. if the fundamental frequency -. .. . v- ' * ; >" r. r! t- is very high, the spectral envelope peaks signalling the formant

frequencies would be impossible to detect, because there may not

I :,' -c be a partial at every formant frequency. It is not unreasonable

13fIt-,. 3 9.-. -, ,: , I to assume that the vibrato plays a role here. If the frequency of

a partial is slightly lower than that of a formant, an increase in

fundamental frequency will raise the amplitude of that partial. If ,- r{rl ,-;.,.*"f.,v; o"-:J. ;,: -n, ..),* . the partial is slightly higher in frequency than the formant, a&-

crease of the amplitude would result from the same situation, as is il-

lustrated in Fig. I-A-1 l. Thus, the phase relationship between the

undulations in frequency and amplitude in a vibrato tone actually C- ,.: gives idormation about the frequency locations of the formants. -

The querrtion is then whether the ear can detect and use this infor-. , I mation. If so, vibrato would facilitate vowel identification in high

pitched vowels;,!.: .., ;;nci~ ;.-; !TC:?C~~\. +l,~:2:3>.~g~f?:I L~~;~s~.:I~cI:IF:. :i??.:r;-i.i

This question was studied in the experiment mentioned earlier :-

concerning vowel identification in the soprano pitch range ( Sund- . ,

berg, 1977a). Each vowel in the test was presented both with and . .-

without vibrato. The interpretations made by phonetically trained ;

subjects differed considerably. The degree of agreement between p -,3

the ioterpretations was measured in the following manner. Each -,

., response vowel was ascribed a set of three formant frequencies.

Then all reaponsee obtained for a given stimulus vowel could be regarded

as a cloud of points fn a three-dlmensional space, where each dimension I

I -corresponds to a formant. The center of this cloud was determined. The I

mean distance between the individual points and the center was next computed

using a formula for perceptual distance between vowels suggested Ij I by Plomp (1970). It was assumed that this average distance ref- <,

. -rr$,~~iani ~svxszdoylls>iy.\lf s.rr ,;~r,lw !n3jxs. . bns s slr lected the difficulty with which a vowel stimulus was identified as

a specific vowel. The average distance between responses is shown in Fig.

I-A-i2. As can be seen in the figure there are no consistent , :c:;(5? '4,: Lr :.,L-p4zf

differences between the values pertaining to vibrato tones i .. ,. , and those obtained for vibrato-free tones. Therefore, .it i-s reasonable

to conclude that vibrato does not facilitate vowel identification. On 0 .I 1-.m ?I fiIrlj !3.J;-..12 8.. ir.r:l:;q. the other hand, the results may have been a bit different if stimuli had

, I I been more like natural vowels sung by scpranos. It is often hard to . . prcjCi@<. how our' abilitytto identify stimuli is affected when-the stimuli

_ 1' . - / ;, $1~~3,* , .-, - d*..C&ftiS ,949 "0 .. & :.. . do not repemble anything familiar. 3 ,2rrdT ,:X-i.-I .&Pj rl b33.f.l :If

I It is a well esqablish,ed fact that the. fundamental frequency deter -. , I

mines pitch. In the case of vibrato,tones, however, ,this is not quite..

true. While the fundamental frequency varies regularly in such tones.

the pitch we perceive is perfectly constant as long as the vibrato rate

7 :, .r.- ;t*!Am-r -** ,; , c.. 4: . and extent are kept within certain limits. " > 's . Which are these limits? Ramsdell studied this question at. Yarvard

University in a thesis work which unfortunately was never published. .,

Ramsdell varied the vibrato rate and extent systematically and had kkq-

teners decide when the resulting tone possessed an optimum " single - ... ness in itch''. His results for a 500 Hz tone are shown in Fig. I-A-1,3.

Later G5;bian (1972) studied vibrato in synthetic vowels. He var~edthe vibrato

rate and extent, and had subjects assess the slmilaiity with human Nice

vibrato. His results agree closelywrth Ramsdell's data as can be seen in

the same figure. In addition to asking the listeners for the uptfrnum .:,.

L,. singleness in pitch, Ramsdell also asked for an evaluation of the "rtch-

ness" in the timbre. His data showed.that the optimum as regards single-

ness in pitch qs well as timbral richness corresponds to. the values of-

I rate and extent which are typically observed in singers.

. ,<,,.<<% 7' . - I r . ,

It is interesting that Ramsdell' s curve approaches a straight line in the neighborhood of seven undulations per second. This implies

that the extent is not very critical for singleness in pitch at this

rate. In contrast to this, there is a strong opinion among some ' singing teachers that not only slow, but also fast vibrato rates are

tolerable only if the extent is small. It would be interesting to re-

rrr- peat Ramsdeli' s experiment with modern equipment. si:, ;,?r~.l~::::r: sn

,,; ,,; 10 iSdj rf??~"~?{TO$ c!,~-T&~T/ b; 1- 3. Pitch asd fundamental frequency

,. ? ',t , :, r Another perbsptual aspect on vibrato is the perceived pitch.

g2 -, Provided that the rate and extent are kept within acceptable limits, '.- -6 I-:7J:;r. ...- --- ' - what is the pitch we perceive? This question w&s shdie indipen-

A - dently by Skonle (1975) and Sundberg (1972, 1978b). Sundberg had

r "I musically trained subjects match the pitch of a vibrato tone by ad- .*..*, +*,.-.,. .I' justing the fundamental frequency of a following vibrato-free tone.

The two tones, synthetic sung vowels, ware identical except for the

vibrato. They were presented repeatedly until the adjustment was

.n.j;. 3j'1 """?"? f ' completed. The vibrato rate was 6.5 undulations per second and the"-

extent was i 30 cents : Fig. LA-14 shows the results. The ear seems to compute the average of the undulating frequency, and perceived pitch

0" fa-\ corresponds closely to this average. Shonle worked with sinewave

- stimuli and arrived at practically the same conclusion. He was also able to show that it is the geometric mean, not the arithmetic mean

<" fiT(,- '3 -, which Sundberg worked with, that determines the pitch at least in the

case of sinewave signals. However, the difference between these two p:3 ?r,;(:s '[.FtI' ' means is insignificant in musically acceptable vibrato 8.

2.' ; ' . It is frequently assumed that the vibrato is useful in musical

!' practise because it reckces the demands on accuracy in fundamental .

frequency (see e. g. Stevens & Davis, 1938; Winckel. 1967). One 34, STL-QPSR 1/1979 - , t c2+ -... - A:.--

possible interpretation of this assumption is that the pitch of a

vibrato tone is less accurately perceived than the pitch of a

vibrato -free tone. Another interpretation is that the pitch inter-

val between two tones, which sound simultaneously, can be deter-

mined with less accuracy when they have vibrato than when they

jr;r!? R'Y rh.31 3' ?I~k?l: are vrbrato-free, - , r f : ~xre~l~trf,~~; 1: * The fixst interpretation was tested by Sundberg (1972; 1'978b) '

The standard deviations obtained when subjects ma tched the pitch :'"

of a vibrato tone with that of a vibrato-free tone were compared

with the standard deviations obtained from similar matchings where

both tones lacked vibrato. As can be seen in Fig. I-A-1 5 the differences

between the standard deviations were extremely small and dropped slightly with rising fundamental frequency. This implies that the .- vibrato reduces pitch perception accuracy slightly for low frequencies.

On the other hand, the effects are to,o s.1;7"11 to explain any measurable , c

effects in musical practice. +,,(;, :,,,v a,, ,L;:.jd~l,d: ,-esnu$ - h: -.A. . The second interpretation has not yet been tested, but it is temp-

ting to speculate about it. If two simultaneous complex tones with. o,

harmonic spectra constitute a perfectly tuned consonant interval, some .,.. , partials of one tone will coincide with some partials of the other tone. t' Let us consider two tones with fundamental frequencies of 200 and 300

Hz, i. e. a perfect fifth interval. In this Case every third partial of the lo

3-r wer tone (frequencies: 600, 1200, 1800 . . . . Hz) will coincide with :c .f;

every second partial of the upper tone. Let us now mistune the inter- . . val by raising the frequency of the upper tone to 300.5 Hz. This fre-

, .I quency shift equals 2.9 cent, which is impossible for almost any'lis-

tener to detect under any experimental conditions. (The difference limen for frequency is at least 6 cent, but may be considerably higher

depending on the experimental method, see Rakowski. 1971). On the

STL-QPSR 1/1979

-. ' . ,',I , other hand, the partials from the two tones will not coincide

any longer . For instance, the fourth partial of the upper tone

.,;'a ?...-, has a frequency of 40300.5= 1202 Hz. This partial will gltrb Go

beats per second with the sixth partial of the lower tone, which

has the frequency 1200 Hz. Thre are no difficulties in detecting A'

such beats, provided that both partials have similar and sufficient-

ly high amplitudes. The point is that thee e beats will not occur

if both tones have vibrato. Thus, if two voices sing perfectly '"[':

, jsh .,t, . I~F . ,, 9 - , . ,.' -* straight", i. e. without kbtato, the d&-&hds on accuracy with

%>' " respect to the fundamental frequency are higher than if they sing

1 with vibrato. However, the advafitage seems to be small. In an -

unpublished thesis work at the Dept . of Speech communication, 'j'"

Royal Institute of Technology in Stockholm, Agren (1976) had mu-

sically trained subjects match different intervals between two si- ;

, e 7 multaneous vibrato tones. The intervals were major second, "

major third, pure fifth, and pure octave. The tones were syn-'. ";

thetic sung vowels. Some of the subjects managed to obtain a ?'-:

standard deviation as low as 6 cents in repeated matchings of ;:. ;:. :. given interval. If we may believe that mis tunings of this small

magnitude an be detected even in musical practice, it would seem

that the demands on pitch accuracy are extremely high even when

the singers use vibrato. It is likely that the vibrato is accepted

i, and used in sInglng for other reasons as will be shown later: -'f19 '2

i, ?., .:L,' Our conclusions are that the pitch of vibrato tones ..+.is .rr;practically

.. 4-. 7,rr identical with the pitch of a vibrato-free tone with a fundamental

frequency equal to the geometric mean of the fundamental frequen-

,I + !f-x, cy of the vibrato tone. Moreover, the consistency with which the

pitch of a vibrato tone is perceived is not affected to any appreciable

.. 'JI. . r .):Ju 2+111k4 'fiq<[l>;{.I,j I I extent by the vibrato. , . . r : iJ *$.-r , . 3'7 '

VI. Pitch accuracy in singing practic* r ,- , . - ,. . . . .r

Above a couple of investigations on the pitch perceived from - ,

vibrato tones were reviewed. These investigations were made .. 3 under well-controlled experimental conditions. Do the results

obtained that way apply also to musical practice? A study of-a. the

accuracy of fundamental frequency in musical practice is likely to

In a review of a number of investigations Seashore (1938) included a

wealthy documentation of fundamental frequency recordings of prof es - sional performances of various songs. The general trend is that long

notes are sung with an average fundamental frequency which coin-

cides with the theoretically correct value. This is in agreement with I the experimentaf findings reported above. On the other hand, they

I often "begin slightly flat (about 9 0 cent on the average) and are gradu- I I I ally corrected during the initial 200 msec. .of the tone". Moreover, a 3 great many of the long tones were observed to change their average

frequency in various ways during the course of the tone. ~jarklund

(1961) found such deviations typical for professional singers as opposed

to non-professional singers. One possible interpretati~nof this is that

pitch is used as a means of musical expression. , -< As regards short tones the relationship between fundamental fre-

quency and pitch seems to be considerably more complicated. The case is illustrated in Fig. I-A- 16 showing the fundamental frequency during a

passage as sung by a professional singer, who found this per-

formance acceptable. The reveals a careful coordination

of amplitude, fundamental frequency, and vibrato. Most notes take exactly one vibrato period, and most of the vibrato periods cent;er around the target frequency. However, if we try to apply what has been shown I i 5 ...... PITCH

TIME (set) FO (Hz)

Fig. I-.A-15. Effect of a vibrato on pitch perception accuracy at different fundamental frequencies (FO). Mu- sically trained subjects first matched the pitch Fig. I-A-16. Synchronous recording of fundamental frequency of a vibrato-free stimulus tone by adjusting the (upper b:aph) and overall intensity (lower graph) fundamental frequency of a subsequent response as measY..ed in a professional singer performing tone which also lacked vibrato. Then, the same a colorap-ra passage (C-9. D3, E3. F3, G3, F3, experiment was repeated except that a vibrato E3, D3, C3, D3. ...). e horizontal dotted lines was added to the stimulus tone. A is the shift in the upper graph show c:e frequencies midway of standard deviation thereby obtained. The in- between (i.e. on the geometrical mean of) the dividual difference~are given by the bymbols scale tone frequencies which were determined while the heavy solid line shows the group aver- from the average .frequency of the last note by age. Reprinted from Sundberg (1978b). means of the equally tempered scale. STL-QPSR i/1979 .

about pitch 6er~e~tio.k'in vibrato tones, we run into trouble. - The average fundamental frequency in a coloratura passage does

not change stepwise between the target frequencies corresponding ' to the pitches we perceive. Rather the average rises and falls

3 monotonically at an approximately constant rate. Thus, we cannot explain why the pa

, 8, ,.. 1 crete- pitches. A possible explanation is that the-average computa-

tion process is interrupted and started again each time there is a

.' minimum in the amplitude and/or frequency curve. However, this

is a cl&r case of an kd-hoc-hypothesis, and no experiments have

beem made to support it. ?EX? 3q63 P .N.;J q ilu bs~sbi&no?a';tsr.s:

t. An in+estigation of 'interest in this connection should be mentioned

here. It has beer; shown that a glide is perceived as a pitch correspon-

ding to the geometric mdan of the extremes of the glide, prodded that

the product of the frequency change and the'ti me for the change is not

' grkater than 5 (Nhbilek, mb%lek, & ~irsh,1970). This case will

certainly apply to some cases of short: no,tes in singing, but it d0e.s not

seem to apply to coloratura cases. For instance, the geometric mean

of the upward glide does not agree with the geometric mean of the fol-

lowing downward glide in the same tone. Moreover, difficulties seem

T'~ to occur when the pitch is very high. In this case the pitch changes (,-< , between the s&le tones a& &ide in terms of Hz. Thus, at high pitches C -. . - the condition of the product of the change and the time being less than

5 can hardly be fulfilled any longer. We have to conclude that, zt pre-

,I1 . . -, ..it? J4 -,"-. A sent, we cannot explarn how a coloratura passage can be perceived as . ,.... q~:,,, ,. *'. .: 6 ;.., >A '1;. "S' ,,> .., . 7- - a sequence of discrete pitches. . ,

'b2. , .,,,:..,-.,..,,I ,.j . .. . i:, ji\;.j--p't'.. f.;$~-' ;: .!, '/.l:i;lc;, ;>,j ~T.::J ;;3,...;, .: . . ; ..;: L,.!.; -* , , .. I. L From what was just said, it seems that Seashore (1938) was right in saying that the musical ear is extremely generous and ope- rates in the interpretative mood when it listens to singing. On the other hand, there are certainly limits for this generosity: there is generally an agreement among experts as to what is in tune and what is off pitch. This would lead us to assume that the analyzing pro- perties of the ear are more important to the pitches we perceive from singing than Seashore apparently assumed. ,,.. ,,tfl.; ,, , rl -,,A ., ,&,, In a thesis work at the Dept. of Musicology, Stockholm University.

Lindgren & Sundberg (1972) studied what musically experienced lis- teners considered off pitch. A tape was prepared including excerpts from phonogram recordings representing different degrees of singing off pitch along with several cases of apparently perfect intonation. A chart with the n~tationof the excerpts was given to the listeners, who were asked to circle each note which they perceived to be off pitch.

The fundamental f requcncy was analyzed by measuring the frequencies of high overtones in sound spectrograms (sonagrams). The results showed that tones with. an average frequency matching the theoretically correct value were mostly accepted as perfect intonation. However, several tones which did not meet this demand were also accepted as correct. Theoretically mistuned tones were accepted remarkably often

(I) when they occurred on unstressed position in the bar, (2) when they were a bit sharp, and (3) when they occurred on emotionally prominent

TIT -?/ places in the text. . I...... t,.. 'St This last point again suggests that deviations from the theoretically perfect pitch is used as an expressive means in singing. Sup- port for this assumption can be found in measurements of clarinet playing (Sirker, 1973). Also, it seems typical of music that the com- poser and the performer build up expectations in the listener as to what \ I' I r STL-QPSR 1/1979

might follow. Occasionally, minor deviations from what was c r

expected are made. It is the ' author' s belief that such deviations ., contribute to the excitement we can perceive when we listen to a

., &. h,J, r4 -~..~a~...-,~~-4 ,~,o~il~~jl&ba:a..A" *!li2j > good performance. I .r-~ikz-~-t;:3. 1~~~ec.m~~>,:

cies are used as an expressive means in singing, an important con- --

elusion regarding- - the benefit of the vibrato can be made 9 We have "

-.( -I.. .-,:- ,- - I -.:, '"+ 1-i L seen that vibrato-free representation of miafuned consonant chords

.idC': give rise to beats, and beats seem to be avoided in most types of 1

music. By adding a vibrato the singer escapes the beats. Consequently

..-7---T.* If -' the vibrato allows him more freedom in using deviations from theo- retically correct frequencies . JU ..A +<$.Y 1 ;,+>*&?f $7 , &~,$AG aC I. 'I Thls pomt Is Illustrated m Fig. I-A-17 showkg the distrfbutlon of fundarn-

< : :, - * : *....a x- - :*.+ ental frequencles In a song as sung by a prokessional opera singer. The freq- uency values were averaged using a running the-window corresponding to the length of one vibrato cycle, approximately. ~*rcomparison a shilar regis-

, *. r -i-?-r' tratlon of the same song performed by a &ingtng)synthesizer is shown (cf Sund- berg 1978b). VIbrato rate and extent were the same as In the real singer. The scale tone frequencles appear as peaks, whlch are considerably wfder in the

- +..,,- . -. .- - I-- -9.- - real shger than in the synthesis. Thrs agrees wrth the aboiie assumption

r.. that deliberate deviations from expected pitches are used in singing.

In the same figure a third distribution is shown. It pertains to the mem-

'~;ab a, +r++L.-'\q:'"*$ bers of a distinguished Barbershop-quartet. The vibrato is not used in Barbershop-singing. Hence, the chords must be perfectly tuned in order to avoid beats, and the singers have a very small freedom as regards

,ic,t:j /. 2 grnr ~?7-+n fundamental frequency. Most scale tones are seen to correspond to very narrow peaks. This means that the frequency value corresponding to a given scale tone varies extremely little in Barbershop- singing. It is

7 Tj,<, , , - :-a> , : :.A 4'.

PITCH: B C D E. F. G A ¶ . .

FUNDAMENTAL FREQUENCY .(Hz) a- 4 PITCH: Eb F G Ab A Bb C Y DE F

FUNDAMENTAL FREQUENCY (Hz)

Fig. I-A- 17. Distribution of fundamental frequencies in singing. The upper graph pertains to a professional singer (solid curve) and a singing synthesizer (dashed curve) perfor - ming the same song. In both cases the fundamental f re- quency was averaged with a running time window cor- responding to one vibrato cycle, approximately. Thus, the distributions should be identical if the singer was as accurate with respect to fundamental frequency as the synthesizer. The lower graph was obtained from a distinguished (Happiness Emporium, Minneapolis, USA) singing a chord progression. Note that the widths of the scale tone peaks are generally much narrower in the Barbershop singers who lack vib- rato, than in the opera singer, who has vibrato. Note also that the pitch A is represented by two peaks in the case of the Barbershop quartet presumably because it appeared several times in the song with different har- monic functions.

STL-QPSR 1/1979 . .

mant" which typically occurs in all voiced sounds in the male singing voice, and (3) the vibrato. In all these three examples we have strong

reasons to assume that they serve a specific purpose. The pitch de-

-> .> pendent formant frequencies as well as the "singer' s formant'' are , .- both resonator~phenomena which increase the audibility of the singer's voice when the orchestral~accompanimentis loud. As resonatory phe- 3rnc v ! i.-~:.I nomena occur independent of vocal effort, a purpose in both these cases is vocal economy. The vibrato serves the purpose of allowing the singer a greater freedom in the choice of fundamental frequency as it eliminates .. :I323 *is ; '-1: beats with the sound of the accompaniment. Thus, in these three cases . .' we see that singing differs from speech in a highly adequate manner. It is tempting to speculate that such characteristics have developed as a t&'T 313.f: 3qf) result of evolution; the singers who developed them became successful.

c> , ", r? ,* and hence their technique was copied by other singers. - A second kind of facts about singing which has been discussed in - .- 2L.t , this chapter is the acoustic correlates of various voice classifications

-, .,.f?<. which can be assumed to be based on perception. Such classifications are not only tenor, baritone, and bass etc . but also vocal effort (8. g. - ;&ij ' 7 in terms of piano, mezzopiano etc.) and register. We have seen th.at 3'3fX .; Ki*>- ,R ,. It in most of these cases it was hard to find a common acoustic denomi- nator, because the acoustic characteristics of the categories vary with vowel and fundamental frequency. Rather, the common denominator 2.IJ. exists within the voice organ. In the case of the male voice classifica- tion in terms of tenor, baritone and bass, the characteristic formant . ' frequency differences could be assumed to result from morphological differences in the vocal tract. The same is true for vocal effort and , : register, because they reflect differences in the control and operation of

the vocal folds. Therefore we may say that these examples of voice . ,: classification seem to rely on the function of the voice organ rather STL-QPSR f/i979

. than on the acoustic properties of voice sounds. This is pro- bably revealing as to the way in which we perceive singing voices.

We seem to interpr& the sdunds in terms of how the voice organ , , -- _ ,. .:S. .-I I .,--- - 7- .. . - t was used in producing the sounds. -

A final speculation might be added here regarding the emotional ,r+ , 17~~:~;SC;)& ~.-:~jb~: ,-?3j.n-7 3.c .?3if zL CIC**J~~, - ~j! aspect of singlhg. Our experience with communication by means !: ,,; I

of speech would have tought us that the voice organ is used diffe- .. .

rently depending on the emotional state of the speaker. This leads

, , , & . . .c.L..".-..L.-C ,r :+. "'.'to the reagoinable co8clukon that we can &aw conclusion$'regarding ' 7 > the emotional state of the speaker by listening to the voice, or, more . I

specifically, by identifying the type of phonation and articulation un- . 8272: : +::

derlying the particular sound of the voice. It seems very likely , ,, ,J

that this ability is utilized in singing: listening to singing is an inOL ';,

terpretative act in which the voice sounds trigger associations to '' ,: ? emotional states. If this is true our acquaintance with speech serves as a reference in decoding the emotional information in the sounds , . ,, . ..\, , .~. . I -. from a singer'. . s voice. ? - :., . . -. , KS. T: ,-) , .f,~y 2li>a~*:olj; !, 9 .'aTr KIT',>^ .L.CLL~ r) .. > 5 5 , r'"30r, ,, ., ,f.z 3 J. r';;:., . . ,+,~.i - [ICG? . ., .-.-__.---..-.-hi;I --- , . i r;t;,~+~1,;iU~6j~,~~.~e-+>.ic\r':'' r.41 *\&.1~fi;;:Q,-y-r~.:,~~od.;sl ,. . . \- - --,- - ---, --,.-. _. . .. -.. I -22-2..-..-. -.--.- .- . . . -" - - ,[' i-1

Acknowledgments , , . ,,r(+.- '6 *;-?. Tc.~F (.j\c\an;+ . ,T r nc\,~lrctY Si Felicetti of the Department of Speech Communication, KTH (Royal

Institute of Te~lanol~gy),Stockholm is acknowledged for her expert as- .------sistance in edifing tlie manuscript and Karin Holmgren of the Department

of Linguistics, Stockholm University for valuable comments on the language

and the content. The preparation of the manuscript was in part supported by the following funds: Swedish Council for Planning and Coordination of

Research, Swedish Council for Research in the Humanities and Social i

3:: L 'T.;: >! a . Sciences, and Swedish Natural Science Research Council. . - .- STL-QPSR 1/1979 . . , . IS ,-. I 44. I r' .I '

*I .%, .

References Appelman, D . R . , The science of , London: Indiana

- -0 University Press, 1967. r1.e .cr,, ... . r, 5 L n!., "rG 4 .!. ,'

Bartholomew, W. T. , A physical definition of 'good voice quality' in the *. . , male voice. Journal of the Acoustical Society of America,. 1934, 6, 25-33. ,I/ - I ~jjdrklund,A*. , Analyses of soprano voices, Journal ,of the Acoustical -Society of America, 1962, 33, 575-582. *2q;,'. ,"'l."..!.'~.?a-'? .P~'v.l.i. fi Cleveland, T. F.', ~couiticproperties of voice timbre types and their influence on voiet alassification, Journal of the- ~coustical Society of America, 1977, 61, 1622- 1,629. , r, jr?g~,r~!Qa@v&rf bllJ&?$ rt,:rc.,sqa ? 'J' . ' T-' . , *, . Coleman, R. O., . A comparison of the,contributions of two voice quality . . characteristics to the perception of maleness and femaleness in the voice, *> . ^- Jourrial of Sw'ech'and Hearing Research, 1976, 19, 168- 180. . . ,.I s ,. , cf~~a-r$' ' * I I (5 Coltm, R.H: , Spectral characteristics of the modal and falsetto

r.. <, ' Colton, R. H. & Hollien, H: , Perceptual differentiati& of the modal and falsetto registers, Folia Phoniat rics, 1973, 25,- 2 70 -280. 1? lG; *;! Fant, G. , Acoustic theory of speecQroductio_n_, The Hague: Mouton, 4. .'-5 1960. iY,,k3 .. *

I Fant, G., Speech sounds andfeatures,-- Cambridge. Mass.: The ,..;:. . . \I - MIT-Press, 1933. ,r,.r, rP(.. -*;..te.).6"- : Fant, G. , Non-uniform vwel normalization. Speech Transmission Laboratory, Quarterly Progress and Status Report, 1975, No. 2-3, 1-19. . ' 2 -c . * . I Flanagan, J. L. , Speech analysis synthesis and perception, New York:

_LA Gibian, G. L. , Synthesis of ,sung vowels. Quarterly Progress Report, .>; - Massachusetts Institute of Technoloas1972, No. ,104, 243-247.. .,- .iL,

Hollien, H. , Keister, E. , & Hollien., ,P.*. ,-,Experimental data on 1 ' ' singer' s formant' ,' 3ou;nal of the Acoustical Society of America. -,, ~ (. 8 1978, -64, Suppl. 1, 5171 (Abstract]. -. , Large, J. & Shipp. T., The effect df certaid parameters on the per- . ~ ception of vocal registers, National Association-- of Teachers of Singing ~ulletin,0ct. 1969, 12-15. r: ;: ? .- ,\ , *2.i,ai~..:,ht.>.* .-. <, - .d I I . . *( , .. .-,* - .I 8' . '1 STL-QPSR 1/1979 45.

, .: , .L 1, .

, ,

,A Large, J., Iwata, S., & von Leden, H., The primary register transition in singing. Folia Phoniatrica, 1970 , 22, 385-3 96. 7

Large, J., Iwata, S., & von Leden, H. , The male operatic head register versus falsetto. Folia Phoniatrica, 1972, -24, 19-29. Large, J. . Towards an integrated physiologic-acoustic theory of vocal registers. National As sociation of Teachers of Sinping Bulle- tin Febr. -March 1972, 18-36., I . . it '... , .. .L .-. 1J;- . ...--.. -..- - ,..% ...... - . ,. . "...... ,--..-.-.----- . - - -. i ...... I Large, J., Acoustic-perceptual evaluation of register equalization. National Association of Teachers of Singing Bulletin, Oct. 1974, 20-41.

Lerman, J. W. gt Duffy, R. J. , Recognition of falsetto voice quality.

,A:-.,c- T r c I Folia Phoniatrica, 1970, -22, 2 1-27. ---- y-r , Ft a - ; .:

Lindblom, B. & Sundberg, J. , Acoustical colisequences of lip; " ' ' tongue, jaw, and larynx movement. Journal of the Acoustical So- '

ciety of America, 1971, -950 11 66- 1179, , ,.J ----. , . . '.l ?-- \.-- . Lindgran, H. & Sundberg, A. , ~rundf*ekvensf(trl6~~och iiIsks&ng. Thesis work, Dept. of Musicology, Stockholm University 1972 (stencil). I,

Monsen, R. B. & Engebretson , A. M., Study of vibrations in the male and female glottal wave. Journal of the Acoustical Society bf Ame-

rica, 1977, 62, 981-993. ,zcjur~rfi:r=.,r G": Y!J~-,li 3ri ' ,----- f :-1 - lss-~:~~~'.-- -- 1

~Qbilek,I. V., ~~b~lek,A.K. , & Hirsh, I. J., Pitch of tone bursts of changing frequency. Journal of the Acoustical Society of America, 1970, 48, 536-553. NordstrBm, P-E. , Female and infant vocal tracts simulated from '. ' male area functions. Journal of Phonetics, 1977, -5, 8 1 - 92. ( !

-a -. Plomp, R., Timbre as a multidimensional attribute of complex tones. In R. Plomp & G .I?. Smoorenburg (Eds.), Frequency anal- , and periodicity detection in hearinz. Leiden: A. W. Sijthoff, 1970.

397-414. + , +;,, .. .. it.,,." <: .,2tl . tri,>~r ct 3~ P? D?F-~+ ma arra*-t-.rr. , I Plomp, R., Continuity effects in the perception of sounds with inter- fering noise bursts. Paper given at the Symposium sur la Psycho- acoustique musicale, IRCAM, Paris, July 1977. j~5ic. . . Rakowski, A. , Pitch discrimination at the thresho~tlof hearing ,'F * ' Proc. 7th International Conpress on Acoustics, Budapest, 1971, -3, 373-376.

STL-QPSR 1/1979 I 47.

Slaw son,, A. W. , Vowel quality and musical timbre as functions of spectrum envelope and fundamental frequency. Journal of the

Acoustical Society of America, 1968, 43, 87- I0I. ,

Stevens, S. S. & Davis, H. , Hearing, its psychology and physiology.

New York: John Wiley & Sons, 1938. , -, .. . ,-. . -.. --a . Stumpf, C. , Die Sprachla.ute. Berlin: J. ,Springer Verlag, 1926. I , .,.. C*C,*a~4 , 'C 3*. ' lL ;r: c?l Sundberg, J . , Formant structure and articulation of spoken and - SL I- I sung vowels. Folia Phoniatrica, 1970, 22, 28-48. 1.2+ 7 3.c1,3 . - Sundberg, 3. , Productiop qnd function of the ' singing formant' ui?r< vrr -. In H. Glahn, S. ~jdrensen, & P. Ryom (Eds. ) Report of the 1 lth .. (15. ' -

Congres a of the Jnte-national Musicoloaical Society. Copenhagen: ,, .,;: , .. Editor Wilhelm I-hzlsen, 1972a, 679- 686.

Sundberg, J. , Pitch of synthetic sung vowels. Speech Transmission - Laboratory, Quarterly Progress and Status Report, 1972, No. 1, 34-44. Later revised and published as Effects of the vibrato and the . ' singing formant' on pitch. In Musicologica Slovaca (in Memoriam r, :,.j f?~i-'sj;?>?'9-zi-3 M. Filip) 197$b, 6; 51-69. ;J - kXhy~!' r. - .'EI~Jc)~'~6rI0~L3: . ;:~TL$LC?. I Sundberg, -J., The source spectrum in professiokal singing. Folia Phoniatrica, 1973, 25, 71-90. - ', 3 , - -+A- A? I . .. ,*r-'.cT/ . \ Sundberg , J. , Articulatory interpr;tation of the ' singing formantJ . Journal of the Acoustical Society of America, 1974, 55, 838-844. ' ' I - I , 1 i:~; I Sundberg, J. , Formant technique in a professipnal female singer. I . -< . , -.8 ,.._,,, .,.,,, .. , ..,,a,, , -.. . r :)-) Acustica, 1975, 32, 89-96.

,b,7 icd . . ?i, >!-I -> > Sundberg, J., Vibrato and vowel identification. Archives of Acous- , . tics (Polish Academy of Sciences), 1977a, 2, 257-266.

Sundberg, J. , Singing and timbre. In Music room acoustics, Stockholm: Royal Swn. Salr Academy. of Music Publications l7, 1977b, 57-8 1. I I i Sundberg, 3., Studies of the soprano voice. Journal of Research I I in Singing, 1977c, i(l),25-35. .

Sundberg, J. , Musikens ljudla~.(2nd ed. ). Stockholm: Proprius Ftjrlag, 1978a. Sundberg, J. , Synthesis of singing. Swedish Journal------of Musicology, 1978c, LO, 107-112,

Sundberg, J. & Gauffin, J., Waveform and spectrum of the glottal voice source. Speechsransmission Laboratory, Quarterly Propress and Status Report, 1978, No. 2-3, 35-50.

- \2 :1 t, :+;"j-;.zya;: .:* : :: Terhardt, E. , On the perception of periodic sound fluctuations (rough- ness). Acustica, 1974, 30, 201-213. ,. , . *-- .- , . .. i. ' . Weait, C. & Shea, 3. B. , Vibrato; an audio-video-fluorographic . - investigation of a bassoonist. Applied Radiology, Jan. -Febr. 1977, . - Vennard, W. , Singing, the mechanism and the gechnic (2nd ed. ). ,,.J---. 10 --- *a 2'3".fW'- - New York: Carl Fischer, Inc., 1967. dgr![tti ., ... c Venna~d,W., Hirano, M., Ohala, J , & Fritzell, B., A series of four electromyographic studies. National------As sociation of Teachers of Singing Bulletin., Oct. 1970, 16-21; Dec. 1970, 30-37; Febr.- March 1971, 26-32; May-June 1971, 22-30. $:l-r a 3 '3nan.ro? ,T?:?..

Winckel, F. , Physikalische Kriterien fiir objektive Stimmbeur- . . +'! teilung. Folia Phoniatrica, 1953, 5 SeparatumL 23 1-252. - Zwicker, E. & Feldtkeller , R. , XsOhr als Nachrichtenempfanger (2nd ed.). Stuttgart: S. Hirzel Verlag, 1967.

.b- *> r Agren, K., Alt- och tenorrost och harmoniska interval1 mellan .:--* dem. Thesis work in speech communication. Dept. of Speech ,, 8 .: i Communication, KTH, Stockholm, 1976. p 1 f C :y.C'.f '$;~-j+:"-. 1

Agren, K. ' Sundberg, J., An acoustic comparison of alto and tenor voices. Journal of Research in Singinp, 1978, 1 (3), 26- 32. i.lc4j- e -: