Assessment, Control and Modification of Oral-nasal Balance in Speech

by

Gillian Lorna de Boer

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Rehabilitation Sciences Institute – Speech-Language Pathology University of Toronto

© Copyright by Gillian Lorna de Boer 2018

i

Assessment, Control and Modification of Oral-nasal Balance in Speech

Gillian Lorna de Boer

Doctor of Philosophy

Rehabilitation Sciences Institute – Speech-Language Pathology University of Toronto

2018 Abstract

In normal speech, oral sounds resonate in the oral cavity, and nasal sounds such as m, and n resonate in the nasal cavities. The velopharyngeal sphincter closes for oral sounds and opens for nasal sounds. The studies presented in this thesis sought to advance the assessment, control and modification of oral-nasal balance in speech. While the research was carried out in normal speakers, the intention was to create knowledge that will ultimately improve the assessment and behavioural treatment of speakers with oral-nasal balance disorders due to cleft .

The first study (de Boer and Bressmann, 2016 a) was a retrospective analysis of normal speakers simulating different disorders of oral-nasal balance (hypernasality, hyponasality and mixed nasality). The recordings were analyzed acoustically using Long Term Averaged Spectra. The simulations produced distinctive spectra enabling the creation of formulas that predicted the oral- nasal balance well above chance level.

The second study (de Boer & Bressmann, 2017) explored the role of auditory feedback in the regulation of oral-nasal balance in speech. In an altered auditory feedback paradigm, speakers of

Canadian English compensated more for increased nasality than decreased nasality. This suggested that the speakers were less critical of a lack of nasality (hyponasality) than excess nasality (hypernasality).

ii

The third study (de Boer, Marino, Berti, Fabron, & Bressmann, 2016) investigated how voice focus affects oral-nasal balance in normal Brazilian speaking individuals. Participants read stimuli with their normal voice, a backward focus and a forward focus. The mean nasalance scores of the stimuli in the backward focus and normal speaking conditions were significantly lower than in the forward focus condition. The results confirmed that speaking focus influences oral-nasal balance in normal speakers, which could be useful for the development of new approaches of behavioural therapy.

The research presented expands our understanding of oral-nasal balance control and lays the groundwork for new ways of clinically assessing and managing oral-nasal balance disorders.

iii

Acknowledgments

Dr. Tim Bressmann has been a supervisor, mentor and friend. Always encouraging, he kept me on track, and rescued me from many intellectual rabbit holes. I continue to marvel at his curiosity which stretches across academia to global cuisine and candy.

Throughout my Masters and Doctoral degree, Drs. Pascal van Lieshout and Gajanan (Kiran)

Kulkarni served on my supervisory committee. Their questions and constructive criticisms helped to shape my graduate studies. I am especially grateful for the care and attention they dedicated to earlier versions of this thesis.

I would like to acknowledge the Ontario Graduate Scholarship program for funding much of my doctoral degree. I am also grateful to Mitacs, whose Globalink award enabled me to travel to

Brazil and work with leaders in my field, including Drs. Viviane Marino, Jeniffer Dutka and

Maria-Ines Pegoraro-Krook. During that trip, I was spoiled by the hospitality of my hosts Dr.

Larissa Berti and her family. They served many wonderful Sunday lunches and provided me with the ever encouraging pineapple metaphor (“It’s a lot of work, but worth it”). I will fondly remember coffees and lunches with Dr. Viviane Marino for years to come. I am also grateful to

Dr. Eliana Fabron for her incredible kindness and generosity.

My time in the lab was enriched by the friendships developed with fellow students and researchers, Dr. Larissa Berti, Amanda Ratner, Charlene Santoni, Elke Sapper, Monique Tardif and Gabriella Zuin. I am also grateful to a number of volunteers for their assistance with the second study: Sheetal Ramaprasad, Bianca Cohn, Yaxin Liu, Roubina Sarkissian, Marika Loy,

iv

and Karalina Lovkina. Though dedicated to other projects in the lab, the company of Joanna

Hunt and Alissa Varlamova was also very much appreciated.

Outside the lab I would like to thank Diane’s Tuesday/Thursday Triple Blast class and Viola’s

Dance Like You Mean It studio for keeping me moving and sane. Finally, I would like to thank

Michael, my partner, my rock, who’s had my back since the day we met.

v

Table of Contents Acknowledgments...... iv List of Tables ...... x List of Figures ...... xi List of Appendices ...... xii Chapter 1 - General introduction ...... 1 1.0 Speech production ...... 1 1.0.1 Resonance vs. oral-nasal balance ...... 2 1.1 Nasals and nasalized vowels ...... 3 1.1.1 Acoustic characteristics of nasals and nasalized vowels ...... 4 1.1.2 Velopharyngeal sphincter function ...... 4 1.1.3 Velopharyngeal musculature ...... 5 1.1.4 Velopharyngeal innervation and control ...... 8 1.1.5 Velopharyngeal closure ...... 12 1.1.6 The nose and nasal cavities ...... 14 1.2 Disordered oral-nasal balance ...... 14 1.3 Assessment of disorders of oral-nasal balance ...... 17 1.3.1 Perceptual ...... 17 1.3.2 Instrumental assessment: Visual assessment of velopharyngeal function ...... 19 1.3.3 Instrumental assessment: Acoustic measurement of oral-nasal balance ...... 20 1.4 Treatment of oral-nasal balance disorders ...... 23 1.4.1 Surgery...... 23 1.4.2 Prosthetics ...... 24 1.4.3 Speech therapy ...... 24 1.5 Thesis objectives ...... 30 Chapter 2 - Application of linear discriminant analysis to the Long Term Averaged Spectra of simulated disorders of oral-nasal balance ...... 33 2.0 Abstract ...... 34 2.2 Methods ...... 41 2.2.1 Participants ...... 41 2.2.2 Participant Training ...... 42 vi

2.2.3 Stimuli ...... 42 2.2.4 Recording Procedures ...... 43 2.2.5 Simulation Verification ...... 43 2.2.6 Acoustic Analysis ...... 43 2.2.7 Statistical Analysis ...... 44 2.3 Results ...... 44 2.3.1 Repeated Measures ANOVA ...... 46 2.3.2 Linear discriminant analysis ...... 49 2.4 Discussion ...... 52 2.5 Conclusion ...... 55 2.6 Acknowledgements ...... 55 Chapter 3 - Influence of altered auditory feedback on oral-nasal balance in speech ...... 56 3.0 Abstract ...... 57 3.1 Introduction ...... 58 3.2 Methods ...... 60 3.2.1 Participants ...... 60 3.2.2 Recording procedures ...... 61 3.2.3 Acoustic impact of changes to multitrack channel level ...... 63 3.2.4 Procedures: Change of nasal feedback level in the two experimental groups ...... 65 3.2.5 Procedures: Change of overall feedback level in the amplitude control group ...... 66 3.2.6 Statistical analysis...... 66 3.3 Results ...... 67 3.3.1 Experimental groups ...... 69 3.3.2 Amplitude control group ...... 70 3.4 Discussion ...... 71 3.5 Conclusion ...... 76 3.6 Acknowledgements ...... 76 Chapter 4 - Influence of Voice Focus on Oral-Nasal Balance in Speakers of Brazilian Portuguese ...... 77 4.0 Abstract ...... 78

vii

4.1 Introduction ...... 79 4.2 Methods ...... 82 4.2.1 Participants ...... 82 4.2.2 Participant training ...... 83 4.2.3 Stimuli ...... 83 4.2.4 Recording Procedures ...... 84 4.2.5 Data Analysis ...... 85 4.3 Results ...... 85 4.4 Discussion ...... 87 4.5 Acknowledgements ...... 91 4.6 Disclosure statement ...... 91 4.7 Appendix - Stimuli ...... 91 Chapter 5 - Conclusions ...... 92 5.1 Classification of oral-nasal balance with Long-Term Averaged Spectra ...... 92 5.1.1 Study summary ...... 92 5.1.2 Study implications ...... 92 5.1.3 Limitations ...... 93 5.1.4 Future directions ...... 94 5.2 Effect of altered nasal auditory feedback on oral nasal balance ...... 95 5.2.1 Study summary ...... 95 5.2.2 Study implications ...... 95 5.2.3 Limitations ...... 97 5.2.4 Future directions ...... 98 5.3 Impact of voice focus on the oral-nasal balance of speakers of Brazilian Portuguese ..... 100 5.3.1 Study summary ...... 100 5.3.2 Study implications ...... 100 5.3.3 Limitations ...... 101 5.3.4 Future directions ...... 102 5.4 Closing statement ...... 102 References ...... 104

viii

Appendix A – Tentative formulas for the classification of oral nasal balance ...... 133 Appendix B – The stimuli from section 4.7 and their phonetic transcriptions...... 135

ix

List of Tables

Table 2.1. Bonferroni multiple-comparison tests (α = 0.05) of z-scores from LTAS frequency bands for oral and nasal stimuli with significant condition interaction effects.

Table 2.2 Canonical discriminant function coefficients derived from ten predictors and four speech conditions (normal, and simulated hypernasal, hyponasal and mixed).

Table 2.3. Function values of group centroids for four speech conditions (normal, and simulated hypernasal, hyponasal and mixed).

Table 3.1 – Combined mean nasalance scores from the two experimental groups high-to-low and low-to-high in the 50% baseline, 0% minimum and 100% maximum nasal level feedback conditions (N=20).

Table 4.1. Mean nasalance scores of 3 repetitions of 9 stimuli in 3 conditions (n = 10)

x

List of Figures

Figure 2.1 Line chart of the LTAS z-transformed mean amplitudes of four conditions for the oral stimulus.

Figure 2.2 Line chart of the LTAS z-transformed mean amplitudes of four conditions for the nasal stimulus.

Figure 2.3. Scatterplot of function values for the linear discriminant analysis with group centroids (centroid labels: N = normal, R = hypernasality, O = hyponasality and X = mixed nasality).

Figure 3.1. Schematic diagram of recording equipment for auditory feedback of oral-nasal balance

Figure 3.2. Line graph of decibels SPL (uncalibrated) by track volume control potentiometer level for a sine wave

Figure 3.3. Spectogram with intensity trace for the word smelly uttered in the baseline feedback condition and mastered with 0% minimum, 50% baseline and 100% maximum nasal level settings.

Figure 3.4 Error bar plot of consecutive mean nasalance scores as a function of nasal feedback level for the high-to-low experimental group (N=10).

Figure 3.5 Error bar plot of consecutive mean nasalance scores as a function of the nasal feedback level for the low-to-high experimental group (N=10).

Figure 3.6 Error bar plot of consecutive mean nasalance scores as a function of the oral and nasal feedback level for the amplitude control group (N=9).

Figure 4.1 Boxplot of nasalance scores for 3 repetitions of 9 stimuli in 3 conditions (normal, forward focus and backward focus) (N=10)

xi

List of Appendices

Appendix A – Tentative formulas for the classification of oral nasal balance

Appendix B – The stimuli from section 4.7 and their phonetic transcriptions.

xii

Chapter 1 - General introduction

In normal speech, oral sounds resonate in the oral cavity and nasal sounds, such as m, n and ng resonate in the nasal cavities. The velopharyngeal sphincter closes for oral sounds and opens for nasal sounds. This oral-nasal balance in speech is important. Without a separation between the oral and nasal cavities, speech is hypernasal (all speech sounds are nasalized). Hypernasality affects speech intelligibility and acceptability and is socially stigmatizing. The studies presented here concern the assessment, control and modification of oral-nasal balance in speech. While the research studies were carried out with typical female speakers, this research was intended to ultimately improve the management of hypernasality resulting from structural velopharyngeal dysfunction due to cleft palate.

After a brief review of oral and nasal speech sounds, this introductory chapter describes the anatomy and function of the velopharyngeal sphincter. The sections that follow describe how clefts of the lips and palate are assessed and treated as well as the challenges clinicians face in assessing oral-nasal balance disorders and treating hypernasality with speech therapy.

1.0 Speech production

Speech is an intricate motor process that is commonly divided into three main functional subsystems, i.e., the respiratory, laryngeal (phonation) and supralaryngeal (articulation) systems

(Zemlin, 1998). Speech requires the regulation of breath to drive phonation, the propulsion of air through the vocal folds to create a source signal, and the movements of the jaw, tongue and lips to shape this raw sound into differentiated speech sounds. The separation and coupling of air between the oral and nasal cavities that is achieved by the velopharyngeal sphincter is an 1

important aspect of this process. The transformation from breath to speech can be explained by

Fant’s (1960) source-filter theory. A source signal is created when a sufficient transglottal pressure difference causes the vocal folds to vibrate. The spectral characteristics of this source signal are then modulated by the aerodynamic changes introduced by the and oral cavity. If the velopharyngeal sphincter is closed, speech sound will be shaped by only the resonance properties of the pharynx and oral cavity. If the velopharyngeal sphincter is open, sound will also travel to the nasal cavities. The coupling and de-coupling of sound between the oral and nasal cavities regulates the oral-nasal balance during connected speech (Hixon,

Weismer, & Hoit, 2008; Peterson-Falzone, Hardin-Jones, & Karnell, 2001).

1.0.1 Resonance vs. oral-nasal balance

Throughout this thesis, the terms “oral-nasal balance” (de Boer & Bressmann, 2015; Jones, 2000;

Jones, Morris, & Van Demark, 2004) and “disordered oral-nasal balance” (de Boer &

Bressmann, 2015) are applied where the speech-language pathology literature commonly uses the terms “resonance” and “resonance disorders” (Kummer, 2008; Peterson-Falzone et al., 2001).

The reasons for this terminological preference were laid out in detail in de Boer & Bressmann

(2015). First, the term resonance is confusing because it has multiple different definitions in speech pathology, speech science, vocal paedagogy and physics (McWilliams, Morris, &

Shelton, 1990; Titze, 1994). In speech-language pathology, the term “resonance disorder” primarily denotes an imbalance of oral and nasal resonance but sometimes also subsumes other perceptual impressions of vocal tract obstructions, such as “cul-de-sac resonance” (McWilliams et al., 1990). According to de Boer & Bressmann (2015), the term “oral-nasal balance” better reflects the specific aspect of only the continuum between oral and nasal sound transmission. By

2

defining the concept of oral-nasal balance more rigorously and separating it from other aspects of vocal tract resonance, de Boer & Bressmann (2015) argued that this could clarify the diagnostic process. Importantly, oral-nasal balance per se can be measured quantitatively. Instruments such as the Nasometer measure the acoustic energy from the nose and mouth and express the proportion of nasal sound (% nasalance = nasal/(nasal+oral) x100) as a percentage nasalance score (Fletcher, 1976). Disorders of oral-nasal balance include hypernasality (too much nasality), hyponasality (lack of nasality) and mixed nasality (hyper-hyponasality). In contrast, the qualification of other aspects of resonance disorders, such as nasal, oral or pharyngeal cul-de-sac resonance should be based on perceptual listener judgements. This argument is revisited and further elaborated in sections 1.2 and 1.3.1.

1.1 Nasals and nasalized vowels

In English, the nasal sounds are the (or stops) [m], [n] and [] (Ladefoged &

Maddison, 1996). The term nasal implies the air and sound are flowing through the nasal cavities and an obstruction is preventing airflow out of the mouth. However, the oral occlusion is not always complete. English vowels can be nasalized by coarticulation, when they are adjacent to a nasal, as in the word ant. The term nasalized implies that the sound is traveling through both the nasal and the oral cavities (Ladefoged & Maddison, 1996). In many languages, such as French and Portuguese, vowels are nasalized not only by coarticulation, but also phonemically, without the production of a nasal stop on either side. Examples from Brazilian

Portuguese are lã [´lɐ̃] (wool) and sim [sĩ] (yes) (Goodin-Mayeda, 2016).

3

1.1.1 Acoustic characteristics of nasals and nasalized vowels

Several studies have described the acoustic features of nasality using spectrography. In adults, nasal resonance peaks will occur at 250-300 Hz and 800-1000 Hz (Stevens, 1997). The most noticeable spectral feature of a nasalized vowel is a first formant peak with a wider bandwidth and lower amplitude than its non-nasalized counterpart (Johnson, 2012; Stevens, 1985, 1997). In the spectra of non-nasalized vowels, the first formant has the greatest amplitude, but for a nasalized vowel, the nasal peak can be as high in amplitude as the first oral formant (Chen,

1997).

For nasal occlusives, closing the oral cavity creates a side branch to the primary nasal resonating chamber. The resonating frequency of the oral cavity creates anti-resonances (or anti-formants) which appear as dips in the spectrograms. For [m] and [n], the anti-resonances are near 750 Hz and 1400 Hz respectively (Johnson, 2012).

1.1.2 Velopharyngeal sphincter function

The oral and nasal cavities are separated anteriorly by the . The two cavities are separated posteriorly when the velopharyngeal sphincter constricts. The velopharyngeal sphincter (also known as the velopharyngeal mechanism) consists of the velum (the ) and the upper lateral and posterior pharyngeal walls. The velopharyngeal sphincter regulates the opening and closing of the velopharyngeal port (Lubker, 1968; Peterson-Falzone et al., 2001). As described above, the velopharyngeal sphincter will open, to varying degrees, for nasal and nasalized speech sounds and close for oral sounds (Lubker, 1968; Yanagisawa, Kmucha, &

Estill, 1990). For swallowing, the sphincter closes to prevent food from entering the nasal

4

cavities. During the swallow, the is opened, equalizing the air pressure in the middle ear. Finally, for nasal breathing the velopharyngeal port will be wide open to allow air to pass from the nostrils to the lungs.

1.1.3 Velopharyngeal musculature

The velopharyngeal sphincter muscles involved in speech production are the levator veli palatini, the musculus uvulae, the palatoglossus, and the superior pharyngeal constrictor. The levator veli palatini is the primary muscle of velopharyngeal closure (Perry, 2011; Perry & Zajac, 2017). A sling-shaped pair of muscles, each half of the levator veli palatini begins at the anterior petrous portion of the temporal bone then meets near the middle of the velum. Within the velum, the fibers fan out and interconnect with the opposing bundle so that the two bundles are fused in the velar midline (Kuehn & Moon, 2005; Perry & Zajac, 2017). When the levator veli palitini contracts, the velum is pulled towards the posterior pharyngeal wall at a 45 degree angle (Perry,

2011). Although the levator veli palatini is considered to be the primary muscle of velopharyngeal closure, there is not a one-to-one relationship between velar elevation and activation of the levator veli palatini (Kuehn, Folkins, & Cutting, 1982). Instead, Kuehn et al.

(1982) found a systematic interaction between the levator veli palatini, the palatoglossus and the palatopharyngeus.

The musculus uvulae is intrinsic to the velum. It begins at the palatal aponeurosis, a tendinous sheath between the hard and soft palate about 25 % along the length of the velum, and despite its namesake, ends before the uvula proper (Kuehn & Moon, 2005; Kuehn & Perry, 2009). The musculus uvulae lies near the dorsal (nasal) surface of the velum, and is held by the levator veli

5

palatini on either side (Perry, 2011). Contraction of the musculus uvulae adds bulk to the velum

(the velar eminence), assisting with velopharyngeal closure (Perry, 2011).

The palatoglossus attaches to the lateral sides of the velum, courses through the anterior faucial pillars, and inserts into the lateral aspects of the tongue (Kuehn & Azzam, 1978; Kuehn & Perry,

2009; Perry, 2011). Contraction of the palatoglossus can lower the velum, elevate the tongue dorsum and narrow the faucial isthmus, which aids in bolus movement for swallowing (Kuehn &

Perry, 2009). Electromyographic measures suggest the muscle is active for certain speech sounds

(Kuehn et al. 1982), but it is not known if the palatoglossus muscle activity relates to velar lowering, tongue dorsum elevation or both. Since the anterior faucial pillars contain many elastic fibers, and the length of the palatoglossus resides within them, Kuehn and Azzam (1978) proposed that velar lowering may be accomplished by contraction of the palatoglossus, gravity and/or elastic recoil.

The superior pharyngeal constrictor is a thin fan-shaped muscle forming the walls of the upper pharynx (Kuehn & Perry, 2009). A series of muscle bundles stretch along the lateral walls and meet at the pharyngeal raphe. The superior pharyngeal constrictor is thought to narrow the walls of the pharynx, bringing the pharynx closer to the velum (Iglesias, Kuehn, & Morris, 1980;

Kuehn & Perry, 2009). Despite several attempts (Bell-Berti, 1976; Dickson, 1975; Fritzell, 1969;

Kuehn et al., 1982, Minifie, Abbs, Tarlow, & Kwaterski, 1974), efforts to verify the function of the superior pharyngeal constrictor with electromyography have been difficult because the muscle is very thin (2mm), overlaps with other muscles and is hard to access (Kuehn et al., 1982;

Kuehn & Perry, 2009).

6

Three additional pairs of the muscles associated with the velopharyngeal sphincter are the palatopharyngeus, the salpingopharyngeus and the tensor veli palatini. The palatopharyngeus consists of the vertically aligned palatothyroideus and the horizontally aligned palatopharyngeus proper (Cassell & Elkadi, 1995). Both sets of muscle fibers attach to the velum. The palatothyroideus courses through the posterior faucial pillars, down the lateral pharyngeal walls and terminates at the thyroid. These vertical muscle fibers coordinate with the levator veli palatini and the palatoglossus to position the velum (Kuehn & Perry, 2009; Moon, Smith,

Folkins, Lemke, & Gartlan, 1994). The palatopharyngeus proper attaches to the lateral and posterior pharyngeal walls. The contraction of the transversally oriented muscle fibers is thought to contribute to lateral narrowing of the velopharyngeal port for swallowing (Kuehn & Perry,

2009; Perry, 2011).

The salpingopharyngeus is a small muscle that originates at the pharyngeal opening of the

Eustachian tube, courses down the lateral wall and blends with the palatopharyngeus (Perry,

2011; Zajac & Vallino, 2017). In anatomical studies, the salpingopharyngeus has been found in less than half of the individuals investigated (Dickson & Dickson, 1972; Trigos, Ysunza, Vargas,

& Vazquez, 1988). Therefore, it is not considered to be important for velopharyngeal function with respect to speech, but when present, the salpingopharyngeus may assist with lateral contraction of the pharynx (Kuehn & Perry, 2009; Perry, 2011; Perry & Zajac, 2017).

Finally, the tensor veli palatini opens the Eustachian tube during swallowing and yawning, thus equalizing the air pressure of the middle ear (Kuehn & Perry, 2009; Perry, 2011; Perry & Zajac,

2017). The tensor veli palatine is a two-bellied muscle, one bundle originates at the cranial base and the second bundle connects to the lateral wall of the Eustachian tube (Abe et al., 2004;

7

Barsoumain, Kuehn, Moon, & Canady, 1998; Perry & Zajac, 2017). The two-bellied muscle runs parallel to the levator veli palatini, its tendons merge as one, wrap around the pterygoid hamulus and terminate at the palatine aponeurosis (Abe et al., 2004; Barsoumain et al., 1998; Heidsieck,

Smarius, Oomen, & Breugem, 2016; Perry & Zajac, 2017).

1.1.4 Velopharyngeal innervation and control

It is generally agreed that the tensor veli palatini is innervated by the mandibular nerve (3rd branch of the N. trigeminus - cranial nerve V) and that the remaining velopharyngeal muscles are innervated by the pharyngeal plexus, a network of afferent and efferent fibers from the glossopharyngeal (IX) and vagal (X) cranial nerves (Kuehn & Perry, 2009, Logjes, Bleys, &

Breugem, 2016). Recent studies suggest the levator veli palatini is also innervated by the lesser palatine nerve (Kishimoto, Matsuura, Kawai, Yamada, & Suzuki, 2016; Kishimoto, Yamada et al., 2016; Shimokawa, Yi, & Tanka, 2005). Whether the lesser palatine is derived from the maxillary branch of the trigeminal nerve (V) or the facial nerve (VII) via the pterygopalatine ganglion remains to be determined (Kuehn & Perry, 2009, Shimokawa et al., 2005).

Proprioception is one’s ability to detect limb and bodily posture from the inside (Fridland, 2011)

According to Hixon et al. (2008), proprioception and kinesthesis (perception of movement) of the velopharyngeal sphincter are believed to be rudimentary at best. In an early study of velopharyngeal control, normal participants could be trained to raise the velum on command, but without extraneous cues, they could not reliably report when their velum was in a raised position

(Shelton, Knox, Elbert, & Johnson, 1970). The proprioception of limbs is influenced by other perceptual modalities such as vision and touch (Botvinick & Cohen, 1998; Fridland, 2011).

8

Likewise, participants have demonstrated a greater degree of velopharyngeal control when performing a speech task with auditory and visual feedback of their performance (e.g., Moon &

Jones, 1991).

Compared to the oral cavity, the pharynx has a much lower density of sensory neurons

(Kanagasuntheram, Wong, & Chan, 1969; Kuehn & Perry, 2009). Muscle spindles are sensory receptors within a muscle that detect changes in its length and are associated with proprioception. Muscle spindles have been found in the tensor veli palatini, palatoglossus and the levator veli palatini (Kuehn, Templeton, & Maynard, 1990; Liss, 1990) but not the musculus uvulae, the superior pharyngeal constrictor, the palatopharyngeus, or the salpingopharyngeus (De

Carlos et al., 2013, Kuehn & Perry 2009). Although Liss (1990) found muscle spindles in the levator veli palatini, they were reported to be much smaller and shorter than typically found in limb muscles. Yet, muscle spindles have been found in the palatoglossus (Kuehn et al., 1990;

Liss, 1990), and there is evidence to suggest that the palatoglossus, the levator veli palatini and the palatopharyngeus work together (Kuehn et al., 1982). The spindles from the palatoglossus may provide sufficient proprioceptive feedback to regulate the position of the levator veli palatini.

In a histological analysis comparing palatal, facial and limbic muscles, the palatal muscles were found to be more similar to the facial muscles than the limbic muscles (Stål & Lindman, 2000)1.

Unlike typical limb muscles, the palatal muscles attach to hard tissues at one end and the aponeurosis of the soft palate at the other. Facial muscles, such as the zygomaticus major, also

1 Stål and Lindman (2000) differentiate the masticatory muscles, which have numerous spindles, from the facial muscles. The masticatory muscles, like limb muscles, have two skeletal insertions.

9

have only one skeletal insertion and lack muscle spindles (Stål, Eriksson, Eriksson, & Thornell,

1987, 1990; Stål & Lindman, 2000). The authors suggested that the lack of ordinary muscle spindles “indicate a special proprioceptive control system for the soft palate muscles” (Stål &

Lindman, 2000, p.288). When De Carlos et al. (2013) examined the superior pharyngeal constrictor, they found sensory structures which they called “corpuscule-like”. The authors proposed that these may act as a sensory substitute for muscle spindles. It remains to be determined if similar “corpuscule-like” structures can be found in other muscles of the velopharyngeal sphincter.

Proprioception exists as a continuum between conscious and non-conscious awareness of one’s own body (Fridland, 2011). If the body did not have a mechanism to judge the position of the velopharyngeal musculature, there would be no velopharyngeal control for swallowing, oral vs. nasal breathing, let alone speech. While the extent or quality of conscious velopharyngeal proprioception is uncertain, it no doubt exists at some level. For example, the electrographic activity of the levator veli palatini changes for speech tasks between the upright and supine position (Moon & Canady, 1995). The difference in activity levels suggests that the body is aware of a difference in position and making adjustments to compensate for gravity (Moon &

Canady, 1995). The activity level of the levator veli palatini is also influenced by changes in air pressure and the wearing of palatal speech appliances (Ruscello, 2007; Tachimura, Hara, &

Wada, 1995; Tachimura, Nohara, Fujita, Hara & Wada, 2001; Tachimura, Nohara, Hara &

Wada, 1999; Tachimura, Nohara & Wada, 2000).

When the work towards the present thesis began, the position taken regarding the proprioception of the velopharyngeal sphincter was that there was a lack of, or, only rudimentary

10

proprioception. Since the articles (sections 2, 3 and 4) were published, for the reasons outlined in the paragraphs above this position requires clarification. For the present thesis, statements regarding the proprioception of the velopharyngeal sphincter are referring to conscious proprioception. Although both facial and some velopharyngeal muscles lack muscles spindles

(Stål & Lindman, 2000), the difference is that a smile can be seen, but the velopharyngeal sphincter is hidden from view. Shelton et al. (1970) and clinical experience have shown that it is difficult to make velopharyngeal proprioception conscious for the speaker. While there are likely to be individual differences, instead of a “lack of” proprioception, the velopharyngeal sphincter has proprioception, but it is difficult to consciously access.

The control of velopharyngeal movement and oral-nasal balance in speech remains incompletely understood. One of the objectives of this thesis is to explore the role of auditory feedback for velopharyngeal control and oral-nasal balance in speech. Auditory feedback can play an important role in the control of speech production. This becomes quite evident when auditory feedback is altered in speech perturbation experiments (Elman, 1981; Houde & Jordan, 1998;

Lane & Tranel, 1971; Larson, Burnett, Kiran, & Hain, 2000; Purcell & Munhall, 2006; Siegel &

Pick, 1974). The altered auditory feedback studies cited above have contributed to the development of neurologically inspired models of speech production. To date, the DIVA model is the most detailed and tested of these models (Guenther, 2014). This model consists principally of a feedforward and a feedback control system (Guenther, 2006; Tourville & Guenther, 2011).

The feedforward system sends motor commands for the intended speech sound, and the feedback system monitors that production and generates corrective commands as required. The feedback system consists of the auditory and the somatosensory subsystems. As example of the

11

importance of somatosensory feedback, Nasir and Ostry (2006) found that participants will compensate for lateral changes in jaw motion (somatosensory feedback) even though the altered jaw motion did not change the acoustic output. The model continues to evolve and expand as more experimental data becomes available. Although the DIVA model is quite detailed, its articulator position map does not yet account for velopharyngeal movement. The current articulator position map […] “consists of 10 pairs of antagonistic cells that correspond to parameters of the Maeda vocal tract that determine lip protrusion, upper and lower lip height, jaw height, tongue height, tongue shape, tongue body position, tongue tip location, larynx height, and glottal opening and pressure” (Tourville & Guenther, 2011, p. 964).

The thesis raises the question whether oral-nasal balance is influenced by the auditory feedback subsystem in a manner similar to how changes in loudness, pitch and vowel formants are responded to by speakers in perturbation studies. The role of the auditory feedback subsystem in the control of oral-nasal balance is explored in Chapter 3.

1.1.5 Velopharyngeal closure

In their description of various sound productions, introductory phonetic textbooks will often provide an image of a mid-sagittal section of the vocal tract. The diagrams for oral sounds all show the velum raised and flush with the posterior pharyngeal wall. When the sound is nasal, the diagram will show the velum lowered and a gap for air and sound to travel to the nasal cavities.

These diagrams give the impression that the velum operates like a trap door and that nasality is a binary (+/-, on/off) articulatory feature. Yet, individuals may use four2 quite different

2 Finkelstein et al. (1993) proposed a fifth closure pattern “coronal with marked movement of the lateral pharyngeal walls” to account for the continuum between coronal and circular closure patterns.

12

velopharyngeal closure patterns (Croft, Shprintzen, & Rakoff, 1981; Finkelstein et al., 1993).

About half of speakers achieve coronal closure by elevating the velum and stretching it to reach the posterior pharyngeal wall, which is the pattern of closure typically shown in phonetics textbooks. In contrast, the sagittal closure pattern consists almost entirely of lateral wall movement with minimal velar movement. The third pattern is that many speakers will elevate the velum and include substantial movement of the lateral walls for a circular closure pattern.

Finally, the fourth pattern is when the circular closure pattern is accompanied by a hypertrophic eminence on the posterior pharyngeal wall, which is known as circular closure with Passavant’s ridge. The circular closure with Passavant’s ridge is typically observed in speakers with cleft palate. Finkelstein et al. (1993) suggested the variety of closure patterns are due to individual differences in the orientation of the velopharyngeal musculature.

When describing oral sound production in normal speakers, the term “velopharyngeal closure” is not entirely accurate as it is often “velopharyngeal near-closure”. In a cinefluorographic study of ten normal +speakers, Moll (1962) found that small gaps remained in 30% of the productions of isolated vowels and in 13-15% of the production of vowels in an oral CVC context. For the sake of simplicity, throughout this thesis, the term “velopharyngeal closure” will also imply “near- closure”. In addition, the size of the velopharyngeal gap may not be directly related to the perception of nasality (Jones & Seaver, 2009). Rather, “acoustic studies of (Fant,

1960; House & Stevens, 1956) have demonstrated that nasality is related to the acoustic impedance of the nasal cavity relative to the acoustic impedance of the oral cavity” (Jones &

Seaver, 2009, p. 231).

13

1.1.6 The nose and nasal cavities

The nose is most important for respiration and smell but also serves as a resonating chamber for nasal and nasalized sounds. The nasal cavities are divided anteriorly by the cartilage of the nasal septum, and posteriorly by the ethmoid bone above and the vomer boner below (Hixon et al.,

2008). The floor of the nasal cavity consists of the hard palate anteriorly and the soft palate posteriorly. Each nasal cavity has three rows of scrolled tissue, the chonchae or turbinates

(superior, middle and inferior), which are lined with olfactory and respiratory mucosa. Although the external nasal muscles are active during speech production (Lansing, Pearl-Solomon,

Kossev, & Andersen, 1991), the primary role of the nasal (and paranasal) cavities in speech is to act as resonating chamber for nasal sounds (Hixon et al., 2008).

1.2 Disordered oral-nasal balance

Oral-nasal balance is determined by the coupling or separation of sound between the oral and nasal cavities. If the mechanism is disrupted, this will result in disordered oral-nasal balance.

Such disorders are particularly prevalent in children born with cleft lip and/or palate. The incidence of cleft lip and palate in North America is estimated at about 1 in 800 live births.

Clefts of the lip and palate are congenital anomalies due to incomplete fusion of the embryologic processes between the 5th and 10th weeks of gestation (Zajac & Vallino, 2017). When there is a complete cleft of the palate, there is no separation between the oral and nasal cavities. In addition, because the hard and soft palate did not fuse properly, the muscles of the velopharyngeal sphincter are not attached to their intended sites in the velum. For example, the levator veli palatini muscles, which would normally meet medially to form a sling to elevate the

14

velum, are instead attached to the posterior edges of the hard palate (Perry & Zajac, 2017). The nasal airway may also be affected. Clefts of the lip and alveolus can be unilateral or bilateral.

Complete clefts of the lip and alveolus will alter the skeletal base of the nose. A repaired bilateral cleft lip often leads to a flatter nose with a shorter columella and nasal obstruction. A unilateral cleft lip and alveolus causes the septum to deviate and collapse the nose towards the side with the cleft, also resulting in a reduced nasal air passage (Aras, Olmez, & Dogan, 2012; Trindade,

Gomes, Fernandes, Trindade, & Silva Filho, 2015).

Disorders of oral-nasal balance include hypernasality (too much nasality), hyponasality (lack of nasality) and mixed nasality (hyper-hyponasality) (de Boer & Bressmann, 2015, 2016a). The structural causes of hypernasality include not only cleft palate, but also oronasal fistulae, congenitally short velums and lesions from oral cancers or traumatic injury (Kummer, 2008,

2011; Kummer et al., 2012). Neurological disorders that impact the coordination of the velopharyngeal sphincter can also lead to hypernasality (Kummer, 2011; Kummer et al., 2012).

Among individuals with severe hearing impairment (and thus a lack of auditory feedback), hypernasal and/or hyponasal speech may result (Kim et al., 2012; Kummer, 2008, Ysunza &

Vazquez, 1993). For example, Ysunza and Vazquez (1993) confirmed with electromyography that their severely hearing impaired participants had normal velopharyngeal muscle activity for non-speech activities, but that the velopharyngeal coordination during speech tasks was poor.

While nasal congestion (rhinitis) causes temporary hyponasality, chronic hyponasality is almost always due to an anatomical blockage (Kummer, 2011). A posterior blockage prevents air and sound from reaching the nasal cavities. Causes of posterior blockage include hypertrophic tonsils, or adenoids, choanal stenosis, and prosthetic speech appliances

15

(D’Antonio & Scherer, 2009; Karnell, Hansen, Hardy, Lavelle, & Markt, 2004; Kummer, 2011).

With an anterior blockage, sound and air can reach the nasal cavity, but very little of it can project to reach the ear of the listener. Anterior blockage can be caused by hypertrophic turbinates, a deviated septum, stenotic nares, and maxillary retrusion (Kummer, 2011).

For mixed nasality (hyper-hyponasality), the patient will present with both hypernasality and hyponasality due to a combination of causes from those listed above (Kummer, 2011;

McWilliams et al., 1990; Peterson-Falzone et al., 2001). A typical scenario is that of a speaker with a unilateral cleft lip and palate who has both dysfunction of the velopharyngeal sphincter

(leading to hypernasality) and a deviated septum (leading to hyponasality) (Kummer, 2008,

2011; Peterson-Falzone et al., 2001).

Two other types of “resonance disorders” that appear in textbooks (Kummer, 2008; McWilliams et al., 1990; Peterson-Falzone et al., 2001, Peterson-Falzone, Trost-Cardamone, Karnell, &

Hardin-Jones, 2006; Zajac & Vallino, 2017) are worth mentioning, namely denasality and cul- de-sac resonance. Hyponasality can be further qualified as “denasality” when the blockage is complete and no sound is heard to escape the nasal cavities (Kummer, 2008; Peterson-Falzone et al., 2006). Based on the construct of oral-nasal balance adopted in this thesis, “denasality” would simply be severe hyponasality (de Boer & Bressmann, 2015). “Cul-de-sac resonance” is often associated with hyponasality and/or mixed nasality (Kummer 2008, 2011; McWilliams et al.,

1990; Peterson-Falzone et al., 2001). It refers to a muffled sound quality that occurs when sound is trapped in a blind pouch. However, “cul-de-sac resonance” may denote an oral, pharyngeal

(Kummer, 2011) or nasopharyngeal (Zajac & Vallino, 2017) sound quality. Other than having a perceptually muffled sound quality, there is probably insufficient agreement among clinicians

16

regarding the term “cul-de-sac resonance” (de Boer & Bressmann, 2015; Zajac & Vallino, 2017).

De Boer and Bressmann (2015) argue that cul-de-sac is a perceptual quality that falls outside the construct of (measurable) oral-nasal balance disorders and, when observed, is probably best documented as an additional perceptual feature.

1.3 Assessment of disorders of oral-nasal balance

1.3.1 Perceptual

According to Kuehn and Moller (2000), the gold standard for the assessment of speech is the experienced clinician’s trained ear. Since perceptual judgements are subjective, multiple speech assessment protocols have been developed in an effort to standardize the assessments between individual speech-language pathologists. English language assessment protocols include the

Great Ormond Street Speech Assessment (Sell, Harding, & Grunwell, 1994, 1999), the Universal

Parameters (Henningson et al., 2008) and the Cleft-Audit Protocol for Speech - Augmented

(John, Sell, Sweeney, Harding-Bell, & Williams, 2006). These protocols involve listening to the patient’s spontaneous speech as well as repetitions of standardized sentences and words in order to tease out the presence of hypernasality, hyponasality, nasal air emission, and compensatory articulations. Typically, oral sentences will be used to assess hypernasality and nasal sentences to assess hyponasality (Dalston et al., 1991a, 1991b). Another phonetic feature to consider is vowel height since hypernasality is perceived more readily on high vowels than low vowels (Andrews

& Rutherford, 1972; Kummer, 2008, 2011; Schwartz, 1968). Speakers with hypernasality will have difficulty creating enough intra-oral pressure for high pressure consonants, often leading to compensatory articulations and nasal air emissions (Karnell, 1995, Kummer, 2011). Therefore,

17

the oral stimuli may be controlled for low and high pressure consonant sounds. The severity of the hypernasality is typically rated on 4 to 7 point equal-appearing interval scales, and hyponasality is rated on binary or 3 point scales (Henningsson et al., 2008; John et al., 2006; Sell et al., 1999). The Cleft Audit Protocol for Speech-Augmented has been tested for reliability and, with training, an acceptable level of agreement has been found for most parameters (Chapman et al., 2016; John, 2006; Sell et al., 2009). At this time in North America, none of the speech assessment protocols are used systematically (Gart & Gosain, 2014), and many clinics continue to use their own non-validated protocols (Kummer, Clark, Redle, Thomsen, & Billmire, 2012;

Zajac & Vallino, 2017).

Despite being advocated as the gold standard (Kuehn & Moller, 2000), perceptual judgements leave much to be desired. Inter- and intra-rater reliability ranges from poor to moderate to excellent depending on the agreement criteria (Baylis, Chapman, & Whitehill, 2015;

Brunnegård, Lohmander, & van Doorn, 2012; Chapman et al., 2016; Keuning, Wieneke,

Wijngaarden, & Dejonckere, 2002; Lewis, Watterson, & Houghton, 2003, Whitehill & Lee,

2008). Attempts have been made to improve rating scales with direct magnitude estimation

(Brunnegård et al., 2012; Whitehill, Lee, & Chun, 2002; Zraick & Liss, 2000), but this has not been proven to markedly increased inter- and intra- rater reliability (Brancamp, Lewis, &

Watterson, 2010; Brunnegård et al., 2012). Other measures of diagnostic efficacy such as sensitivity or specificity have not been calculated in these studies. In current clinical practice, the assessment of oral-nasal balance disorders is often supplemented with instrumental measures described below (section 1.3.2) (Stelck, Boliek, Hagler, & Rieger, 2011).

18

1.3.2 Instrumental assessment: Visual assessment of velopharyngeal function

The most common instrumental methods of assessing velopharyngeal function are videofluoroscopy, nasoendoscopy and nasometry (Kuehn & Moller, 2000). Videofluroscopy and nasoendoscopy both allow direct visualization of the velopharyngeal sphincter during speech.

Nasometry provides an indirect acoustic based measure of the oral-nasal balance and is described in section 1.3.3.

Multiview videofluoroscopy is a radiographic method, sometimes supplemented with barium for improved visualisation (Hinton, 2009). In videofluoroscopy, three views are commonly interpreted together to arrive at a complete assessment of structure and function. The lateral view shows anterior-posterior movement and allows visualisation of the velum and posterior pharyngeal wall, the frontal view shows lateral pharyngeal wall movement, and the Townes (or base view) allows visualisation of the horizontal velopharyngeal closure pattern. Measurements can be made of the images to calculate the ratio of velopharyngeal closure and the size of the gap

(Golding-Kushner et al., 1990). Due to the exposure to radiation, the examination is usually limited to two minutes (Hinton, 2009). Multiview videofluroscopy is often the instrument of choice for younger patients or when a more detailed image of the pharynx is needed than can be provided with nasoendoscopy (Karnell, 2011).

In nasoendoscopy, a flexible fiberoptic endoscope is inserted into the nostril to view the velopharyngeal sphincter from above (Zajac & Vallino, 2017). As there is no exposure to radiation, the procedure can be used for longer periods of time and more frequently (Whitehill &

Kim, 2008). Although cooperation from younger children can be problematic (Kuehn & Moller,

2000; Moon, 2009), various clinical strategies have been developed to help the children

19

overcome their fears (Kummer, 2008; Zajac & Vallino, 2017). To date, this is the most commonly used visualization method to diagnose an occult submucous cleft (Kummer, 2008;

Hinton, 2009, Zajac & Vallino, 2017). Magnetic Resonance Imaging has also shown great promise for the static and dynamic assessment of velopharyngeal structure and function, but the equipment is very expensive and the technique is still considered experimental (Perry, Sutton,

Kuehn, & Gamage, 2014).

1.3.3 Instrumental assessment: Acoustic measurement of oral-nasal balance

To corroborate their perceptual judgements of oral-nasal balance, clinicians will often use acoustic measurement. Nasometry instruments, such as the Nasometer (Kay Pentax, Montvale,

NJ) provide a quantitative acoustic assessment of oral-nasal balance. The Nasometer consists of a nasal and oral microphone mounted to the topside and underside of a metal sound separator plate. A headset secures the placement of the plate between the nose and the upper lip. The accompanying software then computes a nasalance score based on the ratio of oral to nasal sound pressure levels (% nasalance = (nasal)/(oral+nasal)x100) (Fletcher, 1976). Normative scores are available for oral and nasal stimuli in many languages (Kummer, 2008, Zajac & Vallino, 2017).

Higher scores for oral stimuli are associated with hypernasality and lower scores for nasal-loaded stimuli are associated with hyponasality (Dalston, Warren, & Dalston, 1991a, 1991b; Hardin,

Van Denmark, Morris, & Payne, 1992). The nasalance scores provide a quantitative measure for tracking patients over time (Peterson-Falzone et al., 2001) and can assist in assessment and treatment decisions (Kummer et al., 2012). The primary advantages of the Nasometer are that it is non-invasive and does not require specialized training. However, acquiring a Nasometer can

20

be expensive and, as the velopharyngeal sphincter is not seen, its function can only be inferred

(Zajac & Vallino, 2017).

Since the experienced listener’s trained ear is considered the gold standard (Kuehn & Moller,

2000), many studies have compared nasalance scores to listener’s perceptual ratings. Simple relationships between nasality ratings and nasalance scores have been assessed by calculating cutoff-scores for perceived hypernasality (Dalston et al., 1991a, 1993; Hardin et al., 1992) and hyponasality (Dalston et al., 1991b; Hardin et al., 1992). The correlations between nasalance scores and perceptual measures of hypernasality severity range from poor to strong (Bressmann,

Klaiman, & Fischbach, 2006, Brunnegård et al., 2012; Dalston et al., 1991a, 1993; Hardin et al.,

1992; Karnell et al., 2004; Keuning et al., 2002; Nellis, Neiman, & Lehman, 1992; Sweeney &

Sell, 2008). In de Boer and Bressmann (2015), it was proposed that at least a part of the frequent disagreement between listeners and nasalance scores may be attributed to mixed nasality. To date, the nasalance scores of only two individuals identified as having cul-de-sac (Kummer,

Billmire, & Myer, 1993; Van Lierde et al., 2011) and only two identified as having mixed nasality (both hyper- and hyponasal) (Karnell et al., 2004) have been published in the literature.

It has long been recognized that the instrumental assessment of mixed forms of nasality is “long overdue” (McWilliams et al., 1990; Peterson-Falzone et al., 2001).

In de Boer and Bressmann (2015), it was hypothesized that the formulas derived from a linear discriminant analysis would perform better than chance in predicting the oral-nasal balance conditions based on nasalance scores. Nasometric recordings were made of eleven female participants as they read oral and nasal speech stimuli. The nasalance scores of their normal speech and their simulations of hyponasal, hypernasal and mixed nasality were analyzed. (Two

21

recordings were made for the hyponasal and mixed nasality conditions in order to assess the impact of right and left nasal obstruction). A repeated measures Analysis of Variance revealed an oral-nasal balance condition-stimuli interaction effect (p < .001). A linear discriminant analysis of the participants’ nasalance scores led to formulas correctly classifying 64.4% of the six initial oral-nasal balance conditions. When the hyponasal and mixed nasality conditions with obstruction of the less patent nostril were removed from the analysis, the resultant formulas correctly classified 88.6% of the remaining four oral-nasal balance conditions. The results demonstrated the potential of this approach for the assessment of oral-nasal balance disorders.

There is a limited understanding of the acoustic aspects of the speech signal that correspond to the perceptual impression of an oral-nasal balance disorder. Attempts have been made to quantify hypernasality based on spectral features, for example, third-octave band spectral analysis3 (Kataoka, Michi, Okabe, Miura, & Yoshida, 1996; Lee, Ciocca, & Whitehill, 2004), but none are commonly used in the clinical setting. Furthermore, virtually nothing is known of the acoustic features of hyponasality and mixed nasality. If the spectrographic characteristics of the different disorders could be more clearly defined, this could lead to the development of new assessment tools based on a good quality single-microphone audio recording and acoustic analysis software (some of which can be downloaded for free). This would be especially beneficial to clinicians who do not have access to a nasometer. The study presented in Chapter 2

(de Boer & Bressmann, 2016a), analyzed the acoustic recordings of the simulated oral-nasal

3 Earlier spectrographic instruments analyzed sound in octave or third octave frequency bandwidths. For octave bands, the lower frequency is half the upper frequency. When smaller bandwidths were required, the one third octave bands were used. Kataoka et al. (1996) selected the third octave bandwidth because it “compared well with the critical bandwidth of the analyzing mechanism utilized by the ear (Pols, van der Kamp & Plomp, 1969)” (p.2181).

22

balance disorders from de Boer and Bressmann (2015) study using Long-Term Averaged Spectra

(LTAS). Linear discriminant analysis then classified each recording as normal, hypernasal, hyponasal or mixed nasality.

1.4 Treatment of oral-nasal balance disorders

Hypernasality can be treated with surgery, prosthodontics devices and/or behavioral speech therapy. As hyponasality is almost always due to a blockage, the treatment is typically surgical

(Kuehn & Moller, 2000).

1.4.1 Surgery

Children born with a cleft of the palate will typically undergo primary palatal repair between 8 and 14 months (Zajac & Vallino, 2017). About 20-30% of patients will continue to have some degree of hypernasality after the primary surgery (Billmire, 2008; Zajac & Vallino, 2017). The continued hypernasality is due to inadequate lengthening of the velum, or improperly positioned and/ or scarred levator palatini muscles (Chen, Wu, Chen, & Noordhoff, 1994; Gart & Gossain,

2014). Secondary surgeries for the remediation of hypernasality include Furlow palatoplasty or double opposing Z-plasty to lengthen the velum, and pharyngeal flap or sphincter pharyngoplasty to reduce the size of the opening of the nasopharynx (Gart & Gosain, 2014). In a randomized trial comparing the success of pharyngeal flap and sphincter pharyngoplasty,

Åbyholm et al. (2005) found the procedures to be equivalent, both in reducing nasalance scores and the perception of hypernasality about 85% of the time. The specific surgery recommended will depend on the velopharyngeal closure pattern and size of the patient’s velopharyngeal gap

(Gart & Gosain, 2014).

23

1.4.2 Prosthetics

When hypernasality persists after secondary surgery, or surgery is not feasible, a prosthetic may be recommended. The three major types of speech prostheses are palatal obturators, speech-bulbs and palatal lifts (Kuehn & Moller, 2000; Reisberg, 2000). Palatal obturators are used to cover oro-nasal fistulae in the hard palate. A speech-bulb prosthesis (or palatopharyngeal obturator) has a bulb extension which sits inside the velopharyngeal gap and reduces the escape of air and acoustic energy into the nasal cavities. It is used where there is a short velum and a deep pharynx

(Kuehn & Moller, 2000; Reisberg, 2000). A palatal lift appliance is used to elevate the velum into the velopharyngeal gap when the velum is sufficiently long but neurologically incompetent

(Kuehn & Moller, 2000) or lacking mobility (Reisberg, 2000). Speech bulbs and palatal lifts can be uncomfortable to wear due to their tendency to elicit the gag reflex (Kummer, 2008). The success of prostheses in treating hypernasality is variable (Karnell et al., 2004).

An exotic alternative treatment option is a nasal obturator with a one-way valve, allowing for inhalation, but not exhalation (Beukelman, Fager, Green, Hakel, & Marshall, 2004; Suwaki,

Nanba, Ito, Kumakura, & Minagi, 2008). Nasal obturators can modestly increase speech intelligibility (Suwaki et al., 2008). As may be expected, devices that treat hypernasality by restricting airflow into or out of the nasal cavities can also lead to hyponasality (Karnell et al.,

2004, Suwaki et al., 2008).

1.4.3 Speech therapy

At least half of children with a repaired cleft palate will go on to receive speech therapy, mostly for articulation disorders (Hardin-Jones & Jones, 2005; Peterson-Falzone et al., 2001). If

24

velopharyngeal closure cannot be achieved, the sounds that require high intra-oral pressure

(plosives, fricatives, affricates) will be compromised (Peterson-Falzone et al., 2001). Children with cleft palate have a tendency to develop compensatory articulations as substitutes for the high pressure sounds (Kuehn & Moller, 2000). The compensatory articulations are often placed posteriorly to the intended sound (Ruscello, 2007). A classic example is the use of a glottal stop for other stop consonants (Kuehn & Moller, 2000). The Speech-Language Pathologist (SLP) will work with the client to correct placement of the articulators, while the velopharyngeal closure is typically addressed with surgery or prosthesis (Kummer, 2008, Peterson-Falzone et al., 2001;

Zajac & Vallino, 2017).

The efficacy of speech therapy techniques for hypernasality itself will depend on the degree of velopharyngeal closure the patient can achieve. When there is substantial velopharyngeal insufficiency, there is very little to be gained from speech therapy (Kummer, 2008, Peterson-

Falzone et al., 2001; Zajac & Vallino, 2017). Even when closure or near closure can be achieved, patients may not be able to benefit from speech therapy. The best candidates for behavioral therapy will have inconsistent mild to moderate hypernasality (Ruscello, 2007, Zajac & Vallino,

2017). In addition, other potential causes of hypernasality, such as oro-nasal fistulae, should be ruled out. However, the range of diagnostic therapeutic techniques is limited. Historic therapeutic activities, such as blowing and sucking, were designed to improve velopharyngeal muscle strength, but have not shown any utility for speech (Powers & Starr, 1974; Ruscello,

1982, 2007). Despite the lack of evidence, some SLPs continue to recommend non-speech oral- motor tasks such as blowing and sucking. For this reason, professional organizations such as the

American Speech-Language-Hearing Association are aiming to educate SLPs about the general

25

futility of blowing and sucking exercises for oral-nasal balance in speech (Zajac & Vallino,

2017).

Two treatment modalities that have shown promise are Continuous Positive Airway Pressure

(CPAP) and biofeedback techniques (Ruscello, 2007). CPAP was designed to treat obstructive , by applying air pressure through the nasal cavity onto the velum. For an individual with sleep apnea it keeps the nasal airway open. In case studies, CPAP was shown to improve velopharyngeal muscle strength during speech exercises for individuals with normal speech

(Kuehn, Moon, & Folkins, 1993), hypernasality (various aetiologies) (Kuehn, 1991) and hypernasality due to cleft palate (Kuehn et al., 1993). When CPAP is used in conjunction with speech, the levator veli palitini muscle must work harder to produce oral sounds (Liss, Kuehn, &

Kinkle, 1994). Kuehn et al. (2002) subsequently conducted a multi-center study with 43 participants with cleft palate and a clinical diagnosis of hypernasality (various degrees of severity). After eight weeks of CPAP therapy there was a small but significant decrease in the severity of perceptual ratings of hypernasality, but there was no significant change in nasalance scores.

Since the velopharyngeal sphincter is hidden from view, a number of biofeedback techniques have been developed to help the patient become aware of when the velopharyngeal port is open or closed during speech. One of the simplest techniques is a small mirror beneath the nostrils.

Accumulation of condensation on the mirror during oral speech sounds would indicate nasal airflow and suggest an open velopharyngeal port (Kummer, 2008; Sell & Grunwell, 2001).

Alternately, a clinician may use the SeeScape lung function trainer (Kummer, 2008). A flexible tube with a nasal olive at one end is attached to a rigid clear vertical tube containing a styrofoam

26

ball (Kummer, 2008; Sell & Grunwell, 2001). Movement of the styrofoam ball in the cylinder of the SeeScape indicates nasal airflow and an open velopharyngeal port. In a very small study, four participants with cleft palate did speech exercises with the precursor to the SeeScape, a “scape- scope” (Sphrintzen, McCall, & Skolnick, 1975). At the conclusion of the program, two had achieved velopharyngeal closure, as confirmed with videofluoroscopy and perceptual nasality ratings, one achieved inconsistent closure, while the last participant did not complete the therapy sessions.

For clinicians with access to nasometry, many textbooks will recommend it be used for therapy

(Kummer, 2008; Peterson-Falzone et al., 2001, Zajac & Vallino, 2017). The Nasometer and other nasometry systems provide not only a quantitative nasalance score, but a real-time

“nasalance trace”. The precursor to the Nasometer, the TONAR II, was initially designed to be a biofeedback tool (Fletcher, 1972). In therapy studies, Fletcher (1972, 1978) reported about half the participants were able to substantially decrease their nasalance scores after a series of biofeedback sessions with the TONAR II. The current model Nasometer 6450 includes biofeedback games and multiple nasalance trace display options, but to date, no studies replicating Fletcher’s (1972, 1978) original studies have been published in a scientific journal.

Nasoendoscopy enables direct visualization of the velopharyngeal sphincter. A number of studies report that nasoendoscopy-based biofeedback speech therapy helps patients improve their velopharyngeal closure for speech. A case-study by Witzel, Tobe and Salyer (1988) was one of the first to highlight the potential of nasoendoscopy. The authors reported that a ten year old girl with phoneme specific velopharyngeal dysfunction (VPD) was successfully treated in one session. Brunner, Stellzig-Eisenhauer, Pröschel, Verres, & Komposch, (2005) used

27

nasoendoscopy to treat 11 clients with VPD. They participated in up to 16 sessions and were treated for up to eight phonemes. The average rate of velopharyngeal closure increased from 5% pre-treatment to 91% post-treatment and this improvement was maintained at 86% at the six- month follow-up assessment (Brunner et al, 2005). In a small randomized control trial, Ysunza,

Pamplona, Femat, Mayer and Garcia-Velasco (1997) used nasoendoscopy to correct negative movements of the lateral pharyngeal walls (NMLPW), a widening of the velopharyngeal port, during speech. The 17 patients with cleft palate had VPD and NMLPW. Nine participants were assigned to a control group given conventional therapy and eight participants were assigned to an experimental group and given conventional therapy supplemented with nasoendoscopic visual feedback. One of the nine in the control group was able to modify NMLPW after 12 weeks of therapy, while all eight receiving visual feedback had learned to correct their NMLPW. When the control group was subsequently given the visual feedback treatment, their NMLPW was also corrected. Despite the promise of direct visual biofeedback, many SLPs outside a hospital setting do not have access to a nasoendoscope (Stelck et al., 2011), in part due to high costs.

As Fridland states “[…] in order to learn an embodied skill we must go from a stage of explicit proprioceptive representation to a stage where one’s proprioceptive awareness becomes recessive.” (Fridland, 2011, p. 537). Since the velopharyngeal sphincter is hidden from view, consciously linking proprioception to its movements is a challenge. Thus, correcting hypernasality with traditional speech therapy is difficult. However, it may be possible to influence oral-nasal balance in speech by changing global vocal tract settings, such as the voice focus (Boone, McFarlane, Von Berg, & Zraick, 2010, Kummer, 2008). The concept of voice focus describes how the settings of the larynx, pharynx and tongue influence the perceived sound

28

of the voice (Boone, 1997). Physiologically, a forward focus is due to a shortened and narrowed vocal tract, achieved by raising larynx together with a forward tongue carriage and a narrowed pharynx. The resulting sound is described as thin and juvenile (Boone et al., 2010). A backward focus comes from a lengthened and widened vocal tract, by lowering the larynx, a posterior tongue carriage and an expanded pharynx. The backed sound is described as a dark and throaty

‘country bumpkin voice’ (Titze, 1994). Altering the length of the vocal tract, by raising or lowering the larynx, has been shown to change vowel formants and the characteristics of the long-term averaged spectrum (Sundberg & Nordström, 1976). Yet, the specific impact of forward or backward voice focus on oral-nasal balance in speech has not been investigated to date. The voice focus and the acoustics of speech are the result of vocal tracts settings, such as lip rounding, tongue position, mandibular opening, pharyngeal width and larynx height (Laver,

1980). Vocal tract settings are altered and adapted all the time when speaking. This can become particularly noticeable when a speaker switches languages (Gick, Wilson, Koch, & Cook, 2004;

Wilson & Gick, 2014), speaks with the mouth full (Mayer, Gick, & Ferch, 2009), or voluntarily contorts the tongue for a specific effect (Bressmann, 2012).

It has been observed that vowel height influences nasalance scores so that stimuli with high front vowels have higher nasalance scores than those with low back vowels (Awan, Omlar, & Watts,

2011; Gildersleeve-Neumann & Dalston, 2001; Kummer, 2005; Lewis, Watterson, & Quint,

2000). The increase in scores for high vowels may be due to transpalatal sound transmission and/or increased resistance to oral airflow (Gildersleeve-Neumann & Dalston, 2001). Computer models suggest that an expanded pharynx and a more anterior tongue position may reduce the perception of hypernasality for the vowel /i/ (Rong & Kuehn, 2012). Conversely, Bressmann,

Anderson, Carmichael, and Mellies (2012) described the case of a speaker who reduced her 29

perceived hypernasality and her nasalance scores by speaking with an extreme forward focus

(raised larynx and constricted pharynx). Here, a constriction of the pharynx may have improved velopharyngeal closure, leading to the drop in hypernasality and nasalance scores. Chapter 4 describes the impact of global vocal tract settings, or voice focus, on the oral-nasal balance of normal speakers of Brazilian Portuguese (de Boer, Marino, Berti, Fabron & Bressmann, 2016).

The study expanded on a pilot study (de Boer & Bressmann, 2016b) with English speakers. The effect of voice focus on oral-nasal balance was explored with the intent that it could eventually become a therapy technique.

1.5 Thesis objectives

In summary, there are many outstanding issues affecting our ability to manage hypernasality.

The present thesis pursued three different lines of investigation related to the assessment, control and remediation of hypernasality. The first problem addressed was the quantitative assessment of hypernasality (and other oral-nasal balance disorders). The perceptual assessment of oral-nasal balance disorders is the domain of speech-language pathologists. However, even experienced clinicians may disagree with themselves or with each other (Brunnegård et al., 2012; Keuning,

Wieneke, & Dejonckere, 2004; Whitehill & Lee, 2008). The first objective of this thesis was to investigate a new quantitative acoustic assessment procedure for hypernasality and other oral-nasal balance disorders. This is addressed in Chapter 2. The proposal to improve the assessment involved a linear discriminant analysis (LDA) classification algorithm based on quantitative acoustic measures derived from nasalance and spectrographic measures. This approach was previously tested with nasalance scores of simulated disorders of oral-nasal balance (de Boer & Bressmann, 2015). Chapter 2 describes a subsequent study where an analysis

30

of the Long Term Averaged Spectra of simulated disorders of oral-nasal balance achieved high classification accuracy (de Boer & Bressmann, 2016 a).

The second issue was a question of how the oral-nasal balance is controlled in speech. According to current speech models, speech is learned through auditory and somatosensory feedback. The role of auditory feedback has been demonstrated for loudness, pitch and vowel formants (Elman,

1981; Houde & Jordan, 1998; Lane & Tranel, 1971; Larson, Burnett, Kiran, & Hain, 2000;

Purcell & Munhall, 2006; Siegel & Pick, 1974). The second objective of this thesis was to begin investigating the role of auditory feedback in the control of oral-nasal balance in speech. This is the topic of Chapter 3.

The final issue addressed here concerned the modification of oral-nasal balance (with the potential of a new speech therapy intervention for hypernasality). Teaching velopharyngeal closure is very difficult. The mechanism is hidden from view, thus consciously linking proprioception to its movements is a challenge. In an alternative approach that may potentially be useful for speech therapy, it was investigated whether it is possible to facilitate velopharyngeal closure by changing global vocal tract settings, namely by using a forward vs. backward voice focus. The third objective of the thesis was to explore the effect of voice focus adjustments on oral-nasal balance. Pilot work with normal speakers of English (de Boer

& Bressmann, 2016 b) suggested that vocal tract adjustments affect nasality. The research has since been replicated and expanded in normal speakers of Brazilian Portuguese (de Boer et al.,

2016). This last study is addressed in Chapter 4.

31

Taken together, the studies aimed to contribute to the long-term goals of advancing the assessment and treatment of hypernasality, as well as our basic understanding of the control of oral-nasal balance in speech.

32

Chapter 2 - Application of linear discriminant analysis to the Long Term Averaged Spectra of simulated disorders of oral-nasal balance

Contents of this chapter have been published in the Cleft Palate-Craniofacial Journal, Allen Press Publishing Services:

De Boer, G., & Bressmann, T. (2016 a). Application of linear discriminant analysis to the long term averaged spectra of simulated disorders of oral-nasal balance. The Cleft-Palate Craniofacial Journal, 53(5), e163-e171. doi.org/10.1597/14-236

A link to the published paper can be found at http://www.cpcjournal.org/doi/abs/10.1597/14-236?code=acpa-premdev

33

2.0 Abstract

Objective: Acoustic studies of oral-nasal balance disorders to date have focused on hypernasality. However, in patients with cleft palate, nasal obstruction may also be present, so that hypernasality and hyponasality co-occur. In this study, normal speakers simulated different disorders of oral-nasal balance. Linear discriminant analysis was used to create a tentative diagnostic formula based on the Long Term Averaged Spectra (LTAS) of the speech stimuli.

Materials and methods: Eleven female participants were recorded while reading non-nasal and nasal speech stimuli. LTAS of the recordings were run for their normal oral-nasal balance and their simulations of hyponasal, hypernasal and mixed oral-nasal balance. The amplitude values

(in decibels) were extracted in 100 Hz intervals over a range of 4 kHz.

Results: A repeated measures Analysis of Variance of the normalized amplitudes revealed a resonance condition - frequency band amplitude interaction effect (p < .001). A linear discriminant analysis of the participants’ LTAS led to formulas correctly classifying 80.7% of the oral-nasal balance conditions.

Conclusion: The simulations produced distinctive spectra enabling the creation of formulas that predicted the oral-nasal balance above chance level. Future research with speakers with oral- nasal balance disorders will be needed to investigate the potential of this approach for the clinical diagnosis of disorders of oral-nasal balance.

Keywords: hypernasality, hyponasality, mixed nasality, spectrography, acoustics

34

2.1 Introduction

Disorders of oral-nasal balance, especially hypernasality, affect acceptability as well as intelligibility of speech (Whitehill, Gotze, & Hodge, 2013). Oral-nasal balance disorders are perceptually salient and can be socially stigmatizing (Kummer, 2008). However, the acoustic correlates of the different types of oral-nasal balance disorders are not completely understood.

The present study investigated the application of linear discriminant analysis to long-term average spectra of connected speech in order to develop a tentative algorithm for the classification of disorders of oral-nasal balance based on simulations. In order to set the stage for the presentation of the research, it is necessary to first explain our specific understanding of disorders of oral-nasal balance and their assessment. Then, previous attempts of acoustic spectral analysis of hypernasality and the current new approach will be presented.

Disorders of oral-nasal balance are often called resonance disorders (McWilliams et al., 1990).

However, the term resonance disorder can be confusing because vocal tract resonance encompasses many other aspects of speech production above and beyond oral-nasal balance. The term oral-nasal balance better describes the phenomenon of excessive or insufficient sound transmission through the nasal passages (de Boer & Bressmann, 2015; McDonald &Baker, 1951;

McWilliams et al., 1990). De Boer and Bressmann (2015) argue that disorders of oral-nasal balance can be subsumed under the categories of hypernasality (excess nasal resonance), hyponasality (reduced nasal resonance) and mixed nasality (Hixon et al., 2008; Kummer, 2008,

2011). Beyond the oral-nasal balance per se, there are additional important perceptual impressions such as cul-de-sac resonance, nasal turbulence, nasal emission and visible facial grimace, which should be documented separately.

35

While velopharyngeal function can be visualized with videofluoroscopy and nasendoscopy, the diagnostic assessment of oral-nasal balance disorders relies primarily on auditory-perceptual evaluations by speech-language pathologists (Kummer, 2008; Kuehn & Moller, 2000). However, the perceptual measures are subjective and not always reliable within and between observers

(Keuning et al., 2004; Whitehill & Lee, 2008). The scaling method for the perceptual assessment can influence outcomes considerably. Baylis et al. (2015) demonstrated how the choice of visual analogue versus equal appearing interval scaling can influence listener agreement. It has been argued that the reliability of these assessments needs to be improved (Kummer et al., 2012).

Many clinicians will supplement their perceptual assessments with nasometry (Kuehn & Moller,

2000). Instruments such as the Nasometer (KayPentax, Montvale, New Jersey) provide a quantitative nasalance score which reflects the proportion of nasal energy in speech. High scores for stimuli with only oral vowels and consonants are associated with hypernasality and low scores for stimuli loaded with nasal consonants are associated with hyponasality (Kummer,

2008; Dalston et al., 1991a, 1991b). The correlation between listener perception and nasalance scores ranges from poor to strong (Bressmann et al., 2006, Brunnegård et al, 2012; Dalston et al.,

1991a, 1993; Hardin et al., 1992; Karnell et al., 2004; Keuning et al., 2002; Nellis et al., 1992;

Sweeney & Sell, 2008). The variability found has been attributed to listener training (Nellis et al., 1992), listener experience (Brunnegård et al, 2012), as well as language and scales used

(Dalston et al., 1993). Nasalance scores of speakers perceived to have mild or moderate hypernasality often overlap with those of normal resonance (Bressmann et al., 2006, Dalston et al., 1993). To date, the nasometer has been used almost exclusively for the assessment of hypernasality, with only very few studies examining hyponasality (Brunnegård et al., 2012;

Dalston et al., 1991b; Dalston & Seaver, 1992; Hardin et al., 1992). De Boer and Bressmann

36

(2015) argued that nasalance values of oral and nasal speech stimuli should be analyzed together because features of both hyper- and hyponasality can be present in the same speaker. In de Boer and Bressmann (2015), normal speakers were recorded with nasometry. The speakers read an oral and a nasal stimulus in their normal voice and while simulating hypernasality, hyponasality, and mixed nasality. For the oral stimulus, mean nasalance scores from the hypernasal and mixed conditions were higher than normal, and scores from the hyponasal condition were lower than normal. For the nasal stimulus, scores from the hypernasal condition were higher than normal, and scores from the mixed and hyponasal conditions were lower than normal. A linear discriminant analysis of the nasalance scores correctly classified 88.6% of the data sets into the four speaking conditions normal, hyponasal, hypernasal and mixed nasality.

The results from the above nasometric study were promising. Nevertheless, it is important to obtain more information about the frequency characteristics of different oral-nasal balance disorders. The nasometer uses a radical bandpass filter and provides a coarse quantitative assessment of the sound pressure balance between oral and nasal sound transmission (de Boer &

Bressmann, 2014). It does not provide information on the frequencies associated with disorders of oral-nasal balance. As a result, we have a limited understanding of the acoustic aspects of the speech signal that correspond to our perceptual impression of an oral-nasal balance disorder. If the spectrographic characteristics of the different disorders could be more clearly defined, this could lead to the development of new assessment tools that would not rely on the expensive nasometer headset and hardware. Instead, it might be possible to use single-microphone recordings, which would enable more clinicians to use acoustic analysis to corroborate their auditory-perceptual analysis of oral-nasal balance disorders.

37

Several studies have described the acoustic features of nasality using spectrography. When producing nasal consonants, the oral cavity is closed and sound is transmitted through the nose.

The sound resonates at frequencies based on the combined length of the vocal and nasal tracts. In adults, resonance peaks will occur at 250-300 Hz and 800-1000 Hz (Stevens, 1997). There are also anti-resonances, or zeros, from the point of constriction to the velopharyngeal opening.

When producing nasalized vowels, the resonances associated with nasalization are coupled with the oral resonances. The most noticeable spectral feature of a nasalised vowel is a flattened first formant peak (Johnson, 2012; Stevens, 1985, 1997). This is thought to occur because the nasal mucosa dampens the sound, reducing the amplitude, while the additional nasal formants broaden the bandwidth of the first formant (Johnson, 2012; Stevens, 1985).

Chen (1995, 1997) referred to the extra nasal resonances as nasal peaks. Chen (1995, 1997) labelled the nasal peak below first formant P0 and the nasal peak above first formant as P1. In the spectra of non-nasalized vowels, the first formant has the greatest amplitude, but for a nasalized vowel, the nasal peak P0 below the first formant (near 250 Hz) can be as high, or higher than, the first formant (Chen, 1997). Chen (1995) examined the amplitude of the first formant peak A1 and the nasal peak P1 between the first and second formant of vowels produced by hearing impaired speakers with hypernasality. The difference in amplitude A1-P1 was highly correlated with the perception of hypernasality, except where the frequencies of the first formant and the nasal peak P1 were close (Chen, 1995).

Using third octave spectral analysis of the vowel /i/, the perception of hypernasality in speakers with cleft palate was associated with additional energy between the first and second formant

(near 1000 Hz) and a lack of energy between the second and third formants (Kataoka et al,

38

1996). When Lee, Ciocca and Whitehill (2004) expanded third octave analysis to other vowels,

/i/ and /ɔ/ were found to be the most suitable vowels for distinguishing hypernasal from normal oral-nasal balance. Another acoustic approach to measuring hypernasality is the voice low tone high tone ratio (VLHR) (Lee et al., 2009). The VLHR measures the ratio of the energy of vowels at the low end of the spectrum to the energy at the high end. As hypernasality adds lower frequency energy, a higher ratio is meant to reflect hypernasality. The success of this ratio in differentiating between normal and hypernasal speakers has been mixed (Lee et al., 2009; Vogel et al., 2009).

An alternative acoustic approach to the assessment of hypernasality was based on the “cul-de-sac test”, which involves pinching the nares shut with the fingers. When producing oral sounds, the occlusion of the nostrils can change the vowel spectrum if the speaker is hypernasal (Bzoch,

1989; Haapanen, 1991; McWilliams et al., 1990). Using pattern recognition with linear predictive coefficients, Haapanen et al. (1996) showed that the cul-de-sac test with /i/ and /u/ could differentiate hypernasal speakers from normal speakers. An examination of spectral features showed that the nose-pinching decreased the energy of /i/ between 300 Hz and 700 Hz but increased spectral energy between 2000 Hz and 5000 Hz. For /u/, the energy of this upper band was reduced in the pinched condition.

While the above studies have taught us much about the speech acoustics of hypernasality, there are shortcomings that merit additional research. One such shortcoming is the use of sustained vowels (Kataoka et al., 1996, 2001; Lee et al., 2009) or vowel spectra cut from short words or consonant-vowel-consonant sequences (Chen, 1995, 1997; Lee et al., 2004). If speech is assessed based on vowels, syllables or other non-connected speech tasks, this can be problematic because

39

such stimuli may not accurately reflect the speaker’s capabilities or allow the listener to form a valid impression of the patient’s connected speech (Lohmander et al., 2009; Moll, 1964;

Weismer, 2006). Indeed, all four listeners in Kataoka et al.’s (2001) study reported that “rating the isolated vowel was more difficult than rating connected or conversational speech because the number of cues was limited” (p. 2182). A measure of oral-nasal balance should correspond to listener impressions, and listener impressions in conversation are based on connected speech.

Segmenting vowels from words or nonsense syllables for analysis is not practical in a clinical setting because it can be time-consuming and require specific expertise.

As with the nasometry studies summarized above, the focus of the acoustic studies has been on hypernasality. Little is known of the acoustic consequences of hyponasality (Warren, Dalston, &

Mayo, 1993) or mixed nasality (de Boer & Bressmann, 2015). While oral vowels may be suitable for the assessment of hypernasality, hyponasality is detected clinically during the production of nasal consonants and nasalized vowels.

A practical and useful global measure of the frequency distribution of the speech signal is the long term average spectrum (LTAS). The LTAS averages the spectral distribution of the speech signal over time. It has been used in studies of singers’ voice (Sundberg & Nordstrom, 2001) and language-specific articulatory settings (Mennen et al., 2010). The LTAS has also been used in clinical populations to study global features of disordered speech such as voice quality (Löfqvist

& Mandersson, 1987; Lowell et al., 2011) and severity (Tjaden et al., 2010). The

LTAS should be an appropriate tool for the analysis of disorders of oral-nasal balance because the extra- and anti-resonances related to the different disorders should have a constant effect on the acoustic spectrum.

40

In the present study, the influence of oral-nasal balance on the acoustic profile was assessed using LTASs of oral and nasal sentences. Simulated oral-nasal balance disorders were used for the acoustic analysis. The LTASs of normal resonance and simulated disorders of oral-nasal balance were compared. There were a number of specific expectations that guided the data analysis. It was expected that the hypernasal simulations would result in nasalization of all the vowels of the stimuli, and as per Chen’s (1997) study, this would result in additional energy near

250 Hz compared to the normal condition. Haapanen’s (1996) investigation suggested the mixed condition, a partial “cul-de-sac” test, would have less amplitude between 300 Hz and 700 Hz than the hypernasal condition. As both Chen (1997) and Haapanen (1996) used oral vowels, these differences were expected to show in the LTAS of the oral stimulus. For the nasal stimulus, it was predicted that the simulated hyponasal condition would show a reduced nasal formant and thus less energy than the normal condition at 250 Hz. Likewise, the mixed condition would have less energy at 250 Hz than the hypernasal condition. Finally, it was expected that the spectral differences between the speaking conditions would be salient enough that a linear discriminant analysis would be able to classify the data accurately based on the acoustic measurements.

2.2 Methods

2.2.1 Participants

The recordings of the simulated disorders of oral-nasal balance were acquired during the study described in de Boer and Bressmann (2015). Sixteen normal speaking females were recruited from the student population of the University of Toronto. They were between 22 and 30 years of age (mean 24.1, SD 2.2) and spoke English with the accent common to Southern Ontario. The participants reported normal hearing, no history of cleft lip and palate, no resonance disorder and

41

no nasal congestion. The participant consenting and research procedures were reviewed and approved by the Research Ethics Board at the University of Toronto.

2.2.2 Participant Training

The first author explained the nature of disorders of oral-nasal balance and demonstrated to the participants how to simulate them. Normal oral-nasal balance was discussed with the participants but no practice was necessary. To simulate hypernasality, the velum was lowered, nasalizing all speech sounds. To simulate hyponasality, one nostril was closed with the index finger. Mixed nasality was simulated by speaking with the velum lowered and one nostril closed. The participants practiced their hypernasal and mixed resonance with the test stimuli until the first author decided they were ready to proceed with the recordings.

According to estimates, up to 80% of individuals experience a nasal cycle where one nostril is more patent than the other at various times throughout the day (Hixon et al., 2008; Principato &

Osenberger, 1970; Stoksted, 1953). To account for this phenomenon, the hyponasal and mixed resonance conditions were repeated for both nostrils so that the higher and lower patency nostrils could be identified based on the nasalance values from the nasometric recordings. In the further analysis, only the sound files with the blockage of the more patent nostril were used (de Boer

&Bressmann, 2015).

2.2.3 Stimuli

The stimuli consisted of oral and nasal sentences. The first two sentences of the Zoo Passage

(“Look at this book with us. It’s a story about a zoo”) and the first sentence of the Nasal

Sentences (“Mama made some lemon jam”) (Fletcher, 1976) were used. The order of the stimuli

42

was randomized, and each item was read twice in each resonance condition. If a participant made a reading error, she simply repeated the item.

2.2.4 Recording Procedures

All recordings took place in a quiet room with acoustic panelling. High quality audio recordings were made using a Zoom Q3 Handy Video Recorder (Zoom, Tokyo, Japan). The device’s internal directional stereo microphone had a signal resolution of 16 bit and a sampling rate of

44.1 kHz. The gain was set to “high” and the recorder was placed 40 cm from the participant’s mouth. The recordings were saved as *.wav files. At the time of the recordings, the participants were also wearing a nasometer headset.

2.2.5 Simulation Verification

The two authors verified the accuracy of the participants’ portrayal of different oral-nasal balance disorders by listening to the audio recordings of the sessions. For each speaker, a consensus decision was first made about the success of the simulation of hypernasality. Once it had been determined that the speaker sounded hypernasal, the mixed nasality recording condition was reviewed. As a result of this verification step, five participants were excluded, leaving 11 data sets in the study (de Boer & Bressmann, 2015). No separate verification of the hyponasal condition was undertaken as the nasal occlusion was monitored by the first author during the data collection.

2.2.6 Acoustic Analysis

Using the Goldwave audio editor (Goldwave Inc, St John’s, Newfoundland), the files of the recordings were segmented into their individual sentences and saved as pulse code modulated 16 bit mono files in *.wav format. The spectrograms and LTAS analyses were obtained using Praat 43

version 5.3.63 (Boersma & Weenink, 2014). The LTAS bandwidths were set to 100 Hz. A script was run to obtain the amplitude (dB) of each LTAS frequency bin up to 4000 Hz.

2.2.7 Statistical Analysis

The amplitudes of the frequency bins from the oral and nasal stimuli across the four oral-nasal balance conditions were analysed using the Number Cruncher Statistical System 8.0 software

(NCSS LLC, Kaysville, Utah). As some speakers are louder than others and the damping effects of nasal mucosa can make hypernasal speech quieter than normal speech, the decibel values from the LTASs were converted to z scores. The normality of the distribution of z-scores within each frequency bin –condition – stimuli combination was confirmed visually and with skewness and kurtosis calculations. The effects of oral-nasal balance condition on the amplitude of the frequency bins were assessed with repeated measures Analysis of Variance (ANOVA). With 40 frequency bins to analyse for each stimulus, controlling for type I error required a conservative approach. The Holms-Bonferroni method was used with an alpha of .05. Where the F-test was significant, a Bonferroni multiple-comparison test with alpha set to .05 followed. In order to obtain a classification formula based on the amplitudes from the LTAS, the z scores were analysed with linear discriminant analysis.

2.3 Results

Line charts of the mean z scores for each speaking condition by frequency can be found in

Figure 2.1 for the oral stimulus and in Figure 2.2 for the nasal stimulus. A visual inspection of the averaged results for the oral stimulus demonstrated that the simulated hypernasal and mixed nasality conditions had their highest peak at 250 Hz, while the normal and hyponasal conditions

44

had a prominent spectral peak at 450 Hz. The hyponasal condition had greater amplitude than the others at 2450 Hz and the mixed condition appeared highest at 3950 Hz.

Figure 2.1 Line chart of the LTAS z-transformed mean amplitudes of four conditions for the oral stimulus.

A visual inspection of the averaged results for the nasal stimulus demonstrated that all the conditions had their peak amplitude at 250 Hz. The mean z scores of the mixed and hypernasal conditions were higher than the normal condition, while the hyponasal condition had the lowest peak. The mixed condition had lower z scores between 550 Hz to 850 Hz and higher z scores for the band centered at 3850 Hz.

45

Figure 2.2 Line chart of the LTAS z-transformed mean amplitudes of four conditions for the nasal stimulus.

2.3.1 Repeated Measures ANOVA

Repeated Measures ANOVAs for the four speaking conditions, eleven participants, two repetitions and forty LTAS 100 Hz bins were run for the oral stimulus and the nasal stimulus separately. As the LTAS intensity values of each sound file had been converted to z scores, there were no main effects for condition or repetition. With 40 sets of interaction effects to evaluate, the p values required from the Holms-Bonferroni method with an alpha of .05 were .00125,

.00128, .00131, and then .00135 before an F-test was no longer significant.

For the oral stimulus, significant condition-frequency bin interaction effects were found for the bands centered at 250 Hz (F(3,30) = 24.57, p <.000001), 450 Hz (F(3,30) = 7.50, p = .000691),

550 Hz (F(3,30) = 10.26, p = .000083) and 650 Hz (F(3,30) = 10.97, p = .000050). For the nasal 46

stimulus, significant condition-frequency bin interaction effects were found for the bands 250 Hz

(F(3,30) = 12.54, p = .000017), 750 Hz (F(3,30) = 8.37, p = .000342) and 1950 Hz (F(3,30) =

7.42, p = .000739). For each bandwidth with a significant F test, the mean z scores by condition were assessed using Bonferroni multiple comparison tests. The significant results of these tests appear in Table 2.1. There were no significant repetition-frequency bin interaction effects.

For both the oral and the nasal stimuli, the Bonferroni multiple comparison post-hoc tests showed that the amplitudes of the bin centered at 250 Hz were significantly different for the hypernasal vs. non-hypernasal conditions. The hypernasal and mixed conditions had significantly higher z scores than the hyponasal and normal conditions. At 450 Hz for the oral stimulus and

750 Hz for the nasal stimulus, the mixed condition had significantly lower values than the hyponasal or normal conditions. At 550 Hz, the normal and hyponasal conditions of the oral stimulus had higher scores than the hypernasal or mixed conditions. At 650 Hz for the oral stimulus, the hyponasal and normal conditions had significantly higher z scores than the mixed condition, and the hyponasal condition had significantly higher z scores than the hypernasal condition. For the nasal stimulus, at 1950 Hz, the normal condition had significantly higher z scores than the hypernasal and mixed conditions.

47

Table 2.1. Bonferroni multiple-comparison tests (α = 0.05) of z-scores from LTAS frequency bands for oral and nasal stimuli with significant condition interaction effects. Bin frequency Condition Mean Differs from 250 Hz Normal 1.58 Hyper, Mixed Oral Hypo 1.49 Hyper, Mixed Hyper 2.25 Normal, Hypo Mixed 2.30 Normal, Hypo 450 Hz Normal 2.01 Mixed Oral Hypo 2.09 Mixed Hyper 1.72 Mixed 1.45 Normal, Hypo 550 Hz Normal 1.78 Hyper, Mixed Oral Hypo 1.86 Hyper, Mixed Hyper 1.34 Normal, Hypo Mixed 1.10 Normal, Hypo 650 Hz Normal 1.44 Mixed Oral Hypo 1.61 Hyper, Mixed Hyper 1.15 Hypo Mixed 0.92 Normal, Hypo 250Hz Normal 2.06 Hyper, Mixed Nasal Hypo 1.82 Hyper, Mixed Hyper 2.57 Normal, Hypo Mixed 2.51 Normal, Hypo 750 Hz Normal 1.05 Mixed Nasal Hypo 1.15 Mixed Hyper 0.80 Mixed 0.56 Normal, Hypo 1950 Hz Normal -0.40 Hyper, Mixed Nasal Hypo -0.08 Hyper 0.00 Normal Mixed 0.18 Normal

48

2.3.2 Linear discriminant analysis

The repeated measures ANOVA demonstrated that the amplitudes of the conditions differed significantly at certain frequencies. In the next step, it was investigated whether the resonance conditions could be further distinguished by a combination of frequencies. A linear discriminant analysis was performed to obtain a classification formula based on the amplitudes of the stimuli at various frequencies. Stepwise regression was used to select the most salient frequency bands of the oral and nasal stimuli. Initially, the stepwise regression was set to run up to 50 iterations with the probabilities to enter and remove set to .05 and .10 respectively. This produced three formulas with 19 independent variables, almost twice the number of participants in the study. To reduce the number of variables, the stepwise regression was repeated with the probabilities to enter and remove dropped to .025 and .05 respectively. The ten remaining variables represented six frequency bins from the oral stimulus and four frequency bins from the nasal stimulus. The first discriminant function had a Wilks’ lambda Λ of 0.095394, p < .0001 and accounted for 77.3

% of the variance in the oral-nasal balance conditions. The second discriminant function had a

Wilks’ lambda Λ of 0.445923, p < .0001 and accounted for 18.8 % of the variance. The third discriminant function did not meet significance (Λ = 0.843231, p = .0920).

The canonical discriminant function coefficients are displayed in Table 2.2. Higher function 1 values were obtained when the z scores at 250 Hz for the oral stimulus and at 2150 Hz for the nasal stimulus were higher than average and, when z scores of the oral stimulus at 350 Hz, 650

Hz and 1950 Hz were lower than average. High function 1 values were associated with the presence of hypernasality. Function 2 values were higher when the z scores of the oral stimulus at 250 Hz and at 1950 Hz were lower than average and the z scores at 3850 Hz for the nasal

49

stimulus were higher than average. High function 2 values were associated with the presence of hyponasality.

Table 2.2 Canonical discriminant function coefficients derived from ten predictors and four speech conditions (normal, and simulated hypernasal, hyponasal and mixed).

Function 1 Function 2

Constant 0.97 10.31

Oral 250Hz 2.54 -2.05

Oral 350Hz -1.30 -1.38

Oral 650Hz -1.16 -1.18

Oral 1050Hz 0.70 -1.20

Oral 1550Hz 1.00 -0.10

Oral 1950Hz -1.32 -2.71

Nasal 750Hz -0.70 -1.03

Nasal 1450Hz -0.45 0.91

Nasal 2150Hz 1.74 1.07

Nasal 3850Hz 1.04 1.99

The canonical variate group centroids appear in Table 2.3. The highest function 1 centroid values were for the mixed and hypernasal conditions and the lowest were for the normal and hyponasal conditions. The highest function 2 centroid value was obtained for the mixed condition and the lowest was for the hypernasal condition, but the values for the normal and hyponasal condition were relatively close (0.40 and 0.20 respectively).

50

Table 2.3. Function values of group centroids for four speech conditions (normal, and simulated hypernasal, hyponasal and mixed).

Condition Function1 Function2

Hypernasal 0.92 -1.53

Hyponasal -2.06 0.20

Mixed nasality 2.61 0.92

Normal -1.47 0.40

Each participant’s set of z scores produced a pair of function values. The minimal Mahalanobis distance between those function values and those of the condition centroids determines which condition the set of scores is predicted to belong to. When the discriminant formulas were applied to the z scores of the eleven frequency bands, 80.7% were classified correctly. Of the 22 sets of z scores for the hypernasal condition, one was misclassified as mixed and one was misclassified as hyponasal. For the hyponasal condition, seven were misclassified as normal.

Two from the mixed condition were misclassified as hypernasal. Finally, five of the normal condition sets of z scores were misclassified as hyponasal and one was misclassified as hypernasal. A scatterplot of the function values and the condition centroids is shown in Figure

2.3.

51

Figure 2.3. Scatterplot of function values for the linear discriminant analysis with group centroids (centroid labels: N = normal, R = hypernasality, O = hyponasality and X = mixed nasality).

2.4 Discussion

The accurate quantitative acoustic diagnosis of disorders of oral-nasal balance presents a persistent problem in the care of patients with cleft palate. The present study investigated the properties of the LTAS in connected speech in different simulated disorders of oral-nasal balance. As anticipated from Chen’s (1997) study, the Repeated Measures ANOVA

52

demonstrated that the hypernasal condition had significantly higher z scores (more spectral energy) than the normal condition in the frequency band centred at 250 Hz for the oral stimulus.

Indeed, both the hypernasal and mixed nasality conditions had more acoustic energy than either the normal or hyponasal conditions for the oral and nasal stimuli at 250 Hz. Next, based on

Haapanen’s (1996) study, the mixed nasality condition was expected to have less energy than the hypernasal condition between 300 and 700 Hz of the oral stimulus. No such trend was evident for the oral stimulus. While the line graph of the nasal stimulus suggested the mixed nasality condition had less energy between 400 and 900 Hz, none of its frequency bands were significantly lower than the simulated hypernasal condition. However, unlike Haapanen’s (1996)

“cul-de-sac” test (both nostrils completely pinched shut with fingers) with hypernasal speakers, in the mixed condition only one nostril was blocked, so it was not a complete “cul-de-sac” speaking condition. Finally, for the band centered at 250 Hz of the nasal stimulus, the hyponasal condition was expected to have less energy than the normal condition and the mixed condition was expected to have less energy than the hypernasal condition. These expectations were not confirmed. The Repeated Measures ANOVA found seven frequency bands where at least one speaking condition had significantly more acoustic energy than the others. All but one of these differences were found in frequency bands below 800 Hz. Yet, all the significant differences in z scores were along a hypernasality present or absent axis. None of the individual bins differentiated the presence or absence of hyponasality, as none of the post hoc comparisons distinguished hypernasal from mixed, or hyponasal from normal.

The linear discriminant analysis had promising results. The success of the classification formula

(80.7%) was roughly comparable to that of the earlier nasometry study (88.6%) (de Boer &

Bressmann, 2015). However, it also underlined that distinguishing different oral-nasal balance 53

conditions from each other based on the acoustic signal is a rather complex task, even when the same speakers produced the different conditions.The discriminant formulas were particularly successful at identifying the hypernasal and mixed conditions. However, of the 17 items that were misclassified, 12 were confusions between hyponasal and normal. By blocking just one nostril, the speakers may not have simulated a hyponasality severe enough to cause a consistent change to the LTAS. Alternately, individual differences in the size and shape of the nasal cavity may have diluted the acoustic effects across the participants.

The primary limitation of the present study was that the discriminant functions were derived exclusively from the acoustic spectra of normal adult females simulating disorders of oral-nasal balance. Different discriminant functions would be expected for children and adult males, and the functions for clinical participants with disorders of oral-nasal balance may be altogether different. For example, blocking a nostril anteriorly still allows sound to resonate in the posterior part of the blocked nasal passage as well as in the unblocked side of the nose. Hyponasality caused by a posterior blockage (e.g., caused by enlarged adenoids or a pharyngeal flap) would prevent sound from entering the nasal cavity and resonating in the nasal passage. Future research with clinical populations will allow us to better study the acoustic effects of different types of hyponasality arising from different etiologies. While a nasometric assessment will provide a relatively robust representation of the oral-nasal balance, the LTAS may be more vulnerable to acoustic features such as breathy or hoarse voice quality, which can change the spectral characteristics ( Löfqvist & Mandersson, 1987; Lowell et al., 2011). We can only speculate how other features of speech such as the compensatory articulations noted in many speakers with cleft palate (Kummer, 2008) could affect the frequency distribution of the LTAS. It would also be interesting to investigate how much the LTAS of individual speakers varies from day to day. 54

Another potential limitation is that the participants were wearing a nasometer headset throughout the audio recordings so that the sound separator plate could have influenced the sound recorded by the audio recorder’s directional stereo microphone. However, the participants read from a stationary clip board and any head movement should have been minimal. The 120º recording angle of the audio recorder’s microphone would have further mitigated any possible effects of head movement.

2.5 Conclusion

Despite these limitations, the results of the present study demonstrated that the LTAS could provide useful information for clinical assessment. More simulation data from normal-speaking male and female speakers, both paediatric and adult, would have to be collected to develop tentative classification formulas for different groups of speakers. These could then be fine-tuned using data from a sufficiently large number of clinical speech samples. Eventually, it may be possible to develop a robust linear discriminant function that can corroborate a clinician’s auditory-perceptual impression for an individual patient. We believe that the results of a classification using a method such as linear discriminant analysis will be improved if features of both hypernasality and nasal obstruction are taken into account.

2.6 Acknowledgements

This research was supported by an Operating Grant from the Canadian Institutes of Health

Research (grant fund number 485680). We gratefully acknowledge the Praat script for the LTAS analyses were made available by Ms. Huiwen Goy. We thank Drs. Alexei Kochetov, Gajanan

Kulkarni and Pascal Van Lieshout for their advice for this study.

55

Chapter 3 - Influence of altered auditory feedback on oral-nasal balance in speech

The contents of this chapter have been published in the Journal of Speech,

Language and Hearing Research, American Speech-Language-Hearing

Association (ASHA).

De Boer, G., & Bressmann, T. (2017). Influence of altered auditory feedback on oral-nasal balance in speech. Journal of Speech, Language and Hearing Research. 60, 3135-3143. doi:10.1044/2017_JSLHR-S-16-0390

A link to the published paper can be found at http://jslhr.pubs.asha.org/article.aspx?articleid=2660934&resultClick=3

56

3.0 Abstract

Purpose – This study explored the role of auditory feedback in the regulation of oral-nasal balance in speech.

Method – Twenty typical female speakers wore a Nasometer headset and headphones while continuously repeating a sentence with oral and nasal sounds. Oral-nasal balance was quantified with nasalance scores. The signals from two additional oral and nasal microphones were played back to the participants through the headphones. The relative loudness of the nasal channel in the mix was gradually changed, so that the speakers heard themselves as more or less nasal. An additional amplitude control group of 9 female speakers completed the same task while hearing themselves louder or softer in the headphones.

Results – A Repeated Measures ANOVA of the mean nasalance scores of the stimulus sentence at baseline, minimum, and maximum nasal feedback conditions demonstrated a significant effect of nasal feedback condition. Post hoc analyses found that the mean nasalance scores were lowest for the maximum nasal feedback condition. The scores of the minimum nasal feedback condition were significantly higher than two of three baseline feedback conditions. The amplitude control group did not show any effects of volume changes on nasalance scores.

Conclusions – Increased nasal feedback led to a compensatory adjustment in the opposite direction, confirming that oral-nasal balance is regulated by auditory feedback. However, a lack of nasal feedback did not lead to a consistent compensatory response of a similar magnitude.

57

3.1 Introduction

Auditory feedback plays a crucial role in speech. This becomes evident when auditory feedback is altered in speech perturbation experiments. Speakers compensate when they detect a discrepancy between intended and perceived loudness, i.e., they speak up when the auditory feedback volume decreases, and they speak more quietly when the feedback volume increases

(Lane & Tranel, 1971; Siegel & Pick, 1974). When a speaker’s fundamental frequency is altered up or down electronically, and played back to him or her in real time, this leads to a compensatory adjustment in the opposite direction (Elman, 1981; Larson et al., 2000). A similar effect is observed when the vowel formants are gradually shifted (Houde & Jordan, 1998; Purcell

& Munhall, 2006). Although these experiments have shown individual differences between speakers, the general pattern for groups of speakers is one of compensation. Speakers will compensate for altered auditory feedback, even when instructed not to, which suggests that the response is automatic (Munhall et al., 2009). The adaptation studies cited above (among many others) have contributed to the notion of feedback and feedforward control in current speech production models (e.g. Directions Into Velocities of Articulators (DIVA) (Guenther, 2006;

Tourville & Guenther, 2011) and State Feedback Control (SFC) model (Hickok, Houde, & Rong,

2011)).

The regulation of oral-nasal balance in speech, i.e., the control of nasalization, is an aspect of speech production that is not completely understood. Oral-nasal balance in speech (the degree of coupling of the oral and nasal speech signal) is determined by the degree of opening and closing of the velopharyngeal sphincter (Kummer, 2008). As the velopharyngeal sphincter offers very little proprioception (Hixon et al., 2008), one would expect that the auditory feedback subsystem

58

must play a substantial role in the control of oral-nasal balance. However, whether and to what extent this is true is not yet known. It was the goal of the present study to investigate whether oral-nasal balance is governed by the auditory feedback subsystem in a compensatory manner similar to loudness, pitch, and vowel formants (Elman, 1981; Houde & Jordan, 1998; Lane &

Tranel, 1971; Larson et al., 2000; Purcell & Munhall, 2006; Siegel & Pick, 1974).

A commonly used quantitative measure to assess oral-nasal balance in speech is called nasalance

(Kuehn & Moller, 2000; Fletcher, 1976). Nasometry instruments, such as the Nasometer

(KayPentax, Montvale, New Jersey) are commonly used in clinics specialized in the treatment of patients with cleft lip and palate (Kummer, 2008). The Nasometer has microphones mounted on the top and bottom of a sound separation plate. The upper microphone records the nasal signal and the lower microphone records the oral signal. The software calculates the ratio of the sound energy coming from the nose to that coming from the nose and mouth and expresses this ratio as a percentage (N/(N+O)*100). The measurement is repeated in 8 ms intervals and the resulting averaged percentage figure for a speech stimulus is called the nasalance score. Speech stimuli without nasal sounds have low nasalance scores and speech stimuli loaded with nasal sounds have high nasalance scores (Fletcher, 1976). Based on what is known from other speech adaptation studies (eg. Elman, 1981; Houde & Jordan, 1998; Lane & Tranel, 1971; Larson et al.,

2000; Purcell & Munhall, 2006; Siegel & Pick, 1974), one could expect that speakers who receive auditory feedback with the nasality of their speech artificially increased would adapt by closing their velopharyngeal sphincter more tightly, thereby reducing the oral-nasal coupling and decreasing their nasalance scores. Likewise, when the feedback minimizes the nasal component of speech, the speakers could be expected to compensate by opening their velopharynx, increasing their oral-nasal coupling and their nasalance scores. It is not known whether typical 59

speakers have the same automatic and unconscious control over oral-nasal balance as they have over loudness, pitch and vowel formants.

The present study sought to test the following hypotheses:

H1 - When the nasal component in the auditory feedback increases, the speakers will reduce their oral-nasal coupling by closing their velopharyngeal sphincter more tightly and the average nasalance scores will decrease.

H2 - When the nasal component in the auditory feedback decreases, the speakers will increase their oral-nasal coupling by opening their velopharyngeal sphincter more and the average nasalance scores will increase.

Manipulating the auditory signal, so that the amount of audible nasalization increases or decreases, can lead to overall loudness changes. An additional research question addressed whether such changes in loudness by themselves affect oral-nasal balance.

3.2 Methods

3.2.1 Participants

Thirty-three females with a mean age of 21.49 (SD 2.27) were recruited from the University of

Toronto. They spoke English with the accent common to Southern Ontario and had normal hearing (self-report). Participants were excluded if they reported a history of hyper- or hyponasality (e.g., cleft palate, severely deviated septum), or nasal congestion at the time of recording. The recordings of three participants were excluded from analysis. The first excluded participant was perceived to have notable hypernasality, the second reported developing nasal congestion during the data collection, and the third’s vocal intensity was too low for nasalance to 60

be measured. Data from a fourth participant were lost due to experimenter error. All subsequent data analyses were based on the 29 participants with complete data sets. The study was approved by the research ethics board of the University of Toronto. The participants were sequentially assigned to two experimental groups (both N=10) and one amplitude control group (N=9).

3.2.2 Recording procedures

The recordings took place in a quiet room with acoustic paneling. The participants were initially informed that the purpose of the experiment was to evaluate the stability of velopharyngeal function over time. They were instructed to produce the stimulus as regularly as possible. The participants were seated facing a printed copy of the stimulus sentence. The stimulus sentence contained both oral and nasal phonemes: My hamper was damp so the towels are smelly. The participants were asked to repeat the stimulus continuously in their normal speaking voice. Over the course of the experiment, the participants uttered the stimulus over 200 times. At the conclusion of the data collection, the participants were debriefed about the true purpose of the experiment and the nature of the sound manipulation that they had experienced.

During the recording session, the participants wore the Nasometer 6450 headset (KayPentax,

Montvale NJ). The output from the Nasometer headset microphones was recorded using the

Nasometer software on a computer. Two additional small tie-clip style stereo microphones (Sony

ECM-CS3, Sony Canada, Toronto ON) were attached to the Nasometer sound separator plate.

One microphone was placed on the nasal surface of the separator plate to record nasal sound and one microphone was placed on the oral side to record oral sound. For both ECM-CS3 microphones, the left channel was oriented towards the sound source, and only this channel was used for recording. The signals from the additional microphones were boosted by a NasalView

61

stereo pre-amplifier model T-02 (Tiger DRS, Seattle WA) before being fed into a digital multitrack recorder (Tascam DP-008, TEAC America, Montebello CA). The oral and nasal signals from the two tie-clip microphones were assigned to separate tracks. The two channels were centred in the stereo panorama so that the speaker perceived the two input channels from the mouth and nose as a mono signal in headphones. The participants wore headphones

(SHL3000RD, Philips Canada, Markham, ON) that were connected to the output of the multitrack recorder (see Figure 3.1). Throughout the recordings, the multi-track recorder’s input gain and master output levels were left in the same setting for all participants. In order to address hypotheses 1 and 2 about the impact of increased or decreased nasal feedback onto nasalance scores, the track level for the nasal channel was manually adjusted to change its contribution to the output signal. In order to address the additional research question whether loudness changes affect nasalance scores, the track levels for the oral and nasal channels were changed in conjunction.

Figure 3.1. Schematic diagram of recording equipment for auditory feedback of oral-nasal balance

62

As the participants repeated the stimulus sentence, their oral and nasal speech signals were recorded with the multi-track recorder and the Nasometer. The multi-track recordings were saved continuously as .wav files with a sampling rate of 44.1 kHz and a signal resolution of 16 bits.

The Nasometer can only record 100 seconds at a time, so it was necessary to save files and recommence the Nasometer recordings during the experiment, which interrupted the Nasometer recordings for ca. 5 seconds each time. To ensure that the participants received uninterrupted auditory feedback, they were asked to continuously repeat the stimulus. The participants were given opportunities to rest and drink water after nasalance recordings that ended with the nasal level potentiometer at the 50% position (see below).

3.2.3 Acoustic impact of changes to multitrack channel level

To determine the impact of changing the nasal channel level from the 50% baseline midpoint to the 100% maximum and 0% minimum on the overall signal volume, a stereo sine wave (188.5

Hz) was master recorded with various “nasal” channel level settings. The two channels were mixed to the central position in the stereo panorama so that the combined output from both channels could be assessed. The “oral” channel level was kept at the 50% volume setting of the channel potentiometer. The sound pressure levels (SPL) were analyzed in Praat (version 5.3.63,

Boersma & Weenink, 2014) and are reported in uncalibrated dB. When the “nasal” level was at the 50% volume setting, the combined output SPL for the two channels with the sine wave signal was measured at 76.64 dB. At the maximum (100%) nasal channel volume setting, the combined output from both channels output was 84.45 dB SPL, while at the minimum (0%) nasal channel volume setting it was 70.76 dB SPL. It was noted that the maxima and minima were reached when the track volume control potentiometer of the multi-track recorder reached 85% and 15%, respectively (see Figure 3.2). 63

Figure 3.2. Line graph of decibels SPL (uncalibrated) by track volume control potentiometer level for a sine wave

90

85

80

75 Decibels 70

65

60 MIN 10 20 30 40 MID 60 70 80 90 MAX

Nasal level

In an attempt to illustrate the acoustic effects of increasing or decreasing the nasal channel level for the listener, Figure 3.3 shows a spectrogram of the word smelly uttered by a participant in the baseline nasal feedback condition (with the nasal channel level at 50%). The recording was master recorded three times, with the 0% minimum, 50% baseline and 100% maximum nasal channel level settings. Overlaid on the spectrogram is the intensity curve. In the 0% minimum nasal channel volume setting, there is very little intensity for the nasal segment [m] (37.65 dB).

In the 50% baseline nasal channel level condition the [m] shows higher intensity (51.01 dB). The

100% maximum nasal channel volume setting shows that the [m] has even more intensity (57.43 dB). For comparison purposes, the amplitude of the word smelly as a whole, rose from 44.19 dB

64

(minimum nasal setting) to 51.81 dB (baseline nasal setting) to 55.12 dB (maximum nasal setting). It should be emphasized that the spectrograms show copies of one production in the control feedback condition master recorded with the multitrack recorder. Therefore, the spectrograms of Figure 3.3 do not show what compensations a speaker might make in response to the altered auditory feedback.

Figure 3.3. Spectogram with intensity trace for the word smelly uttered in the baseline feedback condition and mastered with 0% minimum, 50% baseline and 100% maximum nasal level settings.

3.2.4 Procedures: Change of nasal feedback level in the two experimental groups

For the two experimental groups (both N=10), only the nasal channel level was changed during the experiment. For the first experimental group high-to-low, the first 18 repetitions of the stimulus sentence were recorded with the nasal channel level at the 50% baseline setting. The nasal channel level was then gradually increased (in 5% increments for every 3 repetitions of the stimulus) to the 100% maximum. The participants were kept at the 100% maximum setting for six repetitions. Then, the nasal level setting was decreased in 5% decrements every 3 sentence 65

repetitions to the 50% baseline nasal channel setting. After six repetitions of the sentence at the baseline setting, the nasal channel signal was reduced in 5% decrements to the 0% minimum, held there for six sentence repetitions and gradually returned to the 50% baseline nasal channel setting for 18 final repetitions of the sentence. To ensure there was no effect of order of manipulation, for the second experimental group low-to-high, the order of the changes to the nasal channel level were reversed. Following the same procedures, the nasal channel level was decreased from the 50% baseline to the 0% minimum (0%), then brought back to the 50% baseline, further increased to the 100% maximum and finally returned back to the 50% baseline.

3.2.5 Procedures: Change of overall feedback level in the amplitude control group

For the amplitude control group (N=9), both the nasal and oral channel levels were changed in conjunction during the experiment. The purpose of the amplitude control group was to assess whether changes in oral-nasal balance could be caused by overall feedback volume changes without a change to the relative loudness of the nasal channel in the mix. Like the experimental groups, the participants were initially held at the 50% baseline setting for the oral and nasal channels for 18 repetitions. The volume was increased in potentiometer increments of 2.5% for both channels to the maximum level of 75% (louder) and then decreased in the same increments to the lowermost level of 25% (quieter).

3.2.6 Statistical analysis

Nasalance scores were obtained for the individual repetitions within the nasalance recordings.

For statistical analysis, the mean nasalance scores of six consecutive repetitions in the same nasal feedback condition were calculated. The average nasalance scores were calculated from the last

66

six repetitions of the baseline condition nasalance files, where the nasal level potentiometer was

50% throughout, and the six scores from each pass through the 0% minimum, 50% baseline and

100% maximum nasal feedback conditions. For the amplitude control group, the average scores were calculated for six recordings each in the 50% baseline and the 25% minimum and 75% maximum amplitude feedback conditions. Because of the 100 second recording limit of the

Nasometer, saving and restarting of the recordings was necessary throughout the data collection.

On five occasions, this led to the incomplete recording of an individual stimulus repetition.

These items were discarded and the average nasalance scores were calculated based on the remaining five of six sentence repetitions. Once a file was overwritten due to experimenter error, in this instance the average nasalance score was calculated based on the remaining three of six sentence repetitions of the feedback condition. As a result, the analysis of the averaged nasalance scores in five feedback conditions for 29 participants was based on 862 individual nasalance scores instead of 870. The mean nasalance scores were analyzed in NCSS version 8 (NCSS Inc.,

Kaysville, UT 84037) with repeated measures ANOVAs. Where sphericity was violated,

Greenhouse-Geisser adjustments to the probability levels were used. The p-values for the post- hoc paired t-tests were adjusted with the Holms-Bonferroni method. For all analyses, alpha was set to .05.

3.3 Results

For the experimental groups, error bar plots of the nasalance values as a function of nasal levels are displayed in Figure 3.4 (high-to-low) and Figure 3.5 (low-to-high). For the amplitude control group, an error bar plot of the nasalance values as a function of both oral and nasal volume levels is displayed in Figure 3.6.

67

Figure 3.4 Error bar plot of consecutive mean nasalance scores as a function of nasal feedback level for the high-to-low experimental group (N=10).

Figure 3.5 Error bar plot of consecutive mean nasalance scores as a function of the nasal feedback level for the low-to-high experimental group (N=10).

68

Figure 3.6 Error bar plot of consecutive mean nasalance scores as a function of the oral and nasal feedback level for the amplitude control group (N=9).

3.3.1 Experimental groups

A repeated measures ANOVA was run for the mean nasalance scores of the twenty participants of the experimental groups. The between subject factor variable was group (high-to-low and low- to-high) and the within-subject variable was auditory nasal feedback condition (minimum, maximum and three times at baseline). There was no effect of group (F(1,18) = 0.57, p = .4587), a highly significant effect of condition (F(4,72) = 21.48, p < .0001) and no group-condition interaction effect (F(4,72) = 1.00, p = .4125).

Post-hoc paired t-tests

As there was no effect of group, and no group-condition interaction effect, the mean nasalance scores of the high-to-low and low-to-high groups were combined for the five nasal feedback conditions. The means of the 0% minimum, 100% maximum and three 50% baseline nasal level 69

feedback conditions are displayed in Table 1. The means were compared with 10 Holmes-

Bonferroni corrected paired t-tests. The mean nasalance scores of the 100% maximum nasal level feedback condition were significantly lower than all the other conditions (all p < .0001,

Cohen’s d between 0.84 and 1.02). The scores of the 0% minimum nasal level feedback condition were significantly higher than the second 50% baseline nasal level feedback condition

(p = .0041, Cohen’s d 0.39). The difference in nasalance scores between the 0% minimum nasal level feedback and final 50% nasal level feedback baseline conditions was 2.20 and had a p- value of .0102, while the Holmes-Bonferroni criterion for that comparison was p=0.01. None of the remaining comparisons were significantly different.

Table 3.1 – Combined mean nasalance scores from the two experimental groups high-to- low and low-to-high in the 50% baseline, 0% minimum and 100% maximum nasal level feedback conditions (N=20).

Nasal feedback Mean SD

Baseline_1 34.29 6.30

Baseline_2 32.95 6.44

Baseline_3 33.53 6.97

Maximum 28.30 5.45

Minimum 35.73 7.78

3.3.2 Amplitude control group

The mean nasalance scores for the nine participants of the amplitude control group’s 75% maximum, the 25% minimum and the three 50% baseline amplitude feedback conditions ranged

70

from 30.09 (SD 7.17) to 32.56 (SD 6.30). A repeated measures ANOVA with Greenhouse-

Geisser adjustments found no effect of amplitude feedback condition (F (4,32) = 2.51, p =

.1167).

3.4 Discussion

The present study was designed to investigate how speakers would respond to altered auditory feedback of their oral-nasal balance in speech. The outcome measure for the listener response was the nasalance score. It was expected that an increase in nasal auditory feedback would lead to a drop in nasalance scores and that a decrease in nasal feedback would lead to an increase in nasalance scores. Our results supported the first hypothesis: Increasing nasal feedback led to lower nasalance scores than the baseline condition. The results only partially supported the second hypothesis: Decreasing nasal feedback brought about an inconsistent increase in nasalance scores, of lower magnitude than the increased nasal feedback condition. In addition, the results for the amplitude control group demonstrated that a change of overall amplitude of the auditory feedback did not lead to a change in nasalance scores.

A repeated measures ANOVA of the nasalance scores in the experimental groups (high to low and low to high) found a significant effect of condition, demonstrating that altered auditory nasal feedback influenced oral-nasal balance. There was no effect of group, so the average nasalance scores from the high-to-low and low-to-high groups were equivalent. More importantly, there was no group*condition interaction effect, which means that the nasal feedback condition effect was not influenced by the order in which the nasal level changes were presented. The nasalance scores from the two experimental groups in their respective 50% baseline, 0% minimum and

100% maximum nasal level feedback conditions were combined for the post hoc t-tests. The

71

mean nasalance scores of the 100% maximum nasal level feedback condition was 4.65 to 5.99 nasalance points lower than the three passes through the 50% baseline nasal level feedback. The effect of reduced nasal feedback was less consistent. Two of the three comparisons between the

0% minimum and the 50% baseline nasal level feedback conditions reached a significant difference. The mean nasalance scores in the 0% minimum nasal level feedback condition were

2.20 to 2.78 points higher than the second and third 50% baseline nasal level conditions. This demonstrated that altered nasal feedback led to a compensatory response and that the compensation had a higher magnitude in terms of nasalance scores for increased nasal feedback than for reduced nasal feedback.

In order to put these shifts in nasalance scores in perspective, the mean value of the hamper sentence in the baseline condition was between 32.95 and 34.29. For Canadian English speakers, the mean nasalance score of a sentence with entirely oral sounds (no nasal consonants or nasalized vowels) is 13.12 (de Boer & Bressmann, 2014). Therefore, in the increased nasal feedback condition, a drop of five nasalance points does not indicate a complete elimination of nasality.

The procedures used to change the nasal level changed the overall amplitude of the signal the participants heard through the headphones. In order to address this potential confound, an amplitude control group was included. The repeated measures ANOVA for the amplitude control group found no difference in nasalance scores between the baseline, lowermost 25% (quiet) and uppermost 75% (loud) conditions. Therefore, in the analysis of the experimental groups, it can be assumed that significant changes to nasalance scores can be attributed to the proportion of nasal level in the auditory feedback, rather than changes in overall amplitude.

72

The significant drop in nasalance scores in the 100% maximum nasal feedback condition indicated that the participants in the present study showed a consistent pattern of change. More research will be needed to investigate whether and how participants change their degree of velopharyngeal closure to react to a perceptual change in their oral-nasal balance, such as in the present study. However, it appears unlikely that such a consistent adjustment can be made by changing just pitch, loudness or tempo. Van Lierde et al. (2010) found small but significant 1-2 point drops in nasalance scores when participants changed their pitch or spoke louder, while quiet speech generated nasalance scores 4 points higher than the control condition. In contrast,

Watterson, York and McFarlane (1994) did not find that speaking softly or loudly led to a significant change in nasalance values. Speaking faster or slower has not been shown to have a significant effect on nasalance scores (Gauster, Yunusova, & Zajac, 2010). Neither pitch, loudness, nor tempo of the participants’ productions were measured in the present study.

In some previous pitch-shift feedback studies, researchers observed occasional participants who

“followed” the altered stimulus. These individuals appear to imitate the altered feedback, rather than compensate production to counter-act the effect (Burnett et al., 1998; Larson et al., 2000).

This behaviour was not noted in the present study. Every single one of the 20 participants in the high-to-low and low-to-high groups had numerically lower mean nasalance scores in the 100% maximum nasal level feedback condition than their average of the three 50% baseline nasal level feedback conditions. The lack of “followers” in the data set may have been an effect of the small sample size. In pitch-shift feedback, “followers” typically represent a minority of responses

(Burnett et al., 1998; Larson et al., 2000; Patel et al., 2014). Much more research will have to be undertaken before the proportion of “following” responses due to altered oral-nasal feedback can be sensibly compared to those of pitch-shift feedback studies. 73

As summarized in various theoretical papers, studies of speech motor control have demonstrated how changed auditory feedback leads to sensorimotor adaptation and changes speakers’ planning of subsequent speech gestures (Guenther, 2006; Hickok, Houde, & Rong, 2011; Perkell, 2013;

Tourville & Guenther, 2011). The control process that is assumed to underlie such sensorimotor adaptation operates based on a combination of feedback and feedforward mechanisms, where the central nervous system processes sensory feedback to adjust the feedforward planning of motor processes to reduce a mismatch between the expected and the actual result of a given aspect of speech production, such as pitch, loudness, or vowel formants (Elman, 1981; Houde & Jordan,

1998; Lane & Tranel, 1971; Larson et al., 2000; Purcell & Munhall, 2006; Siegel & Pick, 1974).

In contrast to these previous studies which described bi-directional compensatory reactions, the compensatory response observed in the present study occurred predominantly in the form of lower nasalance values in response to increased nasal feedback. The nasalance scores for the 0% minimum nasal level feedback condition were significantly higher than the second 50% baseline nasal level feedback condition and the difference between scores for the 0% minimum nasal level feedback and final 50% nasal level feedback baseline condition only narrowly missed significance. However, the magnitudes of the increases in nasalance scores in response to decreased nasal level feedback were smaller than the magnitudes of the decreases in nasalance scores in response to increased nasal level feedback. It can only be speculated what caused the participants to react so selectively. The analysis of the loudness levels with the sine wave indicated that increasing the nasal level to the maximum setting had a slightly greater impact on the uncalibrated decibel values than decreasing the nasal level to the minimum (an additional uncalibrated 8 dB versus a drop of 6 dB). However, this does not explain the largely absent compensatory effect when the nasal signal was decreased. It is possible that the speakers may

74

have relied on vibro-tactile sensations from the face and nasal passages to assure themselves that the nasal resonance was still present.

An alternative explanation for the difference in compensatory reactions to increased versus decreased nasal feedback is that listeners are less sensitive to hyponasality than to hypernasality.

In clinical practice in Speech-Language Pathology, hyponasality is generally considered to be a less salient and less disabling feature of speech than hypernasality (Peterson-Falzone et al.,

2001). As stated by Shprintzen et al. (1979, p.54): “While hyponasal speech is not normal, it is far more desirable than hypernasal speech since the majority of consonant phonemes in the

English language have no nasal resonance.” However, more research is needed to assess whether listeners’ supposed indifference to hyponasality in other speakers equally applies to their own speech.

Speakers of languages with more differentiated nasalization rules, such as French or Portuguese, may possibly show a stronger adaptive reaction to decreased nasality in auditory feedback. This would also be of interest to investigate in future research. Such research would demonstrate whether the relative prominence of nasalization in the phonology of a language influences a speaker’s ability to control his or her oral-nasal balance based on auditory feedback.

The current study had a number of limitations. Hearing level and nasal congestion were based on self-report. The speakers’ loudness, and by extension the amplitude they experienced through the headphones, was not regulated. In addition, while all the participants were fluent speakers of

English, some of them may have spoken additional languages with different patterns of nasalization. It is has been observed in pitch-shift speech adaptation research that there can be fluctuation in individual responses over time, while the group trends show a consistent 75

accommodation reaction (Burnett et al., 1998; Larson et al., 2000; Patel et al., 2014). Since the present study outlined a speech adaptation effect that had not been previously described, the analysis was limited to group effects. The variability of individual reactions to changed auditory feedback of oral-nasal balance should be explored in future research. Finally, while the experimental protocol manipulated the oral-nasal balance through the headphones, the participants’ overall speaking volume, pitch and tempo were not monitored. More research will be needed to further clarify the effects found.

3.5 Conclusion

The present study provided some first evidence that auditory feedback plays a role in the control of oral-nasal balance in speech. The mostly one-sided response to the altered auditory feedback may indicate that speakers are more sensitive to an excess than an absence of nasality in their speech. More research is needed to investigate the sensory-motor underpinnings of the observed effects in more detail and to investigate whether the speakers’ linguistic background has an impact on the magnitudes and the directions of the compensatory reaction.

3.6 Acknowledgements

The authors wish to thank Sheetal Ramaprasad, Bianca Cohn, Yaxin Liu, Roubina Sarkissian,

Marika Loy and Karalina Lovkina for their assistance in extracting the nasalance scores.

76

Chapter 4 - Influence of Voice Focus on Oral-Nasal Balance in Speakers of Brazilian Portuguese

The contents of this chapter have been published in Folia Phoniatrica

Logopaedica, © 2016, S. Karger AG, Basel

De Boer, G., Marino, V., Berti, L., Fabron, E., & Bressmann, T. (2016). Influence of voice focus on oral-nasal balance in speakers of Brazilian Portuguese. Folia Phoniatrica Logopaedica, 68(3),

152-158. doi: 10.1159/000452245

Contributions: G. de Boer was responsible for analyzing the data, as well preparing and revising the manuscript. V. Marino, L. Berti and E Fabron obtained local ethics, recruited the participants, collected the data and assisted with manuscript preparation and revisions. T. Bressmann supervised the project and aided with manuscript preparation and revision.

A link to the published paper can be found at: https://www.karger.com/Article/FullText/452245

77

4.0 Abstract

Objectives –The study investigated whether a change in speaking voice focus affects oral-nasal balance. The investigation was undertaken with different phonetic materials in speakers of

Brazilian Portuguese, which features phonological and phonetic vowel nasalization.

Methods – Ten females read oral, balanced oral-nasal and nasal loaded sentences in their normal voice, and with a backward focus and a forward focus. Nasalance scores were collected with a

Nasometer 6400.

Results – A Repeated Measures ANOVA of the nasalance scores demonstrated a significant main effect of speaking condition [F(2,18) = 12.87, p < .001]. The mean nasalance scores across the stimuli in the backward focus and normal speaking conditions were 36.85 (16.85) and 40.18

(18.02) respectively, both significantly lower than the forward focus condition at 45.38 (18.90).

Conclusion – The results demonstrated that speaking focus influences oral-nasal balance in normal speakers. In future research, it should be investigated whether voice focus can also modify oral-nasal balance in hypernasal speakers with cleft palate and other disorders.

78

4.1 Introduction

Voice focus is a concept from singing pedagogy that has been adapted to voice therapy. The shape and length of the vocal tract determines the timbre of the voice (Boone, 1997). A shortened vocal tract with a raised larynx, a forward tongue carriage and a narrowed pharynx results in a bright and juvenile vocal quality (forward focus). A vocal tract lengthened by lowering the larynx, carrying the tongue more posteriorly and widening the pharynx, results in a dark and throaty voice quality (backward focus) (Boone, McFarlane, Von Berg & Zraick, 2010).

The lengthening or shortening of the vocal tract changes the frequency distribution in the long- term average spectrum, in particular the second vowel formant (Sundberg & Nordström, 1976).

In voice therapy, a balanced, central focus is the goal (Boone & McFarlane, 2000).

In a previous study, we investigated whether the speaking focus of the voice influences oral- nasal balance in speech. Oral-nasal balance is regulated by the velopharyngeal mechanism

(Fritzell, 1969; Moon, Smith, Folkins, Lemke & Gartlan, 1994). Velopharyngeal closure is a task-dynamic process. The degree of velopharyngeal elevation can vary for different tasks such as swallowing, speaking, whistling or blowing (Moll, 1965; Flowers & Morris, 1973; Shprintzen,

McCall, Skolnick & Lencione, 1975). The oral-nasal balance of different vowels varies systematically (Lewis, Watterson & Quint, 2000; Kummer, 2005; Gildersleeve-Neumann &

Dalston, 2001; Awan, Omlar & Watts, 2011). Different speakers may also vary in their velopharyngeal closure patterns (Croft, Shprintzen & Rakoff, 1981).

If there is velopharyngeal dysfunction related to a structural or neurological disorder, the velopharyngeal mechanism may not close properly, leading to hypernasality. In hypernasal speech, too much air and sound are emitted through the nose, affecting intelligibility and

79

acceptability of speech (Kummer, 2008). Velopharyngeal movement is difficult to influence voluntarily because the velopharyngeal sphincter offers no proprioception (Hixon et al, 2008).

This poses a challenge in the behavioural speech therapy for patients with hypernasality. As a result, speech-language pathologists often refer patients with more than mild hypernasality for further surgical management with a pharyngeal flap surgery or for prosthodontic management with a speech bulb or palatal lift prosthesis (Sweeney, 2013).

Precisely because speakers have little voluntary or proprioceptive control over the state of their velopharyngeal closure mechanism (Kuehn & Moon, 1998), it is of interest whether global vocal tract adjustments such as voice focus can change oral-nasal balance. If consistent changes could be demonstrated, this could potentially open up new possibilities for speech therapy interventions for selected patients with hypernasality. This could be helpful for patients with mild degrees of hypernasality as well as patients learning to use a new pharyngeal flap or speech prosthesis. Based on a study involving computer modeling, Rong and Kuehn (2012) speculated that an expanded pharynx and a more anterior tongue position would improve the oral-nasal balance and the perception of hypernasality for the vowel /i/. Conversely, Bressmann, Anderson,

Carmichael and Mellies (2012) described a speaker who reduced her hypernasality and her nasalance scores by adopting a forward focus (raised larynx and narrowed pharynx). The authors speculated that a narrowing of the pharynx may have facilitated velopharyngeal closure.

In a previous study (de Boer & Bressmann, 2016 b), sixteen female normal speakers produced six test sentences without nasal sounds and one test sentence loaded with nasal sounds. The speakers produced the stimuli with their normal voice and with a backward focus and a forward focus. Audio-recordings and nasometry measurements were made (de Boer & Bressmann, 2016

80

b). Based on a perceptual evaluation, which was corroborated with long-term average spectra, nine of the participants were able to complete the task successfully. A repeated-measures

ANOVA demonstrated that the nasalance scores were influenced by the stimuli type (oral vs. nasal stimuli) which was expected. A follow-up ANOVA found a condition effect for the nasal stimulus only, showing that nasalance scores of the backward focus were lower, and those of the forward focus were higher than in the normal condition. This indicated that the selection of test sentences was not ideal because the effect of the voice focus appeared to be most pronounced on the stimulus including nasal speech sounds. It could be argued that a speaker opens his or her velopharyngeal sphincter intermittently for a nasal sentence. Therefore, a change in voice focus can affect the distribution of sound between the oral and the nasal cavities during these opening gestures. However, the velopharyngeal sphincter remains closed for the whole duration of an oral stimulus, so a forward or backward voice focus would only affect transpalatal resonances.

The goal of the present study was to expand on the previous research (de Boer & Bressmann,

2016 b). In particular, we were interested in including more stimuli loaded with nasal sounds as well as with a balanced content of oral and nasal sounds. We also decided to direct the investigation to a different language, namely, Brazilian Portuguese. Brazilian Portuguese is characterized by both phonetic and phonological vowel nasalization. While English vowels tend to be nasalized only by assimilation (they are adjacent to a nasal phoneme), Brazilian Portuguese vowels can be nasalized by assimilation in words such as “janela” [ʒɐ̃’nɛlɐ] (window) and

“caneta” [kɐ̃’netɐ] (pen) as well as in otherwise oral phoneme sequences in words such as “lã”

[´lɐ̃] (wool) and “pão” [´pɐ̃w] (bread) (Silva, 2007). If oral-nasal balance were influenced by voice focus, this effect should be particularly pronounced in Brazilian Portuguese. Based on the findings of the previous study (de Boer & Bressmann, 2016 b) and the model predictions by 81

Rong and Kuehn (2012), the first hypothesis was that the forward focus condition would yield higher nasalance scores than the normal speaking condition. The second hypothesis was that the backward focus condition would yield lower nasalance scores than the normal speaking condition.

Apart from the research hypotheses, it was expected, based on normative nasalance values, that nasal stimuli would have higher scores than balanced stimuli, which in turn would have higher scores than oral stimuli (Kummer, 2005). Nasalance scores were expected to be consistent across repetitions.

4.2 Methods

4.2.1 Participants

Ten female participants with a mean age of 22 y 6 m (SD 1.38) were recruited from the student population of the Fonoaudiologia program at the Universidade Estadual Paulista "Júlio de

Mesquita Filho" in Marília, São Paulo State, Brazil. According to their self-report, the speakers had normal hearing, no history of hyper- or hyponasality, and no nasal congestion at the time of data collection. The absence of voice and speech disorders was verified perceptually by three experienced Speech-Language Pathologists (the second, third and fourth authors). All participants spoke Brazilian Portuguese with the accent common to Western São Paulo State.

The research procedures were reviewed and approved by the Research Ethics Board at the

UNESP Marília.

Participant sample size was determined based on previous research (de Boer & Bressmann,

2016b; Marino et al., 2016). Assuming a mean nasalance score of 49% (SD 5) for the nasal loaded passage O nenê in the normal speaking condition and a score of 54% (SD 5) for the 82

forward focus, we determined a minimum sample size of 8 to achieve a power of .8 and an alpha of p=.05 (one-sided). A group of 10 speakers therefore appeared sufficient for the purposes of the present study.

4.2.2 Participant training

All recordings were made by the second, third and fourth authors. They demonstrated forward focus and backward focus and explained the concept of vocal tract settings. Particular attention was paid to constant speaking pitch in the different conditions. To achieve a forward focus, the participants were instructed to bring their tongue forward, raise their larynx, and narrow the pharynx. The backward focus required the participants to retract their tongue, lower the larynx and widen the pharynx, which was demonstrated and facilitated with a yawn-sigh (Boone et al.,

2010; Boone & McFarlane, 2000). The participants were provided an opportunity to practice.

The data collection began once the second, third and fourth authors found that the voice focus was being produced correctly.

4.2.3 Stimuli

The stimuli consisted of three sentences without nasal sounds, three sentences with a balanced content of oral and nasal sounds, and three sentences loaded with nasal consonants (see

Appendix). The sentences were shorter versions of stimuli designed for the purpose of clinical nasalance assessment of oral-nasal balance in Brazilian Portuguese. Seven of these sentences were taken from stimuli designed by the second author (Marino et al., 2016) and the remaining two sentences were from Trindade et al. (1997). The shortened stimuli were used to make the task more manageable for the participants. The minimum length of stimuli for reliable nasalance assessment has been estimated to be six syllables (Watterson, Lewis & Foley-Homan, 1999). In

83

clinical practice, higher than normative scores for oral stimuli (without nasal sounds) suggests hypernasality, while lower than normative scores for nasal loaded stimuli suggests hyponasality.

For nasometric assessment, phonetically balanced stimuli are of limited use. However, they are a good indicator of oral-nasal balance in the speaker’s normal day-to-day connected speech.

Therefore, balanced stimuli with oral and nasal sounds were also included in the present study.

All nine stimuli were shown to the participants on a computer screen presented at eye-level for easy reading. The order of the stimuli was randomized, and they were read three times for each speaking condition. The participants were asked to read the stimuli in the normal condition first.

The order of the two remaining conditions (forward and backward focus) was randomized. If a participant made an error on a stimulus, they were asked to read it again.

4.2.4 Recording Procedures

All the recordings took place in a sound-treated speech laboratory at the UNESP Marília. The participants were seated with their head in an ultrasound transducer stabilizer (Probe

Stabilization Headset, Articulate Instruments, Edinburgh, UK). The separation plate of the headset of the Nasometer 6400 (Kay Pentax, Montvale, NJ) was attached to a custom holder and placed on the speaker’s prolabium. The nasometer was calibrated according to the manufacturer instructions on each day of recording. The nasalance sound recordings for each condition were saved to hard disk and measured after the session. The mean nasalance scores for the different test items were recorded.

During the recording, the ultrasound probe was held in a constant position under the participant’s chin. The video-feed of the participants’ midsagittal tongue was recorded using the Advanced

Articulate Assistant hardware and software package (Articulate Instruments, Edinburgh, UK). 84

During the live recording, the ultrasound videos served to confirm that the tongue was being held in a protruded (forward focus) or retracted (backward focus) position. For the purposes of the present study, no measurement of the ultrasound recordings was undertaken.

4.2.5 Data Analysis

Statistical analyses were completed using NCSS version 8.0 (NCSS, Kaysville, Utah). The nasalance scores were analyzed with repeated measures ANOVAs. Where Mauchly’s test statistic was significant, the Greenhouse-Geisser adjustment was used. Post-hoc testing was carried out with Bonferroni tests. The p-value for significance was .05.

4.3 Results

Due to a technical error, the nasalance scores of one participant for one of the oral sentences

Table 4.1. Mean nasalance scores of 3 repetitions of 9 stimuli in 3 conditions (n = 10)

Stimuli Average Nasalance by condition

nasalance normal forward backward

Oral 1 (Dudu) 17.84 (7.51) 16.83 (4.92) 21.43 (10.03) 14.96 (4.75)a Oral 2 (Gostou) 18.67 (9.15) 18.10 (6.60) 23.53 (11.42) 14.37 (6.26) Oral 3 (Viu) 15.54 (8.04) 14.33 (4.63) 18.93 (11.43) 13.37 (5.29)

Balanced 1 (Arruma) 43. 31 (8.34) 41.83 (5.67) 49.33 (8.60) 38.77 (6.82) Balanced 2 (O cachorro) 38.62 (6.18) 39.07 (3.95) 42.03 (6.67) 34.77 (5.43) Balanced 3 (Flavinho) 35.39 (7.08) 32.77 (5.21) 40.10 (7.61) 33.30 (5.87)

Nasal 1 (Miriam) 50.29 (8.24) 48.77 (6.09) 55.50 (9.68) 46.60 (5.75) Nasal 2 (Monica) 63.98 (7.18) 64.70 (5.22) 68.10 (7.03) 59.13 (6.26) Nasal 3 (O nene) 60.60 (7.44) 61.83 (5.63) 65.47 (5.61) 54.50 (6.50)

a n = 9, in Oral 1 in backward focus.

85

(Dudu) were missing. Below, the descriptive results include all available data while the inferential statistics were based on the remaining eight sentences with complete data for all participants. Table 4.1 shows the mean nasalance scores for the stimuli by speaking condition as well as averaged across speaking conditions. The box plots in Figure 4.1 illustrate the findings for the three types of stimuli in the three speaking conditions. Based on visual inspection, the scores from the forward condition appeared higher and the scores from the backward condition appeared lower than the normal speaking condition.

Figure 4.1 Boxplot of nasalance scores for 3 repetitions of 9 stimuli in 3 conditions (normal, forward focus and backward focus) (N=10)

86

A three-way repeated measures ANOVA of the nasalance scores was run for three repetitions of the eight stimuli across the three speaking conditions. There were significant main effects for stimuli F (9,63) = 240.29, p < .0001, and speaking condition F (2,18) = 12.87, p = 0.0021, but not for repetition F (1,9) = 0.48, p = 0.6263. There were no interaction effects.

Across conditions and repetitions, post hoc Bonferroni all-pairwise multiple comparison tests indicated that the stimuli within each oral-nasal balance category (oral, balanced, nasal) were significantly different from the stimuli of the other oral-nasal balance categories. In addition, balanced sentence 3 (Flavinho) was significantly lower than balanced sentence 1 (Arruma), and nasal sentence 1 (Miriam) was significantly lower than both nasal sentence 2 (Monica) and nasal sentence 3 (O nene) (all p<0.05).

Across stimuli and repetitions, the post hoc Bonferroni all-pairwise multiple comparison test found that the backward focus and normal focus conditions generated the lowest mean nasalance scores of 38.85% (SD 16.85) and 40.18% (SD 18.02) respectively, while the forward focus had the highest mean nasalance score of 45.38% (SD 18.90). The backward and normal conditions were both significantly different from the forward condition but not from each other (all p<0.05).

4.4 Discussion

The current study expanded a previous investigation of the effect of voice focus on oral-nasal balance (de Boer & Bressmann, 2016b). The study was guided by two hypotheses. The first hypothesis stated that the forward focus condition would yield higher nasalance scores than the normal speaking condition. This hypothesis was confirmed for all three sets of stimuli (oral, balanced and nasal).

87

The second hypothesis stated that the backward focus condition would yield lower nasalance scores than the normal speaking condition. This hypothesis was not confirmed. The post-hoc tests for the repeated measures ANOVA indicated the difference between the normal and the backward focus speaking conditions did not reach a p-value of 0.05. However, the mean nasalance values of the stimuli for the backward focus speaking condition were numerically lower. To determine the p-value of the difference between the normal and backward focus speaking conditions, a separate t-test was run. It suggested a trend towards significance (p<0.07).

It is possible that this difference would have been significant in a larger group of speakers.

As expected, there was a stimulus effect whereby nasal stimuli had higher scores than balanced stimuli, which in turn had higher scores than oral stimuli. Marino et al. (2016) provided normative nasalance scores by age and gender. The data from the current study could therefore be compared to the normative data for young adult females. In the normal condition, the mean nasalance scores for the oral stimuli ranged from 14.33 (4.63) to 18.10 (6.60). This was slightly higher than the normative score of 14.04 (3.82) reported for a comparable oral passage (Marino et al., 2016). The mean scores for the balanced oral-nasal stimuli ranged from 32.77 (5.21) to

41.83 (5.67) which was greater than the reported normative mean of 26.90 (4.00) for a balanced oral-nasal stimulus. The nasal stimuli means ranged from 48.77 (6.09) to 64.70 (5.22), which was higher than the normative mean nasalance score of 49.37 (4.69) reported for a nasal stimulus. The differences can possibly be attributed in part to differences in the sentence stimuli, which were used in shortened versions in the present study. It has also been noted that nasalance group means show some intrinsic variation between studies and even within speakers (de Boer &

Bressmann, 2014, 2015). While the nasalance values measured in the present study appeared higher than the suggested norms, it is unlikely that this difference would have affected the 88

quality of the results. Within the repeated measures design, the participants’ nasalance values in the normal speaking condition served as their own control condition for the comparison with the forward and backward speaking conditions. The results of the ANOVA suggested that the changes in nasalance based on the speaking condition were consistent across speakers.

The participants of this study and those of the previous pilot study were all female. Although males have larger vocal tracts than females, the differences in nasalance scores are not always significant ( de Boer & Bressmann, 2015; Hirschberg et al., 2006; Lee & Brown, 2013;

Prathanee, Thanaviratananich, Pongjunyakul & Rengpatanakij, 2003; Sweeney, Sell & O’Regan,

2004; Trindade et al., 1997). When significant differences are found by gender, they tend to be small and not clinically meaningful (Brunnegard, & Van Doorn, 2009; Marino et al., 2016; Mayo

& Mayo, 2011; Mishima, Sugii, Yamada, Imura & Sugahara, 2008; Seaver, Dalston, Leeper &

Adams, 1991; Van Lierde, Wuyts, De Bodt & Van Cauwenberge, 2001 ). Nevertheless, more research will be needed to confirm the observed effect for male speakers.

The results expanded the previous findings from a group of Canadian English speakers (de Boer

& Bressmann, 2016b) to participants speaking Brazilian Portuguese. An important improvement of the current study was the inclusion of speech stimuli with a balanced phonetic content.

Nasalance testing of clinical speakers usually focuses on oral and nasally loaded stimuli, which represent phonetic extremes. However, normal connected speech has a more balanced phonetic content. The oral-nasal balance requirements of conversational speech are therefore more accurately represented by the balanced stimuli.

In the previous study (de Boer & Bressmann, 2016b), the success of the speakers in moving their speaking focus forward or backwards was analyzed using long-term average spectra, which 89

demonstrated an upward or downward movement, respectively, of the first spectral peak, similar to the spectral changes described by Sundberg and Nordström (1976) for a single trained singer.

In the present study, the speakers’ productions were not analyzed acoustically. The participants all had training in phonetics and vocal tract anatomy. In addition, their productions were verified perceptually by the three experienced speech-language pathologists carrying out the experiment.

Since the changes in nasalance values were consistent across the participants for the different speaking conditions, no additional external validation was undertaken. In future research, it would be interesting to investigate the acoustic consequences of changes of the speaking focus in more detail.

These limitations notwithstanding, the present study successfully corroborated the findings from the pilot study (de Boer & Bressmann, 2016b). The findings confirmed that voice focus can affect oral-nasal balance, especially in speech stimuli that are balanced or loaded with nasal sounds. The effect has now been shown in two languages, one with and one without phonological vowel nasalization. Future studies are needed to explore the effect of voice focus in other languages, as well as to verify that a similar effect is observed in male speakers. In a next step, it will be of interest to explore in how far speaking focus adjustments could be used to improve the oral-nasal balance of hypernasal speakers with cleft palate. Based on the findings from this study as well as the previous study (de Boer & Bressmann, 2016 b) and a computer model (Rong & Kuehn, 2012), it would be expected that the backward speaking focus should reduce nasalance scores and perceived hypernasality. It remains to be seen whether it is possible for hypernasal speakers with cleft palate to use speaking focus adjustments to achieve changes in oral-nasal balance that would be measurable and perceptually relevant.

90

4.5 Acknowledgements

The study was supported by São Paulo Research Foundation (FAPESP, grants No. 2012/23899-6 and 2016/01583-8). The authors wish to thank the participants of this study.

4.6 Disclosure statement

The authors declare no conflicts of interest.

4.7 Appendix - Stimuli Speech stimuli based on Marino et al. (2016) and Trindade et al. (1997) (indicated with *).

Oral

Dudu visitou o bosque.

Viu o pulo do sapo.

Gostou do peixe.

Balanced

O cachorro do Nino.

Arruma seu bercinho.

Flavinho chamou o João.*

Nasal

Monica mima o nenê.

Miriam lambeu o limão.*

O nene mama.

91

Chapter 5 - Conclusions

The studies presented in this thesis aimed to advance the assessment and treatment of oral-nasal balance disorders and our understanding of oral-nasal balance control.

5.1 Classification of oral-nasal balance with Long-Term Averaged Spectra

5.1.1 Study summary

The reliability of perceptual assessments of oral-nasal balance is known to be problematic. The objective of the first study (de Boer & Bressmann, 2016a) was to investigate a new quantitative acoustic assessment procedure for hypernasality and other oral-nasal balance disorders. Audio recordings of normal speech and simulated hypernasal, hyponasal and mixed nasality speech were analyzed with Long-Term Averaged Spectra and then with a linear discriminant analysis.

The resulting formulas were able to successfully classify 80.7% of the speech samples based on their z-score corrected amplitudes at specific frequencies. This was slightly lower but comparable to the classification success of the discriminant functions based on nasalance scores

(de Boer & Bressmann, 2015). Most of the drop in classification accuracy was between the normal speech condition and the simulation for hyponasal speech.

5.1.2 Study implications

For LTAS to be a useful adjunct to clinical assessment, more simulation data would need to be acquired, including males and children. These could then be compared to clinical speech samples from speakers with cleft lip and/or palate. Be it with nasometry or LTAS, we believe that a classification system that monitors both hypernasality and hyponasality could add important quantitative information to a clinician’s auditory-perceptual impression. The primary advantage of a quantitative acoustic measure of oral-nasal balance is that norms can be established and the

92

process standardized. The same classification approach could be applied in different cleft centres. The acoustic measure would not be influenced as easily as the auditory-perceptual assessment by extraneous factors such as the facial appearance of the speaker (Glass & Starr,

1979; Podol & Salvia, 1976), experience of the listener (Brunnegård et al., 2012; Chapman et al.,

2016; John et al., 2006; Lee et al., 2009) or general effects of fatigue over the course of the work day (Danziger, Levav, & Avnaim-Pesso, 2011).

5.1.3 Limitations

As is the case for all the studies included in this thesis, all speakers were adult females. The formulas are therefore only applicable to other adult females. The acoustic analysis was based on simulated hyper- and/or hyponasality, and the number of participants was small.

In a small explorative follow-up study, six male speakers were recruited. Only three of them could produce a consistent hypernasal resonance. Therefore, the six speakers’ recordings were instead analyzed for acoustic features which differentiated their normal speech from their simulations of hyponasality. The expected key feature of the hyponasal speaking condition is reduced amplitude for nasal sounds. It is possible that by normalizing the amplitude of the LTAS frequency bins with z-scores, the differentiating features between the hyponasal and normal conditions may have been diminished. Albeit based on only six participants, the best differentiator that could be found was amplitude variation. The hyponasal condition had greater amplitude variation (as measured with standard deviations of the uncalibrated decibels) than the normal condition. On a phonemic level, visual inspection of the waveforms showed that the amplitude of the stimuli of the hyponasal condition dropped every time a nasal sound was

93

produced. The acoustic features of hyponasality should be explored in more research in order to better characterize this condition.

5.1.4 Future directions

The classification of simulated oral-nasal balance disorders with nasometry (de Boer &

Bressmann, 2015 – Master’s study) and LTAS (de Boer & Bressmann, 2016 – Thesis study 1) both achieved high classification accuracy, 88.6% and 80.7% respectively. Currently, classification with nasometry scores is the better of the two options. Nasalance norms are available in multiple languages and there are minimal (if any) effects of gender or age. In addition, nasometry can distinguish between normal and hyponasal oral-nasal balance.

The expansion of an acoustic classification approach to clinical populations with oral-nasal balance disorders due to clefts of the lip and/or palate has begun with a retrospective study of nasometry scores from the Hospital for the Rehabilitation of Craniofacial Anomalies in Bauru,

São Paulo state, Brazil. The collaborators for this project were Dr. Viviane Marino at the State

University of São Paulo, and Drs. Jeniffer Dutka and Maria-Ines Pegoraro-Krook at the

University of São Paulo. Speakers were classified as normal, hypernasal, hyponasal and mixed based solely on their nasalance scores, using the Brazilian cutoff scores for hyper- and hyponasality (Trindade et al., 1997). The speakers’ audio recordings were then rated for presence and severity of oral-nasal balance disorders by three experienced and three inexperienced listeners. Preliminary analyses indicate that the agreement between the nasalance categories and the listener perceptual categories was low, as was the inter-listener perceptual category agreement. The manuscript is currently in preparation.

94

With low inter-listener agreement, the case for a classification system based on acoustics, such as nasometry is much stronger. Nasometry has very high test-retest reliability, even between

Nasometer models (Awan et al., 2011, Bressmann et al., 2005; de Boer & Bressmann, 2014).

More research using nasometry scores to classify oral-nasal balance will be needed. Future studies should be prospective and include a greater number of listeners. Once a large number of audio and nasometric recordings associated with each oral-nasal balance category have been assembled, the audio files can be analyzed with LTAS. In turn, LTAS can become a clinically efficient assessment tool.

5.2 Effect of altered nasal auditory feedback on oral nasal balance

5.2.1 Study summary

Based on current speech models, such as the DIVA model (Guenther, 2006; Tourville &

Guenther, 2011) speech is partially controlled through a combination of auditory and somatosensory feedback. The objective of the second study was to begin investigating the role of auditory feedback in the control of oral-nasal balance in speech. Participants were asked to repeat a sentence continuously, while their speech was played back to them over headphones.

When the proportion of nasal sound increased, the participants’ nasalance scores decreased.

When the proportion of nasal sounds through the headphones decreased, there was a smaller and less consistent increase in nasalance scores.

5.2.2 Study implications

The second study provided first evidence about the importance of auditory feedback for the control of oral-nasal balance. The reduced compensatory response to decreased nasal feedback was unexpected. There are several possible explanations. A possible explanation is that speakers

95

of English, or speakers in general, are less perceptually aware or sensitive to an absence of nasality in their speech than to an excess of nasality. This is also reflected in clinical practice where SLPs tend to focus on hypernasality in speech disorders while hyponasality is not considered as important a problem (Shprintzen, Lewin, & Croft., 1979). With regards to the first study in this thesis, this finding underlines the point that if listeners have difficulty perceiving hyponasality, a quantitative acoustic measure of oral-nasal balance could be helpful to SLPs in clinical practice.

While this study focused on the role of auditory feedback, speech is mediated by both auditory and somatosensory feedback (Guenther, 2006; Hickok et al., 2011; Tourville & Guenther, 2011).

The inconsistent and minimal response to the reduced nasal feedback condition may be due to somatosensory information which overrode the auditory signal coming through the headphones.

The speakers may have felt the low frequency vibrations from their nasal sound productions in their face and nasal cavities. Alternately, as the levator veli palatini has been shown to respond to changes in nasal airflow and/or intra-oral pressure (Tachimura et al., 1995), the increase in nasal airflow and drop in intra-oral air pressure during the nasal sound productions may have been sufficient feedback to override the perception of reduced nasality in the auditory feedback.

Finally, the speakers may have had enough proprioceptive awareness of their velopharyngeal sphincter to determine the port was sufficiently open for nasal sound production when the reduced nasality in the auditory feedback implied it was not. However, then it is unclear why the same mechanism would not apply to the response to increased nasal feedback.

96

5.2.3 Limitations

The participants were all adult females. However, other studies using altered feedback have relied on mostly or exclusively female participants (Jones & Munhall, 2003; Larson, Burnett,

Kiran, & Hain, 2000; Mitsuya, MacDonald, Purcell, & Munhall, 2011; Mitsuya, Samson,

Ménard, & Munhall, 2013; Munhall, MacDonald, Byrne, & Johnsrude, 2009; Patel et al., 2014;

Purcell & Munhall, 2006). Even when the genders are evenly represented, effects of gender are sometimes not considered relevant for the analysis (Ghosh et al., 2010; Larson, Altman, Liu, &

Hain, 2008; Lui, Chen, Larson, Huang, & Liu, 2010; Villacorta, Perkell, & Guenther, 2007). An exception to this trend is found in studies of delayed auditory feedback (e.g., Swink & Stuart,

2012). Another limitation was that the normal hearing inclusion criterion was based on self- report. None of the participants remarked that the sound coming through the headphones was too loud or not loud enough. Nevertheless, it may be worthwhile to include more stringent hearing testing in future research. In addition, while the participants spoke English, there was no control for any additional languages they may have spoken, i.e., languages with phonological nasalization. We do not yet have any evidence that competence in additional languages would have altered the participants’ response to altered perception of oral-nasal balance.

The participants’ speaking amplitude was not compared to the changes in amplitude in the headphones. Based on the sine wave experiment, the overall volume would be lower in the minimal nasal feedback condition and higher in the maximum nasal feedback condition. It is not known if the participants spoke louder or quieter to compensate for these changes (Lane &

Tranel, 1971; Siegel & Pick, 1974). If there was no change in speaking amplitude, the amplitude through the headphones in the minimal nasal feedback condition would have been lower than in

97

the maximum feedback condition. This may explain in part why the compensatory response was smaller and less consistent than in the maximum nasal feedback condition.

5.2.4 Future directions

The second study showed a part compensatory response to altered oral-nasal feedback, opening many possible avenues for future investigation. Since all the changes to feedback were gradual, there was no demonstration of motor learning and adaptation. To test for adaptation in altered auditory feedback studies, the experimenters will typically return the altered feedback abruptly from the extreme feedback condition to the baseline-control condition or replace the altered feedback with noise (Houde & Jordan, 1998; Jones & Munhall, 2003; Mitsuya et al., 2013;

Mitsuya et al., 2011). If the first few subsequent utterances are similar to those before the shift in feedback, then adaptation to the altered feedback has been demonstrated (Houde & Jordan,

1998). Linguistic awareness of nasality is also a factor of interest. The study has since been repeated with male and female speakers of Brazilian Portuguese, a language with phonological vowel nasalization. The results were similar to those reported in this thesis. A significant effect of auditory feedback condition with a compensatory response to increased nasality and a smaller and inconsistent response to decreased nasality were found. In addition, there was no significant effect of gender, nor were there gender interaction effects.

The phonetic make-up of the stimulus may also have had an influence on the results. In the control condition, the “hamper sentence” generated nasalance scores similar to the Rainbow passage which is considered to be representative of the nasality of day-to-day English speech

(Fairbanks, 1960; de Boer & Bressmann, 2014). However, this sentence (and those used with the speakers of Brazilian Portuguese) may not have been nasal enough. A stronger response to

98

decreased auditory nasal feedback might be found with a stimulus loaded with even more nasal sounds.

It will also be important to find out why the compensation to increased auditory nasal feedback was greater than to decreased auditory nasal feedback. The digital recordings from the Brazilian

Portuguese speakers are currently being analyzed for relative amplitude changes. Future experimental methods should ensure that the amplitude perceived through the headphones remains constant. This would account for the possibility that the decreased and inconsistent response to the minimal nasal feedback condition was influenced by lower amplitude of the signal through the headphones.

Finally, there is a possibility that vibro-tactile sensations from the nasal sound productions contributed to a somatosensory oral-nasal balance feedback mechanism. This hypothesis could be tested by repeating the experimental protocol with the application of vibratory “noise” to the face. If somatosensory feedback is a significant component of the motor control of oral-nasal balance, then blocking that feedback by adding somatosensory noise should result in greater reliance on auditory feedback.

As for clinical applications, it would be interesting to investigate whether altered oral-nasal balance feedback could serve as a treatment for hypernasality. Hypothetically speaking, an individual with hypernasality who receives additional auditory nasal feedback might reduce the nasality of their speech and their nasalance scores and be perceived to have milder hypernasality.

It would be important to determine whether such an effect would be limited to clients who are physically capable of velopharyngeal closure (or near-closure) or whether speakers with more marked velopharyngeal dysfunction could also benefit. 99

5.3 Impact of voice focus on the oral-nasal balance of speakers of Brazilian Portuguese

5.3.1 Study summary

The standard treatment for hypernasality due to cleft palate is surgery (Gart & Gossain, 2014;

Zajac & Vallino, 2017). However, for clients with inconsistent mild to moderate hypernasality, speech therapy may be an option (Ruscello, 2007; Zajac & Vallino, 2017). Speech therapy for hypernasality is hampered, in part, by the speakers’ limited access to conscious proprioception of their velopharyngeal sphincter, whose actions are hidden from view. The objective of the third study was to explore how changes to the vocal tract settings, with forward and backward voice focus, would affect oral-nasal balance. A pilot study with normal speakers of Canadian English

(de Boer & Bressmann, 2016b) suggested that, depending on the stimuli, voice-focus impacts oral-nasal balance. The third study replicated the pilot work (de Boer & Bressmann, 2016b) with female speakers of Brazilian Portuguese and an even number of oral, balanced and nasal-loaded stimuli. Brazilian Portuguese features phonological and phonetic vowel nasalization. The ten speakers read the stimuli into a Nasometer 6400 with their normal voice, and with a backward focus and a forward focus. Across all the stimuli, the forward focus speaking condition produced the highest nasalance scores. The scores from the backward focus condition were numerically lower than the normal condition but did not reach statistical significance (p=.07).

5.3.2 Study implications

The third study demonstrated that voice focus affected oral-nasal balance in typical adult female speakers. The forward focus increased nasalance scores and the backward focus numerically decreased scores. It has been known that nasalance scores of different vowels vary systematically. High vowels generate higher nasalance scores and low vowels generate lower 100

nasalance scores (Awan et al., 2011; Gildersleeve-Neumann & Dalston, 2001), Lewis et al.,

2000). With the forward focus and a typical vowel /i/, a fronted tongue reduces the volume of the oral cavity, increasing oral impedance, and directing more acoustic energy to the nasal cavities

(Rong, Kuehn, & Shosted, 2016).

For the normal speakers, significant increases in nasalance were achieved with a forward focus, which is not clinically useful. Therapists and their clients are seeking strategies to reduce nasality. The current study as well as the model by Rong & Kuehn (2012), indicate that backward focus may be a promising strategy to pursue. However, in a case study of a speaker with hypernasality, it was an extreme forward focus that reduced nasalance scores and the perception of hypernasality (Bressmann et al., 2012). It will be interesting to see which voice focus adjustment will prove to be more advantageous to speakers with hypernasality.

5.3.3 Limitations

Although the group was small, it was enough to demonstrate a proof of concept. The ten participants were adult females training to become Speech-Language Pathologists. Therefore, they likely had more knowledge and awareness of their vocal tracts than a typical speaker. In addition, for their data collection, three experimenters were present (LB, VM, EF). The participant training may have been more comprehensive than in the voice focus pilot study where the first author alone provided the training (de Boer & Bressmann, 2016b).

While the overall effect of voice focus with children and adult males is expected to be similar, this has not yet been confirmed. There were audio and nasalance recordings of the speech and ultrasound recordings of tongue movement, but any possible effect of voice focus on velopharyngeal closure can only be inferred. 101

5.3.4 Future directions

The logical next steps would be to repeat the study with male speakers, to rule out an effect of gender. More research will also be needed with children, as they are the most likely recipients of speech therapy using voice focus adjustments. In a current collaborative study, Dr. Viviane

Marino has begun teaching voice focus to clients with hypernasality due to cleft palate. So far, nasalance scores have been collected from four participants aged 9 to 50. In the normal condition, the participants’ scores were higher than normative values. Three of the participants showed higher scores for forward focus. For the backward focus, scores similar to the normal condition or lower scores were found. These three participants followed the same pattern as found in the third study (de Boer et al., 2016). The fourth participant had lower scores in the forward focus condition, as was seen in Bressmann et al.’s (2012) case study. Perhaps, for this fourth participant and Bressmann et al.’s (2012) case study, the narrowing of the larynx involved lateral pharyngeal wall movement aiding velopharyngeal closure (Ysunza et al., 1997). Future research should include nasoendoscopy, so that the impact of voice focus adjustments on velopharyngeal closure can be monitored.

5.4 Closing statement

Taken together, the studies completed towards this thesis addressed a number of open research questions regarding oral-nasal balance in speech. A prototype for a new inexpensive quantitative assessment procedure of oral-nasal balance was developed. It was confirmed that auditory feedback has a potential role in controlling oral-nasal balance, in particular for stimuli that sound more nasal than normal. Last, but not least, voice focus was revealed to modify oral-nasal

102

balance. Many new avenues for further research in typical speakers and speakers with oral-nasal balance disorders have been identified as a result.

103

References

Abe, M., Murakami, G., Noguchi, M., Kitamura, S., Shimada, K., & Kohama, G. I. (2004).

Variations in the tensor veli palatini muscle with special reference to its origin and

insertion. The Cleft Palate-Craniofacial Journal, 41(5), 474-484.

Åbyholm, F., D'Antonio, L., Davidson Ward, S. L., Kjøll, L., Saeed, M., Shaw, W., … Wyatt, R.

(2005). Pharyngeal flap and sphincterplasty for velopharyngeal insufficiency have equal

outcome at 1 year postoperatively: Results of a Randomized Trial. The Cleft Palate-

Craniofacial Journal, 42(5), 501-511. doi: 10.1597/03-148.1

Andrews, J., & Rutherford, D. (1972). Contribution of nasally emitted sound to the perception of

hypernasality of vowels. The Cleft Palate Journal, 9(2), 147–156.

Aras, I., Olmez, S., & Dogan, S. (2012). Comparative evaluation of nasopharyngeal airways of

unilateral cleft lip and palate patients using three-dimensional and two-dimensional

methods. The Cleft Palate-Craniofacial Journal, 49(6), e75-e81. doi:10.1597/12-004

Awan, S. N., Omlar, K., & Watts, C. R. (2011). Effects of computer system and vowel loading

on measures of nasalance. Journal of Speech Language and Hearing Research, 54, 1284-

1294. doi: 10.1044/1092-4388(2011/10-0201)

Barsoumian, R., Kuehn, D. P., Moon, J. B., & Canady, J. W. (1998). An anatomic study of the

tensor veli palatini and dilatator tubae muscles in relation to eustachian tube and velar

function. The Cleft Palate-Craniofacial Journal, 35(2), 101-110. doi: 10.1597/1545-

1569(1998)035<0101:AASOTT>2.3.CO;2

Baylis, A., Chapman, K., & Whitehill, T. L. (2015). The Americleft Speech Group. Validity and

reliability of visual analog scaling for assessment of hypernasality and audible nasal

104

emission in children with repaired cleft palate. The Cleft Palate-Craniofacial Journal,

52(6), 660-70. doi: 10.1597/14-040

Bell-Berti, F. (1976). An electromyographic study of velopharyngeal function in speech. Journal

of Speech, Language, and Hearing Research, 19(2), 225-240. doi:10.1044/jshr.1902.225

Beukelman, D. R., Fager, S., Green, J., Hakel, M., & Marshall, J. (2004). Nasal obturator for

velopharyngeal dysfunction in dysarthria: Technical report on a one-way valve. Journal

of Medical Speech-Language Pathology, 12(4), 155-159.

Billmire, D. A. (2008). Surgical management of clefts and velopharyngeal dysfunction. In A.

Kummer (Ed.), Cleft palate and craniofacial anomalies – Effects on speech and

resonance (2nd ed.) (pp. 508-540). New York, NY: Delmar Cengage Learning.

Boersma, P., & Weenink, D. (2014). Praat: doing phonetics by computer (Version 5.3.63)

[Computer software]. Retrieved from http://www.praat.org/

Boone, D. R. (1997). Is your voice telling on you? (2nd ed). San Diego, CA: Singular.

Boone, D. R., & McFarlane, S. C. (2000). The voice and voice therapy (6th ed). Boston, MA:

Allyn and Bacon.

Boone, D. R., McFarlane, S. C., Von Berg, S. L., & Zraick, R. I. (2010). The voice and voice

therapy (8th ed).New York, NY: Allyn & Bacon.

Botvinick, M., & Cohen, J. (1998). Rubber hands 'feel' touch that eyes see. Nature, 391(6669),

756.

Brancamp, T. U., Lewis, K. E., & Watterson, T. (2010). The relationship between nasalance

scores and nasality ratings obtained with equal appearing interval and direct magnitude

estimation scaling methods. The Cleft Palate-Craniofacial Journal, 47, 631–637.

105

doi:10.1597/09-106

Bressmann, T. (2012). An ultrasonographic study of lingual contortion speech. Journal of

Phonetics, 40, 830-836. doi: 10.1016/j.wocn.2012.08.002

Bressmann, T., Anderson, J. D., Carmichael, R. P., & Mellies, C. (2012). Prosthodontic

management of hypernasality: Two very different cases. Canadian Journal of Speech-

Language Pathology and Audiology, 36(1), 50-57.

Bressmann, T., Klaiman, P., & Fischbach, S. (2006). Same noses, different nasalance scores:

Data from normal subjects and cleft palate speakers for three systems for nasalance

analysis. Clinical Linguistics and Phonetics, 20, 163-170. doi:

10.1080/02699200500270689

Brunnegård, K., & van Doorn, J. (2009). Normative data on nasalance scores for Swedish as

measured on the Nasometer: influence of dialect, gender, and age. Clinical Linguistics

and Phonetics, 23, 58–69. doi: 10.1080/02699200802491074

Brunnegård, K., Lohmander, A., & van Doorn, J. (2012). Comparison between perceptual

assessments of nasality and nasalance scores. International Journal of Language

Communication Disorders, 47, 556-566. doi: 10.1111/j.1460-6984.2012.00165.x

Brunner, M., Stellzig-Eisenhauer, A., Pröschel, U., Verres, R., & Komposch, G. (2005). The

effect of nasopharyngoscopic biofeedback in patients with cleft palate and

velopharyngeal dysfunction. The Cleft Palate-Craniofacial Journal, 42(6), 649-657. doi:

10.1597/03-044.1

Burnett, T. A., Freedland, M. B., Larson, C. R., & Hain, T. C. (1998). Voice F0 response to

manipulations in pitch feedback. Journal of the Acoustical Society of America, 103,

106

3153-3161. doi: 10.1121/1.423073

Burns, R. P., & Burns, R. (2009). Business Research Methods and Statistics using SPSS.

[companion website]. Retrieved from http://www.uk.sagepub.com/burns/website

%20material/Chapter%2025%20-%20Discriminant%20Analysis.pdf

Bzoch, K. R. (1989). Measurement and assessment of categorical aspects of cleft palate

language, voice and speech disorders. In K. R. Bzoch (Ed.), Communicative disorders

related to cleft lip and palate (3rd ed) (pp. 137-173). Boston, MA: Little Brown.

Cassell, M. D., & Elkadi, H. (1995). Anatomy and physiology of the palate and velopharyngeal

structures. In R. J. Shprintzen & J. Bardach (Eds.), Cleft palate speech management: A

multidisciplinary approach (pp. 45-58). St Louis, MO: Mosby.

Chapman, K. L., Baylis, A., Trost-Cardamone, J., Cordero, K. N., Dixon, A., Dobbelsteyn, C.,

… Sell, D. (2016). The Americleft Speech Project: A training and reliability study. The

Cleft Palate-Craniofacial Journal, 53(1), 93-108. doi: 10.1597/14-027

Chen, M. Y. (1995). Acoustic parameters of nasalized vowels in hearing-impaired and normal

hearing speakers. Journal of the Acoustical Society of America, 98(5), 2443-2453. doi:

10.1121/1.414399

Chen, M. Y. (1997). Acoustic correlates of English and French nasalized vowels. Journal of the

Acoustical Society of America, 102(4), 2360-2370. doi: 10.1121/1.419620

Chen, P. K., Wu, J. T., Chen, Y. R., & Noordhoff, M. S. (1994). Correction of secondary

velopharyngeal insufficiency in cleft palate patients with the Furlow palatoplasty. Plastic

Reconstructive Surgery, 94(7), 933-943.

Croft, C. B., Shprintzen, R. J., & Rakoff, R. J. (1981). Patterns of velopharyngeal valving in

107

normal and cleft palate subjects: A multiview videofluoroscopic and nasendoscopic

study. The Laryngoscope, 91(2), 265-271. doi: 10.1288/00005537-198102000-00015

D’Antonio, L. L., & Scherer, N. L. (2009). Communication disorders associated with cleft

palate. In J. E. Losee & R. E. Kirschner (Eds.), Comprehensive cleft care (pp. 569-88).

New York, NY: McGraw Hill Medical.

Dalston, R. M., & Seaver, E. (1992). Relative values of various standardized passages in the

nasometric assessment of patients with velopharyngeal impairment. The Cleft Palate-

Craniofacial Journal, 29, 17-21. doi: 10.1597/1545-1569(1992)029<0017:RVOVSP

>2.3.CO;2

Dalston, R. M., Neiman, G., & Gonzalez-Landa, G. (1993). Nasometric sensitivity and

specificity: A cross-dialect and cross-culture study. The Cleft Palate-Craniofacial

Journal, 30, 285-291. doi: 10.1597/1545-1569(1993)030<0285:NSASAC>2.3.CO;2

Dalston, R. M., Warren, D. W., & Dalston, E. T. (1991a). Use of nasometery as a diagnostic tool

for identifying patients with velopharyngeal impairment. The Cleft Palate-Craniofacial

Journal, 28, 184-188. doi: 10.1597/1545-1569(1991)028<0184:UONAAD>2.3.CO;2

Dalston, R. M., Warren, D. W., & Dalston, E. T. (1991b). A preliminary investigation

concerning the use of nasometry in identifying patients with hyponasality and/ or nasal

airway obstruction. Journal of Speech Language and Hearing Research, 34, 11-18.

Danziger, S., Levav, J., & Avnaim-Pesso, L. (2011). Extraneous factors in judicial decisions.

Proceedings of the National Academy of Sciences, 108(17), 6889-6892. doi:

10.1073/pnas.1018033108

De Boer, G., & Bressmann, T. (2014). Comparison of nasalance scores obtained with the

108

Nasometers 6200 and 6450. The Cleft Palate-Craniofacial Journal, 51, 90-97. doi:

10.1597/12-202

De Boer, G., & Bressmann, T. (2015). Application of linear discriminant analysis to the

nasometric assessment of resonance disorders: A pilot study. The Cleft Palate-

Craniofacial Journal, 52, 173-182. doi: 10.1597/13-109

De Boer, G., & Bressmann, T. (2016 a). Application of linear discriminant analysis to the long

term averaged spectra of simulated disorders of oral-nasal balance. The Cleft-Palate

Craniofacial Journal, 53(5), e163-e171. doi: 10.1597/14-236

De Boer, G., & Bressmann, T. (2016 b). Influence of voice focus on oral-nasal balance in

speech. Journal of Voice, 30(6), 705-710. doi: 10.1016/j.jvoice.2015.08.021

De Boer, G., & Bressmann, T. (2017). Influence of altered auditory feedback on oral-nasal

balance in speech. Journal of Speech, Language and Hearing Research, 60, 3135-3143.

doi:10.1044/2017_JSLHR-S-16-0390

De Boer, G., Marino, V., Berti, L., Fabron, E., & Bressmann, T. (2016). Influence of voice focus

on oral-nasal balance in speakers of Brazilian Portuguese. Folia Phoniatrica

Logopaedica, 68(3), 152-158. doi: 10.1159/000452245 de Carlos, F., Cobo, J., Macías, E., Feito, J., Cobo, T., Calavia, M. G., … Vega, J.A. (2013). The

sensory innervation of the human pharynx: Searching for mechanoreceptors. The

Anatomical Record, 296(11), 1735-1746. doi:10.1002/ar.22792

Dickson, D. R. (1975). Anatomy of the normal velopharyngeal mechanism. Clinics in plastic

surgery, 2(2), 235-248.

Dickson, D. R., & Dickson, W. M. (1972). Velopharyngeal anatomy. Journal of Speech,

109

Language and Hearing Research, 15(2), 372-381.

Elman, J. L. (1981). Effects of frequency-shifted feedback on the pitch of vocal productions.

Journal of the Acoustical Society of America, 70(1), 45-50.

Fant, G. (1960). Acoustic theory of speech production. The Hague: Mouton

Finkelstein, Y., Lerner, M. A., Ophir, D., Nachmani, A., Hauben, D. J., & Zohar, Y. (1993).

Nasopharyngeal profile and velopharyngeal valve mechanism. Plastic and Reconstructive

Surgery, 92(4), 603-614.

Fletcher, S. G. (1972). Contingencies for bioelectronic modification of nasality. Journal of

Speech and Hearing Disorders, 37(3), 329-346.

Fletcher, S. G. (1976). Nasalance vs. listener judgements of nasality. The Cleft Palate Journal,

13 (1), 31-44.

Flowers, C. R., & Morris, H. L. (1973). Oral-pharyngeal movements during swallowing and

speech. The Cleft Palate Journal, 10(2), 181-191.

Fridland, E. (2011). The case for proprioception. Phenomenology and the Cognitive Sciences,

10(4), 521-540. doi: 10.1007/s11097-011-9217-z

Fritzell, B. (1969). The velopharyngeal muscles in speech: An electromyographic and

cineradiographic study. Acta Oto-Laryngologica, Suppl 250, 1-81.

Gart, M. S., & Gosain, A. K. (2014). Surgical management of velopharyngeal insufficiency.

Clinical Plastic Surgery, 41, 253-270. doi: 10.1016/j.cps.2013.12.010

Gauster, A., Yunusova, Y., & Zajac, D. (2010). The effect of speaking rate on velopharyngeal

function in healthy speakers. Clinical Linguistics and Phonetics, 24, 576-88. doi:

10.3109/02699200903581042

110

Ghosh, S. S., Matthies, M. L., Maas, E., Hanson, A., Tiede, M., Ménard, L., ... & Perkell, J. S.

(2010). An investigation of the relation between production and somatosensory

and auditory acuity. The Journal of the Acoustical Society of America, 128(5), 3079-

3087. doi: 10.1121/1.3493430

Gick, B., Wilson, I., Koch, K., & Cook, C. (2004). Language-specific articulatory settings:

Evidence from inter-utterance rest position. Phonetica, 61(4), 220-233. doi:

10.1159/000084159

Gildersleeve-Neumann, C. E., & Dalston, R. M. (2001). Nasalance scores in noncleft

individuals: Why not zero? The Cleft Palate-Craniofacial Journal, 38, 106-111. doi:

10.1597/1545-1569(2001)038<0106:NSINIW>2.0.CO;2

Glass, L., & Starr, C. D. (1979). A study of relationships between judgments of speech and

appearance of patients with orofacial clefts. The Cleft Palate Journal, 16(4), 436-440.

Golding-Kushner, K. J. (1990). Standardization for the reporting of nasopharyngoscopy and

multiview videofluoroscopy: A report from an international working group. The Cleft

Palate-Craniofacial Journal, 27(4), 337-348. doi:10.1597/1545-

1569(1990)027<0337:SFTRON>2.3.CO;2

Goodin-Mayeda, C. E. (2016). Nasals and nasalization in Spanish and Portuguese: Perception,

phonetics and phonology. Amsterdam, NL: John Benjamins. doi: 10.1075/ihll.9

Goy, H. (2012) Praat Scripts: LTAS and drawing. Retrieved March 2014 from

http://individual.utoronto.ca/huiwen_goy/LTAS_and_drawing.html

Guenther, F. H. (2006). Cortical interactions underlying the production of speech sounds.

Journal of Communication Disorders, 39(5), 350-365. doi:

111

10.1016/j.jcomdis.2006.06.013

Haapanen, M.-L. (1991). A simple clinical method of evaluating perceived hypernasality. Folia

Phoniatrica et Logopaedica, 43, 122-132. doi: 10.1159/000266181

Haapanen, M.-L. (1996). Cul-de-sac hypernasality test with pattern recognition and LPC indices.

Folia Phoniatrica et Logopaedica, 48, 35-43. doi: 10.1159/000266380

Hardin, M., Van Denmark, D., Morris, H., & Payne, M. (1992). Correspondence between

nasalance scores and listener judgements of hypernasality and hyponasality. The Cleft

Palate-Craniofacial Journal, 29, 346-351. doi: 10.1597/1545-1569(1992)029<0346:

CBNSAL>2.3.CO;2

Hardin-Jones, M. A., & Jones, D. L. (2005). Speech production of preschoolers with cleft palate.

The Cleft Palate-Craniofacial Journal, 42(1), 7-13. doi: 10.1597/03-134.1

Heidsieck, D. S., Smarius, B. J., Oomen, K. P., & Breugem, C. C. (2016). The role of the tensor

veli palatini muscle in the development of cleft palate-associated middle ear problems.

Clinical Oral Investigations, 20(7), 1389-1401. doi: 10.1007/s00784-016-1828-x

Henningsson, G., Kuehn, D. P., Sell, D., Sweeney, T., Trost-Cardamone, J. E., & Whitehill, T. L.

(2008). Universal parameters for reporting speech outcomes in individuals with cleft

palate. The Cleft Palate-Craniofacial Journal, 45(1), 1-17. doi: 10.1597/06-086.1

Hickok, G., Houde, J., & Rong, F. (2011). Sensorimotor integration in speech processing:

Computational basis and neural organization. Neuron; 69(3), 407-22. doi:

10.1016/j.neuron.2011.01.019

Hinton, V.A. (2009). Instrumental measures of velopharyngeal function. In J. E. Losee & R. E.

Kirschner (Eds.), Comprehensive cleft care (pp. 607-618). New York, NY: McGraw.

112

Hirschberg, J., Bók, S., Juhász, M., Trenovszki, Z., Votisky, P., & Hirschberg, A. (2006).

Adaptation of nasometry to Hungarian language and experiences with its clinical

application. International Journal of Pediatric Otorhinolaryngology, 70(5), 785–798.

doi: 10.1016/j.ijporl.2005.09.017

Hixon, T. J., Weismer, G., & Hoit, J. D. (2008). Preclinical speech science: Anatomy physiology

acoustics perception. San Diego, CA: Plural Publishing.

Houde, J. F., & Jordan, M. I. (1998). Sensorimotor adaptation in speech production. Science,

279, 1213-1216.

House, A. S., & Stevens, K. N. (1956). Analog studies of the nasalization of vowels. Journal of

Speech and Hearing Disorders, 21(2), 218-232. doi: 10.1044/jshd.2102.218

Iglesias, A., Kuehn, D. P., & Morris, H. L. (1980). Simultaneous assessment of pharyngeal wall

and velar displacement for selected speech sounds. Journal of Speech and Hearing

Research, 23, 429–446. doi: 10.1044/jshr.2302.429

John, A., Sell, D., Sweeney, T., Harding-Bell, A., & Williams, A. (2006). The cleft audit

protocol for speech—augmented: A validated and reliable measure for auditing cleft

speech. The Cleft Palate-Craniofacial Journal, 43(3), 272-288. doi: 10.1597/04-141.1

Johnson, K. (2012). Acoustic and auditory phonetics (2nd ed). Oxford, UK: Blackwell.

Jones, D. L. (2000). The relationship between temporal aspects of oral-nasal balance and

classification of velopharyngeal status in speakers with cleft palate. The Cleft Palate-

Craniofacial Journal, 37(4), 363-369. doi: 10.1597/1545-1569(2000)037<0363:

TRBTAO>2.3.CO;2

Jones, D. L., & Seaver, E. J. (2009). Anatomy and physiology of the normal and cleft palate

113

speech mechanism. In KT Moller & LE Glaze (Eds.) Cleft Lip and Palate:

Interdisciplinary Issues and Treatment (2nd ed.) (pp. 209-240), Austin, TX: PRO-ED.

Jones, D. L., Morris, H. L., & Van Demark, D. R. (2004). A comparison of oral-nasal balance

patterns in speakers who are categorized as “Almost but Not Quite” and “Sometimes but

Not Always”. The Cleft Palate-Craniofacial Journal, 41(5), 526-534. doi: 10.1597/03-

075.1

Jones, J. A., & Munhall, K. G. (2003). Learning to produce speech with an altered vocal tract:

The role of auditory feedback. The Journal of the Acoustical Society of America, 113(1),

532-543. doi: 10.1121/1.1529670

Kanagasuntheram, R., Wong, W. & Chan H. (1969). Some observations on the innervations of

the human nasopharynx. Journal of Anatomy, 104, 361-376.

Karnell, M. P. (1995). Nasometric discrimination of hypernasality and turbulent nasal airflow.

The Cleft Palate Craniofacial Journal, 32, 145-148. doi: 10.1597/1545-

1569(1995)032<0145:NDOHAT>2.3.CO;2

Karnell, M. P. (2011). Instrumental assessment of velopharyngeal closure for speech. Seminars

in Speech and Language, 32(2), 168-178. doi: 10.1055/s-0031-1277719

Karnell, M. P., Hansen, J., Hardy, J. C., Lavelle, W. L., & Markt, J. (2004). Nasality ratings and

nasalance measurements as outcome indices for palatal lift management. Journal of

Medical Speech Pathology, 12, 21-29.

Kataoka, R., Michi, K.-I., Okabe, K., Miura, T. & Yoshida, H. (1996). Spectral properties and

quantitative evaluation of hypernasality in vowels. The Cleft Palate-Craniofacial

Journal, 33, 43-50. doi: 10.1597/1545-1569(1996)033<0043:SPAQEO>2.3.CO;2

114

Kataoka, R., Warren, D. W., Zajac, D. J., Mayo, R., & Lutz, R. W. (2001). The relationship

between spectral characteristics and perceived hypernasality in children. Journal of the

Acoustical Society of America, 109(5), 2181-2189. doi: 10.1121/1.1360717

Keuning, K. H., Wieneke, G. H., & Dejonckere, P. H. (2004). Correlation between the perceptual

rating of speech in Dutch patients with velopharyngeal insufficiency and composite

measures derived from mean nasalance scores. Folia Phoniatrica et Logopaedica, 56,

157–164. doi: 10.1159/000076937

Keuning, K. H., Wieneke, G. H., Van Wijngaarden, H. A., & Dejonckere, P. H. (2002). The

correlation between nasalance and a differentiated perceptual rating of speech in Dutch

patients with velopharyngeal insufficiency. The Cleft Palate-Craniofacial Journal, 39,

277–284. doi: 10.1597/1545-1569(2002)039<0277:TCBNAA>2.0.CO;2

Kim, E. Y., Yoon, M. S., Kim, H. H., Nam, C. M., Park, E. S., & Hong S. H. (2012).

Characteristics of nasal resonance and perceptual rating in prelingual hearing impaired

adults. Clinical Experimental Otorhinolaryngology, 5(1), 1-9. doi:

10.3342/ceo.2012.5.1.1

Kishimoto, H., Matsuura, Y., Kawai, K., Yamada, S., & Suzuki, S. (2016). The lesser palatine

nerve innervates the levator veli palatini muscle. Plastic and Reconstructive Surgery -

Global Open, 4(9), e1044. doi: 10.1097/GOX.0000000000001044

Kishimoto, H., Yamada, S., Kanahashi, T., Yoneyama, A., Imai, H., Matsuda, T., ... Suzuki, S.

(2016). Three-dimensional imaging of palatal muscles in the human embryo and fetus:

Development of levator veli palatini and clinical importance of the lesser palatine nerve.

Developmental Dynamics, 245, 123–131. doi:10.1002/dvdy.24364

115

Kuehn, D. P. (1991). New therapy for treating hypernasal speech using continuous positive

airway pressure (CPAP). Plastic and Reconstructive Surgery, 88, 959-966.

Kuehn, D. P., & Azzam, N. A. (1978). Anatomical characteristics of palatoglossus and the

anterior faucial pillar. The Cleft Palate Journal, 15(4), 349-359.

Kuehn, D. P., & Moller, K. T. (2000). Speech and language issues in the cleft palate population:

the state of the art. The Cleft Palate Craniofacial Journal, 37, 348-383. doi:

10.1597/1545-1569(2000)037<0348:SALIIT>2.3.CO;2

Kuehn, D. P., & Moon, J. B. (1998). Velopharyngeal closure force and levator veli palatini

activation levels in varying phonetic contexts. Journal of Speech Language and Hearing

Research, 41(1), 51-62.

Kuehn, D. P., & Moon, J. B. (2005). Histologic study of intravelar structures in normal human

adult specimens. The Cleft Palate-Craniofacial Journal, 42(5), 481-489. doi: 10.1597/04-

125R.1

Kuehn, D. P., & Perry, J. L. (2009). Anatomy and physiology of the velopharynx. In J. E. Losee

& R. E. Kirschner (Eds.), Comprehensive Cleft Care (pp. 557-68). New York: McGraw

Hill Medical.

Kuehn, D. P., Folkins, J. W., & Cutting, C .B. (1982). Relationships between muscle activity and

velar position. The Cleft Palate Journal, 19(1), 25-35.

Kuehn, D. P., Imrey, P. B., Tomes, L., Jones, D. L., O'Gara, M. M., Seaver, E. J., ... Wachtel, J.

M. (2002). Efficacy of continuous positive airway pressure for treatment of

hypernasality. The Cleft Palate-Craniofacial Journal, 39, 267-276. doi: 10.1597/1545-

1569(2002)039<0267:EOCPAP>2.0.CO;2

116

Kuehn, D. P., Moon, J. B., & Folkins, J. W. (1993). Levator veli palatini muscle activity in

relation to intranasal air pressure variation. The Cleft Palate-Craniofacial Journal, 30(4),

361-368.

Kuehn, D. P., Templeton, P. J., & Maynard, J. A. (1990). Muscle spindles in the velopharyngeal

musculature of humans. Journal of Speech and Hearing Research, 33(3), 488-493.

Kummer, A. (2008). Cleft palate and craniofacial anomalies – Effects on speech and resonance

(2nd ed). New York, NY: Delmar Cengage Learning.

Kummer, A. (2011). Disorders of resonance and airflow secondary to cleft palate and/or

velopharyngeal dysfunction, Seminars in Speech and Language, 32(2), 141-149. doi:

10.1055/s-0031-1277716

Kummer, A. W. (2005). Simplified nasometric assessment procedures (SNAP): Nasometer test-

revised. Lincoln Park, NJ: Kay Elemetrics.

Kummer, A. W., Billmire, D. A., & Myer, C. M. III (1993). Hypertrophic tonsils: The effect on

resonance and velopharyngeal closure. Plastic and Reconstructive Surgery, 91, 608-611.

Kummer, A., Clark, S. L., Redle, E. E., Thomsen, L. L., & Billmire, D. A. (2012). Current

practice in assessing and reporting speech outcomes of cleft palate and velopharyngeal

surgery: a survey of cleft palate/craniofacial professionals. The Cleft Palate-Craniofacial

Journal, 49, 146-152. doi: 10.1597/10-285

Ladefoged, P. & Maddieson, I. (1996). Sounds of the world's languages. Oxford, UK: Balckwell.

Lane, H., & Tranel, B. (1971). The Lombard sign and the role of hearing in speech. Journal of

Speech and Hearing Research, 14, 677-709. doi: 10.1044/jshr.1404.677

Lansing, R. W., Pearl Solomon, N., Kossev, A. R., & Andersen, A. B. (1991). Recording single

117

motor unit activity of human nasal muscles with surface electrodes: applications for

respiration and speech. Electroencephalography and Clinical Neurophysiology, 81(3),

167-175.

Larson, C. R., Altman, K. W., Liu, H., & Hain, T. C. (2008). Interactions between auditory and

somatosensory feedback for voice F0 control. Experimental Brain Research, 187(4), 613-

621. doi: 10.1007/s00221-008-1330-z

Larson, C. R., Burnett, T. A., Kiran, S., & Hain, T. C. (2000). Effects of pitch-shift velocity on

voice F0 responses. Journal of the Acoustical Society of America, 107(1), 559-564. doi:

10.1121/1.428323

Laver, J. (1980). The phonetic description of voice quality. Cambridge Studies in Linguistics, 31,

1-186.

Lee, A. S.-Y., Ciocca, V., & Whitehill, T. (2004). Spectral analysis of hypernasality. Journal of

Medical Speech-Language Pathology, 12, 173-177.

Lee, A., & Browne, U. (2013). Nasalance scores for normal Irish-English speaking adults: a

cross-gender comparative study. Logopedics Phoniatrics Vocology, 38(4), 167-172. doi:

10.3109/14015439.2012.679965

Lee, A., Whitehill, T. L., & Ciocca, V. (2009). Effect of listener training on perceptual

judgement of hypernasality. Clinical Linguistics and Phonetics, 23(5), 319-334. doi:

10.1080/02699200802688596

Lee, G.-S., Wang, C.-P., & Fu, S. (2009). The evaluation of hypernasality in vowels using voice

low tone and high tone ratio. The Cleft Palate–Craniofacial Journal, 46, 47-52. doi:

10.1597/07-184.1

118

Lewis, K. E., Watterson, T. L., & Houghton, S. M. (2003). The influence of listener experience

and academic training on ratings of nasality. Journal of Communication Disorders, 36,

49-58. doi: 10.1016/S0021-9924(02)00134-X

Lewis, K. E., Watterson, T., & Quint, T. (2000). The effect of vowels on nasalance scores. The

Cleft Palate-Craniofacial Journal, 37, 584-589. doi: 10.1597/1545-

1569(2000)037<0584:TEOVON>2.0.CO;2

Liss, J. (1990). Muscle spindles in the human levator veli palatini and palatoglossus muscles.

Journal of Speech and Hearing Research, 33, 736–746. doi: 10.1044/jshr.3304.736

Liss, J. M., Kuehn, D. P., & Hinkle, K. P. (1994). Direct training of the velopharyngeal

musculature. National Center for Voice and Speech Status and Progress Reports, 6, 43-

52.

Liu, P., Chen, Z., Larson, C. R., Huang, D., & Liu, H. (2010). Auditory feedback control of

voice fundamental frequency in school children. The Journal of the Acoustical Society of

America, 128(3), 1306-1312. doi: 10.1121/1.3467773

Löfqvist, A., & Mandersson, B. (1987). Long-time average spectrum of speech and voice

analysis. Folia Phoniatrica, 39, 221-229. doi: 10.1159/000265863

Logjes, R. J., Bleys, R. L., & Breugem, C. C. (2016). The innervation of the soft palate muscles

involved in cleft palate: a review of the literature. Clinical Oral Investigations, 20(5),

895-901. doi: 10.1007/s00784-016-1791-6

Lohmander, A., Willadsen, E., Persson, C., Henningsson, G., Bowden, M., & Hutters, B. (2009).

Methodology for speech assessment in the Scandcleft project - An international

randomized clinical trial on palatal surgery: Experiences from a pilot study. The Cleft

119

Palate-Craniofacial Journal, 46, 347-362. doi: 10.1597/08-039.1

Lowell, S. Y., Colton, R. H., Kelley, R. T., & Hahn, Y. C. (2011). Spectral- and cepstral-based

measures during continuous speech: capacity to distinguish dysphonia and consistency

within a speaker. Journal of Voice, 25(5), e223-32. doi: 10.1016/j.jvoice.2010.06.007

Lubker, J. (1968). An electromyographic-cinefluorographic investigation of velar function

during normal speech production. Cleft Palate Journal, 5(1), 1–17.

Marino, V., Dutka, J., de Boer, G., Cardoso, V., Ramos, R., & Bressmann, T. (2016). Normative

nasalance scores for Brazilian Portuguese using new speech stimuli. Folia Phoniatrica

Logopaedica, 67(5), 238-244. doi: 10.1159/000441976

Mayer, C., Gick, B., & Ferch, E. (2009). Talking while chewing: Speaker response to natural

perturbation of speech. Canadian Acoustics, 37(3), 144-145.

Mayo, C. M., & Mayo, R. (2011). Normative nasalance values across languages. ECHO, 6, 22–

32.

McDonald, E., & Baker, H. K. (1951). Cleft palate speech: an integration of research and clinical

observation. Journal of Speech Hearing Disorders, 16, 9-20.

McWilliams, B. J., Morris, H. L., & Shelton, R. L. (1990). Cleft palate speech (2nd ed).

Burlington ON, B.C. Decker Inc.

Mennen, I., Scobbie, J., deLeeuw, E., Schaeffler, S., & Schaeffler, F. (2010). Measuring

language-specific phonetic settings. Second Language Research, 26(1), 13-41. doi:

10.1177/0267658309337617

Minifie, F. D., Abbs, J. H., Tarlow, A., & Kwaterski, M. (1974). EMG activity within the

pharynx during speech production. Journal of Speech, Language and Hearing Research,

120

17(3), 497-504. doi: 10.1044/jshr.1703.497

Mishima, K., Sugii, A., Yamada, T., Imura, H., & Sugahara, T. (2008). Dialectal and gender

differences in nasalance scores in a Japanese population. Journal of Cranio-maxillo-

facial Surgery, 36, 8–10. doi: 10.1016/j.jcms.2007.07.008

Mitsuya, T., MacDonald, E. N., Purcell, D. W., & Munhall, K. G. (2011). A cross-language

study of compensation in response to real-time formant perturbation. The Journal of the

Acoustical Society of America, 130(5), 2978-2986. doi:10.1121/1.3643826

Mitsuya, T., Samson, F., Ménard, L., & Munhall, K. G. (2013). Language dependent vowel

representation in speech production. The Journal of the Acoustical Society of America,

133(5), 2993-3003. doi: 10.1121/1.4795786

Moll, K. L. (1962). Velopharyngeal closure on vowels. Journal of Speech, Language and

Hearing Research, 5(1), 30-37. doi: 10.1044/jshr.0501.30

Moll, K. L. (1964). Cineradiography in research and clinical studies of the velopharyngeal

mechanism. The Cleft Palate Journal, 30, 391-397.

Moll, K. L. (1965). A Cinefluorographic study of velopharyngeal function in normals during

various activities. The Cleft Palate Journal, 2(2), 112-122.

Moon, J. B. (2009). Evaluation of velopharyngeal function. In K. T. Moller & L. E. Glaze (Eds.),

Cleft Lip and Palate: Interdisciplinary Issues and Treatment (2nd. ed.) (pp. 313-76).

Austin TX: Pro-ed Publishing.

Moon, J. B., Smith, A. E., Folkins, J. W., Lemke, J. H., & Gartlan, M. (1994). Coordination of

velopharyngeal muscle activity during positioning of the soft palate. The Cleft Palate-

Craniofacial Journal, 31(1), 45-55. doi: 10.1597/1545-

121

1569(1994)031<0045:COVMAD>2.3.CO;2

Moon, J., & Canady, J. (1995). Effects of gravity on velopharyngeal muscle activity during

speech. The Cleft Palate-Craniofacial Journal, 32(5), 371-375. doi:10.1597/1545-

1569(1995)032<0371:EOGOVM>2.3.CO;2

Munhall, K. G., MacDonald, E. N., Byrne, S. K., & Johnsrude, I. (2009). Talkers alter vowel

production in response to real-time formant perturbation even when instructed not to

compensate. The Journal of the Acoustical Society of America, 125 (1), 384-390. doi:

10.1121/1.3035829

Nasir, S. M., & Ostry, D. J. (2006). Somatosensory precision in speech production. Current

Biology, 16(19), 1918-1923. doi: 10.1016/j.cub.2006.07.069

Nellis, J. L., Neiman, G. S., & Lehman, J. A. (1992). Comparison of Nasometer and listener

judgments of nasality in the assessment of velopharyngeal function after pharyngeal flap

surgery. The Cleft Palate-Craniofacial Journal, 29, 157–163. doi: 10.1597/1545-

1569(1992)029<0157:CONALJ>2.3.CO;2

Patel, S., Nishimura, C., Lodhavia, A., Korzyukov, O., Parkinson, A., Robin, D. A., & Larson, C.

R. (2014). Understanding the mechanisms underlying voluntary responses to pitch-

shifted auditory feedback. Journal of the Acoustical Society of America, 135(5), 3036-

3044. doi: 10.1121/1.4870490

Perry, J. L. (2011). Anatomy and physiology of the velopharyngeal mechanism. Seminars in

Speech and Language, 32(2), 83-92. doi: 10.1055/s-0031-1277712

Perry, J. L., Sutton, B. P., Kuehn, D. P., & Gamage, J. K. (2014). Using MRI for assessing

velopharyngeal structures and function. The Cleft Palate-Craniofacial Journal, 51(4),

122

476-485. doi: 10.1597/12-083

Perry, J., & Zajac, D. J. (2017). Orofacial and velopharyngeal structure and function. In D. J.

Zajac & L. D. Vallino (Eds.), Evaluation and Management of Cleft Lip and Palate: A

Developmental Perspective (pp. 3-21). San Diego, CA: Plural Publishing.

Peterson-Falzone, S. J., Hardin-Jones, M. A., & Karnell, M. P. (2001). Cleft palate speech (3rd

ed). St Louis, MO: Mosby, Inc.

Peterson-Falzone, S. J., Trost-Cardamone, J. E., Karnell, M. P., & Hardin-Jones M. A. (2006).

The clinician's guide to treating cleft palate speech. St. Louis, MO: Mosby-Elsevier.

Podol, J., & Salvia, J. (1976). Effects of visibility of a prepalatal cleft on the evaluation of

speech. The Cleft Palate Journal, 13(4), 361-366.

Pols, L. C. W., van der Kamp, L. J. Th., & Plomp, R. (1969). Perceptual and physical space of

vowel sounds. The Journal of the Acoustical Society of America, 46, 458-467. doi:

10.1121/1.1911711

Powers, G. L., & Starr, C. D. (1974). The effects of muscle exercises on velopharyngeal gap and

nasality. The Cleft Palate Journal, 11(1), pp. 28-35

Prathanee, B., Thanaviratananich, S., Pongjunyakul, A., & Rengpatanakij, K. (2003). Nasalance

scores for speech in normal Thai children. Scandinavian Journal of Plastic and

Reconstructive Surgery and Hand Surgery, 37, 351–355. doi:

10.1080/02844310310005892

Principato, J., & Osenberger, J. (1970). Cyclical changes in nasal resistance. Archives of

Otolaryngology, 91, 71-77.

Purcell, D. W., & Munhall, K. G. (2006). Compensation following real-time manipulation of

123

formants in isolated vowels. Journal of the Acoustical Society of America, 119(4), 2288-

2297. doi: 10.1121/1.2173514

Reisberg, D. J. (2000). Dental and prosthodontic care for patients with cleft or craniofacial

conditions. The Cleft Palate-Craniofacial Journal, 37(6), 534-537. doi: 10.1597/1545-

1569(2000)037<0534:DAPCFP>2.0.CO;2

Rong, P., & Kuehn, D. (2012). The effect of articulatory adjustment on reducing hypernasality.

Journal of Speech Language and Hearing Research, 55(5), 1438-1448. doi:

10.1044/1092-4388(2012/11-0142)

Ruscello, D. (2007). Treatment of velopharyngeal closure for speech: Discussion and

implications for management. The Journal of Speech and Language Pathology – Applied

Behavior Analysis, 2(1), 55-75. http://dx.doi.org/10.1037/h0100212

Schwartz, S. (1968). The acoustics of normal and nasal vowel production. The Cleft Palate

Journal, 9(2), 125–139.

Seaver, E. J., Dalston, R. M., Leeper, H. A., & Adams, L. E. (1991). A study of nasometric

values for normal nasal resonance. Journal of Speech Hearing Research, 30, 522–529.

Sell, D. A., & Grunwell, P. (2001). Speech assessment and therapy. In A. C. H. Watson, D. A.

Sell & P. Grunwell (Eds.), Management of Cleft Lip and Palate (pp. 227-57). London

UK: Whurr Publishing.

Sell, D., Harding, A., & Grunwell, P. (1994). A screening assessment of cleft palate speech

(Great Ormond Street Speech Assessment). International Journal of Language &

Communication Disorders, 29(1), 1-15. doi: 10.3109/13682829409041477

Sell, D., Harding, A., & Grunwell, P. (1999). GOS.SP.ASS.’98: an assessment for speech

124

disorders associated with cleft palate and/or velopharyngeal dysfunction (revised).

International Journal of Language and Communication Disorders, 34(1), 17–33. doi:

10.1080/136828299247595

Sell, D., John, A., Harding‐Bell, A., Sweeney, T., Hegarty, F., & Freeman, J. (2009). Cleft Audit

Protocol for Speech (CAPS‐A): a comprehensive training package for speech analysis.

International Journal of Language & Communication Disorders, 44(4), 529-548. doi:

10.1080/13682820802196815

Shelton, R. L., Knox, A. W., Elbert, M., & Johnson, T. S. (1970). Palate awareness and

nonspeech voluntary palate movement. In J. F. Bosma (Ed.), Second Symposium on Oral

Sensation and Perception (pp. 416-431). Springfield, IL: Charles C. Thomas Publishing.

Shimokawa, T., Yi, S., & Tanaka, S. (2005). Nerve supply to the soft palate muscles with special

reference to the distribution of the lesser palatine nerve. The Cleft Palate-Craniofacial

Journal, 42(5), 495-500. http://dx.doi.org/10.1597/04-142R.1

Shprintzen, R. J., Lewin, M. L., & Croft, C. B. (1979). A comprehensive study of pharyngeal

flap surgery: tailor made flaps. The Cleft Palate Journal, 16(1), 46-55.

Shprintzen, R. J., McCall, G. N., & Skolnick, M. L. (1975). A new therapeutic technique for the

treatment of velopharyngeal incompetence. Journal of Speech and Hearing Disorders,

40, 69-83. doi:10.1044/jshd.4001.69

Shprintzen, R. J., McCall, G. N., Skolnick, M. L., & Lencione, R. M. (1975). Selective

movement of the lateral aspects of the pharyngeal walls during velopharyngeal closure

for speech, blowing, and whilstling in normals. The Cleft Palate Journal, 12(1), 51-58.

Siegel, G., & Pick, H. (1974). Auditory feedback in the regulation of voice. Journal of the

125

Acoustical Society of America, 65, 1618-1624. doi: 10.1121/1.1903486

Silva, T. C. (2007). Fonética e fonologia do português: roteiro de estudos e guia de exercícios.

9. ed. [Portuguese phonetics and phonology: Study manual and exercise guide (9th

ed.)]São Paulo, BR: Contexto.

Stal, P. S., & Lindman, R. (2000) Characterisation of human soft palate muscles with respect to

fibre types, myosins and capillary supply. Journal of Anatomy, 197 (2), 275-290. doi:

10.1046/j.1469-7580.2000.19720275.x

Stål, P., Eriksson, P. O., Eriksson, A., & Thornell, L. E. (1987). Enzyme-histochemical

differences in fibre-type between the human major and minor zygomatic and the first

dorsal interosseus muscles. Archives of Oral Biology, 32(11), 833-841. doi:

10.1016/0003-9969(87)90011-2

Stål, P., Eriksson, P. O., Eriksson, A., & Thornell, L. E. (1990). Enzyme-histochemical and

morphological characteristics of muscle fibre types in the human buccinator and

orbicularis oris. Archives of Oral Biology, 35(6), 449-458. doi: 10.1016/0003-

9969(90)90208-R

Stelck, E. H., Boliek, C. A., Hagler, P. H., & Rieger, J. M. (2011). Current practices for

evaluation of resonance disorders in North America. Seminars in Speech and Language,

32(1), 58-68. doi: 10.1055/s-0031-1271975

Stevens, J. P. (2002). Applied multivariate statistics for the social sciences (4th ed). Mahwah,

NJ: Lawrence Erlbaum Associates, Inc.

Stevens, K. N. (1985). Spectral prominences and phonetic distinctions in language. Speech

Communication, 4, 137-144. doi: 10.1016/0167-6393(85)90041-X

126

Stevens, K. N. (1997). Articulatory-acoustic-auditory Relationships. In W. J. Hardcastle & J.

Laver (Eds.), The handbook of phonetic sciences (pp. 462-506). Oxford, UK: Blackwell

Publishing.

Stoksted, P. (1953). Rhinometric measurements for determination of the nasal cycle. Acta

Otolaryngologica, 109, Suppl:1-159.

Sundberg, J., & Nordstrom, P. (1976) Raised and lowered larynx – The effect on vowel formant

frequencies. Speech Transmission Laboratory. Quarterly Progress and Status Report. 2-

3, 35-39.

Suwaki M., Nanba K., Ito E., Kumakura I., & Minagi S. (2008). Nasal speaking valve: a device

for managing velopharyngeal incompetence. Journal of Oral Rehabilitation, 35(1), 73-

78. doi: 10.1111/j.1365-2842.2007.01800.x

Sweeney, T. (2013). Nasality—assessment and intervention. In S. Howard & A. Lohmander

(Eds.), Cleft palate speech: Assessment and intervention (pp. 199–220). Hoboken, NJ:

Wiley.

Sweeney, T., & Sell, D. (2008). Relationship between perceptual ratings of nasality and

nasometry in children/adolescents with cleft palate and/or velopharyngeal dysfuntion.

International Journal of Language and Communication Disorders, 43, 265–282. doi:

10.1080/13682820701438177

Sweeney, T., Sell, D., & O’Regan M. (2004). Nasalance scores for normal Irish-speaking

children. The Cleft Palate-Craniofacial Journal, 41, 168–174. doi: 10.1597/02-094

Swink, S., & Stuart, A. (2012). The effect of gender on the N1–P2 auditory complex while

listening and speaking with altered auditory feedback. Brain and Language, 122(1), 25-

127

33. doi: 10.1016/j.bandl.2012.04.007

Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics. Boston, MA:

Pearson/Allyn & Bacon.

Tachimura, T., Hara, H., & Wada, T. (1995). Oral air pressure and nasal air flow rate on levator

veli palatini muscle activity in patients wearing a speech appliance. The Cleft Palate-

Craniofacial Journal, 32(5),382-389. doi: 10.1597/1545-1569(1995)032<0382:

OAPANA>2.3.CO;2

Tachimura, T., Nohara, K., & Wada, T. (2000). Effect of placement of a speech appliance on

levator veli palatini muscle activity during speech. The Cleft Palate-Craniofacial

Journal, 37(5), 478-482. doi: 10.1597/1545-1569(2000)037<0478:EOPOAS>2.0.CO;2

Tachimura, T., Nohara, K., Fujita, Y., Hara, H., & Wada, T. (2001). Change in levator veli

palatini muscle activity of normal speakers in association with elevation of the velum

using anexperimental palatal lift prosthesis. The Cleft Palate-Craniofacial Journal, 38(5),

449-454. doi: 10.1597/1545-1569(2001)038<0449:CILVPM>2.0.CO;2

Tachimura, T., Nohara, K., Hara, H., & Wada, T. (1999). Effect of placement of a speech

appliance on levator veli palatini muscle activity during blowing. The Cleft Palate-

Craniofacial Journal, 36(3), 224-232. doi: 10.1597/1545-1569(1999)036<0224:

EOPOAS>2.3.CO;2

Titze, I. R. (1994). Principles of voice production. Englewood Cliffs, NJ: Prentice Hall.

Tjaden, K., Sussman, J., Liu, G., & Wilding, G. (2010). Long term average spectral measures of

dysarthria and their relationship to perceived severity. Journal of Medical Speech-

Language Pathology, 18, 125-132. doi: 10.1016/j.jcomdis.2011.06.003

128

Tourville, J. A., & Guenther, F. H. (2011). The DIVA model: A neural theory of speech

acquisition and production. Language and Cognitive Processes, 26(7), 952-981. doi:

10.1080/01690960903498424

Tourville, J. A., Reilly, K. J., & Guenther, F. H. (2008). Neural mechanisms underlying auditory

feedback control of speech. NeuroImage, 39(3), 1429-1443. doi:

10.1016/j.neuroimage.2007.09.054

Trigos, I., Ysunza, A., Vargas, D., & Vazquez, M. C. (1988). The San Venero Roselli

pharyngoplasty: an electromyographic study of the palatopharyngeus muscle. The Cleft

Palate Journal, 25(4), 385-388.

Trindade, I. E. K., Genaro, K. F., & Dalston, R. M. (1997). Nasalance scores of normal Brazilian

Portuguese speakers. Brazilian Journal of Dysmorphology and Speech-Hearing

Disorders, 1, 23–34.

Trindade, I. E. K., Gomes, A. D. O. C., Fernandes, M. D. B. L., Trindade, S. H. K., & Silva

Filho, O. G. D. (2015). Nasal airway dimensions of children with repaired unilateral cleft

lip and palate. The Cleft Palate-Craniofacial Journal, 52(5), 512-516. doi: 10.1597/14-

103

Van Lierde, K. M., Luyten, A., Mortier, G., Tijskens, A., Bettens, K., & Vermeersch, H. (2011).

Overall intelligibility, articulation, resonance, voice and language in a child with Nager

syndrome, International Journal of Pediatric Otorhinolaryngology, 75(2), 270-276. doi:

10.1016/j.ijporl.2010.11.017

Van Lierde, K. M., Van Borsel, J., Cardinael, A., Reeckmans, S., & Bonte, K. (2010). The

impact of vocal intensity and pitch modulation on nasalance scores: A pilot study. Folia

129

Phoniatrica et Logopaedica, 63(1), 21-26. doi: 10.1159/000319733

Van Lierde, K. M., Wuyts, F. L., De Bodt, M., & Van Cauwenberge, P. (2001). Nasometric

values for normal nasal resonance in the speech of Flemish adults. The Cleft Palate-

Craniofacial Journal, 38, 112–118. doi: 10.1597/1545-1569(2001)038<0112:

NVFNNR>2.0.CO;2

Villacorta, V. M., Perkell, J. S., & Guenther, F. H. (2007). Sensorimotor adaptation to feedback

perturbations of vowel acoustics and its relation to perception. The Journal of the

Acoustical Society of America, 122(4), 2306-2319. doi: 10.1121/1.2773966

Vogel, A. P., Ibrahim, H. M., Reilly, S., & Kilpatrick, N. (2009). A comparative study of two

acoustic measures of hypernasality. Journal of Speech, Language and Hearing Research,

52, 1640-1651. doi: 10.1044/1092-4388(2009/08-0161)

Warren, D. W., Dalston, R. M., & Mayo, R. (1993). Aerodynamics of nasalization. In M. K.

Huffman & R. A. Krakow (Eds.), Nasals, nasalization and the velum (pp.119-146). San

Diego, CA: Academic Press, Inc.

Watterson, T., Lewis, K. E., & Foley-Homan, N. (1999). Effect of stimulus length on nasalance

scores. The Cleft Palate-Craniofacial Journal, 36(3), 243-247. doi: 10.1597/1545-

1569(1999)036<0243:EOSLON>2.3.CO;2

Watterson, T., York, S. L., & McFarlane, S. C. (1994). Effects of vocal loudness on nasalance

measures. Journal of Communication Disorders, 27(3), 257-262. doi: 10.1016/0021-

9924(94)90004-3

Weismer, G. (2006). Philosophy of research in motor speech disorders. Clinical Linguistics and

Phonetics, 20, 315-349. doi: 10.1080/02699200400024806

130

Whitehill, T. L., & Lee, A. S.-Y. (2008). Instrumental analysis of resonance in speech

impairment. In M. J. Ball, M. R. Perkins, N. Müller & S. Howard (Eds.), The handbook

of clinical linguistics (pp. 332-343). Oxford, UK: Blackwell Publishing.

Whitehill, T. L., Gotzke, C. L., & Hodge, M. (2013). Speech intelligibility. In S. Howards & A.

Lohmander (Eds.), Cleft palate speech: Assessment and intervention (pp. 293–304).

Chichester, UK: Wiley-Blackwell.

Whitehill, T. L., Lee, A. S. Y., & Chun, J. C. (2002). Direct magnitude estimation and interval

scaling of hypernasality. Journal of Speech, Language and Hearing Research;45, 80-88.

doi: 10.1044/1092-4388(2002/006)

Wilson, I., & Gick, B. Bilinguals use language-specific articulatory settings. (2014). Journal of

Speech, Language and Hearing Research, 57(2), 361-373. doi: 10.1044/2013_JSLHR-S-

12-0345

Witzel, M. A., Tobe, J., & Salyer, K. (1988). The use of nasopharyngoscopy biofeedback therapy

in the correction of inconsistent velopharyngeal closure. International Journal of

Pediatric Otorhinolaryngology, 15(2), 137-142.

Yanagisawa, E., Kmucha, S. T., & Estill, J. (1990). Role of the soft palate in laryngeal functions

and selected voice qualities. Simultaneous velolaryngeal videoendoscopy. The Annals of

otology, rhinology, and laryngology, 99(1), 18-28.

Ysunza, A., & Vazquez, M. C. (1993). Velopharyngeal sphincter physiology in deaf individuals.

The Cleft Palate-Craniofacial Journal, 30(2),141-143. doi:10.1597/1545-

1569(1993)030<0141:VSPIDI>2.3.CO;2

Ysunza, A., Pamplona, M., Femat, T., Mayer, I., & Garcia-Velasco, M. (1997).

131

Videonasopharyngoscopy as an instrument for visual biofeedback during speech in cleft

palate patients. International Journal of Pediatric Otorhinolaryngology, 41(3), 291-298.

doi: 10.1016/S0165-5876(97)00096-7

Zajac, D. J., & Vallino, L. D. (2017). Evaluation and management of cleft lip and palate: A

developmental perspective. San Diego, CA: Plural Publishing.

Zemlin, W. R. (1998). Speech and Hearing Science: Anatomy and Physiology. Boston, MA:

Allyn and Bacon.

Zraick, R. I., & Liss, J. M. (2000). A comparison of equal-appearing interval scaling and direct

magnitude estimation of nasal voice quality. Journal of Speech, Language and Hearing

Research, 43(4), 979-988. doi: 10.1044/jslhr.4304.979

132

Appendix A – Tentative formulas for the classification of oral nasal balance

This appendix contains step by step instructions on how to convert the values of tables 2.2 and

2.3 into tentative formulas for classifying the oral-nasal balance of the speech of adult females.

As the values are based on the Long-Term Averaged Spectra (LTAS) from adult females, they are not transferable to children or adult males. In addition, the values are derived from normal speech and simulations of hypernasal, hyponasal and mixed nasality. The formulas have not been validated with a clinical population. Therefore, the formulas in their current form should not be used in a clinical setting.

Procedure

Record the participant saying the oral stimulus (Look at this book with us it’s a story about a zoo) and the nasal stimulus (Mama made some lemon jam). With an acoustic analysis software program, such as Praat, analyze the sound files for the oral and nasal stimuli separately with

LTAS. Set the LTAS frequency limits from 0 to 4000Hz with frequency bins of 100Hz. With a script or individual queries, obtain the uncalibrated decibel values for the 40 frequency bins for each stimulus. The first bin will represent 1 to 100 Hz with a central frequency of 50Hz and the fortieth bin will represent 3901 to 4000 Hz with a central frequency of 3950 Hz. For each sound file, convert the uncalibrated decibel values into z-scores. Insert the relevant z-scores from the oral and nasal stimuli into the formulas for the function values as detailed below.

133

Calculation

The first function value = 0.97 + 2.54(z-score oral 250 Hz) - 1.30(z-score oral 350 Hz) – 1.16(z- score oral 650 Hz) + 0.70(z-score oral 1050 Hz) + 1.00(z-score oral 1550 Hz) – 1.32(z-score oral

1950 Hz) – 0.70(z-score nasal 750 Hz) – 0.45(z-score nasal 1450 Hz) + 1.74(z-score nasal 2150

Hz) + 1.04(z-score nasal 3850 Hz)

The second function value = 10.31 - 2.05(z-score oral 250 Hz) - 1.38(z-score oral 350 Hz) –

1.18(z-score oral 650 Hz) - 1.20(z-score oral 1050 Hz) - 0.10(z-score oral 1550 Hz) - 2.71(z- score oral 1950 Hz) - 1.03(z-score nasal 750 Hz) + 0.91(z-score nasal 1450 Hz) + 1.07(z-score nasal 2150 Hz) + 1.99(z-score nasal 3850 Hz)

Classification

The first and second function values can be plotted as coordinates and compared the centroid values from table 2.3 for the normal (-1.47, 0.40) and simulated hyper (0.92, -1.53), hypo (-2.06,

0.20) and mixed nasality (2.61, 0.92). The closer the coordinates are to a particular centroid, the shorter the Mahalobian distance, and the greater the probability that the sound samples will be classified as that centroid’s category. Some statistical software programs (eg. SPSS), once provided the canonical discriminant function coefficients and the function values of the group centroids, can calculate the probabilities automatically.

134

Appendix B – The stimuli from section 4.7 and their phonetic transcriptions.

Seven of the speech stimuli are based on Marino et al. (2016) and two (indicated with *) are based on Trindade et al. (1997). The phonetic transcriptions were provided by Gabriella Zuin

Ferreira.

Oral

Dudu visitou o bosque. (dudu vizitow u bɔski)

Viu o pulo do sapo. (viw u pulu du sapu)

Gostou do peixe. (gɔstow du pejʃi)

Balanced

O cachorro do Nino. (u kaʃoxu du ninu)

Arruma seu bercinho. (axumə sew behsiŋu)

Flavinho chamou o João.* (flaviŋu ʃamow u ʒoãw)

Nasal

Monica mima o nenê. (monikə mimə u nene)

Miriam lambeu o limão.* (miriã lãbew u limãw)

O nene mama. (u nene mamə)

135

136