<<

Copyright by Niamh Eileen Kelly 2015 The Dissertation Committee for Niamh Eileen Kelly certifies that this is the approved version of the following dissertation:

An Experimental Approach to the Production and Perception of Norwegian Tonal Accent

Committee:

Rajka Smiljani´c, Supervisor

Scott Myers

Megan Crowhurst

Harvey Sussman

Gjert Kristoffersen An Experimental Approach to the Production and Perception of Norwegian Tonal Accent

by

Niamh Eileen Kelly, B.A., M.A.

DISSERTATION Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of

DOCTOR OF PHILOSOPHY

THE UNIVERSITY OF TEXAS AT AUSTIN May 2015 You don’t set out to build a wall! You don’t say, “I’m going to build the biggest, baddest, greatest wall that’s ever been built.” You don’t start there. You say, “I’m going to lay this brick as perfectly as a brick can be laid.” And you do that every single day and soon you have a wall. - Will Smith Acknowledgments

The work that resulted in this dissertation was a collaborative effort, and I have many people to thank. I could not have done this without the support of my committee, whom I sincerely thank: Rajka, for unfaltering support and encouragement, not to mention extensive comments and guidance in all aspects of the research and writing, as well as understanding, affirmation, and belief in me. For constant confidence and help, whenever needed. Scott, for solid advice on experiments and writing, and for always being available to redirect when necessary! Megan, for very beneficial experiences as a research assistant, for very practical ad- vice about academia, and for inspired perspectives on this work. Harvey, for enthusiasm and willingness to learn about Norwegian tones, also for being a wonderful example of an academic who truly enjoys teaching. Gjert, for con- stant guidance and insight on the relevant literature, for enthusiasm beginning with the very first ideas for this work, for extensive help in finding target words and creating the stimuli sentences, and for confidence in my abilities. To all in the Department of Linguistics; professors, colleagues, staff: thank you for such a supportive, motivating, warm environment. I am also very grateful for funding in the form of TAships, RAships, and AIships. These have provided me with invaluable experience and also allowed me to develop my teaching skills, something that I have enjoyed tremendously. To all at NTNU who kindly allowed me to use their facilities, helped with participant recruitment and also made me feel very welcome: Wim van Dommelen, Jacques Koreman, Terje Lohndal, Dawn Behne. I would like to thank Allison Wetterlin, Arnold Dalen, Thorstein Fretheim, Randi Nilsen, Stian H˚arstadand Jørn Almberg for their advice and guidance. Thanks to all participants in the experiments. I also thank Johan, Hilde and Olve for their hospitality in Trondheim. I had a lot of assistance with translations of fliers and posters and consent forms, for which I especially thank Johan and Øystein. I would also like to thank Miquel Simonet for guidance on how

v to use PsychoPy, Katrin Schneider for her advice on designing perception experiments, and Grzegorz Dogil for the opportunity to spend time at the University of Stuttgart. I could not have done this without the unwavering support of my family. I am thankful to my parents for unending love and encouragement, to Deirdre, Triona and Eoin and to my friends in Ireland and all over the world, for love and support. My life in Austin would not have been as joyful without the compan- ionship, support and laughter shared with so many wonderful friends. Stacy, Stephanie, Robyn, Lauren, Sean, Cindy, Alex, Whitney, Aimee, Justin, Brian, Taylor, Oren, Brooks, Megan, and all who have been part of my life here, thank you for all the good times. My second family, the Bennetts, thank you for always being there for me. Finally, I thank the National Science Foundation for supporting the research used as the basis for this dissertation (Doctoral Dissertation Research Improvement Grant No. 1322700).

vi An Experimental Approach to the Production and Perception of Norwegian Tonal Accent

Publication No.

Niamh Eileen Kelly, Ph.D. The University of Texas at Austin, 2015

Supervisor: Rajka Smiljani´c

This dissertation examines the lexical tonal accent contrast of the Trøndersk dialect of East Norwegian from the perspective of both production and per- ception. The goal of the production study was to conduct an in-depth inves- tigation of the tonal accent realization in this understudied dialect, as well as to examine how the lexical accents are impacted by pragmatic focus and sentential . The Trøndersk dialect is unusual typologically in that it exhibits a tonal contrast on monosyllabic words. Therefore, the current study examines the contrast on disyllabic and monosyllabic words. Ten speakers were recorded target monosyllabic and disyllabic words representing each accent, in noncontrastive and contrastive focus, and also at the right edge of an accent phrase (AP). The goal of the perception study was to determine what cues listeners use to identify the accents. The results of the acoustic analysis revealed that the main correlate of the disyllabic accent distinction in this dialect was in the timing of the F0 contour, with accent 2 having a later alignment of F0 landmarks and a higher F0 minimum than accent 1. In contrastive focus, the accent contrast was found to be enhanced. Accent 1 showed an expanded pitch range and accent 2 an even later alignment of the HL contour compared to noncontrastive focus. When produced at the end of an AP, both accents had a higher F0 minimum and lower AP boundary compared to AP-medial position. The AP-final position

vii also had an influence on segment duration, such that the stressed vowels were shorter and final vowels were longer compared to the AP-medial position. The results of the production experiments thus revealed that contrastive focus and AP-final position both affected pitch cues even though these cues are primarily used to distinguish the lexical pitch contrasts. However, the variation in pitch contour introduced by these factors did not diminish the lexical contrast. In fact, the asymmetrical impact of focus on accent 1 and accent 2 words enhanced the distinction between the two accents. For the monosyllabic contrast, the results revealed that in a noncon- trastive focus realization, words with the circumflex accent have a wider HL contour compared to the unmarked accent. In contrastive focus, both accents have a wider pitch range and later low tone alignment. Unlike the effect of contrastive focus on disyllabic words where this increased the timing differ- ence between the accents, the timing of the monosyllabic accents changed in the same direction in contrastive focus. Phonologically long vowels were also lengthened in this condition. Based on the production results, a categorization of stimuli with manip- ulated pitch contours was conducted. This experiment tested which acoustic cues (height and alignment of F0 minimum, and alignment of F0 maximum and turning point from maximum to minimum) are necessary for the perception of the tonal contrast. The results are consistent with the production findings in that changes in all of the examined acoustic cues contributed to the shift in accent categorization. The later timing of the main F0 landmarks (F0 max- imum, F0 minimum and turning point from maximum to minimum) induced accent 2 identification. Raising F0 minimum height also led to more accent 2 responses. The analysis of the perception patterns furthermore revealed that the effect of a later timing of F0 minimum was weak unless combined with a later timing of the other F0 landmarks, or a higher F0 minimum level, all of which contributed to more accent 2 responses. These results indicate that accent 1 is characterized by an early fall, and accent 2 by a salient initial high tone. This comprehensive investigation provided an in-depth description of the monosyllabic and disyllabic accents in this understudied, more conserva- tive dialect that is being replaced by less conservative urban varieties. This

viii contributes to the literature on Scandinavian accentology. Furthermore, this study adds to the literature on the realization of focus in tonal accent lan- guages, and how prosodically marked focus and sentence intonation interact with lexical accents. Finally, this work provides insights into how production and perception constraints shape processing of pitch variation.

ix Table of Contents

Acknowledgments v

Abstract vii

List of Tables xiii

List of Figures xvi

Chapter 1. Introduction 1

Chapter 2. Background 3 2.1 Previous Research into Scandinavian Tonal Accent ...... 3 2.1.1 The Trøndersk Variety of Norwegian ...... 5 2.2 The Tonal Accent Contrast ...... 6 2.2.1 The Disyllabic Accent Contrast ...... 6 2.2.2 The Monosyllabic Accent Contrast ...... 8 2.2.3 The Effect of Sentence-Level Intonation ...... 9 2.2.4 The Prosodic Effect of Focus ...... 11 2.3 The Perception of F0 ...... 13 2.3.1 Perception of Lexical Tonal Accents ...... 14 2.4 Goals and Research Questions ...... 15 2.5 Outline ...... 19

Chapter 3. Experiment 1: Disyllabic Accent Realization in Broad Focus and Contrastive Focus 20 3.1 Methods ...... 20 3.1.1 Participants ...... 20 3.1.2 Materials ...... 21 3.1.3 Procedure ...... 23

x 3.1.4 Measurements and Analysis ...... 23 3.2 Results ...... 27 3.3 Discussion ...... 38

Chapter 4. Experiment 2: Interaction of Disyllabic Accent Re- alization with Higher Level Intonation 46 4.1 Methods ...... 46 4.1.1 Materials ...... 46 4.1.2 Measurements and Analysis ...... 47 4.2 Results ...... 48 4.3 Discussion ...... 58

Chapter 5. Experiment 3: Monosyllabic Accent Realization in Broad Focus and Contrastive Focus 61 5.1 Methods ...... 61 5.1.1 Materials ...... 61 5.1.2 Measurements and Analysis ...... 62 5.2 Results ...... 64 5.3 Discussion ...... 75

Chapter 6. Experiment 4: Perception of Disyllabic Accents 78 6.1 Methods ...... 79 6.1.1 Materials ...... 79 6.1.2 Listeners ...... 83 6.1.3 Procedure ...... 84 6.1.4 Analysis ...... 84 6.2 Results ...... 85 6.3 Discussion ...... 92

Chapter 7. General Discussion and Conclusions 97

Appendices 102

Appendix A. Test Sentences 103

xi Appendix B. Tables of Disyllabic Raw Results 108

Appendix C. Tables of Monosyllabic Raw Results 116

Bibliography 130

xii List of Tables

2.1 Summary of previous analyses ...... 8 2.2 Experiments ...... 18

3.1 Speakers ...... 21 3.2 Disyllabic target words ...... 23 3.3 Disyllabic dependent variables ...... 26 3.4 Raw results ...... 29 3.5 Statistical results ...... 29 3.6 Raw results ...... 30 3.7 Statistical results ...... 30 3.8 Raw results ...... 30 3.9 Statistical results ...... 30 3.10 Raw results ...... 31 3.11 Statistical results ...... 31 3.12 Raw results ...... 32 3.13 Statistical results ...... 32 3.14 Raw results ...... 32 3.15 Statistical results ...... 32 3.16 Raw results ...... 33 3.17 Statistical results ...... 33 3.18 Raw results ...... 34 3.19 Statistical results ...... 34 3.20 Raw results ...... 34 3.21 Statistical results ...... 35 3.22 Raw results ...... 35 3.23 Statistical results ...... 35 3.24 Raw results ...... 36 3.25 Statistical results ...... 36

xiii 3.26 Raw results ...... 36 3.27 Statistical results ...... 37 3.28 Differences between disyllabic accents ...... 38 3.29 Distance of F0 landmarks from segments ...... 40

4.1 Raw results ...... 50 4.2 Statistical results ...... 50 4.3 Raw results ...... 51 4.4 Statistical results ...... 51 4.5 Raw results ...... 51 4.6 Statistical results ...... 51 4.7 Raw results ...... 52 4.8 Statistical results ...... 52 4.9 Raw results ...... 53 4.10 Statistical results ...... 53 4.11 Raw results ...... 53 4.12 Statistical results ...... 53 4.13 Raw results ...... 54 4.14 Statistical results ...... 54 4.15 Raw results ...... 55 4.16 Statistical results ...... 55 4.17 Raw results ...... 55 4.18 Statistical results ...... 55 4.19 Raw results ...... 56 4.20 Statistical results ...... 56 4.21 Raw results ...... 56 4.22 Statistical results ...... 57 4.23 Raw results ...... 57 4.24 Statistical results ...... 57

5.1 Monosyllabic target words ...... 63 5.2 Monosyllabic dependent variables ...... 63 5.3 Raw results ...... 66

xiv 5.4 Statistical results ...... 67 5.5 Raw results ...... 67 5.6 Statistical results ...... 67 5.7 Raw results ...... 68 5.8 Statistical results ...... 68 5.9 Raw results ...... 69 5.10 Statistical results ...... 69 5.11 Raw results ...... 69 5.12 Statistical results ...... 70 5.13 Raw results ...... 70 5.14 Statistical results ...... 70 5.15 Raw results ...... 71 5.16 Statistical results ...... 71 5.17 Raw results ...... 72 5.18 Statistical results ...... 72 5.19 Raw results ...... 72 5.20 Statistical results ...... 73 5.21 Raw results ...... 73 5.22 Statistical results ...... 73 5.23 Raw results ...... 74 5.24 Statistical results ...... 74 5.25 Differences between monosyllabic accents ...... 77

6.1 Manipulation steps ...... 80 6.2 Listeners ...... 83 6.3 Logistic regression results ...... 88 6.4 Majority response crossover points ...... 89

xv List of Figures

2.1 Stockholm Swedish accents ...... 5 2.2 Map of dialect region ...... 6 2.3 East Norwegian accents ...... 7 2.4 East Norwegian intonation ...... 11

3.1 Disyllabic pitch contour and labels ...... 24 3.2 Praat example pitch track ...... 25 3.3 Disyllabic alignment cues ...... 26 3.4 Disyllabic broad focus contours ...... 27 3.5 Disyllabic contrastive focus contours ...... 28 3.6 Phonological analysis ...... 41

4.1 Disyllabic contours in AP-medial position ...... 48 4.2 Disyllabic contours in AP-final position ...... 49

5.1 Monosyllabic pitch contour and labels ...... 64 5.2 Monosyllabic alignment cues ...... 65 5.3 Monosyllabic broad focus contours ...... 65 5.4 Monosyllabic contrastive focus contours ...... 66

6.1 Accents showing landmarks that were manipulated ...... 80 6.2 F0 maximum and minimum alignment steps ...... 81 6.3 HTP alignment steps ...... 82 6.4 F0 minimum height and alignment steps ...... 82 6.5 Responses for F0 Maximum alignment ...... 85 6.6 Responses for HTP ...... 86 6.7 Responses for F0 Minimum alignment ...... 87

B.1 Disyllabic means for F0 maximum ...... 108

xvi B.2 Disyllabic means for F0 minimum ...... 109 B.3 Disyllabic means for slope of the rise ...... 109 B.4 Disyllabic means for F0 maximum alignment ...... 110 B.5 Disyllabic means for F0 minimum alignment ...... 110 B.6 Disyllabic means for HTP alignment ...... 111 B.7 Disyllabic means for slope of the fall ...... 111 B.8 Disyllabic means for AP H% height ...... 112 B.9 Disyllabic means for boundary slope ...... 112 B.10 Disyllabic means for final vowel duration ...... 113 B.11 Disyllabic means for AP H% timing ...... 113 B.12 Disyllabic means for stressed vowel duration (long) ...... 114 B.13 Disyllabic means for stressed vowel duration (short) ...... 114 B.14 Disyllabic means for consonant duration (long) ...... 115 B.15 Disyllabic means for consonant duration (short) ...... 115

C.1 Monosyllabic means for F0 maximum ...... 116 C.2 Monosyllabic means for F0 minimum ...... 117 C.3 Monosyllabic means for vowel onset ...... 118 C.4 Monosyllabic means for slope of the rise ...... 119 C.5 Monosyllabic means for F0 maximum alignment ...... 120 C.6 Monosyllabic means for F0 minimum alignment ...... 121 C.7 Monosyllabic means for slope of the fall ...... 122 C.8 Monosyllabic means for AP H% height ...... 123 C.9 Monosyllabic means for boundary slope ...... 124 C.10 Monosyllabic means for AP H% timing ...... 125 C.11 Monosyllabic means for stressed vowel duration (long) . . . . . 126 C.12 Monosyllabic means for stressed vowel duration (short) . . . . 127 C.13 Monosyllabic means for consonant duration (long) ...... 128 C.14 Monosyllabic means for consonant duration (short) ...... 129

xvii Chapter 1

Introduction

In spoken , pitch can be employed in a number of ways. All languages, including English, use intonation - pitch changes across the course of a phrase - to express emotions such as surprise or anger, and to distinguish between different utterance types, such as questions or statements, and prag- matic information, such as contrastive focus. At least 42% (Maddieson, 2011) of the world’s languages also use pitch changes within words to change the meaning of the word. Tone languages, such as Mandarin Chinese, may do this on every syllable, while tonal accent (or “pitch accent”) languages, such as Norwegian and Lithuanian, do this only on stressed syllables (Hayes, 1995). Languages with such lexical pitch changes (tone languages and tonal accent languages) not only have specific pitch contours on words, but they also use pitch across the sentence to express utterance type or pragmatic informa- tion. The question arises, then, as to how the lexical level and the sentence level interact. Research into the interaction of lexical and post-lexical tones has been conducted on a variety of languages (e.g., Bruce, 1977; Pierrehumbert and Beckman, 1988; Gussenhoven and Bruce, 1999; Xu, 1999; Ma et al., 2006; Scholz, 2012). European languages with lexical pitch contrasts tend to have simpler intonation systems than languages that do not have such a contrast (Gussenhoven and van der Vliet, 1999). Furthermore, while languages such as English and Dutch can express pragmatic focus by a change in peak height or alignment (e.g., Pierrehumbert, 1980; Cooper et al., 1985; Peters et al., 2014), this is restricted in languages with lexical pitch contrasts, which tend to use an increased pitch range for this purpose (Pierrehumbert and Beckman, 1988; Xu, 1999; Remijsen, 2002; Fournier et al., 2006). Norwegian is a tonal accent , with two contrasting accents, accent 1 and accent 2 (Storm, 1884; Vanvik, 1957; Fintoft, 1970; Elstad, 1978).

1 The tonal makeup and phonetic realization of the accents differ across the dialects of Norwegian (G˚arding,1973; Fintoft, 1970). Despite these differences, Fintoft makes the generalization that (1) the main peak is always earlier in accent 1, and (2) accent 1 never has more peaks than accent 2. He also notes that these changes do not occur abruptly as one moves through the country, rather, there are gradual changes in “the relative position of the peak(s) and/or the frequency difference between the peaks” (Fintoft, 1987, p.44). The overlap in contours and the gradual changes across dialects lead to questions about what exactly characterizes the tonal accent contrast for each variety, as well as how much any changes due to sentence intonation and pragmatic context can modify the tonal contours without jeopardizing the contrast. This dissertation is an experimental analysis of the tonal accent con- trast from the perspective of both production and perception. Trøndersk, a variety of East Norwegian spoken in the Trøndelag region in central Norway, will function as a test case for examining the acoustic cues that distinguish the accents in continuous speech. This dialect has not been subjected to a large-scale quantitative analysis, particularly in terms of the interaction of the accents with higher level intonation. Some varieties of Trøndersk, furthermore, have the unique feature of exhibiting a tonal contrast on monosyllabic words, something that is not common in Norwegian or Swedish (Kristoffersen, 1992). Both the disyllabic and monosyllabic contrasts are examined, as are their inter- actions with pragmatic focus (for both word lengths) and sentence intonation (for disyllabic words). Finally, perception experiments explore which cues lis- teners are sensitive to when identifying the tonal accents in this dialect. Next I will provide the background motivating this research followed by the main goals and hypotheses.

2 Chapter 2

Background

Tonal accent is a prosodic pattern found on stressed syllables (e.g., Beckman, 1986; Hyman, 2009). Yip (2002) describes accentual languages as “a particular type of language in which tone is used in a rather limited way, with one (or perhaps two) tone melodies, either lexically linked to particular TBUs [tone bearing units] or perhaps attracted to a syllable selected as prominent by rhythmic principles” (p.260). In such a language, two segmentally identical words can thus be distinguished by the tonal contour only: bønder “farmers” (accent 1) and bønner “beans” (accent 2) (the segments of both words are pronounced /"bøn:@R/) is a minimal pair in the variety of Norwegian. Hualde (2012) defines them further as “a class of languages where words contrast in the tonal melody that is associated with the stressed syllable” (p.1335). Hyman (2006) argues that what are generally referred to as “pitch accent” languages are not a homogeneous group, rather they tend to pick and choose how they instantiate this characteristic and may combine features of stress accent languages and tonal languages. As such, the meaning of the term tonal accent as relevant to Scandinavian will be described below. Varying instantiations of this phenomenon are found in Scandinavian languages (e.g., Bruce, 1977; G˚arding,1973; Fintoft, 1987), Lithuanian (Senn, 1966), Latvian (Karins, 1996; Derksen, 1966), Japanese (Pierrehumbert and Beckman, 1988), and some varieties of Korean (e.g., Kim, 1988), Basque (Hualde, 1991), Serbian and Croatian (Smiljani´c,2006), and Dutch and German (Gussenhoven, 2004).

2.1 Previous Research into Scandinavian Tonal Accent Tonal accent is also known as “word accent”, particularly in reference to Swedish (Bruce, 1977; G˚arding, 1973), while Fintoft (1987) refers to it as a “toneme” system in his description of Norwegian. In this paper I will refer

3 to it as “tonal accent”, as in Kristoffersen (2000). The reason for this is to differentiate it from the pitch accent of intonation, and also to show that in Scandinavian languages, the prominence is not just independently one of pitch but is dependent on primary stress, and is, in fact, a means of indicating primary stress1. Finally, the term “tonal accent” seems clearer than “word accent” since the morphological word is not necessarily the domain of the accent, at least in some varieties of Norwegian, where the accent phrase2 is in fact the domain of the accent (Kristoffersen, 2000). The tonal accent found today in most varieties of Norwegian and Swedish is thought to have arisen historically from a tonal contrast between monosyllables and polysyllables in (Oftedal, 1952; Kristoffersen, 2000). When monosyllables ending in an obstruent-sonorant sequence became disyllabic due to vowel insertion, they retained the tonal contour of monosyllables, thus creating a contrast in tonal contour rather than in syllable number. Another analysis, whereby stress was replaced with a lexical accent (Kock, 1884/85; Riad, 2003), has also been proposed. The tonal makeup and phonetic realization of the accents differ across the Scandinavian language varieties (G˚arding,1973; Fintoft, 1970). For ex- ample, in Norway, some dialects have the tonal makeup of a high-low contour where others have a low-high contour (e.g., Almberg, 2004). The two contrast- ing accents are referred to as accent 1 and accent 2. The first comprehensive acoustic analysis of the tonal accent contrast in Stockholm Swedish revealed that the difference between accent 1 and accent 2 was in the timing of the

F0 fall in relation to the stressed syllable, whereby it was later for accent 2 than accent 1, as seen in Figure 2.1 (Bruce, 1977). The nature of the contrast has been studied extensively, both impressionistically and experimentally, for a number of dialects of Swedish and Norwegian (e.g. Storm, 1884; Bjerrum, 1948; Vanvik, 1957; Fintoft, 1970; G˚arding,1973; G˚ardingand Lindblad, 1975; Bruce, 1977; Elstad, 1978; Lorentz, 1981; Riad, 1998; Kristoffersen, 2000; Van Dommelen, 2002; Van Dommelen and Nilsen, 2003; Segerup, 2003, 2004; Alm- berg, 2004; Gussenhoven, 2004; Riad, 2006).

1In this way it is similar to Serbian and Croatian and different from Japanese, and could be called a stress language with a lexical contrast in the alignment of pitch contours. 2The accent phrase is described in section 2.2.3.

4 Figure 2.1: Tonal Accents of Stockholm Swedish (Bruce, 1977). The beginning of the fall is marked by blue circles.

The dialect focused on in the current study is Trøndersk, an East Nor- wegian variety spoken in central Norway. This variety was chosen because it has not been as extensively described in terms of a large-scale analysis as other varieties, so the current analysis contributes to the typological litera- ture. Also, one unusual feature of this variety is the fact that, unlike most varieties of Norwegian, it has a tonal accent contrast on monosyllabic words, the more complex contour known as the circumflex accent (e.g, Kristoffersen, 1992; Almberg, 2001; Kristoffersen, 2011).

2.1.1 The Trøndersk Variety of Norwegian There are two main varieties of Norwegian, East Norwegian and West Norwegian. East Norwegian comprises a group of dialects spoken in the south- east and central regions of Norway (Kristoffersen, 2000). Figure 2.2 shows the Trøndelag region, where Trøndersk is spoken, highlighting Trondheim (the cap- ital city of the region). East Norwegian is generally referred to as a low-tone dialect, where accent 1 is a low tone (L) and accent 2 is a high-low melody, HL, and West Norwegian as a high-tone dialect, where accent 1 is H and accent 2 is a low-high melody, LH (e.g., Kristoffersen, 2000; Almberg, 2004). However, within these regions there is further variability, for example, in terms of which accent has a higher F0 peak.

5 Figure 2.2: Map showing the Trøndelag region (Bookcoverimgs.com, 2012)

2.2 The Tonal Accent Contrast 2.2.1 The Disyllabic Accent Contrast The tonal accent contrast in the Trøndersk variety has been examined in some detail. The contrast has been described as a difference in tonal makeup, with accent 1 being L and accent 2 HL, thus aligning the Trøndersk variety with other varieties of East Norwegian (Nilsen, 1992), such as the Oslo variety (Fintoft, 1970; Kristoffersen, 2006b). In contrast, other studies have suggested that the difference lies in the alignment of the F0 contour, with both accents having a HL lexical tonal accent (Fintoft, 1970; Kristoffersen, 2006b). In a study analyzing recordings of six disyllabic word pairs spoken (in sentence final position, preceded by a pause) by 13 male speakers from Trondheim (the capital city of the region where Trøndersk is spoken), Fintoft (1970) found that: accent 1 reaches its F0 minimum (L target) in the (initial) stressed vowel, while accent 2 has its initial H tone in the middle of the stressed vowel, and falls from there, as shown in Figure 2.3. In addition, the unstressed (second) vowel tends to be significantly longer in accent 2 words (Fintoft, 1970). Also examining the Trondheim variety, Wetterlin (2010) found both accents to have just an L contour, although accent 1 has a steeper fall than accent 2. In this variety, the L tone is found earlier in accent 1, where it occurs during the

6 first syllable, while in accent 2 it occurs in the second syllable. A similar difference in alignment was described for a variety spoken in the west of the Trøndelag region (Van Dommelen and Nilsen, 2003). While the two accents had a similar overall contour, the difference between them was in the timing of the F0 fall and rise, which both occurred earlier for accent 1. Examination of the tonal accent contrast in the south of the Trøndelag region, , revealed that both accents have a HL contour, with an earlier alignment in accent 1 than in accent 2 (Kristoffersen, 2006b). In Oppdal and Trondheim, the initial H is in the stressed syllable for both accents, while the following L is associated with the stressed syllable in accent 1 and the post-stressed syllable in accent 2 (Fintoft, 1970; Kristoffersen, 2006b). These results show the conflicting analyses of the tonal accent contrast in Trøndersk (see Table 2.1), which centers on whether it is one of timing or tonal makeup. The key to this lies in whether there is an initial H target in accent 1. If the presence of such a target can be discerned, both accents could be argued to have a HL contour. If not, accent 1 is L and accent 2 HL, and the contrast is in the tonal makeup. This question will be addressed in the current study. It should be noted that the current stud does not examine the variety spoken in Trondheim, as other studies did, rather it examines those spoken in towns around this city.

Figure 2.3: Trøndersk disyllabic accents, based on average contours: “smilet” (accent 1) and “smile” (accent 2). (Fintoft, 1970)

7 Tone categories Phonetic Author Dialect Accent 1 Accent 2 Difference Fintoft (1970): Trondheim HL HL Timing Onset? Nilsen (1992): Trøndersk L HL H tone Van Dommelen West Trøndersk HL HL Timing & Nilsen (2003): Wetterlin (2010): Trondheim L L Timing F0 Range? Kristoffersen (2006b): Oppdal HL HL Timing

Table 2.1: Summary of previous production findings on Trøndelag dialects

2.2.2 The Monosyllabic Accent Contrast In the majority of Norwegian and Swedish dialects the accent contrast is only found on polysyllabic words. One explanation for this is that since the accent 2 contour has a later alignment and/or an extra tone in comparison to accent 1, it needs a second syllable in order for the later tones to surface (e.g., Haugen and Joos, 1952). Another explanation is that accent 2 derives from words in Old Norse that had at least one syllable following the main stress (Kristoffersen, p.c.). Some analyses regard all monosyllabic words as carrying accent 1 (Haugen, 1983; Felder et al., 2009). However, a small number of dialects, including Trøndersk, have instances of a tonal contrast surfacing on monosyllabic words, in this case due to apocope (Elstad, 1978; Kristoffersen, 1992, 2011). The tonal contrast on monosyllabic words in Trøndersk is realized as a difference between the circumflex accent and the ‘unmarked’ accent (Almberg, 2001; Kristoffersen, 2011). The circumflex accent occurs on words in which the final vowel is deleted, but are disyllabic in other varieties of Norwegian. This accent can surface on words that were originally either accent 1 or 2 (Almberg, 2001). The circumflex accent also occurs in the Nordland dialect of Norwegian (Almberg, 2001; Kristoffersen, 2011) but while the Trøndersk version can form from polysyllabic words of either accent, the Nordland form can only form from accent 2 (Elstad, 1982). Unmarked monosyllabic words

8 in East Norwegian have been described as being characterized only by an L tone (Dalen, 1985). A phonetic analysis of a small set of circumflex words found that this accent has a HL contour, with a longer vowel and a higher F0 at vowel onset than the unmarked monosyllabic accent (Almberg, 2001). The circumflex contour has also been described as a temporally displaced version of the unmarked contour (Almberg, 2001), but since circumflex is HL, this would suggest that the unmarked accent is also HL. Since the unmarked contour was previously described as just L (e.g., Dalen, 1985), Almberg (2001) suggests that in order to determine whether this is the case, the F0 contour before the monosyllabic accents must be examined. Since Almberg (2001) appears to be the only acoustic analysis of the monosyllabic accents, it is worth investigating further if the ‘displaced’ theory holds up. The goal of the current study is to examine the monosyllabic contrast in contrastive focus. It has been noted that the circumflex accent is moribund and therefor rare among young speakers (Dalen et al., 2008), so an acoustic analysis of it is crucial before it is lost. This will help elucidate the features of each monosyllabic accent and how pragmatic focus affects them. The current analysis also examines the anacrusis, that is, the unstressed syllables before the target word, which has not been examined before, to determine whether the circumflex accent is a displaced version of the unmarked accent, thus provid- ing new evidence which will hopefully contribute toward resolving the earlier conflicting findings.

2.2.3 The Effect of Sentence-Level Intonation The interaction of lexical pitch with sentence intonation has been ex- amined in a variety of languages (e.g., Bruce, 1977; Pierrehumbert and Beck- man, 1988; Gussenhoven and Bruce, 1999; Riad, 2006). In the current study, the goal is to examine how sentential intonation affects the accent contours in Trøndersk. Work on other langauges has shown that the pitch level and alignment of a lexical tone can be affected by sentential intonation (Ma et al., 2006). This can occur in a variety of ways. For example, Gussenhoven and van der Vliet (1999) found that the lexical tonal accents of the Dutch dialect of Venlo have different pitch contours depending on the utterance type, posi- tion and pragmatic context. High boundary tones can induce a higher pitch

9 on lexical tones close to them (Myers, 2004). The realization of Cantonese tones in different positions was examined by Vance (1976), who found that the lexical tones were lowered in sentence-final position compared to medial position, due to sentence-final lowering. In Kammu, a language spoken in Laos, when a lexical high-low tone is followed by the sentence-final boundary H tone, the sentence-final boundary tone is not fully realized, and instead sur- faces as a level or falling contour (Karlsson et al., 2010). Here, the authors suggest that the realization of the lexical tone supersedes the sentence into- nation tones. In Thai, on the other hand, Abramson (1979) observed that the lexical tones were affected by sentence intonation but the contrasts were still preserved. In Mandarin, Lin (2004) found that lexical tones and sentence intonation affect different dimensions of the F0 contour, whereby lexical tones were distinguished by their F0 contour and sentence intonation was expressed through F0 range. In order to tease apart the lexical accents from higher level intonation effects, first it is necessary to examine descriptions of intonation in Norwegian. East Norwegian sentence intonation is extensively described by the Trondheim Model (Fretheim, 1981, 1982). An utterance is composed of in- tonational phrases (IP) which are further composed of accent phrases (AP), specified for accent 1 or 2 depending on the accent of the word that heads the AP (e.g., Haugen and Joos, 1952; Fretheim, 1987a, 1991; Fretheim and Nilsen, 1989). Each AP starts with a primary stressed syllable at the left edge and includes any number of unaccented syllables before the next stressed syllable which is the head of the following AP. The right edge of the AP is delimited by a high boundary tone (H%) (Fretheim, 1987b; Nilsen, 1989; Kristoffersen, 2000), as shown in Figure 2.4 from Fretheim (1987b) (he uses AU where I have used AP). While previous work provides important information about East Norwegian prosodic structure, none of these studies examined how IP- and AP-level boundary tones interact with the lexical tonal accents and whether this interaction impacts their realization. Borgstrøm (1962) noted that in the Oslo variety of East Norwegian, accent 1 may be more affected than accent 2 by higher-level intonation, leading to the tonal accent contrast being “some- what reduced” (p. 36) in falling intonation. Work on Swedish showed that while the range of the rise or fall in tonal accent contours can be affected by sentence intonation, the contrast is preserved (Hadding-Koch, 1961, 1962).

10 One brief mention of the effect of the number of syllables in the AP in East Norwegian is in Teig (2001), who states that the contour of a two-syllable AP differs from that of a one-syllable AP. Although this was not the focus of the study, the pitch track shows a more marked drop to the lexical L when there is a second, unstressed syllable in the AP, “something there is no time for in the...contour with only one syllable in the [AP]” (p.224). This is an indication that the AP tones indeed do affect the lexical accent tones, something that will be explored in the current study.

Figure 2.4: Sentence intonation of East Norwegian, broken into Accent Phrases (Fretheim, 1987b). The high boundary tone can be seen in the raised contour at right edge of each AP.

2.2.4 The Prosodic Effect of Focus While there are various ways in which focus can be defined (related both to its meaning and scope), here the term ‘contrastive focus’ is employed to denote a specific type of narrow focus (Chafe, 1976; Rooth, 1985; Gussen- hoven, 2005; Selkirk, Elisabeth, 2008; Katz and Selkirk, 2011). In this sense, a constituent under contrastive focus relates to a set of alternatives that are shared between the interlocutors. Numerous studies across prosodically differ- ent languages have documented the effect of narrow and contrastive focus on the realization of the F0 contour and segmental duration (Ladd, 1978, 1996; Gussenhoven, 1984; Beckman and Edwards, 1994; Sluijter and van Heuven, 1996; Campbell and Beckman, 1997; Remijsen and van Heuven, 2005; Zhang et al., 2006; Arvaniti et al., 2006; Prieto, 2014; Peters et al., 2014). In Ger- man, for instance, in narrow focus, intonational pitch accents are lowered in prenuclear position and deaccented in postnuclear position (F´ery and K¨ugler,

11 2008). Narrow focus in English and Dutch is realized by a higher F0 peak and longer segments (Pierrehumbert, 1980; Cooper et al., 1985; Eefting, 1991; Cambier-Langeveld and Turk, 1999; Xu and Xu, 2005). A later alignment of the tonal targets in narrow focus was found for some varieties of Dutch and German (Peters et al., 2014). Narrow focus can also change the shape of the lexical tonal accents. In languages with lexical pitch, an expanded pitch range (Pierrehumbert and Beckman, 1988; Xu, 1999; Remijsen, 2002; Fournier et al., 2006; Scholz, 2012) and greater articulatory force (Chen, 2010) are often used to mark narrow focus. In Swedish, single-peaked dialects expand the pitch range on the target word to signal narrow focus, while double-peaked dialects add a pitch gesture after the stressed syllable (Bruce, 2005). In Serbian and Croatian, Smiljani´c (2003) found that narrow focus was indicated by the use of an expanded pitch range, a change in peak alignment, and vowel lengthening. The change in peak alignment was restricted, however, in the dialect with the lexical tonal accent. Interestingly, even closely related tonal languages exhibit differences in focus realization, such that post-focal F0 range compression is found in Beijing Mandarin but not in Taiwan Mandarin or Taiwanese (Chen et al., 2009). Narrow focus impacts segmental durations, as mentioned above for En- glish and Dutch (Cooper et al., 1985; Eefting, 1991). Similar segmental length- ening was found in dialects of Dutch and German (Peters et al., 2014). In a dialect of West Limburgian, Peters (2007) found that durational differences between the tonal accents (one accent had consistently longer syllables than the other accent) were increased in nuclear position. In Swedish, phonologi- cally long segments were lengthened more than short segments in focus, thus exaggerating the phonological contrast between short and long vowels and consonants (Bruce, 1977; Bannert, 1979; Bruce, 1981). Also in Swedish, the unstressed syllable following the stressed syllable was lengthened under focus (Heldner and Strangert, 2001). Similar to the findings that phonologically short and long vowels were more distinct in narrow focus, tonal accents contrasts can also be enhanced. Smiljani´c(2003) found that in the Belgrade variety of Serbian, narrow focus caused asymmetric changes in the alignment of a low tonal target between the lexical accents, leading to a greater contrast between the accents in this

12 condition. In a slightly different way, focus enhances the lexical tonal accent contrast in the Venlo dialect of Dutch, where the contrast only surfaces when target words are focused or final (Gussenhoven and van der Vliet, 1999). In East Norwegian, narrow focus is marked by a high tone at the right edge of the AP (Fretheim, 1987b; Nilsen, 1989; Kristoffersen, 2000). This tone contributes to the lexical item being perceived as focused even though the focus marker is a few syllables beyond the focused word (Abrahamsen, 2004). A similar pattern whereby the focus tone is not realized on the focused word is found in languages such as Bengali (Hayes and Lahiri, 1991) and Greek (Arvaniti et al., 2006). For Norwegian, the original H% of the AP and the focus H tone combine, causing the H% at the right edge of the AP to have a higher

F0 (Fretheim, 1987b; Fretheim and Nilsen, 1989; Kristoffersen, 2000). Accent 1 words in East Norwegian were found to signal narrow focus with increased duration of the vowel, syllable and word (Mixdorff et al., 2010). An earlier AP H% alignment was found for accent 1 words in narrow focus compared to broad focus (Koreman et al., 2009), while Mixdorff et al. (2010) found the earlier AP H% alignment in both accents. It appears there is no description, however, of how narrow focus impacts the height or alignment of the lexical tones, and whether narrow focus enhances or reduces the tonal accent contrast. These questions are examined in the current study by using contrastive focus on the target words.

2.3 The Perception of F0 The fact that a speaker produces certain acoustic cues does not mean that the listener attends to all of them. For example, when multiple acous- tic cues are available, listeners can weight one cue more heavily than others (Francis et al., 2008b), so the presence of a particular cue in production is not evidence for its use in perception. Examining perception of high and rising tones in Korean, Chang (2013) found that the cues for perception lined up well with the descriptions of the tone production. Work on a variety of languages has found an effect of systematic ma- nipulations of F0, on the perception of both tone and intonation, and this approach highlights what characteristics are necessary for the listener to per-

13 ceive a particular feature (e.g., Shen, 1993; G´osyand Terken, 1994; Almberg and Husby, 2000; Gussenhoven and Chen, 2000; Francis et al., 2003; Shattuck- Hufnagel et al., 2004; Xu et al., 2006; Francis et al., 2008b; Shport, 2011; Chang, 2013; Liu, in press). The goal of the perception study conducted here is to examine further what cues listeners attend to in distinguishing the ac- cents. Following the approach of the studies mentioned above, the results of the production study will be a starting point for determining what probable cues are used to distinguishing tones in Trøndersk. The perception experi- ments will use stimuli with artificially manipulated cues in order to pinpoint the most salient cues for accent identification.

2.3.1 Perception of Lexical Tonal Accents While a number of studies examined how the contrasts are realized acoustically in various Scandinavian dialects, few studies have looked at the perception of the tonal accent contrast. In one study, Segerup (2004) found that listeners could correctly identify naturally produced tonal minimal pairs 96% of the time for one variety of West Swedish. With regard to which cues listeners use to make lexical accent identification, one study (Efremova et al.,

1963) using gating experiments found that the shape of the F0 contour in the initial, stressed syllable contains important cues for distinguishing the accents in disyllabic words in Swedish. Similarly, Norwegian listeners were able to identify the two accents accurately even when presented with portions of the words up to the end of the initial, stressed vowel (Fintoft, 1970). Using synthesized contours, Bruce (1977) examined perception of the accent contrast in Stockholm Swedish and confirmed that listeners used the timing of the F0 contour in relation to the stressed syllable as the main cue in differentiating between the two accents, thus aligning production and perception findings closely. Specifically, Swedish listeners identified accent 2 as long as the fall began 25% of the way into the vowel, or later. With regards to the perception of Norwegian tonal accents, two studies tested which aspects of the F0 contours were salient indicators of tonal categories. In a small-scale study, Fintoft and

M´artony (1964) manipulated F0 peak height, alignment and the slope of the rise and fall to examine accent identification in the Oslo dialect. With just the first consonant-vowel syllable of the manipulated disyllabic words played to

14 the listeners, a level or rising F0 in the stressed vowel was identified as accent 1, and a falling F0 at the end of the stressed vowel was identified as accent 2. In another perception study, Fintoft (1970) used synthesized stimuli composed of sine wave signals with manipulated frequency contours, superimposed on segments. The results also showed that Norwegian listeners identified a level or rising frequency contour at the end of the stressed vowel as accent 1, and a falling contour at that point as accent 2. These results indicate that the alignment of the F0 contour is a salient cue for the perception of the lexical pitch contrasts in a variety of Scandinavian dialects. In terms of perception, narrow focus has been found to affect accuracy in the perception of the tonal accents. Listeners were better at distinguishing Norwegian words that had been produced in isolation than those excised from context, presumably since the speakers emphasized the F0 contours when no context was present (Van Dommelen, 2002). Speakers of the Roermond di- alect of Dutch were more accurate at distinguishing the accents when they had been excised from a narrow focus context than from pre-nuclear or post-nuclear contexts (Fournier et al., 2006). Chen (2010) mentions a pilot study where focused Chinese words were excised from context and presented to listeners, who identified them with an accuracy rate of 90%. (This was compared to tones in post-focal position, which had an accuracy rate of 65%). Combined, these results suggest that narrow focus exaggerates cues to the tonal contrasts and therefore contributes to the enhanced word recognition. The current study examines the perception of accent 1 and accent 2 words with the goal of deter- mining which cues are salient markers of the accentual distinction for listeners.

2.4 Goals and Research Questions As described above, focus and sentence-level prosody can further affect the tonal contours. They impact segmental duration and determine distribu- tion and identity of tonal events as well as their exact realization. Little work has directly examined how pragmatic focus and higher level sentential into- nation impact the accent contrast in Norwegian. The current study examines the nature of the tonal accent contrasts in the Trøndersk variety. It expands on previous research by conducting detailed acoustic analyses examining a

15 number of acoustic cues (F0 maximum and minimum height and alignment, alignment of the F0 fall, F0 slope, vowel duration, accent phrase tone height and alignment) in a larger number of sentences and speakers. The stimuli used in this study control for the effect of sentence intonation, thereby in- vestigating the lexical and intonation effects on the tonal contours separately. The question of interest is how much variation in the features that define the phonological tonal contrast can be allowed due to contrastive focus and sen- tence intonation while preserving the lexical contrast itself. Accordingly, the goals of the production studies are to examine how the tonal contrast is imple- mented in Trøndersk. A second goal is to investigate how sentence intonation (position in utterance) and focus impact the realization of the tonal accent contrast. Experiment 1 (Chapter 3) examines the tonal accent contrast in di- syllabic words and the effect of contrastive focus on this contrast. Experiment 2 (Chapter 4) examines the effect of sentential intonation on the tonal accents in disyllabic words. Experiment 3 (Chapter 5) examines the accent contrast on monosyllabic words in both broad (noncontrastive) focus and contrastive focus. Based on previous research, it is hypothesized that both disyllabic ac- cents will have a HL contour and the tones in accent 2 will have a later alignment in relation to the segmental string than accent 1. Through this investigation, the current study will provide further insight into the question of whether accent 1 is L or HL (cf. Kristoffersen, 2006b) by examining the F0 contour of the sentence-initial words (the anacrusis) that precede the target words. It is predicted that given enough segmental material, the initial H of the accent 1 HL tonal accent will be observed. In comparing how the con- trast is realized in AP-medial versus AP-final position, it is hypothesized that the AP-final boundary H% tone will be realized on the target word when in AP-final position, thus adding an extra tone to this unit. It is expected that the closer presence of the AP-final H% tone will cause an earlier alignment of the lexical tones in relation to the syllable, compared to AP-medial position.

This immediately following H tone may also cause the lexical L tone (F0 min- imum) to not be as low as in the non-AP-final condition. It is hypothesized that in contrastive focus, both accents will have a wider pitch range, earlier and higher AP H% tone, and longer segments. If indeed contrastive focus causes an enhancement of the tonal accent contrast, it is hypothesized that

16 for disyllabic words, accent 1 will have an earlier alignment of F0 landmarks in contrastive focus than in broad focus, while accent 2 will have a later align- ment in contrastive focus. For the monosyllabic words, it is hypothesized that in broad focus, the circumflex accent would have a wider pitch range and later

F0 minimum alignment than the unmarked monosyllabic accent. As in other languages with lexical F0 contrasts on monosyllabic words, it is also expected that contrastive focus will induce an expanded pitch range (e.g., Xu, 1999), and also a higher and earlier AP H%. The goal of the current perception study is to examine in detail which acoustic cues are important for lexical accent identification in Trøndersk. The current investigation expands on previous perception studies by systematically manipulating more cues (F0 minimum height and alignment, F0 maximum alignment, and alignment of the turning point from maximum to minimum). This is done in order to examine the perception of which cues will trigger cat- egorical shifts between the two accents and whether any one of these F0 cues or a combination of them will shift the responses. A perception experiment was designed to determine whether listeners use the differences in these cues in making word identification decisions, and their relative importance. Lis- tener responses will provide an insight into how these acoustic dimensions are perceived and used in processing of lexical pitch contrasts. This study further builds on previous perception work by examining a large number of listeners and by carefully controlling for sentence intonation effects on the lexical tonal contrast by focusing on words that were produced in sentence-medial positions and with neutral intonation. Experiment 4 (Chapter 6) presented listeners with manipulated tokens and assessed which cues were used to differentiate between the two accents. For the manipulated contours, it is hypothesized that listeners will pay attention to the alignment of the F0 fall and F0 min- imum, and the height of the F0 minimum. Specifically, an earlier and lower F0 minimum is expected to induce more accent 1 responses, and a later and higher F0 minimum, more accent 2 responses. This dissertation examines the production and perception of fundamen- tal frequency (F0), or pitch, in Trøndersk. Table 2.2 lays out the experiments conducted. The following research questions form the focus of this investigation:

17 (1) Which acoustic cues characterize tonal accent distinctions in this un- derstudied dialect? Does the implementation of the lexical contrast differ for monosyllabic and disyllabic words? (Experiments 1, 3) (2) What is the impact of sentence-level intonation and position within the Accent Phrase (medial vs. final) on the lexical tonal contrast? (Experiment 2) (3) What is the impact of pragmatic contrastive focus on the lexical tonal contrast? (Experiments 1, 3) (4) Which acoustic cues are listeners sensitive to in differentiating the lex- ical tonal accents? (Experiment 4)

The systematic examination of the acoustic F0 patterns will provide an insight into what cues characterize the accents. In order to examine this, the current study involves carrier sentences designed to compare F0 and duration cues in broad focus and contrastive focus of target words, and also sentences designed to examine these cues when a target word is at the right edge of an Accent Phrase versus a number of syllables before this point. Per- ception experiments examine perception of naturally produced words excised from sentence context, as well as manipulated versions of target words, to determine what acoustic features consistently distinguish the accents.

Production experiments: Word Length Accents Condition (1) Disyllabic 1, 2 Broad & Contrastive Focus (2) Disyllabic 1, 2 AP-medial & AP-final (3) Monosyllabic Unmarked, Broad & Contrastive Focus Circumflex

Perception experiment: Word Length Accents Condition (4) Disyllabic 1, 2 Manipulated Contours

Table 2.2: Experiments

18 2.5 Outline The outline of chapters is as follows: Chapters 3 to 5 describe the production experiments examining the monosyllabic and disyllabic accents in Trøndersk and their interaction with higher level intonation. Chapter 6 reports the experiments on the perception of the disyllabic accents. Chapter 7 provides a general discussion on the findings and how they relate to the literature on tonal accent and intonation.

19 Chapter 3

Experiment 1: Disyllabic Accent Realization in Broad Focus and Contrastive Focus

The first goal of this experiment was to examine how the two lexical tonal accents are realized in the Trøndersk dialect. The second goal was to examine how contrastive focus affects the accent contrast. It was hypothesized that in broad (noncontrastive) focus, both accents would have a HL contour with accent 2 having a later alignment of tones in relation to the segmental string, than accent 1 (Fintoft, 1970; Kristoffersen, 2006b). In contrastive focus, it was hypothesized that both accents would have longer segments, a higher and earlier AP H% tone (Koreman et al., 2009; Mixdorff et al., 2010), a higher

F0 maximum and a lower F0 minimum (e.g., Xu, 1999). It may also be the case that accent 1 has an earlier alignment of tones, and accent 2 a later alignment, in broad focus than in contrastive focus.

3.1 Methods 3.1.1 Participants Ten native speakers (6 female, 4 male) of the Trøndersk dialect, aged 18-45 participated in the experiment. They were recruited by posters and fliers around the campus of the National University of Science and Technology (NTNU), Trondheim and were paid for their participation. Before the record- ing session, they filled out a language background questionnaire. The results confirmed that they were all from towns south and west1 of Trondheim and had all grown up speaking Trøndersk at home. Their parents were also native speakers of this dialect.

1These towns were chosen because the circumflex accent still occurs here.

20 Speaker Sex Age range Hometown 01 F 36-40 Tingvoll 02 F 25-30 Oppdal 03 F 30-35 Tingvoll 04 M 18-24 Øksendal 05 M 36-40 Rennebu 06 F 18-24 Surnadal 07 F 18-24 Sunndal 08 M 36-40 Halsa 09 M 36-40 Alvundeid˚ 10 F 18-24 Surnadal

Table 3.1: Speaker details

3.1.2 Materials The target words were disyllabic and had initial stress. The stressed vowel was always /i/, to control for intrinsic pitch (Whalen and Levitt, 1995) and duration (Lindblom et al., 1981) differences. Only sonorant consonants appeared next to the stressed vowel, for example, "limet “the glue” (accent 1) and "minne “memory” (accent 2). There were five target words for accent 1, each produced three times, giving 15 tokens. There were 2 target words2 for accent 2, each produced seven or eight times, also giving 15 tokens. This gave 15 tokens per accent per speaker for each condition (noncontrastive and contrastive focus), for a total of 600 (15 tokens x 2 accents x 2 conditions x 10 speakers). For accent 1, four of the five target words contained a phonologically long vowel followed by a short consonant (V:C), and one of the five had a short vowel followed by a long consonant (VC:). For accent 2, one of the target words had V:C and one VC:. (These segment duration differences were compensated for in the timing measurements (see below).) The target words (shown in Table 3.2) were produced in sentences (listed in Appendix A) to elicit either a noncontrastive or contrastive focus reading. For the noncontrastive focus condition, a content word a number

2Due to the constraints on vowel, stress, word class and consonant type, only two target words could be found for accent 2.

21 of syllables after the target word was contrasted with a word at the end of the sentence. This ensured that the target word did not receive contrastive focus. In the contrastive focus condition, the target word was contrasted with a word at the end of the sentence. In both conditions, the target word was preceded by two or three unstressed syllables, which were outside of any AP (Kristoffersen, 2006b). The target words were also followed by two unstressed syllables in the same AP, to ensure that the target word did not carry the H% boundary tone that marks the right edge of the AP in East Norwegian (Fretheim, 1987a). Example sentences for all conditions are below, with the target words highlighted in bold. (AP = accent phrase, IP = intonational phrase, IU = intonational utterance, based on the Trondheim Model.)

Accent 1, broad focus: Det var glimtet i en film, men ikke i et stykke. 1 1 1 (((Det var ( glimtet-i-en)AP ( FILM)AP )IP , men itj i et ( STYKKE)AP )IP )IU “There was the flash in a film, but not in a play.”

Accent 1, contrastive focus: Det var glimtet i en film, men ikke brannen. 1 1 1 (((Det var ( GLIMTET-i-en)AP )IP (( film)AP , men itj ( BRANNEN)AP )IP )IU “There was the flash in a film, but not the fire.”

Accent 2, broad focus: Det var et minne i en film, men ikke i et stykke. 2 1 1 (((Det var et ( minne-i-en)AP ( FILM)AP )IP , men itj i et ( STYKKE)AP )IP )IU “There was a memory in a film, but not in a play.”

Accent 2, contrastive focus: Det var et minne i en film, men ikke en drøm. 2 1 1 (((Det var et ( MINNE-i-en)AP )IP (( film)AP , men itj en ( DRØM)AP )IP )IU “There was a memory in a film, but not a dream.”

22 Accent 1 Gloss Accent 2 Gloss limet the glue minne memory linet the flax/linen Line girl’s name smilet the smile slimet the mucus glimtet the flash

Table 3.2: Disyllabic target words

3.1.3 Procedure The sentences were presented in slide format, in a randomized order, which was the same for all participants. Different focus conditions were in- terspersed randomly, but the sentences were presented in the same order for each participant. The participants were in control of when to move to the next slide. The sentences were written in the standard Bokm˚alorthography and also in a transcription of Trøndersk. This was to encourage them to use the Trøndersk dialect. They were instructed to speak in a casual manner as they would at home. The recordings were conducted using Adobe Audition at a sampling rate of 44.1kHz. The experiments took place in the phonetics studio at NTNU. The production experiments took 30-45 minutes per participant and they were paid 170 NOK (approx. US$30). All production experiments (Experiments 1-3) took place in the same sitting.

3.1.4 Measurements and Analysis A number of measurements were taken using Praat (Boersma and

Weenink, 2011) to examine the F0 contours and segment durations in detail. The tonal and segmental landmarks labels are shown in Figure 3.1. Figure 3.2 shows a pitch track of part of a sentence from Praat, comparing the accent 2 word ‘Line’ in noncontrastive focus and contrastive focus. Table 3.3 shows the calculations made for duration and pitch measure- ments. All measurements were made on the target word, except for AP H% measurements, which were made on the final syllable in the AP. AP H% timing was measured in milliseconds from the AP H% tone to the AP boundary. This

23 Figure 3.1: Accent 1 contour of the word ‘linet’ and the two following un- stressed syllables in the AP showing measurement points. S = beginning of the sentence; B = beginning of f0 rise; C1 = onset of target word; C2 = onset of second consonant (if present); V1 = vowel onset; C3 = onset of post-vocalic consonant; V2 = unstressed vowel onset; W = end of target word; AP = end of AP; H = F0 maximum, HTP = turning point from F0 maximum; L = F0 minimum; LTP = turning point from F0 minimum; APH = AP boundary tone.

was measured to determine whether it is higher and earlier, as expected, in the contrastive focus condition. The alignment of F0 minimum and high turn- ing point (henceforth ‘HTP’, the point where F0 starts to fall, which occurs after a high plateau) were measured from vowel onset and then divided by the combined duration of the vowel and post-vocalic consonant. This was done to control for speaking rate differences. The duration of the vowel and follow- ing consonant were combined because some target words had V:C and some VC:, so combining these allowed for pooling of timing measures regardless of phonological . Because F0 maximum often occurred before word onset or vowel onset, especially for accent 1 words, the timing of this measure was divided by word duration. This was to compensate for speaking rate but also for onset type differences, since some target words had a complex onset, and some did not. These F0 landmarks were measured to determine whether, as hypothesized, accent 2 has a later tonal alignment than accent 1, and to examine whether contrastive focus affects tonal height and alignment. Slope

24 Figure 3.2: Example pitch track highlighting accent 2 word ‘Line’ in noncon- trastive focus (top) and contrastive focus (bottom).

of the rise was the pitch difference between the beginning of the F0 rise to the F0 maximum, divided by the duration between these two points. This was measured to determine whether both accents had a rise to an initial H tone. Slope of the fall was the pitch difference from H to L (F0 maximum to minimum), divided by the duration between the two points. Boundary slope was the F0 difference between the turning point from L to AP H%, divided by the duration between these two tonal events. This was measured in order to examine whether contrastive focus affects the pitch contour leading to the AP H% tone. Stressed vowel duration was examined separately for long and short vowels. Figure 3.3 indicates where the alignment cues occur on the accents. Vowel onset was determined by the beginning of periodicity in the wave- form, higher intensity than the surrounding sonorants and consistent in the spectrogram. F0 maximum and minimum were determined by exam- ining the pitch track for the highest or lowest point. When this was unclear, the region was selected and the Praat function for choosing local maxima and minima was used. HTP was where the F0 height was equal to the F0 maximum height and began to drop, as determined by examining the pitch track. A mixed model multiple linear regression analysis was conducted using the lmerTest package in R (R Development Core Team, 2008). The indepen-

25 Variables Calculation Units Pitch contour: F0 maximum semitones F0 minimum semitones AP H% height semitones Duration: Stressed vowel duration C3-V1 msec Unstressed (final) vowel duration W-V2 msec AP H% timing from AP boundary AP time - APH time msec Calculations: Slope of the rise (H-B)F0 /(H-B)dur. relative F0 maximum alignment (H-V1)dur./Word dur. relative F0 minimum alignment (L-V1)dur./VC dur. relative F0 max. turning point (HTP-V1)dur./VC dur. relative Slope of the fall (H-L)F0 /(H-L)dur. relative Boundary slope (APH-LTP)F0 /(APH-LTP)dur. relative Table 3.3: Variables

Figure 3.3: Trøndersk accent 1 (solid) & accent 2 (dashed), showing F0 max- imum, F0 minimum and high turning point.

26 dent variables were accent (1 or 2) and focus realization (broad or contrastive) (so the model was Accent * Realization) and the dependent variables were the measures listed in Table 3.3. Speaker and word were included as random effects.

3.2 Results Tables of raw results by speaker can be found in Appendix B. Figure 3.4 shows the target words for the two accents in broad focus and Figure 3.5 shows the two accents in contrastive focus, based on average measurements, for one speaker. In each figure, the accents are time normalized with respect to one another. In broad focus, the accent contours are similar but accent 2 has a later alignment and a high plateau across most of the vowel. It can also be seen that accent 2 has a higher F0 minimum than accent 1. In contrastive focus, there is a wider pitch range and later timing of accent 2, and a longer vowel (by 20 msec), as in Figure 3.5.

Figure 3.4: Contour of the accents in broad focus based on average measure- ments for one speaker. The darker line is accent 2.

27 Figure 3.5: Contour of the accents in contrastive focus based on average mea- surements for one speaker. The darker line is accent 2.

The mixed model multiple linear regression had accent and focus re- alization as the independent variables. Accent refers to the difference in any measure between accent 1 and accent 2. Focus realization refers to the dif- ference between the broad and contrastive focus productions. The reference level for accent is accent 1, and the reference level for focus realization is broad focus. The polarity of the coefficient shows whether accent 2 or the contrastive focus realization has a higher or lower value for a particular mea- sure, in comparison to accent 1, or the broad focus realization, respectively. If the coefficient is negative, accent 2 or the contrastive focus realization has a lower average value (or earlier alignment) for this measure than accent 1 or the broad focus realization. The interaction refers to whether the two variables interact. These interactions were further investigated with pairwise posthoc tests using the lsmeans package in R. The results for each independent variable are presented in turn.

28 Broad Contrastive Accent 1 93.6 93.7 Accent 2 93.7 93.5

Table 3.4: Average raw results for F0 maximum (semitones) by accent and focus realization

Coef. t-value p-value Accent -0.05 -0.2 p=0.88 Focus 0.23 2.1 p<0.05* Interaction 0.24 1.6 p=0.12

Table 3.5: Statistical results for F0 maximum examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

F0 Maximum

It was hypothesized that F0 maximum would not differ between the accents but that it would be higher in contrastive focus than in broad focus. The average raw results are shown in Table 3.4. The statistical results (Table 3.5) show that accent did not have a significant effect on this measure, while for both accents, F0 maximum was higher in contrastive focus.

F0 Minimum

It was hypothesized that F0 minimum would be higher for accent 2 than accent 1, and that for both accents it would be lower in contrastive focus than in broad focus. The average raw results are shown in Table 3.6. The statistical results (Table 3.7) show that accent 2 had a higher F0 minimum than accent 1, as predicted, and that in contrastive focus both accents had a higher F0 minimum.

F0 Maximum Alignment

It was hypothesized that accent 2 would have a later F0 maximum alignment than accent 1 and that in contrastive focus, accent 1 would have an

29 Broad Contrastive Accent 1 86.8 86.2 Accent 2 88.1 87.2

Table 3.6: Average raw results for F0 minimum (semitones) by accent and focus realization

Coef. t-value p-value Accent 1.2 6 p<0.01* Focus -0.35 -3.1 p<0.01* Interaction 0.25 1.5 p=0.0134

Table 3.7: Statistical results for F0 minimum examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Accent 1 -0.09 -0.07 Accent 2 0.03 0.11

Table 3.8: Average raw results for F0 maximum alignment (relative to word length) by accent and focus realization

Coef. t-value p-value Accent 0.12 4.2 p<0.01* Focus 0.01 0.95 p=0.34 Interaction 0.07 3.9 p<0.001*

Table 3.9: Statistical results for F0 maximum alignment examining the ef- fect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction. earlier alignment and accent 2 a later alignment. The average raw results are shown in Table 3.8. The statistical results (Table 3.9) show a main effect of accent and a significant interaction between accent and focus. The pairwise results revealed that accent 2 had a later alignment than accent 1, and in contrastive focus, accent 2 had a later alignment than in broad focus.

30 Broad Contrastive Accent 1 0.64 0.66 Accent 2 1 1

Table 3.10: Average raw results for F0 minimum alignment (relative to VC length) by accent and focus realization

Coef. t-value p-value Accent 0.39 7.1 p<0.001* Focus 0.02 1.4 p=0.174 Interaction -0.04 -1.88 p=0.06

Table 3.11: Statistical results for F0 minimum alignment examining the ef- fect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

F0 Minimum Alignment

It was hypothesized that accent 2 would have a later F0 minimum alignment than accent 1 and that in contrastive focus, accent 1 would have an earlier alignment and accent 2 a later alignment. The average raw results are shown in Table 3.10. The statistical results (Table 3.11) show that accent 2 has a later alignment than accent 1, but no significant effect of focus.

HTP Alignment It was hypothesized that accent 2 would have a later HTP alignment than accent 1 and that in contrastive focus, accent 1 would have an earlier alignment and accent 2 a later alignment. The average raw results are shown in Table 3.12. The statistical results (Table 3.13) show that accent 2 has a later alignment than accent 1, but no significant effect of focus.

Slope of the Rise It was hypothesized that slope of the rise would not be significantly different between accents (based on the hypothesis that both accents have a

31 Broad Contrastive Accent 1 0.06 0.04 Accent 2 0.5 0.5

Table 3.12: Average raw results for HTP alignment (relative to VC length) by accent and focus realization

Coef. t-value p-value Accent 0.44 6.5 p=0.001* Focus -0.01 -1 p=0.306 Interaction 0.016 0.96 p=0.337

Table 3.13: Statistical results for HTP alignment examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Accent 1 0.02 0.02 Accent 2 0.018 0.017

Table 3.14: Average raw results for slope of the rise by accent and focus realization

Coef. t-value p-value Accent -0.003 -1.7 p=0.14 Focus 0.0003 0.3 p=0.782 Interaction -0.002 -1.2 p=0.224

Table 3.15: Statistical results for slope of the rise examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

HL contour) and conditions. The average raw results are shown in Table 3.14. The statistical results (Table 3.15) show no effects, as predicted.

32 Broad Contrastive Accent 1 -0.04 -0.04 Accent 2 -0.03 -0.03

Table 3.16: Average raw results for slope of the fall by accent and focus realization

Coef. t-value p-value Accent 0.01 7 p<0.001* Focus 0.0006 0.7 p=0.5 Interaction -0.006 -4.1 p<0.001*

Table 3.17: Statistical results for slope of the fall examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

Slope of the Fall It was hypothesized that slope of the fall would be steeper for accent 2 than accent 1 due to a later alignment of the HTP. It was also hypothe- sized that it would be steeper for accent 2 in contrastive focus, due to a later alignment. The average raw results are shown in Table 3.16. The statistical results (Table 3.17) show a main effect of accent an interaction. The pair- wise results revealed that accent 2 had a steeper slope than accent 1, and that in contrastive focus, accent 2 had a steeper slope than in broad focus, as predicted.

AP H% Height It was hypothesized that AP H% height would not differ between the accents but would be higher in contrastive focus, based on previous work (Koreman et al., 2009; Mixdorff et al., 2010). The average raw results are shown in Table 3.18. The statistical results (Table 3.19) showed no difference between the accents and a higher result for this measure in contrastive focus, as predicted.

33 Broad Contrastive Accent 1 92.1 93.7 Accent 2 92.3 93.2

Table 3.18: Average raw results for AP H% height (semitones) by accent and focus realization

Coef. t-value p-value Accent 0.11 0.5 p=0.624 Focus 1.8 10.3 p<0.001* Interaction -0.2 -0.9 p=0.385

Table 3.19: Statistical results for AP H% height examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Accent 1 30 41 Accent 2 31 40

Table 3.20: Average raw results for AP H% timing (msec) by accent and focus realization

AP H% Timing It was hypothesized that AP H% timing would not differ between the accents but would be earlier in contrastive focus. The average raw results are shown in Table 3.20. The statistical results (Table 3.21) showed no difference between the accents and that the AP H% tone was earlier in contrastive focus, as predicted.

Boundary Slope It was hypothesized that boundary slope would not differ between the accents but would be steeper in contrastive focus, due to the AP H% being earlier and higher. The average raw results are shown in Table 3.22. The statistical results (Table 3.23) showed an effect of accent and focus condition

34 Coef. t-value p-value Accent 1.1 0.6 p=0.565 Focus 12.8 5.4 p<0.001* Interaction -2.4 -0.7 p=0.467

Table 3.21: Statistical results for AP H% timing examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Accent 1 0.02 0.04 Accent 2 0.02 0.03

Table 3.22: Average raw results for boundary slope by accent and focus real- ization

Coef. t-value p-value Accent 0.005 3.4 p<0.001* Focus 0.02 10.8 p<0.001* Interaction -0.007 -3.3 p<0.001*

Table 3.23: Statistical results for boundary slope examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction. and an interaction. The pairwise results revealed that accent 2 had a steeper boundary slope than accent 1, and that both accents had a steeper boundary slope in contrastive focus.

Stressed Vowel Duration It was hypothesized that there would be no effect of accent on vowel duration, but that the vowel would be longer in contrastive focus. The average raw results are shown in Table 3.24. The statistical results (Table 3.25) show a longer stressed vowel in contrastive focus.

35 Broad Contrastive Accent 1 162 182 Accent 2 170 190

Table 3.24: Average raw results for long stressed vowels (msec) by accent and focus realization

Coef. t-value p-value Accent 6.2 0.5 p=0.67 Focus 17.2 5.1 p<0.001* Interaction 4 0.7 p=0.48

Table 3.25: Statistical results for stressed vowel duration examining the ef- fect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Accent 1 63 80 Accent 2 70 90

Table 3.26: Average raw results for unstressed vowels (msec) by accent and focus realization

Unstressed Vowel Duration It was hypothesized that accent 2 would have a longer final vowel than accent 1, based on Fintoft (1970), and that contrastive focus may induce a longer vowel. The average raw results are shown in Table 3.26. The statistical results (Table 3.27) show no difference between the accents for this measure, and that this vowel is longer in contrastive focus.

In sum, the results revealed that in broad focus, accent 2 had a later F0 maximum alignment, higher F0 minimum, later F0 minimum alignment, later HTP and steeper slope of the fall than accent 1. In contrastive focus, a higher

F0 maximum, lower F0 minimum, higher and earlier AP H% tone, steeper boundary slope and longer vowels were found for both accents. Only accent 2 had a later alignment in contrastive focus as compared to broad focus. Accent

36 Coef. t-value p-value Accent 7.6 1.4 p=0.219 Focus 16.1 5.4 p<0.001* Interaction 1.9 0.4 p=0.657

Table 3.27: Statistical results for unstressed vowel duration examining the effect of accent (1 or 2) and focus realization (broad or contrastive), and their interaction.

2 also had a significantly steeper slope in broad focus than contrastive focus, but focus did not affect the slope for accent 1. These results demonstrate that contrastive focus affected some cues to the accent contrast in the same way for each accent while some cues were affected differently by contrastive focus. Both accents had a wider pitch range, higher and earlier AP H% tone, steeper boundary slope and longer vowels.

Only accent 2 had a later F0 maximum alignment in contrastive focus, while the timing of accent 1 was not affected. The results also revealed that in contrastive focus, accent 2 words had a higher F0 minimum, later F0 minimum and maximum alignment and later HTP alignment compared to accent 1 words. This is the same set of distinc- tions between the two accents that was found in broad focus. However, in contrastive focus, accent 2 words no longer had a steeper slope of the fall or boundary slope than accent 1 words. The accents overall then differed from one another in fewer cues in contrastive focus than in broad focus. However, examining the ‘distance’ for those measures that were significantly different for the two accents in both focus realizations reveals that the magnitude of the dif- ference between the accents is greater in contrastive focus for three of the four measures (see Table 3.28). For example, F0 minimum shows the difference in semitones between the two accents in broad focus versus in contrastive focus. The alignment distinctions are the relative timing measures from the current experiment. (The mean for one accent was subtracted from the mean for the other accent to calculate the difference.) F0 minimum alignment shows that the difference in relative timing of this measure (from vowel onset) between the accents in broad focus was 0.35, while in contrastive focus this difference

37 Measure Broad Contrastive

F0 Minimum (st) 1.3 1.14 F0 Max alignment (relative) 0.12 0.19 F0 Min alignment (relative) 0.35 0.4 HTP (relative) 0.41 0.46

Table 3.28: Magnitude of differences between the accents in broad and con- trastive focus. increased to 0.4, meaning that the timing for accent 2 is later than for accent 1 in both focus realizations, but that difference is larger in contrastive focus. The only measure where the difference is not greater in contrastive focus is

F0 minimum height. For the other three measures, the difference between the accents is somewhat enlarged in contrastive focus. This suggests that even though the number of cues differentiating accent 1 and 2 in contrastive focus is reduced compared to broad focus, some of the remaining cues are enhanced in a way that exaggerates the contrast. Another finding was that a model including vowel length as an inde- pendent variable was a better predictor of two of the alignment measures (F0 minimum alignment (p<0.001) and HTP (p<0.001)) than a model with just accent and realization. This was determined by the use of a likelihood ratio test using the anova function in R. This compares models to determine which one explains more variance in the data. This finding will be discussed below.

3.3 Discussion Describing the accent contours The results of the acoustic analyses revealed that the main differences between the accents were that accent 2 had a higher F0 minimum, a later alignment of F0 minimum, F0 maximum and HTP, and a steeper slope of the fall than accent 1. These findings are in line with previous work describing the tonal contrasts as one of alignment (Fintoft, 1970; Kristoffersen, 2006b). In order to further substantiate the presence of an initial H tone in ac-

38 cent 1 (c.f. Nilsen, 1992), additional analyses were conducted. To that end, the anacrusis, or unstressed sentence-initial syllables, were marked for whenever the H occurred before word or vowel onset. Unlike accent 2, accent 1 words frequently appeared with an early (before word onset) F0 maximum. This was found to occur before vowel onset for 64% of accent 1 words, and 15% of accent

2 words. Recall that there is no difference in F0 maximum height between the two accents. The results thus suggest an analysis where both accents contain an initial H tonal target, which for accent 1 words often occurs before word onset. The presence of this initial H could reflect a phonological H tone, or it may be a phonetic device whereby speakers start off at a higher pitch in order to highlight the important L tone for accent 1 words. However, this latter analysis seems typologically unusual, and also, in the Oslo dialect accent 1 is just L, and has not been described as having any fall to this target (Fintoft, 1970; Kristoffersen, 2006b). This suggests that an initial phonetic high is not necessary to highlight the L tone. Next, the alignment of the tone with the segmental or syllabic boundary to which it is closest was examined (Schepman et al., 2006; Remijsen, 2013). While this will not conclusively determine if the initial H is phonetic or phonological, examining alignment of the tonal targets can provide a description of which ones characterize the contour of each accent by being closely aligned with a segment or syllable boundary. Table 3.29 shows the distance in milliseconds that the F0 landmarks (F0 maximum, F0 minimum, HTP) are from vowel onset3 and syllable offset (end of the vowel for long vowels and halfway through the total duration of the geminate consonant for short vowels), for each accent. With regard to the status of the initial H, these alignment data show that the F0 maximum in accent 1 words occurred on average 29 milliseconds before vowel onset and the HTP 16 milliseconds after vowel onset, which results in the perception of an initial fall. Accent 2, in contrast, had its F0 maximum on average 18 milliseconds into the vowel, and its HTP 106 milliseconds after vowel onset, resulting in a high plateau across much of the vowel. These alignment data provide evidence that both accents have an initial H tonal target, with significantly different alignment.

3Vowel onset was chosen instead of syllable onset because some words had a simple onset and some had a complex onset.

39 It was posited that accent 1 in this dialect has a HL contour, as suggested by Kristoffersen (2006b). The results here indicate that indeed accent 1 has an initial H tone, because the differences between the accents lie in the alignment of the contour, rather than in the height of the F0 maximum or the initial rise to this target. If accent 1 did not have an initial H tone, it is likely that it would have a shallower initial rise, or no rise, and a lower F0 maximum than accent 2. An experiment using intonation modeling also found no difference in the slope of the rise between the accents in this dialect (Kelly and Schweitzer, 2015).

Accent 1 Accent 2 F0 Max. F0 Min. HTP F0 Max. F0 Min. HTP Vowel onset -29 159 16 18 232 106 Syllable offset -194 -6 -149 -135 79 -49

Table 3.29: Distance of F0 landmarks from vowel onset and syllable offset. The lowest number of milliseconds for each accent is highlighted. Negative numbers mean the F0 landmark occurs before the relevant segmental boundary.

Examining the distances of the F0 landmarks from segment boundaries further, it can be seen that for accent 1 words, the F0 minimum (L tone) is aligned with the end of the syllable (only 6 milliseconds before it). For accent 2, the F0 maximum (H tone) is most closely aligned with vowel onset (only 18 milliseconds after it). This suggests that accent 1 is right-aligned and accent 2 is left-aligned. For accent 1, the HTP being only 16 milliseconds after vowel onset is likely due to the alignment of the F0 minimum with the end of the syllable. This is in line with results in Fintoft (1970) where the L tone was found to be at vowel offset in V:C words and halfway into the geminate consonant in VC: words. However, he found that accent 2 words have their H tone halfway into the vowel, but in the current experiment, the H tone was found to be earlier than that. However, recall that Fintoft examined the dialect spoken in the city of Trondheim, which was not the case in the current production study. The current results indicate that for accent 1, the fall is the most stable part of the contour but for accent 2, the alignment of the initial H, rather than the fall, seems to be more stable.

40 Figure 3.6: Phonological analysis of Trøndersk disyllabic accents

Combined, these results suggest that phonologically, in addition to the presence of an initial H, accent 1 is characterized by an L tone that is associated with the right edge of the syllable while accent 2 is characterized by a H tone associated with the left edge of the syllable. Work by Prieto et al. (2005), Mor´enand Zsiga (2006) and Remijsen (2013) show analyses of lexical tonal languages where a tone can be linked with either edge of the tone-bearing unit. Kristoffersen (2000) interprets similar results for East Norwegian as the L tone in accent 1, and the H tone in accent 2, linking with the stressed syllable. Analyses where the tones associate with the stressed syllable are also given for Stockholm Swedish by Gussenhoven and Bruce (1999) and for Swedish and Norwegian by Riad (1998). The results here also coincide with those found for Oppdal and Trondheim, where the L tone was described as being associated with the stressed syllable in accent 1 and the post-stressed syllable in accent 2 (Fintoft, 1970; Kristoffersen, 2003, 2006b). This would be exemplified as in Figure 3.6. In this analysis, the initial H in accent 1 is timed with respect to the following L (H+L* in the terminology of (Pierrehumbert, 1980)). Another alternative, as proposed by Kristoffersen (2006a) for the North Gudbrandsdal dialect (spoken south of the Trøndelag region), is that the initial H in accent 1 is actually associated with the syllable preceding the stressed syllable. Recall that for 64% of accent 1 words, the F0 maximum occurred before vowel onset. If this pre-stress syllable analysis were to hold, though, it is likely that for all accent 1 target words, the initial F0 maximum should occur before word onset. As found in Dutch, for instance, by Schepman et al. (2006), segmental makeup and syllable type can shift tonal alignment. For example, they found an earlier alignment of the H tone in long vowels than in short vowels. In

41 terms of syllable type, the H was found midway through the vowel for long vowels but late in the vowel for short vowels. In the current study, in the case of V:C words, the C is the onset of the final syllable. In the case of VC:, the second part of the geminate consonant is the onset of the final syllable. The average length of the vowel and consonant is 240 milliseconds, while a long vowel alone is 160 milliseconds and a short vowel 80 milliseconds. Accent 1 words have their F0 minimum at 150 milliseconds, which is at vowel offset in V:C words and is about halfway through the consonant in VC: words. Accent

2 words have their F0 maximum 18 milliseconds into the vowel, their HTP 106 milliseconds after vowel onset, and their F0 minimum is at 230 milliseconds after vowel onset, which is into the second syllable. As such, for accent 1 words the L tone in Trøndersk occurs at the end of the vowel in words with a phonologically long vowel, but into the following consonant in words with a short vowel. Another interesting question to examine is whether the different syllable types would lead to a different F0 contour, due to truncation or compression. While all sonorants were used next to the target vowels, vowel length and syl- lable structure can affect F0 contour realization (Van Santen and Hirschberg, 1994; Schepman et al., 2006; Jilka and M¨obius,2006; M¨obiusand Jilka, 2007; Ladd et al., 2009). The current experiment was not set up to examine this, but since half of accent 2 words were V:C and half of them VC:, and 80% of accent 1 words V:C and 20% VC:, the effect of vowel length can be examined. Since the number of tokens with the different syllable types are not balanced for ac- cent 1 words, these results are considered only preliminary, and vowel length was not included as a variable in the overall analysis. Likelihood ratio tests indicate that a model with accent, realization and vowel length is significantly better than a model with only accent and realization as predictors for both HTP alignment and L alignment4. Examination of mean raw measurements for these indicate that for both accents, words with short vowels (VC:) have

4The alignment measures for L and HTP are the only significant measures wherein in- cluding vowel length as an independent variable significantly improved the model. For these, accent is still significant within the new model, and the direction of the difference between the two accents remains the same. The effects of contrastive focus realization on the accents also remain unchanged.

42 an earlier alignment of L. Prieto and Torreira (2007) found that in Spanish,

F0 peaks occurred earlier in the syllable for closed than open syllables. One possible explanation for the current results is that in VC: words, the ambisyl- labic C is not split evenly across the syllables, so the rime of this syllable type is shorter than that of a syllable with a long vowel. As a result, the alignment is shifted earlier in VC: words. L and H remain the relevant landmarks for ac- cents 1 and 2, respectively, because even though there are differences between the two syllable structures for the L tone, this is still the F0 landmark closest to a syllabic boundary for accent 1 words. Therefore, including vowel length as an independent variable does not change the analysis of the accent differ- ence or the effect of focus. It leaves questions for future research into syllable structure with a vowel length contrast and its interaction with F0 alignment. Finally, Fintoft (1970) described accent 2 as having a longer final vowel than accent 1, but this was not found here. The different findings of Fintoft (1970) might be due to his target words all being from one position (sentence- final), whereas in the current experiment, both the accent contrast and its interaction with focus are examined. When a subset of the current data was analyzed solely to examine the accent contrast in broad focus, accent 2 was found to have a longer final vowel than accent 1 (Kelly and Smiljani´c,2014). However, this difference was no longer significant in the current, larger dataset. In this vein, the results may suggest that accent 2 has a slightly longer un- stressed vowel than accent 1, but when examined in the context of a larger dataset that also examines the effect of pragmatic focus, this slight difference is no longer significant. This result may also suggest that this difference is so small that when higher level intonation affects the accents, in particular, as contrastive focus increases the length of this final vowel, it gets lost.

The accent contrast in contrastive focus The overall effects of contrastive focus were that both accents had a wider pitch range, higher and earlier AP H% tone, steeper boundary slope and longer vowels. Accent 2 also had a later F0 maximum alignment in contrastive focus than in broad focus. The higher and earlier AP H% is in line with previous findings (Kore- man et al., 2009; Mixdorff et al., 2010). The higher AP H% tone likely arises

43 from the combined boundary H% tone and the focal H tone (Fretheim, 1987b; Nilsen, 1989; Kristoffersen, 2000). The steeper boundary slope is likely a re- sult of this higher and earlier boundary tone. The results also revealed that both accents had an expanded pitch range, a higher F0 maximum and lower F0 minimum, on the contrastive focus words. This is in line with findings on contrastive focus across a number of languages (e.g., Xu, 1999; Scholz, 2012; Peters et al., 2014). The current study also found that contrastive focus af- fected segmental durations. In contrastive focus, phonologically long vowels were lengthened but short vowels were not. Such asymmetrical lengthening (for long vowels but not short) was found in a number of other languages, such as Swedish (Bruce, 1977; Bannert, 1979) and Serbian and Croatian (Smiljani´c, 2002). Final unstressed vowels were also lengthened in contrastive focus as was found for Swedish (Heldner and Strangert, 2001). The lengthening patterns suggest that contrastive focus impacts the whole target word, not just the stressed syllable. Furthermore, the asymmetrical impact of focus on phono- logically short and long vowels enhances the distinction between them. The results also revealed an asymmetrical effect of contrastive focus on the pitch contours of the two accents. Accent 2 words had a later alignment of the F0 maximum in contrastive focus compared to broad focus. However, the alignment of the tones in accent 1 words did not change in contrastive fo- cus. This suggests that, similar to the vowel length differences, the alignment difference between the accents is enhanced in contrastive focus. A later align- ment of tones in accent 1 in contrastive focus could result in the maintenance of the contrast, i.e., the same magnitude of alignment differences in broad and contrastive focus, or potentially even diminish the contrast. However, this was avoided since no F0 alignment differences between broad and contrastive focus was found for accent 1 words. These results here thus provide evidence that the presence of the lexical tonal accent contrast limits how the pitch cues are used to express pragmatic focus. Accent 1 words, therefore, express contrastive focus by expanding vertically, rather than changing alignment. Finally, Table 3.28 showed that although the accents differed from one another in fewer cues in contrastive focus than in broad focus, the magnitude of these differences was greater in contrastive focus. It was hypothesized that accent 1 would have an earlier alignment and accent 2 a later alignment in

44 contrastive focus, but only the latter of these was found. Nevertheless, the finding that the difference in timing between the accents is larger in contrastive focus means that an earlier timing of accent 1 is likely not necessary. These results, however, have to be taken with caution. Even though an increase in differences between accent 1 and accent 2 for some of these cues is noted, their magnitude is rather small. Whether this difference is sufficient to affect listeners’ perception of the contrast needs to be examined through perception. Perception tests examining accent identification accuracy will be discussed in Chapter 6 (although a direct impact of the magnitude of the differences for these cues will be examined further in the future).

45 Chapter 4

Experiment 2: Interaction of Disyllabic Accent Realization with Higher Level Intonation

This experiment examines how the accent contrast is realized in dif- ferent positions within the accent phrase. Specifically, I compare the accents when they are AP-medial (as in the broad focus condition in Experiment 1, where the target words are followed by two unstressed syllables in the same AP) to when they are AP-final. This will allow for an examination how higher- level (AP) intonation affects the duration and pitch cues that are used to dis- tinguish the accents. It was hypothesized that the accents would differ in the same ways in AP-medial position as in broad focus (both HL, accent 2 having a later alignment). In AP-final position, it was hypothesized that the AP H% tone occurring on the target word would cause tonal crowding, that is, too many tones competing for space, leading to an earlier alignment of the lexi- cal tones, and a higher F0 minimum, in both accents (Fretheim, 1981, 1982; Pierrehumbert, 2000; Teig, 2001).

4.1 Methods Participants and procedure were identical to Experiment 1. The record- ings for all production experiments were made in the same sitting for each participant. The sentences for all of the different conditions (broad focus, contrastive focus, AP-final) were interspersed.

4.1.1 Materials The sentences had a similar structure to the broad focus sentences, except that for the AP-final condition, the target word occurred at the right

46 edge of the AP (there were no unstressed syllables following the target word). The word immediately following the target word was focused, to ensure that a new AP began at this point and also that the target word was not focused.

Accent 1, AP-medial: Det var glimtet i en film, men ikke i et stykke. 1 1 1 (((Det var ( glimtet-i-en)AP ( FILM)AP )IP , men itj i et ( STYKKE)AP )IP )IU “There was the flash in a film, but not in a play.”

Accent 1, AP-final: Det var glimtet før, men ikke n˚a. 1 (((Det var ( glimtet)AP (FØR)AP )IP , men itj (NA)˚ AP )IP )IU “There was the flash before, but not now.”

Accent 2, AP-medial: Det var et minne i en film, men ikke i et stykke. 2 1 1 (((Det var et ( minne-i-en)AP ( FILM)AP )IP , men itj i et ( STYKKE)AP )IP )IU “There was a memory in a film, but not in a play.”

Accent 2, AP-final: Det var et minne før, men ikke n˚a. 2 (((Det var et ( minne)AP (FØR)AP )IP , men itj (NA)˚ AP )IP )IU “There was a memory before, but not now.”

4.1.2 Measurements and Analysis The measurements and analysis were identical to Experiment 1 (see Figure 3.3), except that for words in AP-final position, the AP H% tone oc- curred on the target word rather than two syllables after it.

47 4.2 Results Tables of raw results can be found in Appendix B. Figure 4.1 shows the full AP (target word and two following unstressed syllables) for the two ac- cents in AP-medial position, based on average measurements, for one speaker. Figure 4.2 shows the two accents in AP-final position for the same speaker. In each figure, the accents are time normalized with respect to one another. From the figures, it can be seen that the two accents are distinct from each other in both positions. Furthermore, in AP-final position, the final vowel in the target word is lengthened (by 8 msec).

Figure 4.1: Stylized average contour of the accents in AP-medial position based on average measurements for one speaker. The darker line is accent 2.

The mixed model multiple linear regression had accent and position as the independent variables. Accent refers to the differences between the two accents. Position refers to the difference between the AP-medial and AP-final productions. The reference level is the AP-medial production.

48 Figure 4.2: Stylized average contour of the accents in AP-final position based on average measurements for one speaker. The darker line is accent 2.

49 AP-medial AP-final Accent 1 93.6 93.4 Accent 2 93.7 93.4

Table 4.1: Average raw results for F0 maximum (semitones) by accent and position

Coef. t-value p-value Accent -0.04 -0.1 p=0.916 Position 0.03 0.3 p=0.767 Interaction -0.17 -1.1 p=0.261

Table 4.2: Statistical results for F0 maximum examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

F0 Maximum

It was hypothesized that F0 maximum would not differ between the accents or conditions. The average raw results are shown in Table 4.1. The statistical results (Table 4.2) also show no difference between accents or con- ditions.

F0 Minimum

It was hypothesized that F0 minimum would be higher for accent 2 than accent 1, and that for both accents it would be higher in AP-final position than AP-medial position. The average raw results are shown in Table 4.3. The statistical results (Table 4.4) show a main effect of accent and condition, and an interaction. Pairwise test revealed that F0 minimum was higher for accent 2 than accent 1, as predicted, and that both accents had a higher F0 minimum in AP-final position, also as predicted.

F0 Maximum Alignment

It was hypothesized that F0 maximum alignment would not differ be- tween accents or positions. The average raw results are shown in Table 4.5.

50 AP-medial AP-final Accent 1 86.8 87 Accent 2 88.1 90

Table 4.3: Average raw results for F0 minimum (semitones) by accent and position

Coef. t-value p-value Accent 1.2 5.4 p<0.01* Position 0.3 2.6 p=0.01* Interaction 0.8 5.2 p<0.001*

Table 4.4: Statistical results for F0 minimum examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

AP-medial AP-final Accent 1 -0.09 0.02 Accent 2 0.03 -0.01

Table 4.5: Average raw results for F0 maximum alignment (relative to word length) by accent and position

Coef. t-value p-value Accent 0.12 10.6 p<0.001* Position 0.1 10.4 p<0.001* Interaction -0.14 -10 p<0.001*

Table 4.6: Statistical results for F0 maximum alignment examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

The statistical results (Table 4.6) show a main effect of accent and condition, and an interaction. Pairwise test revealed that F0 maximum alignment did not differ between the accents but was earlier for accent 2 in AP-final than AP-medial position.

51 AP-medial AP-final Accent 1 0.64 0.63 Accent 2 1 1

Table 4.7: Average raw results for F0 minimum alignment (relative to VC length) by accent and position

Coef. t-value p-value Accent 0.39 7.3 p<0.001* Position -0.01 -1.1 p=0.276 Interaction -0.06 -3.2 p=0.001*

Table 4.8: Statistical results for F0 minimum alignment examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

F0 Minimum Alignment

It was hypothesized that F0 minimum alignment would be later for accent 2 than accent 1, and that for both accents it would be earlier in AP- final position than AP-medial position. The average raw results are shown in Table 4.7. The statistical results (Table 4.8) show a main effect of accent and an interaction of accent and condition. Pairwise test revealed that F0 minimum alignment was later for accent 2 than accent 1, as predicted, and that only accent 2 had an earlier F0 minimum in AP-final position.

HTP Alignment It was hypothesized that HTP alignment would be later for accent 2 than accent 1, and that position would have no effect. The average raw results are shown in Table 4.9. The statistical results (Table 4.10) show a main effect of accent, where accent 2 has a later alignment than accent 1, and no effect of position, as predicted.

52 AP-medial AP-final Accent 1 0.06 0.06 Accent 2 0.5 0.5

Table 4.9: Average raw results for HTP alignment (relative to VC length) by accent and position

Coef. t-value p-value Accent 0.44 5.6 p=0.01* Position 0.003 0.2 p=0.817 Interaction 0.01 0.7 p=0.463

Table 4.10: Statistical results for HTP alignment examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

AP-medial AP-final Accent 1 0.02 0.02 Accent 2 0.018 0.02

Table 4.11: Average raw results for slope of the rise by accent and position

Coef. t-value p-value Accent -0.003 -1.7 p=0.14 Position 0.0001 0.2 p=0.88 Interaction 0.001 1 p=0.339

Table 4.12: Statistical results for slope of the rise examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

Slope of the Rise It was hypothesized that slope of the rise would not be significantly different between accents and conditions. The average raw results are shown in Table 4.11. The statistical results (Table 4.12) show no effects, as predicted.

53 AP-medial AP-final Accent 1 -0.04 -0.04 Accent 2 -0.03 -0.03

Table 4.13: Average raw results for slope of the fall by accent and position

Coef. t-value p-value Accent 0.009 6.1 p<0.05* Position -0.002 -2.2 p<0.001* Interaction 0.004 3.2 p<0.01*

Table 4.14: Statistical results for slope of the fall examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

Slope of the Fall It was hypothesized that slope of the fall would be steeper for accent 2 than accent 1. It was also hypothesized that it would be steeper in AP-final position for both accent, due to an earlier alignment of tones in this position. The average raw results are shown in Table 4.13. The statistical results (Ta- ble 4.14) show a main effect of accent and position and an interaction. The pairwise results revealed that that accent 2 had a steeper slope than accent 1, but there was no effect of position.

AP H% Height It was hypothesized that AP H% height would not differ between the accents but may be lower in AP-final position. The average raw results are shown in Table 4.15. The statistical results (Table 4.16) showed no difference between the accents and that the AP H% tone was lower in AP-final position.

AP H% Timing It was hypothesized that AP H% timing would not differ between the accents but may be earlier in AP-final position. The average raw results are shown in Table 4.17. The statistical results (Table 4.18) showed no difference

54 AP-medial AP-final Accent 1 92.1 90.7 Accent 2 92.3 91.6

Table 4.15: Average raw results for AP H% height (semitones) by accent and position

Coef. t-value p-value Accent 0.1 0.6 p=0.585 Position -1.2 -8.6 p<0.001* Interaction -0.2 -0.9 p=0.356

Table 4.16: Statistical results for AP H% height examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

Broad Contrastive Accent 1 30 26 Accent 2 31 16

Table 4.17: Average raw results for AP H% timing (msec) by accent and position

Coef. t-value p-value Accent 1.2 0.7 p=0.511 Position -4.3 -2.5 p<0.05* Interaction -9.3 -3.6 p<0.001*

Table 4.18: Statistical results for AP H% timing examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction. between the accents and that the AP H% tone was earlier in AP-final position.

Boundary Slope It was hypothesized that boundary slope would not differ between the accents but may be steeper in AP-final position. The average raw results are

55 AP-medial AP-final Accent 1 0.02 0.04 Accent 2 0.02 0.04

Table 4.19: Average raw results for boundary slope by accent and position

Coef. t-value p-value Accent 0.005 3.3 p<0.001* Position 0.02 14.1 p<0.001* Interaction -0.01 -4.7 p<0.001*

Table 4.20: Statistical results for boundary slope examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

AP-medial AP-final Accent 1 162 146 Accent 2 170 145

Table 4.21: Average raw results for long stressed vowels (msec) by accent and position shown in Table 4.19. The statistical results (Table 4.20) showed an effect of accent and position and an interaction. The pairwise results revealed that accent 2 had a steeper boundary slope than accent 1, and that both accents had a steeper boundary slope in AP-final position than in AP-medial position.

Stressed Vowel Duration It was hypothesized that there would be no effect of accent or position on vowel duration. The average raw results are shown in Table 4.21. The statistical results (Table 4.22) show that the stressed vowel was shorter in AP-final position.

56 Coef. t-value p-value Accent 6.1 0.4 p=0.688 Position -16.6 -5.4 p<0.001* Interaction -7.6 -1.5 p=0.13

Table 4.22: Statistical results for stressed vowel duration examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their interaction.

Broad Contrastive Accent 1 63 72 Accent 2 70 82

Table 4.23: Average raw results for unstressed vowels (msec) by accent and position

Coef. t-value p-value Accent 7.5 1.8 p=0.127 Position 8.5 3.2 p=0.001* Interaction 3.1 0.8 p=0.397

Table 4.24: Statistical results for unstressed vowel duration examining the effect of accent (1 or 2) and position (AP-medial or AP-final), and their in- teraction.

Unstressed Vowel Duration It was hypothesized that accent 2 would have a longer final vowel than accent 1, based on Fintoft (1970), and that there would be no significant effect of position on this vowel. The average raw results are shown in Table 4.23. The statistical results (Table 4.24) show no difference between the accents for this measure, and that this vowel is longer in AP-final position. In sum, the results showed, as expected, the cues that differentiate the two accents are consistent with the previous experiment. In AP-medial position, accent 2 had a later alignment and higher F0 minimum than accent 1. Regarding the effect of AP-level intonation on accent realization, the main effects show that the accents in AP-final position had a lower and later AP H%

57 tone, shorter stressed vowel (when phonologically long), and shorter consonant duration (when phonologically short). Both accents also had a higher F0 minimum, steeper boundary slope and longer final vowel in AP-final position.

Accent 2 had an earlier F0 minimum alignment and an earlier F0 maximum alignment in AP-final position.

4.3 Discussion In AP-final position, the higher-level AP tones were affected in that for both accents, the AP H% tone was lower in height, and it was aligned later in the phrase (closer to the end of the AP). The lower and later AP H% and the resulting steeper boundary slope can be attributed to the fewer syllables available to reach the AP H% in this condition compared to the AP-medial position in which two syllables followed the target word. A similar result was found by G˚arding(1993) for West Swedish. That study examined the effect of focus tones on lexical tones. The results revealed that increasing the number of unstressed syllables between the final lexical L tone and the phrase-final H focus tone made the contour between these two points shallower, similar to what was found here for the AP tones. The later AP H% tone here, then, is likely caused by tonal crowding, that is, the lexical tones push it closer to the AP boundary. This is in line with other research on the interaction of lexical tones and intonation, where lexical tones take precedent. With regard to the main question of the impact of sentence-level into- nation on the lexical tonal contrast, the results showed that both accents had a higher F0 minimum (the L after the initial H) in AP-final position. Since the AP is shorter in the AP-final condition, it is likely that there is not enough time to reach as low an F0 as there is in AP-medial position. In AP-medial po- sition, the lexical tones surface on the target word and the AP tone is realized two syllables after the target word. It is likely that in AP-final position there is not enough time to reach the targets, so the L target is undershot. A similar result was reported in Teig (2001), where a one-syllable AP had a shallower drop to the L tone than a two-syllable AP. The results of the current study also revealed that accent 2 words had an earlier F0 maximum and F0 mini- mum alignment in AP-final position. This fits with the results for F0 minimum

58 height and the lower and later AP H%, in that the reduced segmental material leads to undershooting of the targets as well as compression of the contour. Compression refers to the tonal contour being realized on a smaller amount of segmental material (Bannert and Bredvad-Jensen, 1977, 1975; Grabe et al., 2000). The compressing of the contour due to less segmental material in the

AP-final condition likely causes the F0 minimum to be higher in both accents, and F0 maximum and minimum to be aligned earlier in accent 2. Research on other languages has provided evidence of boundary tones affecting the align- ment of pitch accent tonal targets, for example in English (Silverman and Pierrehumbert, 1990; Pierrehumbert, 2000), Spanish (Prieto et al., 1995) and Greek (Arvaniti et al., 2006). The fact that this only changes for accent 2 words might be due to the fact that accent 1 words already have earlier tonal alignment. Since accent 2 has a later alignment than accent 1, its lexical tones are closer to the AP boundary, so they are more likely to be affected than the tones in accent 1. The overall effect of position on segments was that both accents had a shorter stressed vowel whenever that vowel was phonologically long, and a longer final (unstressed) vowel in the target word, in AP-final position. The longer final vowel duration is likely an instantiation of phrase-final lengthening (Beckman and Pierrehumbert, 1986; Beckman and Edwards, 1990; Gussen- hoven and Bruce, 1999). This suggests that the domain of phrase-final length- ening is the boundary-adjacent vowel, since no other segments in the word are lengthened. The stressed vowel in the target word was shortened in AP-final position, but only for phonologically long vowels. In such words, the conso- nants are phonologically short. These short consonants also get shortened in AP-final position. In words with a short vowel, neither the vowel nor the con- sonant were affected. It seems that when the vowel is phonologically short, it does not shorten further in AP-final position. Since the medial geminate consonant in C[VC.C]V words is ambisyllabic, the part of it in the stressed syllable is not considered long, so this might explain why it does not shorten either. It is unclear why the stressed vowel would shorten in AP-final position. Compensatory shortening of the stressed syllable is an unsatisfactory explana- tion since the shortening only occurs in words with a long vowel, and not those with a long consonant. The segmental results revealed that syllable structure

59 did not affect phrase-final lengthening, but it did affect lengthening and short- ening patterns in the stressed syllable itself. Only words with V:C syllables underwent shortening in AP-final position, and these same words were those that underwent vowel lengthening in contrastive focus (Chapter 3). In summary, the results revealed that the phrasal intonation had mini- mal acoustic impact on the lexical tones of accent 1, but affected the alignment of tones in accent 2. Comparing the results of this experiment with the previ- ous one, it was found that contrastive focus affected the phonetic realization of the accents more than higher level intonation. It was argued above that con- trastive focus made the accents acoustically more different from one another, that is, enhancing the lexical tonal contrast. In particular, tonal alignment for accent 2 became later in contrastive focus, but alignment of accent 1 was unchanged, leading to a greater difference between the accents (Chapter 3).

The effect of position in the AP actually made the alignment of the F0 max- imum and minimum in accent 2 earlier. This means that there is a smaller difference between the two accents in this measure in AP-final position. For this accent, the lexical tones move away from the AP boundary, but this also means that they are more similar to where the tones lie in an accent 1 word. This indicates that the contrast is diminished in some way, however, a percep- tion test would be necessary to determine whether this impacts identification of the accents. Accent 2 retains its later alignment of the HTP than accent 1, but nevertheless, the impact of sentential intonation varies dramatically from that of pragmatic focus.

60 Chapter 5

Experiment 3: Monosyllabic Accent Realization in Broad Focus and Contrastive Focus

The first goal of this experiment was to examine how the accent con- trast is realized on monosyllabic words. The second goal was to examine how contrastive focus affects contrast implementation. It was hypothesized that in broad (noncontrastive) focus, the circumflex accent would have a higher F0 maximum and lower and later F0 minimum than the unmarked accent (Dalen, 1985; Almberg, 2001). It was expected that contrastive focus would induce a higher F0 maximum and lower F0 minimum (e.g., Xu, 1999; Scholz, 2012), and a higher earlier AP H% tone (Koreman et al., 2009; Mixdorff et al., 2010), for both accents. Finally, the results for the effect of contrastive focus on monosyllabic and disyllabic words are compared.

5.1 Methods Participants, procedure and measurements were identical to the previ- ous experiments. The recordings for all production experiments were made in the same sitting for each participant.

5.1.1 Materials The target words (shown in Table 5.1) were monosyllabic and had either the unmarked accent or the circumflex accent. Four of the five words were minimal pairs. There were five target words for each accent, each produced three times, giving 15 tokens per accent per speaker per condition, a total of 600 tokens (5 words x 3 repetitions x 2 accents x 10 speakers x 2 conditions).

61 The words were controlled for vowel (only the vowel /i/ was included, to control for instrinsic pitch differences), which was surrounded by sonorant consonants, smil “smile” (unmarked) and "smile “to smile” (circumflex) (note that both words are pronounced as monosyllables since the final e in smile is apocopated). The sentences were set up as in Experiment 1, to elicit broad (noncontrastive) focus and contrastive focus readings. Unmarked accent, broad focus: Det var et glimt i en film, men ikke i et stykke. 1 1 1 (((Det var et ( glimt-i-en)AP ( FILM)AP )IP , men itj i et ( STYKKE)AP )IP )IU “There was a flash in a film, but not in a play.”

Unmarked accent, contrastive focus: Det var et glimt i en film, men ikke en brann. 1 1 1 (((Det var et ( GLIMT-i-en)AP )IP (( film)AP , men itj en ( BRANN)AP )IP )IU “There was a flash in a film, but not a fire.”

Circumflex accent, broad focus: Jeg vil smile i en film, men ikke i et bilde. 1 1 1 (((Jeg vil ( smile-i-en)AP ( FILM)AP )IP , men itj i et ( BILDE)AP )IP )IU “I want to smile in a film, but not in a photo.” Circumflex accent, contrastive focus: Jeg vil smile i en film, men ikke fnise. 1 1 1 (((Jeg vil ( SMILE-i-en)AP )IP (( film)AP , men itj ( FNISE)AP )IP )IU “I want to smile in a film, but not frown.”

5.1.2 Measurements and Analysis The F0 contour of the target words was examined by using the land- marks shown in Figure 5.1. From these landmarks, the pitch and duration measurements were derived. The measurements differed from the disyllabic words in several ways. First, the monosyllabic words contained no unstressed vowel. Second, as well as measuring F0 maximum in the word, F0 height at

62 Unmarked Gloss Circumflex Gloss lim glue lime to glue lin flax/linen kline to kiss smil smile smile to smile slim mucus slime to cough up glimt flash glimte gleam

Table 5.1: Monosyllabic target words

Variables Calculation Units Pitch contour: F0 maximum semitones F0 minimum semitones F0 height at vowel onset semitones AP H% height semitones Duration: Vowel duration C3-V1 msec Postvocalic consonant duration W-C3 msec AP H% timing from AP boundary AP time - APH time msec Calculations: Slope of the rise relative F0 maximum alignment (H-V1)dur./Word dur. relative F0 minimum alignment (L-V1)dur./VC dur. relative Slope of the fall (H-L)F0 /(H-L)dur. relative Boundary slope (APH-LTP)F0 /(APH-LTP)dur. relative Table 5.2: Variables vowel onset was also measured. This was done because previous work (Alm- berg, 2001) indicated that this may be part of the contrast between the accents. Finally, HTP was not measured because examining the recordings it was noted that only one F0 maximum was present, suggesting that HTP and F0 maxi- mum may constitute the same target in the case of monosyllabic words. The measurements for monosyllabic words are presented in Table 5.2.

It was found that 40% of unmarked words did not have a discernible F0

63 Figure 5.1: Unmarked monosyllabic F0 contour of the word ‘smil’ showing measurement points. S = beginning of the sentence; B = beginning of F0 rise; C1 = onset of target word; C2 = onset of second consonant (if present); V1 = vowel onset; C3 = onset of post-vocalic consonant; W = end of target word; AP = end of AP; H = F0 maximum, L = F0 minimum; LTP = turning point from f0 minimum; APH = AP boundary tone.

maximum, so for these tokens this measure was omitted. Figure 5.2 indicates where the alignment cues occur on the accents. The statistical analysis was the same as in Experiments 1 and 2.

5.2 Results Tables of raw results can be found in Appendix C. Figures 5.3 and 5.4 show unmarked and circumflex contours for the monosyllabic words in broad and contrastive focus, based on average measurements, for one speaker. The unmarked accents is a relatively level contour, while the circumflex accent has a clear HL contour and therefore a wider pitch range than the unmarked accent. The mixed model multiple linear regression had accent and focus real- ization as the independent variables. Accent refers to the difference in any mea- sure between the unmarked and circumflex accents. Focus realization refers

64 Figure 5.2: Trøndersk unmarked & circumflex (darker line) accents, showing F0 at vowel onset, F0 maximum and F0 minimum.

Figure 5.3: One time-normalized F0 contour for unmarked and circumflex (darker line) accents in broad focus.

65 Figure 5.4: One time-normalized F0 contour for unmarked and circumflex (darker line) in contrastive focus.

Broad Contrastive Unmarked 92.6 92.8 Circumflex 93 93.3

Table 5.3: Average raw results for F0 maximum (semitones) by accent and focus realization to the difference between the broad and contrastive focus productions. The reference level for accent is the unmarked accent, and the reference level for focus realization is broad focus.

F0 Maximum

It was hypothesized that F0 maximum would be higher for the circum- flex accent than the unmarked accent and that it would be higher for both accents in contrastive focus than in broad focus. The average raw results are shown in Table 5.3. The statistical results (Table 5.4) show that the circumflex accent had a higher F0 maximum than the unmarked accent, but no effect of focus condition on this measure.

66 Coef. t-value p-value Accent 1.2 5.9 p<0.001* Focus 0.08 0.4 p=0.655 Interaction 0.07 0.3 p=0.762

Table 5.4: Statistical results for F0 maximum examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Unmarked 90.5 90.8 Circumflex 88.9 89.1

Table 5.5: Average raw results for F0 minimum (semitones) by accent and focus realization

Coef. t-value p-value Accent -0.5 -2.9 p<0.01* Focus -0.19 -1 p=0.3 Interaction 0.32 1.4 p=0.174

Table 5.6: Statistical results for F0 minimum examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

F0 Minimum

It was hypothesized that F0 minimum would be lower for the circumflex accent than the unmarked accent, and that for both accents it would be lower in contrastive focus than in broad focus. The average raw results are shown in Table 5.5. The statistical results (Table 5.6) show that the circumflex accent had a lower F0 minimum than the unmarked accent, as predicted, but effect of focus condition.

67 Broad Contrastive Unmarked 90.8 91 Circumflex 91.1 92.2

Table 5.7: Average raw results for F0 at vowel onset (semitones) by accent and focus realization

Coef. t-value p-value Accent 2.1 7.7 p<0.001* Focus 0.23 1 p=0.315 Interaction -0.2 -0.7 p=0.486

Table 5.8: Statistical results for F0 at vowel onset examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

F0 at Vowel Onset

It was hypothesized that F0 at vowel onset would be higher for the circumflex accent than the unmarked accent and that it would be higher for both accents in contrastive focus than in broad focus. The average raw results are shown in Table 5.7. The statistical results (Table 5.8) show that the circumflex accent had a higher F0 at vowel onset than the unmarked accent, but no effect of focus condition on this measure.

F0 Maximum Alignment

It was hypothesized that F0 maximum alignment would be later for the circumflex accent in contrastive focus than in broad focus. The average raw results are shown in Table 5.9. The statistical results (Table 5.10) show an effect of accent and an interaction with focus condition. The pairwise results revealed that the circumflex accent had a later F0 maximum alignment than the unmarked accent, and that the unmarked accent had a later F0 maximum alignment in contrastive focus than in broad focus.

68 Broad Contrastive Unmarked -0.37 -0.4 Circumflex -0.14 -0.14

Table 5.9: Average raw results for F0 maximum alignment (relative to word length) by accent and focus realization

Coef. t-value p-value Accent 0.2 3.3 p<0.001* Focus -0.18 -4.2 p=0.34 Interaction 0.15 2.7 p<0.01*

Table 5.10: Statistical results for F0 maximum alignment examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Unmarked 0.1 -0.005 Circumflex 0.45 0.3

Table 5.11: Average raw results for F0 minimum alignment (relative to VC length) by accent and focus realization

F0 Minimum Alignment

It was hypothesized that F0 minimum alignment would be later for the circumflex accent than the unmarked accent and that it would be earlier for the unmarked accent, and later for the circumflex accent, in contrastive focus than in broad focus. The average raw results are shown in Table 5.11. The statistical results (Table 5.12) show that the circumflex accent had a later F0 minimum alignment than the unmarked accent, as predicted, but that both accents had a later alignment in contrastive focus.

69 Coef. t-value p-value Accent 0.34 5.2 p<0.001* Focus -0.17 -2.8 p<0.01* Interaction 0.01 0.3 p=0.748

Table 5.12: Statistical results for F0 minimum alignment examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Unmarked 0.018 0.02 Circumflex 0.017 0.016

Table 5.13: Average raw results for slope of the rise by accent and focus realization

Coef. t-value p-value Accent 0 0.02 p=0.986 Focus 0.007 2.9 p<0.01* Interaction -0.006 -2 p=0.05

Table 5.14: Statistical results for slope of the rise examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Slope of the Rise It was hypothesized that slope of the rise would be steeper for the circumflex accent than the unmarked accent and that it would be steeper for both accents in contrastive focus than in broad focus. The average raw results are shown in Table 5.13. The statistical results (Table 5.14) show that both accents had a steeper slope in contrastive focus.

70 Broad Contrastive Unmarked -0.018 -0.018 Circumflex -0.028 -0.03

Table 5.15: Average raw results for slope of the fall by accent and focus realization

Coef. t-value p-value Accent -0.01 -3 p=0.01* Focus 0.001 0.6 p=0.53 Interaction -0.001 -0.6 p=0.57

Table 5.16: Statistical results for slope of the fall examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Slope of the Fall It was hypothesized that slope of the fall would be steeper for the circumflex accent than the unmarked accent and that it would be steeper for both accents in contrastive focus than in broad focus. The average raw results are shown in Table 5.15. The statistical results (Table 5.16) show that the circumflex accent in fact had a shallower slope than the unmarked accent and no effect of focus condition.

AP H% Height It was hypothesized that there would be no difference in AP H% height between the accents and that both accents would have a higher result for this measure in contrastive focus than in broad focus. The average raw results are shown in Table 5.17. The statistical results (Table 5.18) show an effect of accent and condition, and an interaction. The pairwise results revealed that only the circumflex accent had a higher result for this measure in contrastive focus than in broad focus.

71 Broad Contrastive Unmarked 93.7 94.3 Circumflex 91.9 94.8

Table 5.17: Average raw results for AP H% height (semitones) by accent and focus realization

Coef. t-value p-value Accent -1 -2.6 p<0.05* Focus 0.7 2.3 p<0.05* Interaction 1.9 4.9 p<0.001*

Table 5.18: Statistical results for AP H% height examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Unmarked 32 298 Circumflex 39 102

Table 5.19: Average raw results for AP H% timing (msec) by accent and focus realization

AP H% Timing It was hypothesized that there would be no difference in AP H% timing between the accents and that both accents would have an earlier result for this measure in contrastive focus than in broad focus. The average raw results are shown in Table 5.19. The statistical results (Table 5.20) show that both accents had an earlier timing in contrastive focus than in broad focus.

Boundary Slope It was hypothesized that there would be no difference in boundary slope between the accents and that both accents would have a steeper slope in contrastive focus than in broad focus. The average raw results are shown

72 Coef. t-value p-value Accent 6.4 1.2 p=0.246 Focus 20.4 2.3 p<0.05* Interaction -11.5 -1.2 p=0.226

Table 5.20: Statistical results for AP H% timing examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction.

Broad Contrastive Unmarked 0.01 0.02 Circumflex 0.02 0.03

Table 5.21: Average raw results for boundary slope by accent and focus real- ization

Coef. t-value p-value Accent 0.02 0.9 p=0.39 Focus 0.002 6.7 p<0.001* Interaction -0.002 0.7 p<0.475

Table 5.22: Statistical results for boundary slope examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction. in Table 5.21. The statistical results (Table 5.22) show that both accents had a steeper boundary slope in contrastive focus than in broad focus.

Stressed Vowel Duration It was hypothesized that the circumflex accent would have a longer vowel than the unmarked accent, based on previous work (Almberg, 2001; Kristoffersen, 2011), and that both accents would have a longer vowel in con- trastive focus than in broad focus. The average raw results are shown in Table 5.23. The statistical results (Table 5.24) show no difference between the ac- cents but that both accents had a longer vowel in contrastive focus than in

73 Broad Contrastive Unmarked 152 181 Circumflex 167 197

Table 5.23: Average raw results for long stressed vowels (msec) by accent and focus realization

Coef. t-value p-value Accent 0.26 0.04 p=0.972 Focus 32.3 5 p<0.001* Interaction 10.5 1.3 p=0.194

Table 5.24: Statistical results for stressed vowel duration examining the effect of accent (unmarked or circumflex) and focus realization (broad or contrastive), and their interaction. broad focus. In summary, the results showed that the circumflex accent had a higher

F0 at vowel onset and F0 maximum, lower F0 minimum, later alignment of both F0 maximum and minimum tonal targets and shallower slope between these targets than the unmarked accent. Contrastive focus induced a later F0 mini- mum alignment, steeper boundary slope and slope of the rise, and earlier AP H% timing for both accents. Furthermore, the circumflex accent had a higher AP H% tone than when in broad focus. The unmarked accent had a later

F0 maximum alignment in contrastive focus than in broad focus. In contrast to the findings for disyllabic words, including vowel length as an independent variable did not improve the model for any alignment measure. This was determined by comparing a model containing accent and focus realization as independent variables with one containing accent, focus realization and vowel length, using a likelihood ratio test.

Examining the F0 maximum, the results revealed, as noted above, that 40% of unmarked words did not have a discernible H. Of these, in 53% the H occurred before word onset. 60% of circumflex words had their F0 maximum before word onset. The rest of the unmarked words and the circumflex words had the F0 maximum after vowel onset. The timing of this high tone, regard-

74 less of whether it occurred before or after word onset, was compared across the accents. This was aligned significantly later for the circumflex accent com- pared to the unmarked accent. The F0 height was significantly higher for the circumflex accent than the unmarked accent.

5.3 Discussion Similar to Almberg (2001), pitch range and pitch contour alignment were found to differentiate the unmarked and circumflex accents. The circum-

flex accent had a later alignment of both F0 maximum and F0 minimum and a wider pitch range. The presence of a tonal contrast in monosyllabic words is not found in many varieties of Norwegian and Swedish (Elstad, 1978, 1982; Kristoffersen, 1992; Bruce and Hermans, 1999). In terms of the shapes of each accent, the unmarked accent has an early

F0 minimum or L tone, and gradually rises from there to the AP H% tone. The circumflex accent has a HL contour, which is the same contour found for both of the disyllabic accents.

The results for F0 maximum alignment and height indicate that the unmarked accent does not have a H tone. In instances where the H tone is discernible, it is much lower compared to the circumflex accent. The results suggest that the circumflex accent is not a later-aligned version of the un- marked accent. These data suggest that the circumflex accent has HL tonal targets, while the unmarked accent only has an L tonal target. This is in line with Dalen (1985) who described the unmarked accent as just L, but in contrast with Almberg (2001) who states that the unmarked and circumflex accents may have the same HL contour. One surprising result in light of previous analyses of the monosyllabic contrast (Christiansen, 1947; Killingbergtrø, 1969; Almberg, 2001; Kristof- fersen, 2011) was that the circumflex accent did not have a longer vowel than the unmarked accent. This conflicts with the phonological analysis by Kristof- fersen (2011) on the Oppdal dialect of Trøndersk. For that variety, the cir- cumflex accent (HL contour) was analyzed as surfacing on a trimoraic vowel. It may be that the Oppdal variety is different from other varieties within Trøndersk, as Kristoffersen (2011) suggests. With regard to the expression of

75 pragmatic focus, the results revealed that contrastive focus affected F0 mini- mum alignment, boundary slope, slope of the rise and AP H% timing for both accents. Compared to broad focus, the circumflex accent had a higher AP H% tone and the unmarked accent a later F0 maximum alignment. The expected higher AP H% tone was only found for circumflex words. The higher AP H% tone is in line with the description of contrastive focus in East Norwegian, whereby another H tone is added to the AP H% tone, increasing its height (Nilsen, 1989; Fretheim, 1991; Kristoffersen, 2000). The reasons for this only occurring in the circumflex accent are unclear. Contrastive focus did not in fact affect the pitch range for the mono- syllabic accents. Instead, alignment was affected, with both accents having a later F0 minimum alignment in contrastive focus, and the unmarked accent also having a later F0 maximum alignment (when this occurred). In contrast, for disyllabic words, contrastive focus induced a wider pitch range for both accents and accent 2 also had a later F0 maximum alignment. The results for disyllabic words showed that the accent contrast was enhanced through exag- gerated F0 cues in contrastive focus. In the monosyllabic contrast, the pairwise comparison revealed that in contrastive focus, the accents differed from one another in a subset of the measures that differentiated them in broad focus.

In contrastive focus, the accents differed significantly in F0 maximum height and alignment and in slope from H to L, but not in F0 minimum height or alignment, or vowel onset. However, for the measures that differed between the accents in both focus conditions, the magnitude of the difference between them was larger in the contrastive focus condition, similar to the results for disyllabic words (Chapter 3). Table 5.25 shows the average differences between the accents in each focus condition. It can be seen that for each measure, the difference between the unmarked accent and the circumflex accent is greater in contrastive focus than in broad focus. The monosyllabic accent contrast is therefore enhanced, although only for a subset of cues. Perception tests need to be conducted to determine whether these focus modifications lead to improved accent identification. Focus also affected segment durations, with phonologically long vowels lengthened and phonologically short consonants shortened. This was similar to the findings for disyllabic words (Chapter 3). Previous work on Swedish

76 Measure Broad Contrastive

F0 Maximum (st) 0.5 0.78 F0 Max alignment (relative) 0.21 0.56 Slope 0.0104 0.012

Table 5.25: Magnitude of differences between the monosyllabic accents in broad and contrastive focus. has also shown that phonologically long segments are lengthened further un- der contrastive focus (Bruce, 1977; Bannert, 1979; Bruce, 1981; Heldner and Strangert, 2001). Phonologically short vowels were also lengthened in con- trastive focus, but only in circumflex words. Short vowels and long consonants were not lengthened in contrastive focus in words with the unmarked accent (note that the number of VC: tokens was much smaller than V:C words - only one fifth of target words were VC:).

In addition to the finding that contrastive focus modified some F0 cues, acoustic analyses revealed that some acoustic characteristics remained un- changed. F0 minimum height and alignment and vowel onset height differed between the accents but were not affected by focus. Although a perception experiment would be necessary to determine which characteristics are neces- sary for each accent to be perceived, these findings suggest that the circumflex accent has to maintain a stable F0 minimum to be distinguished from the unmarked accent.

77 Chapter 6

Experiment 4: Perception of Disyllabic Accents

In the current investigation, we focus on the acoustic correlates that were found to be varying for the two accents in the production study. The main findings were that both accents had a HL contour, where accent 2 had a higher F0 minimum and later alignment of F0 minimum, F0 maximum and HTP than accent 1. Based on these findings, the manipulations involved in this perception experiment include the cues that will allow us to examine fea- tures of the contrast in more detail. Specifically, the manipulations involve the height of F0 minimum, alignment of F0 maximum and minimum, and alignment of the turning point from F0 maximum to F0 minimum. For ease of reading, a schematic representation of the two accents from Chapter 3 is shown again below (Figure 6.1). In order to investigate whether listeners use these cues in perception to allow them to make accentual distinctions, these cues are systematically manipulated. The effect of these manipulations on accent identification is examined. Based on the production results, it is hypothesized that the manipulated stimuli with a lower and earlier F0 minimum and earlier turning point will be identified as accent 1. Stimuli with a higher and later F0 minimum and a later turning point will be identified more often as accent 2.

When both are changed (F0 minimum height and alignment) it is hypothesized that the alignment of the F0 minimum will be more important than its height. If found to be correct, this will indicate the importance of the F0 alignment dimension over the F0 range/height dimension in the accentual distinction. Overall, this categorization experiment will allow me to determine the percep- tually salient aspects of the accent 1 and accent 2 lexical tonal accents. This will further allow me to draw a link between production and perception.

78 6.1 Methods 6.1.1 Materials One token of an accent 1 word (linet /"li:n@/)) produced by a female native speaker of Trøndersk, aged 20, excised from a broad focus position was chosen as the template for the cue manipulations. This token was chosen as it was a natural production typical of accent 1 in broad focus. Based on the measurements from the production study, words with stylized pitch contours were created. These manipulations were done using the PSOLA program in Praat (Boersma and Weenink, 2011). Four cues (as seen in Figure

6.1) were manipulated: (1) F0 maximum alignment, (2) high turning point (HTP) alignment, (3) F0 minimum height and (4) F0 minimum alignment. F0 maximum alignment is the alignment of the initial F0 maximum in relation to stressed vowel onset. HTP alignment is the alignment of the drop from the F0 maximum in relation to stressed vowel onset. F0 minimum height is the height of the lowest point in the contour after the initial F0 maximum. F0 minimum alignment is the alignment of the F0 minimum in relation to stressed vowel onset. These cues were chosen because the differences found between the ac- cents in the production study were in F0 alignment and the height of the F0 minimum. The end points for each cue were determined based on the actual productions of the accent 1 and accent 2 versions of the minimal pair (in mil- liseconds or Hertz). Table 6.1 shows the incremental changes for each acoustic cue. The range for each cue was then divided into five or six equal steps, con- tinuing for one further step beyond each endpoint. This was done to examine if listeners would perceive more extreme values as belonging to the same accent category. Also, the extreme endpoints for F0 maximum alignment and HTP were similar to narrow focus productions of these cues. The cue conditions have different numbers of steps to avoid having too large jumps between steps.

F0 maximum alignment and HTP were each manipulated in six equal steps. The alignment of the HTP was manipulated in steps of 19 milliseconds each (Figure 6.3). For these stimuli, F0 minimum alignment was kept con- stant. Each of the six F0 maximum alignment steps was 22 milliseconds. For the manipulations of F0 maximum alignment, F0 minimum alignment was ma- nipulated in conjunction so as to keep the slope between them constant (Figure

79 Figure 6.1: Accent 1 (solid) & accent 2 (dashed), showing F0 maximum, F0 minimum and high turning point.

Steps Measure 1 2 3 4 5 6

F0 Maximum Timing (msec) -76 -54 -32 -10 12 34 HTP (msec) 11 30 48 67 86 105 F0 Minimum (Hz) 196 204 212 220 228 F0 Minimum Timing (msec) 170 190 210 230 250 Table 6.1: The F0 (Hz) and duration (msec from vowel onset) at each step of the manipulated stimuli. The bold numbers represent the original accent 1 (earlier steps) and accent 2 (later steps) values.

80 Figure 6.2: Six steps for F0 maximum alignment & F0 minimum alignment

6.2)1. Those characteristics that were not systematically manipulated as part of the experiment were set at midpoints between the two accents (based on average values), so that they would not influence listeners’ judgments towards either of the accents. These were as follows: the final vowel duration was set to 50 msec and the stressed vowel to 170 msec, F0 at vowel onset was set to 97 semitones (264 Hz). F0 minimum and its alignment with vowel onset were changed in five equal steps. Each step for F0 minimum alignment was 20 mil- liseconds. For F0 minimum height, each of the five steps equaled 8 Hz. Each of the 5 F0 minimum height steps was combined with each of the 5 F0 mini- mum alignment steps, resulting in 25 stimuli. Two schematic representations of two sets of these manipulations - those with the lowest and second highest

(for the purposes of clarity in the figure) levels of F0 minimum - are shown in Figure 6.4. In order to check whether the created stimuli sounded natural, two Norwegian speakers (one a linguist, one not) who did not participate in the experiment provided written comments. They confirmed that the stimuli sounded natural.

1These conditions each include manipulations of two measures because it is not possible to manipulate F0 minimum on its own: manipulating it means either changing the slope or changing maximum alignment along with it. The two conditions thus ensure that in each case, one of these is kept constant.

81 Figure 6.3: Six steps for manipulations of the high turning point (HTP)

Figure 6.4: Five steps for F0 minimum alignment, shown at two levels of F0 minimum height. (The first level of alignment is marked at both levels of height, for clarity.)

82 6.1.2 Listeners The participants were 28 native speakers (19 females, 9 males) of the Trøndersk dialect, aged 18-40 (Table 6.2). They were recruited by posters and fliers around campus. Participants had to have grown up speaking the Trøndersk dialect. They filled out a language background questionnaire. Four subjects were disregarded because it turned out that Trøndersk was not their native dialect. This gave 24 subjects (16 females, 8 males) left for analysis.

Listener Sex Age range Hometown 01 M 18-24 Trondheim 03 F 25-30 Oppdal 04 M 31-35 Trondheim 05 M 18-24 Trondheim 06 F 18-24 Trondheim 07 F 18-24 Trondheim 08 M 25-30 Trondheim 09 F 18-24 Trondheim 10 F 18-24 Levanger 11 F 25-30 Agdenes 13 M 18-24 Sunndal 14 F 18-24 Namsos 15 F 18-24 Nord-Trøndelag 16 F 25-30 Trondheim 17 F 25-30 Stjordal 19 M 25-30 Stjordal 20 M 18-24 Steinkjer 21 M 18-24 Trondheim 22 F 25-30 Stjordal 23 F 36-40 Helhus 25 F 25-30 Trondheim 26 F 36-40 Tingvoll 27 F 36-40 Rennebu 28 F 25-30 Trondheim

Table 6.2: Listener details

83 6.1.3 Procedure Participants were seated at a desktop computer and wore headphones, in which sound was played to both ears. The stimuli were presented in the PsychoPy program (Peirce, 2007). Experiments took place in the sound lab- oratory at NTNU. The words linet (accent 1) and Line (accent 2) appeared on the screen. The order of these was switched for half of the participants. They heard one word at a time and were asked to choose which of two words on-screen they had heard, in a forced-choice task. The response for one trial initiated the following trial. There was a blink of the screen to indicate that a new trial had begun. If there was no response within 1.5 seconds after the target word, the next trial began automatically.

There were 37 manipulations total ((5 F0 minimum x 5 F0 minimum alignment) + 6 F0 maximum alignment + 6 HTP). Each stimulus was pre- sented 10 times, for a total of 370 stimuli. The 37 unique stimuli were divided into two blocks of 18 and 19 stimuli respectively, so as not to have any block too long. This means that all the stimuli were heard, without repetition, every two blocks. Within each block, the order of stimuli was randomized. There were 20 blocks in total. A pause screen appeared after each block, and partici- pants pressed a key when they were ready to continue to the next block. There were three practice trials. The total time for all the perception tasks was 40 minutes per participant and they were paid 110 NOK (approx. US$20).

6.1.4 Analysis Responses were analyzed by comparing how each manipulation affected identification of the accents, following Chang (2013). For each stimulus repe- tition, the number of accent 1 and accent 2 responses for each manipulation was obtained. In order to determine whether any of the four manipulated cues significantly affected identification of an accent, the responses were subjected to a logistic regression analysis with the dependent variable being the response (accent 1 or 2) and the independent variables being each of the manipulations

(F0 minimum height and alignment, F0 maximum alignment, HTP), using the glmer function in R (R Development Core Team, 2008). Accent 1 was coded as 0 and accent 2 as 1. Each manipulation was coded 1 (lowest F0 height or

84 Figure 6.5: Percent Accent 2 responses for all listeners when F0 Maximum alignment was manipulated. The x-axis shows the alignment step, with 1 being the earliest (accent 1-like) and 6 the latest (accent 2-like).

earliest alignment) to 5 or 6 (highest F0 height or latest alignment). Listener and block were included as random effects.

6.2 Results Figures 6.5 - 6.7 show the identification results for each manipulation across all listeners. In all graphs, it can be seen that tokens with a later align- ment (to the right of the graphs) and higher F0 minimum level are perceived as accent 2. It can also be seen that the responses from accent 1 to accent 2 change gradually rather than in a more categorical manner. Figures 6.5 and 6.6 show gradually more accent 2 responses from left to right (early to later alignment). Figure 6.7 shows gradually more accent 2 responses from left to right within each graph and across all panels from left to right, corresponding to more accent 2 responses as F0 minimum gets higher and later. A stepwise logistic regression test was used to determine how well the

85 Figure 6.6: Percent Accent 2 responses for all listeners when HTP was ma- nipulated. The x-axis shows alignment steps, with 1 being the earliest (accent 1-like) and 6 the latest (accent 2-like).

86 Figure 6.7: Percent Accent 2 responses for all listeners when F0 Minimum alignment was manipulated. The x-axis shows the manipulations of F0 min- imum alignment relative to vowel onset, 1 being the earliest and 5 being the latest. Each panel shows a different F0 minimum height level.

87 response (accent 1 or 2) could be predicted from each of the manipulations. (Recall that accent 1 was coded as 0 and accent 2 as 1. Each manipulation was coded 1 (lowest F0 height or earliest alignment) to 5 or 6 (highest F0 height or latest alignment).) All parameters significantly affected the responses, as shown in Table 6.3. Each of the manipulations significantly impacted accent identification, in the expected directions: those with a higher or later F0 min- imum, later HTP or later F0 maximum were identified more often as accent 2. There was also a significant interaction between F0 minimum height and its alignment. Condition Coef. p-value F0 maximum alignment 0.5 p<0.001* HTP 0.47 p<0.001* F0 min. height 0.3 p<0.001* F0 min. alignment 0.26 p<0.001* Interaction (F0 min. height & alignment) -0.05 p<0.01*

Table 6.3: Logistic regression results

To examine the interactions further, the results for F0 minimum height and its alignment were examined separately, using a logistic regression. The effect of F0 minimum height was examined separately for each level of align- ment, and the effect of F0 minimum alignment was examined separately for each level of height (as in Figure 6.7). The results showed that the effect of height was significant at all levels of alignment (p<0.001 at each alignment), and the effect of alignment was significant at all levels of height (p<0.01 at each height level). These results suggest that these two factors - how high the F0 minimum was and how early/late it was aligned with the segmental string - in conjunction contributed to accent identification. That is, the later and higher the F0 minimum was, the more likely the listeners were to identify accent 2. Table 6.4 lists the steps in the acoustic cue continuum at which the shift in accent identification response from accent 1 to accent 2 occurs (more 50% responses).

88 Condition Crossover point is between steps: HTP alignment 3 and 4 F0 max. alignment 4 and 5 F0 min. alignment step 1 F0 min. height steps 4 and 5 F0 min. alignment step 2 F0 min. height steps 4 and 5 F0 min. alignment step 3 F0 min. height steps 3 and 4 F0 min. alignment step 4 F0 min. height steps 2 and 3 F0 min. alignment step 5 all accent 2 F0 min. height step 1 F0 min. alignment steps 4 and 5 F0 min. height step 2 F0 min. alignment steps 4 and 5 F0 min. height step 3 F0 min. alignment steps 3 and 4 F0 min. height step 4 F0 min. alignment steps 2 and 3 F0 min. height step 5 all accent 2

Table 6.4: Majority response crossover points for each condition.

With regard to the interaction between F0 minimum alignment and F0 minimum height, for the earliest step (step 1) in alignment, the responses change from accent 1 to accent 2 between F0 minimum height steps 4 and 5. When F0 minimum alignment is latest (step 5), majority responses are accent 2. Likewise, when F0 minimum height is lowest (step 1) and F0 minimum alignment is manipulated, the majority of the responses change from accent 1 to accent 2 between F0 minimum alignment steps 3 and 4.

The results for F0 minimum and its alignment provide an insight into which features listeners may use to base their accent judgments. The lowest

F0 minimum values elicit a majority of accent 1 responses regardless of how late the alignment is. This can be seen in Figure 6.7. Only at height level 3 when the alignment is late (stimuli 4 and 5, 80% of the way into the following consonant, or 95% of the way into the VC unit, or later), do listeners give 50% of accent 2 responses. By height levels 4 and 5, we get consistently more accent 2 responses (at height level 4, alignment step 1 has almost 50% accent 2 responses, and this increases as alignment gets later for both height levels 4 and 5). Likewise, when alignment is early (steps 1 and 2 - the end of the vowel or 70-78% into the VC unit), there is a tendency towards accent 1 responses. Only when alignment gets later and height increases are there more accent 2

89 responses. Figure 6.4 shows that when the height of the F0 minimum is at a high level (step 4), the fall from H to L is very shallow. When height is at its lowest level (step 1), the fall is much steeper. These results are in accord with the production study where the F0 minimum of accent 2 was significantly higher than that of accent 1. At the highest level of F0 minimum, listeners could be perceiving a high tone, especially as alignment gets later, whereas when it is at its lowest level, no high tone is perceived, regardless of how late the alignment becomes.

The hypotheses were thus supported in that a higher and later F0 min- imum were perceived as accent 2, but it was in fact a combination of a higher and later F0 minimum that induced more accent 2 responses, rather than either of these cues alone. Examining the alignment of the high turning point in the contour (HTP) (Figure 6.3) shows that accent 2 responses increase as the HTP is delayed, that is, aligned later (40% through the vowel, step 4), (Figure 6.6).

This is the case even though the F0 minimum alignment is not later. Thus if the initial drop from H to L gets late enough in the vowel the perception of accent 2 is induced. This shift occurs regardless of the F0 minimum alignment. Figure 6.3 shows that making the HTP later creates a high plateau across most of the stressed vowel which induces accent 2 identification as well. It seems that listeners do not use the H tone that is needed for accent 2 identification unless it is aligned late enough into the stressed vowel. In these responses, this occurred around stimulus 3 or 4, where the HTP is 48 msec (28%) into the vowel. This is in line with findings on the perception of F0 changes at various points in the segmental string (House, 1990, 1996, 2004). During the tran- sition from a consonant to a following vowel there is a high level of spectral change, which decreases listeners’ sensitivity to F0 changes at these points. House showed that for the F0 to be perceived as a high tone, the high F0 has to occur past the point of spectral change, and into the middle of the vowel. If a tone is to be perceived as falling, the fall needs to start 30-50 ms after vowel onset. It is the F0 during the middle of the vowel that determines what the listener perceives. If this is the case, the initial H is only perceived in accent 2, and is therefore not far enough into the vowel to be a salient cue for accent 1 perception.

90 Finally, we turn to the F0 maximum alignment series. Once again, the later the F0 maximum is aligned, the more accent 2 responses there are. The alignment of the HTP in the HTP manipulation steps 1 and 2 is comparable to the timing of the HTP in the F0 maximum alignment steps 5 and 6 (Figure 6.2). However, comparing responses, F0 maximum alignment steps 5 and 6 have many more accent 2 responses (50-70%) than HTP stimuli 1 and 2 (30-

40%). The main difference between these stimuli is that in these F0 maximum alignment stimuli, the F0 minimum is also later, past the end of the vowel. It appears that although neither F0 maximum alignment nor F0 minimum alignment alone can change the perception of the accents, when the whole contour is later, accent 2 is most likely to be identified.

This finding - that a later alignment of more than one F0 landmark increases accent 2 responses more than a later alignment of just one F0 land- mark - provides an insight into the difference in responses between F0 minimum alignment steps 1 and 2 (Figure 6.7) and F0 maximum alignment steps 5 and 6 (Figure 6.5, with F0 maximum 7% and 20%, respectively, into the vowel). Even though these stimuli have comparable F0 minimum alignment, the later F0 maximum alignment shifts majority responses to accent 2. The different responses suggest again that F0 minimum alignment alone is not sufficient to induce a change in responses. F0 minimum alignment can be used by listeners to aid in accent identification but it seems to be salient only in combination with another cue, the timing of F0 maximum. Combined, these responses sug- gest that the initial H (as in HTP stimuli 4-6) is a more salient cue compared to the F0 minimum alignment cue, further supporting the notion that the ini- tial H is only perceptible in accent 2. This is discussed further in the following section. Listeners can perceive this initial high tone when the slope is shallow as in F0 minimum height level 5 or timing level 5, or when there is a high plateau (as in HTP stimuli 4-6). Accent 2, then, is perceived when there is an initial high tone, or a later contour (as in maximum alignment stimuli 5 and 6). Overall, the cue interactions suggest that accent 1 is characterized by an early F0 fall (a fall that begins at or before vowel onset), and furthermore, that the F0 fall has to be steep enough and/or the F0 minimum level has to be low enough for it to be perceived as a fall.

91 6.3 Discussion The aim of the perception experiment conducted here was to exam- ine which acoustic cues, or combination of these, were perceptually salient for accent 1 and accent 2 identification. Specifically, the study sought to deter- mine whether changes in the acoustic cues of F0 minimum and F0 maximum alignment, F0 minimum height, and high turning point (HTP) alignment con- tribute to the shifts in accent identification. The production study found that in broad focus, accent 2 has a higher and later F0 minimum, later F0 maximum and later HTP compared to accent 1. Based on these findings, the perception experiment here manipulated the F0 alignment and height cues in order to explore whether the cues that were consistently produced as different in the two accents were also used in their perception. The perception results indi- cate that the most salient cue for accent 1 identification is an early F0 fall. An initial salient F0 maximum led to a majority of accent 2 identifications. A higher F0 minimum and later F0 minimum alignment, or a later alignment of HTP or F0 maximum, all led to more accent 2 responses.

Although each cue impacted accent responses, a later F0 minimum alignment alone did not consistently increase the likelihood of accent 2 re- sponses. While at every level of height there was a significant effect of F0 minimum alignment on responses, there were still a majority accent 1 re- sponse for the lowest two levels of height, regardless of how late alignment was. The apparent lack of importance of the alignment of the F0 minimum on its own in perception of the contrast may be surprising in light of produc- tion studies that focus on this as the main correlate of the contrast in the Trondheim variety of Trøndersk. For example, Wetterlin (2010) describes the timing of the L as the main difference between the accents, and the current study found the alignment of the F0 minimum in accent 1 to be the landmark most closely aligned with a syllable boundary. Previous research has found that while a phonological contrast may have multiple acoustic cues, these are not all equally salient in the perception of the contrast (Haggard et al., 1970; Abramson and Lisker, 1985; Whalen et al., 1993; Francis et al., 2008a). While

F0 minimum alignment is consistently different between the accents, this does not mean that it is the most important cue that listeners use. The perception results here indicate that indeed F0 minimum alignment alone is not the sole

92 difference between the accents and that listeners appear to use other cues to differentiate them. The F0 minimum height and alignment results also pro- vide an insight regarding the impact of AP-final position (that is, higher level intonation) on F0 maximum alignment and F0 minimum alignment (Chapter 4). Recall that there it was found that the higher-level sentence intonation caused accent 2 to have an earlier and higher F0 minimum and an earlier F0 maximum. While accent 2 generally has a higher F0 minimum than accent 1, an earlier alignment is characteristic of accent 1, so the fact that AP-final po- sition induces an earlier alignment in accent 2 means that the contrast might be diminished in this position. Although this occurred, the alignment of the

HTP was unaffected. The current results, that the alignment of F0 minimum alone did not have a very strong effect on accent identification, indicates then that in AP-final position, since there is still a high F0 plateau, this would likely maintain the perception of accent 2. The later alignment of the F0 minimum in accent 2 could simply be a consequence of the presence of a high tone before it, or of a later alignment of the entire HL contour for accent 2. The results here highlight that finding consistent acoustic correlates of the tonal contrasts in production does not necessarily mean that listeners use them in perception. The results further suggest that one of the most salient aspects for accent 2 identification in Trøndersk is the combined late F0 minimum alignment with a higher F0 minimum.

The perception results further indicated that an early fall to the F0 minimum was the most salient cue for accent 1 identification, while a later alignment of the entire HL contour was important for accent 2 identification. The results for accent 2 are somewhat similar to what Bruce (1977) found for

Stockholm Swedish, where the F0 fall had to start at least 25% into the vowel to induce accent 2 responses. In the current study, the perception of an initial high tone (due to a high plateau or a very shallow fall) was also necessary for accent 2 identification. While it is tempting to use these perception results to conclude that only accent 2 has an initial phonological high tone and that the contrast is privative and not one of alignment alone, this assumption may be premature. The perception results simply indicate that since accent 1 has an earlier fall, it will be perceived differently from accent 2, not necessarily proving that accent 1 does not have an initial high tone. However, the findings for accent 1 here may be interpreted in the context of what House (1990) calls

93 “recoding from movement to levels” (p.75). If the onset of a fall takes place before the area of maximum spectral change (the transition from a consonant to a vowel, up to about 25 msec into the vowel), the fall could be perceived as a low tone. Similarly, Remijsen (2013) notes that “[i]f an F0 change...sets in during the onset consonant or at the beginning of the vowel...it would be perceived in terms of level targets, with the end target likely to predominate” (p.324). Following this, the current results suggest that the initial, early- aligned H is not perceived as indicative of accent 1, rather, it is the early

F0 fall that listeners use to identify this accent. The question arises then of whether the initial fall in accent 1 is a fall from an early phonological high tone (as posited by Kristoffersen (2007)), making the contrast one of timing where both accents are HL, or simply a fall from a phonetic high that is used by speakers to make the low target salient (the explanation favored by Randi Nilsen, p.c.; Nilsen (1992)). This latter explanation would make the accent contrast in this dialect a privative one between L and HL. The current data do not allow us to distinguish between these possibilities, because both of them include an early F0 maximum, whether it is phonological or phonetic. However, these alternatives need not contradict one another. If the L is the most salient feature of accent 1, and the fall enhances this salience, this does not preclude the presence of a high tonal target occurring before it. The fact remains that the initial F0 of accents 1 and 2 are perceived as qualitatively different, so while accent 2 undoubtedly has an initial H, it is an initial fall that seems to be most relevant for listeners to perceive accent 1. With regard to the identification curves shown in Figures 6.5 - 6.7, it is important to note that there are no 100% accent 1 responses or accent 2 responses. This is likely due to the fact that other cues that were not being ma- nipulated for the experiment, such as vowel length, were set to the mid-point between the two accents, thus possibly making the stimuli ambiguous in these respects. Also, all of the identification curves appear to be more continuous than categorical. The results are not sigmoid curves with an abrupt change from one accent category to the other, as found for consonant contrasts, for in- stance, rather they are more gradual slopes. Previous work on the perception of pitch in language has shown mixed results. Some research has found that lexical tone is perceived categorically, for example in Mandarin (Hall´eet al., 2004; Xu et al., 2006). Francis et al. (2003) showed that in Cantonese, level

94 tones were perceived more continuously, similar to vowels, while contour tones were perceived more like consonants, that is, more categorically. In study re- lated to the current investigation, the results of a discrimination experiment Kelly and Dogil (under review) also support an analysis where the accents are perceived continuously rather than categorically. Generally, at the crossover point from one category to another (determined by an identification task), there is a peak in discrimination accuracy, indicating that listeners distinguish stimuli across the category boundary better than within the category (Repp, 1984). However, work on the perception of lexical tone does not always find such correspondences between identification and discrimination (Francis et al., 2003; DiCanio, 2012). The results from a brief discrimination task conducted with a subset of the listeners from this experiment indicated that accuracy was not higher across the category boundaries. Therefore, both the identification and discrimination results on this dialect indicate a continuous perception of the lexical accents. Finally, note that the stimuli used in this experiment were created on just one token base (accent 1) rather than from both accents. This was done so that the F0 contour would be the only manipulation differentiating the stimuli. Another, more prosaic reason, was the time limitation during the brief data collection field trip to Norway as well as access to a limited number of partici- pants. This choice, however, has important consequences on the interpretation of the results. The contribution of other, non-F0 factors may be biasing listen- ers toward one response type over another. One way of minimizing this was through setting the segmental and F0 variables not under investigation to a mid-point for the two accents. This strategy would have hopefully equally af- fected responses to both accents. The confound still remaining is if there were other correlates of the accent contrast, such as quality, which would re- sult in listener bias toward accent 1. However, during data collection, a small number of tokens was created from an accent 2 base, and these were tested on five participants. The same controls and manipulations were made on this base, and for these participants, their responses for the accent 2 base did not differ from their responses for the accent 1 base for the comparable manipula- tions. This indicates that using accent 1 as a base token for all manipulations likely did not bias responses. While a future investigation will include a full set of tokens created from both accent 1 and accent 2 words, the results here,

95 nonetheless, provide some suggestions about the nature of the lexical tonal contrast in perception as well as some potential insights into the phonological characterization of the contrast. In summary, the results of this investigation provided insight into the accent contrast in Trøndersk approaching it from the perspective of perception. The results suggest that accent 1 is perceived when there is an initial fall, and accent 2 when there is the perception of an initial H. The results also indicate that the accents are perceived continuously rather than categorically. The production results on the accent contrast guided the manipulation of stimuli to examine which cues listeners use to distinguish the accents. This work indicates how production and perception can inform one another. Previous work described F0 minimum alignment as the main cue to the contrast, but the perception experiment here revealed that this alone is not the most relevant cue, rather the timing of the F0 fall and the initial F0 height are what listeners use to distinguish the accents.

96 Chapter 7

General Discussion and Conclusions

The current investigation examined the lexical tonal accent contrast in disyllabic and monosyllabic words in the Trøndesk variety of East Norwegian and how it is affected by pragmatic focus and phrasal intonation. The goal of the dissertation was to conduct an in-depth analysis of the disyllabic accent contrast in this dialect, and to address the question of whether the accent contrast is one of tonal makeup or alignment differences. A perception exper- iment was set up to examine which cues listeners use to identify each accent. A further goal was to examine the impact of pragmatic focus and phrasal in- tonation on the accents and how cues used to distinguish the accents may also be utilized to express higher level information. Finally, another goal was to conduct an acoustic analysis on the monosyllabic contrast in this dialect, and also how this is affected by pragmatic focus. These questions were investi- gated using controlled experiments and acoustic measurements to describe the cues that differentiate the accents and express intonation. Ten native speak- ers of Trøndersk were recorded reading controlled sentences to allow for an experimental analysis to answer these questions. The disyllabic accent contrast (Chapter 3) was found to be one of tim- ing, with both accents having a HL contour but accent 2 a later timing of the F0 landmarks in relation to the segmental string. Accent 2 words were also found to have a higher F0 minimum than accent 1 words. The lack of a difference in height of the H tone, as well as the finding that many accent 1 words had an F0 peak prior to word onset led to a description of accent 1 as a HL contour. Accent 1 was found to have an L tone aligned with the end of the syllable, and accent 2 a H tone aligned with the beginning of the syllable. In phonological terms, accent was described as HL with the L associated with the initial, stressed syllable, and the preceding H timed with respect to this (H+L*). Accent 2 was described as HL with the initial H associated with the

97 initial, stressed syllable and the L associated with the following, unstressed syllable. Words with phonologically short vowels had an earlier relative tim- ing of the L tone and the HTP than words with long vowels, for both accents.

The alignment of the F0 minimum (L tone) with the end of the syllable in accent 1 words is in line with the results of the perception tests with manipu- lated contours (Chapter 6) which showed that an early fall is necessary for the perception of accent 1. This early fall is what allows the L tone to be reached by the end of the syllable. As Remijsen and Ayoker (2014) note: “When the

[F0] drop is aligned early in or at the center of the vowel...then the result is a falling F0 contour over the vowel, the part of the syllable with greatest in- tensity, so that the falling contour is likely to be perceptually salient in some way” (p. 1). In this context, the early fall in accent 1 is likely to be salient to listeners, which allows them to distinguish it from accent 2. It is posited that it is the presence of the early H tone in accent 1 that allows the initial fall to occur. Therefore, both production and perception indicated that the identifying characteristic of accent 1 is an initial fall (correlated with a right- aligned low tone) and accent 2 is a salient initial high tone. These results further suggest that the Trøndersk dialect does not fit well into the general typology of which divides them into those with accent 1 either as a low tone or a high tone. The contrast here is one of alignment rather than a tonal makeup difference between the accents. This also means that this dialect is more different from the Oslo variety, where accent 1 is L and accent 2 is HL, than previously described. Both disyllabic accents have a wider pitch range in contrastive focus, but accent 2 also has a later timing. The accents thus use different strategies under focus. One explanation is that if accent 1 had a later timing, it would sound too similar to accent 2. Phonologically long vowels were also lengthened under contrastive focus, while phonologically long consonants were not. Some cues that distinguish the accents in broad focus are also used to distinguish them in contrastive focus, but the magnitude of the difference between them is greater in contrastive focus.

Higher level phrasal intonation was found to affect the lexical F0 con- tour of disyllabic words (Chapter 4), mainly in the height of the F0 minimum, which did not reach as low as in AP-medial position due to tonal crowding

98 by the AP H% surfacing on the accented word. All of the other effects only impacted the AP H% tone and the boundary slope leading up to this. The

F0 maximum and F0 minimum were also earlier for accent 2 words, likely be- cause of the later timing of accent 2 tones, meaning that they are close to the AP boundary and thus more vulnerable to being affected by a change in the duration of the AP. Therefore, contrastive focus and position have opposite effects on the timing of the F0 maximum and minimum in accent 2 words. The acoustic analysis found support for the monosyllabic contrast in this dialect (Chapter 5), with the unmarked accent being characterized as an L tone and the circumflex accent as HL, with a wider pitch range than the unmarked accent. It was concluded that the circumflex accent is therefore not a displaced version of the unmarked accent. No vowel length difference was found between the monosyllabic accents, in contrast to previous analy- ses where the circumflex accent was described as having a trimoraic vowel (Kristoffersen, 2011). Similar to the results for disyllabic words, the monosyl- labic contrast was found to be enhanced (in production) in contrastive focus, due to a greater difference between the two accents in some cues. The results for disyllabic words seem somewhat at odds with the findings for the expres- sion of contrastive focus on monosyllabic words. In contrastive focus, the tonal targets were affect in a similar way, that is, no asymmetrical alignment effects were found. Both accents have a later L timing. It is possible that since the monosyllabic accents have a different tonal makeup from one another, the ef- fect of contrastive focus on timing can be similar for both monosyllabic accents because this will not jeopardize the contrast. These results contribute to the typological literature on Swedish and Norwegian accents, and also provide a new acoustic analysis of the mono- syllabic contrast, an unusual feature across language varieties of the region. These results have shown the interaction of focus and phrasal intonation with a lexical pitch contrast, and also the effect of this on identification of the ac- cents. It was found that syllable structure has an effect on these interactions, indicating the importance of examining duration cues as well as pitch cues, in the context of the syllable. This dissertation highlights the importance of per- ception studies in examining phonetic cues, since the presence of a particular acoustic cue in production is not sufficient evidence for its use in perception.

99 Finally, through the course of examining the characteristics of this tonal contrast, some limitations of the study lead to questions for future research. In order to conclusively determine the effect of vowel length and syllable type on the tonal accent contrast, target words would need to be balanced between V:C and VC:. Perception experiments on the monosyllabic contrast would be useful for determining which acoustic cues listeners use to distinguish the unmarked accent and the circumflex accent. Further perception tests on the disyllabic words could also be conducted. The importance of an initial fall for the perception of accent 1 was found here. It would be useful to create con- tours that begin with a low target rather than a fall, to see if listeners would still identify this as accent 1, or if an initial fall from an F0 peak is necessary. Future work might also examine how the phonetic realization of the accents in a particular dialect affects listeners’ perception of manipulations in the F0 contour. Work on dialect variation in the phonetic realization of the pitch con- tours thus far has shown that changes in the contours occur gradually rather than abruptly as one moves through the country. The result of this variation is that the same F0 contour may be interpreted as accent 1 in one dialect and as accent 2 in another. Examining the acoustics of the accent contrast in a number of dialects will provide new insights into how much variation in

F0 contours is present across these different dialects, thereby adding to our understanding of prosodic typology. It will further provide insights into how the realization of focus varies cross-dialectally. It would also be insightful to examine the realization of focus in those dialects of Norwegian that do not have the lexical accent contrast, such as those in the north of the country, for example, Finnmark. The dialect of East Norwegian spoken in Oslo is char- acterized as accent 1 being L and accent 2 HL. Examining how the contrast in this dialect is impacted by pragmatic focus and sentence intonation would extend our understanding of the possible pitch variation within the language. Finally, conducting perception experiments with listeners of this dialect should reveal which acoustic cues are relevant for the perception of the contrast. It may be, for instance, that timing of the F0 minimum would be more important in the Oslo contrast than in the Trøndersk contrast. Importantly, examina- tion of pitch variation across these two dialects of East Norwegian, and the West Norwegian dialects that have very different tonal contours, should reveal how pitch variation due to focus and sentential intonation found in one dialect

100 impacts lexical decisions in another. Comprehensive examination of the pro- duction and perception of the tonal accents in a variety of dialects will allow for a higher level analysis both in terms of the of the accent contrast and in the processing of the accent contrasts. This work will add to the lit- erature on prosodic typology and intonational phonology. This cross-dialectal examination will also provide new insights into how one’s native language or dialect impacts pitch processing.

101 Appendices

102 Appendix A

Test Sentences

103 Word Gloss Acc. Syll. Condit. Sentence limet the glue 1 2 Broad Det var limet i en film, men ikke i et stykke. There was the glue in a film, but not in a play. AP-final Det var limet før, men ikke n˚a. There was the glue before, but not now. Contrastive Det var limet i en film, men ikke saks. There was the glue in a film, but not scissors. smilet the smile 1 2 Broad Det var smilet i en film, men ikke i et stykke. There was the smile in a film, but not in a play. AP-final Det var smilet før, men ikke n˚a. There was the smile before, but not now. Contrastive Det var smilet i en film, men ikke t˚aren. There was the smile in a film, but not the tear. glimtet the flash 1 2 Broad Det var glimtet i en film, men ikke i et stykke. There was the flash in a film, but not in a play. AP-final Det var glimtet før, men ikke n˚a. There was the flash before, but not now. Contrastive Det var glimtet i en film, men ikke brannen. 104 There was the flash in a film, but not the fire. slimet the mucus 1 2 Broad Det var slimet i en film, men ikke i et stykke. There was the mucus in a film, but not in a play. AP-final Det var slimet før, men ikke n˚a. There was the mucus before, but not now. Contrastive Det var slimet i en film, men ikke blodet. There was the mucus in a film, but not the blood. linet the flax 1 2 Broad Det var linet i en film, men ikke i et stykke. There was the flax in a film, but not in a play. AP-final Det var linet før, men ikke n˚a. There was the flax before, but not now. Contrastive Det var linet i en film, men ikke gresset. There was the flax in a film, but not the grass. Word Gloss Acc. Syll. Condit. Sentence (et) minne memory 2 2 Broad Det var et minne i en film, men ikke i et stykke. There was a memory in a film, but not in a play. AP-final Det var et minne før, men ikke n˚a. There was a memory before, but not now. Contrastive Det var et minne i en film, men ikke en drøm. There was a memory in a film, but not a dream. Line (name) 2 2 Broad Det var Line i en film, men ikke i et stykke. 105 Line was in a film, but not in a play. AP-final Det var Line før, men ikke n˚a. Line was there before, but not now. Contrastive Det var Line i en film, men ikke Anna. Line was in a film, but not Anna. Word Gloss Acc. Syll. Condit. Sentence (et) lim glue Unmarked 1 Broad Det var et lim i en film, men ikke i et stykke. There was glue in a film, but not in a play. AP-final Det var et lim før, men ikke n˚a. There was glue before, but not now. Contrastive Det var et lim i en film, men ikke saks. There was glue in a film, but not scissors. (et) smil smile Unmarked 1 Broad Det var et smil i en film, men ikke i et stykke. There was a smile in a film, but not in a play. AP-final Det var et smil før, men ikke n˚a. There was a smile before, but not now. Contrastive Det var et smil i en film, men ikke en t˚are. There was a smile in a film, but not a tear. (et) glimt flash Unmarked 1 Broad Det var et glimt i en film, men ikke i et stykke. There was a flash in a film, but not in a play. AP-final Det var et glimt før, men ikke n˚a. There was a flash before, but not now. Contrastive Det var et glimt i en film, men ikke en brann. 106 There was a flash in a film, but not a fire. (et) slim mucus Unmarked 1 Broad Det var et slim i en film, men ikke i et stykke. There was mucus in a film, but not in a play. AP-final Det var et slim før, men ikke n˚a. There was mucus before, but not now. Contrastive Det var et slim i en film, men ikke blod. There was mucus in a film, but not blood. (et) lin flax Unmarked 1 Broad Det var et lin i en film, men ikke i et stykke. There was flax in a film, but not in a play. AP-final Det var et lin før, men ikke n˚a. There was flax before, but not now. Contrastive Det var et lin i en film, men ikke et gress. There was flax in a film, but not grass. Word Gloss Acc. Syll. Condit. Sentence ˚alime to glue Circumflex 1 Broad Jeg vil lime p˚aei avis, men ikke kort. I want to glue to a newspaper, but not card. AP-final Jeg vil lime snart, men ikke n˚a. I want to glue soon, but not now. Contrastive Jeg vil lime p˚aei avis, men ikke lese den. I want to glue to a newspaper, but not read it. ˚asmile to smile Circumflex 1 Broad Jeg vil smile i en film, men ikke i et bilde. I want to smile in a fillm, but not in a photo. AP-final Jeg vil smile snart, men ikke n˚a. I want to smile soon, but not now. Contrastive Jeg vil smile i en film, men ikke rynke. I want to smile in a film, but not frown. ˚aglimte to gleam Circumflex 1 Broad Jeg vil glimte i en film, men ikke i et bilde. I want to gleam in a film, but not in a photo. AP-final Jeg vil glimte snart, men ikke n˚a. I want to gleam soon, but not now. Contrastive Jeg vil glimte i en film, men ikke rynke. I want to gleam in a film, but not frown. ˚akline to kiss Circumflex 1 Broad Jeg vil kline i en film, men ikke i et bilde. 107 I want to kiss in a film, but not in a photo. AP-final Jeg vil kline snart, men ikke n˚a. I want to kiss soon, but not now. Contrastive Jeg vil kline i en film, men ikke klemme. I want to kiss in a film, but not hug. ˚aslime to produce Circumflex 1 Broad Jeg vil slime i en drøm, men ikke i verkeligheten. phlegm I want to cough up in a dream, but not in reality. AP-final Jeg vil slime snart, men ikke n˚a. I want to cough up soon, but not now. Contrastive Jeg vil slime i en drøm, men ikke smile. I want to cough up in a dream, but not smile. Appendix B

Tables of Disyllabic Raw Results

Figures B.1 - B.15 show mean results for disyllabic accents in broad fo- cus, narrow focus and AP-final position, by accent and speaker. (The speakers marked F are female and M are male.)

Figure B.1: F0 Maximum (disyllabic)

108 Figure B.2: F0 Minimum (disyllabic)

Figure B.3: Slope of the rise (disyllabic)

109 Figure B.4: F0 maximum alignment (disyllabic) (from vowel onset, relative to word length)

Figure B.5: F0 minimum alignment (disyllabic) (from vowel onset, relative to VC length)

110 Figure B.6: HTP alignment (disyllabic) (from vowel onset, relative to VC length)

Figure B.7: Slope of the fall (disyllabic)

111 Figure B.8: AP H% height (disyllabic)

Figure B.9: Boundary slope (disyllabic)

112 Figure B.10: Final vowel duration (disyllabic)

Figure B.11: AP H% timing (disyllabic)

113 Figure B.12: Stressed vowel duration for long vowels (disyllabic)

Figure B.13: Stressed vowel duration for short vowels (disyllabic)

114 Figure B.14: Postvocalic consonant duration for long consonants (disyllabic)

Figure B.15: Postvocalic consonant duration for short consonants (disyllabic)

115 Appendix C

Tables of Monosyllabic Raw Results

Figures C.1 - C.14 show mean results for monosyllabic accents in broad focus and narrow focus by accent and speaker. (The speakers marked F are female and M are male.)

Figure C.1: F0 Maximum (monosyllabic)

116 Figure C.2: F0 Minimum (monosyllabic)

117 Figure C.3: F0 at vowel onset (monosyllabic)

118 Figure C.4: Slope of the rise (monosyllabic)

119 Figure C.5: F0 maximum alignment (monosyllabic) (from vowel onset, relative to word length)

120 Figure C.6: F0 minimum alignment (monosyllabic) (from vowel onset, relative to VC length)

121 Figure C.7: Slope of the fall (monosyllabic)

122 Figure C.8: AP H% height (monosyllabic)

123 Figure C.9: Boundary slope (monosyllabic)

124 Figure C.10: AP H% timing (monosyllabic)

125 Figure C.11: Stressed vowel duration for long vowels (monosyllabic)

126 Figure C.12: Stressed vowel duration for short vowels (monosyllabic)

127 Figure C.13: Postvocalic consonant duration for long consonants (monosyl- labic)

128 Figure C.14: Postvocalic consonant duration for short consonants (monosyl- labic)

129 Bibliography

Abrahamsen, Jarder E. 2004. Focus in the Herøy Dialect. In Nordic Prosody IX , ed. G. Bruce and M. Horne, 9–18. Lund: Peter Lang.

Abramson, A. S. 1979. Lexical tone and sentence prosody in Thai. In Proceed- ings of the International Congress of Phonetic Sciences, 380–387. Copen- hagen.

Abramson, A. S., and L. Lisker. 1985. Relative power of cues: F0 shift versus voice timing. In Phonetic linguistics: Essays in honor of Peter Ladefoged, ed. V. Fromkin, 25–33. New York: Academic Press.

Almberg, Jørn. 2001. The circumflex tone in a Norwegian dialect. In Nordic Prosody: Proceedings of the VIIIth Conference, Trondheim, August 19-21, 2000 , ed. Wim A. Van Dommelen and Thorstein Fretheim, 9–21. Frankfurt, Germany: Peter Lang.

Almberg, Jørn. 2004. Tonal differences between four Norwegian dialect regions - some acoustic findings. In Nordic Prosody IX , ed. G. Bruce and M. Horne, 19–28. Lund: Peter Lang.

Almberg, Jørn, and Olaf Husby. 2000. The relevance of some acoustic param- eters for the perception of a foreign accent. In New sounds 2000: Proceed- ings of the Fourth International Symposium on the Acquisition of Second- Language Speech, 1–10. Klagenfurt: University of Klagenfurt.

Arvaniti, A., D. R. Ladd, and I. Mennen. 2006. Phonetic effects of focus and “tonal crowding” in intonation: Evidence from Greek polar questions. Speech Communication 48 (6):667–696.

Bannert, R. 1979. The effect of sentence accent on quantity. In Proceedings of the 9th International Congress of Phonetic Sciences, 253–259. Copenhagen.

130 Bannert, R., and A.-C. Bredvad-Jensen. 1975. Temporal organization of Swedish tonal accent: the effect of vowel duration. In Working Papers 10 , 1–36. Dept. of Linguistics, Lund University.

Bannert, R., and A.-C. Bredvad-Jensen. 1977. Temporal organization of Swedish tonal accents: the effect of vowel duration in the Gotland dialect. In Working Papers 15 , 122–138. Dept. of Linguistics, Lund University.

Beckman, M. E., and J. Edwards. 1990. Lengthenings and shortenings and the nature of prosodic constituency. In Papers in Laboratory Phonology I , ed. J. Kingston and M. Beckman, 179–200. Cambridge University Press.

Beckman, M. E., and J. Edwards. 1994. Articulatory evidence for differentiat- ing stress categories. In Phonological Structure and Phonetic Form: Papers in Laboratory Phonology III , ed. P. A. Keating, 7–33. Cambridge University Press.

Beckman, Mary E. 1986. Stress and Non-Stress Accent. Netherlands: Foris.

Beckman, Mary E., and Janet B. Pierrehumbert. 1986. Intonational structure in English and Japanese. Phonology Yearbook 3 255–310.

Bjerrum, M. 1948. Felstedmaalets tonale accenter. Aarhus: Universitetsfor- laget i Aarhus.

Boersma, Paul, and David Weenink. 2011. Praat: doing phonetics by com- puter [Computer program]. Version 5.3.03. http://www.praat.org/.

Bookcoverimgs.com. 2012. Map of Regions of Norway. Digital Image. http://norwords.com/settigang2/grammar/regions/index.html/.

Borgstrøm, C. H. 1962. Tonemes and Phrase Intonation in South-East Stan- dard Norwegian. In Studia Linguistica, 34–37.

Bruce, G. 1977. Swedish word accents in sentence perspective. Travaux de L’Institut de Linguistique de Lund 12.

Bruce, G. 1981. Tonal and temporal interplay. In Working Papers 21 , 49–60. Dept. of Linguistics, Lund University.

131 Bruce, G. 2005. Intonational Prominence in Varieties of Swedish Revisited. In Prosodic Typology: The Phonology of Intonation and Phrasing, ed. S. Ah Jun, 410–429. Oxford: Oxford University Press.

Bruce, G., and B. Hermans. 1999. Word tone in Germanic languages. In Word Prosodic Systems in the Languages of Europe, ed. H. van der Hulst, 605–658. Berlin: Mouton de Gruyter.

Cambier-Langeveld, T., and A. E. Turk. 1999. A cross-linguistic study of accentual lengthening: Dutch vs. English. Journal of Phonetics 27:255–280.

Campbell, N., and M. Beckman. 1997. Accent, stress and spectral tilt. Journal of the Acoustical Society of America 101(5):3195.

Chafe, Wallace L. 1976. Givenness, contrastiveness, definiteness, subjects, topics and point of view. In Subject and Topic, ed. Charles N. Li, 27–55. New York: Academic Press.

Chang, S-E. 2013. Effects of fundamental frequency and duration variation on the perception of South Kyungsang Korean tones. Language and Speech 56(2).

Chen, S.-w., B. Wang, and Y. Xu. 2009. Closely related languages, different ways of realizing focus. 1007–1010. Brighton, UK.

Chen, Yiya. 2010. Post-focus F0 compression: Now you see it, now you don’t. Journal of Phonetics 38:517–525.

Christiansen, Hallfrid. 1947. Stavingskontraksjon og tonelag. In Festskrift til professor Olaf Broch p˚ahans 80-˚arsdag, ed. Christian S. Stang, Erik Krag, and Arne Gallis, 49–55. Oslo.

Cooper, W. E., S. J. Eady, and P. R. Mueller. 1985. Acoustical aspects of contrastive stress in question-answer contexts. Journal of the Acoustical Society of America 77(6):2142–2156.

Dalen, Arnold. 1985. Skognam˚alet.Ein fonologisk analyse. Oslo.

132 Dalen, Arnold, Jan Ragnar Hagland, Stian H˚arstad,H˚akan Rydving, and Ola Stemshaug. 2008. Trøndersk spr˚akhistorie.Spr˚aforholdi ein region. Trondheim: Tapir.

Derksen, R. 1966. Metatony in Baltic. Amsterdam: Rodopi.

DiCanio, C. T. 2012. Cross-linguistic Perception of Itunyoso Trique Tone. Journal of Phonetics 40(5):672–688.

Van Dommelen, Wim A. 2002. Toneme realization in two North Norwegian dialects. Proceedings of Fonetik 44(1):21–24.

Van Dommelen, Wim A., and R. A. Nilsen. 2003. Toneme realization in two East Norwegian dialects. Proceedings of Fonetik, PHONUM 9:21–24.

Eefting, W. 1991. The effect of “information value” and “accentuation” on the duration of Dutch words, syllables, and segments. Journal of the Acoustical Society of America 89(1):412–424.

Efremova, I. B., K. Fintoft, and H. Ormestad. 1963. Intelligibility of Tonic Accents. Phonetica 10:203–212.

Elstad, K. 1978. Det nordnorske cirkumsflekstonemet. In Nordic Prosody, ed. E. G˚arding,G. Bruce, and R. Bannert. Lund.

Elstad, K. 1982. Borgfjerdingsm˚al1-2. Del 2: Tonelag i Borgfjerdingsm˚al. Oslo.

Felder, V., E. J¨onsson-Steiner,C. Eulitz, and A. Lahiri. 2009. Asymmetric processing of lexical tonal contrast in Swedish. Attention, Perception, & Psychophysics 71(8):1890–1899.

F´ery, Caroline, and Frank K¨ugler.2008. Pitch accent scaling on given, new and focused constituents in German. Journal of Phonetics 36:680–703.

Fintoft, K., and J. M´artony. 1964. Word Accent in East Norwegian. STL- QPSR 3:8–15.

Fintoft, Knut. 1970. Acoustical Analysis and Perception of Tonemes in Some Norwegian Dialects. Oslo: Universitetsforlaget.

133 Fintoft, Knut. 1987. Toneme Patterns in Norwegian and in Swedish Dialects. In In Honor of Ilse Lehiste, ed. R. Channon and L. Shockey, 33–50. The Netherlands: Foris Publications.

Fournier, R., J. Verhoeven, M Swerts, and C. Gussenhoven. 2006. Perceiving word prosodic contrasts as a function of sentence prosody in two Dutch Limburgian dialects. Journal of Phonetics 29-48:34.

Francis, A. L., V. Ciocca, L. Ma, and K. Fenn. 2008a. Perceptual learning of Cantonese lexical tones by tone and non-tone language speakers. Journal of Phonetics 36:268–294.

Francis, A. L., V. C. Ciocca, and B. K. C. Ng. 2003. On the (non)categorical perception of lexical tones. Perception and Psychophysics 65:1029–1044.

Francis, A. L., N. Kaganovich, and C. J. Driscoll-Huber. 2008b. Cue-specific effects of categorization training on the relative weighting of acoustic cues to consonant voicing in English. Journal of the Acoustical Society of America 124 (2):1234–1251.

Fretheim, Thorstein. 1981. Intonational Phrasing in Norwegian. Nordic Jour- nal of Linguistics 4:111–137.

Fretheim, Thorstein. 1982. Norwegian intonation patterns in discourse per- spective: is there a neutral intonation? In Papers from the sixth Scandina- vian conference of linguistics, 193–204. Trondheim: Tapir.

Fretheim, Thorstein. 1987a. Phonetically Low Tone-Phonologically High tone, and Vice Versa. Nordic Journal of Linguistics 10:35–58.

Fretheim, Thorstein. 1987b. Pragmatics and intonation. In The pragmatic perspective, ed. Jef Verschueren and Bertuccelli-Papi, 395–420. John Ben- jamins.

Fretheim, Thorstein. 1991. Intonational phrases and syntactic focus domains. In Levels of Linguistic Adaptation, ed. Jef Verschueren, 81–112. Amsterdam: John Benjamins.

134 Fretheim, Thorstein, and Randi Alice Nilsen. 1989. Terminal rise and rise-fall tunes in East Norwegian intonation. Nordic Journal of Linguistics 12:155– 181.

G˚arding,E., and P. Lindblad. 1975. Constancy and variation in Swedish word accent patterns. In Working Papers 7 , 36–100. Phonetics Laboratory, Lund University.

G˚arding,Eva. 1973. The Scandinavian word accents. In Working Papers 8 . Phonetics Laboratory, Lund University.

G˚arding,Eva. 1993. Focal domains and their tonal manifestations in some Swedish dialects. In Nordic Prosody VI , ed. B. Grandstr¨om and L. Nord, 65–76. Stockholm: Almquist & Wiksell.

G´osy, M., and J. Terken. 1994. Question marking in Hungarian: Timing and height of pitch peaks. Journal of Phonetics 22:269–281.

Grabe, E., B. Post, F. Nolan, and F. Farrar. 2000. Pitch accent realization in four varieties of British English. Journal of Phonetics 27:161–185.

Gussenhoven, C., and G. Bruce. 1999. Word prosody and intonation. In Word Prosodic Systems in the Languages of Europe, ed. H. van der Hulst, 233–271. Berlin: Mouton de Gruyter.

Gussenhoven, C., and A. Chen. 2000. Universal and language-specific effects in the perception of question intonation. In Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP), 91–94. Beijing.

Gussenhoven, Carlos. 1984. On the grammar and semantics of sentence ac- cents. Dordrecht: Foris.

Gussenhoven, Carlos. 2004. The Phonology of Tone and Intonation. Cam- bridge: Cambridge University Press.

Gussenhoven, Carlos. 2005. Semantics of prosody. In Encyclopedia of Language and Linguistics, 2nd Ed., ed. Keith. Brown, 170–173. Oxford: Elsevier.

135 Gussenhoven, Carlos, and Peter van der Vliet. 1999. The phonology of tone and intonation in the Dutch dialect of Venlo. Journal of Linguistics 35:99– 135.

Hadding-Koch, K. 1961. Acoustico-Phonetic Studies in the Intonation of Southern Swedish. Lund: Gleerup.

Hadding-Koch, K. 1962. Notes on the Swedish word tones. In 4th International Congress of Phonetic Sciences, 630–638. Helsinki 1961.

Haggard, M., S. Ambler, and M. Callow. 1970. Pitch as a voicing cue. Journal of the Acoustical Society of America 47:613–617.

Hall´e,P. A., Y. C. Chang, and C. T. Best. 2004. Identification and discrimina- tion of Mandarin Chinese tones by Mandarin Chinese vs. French listeners. Journal of Phonetics 32 (3):395–421.

Haugen, Einar. 1983. Pitch accent and tonemic juncture in Scandinavian. In Prosodi/Prosody, 277–281. Oslo: Novus Forlag.

Haugen, Einar, and Martin Joos. 1952. Tone and Intonation in East Norwe- gian. Acta Philologica Scandinavica 22:41–64.

Hayes, B., and A. Lahiri. 1991. Bengali intonational phonology. Natural Language and Linguistic Theory 9:47–96.

Hayes, Bruce. 1995. Metrical stress theory: principles and case studies. Chicago: University of Chicago Press.

Heldner, M., and E. Strangert. 2001. Temporal effects of focus in Swedish. Journal of Phonetics 29(3):329–361.

House, David. 1990. Tonal perception in speech. Lund: Lund University Press.

House, David. 1996. Differential perception of tonal contours through the syllables. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP), 2048–2051. Philadelphia, USA.

136 House, David. 2004. Pitch and Alignment in the Perception of Tone and intonation: Pragmatic Signals and Biological codes. In Proceedings of Tonal Aspects of Languages (TAL) 2004 , 93–96. Beijing, China.

Hualde, J. I. 1991. Basque phonology. London: Routledge.

Hualde, Jos´eI. 2012. Two Basque accentual systems and the notion of pitch- accent language. Lingua 122:1335–1351.

Hyman, Larry. 2006. Word-Prosodic Typology. Phonology 23(2):225–257.

Hyman, Larry. 2009. How (not) to do phonological typology: the case of pitch-accent. Language Sciences 31:213–238.

Jilka, Matthias, and Bernd M¨obius.2006. Towards a Comprehensive Investiga- tion of Factors Relevant to Peak Alignment Using a Unit Selection Corpus. 2054–2057. Pittsburgh.

Karins, A. Krisjanis. 1996. The prosodic structure of Latvian. Doctoral Dis- sertation, University of Pennsylvania.

Karlsson, Anastasia, David House, Jan-Olaf Svantesson, and Damrong Tayanin. 2010. Influence of lexical tones on intonation in Kammu. In IN- TERSPEECH , 1740–1743. Chiba, Japan.

Katz, Jonah, and Elisabeth Selkirk. 2011. Contrastive focus vs. discourse-new: Evidence from phonetic prominence in English. Language 87:771–816.

Kelly, Niamh E., and Grzegorz Dogil. under review. Accent specification in East Norwegian: The perceptual magnet effect.

Kelly, Niamh E., and Katrin Schweitzer. 2015. Examining Lexical Tonal Con- trast in Norwegian Using Intonation Modelling. In To be presented at the 18th International Congress of Phonetic Sciences. Glasgow, August 2015.

Kelly, Niamh E., and Rajka Smiljani´c.2014. The Effect of Focus on Norwegian Tonal Accent. In Proceedings of Tonal Aspects of Languages (TAL) 2014 , 95–99. Nijmegen, The Netherlands.

137 Killingbergtrø, Laurits. 1969. Dativ i Leksvik-m˚alet. (Unpublished manuscript.). Universitetet i Oslo.

Kim, G.-R. 1988. The pitch accent system of the Taegu dialect of Korean with emphasis on tone sandhi at the phrasal level. Doctoral Dissertation, University of Hawaii.

Kock, Axel. 1884/85. Spr˚akhistoriskaundersoekningar om svensk akcent. [His- torical investigations of Swedish tonal accent from a linguistic perspective] vol. II.. Lund: Gleerup.

Koreman, J., B. Andreeva, W. J. Barry, W. Van Dommelen, and R. Sikveland. 2009. Cross-language differences in the production of phrasal prominence in Norwegian and German. In Nordic Prosody: Proceedings of the Xth Con- ference, Helsinki, August 4-6, 2008 .

Kristoffersen, Gjert. 1992. Cirkumflekstonelaget i norske dialekter, med sær- ling vekt p˚anordnorsk. Maal og Minne 2:37–61.

Kristoffersen, Gjert. 2000. The Phonology of Norwegian. Oxford: Oxford University Press.

Kristoffersen, Gjert. 2003. The tone bearing unit in Swedish and Norwegian tonology. In Take Danish - for instance. Linguistic studies in honour of Hans Basbøll, ed. Henrik Galberg Jacobsen, Dorthe Bleses, Thomas O. Madsen, and Pia Thomsen, 189–198. Odense: University Press of Southern Denmark.

Kristoffersen, Gjert. 2006a. Dialect variation in East Norwegian tone. (Un- published manuscript.).

Kristoffersen, Gjert. 2006b. Tonal melodies and tonal alignment in East Nor- wegian. In Nordic Prosody IX , ed. G. Bruce and M. Horne. Frankfurt am Main: Peter Lang.

Kristoffersen, Gjert. 2007. Dialect variation in East Norwegian tone. In Tones and Tunes, Vol. 1: Typological and Comparative Studies in Word and Sen- tence Prosody, ed. T. Riad and C. Gussenhoven, 91–111. Berlin: Mouton de Gruyter.

138 Kristoffersen, Gjert. 2011. Cirkumflekstonelaget i Oppdal. Norsk Lingvistisk Tidsskrift 29:221–262.

Ladd, D. R. 1978. The Structure of Intonational Meaning. Cornell University.

Ladd, D. R. 1996. Intonational Phonology. Cambridge University Press.

Ladd, D. R., A. Schepman, L. White, L. M. Quarmby, and R. Stackhouse. 2009. Structural and dialectal effects on pitch peak alignment in two varieties of British English. Journal of Phonetics 37:145–161.

Lin, Maocan. 2004. On production and perception of boundary tone in Chinese intonation. In Proceedings of Tonal Aspects of Languages (TAL) 2004 , 125– 130. Beijing, China.

Lindblom, B., B. Lyberg, and K. Holmgren. 1981. Durational Patterns of : Do They Reflect Short-Term Memory Processes?. Bloomington: IULC.

Liu, Chang. in press. Just Noticeable Difference of Tone Pitch Contour Change for English- and Chinese- Native Listeners. Journal of the Acoustical Society of America .

Lorentz, O. 1981. Adding tone to tone in Scandinavian dialects. In Nordic Prosody II , ed. T. Fretheim, 166–80. Trondheim: Tapir.

Ma, Joan K-Y, Valter Ciocca, and Tara L. Whitehill. 2006. Effect of intonation on Cantonese lexical tones. Journal of the Acoustical Society of America 120 (6):3978–3987.

Maddieson, I. 2011. Tone. In The World Atlas of Language Structures Online, ed. Matthew S. Dryer and Martin Haspelmath. Munich: Max Planck Digital Library. URL {http://wals.info/chapter/13}.

Mixdorff, H., B. Andreeva, and J. Koreman. 2010. Quantitative modelling of Norwegian tonal accents in different focus conditions. In Proceedings of Speech Prosody 2010 . Chicago, USA.

139 M¨obius,Bernd, and Matthias Jilka. 2007. Effects of syllable structure and nuclear pitch accents on peak alignment: A corpus-based analysis. In Pro- ceedings of the International Congress of Phonetic Sciences, 1173–1176. Saarbr¨ucken.

Mor´en,B., and E. Zsiga. 2006. The lexical and post-lexical phonology of thai tones. Natural Language and Linguistic Theory 24:113–178.

Myers, Scott. 2004. The effects of boundary tones on the f0 scaling of lexical tones. In TAL 2004: International Symposium on Tonal Aspects of Lan- guage, 147–150. Institute of Linguistics, Chinese Academy of Social Sciences, Beijing.

Nilsen, Randi Alice. 1989. On prosodically marked information structure in spoken Norwegian. Trondheim: University of Trondheim Working Papers in Linguistics 7.

Nilsen, Randi Alice. 1992. Intonasjon i interaksjon- sentrale spørsm˚ali norsk intonologi. Doctoral Dissertation, University of Trondheim.

Oftedal, M. 1952. On the origin of the Scandinavian tone distinction. In NTS XVI , 201–25.

Peirce, J.W. 2007. PsychoPy - Psychophysics software in Python. Journal of Neuroscience Methods 162(1-2):8–13.

Peters, J. 2007. Bitonal lexical pitch accents in the Limburgian dialect of Borgloon. In Tones and Tunes, Vol. 1: Typological and Comparative Studies in Word and Sentence Prosody, ed. T. Riad and C. Gussenhoven, 167–198. Berlin: Mouton de Gruyter.

Peters, J., J. Hanssen, and C. Gussenhoven. 2014. The phonetic realization of focus in West Frisian, Low Saxon, High German, and three varieties of Dutch. Journal of Phonetics 46:185–209.

Pierrehumbert, J. 1980. The phonology and phonetics of English intonation. Doctoral Dissertation, MIT.

140 Pierrehumbert, J. 2000. Tonal elements and their alignment. In Prosody: The- ory and Experiment. Studies Presented to G¨ostaBruce, 11–26. Dordrecht: Kluwer.

Pierrehumbert, Janet, and Mary Beckman. 1988. Japanese tone structure. Cambridge: MIT Press.

Prieto, P. 2014. The Intonational Phonology of Catalan. In Prosodic Typol- ogy 2. The Phonology of Intonation and Phrasing, 43–80. Oxford: Oxford University Press.

Prieto, P., M. D’Imperio, and B. Gili Fivela. 2005. Pitch accent alignment in Romance: Primary and secondary associations with metrical structure. Language and Speech 48:359–396.

Prieto, P., and F. Torreira. 2007. The segmental anchoring hypothesis revis- ited: Syllable structure and speech rate effects on peak timing in Spanish. Journal of Phonetics 35 (4):473–500.

Prieto, Pilar, Jan van Santen, and Julia Hirschberg. 1995. Tonal Alignment Patterns in Spanish. Journal of Phonetics 23:429–451.

Remijsen, B., and Vincent J. van Heuven. 2005. Stress, tone and discourse prominence in the Cura¸caodialect of Papiamentu. Phonology 22 (2):205– 235.

Remijsen, Bert. 2002. Word-prosodic systems of Raja Ampat languages. Doc- toral Dissertation, Leiden University.

Remijsen, Bert. 2013. Tonal alignment is contrastive in falling contours in Dinka. Language 89:297–327.

Remijsen, Bert, and Otto Gwado Ayoker. 2014. Evidence for contrastive tonal alignment in Shilluk. In Proceedings of Tonal Aspects of Languages (TAL) 2014 , 6–9. Nijmegen, The Netherlands.

Repp, Bruno. 1984. Categorical Perception: Issues, methods and findings. In Speech and Language: Advances in Basic Research and Practice (10), 243–335. New York: Academic Press.

141 Riad, Tomas. 1998. Towards a Scandinavian accent typology. In Phonology and morphology of the Germanic languages (Linguistische Arbeiten 386), 77–109. T¨ubingen: Niemeyer.

Riad, Tomas. 2003. Diachrony of the Scandinavian accent typology. In Devel- opment in Prosodic Systems (Vol. 58), 91–144. Berlin/New York: Mouton de Gruyter.

Riad, Tomas. 2006. Scandinavian accent typology. Sprachtyp. Univ. Forsch. (STUF), Berlin 59(1):36–55.

Rooth, M. 1985. Association with Focus. Doctoral Dissertation, University of Massachusetts, Amherst.

Van Santen, J., and J. Hirschberg. 1994. Segmental effects on timing and height of pitch contours. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP), 719–722. Yokohama.

Schepman, Astrid, Robin Lickley, and D. Robert Ladd. 2006. Effects of vowel length and “right context” on the alignment of Dutch nuclear accents. Jour- nal of Phonetics 34 (1):1–28.

Scholz, Franziska. 2012. Tone sandhi, prosodic phrasing, and focus marking in Wenzhou Chinese. Doctoral Dissertation, Leiden University.

Segerup, My. 2003. Word accent gestures in West Swedish. In Proceedings from Fonetik 2003; Phonum 9 , ed. M. Heldner, 25–28. Univ. Ume˚a.

Segerup, My. 2004. Gothenburg Swedish word accents: a fine distinction. In Proceedings from Fonetik 2004 , ed. P. Branderud and H. Traunm¨uller, 28–31. University of Stockholm.

Selkirk, Elisabeth. 2008. Contrastive focus, givenness and the unmarked status of “discourse-new”. Acta Linguistica Hungarica 55:331–346.

Senn, A. 1966. Handbuch der litauischen Sprache. Heidelberg: Carl Winter: Universit¨atsverlag.

142 Shattuck-Hufnagel, S., L. Dilley, N. Veilleux, A. Brugos, and R. Speer. 2004. F0 peaks and valleys aligned with non-prominent syllables can influence perceived prominence in adjacent syllables. In Proceedings of Speech Prosody 2004 . Nara, Japan.

Shen, X-N. 1993. Relative duration as a perceptual cue to stress in Mandarin. Language and Speech 36(4):415–433.

Shport, Irina. 2011. Cross-linguistic perception and learning of Japanese lexical prosody by English listeners. Doctoral Dissertation, University of Oregon.

Silverman, K., and J. Pierrehumbert. 1990. The timing of prenuclear high ac- cents in english. In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, ed. J. Kingston and M. Beckman, 72–106. Cambridge: Cambridge University Press.

Sluijter, Agaath M. C., and Vincent J. van Heuven. 1996. Acoustic correlates of linguistic stress and accent in Dutch and American English. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP).

Smiljani´c,Rajka. 2002. Lexical, pragmatic and positional effects on prosody in two dialects of Croatian and Serbian: An acoustic study. Doctoral Dis- sertation, University of Illinois at Urbana-Champaign.

Smiljani´c,Rajka. 2003. Lexical and pragmatic effects on pitch range and low tone alignment in two dialects of Serbian and Croatian. Proceedings of the 39th Regional Meeting of the Chicago Linguistic Society 39 (1):520–539.

Smiljani´c,Rajka. 2006. Early vs. late focus: Pitch-peak alignment in two dialects of Serbian and Croatian. In Papers in laboratory phonology (8), ed. L. Goldstein, D.H. Whalen, and C. Best, 495–518. Berlin: Mouton.

Storm, J. 1884. Norvegia. Tidsskrift for det norske folks maal og minder. Kristiania: Grøndahl and Søn.

R Development Core Team. 2008. R: A Language and Environment for Statis- tical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL {http://www.R-project.org}, ISBN 3-900051-07-0.

143 Teig, Andreas Hilmo. 2001. Can phonetically high pitch represent phonologi- cally low tone? Evidence from East Norwegian raised background domains. In Nordic Prosody: Proceedings of the VIIIth Conference, Trondheim, Au- gust 19-21, 2000 , ed. Wim A. Van Dommelen and Thorstein Fretheim, 213–225. Frankfurt, Germany: Peter Lang.

Vance, T. J. 1976. An Experimental Investigation of Tone and Intonation in Cantonese. Phonetica 33:368–392.

Vanvik, A. 1957. Norske tonelag. Maal og Minne 92–102.

Wetterlin, Allison. 2010. Tonal Accents in Norwegian: Phonology, Morphology and Lexical Specification. Walter de Gruyter.

Whalen, D., and A. Levitt. 1995. The universality of intrinsic f0 of vowels. Journal of Phonetics 23:349–366.

Whalen, D. H., A. S. Abramson, L. Lisker, and M. Mody. 1993. F0 gives voicing information even with unambiguous voice onset times. Journal of the Acoustical Society of America 93:2152–2159.

Xu, Y. 1999. Effects of tone and focus on the formation and alignment of f0 contours. Journal of Phonetics 55-105:27.

Xu, Y., J. Gandour, and A. Francis. 2006. Effects of language experience and stimulus complexity on categorical perception of pitch direction. Journal of the Acoustical Society of America 120(2):1063–1074.

Xu, Y., and C. X. Xu. 2005. Phonetic realization of focus in English declarative intonation. Journal of Phonetics 159-197:33.

Yip, Moira. 2002. Tone. Cambridge: Cambridge University Press.

Zhang, L., Y.-Q. Zu, and R.-Q. Yan. 2006. Focus, lexical stress and boundary tone: Interaction of 3 prosodic features. In Chinese Spoken Language Pro- cessing: Proceedings of the 5th International Symposium (ISCSLP 2006). Singapore.

144