<<

Journal of Phonetics 46 (2014) 185–209

Contents lists available at ScienceDirect

Journal of Phonetics

journal homepage: www.elsevier.com/locate/phonetics

Research Article The phonetic realization of focus in West Frisian, Low Saxon, High German, and three varieties of Dutch ⁎ Jörg Peters a, , Judith Hanssen b, Carlos Gussenhoven b a Carl von Ossietzky Universität Oldenburg, 26111 Oldenburg, b Radboud University Nijmegen, PO Box 9103, 6500 HD Nijmegen, The

ARTICLE INFO ABSTRACT

Article history: This study examines the effects of different kinds of focus and of focus constituent size on the phonetic realization Received 14 May 2013 of accent peaks in declarative sentences in varieties of continental West Germanic. Speakers were drawn from six Received in revised form populations along the coastal line of the Netherlands, covering Dutch, Hollandic Dutch, West Frisian, Dutch 10 July 2014 Low Saxon, German Low Saxon, and Northern High German. Our findings suggest that focus structure has systematic Accepted 17 July 2014 effects on segmental durations, the scaling and timing of the accentual f0 gesture, and on the alignment of f0 targets relative to the beginning of the accented syllable. However, the difference between neutral focus and corrective focus has Keywords: more systematic effects than variation of the size of the focused constituents in corrective focus. In addition, speakers Dutch from different places were found to adopt different strategies in signaling these focus structures. Speakers of Hollandic West Frisian Dutch and West Frisian expanded the pitch span on the accented word, whereas speakers of Low and High German Low Saxon rescaled single targets of the accentual f gesture, and speakers of Zeelandic Dutch mixed both strategies. German 0 & 2014 Elsevier Ltd. All rights reserved. Focus Information structure Intonation Fundamental frequency Segmental duration

1. Introduction

Focus is one of the main triggers of intonational events in West Germanic. It determines the distribution or identity of pitch accents, and may additionally affect the phonetic realization of the intonation contour. Variation in the focus of the sentence has two dimensions. One is the size of the focus constituent. In John bought eggs, for example, the focus constituent may be the object eggs, the verb phrase bought eggs, or the whole sentence (cf. Ladd's (1980, chap. IV) ‘broad focus’ and ‘narrow focus’). The other dimension concerns the specificfocus meaning that applies to the focus constituent. The terms ‘information focus’ (Kiss, 1998), ‘presentational focus’ or ‘discourse-new’ (Katz & Selkirk, 2011; Selkirk, 2008) have been used for information provided by speakers either in response to the hearer's request or otherwise equivalently without the hearer's prompting (Baart, 1987). In addition, the focus may be contrastive. In our terminology, contrastive focus relates the focus constituent to a restricted set of alternatives that are accessible to the addressee (Katz & Selkirk, 2011; Rooth, 1985, 1992; Selkirk, 2008;cf.Chafe, 1976). If the contrastive focus is used to reject an alternative or a set of alternatives known to the addressee, this can be further classified as ‘corrective’ (Gussenhoven, 2005). The response in (1a) illustrates information focus for the whole sentence. The response in (1b) illustrates narrow information focus on eggs. It relates what is said to an unrestricted set of alternatives (‘Of the things that John may have bought, he did buy eggs’). The response in (1c) illustrates narrow contrastive focus, and relates what is said to a restricted set of alternatives, (‘John bought eggs, but not vegetables’). The response in (1d) represents narrow corrective focus, used to reject an alternative proposition stated in the preceding sentence. The size of the focus constituent, indicated by square brackets, and type of focus meaning are thus seen as orthogonal dimensions in the specification of focus.

(1) a. What happened? − [John bouɡht eggs]. b. What did John buy? − John bought [eggs]. c. Did John buy eggs or vegetables? − John bought [eggs]. d. John bought vegetables. − John bought [eggs].

⁎ Corresponding author. Tel.: +49 441 798 4589; fax: +49 441 798 2953. E-mail address: [email protected] (J. Peters).

0095-4470/$ - see front matter & 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.wocn.2014.07.004 186 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

Experimental studies have shown that focus structure may affect segmental durations and f0 of focused or non-focused constituents (e.g., Chen, 2006; Chen & Gussenhoven, 2008; Xu, 1999 for Standard Chinese, and Xu & Xu, 2005 for American English). In addition, it has been shown that those effects may differ even among closely related languages, as illustrated by postfocal compression of the f0 range in Beijing Mandarin, which is missing in both Taiwanese and Taiwanese Mandarin (Chen, Wang, & Xu, 2009). The majority of experimental production studies on focus deal with single, standard varieties, such as Standard American English or Standard Mandarin Chinese. In the investigation reported here we deviate from this practice by investigating six non-standard varieties of Dutch and German spoken along the coast. Whereas there is a growing body of knowledge on the regional variation of prosodic features of semantically equivalent utterances even among closely related varieties of the same language (e.g., Grabe, Post, Nolan, & Farrar, 2000; Atterer & Ladd, 2004; Dalton & Ní Chasaide, 2005; Grabe, 2004; Gilles, 2005; Ladd, Schepman, White, Quarmby, & Stackhouse, 2009; van Leyden, 2004), little is known about regional variation in the phonetic realization of focus. A better understanding of the regional variation in the realization of focus may throw light on the variation that has been reported among speakers of standard languages and may indicate that the latter kind may in part reflect regional variation among the participants, for instance, when these are recruited from student populations of a single university. The varieties we investigated fall within the German-Dutch-Frisian continuum and are spoken in locations that form an arc along the North Sea coast. They belong to different dialectal subgroups and entertain heteronomy relations to different standard languages (cf. Chambers & Trudgill, 1998). Three , one Zeelandic (Zuid-Beveland) and two Hollandic dialects (Rotterdam, a Southern Hollandic dialect, and Amsterdam, a Northern Hollandic dialect), belong to the Low and are spoken in the west of the Netherlands. Their speakers relate to Standard Dutch as the autonomous language. A West Frisian (Grou) intervenes between this group and two Low Saxon dialects, one spoken in the Netherlands, where Standard Dutch is the autonomous language (Winschoten) and one in Germany, where Northern is the autonomous language (Weener). There is thus a non-trivial relation between the geographical continuum and the variety continuum in that it cross-sects an area dominated by the three standard continental West German, Frisian and Dutch. Since West Germanic standard languages have very similar intonation systems (Bolinger, 1989: 43f; de Pijper, 1983) and dialectal variation in non-tonal Dutch intonation has been characterized as small (’t Hart, 1998: 108), we may expect to find phonetic gradience along the geographical arc along which the dialects are situated, despite their different language groupings.1 In our study, we investigated a contour that all dialects have in common, a declarative rising-falling pitch accent on a non-final syllable of the intonational phrase and for which no regional variation has so far been reported. Finally, we address the issue whether dialect speakers who also speak a local variety of the for their area vary systematically in the way they realize their intonation contours between their standard and dialectal speech. We chose Weener as the location where speakers were recorded both in the indigenous variety, Weener , and in the standard variety, Weener High German. It is conceivable that the Weener speakers use the same prosody in the two varieties, but equally that they adjust the realization of their intonation contours in the direction of the standard language. Because this is an exploratory question and also because a number of research findings on Standard German are available, we decided to forgo the recruitment of a Standard German control group. We know of no previous research that might provide a reference for this specific question. The details in the chronology and proportions of exposure to the two varieties will vary across families in Weener. A sociolinguistic perspective would suggest that speakers adapt their speech in different degrees to the phonetic features of the standard language, but we cannot predict with confidence that such adaptations are stronger in their dialectal speech than in their standard speech. A bilingualism perspective does not provide clear predictions either, but does allow for the expectation that no difference will be found. According to Flege's Speech Learning Model (2007), new phonetic categories are acquired if these are sufficiently different from categories in the L1, but that smaller phonetic differences are treated as belonging to a single phonetic category, whereby the degree to which the bilingual speaker's production is similar to either the L1 or the L2 is determined by the exposure balance between the two languages. It is reasonable to assume that the Weener speakers will equate the dialectal and standard categories, since both varieties have rising-falling pitch accents to express the kind of intonational meanings that our experiments were concerned with. If that is correct, the prediction is that no differences are to be found between the two varieties as spoken by these speakers. The aim of our study is thus to examine the effects of dialect variation on the phonetic realization of focus, specifically the phonetic realization of pitch accents as used in sentences with corrective focus on constituents of different lengths, whereby a non-corrective wide focus is used as a baseline. In the remainder of this introduction, we will summarize findings on the prosodic effects of focus structure on segmental durations, the scaling and timing of the f0 contour, and the synchronization of the f0 contour with the segmental string in English, Dutch, and German.

1.1. Segmental lengthening effects

In American English, accented mono- and disyllabic words in declarative sentences are lengthened when occurring under narrow information focus (Cooper, Eady, & Mueller, 1985; Eady & Cooper, 1986; Eady, Cooper, Klouda, Mueller, & Lotts, 1986). Xu and Xu (2005) observed similar lengthening effects under narrow information focus on words containing up to three syllables. In addition, Eady and Cooper (1986) reported lengthening effects in questions, and Eady et al. (1986) in statements with dual focus. In all these studies, the degree of lengthening depended on the position of the focus constituent in the sentence. Lengthening effects on focused

1 For tonal varieties, see e.g. Gussenhoven (2004). J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 187 words in sentence-initial position appear to be stronger than in sentence-final position (Eady & Cooper, 1986; Eady et al., 1986). Under equivalent locations of the pitch accent, focus constituents consisting of a single word are affected more than focus constituents consisting of a word group in sentence final position, but not in sentence-medial position (Eady et al., 1986). For British English, Sityaev and House (2003) found small increases of word duration in contrastive and non-contrastive narrow focus, which varied depending on speaker and on the position of the accented word in the sentence. In some speakers, word duration in both contrastive and non-contrastive narrow focus was larger than in wide information focus. One speaker lengthened the accented word in contrastive narrow focus when compared to non-contrastive wide and narrow focus. In Standard German, accented words have been found to be lengthened in both contrastive and non-contrastive narrow focus (Baumann, Grice, & Steindamm, 2006; Kügler, 2008; Steindamm & Susanne, 2005; see also Féry & Kügler, 2008). For Standard Dutch, Hanssen, Peters, and Gussenhoven (2008) compared the phonetic realization of wide information focus, narrow information focus and narrow corrective focus and found that both types of narrow focus lengthened the onset of the accented syllable when compared to wide focus. The lengthening of the coda consonant in the corrective focus condition failed to reach statistical significance. In order to examine whether Standard Dutch allows for focusing segments smaller than the syllable, van Heuven (1994) ranaperceptionandaproductiontestin which subjects listened to or produced sentences of the type Ik heb niet X, maar gezegd ‘I did not say X, but Y’, where X and Y were monosyllabic words in which the focus constituent varied from the whole syllable to segments in them, as shown in (2).

(2) a. Ik heb niet [veer], maar [boon] gezegd. b. Ik heb niet []oon, maar [b]oon gezegd. c. Ik heb niet b[ee]n, maar b[oo]n gezegd. d. Ik heb niet boo[m], maar boo[n] gezegd.

In all the four examples, the pitch accent in the target clause is on the monosyllabic word boon (‘bean’) on account of a corrective focus, but the focus constituent varies from the syllable (2a) to the onset (2b), the vowel (2c) and the coda (2d). In the production experiment, van Heuven found unanticipated pitch effects (see below), but neither domain-specific durational differences nor any other lengthening effects. To summarize, in American English, British English, Standard German and Standard Dutch, narrow focus was found to lengthen the accented word and the accented syllable. Few if any durational differences, however, were found between different types of narrow focus and among corrective focus conditions with focus domains ranging from the phoneme to the accented word.

1.2. Pitch effects

For American English, Xu and Xu (2005) report increased rise excursions on accented syllables in narrowly focused words in both nuclear and prenuclear positions. This increase was accomplished by raising the f0 peak rather than by lowering the beginning of the rise (the initial f0 minimum), which is in line with the results of a listening test reported by Bartels and Kingston (1994). As raising of the f0 peak was not accompanied by a proportional peak delay, the steepness of the slope increased as well. In post-nuclear syllables, including those inside the narrowly focused word, f0 was lower than in comparable locations in sentences with neutral focus, whereas the prefocal f0 level remained unchanged. Hence, narrow focus was found to expand the pitch range on the focused word and to compress the post-focal pitch range. Eady et al. (1986) report f0 peak raising on focused words in sentence-final position only. No raising effects were found for narrowly focused phrases containing more than a single word or for single focus words in pre-

final position. Both Cooper et al. (1985) and Eady and Cooper (1986) report a sharp post-focal f0 drop, again showing that post-focus compression (Xu & Xu, 2005) in English is both a phonological and a phonetic reality. A point to be noted here is that Eady et al. (1986) found that post-focus compression was suspended when another focused word followed in the same intonational phrase. That is, the pitch range compression is strictly post-nuclear. Increased pitch range under narrow focus has likewise been reported for Standard German (e.g., Batliner, 1989; Baumann, Becker, Grice, & Mücke, 2007; Féry & Kügler, 2008, Kügler & Gollrad, 2011; Uhmann, 1991). Steindamm and Susanne (2005) and

Baumann et al. (2006) report f0 peak raising for two of their speakers. In Standard Dutch, Hanssen et al. (2008) found scaling effects to be largely restricted to the second half of the accentual f0 gesture and the postfocal f0 level. The falling movement after the accentual peak was steeper and longer in both corrective and narrow information focus than in wide information focus. As a consequence, f0 reached a lower postfocal level in the narrow focus conditions, in line with the expectation that narrow focus leads to compression of the postfocal pitch range. van Heuven (1994) found small increases in excursion size of the rise and fall of the accentual f0 gesture when either the onset or the coda was the corrective focus constituent. The durations of the rising movements on the accented word tended to be longer when the onset was focused, but shorter when the coda was focused. Correspondingly, the falling movement was longer when the coda was focused, but this difference did not reach statistical significance. To summarize, expansion of the pitch range on the narrowly focused word and extra pitch compression after the narrowly focus constituents is a widespread phenomenon, but pitch range expansion may affect different parts of the accentual f0 gesture differently.

1.3. Alignment effects

In their study of American English, Xu and Xu (2005) examined focus effects on the alignment of the beginning of the rise and the accentual peak relative to the beginning of the focused word. The accentual peak was earlier in narrow focus than in wide focus, but 188 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 only in sentence-final accented syllables that contained a long vowel. No effect on the beginning of the rise was found. Hence, narrow focus did not retract the whole accentual f0 gesture. Hanssen et al. (2008) examined the timing of the accentual f0 gesture relative to the beginning of the vowel of the accented syllable in Standard Dutch. They found the distance between the f0 peak and the beginning of the vowel to be larger under wide information focus than under narrow information and corrective focus; concomitantly, the distance of the elbow at the end of the fall after the f0 peak and the following stressed syllable was shorter in wide information focus than in the two narrow focus conditions. The first difference can in part be reduced to shorter onset duration in neutral focus; the second amounts to the use of a steeper fall in the narrow focus conditions. Earlier, van Heuven (1994) examined the alignment of the accentual peak (the end of the rise and the beginning of the fall) relative to the beginning of the vowel of the accented syllable and relative to the whole accented syllable as a function of variation in the size of the intra-syllabic focus constituent. While the accentual peak was expected to shift along the time axis with the position of the focused segment (the onset C, the V, or the coda C), a shift in the opposite direction was found. When the onset C was focused, the peak shifted towards the end of the syllable; when the coda C was focused, it shifted towards the beginning of the syllable. The differences, however, were small and not all of them reached statistical significance. Summarizing, no major focus-related alignment differences have been found for English and Dutch, apart from the earlier pronunciation of the post-peak valley as a result of a steeper pronunciation of the fall in narrow focus conditions. In the design of the production experiment described in Section 2, we intended to collect data that would allow us to investigate the possible phonetic exponents of focus, discussed above, in particular the variables that have turned out to be relevant in studies like Xu and Xu (2005) and van Heuven (1994). These include segmental durations, the scaling and duration of the f0 contour, and the alignment of the f0 contour with the segmental string.

2. Material and methods

2.1. Subjects

The recordings of the six indigenous dialects and the local standard variety were made in the localities concerned (Fig. 1). A total of 125 speakers participated in the experiments, 48 of whom were male. They were aged between 16 and 49. All regional speakers and at least one of their parents spoke the indigenous variety fluently and were raised in the location concerned. All speakers from Zuid-Beveland, Grou, and Winschoten were bilingual with Standard Dutch and their indigenous variety. The speakers from Weener were bilingual with German Low Saxon and High German. Except for the speakers of Frisian, our speakers were less familiar with the written form of their local variety than that of the standard language, which had a negative influence on the fluency in the reading tasks in the case of some speakers. The recordings by 10 participants were excluded, as these were disfluent or appeared to the

Weener Winschoten Grou North Sea GERMANY Amsterdam THE NETHERLANDS Rotterdam

Zuid-Beveland

Fig. 1. Recording locations in the Netherlands and North-.

Table 1 Number of speakers from Zuid-Beveland (ZB), Rotterdam (RO), Amsterdam (AM), Grou (GR), Winschoten (WI) and Weener, with WL for Weener Low German and WH for Weener High German.a

ZB RO AM GR WI WL/WHb Total

Total 18 20 24 23 20 20 125 Selected 18 19 19 23 18 18 115 Male 10 12 12 3 4 7 48 Female 8 7 7 20 14 11 67

a Note that in the remainder of this paper the place labels will likewise be used for the varieties spoken in these places. b Weener Low Saxon and Weener High German were recorded by the same speakers. J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 189 experimenter not to speak naturally. Table 1 gives an overview of the remaining speakers recorded per variety. The participants were naïve as to the purpose of the experiment and were paid for their participation.

2.2. Material

In order to elicit nuclear falling contours, we designed a reading task with short declarative answers to questions. The script contained three proper names as target words which we kept constant in all language versions: Malberen, Melberen, and Molberen. The target words had the stress pattern sww and they were always followed by the infinitive verb belonen, whose stress pattern is wsw. The set of target sentences consisted of one sentence with wide information focus, henceforth referred to as ‘neutral focus’ (NF), and three sentences with corrective focus (CF) in which the focus constituent was the entire target word, the accented syllable of the target word, or the onset consonant of the accented syllable, which was always /m/. Table 2 lists these four types of mini- dialogues with the word Malberen. The target words Melberen and Molberen appeared in comparable mini-dialogues. The Standard Dutch version of the script was used for Zuid-Beveland, Rotterdam and Amsterdam. Speakers of Zuid-Beveland preferred to translate the Dutch sentences into their variety as they went along and therefore were presented with the Standard Dutch versions of the test materials. For Grou, Winschoten, and Weener, we used translations of the Standard Dutch sentences into West Frisian, , German Low Saxon, and High German, respectively. For all varieties, the rhythmic, lexical and segmental context was kept comparable to the Standard Dutch materials as much as possible. Appendix A gives the mini-dialogues in all five language versions.

2.3. Recording procedure

The mini-dialogues were presented in a booklet, one mini-dialog per page, in pseudo-randomized order, which was reversed for half of the subjects per variety. To prevent order effects, we added 93 mini-dialogues as fillers. To reduce effects of the experimenter's presence on the speakers' dialect level, our speakers were recorded in pairs, with one speaker producing the context sentence and the other the carrier sentence. The participants switched roles at the end of the task after they had repeated any mispronounced sentences. The German Low Saxon and High German test sentences were recorded from the same speakers on two visits which were at least four weeks apart. Recordings were made in a quiet room either in the homes of our speakers or in a public building. We used a portable digital recorder (Tascam HD P2 and Zoom H4) with a 48 kHz sampling rate, 16 bit resolution and stereo format. The participants wore head-mounted Shure WH30XLR or Sennheiser MKE 2 wired condenser microphones.

2.4. Data selection and analysis

All recorded target sentences were converted to monaural files and stored on computer disk as separate wave files. Utterances were excluded from further analysis if they showed deviant pitch patterns due to accent position, choice of pitch accent or choice of final boundary tone. In particular, we excluded utterances with a downstepped nuclear accent. While in languages like German or Dutch downstep occurs more frequently in neutral than in narrow focus (e.g. Féry & Kügler, 2008), our interest was in the realization of the non-downstepped nuclear fall. As it happens, only a small number of the utterances in our experiment had downstep. We also excluded utterances with hesitation pauses on and around the target word. Some speakers consistently avoided the intended pronunciation of the targets words Malberen, Melberen, and Molberen. Whereas these fictitious place names were expected to be pronounced with the dactylic stress pattern sww, we also found three deviant stress patterns, ssw, ss, and sw, as in [ˈmalbeːʁən], [ˈmalbɛʁn], and [ˈmalbɐn], respectively. By using these patterns the speakers avoided sequences of three unstressed syllables induced by the prosodic context, as in 'Malberen be'lonen, and ended up

Table 2 Types of test sentences varying by focus condition and focus domain (Standard Dutch version).

Focus type Focus domain Dialog

Neutral Sentence A Wat gaat er gebeuren? What is going to happen? B Ze willen bakker Malberen belonen. They are going to reward baker Malberen.

Corrective Word A Wilde de agent meester Verdonck belonen? Did the policeman want to reward teacher Verdonck? B Meester Verdonck? Nee, hij wilde meester [Malberen] belonen! Teacher Verdonck? No, he wanted to reward teacher Malberen!

Corrective Syllable A Zullen die mensen dokter Lomberen belonen? Are those people going to reward doctor Lomberen? B Dokter Lomberen? Nee, ze zullen dokter [Mal]beren belonen! Doctor Verdonck? No, they wanted to reward doctor Malberen!

Corrective Onset consonant A Mag ik Anne Nalberen belonen? May I reward Anne Nalberen? B Anne Nalberen? Nee, je mag Anne [M]alberen belonen! Anne Nalberen? No, you may reward Anne Malberen! 190 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 either with two unstressed syllables ('Malbeːren be'lonen or 'Malbern be'lonen), or one ('Malbeːrn be'lonen). Most instances of deviant stress patterns were found in utterances from WI and in a smaller number of utterances from ZB, WL, and WH. ZB speakers tended to replace sww by sw, whereas speakers of the eastern varieties WL and WH tended to replace it by ˈss. Winschoten (WI) speakers tended to replace sww by either sw or ss. To prevent possible effects of these patterns on our analysis of the timing and scaling of pitch accents we restrict the statistical analyses to utterances containing the intended dactylic forms of the target words. As only three WI speakers consistently produced dactylic forms we exclude all WI data from statistical analysis. Table 3 gives numbers of speakers and utterances left for analysis after excluding all utterances with deviant pitch contours or deviant stress patterns

2.5. Data analysis

Acoustic and auditory analysis of the data was done with the help of the speech processing software package Praat (Boersma & Weenink, 2008). We inserted the labels listed in Table 4. In general, segment boundaries were determined on the basis of visual inspection of the waveform and the broadband spectrogram (Turk, Nakai, & Sugahara, 2006), aided by auditory information. We placed all labels at negative-to-positive zero- crossings of the sound wave. L1 and H were determined semi-automatically using a Praat function to locate the f0 minimum or maximum in a selected region. Semi-automatic determination of the elbow after the nuclear peak (L2) was found to yield less inter- rater agreement for our data set. Therefore, we determined L2 visually by looking for the location of the highest rate of f0 change near the bottom line of the nuclear contour. For a comparison of manual with automatic detection methods see del Giudice, Shosted,

Davidson, Salihie, and Arvaniti (2007). Whenever the f0 track showed two elbows in the low-pitched section after the peak, we chose the first. Pitch tracking errors such as octave jumps were corrected by hand. Using the labels in Table 4, we calculated the acoustic variables given in Table 5. Since corrective focus on the onset consonant was expected to lengthen this consonant and thus to affect the relative position of the beginning of the vowel, we measured the alignment of L1, H, and L2 with reference to the beginning of the accented syllable rather than to the beginning of the vowel. In order to account for possible cross-linguistic or inter-speaker variation in speech rate and scaling, we normalized the acoustic variables listed in Table 5. It was not possible to avoid some variation in the metrical and segmental structure of the prenuclear parts of the test sentences in the different language versions. Therefore, we used the duration and the pitch range of the nuclear part of the utterance for normalization, which was defined as the interval between the beginning of the nuclear accented syllable and the end of the intonational phrase. As the accentual peak was always identical to the f0 maximum of the nuclear utterance, H(f) will be used to determine the upper limit of the nuclear pitch range. Table 6 lists the variables used for normalizing and the normalized variables. To check the reliability of measurements, one sentence per focus condition was quasi-randomly selected from one male and one female speaker per variety and labeled independently by the first author and a phonetically trained research assistant (4×2×6¼48 sentences). Table 7 gives the mean difference between the two measurements for each of the measurement labels listed in Table 4. For most variables inter-rater agreement was high. The largest absolute difference were found for the end of the nuclear fall (the “elbow”, L2(t)).

Table 3 Number of valid speakers and utterances.

ZB RO AM GR WL WH Total

Valid speakers 17 19 19 23 12 16 106 (Male/female) (9/8) (12/7) (12/7) (3/20) (3/9) (5/11) (44/62) Valid utterances 147 210 196 264 97 146 1060 (Male/female) (76/71) (134/76) (122/74) (36/228) (21/76) (37/109) (426/634)

Percent of all utt. 68.1 92.1 81.7 95.7 44.9 67.6

Table 4 Acoustic measurement labels.

1. Segmental boundaries O1(t) Beginning of the onset of the nuclear syllable V1(t) Beginning of the rhyme of the nuclear syllable O2(t) Beginning of the onset of the first postnuclear syllable (¼end of the rhyme of the nuclear syllable) WB(t) Final boundary of the accented word E(t) End of utterance

2. Pitch targets L1(t) Time of L1(f)

L1(f) Minimum f0 at or around the beginning of the accented syllable (¼begin of rise) H(t) Time of H(f)

H(f) Maximum f0 of the accented syllable (¼nuclear peak) L2(t) Time of L2(f) L2(f) Elbow after the nuclear peak

MinL(f) Minimum f0 between O1(t) and E(t) J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 191

Table 5 Acoustic variables.

1. Durational variables Onset Duration The duration of the onset of the nuclear syllable (ms) V1(t)−O1(t) Syllable Duration The duration of the nuclear syllable (ms) O2(t)−O1(t) Word Duration The duration of the accented word (ms) WB(t)−O1(t) 2. Accentual variables 2.1 Rise

Rise Size The pitch difference between the accentual peak and the preceding f0 minimum (Hz) H(f)−L1(f)

Rise Time The time difference between the accentual peak and the preceding f0 minimum (Hz) H(t)−L1(t) Rise Slope Rise Size relative to Rise Time (Hz/s) (H(f)−L1(f))⁎1000/(H(t)-L1(t)) 2.2 Fall Fall Size The pitch difference between the accentual peak and the elbow (Hz) H(f)−L2(f) Fall Time The time difference between the elbow and the accentual peak (ms) L2(t)−H(t) Fall Slope Fall Size relative to Fall Time (Hz/s) (H(f)−L2(f))⁎1000/(L2(t)−H(t)) 3. Alignment variables L1 Delay The distance of the beginning of the rise from the beginning of the onset of the accented syllable (ms) L1(t)−O1(t) H Delay The distance of the accentual peak from the beginning of the onset of the accented syllable (ms) H(t)−O1(t) L2 Delay The distance of the elbow from the beginning of the onset of the accented syllable (ms) L2(t)−O1(t)

Table 6 Basic and normalized acoustic variables.

1. Basic variables Nuclear Duration The distance of the end of the utterance from the beginning of the nuclear accented syllable E(t)−O1(t)

Nuclear Pitch Range The pitch difference between the nuclear accentual peak and the lowest f0 between the beginning H(f)−MinL(f) of the nuclear accented syllable and the end of the utterance

2. Normalized durational variables Onset Durationn Onset Duration relative to Nuclear Duration (V1(t)−O1(t))/(E(t)−O1(t)) Syllable Durationn Syllable Duration relative to Nuclear Duration (O2(t)−O1(t))/(E(t)−O1(t)) Word Durationn Word Duration relative to Nuclear Duration (WB(t)−O1(t))/(E(t)−O1(t))

3. Normalized pitch variables 3.1 Rise Rise Sizen Rise Size relative to Nuclear Pitch Range (H(f)−L1(f))/(H(f)−MinL(f)) Rise Timen Rise Time relative to Nuclear Duration (H(t)−L1(t))/(E(t)−O1(t)) Rise Slopen Rise Slopen relative to Rise Timen (H(f)−L1(f))/(H(f)−MinL(f))/(H(t)−L1(t))/(E(t)−O1(t)) 3.2 Fall Fall Sizen Fall Size relative to Nuclear Pitch Range (H(f)−L2(f))/(H(f)−MinL(f)) Fall Timen Fall Size Time relative to Nuclear Duration (L2(t)−H(t))/(E(t)−O1(t)) Fall Slopen Fall Sizen relative to Fall Timen (H(f)−L2(f))/(H(f)−MinL(f))/(L2(t)−H(t))/(E(t)−O1(t))

4. Normalized alignment variables L1 Delayn L1 Delay relative to Nuclear Duration (L1(t)−O1(t))/(E(t)−O1(t)) H Delayn H Delay relative to Nuclear Duration (H(t)−O1(t))/(E(t)−O1(t)) L2 Delayn L2 Delay relative to Nuclear Duration (L2(t)−O1(t))/(E(t)−O1(t))

Table 7 Comparison of two independent measurements (N¼2×48 measurements per variable).

Durational Mean absolute Pitch Mean absolute variables differences variables differences

O1(t) 3.1 ms V1(t) 1.9 ms O2(t) 2.7 ms WB(t) 9.5 ms E(t) 9.8 ms MinL(f) 0.23 st L1(t) 3.9 ms L1(f) 0.13 st H(t) 1.8 ms H(f) 0.04 st L2(t) 12.3 ms L2(f) 0.30 st

2.6. Statistical analysis

We fitted a mixed-effect model using the lmer function of the package lme4. Two fixed factors were entered into the model: (i) FOCUS, with the four levels Neutral focus (NF), Corrective focus on the accented word (CF-W), Corrective focus on the accented syllable (CF-S), and Corrective focus on the onset of the accented syllable (CF-O); and (ii) DIALECT, with six levels ZB, RO, AM, GR, 192 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

WL, and WH, and with WL and WH referring to the same group of speakers. As random factors we entered SPEAKER and SENTENCE into the model. We used Restricted Maximum Likelihood (REML) which is preferred to full Maximum Likelihood (ML) since REML takes account of the number of (fixed effects) parameters estimated, losing 1 degree of freedom for each. Especially for small sample sizes REML is preferred. In order to find the p values of the main effects and the interaction, we used the anova function of the R package lmertest. This function calculates Satterthwaite's approximation to degrees of freedom. Multiple comparisons were calculated by using the functions glht and summary from the R packages multcomp. Tukey's correction was applied in order to control overall significance. The R script code is presented in Appendix B.

3. Results

We present the results of our analyses in four steps. In Section 3.1, we report effects of the basic variables that were used for the normalization of the remaining variables. Section 3.2 reports effects of focus and dialect on segmental durations. Section 3.3 reports focus and dialectal effects on the scaling and timing of pitch targets. Finally, Section 3.4 reports focus and dialectal effects on the synchronization of the accentual gesture with the segmental string. All statistical effects will be reported at a significance level of 0.05.

3.1. Basic variables

For both Nuclear Duration and Nuclear Pitch Range, Table 8 shows significant main effects of FOCUS and DIALECT, but no interaction between them. The bar charts in Fig. 2 suggest an overall tendency to increase Nuclear Duration and Nuclear Pitch Range when the size of the focus constituent (clause, word, syllable, onset) becomes smaller. Table 9 gives pairwise comparisons of focus conditions for the two basic variables. As we are interested in possible effects of focus type (NF vs. CF) as well as of the size of the focus constituent in corrective focus (CF-W vs. CF-S vs. CF-O), we include comparisons between NF and pooled data of CF-W, CF-S, and CF-O (focus type) and between adjacent levels of corrective focus, that is CF-W vs. CF-S and level CF-S vs. CF-O (focus domain). Table 9 shows a significant effect of focus type on both variables. Additionally, narrowing down the focus domain from the accented word to the accented syllable (CF-W vs. CF-S) was found to

Table 8 Effects of FOCUS and DIALECT on Nuclear Duration and Nuclear Pitch Range.

Nuclear duration FOCUS F (3,30)¼6.34 p<0.01 DIALECT F (5,92)¼5.24 p<0.001 FOCUS×DIALECT F (15,68)¼0.67 n.s.

Nuclear pitch range FOCUS F (3,29)¼22.09 p<0.001 DIALECT F (5,96)¼2.81 p<0.05 FOCUS×DIALECT F (15,75)¼1.37 n.s.

Fig. 2. Nuclear Duration and Nuclear Pitch range for each variety, broken down by FOCUS. J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 193

Table 9 a Pairwise comparisons of levels of FOCUS.

Levels NF vs. CF CF-W vs. CF-S CF-S vs. CF-O ⁎⁎ Nuclear duration ⁎⁎⁎ ⁎ Nuclear pitch range ⁎ ⁎⁎ ⁎⁎⁎ a The symbols , , and indicate levels of significance at p¼0.05, 0.01, and 0.001, respectively. increase Nuclear Duration and Nuclear Pitch Range, but there was no effect of the confinement of the focus domain to the onset of the accented syllable (CF-S vs. CF-O). The bar charts in Fig. 2 also suggest variation of Nuclear Duration and Nuclear Pitch Range across varieties. The highest values of Nuclear Duration were found for corrective focus on the syllable and onset in GR. Nuclear Duration was found to decrease when moving to the western and eastern varieties suggesting that on average GR speakers were somewhat slower than the other speakers. Nuclear Pitch Range tends to be larger in the more ‘central’ varieties RO, AM, and GR than in the more ‘peripheral’ varieties RO, WL, and WH.2 Table 10 reports pairwise comparisons between all varieties. Distances between levels of the factor DIALECT roughly correspond to geographical distances, except for High and Low German spoken in Weener. As the location of the varieties may matter in the area of investigation, it seems advisable to capture all pairs of varieties which are geographically adjacent to each other. For this reason, we compare West Frisian in Grou (GR) with both Weener Low German and Weener High German (WL and WH). The results presented in Table 10 show that differences are found between ZB and AM or GR. Weener speakers have longer mean values for Nuclear Duration when reading the Low German sentences (WL) than when reading the High German sentences (WH).

3.2. Segmental duration

n n n For Onset Duration , Syllable Duration , and Word Duration Table 11 shows significant main effects of FOCUS and DIALECT but no interaction between FOCUS and DIALECT. The bar charts in Fig. 3 show an overall tendency for Word Duration, Syllable Durationn, and Onset Durationn to increase when the size of the focus constituent (clause, word, syllable, onset) becomes smaller. Table 12 gives pairwise comparisons of focus conditions for the three durational variables. There is a significant effect of focus type on all three variables but no significant lengthening effect of the size of the focus constituent in corrective focus. The bar charts in Fig. 3 also indicate variation across varieties. Syllable Durationn and Onset Durationn tend to be longer in the ‘central’ varieties RO, AM, and GR than in the ‘peripheral’ varieties ZB, WL, and WH, forming an inverted U-shaped pattern. These differences persist even after differences in overall Nuclear Duration have been removed by normalization. On the other hand, Word Durationn is found to be shorter in the ‘central’ varieties after removing differences of overall Nuclear Duration. Pairwise comparisons between all varieties reported in Table 13 reveal that most of the variation can be reduced to differences between WL and WH speakers on the one hand and speakers of the ‘central’ varieties RO and AM on the other.

3.3. Scaling and timing of the accentual f0 contour

3.3.1. General observations Before turning to the details of the scaling and timing of pitch targets, we report general observations on the effects of focus and dialect on the realization of the nuclear falls based on visual inspection of time-normalized averaged f0 contours. For visual identification of general patterns of variation we averaged individual f0 curves of each utterance per focus condition and variety, using the Praat script ProsodyPro by Yi Xu (2005–2011). Mean f0 curves were generated in four steps. First, we generated vocal pulse marks for each utterance based on 100 equally spaced measuring points within the interval defined as Nuclear Duration in Section 2.5. Second, we checked each utterance for missing or misplaced pulses by inspection of the waveform. When the waveform did not allow us to decide on the correct placement, equidistant pulses were inserted resulting in linear interpolations.

Third, we ran the script producing separate averaged f0 curve for male and female speakers for each variety and focus condition. Fourth, we converted the mean f0 values in Hz to semitones choosing 100 Hz as a reference. Fig. 4 displays time-normalized averaged f0 contours in semitones for male (lower curves) and female speakers (upper curves) for each variety. Red lines indicate neutral focus (NF) and blue lines corrective focus (CF) (blue lines). For CF, the data from all three corrective focus conditions were pooled. Visual inspection reveals a difference between NF and CF on the accented word, but no systematic differences between the averaged f0 curves for male and female speakers, except for the overall pitch level. In view of this finding, which is in line with the lack

2 We use the terms ‘central’ and ‘peripheral’ in a relative sense in order to avoid any implication that there are discrete divisions between dialects. The geographical continuum is best seen as a bundle of clines. For instance, whether Grou is characterized as ‘central’ or ‘peripheral’ may depend on the phonetic feature involved in the comparison. In this connection we note that the data for Winschoten, which we excluded on account of the deviant prosodic pattern in the target words, fall in with this notion of geographical-phonetic clines for variables that are independent of the word prosody difference, like the Nuclear Pitch Range, where it sides with GR, and Fall Duration, where it sides with WL. 194 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

Table 10 Pairwise comparisons of levels of DIALECT.

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

Nuclear Duration ⁎ Nuclear Pitch Range

ZB vs. AM RO vs. GR AM vs. WL/WH Nuclear Duration ⁎ Nuclear Pitch Range ⁎

ZB vs. GR RO vs. WL/WH Nuclear Duration ⁎⁎ Nuclear Pitch Range ⁎⁎

ZB vs. WL/WH Nuclear Duration Nuclear Pitch Range

Table 11 n n n Effects of FOCUS and DIALECT on Word Duration , Syllable Duration , and Onset Duration .

Word Durationn FOCUS F (3,31)¼5.85 p<0.01 DIALECT F (5,92)¼4.38 p<0.01 FOCUS×DIALECT F (15,64)¼0.48 n.s.

Syllable Durationn FOCUS F (3,32)¼8.47 p<0.001 DIALECT F (5,100)¼3.66 p<0.01 FOCUS×DIALECT F (15,69)¼0.34 n.s.

Onset Durationn FOCUS F (3,29)¼28.86 p<0.001 DIALECT F (5,101)¼4.10 p<0.01 FOCUS×DIALECT F (15,79)¼0.36 n.s.

n n n Fig. 3. Mean Word Duration , Syllable Duration , and Onset Duration for each variety, broken down by FOCUS.

Table 12 Pairwise comparisons of levels of FOCUS.

Levels NF vs. CF CF-W vs. CF-S CF-S vs. CF-O

Word Durationn ⁎⁎ Syllable Durationn ⁎⁎ Onset Durationn ⁎⁎ J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 195

Table 13 Pairwise comparisons of levels of DIALECT.

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

Onset Durationn Syllable Durationn n Word Duration ⁎/⁎ ZB vs. AM RO vs. GR AM vs. WL/WH

n Onset Duration ⁎ ⁎/⁎⁎⁎ n Syllable Duration ⁎ /⁎⁎ n Word Duration ⁎⁎⁎/⁎⁎⁎

ZB vs. GR RO vs. WL/WH

n Onset Duration /⁎ Syllable Durationn n Word Duration ⁎⁎/⁎⁎ ZB vs. WL/WH

Onset Durationn Syllable Durationn n Word Duration /⁎

22 22 22 ZB NF RO NF AM NF 20 CF 20 CF 20 CF 18 18 18 16 16 16 14 14 14 12 12 12 10 10 10 st 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 -2 -2 -2 -4 -4 -4 0.00.20.40.60.8 1 .0 1. 2 0.0 0.2 0.4 0 .6 0 .8 1. 01.20.0 0.2 0 .4 0 .6 0. 81.01.2

22 22 22 GR NF WL NF WH NF 20 CF 20 CF 20 CF 18 18 18 16 16 16 14 14 14 12 12 12 10 10 10 st 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 -2 -2 -2 -4 -4 -4 0.0 0.2 0.4 0.6 0 .8 1. 01.20.00.20.40.6 0. 81.01.20.0 0.2 0 .4 0. 6 0.8 1.0 1.2 sec.

Fig. 4. Mean time-normalized f0 contours in semitones of sentences produced by male speakers (lower curves) and female speakers (upper curves) in each variety, comparing neutral focus (NF) with pooled data from all corrective focus conditions (CF). The ordinate is mean f0 in semitones relative to 100 Hz. The abscissa is time in seconds, with the beginning of the word set to 0.

of evidence for gender-related differences in the realization of focus in previous research, we ignore the gender of speakers in the remainder of this paper. For CF, we observe an increase of the excursion size of the rising movement in all varieties. Strikingly, the four varieties spoken in the Netherlands, ZB, RO, AM, and GR, have higher f0 peaks (H) in CF and, except for the male speakers of GR, lower beginnings of 196 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

22 22 22 ZB NF RO NF AM NF 20 CF-W 20 CF-W 20 CF-W 18 CF-S 18 CF-S 18 CF-S CF-O CF-O CF-O 16 16 16 14 14 14 12 12 12 10 10 10 st 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 -2 -2 -2 -4 -4 -4 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0 .2 0 .4 0 .6 0 .8 1 .0 1 .2 0.00.20.40.60.81.01.2

22 22 22 GR NF WL NF WH NF 20 CF-W 20 CF-W 20 CF-W 18 CF-S 18 CF-S 18 CF-S CF-O CF-O CF-O 16 16 16 14 14 14 12 12 12 10 10 10 st 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 -2 -2 -2 -4 -4 -4 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 sec.

Fig. 5. Mean time-normalized f0 contours in semitones of sentences produced by male and female speakers, comparing neutral focus (NF) with corrective focus on the nuclear word (CF- W), corrective focus on the nuclear syllable (CF-S), and corrective focus on the onset of the nuclear syllable (CF-O). the rise (L1), even if in some places differences are small.3 In the two German varieties WL and WH, by contrast, the rise starts unusually high in neutral focus, and in corrective focus the range of the rising movement is primarily increased by lowering L1. We also observe that in all varieties the rise is steeper in CF than in NF, which suggests a disproportional increase of the excursion size of the rise when compared to the duration of the rise. This difference in rise slope is most obvious in WL and WH.

In all varieties, the f0 differences between NF and CF are confined to the target words. After the accented word, f0 is approximately the same. This means that the data do not support the expectation that corrective focus compresses the postnuclear pitch range more than does neutral focus (see Section 1.2). The fall is steepest in WL and WH, where the difference between NF and CF is restricted to the rising movement at the beginning of the accented word. Fig. 5 displays the three corrective focus conditions CF-W, CF-S, and CF-O separately, using the neutral condition NF as a baseline. The largest differences among corrective focus conditions show up in the scaling of the f0 peak in the ‘central’ varieties RO, AM, and GR. In particular, the f0 peak appears to be somewhat higher in these varieties when the nuclear syllable or onset is focused (CF-S and CF-O) than when the nuclear word is focused (CF-W). Overall, however, the differences between the three CF conditions are smaller than the difference between NF on the one hand and the averaged CF conditions on the other hand. As shown in Figs. 4 and 5, the shapes of the contours for the different focus conditions differ, often subtly, in a number of ways. The rises may be different, as may be the falls, while each of these in turn may differ in more than one way. Both rises and falls may differ in excursion size, in time, and slope. To capture these differences we now will have a closer look at the size, time, and slope of the accentual rises and falls.

3.3.2. Rising movement For comparing the accentual rises we used normalized Rise Sizen, Rise Timen, and Rise Slopen as dependent variables, as defined in Table 6. Table 14 shows significant main effects of FOCUS and DIALECT for the three variables and significant interactions n n between FOCUS and DIALECT for Rise Size and Rise Slope . The bar charts in Fig. 6 suggest that Rise Sizen increased over the first three focus conditions in all varieties, Rise Timen in all varieties except GR, and Rise Slopen in all varieties except RO and GR. Further increases of Rise Sizen Rise Timen and Rise Slopen

3 Note that the averaged f0 curves in Figs. 3–5 start with the onset of the accented syllable rather than with the initial valley (L1). In some cases, the initial valley occurs before the beginning of the onset. Therefore, inspection of the distances between the curves at 0 s may underestimate the difference between the f0 height in NF and CF at the beginning of the rise. J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 197

Table 14 n n n Effects of FOCUS and DIALECT on Rise Size , Rise Time , and Rise Slope .

Rise Sizen FOCUS F (3,22)¼37.69 p<0.001 DIALECT F (5,92)¼6.93 p<0.001

FOCUS×DIALECT F (15,59)¼3.20 p<0.001 Rise Timen FOCUS F (3,25)¼8.72 p<0.001 DIALECT F (5,95)¼6.30 p<0.001

FOCUS×DIALECT F (15,62)¼0.70 n.s. Rise Slopen FOCUS F (3,39)¼10.71 p<0.001 DIALECT F (5,109)¼4.30 p<0.01 FOCUS×DIALECT F (15,102)¼3.02 p<0.01

n n n Fig. 6. Mean normalized Rise Size , Rise Time , and Rise Slope for each variety, broken down by FOCUS.

Table 15 Pairwise comparisons of levels of FOCUS.

Levels NF vs. CF CF-W vs. CF-S CF-S vs. CF-O

Rise Size ⁎⁎⁎ Rise Time ⁎⁎⁎ Rise Slope ⁎⁎⁎ in the CF condition for the onset consonant can be observed in the ‘peripheral’ varieties only. Overall, however, only the difference between NF and CF reached significance. Pairwise comparisons of the four focus conditions in Table 15 show for all three variables a significant effect of focus type (NF vs. CF) but not of the size of the focused constituent in corrective focus. The increase of slope in corrective focus suggests that rise size increases disproportionally when compared to the duration of the rise. The bar charts in Fig. 6 show cross-linguistic variation for all three variables. The variation found for Rise Timen is strikingly similar to the variation found for Syllable Durationn and Onset Durationn in Fig. 3, again forming an inverted U-shaped pattern. Rise Sizen in the NF condition is substantially smaller in WL and WH than in the other varieties, as has already been observed in Figs. 4 and 5. As the smaller rise size in WL and WH is not fully compensated for by a shorter rise time, WL and WH speakers have likewise the lowest values for Rise Slopen in the NF condition. Pairwise comparisons in Table 16 confirm the overall impression that most of the variation can be reduced to the deviant rising patterns produced by the WL and WH speakers. To obtain more information on the interactions effects attested for Rise Sizen and Rise Slopen we carried out pairwise comparisons between levels of FOCUS per variety. Table 17 indicates that focus type (NF vs. CF) affects the size of the rising movement in the ‘peripheral’ varieties WL, WH, RO and partly ZB, whereas narrowing down the size of the focus constituent in corrective focus has no systematic effects (CF-W vs. CF-S and CF-S vs. CF-O). Similar effects on the slope of the rising movement were found for WL, WH, and ZB. 198 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

Table 16 Pairwise comparisons of levels of DIALECT.

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

Rise Size ⁎⁎⁎/⁎⁎⁎

Rise Time ⁎⁎/⁎

Rise Slope ⁎⁎/⁎ ZB vs. AM RO vs. GR AM vs. WL/WH

Rise Size ⁎⁎⁎/⁎⁎ Rise Time ⁎⁎ ⁎⁎⁎/⁎⁎⁎ Rise Slope

ZB vs. GR RO vs. WL/WH

Rise Size ⁎⁎⁎/⁎⁎⁎

Rise Time ⁎⁎/⁎⁎

Rise Slope ⁎/ ZB vs. WL/WH

Rise Size ⁎⁎/⁎ Rise Time

Rise Slope ⁎⁎/⁎

Table 17 Main effects and pairwise comparisons of levels of FOCUS, broken down by DIALECT.

NF vs. CF CF-W vs. CF-S CF-S vs. CF-O

Rise Sizen ZB ⁎⁎ RO ⁎⁎ AM ⁎⁎ GR ⁎ WL ⁎⁎⁎ WH ⁎⁎⁎

Rise Slopen ZB ⁎⁎⁎ RO AM GR WL ⁎⁎⁎ WH ⁎⁎⁎

Table 18 shows significant differences between varieties in each focus conditions, except for corrective focus in the onset (CF-O), where no differences reached statistical significance. All significant differences involve either WL or WH and differences of Rise Slopen are restricted to the neutral focus condition. n n Overall, the results suggest that the main source of interaction between FOCUS and DIALECT attested for Rise Size and Rise Slope is the deviant realization of neutral focus by Weener speakers, and in particular the unusually high start of the rising movement in neutral focus (L1) observed in Fig. 4. Because for Standard German, prenuclear accents have been reported to occur more often in neutral focus and narrow non-contrastive focus than in narrow contrastive focus (Baumann et al., 2007), the question arises whether the high values for L1 in Weener⁎ might be⁎ an effect of the presence of prenuclear accents. In particular, prenuclear accents ending with a high pitch target, such as H and L H, are likely to have a raising effect on L1.To examine the sources of L1 raising in Weener, we checked all utterances for the presence of prenuclear accents on the last or second to last prenuclear foot. We identified four accent types: H*L, H*, L*H, and L*. As prenuclear contours with L* were rarely attested and hardly distinguishable from accentless prenuclear contours, this accent type will be ignored. Fig. 7 shows relative frequencies of (rightmost) prenuclear accents in neutral and corrective focus, respectively. For corrective focus, data for the three corrective focus conditions were pooled. To examine whether WL and WH speakers used more prenuclear accents with a final H than the other speakers we compared the frequencies of H* and L*H in pooled data of ZB, RO, AM, and GR with those of WL and WH, respectively, using two-tailed Binominal tests. In both the neutral and corrective focus condition, prenuclear accents with final H were found to be more often used in WL and WH than in the other varieties (for each test, p<0.001). We conclude that the more frequent use of H* and L*H may be one source of the unusual high scaling of L1 in WL and WH.

3.3.3. Falling movement For comparing the accentual falls we used Fall Sizen, Fall Timen, and Fall Slopen as dependent variables, as defined in Table 5. n Table 19 shows significant main effects of FOCUS and DIALECT for all variables included, except for the effect of FOCUS on Fall Time , but no significant interaction between FOCUS and DIALECT. J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 199

Table 18 Pairwise comparisons of levels of DIALECT, broken down by FOCUS.

a. Neutral focus

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

n Rise Size ⁎⁎/⁎⁎ n Rise Slope ⁎⁎/⁎⁎ ZB vs. AM RO vs. GR AM vs. WL/WH

n Rise Size ⁎/⁎ Rise Slopen

ZB vs. GR RO vs. WL/WH

n Rise Size ⁎⁎/⁎⁎ n Rise Slope ⁎/⁎ ZB vs. WL/WH

n Rise Size ⁎/ Rise Slopen

b. Corrective focus on the accented word (CF-W)

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

n Rise Size ⁎⁎/⁎ Rise Slopen

ZB vs. AM RO vs. GR AM vs. WL/WH

n Rise Size ⁎/ Rise Slopen

ZB vs. GR RO vs. WL/WH

n Rise Size ⁎/ Rise Slopen

ZB vs. WL/WH

Rise Sizen

c. Corrective focus on the accented syllable (CF-S)

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

n Rise Size ⁎/⁎ Rise Slopen

ZB vs. AM RO vs. GR AM vs. WL/WH

n Rise Size /⁎ Rise Slopen

ZB vs. GR RO vs. WL/WH

n Rise Size ⁎/⁎ Rise Slopen

ZB vs. WL/WH

Rise Sizen Rise Slopen

no accent H*L H* L*H no accent H*L H* L*H 100 100 80 80 60 60 % % 40 40 20 20 0 0 ZB RO AM GR WL WH ZB RO AM GR WL WH

Fig. 7. Relative frequencies of prenuclear accent type in neutral (left panel) and corrective focus condition (right panel). 200 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

Table 19 n n n Effects of FOCUS and DIALECT on Fall Size , Fall Time , and Fall Slope .

Fall Sizen FOCUS F (3,36)¼5.82 p<0.01 DIALECT F (5,104)¼7.39 p<0.001 FOCUS×DIALECT F (15,120)¼0.52 n.s.

Fall Timen FOCUS F (3,42)¼1.30 n.s. DIALECT F (5,103)¼13.03 p<0.001 FOCUS×DIALECT F (15,126)¼0.72 n.s.

Fall Slopen FOCUS F (3,30)¼10.57 p<0.001 DIALECT F (5,102)¼7.85 p<0.001 FOCUS×DIALECT F (15,94)¼1.59 n.s.

n n n Fig. 8. Mean normalized Fall Size , Fall Time , and Fall Slope for each variety, broken down by FOCUS.

The bar charts in Fig. 8 display mean values of Fall Sizen, Fall Timen, and Fall Slopen as a function of the four focus conditions. Fall Sizen tends to increase in all varieties when corrective focus in place of neutral focus is used but differences in Weener are small. Mean slope values indicate that reducing the size of the focus constituent tends to increase the excursion of the fall in most varieties disproportionally when compared to the duration of the fall. Pairwise comparisons of the focus conditions in Table 20 reveal an effect of focus type (NF vs. CF) for Fall Sizen and Fall Slopen but not for Fall Timen. An additional effect of Fall Sizen is found for narrowing down the focus constituent from the syllable to the onset of the accented syllable (CF-S vs. CF-O). The bar charts in Fig. 8 indicate dialectal variation that in the case of Fall Timen forms an inverted U-shaped pattern comparable to that found for the rising movement, suggesting that the ‘central’ varieties have both longer rises and falls relative to Nuclear Duration than the ‘peripheral’ varieties. The U-shaped pattern attested for Fall Slopen indicates that the longer falls in the ‘central’ varieties cannot be fully explained by a larger size of the falling movement. In the ‘central’ varieties falls are larger and longer than in the ‘peripheral’ varieties. Pairwise comparisons in Table 21 show that the main differences in Fall Sizen hold between the ‘central’ varieties on the one hand (RO, AM, GR) and the ‘peripheral’ varieties ZB and WL/WH on the other. Differences in Fall Time are found between all neighboring varieties but not between the two varieties spoken in Weener (WL vs. WH). Differences in Fall Slope are found between all neighboring varieties except between ZB and RO in the southwest. Additionally, Weener speakers used larger falls in WH than in WL.

3.4. Alignment of f0 targets

For comparing the synchronization of the accentual pitch gesture with the segmental string we used L1 Delayn, H Delayn, and L2 Delayn as dependent variables, which in Table 5 were defined as the distances of L1, H, and L2 from the beginning of the accented n syllable relative to Nuclear Duration. Table 22 shows significant main effects of FOCUS and DIALECT on H Delay , a significant main n effect of DIALECT on L2 Delay and no significant interaction effect. J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 201

Table 20 Pairwise comparisons of levels of FOCUS.

Levels NF vs. CF CF-W vs. CF-S CF-S vs. CF-O

Fall Sizen ⁎⁎ ⁎ Fall Slopen ⁎⁎

Table 21 Pairwise comparisons of levels of DIALECT.

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

n Fall Size ⁎⁎ ⁎⁎/ ⁎ n Fall Time ⁎⁎⁎⁎⁎⁎⁎/⁎⁎ n Fall Slope ⁎⁎⁎/⁎ ZB vs. AM RO vs. GR AM vs. WL/WH

Fall Sizen n Fall Time ⁎⁎⁎ ⁎⁎⁎/⁎⁎⁎ n Fall Slope ⁎⁎⁎ ⁎⁎⁎/⁎⁎⁎

ZB vs. GR RO vs. WL/WH

n Fall Size ⁎⁎⁎ n Fall Time ⁎⁎⁎/⁎ n Fall Slope /⁎ ZB vs. WL/WH

n Fall Size /⁎ Fall Timen Fall Slopen

Table 22 Effects of FOCUS and DIALECT on L1 Delay, H Delay, and L2 Delay.

L1 Delayn FOCUS F (3,25)¼1.64 n.s. DIALECT F (5,94)¼1.75 n.s. FOCUS×DIALECT F (15,58)¼1.54 n.s.

H Delayn FOCUS F (3,30)¼58.78 p<0.01 DIALECT F (5,96)¼30.07 p<0.001 FOCUS×DIALECT F (15,65)¼2.32 n.s.

L2 Delayn FOCUS F (3,36)¼1.01 n.s. DIALECT F (5,104)¼20.45 p<0.001 FOCUS×DIALECT F (15,117)¼0.37 n.s.

The bar charts in Fig. 9 show mean values for L1 Delayn, H Delayn, and L2 Delayn as a function of the four focus conditions. In neutral focus, the accentual rise appears to start later relative to the beginning of the accented syllable (L1 Delayn) in most varieties, but this possible effect did not reach statistical significance. Pairwise comparisons between levels of focus in Table 23 show a significant effect of focus type (NF vs. CF) for H Delayn. Stepwise reduction of the size of the focused constituent in corrective focus has no effect. Fig. 9 suggests substantial variation across varieties, which for all three dependent variables forms an inverted U-shaped pattern. In general, speakers of the more ‘central’ varieties tend to produce L1, H, and L2 later than the speakers of the more ‘peripheral’ varieties, which means that in the ‘central’ varieties the whole accentual gesture is delayed relative to the beginning of the accented syllable. However, the observed dialectal variation of L1 Delayn is too small to reach statistical significance. Pairwise comparisons between varieties in Table 24 show that as in the case of Fall Sizen (Section 3.3) most differences for H Delayn and L2 Delayn hold between the ‘central’ and the ‘peripheral’ varieties (RO, AM, GR vs. ZB and WL/WH).

4. Summary and discussion

Our interest in the investigation of Dutch, Frisian, Low Saxon and High German varieties spoken in the Netherlands and Northwest Germany concerned the kind of phonetic adjustments speakers make when pronouncing comparable sentences under different focus conditions. In order to observe these adjustments independently of differences in speech rate and pitch range that are likely 202 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

n n n Fig. 9. Mean normalized L1 Delay , H Delay , and L2 Delay for each variety, broken down by FOCUS.

Table 23 Pairwise comparisons of levels of FOCUS.

Levels NF vs. CF CF-W vs. CF-S CF-S vs. CF-O

H Delayn ⁎⁎⁎

Table 24 Pairwise comparisons of levels of DIALECT.

Levels ZB vs. RO RO vs. AM AM vs. GR GR vs. WL/WH WL vs. WH

n H Delay ⁎ ⁎⁎⁎/⁎⁎⁎ n L2 Delay ⁎⁎ ⁎⁎⁎ ⁎⁎⁎ ⁎⁎⁎/⁎⁎⁎ ZB vs. AM RO vs. GR AM vs. WL/WH

n H Delay ⁎⁎⁎ ⁎⁎⁎/⁎⁎⁎ n L2 Delay ⁎⁎⁎ ⁎⁎⁎/⁎⁎⁎

ZB vs. GR RO vs. WL/WH

n H Delay ⁎⁎⁎/⁎⁎⁎ n L2 Delay ⁎⁎ ⁎⁎⁎/⁎⁎⁎ ZB vs. WL/WH

H Delayn L2 Delayn

to occur between speakers, we calculated the normalized values for segmental variables, pitch variables, and alignment variables before analyzing them. In order to avoid variation in the metrical and segmental structure of the prenuclear parts of the test sentences in the different language versions, we used the nuclear part of the utterance for normalization, which was defined as the interval between the beginning of the nuclear accented syllable and the end of the intonational phrase. As the accentual peak was always identical to the f0 maximum of the nuclear utterance, H(f) was used as the upper limit of the ‘nuclear pitch range’; the lowest f0 was the lower limit. Likewise, the duration of the same stretch of speech, the ‘nuclear duration’, was used as the basis for all durational normalizations. Nuclear pitch range and nuclear duration were in fact found to vary overall with focus condition and dialect. There were longer nuclear durations for smaller focus constituents, while Grou stood out as having the longest nuclear duration. Similarly, wider pitch ranges were found for shorter focus constituents, while Rotterdam, Amsterdam and Grou showed wider pitch ranges than ZB in the south-west and WL and WH in the north-east. This geographically defined U-shaped pattern in the dialectal variation was a recurring theme in our data even after normalization. That is, it not only revealed itself globally over the nuclear stretch of speech, but also appeared locally in the realization of various features of the rising-falling pitch accent. J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 203

4.1. Focus effects

In general, the largest and most systematic phonetic effects arose from variation of the semantic type of focus (NF vs. CF). Variation of the size of the focus domain in corrective focus, that is CF on the accented word (CF-W), on the accented syllable (CF-S), or on the onset of the accented syllable (CF-O), had minor and less systematic effects.

(i) Segmental duration. CF was found to increase Onset Durationn, Syllable Durationn, and Word Durationn, as shown in Fig. 3 and Table 11. Hanssen et al. (2008) found for their mixed group of speakers of Standard Dutch that corrective focus on the accented word increased the duration of the onset consonant when compared to neutral focus. No overall differences among CF conditions (CF-W vs. CF-S, CF-S vs. CF-O, see Table 12) were found, which is in line with the results reported by van Heuven (1994), who failed to find any significant effects when comparing corrective focus on the onset consonant, the vowel, the coda consonant, and the accented syllable. There is thus no evidence that lengthening effects of focus are domain-specific, as was expected by van Heuven (1994). Like van Heuven, we failed to find any disproportionate lengthening of the segments in the corrective focus constituent. That is, segmental durations may help to distinguish neutral from corrective focus, but not a wider focus constituent from a narrower one.

(ii) f0. Generally, the differences between the three CF conditions are smaller than the difference between NF on the one hand and the averaged CF conditions on the other hand, which we found in all varieties. This difference between NF and CF is confined

to the target words, f0, after the accented word being approximately the same. This means that the data do not support an expectation that corrective focus compresses the postnuclear pitch range any further relative to an otherwise identical neutral focus (see Section 1.2).

As for the rising part of the pitch accent, Rise Sizen increased over the first three focus conditions in all varieties, Rise Timen in all varieties except GR, and Rise Slopen in all varieties except RO and GR (Fig. 6). A further increase in the CF condition for the onset consonant occurred in the ‘peripheral’ varieties (ZB and WH). Overall, however, only the difference between NF and CF reached significance. The significant interaction between dialect and focus for Rise Sizen and Rise Slopen is due to the treatment of the rise in WL and WH, which German varieties distinguish themselves from the other dialects by increasing both the size and slope of the rise, and thus speeding it up disproportionately, in particular when moving from NF to CF. This is achieved by the favorable starting point in NF, which has a fairly high beginning of the rise in the NF condition, observable in Fig. 4 (see also Section 4.2). Marginally steeper and wider rises were also found in contrastive focus by Kügler and Gollrad (2011) for speakers of Standard German. n n Significant effects of FOCUS for the falling part were observed for Fall Size and Fall Slope , which broadly increased with the narrowing of the focus constituent relative to NF. Strikingly, WL/WH deviate from the varieties spoken in the Netherlands by shortening the duration of the fall, thereby creating a more substantial increase in Fall Slopen, a conclusion that is supported by the pairwise comparisons between WL/WH and the other dialects for Fall Slopen and Fall Timen (Table 21). A later alignment of the rise-fall in narrower focus constituents was evident for the peak only (Table 22 and Fig. 9), whereby the effects for GR and AM were very small or unobservable.

4.2. Dialect effects

A comprehensive inspection of the regional variation reveals that for many phonetic variables there is an inverted U-shaped pattern, with the more ‘central’ varieties RO, AM, and GR showing larger values than the more ‘peripheral’ varieties in the southwest (ZB) and northeast (WL, WH). This inverted U-shaped pattern was most pronounced for the durational variables. The RO and AM speakers produced longer accented syllables and longer accented syllable onsets (Fig. 3) than the speakers of the more ‘peripheral’ varieties, a conclusion that is broadly supported by the pairwise comparisons between these and other varieties. As shown in Table 13, RO differs from WL/WH in Word and Onset Durationn, while AM differs from WL/WH for all three variables and from ZB in n Syllable and Onset Duration . An inverted U-shaped pattern was also found for the duration of the rising and falling f0 movements (Figs. 6 and 8). Concomitantly, a U-shape pattern was attested for the slope of the fall, meaning that the shorter fall durations of the peripheral varieties co-occurred with steeper slopes (Fig. 8). A similar concomitant U-shape for the slope of the rise is obscured by the fact that WL/WH have rather high beginnings of the rise, reducing its size and slope. Moreover, the alignment of the delay of the rising-falling movement was greatest in the ‘central’ varieties than in the ‘peripheral’ varieties, as shown in particular by the H-delay variable. Because of the normalizations of these phonetic measures for the pitch range and duration of the stretch from the beginning of the accented syllable to the utterance end, this ‘central’ bulge in the data is independent of the less pronounced U-shaped pattern that is observable in the duration of pitch range of the nuclear stretch (Fig. 2, Table 8). Overall, therefore, the ‘central’ varieties have wider f0 excursions of the rise-falls, with peaks in particular reaching higher values, and take more time to execute them, while also allowing them to be aligned later. Putting it differently, the accentual gestures of the ‘peripheral’ varieties ZB, WL and WH are more compact than the accentual gestures of the ‘central’ varieties, as shown by shorter segmental durations in the ‘peripheral’ varieties, shorter rise and fall movements, and smaller excursions sizes. An explanation of the inverted U-shape may be found in the greater prestige enjoyed by speakers of the ‘central’ varieties in the Dutch heartland, whose speech is closest to Standard Dutch. The widely attested greater prestige of the heartland, also known as the Randstad, leads one to expect that its pronunciation patterns are innovative, and so the expansion of the nuclear rise-fall most probably is an innovation. The fact that WH does not show this expansion should not be surprising. First, as discussed below, we 204 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 have no indication that the Weener speakers adopted prosodic features from regional or national standard German speech. Second, any innovations in the Dutch heartland are unlikely to coincide with innovations in standard varieties of German, since Dutch and German have been autonomous for centuries (Chambers & Trudgill, 1998).4 The effect of geographical proximity is one of the most interesting findings in this investigation. Neighboring varieties, even those belonging to different language groupings, like AM () and GR (Frisian), tend to behave more like each other than geographically distant varieties, like ZB (Low Franconian) and GR, or even ZB and AM, both of which belong to the Low Franconian dialect group. At the same time, it is also clear that some of the variation is attributable to the different language groups that our varieties belong to. In fact, the largest differences were found between varieties spoken in the Netherlands and in Germany. While both of these encompass varieties belonging to different languages (Dutch and Frisian for the Netherlands and Low Saxon and High German for Germany), within either group a single standard contact language is recognized, Standard Dutch and Standard German, respectively. In this context, we also refer to the strikingly higher beginning point of the rise in WL/WH, which must in part be due to more frequent occurrence of H-ending prenuclear pitch accents in the two German varieties (Fig. 7). Two other dialect effects are singled out for comment. First, Weener speakers produced longer nuclear durations when speaking Low German (WL) than High German (WH) (Table 10). Rather than attributing this difference to phonetic differences between the regional standard and the Weener dialect, we suggest that speakers were less familiar with reading Low German than High German, something which may have reduced their rate of speech when reading the Low German test sentences. Overall, differences between WL and WH are small, and the question arises to what extent prosodic features of the kind we have discussed are as readily manipulated by speakers as are more salient differences between standard and regional varieties, like and lexis, or even segmental features. As noted by a reviewer, this conclusion tallies with Atterer and Ladd (2004), who found that Southern and Northern speakers of German transferred their different rise-fall alignment patterns to English, rather than adopting English alignments. Second, the fact that RO and AM tend to have longer word, syllable and onset durations than ‘peripheral’ dialects, particularly WL/ WH, must not be interpreted to mean that their speech tempo is slower. The finding only applies to the accented words and syllables, not to the ‘nuclear durations’. At best therefore it may be concluded that that RO and AM speakers differentiate segmental durations between focused and non-focused positions to a larger extent than other speakers.5 Finally, the avoidance of sww patterns for the target words was found in the peripheral varieties of ZB and WL/WH. This is a U- shaped pattern again, which is arguably related to the shorter segmental durations, shorter rises and shorter falls in the peripheral dialects, such that all four phenomena conspire to create a ‘compact’ accentual foot.

4.3. Strategies of signaling corrective focus

The effects summarized in Section 4.1 call for an attempt to interpret the single measures as symptoms of a more general articulatory strategy that is adopted in the realization of corrective focus. The overall dependent variable in our experiment can be characterized as information weight or communicative urgency, which was expected to lead to greater articulatory emphasis in the realization of the accented syllable in the phonological domain in which the increased information weight is located, following a long tradition of phonetic research on . By the side of spectral effects, this should lead us to expect greater durations of the accented syllable or word and greater precision or distinctiveness in the execution of the relevant pitch contour, i.e. hyperarticulation (Lindblom, 1990). An increase in the distinctiveness of a rising-falling pitch contour on the accented syllable can be achieved by expanding the rising part and/or the falling part, while counteracting any side effects of these actions. An idealized, but unrealistic result would be a lower beginning of the rise, a lower end of the fall and a higher peak, with increased slopes, modulo any segmental duration increases. Instead of this mechanical ideal, compromises can be found in peak delay and L1 retraction, measures that afford more time for the rise, and peak retraction and L2 delay, measures that afford more time for the fall. Since the attenuating measures for the rise slope conflict with those for the fall slope where the location of the peak is concerned, we may expect both peak retraction and peak delay in different groups of speakers, depending on which part of the rise-fall contour they are keener to hyperarticulate. Moreover, enhancement strategies may exploit side effects (Stevens & Keyser, 1989; Kingston & Diehl, 1994; Keyser & Stevens, 2006). Our results consistently conform to this characterization for the durations of the accented word, the accented syllable and the onset consonant of the accented syllable in all varieties, showing that the speakers take more time for the execution of the pitch gesture in the crucial locations. They also agree with this scenario in the consistent expansion of the rise excursion and the expansion of the fall excursion in the varieties spoken in the Netherlands. The f0 alignment data in Fig. 9 conform to this pattern in showing delayed peaks across focus conditions in the ‘peripheral’ varieties including RO. In AM, the longer durations of the rise and fall (Figs. 6 and 8) may have counteracted a delay of the peak. The prediction that the end of the fall will be delayed was not confirmed. The data for L2 Delayn in Fig. 9 appear to be rather variable, with GR and WL/WH in particular showing constant timings across focus conditions. A predicted retraction of the beginning of the rise was not attested either. Again, the variation across focus conditions and dialects was very large, as shown in particular in the unexpected behavior of ZB, where the beginning of the rise is

4 In North-Western Germany, High German was first acquired as a second language by speakers of Low German some 500 years ago, and we expect the intonation of High German in that area to be affected by the intonation of Low German. However, Atterer and Ladd (2004) provide no pitch range data. 5 There are indications that speakers from the Randstad speak faster than speakers from other areas in the Netherlands (Verhoeven, de Pauw, & Kloots, 2004; Quené, 2008). J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 205 delayed as focus becomes narrower (Fig. 9). We also note that no variety retracts the peak in corrective focus. While our account could have accommodated peak retractions, this negative finding is not in conflict with the hyperarticulation strategy outlined above.6 Summarizing, our investigation has shown that speakers from different locations on the arc from the south-west to the north-east of the Netherlands and north-west of Germany adopt different strategies in signaling increasing degrees of significance of the focus in the phonetics of the pitch accent that realizes that focus. Speakers of Hollandic Dutch and West Frisian expanded the pitch span on the accented word, whereas speakers of Low and High German rescaled single targets of the accentual f0 gesture, and speakers of Zeelandic Dutch mixed both strategies. There was overwhelming evidence of phonetic gradience reflecting the geographical gradient patterning so as to suggest that the western Hollandic varieties are innovative in expanding the pitch range of the focus-marking pitch accent. At the same time, we found more substantial differences between the varieties spoken in the Netherlands on the one hand and the Weener varieties, which are spoken in Germany, on the other. The role of language contact therefore is certainly one aspect that needs more attention in future research. A final comment concerns the practice of recruiting participants for production experiments on prosody from a single university. Given that there is regional variation between groups of high school students in the Netherlands, as we demonstrated in this article, it is not implausible that groups of university students, who come from a wider area, will similarly show regional variation in what they and others may consider ‘Standard Dutch’. A more precise geographical definition of the participant group in such cases may therefore be advisable.

Acknowledgments

This study was carried out as part of the project Intonation in Varieties of Dutch, funded by the Netherlands Organisation of Scientific Research (NWO), Grant no. 360-7-180. We thank Marron C. Fort and Garrelt van Borssum for translating the test sentences into Dutch and German Low Saxon, respectively. We also thank Rachel Fournier and Joop Kerkhoff for valuable practical and technical support, Wilbert Heeringa and Martijn Wieling for statistical advice, and our student assistants Marjel van Dijk, Lian van Hoof, Jan Michalsky, and Renske Teeuw for their help with data collection and annotation. We are grateful to the anonymous reviewers, whose comments have contributed substantially to the quality of the analysis and presentation of our data.

Appendix A. Speech materials

SD Standard Dutch (used for ZB, RO, and AM) WF West Frisian (GR) DLS Dutch Low Saxon (WI) GLS German Low Saxon (WL) HG High German (WH) E English translation

Neutral focus

1. SD A Wat gaat er gebeuren? B Ze willen bakker Malberen belonen. WF A Wat sil der barre? B Se wolle bakker Malberen beleanje. DLS A Wat gaait ter gebeuren? B Zai willen bakker Malberen belonen. GLS A Wat is d‘r denn? B Se willen Backer Malberen belohnen. HG A Was ist da los? B Sie wollen Bäcker Malberen belohnen. E A What is going to happen? B They are going to reward baker Malberen.

2. SD A Waarom wordt de zaal versierd? B Ze gaan minister Melberen benoemen. WF A Wêrom wurdt de seal fersierd? B Se sille minister Melberen beneame. DLS A Woarom wort de zoal opsierd? B Zai goan minister Melberen benuimen. GLS A Waarum sünd ji na hen fahren? B Wi wullen Unkel Melberen besöken.

6 A small peak retraction (−11 ms) was reported for a group of Standard Dutch speakers by Hanssen et al. (2008). 206 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

HG A Warum seid ihr nach Berlin gefahren? B Wir wollten Onkel Melberen besuchen. E A Why is the hall being decorated?/Why did you drive to Berlin? B They are going to nominate minister Melberen./ They wanted to visit minister Melberen.

3. SD A Had de roman een ‘happy end’? B Ja. Hij mocht 't huis van tante Molberen bewonen. WF A Hie de roman in ‘happy end’? B Ja. Hy mocht 't hûs fan tante Molberen bewenje. DLS A Haar de roman ain ‘happy end’? B Joa. Hai mog 't hoes van tante Molberen bewonen. GLS A Harr denn de Roman en Happy End? B Ja. He dürs dat Huus van Oma Molberen bewohnen. HG A Hatte der Roman ein Happy End? B Ja. Er durfte das Haus von Tante Molberen bewohnen. E A Did the novel have a ‘happy end’? B Yes. He was allowed to live in aunt Molberen’s house.

Corrective focus – focus domain¼word

1. SD A Wilde de agent meester Verdonck belonen?

B Meester Verdonck? Nee, hij wilde meester [Malberen]F belonen! WF A Woe de plysje master Verdonck beleanje?

B Master Verdonck? Nee, hy woe master [Malberen]F beleanje! DLS A Wol de dainder meester Verdonck belonen?

B Meester Verdonck? Nee, hai wol meester [Malberen]F belonen! GLS A Wull de Schandarm Mester Verdonck belohnen?

B Mester Verdonck? Nee, he wull Mester [Malberen]F belohnen! HG A Wollte der Polizist Lehrer Verdonck belohnen?

B Lehrer Verdonck? Nein, er wollte Lehrer [Malberen]F belohnen! E A Did the policeman want to reward teacher Verdonck?

B Teacher Verdonck? No, he wanted to reward teacher [Malberen]F!

2. SD A Zullen ze bakker van Hout benoemen?

B Bakker van Hout? Nee, ze zullen bakker [Melberen]F benoemen! WF A Soenen se bakker Van Hout beneame?

B Bakker van Hout? Nee, se sille bakker [Melberen]F beneame! DLS A Zellen ze bakker van Holt benuimen?

B Bakker van Holt? Nee, ze zellen bakker [Melberen]F benuimen! GLS A Willen se Backer van Holst besöken?

B Backer van Holst? Nee, se willen Backer [Melberen]F besöken! HG A Wollen sie Bäcker von Holst besuchen?

BBäcker von Holst? Nein, sie wollen Bäcker [Melberen]F besuchen! E A Are they going to nominate/visit the baker van Hout?

B Baker van/von Hout/Holst? No, they are going to nominate/visit baker [Melberen]F.

3. SD A Zal tante Eva 't huis met Peter van der Vaart bewonen?

B Peter van der Vaart? Nee, ze zal 't huis met Peter [Molberen]F bewonen! WF A Sil tante Eva 't hûs mei Peter van der Vaart bewenje?

B Peter van der Vaart? Nee, se sil 't hûs mei Peter [Molberen]F bewenje! DLS A Zel tante Eva 't hoes mit Peter van der Vaart bewonen?

B Peter van der Vaart? Nee, ze zel 't hoes mit Peter [Molberen]F bewonen! GLS A Sall Tant Eva dat Huus mit Peter van der Vaart bewohnen?

B Peter van der Vaart? Nee, se sall dat Huus mit Peter [Molberen]F bewohnen. HG A Soll Tante Eva das Haus mit Peter van der Vaart bewohnen?

B Peter van der Vaart? Nein, sie soll das Haus mit Peter [Molberen]F bewohnen! E A Is aunt Eva going to live with Peter van der Vaart in the house?

B Peter van der Vaart? No, she is going to live with Peter [Molberen]F in the house! J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 207

Corrective focus – focus domain¼syllable

1. SD A Zullen die mensen dokter Lomberen belonen?

B Dokter Lomberen? Nee, ze zullen dokter [Mal]Fberen belonen! WF A Soenen dy minsken dokter Lomberen beleanje?

B Dokter Lomberen? Nee, se sille dokter [Mal]Fberen beleanje! DLS A Zellen de minsken dokter Lomberen belonen?

B Dokter Lomberen? Nee, zai zellen dokter [Mal]Fberen belonen! GLS A Sölen de Minsken Dokter Lomberen belohnen?

B Dokter Lomberen? Nee, se sölen Dokter [Mal]Fberen belohnen! HG A Sollen die Menschen Doktor Lomberen belohnen?

B Doktor Lomberen? Nein, sie sollen Doktor [Mal]Fberen belohnen! E A Are those people going to reward doctor Lomberen?

B Doctor Verdonck? No, they wanted to reward doctor [Mal]Fberen!

2. SD A Wilde Jan tante Lumberen benoemen?

B Tante Lumberen? Nee, hij wilde tante [Mel]Fberen benoemen! WF A Woe Jan tante Lumberen beneame?

B Tante Lumberen? Nee, hy woe tante [Mel]Fberen beneame! DLS A Wol Jan tante Lumberen benuimen?

B Tante Lumberen? Nee, hai wol tante [Mel]Fberen benuimen! GLS A Wull Jan Oma Lümberen besöken?

B Oma Lümberen? Nee, he wull Oma [Mel]Fberen besöken! HG A Wollte Jan Oma Lümberen besuchen?

B Oma Lümberen? Nein, er wollte Oma [Mel]Fberen besuchen! E A Did Jan want to nominate aunt Lumberen? B Aunt Lumberen/Grandma Lümberen?

No, he wanted to nominate/visit aunt/grandma [Mel]Fberen!

3. SD A Wil Rob Meijer 't huis met Paul de Lamberen bewonen?

B Met Paul de Lamberen? Nee, hij wil 't met Paul de [Mol]Fberen bewonen! WF A Wol Rob Meijer 't hûs mei Paul de Lamberen bewenje?

B Mei Paul de Lamberen? Nee, hy wol 't mei Paul de [Mol]Fberen bewenje! DLS A Wil Rob Meijer 't hoes mit Paul de Lamberen bewonen?

B Mit Paul de Lamberen? Nee, hai wil 't mit Paul de [Mol]Fberen bewonen! GLS A Will Rob Meier dat Huus mit Paul de Lamberen bewohnen?

B Mit Paul de Lamberen? Nee, he will dat mit Paul de [Mol]Fberen bewohnen! HG A Will Rob Meier das Haus mit Paul de Lamberen bewohnen?

B Mit Paul de Lamberen? Nein, er will es mit Paul de [Mol]Fberen bewohnen! E A Does Rob Meijer want to live with Paul de Lamberen in the house?

B With Paul de Lamberen? No, he want's to live in it with Paul de [Mol]Fberen!

Corrective focus – focus domain¼phoneme

1. SD A Mag ik Anne Nalberen belonen?

B Anne Nalberen? Nee, je mag Anne [M]Falberen belonen! WF A Mei ik Anne Nalberen beleanje?

B Anne Nalberen? Nee, do meist Anne [M]Falberen beleanje! DLS A Mag ik Anne Nalberen belonen.

B Anne Nalberen? Nee, doe magst Anne [M]Falberen belonen! GLS A Dür ik Anne Nalberen belohnen?

B Anne Nalberen? Nee, du dürst Anne [M]Falberen belohnen! HG A Darf ich Anne Nalberen belohnen?

B Anne Nalberen? Nein, du darfst Anne [M]Falberen belohnen! E A May I reward Anne Nalberen?

B Anne Nalberen? No, you may reward Anne [M]Falberen!

2. SD A Willen ze pater Nelberen benoemen?

B Pater Nelberen? Nee, ze willen pater [M]Felberen benoemen! WF A Wolle se pater Nelberen beneame?

B Pater Nelberen? Nee, se wolle pater [M]Felberen beneame! DLS A Willen ze poater Nelberen benuimen?

B Poater Nelberen? Nee, ze willen poater [M]Felberen benuimen! 208 J. Peters et al. / Journal of Phonetics 46 (2014) 185–209

GLS A Willen se Vader Nelberen besöken?

B Vader Nelberen? Nee, se willen Vater [M]Felberen besöken! HG A Wollen sie Vater Nelberen besuchen?

B Vater Nelberen? Nein, sie wollen Vater [M]Felberen besuchen! E A Do they want to nominate/visit father Nelberen?

B Father Nelberen? No, they want to nominate/visit father [M]Felberen!

3. SD A Mag Bram Wiering 't huis met Jan de Nolberen bewonen?

B Met Jan de Nolberen? Nee, hij mag 't huis met Jan de [M]Folberen bewonen! WF A Mei Bram Wiering 't hûs mei Jan de Nolberen bewenje?

B Mei Jan de Nolberen? Nee, hy mei 't hûs mei Jan de [M]Folberen bewenje! DLS A Mag Bram Wiering 't hoes mit Jan de Nolberen bewonen?

B Nee, hai mag 't hoes mit Jan de [M]Folberen bewonen! GLS A Dürt Bram Wiering dat Huus mit Jan de Nolberen bewohnen?

B Mit Jan de Nolberen? Nee, he dürt dat mit Jan de [M]Folberen bewohnen! HG A Darf Bram Wiering das Haus mit Jan de Nolberen bewohnen?

B Mit Jan de Nolberen? Nein, er darf es mit Jan de [M]Folberen bewohnen! E A May Bram Wiering live with Jan de Nolberen in the house?

B With Jan de Nolberen? No, she may live with Jan de [M]Folberen in the house!

Appendix B. R script code used for mixed-effect model analysis

require(lme4) require(lmerTest) require(multcomp)

table<- read.delim("TableNEW.csv", dec¼",") colnames(table)[1]<- "Dialect" colnames(table)[2]<- "Focus" colnames(table)[3]<- "Sentence" colnames(table)[4]<- "Speaker"

colnames(table)[6]<- "V1.Nuclear.Duration " colnames(table)[7]<- "V2.Nuclear.Pitch.Range" colnames(table)[8]<- "V3.Onset.Dur.re.ND" colnames(table)[9]<- "V4.Syll.Dur.re.ND" colnames(table)[10]<- "V5.Word.Dur.re.ND" colnames(table)[11]<- "V6.Rise.Size.re.NPR" colnames(table)[12]<- "V7.Fall.Size.re.NPR " colnames(table)[13]<- "V8.Rise.Time.re.ND" colnames(table)[14]<- "V9.Fall.Time.re.ND" colnames(table)[15]<- "V10.Rise.Slope.re.PR.ND" colnames(table)[16]<- "V11.Fall.Slope.re.PR.ND" colnames(table)[17]<- "V12.L1.Delay.re.ND" colnames(table)[18]<- "V13.H.Delay.re.ND" colnames(table)[19]<- "V14.L2.Delay.re.ND"

table$Dialect<- as.factor (table$Dialect) table$Focus<- as.factor (table$Focus) table$Sentence<- as.factor (table$Sentence) table$Speaker<- as.factor (table$Speaker)

for (i in (6:19)) { cat("\n","Processing ",names(table)[i],"\n\n") variable<- as.numeric(table[i]) ´ model.lmer¼lmer(variable~Dialect⁎Focus+(1|Speaker)+(1|Sentence),data¼table,REML¼TRUE) print(anova(model.lmer))

cat("\n","Multiple comparisons for Dialect\n") model.lmer¼lmer(variable~Dialect+Focus+(1|Speaker)+(1|Sentence),data¼table,REML¼TRUE) print(summary(glht(model.lmer,linfct¼mcp(Dialect¼"Tukey"),alternative¼"two.sided")))

cat("\n","Multiple comparisons for Focus\n") model.lmer¼lmer(variable~Dialect+Focus+(1|Speaker)+(1|Sentence),data¼table,REML¼TRUE) J. Peters et al. / Journal of Phonetics 46 (2014) 185–209 209

print(summary(glht(model.lmer,linfct¼mcp(Focus¼"Tukey"),alternative¼"two.sided")))

cat("\n","Multiple comparisons for the interaction Dialect*Focus\n") table$DialectFocus¼interaction(table$Dialect,table$Focus) model.lmer¼lmer(variable~DialectFocus+(1|Speaker)+(1|Sentence),data¼table,REML¼TRUE) print(summary(glht(model.lmer,linfct¼mcp(DialectFocus¼"Tukey"),alternative¼"two.sided"))) }

References

Atterer, M., & Ladd, D. R. (2004). On the phonetics and of ‘segmental anchoring’ of F0: evidence from German. Journal of Phonetics, 32, 177–197. Baart, J. L. G. (1987). Focus, syntax and accent placement (Ph.D. dissertation). Leiden: The Netherlands. Bartels, C., & Kingston J. (1994). Salient pitch cues in the perception of contrastive focus. In P. Bosch, & R. van der Sandt (Eds.), Focus and natural language processing (pp. 1–10). IBM Working Papers 6. Batliner, A. (1989). Fokus, Modus und die große Zahl. Zur intonatorischen Indizierung des Fokus im Deutschen. In H. Altmann, A Batliner, & W. Oppenrieder (Eds.), Zur Intonation von Modus und Fokus im Deutschen (pp. 21–70). Tübingen: Niemeyer. Baumann, S., Grice, M., & Steindamm, S. (2006). Prosodic marking of focus domains – categorical or gradient? Speech Prosody 2006, , Germany, May 2–5, 2006. Baumann, S., Becker, J., Grice, M., & Mücke, D. (2007). Tonal and articulatory marking of focus in German. In Proceedings of the 16th ICPhS (pp. 1029–1032), Saarbrücken . Boersma, P., & Weenink, D. (2008). Praat: Doing phonetics by computer (Version 5.0.25) [computer program]. 〈http://www.praat.org/〉 Retrieved 31.05.08. Bolinger, Dwight (1989). Intonation and its uses: Melody in and discourse. Stanford, CA: Stanford University Press. Chafe, W. (1976). Givenness, contrastiveness, definiteness, subjects, topics, and points of view. In C. Li (Ed.), Subject and topic (pp. 27–55). New York: Academic Press. Chambers, J. K., & Trudgill, Peter (1998). (2nd ed.). Cambridge: Cambridge University Press; Cambridge. Chen, S., Wang, B., & Xu, Y. (2009). Closely related languages, different ways of realizing focus. Interspeech,6–10 September, Brighton, UK, pp. 1007–1010. Chen, Y. (2006). Durational adjustment under corrective focus in Standard Chinese. Journal of Phonetics, 34, 176–201. Chen, Y., & Gussenhoven, C. (2008). Emphasis and tonal implementation in Mandarin Chinese. Journal of Phonetics, 36, 724–746. Cooper, W. E., Eady, S. J., & Mueller, P. R. (1985). Acoustical aspects of contrastiveness in question–answer contexts. Journal of the Acoustical Society of America, 77, 2142–2156. Dalton, M., & Ní Chasaide, A (2005). Tonal alignment in Irish dialects. Language and Speech, 48, 441–464. del Giudice, A., Shosted, R. K., Davidson, K., Salihie, M., & Arvaniti, A (2007). Comparing methods for locating pitch “elbows”.InProceedings of ICPhS XVI (pp. 117–1120). de Pijper, Jan Roelof (1983). Modelling British English intonation. : Foris. Eady, S. J., & Cooper, W. E. (1986). Speech intonation and focus location in matched statements and questions. Journal of the Acoustical Society of America, 80, 402–415. Eady, S. J., Cooper, W. E., Klouda, G. V., Mueller, P. R., & Lotts, D. W. (1986). Acoustical characteristics of sentential focus: Narrow vs. broad and single vs. dual focus environments. Language and Speech, 29, 233–251. Féry, C., & Kügler, F. (2008). Pitch accent scaling on given, new and focused constituents in German. Journal of Phonetics, 36, 680–703. Flege, James E. (2007). Language contact in bilingualism: Phonetic system interactions. In Jennifer Cole, & José I. Hualde (Eds.), Laboratory phonology, Vol. 9 (pp. 353–381). Berlin: Mouton de Gruyter. Gilles, P. (2005). Regionale Prosodie in Deutschen. Variabilität in der Intonation von Abschluss und Weiterweisung. Berlin & New York: de Gruyter. Grabe, E. (2004). Intonational variation in urban dialects of English spoken in the British Isles. In P. Gilles, & J. Peters (Eds.), Regional variation in intonation (pp. 9–31). Tübingen: Niemeyer. Grabe, E., Post, B., Nolan, F., & Farrar, K. (2000). Pitch accent realisation in four varieties of British English. Journal of Phonetics, 28, 161–185. Gussenhoven, C. (2004). Tone in Germanic: Comparing Limburgian with Swedish. In Hiroya Gunnar Fant, Jianfen Fujisaki, Cao, & Yi Xu (Eds.), From traditional phonology to modern speech processing (pp. 129–136). Beijing: Foreign Language Teaching and Research Press. Gussenhoven, C. (2005). Types of focus in English. In: Matthew Chungmin Lee, Gordon, & Daniel Büring (Eds.), Topic and focus: Cross-linguistic perspectives on meaning and intonation (pp. 83–100). Heidelberg, New York, London: Springer. Hanssen, J., Peters, J., & Gussenhoven, C. (2008). Prosodic effects of focus in Dutch declaratives. In Proceedings of the international conference on speech prosody,6–9 May 2008, Campinas, Brazil. ’t Hart, Johan (1998). Intonation in Dutch. In: Daniel Hirst, & Albert Di Cristo (Eds.), Intonation systems: A survey of twenty languages (pp. 96–111). Cambridge: Cambridge University Press. van Heuven, V. (1994). What is the smallest prosodic domain? In P. Keating (Ed.), Phonological structure and phonetic form. Papers in laboratory phonology III (pp. 76–98). Cambridge: Cambridge University Press. Katz, Jonah, & Selkirk, Elisabeth (2011). Contrastive focus vs. discourse-new: Evidence from phonetic prominence in English. Language, 87, 771–816. Keyser, S. J., & Stevens, K. N. (2006). Enhancement and overlap in the speech chain. Language, 82,33–63. Kingston, J., & Diehl, Randy L. (1994). Phonetic knowledge. Language, 70, 419–454. Kiss, K. E. (1998). Identificational focus versus information focus. Language, 74, 245–273. Kügler, F. (2008).The role of duration as a phonetic correlate of focus. In Proceedings of the international conference on speech prosody,6–9 May 2008, Campinas, Brazil. Kügler, F., & Gollrad, A. (2011). Production and perception of contrast in German, Proceedings of 17th international congress of phonetic science 2011 (pp. 1154–1157), Hong Kong. Ladd, D. R. (1980). The structure of intonational meaning. Evidence from English. Bloomington & London: Indiana University Press. Ladd, D. R., Schepman, A., White, L., Quarmby, L. M., & Stackhouse, R. (2009). Structural and dialectal effects on pitch peak alignment in two varieties of British English. Journal of Phonetics, 37, 145–161. Lindblom, B (1990). Explaining phonetic variation: a sketch of the H & H theory. In W. J. Hardcastle, & A Marchal (Eds.), Speech production and speech modelling (pp. 403–439). Dordrecht: Kluwer. Quené, H. (2008). Andante of allegro? Verschillen in spreektempo tussen Vlamingen en Nederlanders. Onze Taal, 77, 179–181. Selkirk, E. (2008). Contrastive focus, givenness and the unmarked status of discourse-new. Acta Linguistica Hungarica, 55, 331–346. Rooth, M. (1985). Association with focus. Amherst, MA: University of Massachusetts dissertation. Rooth, M. (1992). A theory of focus interpretation. Natural Language Semantics, 1,75–116. Sityaev, D., & House, J. (2003). Phonetic and phonological correlates of broad, narrow and contrastive focus in English. In Proceedings of the 15th ICPhS, Barcelona (pp. 1819–1822). Steindamm, & Susanne (2005). Fokusprojektion und Akzenttypen im Deutschen – Produktion und Perzeption (MA thesis). University of Cologne. Stevens, K. N., & Keyser, S. J. (1989). Primary features and their enhancement in consonants. Language, 65,81–106. Turk, A., Nakai, S., & Sugahara, M. (2006). Acoustic segment durations in prosodic research: A practical guide. In S. Sudhoff, D. Lenertová, R. Meyer, S. Pappert, P. Augurzky, I. Mleinek, N. Richter, & J. Schliesser (Eds.), Methods in empirical prosody research (pp. 1–28). Berlin, New York: De Gruyter. Uhmann, S. (1991). On the tonal disambiguation of focus structures. Journal of Semantics, 8, 219–238. van Leyden, K. (2004). Prosodic characteristics of Orkney and Shetland dialects: An experimental approach (Diss.). Leiden University. Verhoeven, J., de Pauw, G., & Kloots, H. (2004). Speech rate in a situation: a comparison between Dutch in and the Netherlands. Language and Speech, 47, 299–310.

Xu, Y. (1999). Effects of tone and focus on the formation and alignment of f0 contours. Journal of Phonetics, 27,55–105. Xu, Y., & Xu, C. X. (2005). Phonetic realization of focus in English declarative intonation. Journal of Phonetics, 33, 159–197.