<<

Sonorant transparency and the complexity of voicing in Polish

Patrycja Strycharczuk To appear in Journal of

Abstract Final devoicing and regressive assimilation have been reported to apply to in word-final + clusters in Polish. This phenomenon, interpreted as a case of sonorant transparency in generative phonological analyses of Polish voicing, has sparked a number of attempts to reconcile the transparency generalisation with phonological characteristics of other laryngeal processes in Polish. This paper formulates some predictions concerning the surface realisation of underlying voicing values that follow from the sonorant transparency hypothesis, and reports on a production experiment designed to test these predictions. Results show, contrary to the descriptive and theoretical literature, that word-final typically block final devoicing and voice assimilation. The minority of cases where voicing and devoicing appear to apply as predicted by transparency are analysed using mixed-effects modelling, with a view to determining what factors influence their occurrence. Based on the results it is argued that apparent transparency cases are best explained as resulting from an interaction of phonological, phonetic and lexical factors, including , segmental duration, prosodic boundary, and word size, which are known to affect the probability of vocal fold vibration, and that systematic phonetic variation found in the data does not support the hypothesis that sonorants are transparent to laryngeal processes.

1 Introduction

Polish has been reported to display a rare form of final devoicing and voice assimilation, where an obstruent in a word-final obstruent+sonorant cluster undergoes final devoicing, or voice assimilation if an obstruent follows in the next word. A number of descriptive sources assert that only voiceless are found in Polish codas (Benni, 1959; Wierzchowska, 1980; Ostaszewska & Tambor, 2000). In consequence, obstruents in word final obstruent+sonorant clusters undergo final devoicing even though they are not in an absolute final position. Sonorants positioned between a voiceless obstruent and a phrase boundary are reported to also undergo devoicing. Relevant examples are in (1).

1 (1) Devoicing in word-final obstruent+sonorant sequences. Examples from Benni (1959) [IPA transcription added] [vjAtr] wiatr ‘wind’ > [tsEtr˚] cedr ‘cedar tree’ [bACñ˚] ba´s´n ‘fairy tale’ ["bO.jACñ˚ ] boja´z´n ‘fear’ ˚ Some sources mention that devoicing is not always realised. Benni (1959) notes that scientific terms ending in -ism and -yzm are often pronounced ‘more carefully’ with a voiced obstruent. Kara´s& Madejowa (1977) report that the realisation of word-final obstruent+sonorant clusters varies with register. The most colloquial forms involve deletion of the word-final sonorants, especially in past tense forms such as [zgAt] zgadl ‘guessed’, or [zjAt] zjadl ‘ate’. More careful variants have devoicing of the entire obstruent+sonorant cluster ([zgAtw], [zjAtw]). However, according to Kara´s & Madejowa (1977), forms with˚ a voiced˚ obstruent in word-final obstruent+sonorant clusters ([zgAdw], [zjAdw]) are on the rise, which the authors attribute to the influence of orthography on pronunciation. Sawicka (Dukiewicz & Sawicka, 1995) lists devoicing of word- final obstruent+sonorant clusters as optional. According to Sawicka, sonorants following voiceless obstruents in word-final clusters always undergo devoicing, as in [r1tm] rytm ‘rhythm’. However, when the obstruent is underlyingly voiced, it can˚ surface as either voiced, or voiceless, as in [kAdr]/[kAtr] kadr ‘personnel, Gen.’. ˚ In addition, word-final sonorants have been reported not to block regressive voice assimilation between flanking obstruents, as illustrated in (2) with data from Sawicka (1995). Sawicka’s description does not mention optionality with respect to voice assimilation across a word-final sonorant.

(2) Voice assimilation across a sonorant. Data from Dukiewicz & Sawicka (1995) ­m1ýl#bO."gA.tA my´slbogata ‘rich thought’ kAtr#"fji.lmu kadr filmu ‘film frame’ ˚ Finally, Rubach & Booij (1990) and Rubach (1996, 2008) report that, unlike in the cases presented in (2), a word-initial sonorant blocks voice assimilation between flanking obstruents, as shown in (3). The description is said to be based on Rubach’s own observations coupled with recordings of University of Warsaw students, but no specifics concerning the recordings are discussed.

(3) Voice assimilation blocked by an intervening word-initial sonorant. Data from Rubach (2008) [IPA transcription added] [brAk#"rdz1] brak rdzy ‘lack of rust’ [­vi.dOk#"mZA.fki] widok m˙zawki ‘the sight of a drizzle’

Generalising from cases as the one exemplified in (1)-(3), a number of phonologists, including Bethin (1984), Gussmann (1992), and Rubach (1996, 2008), have proposed that word-final sonorants in Polish are laryngeally transparent, i.e.

2 invisible to all categorical laryngeal processes. The generalisation has had a considerable impact in phonological literature, and it has been cited as evidence for a number of theories. Gussmann (1992) uses sonorant transparency to support a theory that final devoicing is conditioned by structure, even though it is mostly observed at the Prosodic Word boundary. Scheer (2004) argues for a theory of syllable-conditioned segment licensing, which demotes syllable-structure violating sonorants to obstruents. Finally, Rubach (1996, 2008) proposes that syllabification interacts with licensing-by-cue in conditioning voicing in Polish. On the basis of the asymmetry between cases like (2) and cases like (3), Rubach (1996) argues that word-initial, but not word-final sonority-violating sonorants can license voicing contrast in the preceding obstruent. Rubach (2008) formalises this proposal within Optimality Theory (Prince & Smolensky, 2004 [1993]) by expanding pre-sonorant faithfulness (pre-sonorant contrast licensing, cf. Steriade (1999)) with pre-vocalic faithfulness, and subjecting them to ranking. However, the generalisations concerning occurrence of sonorant transparency in word-final obstruent+sonorant clusters are not confirmed by the seemingly only phonetic study of the subject to date by Castellv´ı-Vives (2003). Based on phonetic data from 4 Polish speakers, Castellv´ı-Vives (2003) reports that obstruent devoicing is by no means the default way of realising voicing in word- final obstruent+sonorant clusters. Castellv´ıfound that the underlying voicing was most commonly realised on the surface in obstruents followed by word- final sonorants. The second most frequent variant was final sonorant deletion, followed by sonorant devoicing. The cases of sonorant transparency with a devoiced obstruent followed by a voiced sonorant were very few. Castellv´ıalso reports that sonorant transparency occurred more frequently with word-final nasals than with other sonorant subclasses, and that transparency was rare in words ending in a glide. An asymmetry with respect to the presence or absence of surface voicing was also observed, as transparency involving a glide was more frequent when a voiced obstruent followed than when the following obstruent was voiceless. Castellv´ı’sfindings suggest that the realisation of voicing of word-final obstruent+sonorant clusters is more complex than previously reported, and it invites further research into the subject. Further motivation for a closer scrutiny of sonorant transparency in Polish comes from the controversy surrounding voicing in Russian, where sonorant transparency has also been reported, although the reports for Russian concern transparency to voice assimilation only, not final devoicing, and the assimilation is said to be limited to obstruent#sonorant+obstruent sequences. The report goes back to Jakobson (1978), and has featured in phonological analyses of voicing in Russian, including Petrova & Szentgy¨orgyi(2004) and Rubach (2008). However, some linguists question whether there is indeed voicing in word-final obstruents followed by a sonorant+obstruent cluster (see Shapiro (1993) for an extensive review of relevant sources on Russian). None of the published phonetic studies provides evidence that sonorant transparency is a categorical phonological process in Russian. Robblee & Burton (1997) in their analysis of obstruent+sonorant+obstruent clusters report no effect of assimilation. Kulikov (2010) has found some cases of voice assimilation between obstruents across a

3 sonorant in fast Russian speech, but argues for the assimilation to be highly variable, phonetic and gradient. The gradient view of sonorant transparency in Russian is also taken by Padgett (2002, 2012). Padgett (2012) presents acoustic data from a native speaker of Russian, which show a varying degree of gradient closure voicing in word-final obstruents followed by a sonorant+obstruent cluster. In an attempt to explain the discrepancy between the data and some linguists’ reports of sonorant transparency in Russian, Padgett (2012) hypothesises, following Robblee & Burton (1997), that voice assimilation across an intervening sonorant may be perceived (by language users including linguists who report sonorant transparency to occur) even where there is no acoustic evidence for categorical assimilation. The discrepancy between perceptual and acoustic evidence in Russian is also of consequence to other languages, indicating that reports of assimilation based on individuals’ experience and perception of language may not be representative of speakers’ actual production. This study provides an analysis of the phonetic realisation of underlying voicing in Polish in word-final obstruent+sonorant clusters in different morphosyntactic environments: at the end of a sentence (OS###), at the end of a phrase (OS##), word-finally before another obstruent (O1S#O2), as well as in word- final obstruents followed by word-initial sonorant+obstruent clusters in the next word (O1#SO2). The strength of the prosodic boundary associated with the syntactic boundary will be controlled by means of a number of discrete and continuous criteria (including presence or absence of pause, presence or absence of boundary tones, and duration of the preceding ). Participants recruited for the study were from Warsaw (experiment 1) and central Poland (experiment 2). Dialects spoken in these areas are reported to devoice word-final obstruents (as opposed to South-Western Polish dialects where final devoicing does not always apply). The study had two aims and the experimental design was tailored accordingly. The first aim was to test the predictions concerning the realisation of underlying voicing values in obstruents followed by sonorants. Three specific predictions that follow from the sonorant transparency hypothesis were tested. 1) The underlying voicing contrast in obstruents in word-final clusters (OS### and OS##) is neutralised in the phonetic output. Neither underlyingly voiced nor underlyingly voiceless obstruents will surface as phonetically voiced, and the two groups will be phonetically indistinguishable. 2) The initial obstruent in a three-member O1S#O2 cluster assimilates in voicing to the rightmost obstruent in the cluster. Thus, an underlyingly voiceless O1 followed by a voiced O2 across a sonorant will surface with more voicing than an underlyingly voiced O1 followed by a voiceless O2 across a sonorant. 3) A word-final obstruent followed by a sonorant+obstruent cluster in the next word assimilates in voicing to the rightmost obstruent in the cluster. Thus, a word-final underlyingly voiceless O1 followed by a voiced O2 across a sonorant will surface with more voicing than a word-final underlyingly voiced O1 followed by a voiceless O2 across a sonorant. The first two predictions have been stated as generalisations for Polish. The third one follows from the hypothesis that sonorants are transparent to voice assimilation (and have been proposed by some to occur in Russian, as discussed above), although the case of O1#SO2 has been argued by Rubach & Booij (1990)

4 not to involve sonorant transparency. Rubach & Booij (1990) report that word- final obstruents undergo final devoicing before a sonorant in the next word, also when this sonorant is followed by another obstruent. The second aim of the study was to analyse how various phonetic factors influence the phonetic realisation of underlying voicing in the environments where sonorant transparency has been reported. As previously mentioned, devoicing of a pre-sonorant obstruent in a final cluster is reported to be associated with the devoicing of the final sonorant by all consulted descriptive sources. The link is confirmed by Castellv´ı-Vives (2003), who reports that the cases in his data coded as involving sonorant transparency frequently involved sonorant devoicing. Castellv´ı also reports the effect of manner of articulation (e.g. occurrence of sonorant transparency is more frequent with word-final nasals). All these effects involved the treatment of sonorant transparency as a categorical response variable (transparency classified as present or absent), and the generalisations are formed on the basis of qualitative observation coupled with ANOVA’s. Two potential research questions emerge from these findings. First, can the influence of phonetic factors, such as sonorant devoicing, or manner of articulation, also be seen on continuous response variables associated with surface realisation of voicing (duration of glottal pulsing, ratio of glottal pulsing duration to obstruent duration)? Second, is the surface realisation of voicing conditioned by any other factors? How do these individual factors contribute to the realisation of voicing, and how do they interact in a model of surface voicing of obstruents preceding sonorants? The current study addresses these two research questions by exploring the influence of multiple factors on the realisation of vocal fold vibration during obstruents in obstruent+sonorant clusters by using mixed-effects modelling. The effects examined in the model included the following: 1. Type of potential transparency (whether it would involve surface voicing or surface devoicing); 2. Duration of voicing during the sonorant; 3. Sonorant’s manner of articulation (whether the sonorant was a nasal, liquid or a glide);

4. O1’s (whether O1 was labial, coronal or dorsal);

5. O1’s manner of articulation (whether O1 was a stop or a );

6. O1’s duration; 7. Size of the word. 8. Prosody-related factors including: • Presence or absence of a following pause; • Presence or absence of a boundary tone; • Duration of the vowel in the final syllable; • Syntactic characteristics of the structure in which the OS cluster was embedded (whether it was a noun phrase, a verb phrase, an adverb phrase, or a clause).

5 The effect of underlying voicing and of sonorant’s manner of articulation were included based on the finding by Castellv´ı-Vives (2003) that the degree to which transparency occurs is greater with word-final nasals, and quite rare with word- final glides, and that sonorant transparency involving glides is more common when it consists in voicing than in devoicing. The presence of voicing during the sonorant was studied following Castellv´ı’sfindings, and previous descriptive reports that sonorant transparency frequently involves sonorant devoicing. The inclusion of the remaining factors was motivated by literature reports of their interactions with phonetic and phonological voicing. Duration of glottal pulsing during closure has been found to vary with place of articulation in stops, being greater in labials, than in coronals and velars (Keating, 1984). The degree of vocal fold vibration is also sensitive to manner of articulation of the potential voicing target. There is a typological tendency for voiced to be much less frequent than voiced stops (Ladefoged & Maddieson, 1996). Ohala (1983) argues that this asymmetry is due to inherent aerodynamic differences involved in the production of stops and fricatives. Specifically, fricative voicing is particularly difficult to sustain, as noted by Ohala (1983) and Ohala & Sol´e(2010), because the production of frication noise requires high intraoral pressure. This conflicts with aerodynamic constraints on voicing, as vocal fold vibration is initiated and maintained in the presence of a transglottal pressure drop that involves lower pressure above than below the glottis (Baer, 1975; Stevens, 1998). Aerodynamic factors are also of relevance with respect to the interaction of obstruent duration and surface voicing, particularly in the case of stops. Supraglottal pressure rises naturally during obstruent articulation (especially occlusion), so voicing is expected to cease at some point during obstruent articulation (Westbury & Keating, 1986). The effect of word size was included following the reports by Wedel (2002) that short (monosyllabic) words resist voicing alternations in Turkish. This pattern is attributed by Wedel (2002) to neighbourhood density. Short words typically have dense neighbourhoods, i.e. they differ by only one feature/segment from other existing words. Resistance to alternation in dense neighbourhoods is conditioned by lexical access, as argued by Wedel (2002) (though cf. Becker & Nevins (2009) for counter-arguments). Prosodic effects on the degree of coarticulation have been previously reported by Cho (2004) and Kuzla et al. (2007). Cho (2004) reports that the extent of vowel-to-vowel coarticulation in American English is influenced by prosody with relatively less coarticulation in stronger prosodic positions. A similar prosodic influence was found by Kuzla et al. (2007) for progressive devoicing of German fricatives, which was more advanced when the intervening prosodic boundary was relatively weaker. The rest of the paper is organised as follows. Section 2 presents the results from a pilot experiment, designed to test whether the underlying voicing contrast is neutralised in pre-sonorant stops at the end of an utterance. Section 3 introduces results from an experiment designed to test the predictions of the sonorant transparency hypothesis with respect to the realisation of underlying voice specifications in OS##, O1S#O2, and O1#SO2 sequences, where O1 and O2 conflict in their underlying voicing. In the second part of this section, glottal

6 pulsing duration and ratio of glottal pulsing duration to obstruent duration (henceforth, voicing ratio) are analysed in a series of mixed-effects models applied to two subsets of data: 1) cases where the sonorant transparency hypothesis predicts surface devoicing of an underlyingly voiced obstruent, and 2) cases where the sonorant transparency hypothesis predicts surface voicing of an underlyingly voiceless obstruent. Section 4 goes on to argue that the attested cases of realisation consistent with the sonorant transparency hypothesis are marginal, and that they involve effects which are not predicted if sonorant transparency is considered to be a uniform phonological phenomenon. Section 5 concludes.

2 Sonorant transparency and final devoicing. Pilot study

The pilot study was was set up as a preliminary investigation into the obstruent voicing in the pre-sonorant position word-finally. The aim of the experiment was to establish whether obstruents in word-final OS### clusters have distinct voicing targets. 6 female native speakers of Polish, aged 20-24, read two repetitions of test stimuli. All the participants were originally from Warsaw or the surrounding area, and they were all living in Warsaw at the time of the experiment. They were naive as to the purpose of the experiment, and were not paid for their participation.

2.1 Materials and method The test items were 10 words including word-final stop+sonorant sequences at the end of an utterance (OS###): /tr/, /dr/, /pr/, /br/, /tw/, /dw/, /kw/, /gw/. The test items were embedded in meaningful Polish sentences, as illustrated in (4).

(4) Sample stimulus sentence Prognoza przepowiada silny wiatr. ‘The forecast predicts strong wind.’

The test items were paired to correspond in the size of the word, the final sonorant, the preceding stop’s place of articulation and, where possible, the height of the preceding vowel. The full list of items is in the appendix. The data were collected as a part of a larger study on the realisation of word- final obstruents in various segmental contexts. Some other data collected in the study were also analysed post hoc in addition to the data from OS### sequences. This was done in order to assess the potential effect of using written stimuli in the experimental design. The voicing distinction in Polish is reflected in the orthography, and since the stimuli were presented to the speakers in writing, voicing contrast might have been triggered by the spelling. An objection of this kind has been raised by numerous phoneticians in their criticism of incomplete neutralisation findings, including Jassem & Richter (1989) for Polish, who argue

7 that incomplete final neutralisation in Polish, as reported by Slowiaczek & Dinnsen (1985), may be partially due to the use of written stimuli in phonetic experiments. The potential effect of orthography is analysed here based on word- final stops followed by a sonorant. These items, henceforth referred to as control items, involved 14 tokens of word-final coronal stops (/t/, /d/) followed by a sonorant (/r/, /l/, /m/, /n/, /w/, /j/, /o/) in the following word. These items were embedded in meaningful Polish sentences, as illustrated in (5).

(5) Kryzys gospodarczy powoduje nawr´otle˛ku w spolecze´nstwie . The economic crisis triggers a return of anxiety in the society.

The sentences were presented to the participants on cards, one at a time. The stimuli were randomised by re-shuffling for each participant. The recordings were made in a sound-treated room, using a Behringer-B1 condenser microphone. The speakers were positioned 30 cm away from the microphone and instructed to read the sentences at a comfortable rate. They were encouraged to correct themselves if they made a mistake. The recordings were sampled at 44.1 kHz. Segmentation and acoustic analysis were carried out in Praat (Boersma & Weenink, 2010) on a 5 ms Gaussian window (spectrogram bandwidth 260 Hz). Boundaries were inserted manually based on visual analysis of the spectrograms. Altogether 120 utterances were recorded (2 repetitions of 10 test stimuli and 14 control stimuli pronounced by 6 subjects). 8 utterances were excluded due to deletions, mispronunciations, or segmentation difficulties, leaving 280 test utterances for analysis. The following acoustic measurements were taken. All of the measurements related to stops and had been previously recorded in studies on the voicing contrast in Polish, including Keating (1980), Slowiaczek & Dinnsen (1985), Jassem & Richter (1989), as well as in studies on laryngeal neutralisation in other languages, including Fourakis & Iverson (1984), Port & O’Dell (1985), Charles- Luce (1985), and Barry (1988). 1. Duration of glottal pulsing during closure. The presence of vocal fold vibration has been shown to be a primary acoustic correlate of the voicing contrast in true voice languages, including Polish (Keating, 1980). Voiced segments in Polish are typically realised with longer glottal pulsing, both in absolute terms and relative to closure duration, than their voiceless counterparts. Duration of glottal pulsing was measured manually based on the presence of the voicing bar on the spectrogram and periodicity in the waveform. Absence of voicing was coded as 0. 2. Stop closure duration. The voicing contrast has been shown to influence the duration of stop closure in some languages, where word-medial and word-final phonologically voiced stops have a shorter closure phase than phonologically voiceless stops (Chen, 1970; Kluender et al., 1988). Although this effect does not seem to have been observed for Polish, closure duration was recorded to help contextualise the measurements of the duration of glottal pulsing in terms of duration of occlusion. Closure was measured

8 manually, based on to be the presence of low acoustic energy between the preceding vowel and the following stop release. 3. Vowel duration. Lengthening of the preceding vowel has been observed before phonologically voiced stops for a number of languages (Peterson & Lehiste, 1960; Chen, 1970; Kluender et al., 1988). Slowiaczek & Dinnsen (1985) report to have found this effect for Polish, but their finding was not replicated by Jassem & Richter (1989). Keating (1980) reports that duration of the preceding vowel does not correlate with the voicing contrast. Vowel duration was measured manually. The beginning of the vowel was placed at the beginning of the formant structure for vowels preceded by obstruent. For vowels preceded by sonorants, the initial boundary was placed at the onset of the formant steady state. The boundary between the vowel and the following stop was placed at the onset of low acoustic energy at higher frequencies. 4. Duration of the burst. Phonologically voiced stops have been found to have a weaker and shorter burst than phonologically voiceless stops (Fischer- Jørgensen, 1954; Slis & Cohen, 1969), although the effect in Polish is said to be fairly weak (Keating, 1980). Burst was identified based on the presence of high frequency noise following the closure phase of the stop. Absence of burst was coded as 0. 5. Duration of voicing during the sonorant. This measurement was taken to investigate whether there is a correlation between obstruent and sonorant devoicing in OS## clusters, as reported by Castellv´ı-Vives (2003). Sonorant voicing was identified based on the presence of periodicity in the waveform and the presence of a voicing bar at the bottom of a spectrogram. Absence of voicing was coded as 0.

2.2 Results An initial exploration of the data already shows that stops can surface as voiced when followed by a word-final sonorant. Figure 1 illustrates a spectrogram of the pair: kadr and wiatr by speaker W3. There is a clear voicing bar extending from the vowel, through the closure and release phase of the stop in kadr, into the final sonorant. In comparison, the voicing tail from the vowel in wiatr is very short, leaving most of the stop voiceless. Voiced realisations, such as the one shown in Figure 1 were found to be very common for the studied speakers. The recoverability of underlying voicing in OS### sequences was analysed with a generalised linear mixed-effects model, using the lme4 package (Bates & Maechler, 2009) in R (R Development Core Team, 2005), version 2.13.1. Whether the underlying voicing value could be recovered from the phonetic signal was the response variable in the model, and the effect of speaker was treated as random. The model achieved the best fit of the data with two fixed effects: duration of glottal pulsing, and duration of closure. A summary of the model is in Table 1. A voiceless underlying specification was more likely in obstruents with shorter duration of glottal pulsing (B=-0.11 (SE=0.02), z=-5.44, p<0.001),

9 d r t r

Figure 1: Realisation of utterance-final /dr/ and /tr/ by speaker W3

Table 1: Summary of the fixed part of a generalised linear mixed-effects model predicting the likelihood of the obstruent being underlyingly voiceless. Experiment 1. Term β SE z p (Intercept) -0.07 1.34 -0.05 0.960 Glottal pulsing duration -0.11 0.02 -5.44 <0.001 Closure duration 0.04 0.02 2.73 0.006

and longer duration of closure (B=0.04 (SE=0.02), z=2.73, p=0.006). A model which also involved the effect of preceding vowel duration and duration of burst did not achieve a significantly better fit of the data (log-likelihood test: χ2=2.98, p=0.22), and neither of the additional effects was significant (vowel duration: B=- 0.03 (SE=0.02), z=-1.53, p=0.12; burst duration: B=0.01 (SE=0.01), z=1.13, p=0.26). The results of mixed-effects modelling show that the underlying voicing value is not neutralised on the surface, as the presence of underlying voicing is associated with an increase in the duration of vocal fold vibration, and a decrease in closure duration. The surface contrast between underlyingly voiced and underlyingly voiceless obstruents in the current data was found to be quite robust. Boxplots in Figure 2 illustrate the difference in the realisation of underlying voicing. There is a large difference between the two groups in terms of glottal pulsing duration, with the median difference of 56.41 ms. The difference is even more robust in

10 1.0 80 0.8 60 0.6 40 Voicing ratio Voicing 0.4 Duration of glottal pulsing (ms) pulsing of glottal Duration 20 0.2 0 0.0

voiced voiceless voiced voiceless

Underlying voicing Underlying voicing

Figure 2: Boxplots of glottal pulsing duration (left) and voicing ratio (right) as a function of the underlying voicing value in OS### sequences in the pilot study.

terms of voicing ratio, i.e. glottal pulsing duration divided by closure duration, with the median difference of 1. In order to establish whether the underlying voicing contrast is neutralised in O#S sequences, a generalised linear mixed-effects model was fitted to the control data, i.e. tokens of word-final coronal stops followed by sonorants in the next word. The effect of speakers was analysed as random. Four fixed effects were considered in the modelling: duration of glottal pulsing, duration of closure, duration of the preceding vowel, and the duration of burst. No subset of these four predictors yielded a model with any significant fixed effects. In a model based on glottal pulsing and closure duration alone, analogous to the model described above for the OS### sequences, neither the effect of glottal pulsing duration, nor the effect of closure duration was significant (glottal pulsing duration: B=5.43 (SE=18.89), z=0.29, p=0.77; closure duration: B=1.54 (SE=11.83), z=0.13, p=0.9). The fit of the model did not improve significantly upon adding further fixed effects of vowel and burst duration (log- likelihood test: χ2=4.81, p=0.09), and neither of the added effects was significant (vowel duration: B=22.75 (SE=16.12), z=1.41, p=0.16; burst duration: B=2.07 (SE=15.77), z=0.13, p=0.9). This result shows that the underlying voicing value in the obstruents in O#S sequences is not associated with any effects which where significant in the case of OS### sequences. Thus, the underlying voicing contrast appears to be neutralised along these dimensions. Neutralisation in terms of glottal pulsing duration and voicing ratio is illustrated in the boxplots in Figure 3. In addition,

11 the underlying voicing value was not found to be significantly correlated with other potential acoustic exponents of voicing, such as vowel duration, or duration of the burst. Thus, as far as the acoustic predictors analysed in the current study are concerned, the underlying voicing contrast was neutralised by the participants in word-final stops followed by a sonorant in the next word. 0.05 1.0 0.04 0.8 0.03 0.6 Voicing ratio Voicing 0.4 0.02 Duration of glottal pulsing (ms) pulsing of glottal Duration 0.2 0.01 0.0 0.00

voiced voiceless voiced voiceless

Underlying voicing Underlying voicing

Figure 3: Boxplots of glottal pulsing duration (left) and voicing ratio (right) a function of the underlying voicing value in C#S sequences in the pilot study.

The results of modelling the underlying voicing value in obstruent in O#S sequences suggest that using written stimuli in the experimental set-up does not necessarily involve lack of neutralisation in the coda, and so the contrast found for obstruents in OS### sequences need not be an artefact of the experimental design. The evidence is indirect, as the prosodic boundaries were not strictly controlled for. The test tokens were at the Utterance boundary, while the control items were Prosodic Word-final. However, if anything, more devoicing (and hence less contrast) is expected with stronger prosodic boundaries, due to the open position of the glottis at the end of the utterance in anticipation of breathing1. This type of prosodic effect on devoicing is also reported by Tucker & Warner (2010), who found that for final nasals in Romanian devoicing occurs more frequently and to a greater extent at the end of an utterance than at the end of a word. In light of these findings, if final devoicing occurs in O#S, it is equally or even more likely to occur in OS###, as far as prosody is concerned, and the effect of orthography is not expected to vary between different types of items for the same speakers in the same experiment. Thus, the fact the devoicing

1I owe this observation to one of the reviewers.

12 is typically not observed in OS### tokens suggests that devoicing is blocked by the final sonorant.

2.2.1 Devoicing cases From the data presented so far it follows that obstruent devoicing is not the default, or even the prevailing realisation of the obstruent in a OS### cluster. However, some devoicing cases were found in the data. To provide the reader with a rough frequency estimate of the potential transparency cases, all the obstruent tokens in the data were classified as either voiced or voiceless on the surface. Classification was done by means of k-means clustering based on two vectors: glottal pulsing duration, and closure duration2. The data points were assigned into two groups such that the sum of squares from points to the assigned cluster centres was minimised using the Hartigan-Wong algorithm (Hartigan & Wong, 1979), as illustrated in Figure 4.

Cluster 1 2 80 60 40 Voicing duration (ms) duration Voicing 20 0

40 60 80 100 120 140

C1 duration (ms)

Figure 4: Results of k-mean clustering of pre-sonorant stops in OS### environment. Pilot study

Out of the 60 underlyingly voiced stops, 49 were assigned to cluster 1 (centred around full glottal pulsing during closure), with the remaining 11 tokens assigned to cluster 2 (centred around 0 ms of glottal pulsing). 8 out of 11 stops assigned to cluster 2 were produced by the same speaker, W2, and they were all produced with extending over the utterance-final rhyme, as illustrated in the left panel of Figure 5.

2These two variables were selected based on their high significance in mixed-effects modelling.

13 k ɑ d r ts ɛ n ɛ

Figure 5: Creaky voice realised by speaker W2 utterance-finally. The left panel represents the speaker’s realisation of an utterance final kadr ‘personnel, Gen.Pl.’. The right panel represents the speaker’s realisation of utterance-final cene˛ ‘price, Acc. Sg.’

As the right panel in Figure 5 shows, W2 was also found to produce creaky voice at the end of utterances involving sequences other than stop+sonorant. This phonetic strategy can be interpreted as boundary marking by this speaker, not limited to utterance-final stop+sonorant sequences, and laryngeal neutralisation appears to be little more than a side-effect of utterance-final creaky voice. This hypothesis is further corroborated by the observation that whenever W2 did not produce creaky voice over the whole phrase-final stop+sonorant sequence, she would voice the pre-sonorant stop, as illustrated by the spectrogram in Figure 6. Based on the occasional stop voicing in the pre-sonorant position by speaker W2, it can be argued that her pre-sonorant stops have distinct voicing targets. However, these targets are frequently neutralised on the surface due to the tensing of vocal folds when creaky voice is produced for boundary marking.

2.3 Summary The method used detected a very salient voicing contrast in stops followed by a sonorant at the end of a word. Only one out of six speakers was found to neutralise the contrast in most cases, which might be just a side effect of the speaker’s tendency to produce creaky voice at the end of the utterance. What is more, even the production of this particular speaker involved voiced stops in the pre-sonorant position, which points to the conclusion that underlyingly

14 k ɔ b r

Figure 6: Voicing of a pre-sonorant obstruent in utterance-final kobr ‘cobra snake, Gen.Pl.’ by speaker W2

voiced and underlyingly voiceless stops followed by a word-final sonorant differ in their surface voicing targets. Experiment 1 provided also some validation of the elicitation method used. The speakers were found to neutralise the voicing distinction in stops followed by a sonorant in the next word, which indicates that the non-neutralisation in stops in word-final stop + sonorant clusters was unlikely to be due to the use of written stimuli.

3 Experiment on sonorant transparency to final devoicing and voice assimilation

The pilot study data on OS### sequences question do not confirm the literature reports which state that pre-sonorant stops typically undergo final devoicing in this position. The data also show that stops in word-final stop+sonorant clusters do have distinct voicing targets, which is expected to counteract voice assimilation. In order to test whether voice assimilation can occur across a sonorant, another production experiment was conducted. 8 native speakers of Polish participated in the experiment: 6 females aged 39-53, and 2 males aged 28 and 32. All the speakers came from central Poland. The purpose of the experiment was not explained to the speakers until after the recording. The participants were not paid.

15 3.1 Materials and method Three types of test items were used in the study 1) word-final obstruent+sonorant sequences (OS##); 2) word-final obstruent+sonorant sequences followed by an obstruent in the next word (O1S#O2); and 3) word-final stops, followed by word initial sonorant-obstruent sequences (O1#SO2). Sample test items are in (6), for the full list of items see the Appendix.

(6) Sample test items Condition 1 (OS##): ˙zubr ‘bison, Nom.Sg.’

Condition 2 (O1S#O2): ˙zubrsiedzial ‘the bison was sitting’ Condition 3 (O1#SO2): kwiat rdestu ‘water pepper flower’

An equal number of words with voiced and voiceless stops was used across all test items. The same set of tokens was used in Condition 1 (in the devoicing context) and in Condition 2 (assimilation context). The test items in both of these conditions were paired to correspond in word size, and the place and manner of the obstruent+sonorant sequence, across the two voicing categories (e.g. ˙zubrsiedzial - Cypr wiosna˛). Place and manner of the obstruent and sonorant were systematically varied, in order to test whether a potential effect in the degree of voicing or devoicing in the obstruent. The size of the word containing the potential devoicing/assimilation undergoer was systematically varied for the same reason. Lexical restrictions did not allow for matching of the preceding vowel, or controlling for this factor in any other way. In the assimilation contexts all potential assimilation undergoers differed in the underlying voicing specifications from the triggers, i.e. O1 and O2 always conflicted in their underlying voicing values. Unlike in the first experiment, a standard carrier sentence (7) was used in order to eliminate confounds from syntactic structure, sentence length, and provide a better comparison of durations.

(7) The carrier sentence Powiedz jeszcze raz. ‘Say one more time.’

The recordings were made in a quiet room on a Marantz PMD 670 Solid State Recorder, using a head-wearable microphone (AKG C420). The stimuli were presented to the speakers in a semi-random order (excluding immediate repetitions) on a computer screen, one stimulus at a time. The experiment was self-timed: the speakers were told they could complete the experiment at their own pace, and they were encouraged to correct themselves if they made an error. The speakers read two repetitions of each test item. Altogether 832 utterances were recorded (2 repetitions of 52 stimuli pronounced by 8 speakers). Many speakers had trouble reading the items fluently, especially the very complex clusters in the assimilation context. A rather large number of 98 utterances had to be discarded, due to the disfluencies, reading errors, and pauses within the test items. This left 724 utterances for phonetic and statistical analysis.

16 The phonetic analysis followed the same procedure as established for the pilot study. Spectrograms were labelled manually in Praat, and acoustic measurements were made based on the inserted boundaries. The following measurements were made for pre-sonorant stops, following the pilot study. 1. Duration of glottal pulsing into stop closure; 2. Stop closure duration; 3. Duration of the preceding vowel; 4. Duration of the burst. For fricatives the following measurements were made. 1. Duration of glottal pulsing during the fricative. Analogically to the case of stops, increased glottal pulsing is expected to mark the surface voicing of a fricative. 2. Duration of the frication noise. Fricative duration has been shown to be an exponent of voicing in a number of languages, including Dutch (Slis & Cohen, 1969) and English (Crystal & House, 1988). Voiced fricatives tend to surface as shorter than voiceless fricatives. Stevens et al. (1992) have also found a duration effect on the perception of voice in fricatives, where shorter frication noise brings about the perception of voicing in listeners (Forrez, 1966; Stevens et al., 1992) r 3. Duration of the preceding vowel. For both, stops and fricatives, voicing ratio was calculated, based on the duration of glottal pulsing into closure, or into frication. In addition, the following measurements were made for the sonorants following the obstruents in clusters (both, post-stop and post-fricative). 1. Duration of voicing during the sonorant.

2. f0 at 10 ms into the sonorant.

3. f1 at 10 ms into the sonorant. The f0 and f1 measurements were recorded, following the reports by House & Fairbanks (1953) and Kingston & Diehl (1994), inter alia, that f0 and f1 are relatively lower following voiced consonants. The effect is most prominent following the obstruent, and fades over time. The vowel durations measurements were complicated by the presence of a preceding onglide in the context of a palatalised consonant (e.g. in /vjAtr/‘wind’. The presence of a glide made it difficult to precisely determine the onset of the vowel, and it was not possible to include the glide in the duration measurements, since the palatalisation had not been controlled for. In the light of these problems, vowel duration measurements were discarded. In addition to the continuous phonetic measurements related to voicing, a number of prosodic factors were transcribed for the data from experiment 2. The most likely prosodic realisations of the test items in Condition 1 involved

17 a following phrase boundary under the definition of a prosodic pause as a sequence of one or more pitch accents and a boundary tone (Pierrehumbert, 1980). Previous work on Polish shows that phonetic cues to prosodic boundaries include pre-boundary lengthening of the final syllable’s nucleus, a pitch movement corresponding to a boundary tone, and a following pause (Demenko, 2000; Francuzik et al., 2002). Relatively stronger phrase boundaries are signalled by the co-occurrence of two or more cues, especially involving the presence of a following pause. In contrast, absence of either pause or a boundary tone signals a lower-level prosodic boundary corresponding to that of a Prosodic Word. The test items in Conditions 2 and 3 were most likely to be realised with the O1SO2 cluster straddling a Prosodic Word boundary (O1S#O2, or O1#SO2). However, since the exact prosodic realisation of a string of speech cannot be predicted based on syntax alone (Shattuck-Hufnagel & Turk, 1996), phonetic cues to prosodic boundaries were annotated for the data, including the presence or absence of a following pause and a boundary tone. Pause was defined as a period of low acoustic energy of at least 10 ms. Boundary tone was defined as pitch movement following the pitch accent diagnosable by a rise (or fall) in f0 through the post- tonic syllable towards an apparent H (or L) target aligned at the right edge of that word. In 54 cases it was impossible to determine the presence or absence of a boundary tone due to final rhyme devoicing and absence of f0. Given previous findings on pre-boundary lengthening of the vocalic nucleus, vowel duration was also considered in the analysis of the test items’ prosodic realisation. In addition, since prosody may also be influenced by syntactic factors, basic syntactic characteristics were transcribed for all test items (whether the item was a noun phrase, a verb phrase, an adjectival phrase, or a clause).

3.2 Results 3.2.1 Deletions The acoustic analysis of the data reveals a substantial number of cases where a word-final sonorant in an obstruent+sonorant cluster is elided, leaving no acoustic or auditory trace3. As previously noted by Castellv´ı-Vives (2003), it is not uncommon for a sonorant to be elided from a word-final obstruent+sonorant cluster, but in the absence of a sonorant it is incorrect to talk about ‘transparency’. 76 cases of deletion were counted in the current data altogether, accounting for 10.73% of all utterances. No deletion cases occurred in Condition 3 (i.e. when the sonorant was word-initial), so the 76 attested deletion cases make up 12.31% of all Condition 1 and Condition 2 utterances. A generalised linear mixed-effects model was fitted to the data from conditions 1 and 2 pooled together. The dependent variable was whether or not the sonorant was deleted. The effects of speaker and item were treated as random. The predictors considered in the model included condition, the size of the word (monosyllabic, disyllabic, or trisyllabic), as well

3Whether or not a residual sonorant gesture is present in such cases is a question for future articulatory research.

18 Table 2: Summary of the fixed part of a generalised linear mixed-effects model predicting whether or not the word-final sonorant in an OS cluster would undergo deletion. The intercept corresponds to a word-final glide in a monosyllabic word in Condition 1. Term Level β SE z p (Intercept) -3.65 0.84 -4.32 <0.001 Condition 2 2.18 0.44 5.00 <0.001 Word size disyllabic 1.62 0.51 3.16 0.002 Word size trisyllabic 2.02 0.92 2.20 0.028 Manner nasal -2.44 0.85 -2.87 0.004 Manner rhotic -2.83 0.58 -4.85 <0.001

as the manner of articulation of the following sonorant (glide, nasal, or rhotic). The model’s summary is in Table 2. The occurrence of deletions was greater in Condition 2 than in Condition 1 (B=2.18 (SE=0.44), z=5.00, p<0.001), meaning that word-final sonorants in an obstruent+sonorant cluster were more likely to delete when an obstruent followed. The likelihood of deletion also increased with the size of the word. Deletion was less likely to occur in monosyllabic than in disyllabic (B=1.62 (SE=0.51), z=3.16, p=0.002), or trisyllabic (B=2.02 (SE=0.92), z=2.20, p=0.028) words. In addition, glides were more likely to delete than nasals (B=-2.44 (SE=0.85), z=-2.87, p=0.004), or rhotics (B=-2.83 (SE=0.58), z=-4.85, p<0.001). It is not clear whether this last finding is due to manner of articulation, or morphosyntactic factors, since all test items with word-final glides where verbs, where the glide was a past tense marker. In contrast, the test items ending with nasals or rhotics were all nouns.

3.2.2 Prosodic realisation Condition 1 had been predicted to typically trigger a phrase boundary, characterised by the presence of a boundary tone, presence of a following phrase, or both. A pause following the test item was found in 134 cases. In addition, a boundary tone was identified in 172 Condition 1 items that did not have a following pause. 16 of the Condition 1 test items were characterised by the absence of either boundary tone or a following pause. These 16 cases involved a following word boundary under the definition provided above, whereas all the remaining pronunciations involved a phrase boundary of varying strength. The phrase-final items were associated with lengthening of the vowel in the final syllable. A t-test comparison of the vowel length in Condition 1 depending on the presence or absence of a phrase boundary showed a difference in means of 38.36 ms, which was significant at t=-11.49, p <0.001. Conditions 2 and 3 had been predicted to typically trigger a word boundary, defined as the absence of a following phrase or a boundary tone. Test items

19 Table 3: Summary of the fixed part of a generalised linear mixed-effects model predicting the likelihood of the obstruent being underlyingly voiceless. Condition 1. Term β SE z p (Intercept) 1.36 0.80 1.69 0.090 Glottal pulsing duration -0.09 0.01 -8.26 <0.001 f0 0.01 0.003 2.68 0.007

where a pause intervened between potential assimilation trigger and undergoer had been previously discarded as disfluencies. Out of the remaining 388 items, 305 showed no boundary tone, which was consistent with the presence of a word boundary. A boundary tone was found in 53 cases from Conditions 2 and 3, while the remaining 30 cases were not coded for the presence or absence of a final pitch movement, due to devoicing. The presence of a boundary tone (and an associated phrase boundary under the current definition) was again correlated with vowel lengthening. The difference in means between vowels depending on the presence or absence of a boundary tone equalled 11.63 ms and was significant at t=-2.37, p=0.02.

3.2.3 Predicting the value of underlying voicing Predictions concerning the recoverability of underlying voicing were tested in a series of generalised mixed-effects models, where the dependent variable was the underlying voicing of the pre-sonorant obstruent (voiced vs. voiceless), with the effect of speaker treated as random. Three separate models were fitted for the three experimental condition. In all models the effect of speaker was treated as random. The fixed predictors considered in the modelling included: O1 duration (i.e. duration of closure or frication), the duration of glottal pulsing during closure or frication, f0 at 10 ms after the offset of the obstruent, and f1 at 10 ms after the offset of the obstruent. Condition 1 data were used to test whether the underlying voicing value of a pre-sonorant obstruent followed by a phrase boundary can be recovered from the acoustic signal. Underlying specification for voicing was found to be associated with increased glottal pulsing (B=-0.091 (SE=0.01), z=-8.26, p<0.001) and f0 lowering following the obstruent (B=0.01 (SE=0.003), z=2.68, p=0.07). The fit of the model did not improve significantly upon adding further predictors such as 2 obstruent duration (log-likelihood test: χ =2.91, p=0.09), or f1 value following 2 the obstruent (χ =0.08, p= 0.78). Neither O1 duration, nor the f1 measure reached the significance of 0.05 when added to the model, and neither of these predictors was retained in the final model (summarised in Table 3). This result replicates the result of the pilot study: the underlying voicing contrast in word-final obstruent followed by a sonorant and a phrase boundary is mostly recoverable. The difference is also rather robust, as illustrated in the left panel of Figure 7. Although there are outliers, which will be discussed in Section 3.3, the general tendency is for underlyingly voiced stops to surface with

20 decidedly longer glottal pulsing. 150 1.0 0.8 100 0.6 Voicing ratio Voicing 0.4 50 Duration of glottal pulsing (ms) pulsing of glottal Duration 0.2 0 0.0

voiced voiceless voiced voiceless

Underlying voicing Underlying voicing

Figure 7: Duration of glottal pulsing and voicing ratio as a function of underlying voicing in OS sequences in Condition 1.

Condition 2 involved tokens of word-final obstruent+sonorant sequences followed by an obstruent in the next word. The tokens were constructed in such a way that the two obstruents flanking a sonorant would conflict in their underlying voice specifications. From the point of view of sonorant transparency, it would be expected that the first obstruent in the cluster will assimilate in voicing to the second obstruent, reversing its underlying voice specification.

(8) Predictions for the outcome of Condition 2 from the perspective of sonorant transparency Sequence Prediction /br#C/ [prC] /pr#v/ [brv]

A generalised mixed linear model was fitted to the data from Condition 2. A model including two fixed effects: that of glottal pulsing duration and f0 at 10ms after the offset achieved a better fit of the data than a model based on glottal pulsing alone (log-likelihood test: χ2=31.94, p< 0.001). According to the model based on these two effects, glottal pulsing was a highly significant predictor (B=-0.05 (SE=0.007), z=-6.31, p<0.001), but f0 was not significant (B=-0.0009 (SE=0.003), z=-0.31, p=0.76). Adding further fixed effects (O1 duration and f1) did not significantly improve the fit of the model, and neither of the added effects was significant. The final model is summarised in Table 4. Increased glottal pulsing was associated with underlying voicing, as indicated

21 Table 4: Summary of the fixed part of a generalised linear mixed-effects model predicting the likelihood of the obstruent being underlyingly voiceless. The model was run on the data from Condition 2. Term β SE z p (Intercept) 3.57 0.67 5.33 9<0.001 Glottal pulsing duration -0.05 0.007 -6.31 <0.01 f0 -0.0009 0.003 -0.31 0.76

by the negative value of the β coefficient for glottal pulsing duration. Boxplots in Figure 8 illustrate surface voicing associated with underlyingly voiced obstruents in terms of glottal pulsing duration and voicing ratio. Similarly to Condition 1, decidedly more glottal pulsing (both in absolute terms and relative to obstruent duration) is found in underlyingly voiced obstruents than in the underlyingly voiceless obstruents, contrary to the predictions stated in (8). 1.0 140 120 0.8 100 0.6 80 60 Voicing ratio Voicing 0.4 40 Duration of glottal pulsing (ms) pulsing of glottal Duration 0.2 20 0 0.0

voiced voiceless voiced voiceless

Underlying voicing Underlying voicing

Figure 8: Duration of glottal pulsing and voicing ratio as a function of underlying voicing in pre-sonorant obstruents followed by another obstruent in the next word (Condition 2)

The test items in Condition 3 contained word final obstruents followed by sonorant+obstruent sequences in the next word. The rightmost obstruent in the three-member cluster conflicted in its underlying voicing with the leftmost obstruent in the cluster in all the cases. A generalised mixed-effects model was fitted to the data in Condition 3 modelling the recoverability of the underlying voicing contrast. No subset of the four predictors (O1 duration, duration of glottal

22 pulsing during closure or frication, f0 at 10 ms after the offset of the obstruent, and f1 at 10 ms after the offset of the obstruent) yielded any significant fixed effects. Unlike in Conditions 1 and 2, an equal amount of vocal fold vibration was found for underlyingly voiced and underlyingly voiceless obstruents, as illustrated in Figure 9. 1.0 100 0.8 80 0.6 60 Voicing ratio Voicing 0.4 40 Duration of glottal pulsing (ms) pulsing of glottal Duration 0.2 20 0 0.0

voiced voiceless voiced voiceless

Underlying voicing Underlying voicing

Figure 9: Duration of glottal pulsing and voicing ratio as a function of underlying voicing in obstruents followed by an SO cluster in the next word (condition 3)

This observation confirms the report by Rubach (1996, 2008) that the underlying voicing contrast in obstruents in the word-final position tends to be neutralised on the surface. The neutralisation effect is also important for validating the method. The speakers were found to neutralise the underlying voicing contrast despite the contrast being represented in writing suggesting again that the contrast preservation in Condition 1 and 2 is unlikely to be due to the use of written stimuli.

3.3 Modelling the realisation of obstruent+sonorant clusters The statistical results presented in the previous section do not confirm the generalisation that word-final sonorants behave transparently by allowing obstruents in word-final obstruent+sonorant clusters to undergo final devoicing. Neither do they confirm that the leftmost obstruent in a three-member obstruent+sonorant+obstruent cluster tends to assimilate in voicing to the rightmost obstruent, regardless of whether there is a word boundary following the first obstruent (Condition 2), or the sonorant (Condition 3). However, the variation found in the data signals that

23 no simple generalisation can accurately capture the data. This section estimates the number of potential transparency cases, and explores some trends found in the realisation of surface voicing in pre-sonorant obstruents in the current dataset using a series of mixed-effects regression models.

3.3.1 Potential transparency cases Following the procedure previously used for the pilot study data, all the obstruent tokens from experiment 2 were classified as either voiced or voiceless on the surface. Classification was done by means of k-means clustering based on 4 two vectors: voicing duration and O1 duration . The classification results are illustrated in Figure 10.

Cluster 1 2 150 100 50 Glottal pulsing duration (ms) duration Glottal pulsing 0

50 100 150

C1 duration (ms)

Figure 10: Classification results of k-means clustering based on glottal pulsing duration and obstruent duration.

As illustrated in the scatterplot in Figure 10, 261 tokens were classified as a part of cluster 1, interpreted as voiced, with the median voicing ratio of 1, and the minimum voicing ratio of 0.54. 190 tokens were classified as a part of cluster 2, interpreted as voiceless, with the median voicing ratio of 0.21 and, and the maximum voicing ratio of 0.59. The scatterplot in Figure 10 makes it clear that there are numerous intermediate cases in the distribution of both voicing and obstruent duration. Consequently, imposing a two-way distinction on the data is necessarily arbitrary, and the numerical data obtained on the basis of the classification should not be treated as conclusive statistics about how often transparency occurs. The classification is intended solely as a tentative way of estimating in what percentage of cases the sonorant transparency hypothesis can

4 It would have been possible to include more voice-related measurements, such as f0 and f1 following the obstruent offset. The use of f0 turned out to be problematic due to missing values, as a number of sonorants were realised without any glottal pulsing. The use of f1 was found to confuse the classification, presumably because it followed a bimodal distribution conditioned by speaker sex.

24 be said to make correct predictions about the realisation of a token as voiced or voiceless. Due to the way the stimuli were constructed, four types of sonorant- transparency-like realisations could potentially be found in the data, as listed in 5. Table 5 also summarises the counts of potential transparency cases for the four environments.

Table 5: Counts of voiced and voiceless surface realisations (according to k-means clustering) of obstruents in four environments where sonorant transparency could occur. The bold case indicates the number of realisations consistent with the sonorant transparency hypothesis. Cases where sonorant had been deleted are not included in the count.

Realisation Environment Voiced Voiceless 1 Underlyingly voiced obstruent+sonorant ## 141 24 e.g. /Zubr/ → [Zupr] 2 Underlyingly voiced˚ obstruent+sonorant # voiceless obstruent 81 24 e.g. /Zubr#CEdýAw/ → [Zupr."CE.dýAw] 3 Underlyingly voiceless obstruent+sonorant˚ # voiced obstruent 30 85 e.g. /ts1pr#vjOsn˜O/ → [ts1br."vjO.sn˜O] 4 Underlyingly voiceless obstruent # sonorant + voiced obstruent 9 57 e.g. /brAk#mgw1/ → [brAg."mgw1]

The summary of the counts supports the basic generalisations made based on modelling the recoverability of underlying voicing in pre-sonorant obstruents. Obstruents in word final OS clusters typically retained their underlying voicing specification on the surface, whether or not an obstruent followed in the same word. Word-final obstruents were typically realised as voiceless when a cluster of a sonorant and a voiced obstruent followed in the next word. At the same time, however, pronunciations consistent with the sonorant transparency hypothesis were not unattested; a small numbers of such pronunciations was recorded in all four potential environments.

3.3.2 Modelling voicing duration and ratio The four potential environments for sonorant transparency listed in Table 5 involve two situations where an underlyingly voiced obstruent is realised with limited glottal pulsing, and two situations where an underlyingly voiceless obstruent exhibits surface voicing. The conditioning of these two situations was analysed in a series of mixed-effects models with random intercepts for speaker and item. Duration of glottal pulsing and voicing ratio were used as dependent variables. Analysing both, glottal pulsing duration and voicing ratio,

25 Table 6: Summary of the fixed part of a linear mixed-effects model predicting the voicing ratio in underlyingly voiced obstruents from Conditions 1 and 2. The intercept corresponds to a pre-sonorant fricative in a monosyllabic word in the absence of a pause following the sonorant. Term Level β SE t p (Intercept) 0.49 0.07 6.87 <0.001 Duration of sonorant voicing 0.001 0.0003 3.45 <0.001 Word size disyllabic -0.03 0.03 -0.82 0.43 Word size trisyllabic -0.13 0.06 -2.14 0.05 Manner (obstruent) stop 0.26 0.05 5.69 <0.001 Following pause present 0.12 0.04 3.38 0.001

was motivated by previous literature reports that these two response variables can be shaped differently by some effects, such as speech rate (Sol´e,2007). p- values were calculated based on Markov Chain Monte Carlo confidence intervals, using the pvals function within the languageR package (Baayen, 2011). The first set of models was fitted to the data in environments 1 and 2 (cf. Table 5), where the potential transparency target was underlying voiced. The purpose of the models was to analyse under which conditions devoicing might occur. A model with voicing ratio as a dependent variable achieved the best fit with four fixed effects: duration of sonorant voicing, the number of , whether the obstruent was a stop or a fricative, and whether or not a pause followed. Further effects that were analysed, but did not significantly improve the fit of the model were sex of the speaker, sonorant’s manner of articulation, obstruent’s place of articulation, condition, presence or absence of a boundary tone, duration of the vocalic nucleus in the final syllable, and syntactic characteristics of the structure in which the OS cluster was embedded. The final model is summarised in Table 6. The ratio of voicing to obstruent duration increased significantly with the duration of sonorant voicing (B=0.001 (SE=0.0003), t=3.45, p<0.001). The voicing ratio was also greater in monosyllabic than in trisyllabic words (B=- 0.13 (SE=0.06), t=-2.14, p=0.05), but there was no significant effect at the level of disyllabic words (B=-0.03 (SE= 0.03), t=-0.82, p=0.43). The voicing ratio was significantly higher for stops than for fricatives (B=0.27 (SE=0.05), t=5.69, p<0.001). The ratio was significantly greater if the sonorant was followed by a pause (B=0.12 (SE=0.04), t=3.38, p=0.001). Some of the fixed predictors in the model of voicing ratio showed similar effects in a model of glottal pulsing duration. The duration of glottal pulsing during closure or frication increased significantly with the duration of glottal pulsing during the following sonorant (B=0.09 (SE=0.04), t=2.48, p=0.022). Glottal pulsing was also longer when the sonorant was followed by a pause (B=13.36 (SE=4.48), t=2.98, p=0.004). In addition, there was a significant effect of the presence of a boundary tone, which involved an increase in the duration of glottal

26 Table 7: Summary of the fixed part of a linear mixed-effects model predicting the duration of glottal pulsing in underlyingly voiced obstruents from Conditions 1 and 2. The intercept corresponds to a pre-sonorant fricative in a monosyllabic word in the absence of a pause following the sonorant. Term Level β SE t p (Intercept) 46.32 7.50 6.18 0.001 Duration of sonorant voicing 0.09 0.04 2.48 0.022 Following pause present 13.36 4.48 2.98 0.004 Boundary tone present 8.85 4.38 2.02 0.040

pulsing (B=8.85 (SE=4.38), t=2.02, p=0.040). Adding word size and obstruent’s manner of articulation as predictors did not significantly improve the fit of the model. Other predictors which did not improve the model’s fit included sex of the speaker, sonorant’s manner of articulation, obstruent’s place of articulation, condition, duration of the vocalic nucleus in the final syllable, and syntactic characteristics of the structure in which the OS cluster was embedded. The final model is summarised in Table 7, and the effects are plotted in Figure 12. Results from the two models show that an underlyingly voiced obstruent followed by a word-final sonorant was likely to surface with limited glottal pulsing and with limited voicing ratio when the following sonorant does not have an extended voiced portion. In addition, surface devoicing is more likely next to relatively weaker prosodic boundaries indicated by the absence of a following pause, or a boundary tone. Fricatives followed by word-final sonorants surfaced with limited voicing compared to stops in terms of ratio, but the effect of manner on the duration of glottal pulsing was not significant. Similarly, more devoicing was found in longer (trisyllabic) words, as far as voicing ratio is concerned, but there was no significant effect of word size on the duration of glottal pulsing. Two mixed-effects models were also fitted to the data in environments 3 and 4 (cf. Table 5), where an underlyingly voiceless obstruent was followed by a sonorant and a voiced obstruent. The purpose of the analysis was to determine when there was increased surface voicing associated with the first obstruent in the cluster. The model which used voicing ratio as a response variable achieved the best fit with only one fixed effect, that of obstruent duration. The model’s summary is in Table 8. Increased voicing ratio was found in obstruents of shorter duration (B=-0.004 (SE=0.0006), t=-5.61, p=0.001). A graphical representation of this effect is in Figure 13. According to the log-likelihood test, the model did not improve upon adding further fixed effects, including speaker’s sex, the word size, the manner of articulation of the leftmost obstruent, the manner of articulation of the sonorant, duration of sonorant voicing, the presence of a boundary tone, duration of the vocalic nucleus, or syntax of the structure . What is also noteworthy, is that the fit of the model did not improve when adding the effect of condition (log-likelihood test: χ2=1.04, p=0.31). Obstruent duration was not found to have a significant effect of duration of

27 Table 8: Summary of the fixed part of a linear mixed-effects model predicting the voicing ratio in underlyingly voiceless obstruents from Conditions 2 and 3 Term Level β SE t p (Intercept) 0.75 0.08 9.92 0.001 Obstruent duration -0.004 0.0006 -5.61 0.001

glottal pulsing in underlyingly voiceless obstruents from Conditions 2 or 3. None of the other predictors considered in the modelling had a significant effect on the duration of glottal pulsing either.

4 Discussion 4.1 The main trends The view of sonorant transparency that emerges from the current data does not support the reports found in the phonological literature. For the majority of data the sonorant transparency hypothesis makes incorrect predictions with respect to the surface realisation of the underlying voicing values. If word-final sonorants were transparent to final devoicing, the surface neutralisation of the underlying voicing contrast would be expected in the preceding obstruents. However, underlyingly voiced obstruents followed by a sonorant and a phrase boundary were found to be realised with significantly more glottal pulsing and f0 lowering, compared to to the surface realisation of underlyingly voiceless obstruents in the same segmental and prosodic right-hand context. The current data also do not support the generalisation that there is regressive voice assimilation between two obstruents separated by a sonorant and a word boundary. This generalisation would predict more surface voicing (reflected in e.g. increased glottal pulsing) in the realisation of O1+S#O2 clusters where O1 is underlyingly voiceless and O2 is underlyingly voiced, than when O1 is underlyingly voiced and O2 is underlyingly voiceless. However, data from the present experiment point to the contrary, which indicates that the leftmost obstruent in the cluster tends to retain its underlying voice specifications in its output. Finally, the current data confirm previous literature reports that the underlying voicing contrast is neutralised on the surface in word-final obstruents when a sonorant+obstruent cluster follows in the next word, as the underlying voicing values of these obstruents could not be reliably predicted from the phonetic signal. The conclusion that follows from these findings is that the sonorant transparency cannot be upheld as a core property in the of Polish. The majority of the data support the opposite generalisation, i.e. that word-final sonorants are typically not transparent to final devoicing or voice assimilation. However, there are exceptions to this generalisation, as indicated by the phonetic variation illustrated in Figures 7 and 8, and by the results of k-means clustering. We therefore need to ask whether these exceptions are best modelled by an optional

28 rule that somehow renders word-final sonorants phonologically transparent, even if only in a minority of cases, or whether the concept of phonological transparency is not useful in understanding the factors that control the distribution of these exceptions.

4.2 The status of transparency What would count as evidence for positing an optional rule (with a low frequency of application) rendering word-final sonorants phonologically transparent in Polish? Such a hypothesis would presumably be strengthened if one could find evidence that, in putative instances of transparency, the presence or absence of voicing on the surface is categorical, and so best analysed by means of an operation over features in the phonological component of the grammar. In turn, one might argue for categorical voicing or devoicing if glottal pulsing or voice ratio exhibited a bimodal distribution. However, the bimodality test is easily confounded by experimental design, as pointed out by (Scobbie, 2005, 13), who notes that it is “very easy to find bimodal or multimodal distributions of values for phonetic parameters, where each mode is associated with some conditioning factor”. The present study is vulnerable to this problem, as the design involved a variety of factors influencing the phonetic realisation of voicing. For instance, if one pools together the duration of glottal pulsing from underlyingly voiced pre-sonorant fricatives and stops, bimodality emerges as a result of inherent duration differences between the two classes of sounds. The problem increases with the inclusion of other factors which have been shown to influence the duration of glottal pulsing, including place of articulation, sonorant’s manner of articulation and word size. At the same time, analysing the distribution for each factor separately is not feasible, as the data are too scarce to provide conclusive results. This problem could only be solved with a very large scale study with multiple tokens per each strictly controlled condition, and, given that lexical limitations make such control difficult to sustain for multiple test items, a very large speaker population would be required. However, even if categorical effects were found in a larger purpose-designed experiment, ostensibly supporting the postulation of a phonological operation over features, it would not necessarily follow that this operation should be understood as one rendering word-final sonorants transparent. On the contrary, the phonetic findings reported in this paper show that simply labelling sonorants as transparent in the relevant tokens provides little insight into the factors at work. The results of mixed-effects modelling reveal a considerable array of significant effects on the duration of glottal pulsing and voicing ratio. All of those individual effects are in some way predicted and consistent with findings on the influences on voicing in other languages. Limited phonetic voicing in underlyingly voiceless obstruents followed by word-final sonorants was found when the sonorant also underwent complete, or partial devoicing, when the obstruent was a fricative, when then size of the word increased, and preceding relatively weaker prosodic boundaries signalled by the absence of a following pause or a boundary tone. The effect of sonorant devoicing can be attributed to glottal coarticulation extending

29 over the entire word-final cluster, in anticipation of a voiceless obstruent, or a pause. The observed pre-sonorant fricative devoicing confirms previous literature findings of aerodynamic difficulties associated with fricative voicing (Ohala, 1983; Ohala & Sol´e,2010), although the aerodynamic difficulty can also be seen as inherent to all obstruents, but more readily observable in fricatives, due to their increased duration in comparison to stops. The latter generalisation appears consistent with the finding that obstruent manner had a significant effect on the voicing ratio, but not on the absolute duration of glottal pulsing. The effect of word size is reminiscent of Wedel’s (2002) observation for Turkish where monosyllabic words resist voicing alternations, and suggests that more complex factors of lexical access may also enter into conditioning of phonetic and phonological voicing5. The effect of increased devoicing when a voiceless obstruent followed compared to a following phrase boundary is the only effect that suggests a coarticulatory influence. However, there was no parallel coarticulatory effect that would involve relatively more voicing of underlyingly voiceless obstruents from Conditions 2 (O1S#O2) and 3 (O1#SO2) across weaker prosodic boundaries. Surface voicing of underlyingly voiceless obstruents in Conditions 2 and 3 was sensitive to only one influence, that of obstruent duration. Importantly, relatively shorter obstruents showed significantly increased voicing ratio, but not increase in vocal fold vibration. This would suggest that the increase in ratio is little more than a byproduct of obstruent shortening, which itself may be conditioned by other factors, such as prosodic boundary effects. The presence of a relatively weaker prosodic boundary, as found for the majority of test item realisations in Conditions 2 and 3, may involve decreased duration, as relatively weaker prosodic boundaries have been found to have a limiting effect on final lenghtening (Kuzla et al., 2007). The production of voiceless obstruent flanked by a vowel and a sonorant consonant is likely to show some laryngeal overlap, as it involves transitions between presence and absence of vocal fold vibration. The same amount of glottal pulsing ‘spilling over’ to a voiceless stop from a neighbouring sonorant will translate into relatively higher voicing ratio if the obstruent itself is relatively shorter. The term ‘transparency’ suggests that the sonorant is somehow not involved in the laryngeal assimilation. However, the effect of sonorant voicing on the phonetic devoicing of a preceding obstruent suggests the contrary, i.e. that sonorants do participate in laryngeal coarticulation. At the same time, the tendency for the whole cluster to devoice is not absolute, as there were cases in the dataset of phonetically devoiced obstruents followed by phonetically voiced sonorants. This observation challenges the argument that sonorant devoicing is a phonetic manifestation of their phonological transparency, as proposed by Gussmann (1992) (cf. discussion in Section 1). Instead, it appears that cases

5A reviewer suggests an alternative explanations for the word size effects observed in this study which has to do with differences in phonetic duration conditioned by the length of the word. For instance, segments tend to be phonetically longer in shorter words. Vowel lengthening could potentially facilitate voiced percepts of the following obstruent, counteracting perception-driven devoicing. On the other hand, lengthening of the obstruent itself is in some ways likely to trigger devoicing, as vocal fold vibration is aerodynamically counteracted through prolonged stricture.

30 of pre-sonorant devoicing really involve two types of situations conditioned by conspiring, but separate processes. The first process is laryngeal coarticulation which affects the entire cluster. The other process involves surface obstruent devoicing (for instance due to a rise in supraglottal air pressure during stricture) even though the vocal fold vibration then resumes during the following sonorant. There is no direct link between any of the attested effects and the visibility of sonorants to laryngeal processes. For instance, while the high intraoral pressure associated with frication might impede the production of glottal pulsing, it does not in any obvious way determine why the following sonorant should allow assimilation. Similarly, there is nothing inherent about sonorant transparency that would predict any of the previously discussed significant effects on the realisation of glottal pulsing. Thus, a formal model positing that word-final sonorants following fricatives are optionally transparent departs from a well motivated phonetic relationship towards an arbitrary interaction without gaining any additional explanatory or predictive power. All in all, it appears that the realisation of glottal pulsing in Polish obstruent+sonorant clusters results from a complex interaction of the following factors: underlying voicing value on the obstruent, phonological rule of word- final obstruent devoicing, gestural coordination of the state of the glottis in time, aerodynamic pressures on the maintenance of voicing during obstruent articulation, and perhaps even lexical influence evidenced by increased contrast in shorter words. All of these involves interactions of phonology, phonetics and the lexicon on the realisation of glottal pulsing in an obstruent, and are most accurately analysed as just that. By simply pooling together all the apparent cases of putative laryngeal neutralisation in word-final obstruent+sonorant clusters, and applying the label ‘transparency’ to them, one gains no insight into the phenomena: no predictions follow. Instead the relevant factors are counterintuitively obscured. To sum up, even for the subset of the data that appear consistent with the sonorant transparency hypothesis, the theory does not predict the kind of variation we find. While positing a sonorant transparency rule allows to generalise over a number of different cases, including voicing and devoicing assimilation, as well as final devoicing, it is a case of ill-conceived parsimony, as the generalisation obscures relevant aspects of when the purported transparency occurs. In comparison, a hybrid model of multi-level phonological, articulatory, aerodynamic and lexical influences on phonetic variation associated with laryngeal processes is better suited to deal with the wide array of laryngeal effects associated with pre-sonorant obstruents, and it does so without positing an additional phonological phenomenon of transparency, for which there is no direct evidence.

4.3 Some phonological consequences The finding that pre-sonorant obstruents tend to retain their underlying voice specification has far reaching theoretical consequences, as it undermines several analyses of voicing in Polish. First, it undermines syllable-based analyses of final devoicing in Polish, as proposed by Bethin (1984) and Gussmann (1992). The

31 analyses by the two authors exclude voiced obstruents from surfacing in codas. This prediction is falsified by the consistent production of voiced obstruents in the pre-sonorant position by speakers in Experiment 1 and 2. One might ask whether word-final sonorants in obstruent+sonorant are not syllabic, in which cases the preceding obstruent is syllabified into an onset. Certainly positing an extra syllable in words with final OS clusters goes against the native intuitions about the number of syllables in such words. Such intuitions are further corroborated by the behaviour of word stress. Polish has a productive pattern of penultimate stress, as exemplified by the alternations in (9).

(9) Penultimate stress in Polish ["rO.vEr] ‘bicycle’ [rO."vE.r1] ‘bicycle, Nom. ’ [rO.vE."rA.mi] ‘bicycle, Inst. ’ Since stress shifts to the penultimate syllable in morphophonological alternations, it creates a test for syllabicity. Word-final sonority-violating sonorants do not cause a stress shift, and thus they cannot be analysed as syllabic.

(10) Penultimate stress in Polish ["u.lEgw ] ‘he gave in’ *[u."lE.gw ] [mE."xA.ñizm] ‘mechanism’ *[mE.xA."ñi.zm]

If word-final sonorants cannot form a syllable nucleus, the preceding obstruents must be codas, not onsets. And since those obstruents can be realised as voiced, syllable-conditioned rules make wrong empirical predictions for Polish voicing. At this point a question might perhaps arise in the reader’s mind concerning how the discrepancy could arise between the data on transparency found in the descriptive and phonological literature and the data produced by the participants in the two experiments presented in this paper. The earliest reference on sonorant transparency that I have been able to trace is Benni (1959). Benni’s generalisation has then been confirmed by a number of authors, but all the reports appear to have been based on introspective data and/or auditory transcriptions. Neither of these methods is well suited to deal with variable data, and voicing in word-final stop+sonorant clusters does involve a considerable degree of inter- and intra-speaker variation. Failure to perceive this kind of variation might result in generalisations which involve possible pronunciations, but which are only representative of a subset of the data. Apart from variation, diverging reports concerning sonorant transparency might potentially be due to dialectal differences. However, as the participants in the study were speakers of standard Polish with no discernible regional features, their results certainly go against grammatical descriptions which focus on the standard variety. Another issue that transpires in relation to the current findings is the general validity of the theoretical notion of sonorant transparency. I have argued that sonorant transparency is not empirically supported for Polish. However, as Polish had previously been cited as the key case in support of the hypothesis that sonorant may be transparent to voicing, the present results undermine

32 the status of transparency in a broader cross-linguistic perspective. The other potential transparency case is Russian, however early reports of Russian sonorant transparency to voice assimilation have been disputed once experimental data had been obtained (Robblee & Burton, 1997; Padgett, 2002, 2012; Kulikov, 2010). In the light of combined experimental results from the current study and the cited studies on Russian the question arises whether voicing processes ever operate across an intervening sonorant. A positive answer to this question seems to be yet awaiting a convincingly documented case.

5 Conclusion

The argument put forward in this paper is that sonorant transparency is not a part of the Polish grammar. The majority trend in the phonetic realisation of voicing in word-final obstruent+sonorant clusters is to preserve the underlying voicing value of the obstruent. Although some exceptions to this tendency can be found, they can be understood as reflecting phonetic influences which oppose vocal fold vibration in some cases, and coarticulatory mechanisms which may yield surface voicing and devoicing patterns. These two phenomena may also be modulated by factors such as prosodic boundary effects. Treating the exceptional voicing and devoicing cases as forming a coherent phenomenon of sonorant transparency is empirically inadequate, as the generalisation struggles to make any predictions with respect to how much voicing is produced by Polish speakers and under what circumstances. Re-analysing transparency in terms of multiple influences on vocal fold vibration is also of consequence to formal approaches to Polish coda voicing, as it challenges the analyses which treat Polish final devoicing as a syllable-level phenomenon.

References

Baayen, R. H. (2011). languageR: Data sets and functions with “Analyzing Linguistic Data: A practical introduction to statistics”. R package version 1.2.

Baer, T. (1975). Investigation of using excised larynxes. Ph.D. thesis MIT.

Barry, S. (1988). Temporal aspects of the devoicing of word-final obstruents in Russian. In J. N. Holmes, & W. A. Ainsworth (Eds.), Speech’88. (Proceedings of the Federation of Acoustical Societies of Europe, August 1988) (pp. 81–88). Edinburgh: Institute of Acoustics.

Bates, D., & Maechler, M. (2009). lme4: Linear mixed-effects models using S4 classes. R package version 0.999375-32.

33 Becker, M., & Nevins, A. (2009). Initial-syllable faithfulness as the best model of word-size effects in alternations. Handout from presentation at NELS 40, MIT.

Benni, T. (1959). Fonetyka opisowa je˛zyka polskiego. Wroclaw: ZakladNarodowy im. Ossoli´nskich.

Bethin, C. Y. (1984). Voicing assimilation in Polish. International Journal of Slavic Linguistics and Poetics, 29 , 17–32.

Boersma, P., & Weenink, D. (2010). Praat: doing phonetics by computer [Computer programme].Version 5.1.12, retrieved 15 October 2009 from http://www.praat.org/.

Castellv´ı-Vives, J. (2003). Neutralisation and transparency effects in consonant clusters in Polish. In Proceedings of the 15th ICPhS Barcelona.

Charles-Luce, J. (1985). Word-final devoicing in German: Effects of phonetic and sentential contexts. Journal of Phonetics, 13 , 309–324.

Chen, M. (1970). Vowel length variation as a function of the voicing of the consonant environment. Phonetica, 22 , 129–159.

Cho, T. (2004). Prosodically conditioned strengthening and vowel-to-vowel coarticulation in English. Journal of Phonetics, 32 , 141–176.

Crystal, T. H., & House, A. S. (1988). A note on the durations of fricatives in American English. Journal of the Acoustical Society of America, 84 , 1932– 1935.

Demenko, G. (2000). Automatic analysis of phrase in Polish. Speech and Language Technology, 4 , 13–22.

Dukiewicz, L., & Sawicka, I. (1995). Fonetyka i fonologia. Krak´ow:Wydawnictwo Instytutu Je˛zyka Polskiego PAN.

Fischer-Jørgensen, E. (1954). Acoustic analysis of stop consonants. Miscellanea Phonetica, 2 , 42–59.

Forrez, G. (1966). Relevante parameters van de stemhebbende fricatief /z/. Inst. Perceptie Onderzoek Verslag (Eindhoven).

Fourakis, M., & Iverson, G. (1984). On the ‘incomplete neutralization’ of German final obstruents. Phonetica, 41 , 140–149.

Francuzik, K., Karpi´nski,M., & Kle´sta,J. (2002). A preliminary study of the intonational phrase, nuclear melody and pauses in Polish semi-spontaneous narration. In Proceedings of Speech Prosody 2002 .

34 Gussmann, E. (1992). Resyllabification and delinking: the case of Polish voicing. Linguistic Inquiry, 23 , 29–56.

Hartigan, J. A., & Wong, M. A. (1979). A k-means clustering algorithm. Applied Statistics, 28 , 100—108.

House, A. S., & Fairbanks, G. (1953). The influence of consonant environment upon the secondary acoustical characteristics of vowels. Journal of the Acoustical Society of America, 25 , 105–113.

Jakobson, R. (1978). Mutual assimilation of Russian voiced and voiceless consonants. Studia Linguistica, 32 , 107–110.

Jassem, W., & Richter, L. (1989). Neutralisation of voicing in Polish obstruents. Journal of Phonetics, 17 , 317–325.

Kara´s,M., & Madejowa, M. (1977). Slownik wymowy polskiej (The Dictionary of Polish pronunciation). Warszawa: PWN.

Keating, P. A. (1980). A Phonetic Study of a Voicing Contrast in Polish. Ph.D. thesis Brown University.

Keating, P. A. (1984). Physiological effects on stop consonant voicing. UCLA Working Papers in Phonetics, 59 , 29–34.

Kingston, J., & Diehl, R. L. (1994). Phonetic knowledge. Language, 70 , 419–454.

Kluender, K. R., Diehl, R. L., & Wright, B. A. (1988). Vowel-length differences before voiced and voiceless consonants: An auditory explanation. Journal of Phonetics, 16 , 153–169.

Kulikov, V. (2010). Phonetics and phonology of voice assimilation and sonorant transparency in normal and fast speech in Russian. Ms, University of Iowa.

Kuzla, C., Cho, T., & Ernestus, M. (2007). Prosodic strengthening of German fricatives in duration and assimilatory devoicing. Journal of Phonetics, 35 , 301–320.

Ladefoged, P., & Maddieson, I. (1996). The Sounds of the World’s Languages. Cambridge, MA: Blackwell.

Ohala, J. J. (1983). The origin of sound patterns in vocal tract constraints. In P. MacNeilage (Ed.), The Production of Speech. New York: Springer.

Ohala, J. J., & Sol´e,M.-J. (2010). Turbulence and phonology. In S. Fuchs, M. Toda, & M. Zygis (Eds.), Turbulent sounds. An interdisciplinary guide (pp. 37–97). Berlin: Mouton deGruyter.

Ostaszewska, D., & Tambor, J. (2000). Fonetyka i fonologia wsp´olczesnego je˛zyka polskiego. Warszawa: PWN.

35 Padgett, J. (2002). Russian voicing assimilation, final devoicing, and the problem of [v] (or, the mouse that squeaked). Ms., University of California, Santa Cruz.

Padgett, J. (2012). The role of prosody in Russian voicing. In T. Borowsky, S. Kawahara, T. Shinya, & M. Sugahara (Eds.), Prosody Matters: Essays in Honor of Elisabeth Selkirk. Equinox.

Peterson, G. E., & Lehiste, I. (1960). Duration of syllable nuclei in English. Journal of the Acoustical Society of America, 32 , 693–703.

Petrova, O., & Szentgy¨orgyi,S. (2004). /v/and voice assimilation in Hungarian and Russian. Folia linguistica, 38 , 87–116.

Pierrehumbert, J. B. (1980). The phonology and phonetics of English intonation. Ph.D. thesis MIT.

Port, R. F., & O’Dell, M. L. (1985). Neutralization of syllable-final voicing in German. Journal of Phonetics, (pp. 455–471).

Prince, A., & Smolensky, P. (2004 [1993]). Constraint interaction in generative grammar. Malden, MA, and Oxford, UK: Blackwell.

R Development Core Team (2005). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.

Robblee, K. E., & Burton, M. W. (1997). Sonorant voicing transparency in Russian. In W. Browne, & D. Zec (Eds.), Proceedings of Formal Approaches to Slavic Linguistics (pp. 407–434). Ann Arbor: Michigan Slavic Publications.

Rubach, J. (1996). Nonsyllabic analysis of voice assimilation in Polish. Linguistic Inquiry, 27 , 69–110.

Rubach, J. (2008). Prevocalic faithfulness. Phonology, 25 , 433–468.

Rubach, J., & Booij, G. (1990). Edge of constituent effects in Polish. Natural Language and Linguistic Theory, 8 , 427–463.

Scheer, T. (2004). Lateral Theory of Phonology: What is CVCV, and why should it be?. Berlin, New York: Mouton de Gruyter.

Scobbie, J. M. (2005). The phonetics-phonology overlap. QMUC Speech Science Research Centre Working Papers, 1 .

Shapiro, M. (1993). Russian non-distinctive voicing: a stocktaking. Russian Linguistics, 17 , 1–14.

Shattuck-Hufnagel, S., & Turk, A. E. (1996). A prosody tutorial for investigators of auditory sentence processing. Journal of psycholinguistic research, 25 , 193– 247.

36 Slis, I. H., & Cohen, A. (1969). On the complex regulating the voiced-voiceless distinction I. Language and Speech, 12 , 80–102.

Slowiaczek, L., & Dinnsen, D. A. (1985). On the neutralizing status of Polish word-final devoicing. Journal of Phonetics, 13 , 325–341.

Sol´e,M.-J. (2007). Controlled and mechanical properties in speech: a review of the literature. In M.-J. Sol´e,P. S. Beddor, & M. Ohala (Eds.), Experimental approaches to phonology (pp. 302–321). New York: Oxford University Press.

Steriade, D. (1999). Phonetics in phonology: the case of laryngeal neutralization . In M. Gordon (Ed.), Papers in Phonology 3 (UCLA Working Papers in Linguistics 2) (pp. 25–145). Los Angeles: Department of Linguistics, University of California.

Stevens, K. (1998). Acoustic Phonetics. Cambridge, MA: MIT Press.

Stevens, K. N., Blumstein, S. E., Glicksman, L., Burton, M., & Kurowski, K. (1992). Acoustic and perceptual characteristics of voicing in fricatives and fricative clusters. Journal of the Acoustical Society of America, 91 , 2979–3000.

Tucker, B. V., & Warner, N. (2010). What it means to be phonetic or phonological: the case of Romanian devoiced nasals. Phonology, 27 , 289–324.

Wedel, A. (2002). Phonological alternation, lexical neighborhood density and markedness in processing. Handout from presentation at LabPhon8, Yale University.

Westbury, J. R., & Keating, P. A. (1986). On the naturalness of stop consonant voicing. Journal of Linguistics, 22 , 145–166.

Wierzchowska, B. (1980). Fonetyka i fonologia je˛zyka polskiego. Wroclaw: Zaklad Narodowy im. Ossoli´nskich.

Appendix

Test items used in experiment 1 /tr/ wiatr ‘wind’ /pr/ Cypr ‘Cyprus’ /pr/ Dniepr ‘The Dnieper River’ /tw/ zmi´otl ‘wiped out, 3p., sg.’ /kw/ uciekl ‘escaped, 3p., sg.’ /dr/ kadr ‘personell, gen. pl.’ /br/ ˙zubr ‘bison’ /br/ kobr ‘cobra snakes, Gen.’ /dw/ schudl ‘lost weight, 3p., sg.’ /gw/ ulegl ‘gave in, 3p., sg.’

37 Control items used in experiment 1 /d#m/ po´sr´odmiasta ‘amidst the city’ /d#n/ por´odnaturalny ‘natural delivery’ /d#j/ przeszk´odje´zdzieckich ‘show jumping obstacles, Gen.’ /d#r/ rozw´odrodzic´ow ‘parents’ divorce’ /d#l/ pow´odle˛k´ow ‘reason for anxieties’ /d#w/ zaw´odlowcy ‘hunter’s profession’ /d#o/ swob´odobywatelskich ‘civic rights gen.’ /t#m/ przewr´otmajowy ‘The May Coup d’Etat’´ /t#n/ przerzut narkotyk´ow ‘illegal drug transfer’ /t#j/ zarzut jest ‘an objection is’ /t#r/ walut Rosji ‘currencies of Russia gen. pl.’ /t#l/ nawr´otle˛ku ‘return of anxiety, Gen.’ debiutl´odzkiego ‘debut of aL´od´z-based /t#w/ (dokumentalisty) documentary-maker’ /t#o/ statut okre´sla ‘charter determines’

Test items used in experiment 2. Condition 1 /br/ ˙zubr ‘bison’ /br/ kobr ‘cobra snakes, Gen.’ /dr/ kadr ‘personell, Gen.’ /dm/ wydm ‘dunes, Gen.’ /zm/ pryzm ‘heap, Gen.’ /dw/ schudl ‘lost weight, 3p. Masc.’ /dr/ katedr ‘cathedrals, Gen.’ /gw/ ulegl ‘succumbed, 3p. Masc.’ /dw/ napadl ‘attacked’, 3p. Masc. /gw/ pom´ogl ‘helped, 3p. Masc.’ /zm/ mechanizm ‘mechanism’ /pr/ Cypr ‘Cyprus’ /pr/ Dniepr ‘The Dnieper River’ /tr/ wiatr ‘wind’ /tm/ rytm ‘rhythm’ /sm/ pasm ‘streaks, Gen.’ /tw/ zmi´otl ‘wiped (out), 3p. Masc.’ /tr/ teatr ‘theatre’ /kw/ uciekl ‘esaped 3p. Masc.’ /tw/ przygni´otl ‘crushed, 3p. Masc.’ /kw/ przywl´okl ‘dragged in, 3p. Masc.’ /sm/ czasopism ‘magazines, Gen.’

38 Test items used in experiment 2. Condition 2. /br#C/ ˙zubrsiedzial ‘the bison was sitting’ /br#k/ kobr kr´olewskich ‘king cobras, Gen.’ /dr#f/ kadr filmowy ‘film frame’ /dm#p/ wydm piaskowych ‘sand dunes, Gen.’ /zm#C/ pryzm ´sniegu ‘heaps of snow, Gen.’ /dw#k/ schudl kilogram ‘lost a kilo, 3p. Masc’ /dr#C/ katedr ´swiata ‘cathedrals of the world, Gen.’ /gw#p/ ulegl presji ‘succumbed to the pressure, 3p. Masc.’ /dw#k/ napadl kobiete˛ ‘attacked a woman’, 3p. Masc. /gw#C/ pom´ogl siostrze ‘helped a sister, 3p. Masc.’ /zm#f/ mechanizm finansowy ‘financial mechanism’ /pr#v/ Cypr wiosna˛ ‘Cyprus in the spring’ /pr#v/ Dniepr wylal ‘The Dnieper overflowed its banks’ /tr#v/ wiatr wial ‘wind was blowing’ /tm#v/ rytm walca ‘waltz rhythm’ /sm#v/ pasm wieczornych ‘evening programmes, Gen.’ /tw#d/ zmi´otl dach ‘blew the roof off, 3p. Masc.’ /tr#v/ Teatr Wybrze˙ze ‘The Coast Theatre’ /kw#v/ uciekl w ladzom ‘escaped from the authorities 3p. Masc.’ /tw#g/ przygni´otl g´ornika ‘crushed a miner, 3p. Masc.’ /kw#b/ przywl´okl balterie ‘dragged in bacteria, 3p. Masc.’ /sm#z/ czasopism zagranicznych ‘foreign magazines, Gen.’

Test items used in experiment 2. Condition 3 /Z#mC/ ˙zo lnierz m´sciwy ‘vengeful soldier’ obrazlkaja ˛cego ‘the sight /z#wk/ dziecka of a weeping child’ /v#rt/ termometr´owrte˛ciowych ‘mercury thermometers, Gen. ’ /g#mS/ plag mszyc ‘plagues of aphids, Gen.’ /k#mg/ brak mgly ‘lack of fog’ /k#wg/ steklgarstw ‘a bunch of lies’ /C# lZ/ komu´sl˙zej ‘easier for somebody’ /t#rd/ kwiat rdestu ‘water pepper flower’

39 ● 0.62 ● 0.70 0.58 0.60 0.54 Voicing ratio Voicing ratio Voicing

0.50 ● 0.50

0 50 100 200 disyllabic monosyllabic trisyllabic

Duration of sonorant voicing (ms) Word size

● ● 0.72 0.80 0.68 0.70 0.64 Voicing ratio Voicing ratio Voicing

● ● 0.60 0.60

fricative stop absent present

Manner (obstruent) Following pause

Figure 11: Effects plot for the linear mixed-effects model model predicting the voicing ratio in underlyingly voiced obstruents from Conditions 1 and 2.

40 ● 70 70 65 66 60 55 62 50 ● 58 Glottal pulsing duration (ms) Glottal pulsing duration (ms) Glottal pulsing duration 0 50 100 150 200 250 absent present

Duration of sonorant voicing (ms) Following pause

● 66 64 62 60

● 58 Glottal pulsing duration (ms) Glottal pulsing duration absent present

Boundary tone

Figure 12: Effects plot for the linear mixed-effects model model predicting the duration of glottal pulsing in underlyingly voiced obstruents from Conditions 1 and 2.

41 0.6 0.4 0.2 Voicing ratio Voicing 0.0

50 100 150 200

Obstruent duration

Figure 13: Effects plot for the linear mixed-effects model model predicting the voicing ratio in underlyingly voiceless obstruents from Conditions 2 and 3.

42