Running head: Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Kaidi L˜oo

University of Tartu, Estonia

Fabian Tomaschek

University of T¨ubingen,Germany

P¨artelLippus

University of Tartu, Estonia

Benjamin V. Tucker

University of Alberta, Canada

University of T¨ubingen,Germany

Word count: 4909

Version: February, 2021

Corresponding author:

Kaidi L˜oo

Institute of Estonian and General Linguistics, University of Tartu

Jakobi 2-405, 50090 Tartu, Estonia e-mail: [email protected] Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Abstract

Recent evidence has indicated that a word’s morphological family and inflectional paradigm members get activated when we produce words. These paradigmatic effects have previously been studied in careful, laboratory context using words in isolation. This previous research has not investigated how the linguistic context affects spontaneous speech production. The current corpus analysis investigates paradigmatic and syntagmatic effects in Estonian spontaneous speech. Following related work on English, we focus on mor- phemic and non-morphemic word final /-s/ in content words. We report that linguistic context, as measured by conditional probability, has the strongest effect on the acoustic durations, while inflectional properties

(internal structure and inflectional paradigm size) also affect word and segment durations. These results indicate that morphology is part of a complex system that interacts with other aspects of the production system.

Keywords: morphological complexity, inflection, paradigm size, conditional probability, acoustic durations. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

1 Introduction

Most investigations of paradigmatic effects on speech production have used experimental methods to in- vestigate them (e.g. L˜ooet al., 2018b, and references therein). However, very few researchers have used actual spontaneous speech production data or considered the effect of context on these productions. Further, many of these studies have investigated, relatively speaking, morphologically impoverished (see

Strycharczuk, 2019, for a recent overview). What happens when we investigate a morphologically complex language like Estonian? By way of example, a language like English, an Indo-European language, arguably has two noun cases (Quirk et al., 1985), German (also an Indo-European language) has four (Drosdowski and Eisenberg, 1998), and Serbian (also an Indo-European language) has five (Hamm, 1981). Estonian is a

Finno-Ugric language known for its’ complex morphology. Each noun and adjective in Estonian has 14 cases in singular and plural (Erelt, 2003). Therefore the probability that one encounters all of the inflected forms is relatively high in English, whereas in Estonian it will be relatively low. This distribution difference may imply differences in how Estonian paradigms are represented and processed during speech production. The present study investigates the phonetic effects of morphological complexity in Estonian and how it is realized in spontaneous speech.

By investigating phonetic effects at the word and segment level, we obtain a broader picture of the interaction between a word’s morphological structure, its paradigmatic family and the context it is located in.

In the next subsection, we describe some of the previous literature investigating morpho-phonetic interactions.

Then we describe two phonetic analyses performed using a corpus of Estonian spontaneous speech. We conclude this paper with a discussion about the implications of our findings for theories of speech production.

1.1 Morphological effects

Two types of morphological information affect phonetic realizations of complex words: internal structure

(Cohen, 2014, 2015; Kuperman et al., 2007; Plag et al., 2017) and paradigmatic relations (Bell et al., 2020;

Cohen, 2014, 2015; L˜ooet al., 2018b). Regarding a word’s internal structure, simplex and complex words are articulated differently even if they consist of phonologically identical segments. Recent research on phonetic properties of the speech signal (Plag et al., 2017; Seyfarth et al., 2018; Tomaschek et al., 2019; Zimmermann,

2016) has reported systematic duration differences between morphemic and non-morphemic word final /-s/, as well as differences between different inflectional functions of morphemic /-s/.

A relevant body of literature not often discussed in the morphology literature is research into the strength- ening of prosodic domains. It was shown that articulatory processes differ depending on the strength and level of prosodic boundaries. Prosodic boundaries lengthen articulatory gestures and increase the amount Paradigmatic and syntagmatic effects in Estonian spontaneous speech of articulation contact, when gestures are located at a prosodic boundary (Cho, 2004; Keating et al., 2003;

Keating, 2006; Krakow, 1999).

Regarding effects of word internal structure, differences in gestural coordination have been shown to differ depending on their position in the word. Articulatory gestures show less overlap word initially than across words (Tiede et al., 2007) and word medially (Gafos et al., 2010). These effects expand to word internal boundaries. For example, Cho(2001) observed that the overlap and variability in gestural coordination during the articulation of consonant clusters was larger when the consonants were located at morpheme boundaries than when they were within a morpheme. Lee-Kim et al.(2013) have found that the ‘darkness’ of English /-l/ depends on its morphological status. Accordingly, Plag et al.(2017) discuss the differences between morphemic and non-morphemic /-s/ in English as potential effect of prosodic boundaries.

There is evidence that besides , informativity modulates word internal phonetic and phonological detail (e.g., Pluymaekers et al., 2010; Torreira and Ernestus, 2012; Wedel et al., 2019). For example, word

final /-s/ in Spanish becomes voiced when it is intervocalic. Torreira and Ernestus(2012) showed that voicing depends on predictability. They found that /-s/ suffixes in predictable morphosyntactic contexts were more likely to lose voicing than other word-final /-s/. Similar findings have been reported for /-s/ segments in longer affixes. For example, Smith et al. 2012 showed that transparent affixes (e.g., mis in misbehave) are pronounced longer than pseudo-affixes (e.g., mis in mistake) (see also Baker et al.(2007)). Kemps et al.

(2005) showed that stems are longer when they are not followed by an affix (i.e., keep alone vs. keep in keeper).

In summary, there is evidence that word internal and word external prosodic structure systematically modulates articulatory fine phonetic detail. Next, we turn our attention to effects of paradigmatic relations.

There are various measures how paradigmatic relations have been assessed. For example, Hay(2003) used the relative frequency of inflected forms to the uninflected form (e.g. swiftly, swifter, swiftest vs. swift). Hay found that the consonant at the boundary between the stem and the affix was more likely to get deleted when the inflected forms were more frequent than the uninflected form. A similar finding was reported by

Schuppler et al.(2012) for word final /-t/ suffixes in Dutch verbs in relation to the frequency of the inflected and uninflected stems. These effects have been interpreted to indicate that inflected verbs are composed from smaller units during the cognitive preparation stage. However, they also can be regarded to indicate that the frequency of whole word forms in relation to other word forms within a paradigm co-determines the articulation process. This assumption is supported by studies that assess paradigmatic structures by means of a word form’s frequency relative to the cumulative frequency of the entire paradigm.

Kuperman et al.(2007) investigated Dutch compound interfixes. They gauged paradigmatic relations by using the interfixes’ probability given the constituents of the compound in which they were located to predict Paradigmatic and syntagmatic effects in Estonian spontaneous speech their acoustic duration. Controlling for the uncertainty following the interfix, they found that the duration of interfix was longer when they were more probable and proposed the ‘Paradigmatic Enhancement Hypothesis’.

Using a compound elicitation task in English, Bell et al.(2020) reported that consonants located at the internal boundary were longer, when the family size of the second compound was smaller. They interpret smaller family size as equivalent to smaller paradigmatic uncertainty. Thus, smaller uncertainty correlated with longer durations.

These enhancement findings are opposite of what many other researcher would predict as a result of the effects of probability (e.g. Aylett and Turk, 2004; Bell et al., 2003). Nevertheless, the ‘Paradigmatic

Enhancement Hypothesis’ has been replicated in several instances. While Kuperman et al. investigated paradigmatic relations of compounds, Cohen(2014) changed the perspective to inflectional relations among verbs. Cohen(2014) found that the final /-s/ in the third-person singular English verbs (e.g., looks) was longer when the singular form was more frequent compared to the plural form (e.g., look). In another study, Cohen(2015) looked at vowel suffixes and also found that higher paradigmatic probability lead to both reduced and enhanced aspects of the articulation of vowel suffixes in Russian. Tucker et al.(2019) reported a similar finding for the modulation of the duration of stem vowels in regular and irregular English verbs. A similar result was reported for articulatory patterns of stem vowels in English (Tomaschek et al.,

2021a). Tomaschek et al.(2019) further suggested that durational differences between different morphemic

/-s/ indicate functional certainty, segment durations being lengthened under higher functional certainty, but shortened under functional uncertainty.

All these effects have been shown for languages with relatively simple morphological complexity, where it is likely that all members of a paradigm have a lexical representation. However, there is evidence that effects of paradigmatic relations also emerge in languages with morphologically complex structures. L˜ooet al.

(2018b) investigated effects of inflectional paradigm size on production latencies and acoustic durations in a word naming experiment. They treat paradigm size (calculated from a large language corpus) as a measure reflecting actual language use. For example, in a large paradigm word like jalg ‘foot’ often appears in the language (as measured in a corpus) with most possible inflected forms such as jalad ‘feet’ , jalgadel ‘on the feet’, jalus ‘in your way’ , jalas ‘in the foot’ , jalaga ‘with the foot’. However, small paradigm words like tainas ‘dough’ are often restricted in language use to a few of the possible inflected forms such as tainas and taina likely, due in part, to the semantics characteristics of the word. In a word naming task, L˜ooet al.

(2018b) found that inflected nouns from larger paradigms (jalg) were articulated with shorter latencies and acoustic duration than nouns from smaller paradigms (tainas) (see also L˜ooet al., 2018a, for similar findings in comprehension).

In summary, internal structure as well as paradigmatic structure have shown to correlate with phonetic Paradigmatic and syntagmatic effects in Estonian spontaneous speech realizations of complex words. Their interaction with contextual predictability is less known.

1.2 Context and frequency effects

It has been well established that frequent words are articulated with shorter durations (Whalen, 1991; Wright,

1979). While models of speech production assume that whole word properties of morphologically complex words should not co-determine production characteristics (Levelt et al., 1999; Roelofs, 1997), numerous studies challenge this assumption. For example, Caselli et al.(2016) studied frequency effects in English conversational speech. They found that acoustic durations of inflected forms decreased with increasing whole-word frequency. This effect was present even when the stem frequency was controlled for. Plag et al.

(2020) found that whole-word frequency reduced acoustic durations of complex words in English, even when the stem frequency was equal. For Estonian, L˜ooet al.(2018b) found that more frequent inflected forms were articulated with shorter duration in a word naming task. These effects concern the words frequency, which from a probabilistic perspective can be conceptualized as its a-priori probability to occur in a sentence. In addition, it is well established that contextual predictability in a sentence further modulates a word’s phonetic realization (Aylett and Turk, 2004; Bell et al., 2009, 2003; Bybee, 2001; Gregory et al., 1999; Jurafsky et al.,

2002). One of the most common ways to quantify this phenomenon has been by using the conditional probability of a target word given the previous or the next word (Bell et al., 2009, 2003). For example, Bell et al.(2009) show that more probable content words given the next word as well as more probable function words given both the previous and next word are produced with shorter acoustic duration than less probable words.

Clearly, changes in the characteristics of the whole word will also correlate with changes in the character- istics of subparts of the words such as stems and affixes. For example, Cohen(2014) reported that the third person singular /-s/ is pronounced with a shorter duration when it is contextually more probable in English.

Tang and Bennett(2018) reported that the contextual predictability of certain prefixes in a morphologically complex language Kaqchikel affects their duration. However, Torreira and Ernestus(2012) reported no effects of conditional probability in their study of Spanish spontaneous speech.

1.3 The current study

In the current study we investigate the interaction between contextual predictability and paradigmatic struc- ture and its effects of the phonetic characteristics of whole words and affixes in Estonian, a highly morpho- logically complex language. Mirroring the investigated sound segment of preceding studies (Plag et al., 2017;

Seyfarth et al., 2018), we study word final /-s/ which occurs in morphemic and non-morphemic positions in Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Estonian.

The average uncertainty about inflectional functions is smaller in languages with smaller inflectional paradigms than in languages with larger inflectional paradigms. This is why we predict that morphological structure will play a smaller role in Estonian than it does in English for example.

Following previous experimental work on paradigm size (L˜ooet al., 2018a,b), we predict that inflectional paradigm size will be an important predictor of acoustic durations also in spontaneous speech. Nevertheless, since we take context into account, the effect might be smaller than in previous studies of isolated words

(Bell et al., 2020; Cohen, 2014). Before introducing the results of the analyses, we describe the method used and our modeling strategy.

2 Methods

2.1 Word material

The material for the present study was extracted from the Phonetic Corpus of Estonian Spontaneous Speech

(Lippus, 2020). The corpus consists of 90 face-to-face spontaneous conversations between two speakers, who were acquainted with each other. The corpus contained approximately 80 hours of speech, 0.5 million word tokens. Each conversation is approximately 30 minutes long. 143 different speakers were recorded (some of them were recorded more than once). There are 78 female speakers and 65 male speakers (mean age

= 41.20 years, range 20 to 85 years). The conversations were recorded in a sound booth or in a silent room. Words, segments and their boundaries were manually annotated in Praat (Boersma and Weenink,

2021). The corpus is tagged with the following information: 1) of the pronunciation using SAMPA transcription, 2) orthographic spelling of the canonical word form, 3) Part-of-speech and morphological information of the form. 4) Syllable information.

For the analysis, we extracted all content words (nouns, verbs and adjectives) that end in /-s/ from the corpus. We excluded proper names and compounds. To control for the phonological environment, we only included two syllable words where the word final segment was a short voiceless or voiced /-s/ preceded by a vowel.

2.2 Modelling strategy

Model fitting was conducted using generalized additive mixed models (GAMM, Wood 2006; the R-package mgcv). The residuals of the final models were checked and they were normally distributed. Pairwise model comparisons were conducted using the fREML and AIC scores provided by the compareML-function in the Paradigmatic and syntagmatic effects in Estonian spontaneous speech

R-package itsadug to select the best predictors between collinear predictors in the analysis.

To inspect the effects of collinearity in our models, we calculated the Spearman-rank correlation between our numeric predictors and control variables. Table1 shows that inflectional paradigm size, lemma frequency and whole-word frequency were highly correlated with each other (R2 > 0.6). High collinearity may result in false positives and false negatives (Tomaschek et al., 2018). Accordingly, in order to avoid collinearity issues in the models, these predictors were tested in separate models. Final models were obtained by a forward model fitting approach while keeping the random effects fixed (see Baayen et al., 2017). Random effects contained random intercepts for speakers, words and the manner of the first segment of the following words.

Words with longer and shorter duration than 2.5 standard deviations from the mean word duration were removed (121 data points) from the dataset. The final dataset consisted of 5793 word tokens. In the next subsection is a summary of the dependent and independent variables considered in the modeling.

Since we apply an exploratory approach to our data analysis, and to be more conservative about our interpretation of the effects, we set a p-value of 0.001 as the significance threshold for our study.

Table 1: The Spearman correlation coefficient values of continuous variables. Pairs with over 0.6 values are marked in bold. segm dur word dur freq lemma freq par size cond prob nr segments age speech rate segm dur 1.00 word dur 0.58 1.00 freq -0.10 -0.40 1.00 lemma freq -0.12 -0.38 0.84 1.00 par size -0.12 -0.30 0.62 0.89 1.00 cond prob -0.28 -0.37 0.26 0.25 0.21 1.00 nr segments -0.05 0.29 -0.16 -0.15 -0.07 -0.08 1.00 age 0.13 0.17 -0.04 0.01 0.06 -0.02 -0.01 1.00 speech rate -0.37 -0.52 0.11 0.09 0.04 0.11 -0.07 -0.18 1.00

2.3 Dependant variables

The dependant variables were the word final /-s/ duration (Study 1) and the whole word duration (Study 2) in seconds.

2.4 Predictors

We were interested in three types of predictors: inflectional function, frequency and paradigmatic predictions as well as conditional probability.

Inflectional function: Initially, the word material was grouped into four categories according to the morphological function of the last segment: a) monomorphemic stems without a derivational or inflectional Paradigmatic and syntagmatic effects in Estonian spontaneous speech affix (N = 323, e.g., lammas ‘sheep’), b) derivational stems without an inflectional ending (N = 737, e.g., ilu-s ‘beautiful’), c) inflected nouns and adjectives in inessive case (N = 2347, e.g., auto-s ‘in the car’), and d) inflected verbs in the 3rd ps sg past tense (N = 2266, e.g., uju-s ‘he/she swam’). In order to make the distinction between category (a) and category (b), the online version the Estonian word families dictionary

(Vare, 2012) was consulted.

Pilot analyses showed no significant differences between the /-s/ segment durations of derivational and monomorphemic words (t=0.33, p=0.74). As a result, we collapsed the words into two categories based on their inflectional properties a) inflected forms (4653, words with just stems and words with stem and derivational affixes) and b) uninflected forms (1020 monomorphemic and derived words without an inflectional affix). Our predictor of interest was a binary category Inflected/Uninflected.

Frequency and inflectional paradigm size: Following the previous work on Estonian processing (L˜oo et al., 2018a,b), we focused on the following lexical predictors: (1) Whole-word frequency captures the total number of occurrences of a particular form in the spontaneous speech corpus. Whole-word frequency varied between 1 and 400 (mean 99.03) for the words in our data set. (2) Lemma frequency is the cumulative frequency of a complete inflectional paradigm. Lemma frequency varied between 1 and 2242 (mean 405.1) for the words in our data set. (3) Inflectional paradigm size is the number of observed forms for a certain lemma. For example, for the lemma lammas ’sheep’, the paradigm size was four: lammas ’nom sg’, lambad

’nom. pl.’ lammastele ’pl. all.’, lambaid ’pl. part’. Paradigm size varied between 1 and 40 (mean 14.78) for the words in our data set.

Conditional probability: Following the work by Bell et al.(2003) and Bell et al.(2009), we included

Conditional probability given the previous word and given the next word as predictors. (1) The conditional probability of a particular word wi given a previous word wi−1 was estimated by counting the instances of the two words occurring together C(wi−1wi) in the Estonian spontaneous speech corpus and by dividing this number by the total number of times the previous word occurred in the corpus

C(wi−1wi) P (wi|wi−1) = C(wi−1)

(2) The conditional probability of a particular word wi given the next word wi+1 was estimated by counting the instances where the current and the next word occur together C(wiwi+1) and by dividing this by the total number of times the next word occurred in the corpus C(wi+1):

C(wiwi+1) P (wi|wi+1) = C(wi+1) Paradigmatic and syntagmatic effects in Estonian spontaneous speech

The conditional probability given the previous word varied between 0.0000224 and 1 for our words (mean

0.153) and the conditional probability given the next word varied between 0.0000352 and 1 (mean 0.074).

Similarly to Bell et al.(2009), we found that only conditional probability given the next word was a significant predictor for content words.

2.5 Control variables

Based on the previous research and the specifics of Estonian , we also included several control variables.

Number of segments: Words with more segments are also articulated with longer duration. For the segment duration, the finding has been the opposite, in longer words each individual segment is shorter

(Menzerath, 1954). In the current analysis, the number of segments was calculated by counting the number of segments in the canonical form.

Local speech rate: Previous research has shown that words with faster speech rate are produced with shorter durations (e.g., De Jong and Wempe 2009, and citations therein). We calculated the local speech rate by dividing the number of syllables in an utterance by the duration of the utterance.

Voicing of the /-s/. Voicing has been shown to correlate with duration Klatt (e.g. 1976). While /-s/ is generally an unvoiced consonant in Estonian, it can become allophonically voiced in certain voiced contexts.

Quantity: Estonian is characterized by a unique three-way quantity distinction in the first syllable which affects the duration of individual segments and syllables. Syllables ending in a short vowel are short and syllables ending in a long vowel, or consonant are long. The contrastive length of vowels, consonants and thus syllables can mark differences in both lexical meaning and grammatical function of words (Asu and Teras, 2009). The following duration ratio distribution has been proposed for two syllable words: 2:3 for Q1, 3:2 for Q2 and 2:1 for Q3 (Lehiste, 1960). The first quantity is proposed to be the shortest, and the third quantity the longest. In our dataset, 952 words were in the first quantity (kodus /kotus/ ’home inessive’), 1902 in the second quantity (katus /kAttus/ ’roof nominative’) and 2819 in the third quantity

(kattes /kAt:tes/ ’cover inessive’).

Manner: Klatt(1976) reported that the duration of segments is affected by the manner of articulation of the following segment. We added this information coding the manner of articulation of the following word as: none, , , nasal, , trill, vowel.

Part-of-speech: Previous research suggests that there might be durational differences within content words depending on whether the word is a noun or a verb (e.g., Gahl et al., 2012; Seyfarth, 2014). Our dataset consisted of 2906 nouns, 462 adjectives and 2305 verbs. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Disfluency and pauses following: Previous research suggests that segments and words before and after pauses are lengthened in spontaneous speech (Byrd et al., 2006; Klatt, 1976). In Estonian, word-final lengthening in speech has been shown for example by Mihkla et al.(2006) and Krull(1997), disfluencies have been not investigated. In the current study, pauses and disfluencies in the preceding the word did not significantly affect the duration and thus were not included. In our dataset, 540 words were followed by a pause, and 896 words were followed by a disfluency, e.g., uncompleted words, words spoken with laughter etc.

Age and gender of the speaker: Previous research indicates that males and younger speakers produce words with shorter duration than females and older speakers (Bell et al., 2009; Raymond et al., 2006; Simpson,

2009). Aare and Lippus(2017) studied gender and age effects in Estonian conversations, and found that articulation rate was faster for male speakers, but no age effect was found. The age and gender distribution in the current analysis is described under Methods. In the next section, we report the results of our two analyses of word and segment duration.

3 Results

3.1 Study 1: Word Duration

In the first study, we investigated the role of inflectional function, inflectional paradigm size, lemma frequency, whole-word frequency and conditional probability on the acoustic duration of the whole word. We tested the contribution of each individual predictor with a model including control variables and random effects using the compareML-function in the itsadug package. Lemma frequency and inflectional paradigm size were highly correlated with whole-word frequency (Table1) and as a result they were not included in the same model. We performed model comparison and found that the model including whole word frequency had lower fREML and AIC-scores than the model with lemma frequency (fREML 3.77 and AIC 3.29 lower) or inflectional paradigm size (fREML 0.37 lower, AIC 10.77 lower). As a consequence, we selected whole-word frequency to be the main predictor in our final model. As can be also seen from Table2, including inflectional function in the model improved the model fit by only a little, whereas with including of whole-word frequency and conditional probability the model fit improved considerably more.

The summary of the final model for word duration is outlined in Table3, and the effects of nonlinear continuous variables are illustrated in Figure1. The model contained random intercepts for speakers and words as well as for the manner of the first segment in the following word.

Number of segments, speech rate, age, and quantity emerged as significant control variables in the model, Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Table 2: fREML-scores representing model fits between the baseline model and models including each indi- vidual predictor in the word duration analysis. Model fREML-score Baseline -1274.1 +Inflectional Function -1277.4 +Whole Word Frequency -1314 +Conditional Probability -1336.7 the effects were in expected direction. As can be seen in Table3, words with more segments were produced with longer duration than words with fewer segments. Words in the third quantity were produced longer than in the first quantity. Words following a pause or a disfluency were produced with longer durations than words not following a pause or a disfluency. Finally, the lower panels of Figure1 show that word durations are longer as speakers age and they are shorter as speech rate increases.

Table 3: Summary of the GAMM model fitted to the log-transformed word duration. Significance threshold is set to 0.001. A. parametric coefficients Estimate Std. Error t-value p-value (Intercept) -1.4843 0.0398 -37.3303 < 0.0001* Number of segments 0.0788 0.0070 11.3081 < 0.0001* Quantity 2 -0.0382 0.0133 -2.8681 0.0041 Quantity 3 0.0481 0.0132 3.6444 0.0003* Voicing: Voiced -0.0265 0.0092 -2.8948 0.0038 Disfluency following 0.1326 0.0080 16.5030 < 0.0001* Pause following 0.1389 0.0343 4.0467 0.0001* Gender: female 0.0309 0.0127 2.4400 0.0147 /-s/ = Inflected -0.0363 0.0096 -3.7899 0.0002* B. smooth terms edf Ref.df F-value p-value s(Age) 2.0696 2.1616 16.3248 < 0.0001* s(Speech Rate) 1.5971 2.0017 501.3616 < 0.0001* s(log Conditional Probability) 6.0919 7.1829 21.6940 < 0.0001* s(log Frequency) 2.4169 2.6750 33.5737 < 0.0001* s(Next Manner) 4.4345 5.0000 22.2063 < 0.0001* s(Word) 190.8398 777.0000 1.1651 < 0.0001* s(Speaker) 106.9333 139.0000 5.8820 < 0.0001*

As for variables of interest, inflected forms were significantly shorter than uninflected forms (/-s/ =

Inflected in Table3, effect size of 8 ms). The upper left panel of Figure1 illustrates the nonlinear effect of conditional probability. Whereas the overall trend is that the more probable the word the shorter the duration; for very low conditional probability words the effect was the opposite: higher probability resulted in longer word duration. Finally, in high probability words, the effect of conditional probability flattens.

The upper right panel in Figure1 indicates that as the whole word frequency increases the word duration decreases. Paradigmatic and syntagmatic effects in Estonian spontaneous speech 0.42 0.44 0.40 0.40 0.38 0.36 Word duration (s) duration Word (s) duration Word 0.36 0.32 fitted values, excl. random excl. fitted values, transformed random excl. fitted values, transformed

−10 −8 −6 −4 −2 0 1 2 3 4 5 6

Log conditional probability Whole word frequency 0.44 0.55 0.42 0.45 0.40 0.38 Word duration (s) duration Word Word duration (s) duration Word 0.35 0.36 fitted values, excl. random excl. fitted values, transformed random excl. fitted values, transformed 0.25 20 30 40 50 60 70 80 2 4 6 8 10

Age (years) Speech rate

Figure 1: Estimated effects of conditional probability given the next word, whole word frequency, age as well as speech rate on word duration (Study 1). The gray shadow represents =)95% confidence intervals (SE=1.96) of the regression line for individual predictors on word duration. Y-scale is back-transformed to seconds.

3.2 Study 2: Segment Duration

In the second analysis, we investigated the role of inflectional function, inflectional paradigm size, lemma frequency, whole-word frequency and conditional probability on the segment duration. We investigated the contribution of each individual predictor with a model including control variables and random effects using the compareML-function by itsadug package. As can be seen from Table4, including inflectional function in the model did not improve the model, but including the interaction between the inflectional function and conditional probability did. The inclusion of inflectional paradigm size alone and in interaction with Paradigmatic and syntagmatic effects in Estonian spontaneous speech inflectional function improved the model fit slightly.

As in the previous study, we also tested whole-word frequency and lemma frequency as potential predictors.

Although they also emerged as significant predictors, they were not included in the final model due to high correlation between predictors. A model with inflectional paradigm size had a lower fREML and AIC-score than a model including whole-word frequency (fREML 3.78 lower and AIC 5.78 lower) or a model including lemma frequency (fREML score 1.93 lower AIC 2.60 lower). The final model contained random intercepts for speakers and words as well as for the manner of the first segment in the following word.

Table 4: fREML-scores representing model fits between the baseline model and models including each indi- vidual predictor in the segment duration analysis. Model fREML-score Baseline 1079.3 +Inflectional Function 1079.9 +Inflectional Paradigm Size 1076.3 +Conditional Probability 1062.7 +Inflectional Paradigm Size:Inflectional Function 1079.1 +Conditional Probability:Inflectional Function 1064

All control effects, i.e., number of segments, speech rate, quantity, voicing as well as disfluency and pause following yielded significant effects in the expected direction (Table5). Words with more segments were produced with a shorter final segment than words with less segments. Participants with a faster speech rate produced shorter final segments than participants with a slower speech rate (the right panel of Figure2).

Words in the second and in the third quantity were produced with a shorter final segment duration compared to words in the first quantity. Voiced /-s/ was produced shorter than voiceless /-s/, and finally segments following a disfluency or a pause were articulated with longer durations than a word not following a disfluency or a pause.

Inflectional function significantly interacted with inflectional paradigm size and conditional probability, with the smooths yielding significant effects whenever /-s/ reflected an inflectional function. Figure2 illus- trates the significant smooths from Table5. The duration of /-s/ was shorter when inflected words had a higher conditional probability and when they had a larger paradigm size.

4 Discussion & Conclusion

Study 1 revealed inflected forms were produced with shorter word durations than uninflected forms. This is in line with the recent work by Plag and colleagues (Plag et al., 2020), who reported duration differences in the relative segment duration between English plural and genitive plural nouns. Although it is important to note that in both studies the effects are not very big. We also showed that higher whole-word frequency was Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Table 5: Summary of GAMM model fitted to the log-transformed word final /-s/ segment duration. Signifi- cance threshold is set to 0.001. A. parametric coefficients Estimate Std. Error t-value p-value (Intercept) -2.3170 0.0501 -46.2323 < 0.0001* Number of Segments -0.0325 0.0083 -3.9306 0.0001* Quantity 2 -0.0551 0.0159 -3.4524 0.0006* Quantity 3 -0.0839 0.0157 -5.3520 < 0.0001* Voicing: Voiced -0.2786 0.0142 -19.5747 < 0.0001* Disfluency following 0.2909 0.0123 23.6514 < 0.0001* Pause following 0.3670 0.0583 6.2943 < 0.0001* /-s/ = Inflected -0.0536 0.0187 -2.8620 0.0042 B. smooth terms edf Ref.df F-value p-value s(log Conditional Probability):/-s/ = UnInflected 1.0000 1.0000 1.5644 0.2111 s(log Conditional Probability):/-s/ = Inflected 1.4095 1.7080 25.7926 < 0.0001* s(Inflectional Paradigm Size):/-s/ = UnInflected 1.5169 1.8314 0.9981 0.3548 s(Inflectional Paradigm Size):/-s/ = Inflected 1.0000 1.0001 13.1275 0.0003* s(Speech Rate) 2.4861 3.1889 171.2860 < 0.0001* s(Next Manner) 4.5241 5.0000 26.7760 < 0.0001* s(Word) 32.6305 775.0000 0.0631 0.0016 s(Speaker) 98.2108 141.0000 3.1566 < 0.0001* 0.12 0.078 0.075 0.10 0.074 0.070 0.08 0.070 Segment duration (s) Segment duration (s) Segment duration (s) Segment duration 0.06 0.065 0.066 fitted values, excl. random excl. fitted values, transformed random excl. fitted values, transformed random excl. fitted values, transformed

−10 −8 −6 −4 −2 0 0 1 2 3 2 4 6 8 10

Log conditional probability: /−s/ = Inflected Log paradigm size: /−s/ = Inflected Speech rate

Figure 2: Estimated effects of inflectional paradigm size, conditional probability and speech rate for /-s/ duration (Study 2). The grey shadows represent 95 % confidence bands (SE=1.96) of the regression line for individual predictors on segment duration. Y-scale was back-transformed to seconds. associated with shorter word durations, extending L˜ooet al.(2018b)’s previous work on Estonian inflection which reported whole-word frequency effects for isolated read words in a word naming task. Similar to English

(e.g., Caselli et al., 2016), our findings show that whole-word frequency effects of complex words also pertain in Estonian spontaneous speech.

We also replicate the well documented effect of conditional probability for Estonian. As previously found for many other languages (e.g., English: Bell et al. 2009, Dutch: Tily and Kuperman 2012, Kaqchikel: Tang and Bennett 2018), words in a more probable context are produced with a shorter duration than forms in a Paradigmatic and syntagmatic effects in Estonian spontaneous speech less probable context. In contrast to Plag and colleagues (Plag et al., 2017), the Study 2 does not support a main effect of inflectional function. The effect was not significant in this study due to our use of a more conservative p-value. The distinction between inflected and uninflected /-s/ emerges in the interaction with conditional probability and inflectional paradigm size, with the latter yielding shorter /-s/ durations only for inflected word forms. These results indicate that paradigmatic effects emerge for Estonian like they do for morphologically less complex language like Dutch or English, however, the nature of this effect may be different.

Thus, our results are in principal in line with other recent evidence on paradigmatic effects in phonetic realization of complex words, even though the interpretation of the effects differs slightly. In Bell et al.(2020), for example, an increase in acoustic duration of linking consonants in compounds in relation to smaller family size. They argue that smaller family size is equivalent to higher paradigmatic probability, which has been repeatedly been shown to be associated with phonetic enhancement (Cohen, 2014, 2015; Kuperman et al.,

2007; Tomaschek et al., 2021b).

In contrast, L˜ooet al.(2018b) proposes a different interpretation of the inflectional paradigm size effect in Estonian. Estonian is a language where the size of the paradigm can vary greatly, in comparison to languages with relatively small paradigms such as English. This has implications for how many forms are actually actively used. While in English most of the cells of a paradigm are likely actively used, this may not be the case in Estonian. The present results suggest that the Estonian production system is tuned to the distributional characteristics of the language. When an inflected word is produced, the inflectional paradigm size effect indicates that all actively used forms are activated during production. As a consequence, the production of the inflected word is facilitated leading to durational shortening of each individual form.

Even though the effect of inflectional paradigm size is not as large as in studies with isolated Estonian words (L˜ooet al., 2018a,b), the current findings indicate that paradigmatic effects pertain even when syn- tagmatic context of the word is accounted for in spontaneous speech. Critically, the paradigm effect and conditional probability effect do not interact in our analysis, which suggests that they have an important independent role in speech production.

The question arises with regard to why inflected and uninflected /-s/ durations are impacted by the syntagmatic context differently. One potential explanation may be that these two instances differ in terms of their informativity in conveying a semantic contrast. The inflected /-s/ contributes information to the signal in addition to the information provided by the base, namely the inflectional function of the word.

When this inflectional function is well predictable by the syntagmatic context, it is shortened. When the inflection function is less predictable by the syntagmatic context, it is lengthened to provide the necessary information. By contrast, the uninflected /-s/ is part of the base and thus can be less easily predicted from Paradigmatic and syntagmatic effects in Estonian spontaneous speech the context, which is why it is not affected by conditional probability. This explanation dovetails with the

findings by Torreira and Ernestus(2012) who reported stronger effects of morpho-syntactic predictability on suffixes than on other types of word final /-s/.

An important implication of the difference in how inflected and uninflected segments behave should be accommodated in models of speech production (e.g., Dell et al., 2007; Levelt et al., 1999) such that morphological complexity can influence the speech articulation process. This has usually not been the case due to the fact that most studies have been conducted on languages with relatively simple morphological complexity (see also Tucker, 2019, for a review of the role of morphology in models of speech production).

Recent computational approaches such as those applied in Baayen et al. 2019; Chuang et al. 2020; Hickok

2014 may provide an additional perspective to our understanding of what underlying mechanisms drive these effects. The specific details of such a model, however, are outside the scope of the current paper.

In summary, syntagmatic properties had the strongest effect on both segment and word duration in the production of Estonian spontaneous speech. In addition, morphological and paradigmatic properties like inflectional function and inflectional paradigm size co-determined the phonetic characteristics of words and phones under investigation. Our research supports the idea that morphology is a part of a vast complex system and it is important to study how it interacts with other aspects of the language to better understand the production system.

Acknowledgements

This research was funded by Estonian Research Council Mobilitas Pluss postdoctoral researcher grant

(MOBJD408), a collaborative grant from the Deutsche Forschungsgemeinschaft (German Research Foun- dation: Spoken Morphology, Projects BA 3080/3-1 and BA 3080/3-2) and the National Program for the

Estonian Language Technology (project EKTB3).

References

Aare, K. and Lippus, P. (2017). Some gender patterns in Estonian dyadic conversations. In Nordic prosody.

Proceedings of the XIIth conference, Trondheim, pages 29–38.

Asu, E. L. and Teras, P. (2009). Estonian. Journal of the International Phonetic Association, 39(3):367–372.

Aylett, M. and Turk, A. (2004). The smooth signal redundancy hypothesis: a functional explanation for

relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language

and Speech, 47:31–56. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., and Blevins, J. P. (2019). The discriminative lexicon:

A unified computational model for the lexicon and lexical processing in comprehension and production

grounded not in (de) composition but in linear discriminative learning. Complexity, 2019. Publisher:

Hindawi.

Baayen, R. H., Vasishth, S., Bates, D., and Kliegl, R. (2017). The cave of shadows. addressing the human

factor with generalized additive mixed models. Journal of Memory and Language, 56:206–234.

Baker, R., Smith, R., and Hawkins, S. (2007). Phonetic differences between mis-and dis-in english prefixed

and pseudo-prefixed words. Proceedings of ICPhS XVI, pages 553–556.

Bell, A., Brenier, J. M., Gregory, M., Girand, C., and Jurafsky, D. (2009). Predictability effects on durations

of content and function words in conversational English. Journal of Memory and Language, 60(1):92–111.

Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., and Gildea, D. (2003). Effects of disfluen-

cies, predictability, and utterance position on word form variation in English conversation. Journal of the

Acoustical Society of America, 113:1001–1024.

Bell, M. J., Hedia, S. B., and Plag, I. (2020). How morphological structure affects phonetic realisation in

english compound nouns. Morphology, pages 1–34.

Boersma, P. and Weenink, D. (2021). Praat: doing by computer. http://www.praat.org.

Bybee, J. L. (2001). Phonology and language use. Cambridge University Press, Cambridge.

Byrd, D., Krivokapi´c,J., and Lee, S. (2006). How far, how long: On the temporal scope of prosodic boundary

effects. The Journal of the Acoustical Society of America, 120(3):1589–1599.

Caselli, N. K., Caselli, M. K., and Cohen-Goldberg, A. M. (2016). Inflected words in production: Evidence

for a morphologically rich lexicon. The Quarterly Journal of Experimental Psychology, 69(3):432–454.

Cho, T. (2001). Effects of Morpheme Boundaries on Intergestural Timing: Evidence from Korean. Phonetica,

58:129–162.

Cho, T. (2004). Prosodically conditioned strengthening and vowel-to-vowel coarticulation in English. Journal

of Phonetics, 32(2):141 – 176.

Chuang, Y.-Y., L˜oo,K., Blevins, J. P., and Baayen, R. H. (2020). Estonian case inflection made simple: a

case study in Word and Paradigm Morphology with Linear Discriminative Learning. In Kortvelyessy, L.

and Stekauer, P., editors, Complex Words: Advances in Morphology, pages 114–119. Cambridge University

Press. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Cohen, C. (2014). Probabilistic reduction and probabilistic enhancement: contextual and paradigmatic effects

on morpheme pronunciation. Morphology, 24(4):291–323.

Cohen, C. (2015). Context and paradigms: Two patterns of probabilistic pronunciation variation in russian

agreement suffixes. The Mental Lexicon, 10(3):313–338.

De Jong, N. H. and Wempe, T. (2009). Praat script to detect syllable nuclei and measure speech rate

automatically. Behavior research methods, 41(2):385–390.

Dell, G. S., Martin, N., and Schwartz, M. F. (2007). A case-series test of the interactive two-step model

of lexical access: Predicting word repetition from picture naming. Journal of Memory and Language,

56(4):490–520.

Drosdowski, G. and Eisenberg, P. (1998). Duden, Grammatik der deutschen Gegenwartssprache. Der Duden

in 12 B¨anden:Das Standardwerk zur deutschen Sprache. Dudenverlag.

Erelt, M. (2003). . Estonian Academy Publishers, Tallinn.

Gafos, A., Hoole, P., Roon, K., and Zeroual, C. (2010). Variation in overlap and phonological grammar in

Moroccan Arabic clusters. In Laboratory Phonology, volume 10, pages 657–698. Journal Abbreviation:

Laboratory Phonology.

Gahl, S., Yao, Y., and Johnson, K. (2012). Why reduce? phonological neighborhood density and phonetic

reduction in spontaneous speech. Journal of Memory and Language, 66(4):789–806.

Gregory, M., Raymond, W., Bell, A., Fosler-Lussier, E., and Jurafsky, D. (1999). The effects of collocational

strength and contextual predictability in lexical production. CLS, 35:151–166.

Hamm, J. (1981). Grammatik der serbokroatischen Sprache. Slavische Studienb¨ucher. 5. Harrassowitz.

Hay, J. B. (2003). Causes and Consequences of Word Structure. Routledge, New York and London.

Hickok, G. (2014). The architecture of speech production and the role of the phoneme in speech processing.

Language, Cognition and Neuroscience, 29(1):2–20. Publisher: Taylor & Francis.

Jurafsky, D., Bell, A., and Gyrand, C. (2002). The role of the lemma in form variation. In Gussenhoven, C.

and Warner, N., editors, Papers in Laboratory Phonology VII, pages 1–34. Mouton de Gruyter, Berlin/New

York.

Keating, P., Cho, T., Fougeron, C., and Hsu, C.-S. (2003). Domain-initial strengthening in four languages. In

Papers in laboratory phonology VI: Phonetic interpretations, pages 145–163. Cambridge, UK: Cambridge

University Press. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Keating, P. A. (2006). Phonetic encoding of prosodic structure. In Harrington, J. and Tabain, M., editors,

Speech production: Models, phonetic processes, and techniques, pages 167–186. Psychology Press, New York

and Hove.

Kemps, R., Wurm, L. H., Ernestus, M., Schreuder, R., and Baayen, R. (2005). Prosodic cues for morphological

complexity in Dutch and English. Language and Cognitive Processes, 20:43–73.

Klatt, D. H. (1976). Linguistic uses of segmental duration in english: Acoustic and perceptual evidence. The

Journal of the Acoustical Society of America, 59(5):1208–1221.

Krakow, R. A. (1999). Physiological organization of syllables: a review. Journal of Phonetics, 27(1):23 – 54.

Krull, D. (1997). Prepausal lengthening in estonian: Evidence from conversational speech. In Estonian

prosody: Papers from a symposium, pages 136–148. Institute of Estonian Language and Authors Tallinn.

Kuperman, V., Pluymaekers, M., Ernestus, M., and Baayen, H. (2007). Morphological predictability and

acoustic duration of interfixes in dutch compounds. The Journal of the Acoustical Society of America,

121(4):2261–2271.

Lee-Kim, S.-I., Davidson, L., and Hwang, S. (2013). Morphological effects on the darkness of english inter-

vocalic/l. Laboratory Phonology, 4(2):475–511.

Lehiste, I. (1960). An acoustic–phonetic study of internal open juncture. Phonetica, 5(Suppl. 1):5–54.

Levelt, W. J., Roelofs, A., and Meyer, A. S. (1999). A theory of lexical access in speech production. The

Behavioral and brain sciences, 22(1).

Lippus, P. (2020). Phonetic corpus of Estonian spontaneous speech v.1.0.6 [online]. Available at: https:

//doi.org/10.15155/1-00-0000-0000-0000-0012BL.

L˜oo,K., J¨arvikivi, J., and Baayen, R. H. (2018a). Whole-word frequency and inflectional paradigm size

facilitate Estonian case-inflected noun processing. Cognition, 175:20–25.

L˜oo,K., J¨arvikivi,J., Tomaschek, F., Tucker, B. V., and Baayen, R. H. (2018b). Production of Estonian

case-inflected nouns shows whole-word frequency and paradigmatic effects. Morphology, 28:71–97.

Menzerath, P. (1954). Die Architektonik des deutschen Wortschatzes. Phonetische Studien, Heft 3.

Mihkla, M. et al. (2006). Pausid k˜ones. Keel ja Kirjandus, 49(04):286–295.

Plag, I., Homann, J., and Kunter, G. (2017). Homophony and morphology: The acoustics of word-final s in

english. Journal of Linguistics, 53(1):181–216. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Plag, I., Lohmann, A., Hedia, S. B., and Zimmermann, J. (2020). An < s > is an < s0 >, or is it? plural

and genitive-plural are not homophonous. Complex Words: Advances in Morphology, pages 260–285.

Pluymaekers, M., Ernestus, M., Baayen, R. H., and Booij, G. (2010). Morphological effects on fine phonetic

detail: The case of dutch -igheid. Laboratory phonology, 10(511-532).

Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1985). A comprehensive grammar of the English

language. Longman, London.

Raymond, W. D., Dautricourt, R., and Hume, E. (2006). Word-internal/t, d/deletion in spontaneous speech:

Modeling the effects of extra-linguistic, lexical, and phonological factors. Language Variation and Change,

18(1):55–97. Publisher: Cambridge University Press.

Roelofs, A. (1997). The WEAVER model of word-form encoding in speech production. Cognition, 64(3):249

– 284.

Schuppler, B., van Dommelen, W. A., Koreman, J., and Ernestus, M. (2012). How linguistic and probabilistic

properties of a word affect the realization of its final/t/: Studies at the phonemic and sub-phonemic level.

Journal of Phonetics, 40(4):595–607.

Seyfarth, S. (2014). Word informativity influences acoustic duration: Effects of contextual predictability on

lexical representation. Cognition, 133(1):140–155.

Seyfarth, S., Garellek, M., Gillingham, G., Ackerman, F., and Malouf, R. (2018). Acoustic differences in

morphologically-distinct homophones. Language, Cognition and Neuroscience, 33(1):32–49.

Simpson, A. P. (2009). Phonetic differences between male and female speech. Language and linguistics

compass, 3(2):621–640.

Smith, R., Baker, R., and Hawkins, S. (2012). Phonetic detail that distinguishes prefixed from pseudo-prefixed

words. Journal of Phonetics, 40(5):689–705.

Strycharczuk, P. (2019). Phonetic detail and phonetic gradience in morphological processes. In Oxford

Research Encyclopedia of Linguistics.

Tang, K. and Bennett, R. (2018). Contextual predictability influences word and morpheme duration in a

morphologically complex language (kaqchikel mayan). The Journal of the Acoustical Society of America,

144(2):997–1017.

Tiede, M., Shattuck-Hufnagel, S., Johnson, B., Ghosh, S., Matthies, M., Perkell, J., Comm, S., Group, M.,

and Laboratories, H. (2007). Gestural phasing in/kt/sequences contrasting within and cross word contexts. Paradigmatic and syntagmatic effects in Estonian spontaneous speech

Tily, H. and Kuperman, V. (2012). Rational phonological lengthening in spoken dutch. The Journal of the

Acoustical Society of America, 132(6):3935–3940.

Tomaschek, F., Hendrix, P., and Baayen, R. H. (2018). Strategies for addressing collinearity in multivariate

linguistic data. Journal of Phonetics, 71:249–267.

Tomaschek, F., Plag, I., Ernestus, M., and Baayen, R. H. (2019). Phonetic effects of morphology and context:

Modeling the duration of word-final S in English with naive discriminative learning. Journal of Linguistics,

pages 1–39.

Tomaschek, F., Tucker, B. V., Ramscar, M., and Baayen, R. H. (2021a). Paradigmatic enhancement of stem

vowels in regular English inflected verb forms. Morphology, pages 1–29.

Tomaschek, F., Tucker, B. V., Ramscar, M., and Baayen, R. H. (2021b). Paradigmatic enhancement of stem

vowels in regular English inflected verb forms. Morphology.

Torreira, F. and Ernestus, M. (2012). Weakening of intervocalic /s/ in the Nijmegen Corpus of Casual

Spanish. Phonetica, 69(3):124–148.

Tucker, B. V. (2019). Psycholinguistic approaches to morphology: Production. In Oxford Research Encyclo-

pedia of Linguistics.

Tucker, B. V., Sims, M., and Baayen, R. H. (2019). Opposing forces on acoustic duration.

https://psyarxiv.com/jc97w.

Vare, S. (2012). Eesti keele s˜onapered: t¨anap¨aevaeesti keele s˜onavara struktuurianal¨u¨us(Estonianword

families: a structural analysis of modern Estonian vocabulary). Eesti Keele Sihtasutus.

Wedel, A., Ussishkin, A., and King, A. (2019). Crosslinguistic evidence for a strong statistical universal:

Phonological neutralization targets word-ends over beginnings. Language.

Whalen, D. H. (1991). Infrequent words are longer in duration than frequent words. The Journal of the

Acoustical Society of America, 90(4):2311–2311.

Wood, S. N. (2006). Generalized Additive Models. Chapman & Hall/CRC, New York.

Wright, C. E. (1979). Duration differences between rare and common words and their implications for the

interpretation of word frequency effects. Memory & Cognition, 7(6):411–419.

Zimmermann, J. (2016). Morphological status and acoustic realization: Findings from New Zealand English.

In Proceedings of the 16th Australasian International Conference on Speech Science and Technology, pages

6–9.