“That Sounds Unlikely”: Syntactic Probabilities Affect Pronunciation

“That Sounds Unlikely”: Syntactic Probabilities Affect Pronunciation Susanne Gahl ([email protected]) Beckman Institute for Advanced Science and Technology, 405 N. Mathews Ave Urbana, IL 61801 USA Susan M. Garnsey ([email protected]) Department of Psychology, 603 E. Daniel St Urbana, IL 61801 USA Cynthia Fisher ([email protected]) Department of Psychology, 603 E. Daniel St Urbana, IL 61801 USA Laura Matzen ([email protected]) Department of Psychology, 603 E. Daniel St Urbana, IL 61801 USA Abstract The only other studies, to our knowledge, to check for effects of verb bias on pronunciation (Blodgett, 2004; The probability of encountering a particular syntactic Kjelgaard & Speer, 1999) did not find such effects. This configuration, given a particular verb (i.e. “verb bias” or discrepancy could be due to the sentence types examined in “subcategorization preference”) affects language those studies, viz. sentences with initial transitive or comprehension (Trueswell, Tanenhaus, & Kello, 1993). A recent study (Gahl & Garnsey 2004) of sentences with intransitive subordinate clauses, such as When Roger leaves temporary direct object / sentential complement ambiguities [the house is empty | the house, it’s empty]. Such sentences – shows that such probabilities also affect language production, sometimes referred to as “Closure sentences” – are typically specifically pronunciation. In this paper, we extend that pronounced with a prosodic boundary at the end of the finding to a new sentence type – sentences with initial subordinate clause. This boundary tends to be more salient subordinate clauses with temporary closure ambiguities, such than boundaries in DO/SC sentences. Warren (1985), for as If the tenants beg [, the landlord will let them stay | the example, reports that the duration of verbs at clause landlord, he will let them stay]. We show systematic boundaries is lengthened, on average, by a factor of 1.08 in differences in the pronunciation of sentences with high vs. DO/SC sentences, but by a factor of 1.51 in Closure low probability structures, given their verbs. We briefly discuss the implications of this finding for research on sentences. Perhaps the salient boundaries in Closure sentence comprehension and pronunciation variation. sentences eclipse microvariation of the sort described in Gahl and Garnsey (2004). On the other hand, the absence of Keywords: Probability; pronunciation; prosody; verb bias; probability-based variation in Kjelgaard and Speer’s and subcategorization; closure ambiguities; early closure; late Blodgett’s materials may be due to the fact that these studies closure; variation; probabilistic parsing; language production. used recordings made by a trained speaker intentionally producing particular boundary types in the ToBI standard Introduction (Beckman & Hirschberg, 1994; Beckman, Hirschberg, & The probability of encountering a particular syntactic Shattuck-Hufnagel, 2005; Silverman et al., 1992). Such configuration, given a particular verb is often referred to as deliberate, tightly controlled pronunciation may not show “verb bias” or “subcategorization preference” and has been the same range of variation found in naïve productions. shown to affect language comprehension (Trueswell et al., Perhaps less controlled pronunciation of Closure sentences 1993). Verb biases are normally thought of as does reflect structural probabilities. generalizations over usage. Do such probabilities affect the Many more verbs participate in transitivity alternations language production system during the production of than in the DO/SC alternation (cf. Levin, 1993). Therefore, individual utterances? A recent study shows that they do: closure sentences allow us to study the workings of a larger Examining sentences with temporary direct object / part of the lexicon than do DO/SC sentences, as well as to sentential complement (DO/SC) ambiguities, such as The examine pronunciation variation in strongly marked divorce lawyer argued the issue [was irrelevant | with her prosodic boundaries. colleague], Gahl and Garnsey (2004) found systematic How might structural probabilities affect the differences in the pronunciation of sentences with high vs. pronunciation of Closure sentences? Gahl and Garnsey low probability structures, given their verbs. (2004) predicted that phonetic characteristics of boundaries 1334 would tend to be exaggerated for low-probability 18.8, p < 001.). Each of the verbs appeared in two boundaries, and minimized for high-probability boundaries. sentences, one with Early Closure and one with Late This prediction was confirmed. Thus, Gahl and Garnsey Closure, e.g. When the python escaped, the zoo had to be found greater degrees of pre-pausal lengthening near low- closed to the public and When the python escaped the zoo, it probability clause boundaries than near high-probability had to be closed to the public. The complete set of sentences clause boundaries.1 These predictions were motivated by appears in Table 1. All participants read all sentences. earlier findings that high lexical frequency and high lexical To prevent confounds due to presentation order, two transitional probability – the probability of a word given a presentation lists were constructed, each with two blocks of neighboring word - promote phonetic reduction (Bell et al., twenty sentences. On List 1, half the verbs of each bias type 2003; Gregory, Raymond, Bell, Fosler-Lussier, & Jurafsky, appeared in their bias-conforming syntactic context in block 1999; Jurafsky, Bell, Gregory, & Raymond, 2001). 1, and in their bias-violating context in block 2, while for Clause boundaries in Closure sentences occur either the other half of the verbs, the opposite order was used. On immediately following the verb (Early Closure, i.e. List 2, the relative order of bias-conforming and bias- intransitive: When Roger leaves, # the house is empty), or violating environments was reversed. Within blocks, the following the direct object (Late Closure, i.e. transitive: order of sentences was randomized. The same random order When Roger leaves the house, # it is empty). In both cases, of sentences was used in each block, thus maximizing the speakers tend to insert pauses at the boundary and lengthen distance between sentences containing the same verb. The the words immediately before the boundary (the verb in participants were randomly assigned to two groups, one Early Closure sentences, the direct object in Late Closure receiving List 1, the other List 2. sentences), compared to its baseline duration (Kjelgaard & The subject noun phrases were different for each verb, but Speer, 1999; Schafer, Speer, Warren, & White, 2000; were the same for the two sentences each verb appeared in. Warren, 1985; Warren, Grabe, & Nolan, 1995). If structural The nouns used in the subject noun phrases for the two sets probabilities have similar effects in these sentences as in of verbs did not differ significantly in frequency, length in DO/SC sentences, we should expect words near low- letters, phonemes, or syllables (all t(18) < 1.6, p > .15), or in probability boundaries to be lengthened more than those plausibility as subjects for the verbs they appeared with, as near high-probability boundaries. estimated by the method described in (Keller, Lapata, & In this paper, we offer evidence showing that this is the Ourioupina, 2002). case. Our results support the notion that verb biases affect The ambiguous noun phrases (e.g. The python escaped the language production, as well as comprehension. Our results zoo…) did not differ in frequency, length in letters, have implications for research on sentence processing and phonemes, or syllables, or in estimated plausibility as probabilistic pronunciation variation. objects of the verbs they appeared with (all t(18) <1.5, p > .15). Method The forty experimental sentences were pseudorandomly interleaved with 177 filler sentences of various syntactic Participants structures, which represented stimuli for two other experiments. Twenty undergraduate students (ten male, ten female) at the University of Illinois participated in the experiment for Procedure payment. All were native speakers of English without reported hearing problems. Sentences were recorded in a sound booth as 16-bit digital sound files at a sampling rate of 44.1 kHz and resampled to Materials and design 22.1 kHz. Participants were told to read each sentence The materials consisted of Early/Late Closure sentence silently first, until they felt confident that they understood it pairs, such as When the python escaped the zoo (it) had to and could say it without difficulty. They were also told that be closed to the public, some of which were based on the there was no limit on the amount of time they could take to sentences in Kjelgaard and Speer (1999). The main read and record the sentences, and that the recordings would consideration in selecting the verbs for the subordinate be used as stimuli in a comprehension experiment. If clause was strong verb bias, either towards transitive (Late participants felt that they had not said a sentence in a natural Closure) or intransitive (Early Closure). Estimates of verb manner, they were asked to repeat it. When speakers bias were based on corpus counts (Gahl, Jurafsky, & misspoke or hesitated or used a noticeably exaggerated Roland, 2004). Ten Transitive Bias verbs and ten pronunciation after recovering from a garden-path (“oh, I Intransitive Bias verbs were selected. get it,… When the python escaped THE ZOO, it…”), the The two

“That Sounds Unlikely”: Syntactic Probabilities Affect Pronunciation

Common and Distinct Neural Substrates for Pragmatic, Semantic, and Syntactic Processing of Spoken Sentences: an Fmri Study

Modeling Subcategorization Through Co-Occurrence Outline

Acquiring Verb Subcategorization from Spanish Corpora

Features from Aspects Via the Minimalist Program to Combinatory Categorial Grammar

Glossary for Syntax in Three Dimensions (2015) © Carola Trips

Subcategorization Semantics and the Naturalness of Verb-Frame Pairings

Verb Sense and Verb Subcategorization Probabilities Douglas Roland Daniel Jurafsky University of Colorado University of Colorado Department of Linguistics Dept

Valency Dictionary of Czech Verbs: Complex Tectogrammatical Annotation

Dependency Grammar

MITA Working Papers in Psycholinguistics, Volume 2. INSTITUTION Keio Univ., Tokyo (Japan)

Verb Subcategorization Frequencies: American English Corpus Data, Methodological Studies, and Cross-Corpus Comparisons

Automatic Extraction of Subcategorization from Corpora