<<

Class 8: Phonological typology

Adam Albright ([email protected])

LSA 2017 University of Kentucky Announcements

▶ For those taking this class for credit ▶ Please upload assignments (option 1 or option 2) by PDF to Canvas by tonight ▶ Today ▶ Questions? ▶ Phonological typology

References 1/38 From specific languages to typology

▶ So far: main focus has been on providing rankings that yield set of outputs attested in a specific ▶ However, arguments for constraint formulation and ranking have been partly language-internal, and partly cross-linguistic ▶ Language-internal: Korean allows laryngeal contrasts on consonants before a vowel, but not before another consonant ▶ Cross-linguistic: if a language allows laryngeal contrasts before a consonant, it allows them before a vowel *[]/ ¬[+son] ≫ *[voice]/ ¬[−son]

or: Ident([±voi])/[+son] ≫ *Ident([±voi])/ [+son]

▶ Or in some cases, almost entirely cross-linguistic ▶ Low-ranked markedness constraints[ ] −sonorant ▶ E.g., Limbu: Ident([±voi]) ≫ * +voice

References 2/38 Using typological data to inform constraint formulation

Implicational asymmetries give us insight into… ▶ Which constraints to include ▶ Conjecture (not verified): if a language allows initial #ŋV, it also allows initial #mV and #nV ▶ *#ŋ constraint without corresponding *#m, *#n: predicts two type of languages, depending on ranking w.r.t. Ident(place) ▶ Fixed rankings ▶ Verified by Steriade (1999): if a language allows laryngeal contrasts before a consonant, it allows them before a vowel ▶ *[voice]/ ¬[+son] ≫ *[voice]/ ¬[−son]

References 3/38 Universal CON?

▶ This reasoning is most straightforward if we can guarantee that no would ever contain a constraint that would ‘subvert’ the predicted asymmetry ▶ Hypothesis: set of constraints (and, perhaps, a priori rankings) is fixed and universal (Prince and Smolensky, 2002) ▶ Or, subject to limitations that guarantee asymmetries (Hayes, 1999; Hayes and Steriade, 2004; Smith, 2003)

▶ Assumed by RCD (must be able to identify all L’s from the start)

References 4/38 Factorial typology

▶ Space of possible = set of possible rankings ▶ Deriving the set of predicted languages ▶ Virtually guaranteed to be fewer languages than rankings (why?) ▶ Enormous space, but much smaller than possible sets of ordered rules

References 5/38 Evaluating typological predictions of a proposed constraint

▶ Can only be assessed through interaction ▶ In practice, often assessed for just a limited set of constraints (‘mini-typology’)

▶ Typological predictions are independent of (Richness of the Base) ▶ Assessing fit to attested typology

Predicted/Attested Yes No Yes Correctly analyzable Accidental gap No Exception Correctly excluded ▶ Eliminating exceptions: descriptive adequacy ▶ Minimizing “accidental” gaps → restrictive theory

References 6/38 The typology of stress systems

▶ In principle, all of the constraints that we’ve used up until this point could be submitted to factorial typology and evaluated ▶ Interactions → enormous set of possible languages ▶ Stress assignment: somewhat ‘insulated’ from other parts of the grammar ▶ Easier to document independently of other features of the language (modulo ) ▶ Easier to assess mini-typology with some confidence

References 7/38 Stress

▶ An abstract (“hidden”) property ▶ Liberman (1975); Liberman and Prince (1977): linguistic manifestation of rhythmic structure ▶ Prosodic prominence = ‘strength’

▶ Behavioral diagnostics (tapping, text alignment) ▶ English: eligibility for phrasal prominences (‘nuclear intonation tones’, marked with pitch accents) ▶ Diagnosis through pitch accent: calling contour, surprise redundancy contour ▶ Compare: collàborátion, clàssificátion ▶ Conditions phonological processes ▶ Contrast: e.g., vowel reduction in stressless ▶ Other reductions: e.g., flapping in English

References 8/38 Stress

▶ Acoustic correlates: mostly indirect in English (pitch accent) ▶ Inherent: duration, possibly voice quality, following C duration ▶ Accent: intensity/amplitude, pitch

▶ Probably also mostly indirect cues in other languages, though remarkably few studies dissociating stress from pitch accent ▶ NB: when the most straightforward diagnostics (e.g., stress-based meter) are unavailable or irrelevant for a given language, the position of stress can be notoriously difficult for non-native listeners to identify! ▶ Misidentification of duration, pitch, etc. associated with position in or phrase (French, Welsh) ▶ An interesting problem: difficult also for learners

References 9/38 Typological properties of stress: some universal properties (Hayes, 1995, chap. 3) ▶ Culminativity: every word or phrase has a single strongest (most prominent) ▶ Hierarchical organization ▶ Primary, secondary, tertiary stress: Constantinople 23010 vs. sensationality 32010 ▶ Rhythmic organization ▶ Alternating stressed/stressless syllables ▶ If there are multiple stresses in a given domain, they are generally spaced at regular intervals: 102020 not *122000 ▶ Regular stresses every two (or sometimes three) syllables ▶ No assimilation ▶ Unlike voicing, place, etc., no tendency for adjacent syllables to agree in stress ▶ In fact, assimilation would destroy rhythmic organization ▶ Often taken as an argument for a distinct representation (not a feature)

References 10/38 Parameters of stress systems

▶ Is the position of stress determined phonologically? (lexical (free) vs. fixed stress) ▶ What determines position? ▶ Edges of the word: stress left, right, penultimate, peninitial, antepenultimate… ▶ Weight: stress ‘heavier’ syllables (long vowel, CVN, CVC, etc.) (Quantity sensitivity) ▶ Stress just the syllable(s) with relevant property (free stress) or regularly alternating syllables (bounded stress) ▶ If alternating: binary or ternary?

▶ Morphological sensitivity

References 11/38 The representation of stress

▶ Featural (but: no assimilation) ▶ Grid (Prince, 1983; Selkirk, 1984) × × × σ σ σ σ σ a bra ca da bra ▶ Feet: binary vs. ternary, head position

(σ̀ σ) σ (σ́ σ) abra ca dabra

References 12/38 Where does stress fall?

Quantity insensitive systems (Gordon, 2002) ▶ Final: Atayal, Moghol, Mazatec ▶ Penultimate: Mohawk, Albanian, Jaqaru ▶ Antepenultimate: Macedonian ▶ Initial: Arabela, Chitimacha, Nenets ▶ Peninitial: Lakhota, Koryak ▶ Postpeninitial: Hocąk (a.k.a. Winnebago) ▶ Rarer: ‘dual’ systems, at/near L and R (one primary, one secondary) (Not discussed here: quantity sensitive systems, where position of stress depends on vowel length or syllable type)

References 13/38 Capturing stress placement with constraints Gordon (2002): Align(Level n,Edge)

Level 2: × Level 1: × × Syllables: σ σ σ σσ

▶ Levels: {1,2}, Edges: {L,R} ▶ Every grid mark on Level n must be aligned with the grid mark on the named edge of Level n-1 ▶ Align(Level 1,L): there must be a stress on the leftmost syllable ▶ Example above: satisfies Align(Level 1,L), but violates Align(Level 1,R) ▶ Align(Level 2,L): the leftmost stress must be primary (cf. Hayes, 1995 ‘End Rule Left’) ▶ Example above: violates Align(Level 2,L), but satisfies Align(Level 2,R)

References 14/38 AFACTORIALTYPOLOGYOFQUANTITY-INSENSITIVESTRESS 499

are several violations of a constraint committed by a form, the number of violations appears in parentheses. Adopting the ALIGN (x2,X,1,PrWd)constraintsasopposedtocon- straints which count absolute distance of the primary stress from an edge (cf. McCarthy and Prince 1993) has the empirical advantage of creating amoreconstrainedfactorialtypologyofstresssystems12 as well as the formal advantage (in a grid-based theory of representation) of being more principled, assuming that all grid marks above level 0 must dominate a Evaluatinglower levelAlign grid mark(Level (Prince’s n,Edge): (1983) ContinuousGordon (2002 Column, Constraint). p. 499)

(5) Evaluation of the ALIGN constraints

References 15/38 Before proceeding with analyses employing the ALIGN constraints, it should be noted that, although the ALIGN constraints discussed in this paper will make reference to the word as the stress domain, it is assumed that other members of the ALIGN constraint family sensitive to different stress domains, such as the root and different phrasal levels, also exist. These ALIGN constraints play an important role in characterizing mor- phologically sensitive stress (see Alderete 1999 for morphological stress

12 The ALIGN (x2,{R/L},1,PrWd)constraintsadoptedheregenerateonly79distinct stress systems as opposed to 93 generated by their hypothetical counterparts which count absolute distance of the main stress from an edge. The extra patterns, none of which are attested, fall under the class of fixed stress systems displaying two stresses per domain (see section 2.2 for the factorial typology of fixed stress). Rhythmic stress and windows

▶ *Clash ▶ No sequences of two stressed syllables: *σ́σ́ ▶ *Lapse ▶ No sequences of two stressless syllables: *σσ ▶ *Extended Lapse ▶ No sequences of three stressless syllables: *σσσ

▶ Position: *Lapse(R), *Lapse(L), *ExtLapse(R), possibly also *ExtLapse(L)

References 16/38 Rhythmic stress and windows

▶ The idea behind windows: stress wants to be at one edge of the word, but is prohibited from being more than one/two syllables from the end ▶ Antepenultimate: *ExtLapse(R) ≫ Align(Level 1,L) ≫ Align(Level 1,R)

▶ *Lapse(R), *Lapse(L): penultimate, peninitial stress ▶ *ExtLapse(R): antepenultimate (and *ExtLapse(L) if postpenititial exists) ▶ Gradient violations: must be better to stay ‘at outer edge of window’ than to go all the way to opposite edge

/σσσσσ/ *ExtLapse(R) Align(Level 1,L) Align(Level 1,R) a. σ́σσσσ *! W **** b. σσ́σσσ *! W * *** c. σσσ́σσ ** ** d. σσσσ́σ ***! * e. σσσσσ́ ***!*

References 17/38 Culminativity

▶ Exactly one primary stress ▶ Grids: assign violation for multiple grid marks at highest grid level ▶ Since never violated, perhaps not a rankable constraint? (requires fancier Gen: intrinsic limitation on grid representations that can be generated)

References 18/38 An example: Sibutu Sama (Malayo-Polynesian)

a. bɪssála ‘talk’ b. bɪ̀ssaláhan ‘persuading’ c. bɪ̀ssalahánna ‘he is persuading’ d. bɪ̀ssalahankámi ‘we are persuading’

▶ Initial and penultimate stress (dual ), except in three syllable ▶ Initial and penultimate: *Lapse(R) ≫ Align(Level 1,Edges) ≫ Align(Level 1,L), Align(Level 1,R) ▶ Primary stress is the rightmost stress: Align(Level 2,R) ▶ Avoiding *bɪ̀ssála: *Clash ▶ No sequences of two stressed syllables

References 19/38 NonFinal

▶ Stress (a level 1 grid mark) does not fall on the final syllable ▶ Violated if final syllable is stressed ▶ Not needed for anything so far, but Gordon includes as a way of deriving penultimate stress ▶ Only becomes important in systems that require regularly alternating stress (“bounded stress”)

References 20/38 The typology so far

▶ Twelve constraints ▶ Align(Level 1, L), Align(Level 1, R), Align(Level 1, Edges) ▶ Align(Level 2, L), Align(Level 2, R) ▶ *Clash ▶ *Lapse,*Lapse(L), *Lapse(R) ▶ *ExtLapse,*ExtLapse(R) ▶ NonFinality

▶ 12! = 479,001,600 possible rankings ▶ Gordon (2002): calculated possible combinations of stress placement for words of 1 through 8 syllables ▶ For words of each length, candidates with all possible stress positions (respecting culminativity) were considered ▶ Yields 10,823,318,000,000 logically possible languages!

References 21/38 The typology so far

▶ Result: only 152 combinations actually emerge as optimal under some ranking ▶ 79 different stress placements, 73 others that switch which stress is primary and which is secondary ▶ Single stress systems: just 6 predicted (see Table IV, p. 512) ▶ 5 attested in quantity-insensitive systems ▶ One unattested, but found in a quantity-sensitive language (Hopi) among words with all light syllables: Penititial stress, but non-finality forces initial stress in words of 2 syllables ▶ Dual stress systems: 34 predicted possibilities (17 placements, primary at left or right) ▶ Of these, only about 6 are attested ▶ Another 6 have their “opposite side” counterparts attested (same stress placement, but differs in which side is primary)

References 22/38 The typology so far

▶ Some systems are predicted to be impossible, and are unattested ▶ Antepenultimate + penitial: would require simultaneously highest-ranked Align(Level 1,L) and Align(Level 1,R) ▶ Can’t be derived, and do not occur

▶ Gordon argues that the others are generally ‘close’ to attested systems ▶ Only 14 dual stress systems attested in total, so accidental gaps are very likely (no explanation for general rarity) ▶ Many gaps involve independently rare properties (clashes, penitial stress, etc.) ▶ See Gordon §2.2.3 regarding another possible principle ruling out some unattested patterns: Uniformity of Primary Stress Placement

References 23/38 Summary so far

▶ A pretty good match for fixed stress systems! ▶ Gordon discusses some (small) advantages of Align(Edges) rather than independent Align(L), Align(R) for dual systems ▶ For fixed stress, feet are unlikely to help make the predictions even better, since we are not dealing with alternating stresses

References 24/38 Assessing fit

▶ Undergeneration: fatal, if true (empirical adequacy) ▶ But apparent exceptions merit careful scrutiny ▶ Overgeneration ▶ Accidental gaps? (low expected probability, or historical ‘accident’) ▶ Additional pressures, such as learnability

References 25/38 The midpoint pathology (Kager, 2012; Stanton, 2016)

▶ For short words, possible to satisfy both *(Extended)Lapse(L) and *(Extended)Lapse(R), by keeping stress towards the middle of the word /σσσσσ/ *ExtLapse(L) *ExtLapse(R) a. σσσ́σσ b. σσσσσ́ *! W c. σ́σσσσ *! W ▶ For longer words, can’t satisfy both, so satisfy the higher-ranked one and keep stress at the relevant edge

/σσσσσσσ/ *ExtLapse(L) *ExtLapse(R) a. σσσσ́σσσ *! W * b. σ́σσσσσσ * c. σσσσσσσ́ *! W L

References 26/38 A ‘midpoint-stress’ language *ExtendedLapse(L) ≫ *ExtendedLapse(R) ≫ Align(L) ≫ Align(R)

2 syl σ́σ 3 syl σ́σσ 4 syl σσ́σσ 5 syl σσσ́σσ 6 syl σ́σσσσσ 7 syl σ́σσσσσσ 8 syl σ́σσσσσσσ

▶ *ExtLapse(L/R) ≫ Align(L/R): stress can move inside word to avoid extended lapse ▶ *ExtLapse(L) ≫ *ExtLapse(R): when the word is too long to satisfy both, it moves to the left side of the word ▶ Align(L) ≫ Align(R): when it’s on the left side of the word, it falls on the very first syllable

References 27/38 Stanton’s observation /σσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σ * b. σσ́ *! W L /σσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσ ** b. σσ́σ *! W *L c. σσσ́ *!* W L /σσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) ▶ Clear evidence for Align(L) a. σ́σσσ *! W L *** W b. σσ́σσ * ** ≫ Align(R) in 2,3,4-syllable c. σσσ́σ **! W *L d. σσσσ́ *! W *** W L words /σσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσσ *! W L **** W ▶ Evidence for *ExtLapse(R) ≫ b. σσ́σσσ *! W *L *** W c. σσσ́σσ ** ** Align(L) from 5-syllable d. σσσσ́σ *! W *** W *L e. σσσσσ́ *! W **** W L words /σσσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσσσ * ***** ▶ Evidence for *ExtLapse(L) ≫ b. σσ́σσσσ * *W **** L c. σσσ́σσσ * ** W *** L *ExtLapse(R) only from d. σσσσ́σσ *! W L *** W ** L e. σσσσσ́σ *! W L **** W *L 6-syllable words and longer f. σσσσσσ́ *! W L ***** W L /σσσσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσσσσ * ****** b. σσ́σσσσσ * *W ***** L c. σσσ́σσσσ * ** W **** L d. σσσσ́σσσ *! W * *** W *** L e. σσσσσ́σσ *! W L **** W ** L f. σσσσσσ́σ *! W L ***** W *L g. σσσσσσσ́ *! W L ****** W L References 28/38 Learnability shapes typology: The case of the midpoint pathology 765

24, they would have to be exposed to words that are six syllables or longer. A survey of text corpora from 102 languages reveals that this situation is, on average, unrealistic: long words are infrequent (on the distribution of word lengths, see also Hatzigeorgiu et al. 2001, Sigurd et al. 2004, Piantadosi et al. 2011, Kalimeri et al. 2015). The results of this word-length study are presented in Figure 1: each thin gray line represents the fre- quency distribution of an individual language, while the thicker black line represents the median values. More details about how the survey was conducted, as well as more information on the surveyed languages (including frequencies by language, genetic Onc thelassifi relativecation inform scarcityation, and so ofurce longs of the words data), are given in the appendices.

Figure 1. Results of the survey of text corpora from 102 languages (see the appendices for more details).

T▶heRough importan estimatet point to ta ofke relative away from proportion Fig. 1 is tha oft, a wordsssuming of th differente median v lengthsalues rep- resent approximately what the average learner would be exposed to, words of five in texts of 102 languages or more syllables make up only 4% of the learner’s input, and words of six or more syl- ▶ lables Withmake u ap fewonly notable1%. Wha exceptions,t this means, th wordsen, is th ofat f sixor a syllables learner atte ormp longerting to learn a midpmakeoint sy upstem a l veryike 23 smallor 24, proportion evidence as ofto thethe r linguisticelative rank inputing of the anti-lapse con▶straAin furtherts comescaveat from a s notmall reflected minority o here:f forms long prese wordsnt in th tende inpu tot. S beince there is reason to believe that long words are even less frequent in child-directed speech (see morphologically complex (may show other patterns) Referencese.g. Vihman et al. 1994:656 for properties of child-directed speech in English, French, 29/38 and Swedish, where one-to-two-syllable words predominate), patterns where crucial rankings are available only in these longer words might therefore be difficult for a child to acquire. The rest of this subsection focuses on the following question: if a learner samples long words at the rate they are attested crosslinguistically, does it have a difficult time learning midpoint systems? To address this question, we focus on the learner’s behavior as the number of long words that it encounters is steadily decreased. To model this de- crease in the number of long words, I selected five word-length distributions from the word-length study, detailed in Table 5. Here, Portuguese represents the ‘average’ lan- guage, since its distribution is closest to the median. Inuktitut represents the upper bound, since it has more long words than any other language in the study; Haitian rep- resents the lower bound, since it has very few. English and Ganda represent intermedi- ate points along the continuum. Each of the word-length distributions in Table 5 represents a learner that encounters words of different lengths at different rates. To probe the effects of the word-length dis- tribution on learning different stress systems, I taught each learner five different sys- : *ExtLapse(L), *ExtLapse(R) ≫ Align(L) ▶ Close, but leaves open ranking of *ExtLapse(L), *ExtLapse(R)

Learning from short words

/σσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σ * b. σσ́ *W L /σσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσ *W L *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *W *** W L /σσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσσ *W L **** W b. σσ́σσσ *W *L *** W c. σσσ́σσ ** ** d. σσσσ́σ *W *** W *L e. σσσσσ́ *W **** W L

▶ Applying RCD

References 30/38 ▶ Close, but leaves open ranking of *ExtLapse(L), *ExtLapse(R)

Learning from short words

/σσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σ * b. σσ́ *W L /σσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσ *W L *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *W *** W L /σσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσσ *W L **** W b. σσ́σσσ *W *L *** W c. σσσ́σσ ** ** d. σσσσ́σ *W *** W *L e. σσσσσ́ *W **** W L

▶ Applying RCD: *ExtLapse(L), *ExtLapse(R) ≫ Align(L), Align(R)

References 30/38 Learning from short words

/σσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σ * b. σσ́ *W L /σσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσ *W L *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *W *** W L /σσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) a. σ́σσσσ *W L **** W b. σσ́σσσ *W *L *** W c. σσσ́σσ ** ** d. σσσσ́σ *W *** W *L e. σσσσσ́ *W **** W L

▶ Applying RCD: *ExtLapse(L), *ExtLapse(R) ≫ Align(L) ≫ Align(R) ▶ Close, but leaves open ranking of *ExtLapse(L), *ExtLapse(R)

References 30/38 Two possible refinements

(☹= preferred by generating grammar, losing in acquired grammar) *ExtLapse(L) ≫ *ExtLapse(R) *ExtLapse(R) ≫ *ExtLapse(L)

/σσσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) /σσσσσσ/ *ExtLapse(R) *ExtLapse(L) Align(L) Align(R) a. σ́σσσσσσ * ****** ☹ a. σ́σσσσσσ *! W L ****** W b. σσ́σσσσσ * *W ***** L b. σσ́σσσσσ *! W *L ***** W c. σσσ́σσσσ * ** W **** L c. σσσ́σσσσ *! W ** L **** W d. σσσσ́σσσ *! W * *** W *** L d. σσσσ́σσσ *! W * *** L *** W e. σσσσσ́σσ *! W L **** W ** L e. σσσσσ́σσ * **** ** f. σσσσσσ́σ *! W L ***** W *L f. σσσσσσ́σ * *****! W *L g. σσσσσσσ́ *! W L ****** W L g. σσσσσσσ́ * *****!* W L /σσσσσσσ/ *ExtLapse(L) *ExtLapse(R) Align(L) Align(R) /σσσσσσσ/ *ExtLapse(R) *ExtLapse(L) Align(L) Align(R) a. σ́σσσσσσσ * ******* ☹ a. σ́σσσσσσσ *! W ******* W b. σσ́σσσσσσ * *W ****** L b. σσ́σσσσσσ *! W *L ****** W c. σσσ́σσσσσ * ** W ***** L c. σσσ́σσσσσ *! W ** L ***** W d. σσσσ́σσσσ *! W * *** W **** L d. σσσσ́σσσσ *! W * *** L **** W e. σσσσσ́σσσ *! W * **** W *** L e. σσσσσ́σσσ *! W * **** L *** W f. σσσσσσ́σσ *! W L ***** W ** L f. σσσσσσ́σσ * ***** ** g. σσσσσσσ́σ *! W L ****** W *L g. σσσσσσσ́σ * ******! W *L h. σσσσσσσσ́ *! W L ******* W L h. σσσσσσσσ́ * ******!* W L

Midpoint system Antepenultimate stress 2 syl σ́σ 2 syl σ́σ 3 syl σ́σσ 3 syl σ́σσ 4 syl σσ́σσ 4 syl σσ́σσ 5 syl σσσ́σσ 5 syl σσσ́σσ 6 syl σ́σσσσσ 6 syl σσσσ́σσ 7 syl σ́σσσσσσ 7 syl σσσσσ́σσ

References 31/38 Ambiguity in short words

▶ The consequence: based on data from words less than 6 syllables, learners exposed to a midpoint system might infer that they are learning antepenultimate stress instead ▶ Hoped-for claim: the midpoint system is ‘unstable’, in that learners may not reliably recover it, and go for antepenultimate stress instead ▶ But a problem: since midpoint and antepenultimate stress are ambiguous in short words, learners exposed to antepenultimate stress might just as well assume that they are learning the midpoint system! ▶ Where we are actually at now: predict variability or changes in both directions

▶ Where does the antepenultimate bias come from?

References 32/38 The learning algorithm matters

▶ RCD does not explain the antepenultimate stress bias, because in short words, both *ExtLapse(L) and *ExtLapse(R) are ‘W-only’ constraints, so remain highly ranked ▶ Stanton’s conjecture: human learners actually use a ranking algorithm that doesn’t just demote L’s, but also promotes W’s (Boersma, 1997; Magri, 2012) ▶ Why this will help: ▶ Short words give a lot of evidence for Align(L) ≫ Align(R) ▶ If this evidence is used to demote Align(R) and promote Align(L), then Align(L) will end up above other markedness constraints ▶ Similarly, 4- and 5-syllable words provide evidence for *ExtLapse(R) ≫ Align(L), causing it to be promoted as well ▶ Consequence: *ExtLapse(L) is ‘left in the dust’ (not promoted until you get 6+ syllable words, at which point it might be too late

▶ Background: the Gradual Learning Algorithm, slides 43–46 from Class 7

References 33/38 Learning from short words: promotion and demotion

/σσ/ *ExtLapse(L) *ExtLapse(R) ←Align(L) Align(R)→ a. σ́σ * b. σσ́ *W L /σσσ/ *ExtLapse(L) *ExtLapse(R) ←Align(L) Align(R)→ a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ *ExtLapse(L) *ExtLapse(R) ←Align(L) Align(R)→ a. σ́σσσ *W L *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *W *** W L /σσσσσ/ *ExtLapse(L) *ExtLapse(R) ←Align(L) Align(R)→ a. σ́σσσσ *W L **** W b. σσ́σσσ *W *L *** W c. σσσ́σσ ** ** d. σσσσ́σ *W *** W *L e. σσσσσ́ *W **** W L ▶ The shortest and most frequent words give lots of unambiguous evidence to demote Align(R), and now we also promote Align(L)

References 34/38 Learning from short words: promotion and demotion

/σσ/ Align(L)← *ExtLapse(L) *ExtLapse(R) →Align(R) a. σ́σ * b. σσ́ *W L /σσσ/ Align(L)← *ExtLapse(L) *ExtLapse(R) →Align(R) a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ Align(L)← *ExtLapse(L) *ExtLapse(R) →Align(R) a. σ́σσσ L *W *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *** W *W L /σσσσσ/ Align(L)← *ExtLapse(L) *ExtLapse(R) →Align(R) a. σ́σσσσ L *W **** W b. σσ́σσσ *L *W *** W c. σσσ́σσ ** ** d. σσσσ́σ *** W *W *L e. σσσσσ́ **** W *W L ▶ The shortest and most frequent words give lots of unambiguous evidence to demote Align(R), and now we also promote Align(L)

References 34/38 Learning from short words: promotion and demotion

/σσ/ ←Align(L)→ *ExtLapse(L) ←*ExtLapse(R) ←Align(R)→ a. σ́σ * b. σσ́ *W L /σσσ/ ←Align(L)→ *ExtLapse(L) ←*ExtLapse(R) ←Align(R)→ a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ ←Align(L)→ *ExtLapse(L) ←*ExtLapse(R) ←Align(R)→ a. σ́σσσ L *W *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *** W *W L /σσσσσ/ ←Align(L)→ *ExtLapse(L) ←*ExtLapse(R) ←Align(R)→ a. σ́σσσσ L *W **** W b. σσ́σσσ *L *W *** W c. σσσ́σσ ** ** d. σσσσ́σ *** W *W *L e. σσσσσ́ **** W *W L ▶ 4- and 5-syllable words provide evidence that *ExtLapse(R) or Align(R) must outrank Align(L) ▶ But we know Align(L) ≫ Align(R), reinforced by lots more data ▶ So eventually just *ExtLapse(R) is promoted

References 34/38 Learning from short words: promotion and demotion

/σσ/ *ExtLapse(R)← ←Align(L)→ *ExtLapse(L) ←Align(R)→ a. σ́σ * b. σσ́ *W L /σσσ/ *ExtLapse(R)← ←Align(L)→ *ExtLapse(L) ←Align(R)→ a. σ́σσ ** b. σσ́σ *W *L c. σσσ́ ** W L /σσσσ/ *ExtLapse(R)← ←Align(L)→ *ExtLapse(L) ←Align(R)→ a. σ́σσσ *W L *** W b. σσ́σσ * ** c. σσσ́σ ** W *L d. σσσσ́ *** W *W L /σσσσσ/ *ExtLapse(R)← ←Align(L)→ *ExtLapse(L) ←Align(R)→ a. σ́σσσσ *W L **** W b. σσ́σσσ *W *L *** W c. σσσ́σσ ** ** d. σσσσ́σ *** W *W *L e. σσσσσ́ **** W *W L ▶ 4- and 5-syllable words provide evidence that *ExtLapse(R) or Align(R) must outrank Align(L) ▶ But we know Align(L) ≫ Align(R), reinforced by lots more data ▶ So eventually just *ExtLapse(R) is promoted

References 34/38 The resulting grammar *ExtLapse(R) ≫ Align(L) ≫ *ExtLapse(L) ≫ Align(R) ▶ This ranking works for words of 2–5 syllables (see previous slide) ▶ But it predicts antepenultimate stress for longer words

/σσσσσσ/ *ExtLapse(R) Align(L) *ExtLapse(L) Align(R) ☹ a. σ́σσσσσ *! W L L ***** W b. σσ́σσσσ *! W *L L **** W c. σσσ́σσσ *! W ** L L *** W d. σσσσ́σσ *** * ** e. σσσσσ́σ ****! W * *L f. σσσσσσ́ ****!* W * L /σσσσσσσ/ *ExtLapse(R) Align(L) *ExtLapse(L) Align(R) ☹ a. σ́σσσσσσ *! W L L ****** W b. σσ́σσσσσ *! W *L L ***** W c. σσσ́σσσσ *! W ** L L **** W d. σσσσ́σσσ *! W *** L * *** W e. σσσσσ́σσ **** * ** f. σσσσσσ́σ *****! W * *L g. σσσσσσσ́ *****!* W * L ▶ Result: regardless of whether the learner was trained on midpoint or antepenultimate stress, it learns an antepenultimate grammar ▶ …at least, until long words are encountered, if it’s not too late

References 35/38 Stepping back: the approach, more generally

▶ Some unattested systems may be possible to capture grammatically, but are difficult to learn ▶ Goal: theory of grammatical learning that predicts that learners, when exposed to typical input from a ‘difficult’ pattern, systematically misacquire it as a different, more commonly attested pattern ▶ Potential to explain not only unattested systems, but also rare systems (which we can’t exclude as impossible grammars, anyway) ▶ Converging evidence: acquisition data, learning in the lab?

References 36/38 References

Boersma, P. (1997). How we learn variation, optionality, and probability. Proceedings of the Institute of Phonetic Sciences of the University of Amsterdam 21, 43–58. http://fon.hum.uva.nl/paul/. Gordon, M. (2002). A factorial typology of quantity-insensitive stress. Natural Language & Linguistic Theory 20, 491–552. Hayes, B. (1995). Metrical Stress Theory: Principles and Case Studies. Chicago: University of Chicago Press. Hayes, B. (1999). Phonological restructuring in Yidiɲ and its theoretical consequences. In B. Hermans and M. van Oostendorp (Eds.), The Derivational Residue in Phonological Optimality Theory, pp. 175–205. Amsterdam: John Benjamins. Hayes, B. and D. Steriade (2004). The phonetic basis of phonological markedness. In B. Hayes, R. Kirchner, and D. Steriade (Eds.), Phonetically based phonology, pp. 1–33. Cambridge: Cambridge University Press. Kager, R. (2012). Stress in windows: Language typology and factorial typology. 122, 1454–1493. Liberman, M. (1975). The Intonational System of English. Ph. D. thesis, MIT. Liberman, M. and A. Prince (1977). On stress and linguistic rhythm. 8, 249–336.

References 37/38 References

Magri, G. (2012). Convergence of error-driven ranking algorithms. Phonology 29(2), 213–269.

Prince, A. (1983). Relating to the grid. Linguistic Inquiry 4, 19–100.

Prince, A. and P. Smolensky (1993/2002). Optimality Theory: Constraint Interaction in . Technical report, Rutgers RuCCS-TR-2/University of Colorado, Boulder CU-CS-696-93. ROA 537, 8/2002 version.

Selkirk, E. (1984). Phonology and syntax. Cambridge: MIT Press.

Smith, J. (2003). Towards a compositional treatment of positional constraints: The case of positional augmentation. In A. Carpenter, A. Coetzee, and P. de Lacy (Eds.), UMass Occasional Papers in (UMOP) 26, pp. 337–370. Amherst, MA: GLSA.

Stanton, J. (2016). Learnability shapes typology: the case of the midpoint pathology. Language 92, 753–791.

Steriade, D. (1999). in phonology: The case of laryngeal neutralization. In M. K. Gordon (Ed.), UCLA Working Papers in Linguistics, Number 2: Papers in Phonology 3, pp. 25–146. http://www.linguistics.ucla.edu/people/steriade/papers/phoneticsinphonology.pdf.

References 38/38