Malamud 9/13/2006

Intro to Linguistics

Lecture 5.

Recall:

• phonetics: the physical manifestation of language in sound waves; how these sounds are articulated and perceived • phonology: the mental representation of sounds as part of a symbolic cognitive system; how abstract sound categories are manipulated in the processing of language

So phonetics deals with the physiological and acoustic parts of the path between speaker and listener, while phonology resides in the brain.

Phonemes

The phonological elements of a language are the basic, distinctive sounds, also called . In English, these are the following (for a dialect of Standard ).

• consonants: p, t, k, b, d, g, č, , f, θ, s, š, h, v, ð, z, ž, m, n, ŋ, l, r, w, y • : i, u, I, U, e, o, ε , ə/ , , æ, a, ay, aw, oy

"distinctive" = can be used to make contrasts between different words. For the stops, using minimal pairs (words that differ in exactly one sound):

pill till kill bill dill gill

For the vowels (for each individual pair of vowels we could come up with a minimal pair):

bead booed bid bade bowed bed bud bad bod bide Boyd heed who'd hid hood head hawed had hide how'd jean June jin Jane Jen John join lead lewd lid laid load lead lad lod loud Lloyd

And for the nasals: rum run rung

In English, the velar nasal [η] can't occur at the beginning of a word -- cf. map, nap, *ngap – What are the restrictions on the way these elements are organized into words?

Basic way in which languages differ is their inventory of sounds, or phonemes. For example:

• German has the voiceless velar fricative [x], as in Bach "creek". o English has voiceless fricatives such as [s] and velars such as [k], but it doesn't have a single that has both of these properties.

1 • German also has the high front rounded [ü], as in kühn "clever". o Again, English has high front [i] and rounded [u], but these properies are not combined in one vowel. • English [θ] sets it apart from many languages, including German and French. o They have several voiceless fricatives, but not the interdental.

Learning a new (or a first) language includes learning the "list" or inventory of sounds.

Syllables

Phonological structure - the way these elements are organized - includes the notion of syllable and its subparts. This structure is crucially involved in describing the possible words.

• the onset or consonant(s) at the beginning of the syllable o English normally permits up to two consonants o but in addition, [s] can be added to the beginning of many syllables as well, making up to three consonants o all sounds can occur in this position except for [ŋ] • the nucleus or vowel that is the core of the syllable o sometimes a consonant can serve as the nucleus, as in the second syllable of kitten or table • the coda or consonant(s) at the end of the syllable o English normally permits up to two consonants at the end (belt, jump, arc) o but in addition, certain sounds such as [s, t, θ] can be piled up (belts, sixths)

Here's a general schema of how syllables are constructed.

SYLLABLE RHYME ONSET NUCLEUS CODA

consonant(s) vowel consonant(s)

Rhyme = nucleus + coda, e.g. in blend rhyme = [ε] + [nd].

Sonority

Human speech involves repetitive cycles of opening and closing the vocal tract = syllables.

Relatively closed position = onset , then relatively open nucleus, then closing for coda or the next syllable's onset. The degree of vocal tract openness correlates with the loudness.

Speech sounds differ on a scale of sonority: vowels = most sonorous end, obstruents (stops, affricates, fricatives) = least sonorous end. In between are the liquids [l] and [r], and nasal consonants like [m] and [n].

2 Least sonorous sounds are restricted to the margins of the syllable -- the onset in the simplest case -- and the most sonorous sounds occur in the center of the syllable -- most often a vowel.

E.g., "soon" * "blend" * * * * * * * * * * * * * * * * * [s][u][n] [b][l][ε][n][d]

"pretending"- each syllable corresponds to a peak in sonority. * * * * * * * * * * * * * * * * * * * * * * [p][r][ə][t][ε][n][d][ I][ ŋ]

"film"= one syllable BUT "fiml","pummel" = two syllables "fizm","chasm"=two * * * * * * * * * * * * * * * * * * * * * * * * * * * * [f][ I][l][m] [f][ I] [m] [l] [f] [I] [z][m] [p][ ][m][l] [k][æ][z][m]

• This breakdown does not need to be memorized for each word: syllabification is a general property of the language. • In these last two words, the consonant serves as the sonority peak - it is syllabic (a nucleus). English permits nasals and liquids to be syllabic, at least in unstressed syllables: prism, bottom, sump'm (for "something"), cap'm (for "captain"), hidden, button, kitten, risen bottle, little, towel, swimmer, higher, butter

• For [r], the consonant can function as a vowel even in a stressed syllable: bird, fur, word

In some dialects, such as Standard British, Boston, and Coastal Southern US, any [r] in the rhyme of a syllable (whether nucleus or coda) loses its r-ness and becomes a schwa-like vowel. These are called "r-less" dialects.

• Another general property of English: restrictions on what consonants can be an onset cluster - sonority has to increase by two steps.

• actual words with obstruent+liquid (two steps):brick, true, free, crab; play, blue, flea, glib • possible words with obstruent + liquid:blick, clee • impossible words with obstruent + nasal (just one step) :*bnick, *fnee, *gmue, *dmay • historical loss of initial obstruent in cluster (letter now silent):knee, knight, gnat, gnaw

3 • This too is part of our general knowledge of the language: we can distinguish blick and *bnick as "possible" and "impossible" even if we've never heard either word before.

But what about words like snow (obstruent + nasal onset cluster)? • Take (almost) any English onset, and tack an [s] on the front of it, ignoring sonority.

snow (cf. no), stop (cf. top), spray (cf. pray)

• This is a special property of [s] and no other obstruent in English: loud fricative noise: it doesn't depend on the normal syllable structure. In German (and Yiddish), for example, it's the (alveo)palatal fricative, as in Schmutz "dirt."

Once again, syllable structure is a way in which languages differ.

Hawaiian: no coda consonants, maximum of one consonant in the onset. So: borrowed words get a lot of extra vowels, to create new syllables of the proper type.

ink > 'înika Norman > Nolemana

Polish: allows more consonants at the beginning or end of a word than English does. This is why some Polish names are hard for English speakers to say, such as Gdansk or Zbigniew Brzezinski.

bzdura "nonsense" babsk "witch" grzbiet [gzhbyet] "back" marnotrawstw [-fstf] "of wastes"

A language learner comes to understand what structures are possible in that language by observing the attested patterns.

Allophones

There are often differences in the way a phoneme is pronounced in a specific context. The variant pronunciations are called allophones ("other sounds").

When it's important to make this difference:

• we'll use [square brackets] to indicate sounds from a phonetic point of view, i.e. focusing on their physical properties and the details of actual pronunciation; • and we'll use /slashes/ to indicate sounds from a phonological point of view, i.e. as part of an abstract representation independent of potential differences in the way the sound in pronounced in specific contexts. • I.e., [ ] = allophone, / / = phoneme.

A classic example of sound alternation in English relates to the [s] found at the beginning of a syllable before a voiceless stop.

spin is basically pin with [s] added, but the /p/ in each case is pronounced differently.

4 • pin contains an aspirated version of /p/, with a puff of air after the stop is released; [ph] • spin contains a plain /p/, without a puff of air after the stop; this is written just [p]

The same is true for pairs like pit~spit, pot~spot, pair~spare, etc.

A simple statement of this alternation is as follows:

allophone [p] immediately following [s] the phoneme /p/ becomes: at the beginning of the allophone [ph] word

But the same generalization holds not just for /p/ but for the other voiceless stops, /t/ and /k/. Compare these word pairs:

• top~stop, take~stake, tie~sty, etc. • kin~skin, cope~scope, can~scan, etc.

So more accurately, there's a single general statement that covers all these cases, stated in terms of natural classes.

unaspirated immediately following [s] voiceless stops are: at the beginning of the aspirated word

How do we know that aspirated and unaspirated voiceless stops are not different phonemes? The same way Lois Lane knew Clark Kent was the same guy as the Superman: they never appear at the same place in the same time. This is called complementary distribution.

Expanding to more than one-syllable: voiceless stops are aspirated when they occur syllable- initially and are followed by a stressed vowel (rápid, raphídity); & word-initially regardless of (photháto). At the beginning of a word, a preceding /s/ prevents the stop from being syllable- or word-initial.

Different stresses cause alternations: underlying sound /t/ is pronounced as [t] etc. or [th] etc.

rápid [p] rapídity [ph] authéntic [t] authentícity [th] récord [k] recórd [kh]

This process is completely unconscious for most speakers, and often quite hard to unlearn.

5 English speakers who learn a language like French or Spanish, in which all voiceless stops are unaspirated, typically impose aspiration according to their native rule; but that's wrong for these languages, and sounds foreign.

Similarly, a French or Spanish speaker learning English will typically fail to produce aspiration in the right places; this is part of what it means to have a foreign accent.

Aspiration in English is a small example of what phonological knowledge consists of:

• it's learned unconsciously by children imitating (quite accurately!) the details of the language around them • it's systematic, applying to all words with voiceless stops, not just some random selection • it's defined in terms of a natural class (here "voiceless stops") rather than some arbitrary set of three consonants

The study of phonology is largely the investigation of alternations like this -- what changes occur, what sounds undergo them, and in what contexts.

Flapping

A prominent feature of American English affects /t/ and /d/, and is called flapping.

A flap is a quick motion with the tongue against the alveolar ridge.

All these English words have flaps where "t" or "d" is written in the spelling (relevant dialects).

butter caddy pretty buddy little water

The proper phonetic symbol for a flap is - it's an "r" missing the top left serif.

For most speakers, in the right context a phonological /t/ will end up sounding phonetically just like a phonological /d/, since both become a flap [ ] (voiced), as these homophones show:

latter ladder matter madder mettle meddle betting bedding

outty (belly button) Audi (car)

And the answer in this exchange is therefore ambiguous:

6 -- Do you want the ladder or the chair? -- Give me the [læ r].

Of course, /t/ and /d/ don't always end up as flaps: minimal pairs illustrate.

hit hid tin din tear dare melting melding attain A Dane

What context causes flapping to occur? There are two conditions:

1. The /t/ or /d/ has to be between vowels (this includes a syllabic [r] or [l]) o so not in hit, melting 2. The following vowel has to be unstressed. o so not in tin, attain

If you compare the list of homophones (with flapping) vs. minimal pairs (without flapping), you'll see that only the homophones satisfy both these conditions.

The same basic word (or word root, or morpheme) can change. This includes adding a vowel:

[ש] sit [t] sitting, sitter

[ש] spot [t] spotty

[ש] mad [d] madder, maddest

[ש] bird [d] birdy

As well as moving the stress (primary ´ or secondary `):

atómic [th] átom [ ] còmputátion [th] compúter [ ] prágmatìsm [th] pragmátic [ ] addíctive [d] áddict [ ] edítion [d] édit [ ]

Borrowed a new words are subject to these patterns too:

7 tofu [th] (Japanese [t]) tortilla [th], [th] (Spanish [t], [t]) coyote [kh], [ ] (Spanish [k], [t]) condor [kh] (Spanish/Quechua [k]) panache [ph] (French [p])

So aspiration and flapping are not learned in individual words (such as flapped latter), but are part of what we know about the language.

• Raising diphthong /ay/: followed by a voiceless consonant, it is "raised" so the first part is more like the first vowel of mother than that of father. regular diphthong at the end of a word, or before a voiced consonant

Ø, b, d, v, z, , ð, m, n, l, r tie, jibe, hide, live, rise, oblige, tithe, time, line, tile, tire raised diphthong before a voiceless consonant p, t, k, f, s hype, white, bike, life, ice

• This distinction between [ay] and raised [əy] is maintained even when the voicing distinction is eliminated by flapping. Thus if a speaker has raising in write, that pronunciation is maintained in writer, while rider will have [ay] just like ride.

So: while flapping eliminates the distinction in consonants, the words still do not rhyme.

Most speakers think the difference lies in the consonants. In the pronunciation, it’s in the vowels.

• The reason why speakers "hear" the difference in the consonants is because o on an abstract level in their minds, the words are represented as /rayter/ and /rayder/, with the difference localized in the consonant. o The raising of the /ay/ in the former, and the flapping of the consonants in both are subsequent unconscious processes.

How does raising keep working properly to distinguish these two words when the conditioning factor on raising -- voicing on the following consonant -- has been obliterated?

Raising the /ay/ in writer on analogy to write, where the conditioning factor is still intact? NO

Speakers apply phonological rules to these representations in some order, so that the output of one rule can be the input to another:

8 • writer

/rayter/ --> raising --> /rəyter/ --> flapping --> /rəy er/

• rider

/rayder/ --> raising /rayder/ --> flapping --> /ray er/

What does phonology do for us?

Phonology of human language is an ingenious solution to a serious problem.

Apparent design features of human spoken language

1. Large vocabulary: 10,000-100,000 items 2. Open vocabulary: new items are added easily 3. Variation in space and time: different languages and "local accents" 4. Messages are typically structured sequences of vocabulary items

Compare the "referential" part of the vocal signaling system of other primates:

1. Small vocabulary: ~10 items 2. Closed vocabulary: new "names" or similar items are not added 3. System is fixed across space and time: widely separated populations use the same signals 4. Messages are usually single items, perhaps with repetition

Some general characteristics of other primate vocalizations that are shared by human speech:

1. Vocalizations communicate individual identity 2. Vocalizations communicate attitude and emotional state

Some potential advantages of the human innovations:

1. Easy naming of new people, groups, places, etc. 2. Signs for arbitrarily large inventory of abstract concepts 3. Language learning is a large investment in social identity

How can it work?

Children learn an average of more than 10 items per day, day in and day out, just from hearing the words used in meaningful contexts. Young children can learn a word (and retain it for at least a year) from hearing just one casual use.

Focus on how they learn word sounds, from very few examples, individual, social and geographical, attitudinal and emotional variation, the sound expressing the word, the identity of

9 the speaker, the speaker's attitude and emotional state, the influence of the performance of adjacent words, and the structure of the message containing the word.

Perceptual error rate for spoken word identification is less than one percent, where words are chosen at random and spoken by arbitrary and previously-unknown speakers. In more normal and natural contexts, performance is much better.

Let's call this the pronunciation learning problem. If every word were an arbitrary pattern of sound, this problem would probably be impossible to solve.

So what makes it work?

The Phonological Principle

In human spoken languages, the sound of a word is not defined directly (in terms of mouth gestures or acoustic wave patterns). Instead, it is mediated by encoding in terms of a phonological system:

1. A word's pronunciation is defined as a structured combination of a small set of elements 1. The available phonological elements and structures are the same for all words (though each word uses only some of them) 2. The phonological system is defined in terms of patterns of mouth gestures and noises 1. This "grounding" of the system is called phonetic interpretation 2. Phonetic interpretation is the same for all words

How does the phonological principle help solve the pronunciation learning problem? Basically, by splitting it into two problems, each one easier to solve.

1. Phonological representations are digital, i.e. made up of discrete elements in discrete structural relations. 1. Copying can be exact: members of a speech community can share identical phonological representations 2. Within the performance of a given word on a particular occasion, the (small) amount of information relevant to the identity of the word is clearly defined. 2. Phonetic interpretation is general, i.e. independent of word identity 1. Every performance of every word by every member of the speech community helps teach phonetic interpretation, because it applies to the phonological system as a whole, rather than to any particular word. 2. Speakers of different dialects will have somewhat different phonetic interpretation of the phonological units, but because they are all dealing with essentially the same units, once you learn the basics of a different dialect's phonetic interpretation, you can learn new words from speakers of that dialect, and interpret them phonetically in your own.

10