Quick viewing(Text Mode)

Pattern Substitution in Wuxi Tone Sandhi and Its Implication for Phonological Learning

Pattern Substitution in Wuxi Tone Sandhi and Its Implication for Phonological Learning

John Benjamins Publishing Company

This is a contribution from International Journal of Chinese Linguistics 3:1 © 2016. John Benjamins Publishing Company This electronic file may not be altered in any way. The author(s) of this article is/are permitted to use this PDF file to generate printed copies to be used by way of offprints, for their personal use only. Permission is granted by the publishers to post this file on a closed server which is accessible only to members (students and faculty) of the author’s/s’ institute. It is not permitted to post this PDF on the internet, or to share it on sites such as Mendeley, ResearchGate, Academia.edu. Please see our rights policy on https://benjamins.com/content/customers/rights For any other use of this material prior written permission should be obtained from the publishers or through the Copyright Clearance Center (for USA: www.copyright.com). Please contact [email protected] or consult our website: www.benjamins.com Pattern substitution in sandhi and its implication for phonological learning

Hanbo Yan and Jie Zhang University of Kansas

Tone sandhi in Wuxi Chinese involves “pattern substitution,” whereby the base tone on the first syllable is first substituted by another tone, then spread to the sandhi domain. We conducted a wug test to investigate native Wuxi speakers’ tacit knowledge of and found that the substituion aspect of the sandhi is not fully productive, but the extension aspect is, and sandhi produc- tivity is influenced by the phonetic similarity between base and sandhi tones. These results are discussed in the context of how phonological opacity, phonetic naturalness, and lexical frequency influence phonological learning, and a gram- matical learning model that can predict Wuxi speakers’ experimental behavior is proposed.

Keywords: Tone Sandhi, productivity, Wuxi, opacity, Maximum Entropy grammar

1. Introduction

1.1 Tone sandhi in Chinese dialects

The phonetic pitch on a syllable distinguishes lexical meaning in tone languages like Chinese. Like other phonological features, tones may participate in phono- logical alternation triggered by the tonal or prosodic/morphosyntactic context in which they appear. This type of alternation in Chinese dialects is typically referred to as tone sandhi (Chen, 2000; Zhang, 2014a, b). Typologically, tone sandhi patterns in Chinese dialects fall under two dif- ferent varieties: last-syllable dominant (right-dominant) and first-syllable domi- nant (left-dominant) (Yue-Hashimoto, 1987; Zhang, 2007). In right-dominant sandhi, the final syllable in the sandhi domain keeps the citation tone, while the preceding syllables undergo sandhi. Most of Min, Southern Wu, and Mandarin

International Journal of Chinese Linguistics 3:1 (2016), 1–44. doi 10.1075/ijchl.3.1.01yan issn 2213–8706 / e-issn 2213–8714 © John Benjamins Publishing Company 2 Hanbo Yan and Jie Zhang

dialects show this type of tone sandhi, such as Taiwanese (Cheng, 1968), Wenzhou (Zheng-Zhang, 1964, 1980), and Mandarin. Both Yue-Hashimoto (1987) and Zhang (2007) argued that right-dominant sandhi tends to involve local, paradig- matic tone change, as shown in the examples in (1). The Mandarin tone sandhi in (1a) shows that a nonfinal dipping tone 213 (Tone 3) alternates to a rising tone 35 (Tone 2) before another dipping tone; the Taiwanese tone sandhi in (1b) shows that a tone undergoes a regular change whenever it appears in non-phrase-final positions regardless of the tone in final position, and four of the five tones are involved in a circular . In left-dominant sandhi, the tone on the first syllable in the sandhi domain is maintained while the following syllables undergo sandhi. It is generally found in Northern Wu dialects such as Shanghai (Zee & Maddieson, 1979) and Changzhou (Wang, 1988). The Shanghai tone sandhi in (2) involves spreading the tone of the first syllable to a disyllabic sandhi domain. For example, when tone 24 is combined with any other tone, the tones of the disyllable become 22 + 44, a result of spreading the initial base tone 24. According to Yue- Hashimoto (1987) and Zhang (2007), the rightward spreading pattern, also known as “pattern extension” (Chan & Ren 1989), is the typologically most common pat- tern in left-dominant tone sandhi. (1) Right-dominant sandhi: a. Mandarin third tone sandhi: 213 → 35 / ___213 b. Taiwanese tone sandhi: 51 → 55 → 33 ← 24 in non-phrase-final positions ↖ ↙ 21 (2) Left-dominant sandhi: pattern extension in Shanghai: 24 + X → 22 + 44 (“X” refers to any tone in the Shanghai tonal inventory.) These complex tone sandhi patterns of Chinese dialects have presented con- siderable challenge to theoretical , for a number of reasons. First, the sandhi patterns in Chinese dialects can be extremely complex, and any tone in the inventory may alternate, as we will see in the Wuxi example later on. Second, the articulatory and perceptual motivation of tone sandhi may have been lost during diachronic change and cannot be found in the current synchronic systems. For example, the shang → yang ping / __ shang sandhi, realized in Mandarin as 213 → 35 / __ 213, has cognates in many Mandarin dialects, indicating a common his- torical origin for the sandhi. But the phonetic realizations of the shang and yang ping tones in these dialects might be quite different (Court, 1985); for instance, in Tianjin, the sandhi is 13 → 45 / __ 13 (Yang, Guo, & Shi, 1999), and in Jinan, it is

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 3

55 → 42 / __ 55 (Qian & Zhu, 1998). This indicates that the synchronic pattern in Mandarin does not necessarily reflect any articulatory or perceptual motivations that may have existed historically.1 Third, some tone sandhi patterns are phono- logically opaque (Kiparsky, 1973),2 such as the Taiwanese pattern in (1b). This poses problems for surface oriented phonological theories. Moreton (2004), for example, showed that a circular chain shift, as found in Taiwanese tone sandhi, is incomputable by standard Optimality Theory that only assumes IO-faithfulness and markedness constraints. For these reasons, many researchers found it difficult to account for complex tone sandhi patterns in synchronic phonology (e.g., Chen, 2000; Lin, 2008; Zhang, 1999; Wang, 2002; Yip, 1999, 2004), particularly using constraint-based Optimality Theory (Prince & Smolensky, 1993, 2004). However, we can address this issue from another perspective, i.e., whether native speakers’ tacit knowledge of the tone sandhi patterns is accurately reflected in the synchronic sandhi patterns; in other words, whether the observed sandhi patterns are truly productive, as evidenced by nonce probe tests. If the sandhi pat- terns are truly productive, then the theoretic issues mentioned above indeed need to be addressed head-on. If not, however, then it is likely that the sandhi patterns are more due to lexical listing rather than input-output derivations. Recent productivity studies have shown that the Chinese sandhi patterns are not entirely productive in novel words. Most of them have been conducted in right-dominant sandhi systems, such as Taiwanese (Hsieh, 1970; Wang, 1993; Zhang, Lai, & Sailor, 2009, 2011), Mandarin (Zhang & Lai, 2010), and Tianjin (Zhang & Liu, 2011). The only dialect with left-dominant sandhi that has been tested is Shanghai (Zhang & Meng, 2012). These studies investigated the different factors that could potentially influence the productivity of tone sandhi, such as phonological opacity, phonetic naturalness, and lexical tone frequency. In the next section, a brief review of the relevant productivity studies is provided.

1.2 Experimental studies addressing the productivity of tone sandhi

Hsieh (1970) investigated native speakers’ phonological knowledge of Taiwanese tone sandhi using a wug test. The result showed that speakers had no difficulty pronouncing actual noun compounds. But they had difficulty with the circular

1. We do not mean to imply that the historical origin of tone sandhi is necessarily articulatorily or perceptually based. We are simply stating that even if a sandhi did have such a motivation at an earlier stage, it may not appear so in the current synchronic systems.

2. A P, A → B / C__D, is opaque if the surface structures are any of the fol- lowing: (a) instance of A in the C__D environment, or (b) instance of B derived by P in environ- ments other than C__D (Kiparsky, 1973).

© 2016. John Benjamins Publishing Company All rights reserved 4 Hanbo Yan and Jie Zhang

chain shift in novel compounds. During the wug test, if they could identify the monosyllabic morphemes in the novel compounds, they applied the expected san- dhi; if not, they repeated the syllables without sandhi. Wang (1993) used a similar method to investigate Taiwanese tone sandhi, but also included a longitudinal component. He discovered an overall higher sandhi productivity and also observed that the subjects produced more sandhi patterns in the later period of the experiment, indicating a practice and learning effect. Wang also pointed out that there was a large variation between the sandhi productivity of different base tones. He assumed that both the citation tone and the sandhi tone existed in the speakers’ lexicon. The words and phrases are connected by phonemes, lexical entries, and semantics in substructures, which form an analogi- cal chain. According to him, language is not rule-governed, but a connection of analogical chains, and speakers use this knowledge in production. Zhang et. al. (2009, 2011) investigated the productivity of the sandhi pattern in reduplication in Taiwanese. They discovered that when the syllables did not exist, speakers produced significantly less sandhi. Moreover, both duration of the sandhi tones and the lexical frequency of the base tones influenced the productiv- ity among the opaque mappings. For example, the two falling tones 51 and 21 have considerably shorter durations than 55, 33, and 24 according to acoustic studies (Lin, 1988; Peng, 1997). Given that nonfinal syllables have intrinsically shorter du- ration than final syllable due to the lack of final lengthening, the low productivity of 51 → 55 may have been caused by the duration-increasing nature of the sandhi; the lexical frequencies of the tones involved cannot explain this productivity pat- tern, as 51, 55, and the tonal melody 55–51 all occur relatively frequently. On the other hand, the low productivity of 33 → 21 could not have been resulted from duration, as the sandhi is duration-reducing, but may have been caused by the low lexical frequency of the base tone 33 and the reduplicative melody 21–33. Finally, the lack of rising tone on nonfinal syllables is generally productive across real and novel words. They suggested that opacity, phonetic basis, and lexical frequency all have an effect on the productivity of tone sandhi. Zhang & Lai (2010) investigated how native Mandarin speakers applied two types of tone sandhi to novel words: one is the third-tone sandhi T3 (213) → T2 (35) / _ T3 (213), and the other is the half-third sandhi T3 (213) → 21 / _ T (T ≠ 213). They found that speakers applied the half-third sandhi more accurately than the third-tone sandhi in novel words. They argued that the results were due to the fact that the half-third sandhi is a contour reduction process directly related to the shortened duration in nonfinal positions and thus has a clearer phonetic motivation than the third-tone sandhi, which involves a pitch not easily explainable by phonetics and is also perceptually neutralizing. Moreover, lexical frequency is also related to the sandhi productivity. The half-third tone (21) in T3

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 5

+ T2 has the lowest type frequency and also the lowest accuracy in novel words among all half-third sandhi environments. In other words, phonetic properties and lexical frequency both influence Mandarin tone sandhi. Zhang & Liu (2012) examined the productivity of the tone sandhis in Tianjin Chinese, also using a wug test. Similar to the results of Zhang and Lai (2010), their results also showed that the phonetic property and lexical frequency of the tone sandhi both have an influence on sandhi productivity: without strong phonetic bases, some fully productive sandhis in the lexicon have lower productivity in novel words than real words, e.g., L + L → LH + L; for sandhis that share a similar phonetic motivation, e.g., LH + H → L + H and LH + HL → L + HL, the more frequent one — the latter — is more productive. Zhang & Meng (2012) investigated the productivity of left-dominant sandhis of Shanghai in both real and novel words. For example, when tone 51 is combined with any tones in the Shanghai tone inventory, 51 + X, the surface tones become 55 + 31, a spreading pattern of tone 51. They found that overall, left-dominant sandhi is relatively productive, although the sandhi in certain tonal combinations is not productive, such as 12 + 51 → 11 + 13.3 They pointed out that for this tonal combination, the mismatch between phonological stress (left) and surface pho- netic prominence (right) and the contour dissimilarity between the base tones and sandhi tones may have contributed to its unproductivity. Overall, earlier research on tone sandhi productivity has shown that our un- derstanding of tone sandhi can considerably benefit from experimental studies that directly shed light on the speakers’ tacit knowledge of the sandhi patterns, as it can systematically differ from the lexical manifestation of the sandhi due to phonetic (e.g., the duration of the sandhi position, the phonetic similarity between base and sandhi tones) and phonological (e.g., opacity) properties as well as usage frequencies of the sandhis. The results of these productivity studies will then pro- vide a firmer foundation from which formal analyses of tone sandhi can proceed.

1.3 The tone sandhi pattern in Wuxi

Previous productivity results of tone sandhi showed that in right-dominant san- dhi systems, the sandhis with stronger phonetic motivations result in high pro- ductivity, e.g., in Mandarin and Tianjin, and sandhis that involve opacity, e.g., in Taiwanese, are not very productive. On the other hand, in the left-dominant sandhi system of Shanghai, pattern extension is relatively productive in novel words. An interesting left-dominant sandhi pattern that has not been previously investigated

3. In line with the tradition of Chinese dialectology, an underlined tone number indicates that the tone occurs on checked syllables, which are syllables closed with a glottal stop.

© 2016. John Benjamins Publishing Company All rights reserved 6 Hanbo Yan and Jie Zhang

is the Wuxi tone sandhi pattern, which combines rightward spreading with para- digmatic substitution of tones, as the sandhi replaces the tone of the first syllable with another tone before spreading — a pattern that Chan & Ren (1989) termed “pattern substitution.” We introduce the Wuxi sandhi pattern in this section and discuss the importance of the pattern to our understanding of sandhi productivity and native speakers’ tacit linguistic knowledge of the sandhi pattern. Wuxi is a Northern Wu dialect of Chinese. According to traditional practice, there are four categories of tones in Wuxi, Ping, Shang, Qu, and Ru. Two regis- ters based on the voicing of the initial consonant separate each category into Yin (voiceless) and Yang (voiced), resulting in eight tones. Traditionally, Yin-register tones are referred to by odd numbers (T1, T3, T5 and T7), and Yang-register tones by even numbers (T2, T4, T6 and T8), as in (3). T1 to T6 occur on open or so- norant-closed syllables. T7 and T8 occur on checked syllables, which are syllables closed by a glottal stop. According to the acoustic study by Xu (2007), T1, T3, and T5 are different in monosyllabic citation forms, while T2, T4, and T6 have been merged as the same contour tone in monosyllabic citation forms. Zhang, Van de Velde, & Kager (2011) confirmed the merge tendency of these three tones, but also suggested that T2 and T6 have begun to separate among young speakers. For T3, the two transcriptions 223 and 323 reflect a gender difference according to Xu (2007): male speakers’ production is 223, while female speakers’ production is 323. But even in female speakers’ production, the initial fall is very small. We will use 323 in the rest of our paper for consistency’s sake. (3) Wuxi tones: Tone 1 53 Tone 2 113 Tone 3 223/323 Tone 4 13 Tone 5 34 Tone 6 113 Tone 7 5 Tone 8 13 Chan & Ren (1989) provided the first instrumental study of tones and tone sandhi in Wuxi Chinese, which has both pattern extension and pattern substitu- tion. Pattern extension applies to reduplicated verbs, reduplicated nouns in baby talk, verbs with resultative or directional complement, and expressions of ‘number + classifier’. Pattern substitution applies to regular compounds, phrases, and redu- plicated nouns, as seen in (4). For example, when T1 (53) is combined with one of the six tones from T1 to T6, the disyllable undergoes the sandhi 53 + X → 43 + 34, such as 新鲜 (sin53 siɪ53) →sin43-siɪ34. When T1 is combined with one of the checked tones T7 and T8, the disyllable undergoes the sandhi 53 + X → 43 + 34.4

4. Xu (2007) reported that for the T1+X sandhi, the male speaker in the study pronounced the sandhi tones as 33+34 rather than 43+34 as in the two female speakers’ production. But the

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 7

(4) Wuxi disyllabic tone sandhi substitution patterns: σ2 T1 T2 T3 T4 T5 T6 T7 T8 Examples σ1 [53] [113] [323] [13] [34] [113] [5] [13] 新鲜 T1 [53] 43+34 43+34 (sin53 siɪ53) fresh 年轻 T2 [113] 35+31 35+31 (ȵɪ113ʨin53) young 胆小 T3 [323] 33+44 33+4 (tɛ323siɔ323) coward 老师 (lɔ13 sɿ53) — 33+55 33+55 43+34 teacher T4 [13] (+T1/T2/T3/ 33+5 (+T3/T4/T5/T6) 冷水 T4/T5/T6) (læ̃13sʮ323) — 43+34 cold water 奋斗 T5 [34] 55+31 55+31 (fən34tɛi34) to fight for 号码 T6 [113] 33+55 33+5 (ɦɔ113m̩13) number 发明 T7 [5] 3+55 (faʔ5min113) to invent 5+5 3+5 物理 T8 [13] 3+55 (vəʔ13 li13) physics

Our study focuses on the pattern substitution of disyllabic combinations be- tween T1 (53), T3 (323) and T5 (34), as shown in (5). We focus our study on these tones for two reasons. First, this avoids the neutralization or near-neutralization issue of citation tones in the Yang tones and provides a better test for the speak- ers’ knowledge of tone sandhi when the citation tone is known. Second, as we can see in (5), these three Yin tones are involved in a circular chain shift in pattern substitution: before spreading to the disyllable, the falling tone 53 (A) on the first pitch tracks from the male speaker still showed a slight falling tone on the first syllable (p. 46). Due to the presence of this fall as well as the fact that the pattern was only attested in one speaker, we use 43+34 in the rest of our paper for this sandhi pattern.

© 2016. John Benjamins Publishing Company All rights reserved 8 Hanbo Yan and Jie Zhang

syllable needs to be substituted by a dipping contour (B); the dipping tone 323 (B) needs to be substituted by a rising contour (C), and the rising tone 34 (C) needs to be substituted by a falling contour (A). Therefore, these Yin tones provide us with a unique opportunity to investigate the productivity of a sandhi pattern that is a combination of both transparent spreading, which earlier research has shown to be relatively productive, and opaque substitution, which earlier research has shown to be relative unproductive. This will provide a further test for whether the two types of tone sandhi should be represented differently in phonological gram- mar. (5) Tonal combinations under investigation: σ2 T1 T3 T5 Examples σ1 [53] [323] [34] 新鲜 (sin53 siɪ53) T1 [53] (A) 43+34 (B) fresh 胆小 (tɛ323 siɔ323) T3[323] (B) 33+44 (C) coward 奋斗 (fən34 tɛi34) T5[34] (C) 55+31 (A) to fight for

A → B ↖ ↙ C We use the wug test to address two questions about Wuxi tone sandhi. First, is pattern substitution productive? In other words, does pattern substitution apply to both real and novel words? We hypothesized that the opaque mapping between the citation tones and the spread tones is not entirely productive, comparable to the finding in Taiwanese; but the spreading itself is relatively productive, compa- rable to the finding in Shanghai. Second, are there any productivity differences among different tonal combinations? We hypothesized that the phonetic similarity between the citation tone of the first syllable and the sandhi tone as well as the type and token frequencies of the three tones would both have an effect on the produc- tivity. When we consider the phonetic similarity between the citation tone shape of the first syllable and the sandhi tone shape on the whole disyllabic domain, the citation tone of T3 is phonetically more similar to the sandhi shape of T3+X com- pared to the other two sandhis. This is because this mapping is between a dipping tone with a small initial fall with a rising tone of comparable pitch height, while the other two mappings map a falling tone to a dipping tone and a rising tone to a falling tone. This phonetic similarity between T3 and T3+X may lead to a higher

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 9 rate of substitution.5 Disyllables with T3 on the first syllable, therefore, may be more likely to undergo the substitution sandhi in novel words. On the other hand, according to the Monosyllabic Morpheme List of Wuxi by Cao (2003) and Wang (2008), there are 886 monosyllabic morphemes for T1, 449 for T3, and 579 for T5. This indicates that T1 has the highest type frequency. A calculation of the character frequency of these morphemes in Jun Da (2004)’s written Mandarin character frequency corpus was also made,6 and T1, T3, and T5 have raw token frequencies of 42,952,704, 25,775,177, and 34,360,503. In other words, T1 has the highest type and token frequencies among the three tones and T3 has the lowest. If frequency has an effect on the productivity of sandhi patterns, disyllabic words with T1 on the first syllable would be expected to undergo more substitution sandhi than those with T3 on the first syllable. Hence, the question is “which factor has more influence on the productivity of pattern substitution in Wuxi, phonetic similarity or frequency?” In what follows, Section 2 discusses the methodology and results of a wug test experiment that investigates the factors that influence substitution productiv- ity in Wuxi tone sandhi. Section 3 models the speakers’ learning using Maximum Entropy grammar so that the observed productivity in the wug test can be pre- dicted. Section 4 offers a general discussion, and Section 5 is the conclusion.

2. The productivity test

2.1 Subjects

Twenty native speakers (8 female) of Wuxi Chinese participated in this study. They were all born in Wuxi, ranging from 21 to 35 years old, with an average age of 27.

5. An anonymous reviewer questioned the dimensions along which phonetic similarity should be defined: tone register, contour, or both. Our position is that any phonetic dimensions that could influence speakers’ tone perception could be relevant for phonetic similarity. These di- mensions include the average height, direction, slope, and end points of the pitch (see Gandour and Harshmon 1978 and Gandour 1983, for example). We are agnostic about the feature primi- tives for tone and do not believe that similarity should be defined based purely on potential primitives such as register and contour.

6. The reasons that we used Jun Da (2004)’s corpus here and below (in §2.2) are that, (a) there is no spoken or written corpus in Wuxi, and (b) Wuxi and Mandarin share the same writing system and the majority of the lexical words. The frequency of written Mandarin, therefore, could be used as an estimate for the frequency of Wuxi. The corpus has 12,041 characters from Classical and Modern Chinese and a total of 258,852,642 character tokens.

© 2016. John Benjamins Publishing Company All rights reserved 10 Hanbo Yan and Jie Zhang

None of them lived outside of Wuxi for any substantial period of time before 18 years of age, and they were all living in Wuxi at the time of the experiment.

2.2 Materials

The experiment was a production task with stimuli designed to investigate the productivity of the sandhis by using nonce words. Real words were also tested to serve as the baseline. The experiment was implemented in Paradigm (Tagliaferri, 2011). Four sets of disyllabic words were used in the experiment. The first set is real words (Real). For these words, we controlled the frequency of the disyllabic words based on Jun Da (2004)’s general fiction bigram7 frequency list, which includes 973,338 bigrams. The average disyllable frequency of the stimuli is 254.56. The other sets are all wug words. The second set (Pseudo1) is nonsense words formed by combining two real morphemes. Moreover, the first morpheme can oc- cur in initial position in real disyllabic words. This potentially allows the speaker to access the substituted tone used in pattern substitution. For example, we used 煎 弯 ([tsiɪ53 uɛ53], to fry + curve), 煎展 ([tsiɪ53 tsʊ323], to fry + exhibition), and 煎 伞 ([tsiɪ53 sɛ34], to fry + umbrella) as one set of T1(53)+X stimuli, as 煎 ([tsiɪ53], to fry) is a morpheme that can occur in initial position in real words in Wuxi. The third set (Pseudo2) is similar to the second set, but the first morpheme never appears as the first morpheme of real disyllabic words. The speakers, therefore, have no opportunity to observe the substituted tone. For instance, 齿军 ([tshɿ323 tɕyən53], the second character of “tooth” + army), 齿掌 ([tshɿ323 tsæ̃323], the second character of “tooth” + palm), and 齿顿 ([tshɿ323 tən34], the second char- acter of “tooth” + pause) are one set of T3(323)+X stimuli, as the morpheme 齿 ([tshɿ323]) never appears in word-initial position in Wuxi. The frequencies of the first morphemes of these two sets could not be controlled, as morphemes that never appear in initial position have overall lower frequencies than morphemes that do occur in initial position. The fourth set is entirely novel words (Novel), in which the first syllable is an accidental gap and the second syllable is an actual-oc- curring morpheme. In the accidental gap syllables, both the segments and the tone of the syllable are legal, but their combination does not exist in Wuxi. For instance, [tsia] is a legal syllable and can take T3 (323) ‘elder sister’ and T5 (34) ‘to borrow’, yet it is never combined with T1 (53). Hence [tsia53] is a possible accidental gap. [tsia53 tɕin53] (tsia53 gold), [tsia53 tʰin323] (tsia53 yacht), and [tsia53 pʰiɔ34] (tsia53 ticket) are one set of stimuli in the Novel group. The accidental gaps were

7. A bigram refers to every sequence of two adjacent characters in a string of tokens, which may be a nonsense combination of characters in the corpus.

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 11 selected by the first author, a native speaker of Wuxi Chinese. Only T1, T3, and T5 were tested here, so there were nine tonal combinations (3*3=9). Four words were used for each tonal combination, including two verb + noun and two modifier + noun combinations, which resulted in 36 stimuli (9*4=36) in each set. There were 144 stimuli in total (36*4=144). The entire stimuli list is given in the Appendix.

2.3 Procedure

During the experiment, the participants wore a pair of headphones, and a computer screen and a microphone were also placed in front of them. For the Real, Pseudo1 and Pseudo2 groups, they listened to two monosyllabic words pronounced in their base tones separated by an 800ms interval. At the same time, they saw the on the computer screen. They were then asked to pronounce them as a disyllabic word in Wuxi. For the Novel group, subjects first listened to cue sen- tences that provided meanings for the novel syllable, they were then asked to put the novel and real syllables together and produce the disyllable as if it were a real word. The novel syllables were represented by a box “▢” on the computer screen. For example, the subjects would simultaneously hear and see “假设上网买东西 叫做 tsia53; 如果黄金还没有 tsia53, 那么也可以讲还没_” (“If to shop online is called tsia53; if gold has not been tsia53-ed, then we can say that we have not _”), with the novel syllable “tsia53” represented as “▢” on the computer screen. Then he or she was asked to produce “tsia53 金” as if it were a real word. Each novel syl- lable was pronounced in its base tone twice in the cue sentence. Each novel word was combined with three real monosyllabic morphemes in one block, which were in T1, T3, and T5 respectively, and the 12 novel syllable blocks appeared in ran- dom order for every speaker. The speakers’ responses were recorded by a Marantz PMD-671 solid state recorder connected to an Electrovoice 767 microphone. The experiment was conducted in a quiet room in Wuxi, .

2.4 Analysis/coding

All stimuli were analyzed in VoiceSauce (Shue, Keating & Vicenik 2009), an ap- plication implemented in Matlab that provides automated voice measurements of audio recordings. It measures the F0 at every millisecond through the STRAIGHT (Speech Transformation and Representation by Adaptive Interpolation of weiGHTed spectrogram) method (Kawahara, Masuda-Katsuse, & de Chieveigné, 1999). The data were then processed by an R script to get the average F0 for every 10% of the duration of the targeted syllables, with the first and last 12ms of each targeted syllable trimmed off. To get more reliable data, the F0 measurements were also checked manually in Praat (Boersma & Weenink, 2014). All F0 measurements

© 2016. John Benjamins Publishing Company All rights reserved 12 Hanbo Yan and Jie Zhang

were then converted to semi-tone according to the formula in (6a), and the semi-

tone values were further Z-score transformed using (6b). In (6b), each STi is a pitch value data point among all stimuli produced by that speaker. These transfor- mations allow a better reflection of pitch perception and normalize gender varia- tions (Rietveld & Chen, 2006; Rose, 1987; Zhu, 2004). Twenty-six tokens from the Novel group were excluded due to incorrect segmental production of the syllables.

(6) a. ST = 39.87 × log10(Hz/50)

b . ! ! !"!!! !!! !"! !"! ! ! ! ! ! 𝑧𝑧 = !!! !!!(!"!!! !!! !"!) A three-way Repeated Measures ANOVA was conducted on the F0 results. Since we expected the major difference in productivity to appear on the second syllable due to the left-dominant nature of the sandhi, we compared the F0 of the second syllable, with Word-type (4 levels), Second-syllable-tone (3 levels), Data- point-in-syllable (11 levels) as independent variables. Huynh-Feldt adjusted val- ues were used to correct for sphericity violations. The stimuli were also transcribed on a 5-level tonal scale according to both the normalized F0 measurements and the first author’s perceptual judgment. Except for the 26 mispronounced stimuli, all sandhi tones produced by the speakers were then categorized into six categories: Correct Substitution (expected tone san- dhi), Wrong Substitution (the use of an incorrect substitution pattern), Extension (spreading of the base tone on the 1st syllable), Unchanged (rendition of the base tones on both syllables), Partially Unchanged (rendition of the base tones on one of the syllables), and Others (other tone changes). To examine the productivity patterns, the speakers’ production of a test word was coded as “1” if it represented correct substitution and “0” if it did not. We then used Logit Mixed-Effects models with Speaker and Item as random effects and Word-type (Real, Pseudo1, Pseudo2, Novel) and Tonal-combination (T1+X, T3+X, T5+X) as fixed effects. These models incorporate both factors with repeated levels (fixed-effects) and factors with levels randomly sampled from a much larger population (random-effects) and correspondingly allow the data to be character- ized optimally (Jaeger, 2008).

2.5 Results

2.5.1 Tone sandhi patterns To examine the tone sandhis of T1, T3, and T5 in Real, Pseudo1, Pseudo2, and Novel word types, the average pitch tracks of different disyllabic tonal combi- nations are compared for the four word types. Figure 1 shows the average pitch

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 13 tracks for T1 (53) + T1, T3, T5; T3 (323) + T1, T3, T5, and T5 (34) + T1, T3, T5 for the four types of words.

(a) T1 (53) — X → 43–34 Real Pseudo  

 

  f f

– –

– –                         s s s s

Pseudo Novel  

 

f  f 

– –

– –                         s s s s

+ + +

© 2016. John Benjamins Publishing Company All rights reserved 14 Hanbo Yan and Jie Zhang

(b) T3 (323) — X → 33–44 Real Pseudo  

 

  f f

– –

– –                         s s s s

Pseudo Novel  

 

f  f 

– –

– –                         s s s s

+ + +

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 15

(c) T5 (34) — X → 55–31 Real Pseudo  

 

  f f

– –

– –                         s s s s

Pseudo Novel  

 

f  f 

– –

– –                         s s s s

+ + + Figure 1. Pitch tracks for disyllabic words, organized by tone on the first syllable. “X” refers to one of the three tones in this study. Statistical results for the second-syllable comparisons are listed in Table 1. The main effect of Word-type is significant for T3+X and T5+X but not for T1+X. The main effect of Second-syllable-tone is significant for all three tonal combina- tions T1+X, T3+X, T5+X, indicating that the second-syllable pitch curves for the three tonal combinations have different intercepts. The effect of Data-point-in- syllable is also significant for all three tonal combinations, indicating that the tones on σ2 are all contour tones. There is a significant interaction between Word-type

© 2016. John Benjamins Publishing Company All rights reserved 16 Hanbo Yan and Jie Zhang

and Second-syllable-tone for all of T1+X, T3+X, and T5+X, suggesting that the main effect for Second-syllable-tone is regulated by Word-type. The main effect for Data-point-in-syllable is also regulated by Word-type for T1+X and T5+X, indicating that they have significantly different pitch slopes on σ2 among the four word types. T3+X does not show such significant difference, indicating that σ2 in T3+X does not have significant slope differences among the four word types. The interaction between Second-syllable-tone and Data-point-in-syllable is only significant for T1+X, indicating that σ2 in T1+X has significantly different pitch slopes depending on the tone on the second syllable. The three-way interactions are only significant in T5+X. This shows that the pitch slope difference on σ2 of T5+X among different tones on the second syllable is regulated by different word types.

Table 1. ANOVA results for the F0 comparisons of the second syllables in combination with different first syllables. Word-Type Tone Data Point Word-Type Word-Type Tone*Point Word-Type * Tone * Point *Tone*Point T1+X F(1.761, F(1.939, F(1.622, F(5.254, F(3.561, F(2.998, F(7.756, 33.461)= 36.836)= 30.811)= 99.827)= 67.659)= 56.959)= 147.361)= .403, 25.490, 13.067, 3.909, 29.798, 3.638, 1.421, p=.194 p=.646 p<.001 p<.001 p=.002 p<.001 p=.018 T3+X F(2.336, F(1.870, F(1.356, F(5.466, F(6.750, F(2.859, F(13.336, 44.393)= 35.527)= 25.766)= 103.861)= 128.245)= 54.328)= 253.378)= 6.460, 20.303, 6.895, 4.006, 1.725, 1.736, 1.067, p=.388 p=.002 p<.001 p=.009 p=.002 p=.112 p=.173 T5+X F(2.167, F(1.830, F(1.665, F(5.787, F(4.426, F(3.338, F(8.911, 41.172)= 34.771)= 31.637)= 109.958)= 84.089)= 63.414)= 169.304)= 29.648, 30.546, 77.716, 5.918, 16.041, 1.902, 2.888, p=.003 p<.001 p<.001 p<.001 p<.001 p<.001 p=.132

As we can see, for all tonal combinations, the sandhi patterns in Real words are consistent with the patterns in Xu (2007)’s acoustic study: T1+X, T3+X, and T5+X were realized as 43+34, 33+44, and 55+31, respectively. In Pseudo1 and Pseudo2 words, the pitch tracks showed substitution tendencies, but the F0 of the second syllable was different among the three base tones. For the Novel group, for T1+X, instead of spreading the substituted tone, the speakers seemed to have sim- ply spread the citation tone of T1 to the sandhi domain; for T3+X, the second syl- lables’ F0 was lower than that of Real words, but the disyllable did have the correct

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 17 sandhi shape; T5+X showed a falling tone on the second syllable, similar to the correct tone sandhi, but the disyllable also showed an extension pattern (44+55). It is interesting to note that in nonce words (Pseudo1, Pseudo 2, Novel), except for T3+X in Pseudo1 and Novel, all other patterns showed a pitch height hierarchy on the second syllable, with T1 being the highest, T5 in the middle, and T3 the lowest. This corresponds to the pitch height of the base tones of the three tones. It seemed that there were extensions from the first syllable to the whole disyllabic words in Novel words, but the second syllable also kept part of its base tone in the three types of nonce words.

2.5.2 Tone sandhi categories As mentioned in §2.4, to further investigate the types of sandhi patterns that the stimuli underwent, we categorized the acoustic output into the following six cate- gories: Correct Substitution, Wrong Substitution, Extension, Unchanged, Partially Unchanged, and Others. These results are shown in Figure 2. In T1+X, Correct Substitution occurs the most frequently (97%) in Real words. The Correct Substitution rate decreases to 60% in Pseudo1, 52% in Pseudo2, and 12% in Novel words. Extension, Unchanged and Wrong Substitution account for most of T1+X responses in the three types of nonce words. Cases of Extension increase from 13% in Pseudo1 and 20% in Pseudo2 to 56% in Novel words. Cases of Wrong Substitution also increase from 11–12% in Pseudo words to 15% in Novel words. Unchanged maintains a rate of 13–14% across the three types of nonce words. In T3+X, Correct Substitution occurs most often in Real words (94%), but also has a relatively high rate in Pseudo words (74%). Cases of Correct Substitution decrease to 64% for Novel words. Due to the higher rate of Correct Substitution, Extension only accounts for 5% in Pseudo words and 18% in Novel words. Wrong Substitution accounts for about 5% across the three types of nonce words, and Unchanged is about 3%. In T5+X, Correct Substitution occurs in 98% of the responses in Real words, 62% in Pseudo1, 53% in Pseudo2, and 12% in Novel words. It decreases in the same direction as Correct Substitution for T1+X. Cases of Extension also increase from 0% in Real words to 20% in Pseudo1, 26% in Pseudo2, and 46% in Novel words. Wrong Substitution is about 4% across the three types of nonce words, and Unchanged increases from 12% in Pseudo words to 18% in Novel words. To further investigate the productivity of the three tonal combinations, the Correct Substitution rates of the three tonal combinations with the four types of words were compared. These data were analyzed using a Logit Mixed-Effects Model, with Word-type and the tone of the first syllable (Syll1Tone) as fixed ef- fects and subject and item as random effects. The model with interaction is sig- nificantly better than the other simpler models without interaction based on the

© 2016. John Benjamins Publishing Company All rights reserved 18 Hanbo Yan and Jie Zhang

T+X  Others Partically Unchanged  Unchanged Extension  Wrong Substitution Correct Substitution  



 Real Pseudo Pseudo Novel

T+X  Others Partically Unchanged  Unchanged Extension  Wrong Substitution Correct Substitution  



 Real Pseudo Pseudo Novel

T+X  Others Partically Unchanged  Unchanged Extension  Wrong Substitution Correct Substitution  



 Real Pseudo Pseudo Novel

Figure 2. Tone sandhi categories for the four sets of stimuli, “X” refers to one of the three tones in this study.

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 19 results of log-likelihood tests. We predicted that type and token frequencies and phonetic similarity could both have an effect on the correct sandhi productivity. We therefore ran the model twice, using Syll1ToneT1 (the most frequent tone) and Syll1ToneT3 (the most phonetically similar base and sandhi tone pair) as the base- line. Both models treated Real words as the baseline for Word-type. The parameter estimates for the models with the interaction between the two fixed effects when Syll1ToneT1 is the baseline are listed in Table 2.

Table 2. Fixed effect estimates (top) and variance estimates (bottom) for multi-level model results of correct sandhi (N=2880, log-likelihood: −1261). Fixed effect Coefficient SE of estimate z p (> |z|) Intercept 3.9816 0.5037 7.905 2.69e-15 *** Pseudo1 −3.4499 0.5504 −6.268 3.66e-10 *** Pseudo2 −3.8451 0.5482 −7.014 2.31e-12 *** Novel −6.4416 0.5783 −11.139 < 2e-16 *** Syll1ToneT3 −0.6577 0.6167 −1.066 0.2862 Syll1ToneT5 0.3586 0.7294 0.492 0.6230 Pseudo1×Syll1ToneT3 1.5047 0.7296 2.062 0.0392 * Pseudo2×Syll1ToneT3 1.7410 0.7264 2.397 0.0165 * Novel×Syll1ToneT3 3.7865 0.7454 5.080 3.77e-07 *** Pseudo1×Syll1ToneT5 −0.2646 0.8228 −0.322 0.7478 Pseudo2×Syll1ToneT5 −0.3545 0.8205 −0.432 0.6657 Novel×Syll1ToneT5 −0.3569 0.8591 −0.415 0.6778 Random effect s2 Item 0.78451 Subject 0.68301

Table 2 shows that when T1 is the baseline, there is an effect of Word-type. Pseudo1, Pseudo2, and Novel words all have significantly lower rates of correct sandhi than Real words do, and the correct sandhi rate from high to low is Real, Pseudo1, Pseudo2, Novel, which is in accordance with our prediction. But the co- efficient difference between Pseudo1 and Pseudo2 (−3.4499 vs. −3.8451) is small, while the coefficient value for Novel is considerably lower (−6.4416). Syll1Tone (T1+X, T3+X, T5+X) does not show a significant effect on produc- tivity in this model as the comparisons are between T3+X and T1+X and between T5+X and T1+X in Real words. However, the interactions between Word-Type and Syll1Tone3 are all significant. This indicates that the differences between T1+X and T3+X in Pseudo1, Pseudo2 and Novel words are significantly different from

© 2016. John Benjamins Publishing Company All rights reserved 20 Hanbo Yan and Jie Zhang

the difference between T1+X and T3+X in Real words. Considering that the inter- action coefficients are positive, 0.847 ((3.9816–0.6577–3.4499 + 1.5047)–(3.9816– 3.4499)) for Pseudo1, 1.0833 ((3.9816–0.6577–3.8451 + 1.7410)–(3.9816–3.8451)) for Pseudo2, and 3.1288 ((3.9816–0.6577–6.4416 + 3.7865)–(3.9816–6.4416)) for Novel words, this suggests that the speakers produced significantly more correct sandhi for T3+X than T1+X in the three types of nonce words, and the differ- ence between T3+X and T1+X is the greatest in Novel words, followed by Pseudo2 and Pseudo1 words. Finally, there are no significant interactions between Word- type and Syll1Tone5, suggesting that the differences between T1+X and T5+X in Pseudo1, Pseudo2 and Novel words are not significantly different from the differ- ence between T1+X and T5+X in Real words. The parameter estimates for the model with the interaction between the two fixed effects when Real words and Syll1ToneT3 are the baseline are listed in Table 3.

Table 3. Fixed effect estimates (top) and variance estimates (bottom) for multi-level model results of correct sandhi (N=2880, log-likelihood: −1261). Fixed effect Coefficient SE of estimate z p (> |z|) Intercept 3.3240 0.4182 7.948 1.90e-15 *** Pseudo1 −1.9452 0.4798 −4.055 5.02e-05 *** Pseudo2 −2.1042 0.4777 −4.405 1.06e-05 *** Novel −2.6552 0.4724 −5.621 1.90e-08 *** Syll1ToneT1 0.6576 0.6167 1.066 0.28632 Syll1ToneT5 1.0162 0.6734 1.509 0.13127 Pseudo1×Syll1ToneT1 −1.5046 0.7296 −2.062 0.03920 * Pseudo2×Syll1ToneT1 −1.7409 0.7264 −2.396 0.01656 * Novel×Syll1ToneT1 −3.7864 0.7454 −5.080 3.77e-07 *** Pseudo1×Syll1ToneT5 −1.7692 0.7778 −2.275 0.02292 * Pseudo2×Syll1ToneT5 −2.0954 0.7757 −2.701 0.00691 ** Novel×Syll1ToneT5 −4.1434 0.7939 −5.219 1.80e-07 *** Random effect s2 Item 0.78451 Subject 0.68301

able 3 shows that when T3 is the baseline, there is also a significant effect of Word-type. The three types of nonce words all have significantly lower rates of correct sandhi than Real words, and the correct sandhi rate from high to low is Real, Pseudo1, Pseudo2, Novel, just as in the last model. The coefficient difference between Pseudo1 and Pseudo2 (−1.9452 vs. −2.1042) is still small; the difference

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 21 between Novel and Real for T3 (Coefficient: −2.6552) is not as great as that for T1 (Coefficient: −6.4416). This indicates that Novel words for T3+X have more pro- ductive tone sandhi than Novel words for T1+X. Syll1Tone (T1+X, T3+X, T5+X) does not show a significant effect in this model as the comparisons were between T1+X and T3+X and between T5+X and T3+X in Real words. The interactions between different word types and Syll1Tone1 replicate the results from the last model except that the coefficients are now all negative due to the switch of the baseline to T3. The interactions between different word types and Syll1Tone5 are also significant. The coefficient is −0.753 ((3.324 + 1.0162–1.9452–1.7692)–(3.324–1.9452)) for Pseudo1, −1.0792 ((3.324 + 1.0162–2.1042–2.0954)–(3.324–2.1042)) for Pseudo2, and −3.1272 ((3.324 + 1.0162–2.6552–4.1434)–(3.324–2.6552)) for Novel words. This suggests that the speakers produced significantly less correct sandhi for T5+X than T3+X in the three types of nonce words, and the difference between T5+X and T3+X is similar to that between T1+X and T3+X. These comparisons show that phonetic similar- ity may have an effect on applying correct substitution, as the base and sandhi tones for T3+X are the most similar phonetically, and the sandhi is also the most productive in T3+X. According to log-likelihood tests, for the categories of Extension and Unchanged, the full Logit Mixed-Effects model with Word-type, Syll1Tone, and their interaction is not significantly different from the model without the interac- tion. The simpler model is therefore adopted, and its parameter estimates for the Extension and Unchanged categories are given in Table 4 and Table 5.

Table 4. Fixed effect estimates (top) and variance estimates (bottom) for multi-level model results of Extension sandhi (N=2880, log-likelihood: −984). Fixed effect Coefficient SE of estimate z p (> |z|) Intercept −5.70758 0.71525 −7.980 1.47e-15 *** Pseudo1 3.82487 0.71939 5.317 1.06e-07 *** Pseudo2 4.10766 0.71773 5.723 1.05e-08 *** Novel 5.48864 0.71399 7.687 1.50e-14 *** Syll1ToneT3 −1.49607 0.27693 −5.402 6.57e-08 *** Syll1ToneT5 0.08905 0.25079 0.355 0.723 Random effect s2 Item 0.90772 Subject 0.59730

© 2016. John Benjamins Publishing Company All rights reserved 22 Hanbo Yan and Jie Zhang

Table 5. Fixed effect estimates (top) and variance estimates (bottom) for multi-level model results of Unchanged sandhi (N=2880, log-likelihood: −608.8). Fixed effect Coefficient SE of estimate z p (> |z|) Intercept −5.58574 0.59242 −9.429 < 2e-16 *** Pseudo1 2.96021 0.55304 5.353 8.67e-08 *** Pseudo2 3.21965 0.55043 5.849 4.94e-09 *** Novel 3.24536 0.55027 5.898 3.68e-09 *** Syll1ToneT3 −1.87209 0.31790 −5.889 3.89e-09 *** Syll1ToneT5 0.06022 0.23985 0.251 0.802 Random effect s2 Item 0.74462 Subject 1.12452

For the category of Extension, Pseudo1, Pseudo2, and Novel groups are all significantly different from Real words, but in the opposite direction from Correct Substitution. Positive coefficients indicate that the spreading pattern is signifi- cantly more common in Pseudo1, Pseudo2, and Novel words than Real words for T1, and the rate of spreading the base tone of σ1 from high to low is Novel, Pseudo2, Pseudo1, Real. The difference between the comparison of Pseudo1/Real and Pseudo2/Real is again small (Coefficients: 3.82487versus 4.10766). The exten- sion pattern for Syll1ToneT3 (T3+X) is significantly different from T1+X in Real words. Since the coefficient is negative (−1.49607), there are significantly fewer cases of extension in T3+X than in T1+X. In other words, Word-type and Tonal- combination both have an effect on the spreading pattern. For the category of Unchanged, Pseudo1, Pseudo2, and Novel words are sig- nificantly different from Real words, and the rate of base tones remaining un- changed from high to low is Novel, Pseudo2, Pseudo1, Real words. The differences among the three wug word types are small despite the fact that they are all signifi- cantly different from Real words. The base tone of σ1 also has an effect with T3+X having significantly fewer Unchanged cases than T1+X in Real words (Coefficient: −1.87209).

2.6 Discussion

Our results show that pattern substitution is not fully productive in Wuxi. Speakers failed to apply substitution productively in Novel words; instead, they often ap- plied the spreading pattern or kept the base tones unchanged. It appears that the speakers had difficulties in spreading the correctly substituted tone to the sandhi domain when they encountered Novel words. Our pitch results showed that there

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 23 is a pitch height hierarchy of the second syllables in Novel words, with T1 being the highest, T5 in the middle, and T3 the lowest, also indicating that the speakers were aware of the underlying tones of the second syllables. This pattern may be caused by two reasons. One is that the substitution pattern among T1, T3, and T5 in Wuxi is a phonologically opaque circular chain shift. The lack of productivity of this opaque pattern agrees with the Taiwanese results from previous studies (Hsieh, 1970; Wang, 1993; Zhang et. al., 2009, 2011). Secondly, it is simply more difficult for the speakers to spread a substituted tone than to only spread the cita- tion tone of the first syllable. Our results also show that pattern substitution is the most productive in T3+X. Considering that T1 has the highest type and token frequencies among the three tones (see §1.3), this suggests that type and toke frequencies either do not influ- ence productivity or that their effect has been shadowed by other effects. Phonetic similarity, on the other hand, facilitates correct substitution, as the similar shapes between the citation tone and the sandhi tone allow the speakers to find the sub- stituted tone more easily. Another interesting finding is that pattern substitution is more productive in Pseudo1 and Pseudo2 words than Novel words, but there is little difference be- tween Pseudo1 and Pseudo2. Pseudo1 and Pseudo2 have correct substitution rates of 66% and 59% respectively, while Novel words’ correct substitution rate is only 29%. This means that speakers may not rely on specific morphemes to apply the sandhi, and it may be the phonological content of morphemes that actually mat- ters. We checked the frequencies of all homophones of the first morphemes in Pseudo1 and Pseudo2 in Cao (2003) and Wang (2008)’s Monosyllabic Morpheme List of Wuxi as well as Jun Da (2004)’s combined character frequency corpus in Mandarin. The average frequencies of the homophones of the first morphemes in Pseudo2 words turned out to be higher than those in Pseudo1 words, as shown in Table 6. When the citation tone and substituted tone of the first morpheme are both known, but the disyllabic word is not familiar, as in Pseudo1, speakers can some- times access the substituted tone, but not to as reliable an extent as Real words. When only the base tone of the first morpheme is known, as in Pseudo2, where the first morpheme never occurs in initial position, speakers were still able to ac- cess the substituted tone sometimes and produce the correct sandhi, likely because there are homophones of the morpheme that can appear in initial position. This suggests that speakers relied on the phonological content of the morpheme rather than the morpheme itself in coming up with the sandhi tone. If the substituted tone of the first morpheme is not available, as in Novel words, they can only rely on the citation tone of the first morpheme to apply tone sandhi. This results in tonal spreading or unchanged base tones.

© 2016. John Benjamins Publishing Company All rights reserved 24 Hanbo Yan and Jie Zhang

Table 6. Type and token frequencies of homophones for all of the first morphemes in Pseudo1 and Pseudo2, based on Cao (2003) and Wang (2008)’s Monosyllabic Morpheme Listing of Wuxi as well as Jun Da (2004)’s combined character frequency corpus in Mandarin. Tones Average Average number of Average fre- Average frequency of number of homophonic mor- quency of all homophonic mor- homophonic phemes that can ap- homophonic phemes that can appear morphemes pear in initial position morphemes in initial position Pseudo1 T1 6 5.75 103,883 102,821.75 T3 2 2 43,933.5 43,933.5 T5 3.75 2.5 68,306.75 63,079.5 Pseudo2 T1 7.25 5.5 344,505.5 340,923.25 T3 5.25 3.5 246,819.5 241,045.25 T5 5.75 4.25 214,060.25 209,528

3. A grammatical model

There are two generalizations that a grammatical model of Wuxi speakers’ tone sandhi knowledge needs to capture. First, pattern substitution is productive in Real words but lacks full productivity in nonce words (Pseudo1, Pseudo2, and Novel groups). Second, phonetic similarity between base and sandhi tones facili- tates the production of the substitution pattern in nonce words. We propose a Maximum Entropy model to this effect, and this section spells out the details of this model.

3.1 The Maximum Entropy (MaxEnt) Grammar

As a variant of Optimality Theory, the Maximum Entropy (MaxEnt) grammar is a rigorous model with enough flexibility for our purpose. In MaxEnt, each con- straint is associated with a weight, and for each input, the probability of a particu- lar candidate surfacing as the output is determined by how well this candidate satisfies the constraint weight hierarchy when compared with all other candi- dates. Learning in a MaxEnt grammar is to determine the constraint weights that maximize the log probability of the learning data, and for each constraint, the learner can impose a Gaussian prior, with a mean of μ and a variance of σ2, over its weight to prevent overfitting the data. The μ represents the default weight for

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 25 the constraint and σ2 determines the severity of the penalty when the weight of the constraint deviates from μ — the smaller the σ2, the greater the penalty. For more details on MaxEnt grammars and learning biases as Gaussian priors, see Jäger (2007), Zhang et al. (2011), and White (2013).

3.2 The dual listing/generation model

The analysis is also built on the dual listing/generation model of Zuraw (2000). It assumes that existing forms are listed in the speakers’ lexicon. These forms can be protected by a set of highly-weighted faithfulness constraints; but other con- straints in the grammar can help predict speakers’ behavior when nonce words are encountered. Transparent patterns in the lexicon can be derived by both highly- weighted markedness constraints and lexical listing. In this case, these patterns will generally be productive in novel words. Opaque patterns, however, can only be achieved through lexical listing. This means that the pattern can emerge for Real words under the protection of lexical listing, but cannot be derived produc- tively in Novel words due to the lack of listed lexical entries.

3.3 Constraints

3.3.1 UseListed constraints According to the wug test, the speakers were able to apply substitution in Real words; in Pseudo1 & Pseudo2, the speakers referred to the listed sandhi form of the syllable to some degree but not as successfully as in Real words, so both sub- stitution and extension applied in Pseudo words; in Novel words in which the substituted tone for the first syllable is not available, the speakers often only relied on the citation tone of the first syllable to apply tone sandhi, causing tonal spread- ing or unchanged base tones. Considering the three levels of productivity, there needs to be three levels of listedness. The first level of listed constraints is operative for Real words with exist- ing sandhis and accounts for the correct substitution for real words. The second level is for the syllable-level allomorphic relation between the base tone and the sandhi tone and captures the higher sandhi productivity in Pseudo words than Novel words. The third level is for the allomorphic relation between base tone and sandhi tone independent of segmental contexts and accounts for the degree of substitution productivity in Novel words. For example, for an existing syllable with a base tone of 53 like [sin53], when it forms a real disyllabic word with another existing syllable [siɪ53], the disyllable /sin53-siɪ53/ has a listed lexical entry /sin43-siɪ34/, and the syllable itself also has a listed allomorph /sin323/. In Pseudo words, speakers use the listed allomorph

© 2016. John Benjamins Publishing Company All rights reserved 26 Hanbo Yan and Jie Zhang

/sin323/ to apply tonal substitution for the disyllabic word. Each tone in the tonal inventory also has a listed tonal allomorph, so /53/ has a listed tonal allomorph /323/ that can be used in tone substitution. The three types of UseListed constraints are defined in (6). Constraints in (6a) require that real words use the listed lexical entry; constraints in (6b) force an existing syllable to use its listed allomorph; and constraints in (6c) require a proper tonal allomorph to be used for existing tones regardless of what the syllable is. (6) UseListed constraints: a. UseListed(σ53-X): If the base tone of σ1 is 53, use the lexical entry /σ43-σ34/. UseListed(σ323-X): If the base tone of σ1 is 323, use the lexical entry /σ33-σ44/. UseListed(σ34-X): If the base tone of σ1 is 34, use the lexical entry /σ55-σ31/. b. UseListed(σ53): If the base tone of σ1 is 53, use the listed allomorph /σ323/. UseListed(σ323): If the base tone of σ1 is 323, use the listed allomorph /σ34/. UseListed(σ34): If the base tone of σ1 is 34, use the listed allomorph /σ53/. c. UseListed(53): Use the listed tonal allomorph /323/ for /53-X/. UseListed(323): Use the listed tonal allomorph /34/ for /323-X/. UseListed(34): Use the listed tonal allomorph /53/ for /34-X/. The tableaux in (7) show how the UseListed constraints are evaluated across different types of words. (7) The evaluation of UseListed constraints Real:/σ53-X/ UseListed (σ53-σX) UseListed (σ53) UseListed (53) Listed: /σ43-σ34/ Listed: /σ323-X/ Listed: /323-X/ → σ43-σ34 σ55-σ31 * * * σ44-σ55 * * * σ53-σ53 * * *

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 27

Pseudo:/σ53-X/ UseListed (σ53-σX) UseListed (σ53) UseListed (53) Listed: /σ323-X/ Listed: /323-X/ → σ43-σ34 σ55-σ31 * * σ44-σ55 * * σ53-σ53 * *

Novel:/σ53-X/ UseListed (σ53-σX) UseListed (σ53) UseListed (53) Listed: /323-X/ → σ43-σ34 σ55-σ31 * σ44-σ55 * σ53-σ53 *

3.3.2 Constraints for tonal extension To capture the tonal extension that occurs in Pseudo and Novel words, we use the interaction between markedness and faithfulness constraints. These constraints will apply to all three types of words. Given that a lexical entry has both the base tone and the sandhi tone listed, the tonal faithfulness constraint is defined such that it can be satisfied as long as the tonal specification of either the base tone or the sandhi tone is preserved in the output, as shown in (8). This constraint differs from UseListed in that the latter requires the use of the sandhi tone in the output, while this constraint does not. (8) Max(Tone) (abbr. Max-T): Maximize the tonal specification of either the listed base tone or the listed substituted tone, in the tonal specification of the output. According to Zhang (2007), tone extension is motivated by constraints that fa- vor the reduction of tonal contours as the extension leads to flattened contours on each syllable. We consider a markedness constraint that disfavors a pronounced contour tone in nonfinal position, as defined in (9). (9) * Pronounced Contour-Nonfinal (abbr. *Pro Con-NF): Pronounced contour tones cannot occur on nonfinal syllables. Zhang (2007) argued that tone extension also violates a specific type of faith- fulness, which he termed Faith-Alignment. The constraint that bans rightward spreading — Faith-Align-Right — is defined in (10).

© 2016. John Benjamins Publishing Company All rights reserved 28 Hanbo Yan and Jie Zhang

(10) Faith-Align-Right (abbr. F-Align-R): If the right edge of T is aligned with

the right edge of σi in the input, then the right edge of T’s correspondent in the output cannot be aligned with the right edge of syllables later than σi in the output. The tableau in (11) illustrates the candidates for /σ53-X/ with respect to the three constraints: *Pro Con-NF, Max-T, and F-Align-R. The top two candidates involve the spreading of the base tone and the listed sandhi tone. Therefore, they satisfy Max-T and *Pro Con-NF, but violate F-Align-R. The third candidate in- volves the spreading of a wrong sandhi tone and thus violates Max-T; but it does not have a pronounced contour in nonfinal position, nor does it violate F-Align-R as the spread tone has no correspondence in the input. The fully faithful candidate violates *Pro Con-NF, but does not violate the faithfulness constraints. (11) /σ53-X/ → ? /σ53-X/ MAX-T *PRO CON-NF F-ALIGN-R Listed: /323-X/ σ43-σ34 * σ55-σ31 * σ44-σ55 * σ53-X *

3.4 Learning biases

The wug test shows that the tone sandhi pattern in Wuxi does not reflect the speakers’ tacit knowledge of the pattern. This indicates that phonological learn- ing is biased. We argue that there is a substantive bias that encodes the learners’ a priori preference for learning phonological patterns according to perception, articulation, and other phonetic knowledge (Hayes, Kirchner, & Steriade, 2004; Steriade, 1999, 2008; White 2013). As mentioned in §4.1, MaxEnt with Gaussian priors provides the mathematical basis to account for the speakers’ learning biases. We first need a learning bias in favor of the UseListed constraints for T3 due to the phonetic similarity between the base and sandhi tones. According to Steriade’s P-map hypothesis (Steriade 2008, also see White 2013), constraints that regulate the correspondence between forms are intrinsically ranked according to the perceptual similarity between the forms in correspondence: the more dissimi- lar the forms are to each other, the more highly ranked the constraint penalizing their correspondence. In the context of UseListed constraints, given that they enforce the correspondence between a base tone and a listed allomorph, we assume that the more similar the base and sandhi tones are, the higher their UseListed

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 29 constraint is weighted. Following White (2013), we encode this as different default weights for different UseListed constraints. We assign the UseListed constraints of T3+X a μ value of 0.3, and all the other constraints a μ value of zero to en- code this effect. Second, we need a learning bias that demotes the UseListed con- straints for tonal allormophy. As the value of σ2 determines how easy it is for the weight of each constraint to deviate from its default μ, to capture the productivity differences among different word types, the three types of UseListed constraints have different 2σ values (Zhang et al., 2011). The 2σ value of UseListed constraints for Real words is set to 1; the σ2 value of UseListed constraints for syllable-level allomorphs is set to 0.05; and the σ2 value of UseListed constraints for tonal al- lomorphs is set to 0.00001. The σ2 of other constraints is set to 1 as well. The two learning biases reflect the intuition that first, phonetic properties motivate speak- ers’ learning, and second, if UseListed is the learner’s strategy to cope with excep- tional patterns that cannot be captured by regular means, such as Markedness » Faithfulness rankings, then the learner is unwilling to promote those UseListed constraints that cover a large amount of data so as to avoid treating these data as exceptions. The μ and 2σ values for all constraints are listed in (12). The intuition behind this bias coefficient is that if UseListed is the learn- er’s strategy to cope with exceptional patterns that cannot be captured by regu- lar means, such as the Markedness » Faithfulness ranking, then the learner is first of all cautious about positing exceptions, expressed in the model by assigning

UseListed constraints greater penalties than other constraints (BListed < 0 for all UseListed constraints) if they deviate from the default ranking of 0; secondly, the learner is unwilling to treat massive amounts of data as exceptions, expressed in the model as greater penalties for UseListed constraints that cover a greater number of morphemes, i.e., make generalizations. (12) μ and σ2 values for all constraints Constraints μ σ2 Constraints μ σ2 UseListed(σ53-X) 0 1 UseListed (σ53) 0 0.05 UseListed (σ323-X) 0.3 1 UseListed (σ323) 0.3 0.05 UseListed (σ34-X) 0 1 UseListed (σ34) 0 0.05 Max(Tone) 0 1 UseListed (53) 0 0.00001 *Contour-Nonfinal 0 1 UseListed (323) 0.3 0.00001 Faith-Align-Right 0 1 UseListed (34) 0 0.00001

© 2016. John Benjamins Publishing Company All rights reserved 30 Hanbo Yan and Jie Zhang

3.5 Modeling the speakers

We used the MaxEnt Grammar Tool8 to implement the learning model for Wuxi tone sandhi. The learner is fed tonal combinations that are representative of the lexicon of Wuxi and outputs a grammar that can predict the speakers’ behavior. The data fed to the learner are real disyllabic words with T1, T3, or T5 on the first syllable. To estimate the appropriate token frequencies of T1+X, T3+X, and T5+X words, we first made a monosyllabic morpheme list for Wuxi by combining the Wuxi monosyllabic morphemes found in Cao (2003) and Wang (2008), we then looked up the frequencies of these morphemes in Jun Da’s (2004) combined char- acter frequency corpus of written Mandarin as an estimate of their usage frequen- cies in Wuxi. This calculation indicated that the numbers of T1+X, T3+X, and T5+X words a Wuxi learner encounters are in a rough ratio of 5:3:4. Assuming that a learner hears 7,000,000 words per year,9 a 27-year old speaker (the average age of the subjects in our experiment) has encountered an estimate of 189,000,000 words. Based on these, the inputs and outputs that we fed to the learner are as given in (13). For each input, we also included three additional candidates: one involves spreading the base tone on the first syllable to the disyllable; one involves the spreading of a wrongly substituted tone, and one that is fully faithful. (13) Learning data fed to the MaxEnt Grammar Tool Input Output Frequency Real words /σ53-X/ σ43-σ34 78,750,000 /σ323-X/ σ44-σ55 47,250,000 /σ34-X/ σ55-σ31 63,000,000

The weights of each constraint after learning are listed in (14). (14) Constraint weights after optimization Constraints μ σ2 Weights UseListed(σ53-X) 0 1 14.2544 UseListed(σ34-X) 0 1 14.0553 UseListed(σ323-X) 0.3 1 13.2986 Max(Tone) 0 1 2.6216

8. A software package developed Colin Wilson and Ben George, made available to public by Bruce Hayes at http://www.linguistics.ucla.edu/people/hayes/MaxentGrammarTool/.

9. According to Hart & Risley (1995), a three-year-old hears around 3,000,000 to 11,000,000 words per year. We took the median of this estimate as the annual basis for our learner.

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 31

Pro Contour-Nonfinal 0 1 2.6216 UseListed(σ323) 0.3 0.05 0.9499 UseListed(σ53) 0 0.05 0.7127 UseListed(σ34) 0 0.05 0.7028 UseListed(323) 0.3 0.00001 0.3001 UseListed(53) 0 0.00001 1.4254E-4 UseListed(34) 0 0.00001 1.4055E-4 Faith-Align-Right 0 1 0

The weights of these constraints allow the effects of both tonal combination and word type in the speakers’ experimental behavior to be captured. For the effect of tonal combination, on one hand, since the different values of μ assign preferred weights to UseListed constraints for T3-X, for which the base tone and surface tone share similar phonetic properties, these UseListed constraints for T3-X have higher weights. On the other hand, the weights for UseListed constraints of listed sandhi tones also reflect the different token frequencies of the three tonal combi- nations. T1 has the highest frequencies, and the weight of UseListed constraint for disyllabic words in T1+X is the highest. For word-type effect, in Real words, the highly weighted UseListed con- straints on disyllables allow the speakers to use the lexical entries and apply cor- rect substitution. For a Real word /σ53-X/, it has a listed sandhi tone /σ43-σ34/, a listed allomorph /σ323/ and a listed tonal allomorph /323/. If the listed sandhi tone is used, none of the UseListed constraints is violated; but candidates of tonal extension and wrong substitution as well as the faithful candidate violate all three UseListed constraints. For a Pseudo word /σ53-X/, given that it lacks a lexical listing for the disyllable, UseListed (σ53-X) is vacuously satisfied by all candi- dates. This allows all candidates that do not involve correct substitution to incur fewer violations than in Real words and hence a greater chance to surface. But note that tonal extension will occur less often than substitution as the extension candidate is harmonically bounded by the substitution candidate. Finally, for a Novel /σ53-X/, there is no disyllabic lexical listing or syllable-based allomorph. Therefore, UseListed (σ53-X) and UseListed (σ53) are both vacuously satisfied, and only UseListed (53) is relevant. This further reduces the violation marks for candidates that do not involve correct substitution and hence increases their likeli- hood of surfacing. The learned grammar was tested for its prediction on Real, Pseudo, and Novel words, again with correct substitution, extension, wrong substitution, and un- changed as candidates. In MaxEnt grammar, each candidate is associated with a harmonic score determined by the constraint weights and the numbers of times

© 2016. John Benjamins Publishing Company All rights reserved 32 Hanbo Yan and Jie Zhang

it violates the constraints, and the probability of each candidate appearing as the output for an input is the ratio between its harmonic score and the sum of the harmonic scores of all of the input’s candidates. According to this calculation, the rates of these sandhi categories in T1+X, T3+X, and T5+X predicted by the grammar are listed in Figure 3, with the corresponding experimental results next to them. We combined Partially Unchanged and Unchanged in the experimental results into one category and left out Others, which seldom occurred in the ex- periments. We also combined the Pseudo1 and Pseudo2 results into the Pseudo category.

a. Grammar predictions: Experimental results: T+X T+X   Unchanged   Wrong Substitution  

  Extension   Correct Substitution    

Real Real Novel Novel Pseudo Pseudo b. Grammar predictions: Experimental results: T+X T+X   Unchanged   Wrong Substitution  

  Extension     Correct Substitution  

Real Real Novel Novel Pseudo Pseudo c. Grammar predictions: Experimental results: T+X T+X   Unchanged   Wrong Substitution   Extension     Correct Substitution    

Real Real Novel Novel Pseudo Pseudo Figure 3. Comparison of grammar predictions and experimental results for the sandhi in the three tonal combinations, including both. The grammar predicts that within each tonal combination, for T1+X in (a), T3+X in (b), and T5+X in (c), pattern substitution in Real words is productive. It is still relatively productive in Pseudo words, but extension is predicted as well. More Extension cases appear in Novel words. Across the three tonal combinations,

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 33 there is no difference in Real words, but in Pseudo and Novel words, T3+X (b) has a higher rate of Correct Substitution than T1+X (a) and T5+X (c). These patterns match up with the experimental results relatively well. We note that there are also areas of the grammar that can be improved. The grammar did not capture the low rate of Correct Substitution and the relatively high rate of Extension in Novel words for T1+X and T5+X. This is because the Extension candidate is harmonically bounded by the substitution candidate as we have seen in (17). Therefore it is not possible for the grammar to predict more ex- tension cases than substitution ones. It could be that the definition of Max(Tone), which promotes the spreading of both the base and substituted tones, leads to the high rate of predicted substitution. However, if Max(Tone) only maximizes the tonal specification of the base tone in the output, the grammar would devalue sub- stitution too much for it to surface in Novel words. The grammar also predicted relatively few Unchanged cases in Novel words, as it naturally disprefers the lack of sandhi due to the large number of constraints that favor tonal substitution and extension.

4. General discussion

Based on the results of the production experiment on disyllabic tone sandhi in both real and nonce words in Wuxi, we conclude that pattern substitution is not fully productive. The speakers could substitute the base tone of the first syllable and spread the substituted tone to the whole disyllable if they could identify the syllable in real morphemes (Pseudo1 and Pseudo2), but they did not do so at a high rate. If they could not identify the syllable (Novel), they tended to spread the base tone directly or keep the base tones of both syllables. The tonal combination of T3+X has the most productive sandhi even though T3 has the lowest type and token frequencies, suggesting that phonetic similarity has a greater effect than fre- quency on pattern substitution in Wuxi. An alternative account for the prevalence of the spreading pattern in our re- sults is that it is due to dialectal contact with Shanghai, which has a spreading pattern. There are two arguments again this account. First, if contact was the rea- son, we would expect the spreading pattern to occur in real words, but our results showed that the spreading pattern occurred primarily in nonce words and in very few real words. This is consistent with our participants’ reported dialect back- ground: none reported fluency in Shanghai. Second, Chan & Ren (1989) proposed an elegant historical account for pattern substitution: it originated from an earlier stage of right-dominant sandhi, in which the tone on the initial syllable alternated locally; the dominant edge then shifted to the left, and the sandhied tone on the

© 2016. John Benjamins Publishing Company All rights reserved 34 Hanbo Yan and Jie Zhang

initial syllable spread rightward as left-dominant sandhi is wont to do, causing a substitution type tonal alternation between the base tone on the initial syllable and the spread tone on the disyllable. Therefore, Wuxi’s tone sandhi pattern represents an intermediate stage between right-dominant and left-dominant sandhi, indicat- ing that it is historically more conservative than Shanghai. The major difference between Wuxi and Shanghai tone sandhi is that the former involves pattern sub- stitution and the latter involves spreading. If the speakers’ knowledge of spreading can arise independently of contact, as in our analysis, then not only have we pro- posed an analysis for Wuxi, but also a potential pathway from the more conserva- tive Wuxi pattern to the more innovative Shanghai pattern. If, on the other hand, the speakers’ knowledge of spreading is due to contact, then the appearance of the Shanghai pattern remains unaccounted for. We also modeled the speakers’ learning behavior and compared the model- ing results to the experimental results. Using UseListed constraints on different levels of representation and learning biases that penalize the weight increase of more general UseListed constraints, our model is able to capture the gradation of the productivity of pattern substitution from Real to Pseudo to Novel words. The higher productivity of the T3+X sandhi is captured by a higher default weight of UseListed constraints that regulate T3 in the model, and we argued that the higher default weight is motivated by the phonetic similarity between base T3 and its substituted sandhi tone. An anonymous reviewer questioned the nature of the UseListed constraints and their explanatory value of in the grammar. Our UseListed constraints are motivated by both the formal nature of the sandhi pattern and the experimen- tal results: the circular opacity of the substitution pattern determines that the traditional Markedness and Faithfulness interactions cannot derive the pat- tern (Moreton, 2004), and the lack of full productivity of the pattern in the wug test also indicates that the grammar cannot simply comprise Markedness and Faithfulness rankings. The partial productivity of the pattern, on the other hand, indicates that the mapping relation has been partially learned, and the gradation of the productivity from Real, to Pseudo, to Novel words motivates the different lev- els of listedness. In terms of the formal statement of UseListed, we take the posi- tion that it states both the tonal allomorph and the phonological context — either positional or tonal — in which the allomorph is required. Learners of a particular language can then plug in the specifics of their language to this universal template. This study complements our knowledge on the productivity of tone sandhi patterns by providing results for pattern substitution. It also echoes the findings of previous research. Tones in a circular chain shift cause difficulty for the speak- ers of Wuxi, just like for the speakers of Taiwanese. Phonetic properties influence tone sandhi productivity in Wuxi, similar to the findings in Mandarin (Zhang &

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 35

Lai, 2010) and Tianjin (Zhang & Liu, 2011). For example, in Mandarin, the sandhi with a clearer phonetic motivation is more productive. The effect of lexical fre- quency on sandhi application is overshadowed by the effect of phonetic similar- ity in Wuxi. This differs from Taiwanese (Zhang et. al, 2009; 2011), in which low lexical frequency of the base tone 33 and the reduplicative melody 21–33 cause a low productivity of 33 → 21. These indicate that opacity, phonetic properties, and lexical frequency have a complex relation in their influence on the productivity of tone sandhi patterns. An interesting question raised by an anonymous reviewer is whether the san- dhi productivity results support a distinction between morphophonological (e.g., positional type sandhi) and phonological (e.g., spreading type sandhi) processes (cf. lexical vs. postlexical rules à la Kiparsky, 1982 and Mohanan, 1982), and if so, whether the wug test is an appropriate tool for the study of the former, which may obtain unreliable results. There are two points we would like to offer. First, we must obtain the productivity results in some way, and there is no principled reason to believe that the wug test produces unreliable results. Wug tests on processes that have morphophonological properties (e.g., Bybee & Pardo, 1981; Albright, 2002; Pierrehumbert, 2006; Zuraw, 2007; Hayes, Wilson, & George, 2009; Becker, Ketrez, & Nevins, 2011) have uncovered consistent findings regarding the role of lexical frequency and phonetic properties in how phonology extends to new items, and these findings have provided important insight to how these processes are internalized by speakers and how they should be formally analyzed. Second, we are indeed interested in how the factors that have been used to delineate morpho- phonological vs. phonological processes influence productivity, such as the exis- tence of exceptions and the phonetic nature of the process, but we are unwilling to subscribe to a principled distinction between the two. In tone sandhi, for ex- ample, although the spreading type sandhi in Shanghai Wu is most likely rooted in phonetic tonal , it also is clearly phonologized, has morphosyntactic conditioning, and has ample exceptions. Therefore, we believe that it is more fruit- ful to make fewer assumptions about the categorical delineation among different types of phonological processes in our investigation of tone sandhi productivity.

5. Conclusion

The tone sandhi patterns in Wuxi Chinese are a combination of opacity and trans- parency. We conducted a wug test to study the speakers’ knowledge of Wuxi tone sandhi and proposed a theoretical learning model that produced a grammar that predicted a number of crucial properties observed in the speakers’ wug test results. Our experimental result showed that the speakers were able to apply the opaque

© 2016. John Benjamins Publishing Company All rights reserved 36 Hanbo Yan and Jie Zhang

pattern substitution in Real words, but the pattern was not productively applied in Pseudo and Novel words. The transparent pattern of tonal extension, however, was productive, and the substitution of two phonetically similar tones was also more productive. Our theoretical model was able to capture these basic generalizations, but fell short in predicting certain details of the experimental results. We therefore consider this as a first step towards our understanding of how speakers integrate simultaneously opaque and transparent tone sandhi patterns.

References

Albright, A. (2002). Islands of reliability for regular : Evidence from Italian. Language, 78, 684–709. doi: 10.1353/lan.2003.0002 Becker, M., Ketrez, N., & Nevins, A. (2011). The surfeit of the stimulus: Analytic biases filter lexical statistics in Turkish laryngeal alternations. Language, 87, 84–125. doi: 10.1353/lan.2011.0016 Boersma, P., & Weenink, D. (2003). Praat: A system for doing phonetics by computer. [http:// www.praat.org/]. Bybee, J., & Pardo, E. (1981). On lexical and morphological conditioning of alternations: A nonce-probe experiment with Spanish verbs. Linguistics, 19, 937–968. doi: 10.1515/ling.1981.19.9-10.937 Cao, X. Y. (2003). Wuxi fangyan yanjiu (Research on the Wuxi dialect). MA thesis, Suzhou University, Suzhou, China. Chan, M. K. M., & Ren, H. M. (1989). Wuxi tone sandhi: From last to first syllable dominance. Acta Linguistica Hafniensia, 21, 35–64. doi: 10.1080/03740463.1988.10416058 Chen, M. Y. (2000) Tone sandhi: Patterns across Chinese dialects. Cambridge, MA: Cambridge University Press. doi: 10.1017/CBO9780511486364 Cheng, R. (1968). Tone sandhi in Taiwanese. Linguistics, 41, 19–42. Court, C. (1985). Observations on some cases of tone sandhi. In G. Thurgood, J. A. Matisoff, & D. Bradley (Eds.), Linguistics of the Sino-Tibetan area: The state of the art (pp. 125–137). Canberra: Australian National University. Da, J. (2004). Chinese text computing. [http://lingua.mtsu.edu/chinese-computing]. Gandour, J. (1983). Tone perception in far eastern-languages. Journal of Phonetics, 11, 149–175. Gandour, J., & Harshmon, R. A. (1978). Crosslanguage differences in tone perception: A multi- dimensional scaling investigation. Language and Speech, 21, 1–33. Hart, B., & Risely, T. (1995). Meaningful differences in everyday experiences of young American children. Baltimore, MD: Brookes. Hayes, B., Kirchner, R., & Steriade, D. (Eds.). (2004). Phonetically based phonology. Cambridge, MA: Cambridge University Press. doi: 10.1017/CBO9780511486401 Hayes, B., Wilson, C., & George, B. (2009). Maxent grammar tool. Java program. [http://www. linguistics.ucla.edu/people/hayes/MaxentGrammarTool/]. Hayes, B., Zuraw, K., Siptár, P., & Londe, Z. (2009). Natural and unnatural constraints in Hungarian . Language, 85, 822–863. doi: 10.1353/lan.0.0169 Hsieh, H.-I. (1970). The psychological reality of tone sandhi rules in Taiwanese. Chicago Linguistic Society, 6, 489–503.

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 37

Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59, 434–446. doi: 10.1016/j.jml.2007.11.007 Jäger, G. (2007). Maximum entropy models and stochastic Optimality Theory. In A. Zaenen, J. Simpson, T. H. King, J. Grimshaw, J. Maling, & C. Manning (Eds.), Architectures, rules and preferences: Variation on themes by Joan W. Bresnan (pp. 467–479). Stanford: CSLI Publications. Kawahara, H., Masuda-Katsuse I., & de Chieveigné, A. (1999). Restructuring speech representa- tions using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency- based F0 extraction: Possible role of a repetitive structure in sounds. Speech Communication, 27, 187–207. doi: 10.1016/S0167-6393(98)00085-5 Kiparsky, P. (1973). Abstractness, opacity and global rules. In O. Fujimura (Ed.), Three di- mensions of linguistic theory (pp. 57–86). Tokyo: Tokyo Institute for Advanced Studies of Language. Kiparsky, P. (1982). Lexical morphology and phonology. In I. S. Yang (Ed.), Linguistics in the morning calm (pp. 3–91). Seoul: Hanshin. Lin, H.-B. (1988). Contextual stability of Taiwanese tones. Ph.D. dissertation, University of Connecticut, Storrs. Lin, H.-S. (2008). Variable directional applications in Tianjin tone sandhi. Journal of East Asian Linguistics, 17, 181–226. doi: 10.1007/s10831-008-9024-x Mohanan, K. P. (1982). Lexical phonology. Ph.D. dissertation, MIT. Distributed by Bloomington: Indiana University Linguistics Club. Moreton, E. (2004). Non-computable functions in Optimality Theory. In J. J. McCarthy (Ed.), Optimality Theory in phonology (pp. 141–164). Malden: Blackwell Publishing. ​ doi: 10.1002/9780470756171.ch6 Peng, S.-H. (1997). Production and perception of Taiwanese tones in different tonal and pro- sodic contexts. Journal of Phonetics, 25, 371–400. doi: 10.1006/jpho.1997.0047 Pierrehumbert, J. (2006). The statistical basis of an unnatural alternation. In L. Goldstein, D. H. Whalen, & C. Best (Eds.), Laboratory phonology 8, varieties of phonological competence (pp. 81–107). Berlin: Mouton de Gruyter. Prince, A., & Smolensky, P. (1993/2004). Optimality theory: Constraint interactions in generative grammar. Ms., New Brunswick: Rutgers University and Boulder: University of Colorado. Published 2004, Cambridge, MA: MIT Press. Qian, Z.-Y., & Zhu, G.-Q. (1998). Jinanhua yindang (A record of the Jinan dialect). Shanghai: Shanghai Educational Press. Rietveld, T., & Chen, A. J. (2006). How to obtain and process perceptual judgements of intona- tional meaning. In S. Sudhoff, D. Lenortová, R. Meyer, S. Pappert, P. Augurzky, I. Mleinek, N. Richter, & J. Schieβer, (Eds.), Methods in empirical prosody research (pp. 283–319). Berlin: Walter de Gruyter. Rose, P. (1987). Considerations in the normalisation of the fundamental frequency of linguistic tone. Speech Commun, 6, 343–352. Shue, Y.-L., Keating, P., & Vicenik, C. (2009). Voicesauce: A program for voice analysis. Journal of the Acoustical Society of America, 126, 2221. [http://www.ee.ucla.edu/~spapl/voicesauce]. ​ doi: 10.1121/1.3248865 Steriade, D. (1999). Alternatives to syllable-based accounts of consonantal phonotactics. In O. Fujimura, B. Joseph, & B. Palek (Eds.), Proceedings of the 1998 Linguistics and Phonetics Conference (pp. 205–245). Prague: The Karolinum Press.

© 2016. John Benjamins Publishing Company All rights reserved 38 Hanbo Yan and Jie Zhang

Steriade, D. (2008). The phonology of perceptibility effects: the P-map and its consequences for constraint organization. In S. Inkelas & K. Hanson (Eds.), The nature of the word: Studies in honor of Paul Kiparsky (pp. 151–180). Cambridge, MA: MIT Press. Tagliaferri, B. (2011). Paradigm. Perception Research Systems Inc. [http://www.paradigmex- periments.com/]. Wang, H. (1993). Taiyu biandiao de xinli texing (On the psychological status of Taiwanese tone sandhi). Tsinghua Xuebao (Tsinghua Journal of Chinese Studies), 23, 175–192. Wang, J. L. (2002). Youxuanlun he Tianjinhua de liandu biandiao ji qingsheng. (Optimality Theory and the tone sandhi and tone neutralization in ). Zhongguo Yuwen (Studies of the ), 2002, 363–371. Wang, P. (1988). Changzhou fangyan de liandu biandiao (Tone sandhi in the Changzhou dia- lect). Fangyan (Dialects), 1988, 177–194. Wang, Y. Z. (2008). Wuxi Fangyan Yuyin Yanjiu. (Studies on the phonological system of Wuxi dialect). MA thesis, Shanghai University. White, J. (2013). Bias in phonological learning: Evidence from saltation. Ph.D. dissertation, University of California, Los Angeles. Xu, J. Y. (2007). Wuxi fangyan shengdiao shiyan yanjiu (The experimental research on the tone of Wuxi dialect). MA thesis, Nanjing Normal University. Yang, Z. X., Guo, H. T., & Shi, X. D. (1999). Tianjinhua yindang (A record of the Tianjin dialect). Shanghai: Shanghai Educational Press. Yip, M. (1999). Feet, tonal reduction and speech rate at the word and 1 phrase level in Chinese. In R. Kager & W. Zonneveld (Eds.), Phrasal phonology (pp. 171–194). Nijmegen: Nijmegen University Press. Yip, M. (2004). Phonological markedness and allomorph selection in Zahao. Language and Linguistics, 5, 969–1001. Yue-Hashimoto, A. (1987). Tone sandhi across Chinese dialects. In Chinese Language Society of Hong Kong (Ed.), Wang Li memorial volumes, English volume (pp. 445–474). Hong Kong: Joint Publishing Company. Zee, E., & Maddieson, I. (1979). Tone and tone sandhi in Shanghai: Phonetic evidence and pho- nological analysis. UCLA Working Papers in Phonetics, 45, 93–129. Zhang, J. (1999). Duration in the tonal phonology of Pingyao Chinese. In M. K. Gordon, (Ed.), UCLA Working Papers in Linguistics, Papers in Phonology 3 (pp. 147–206). Los Angeles: UCLA. Zhang, J. (2007). A directional asymmetry in Chinese tone sandhi systems. Journal of East Asian Linguistics, 16, 259–302. doi: 10.1007/s10831-007-9016-2 Zhang, J. (2014a). Tones, tonal phonology, and tone sandhi. In C.-T. J. Huang, Y.-H. A. Li, & A. Simpson (Eds.), The handbook of Chinese linguistics (pp. 443–464). Oxford, UK: Wiley- Blackwell. doi: 10.1002/9781118584552.ch17 Zhang, J. (2014b). Tone sandhi. In M. Aronoff (Ed.), Oxford bibliographies in linguistics. New York, NY: Oxford University Press. [http://www.oxfordbibliographies.com/view/docu- ment/obo-9780199772810/obo-9780199772810-0160.xml]. Zhang, J., Lai, Y. W., & Sailor, C. (2009). Opacity, phonetics, and frequency in Taiwanese tone sandhi. Chicago Linguistic Society, 1(43), 273–286. Zhang, J., & Lai, Y. W. (2010). Testing the role of phonetic knowledge in Mandarin tone sandhi. Phonology, 27, 153–201. doi: 10.1017/S0952675710000060 Zhang, J., Lai, Y. W., & Sailor, C. (2011). Modeling Taiwanese speakers’ knowledge of tone san- dhi in reduplication. Lingua, 121, 181–206. doi: 10.1016/j.lingua.2010.06.010

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 39

Zhang, J., & Liu, J. (2011). Tone sandhi and tonal coarticulation in Tianjin Chinese. Phonetica, 68(3), 161–191. Zhang, J., & Liu, J. (2012). Patterns of tone sandhi productivity in Tianjin Chinese. In S. Boyce (Ed.), Proceedings of Meetings on Acoustics, 11, 060003: 160th Meeting of the Acoustical Society of America. doi: 10.1121/1.3573498 Zhang, J., & Meng, Y. L. (2012). Structure-dependent tone sandhi in real and nonce words in Shanghai Wu. In W. T. Gu (Ed.), Proceedings of the 3rd International Symposium on Tonal Aspects of Languages. Nanjing, China. Zhang, J. W., Van de Velde, H., & Kager, R. (2011). Tone variation in the Wuxi dialect. ICPhS, XVII, 2288–2291. Hong Kong. Zheng-Zhang, S. F. (1964). Wenzhou fangyan de liandu biandiao (Tone sandhi in the Wenzhou dialect). Zhongguo Yuwen (Studies of the Chinese Language), 1964, 106–152. Zheng-Zhang, S. F. (1980). Wenzhou fangyan erweici de yuyin bianhua - 1 (Phonological chang- es in the diminutive suffix in the Wenzhou dialect - 1). Fangyan (Dialects), 1980. 245–262. Zhu, X. N. (2004). Jipin guiyihua – ruhe chuli shengdiao de suiji chayi? (F0 normalization – How to deal with between-speaker tonal variations?) Yuyan Kexue (Linguistic Sciences), 3, 3–19. Zuraw, K. (2000). Patterned exceptions in phonology. Ph.D. dissertation, University of California, Los Angeles. Zuraw, K. (2007). The role of phonetic knowledge in phonological patterning: Corpus and sur- vey evidence from Tagalog infixation. Language, 83, 277–316. doi: 10.1353/lan.2007.0105

Appendix

Real words Base tones Chinese Transcription Gloss compounds freq. T1+T1 开窗 kʰɛ tsʰɒ̃ to open the window 140 43+34 翻身 fɛ sən to turn over the body 270 西瓜 si ku watermelon 192 花椒 hu tsiɔ Chinese pepper 22 T1+T3 浇水 tɕiɔ sʮ to water 108 招手 tsɔ ɕiəɯ to wave hands 229 山顶 sɛ tin mountain top 200 东海 toŋ xɛ east sea 181 T1+T5 收费 ɕiəɯ fi to charge a fee 126 通信 tʰoŋ sin to communicate 211 青菜 tsʰin tsʰɛ green leaf vegetable 96 抽屉 tɕʰiəɯ tʰi drawer 420

© 2016. John Benjamins Publishing Company All rights reserved 40 Hanbo Yan and Jie Zhang

Base tones Chinese Transcription Gloss compounds freq. T3+T1 写书 sia sʮ to write a book 75 44+55 打呼 tæ̃ xu to snore 36 饼干 pin kʊ cookie 102 宝刀 pɔ tɔ precious knife 48 T3+T3 炒股 tsʰɔ ku to invest in stocks 56 改口 kɛ kʰɛi to correct oneself 77 喜酒 ɕi tsɛi wedding feast 81 警犬 tɕin tɕʰyʊ police dog 20 T3+T5 讲课 kɒ̃ kʰəɯ to give a lecture 98 喘气 tsʰʊ tɕʰi to take a breath 154 苦笑 kʰu siɔ bitter smile 460 彩票 tsʰɛ pʰiɔ lottery 12 T5+T1 化妆 xu tsɒ̃ to put on make-up 302 55+31 唱歌 tsʰæ̃ kəɯ to sing a song 488 汽车 tɕʰi tsʰeɯ car 2475 线衫 siɪ sɛ sweater 35 T5+T3 散伙 sɛ xəɯ to separate 35 泡澡 pʰɔ tsɔ to take a bath 12 报纸 pɔ tsʮ newspaper 1146 记者 tɕi tsa reporter 869 T5+T5 放假 fɒ̃ tɕia to have a vacation 124 进货 tsin xəɯ to stock with goods 47 志向 tsʮ ɕiæ̃ aspiration 48 抗战 khɒ̃ tsʊ wars against aggression 169

Pseudo1 Pseudo2 Base tones Chinese Transcription Base tones Chinese Transcription T1+T1 煎弯 tsiɪ uɛ T1+T1 疚军 tɕiəɯ tɕyən 43+34 消涛 siɔ tʰɔ 43+34 叨松 tɔ soŋ 秋街 tsʰɛi ka 蒿冰 xɔ pin 灯蛙 tən ua 筝坡 tsən pʰəɯ

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 41

T1+T3 煎展 tsiɪ tsʊ T1+T3 疚掌 tɕiəɯ tsæ̃ 消体 siɔ tʰiɪ 叨土 tɔ tʰəɯ 秋彩 tsʰɛi tsʰɛ 蒿主 xɔ tsʮ 灯狗 tən kɛi 筝底 tsən ti T1+T5 煎伞 tsiɪ sɛ T1+T5 疚顿 tɕiəɯ tən 消戏 siɔ ɕi 叨贩 tɔ fɛ 秋炮 tsʰɛi pʰɔ 蒿蒜 xɔ sʊ 灯素 tən səɯ 筝剑 tsən tsiɪ T3+T1 垮弯 kʰua uɛ T3+T1 齿军 tsʰɿ tɕyən 44+55 卷涛 tɕyʊ tʰɔ 44+55 帚松 tɕiəɯ soŋ 毯街 tʰɛ ka 柬冰 tɕiɪ pin 巧蛙 tɕʰiɔ ua 袄坡 ɔ pʰəɯ T3+T3 垮展 kʰua tsʊ T3+T3 齿掌 tsʰɿ tsæ̃ 卷体 tɕyʊ tʰiɪ 帚土 tɕiəɯ tʰəɯ 毯彩 tʰɛ tsʰɛ 柬主 tɕiɪ tsʮ 巧狗 tɕʰiɔ kɛi 袄底 ɔ ti T3+T5 垮伞 kʰua sɛ T3+T5 齿顿 tsʰɿ tən 卷戏 tɕyʊ ɕi 帚贩 tɕiəɯ fɛ 毯炮 tʰɛ pʰɔ 柬蒜 tɕiɪ sʊ 巧素 tɕʰiɔ səɯ 袄剑 ɔ tsiɪ T5+T1 替弯 tʰi uɛ T5+T1 锢军 ku tɕyən 55+31 照涛 tsɔ tʰɔ 55+31 涕松 tʰi soŋ 帅街 sua ka 沛冰 pʰɛ pin 脆蛙 tsʰɛ ua 尬坡 ka pʰəɯ T5+T3 替展 tʰi tsʊ T5+T3 锢掌 ku tsæ̃ 照体 tsɔ tʰiɪ 涕土 tʰi tʰəɯ 帅彩 sua tsʰɛ 沛主 pʰɛ tsʮ 脆狗 tsʰɛ kɛi 尬底 ka ti T5+T5 替伞 tʰi sɛ T5+T5 锢顿 ku tən 照戏 tsɔ ɕi 涕贩 tʰi fɛ 帅炮 sua pʰɔ 沛蒜 pʰɛ sʊ 脆素 tsʰɛ səɯ 尬剑 ka tsiɪ

© 2016. John Benjamins Publishing Company All rights reserved 42 Hanbo Yan and Jie Zhang

Novel words Base tones Cue sentences 要是上网买东西叫做 tsia… T1(53)+X “If to shop online is called tsia53…” 如果黄金还没有囗,那么也还可以讲还没(囗金)。 T1+T1 “If the gold has not been tsia53-ed, then we can say that we have not (囗 gold).” 如果游艇还没有囗,那么也还可以讲还没(囗艇)。 T1+T3 “If the yacht has not been tsia53-ed, then we can say that we have not (囗 yacht).” 如果门票还没有囗,那么也还可以讲还没(囗票)。 T1+T5 “If the ticket has not been tsia53-ed, then we can say that we have not (囗ticket).” 要是用飞船运输叫做 kuən… T1(53)+X “If to transport via a spaceship is called kuən53…” 如果猪还没有囗,那么也还可以讲还没(囗猪)。 T1+T1 “If pigs have not been kuən53-ed, then we can say that we have not (囗pigs).” 如果鼓还没有囗,那么也还可以讲还没(囗鼓)。 T1+T3 “If drums have not been kuən53-ed, then we can say that we have not (囗drums).” 如果菜还没有囗,那么也还可以讲还没(囗菜)。 T1+T5 “If dishes have not been kuən53-ed, then we can say that we have not (囗dishes).” 要是有一种形状叫做 tsʰia… T1(53)+X “If there is a shape called tsʰia53…” 如果客厅是这个形状囗,那么也还可以讲这个是一个(囗厅)。 T1+T1 “If a living room has this shape tsʰia53, then we can call it a (囗room).” 如果手表是这个形状囗,那么也还可以讲这个是一个(囗表)。 T1+T3 “If a watch has this shape tsʰia53, then we can call it a (囗watch).” 如果书架是这个形状囗,那么也还可以讲这个是一个(囗架)。 T1+T5 “If a bookshelf has this shape tsʰia53, then we can call it a (囗shelf).” 要是有一种气味叫做 pʰɒ̃53… T1(53)+X “If there is a smell called pʰɒ̃53…” 如果一朵花是这个味道囗,那么也还可以讲这个是一朵(囗花)。 T1+T1 “If a flower has this smell ʰɒ̃p 53, then we can call it a (囗 flower).” 如果一棵草是这个味道囗,那么也还可以讲这个是一棵(囗草)。 T1+T3 “If a piece of grass has this smell pʰɒ̃53, then we can call it a (囗 grass).” 如果一种酱是这个味道囗,那么也还可以讲这个是一种(囗酱)。 T1+T5 “If a type of jam has this smell pʰɒ̃53, then we can call it a (囗 jam).” 要是有一种卖东西的方式叫做 pʰɛ323… T3(323)+X “If a form of sales is called pʰɛ323…” 如果黄金还没有囗,那么也可以讲还没(囗金)。 T3+T1 “If the gold has not been pʰɛ323-ed, then we can say that we have not (囗gold).”

© 2016. John Benjamins Publishing Company All rights reserved Pattern substitution in Wuxi tone sandhi and its implication for phonological learning 43

如果游艇还没有囗,那么也还可以讲还没(囗艇)。 T3+T3 “If the yacht has not been pʰɛ323-ed, then we can say that we have not (囗yacht).” 如果门票还没有囗,那么也还可以讲还没(囗票)。 T3+T5 “If the ticket has not been pʰɛ323-ed, then we can say that we have not (囗ticket).” 要是有一种走私的方式叫做 kʰuɛ323… T3(323)+X If a form of smuggling is called kʰuɛ323… 如果猪还没有囗,那么也还可以讲还没(囗猪)。 T3+T1 “If pigs have not been kʰuɛ323-ed, then we can say that we have not (囗pigs).” 如果鼓还没有囗,那么也还可以讲还没(囗鼓)。 T3+T3 “If drums have not been kʰuɛ323-ed, then we can say that we have not (囗drums).” 如果菜还没有囗,那么也还可以讲还没(囗菜)。 T3+T5 “If dishes have not been kʰuɛ323-ed, then we can say that we have not (囗dishes).” 要是有一种人造材料叫做 ʨyn323… T3(323)+X “If a man-made material is called ʨyn323…” 如果客厅是这种材料囗,那么也还可以讲这个是一个(囗厅)。 T3+T1 “If a living room is made of this material ʨyn323, then we can call it a (囗room).” 如果手表是这种材料囗,那么也还可以讲这个是一个(囗表)。 T3+T3 “If a watch is made of this material ʨyn323, then we can call it a (囗watch).” 如果书架是这种材料囗,那么也还可以讲这个是一个(囗架)。 T3+T5 “If a bookshelf is made of this material ʨyn323, then we can call it a (囗shelf).” 要是有一个国家叫做 tsʰei323… T3(323)+X “If a country is called tsʰei323…” 如果一种花产在这个国家囗,那么也还可以讲这个是一种(囗花)。 T3+T1 “If a type of flower comes from this country tsʰei323, then we can call it a (囗 flower).” 如果一种草产在这个国家囗,那么也还可以讲这个是一种(囗草)。 T3+T3 “If a type of grass comes from this country tsʰei323, then we can call it a (囗 grass).” 如果一种酱产在这个国家囗,那么也还可以讲这个是一种(囗酱)。 T3+T5 “If a type of jam comes from this country tsʰei323, then we can call it a (囗 jam).” 要是有一种送东西的方式叫做 la34… T5(34)+X “If there is a form of gift-giving called la34…” 如果黄金还没有囗,那么也可以讲还没(囗金)。 T5+T1 “If the gold has not been la34-ed, then we can say that we have not (囗gold).” 如果游艇还没有囗,那么也还可以讲还没(囗艇)。 T5+T3 “If the yacht has not been la34-ed, then we can say that we have not (囗yacht).” 如果门票还没有囗,那么也还可以讲还没(囗票)。 T5+T5 “If the ticket has not been la34-ed, then we can say that we have not (囗ticket).” 要是有一种研究方式叫做 tsɿ34… T5(34)+X “If there a research method called tsɿ34…”

© 2016. John Benjamins Publishing Company All rights reserved 44 Hanbo Yan and Jie Zhang

如果猪还没有囗,那么也还可以讲还没(囗猪)。 T5+T1 “If pigs have not been tsɿ34-ed, then we can say that we have not (囗pigs).” 如果鼓还没有囗,那么也还可以讲还没(囗鼓)。 T5+T3 “If drums have not been tsɿ34-ed, then we can say that we have not (囗drums).” 如果菜还没有囗,那么也还可以讲还没(囗菜)。 T5+T5 “If dishes have not been tsɿ34-ed, then we can say that we have not (囗dishes).” 要是有一种颜色叫做 piɔ34… T5(34)+X “If there is a color called piɔ34…” 如果客厅是这种颜色囗,那么也还可以讲这个是一个(囗厅)。 T5+T1 “If a living room has this color piɔ34, then we can call it a (囗room).” 如果手表是这种颜色囗,那么也还可以讲这个是一个(囗表)。 T5+T3 “If a watch has this color piɔ34, then we can call it a (囗watch).” 如果书架是这种颜色囗,那么也还可以讲这个是一个(囗架)。 T5+T5 “If a bookshelf has this color piɔ34, then we can call it a (囗shelf).” 要是有一种果实叫做ɕyʊ34… T5(34)+X “If there is a fruit called ɕyʊ34…” 如果有一种花的果实是囗,那么也还可以讲这个是一种(囗花)。 T5+T1 “If a type of flower has this fruitɕ yʊ34, then we can call it a (囗 flower).” 如果有一种草的果实是囗,那么也还可以讲这个是一种(囗草)。 T5+T3 “If a type of grass has this fruit ɕyʊ34, then we can call it a (囗 grass).” 如果有一种酱出自这种果实囗,那么也还可以讲这个是一种(囗酱)。 T5+T5 “If a type of jam is made from this fruit ɕyʊ34, then we can call it a (囗 jam).”

Author’s addresses Hanbo Yan Jie Zhang Department of Linguistics Department of Linguistics University of Kansas 1541 Lilac Lane Lawrence, KS USA 66045-3129 Blake Hall, 420E Lawrence, KS USA 66045-3129 [email protected] [email protected]

© 2016. John Benjamins Publishing Company All rights reserved