“Voicing”-tone interaction in Xiangxiang Chinese

Ting Zeng City University of Hong Kong

This paper investigates the phonetic characteristics of the voiced stops and discusses questions about the general assumptions of “voicing”-tone interaction. The resulting data shows that for the voiced stops the VOT values are distributed along a continuum from long-lead to short-lag values, with short-lag values overlapping with the voiceless unaspirated stops. A closer examination of the acoustic characteristics of the voiced stops reveals four patterns of variation for each speaker: simple voiceless stop; voiced stop; voiced implosive and voiced stop where voicing during the oral closure dies out before the release of the stop, instead of continuing into the following vowel. There are seven tones in XXC and voiced stops co-occur only with low tones. These results raise questions about the usual assumption about the mapping between phonological representations of features and their physical correlates, as well as the relationship between variation and historical sound change.

1. Introduction Xiangxiang Chinese (hereafter XXC) is a sub-dialect of the Xiang Dialects of Chinese. According to the impressionistic description in Zeng (2001) and Zeng (2005), there is a three-way distinction of stops in Xiangxiang Chinese: voiceless unaspirated, voiceless aspirated and voiced (shown in Table 1); and voiced stops co-occur only with low tones.

Table 1: The three series of stops in XXC Voiceless unaspirated Voiceless aspirated Voiced p ph b t th d k kh 

This paper aims at (i) investigating the phonetic characteristics of the voiced stops; (ii) conducting an acoustic experiment on tones; (iii) raising questions about the general assumptions of “voicing”-tone interaction based on the data in XXC.

T W Toronto Working Papers in Linguistics 28: 395–405 P L Copyright © 2008 Ting Zeng

TING ZENG

2. Methodology The wordlist for an investigation of the stops includes all monosyllabic words in XXC which begin with any of the nine stops, and that for the study of tones was given in Table 2. Four native speakers, two male and two female, provided the speech data of the stops, and two of them, one male and one female, provided the speech data of the tones. They all age around 50, and were born and grew up in Xiangxiang. Recording was done in a quiet room by using a Sony PCM-R700 digital audio recorder and a Shure SM-58 microphone. The speakers were asked to read each test word four times with an interval of 2~3 seconds and in a natural manner.

Table 2: Test words for the study of tones Test tones Test words Group 1 Group 2 Group 3 Yinping i55 衣 u55 乌 y55 淤 Yangping di23 题 du23 图 dy23 局 thi23 踢 thu23 凸 thy23 出 Ciyangping i34 一 u34 吴 y34 余 Shangsheng i21 已 u21 五 y21 与 Yinqu i45 意 u45 物 y45 玉 Ciyinqu thi25 替 thu25 兔 thy25 处 Yangqu i22 异 u22 务 y22 寓 Note: Ping, Shang, Qu and Ru were four tones that were established since Middle Chinese (M.C. 200 AD - 900 AD). Tonal split then took place resulting in different registers, known in traditional terminology as Yin, Yang, Ciyin and Ciyang.

To get the F0 values of each tone, 11 points were sampled at every 10% of the overall duration of the F0 contour for each monosyllabic word. The F0 values were then converted to pitch values by using the formula shown in (1):

(1) Ti = 5 × lg xi –lg xmin lgxmax – lgxmin

3. Results

3.1. Stops Phonetic data emerging from the acoustic study of the stops in XXC show several patterns. Table 3 gives the mean VOT values and Standard Deviations of each stop for each speaker. The voiceless aspirated stops can be well distinguished from the other two categories by their long-lag VOT values. The voiced category, as indicated by shading, shows considerable intra-speaker variation along the VOT dimension for each speaker: some tokens have long-lead values, while others have short-lag values overlapping with the voiceless unaspirated category.

396 “VOICING”-TONE INTERACTION IN XIANGXIANG CHINESE

Table 3: Mean VOT values and Standard Deviations of each stop for each speaker speaker 1 stop p t k ph th kh mean 9 11 20 78 85 97 SD 1 2 5 17 24 19 stop b b d d g g mean -83 10 -82 7 -86 22 SD 36 5 35 4 38 5 speaker 2 stop p t k ph th kh mean 8 6 12 56 60 80 SD 2 1 2 12 11 24 stop b b d d g g mean -50 6 -41 6 -34 26 SD 36 5 32 2 25 16 speaker 3 stop p t k ph th kh mean 6 6 17 71 63 78 SD 1 2 4 23 20 24 stop b b d d g g mean -97 7 -77 8 -65 16 SD 46 4 34 3 38 5 speaker 4 stop p t k ph th kh mean 9 6 18 88 73 99 SD 2 1 6 15 14 17 stop b b d d g g mean -60 9 -57 7 -71 14 SD 32 1 42 3 37 8

The Figures below further show that for each speaker the VOT values are distributed along a continuum from long-lead to short-lag values, with short-lag values overlapping with the voiceless unaspirated stops. (For each figure, the horizontal axis is the VOT dimension, and the vertical dimension is the percentage occurrence.)

chen chen 1.4 1 b d p 0.9 t 1.2 Speaker 1 Speaker 1 0.8

1 0.7

0.6 0.8

0.5 0.6 0.4

0.4 0.3

0.2 0.2 0.1

0 0 -160 -140 -120 -100 -80 -60 -40 -20 0 20 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 time (ms) time (ms)

397 TING ZENG

chen shen 0.9 1.4 g b 0.8 k p Speaker 1 1.2 Speaker 2 0.7 1 0.6

0.5 0.8

0.4 0.6

0.3 0.4 0.2

0.2 0.1

0 0 -200 -150 -100 -50 0 50 -160 -140 -120 -100 -80 -60 -40 -20 0 20 time (ms) time (ms)

shen shen 1.4 0.7 d g t k 1.2 Speaker 2 0.6 Speaker 2

1 0.5

0.8 0.4

0.6 0.3

0.4 0.2

0.2 0.1

0 0 -160 -140 -120 -100 -80 -60 -40 -20 0 20 -100 -80 -60 -40 -20 0 20 40 60 time (ms) time (ms)

ding ding 1.4 0.8 b d p t 1.2 Speaker 3 0.7 Speaker 3

0.6 1

0.5 0.8

0.4 0.6 0.3

0.4 0.2

0.2 0.1

0 0 -250 -200 -150 -100 -50 0 50 -160 -140 -120 -100 -80 -60 -40 -20 0 20 time (ms) time (ms)

ding huang 0.9 0.9 g b 0.8 k Speaker 3 0.8 p Speaker 4

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0 -140 -120 -100 -80 -60 -40 -20 0 20 40 -120 -100 -80 -60 -40 -20 0 20 time (ms) time (ms)

398 “VOICING”-TONE INTERACTION IN XIANGXIANG CHINESE

huang huang 1.5 0.9 d g t Speaker 4 0.8 k Speaker 4

0.7

1 0.6

0.5

0.4

0.5 0.3

0.2

0.1

0 0 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 -200 -150 -100 -50 0 50 time (ms) time (ms) Figure 1: VOT measurements of the voiced and voiceless unaspirated stops

A closer examination of the acoustic characteristics of the voiced stops revealed four patterns of variation for each speaker:

• simple voiceless stop • voiced stop • voiced implosive • stop where voicing during the oral closure dies out before the voiceless release of the stop, instead of continuing into the following vowel.

Figures 2-6 show the waveforms and spectrograms of the four types of variation of the prevocalic voiced stops. Figure 2 shows the first variant—a simple voiceless stop—with a short lag VOT value of 5 ms. The stops in Figures 3 and 4 have VOT values of –108 ms and –137 ms respectively. The difference lies in that in Figure 3 the amplitude of vibrations during the time when the oral closure is maintained remains relatively the same, while in Figure 4 the amplitude of vibrations decreases into the following vowel. Both Figures show variation of the second type—voiced stops. The decrease of amplitude in Figure 4 can be explained by the fact that the trans-glottal pressure is not sufficient for high amplitude of vocal cord vibration to be maintained for a long time. The account for the variant in Figure 6 in which the voicing during the oral closure dies out before the release instead of continuing into the following vowel is also aerodynamic: the closure is too long for positive trans-glottal pressure to be maintained, and as a result the vocal cord vibration will stop if trans-glottal pressure becomes zero or even negative before the release.

399 TING ZENG

Figure 2: VOT = 5 ms Figure 3: VOT = -108 ms

Figure 4: VOT = -137ms Figure 5: VOT = -101 ms

Figure 6

In Figure 5 the stop has a VOT value of –101 ms, and the amplitude of vibrations increases during the time when the oral closure is maintained. It is typical of voiced implosives to show increasing amplitude of vibrations during the oral closure, which has been reported in Lindau (1984) on the voiced implosives attested in a number of languages spoken in the southeast of Nigeria (see Figure 7), Cun (2004) on the voiced implosives in Wuyang Dialect, a member of the Yue Dialects of Chinese (see Figure 8), and Cun (2005) on the voiced implosives in Dialect where the contrast between voiced stops and implosives are present (see Figure 9 in which the figure on the left with increasing amplitude of vibration during the oral closure of the stop is transcribed as [∫i], and the figure on the right with a different wave configuration, i.e. decreasing amplitude of vibration, is transcribed as [bi]. The waveforms of [∫i] and [bi] in XXC are shown in Figure 10 for comparison). Ladefoged & Maddieson (1996) attributes this increasing amplitude of vibrations to the fact that the lowering of the larynx was more than sufficient to counteract the pressure buildup in the oral cavity.

400 “VOICING”-TONE INTERACTION IN XIANGXIANG CHINESE

Figure 7: The waveforms of intervocalic [b] and [∫] (Lindau 1984)

Figure 8: The waveform of the voiced implosive in Wuyang Dialect (Cun 2004)

[i] [bi]

Figure 9: The waveforms of [i] and [bi] in Wenchang Dialect (Cun 2005)

401 TING ZENG

0.1898 0.1855

0

0

-0.1289 -0.1792 0 0.503356 0 0.700183 Time (s) Time (s) [i] [bi] Figure 10: The waveforms of [∫i] and [bi] in XXC

3.2. Tones The results of the acoustic study of citation tones in XXC are given in Figure 11~12 which show the pitch curves in Chao’s 5-point scale and in Table 4 which gives the pitch values of different tones and the stops with which each tone co-occurs.

5 Yinping 4 Yangping 3 Ciyangping 2 Shangsheng 1 Yinqu 0 Ciyinqu 1234567891011 Yangqu

Figure 11: Pitch curves in Chao’s 5-point scale for Speaker 1

5 Yinping 4 Yangping 3 Ciyangping 2 Shangsheng 1 Yinqu Ciyinqu 0 Yangqu 1234567891011

Figure 12: Pitch curves in Chao’s 5-point scale for Speaker 3

The above figures show that the pitch curves of the two speakers are significantly consistent with each other: Yinping is 44; Yangping is basically 23 for the major portion of the pitch curve falls within the area of 2 and 3 in the 5-point scale; Ciyangping is 34 and Shangsheng is 31; Yinqu is 45; Ciyinqu is lower than Yinqu but share similar end point with Yinqu, so it is described as 25; Yangqu is 33.

402 “VOICING”-TONE INTERACTION IN XIANGXIANG CHINESE

Table 4: Pitch values of the citation tones and consonants with which they co-occur M.C. tone Name Onset stop types Pitch value Yinping voiceless unaspirated 44 I (Ping) voiceless aspirated Yangping voiceless aspirated 23 voiced Ciyangping voiceless unaspirated 34 II (Shang) Shangsheng voiceless unaspirated 31 voiceless aspirated Yinqu voiceless unaspirated 45 III (Qu) Ciyinqu voiceless aspirated 25 Yangqu voiced 33

From Table 4 we can see that different stop types show distinct patterns with respect to the tones with which they co-occur: tones following voiceless unaspirated stops are /44, 34, 31, 45/; those following voiceless aspirated stops are /44, 23, 31, 25/; those following voiced stops are /23, 33/. Taking into account of the historical development of voicing and tones in XXC reveals the following patterns (here our discussion does not cover the voiceless aspirated stops which show a different pattern with the other two types of stops in consonant-tone interaction):

• Among syllables corresponding to the historical tone type Ping, those with voiced stop onset co-occur with low tone 23; those with voiceless unaspirated stop onset co-occur with high tones 44 or 34. • Among the syllables corresponding to the historical tone type Shang, those with voiceless unaspirated stop onset co-occur with a falling tone 31, while those with voiced stop onset merge with Yangqu syllables and co-occur with low tone 33. • Among the Qu syllables, those with voiced stop onset co-occur with low tone 33, while those with voiceless unaspirated stop onset co-occur with high tone 45. • The historical voiced stop onsets in Ru syllables all become voiceless. Syllables belonging to this tonal category have been distributed among other tonal categories, and those with voiceless unaspirated stop onset pattern with Ciyangping or Yinqu and co-occur with high tone 34 and 45.

The above patterns show that in XXC the voicing of prevocalic stops do interact with tones in the historical development, i.e. voiced stops co-occur with low tones and voiceless unaspirated stops occur with high tones.

4. Discussion and conclusion

4.1. Discussion Such considerable intra-speaker variation of voiced stops observed in XXC

403 TING ZENG

reveals the lack of invariant physical correlates of phonological features (for the feature [voice] in this case, [+voice] stops should have negative VOT values, and [-voice] stops should have positive VOT values in the acoustic domain), the quest for which is inspired by the assumption that the single, abstract phonological representation is made up of sets of feature specifications, which would have invariant correlates in the physical domain: instead of having invariant lead VOT values, the XXC voiced stops have their VOT values distributed along a continuum from long-lead to short-lag values, with short-lag values overlapping with the voiceless unaspirated stops. It is widely accepted that voiced stops and voiced implosives have distinct effect on pitch of the following vowel: voiced stops lower pitch and implosives raise pitch. (Greenberg 1970, Ohala 1976) However, the resulting data emerging in this study complicate this picture in that the voiced stops in XXC are phonetically realized as, among others, voiced stops or implosives, both co-occurring with low tones. The co-occurrence of voiced stops with low tones and voiceless stops with high tones in XXC shows that the voicing distinction of the prevocalic stops is present and at the same time the pitch difference of the following vowel due to this voicing distinction is maintained. It is often assumed that historical sound changes is the result of the reinterpretation of the listeners of a previously intrinsic cue after the loss of the main cue, following which assumption the historical development of tones is explained by the listeners’ reinterpretation of the previously secondary cue (pitch difference of vowels following the voiceless series versus the voiced series) after the loss of the primary cue, i.e. voicing distinction of the prevocalic stops. (Hombert 1978) XXC therefore mirrors an intermediate stage in the development of tones where both the primary and secondary cues are present. In XXC the voiced category exhibits considerable variation while the tones are quite systematic among speakers, which further shows that voicing of the stops is redundant while the pitch difference of the following vowel is primary and phonological. The underlying trigger for this prejudice might be perceptual: the pitch difference of the vowel is auditorily more prominent than voicing difference of the prevocalic stop, and therefore listeners will identify the former as primary and the latter secondary or redundant, or will only identify the former although both differences are present. The considerable intra-speaker variation of the voiced category may also mirror the intermediate stage in its historical process toward devoicing. The phonetic manifestation of the voiced stops in XXC also has implications for theories of sound change. It raises questions about the usual assumption and practice of phonological reconstruction. If there is such a wide range of variants in a contemporary language, what about the parent languages? Do variants in sounds in daughter languages really develop from only one single phonetic shape in the parent language? These questions will be the focus of future study.

4.2. Conclusion This paper investigates the phonetic characteristics of the voiced stops, conducts an acoustic experiment on tones and raises questions about the general assumptions of “voicing”-tone interaction based on the data in XXC. The resulting data show that (1) for the voiced stops the VOT values are distributed along a continuum from long-lead to

404 “VOICING”-TONE INTERACTION IN XIANGXIANG CHINESE short-lag values, with short-lag values overlapping with the voiceless unaspirated stops. A closer examination of the acoustic characteristics of the voiced stops revealed four patterns of variation for each speaker: (i) simple voiceless stop; (ii) voiced stop; (iii) voiced implosive; (iv) voiced stop where voicing during the oral closure dies out before the release of the stop, instead of continuing into the following vowel. (2) There are seven tones in XXC and voiced stops co-occur with low tones and voiceless unaspirated stops co-occur with high tones. The resulting data raise questions about the usual assumptions about (i) the mapping between phonological representation of features and their physical correlates; (ii) the interactions of stop voicing and tones; (iii) the relationship between variation and historical sound change.

References Cun, Xi. 2004. The Phonetic Characteristics of Implosives in the Wuyang Dialect. MPhil thesis, Hong Kong University of Science and Technology, Hong Kong. Cun, Xi. 2005. “The phonetic characteristics of implosives in two Chinese dialects.” Presented at PaPI (Phonetics and Phonology in Iberia) 2005, Barcelona, Spain. Greenberg, Joseph. 1970. “Some generalizations concerning glottalic consonants, especially implosives.” International Journal of American Linguistics 36, 123–145. Hombert, Jean-Marie. 1978. “Consonant types, vowel quality, and tone.” In Tone: A linguistic Survey, Victoria A. Fromkin (ed.), pp. 77–112. New York: Academic Press. Ladefoged, Peter and Ian Maddieson. 1996. The Sounds of the World’s Languages. Oxford: Blackwell. Lindau, Mona. 1984. “Phonetic differences in glottalic consonants.” Journal of Phonetics 12, 147–155. Ohala, John. 1976. “A model of speech aerodynamics.” Report of the Phonology Laboratory (Berkeley) 1, 93–107. Zeng, Shaoda. 2001. “Xiangxiang fangyan (Xiangxiang dialect).” In Hunansheng fangyanzhi (Dialects in province), pp. 211–275. : Hunan renmin chubanshe. Zeng, Ting. 2005. The Interaction of Obstruent Voicing with Tone Registers in the Xiangxiang Dialect of Chinese: An Optimality Theoretical Analysis. MA thesis, Hunan University, Hunan.

405