Shanghai Slack Voice: Acoustic and EPGG Data Jiayin Gao, Pierre Hallé, Kiyoshi Honda, Shinji Maeda, Martine Toda
Total Page:16
File Type:pdf, Size:1020Kb
Shanghai Slack Voice: Acoustic and EPGG Data Jiayin Gao, Pierre Hallé, Kiyoshi Honda, Shinji Maeda, Martine Toda To cite this version: Jiayin Gao, Pierre Hallé, Kiyoshi Honda, Shinji Maeda, Martine Toda. Shanghai Slack Voice: Acoustic and EPGG Data. 17th International Congress on Phonetic Sciences, 2011, Hong Kong SAR China. pp.719-722. halshs-00677248 HAL Id: halshs-00677248 https://halshs.archives-ouvertes.fr/halshs-00677248 Submitted on 7 Mar 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ICPhS XVII Regular Session Hong Kong, 17-21 August 2011 SHANGHAI SLACK VOICE: ACOUSTIC AND EPGG DATA J-Y. Gaoa, P. Halléa,b, K. Hondaa, S. Maedac & M. Todaa aLPP-Paris 3, France; bLPNCog-Paris 5, France; cENST Paris, France [email protected]; [email protected]; [email protected]; [email protected]; [email protected] ABSTRACT southern dialects classified as Wu on that criterion clearly possess features typical of the Min dialects. From a representational viewpoint, the “voiced” The phonetic characterization of „muddy‟ series of obstruents in Shanghai dialect can be obstruents in Wu dialects, as well as cross-dialectal specified in terms of complementary, context- differences in this respect, are controversial. Early conditioned tonal and segmental features: either impressionistic descriptions simply assumed plain low tone or glottal pulsing. Yet, some studies have voicing, that is glottal pulsing throughout the proposed that, when the “voiced” obstruents can production of these obstruents. In the late twenties, only be signaled by low tone (stress-initially), they a more detailed account was proposed by Liu Fu retain something of segmental voicedness. This and Chao Yuanren [4, 8] for Shanghai dialect somewhat mysterious “something” has often been among others: the closure of the „muddy‟ stops is identified to a moderate degree of breathiness after voiceless but their release is voiced and breathy. stop release, or “slack voice.” In this study we This proposal, known by Chinese linguists as 清音 revisit this issue and find that Shanghai obstruents, 浊流 (qingyin zhuoliu: „clear sound then muddy as produced today by young Shanghai people, aspiration‟), motivates transcriptions such as [t ]. indeed retain some characteristics of plain voiced ʱ However, this description may hold only for stress- obstruents but breathiness does not appear as the initial stops in the dialects examined so far, less sole one. We propose that articulatory timing clearly perhaps for southern than northern dialects relationships are the main determinant to the (e.g., Whenzhou vs. Shanghai dialects [3]). In non- mysterious voiced quality of Shanghai obstruents. initial, unstressed syllables, muddy stops tend to be Keywords: Shanghai slack voice, low vs. high fully voiced in Shanghai dialect among others ([2, tone, acoustic/physiological breathiness, EPGG 3]; also see [10]). More precisely, muddy stops in this context either are plain voiced stops but their 1. INTRODUCTION syllable loses its yang tone (due to pervasive tone Four tones are assumed in early Middle Chinese, sandhi in Wu dialects) or are voiceless but yang traditionally labeled 平, 上, 去, 入 (ping, shang, qu, tone is retained. A possible phonological account ru). Middle Chinese distinguished two series of of “muddiness” is thus that muddy stops contrast obstruents, described as 浊音 (zhuoyin „muddy with the others by either a segmental [+voice] or a voice‟) and 清音 (qingyin „clear voice‟) in the suprasegmental [-high tone] feature, depending on Chinese linguistic tradition, probably voiceless and stress context. By this account, muddy stops need voiced, respectively. Segmental tonogenesis led to not differ segmentally from voiceless unaspirated the general Middle Chinese “tonal split” into 阴 vs. stops in stress-initial position. Yet, even then, 阳 tones (yin vs. yang: „low‟ vs. „high‟ register muddy stops are felt to retain aspects of phonetic tones), from the clear vs. muddy series. The clear- voicedness. Breathy phonation has been proposed muddy distinction thus became redundant with as one such aspect in northern Wu dialects such as pitch register, motivating the disappearance in Shanghai (cf. [3] for a review); whispery most late Middle Chinese dialects of the „muddy‟ phonation has been proposed for Zhenhai dialect obstruents, replaced with voiceless aspirated or [11]. The specific phonation of these stops is also non-aspirated ones of the remaining „clear‟ series. called “slack voice,” suggesting a loose The retention of the „muddy‟ obstruents –as well tension/adduction of the vocal folds. Studies as their associated low, yang tones– is traditionally conducted in the late eighties found both acoustic taken as a defining feature of the Wu dialects. and physiological cues to breathiness in the release Indeed, all the dialects believed to belong to the portion of muddy stops [3, 9, 10]: H1 relative Wu family have a phonologically voiced series in salience [1, 6], oral airflow, and glottal opening as addition to two voiceless series. Yet, some measured by fiberoptic transillumination [11]. The 719 ICPhS XVII Regular Session Hong Kong, 17-21 August 2011 degree of breathiness revealed by these studies is 2.1.3. Procedure and apparatus however far from that found in, for example, The session was conducted in a soundproof booth, Hindi. Recently, Chinese linguists proposed that using a Dash 8 multi-channel data acquisition moderate breathiness was a feature attached to the device recording three channels: audio, EPGG, and entire syllable or its rime rather than to the onset oral airflow, all sampled at 20 kHz with 16 bit consonant, but this view is still debated [3, 5, 13]. resolution. The EPGG and airflow channels were Our study addresses that issue in using zero onset low-pass filtered at 500 and 80 Hz, respectively. and nasal onsets. The signals were transferred to computer as wave In this paper, we focus on the Shanghai dialect, files, segmented into utterances and processed. as spoken by young, educated native speakers. The literature on Wu “slack voice” has exclusively 2.1.4. Physiological analyses focused, as far as we know, on oral stop syllable For any given syllable, five EPGG signals (only onsets. In this paper, we also examine fricatives (/f, h h v, s, z/). Because nasal and zero onsets may bear three for /pɛ, tɛ, p ɛ, t ɛ/) were lined up on the either yang or yin tone, we also examine yin-yang onset of /ɛ/, then averaged together; 600 ms before pairs with nasal and zero onsets: were slack voice line-up point and 200 ms after sentence offset were characterizing the entire syllable, it should be found included in those averages so that the physiological in nasal or zero onset yang but not yin tone activity before and after the target syllable could syllables. (Zero onset yang tone syllables may be be tracked. For a large number of syllables, the transcribed with an /ɦ/ whose motivation is first repetition exhibited extra-wide glottal morphophonemic rather than phonetic as in 雨 /ɦy/ opening, presumably due to the speaker taking her [y] „rain‟.) Our report on the production of slack breath to produce the 5 repetitions, and thus was voice syllables covers acoustic measurements such not retained in the averaging process. as H1–H2, and glottal opening estimations obtained Table 1: The 16 syllables investigated. with EPGG, a novel technique similar to pɛ tɛ fɛ sɛ mɛ nɛ ɛ pʰɛ tʰɛ photoglottography but with an external lightening 背 呆 翻 三 蛮 拿 爱 配 摊 source [7]. bɛ dɛ vɛ zɛ mɛ nɛ ʱɛ 2. EXPERIMENT 办 蛋 烦 才 慢 难 害 The main goal of the experiment was to determine We estimated the location (relative to vowel the time course of glottal opening before and after onset) and amplitude (arbitrary unit) of the glottal the onset consonants under scrutiny, along with opening maximum ahead of the target syllable for possibly related acoustic measurements such as each utterance individually. Averages of these H1–H2. Several syllable pairs differing in tone measurements are presented in the following. (yin vs. yang) and onset (nasal onsets excepted) 2.1.5. Acoustic analyses were compared. We added voiceless aspirated stops for sake of comparison with previous studies. For each target syllable, the following acoustic parameters were measured: VOT (stop onsets), 2.1. Method onset duration (onsets other than stops and zero), 2.1.1. Participant vowel duration, H1 and H2 amplitudes, hence the H1–H2 difference, and harmonic to noise ratio The first author, a young woman aged 23 years, (HNR). H1–H2 and HNR were computed on a 30 native speaker of Shanghai dialect, raised in ms window at vowel /ɛ/‟s onset, middle, and Shanghainese-speaking family environment, was offset. For /pʰɛ/ and /tʰɛ/, only VOT was measured. recorded on the speech materials. H1-H2 values are usually taken as a cue to loose setting of vocal folds and/or breathiness; 2.1.2. Speech materials lower HNR values indicate noisier speech. Sixteen syllables, sharing the rime /ɛ/, in sentence- initial position within the frame sentence X gə ə zi 2.2. Results /ŋo nintə ə („X‟, this character, I know) were 2.2.1. H1–H2 and HNR produced five times in succession to ensure ease of production. The onsets were /p, t, f, s, m, n, Ø, pʰ, Table 2 shows the differences in H1–H2 between tʰ/ (yin tone) and /b, d, v, z, m, n, ɦ/ (yang tone).