A comparison of speech rhythm in British and Kong English

Jane Setter School of Linguistics and Applied Studies, University of Reading, UK Email: [email protected]

(L2), are what is observed in the speech rhythm of HKE, ABSTRACT as well as in segmental features of the L2.

This paper presents the results of a study of rhythm and In this paper, duration is selected for investigation. timing in English spoken by In combination with pitch, loudness and vowel quality, speakers, for whom English is a second language. A syllable duration is an important factor in determining corpus of (HKE) speech data was syllable in English, and must therefore contribute to collected on computer, and measurements of syllable its perceived rhythmical properties. An additional duration were made. These measurements were then attraction to the study of syllable duration is that it is compared statistically with syllable measurements taken thought to be a highly learnable/teachable feature of word from an existing corpus of (BrE) speakers. and rhythmic stress; see for example Gilbert [4] and Taking syllable duration as the dependent variable, this Halliday [5]. paper looks at the effects of stress level in the two types of English. It is found that the HKE speakers examined show less difference in the relative syllable duration of 2. DATA tonic, stressed, unstressed and weakened than the BrE speakers. This is discussed with regard to 2.1 THE HKE DATA possible intelligibility problems which may arise due to In order to ensure a large amount of single-speaker divergent English speech rhythm when a native speaker of continuous speech, data was taken from 20 Hong Kong English listens to a Hong Kong English speaker. Polytechnic University students giving presentations on various topics. 10 were female and 10 were male. Speakers were given alphanumerical labels, such as m01 1. INTRODUCTION for male speaker 1 and f01 for female speaker 1. One minute of continuous speech from each subject was English speech rhythm is often described as ‘stress-timed’; analysed in this study. in basic terms, this means that each stressed syllable is said to be equidistant in time from the next stressed Subjects were tape-recorded using a personal stereo syllable. This is in comparison to ‘syllable-timed’ cassette recorder (Sony Walkman™ model WM-R707) (e.g., Spanish, Cantonese), in which the start of with a lapel microphone, which was clipped on to either a each syllable is said to be equidistant in time from the start lapel or the collar of their clothing. The speech collected of the next. was analysed by converting the recordings to a machine-readable sound signal and measuring the Instrumental studies have, in fact, shown that very little duration of syllables using specialist computer software difference can be found between typically ‘stress-timed’ on a PC platform. Speech from the cassette recordings is and typically ‘syllable-timed’ languages (Roach [10] sampled at a rate of 16000 samples per second (16kHz, 16 Dauer [3]). However, English spoken with a bit mono PCM), and then labelled on computer. The ‘syllable-timed’ rhythm can be difficult for speakers of computer software used to analyse and label data in this ‘stress-timed’ accents of English to understand study is Speech Filing System (SFS), developed for (Anderson-Hsieh and Venkatagiri [1]). This makes the research purposes at the Department of Phonetics and acquisition of English speech rhythm by non-native Linguistics, University College London (the SFS software speakers important in some contexts. is available free for research use at http://www.phon.ucl.ac.uk/resource/sfs/). With this Cantonese, the first language (L1) of HKE speakers, is software, speech data may be labelled in a number of ways. described as “an example of a language with For the purposes of this study, a broad phonetic segmental syllable-timed rhythm” (Bauer and Benedict [2]: 316), in transcription was used. The software then allows a file to which, unlike English, each syllable “receives about equal be generated which contains information on the duration emphasis in emotionally neutral utterances” (Bauer and of each of the sound segments in samples per second. Benedict [2]: 316). It is possible, therefore, that transfer This number is converted into milliseconds (ms) by effects from Cantonese to English, the second language dividing it by 16 (thus 16000 samples = 1000ms). Calculations of syllable duration are made from that 2.3 SYLLABIFICATION information; this is then analysed and compared with the In order to calculate the duration of the syllables in the SCRIBE data. data, it is first necessary to syllabify the data. This was achieved using the ‘Maximal Onsets’ approach adopted in 2.2 THE BRE DATA Jones ([7]; see p.xiii for a description) for syllabifying the The British English data used for this study was drawn entries in the fifteenth edition of the English Pronouncing from the SCRIBE corpus (see Spencer [12]). SCRIBE is a Dictionary. In its most basic form, Maximal Onsets corpus of British English speakers from four main areas of means that, “where possible, syllables should be divided the United Kingdom: the South-East (with Received in such a way that as many as possible are Pronunciation or a Southern Standard British English assigned to the beginning of the syllable to the right” accent), Glasgow, Leeds and Birmingham. The aim was (Jones [7]: xiii). to record and annotate the speech of 30 speakers from each set performing a number of different spoken tasks, The rules for syllabification were based on what is which include reading several different sets of sentences, permissible in the citation form of a monosyllabic word in reading a passage, and undertaking a map task to elicit English. In the case of vowels, long vowels and free speech. diphthongs in English were permitted to be syllable final, but short vowels were not; this is because no monosyllabic In selecting appropriate material for comparison, it was English word occurs in RP or Southern Standard British necessary to decide which speech task performed by the English which ends with one of the short vowels / /, / /, BrE speakers is most closely comparable to the HKE data.  e In this instance, it was decided to use the read passage for / æ /, /  /, /  / or /  /. There are, however, exceptions comparison. The passage itself takes little more than two among short vowels in the case of unstressed syllables. minutes to read aloud. My reasons for choosing the is always weak and can therefore occur in passage for comparison, and not the free speech task, is syllable-final position; unstressed /  / and /  / also occur that the HKE speakers, in giving presentations, are in weakened syllables in English and were therefore performing a task which is in more ways similar to afforded the same structural status when weakened. The passage reading than to free speech. non-phonemic vowel symbols / i / and / u / were used either as the counterparts to /  / and /  / respectively when Five speakers were taken from the SCRIBE material, one either was followed by a vowel (e.g., react / riækt /; female and four male speakers. All are from the influential / «n.flu»en.´l /), or appeared word finally in South-East set. The choice of speakers is restricted by the unstressed positions (e.g., happy / hæp.i /). availability of comparable transcription passages, as only one female and five male speakers from this region were Concerning consonants, it is permissible to have up to transcribed using a broad phonetic transcription using the three consonants initially and four consonants finally in SAMPA machine readable phonetic alphabet, details of restricted combinations in British English monosyllables. which can be found on the following web site: All consonants making up the consonantal inventory of http://www.phon.ucl.ac.uk/home/sampa/home.htm. British English, with the exception of /  /, may occur in initial positions. In final positions, the approximant Approximately one minute from each of the four male consonants / /, / / and / / and fricative / / are not speakers was used. The passage is divided into four r w j h permitted. However, according to Maximal Onsets, in paragraphs of just over 30 seconds each. The data used connected speech, consonants belonging to the end of was from two of the male speakers reading the first two words may be syllabified as initials when the speech is paragraphs and the other two reading the last two broken down into syllables. For example, cats and dogs is paragraphs. In the case of the female speaker, as there is only one female for whom a broad phonetic transcription likely to become / kæt.sn.dÅ /, and forced in two will be is available, the entire passage was used (approximately 2 divided as / fç˘.stn.tu˘ / in connected speech if the minutes). Maximal Onsets rule is applied.

Speech from the SCRIBE corpus was sampled at a rate of It was found in the process of syllabifying the HKE data 20000 samples per second (20kHz) and labelled using that, in some cases, it was difficult to apply Maximal suitable speech analysis software. This renders the label Onsets insofar as many syllables which would usually be files into a slightly different format to that of my own data weakened in BrE connected speech were pronounced with using SFS, and so the SCRIBE data was manipulated on a full vowel. For example, collapse of any part is computer to make it comparable. In addition, the produced by speaker m03 as / kÅlæpsÅvenip˘t /, rather segmental durations derived from sampling at 20kHz than / k´læps´venip˘t /. If adhering strictly to Maximal were divided by 20 in order to give a duration in Onsets in this case, it would be necessary to divide this milliseconds (20000 samples = 1000ms). section as / kÅl.æp.sÅv.en.i.p˘t /; however, it was felt that for HKE speakers, a short vowel in syllable final position is entirely possible, as long as the syllable is unstressed. This interpretation leads us to the following division of significant difference between the duration of tonic syllables: / kÅ.læp.sÅ.ven.i.p˘t /, which is comparable to syllables in HKE and BrE. the likely BrE version, / k´.læp.s´.ven.i.p˘t /. Intuitively, I was happier with this interpretation than with the version The following line plot shows the difference between the which applied Maximal Onsets indiscriminately. two varieties. The HKE data is represented by the upper solid line, and the BrE data by the lower dashed line. On Another matter arising was that HKE has many the X (horizontal) axis, 1 = weakened syllables, 2 = phonetically nasalised vowels. Walmsely ([13]: 262) unstressed syllables, 3 = stressed syllables and 4 = tonic notes that, among Cantonese speakers of English, there is syllables. On the Y (vertical) axis, average duration in ms “a strong tendency to lower the velum especially when the is given. closure for a nasal was not completed”. In syllabifying nasalised vowels where there was an underlying final 400 nasal , the syllable was treated as containing a final nasal consonant. However, if there was a vowel following, the nasal consonant was not syllabified as attracting into the following syllable in these cases. 300

A further syllabification issue for HKE concerns final dark and syllabic / l /. As is noted in Hung [6], dark and syllabic / l / are frequently realised as vowels. Where a 200 dark / / was very clearly realised as a vowel, it was l L1 decided not to transcribe it as / l /. 1

100 2 Finally, there is a large amount of glottal stopping in the Mean DURATION HKE data, which often has the effect of preventing the 1 2 3 4 linking which is associated with connected English speech. STRESSLV Where the is clearly not a realisation of Figure 1. Line plot of syllable duration according to another consonant and appears in prevocalic position (e.g., stress level, HKE and BrE speaker m09’s the accident is realised with a glottal stop at the beginning of accident), it is not included as part of the syllable. Figure 1 clearly shows that, while differences in length across the four stress levels weakened, unstressed, stressed and tonic are maintained by this group of HKE The BrE data is much more straightforward to syllabify, speakers, these differences are not maintained to the same and in no cases were Maximal Onsets violated in order to degree as by the BrE speakers studied; the ratio is cope with a speaker’s idiolect. different.

3. ANALYSIS The above line plot is rather telling about the situation in HKE rhythmic stress: weak and unstressed syllables are For the purpose of analysis, syllables were divided into not as short as those in the BrE speech data, but the tonic four categories: weakened, unstressed, stressed and tonic. syllables are very similar in length. Thus, although a It was assumed that tonic syllables would be the longest in statistically significant difference across stress levels is duration, followed by stressed, unstressed and finally maintained throughout by the HKE speakers, as is the case weakened syllables. This was indeed found to be the case for BrE, the degree to which these syllables differ is in for both language varieties. There were 4404 syllables in sharp contrast to that of BrE. For the pattern to reflect the the HKE data, and 1847 syllables in the BrE data. BrE speakers, the lines would have to have been parallel, not convergent. The lines, although similar in form, are Comparison of the corpus of Hong Kong English (HKE) certainly not parallel, and the only point at which there is speaker speech with the BrE data revealed differences in no statistically significant difference between the two the patterns of rhythm and timing in the two varieties. A varieties is tonic syllables (4 on the X axis). At each of the certain lessening of the distinctions between other three points, the amount of difference becomes weak/unstressed syllables and stressed/tonic syllables was progressively less, but is still significantly different to the found in the HKE data. Although a statistically BrE data. significant distinction between all four syllable types was maintained by the HKE speakers, it was not as large as A feature revealed by the descriptive statistics which may that of the speakers in the British data. In addition, be of significant importance to the perceived rhythm of although there was a statistically significant difference HKE was the much greater proportion of unstressed but between the duration of weak, unstressed and stressed not weakened syllables in the HKE data, as demonstrated syllables across the two languages, there was no in Figure 2 below. Although there were similar percentages of stressed and tonic syllables in the HKE and BrE data, there were far more unstressed than weakened REFERENCES syllables in the HKE data: 43.64% of HKE syllables were unstressed and 19.3% weakened, compared with 26.96% [1] J. Anderson-Hsieh and H. Venkatagiri. “Syllable unstressed and 34.81% weakened in the BrE data. This duration and pausing in the speech of Chinese ESL could show a similar situation to Low and Grabe’s “lack speakers”, TESOL Quarterly, 28 (4), pp. 807-812, of ‘deprominencing’” ([9]:49) in . 1994. What this means is that more syllables appeared with a full vowel rather than a schwa or syllabic consonant – they [2] R. S. Bauer and P. K. Benedict (eds.). Modern are, in effect, less weak, and so lack ‘deprominencing’. . Berlin; New York: Mouton Dauer’s observation ([3]: 57), that syllable-based De Gruyter, 1997. languages do not have the same patterns of vowel [3] R. M. Dauer. “Stress timing and syllable timing reduction, supports the fact that HKE is likely to sound reanalyzed”, Journal of Phonetics, 11, pp. 51-62, syllable- rather than stress-timed, as a preference for 1983. unstressed rather than weakened syllables in the speech of these HKE speakers is revealed. [4] J. Gilbert. Clear Speech: Pronunciation and listening comprehension in . Cambridge: Cambridge University Press, 1984. 60 [5] M. A. K. Halliday. Spoken and Written Language (2nd 40 Edn). Oxford: Oxford University Press, 1989. 20 [6] T. Hung. “Towards an phonology of Hong Kong Percentage 0 English”, , 19 (3), pp. 337-356, 2000. weak unstresse stressed tonic th BrE% 34.81 26.96 22.09 16.13 [7] D. Jones. English Pronouncing Dictionary (15 Edn.). HKE% 19.3 43.64 21.8 15.28 Edited by P. J. Roach and J. W. Hartman, with J. Setter. Stress Level Cambridge: Cambridge University Press, 1997. [8] D. S. C. Li. “The functions and status of English in Hong Kong: a post 1997 update”, English World-Wide Figure 2. Proportion of syllables for HKE and BrE 20 (1), pp. 67-110, 1999. according to stress level [9] E. L. Low and E. Grabe, E. “Prosodic patterns in Singapore English”, Proceedings of the XIIIth 4. DISCUSSION International Congress of Phonetic Sciences, Stockholm 13-19 August 1995, 3, pp. 636-639, 1995. Although returned to China in 1997, Hong Kong is a long-established financial centre and still has a large [10] P. Roach. “On the distinction between ‘stress-timed’ amount of native English speakers of longer established and ‘syllable-timed’ languages”, In D. Crystal (Ed.), varieties, such as American, Australian and British Linguistic Controversies: essays in linguistic theory speakers. These varieties conform to the ‘stress-timed’ and practice in honour of F R Palmer. London: model. English is considered very much to be a Edward Arnold, pp. 73-79, 1982. ‘value-added’ language by Hong Kong Chinese (Li [8]), [11] J. Setter. Rhythm and Timing in Hong Kong English. associated with high earnings and social status. When using it, Hong Kong Chinese communicate largely with University of Reading: Unpublished PhD Thesis, people from the aforementioned groups. As the difference 2000. in duration between tonic, stressed, unstressed and weak [12] C. Spencer. Pre-SCRIBE Final Report. University syllables is of paramount importance to American, College London, 1990. Australian and British speakers of English in the correct parsing of messages (see, for example, Anderson-Hsieh [13] J. B. Walmsley. “Cantonese English: an essay in and Venkatagiri [1]), the lack of distinctness found in the diagnostic linguistics”, In G. Nickel (Ed.), HKE data may lead to misunderstanding and Proceedings of the 3rd AILA Congress Copenhagen, 1. incomprehension in certain conversational situations in Heidelberg: Groos, pp. 261-277, 1997. Hong Kong. At least as long as speakers from these and similar groups continue to have influence in the financial and business arenas in Hong Kong, it seems sensible therefore to address the issue of rhythm and timing in the pronunciation of HKE speakers.