Methods and Tool for Constructing Phonetically-Balanced Materials for Speech Perception Testing: a Development of Thai Sentence-Length Materials
Total Page:16
File Type:pdf, Size:1020Kb
PACLIC 29 Methods and Tool for Constructing Phonetically-Balanced Materials for Speech Perception Testing: A Development of Thai Sentence-Length Materials Adirek Munthuli Charturong Tantibundhit Department of Electrical and Computer Department of Electrical and Computer Engineering, Faculty of Engineering, Engineering, Faculty of Engineering and Thammasat University, Thailand Center of Excellence in Intelligent [email protected] Informatics, Speech and Language Technology, and Service Innovation (CILS) Thammasat University, Thailand [email protected] Chutamanee Onsuwan Krit Kosawat Department of Linguistics, Faculty of Liberal National Electronics and Computer Arts and Center of Excellence in Intelligent Technology Center (NECTEC), National Informatics, Speech and Language Science and Technology Development Technology, and Service Innovation (CILS) Agency (NSTDA), Thailand Thammasat University, Thailand [email protected] [email protected] phonetically balanced (PB). To show Abstract how this is accomplished, two sentence sets are constructed and evaluated by Phonemic content is one of many native speakers. The procedure and tool important criteria in a development of have characteristics that make them any kind of speech testing materials. In potentially useful in other applications this paper, we explain a procedure and and can be applied to other languages. tool we created in the process of constructing phonetically-balanced (PB) Keywords: Thai, sentence-length sentence-length materials for Thai, as an material construction, phonetically assessment for speech reception balanced speech materials thresholds. Our procedure includes establishing criteria, preselecting sentences, creating pool of replacement 1 Introduction words, determining phonemic distribution, and constructing sentences. It is well-established that an assessment technique Importantly, a tool is created to for evaluating an individual’s hearing sensitivity determine whether set of words or based on pure-tone audiometry alone does not truly sentences are phonetically balanced. reflect the individual's speech understanding Once the phoneme distribution and the (Bilger et al., 1984; Egan, 1948). Importantly, set of words with transcription are measuring of speech intelligibility could be specified, the tool efficiently computes obtained by counting number of correct responses phoneme occurrences among words or from speech testing materials, e.g., phonetically- sentences (within a set) and can be used balanced (PB) monosyllabic words, polysyllabic to manipulate words to achieve goal in words, and sentences (Egan, 1948). 293 29th Pacific Asia Conference on Language, Information and Computation: Posters, pages 293 - 301 Shanghai, China, October 30 - November 1, 2015 Copyright 2015 by Adirek Munthuli, Charturong Tantibundhit, Chutamanee Onsuwan and Krit Kosawat PACLIC 29 For the Thai language, there are a few especially those for assessing hearing-impaired existing speech materials for intelligibility test, individuals. Our goal is to construct Thai some of these were developed using monosyllabic phonetically balanced sentence-length materials for word lists, e.g., RAMA.SD1, RAMA.SD2 assessing speech reception thresholds. In this (Komalarajun, 1979), and TU PB’14 (Munthuli et paper, we describe the methods and tool for al., 2014) while some using phrase or sentence constructing a subset of these meaningful materials, e.g., Ramathibodi Synthetic Sentence sentences. Identification (RAMA.SSI), which contains Thai . artificial sentences (with no real meaning) SPIN (1977) HINT (1994) (Wissawapaisal, 2002), and “PB and PD Sentence length 6-8 syllables 6-9 syllables sentences”, which are long stretches of phrases and Number of lists 8 25 Number of sentences derived from Thai continuous speech sentences per 50 10 corpus for an evaluation of automatic speech list recognition system (Wutiwiwatchai et al., 2002). Speech Speech intelligibility However, a majority of speech testing methods intelligibility (count only (count every using sentence materials requires that the sentences word of the are representative of the real communication Measurement ‘keyword’ at the sentence) and last system, which includes many factors such as monosyllabic sentence speech meaning, context, rhythm, etc. (Egan, 1948). It is reception noun of the threshold sentence) quite clear that the existing Thai sentence materials (sSRT) would not satisfactorily meet this requirement. Balanced within Phonemically class of balanced of 43 Sentence speech materials have been Phonemic created in many languages, e.g., Dutch (Plomp and content phonemes from phoneme Dewey’s written sounds among Mimpen, 1979), Mandarin Chinese (Fu et al., corpus lists 2011), German (Kollmeier and Wesselkamp, Low predictability Sentence 1997). For English (American), the most widely difficulty: 1- used are Speech Perception in Noise test (SPIN) Others (LP) and high grade reading predictability (Kalikow et al., 1977) and Hearing in Noise Test (HP) sentences level (HINT) (Nilsson et al., 1994). SPIN is a test for measuring speech intelligibility at fixed S/N ratio, Table 1: Important characteristics of SPIN and but it was found to have variability in terms of HINT tests. sentence difficulty (Kalikow et al., 1977). Therefore, HINT was designed and developed as a 2 Establishing Criteria Hearing in Noise Test, composed of lists of sentences, which are shown to have no significant The first requirement in constructing PB sentence difference in terms of difficulty. Among those, materials is phonetic/phonemic balance. Other different strategies were used (but no specific tool common criteria include word familiarity, had been mentioned) to construct the phonetically naturalness, sentence length, homogeneity, test- balanced materials. It should be noted that those retest reliability, and inter-list difficulty (Bilger et materials were created to obtain similar phoneme al., 1984). Our approach is to incorporate most of distributions among sentence sets (see Phonemic the above criteria. However, in this paper, our content in Table 1) rather than to reflect the true focus is on the initial phase, which is designing phonemic distributions of the language. Our lists of natural sentences with phonetic balance, approach tries to achieve both ends by using a equal length, and familiar words. The next phase, semi-automatic tool. A fully automated tool of this testing and evaluating, not discussed here, will be type would be ideal, but would require other to ensure homogeneity, test-retest reliability, and crucial components such as a language model. A inter-list difficulty. list of important characteristics of SPIN and HINT Our PB sentence lists are based on are given in Table 1. phoneme distribution of Thai speech LOTUS- Due to the lack of Thai ‘natural’ sentence CELL2.0 (LT-CS) corpus (Section 3.2). To materials for speech perception testing, and minimize effect of subject’s different language 294 PACLIC 29 background, we opt for familiar words. This is familiarity of word. Therefore, for our lists, we carried out by selecting words and sentences, have to make certain that the selected words which match desired phonemic content, from (candidates) are familiar words in the language. children’s textbooks and stories, (Thai Children We do so, by selecting words from children’s Stories, 1990; Ministry of Education, 1986; textbooks and stories. In addition, to estimate Sripaiwan, 1994; Sangworasin, 2003). In terms of frequency of word occurrences, we utilize the sentence length, we follow SPIN (Kalikow et al., largest available Thai written corpus InterBEST 1977), HINT (Nilsson et al. 1994), and RAMA.SSI (Kosawat et al., 2009). (Wissawapaisal, 2002) and limit each sentence to Most importantly, our PB word candidates six to eight syllables with no words greater than are considered to be as phonetic balanced as two syllables long. The PB sentences will compose possible, i.e., less than 10% difference from of 10 lists, each with 10 sentences. targeted phoneme distribution. In addition, to address a question of whether different levels of predictability affect 3.1 Preselecting Sentences and Pool of sentence intelligibility (Kalikow et al., 1977), in all Replacement Words five lists will be created to fit the ‘low’ The first step to create PB sentences is based on predictability status and another five the ‘high’ preselection of sentences. All sentences from a predictability. However, degrees of predictability collection of 89 children’s stories (Thai Children are beyond the scope of our developed tool, and Stories, 1990) are analyzed and only simple are determined by semantics and overall sentence sentences, (i.e., subject-verb-(adverb), subject- contexts. (see Sections 4 and 5). verb-complement/object), are kept. These result in 313 sentences in total. Then, each sentence is 3 Procedure and Concept Design for Tool transcribed and its phonemic distribution of For SPIN (Kalikow et al., 1977) and HINT initials, finals, vowels, and tones are tallied. (Nilsson et al., 1994), pre-selection of sentences Attempts are made to group a set of 10 sentences were carried out prior to phonemic distribution in to a list (10 lists in all) such that the phoneme analysis and matching. HINT sentences were distributions are as close to the ones shown in selected from Bamford-Kowal-Bench (BKB) Tables 2-5 as possible. corpus. SPIN sentences were constructed by However, from a limited number of simple generating sets of ‘low’ predictability and ‘high’