<<

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for additional charge. Contact UMI directly to order.

ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor, Ml 48106-1346 USA 800-521-0600

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. PHONOLOGICAL NEIGHBORHOODS AND PHONETIC SIMILARITY

IN JAPANESE WORD RECOGNITION

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree of Doctor of Philosophy in the Graduate

School of The Ohio State University

By

Kiyoko Yoneyama, M.A.

The Ohio State University 2002

Dissertation committee: Approved by

Professor Keith Johnson, Adviser

Professor Mary E. Beckman Adviser Linguistics Graduate Program Professor Mark A. Pitt

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3039544

Copyright 2002 by Yoneyama, Kiyoko

All rights reserved.

___ ® UMI

UMI Microform 3039544 Copyright 2002 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17. United States Code.

ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor. Ml 48106-1346

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. © Copyright By Kiyoko Yoneyama March. 2002

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT

This dissertation explores two aspects of spoken-word recognition in Japanese: representations of words stored in the lexicon and lexical word competition. The nature of lexical representations and lexical competition were explored by testing three different neighborhood density calculations in naming, word identification in noise and semantic categorization experiments. Neighborhood density is a measure of the number of similar words surrounding a word in the lexicon (“neighbors”). However, definitions of neighbors vary depending on the definition of similarity used. This dissertation tests three neighborhood definitions, each of which coincides with a hypothesis about lexical access with different word representations in Japanese. The first calculation posits the situation where listeners rely on the phonemic word representation as proposed in abstract models. Here, neighborhoods are calculated in terms of the number of phonemes in common, as in the Greenberg-Jenkins calculations (Greenberg-Jenkins. 1964) as widely used in the English word recognition literature. The second neighborhood calculation included prosodic information as another dimension in the neighborhood calculation in order to reflect the finding that prosodic information has a vital role in Japanese word recognition (Cutler & Otake. 1999). This calculation proposes that Japanese listeners use word-level prosody for lexical access. However, both word representation and word-level prosody are separately stored in the lexicon. In other words, the word representation in this calculation is the categorical abstract representation used in the previous neighborhood calculation and the pitch accent patterns additionally constrain the neighbors. A similarity judgment experiment on pitch accent patterns was carried out and the results were implemented in the calculation.

ii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The third neighborhood calculation is designed to test exemplar-based models. In this calculation, neighborhood density was measured by comparing the similarity of cochleagrams of the 66000 audio files (one file for each noun in the NTT psycholinguistic database, Amano & Kondo, 1999, 2000). Therefore, the word representation is an auditory representation in which all segmental and prosodic information is available. In this calculation, as in the GNM (General Neighborhood Model; Bailey & Hahn, 2000), the words in the lexicon are considered as exemplars and they are mapped onto psychological mental space. Data for the analyses were collected from Japanese neighborhood experiments using the same 700 test words used in the previous experiments and a lexicon that consisted of only nouns from the NTT psycholinguistic database. The results of the three experiments in this dissertation shed light on two aspects of lexical access. First, a lexical competition effect is confirmed in Japanese. There are also two types of lexical competition in auditory word recognition: form-based competition (neighborhood density) and phoneme-based competition (cohort reduction). Finally, both abstract (symbolic) representations and episodic (auditory) representations need to be stored in the lexicon. Implications of these results for the current word recognition models are also discussed.

iii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in memory of my grandmother, Toyo Yoneyama and to my parents, Susumu and Rumiko Yoneyama who have been supportive of me

IV

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACKNOWLEDGMENTS

This dissertation wouldn’t exist without help from many people. First and foremost, I would like to thank my adviser, Keith Johnson, who is a true mentor and who has believed in me very patiently.

I would also like to thank Mary Beckman, who was also an excellent researcher role model. She was always allowing me to formulate my own ideas about issues in spoken word recognition, but always ready and eager to provide insights into how the data might be interpreted in different perspectives.

I also thank Mark Pitt who was extremely encouraging and full of enthusiasm for my work. I value most the way in which he tests the problems as an expert in the field of spoken word recognition. I learned much from his thorough investigation and data analyses techniques.

I am also grateful for support from the National Institute for Health for a grant entitled "Cross-linguistic studies of spoken language processing" (R01DC04421. PI: Keith Johnson), which supported my dissertation research in Japan. I also am very thankful to JJ Nakayama for allowing me to use his copy of the NTT Psycholinguistic Databases, which was essential for my dissertation research.

Takashi Otake and Anne Cutler also had a great impact on my interests in spoken word recognition and encouraged me to pursue my PhD study here at The Ohio State University. They provided me a great opportunity to work as a research assistant for a couple of important studies in Japanese word recognition. Without such experience, I wouldn’t have become a PhD student at the Ohio State University. Takashi Otake is my master’s adviser who encouraged me to pursue my PhD. He has always been supportive for more than 10 years. All my dissertation experiments were conducted at his lab. Without his support, this dissertation wouldn’t exist. Anne Cutler is also an excellent role model as a researcher. I decided to pursue my graduate work in the states because I wanted to be a researcher like her. Her energetic and enthusiastic attitudes towards research inspired me a lot.

I also owe a great deal of gratitude to the Labbies, teachers, staff, and friends at OSU. They always offered me help whenever I needed it: Beth Hume, Jan Edwards, JJ Nakayama, Osamu Fujimura, Matt Makashay, Satoko Katagiri, Janice Fon, Pauline Welby, Steve Winters, Laurie Maynell, Allison, Blodgett, Tsan Huang, Georgios Tserdanelis, Misun Seo, Peggy Wong, Grant McGuire, Craig Hilts, Robbin Dautricourt, v

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Amanda Miller-Ockhuizen, Jennifer Venditti, Stefanie Jannedy, Mariapaola DTmperio, Liz Strand, Rebecca Herman, Jennifer Vannest, Julie McGory, Nick Cipollone, Sanae Eda, Kooichi Sawasaki, Jim Harmon, and Matt Hyclak.

I thank my family members who have been always thinking of me from Japan: my parents, Susumu and Rumiko Yoneyama, my younger brother and my sister-in-law Kiyoshi and Yumi Yoneyama; and my youngest brother, Kazunari Yoneyama.

My dearest grandmother who passed away at the age of 96 in December, 2001, had been supportive for my entire life. She did not go to school, but she was always interested in education, and she had been encouraging me to complete this endeavor. One of the two wishes she asked me to do was to complete my degree before she went to see her husband in her afterlife. She couldn’t see me finish, but I still believe that she is really happy to see me finished - from there.

This is your dissertation, grandma.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VITA

June 23, 1969 ...... Bom, Chigasaki, Kanagawa, Japan

1993 ...... B.A. English Language. Dokkyo University, Soka, Japan.

1995 ...... M.A. English Linguistics. Dokkyo University, Soka, Japan.

1995-present ...... Graduate Teaching and Research Associate, Department of Linguistics, The Ohio State University

2000...... M.A. Linguistics. The Ohio State University, Columbus. OH.

PUBLICATIONS

Peer-reviewed Journal Article Otake, T., Yoneyama, K.. Cutler, A., & van der Lugt, A. (1996). The representation of Japanese moraic nasals. Journal of Acoustic Society of America, 100 (6), 3831-3842.

Book Chapters

1. Otake, T.. Hatano, G., & Yoneyama, K. Speech segmentation by Japanese. (1996). In Otake, T. and Cutler, A. (eds.), Phonological Structure and Language Processing: Cross-linguistic Studies, Mouton de Gruyter:Berlin, 183-201.

2. Yoneyama, K. (1996) Spoken language recognition and segmentation: Evidence from data of monolingual and bilingual speakers. In the circle of Phonology in Japan (ed.), Study on phonology, 179-182 (Written in Japanese).

vii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Conference Proceedings

1. Otake, T. & Yoneyama, K. (2000). Shinai jisho no on’in tani to sono ninshiki (Recognition of phonological units in mental lexicon). Phonological studies, 3, 21-28 (Written in Japanese).

2. Otake, T. & Yoneyama. K. (1999). Shinnai jisho ni okeru onsetsu to mora no ninshiki (Recognition of syllables and moras in mental lexicon). Proceedings o f the 113th Annual Meeting of the Phonetic Society o f Japan, 77-82 (Written in Japanese).

3. Yoneyama, K. & Pitt, M.A. (1999). Prelexical representation in Japanese: Evidence from the structural induction paradigm. Proceedings o f the 14th International Congress o f Phonetic Sciences, vol. 2, 893-896.

4. Otake, T. & Yoneyama, K. (1999). Listeners' representations of within-word structure: Japanese preschool children. Proceedings o f the 14th International Congress of Phonetic Sciences, vol. 3, 2193-2196.

5. Yoneyama, K. & Johnson, K. (1999). An Instance-based Model of Categorical Perception in Japanese by native and non-native listeners: A case of Segmental Duration. Phonological Studies,!, 11-18.

6. Otake, T. & Yoneyama, K. (1998). Phonological units in speech segmentation and phonological awareness. Proceedings of International Conference on Spoken Language 98, Vol. 5, 2179-2182.

7. Otake, T., Yoneyama, K., & Maki, H. (1998). Non-native listeners' representations of within-word structure. Proceedings of the 16th International Congress on Acoustics and the 135th Meeting of the Acoustical Society o f America, Vol. 2, 2067-2068.

8. Yoneyama, K. & Johnson, K. (1998). An instance-based model of Japanese speech recognition by native and non-native listeners. Proceedings o f the 16th International Congress on Acoustics and the 135th Meeting of the Acoustical Society of America. Vol. 3 2977-2978.

9. Otake, T. & Yoneyama, K. (1996). Can a mora occur word-initially in Japanese? Proceedings o f the 1996 International Conference on Spoken Language Processing, Philadelphia, vol. 4, 2454-2457.

10. Yoneyama, K. (1996). Segmentation strategies for spoken language recognition: Evidence from Semi-bilingual Japanese speakers of English. Proceedings o f the 1996 International Conference on Spoken Language Processing, Philadelphia, vol. 1, 454- 457.

viii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 11. Otake, T. & Yoneyama, K. (1995). A moraic status and syllable structure in speech perception. Proceedings ofXllllh International Congress of Phonetic Sciences , vol. 2, 686-689.

12. Otake, T & Yoneyama, K. (1994). A Moraic Nasal and a Syllable Structure in Japanese. Proceedings of the 1994 International Conference on Spoken Language Processing in Yokohama, vol. 3, 1427-1430.

Technical Reports

1. Yoneyama, K. (1997). A cross-linguistic study of diphthongs in spoken word processing in Japanese and English. OSU Working Papers in Linguistics , 50, 163- 175.

2. Yoneyama. K. (1995). Segmentation procedure by semi-bilingual speakers of Japanese and English. Dokkyo working papers in Linguistics, vol. 11,67-107.

3. Otake, T. & Yoneyama, K. (1995). Recognition of a moraic nasal in different speech rates. Dokkyo Studies in Data Processing and Computer Science. 13, 23-32, (written in Japanese).

4. Otake, T. & Yoneyama, K. (1994). A geminate consonant and a syllable structure in Japanese. Dokkyo Studies in Data Processing and Computer Science, 12, 55-64. (written in Japanese).

FIELDS OF STUDY

Major Field: Linguistics Specialization: Psycholinguistics, Phonetics

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS

ABSTRACT...... ii

DEDICATION...... iv

ACKNOWLEDGMENTS...... v

VITA...... vii

LIST OF TABLES...... xiii

LIST OF FIGURES...... xviii

LIST OF EQUATION...... xx

CHAPTERS

1. INTRODUCTION...... 1

1.1. Introduction ...... 1 1.2. Information Used for Lexical Access ...... 7 1.3. Mental Lexicon and Phonological Neighbors ...... 14 1.4. Testing neighborhood effects in Japanese: Overview ...... 21 1.5. Organization of the Dissertation ...... 24

2. STILUMI AND THEIR NEIGHBORHOODS...... 26

2.1. Introduction ...... 26 2.2. Japanese Mental Lexicon ...... 26 2.3. Defining Neighbors in Japanese ...... 29 2.3.1. The Segments Calculation ...... 30 2.3.2. The Segments + Pitch Calculation ...... 31 2.3.2. l.Similarity Judgments on Japanese Pitch-Accent Patterns ...... 34 2.3.2.1.1. Purpose ...... 34 2.3.2.1.2. M ethod...... 35 x

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2.1.2.1 .Stimuli...... 35 2.3.2.1.2.2.Participant s ...... 37 2.3.2.1.2.3.Procedur e ...... 37 2.3.2.1.3. Results...... 38 2.3.2.1.3.1.Word Prosodic Similarity Based on Greenberg-Jenkins’ rules...... 41 2.3.2.1.4. Calculating the Segments + Pitch Calculation ...... 43 2.3.3. The Auditory Calculation ...... 46 2.3.4. Comparison of Three Neighborhood Calculations ...... 54 2.4. Target Words ...... 57 2.5. Participants ...... 68 2.6. Summary ...... 69

3. EXPERIMENT 1: AUDITORY NAMING...... 70

3.1. Introduction ...... 70 3.2. Methods ...... 71 3.3. Results...... 72 3.4. Discussion ...... 87

4. EXPERIMENT 2: AUDITORY NAMING IN NOISE...... 90

4.1. Introduction ...... 90 4.2. Methods ...... 91 4.3. Results...... 92 4.3.1. Naming Time Data ...... 93 4.3.2. Word Identification Data ...... 99 4.4. Discussion ...... 107

5. EXPERIMENT 3: SEMANTIC CATEGORIZATION EXPERIMENT...... 112

5.1. Introduction...... 112 5.2. Methods ...... 114 5.2.1. Stimuli...... 114 5.2.2. Participants ...... 116 5.2.3. Procedure ...... 117 5.3. Results...... 118 5.3.1. An Evaluation of the Semantic Categorization Task in Japanese 118 5.3.2. Semantic Categorization Data ...... 128 5.4. Discussion ...... 142

6. GENERAL DISCUSSION AND CONCLUSION...... 145 6.1. Introduction ...... 145 6.2. Summary of Results ...... 146

xi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2. I. Other Effects...... 146 6.2.2. Neighborhood Density Effects ...... 151 6.2.2.1.Processing Time Data ...... 151 6.2.2.2.Word Identification Data ...... 153 6.3. Proposal: A Model of Spoken-Word Recognition and Word Production ...... 154 6.3.1. Plaut & Kello (1999) ...... 156 6.3.2. A Model of Spoken-Word Recognition and Word Production ...... 159 6.3.3. The Current Findings in Terms of The Proposed Model ...... 164 6.3.3.1.Experiment I: Auditory Naming ...... 165 6.3.3.2.Experiment 2: Auditory Naming in N oise ...... 175 6.3.3.3.Experiment 3: Semantic Categorization ...... 180 6.3.4. Previous Findings in Terms of The Proposed Model ...... 184 6.3.4.1.Auditory Naming Experiments with Word Targets in English (Luce & Pisoni, 1998; Vitevitch & Luce, 1999) ...... 185 6.3.4.2.Lexical Decision Experiment in Japanese (Amano & Kondo, 1999)187 6.3.4.3.Implication for Current Recognition Models ...... 190 6.4. Conclusions ...... 194

BIBLIOGRAPHY...... 198

APPENDICES...... 209

Appendix A: Alphabetic symbols used in the lexicon ...... 209 Appendix B: The 300 stimulus pairs used in a similarity judgment experiment 214 Appendix C: 700 Target Words ...... 215 Appendix D: Statistics in Experiment 1 ...... 238 Appendix E: Statistics in Experiment 2 ...... 246 Appendix F: Similarities of Sounds in Noise: MDS Analyses ...... 262 Appendix G: Semantic Categories ...... 273 Appendix H: Reasons to Discard Eight Target Words From the Final Analysis in Experiment 3 ...... 304 Appendix I: Statistics in Experiment 3 ...... 306

xii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF TABLES

Table 2.1: Contents of the NTT Database Series (Amano & Kondo, 1999; 2000) ...... 27

Table 2.2: The target, neighbors selected by the Segments calculation, and its operations...... “...... ~...... 1 ...... 31

Table 2.3: Twenty tonal patterns tested in Experiment 1 (0 = low pitch; I = unaccented high pitch; * = accented high pitch) ...... 36

Table 2.4: A target word and its four potential neighbors selected at the first stage with information calculating neighborhood density ...... 45

Table 2.5: Descriptive statistics of neighborhood density computed by three different neighborhood calculations...... 56

Table 2.6. Pearson correlation matrix of the three neighborhood calculations ...... 57

Table 2.7: Three represntations of anago, 'conger eel’ found in the Word Frequency Database (Volume 7, The NTT Database Series) ...... 59

Table 2.8: A summary of the uniqueness-point tabulations ...... 62

Table 2.9: The number of words beginning with ka, the total number of words in the lexicon, and the proportion of words beginning with ka in the lexicon ...... 63

Table 2.10: Pearson correlation matrix of the three neighborhood calculations and other factors ...... 67

Table 3.1: Basic model of the naming time data for fast namers. Experiment . 1...... 74

Table 3.2: Models of the naming time data for fast namers. Experiment 1 ...... 76

Table 3.3: The number of responses before and after the offset of the 700 target words for fast namers...... 78

Table 3.4: Basic model of the naming time data for slow namers. Experiment 1 ...... 79

Table 3.5: Models of the naming time data for slow namers. Experiment 1 ...... 80

xiii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.6: The number of responses before and after the offset of the 700 target words for slow namers ...... 82

Table 3.7: A summary of the reliable effects from the regression models for (the naming time data (fast namers and slow namers). Experiment 1. Effects in bold show the calculation that yielded the highest increase in R2 ...... 83

Table 4.1: Basic model of the naming time data for fast namers. Experiment 2 ...... 94

Table 4.2: Models of the naming time data for fast namers. Experiment 2 ...... 95

Table 4.3: Basic model of the naming time data for slow namers. Experiment 2 ...... 96

Table 4.4: Models of the naming time data for slow namers. Experiment 2 ...... 97

Table 4.5: A summary of the regression models for the naming time data (both fast namers and slow namers). Experiment 2 ...... 98

Table 4.6: Basic model of the word identification data for fast namers. Experiment 2.. 101

Table 4.7: Models of the word identification data for fast namers. Experiment 2 ...... 102

Table 4.8: Basic model of the word identification data for slow namers. Experiment 2.103

Table 4.9: Models of the word identification data for slow namers. Experiment 2 ...... 104

Table 4.10: A summary of the regression models for the word identification data (both fast namers and slow namers). Experiment 2 ...... 105

Table 5.1: A summary of correct responses for "yes” filler-word responses in terms of semantic categories ...... 121

Table 5.2: A summary of the accuracy data for the 700 target words ...... 125

Table 5.3: The discarded target words for the final analysis in the semantic categorization experiment ...... 127

Table 5.4: Basic model of the semantic categorization data. Experiment 3 ...... 129

Table 5.5: Models of the semantic categorization data. Experiment 3 ...... 130

Table 5.6: A summary of the regression model with two types of neighborhood density (facilitative and inhibitory). Experiment 3 ...... 132

Table 5.7: Categorization data of fast responders, Experiment 3 ...... 134

Table 5.8: Models of the semantic categorization data for slow responders. Experiment 3 ...... 135 xiv

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5.9: A summary of the regression models with two types of neighborhood density (facilitative and inhibitory) for fast responders and slow responders. Experiment 3...... 137

Table 6.1: Summaries of effects in three experiments. Processing time data (top) and Accuracy data (bottom). (F = Facilitative effect, I = Inhibitory effect, H = Higher accuracy, L = Lower accuracy, N/A = no applicable) ...... 147

Table 6.2: Summary of the neighborhood density effects on processing times in three experiments. Effects in bold show the calculation that yielded the highest increase in R:...... 152

Table 6.3: Summary of the neighborhood density effect in the word identification data of Experiment 2 ...... 154

Table 6.4: Characteristics of three neighborhood density calculations ...... 167

Table 6.5: Relationships between the acoustic input and neighborhood density calculation ...... 168

Table D.l: Basic model for naming data (fast namers). Experiment 1 ...... 238

Table D.2: Basic model + Neighborhood density (Segments) for naming data (fast namers). Experiment 1...... 239

Table D.3: Basic model + Neighborhood density (Segments + Pitch) for naming data (fast namers). Experiment 1...... 240

Table D.4: Basic model + Neighborhood density (Auditory) for naming data (fast namers), Experiment 1...... 241

Table D.5: Basic model for naming data (slow namers). Experiment 1 ...... 242

Table D.6: Basic model + Neighborhood density (Segments) for naming data (slow namers). Experiment 1...... 243

Table D.7: Basic model + Neighborhood density (Segments + Pitch) for naming data (slow namers). Experiment 1 ...... 244

Table D.8: Basic model + Neighborhood density (Auditory) for naming data (slow namers), Experiment 1...... 245

Table E.l: Basic model for naming data (fast namers), Experiment 2 ...... 246

Table E.2: Basic model + Neighborhood density (Segments) for naming data (fast namers). Experiment 2 ...... 247

xv

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table E.3: Basic model + Neighborhood density (Segments + Pitch) for naming data (fast namers), Experiment 2 ...... 248

Table E.4: Basic model + Neighborhood density (Auditory1) for naming data (fast namers). Experiment 2 ...... 249

Table E.5: Basic model for naming data (slow namers). Experiment 2 ...... 250

Table E.6: Basic model + Neighborhood density (Segments) for naming data (slow namers). Experiment 2 ...... 251

Table E.7: Basic model + Neighborhood density (Segments + Pitch) for naming data slow namers). Experiment 2 ...... 252

Table E.8: Basic model + Neighborhood density (Auditory) for naming data (slow namers). Experiment 2 ...... 253

Table E.9: Basic model for word identification data (fast namers), Experiment 2 ...... 254

Table E.10: Basic model + Neighborhood density (Segments) for word identification data (fast namers), Experiment 2 ...... 255

Table E.l 1: Basic model + Neighborhood density (Segments + Pitch) for word identification data (fast namers). Experiment 2 ...... 256

Table E.12: Basic model + Neighborhood density (Auditory) for word identification data (fast namers). Experiment 2...... 257

Table E.13: Basic model for word identification data (slow namers), Experiment 2 ...... 258

Table E.14: Basic model + Neighborhood density (Segments) for word identification data (slow namers). Experiment 2 ...... 259

Table E. 15: Basic model + Neighborhood density (Segments + Pitch) for word identification data (slow namers), Experiment 2 ...... 260

Table E.16: Basic model + Neighborhood density (Auditory) for word identification (slow namers). Experiment 2 ...... 261

Table F.l: The mean number of errors of consonants and vowels in each of the two analyses ...... 267

Table F.2: The mean number of errors in terms of word positions ...... 267

Table F.3: Proportions of responses for vowels ...... 268

Table F.4: A similarity matrix for vowels ...... 268

xvi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table F.5: Proportions of responses for consonants ...... 270

Table F.6 : A similarity matrix for consonants ...... 271

Table 1.1: Basic model for semantic categorization data. Experiment 3 ...... 305

Table 1.2: Basic model + Neighborhood density (Segments) for semantic categorization data. Experiment 3 ...... 306

Table 1.3: Basic model + Neighborhood density (Segments + Pitch) for semantic categorization data. Experiment 3 ...... 307

Table 1.4: Basic model + Neighborhood density (Auditory) for semantic categorization data. Experiment 3 ...... 308

Table 1.5: Basic model + Neighborhood density (Segments+Pitch & Auditory) for semantic categorization data. Experiment 3 ...... 309

Table 1.6: Basic model for semantic categorization data (fast responders), Experiment 3...... 310

Table 1.7: Basic model + Neighborhood density (Segments) for semantic categorization data (fast responders), Experiment 3 ...... 311

Table 1.8: Basic model + Neighborhood density (Segments + Pitch) for semantic categorization data (fast responders). Experiment 3 ...... 312

Table 1.9: Basic model + Neighborhood density (Auditory) for semantic categorization data (fast responders). Experiment 3 ...... 313

Table 1.10: Basic model + Neighborhood density (Segments+Pitch & Auditory) for semantic categorization data (fast responders). Experiment 3 ...... 314

Table 1.11: Basic model for semantic categorization data (slow responders). Experiment 3...... 315

Table 1.12: Basic model + Neighborhood density (Segments) for semantic categorization data (slow responders), Experiment 3 ...... 316

Table 1.13: Basic model + Neighborhood density (Segments + Pitch) for semantic categorization data (slow responders), Experiment 3 ...... 317

Table 1.14: Basic model + Neighborhood density (Auditory) for semantic categorization data (slow responders), Experiment 3 ...... 318

Table 1.15: Basic model + Neighborhood density (Segments+Pitch & Auditory) for semantic categorization data (slow responders). Experiment 3 ...... 319

xvii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIGURES

Figure 2.1: Fundamental frequency (FO) contours of ana , ‘hole (top) and a'na. ‘announcer (bottom) ...... 33

Figure 2.2: The mean similarity rating in the order AB as a function of the mean similarity rating in the order BA. The numbered points (I to 4) in the figure represent Oil* vs. 0111,01* vs. 011,0* vs. 01 and0111* vs. 01111, respectively ...... 40

Figure 2.3: The number of operations (substitutions, deletions or insertions) as a function of the similarity ratings (left), the number of operations as a function of the median similarity rating with a logarithmic function (right) ...... 42

Figure 2.4: Frequency counts as a function of operations for pitch-pattem responses in the auditory naming in noise experiment (Chapter 4) ...... 44

Figure 2.5: Examples of LAFS and X-MOD representations of “Cat.” ...... 47

Figure 2.6: Quantized vectors of the exemplars (kodomo and domori) ...... 48

Figure 2.7: A neighbor-nonneighbor distinction in a similarity space in the Auditory calculation ...... 50

Figure 2.8: The target word, kodomo , ‘child’ and its seven most similar neighbors in the Auditory calculation ...... 52

Figure 2.9: Frequency counts of target words as a function of neighborhood density. Neighborhood density by the Segments calculation (top), neighborhood density by the Segments + Pitch calculation (middle), and neighborhood density by the Auditory calculation (bottom) ...... 55

Figure 2.10: Distribution of word frequency of the target words ...... 60

Figure 2.11: Frequency counts of the target words as a function of frequency of the first mora. Words beginning with a fricative (top left), words beginning with a nasal (top right), and words beginning with a stop (bottom left) ...... 64

Figure 2.12: Distribution of the durations of the target words ...... 65

xviii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5.1: The categorization times as a function of the number of the segments from the word-initial point for a given word to be unique from the words in the lexicon (UP)...... 141

Figure 6.1: A model of speech comprehension and production by Plaut & Kello (1999)...... 158

Figure 6.2: A model of spoken-word recognition and word production ...... 161

Figure 6.3: Participants’ performance in Experiment 1 (Auditory naming) in which they started naming the words after they had heard only part of the word by exploiting only segmental information ...... 171

Figure 6.4: Participants’ performance in Experiment I (Auditory naming) in which the participants started naming the words after they had partially heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information. 7...... 172

Figure 6.5: Participants’ performance in Experiment 1 (Auditory naming) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information...... 7...... 174

Figure 6.6: Participants’ performance in Experiment 2 (Auditory naming in noise) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word -level prosodic (pitch accent patterns) information ...... 176

Figure 6.7: Participants’ performance in Experiment 3 (Semantic categorization) 181

Figure F.l: MDS for vowels. Dimensions 1 and 2 represent vowel height (FI) and backness (F2), respectively ...... 269

Figure F.2: MDS for consonants. SS = [f], C = [ts], CC = [c], KY = [kJ], Z= [3], Y = [j]. Dimensions I and 2 represent [tvoice] and [tsonorant], respectively ...... 272

xix

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF EQUATIONS

Equations 2.1 & 2.2. Equations used in the Auditory calculation for neighborhood density ...... 49

xx

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1

INTRODUCTION

1.1. Introduction

This dissertation investigates two aspects of lexical access in Japanese word

recognition.

The first aspect to be investigated is the kind of word representation used for

lexical access. Two types of word recognition models have been proposed to account for

word representation. Some models propose that words are represented in the lexicon in

the form of abstract phonological structures (Grossberg. Boardman, & Cohen, 1997;

McClelland & Elman, 1986; Norris, 1994; Norris, McQueen & Cutler. 2000). In these

models, the acoustic speech stream is coded as a normalized, language-specific

phonological representation (which may consist of features, phonemes, syllables, or a

1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. combination of those). This prelexica! phonological representation is used for matching

with lexical representations.

Other models assume that word forms are stored in the brain in the form of

detailed acoustic traces (Goldinger, 1992, 1996: Klatt, 1979, 1981; Johnson, 1997a,

1997b, Pisoni, 1997). Word recognition involves a “direct” comparison between

memorized acoustic patterns and the pattern elicited by the current acoustic signal. Each

word is associated to many acoustic tokens, and word recognition consists of finding the

nearest match in a vast collection of word forms. Johnson ( 1997ab) proposed an

exemplar-based model of speech perception in which the words are recognized based on

auditory representation. Experimental evidence supporting this view has shown that in

word recognition tasks, participants are very sensitive to nonlinguistic surface form such

as the speaker’s voice (Goldinger. 1996; Schacter & Church. 1992; also see Pisoni.

1997).

However, recent studies have shown that listeners might have both abstract and

episodic representations. Luce and Lyons (1998) showed that English listeners exploit

both abstract and episodic representations of words from identification and memory tasks

as well as from a lexical decision task. This view is also supported by Pallier. Colome.

and Sebastian-Galles (2001) who claim that Spanish-Catalan bilinguals exploit both

abstract and episodic representations. This dissertation aims to investigate which word

representation needs to be assumed in the lexicon and is involved in lexical access in

Japanese.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The second aspect investigated in this dissertation is lexical competition in

Japanese auditory word recognition. Adults actively use a lexicon that was built in

infancy based on surface regularities in the input. It is assumed that lexical entries are

organized into a network, and these entries compete with each other during lexical access

("word competition”). There is now a large body of evidence supporting word

competition models (Gow & Gordon, 1995: McQueen, Norris, & Cutler, 1994; Norris,

McQueen, & Cutler. 1995; Shillcock, 1990: Tabossi, Burani, & Scott. 1995; Vitevitch &

Luce, 1999; Vroomen & de Gelder, 1995a; 1997; Wallece, Stewart, & Malone, 1995b;

Wallace, Stewart, Shaffer, & Mellor, 1995; Zwitserlood. 1989; Zwitserlood & Schriefers,

1995). Most of the studies have been conducted in English so this hypothesis also needs

to be tested in different languages such as Japanese.

These two questions are simultaneously tested by exploring neighborhood density

effects in Japanese auditory word recognition. Neighborhood density is a measure of the

lexical competition effect. It follows that neighborhood density effects in Japanese also

tests lexical competition. In order to define neighbors, explicit assumptions need to be

made about the word representation. For example, English neighborhood density is based

on word similarity, but the word forms compared are assumed to be raw sequences of

segments. In English, stress pairs where no marked difference in vowel quality, (such as

differ and defer) are much more rare than in Dutch, so to a large extent, stress is encoded

in choice of vowel symbols in the Hoosier Mental Lexicon (HML), an online database of

20000 English words (Pisoni, Nusbaum, Luce, & Slowiaczek, 1985). Also, in both

English and Dutch, pitch shapes (accent types, etc) are associated with stress and are 3

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. specified by pragmatic functions. Therefore, the pitch shape per se is not a lexical

property in the same way as the accent pattern is in Japanese. In the case of Japanese,

pitch accent patterns play an important role for word recognition; it is unclear whether the

word representation assumed in English neighborhood studies is also true in Japanese.

Therefore, in order to answer these questions, neighborhood density experiments are

conducted in this dissertation.

Neighborhood density can be defined broadly as the number of words that are

similar in sound to a specific word, and the way in which the similarity sounding words

affect recognition of the target word (Pisoni et al.. 1995; Luce, 1986a; Goldinger. Luce, &

Pisoni, 1989; Luce, Pisoni. & Goldinger. 1990; Luce & Pisoni. 1998; Vitevitch & Luce,

1998; Vitevitch & Luce, 1999; Luce, Goldinger. Auer. & Vitevitch, 2000; Luce & Large,

2001; Amano & Kondo, 1999). Neighborhood density shows inhibition: words that are

similar to many other words (dense neighborhoods) are recognized more slowly than

words in sparse neighborhoods (Luce & Pisoni, 1998; Vitevitch & Luce, 1999: Amano &

Kondo, 1999). However, neighborhood density sometimes shows facilitation: nonwords

that are similar to words in dense neighborhoods are recognized more quickly than the

ones in sparse neighborhoods (Vitevitch & Luce, 1998. 1999). The neighborhood

facilitative effect is observed not only among adults but also among infants and children

(Charles-Luce & Luce, 1995; Metsala, 1997; Pitrat. Logan, Cockell, & Gutteridge, 1995,

Garlock, Walley, & Metsala, 2001).

Neighborhood density and probabilistic phonotactics have a strong positive

correlation in the language between number of overlapping words and segmental 4

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. frequency. Typically, as the number of overlapping words increases, the frequencies of

the segments that consist of the overlapping words also increase. Based on this fact, a

neighborhood facilitative effect is interpreted as an effect of probabilistic phonotactics.

Of course, infants do not have a lexicon yet, and it is known that symbolic

representation is acquired a necessity for production (Jusczyk, 1993; Plaut & Kello,

1999). Jusczyk (1993) has proposed that infants acquire auditory exemplars in the

lexicon. Perhaps the two different types of lexical representations (symbolic vs.

cochleagram) represent different potential ''levels” for the child as well. Maybe, infants’

lexical representations are based on auditory exemplars for their sensitivity to sound

frequency whereas lexical representations of children and adults, who have already

established a production path from semantic space to articulation, are based on symbolic

representations.

However, a couple of questions are still unsolved in this research. The first

question is whether this neighborhood density effect universally affects the processing

times for lexical access. The effect of neighbors has been confirmed in English, but not

in many other languages. Amano and Kondo (1999) tested neighborhood density effects

in Japanese and found that the effects were significant in accuracy of a word

identification in noise experiment (off-line task) but not in processing times of a lexical

decision experiment. The second question is exactly how similarity of neighbors should

be calculated. Should we calculate neighborhoods simply in terms of words that differ by

one phoneme, as in the Greenberg and Jenkins (1964) calculation? Does subphonemic

auditory/acoustic similarity determine a word’s neighbors? Also, what role does prosody 5

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. play? The third question is highly related to the second question: how are words stored in

the mental lexicon? Are words in the mental lexicon highly detailed episodic

representations as claimed in Goldinger (1989, 1996, 1998) and Johnson ( I997ab), or

does the lexicon consist of highly abstract phonological representations, as linguists

generally assume? Or can we assume that words have both representations as claimed in

Luce and Pisoni (1998) and Pallier and his colleagues (2001)? Exploring the definition of

neighbors in Japanese word recognition should lead to better understanding of the issues

in word representation and word competition.

This dissertation reports the results of auditory naming, word identification in

noise, and semantic categorization experiments designed to explicitly compare phoneme-

based neighborhood definitions and more fine-grained acoustically-based neighborhood

definitions, and to compare neighborhoods defined with and without prosodic structure.

Acoustic similarity is calculated using the audio stimuli available in the NTT database

(Amano & Kondo, 1999, 2000) using a new methodology developed in this dissertation.

Using various ways to calculate neighborhood density makes it possible to determine

which level of lexical representation (an acoustic-auditory representation or an abstract

phonemic representation) is used to calculate phonological similarity within the lexicon.

The results show that different lexical neighborhoods are operative both at lexical and

sublexical levels of word recognition.

The results demonstrate that phonological similarity within the lexicon seems to

be calculated based on the acoustic-auditory representation rather than on an abstract

6

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. phonemic representation in any of the experiments. The implications from the results to

current word recognition theories will also be discussed.

1.2. In f o r m a t io n Us e d f o r L e x ic a l A c c e s s

One of the important tasks in understanding speech is finding out the relationship

between speech and linguistic structure. Essentially language users must leam a

relationship among three spaces: the acoustic input, articulation and semantics. These

mappings are not direct at all. In order to mediate these three spaces, phonological

(cognitive) representation emerges. For production and comprehension, language users

have to leam how to map the acoustic properties of speech (such as fundamental

frequency, intensity, duration) onto the (more abstract) linguistic structure stored in the

mental lexicon acquired in infancy through comprehension, and how to convert the

linguistic structure to the acoustic speech signal through articulation. I believe that

“phonetics” is the study of the connection between the phonological (cognitive) structure

of the language and the physiological properties of speech.

The terms pitch, loudness, length, and timbre are often used as auditory correlates

of fundamental frequency, intensity, duration and spectral characteristics, respectively.

Such impressions are evidently determined not only by the physical characteristics of the

speech signal but also by language users’ knowledge. These impressions somehow

straddle the boundary between the physical world and language users’ abstract (cognitive)

representations of that world. This means that every language user has to figure out how

7

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to interpret the physical aspects of speech in a way that fits into their language system in

order to understand speech (also see Cutler, 1997).

In order to understand lexical access, we need to understand how the mental

lexicon is developed in infancy, what kinds of word representation need to be assumed,

how adult listeners map acoustic information onto the words stored in the mental lexicon,

and how adult speakers map from words stored in the mental lexicon to articulatory

plans.

Once language users acquire the lexicon, information about phonological and

morphological patterns may be available for lexical access. The question then is which

kinds of information are exploited by adults, who have established lexicons and who can

rely on the shapes of words as a whole (i.e., who are not “pre-lexical” anymore), for

lexical access. Cutler (1997) claims that adult listeners use many kinds of phonological

information acquired in a pre-lexical period during infancy, suggesting that phonological

information is in effect prior to morphological information. This view is also confirmed

by Smith and Pitt (1999) who found that the formation of syllabic structure is guided by

phonology prior to morphology. Segmentation studies have shown that different kinds of

phonological information are used for lexical access.

Even after acquiring the lexicon, studies suggest that listeners use multiple

phonological cues for lexical access. Rhythmic structure of languages is a strong cue for

adults. Across many languages it appears that listeners exploit metrical structure to locate

word boundaries in speech, although these boundaries can be determined in a highly

language-specific way. Several studies using syllable monitoring experiments suggest 8

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that French listeners segment the incoming speech signal into syllabic units (Cutler,

Mehler, Norris, & Segui, 1986; Cutler, Mehler, Norris & Segui, 1992; Mehler,

Dommergues, Frauenfelder & Segui, 1981; Otake. Hatano, Cutler & Mehler, 1993).

French listeners segment speech into syllables even when the input is in a language other

than French, English (Cutler et al„ 1986) and Japanese (Otake et al., 1993). These

syllable-based segmentation results are also confirmed by studies on bilingual listeners,

suggesting that French-dominant bilinguals clearly exhibited syllabic segmentation when

they listened to French, whereas English-dominant bilinguals did not. (Cutler et al.,

1992). The syllabic effect in French was also replicated in a study with a phoneme-

induction paradigm, a variant of the phoneme-monitoring task (Pallier, Sebastian-Galles,

Felguera, Christopher, & Mehler, 1993). The syllabic effect has been found with

speakers of Spanish and Catalan in syllable-monitoring experiments, though not in all

cases (Bradley, Sanchez-Casas & Garcia-Albea, 1993; Sebastian-Galles, Dupoux. Segui

& Mehler, 1992).

Many studies have shown that language users in English and Dutch employ

metrical stress information (alternations of strong and weak syllables that are based on

vowel quality (full vs. reduced vowels); Fear, Cutler & Butterfield, 1995). Cutler and

Norris (1988) found that stress alternation (strong and weak syllables) was keyed to word

segmentation in English word-spotting studies. They used a word-spotting task in which

listeners were asked to press a button as soon as they heard a real word embedded at the

beginning of a pseudoword. The results showed that the detecting times for mint in

mintayf and mintef were significantly different, whereas the detecting times for thin in 9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. thintayf and thintef were not significantly different. They proposed that the CVCC target

(mint) from mintayf was divided across two segmentation units into min_t, so that

listeners had a problem detecting the target whereas in the case of mititef, the CVCC

target was not split because the "e" was a reduced vowel. However, they did not have any

problem detecting CVC targets (thin) because the segmentation of the target in th in ja yf

and th in je f should not be any different. Based on these results. Cutler and Norris

proposed the Metrical Segmentation Strategy (MSS), which is based on the strong/weak

syllable alternation for speech segmentation in English. The MSS effect was also found

in word-spotting studies by McQueen and his colleagues. (1994) and Norris and his

colleagues (1995). Cutler and her colleagues further found that English-dominant

bilinguals use a stress-based segmentation strategy (Cutler et al., 1992). This MSS effect

has also been demonstrated in both spontaneous and experimentally elicited

misperceptions in English (Cutler & Butterfield, 1992). and in word blending

experiments (Cutler & Young, 1994). Further. English listeners' sensitivity to

predominant stress patterns (strong/weak) is also supported by computational analyses on

the English lexicon and corpus (Cutler & Carter, 1987). Smith and Pitt (submitted)

further replicated the MSS in word spotting experiments. The MSS in Dutch is also

found in cross-modal identity priming experiments (Vroomen & de Gelder. 1995) and in

a laboratory-induced misperception experiment and a word-spotting experiment

(Vroomen, van Zon & de Gelder, 1996).

Japanese listeners have been shown to segment speech into moarae in studies on

Japanese auditory word recognition using syllable monitoring (Otake et al., 1993: Otake, 10

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hatano, & Yoneyama, 1996a), phoneme monitoring (Cutler & Otake, 1994; Otake,

Yoneyama, Cutler, & van der Lugt, 1996b), word blending (Kubozono, 1995), phoneme

induction (Yoneyama & Pitt, 1999), and word spotting (McQueen, Otake, & Cutler,

2001).

Adult studies have shown that adults also employ the statistics of the language

input for word segmentation. Saffran, Newport, and Aslin (1996) and Saffran. Newport,

Aslin, and Barrueco (1997) exposed adult English speaking listeners to an artificial

language (directly or indirectly) in which the only cues available for word segmentation

were transitional probabilities between syllables to leam the words of this language. Pitt

and McQueen (1998) claim that compensation for coarticulation, which is used as a

strong piece of evidence in support of interactive models of speech perception (like

TRACE), can be an effect of local transitional probability of segments. McQueen (1998)

and van der Lugt (2001) both demonstrated that Dutch listeners use phonotactic cues to

help solve the segmentation problem through word-spotting experiments. Phonotactics is

used by listeners to process phonologically illegal sequences in English (Pitt, 1998) and in

Japanese (Dupoux, Kakehi, Hirose. Pallier & Mehler, 1999; Dupoux. Pallier. Kakehi &

Mehler, 2001).

As we have seen here, adults are sensitive to different levels of statistical

structures of the input language. However, other types of phonological information are

also used among adult listeners. Syllables are used as segmentation units in French,

Spanish and Catalan discussed above. All syllable monitoring experiments with English

listeners so far have failed to show this effect (Cutler et al., 1986; Bradley et al., 1993) 11

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. regardless of whether targets have been presented visually or auditorily. Even with

foreign-language input in which syllable boundaries are clear, experiments have not

shown this effect (Cutler et al., 1986; Otake et al., 1993). However, when using

methodologies other than a syllable monitoring task, the syllabic effect has been found in

English. Bruck, Treiman and Caravolas (1995) found that the listeners were able to

decide whether two nonsense words share sounds more quickly when the nonsense words

shared a syllable (e.g, [kipaest] and [kipbeld]) than when they did not (e.g., [flingil] and

[flikboz]), suggesting that syllabified representations of the nonwords may be used in a

comparison task, even in English. Finney, Protopapas. and Eimas (1996) showed that

English listeners can use syllabic information to cue them to the location of phoneme

targets in a phoneme induction paradigm (a variant of the phoneme detection task) that

showed a syllabic effect in French (Pallier et al.. 1993). although the syllabic effect was

not observed in English words with strong first syllables. Pitt. Smith and Klein (1998)

further conducted more controlled induction experiments with a baseline condition in

which no induction is manipulated. Unlike in Finney et al. (1996), a syllabic effect

appeared even in words with strong first syllables as well as in nonsense words. Further.

Smith and Pitt (submitted) showed that the information used to determine syllable

boundaries (vowel length, lexical stress and phone class) was effective in determining

word boundaries. The syllable effect is also reported in Dutch using syllable-monitoring

experiments (Zwitserlood, Sheriefers, Lahiri, & van Donslaar. 1993). This effect was

clearly shown in the unambiguous as well as in the ambisyllabic cases.

12

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Listeners also rely on tonal information. Listeners exploit accent in Finnish

(Vroomen. Tuomainen & de Gelder, 1998) and in Spanish (Sebastian-Galles et al., 1992,

Sebastian-Galles, 1996). Although Cutler (1986) claimed that lexical stress does not

contribute to lexical access in English, a recent study by Smith and Pitt (submitted)

reported that American-English listeners exploit lexical stress for segmentation. Lexical

pitch accents are involved in auditory word recognition in Japanese (Sekiguchi &

Nakajima, 1999: Cutler & Otake, 1999, Otake & Cutler, 1999). Lexical tones play an

important role in Cantonese and Mandarin Chinese (Cutler & Chen. 1997; Ye & Connine.

1999).

Language-specific phonological information such as vowel harmony is effective

in Finnish (Suomi, McQueen & Cutler, 1997; Vroomen, & de Gelder. 1998). Further,

adults listeners use many different cues that are related to physical characteristics of the

speech signal across languages: silence (Norris et al., 1997): allophonic cues such as

aspiration of word initial stops in English (Lehiste, 1960; Nakatani & Dukes. 1977); the

duration of segments or syllables (Beckman & Edwards. 1990: Gow & Gordon. 1995:

Klatt, 1974, 1975: Lehiste, 1972; Oiler, 1973; Quene, 1992, 1993; Saffran, Newport &

Aslin. 1996; Smith & Pitt, submitted); and fundamental frequency movement (Vroomen

et al.. 1998. Hasegawa & Hata. 1992).

In summary, adults use multiple cues for lexical access. Listeners are sensitive to

all acoustic information relevant to the language’s phonology (Cutler, 1997). Based on

this general observation. Norris. McQueen. Cutler, and Butterfield. (1997) proposed a

possible word constraint (PWC) in which listeners use all possible phonological 13

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. information in order to find the appropriate word boundaries for lexical access. As a

result, they claimed that listeners do not segment fapple into/ and apple because/is not a

possible word in English. PWC is confirmed in other languages such as in Japanese

(McQueen, et al., 2001). Therefore, McQueen, Cutler, Butterfield, and Keams (2001)

claimed that PWC is a universal constraint based on the findings explained above.

1.3. M e n t a l L e x ic o n a n d P honological N e ig h b o r s

This section explores a role of the mental lexicon for lexical access. In spoken-

language recognition, adult listeners actively use the lexicon for lexical access. Lexical

entries are organized into a network, and compete with each other during access (“word

competition"). There is now a large body of evidence of supporting the claim that words

compete for lexical access by adults (Gow & Gordon. 1995: McQueen et al.. 1994: Norris

et al., 1995; Shillcock, 1990; Tabossi, Burani. & Scott. 1995; Vitevitch & Luce. 1999:

Vroomen & de Gelder, 1995: 1997; Wallece, Stewart. & Malone. 1995: Wallace, Stewart.

Shaffer, & Mellor, 1995; Zwitserlood, 1989: Zwitserlood & Schriefers, 1995). At the

same time, high probabilistic statistics (frequency of sounds or sound sequences within

words) at the prelexical level facilitates processing as explained above. These effects are

predicted by any type of activation-competition models such as MERGE (Norris et al..

2000), TRACE (McClelland & Elman, 1986), PARSYN (Luce, et al., 2000), Shortlist

(Norris, 1994) and ARTPHONE (Grossberg, et al., 1997). Thus, activation-competition

is central to current word recognition models.

14

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Although many word recognition models do not provide information about how

the words are organized in the lexicon, the Neighborhood Activation Model (NAM; Luce,

1986a; Luce, Pisoni, & Goldinger, 1990; Luce & Pisoni, 1998) as well as PARSYN (a

connectionist model based on NAM; Luce et al., 2000) clearly addresses the structure of

the lexicon: The memory stored for the phonological forms of words is organized in

terms of sound similarity. The number of similar words (“neighbors”) shows an

inhibitory effect: Words with many neighbors are recognized more slowly than words

with few neighbors. Structural relations among words are measured by their

neighborhood size (“Neighborhood density”). For a given word in the lexicon, the

word’s neighborhood size is the number of words in the lexicon that contains sounds

similar to that word. A widely-used calculation of neighborhood size is based on an

algorithm proposed by Greenberg and Jenkins (1964). A neighborhood density effect has

been reported in many studies (Pisoni et al., 1995; Luce, 1986a: Goldinger et al., 1989;

Luce, Pisoni, & Goldinger, 1990; Luce & Pisoni. 1998; Vitevitch & Luce, 1998;

Vitevitch & Luce, 1999: Luce et al., 2000).

“Phonological similarity” in English is calculated based on form similarity. Four

different neighborhood calculations have been used. The first calculation is based on

experimentally derived phoneme confusability (Luce, 1986a: Luce 8c Pisoni, 1998). This

rule is based on R. D. Luce’s general biased choice rule (R. D. Luce, 1959).

The second calculation is that neighbors are determined in terms of shared number

of phonemes (Luce, 1986a: Luce & Pisoni, 1998). In this calculation, neighbors are

words that differ from one another by a single phoneme addition, deletion or substitution 15

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in any position (Greenberg & Jenkins, 1964). The number of such neighbors is the

neighborhood density of the target item. Most studies of neighborhood effects have used

this simple notion of neighborhood density (e.g., Charles-Luce & Luce, 1995; Metsala,

1997). Unlike the First calculation, this definition has a sharp cutoff, simply ignoring all

words outside the single phoneme edit distance. Furthermore, this definition does not take

similarity between phonemes into consideration. For example, replacing /b/ with /p/ (a

change involving only voicing) yields a neighbor just the same as does replacing lb/ with

/s/ involving place and manner of articulation in addition to voicing). Also, phoneme

insertion could change word prosody (such as dog vs. doggy), but this aspect is not

considered. Although it is widely recognized as a very rough approximation, the

definition has been surprisingly successful in neighborhood studies in English.

The third calculation is based on the percentage of phoneme matching between

words. Frisch, Large and Pisoni (2000) expanded the neighborhood definition based on

one-phoneme edit distance for CVC words to longer words by basing their calculation of

similarity in the fraction of shared phonemes in a word. For example, a proportional

change of 1/3 would be equivalent to a single phoneme change when applied to CVC

words. This means that the neighbors for CVC words should share 66% of the phonemes

within words. This phoneme-matching percentage (66%) is then also used for multi­

syllabic words.

The final calculation is based on distinctive feature lattice distance (Frisch, 1996).

Bailey and Hahn (2001) proposed a “General Neighborhood Model (GNM)” based on

this measure of word similarity. GNM is an adaptation of the Generalized Context Model 16

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (GCM; Nosofskey, 1986) of classification based on similarity of exemplars. In the GCM,

words in the lexicon are considered as exemplars and they are mapped onto psychological

mental space. In this model, all but the target word are considered as neighbors that vary

along a continuous space of similarity. Unlike a sharp neighbor-nonneighbor distinction,

all the words in the lexicon are neighbors to some degree. The model calculates the

psychological distances between individual items by a standard edit distance metric with

assessment of the relative cost of substituting one phoneme for another based on the

natural class lattice distance metric (Frisch, 1996). GNM neighborhood similarity is

similar in spirit to the neighborhood confusability term in Luce’s (1986) but differs in

using an exponential transformation from psychological distances instead of using

confusion probabilities.

These neighborhood calculations can also be weighted by lexical frequency. For

example. Luce and Pisoni (1998) used a neighborhood density calculation that is based on

R.D. Luce’s (1959) choice rule that weights similarity by the frequencies of target words

and neighbors. Vitevitch and Luce (1998, 1999) modified the neighborhood calculation

used in Luce and Pisoni (1998) so that the overall frequency-weighted neighborhood

probability (the first definition) is simplified as the sum of the frequency-weighted

neighbor word probability. In this calculation, neighbors are first calculated by

Greenberg-Jenkins’ phoneme substitution, deletion and insertion rules. Then, the

frequencies of the neighbors are summed in order to obtain the frequency-weighted

neighborhood density. The above two frequency-weighted calculations are very similar.

A crucial difference between the two, however, is assumed distribution of similarity. The 17

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. former considers all the words but the target word in the calculation of the similarity

space whereas the latter considers the words within one phoneme edit distance from the

target word as neighbors resulting in a discrete similarity space. A neighborhood

calculation based on distinctive feature lattice distance (Frisch. 1996) can also be

weighted by word frequency. For example, Bailey and Hahn (2001)’s General

Neighborhood Model is a neighborhood calculation that is sensitive to the frequency of

the neighbors.

The survey of the neighborhood calculations revealed two tendencies. First, the

English neighborhood definitions are all based on phonemes as the basic units. Luce and

Pisoni (1998) predict word similarity by calculating sound confusion matrices in CVC

words. A widely used neighborhood calculation based on Greenberg-Jenkins’ rules is

based on the number of common phonemes between words. Moreover, the psychological

distances between individual items is calculated by a standard one phoneme edit distance

metric with assessment of the relative cost of substituting one phone for another based on

the natural class lattice distance metric (Frisch. 1996). This is not surprising, however,

since most adult word recognition models assume that the phoneme is the basic unit of

processing.

Second, the calculations in some studies may not consider word-level prosody.

One main reason for this is that most neighborhood studies used CVC tokens as stimulus

words so that they did not need to consider the effect of lexical stress on word similarity.

For example, these neighborhood calculations predict that FORbear and forB ARE are

neighbors in English. In keeping with this. Cutler (1986) claimed that lexical prosody 18

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. observed in words such as FORbear and forB ARE does not constrain lexical access.

However, neighborhood studies using the HML consider word-level prosody, because the

transcriptions in the HML distinguish strong vs. weak vowels.

Referring to English neighborhood calculations, the definition of neighborhood in

Japanese are explored. Three neighborhood calculations are used to test neighborhood

density in Japanese. Each of the neighborhood calculations coincides with a hypothesis

about lexical access with different word representations. The first calculation posits a

situation in which Japanese listeners rely on the phoneme string representation as

proposed in models such as the NAM (Luce & Pisoni, 1998) and the PARSYN (Luce et

al., 2000). Here, neighborhoods are calculated in terms of number of phonemes in

common, as in the Greenberg-Jenkins calculations (Greenberg & Jenkins. 1964) as

widely used in the literature on English.

The second neighborhood calculation included word accent information as

another dimension in the neighborhood calculation in order to reflect the finding that

prosodic information has a vital role in Japanese word recognition (Cutler & Otake,

1999). This calculation proposes that Japanese listeners use word-level prosody for

lexical access. However, both word representation and word-level prosody are separately

calculated. In other words, the word representation in this calculation is the categorical

abstract representation as used in the phoneme-based neighborhood calculation and the

pitch accent patterns additionally constrain the neighbors. Take a pair of words like ka’ki.

19

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ‘oyster’ and kaki ‘persimmon’ as an example1. The words are considered as ‘neighbors’

in the neighborhood calculation based on Greenberg-Jenkins’ rules. However, Japanese

listeners seem to have sensitivity to similarity of accent patterns, so they may not consider

ka’ki, ‘oyster’ and kaki ‘persimmon’ to be neighbors because the accent patterns’

difference might have a crucial role for their recognition (HL pitch pattern for ‘oyster’

and LH pitch pattern for ‘persimmon’). A similarity judgment experiment on pitch

accent patterns was carried out and the results were implemented in the calculation of

prosodic similarity.

In the third calculation, neighborhood density was measured by comparing the

similarity of cochleagrams of the audio files. In this case, the word representation is an

auditory representation in which all segmental and prosodic information is available. In

this calculation, as in the GNM (Bailey & Hahn, 2000), the words in the lexicon are

considered as exemplars and they are mapped onto psychological mental space. In this

model, all but the target word are considered as neighbors that vary along a continuous

space of similarity. This calculation is thus like an auditory version of GNM.

The details of these calculations will be explained in Chapter 2.

1 An apostrophe specifies the place of lexical accent. Romanization conventions used in this are mainly based on the one created by the Society for the Romanization of the Japanese Alphabet ("99 version") except that moraic nasals and geminate consonants are represented as 'N’ and ‘Q.‘ respectively in Japanese. See website (http://www.roomazi.org/99siki.html) for further details.

20

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.4. T e s t in g neighborhood e f f e c t s in J a p a n e s e : O v e r v ie w

The main goal of this dissertation is to better understand lexical access processes.

In a series of experiments, the same set of 700 noun words are used as target words with

different experimental tasks. This will give an opportunity to directly compare the results

obtained from different experiments.

In English, the neighborhood density effect has been shown in experiments using

many different methodologies such as auditory naming (e.g.. Luce & Pisoni, 1998;

Vitevitch & Luce, 1998, 1999), word identification in noise (e.g.. Luce & Pisoni. 1998),

lexical decision (Luce & Pisoni, 1998), same-different matching task (Vitevitch & Luce,

1999), and semantic categorization (Vitevitch & Luce. 1999). However, in Japanese,

Amano and Kondo (1999) found a neighborhood density effect (inhibitory) with words in

a word identification in noise experiment but not in a lexical decision experiment.

These methodologies are roughly categorized into two groups: sublexically-biased

and lexically-biased tasks. Lexically-biased tasks require access to the lexicon whereas

sublexically-biased tasks do not. The auditory naming task and same-different matching

task are sublexically-biased tasks that show some effects from shallow phonetic details.

On the other hand, word identification in noise, lexical decision and semantic

categorization tasks are considered to be lexically-biased tasks in the sense that they all

require accessing the lexicon. However, according to Vitevitch and Luce (1999),

semantic categorization requires accessing lexicon but is not biased either towards the

21

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. lexical level or the sublexical level, because the task decision is made at the semantic

level.

Three neighborhood experiments conducted in this dissertation use different

methodologies. The advantage of testing the neighborhood effects in experiments with

different methodologies is that it should allow us to investigate the influence of the

neighborhood density effect from several different perspectives. For example, a syllable

effect was not observed in sequence-monitoring experiments in English (Cutler et al.,

1986; Bradley et al., 1993) whereas phoneme-induction experiments clearly showed a

syllable effect (Finney et al., 1996; Pitt et al.. 1998). Therefore, the use of a particular

methodology can sometimes show hidden effects that might not be observed using a

different methodology. This could be the case for neighborhood density in Japanese.

Three methodologies chosen for our experiments are auditory naming, word

identification in noise, and semantic categorization. Experiment I uses the auditory

naming task, which was chosen as a sublexically-biased task. Experiment I will be

reported in Chapter 3. Experiment 2 is an auditory naming experiment with a word

identification task in a noise condition (see Pisoni, 1996). Participants performed an

auditory naming task as a primary task. Once they finished naming a stimulus word, they

wrote down what they said in hiragana characters. Experiment 2 aimed to collect both

performance time and identification data in order to compare these data with the data of

Experiment 1, where the same task is performed in a condition without noise, and the

identification data of previous studies (Luce & Pisoni, 1998; Amano & Kondo. 1999).

These results will be reported in Chapter 4. Chapter 5 reports the results of Experiment 3 22

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. which uses the semantic categorization task that requires lexical access (see Foster &

Shen, 1996). Vitevitch and Luce (1999) explained that an advantage of using this task is

that it can allow observation of the neighborhood density effect at the lexical level in a

more natural way; it certainly looks at lexical level activity without using nonwords like a

lexical decision task and without biasing either prelexical or lexical levels.

We can look at the results of previous studies in English to predict what the

results of the current experiments should be if Japanese shows the same neighborhood

effects as those observed in English. When words are presented in auditory naming, word

identification and semantic categorization tasks, inhibitory effects of neighborhood

density are observed: high density words are responded to more slowly than low density

words (Luce & Pisoni, 1998; Vitevitch & Luce. 1998, 1999). Therefore, we expect to see

neighborhood inhibitory effects in the three experiments in this dissertation.

These three experiments all use the same 700 targets, allowing us to directly

compare the results across the experiments using the different methodologies. The details

about the selection and characteristics of the target words are shown in §2.4.

Neighborhood density calculations assume existence of the mental lexicon

because what we are trying to show is how many similar words to a given word exist in

the lexicon. The mental lexicon assumed here is based on a standard dialect of Japanese,

the Tokyo dialect. Because of this restriction, participants are all Tokyo-native speakers

(See §2.5). All nouns used in this study are found in the electronic version of a Japanese

standard dictionary (Sanseido Shinmeikai Japanese dictionary ; Kenbou, Kindaichi.

23

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Kindaichi, & Shibata, 1981). More information about the lexicon used in this study will

be provided in §2.2.

In this dissertation, three different neighborhood calculations are tested in order to

decide how to define neighbors. The first calculation was based on Greenberg-Jenkins’

(1964) phoneme substitution, deletion and insertion rules. The second calculation

included prosodic information as another dimension in the neighborhood calculation in

order to reflect the finding that prosodic information has a vital role in Japanese word

recognition. The third calculation was based on the auditory properties of the words in the

lexicon. The first two calculations are based on the abstract representation of the words

whereas the last calculation is based on the auditory representation. Therefore, finding

the best definition of neighbors in Japanese also contributes to revealing the

representation of words in the lexicon.

1.5. O rganization o f t h e D issertation

In this dissertation, three definitions of neighbors will be tested in three

neighborhood experiments. This dissertation is organized as follows: Chapter 2 provides

basic information about the neighborhood density experiments conducted in this

dissertation. The common features of the experiments will be explained. Chapter 3

reports the results of the auditory naming experiment. In Chapter 4. neighborhood

density effect will be tested in two tasks: auditory naming in noise and word

identification in noise. Chapter 5 further tests neighborhood density effect in a semantic

24

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. categorization experiment. The results are drawn together and discussed as a whole in the

General Discussion and Conclusion in Chapter 6.

25

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2

STIMULI AND THEIR NEIGHBORHOODS

2.1 Introduction

This dissertation tests neighborhood density effects in Japanese in order to better

understand the representation of word forms stored in the lexicon and the processes

mapping between this phonological representation and the acoustic-auditory input. Three

experiments were conducted. In this chapter, words used as stimuli in all three

experiments are described, and an explanation is provided for different ways to calculate

their neighborhood densities.

2.2 J a p a n e s e M e n t a l L e x ic o n

Japanese neighborhood experiments require a Japanese lexicon in order to

calculate the neighborhood density for words. The English language lexicon used in

26

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. many previous experiments is the Hoosier Mental Lexicon (Pisoni et al.. 1985). The

Japanese lexicon used in this dissertation is based on the NTT Database Series (Amano &

Kondo, 1999, 2000).

The NTT Database Series has seven volumes, each of which focuses on different

aspects of the Japanese lexicon shown in Table 2.1. Entries in all volumes are cross-

referenced with common ID numbers.

VOLUME CONTENT

I Word Familiarity

2 Word Orthography

3 Word Accent

4 Parts of Speech

5 Characters

6 Character-Word

7 Frequency

Table 2.1: Contents of the NTT Database Series (Amano & Kondo, 1999; 2000)

The lexicon used in this study consists of a subset of words in the NTT Database

Series (Volume I: Word familiarity), the 3rd edition of the Sanseido Shinmeikai

27

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Dictionary (Kenbou et al., 1981). For a smaller subset of words, there is also a recorded

utterance of each word, and a set of at least three familiarity ratings for ( I) auditory

presentation of the word, (2) visual presentation of the word (in each of its written forms,

if there is more than one typical way to write the word), and (3) simultaneous audio­

visual presentation. At least 32 subjects per word rated the familiarity of each word on a

7-point scale starting from 1 (not familiar) to 7 (most familiar). Some words have more

than one pronunciation in standard Japanese (e.g., ‘almond’ has two forms: a'amoNtlo ,

with initial accent, and aamo'Ndo, with penultimate accent). Lexical entries were

recorded by a single adult female speaker of the Tokyo-dialect in the multiple sessions.

The utterances were first digitized onto a PC. and were stored as individual audio files at

16-bit resolution and a 16000 Hz sampling rate in the .wav format (Windows PCM: See

Amano & Kondo. 1999 for further details). The lexicon used in this study consisted of

the smaller subset of 63,531 noun words which have associated recorded utterances with

the NTT Database Series.

Two representations are assumed for each word in the Japanese lexicon: an

abstract representation and an acoustic/auditory representation, each of which is based on

two types of word recognition models (abstract models vs. exemplar-based models). The

abstract representation is described in alphabetic symbols. The full description of

alphabet usage in the representation is shown in Appendix A. Two features of the

phonemic representation are worth mentioning. First, moraic nasals and moraic

consonants are transcribed as N and Q, respectively. This decision was made based on

studies that Japanese listeners are sensitive to moraic structure in auditory word 28

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. recognition (e.g., Otake et al., 1993; Cutler & Otake, 1994; Otake et al.. 1996b). Second,

the vowel length contrast in Japanese is expressed by the number of segments (single or

double) such as in ka’do ‘comer’ vs. ka'ado ‘card’ as explained in Vance (1987).

The auditory representation is based on the audio files that came with the NTT

Database Series (Amano & Kondo, 1999, 2000). The auditory representation of the

words in the lexicon is modeled as a sequence of auditory spectra calculated from the

audio files.

Pitch accent patterns are also stored in the lexicon as separate information. The

details will be discussed in §2.3.2.

2.3 D e f in in g N e ig h b o r s in J a p a n e s e

As discussed in Chapter 1. finding the best neighborhood density calculation in

Japanese could provide useful information about how words are represented. Of

particular interest is whether neighborhoods should be calculated in terms of number of

phonemes in common, as in Greenberg and Jenkins (1964), a method which has been

widely used in the English neighborhood literature, or whether neighborhoods should be

calculated in terms of acoustic/auditory similarity.

In the following sections, each of the three neighborhood calculations tested in

this dissertation are explained in more detail. In order to show the different outcomes of

the three neighborhood calculations, the same target word ( kodomo , ‘child’), one of the

words used in the experiments, is used as an example word.

29

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.1 T h e S e g m e n t s C a l c u l a t io n

The first neighborhood density calculation is an algorithm based on Greenberg

and Jenkins (1964). This calculation will be referred to as "the Segments calculation.”

Neighborhoods are computed by comparing a given segmental transcription (the

stimulus) to all other transcriptions in the Japanese lexicon discussed in §2.2. A neighbor

is defined as any transcription that could be converted to the transcription of the stimulus

word by the substitution, deletion, or addition of one phoneme in any position. Table 2.2

shows an example neighborhood calculation for the target word, kodomo, 'child.’ This

word has four neighbors. Three neighbors are obtained by deleting an onset of the third

syllable of the target word. The last neighbor is obtained by substituting r with the onset

of the second syllable in the target word. The first two neighbors have "the same word

entity" because they have the same kanji characters in Sanseido Shinmeikai Dictionary

(Kenbou et al., 1981). This is not the same thing as aamoNdo , almond' where the accent

difference does not change usage, which was explained in §2.2.

30

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Target word Neighbors Operation ko’doo1 Deletion ‘old discipline’ kodoo kodomo Deletion ‘old road’ ‘child’ kodoo &» Deletion ‘heart beat’ koromo £ Substitution ‘batter’

Table 2.2: The target, neighbors selected by the Segments calculation, and its operations.

2.3.2 T h e S e g m e n t s + P it c h C a l c u l a t io n

The second calculation, “the Segments + Pitch calculation,” is designed to

investigate the case in which abstract models such as Shortlist (Norris, 1994), MERGE

(Norris, et al.. 2000) and TRACE (McClelland & Elman. 1986) take advantage of

suprasegmental information for their word selection. Several studies have claimed that

pitch accent information plays a vital role in Japanese word recognition (Sekiyama &

Nakajima, 1998; Cutler & Otake, 1999). In Japanese, placement of the accent within

each word is lexically specified. For instance, the words, ana , ‘hole’ and a ’na

“announcer’ have the same sequence of sounds but the first is an unaccented word and the

second has an initial accent. These words are realized with the pitch patterns shown in

1 Of course, this placement of the lexical accent is not considered in this neighborhood calculation. 31

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.1. The fact that Japanese listeners can recognize them as different words shows

that the pitch accent contributes to lexical information in Japanese. Perception of accent

location in Japanese is influenced by both FO peak location and post-peak FO fall rate

(steep or shallow relative to the syllable edge (Sugito, 1972: Hasegawa & Hata,I992).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. rj 100.2) [leffcup/down move mid:play between marks right Ti»e: 0.4176Ssec D: 0.34317 I: 0.10259 R: 0.44576

- j j 100.2) {lefcup/down move mid:play between marks right Tine: 0.35515sec 0: 0.00000 L: 0.36168 R: 0.36168

Figure 2.1: Fundamental frequency (FO) contours and spectrograms of ana, "hole" (top) and a Jna "announcer" (bottom). 33

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The second neighborhood calculation is based on both segmental similarity and

pitch pattern similarity. An important question is how Japanese listeners perceive

similarity of pitch accent patterns. Traditionally, Japanese pitch accent patterns are

represented with H (high pitch) and L (low pitch) targets on each mora. For example, a

pitch accent contour of a word, ana, ‘hole’ shown in Figure 2.1 is represented as a

sequence of LH. If we assume these Hs and Ls are "tonemes", it should be possible to

calculate similarity of pitch accent patterns using Greenberg-Jenkins’ substitution,

deletion or insertion rules. In order to test this hypothesis, an experiment examining

similarity judgments about Japanese pitch patterns was conducted.

2.3.2.1 S im il a r it y J u d g m e n t s o n J a p a n e s e Pit c h A c c e n t P a t t e r n s

2.3.2.1.1 P u r p o s e

The purpose of this experiment was to investigate how Japanese listeners perceive

similarities of pitch accent patterns in Japanese when pairs of pitch accent patterns are

presented auditorily. In segmental similarity, a single phoneme edit distance is allowed

for words to be segmental neighbors (e.g .,gaN, ‘cancer’ and kaN, “can’ are neighbors,

but gaN and kani, 'crab' are not). Similarly, it is assumed that the pitch patterns consist

of a sequence of pitch units using H (high pitch) and L (low pitch), the question to ask

would be whether one pitch unit difference could also be used to determine pitch

neighbors (e.g., LHH and LHL). If this is the case, a clear categorical boundary would

be expected between the pairs of pitch patterns that are within one pitch edit difference of

34

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. each other. Thus, LHH and LHL would be pitch accent neighbors whereas HLL and

LHL would not.

2.3.2.1.2 M e t h o d

2.3.2.1.2.1 S t im u l i

Twenty pitch patterns were used. They are the patterns that are attested in simple

I to 5 CV syllable/mora words in Japanese. Table 2.3 shows the 20 pitch patterns and

pseudowords that were used in Experiment I. In this experiment, pitch patterns are

coded by three different levels (0 = low pitch; 1 = unaccented high pitch; * = accented

high pitch) so that liana' ‘flower’ and liana ‘nose’ should be coded as 0* and 01,

respectively.

The patterns were produced on nonword stimuli consisting of a string of /ma/

syllables. That is, a simple CV syllable, ‘ma’ is the only syllable used in all the nonsense

words. Since a lexical pitch accent in some pitch patterns such as 0* and 0 1 11* only

appears when a grammatical particle (such as ‘-ga’ or ‘-wa’) is attached word-finally, 20

pitch patterns realized with the nonsense words were recorded with a grammatical

particle ‘-Qte,’ which was subsequently deleted (i.e., ma°ma*-Qte —* ma°ma*).

The nonsense words were recorded onto a DAT tape by the author, a native

speaker of the Tokyo-dialect, at a sampling rate of 48000 Hz. The data were down-

sampled to a sampling rate of 22050 Hz with 16-bit accuracy when they were transferred

35

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. onto a computer. Twenty separate sound files were created, and the grammatical particle

‘-Qte’ was deleted.

The waveforms of the 20 sound files were scaled so that the peak root-mean -

square (RMS) amplitude values were equated for all files at an amplitude of

approximately 75 dB sound pressure level (SPL).

1-syllable 1- 2 syllable-2 3-syllable 3- 4-syllable 4- 5-syllable 5- mora word mora word mora word mora word mora word ma mama mamama mamamama mamamamama * 01 Oil 0111 01111

0 *0 01* o n * 0111*

0* 0*0 01*0 011*0

*00 0*00 01*00

*000 0*000

*0000

Table 2.3: Twenty tonal patterns tested in Experiment 1 (0 = low pitch; 1 = unaccented high pitch; * = accented high pitch).

36

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2.1.2.2 P articipants

Participants were 18 Japanese native speakers who were bom and raised in the

Tokyo area (Tokyo, Saitama, Chiba and Kanagawa). They were thus native speakers of

the Tokyo dialect, the standard dialect in Japanese. The participants were undergraduate

students at Dokkyo University (Saitama, Japan). None had stayed in English-speaking

countries except for short travel visits. They each received a small amount of money for

their participation. None of the participants had any hearing impairment.

2.3.2.1.2.3 P r o c e d u r e

Participants were given answer sheets and a pencil. They were played a pre­

recorded list of experimental instructions in which they were told that they would be

hearing a series of nonsense word pairs and that their task was to judge the similarity of

the tonal patterns in each pair on a 7-point scale. Participants were instructed to make

their judgments based on the overall impression of the nonsense words in each pair.

They were told that it might be helpful to think how likely it would be that nonsense

words in the pair could be identical: a judgment of “very likely identical’* would rate a

“ 1" on the 7-point scale.

All participants were tested as a group in a language laboratory at Dokkyo

University. Of the 200 possible pairings of the 20 tokens, a subset of 150 was selected

for our study. Both orders (AB and BA) of each of the pair-wise comparisons were

37

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. presented in random order to participants, for a total of 300 pairs for similarity judgment.

The 300 pairs are shown in Appendix B.

All stimuli were presented from a laptop computer that was connected to the

central audio system in a language laboratory. Participants heard the stimuli binaurally

via headphones. Each trial started with a short tone followed by a 500 millisecond (ms)

pause. The participants then heard a pair of stimuli with a 500 ms inter-stimulus interval

(ISI). Within 4 seconds after the second stimulus item was presented, they were

instructed to record their judgment by circling a number (from I through 7) printed on

the answer sheets. Before the test session, a practice session of 5 pairs was provided to

participants in order to familiarize them with the procedure. The length of Experiment 1

was about 50 minutes.

2.3.2.1.3 R e s u l t s

The mean similarity ratings across participants were first calculated for all 300

trials. One hundred and fifty pairs were presented in two different orders (AB and BA).

Figure 2.2 shows the mean rating in the order AB as a function of the mean rating in the

order BA. Note that pairs with lower rating are perceived as more similar. The figure

shows that most of the datapoints are very near the diagonal line. This indicates that

participants gave their ratings consistently on the same stimuli in different presentations

in the experiment. A regression analysis showed that the ratings in the order AB are

highly correlated with the ratings in the order BA (/f2 = 0.842). Furthermore, a paired t-

38

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. test showed that the differences between AB and BA order are not significant (t = L.619,

df= 149, p > . I). Thus, participants performed their task consistently.

Figure 2.2 also shows that there are two clusters: four with the highest degree of

similarity (ratings near 1) are separated from the other pairs in their own cluster. It

turned out that these pairs are ones in which the stimuli are of the same length contrasting

unaccented with final accented syllable, i.e., 011* vs. 0111,01* vs. 011,0* vs. 01 and

0111* vs. 01111. This indicates that the H target for an accented syllable at the end of

word (e.g., hana' ‘flower’) and the relatively high pitch at the end of an unaccented word

before an excised Qte’ (e.g., hana ‘nose’) are very similar. Moreover, these four pairs

are separated from the rest of the pairs, which demonstrates that they are more similar to

each other than the other pairs are similar to each other. Also, the ratings reflect that

Japanese listeners were able to discern a difference between pairs like liana ‘nose’ and

hana, ‘flower’ because most of the pairs did not get a rating of ‘I,’ indicating that they

are identical. Vance (1995) and Warner (1997) both reported that Japanese listeners are

able to distinguish an accented high pitch and an unaccented high pitch when each

appears at the end of words. In this sense, our similarity data do not conflict with Vance

(1995) and Warner (1997). These pairs are very similar yet still distinguishable by

Japanese listeners.

39

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7

O 00, 6 c —

5

4

3

2

1

0 0 1 2 3 4 5 6 7 AB

Figure 2.2: The mean similarity rating in the order AB as a function of the mean similarity rating in the order BA. The numbered points (1 to 4) in the figure represent Oil* vs. 0111,01* vs. 011,0* vs. 01 and 0111* vs. 01111, respectively.

In sum, the similarity judgment data showed that there was no stimulus order

effect in judging the prosodic similarity of the pairs. Based on this finding, the means of

40

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the two ratings for the 150 stimulus pairs were calculated and were used for further

analyses on similarity of word prosodic patterns. Furthermore, the difference between a

final accented high pitch (*) and a final unaccented high pitch (I) will be ignored in the

following analyses. The data support a lexical representation of accent patterns as pitch

contours that can be roughly represented by just two levels: low and relatively high

(represented as "0” and “ I,” respectively).

2.3.2.1.3.1 W o r d P r o s o d ic S im il a r it y B a s e d o n G r e e n b e r g -J e n k in s ’ r u l e s

This section investigates whether Japanese listeners’ pitch pattern similarity

judgments can be predicted by a single toneme edit distance criterion.

The means of all 150 pairs were coded with the number of substitutions, deletions

and insertions to change one pitch pattern to the other pattern. A regression analysis was

performed to see how well the number of operations predicts participants’ similarity

ratings. Figure 2.3 shows a graph of the number of operations (substitutions, deletions or

insertions) as a function of the similarity ratings and a graph of the number of operations

as a function of the median similarity rating. Note that participants used a similarity

range (1 to 7) and similarity is higher if the rating is lower.

41

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7

6 6

5

4 4

3 3

2

1

0 1 2 4 0 3 5 6 3 4 5 OPERATIONS OPERATIONS

Figure 2.3: The number of operations (substitutions, deletions or insertions) as a function of the similarity ratings (left), the number of operations as a function of the median similarity rating with a logarithmic function (right).

The results showed that there is no advantage for pairs at one toneme edit

distance. There is tremendous overlap between I operation and 2 operations. Therefore,

a “single-element’ rule as in the Greenberg-Jenkins algorithm would not work. The right-

hand graph of Figure 2.3 shows the number of operations as a function of the mean

similarity ratings. The relation between the number of operations and the mean similarity

rating is in a logarithmic function. Therefore, one toneme edit distance may not be the

best method to describe Japanese listeners’ performance on similarity judgments.

42

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2.1.4 C a l c u l a t in g t h e S e g m e n t s + P it c h C a l c u l a t io n

The Segments + Pitch calculation is a modified version of the Segments

neighborhood calculation explained in §2.3.1, and reflects the fact that Japanese listeners

are sensitive to pitch information in word recognition. The calculation has two stages.

The first stage selects potential neighbors based on segmental information. The second

stage further selects neighbors based on pitch-accent patterns from the candidates

determined in the first stage.

At the second stage, the word accent patterns are considered. In §2.3.2.1, it was

discovered that similarity rating for pairs of accent patterns and the number of operations

are related logarithmically. In order to incorporate this similarity rating information into

a neighborhood calculation, a cutoff point was introduced. The cutoff point was based on

the error responses in an auditory naming in noise experiment reported in Chapter 4. In

this experiment, the stimuli were presented in noise and the participants were expected to

repeat and write the word they heard. Here, all the error responses in this experiment

were analyzed to see how Japanese listeners misperceived the accent patterns of the

stimuli. In this analysis, the actually perceived accent pattern and the repeated accent

pattern are compared as shown in Table 2.4.

Figure 2.4 shows frequency counts as a function of operations of pitch pattern

responses in the auditory naming in noise experiment reported in Chapter 4. Here, all the

responses (N = 18900) were analyzed in terms of pitch patterns only; misidentified

segment responses are not considered here. As you can see. the highest count of all is the

bar of zero operation. This means that more than 86% of the responses correctly 43

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. reproduced the pitch accent pattern of the stimulus. Whether we introduce a categorical

distinction between pitch neighbors or not, the words that have the exact same pitch

pattern are considered as pitch neighbors.

18000 16326 6000

4000

12000

0000

5 8000

OPERATIONS

Figure 2.4. Distance between accent pattern produced and target accent pattern in the auditory naming in noise experiment as measured by the number of operations needed to change the perceived pattern into the response pattern (Chapter 4).

Now the processes at the second stage of the neighborhood calculation are

explained with kodomo , ‘child’ as a target word. Table 2.4 shows the target word and its

four potential neighbors selected at the first stage with similarity ratings and 44

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. neighborhood information. The target word has an accent pattern of 011. The four

words selected as potential neighbors at the first stage of the calculation have either 011

or 100 pitch patterns. Because ko'doo, “old discipline’ does not have the same pitch

accent pattern as the target word, it is no longer a neighbor in this calculation. Therefore,

kodomo , ’child’ has three neighbors.

Potential Neighbors at the Target word first stage of the Neighbor? calculation ko’doo “old discipline' NO 100 kodoo ‘old road’ YES kodomo Oil “child’ kodoo Oil “heart beat’ YES Oil koromo “batter’ YES Oil Neighborhood Density: 3

Table 2.4: A target word and its four potential neighbors selected at the first stage with information calculating neighborhood density.

45

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 .3 .3 T h e A u d it o r y C a l c u l a t io n

The third definition of neighbors in Japanese is based on auditory similarity

among words in the lexicon (“the Auditory calculation’)- In order to calculate auditory

similarity of words, I adopted the equations from an exemplar-based model (X-MOD,

Johnson, I997ab). X-MOD is an exemplar-based model of word recognition, which is an

extension of Klatt’s LAFS model (Klatt, 1980). This instance-based model of

phonological learning and word recognition is based on three assumptions. First, speech

is recognized by reference to stored instances (exemplars). Second, these exemplars have

no internal structure; rather, they are unanalyzed auditory representations. Third,

exemplars are word-sized chunks that result from primitive auditory scene analysis

(Bregman, 1990) where isolated word productions form the basis for word recognition in

running speech. In this model, each 23 ms frame of speech is processed to yield a critical

band spectrum (95 points) at a 43 Hz frame rate, and is vector quantized. After vector

quantization, each frame is expressed as an arbitrary number. In a matching process,

exemplars are activated based on similarity to input, where similarity is an exponetial

function of the Euclidean distance between exemplars.

The critical difference between X-MOD and LAFS is the stored auditory

representation for each word. Figure 2.5 shows examples of LAFS and X-MOD

representations. In contrast to the prototypes stored for LAFS, the stored representations

in X-MOD are distinct exemplars. In LAFS, lexical representations are based on spectral

decoding networks. Unlike LAFS. X-MOD stores multiple auditory exemplars for each

46

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. word rather than prototypes. Also, a lexical representation is a sequence of quantized

vectors. For example, the LAFS representation of “Cat” is a sequence of auditory spectra,

which represents the word, as shown in Figure 2.5. Each number stands for a spectrum

code. One sequence of speech auditory spectra must represent all instances of the word.

This is a “brittle” representation, because it is not robust over sources of variation.

X-MOD assumes that the representation of “Cat” (and other words) is a set of

sequences of auditory spectra, where each code sequence is considered as a distinct

exemplar, as shown in Figure 2.5. The advantage of this model is that it keeps variation

directly in the lexical representation, treating it as information rather than noise. In this

sense, this model is similar to a hidden Markov Model (HMM) representation, but it does

not need to assume that state dependencies are purely local.

LAFS representation of “Cat'”:

73 71 18 11 16 90 1 88

X-MOD representation of “Cat

“Cat” exemplar 1 73 71 18 11 16

“Cat” exemplar 2 73 71 42 18 15

“Cat” exemplar 3 73 42 11 17 89

Figure 2.5: Examples of LAFS and X-MOD representations of “Cat.”

47

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To calculate similarity of neighbors based on the auditory representation, a basic

assumption is that as in the General Neighborhood Model (GNM, Bailey & Hahn, 2001),

is that all the words stored in the lexicon are neighbors to some degree. Therefore, the

perceived word in an incoming speech signal is compared with all the words in the

lexicon based on the algorithms used in X-MOD.

If the psychological distance between instances i and j is dtJ, perceived similarity

of a target word i to a set of instances stored in memory is calculated by equations in

Equations 2.1 and 2.2. Similarity in this calculation is first computed by comparing two

auditory spectra that are represented as a sequence of numbers (see Figure 2.6). The

auditory property m, the index to the sequenced auditory spectra of exemplar j is written

Xjm. ^nd is represented as a number. The Euclidian distance between exemplar j and item

i is written dij, a sensitivity constant is c.

kodomo exemplar (/): 73 71 18 11 16 90 I 88

domori exemplar (/): 15 16 20 90 2 88 45 67 62

Figure 2.6: Quantized vectors of the exemplars (kodomo and domori)

48

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Equation 2.1) di} = [X(x,m - Xjm)2]m m (Equation 2.2) Simt = X exp(-crfy) j

Equations 2.1 & 2.2. Equations used in the Auditory calculation for neighborhood density

In Equation 2.1, auditory similarity between exemplars is calculated by

comparing sequences of quantized vectors as shown in Figure 2.6. The best alignment of

two exemplars is found to accommodate vector length differences. Auditory properties

of these instances are compared and are summed in order to compute dtj. In this analysis,

a threshold is introduced in order to decide whether two exemplars (words) are neighbors

or not. Figure 2.7 shows a neighbor-nonneighbor distinction in a similarity space in the

Auditory calculation.

49

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Neighbors Non-Neighbors

U 9 m

Neighborhood Threshold

Figure 2.7: A neighbor-nonneighbor distinction in a similarity space in the Auditory calculation.

Recall that Bailey and Hahn (2001) assume that all words in the lexicon are

neighbors to some degree, and they proposed a continuous similarity space within the

lexicon. In this neighborhood density calculation, sound differences are considered. On

the other hand. Luce and Pisoni (1998) used a discrete similarity space. A neighbor-

nonneighbor distinction is based on one phoneme edit distance, so sound similarity is not

included in the neighborhood density calculation. The Auditory calculation in this

dissertation is a combination of these two algorithms. Like Luce and Pisoni s (1998)

algorithm, our algorithm has a threshold for a categorical cutoff point between neighbors

and nonneighbors. That is, only psychological distances ( dij) that are above the threshold 50

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. are plugged into Equation 2.2 to compute similarity among words. However, as in GNM

(Bailey & Hahn, 2001), sound similarity is implemented in the neighborhood calculation

in a psychological space. That is, the similarity value is considered as neighborhood

density in the Auditory calculation. The number of neighbors for kodomo is 165, and the

neighborhood density is 70.920.

In this calculation, not only are different metrics of similarity used (auditory

similarity of the vectors of spectra versus phoneme edit distance), a different idea of

measuring “similarity” is used. That is, in the Segments calculation and the Segments +

Pitch calculation, the number of neighbors was added whereas in the Auditory calculation,

the degrees of similarity were added. However, a linear correlation analysis between the

number of neighbors based on the Auditory calculation and the similarity of neighbors

(the Auditory calculation) revealed that these two calculations are highly correlated (R2 =

0.983083). This suggests that changing the representation (symbolic vs. auditory) is

more important than changing a measure of neighborhood density (number of neighbors

vs, degrees of similarity).

Lastly, an important aspect of the Auditory calculation should be pointed out.

Since this calculation is based on an auditory representation that contains phonetic details,

it is easily assumed that a neighborhood calculation based on the auditory representation

is less abstract than neighborhood calculations based on the symbolic representation.

However, contrary to this prediction, neighbors in the Auditory calculation are selected

less strictly than those in the other two calculations.

51

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. riMm anago , conger eel’ kowane , “voice quality'

karoo , “fatigue’ moo, “hidden pocket, distinguished talent’

Figure 2.8: The target word, kodomo, ‘child’ and its seven most similar neighbors in the Auditory calculation.

52

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.8 shows spectrograms of the target word, kodomo, ‘child’ and its seven

most similar neighbors. First of all, even the most similar neighbors for kodomo in the

Auditory calculation cannot be neighbors in other two neighborhood calculations, both of

which are based on the one phoneme edit distance. However, if you look at the

spectrograms of these neighbors, we can see some common elements. First, all neighbors

have the same pitch accent pattern2. Secondly, they have similar durational relationships

and other relational properties (e.g.. spectral edges). Thirdly, although segments that

make up the words are different, neighbors to kodomo show a similar impression.

Although these neighbors are not in one-phoneme edit distance, substituted sounds in the

neighbors are phonetically similar to the ones in the target. For example, a comparison

between kodomo and enogu , /m/ and /g/ are both realized as nasals, once you recall that

/g/ is realized as [rj] in Tokyo dialect, and is in the process of changing to [g] (Hibiya,

1995). The position in /dJ is Filled with M in enogu, both of which are alveolar sounds.

Also, /o/ is changed to /u/. /e/ and /a/ but never to /i/ among the 7 most similar neighbors.

Therefore, the selection of the neighbors is based on the whole-word confusability rather

than segmental confusability. The whole-word confusability is based on auditory

impressions. In short, the Auditory neighborhood calculation is in a sense a broader

neighborhoodW calculation than the Segments V calculation and the Segments V + Pitch

calculation.

: kunoo, ‘suffering’ has a HLL pitch accent pattern in the NTT databases. However, this word was accidentally recorded with a LHH pitch accent pattern. Therefore, in the Auditory neighborhood calculation, it was chosen as one of the most similar neighbors to the target word. 53

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.4 C o m p a r is o n o f t h e T h r e e N eighborhood C alculations

Three different neighborhood calculations have been explained in §2.3.1, §2.3.2

and §2.3.3. Three different values of neighborhood density are obtained for the same

target word, kodomo , ‘child.’ Numbers of neighbors in the Segments calculation and the

Segments + Pitch calculation are 4 and 3, respectively. Similarity among words in the

lexicon in the Auditory calculation is 70.920.

The distributions of neighborhood density for the 700 target words used in these

experiments are shown in Figure 2.9. Descriptive statistics of neighborhood density

computed by the three calculations are shown in Table 2.5.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. without prohibited reproduction Further owner. copyright the of permission with Reproduced Figure 2.9: Frequency counts of target words as a function of neighborhood density. neighborhood of function a as words target Frequency of counts 2.9: Figure Neighborhood density by the Segments calculation (top), neighborhood density by density neighborhood (top), calculation theby Segments density Neighborhood Counts Counts Counts the Segments + Pitch calculation (middle), and neighborhood density bythe density neighborhood and (middle), Pitch+ calculation the Segments 100 120 100 120 100 120 60 20 40 20 1 2 3 4 S 6 7 8 9 100 90 80 70 60 SO 40 30 20 10 0 0 0 10 10 03 40 30 20 20 Auditory calculation (bottom). Auditory calculation 040 30 Neighborhood Oenslty Neighborhood Density Neighborhood Neighborhood Density Neighborhood so so 55 07 90 70 60 08 90 80 60 ao 100 100

Segments Segments + Pitch Auditory

N of cases 700 700 700

Minimum 0 0 0

Maximum 59 32 609.67

Median 6 3 9.12076

Mean 8.64 4.92 42.19

Standard Dev 9 6 76

Skewness (Gl) 2.030257 1.984840 3.187733

Table 2.5: Descriptive statistics of neighborhood density computed by three different neighborhood calculations.

Linear correlation analyses were conducted in order to understand the

relationships among the three neighborhood calculations and other factors. Table 2.6

shows the Pearson correlation matrix of the three neighborhood calculations. The results

showed that the Auditory calculation is different from the Segments calculation and the

Segments + Pitch calculation, which are highly positively correlated ( R2 = 0.851). The

Auditory calculation is negatively correlated with both the Segments calculation (R2 = -

0.167) and Segments + Pitch calculation (/f2 = -0.156). The auditory neighborhood

consist of the number of neighbors weighted by similarity whereas the other two consist

of just the number of very close neighbors —not a measure of similarity at all.

56

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Segments Segments + Pitch Auditory

Segments 1.00000000

Segments + Pitch 0.85135157 1.000000

Auditory -0.16761086 -0.15618879 1.00000000

Table 2.6. Pearson correlation matrix of the three neighborhood calculations.

2 .4 T a r g e t W o r d s

The three neighborhood experiments in this dissertation used the same set of 700

target words from the Japanese lexicon. The target words are trimoraic-trisyllabic words

(CVCVCV words) with a rated auditory familiarity of 5 or higher on a 7-point scale with

7 being highly familiar in the NTT Database Series (Volume I; Amano & Kondo. 1999.

2000). They begin with a voiceless stop ([t, k]), a nasal ([n, m]), or a fricative ([s. J, z.

3])-

Several lexical statistics were computed and later used to account for listeners'

performance in the experiments. These included word frequency, uniqueness point, first

mora frequency, and duration. This section describes how these lexical characteristics

were defined and shows the distributions of the target words according to each of these

factors.

Although the study of word frequency effects in spoken word recognition would

be best if we were to use a count of spoken words, no such count is available in Japanese. 57

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Therefore, as in studies of English lexical processing, frequency counts based on written

text are used.

Volume 7 of the NTT database series is devoted to Frequency (word frequency

and character frequency), and word frequency for each word in the lexicon was

calculated based on the Word Frequency Database. Although they call the database

"Word Frequency Database,” it is actually "Morpheme Frequency Database.” The

database consists of frequency information on morphemes that were found in the articles

from Asahi Shinbun, a Japanese newspaper, during a 14 year period (from 1985 to 1998).

Entries in the database have original morpheme ID numbers as well as common ID

numbers. These ID numbers allow us to refer to the words in the Sanseido Shinmeikai

Dicationary (Kenbou et al., 1981), and to calculate the word frequency for each entry of

our lexicon from morpheme frequencies in the database.

One thing to keep in mind for word frequency calculation is that the same word

could be represented more than once in the text corpus. Japanese has three different

writing systems (hiragana. katakana and kanji) so that multiple morphemes might be

listed for the same word in the database. Table 2.7 shows information about the word,

anago , ‘conger eel,’ as an example. Database ID, Common ID, Representations, and

Frequency information are shown in the table.

58

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Database ID Common ID Representations Frequency

042363 010690 f c & r 5

074084 010690 7 7 u* 105

162659 010690 K=F 16

Table 2.7: Three representations of anago, ‘conger eel’ found in the Word Frequency Database (Volume 7, The NTT Database Series).

As shown here, anago has three different representations, each of which has its

specific database ID number in the database (042363, 074084, 062659, respectively).

However, they all have the same common ID (010690). Therefore, word frequency for

anago should be the total of the frequency counts for morphemes with the same common

ID number. The total citation frequency of anago is 126. In this dissertation, word

frequency is defined as the logarithm (base 10) of the citation frequency count for each

word in the database. In order to include words listed in the lexicon with zero token

frequency, a constant 2 was added to the citation frequency count before taking

logarithms in order to avoid taking the log of zero. This computation was conducted with

all words in the lexicon. Therefore, because the citation frequency for anago was 126. it

word frequency is logl0( 126+2), or 2.11.

59

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. rprin e Bar per Proportion

c 13 O O - 0.06

1 2 3 4 Frequency (Iog10)

Figure 2.10: Distribution of word frequency of the target words.

Figure 2.10 shows frequency counts of the 700 target words as a function of word

frequency. As can seen in the figure, word frequency of the target words varies across a

wide range, although the targets are mostly familiar words.

60

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The uniqueness point is the point at which moving from left to right through the

word, the word is distinguished from all other words in the lexicon. This concept is the

core of the Cohort Theory (Marslen-Wilson, 1984; Marslen-Wilson & Tyler, 1980;

Marslen-Wilson & Welsh, 1987). Uniqueness points were identified for each target word

and tabulated according to whether the word was unique before the last segment, was

unique at the last segment, or whether the word did not have a uniqueness point (coded as

after). Uniqueness points were further tabulated according to the number of segments

from the beginning of the word. If a word had no uniqueness point, it was coded as 7

since all 700 target words had 6 segments. Table 2.8 shows a summary of the uniqueness

point tabulations. Note that the uniqueness point ignores pitch patterns.

Over 55% of the target words do not have a uniqueness point. The number of

words that have a uniqueness point before the last segment of the word is 214, or 30.57%

of the targets. The rest of the target words (102 words; 14.56%) were made unique by

the last segment of the word. The uniqueness point could occur as early as at the third

segment of the target words.

61

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UP Groups Before At After UP from 3 4 5 6 7 word-initial # of words 5 8 201 102 374 102 374 Total (%) 214(30.57%) (14.57%) (54.86%)

Table 2.8: A summary of the uniqueness point tabulations

The first mora frequency was calculated by the number of words beginning with a

target mora divided by the total number of words in the lexicon. For example, as in

Table 2.9, 3444 words in the lexicon begin with the mora, ka. The total number of words

in the entire lexicon is 63531. Therefore, the proportion of words beginning with ka in

the lexicon is 0.05421. This frequency is recorded for all of the target words like

kabocha “squash,’ and karada ‘body,’ that start with ka.

This measurement may be used in three ways. First, this measure shows how

practiced the speaker is at parsing this form down to the phoneme segment in order to

distinguish it from other words that begin the same way. Another interpretation of the

measurement is that it shows the proportion of the initial cohorts. In the Cohort Theory

(Marslen-Wilson, 1984: Marslen-Wilson & Tyler, L980; Marslen-Wilson & Welsh,

1978), the initial cohorts are activated once listeners hear a few phonemes. Therefore, if

the proportion is higher, the number of words activated as cohorts should be higher. The

62

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. final interpretation is that this proportion shows a transitional probability of sounds that

consist of the initial mora of the target words.

# of Words beginning # of words in the Proportion of words with ka Lexicon beginning with ka 3444 63531 0.05421

Table 2.9: The number of words beginning with ka , the total number of words in the lexicon, and the proportion of words beginning with ka in the lexicon.

Figure 2.11 shows the distribution of the first mora frequencies for target words

classified in terms of initial sounds. The first mora frequency varies from 0.001133 to

0.054210 (Mean = 0.023305; SD = 0.015113) among all 700 target words. The number

of stimulus words with initial fricatives is 229 and the first mora frequency varies from

0.001684 to 0.042436 (Mean = 0.023616; SD = 0.012307). The number of stimulus

words with initial nasals is 189 and the first mora frequency varies from 0.002424 to

0.014151 (Mean = 0.011326; SD = 0.003018). The number of stimulus words with

initial stops is 282 and the first mora frequency varies from 0.001133 to 0.054210 (Mean

= 0.031081; SD = 0.016790).

63

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fricatives Nasals

3000 3000

2000 2000 -

c c 3 3 o o O o

1000 - - 10.2 03 1000 - u

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.00 0.01 0.02 0.03 0.04 0.05 1st Mora Frequency 1st Mora Frequency Stops

3000

2000 -

c 3 Oa

1000 -

0.00 0.01 0.02 0.03 0.04 0.05 0.06 1st Mora Frequency

Figure 2.11: Frequency counts of the target words as a function of frequency of the first mora. Words beginning with a fricative (top left), words beginning with a nasal (top right), and words beginning with a stop (bottom left).

64

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.12 shows frequency count and proportion per bar as a function of

durations of the target words. The durations of the target words varies between 431 ms

and 780 ms (Mean = 592 ms; SD = 60.4). Note that all the target words have the same

CVCVCV structure in Figure 2.11.

400 500 600 700 800 Duration (ms)

Figure 2.12: Distribution of the durations of the target words.

65

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.10 shows Pearson’s correlation matrix of the three neighborhood

calculations and other factors. The interest here is whether a particular factor is highly

correlated with any of the neighborhood calculations. A noticeable point is that the

Auditory calculation is NEGATIVELY correlated with all other factors whereas the

Segments calculation and the Segments + Pitch calculations are POSITIVELY correlated.

The Auditory calculation is highly (negatively) correlated with Duration : if duration of

the target is longer, the neighborhood density is lower (R2 = -0.531865). This is the

highest correlation observed among neighborhood density calculations and other factors.

66

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 -0.156189 1 Segs Scgs+Pitch Auditory 0.851352 1 1 0.294376 1st mora 1st 0.265391 0.174769 Word -0.045132 -0.114186 -0.167611 Frequency 1 0.0475430.042633 0.191566 1 0.224668 0.21316 0.248828 -0.531865 Duration 1 Point 0.176829 -0.064771 Uniqueness Tabic 2.10: Pearson correlation matrix ofthe three neighborhood calculations and other factors. lsl lsl mora Uniqueness Point Word Frequency 0.198834 Segs + Pitch 0.189811 Duration -0.00083 Segs 0.233265 Auditory

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The prerecorded auditory sound files of the target words in the NTT databases were

presented to the participants. The 700 target words are shown in Appendix C. The

waveforms of 700 audio files were scaled so that the peak RMS amplitude values were

equated for all files, at an amplitude of approximately 75 dB SPL.

2.5 P articipants

Participants of three experiments reported in this dissertation were all Japanese

native speakers who were bom and raised in the Tokyo area (Tokyo, Saitama. Chiba and

Kanagawa). They spoke the Tokyo dialect as their native dialect. None had stayed in

English-speaking countries except for short travel visits. The participants were mainly

recruited from undergraduate students at Dokkyo University (Saitama, Japan). The age

of the participants ranged from 19 to 3 1 years old. They each received a small amount of

money for their participation. The participants did not have any hearing impairment.

The selection of the participants was mainly based on the fact that dialectal

differences in Japanese affect speech processing in Japanese. Cutler and Otake (1999)

reported that Japanese speakers from Kagoshima andTochigi, where the spoken dialects

do not present an accent contrast, perceive pitch accent patterns differently from the

native speakers of the Tokyo dialect. Because a Tokyo-native speaker produced the

utterances in the NTT database series (Amano & Kondo, 199. 2000), it was necessary to

have participants also be Tokyo-native speakers.

68

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.6 S u m m ar y

In this Chapter, common features of the experiments conducted in this dissertation

were discussed. In §2.2, the lexicon used in this dissertation was described. It is a noun

lexicon that was developed from the electronic version of Sanseido Shinmeikai

Dictionary , a part of the NTT Database Series. Section 2.3 described three

neighborhood calculations (the Segments calculation, the Segments + Pitch calculation

and the Auditory calculation). The properties of the target words were discussed in §2.4.

Finally, §2.5 provided information about the participants.

69

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3

EXPERIMENT I: AUDITORY NAMING

3.1. Introduction

This chapter discusses the results of the auditory naming experiment. In this task,

participants listened to words over headphones and repeated them as quickly as possible.

Previous neighborhood studies on English (Luce & Pisoni. 1998; Vitevitch & Luce,

1998) found an inhibitory neighborhood density effect in auditory naming. If

neighborhood density effects are language-universal, we would expect Japanese listeners

to perform in a similar way. In other words, we would expect neighborhood density to

negatively affect naming time and accuracy: words from a dense neighborhood would be

named less quickly and accurately than words from a sparse neighborhood. This is not

what was found in this experiment. Rather, the reverse facilitative neighborhood effect

70

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. was found. Quite interestingly, the neighborhood densities best predicted listeners’

behavior.

3.2. M e t h o d s

As explained in the previous chapter, seven hundred nouns were selected as target

words from the NTT databases (Amano & Kondo, 1999, 2000). These target words met

the following criteria: all the words ( I) are 3 mora words, (2) are trisyllabic words, (3)

have a rated auditory familiarity of 5 or higher on a 7-point scale in the database (7 =

highly similar), (4) have audio files in the database, and (5) have as their initial sounds a

voiceless stop ([t. k]), a nasal ([n. m]). or a fricative ([s. J, z, 3l).

Participants were 27 native speakers of Tokyo Japanese who were bom and raised

in the Tokyo area (Tokyo. Kanagawa, Chiba, and Saitama). as explained in Chapter 2.

The 27 participants were run individually in a quiet room. Participants completed one-

hour test sessions on each of two successive days. In each session, a list of 350 target

words was presented. Each list was divided into five blocks, each of which contained 70

words. The order of the blocks and the words within each block were randomized. The

order of the two lists was counterbalanced among the participants.

In each test session, participants heard the 350 stimulus words binaurally over

headphones at approximately 75 dB SPL. as measured using a sound level meter. The

participants were instructed that they would hear words over the headphones and that

their task would be to repeat each word as quickly as possible. They were told that the

71

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. microphone would register the time that they began speaking, and that the time between

when the word was played and when the microphone detected their response would be

recorded.

During the experiment, a visual prompt appeared at the beginning of each trial.

One second later, the word was presented over the headphones. Naming times were

measured from the beginning of the auditory stimulus. If a participant did not respond to

the word within 4 seconds from the beginning of the auditory stimulus, “no response”

was logged.

The author tested all participants and checked all responses with headphones

during the experiment in order to correct by hand in the datasheet responses mistakenly

activated by hesitation or wrong words.

Before the test session, participants were given a practice block of 20 stimuli to

familiarize them with the procedure for the auditory naming task. Each session lasted

about 50 minutes.

3 .3 . R e s u l t s

For the naming-time analyses, abnormally fast and slow responses falling above

or below 2.5 standard deviations (subjects and items) were eliminated. Naming times

were analyzed in a multiple general linear regression model (Cohen & Cohen, 1983) for

three different neighborhood density calculations.

72

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Before the analyses were conducted, the mean naming time for each participant

was calculated in order to understand the participants’ performance. The data showed

that the range for naming times was remarkably large. Even after the abnormally fast and

slow responses were removed (about 1% of the total responses), the mean naming time

for each participant varied from 283 ms to 729 ms (MEAN = 534 ms, SD = 113). The

difference between the fastest namer and the slowest namer is was an extremely large 446

ms. However, the participants all performed their task very diligently. Moreover, the

mean naming time for the slowest namer was still faster than the mean naming time for

real-word targets by the American-English participants in the Vitevitch and Luce (1998)

study (over 800 ms). Because of the wide range in participants’ mean naming time, the

27 participants were classified into two groups: the fast namers and the slow namers.

First, median naming times for all participants were calculated. Then, participants were

split by the median naming time (14 participants for fast namers and 13 participants for

slow namers). The analyses of the naming data for fast namers and slow namers were

conducted separately.

The percentage of variance accounted for by each neighborhood definition was

calculated by subtracting the R2 of the basic model from the R2 of the basic model +

neighborhood density. The calculation that yielded the highest R2 that was statistically

significant was chosen as the best neighborhood calculation for the data.

The basic model was composed of 6 factors ( Participants, Initial sound class.

Uniqueness point, / 'r mora frequency, Word frequency, and Duration). For fast namers,

73

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. all factors were used to construct the basic model, although Word frequency did not reach

significance. The overall model was significant (F(19, 9713) = 335.969419, p < 0.00001;

R2 = 0.396574). The basic model for fast namers is shown in Table 3.1 below. Although

a checkmark ( “ V ” ) for Participants and Initial Sound Class means that these two factors

are statistically significant, they were treated as dummy variables in the regression

analyses1.

Participants

Initial sound class yxx*

UP Facilitation*

Duration Inhibition***

Ist Mora Frequency Facilitation*** Word Frequency Facilitation (p = 0.1281) (*p < 0.05. ***p < 0.001)

Table 3.1: Basic model of the naming time data for fast namers, Experiment 1.

1 A checkmark ("V”) is used to show a statistical significance for dummy variables (such as Participants and Initial Sound Class). This convention is used in the rest of this dissertation without any further explanation. 74

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. At the next step, three additional models were constructed by adding one of the

three neighborhood density calculations to the basic model. Table 3.2 shows the results

of the basic model and the three additional models for the fast namers. The colums in the

table contain the R2 accounted for by the model, R2 accounted for by the neighborhood

density calculation, and the direction of the neighborhood effect. The basic model

explained 39.66% of the data. The basic model with the neighborhood calculation based

on segments explained 39.71% of the data. Therefore, 0.0517 % of the variance was

actually accounted for by the neighborhood density effect independently of the other

factors. The same calculation was applied to the other two neighborhood calculations.

75

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models2 for by the neighborhood neighborhood model effect effect Basic 0.396574 NA NA

Basic + Segs 0.397091 0.000517**3 Facilitation

Basic + Segs&Pitch 0.396761 0.000187* Facilitation

Basic + Auditory 0.396574 0 NA (*p < 0.05, ** p < 0.01)

Table 3.2: Models of the naming time data for fast namers, Experiment 1.

A comparison of the /?2 accounted for by the three neighborhood calculations in

Table 3.2 shows two findings. Firstly, the Segments calculation and the Segments +

Pitch calculation both showed neighborhood facilitative effects: words from a dense

neighborhood were named more quickly than words from a sparse neighborhood.

Although all three neighborhood calculations showed neighborhood facilitative effects,

only the effects from the Segments calculation and the Segments + Pitch calculation

reached significance. The Segments calculation yielded the highest statistically

significant Rr, and as such it counts as the best neighborhood calculation for describing

2 Sees = the Segments calculation; Segs+Pitch = the Segments + Pitch calculation; and Auditory = the Auditory calculation. I will use these conventions in the rest of this dissertation. 3 ANOVAs were performed using a median split (high density neighborhood, low density neighborhood). The results showed that the effects were significant (F/(I.13) = 27.463. p <0.001; F2( L. 69S) = 22.499. p < 0.001). 76

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the data of the fast namers. Comparing the R2 of the three calculations, R2 tended to

decrease as the representation of the calculations changed from the categorical phonemic

representation to the Auditory representation.

The second finding is that the neighborhood density effect is facilitative. Such

facilitative effect for neighborhood density has been observed for nonword targets in an

auditory naming task in Vitevitch and Luce (1998, 1999). Vitevitch and Luce (1999) also

claimed that neighborhood inhibitory effects that were strongly observed among words in

English could be modified by focusing participants’ processing on a sublexical level.

They conducted a same-different matching task in which nonwords and words were

presented together. The reasoning behind this was that if the presentation of words and

nonwords were mixed, participants would focus their processing on the sublexical level

that is common to both words and nonwords. The results showed that the previously

observed inhibition effect of neighborhood density for these words considerably

attenuated, resulting in no significant effect of neighborhood density. Their results

showed that neighborhood density inhibition effects could be completely changed to

neighborhood density facilitation if the effect of probabilistic phonotactics was stronger

than a lexical competition effect (neighborhood inhibitory effect).

A comparison of the average naming time for words in Vitevitch and Luce (1998)

to the naming times in this experiment, reveals and extremely large difference (over

800ms for Vitevitch & Luce, 1998, and 283 ms for fast namers in Experiment I). This

average naming-time difference clearly shows that Japanese listeners performed the task

77

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. much faster than English listeners. However, at issue is whether Japanese listeners

started naming the words before their offset. Table 3.3 shows the number of responses

before and after the offset of the 700 target words for fast namers. The number of

naming-time responses that started before the target offset is greater than the number of

naming responses that started after the target offset for all fast namers. A WTLCOXON

signed ranks test showed that this tendency is significant (p < 0.001). Therefore, fast

namers tended to start naming the targets before the offset.

Fast Namers Before After FI 424 276 F2 522 178 F3 396 304 F4 694 6 F5 675 25 F6 494 206 F7 700 0 F8 583 117 F9 650 50 F10 678 22 FI 1 546 154 F12 671 29 F13 640 60 F14 677 23

Table 3.3: The number of responses before and after the offset of the 700 target words for fast namers.

78

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A basic model was also constructed for the slow namers using all the factors that

accounted for a significant portion of the naming time variance. The overall fit of the

model for the slow namers was significant (F( 17, 8950) = 237.40333, p < 0.00001; R2 =

0.310789). The basic model for slow namers is shown in Table 3.4.

Participants

Initial sound class

UP

Duration Inhibition***

Ist Mora Frequency Facilitation**

Word Frequency Facilitation*** (**p< 0.01, ***p <0.001)

Table 3.4: Basic model of the naming time data for slow namers, Experiment 1.

Table 3.5 shows the results of the models of the naming time data for the slow

namers. As in the analyses for fast namers, as shown in Table 3.2, neighborhood density

facilitated auditory naming. The effects were significant for the Auditory calculation and

the Segments + Pitch calculation, but marginally significant for the Segments calculation

(p = 0.0564). This reflects a major difference between fast namers and slow namers.

79

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. namely, that the direction of increasing R2 magnitude was different. As shown in Table

3.5, the Auditory calculation yielded the highest statistically significant R2 (R2 =

0.000541). This indicates that the neighborhood density based on the auditory

representation predicted listeners’ performance better than the other methods for

calculating neighborhood density. It is interesting that the more detailed representations

(Auditory and Segments + Pitch) were better at predicting naming times for these slow

namers, while the opposite tendency was observed for the fast namers.

R2 accounted R2 accounted Direction of the for by the Models 1 for by the neighborhood neighborhood | model effect effect Basic 0.310789 NA NA 0.00028 Basic + Seg 0.311069 Facilitation (p = 0.0564) Basic + Seg&Pitch 0.311154 0.000365* Facilitation

Basic + Auditory 0.311330 0.000541**4 Facilitation (*p <0.05, ** p < 0.01)

Table 3.5: Models of the naming time data for slow namers, Experiment 1.

4 ANOVAs were performed using a median split (high density neighborhood. low density neighborhood). The results showed that the effects were significant (F/( 1.12) = 15.798. p < 0.005; F2 ( 1.697) = 36.220. p < 0 .0001). 80

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A comparison of the R2 accounted for by three neighborhood calculations in Table

3.5 shows two findings. First, the Segments + Pitch calculation and the Auditory

calculation both showed neighborhood facilitative effects: words from a dense

neighborhood were named more quickly than words from a sparse neighborhood.

Although all three neighborhood calculations showed neighborhood facilitative effects,

only the effects from the Segments + Pitch calculation and the Auditory calculation

reached significance. Because the Auditory calculation yielded the highest statistically

significant R2, it represents the best neighborhood calculation for describing the data of

the slow namers. Comparing the R2 of the three calculations, there was a tendency for R2

to increase as the representation of the calculations changed from the categorical

phonemic representation to the Auditory representation.

Second, the neighborhood density effect was facilitative. Although such a

facilitative effect for neighborhood density has been observed for nonword targets in

auditory naming task in Vitevitch and Luce (1998, 1999), the data of slow namers

confirmed the same finding with real words for fast namers.

Table 3.6 shows the number of responses before and after the offset of the 700

target words for slow namers. S 1, S3, and S7 named the majority of target words after

they completely heard them, and most of the participants showed the same tendency.

Some participants such as S5 and S 11, named more targets before the offset than after the

offset. For 9 out of 13 slow namers, the number of naming-time responses that started

before the target offset was greater than the number of naming responses that started after

81

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the target offset. A WILCOXON signed ranks test showed that this tendency was

marginal (p = 0.086860). As a group, the slow namers tended to hear the entire word

before performing the task, but this was not true for all target words.

Slow Namers Before After SI 79 621 S2 409 291 S3 82 618 S4 261 439 S5 403 297 S6 200 500 S7 75 625 S8 452 248 S9 237 463 S10 343 357 S ll 468 232 S12 161 539 S 13 347 353

Table 3.6: The number of responses before and after the offset of the 700 target words for slow namers.

Table 3.7 contains a summary of the reliable effects from the regression models

for fast namers and slow namers. The basic model with the Segment calculation and the

basic model with the Auditory calculation were chosen as the best calculation for the fast

namers and the slow namers. respectively.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Namers Slow Namers

Participants V V

E Initial sound class V < w V.a .= UP .£9i aL. Duration Inhibition Inhibition tm 15,1 Mora Frequency Facilitation Facilitation r3 Jm Word Frequency Facilitation

S C Segments Facilitation s a Segments + Pitch Facilitation Facilitation 2 3 ec — *8 3 z u Auditory Facilitation

Table 3.7: A summary of the reliable effects from the regression models for (the naming time data (fast namers and slow namers), Experiment 1. Effects in bold show the calculation that yielded the highest increase in R2.

When the patterns for fast namers and slow namers are compared in terms of

effective factors, it becomes apparent that four of the factors used in the basic models are

common to both fast namers and slow namers. The participants factor had an effect,

even after grouping into fast and slow namers. The initial sound classes of the target

words (Initial sound class) also affected naming time. In an auditory naming experiment,

initial sound class should affect naming times, because different initial sounds may have

different amplitudes. Although Luce and Pisoni (1998) included initial sound difference

83

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. as a factor for their analyses, they did not report the details of any effects of initial sound

difference. The analysis of the current experiment showed that words beginning with a

stop were named significantly more quickly than words beginning with a nasal or a

fricative. Moreover, words beginning with a fricative were named significantly more

quickly than words beginning with a nasal. (F(2,9730) = 173.022; p < 0.00001 for the

fast namers, ^(2, 8965) = 204.769; p < 0.00001 for the slow namers). Tukey HSD

Multiple Comparisons showed that differences among the sound classes were

significantly different for both groups (all at the significance level of p < 0.0001).

Duration of the words also affected naming times. Shorter words were named

more quickly than longer words. As shown in §2.4, the durations of the target words

varied from 431 ms to 780 ms, and thus, it is not surprising that this nearly 350 ms

difference among the word utterances affected the naming times for both groups of

participants for their naming times.

First Mora Frequency affected naming times as well. Note that the first mora

frequency is the proportion of words beginning with that mora. The data demonstrated

that words with a higher first mora frequency were named more quickly than words with

a lower first mora frequency for both fast and slow namers.

Two other factors ( Neighborhood density and Word frequency) demonstrated an

interesting contrast between fast namers and slow namers. A word frequency effect was

observed in this auditory' naming experiment, although it was only found for slow namers.

For these listeners, word frequency facilitated auditory naming. Frequent words were

84

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. named more quickly than infrequent words. This seems to indicate that slow namers but

not fast namers accessed the lexicon while they performed the task.

Recall that Japanese participants performed the task much more quickly than

English participants in the equivalent experiments (Luce & Pisoni, 1998; Vitevitch &

Luce, 1999). Even the slow namers in this experiment performed the task much more

quickly than the English participants. Interestingly, the slow namers showed a word

frequency effect that was not observed in the naming data from the English participants.

This may indicate that Japanese and English are different in terms of when word

frequency has an effect.

The data showed that Japanese participants processed CVCVCV targets in

approximately the same amount of time as English monosyllables. Although the overall

duration of these targets was nearly identical, the amount of phonological information

available at points with the words was different. This means Japanese participants

process more phonological information than English participants do in roughly the same

amount of time. These results are consistent with general claims that later processing

times reflect effects from the lexicon, such as word frequency, but also suggest that the

structure of the stimuli affects processing in the two languages.

As discussed above, a neighborhood density effect was observed among both fast

namers and slow namers. The results showed two interesting findings. First, the

neighborhood density effect was facilitative in this auditory naming experiment in

Japanese. This pattern was observed whether participants started naming words before

85

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (fast namers) or after (slow namers) the word offset. Second, there was an interesting

contrast between fast namers and slow namers, which suggested neighborhood density

might be calculated based on different kinds of information at different stages of

processing. In this experiment, fast namers seemed to rely on the Segments calculation as

well as the Segments + Pitch calculation. Slow namers seemed to rely on the Auditory

calculation as well as the Segments +■ Pitch calculation. This could indicate that the

information available to listeners might be different, depending on when they performed

their task relative to the timecourse of the word recognition. The change in effective

neighborhood suggests that richer information in the word representation becomes

available during the timecourse of word recognition. The calculation based on the

auditory representation of words was more highly correlated with slow namers’ reaction

times because not only segmental information but also pitch-accent information, was

available in the word representation.

In sum, there are four main findings from the analyses of naming times. First,

many factors affected both fast namers and slow namers. The neighborhood density

effect, of main interest, was one of the factors that predicted a significant proportion of

the naming time variance. Second, the neighborhood density effect showed facilitation

for both fast and slow namers. Third, the neighborhood density effect consistently

seemed to be based on different kinds of information (from different levels of

representation) for the fast versus the slow namers. Finally, a lexically-based word

frequency effect was observed only for slow namers.

86

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Because of the extremely high accuracy of the participants (99.86 % of the

responses by the fast namers and 99.85% of the responses by the slow namers), it was not

possible to look at effects of accuracy rates.

3 .4 . D is c u s s io n

Previous auditory naming experiments in English have reported that neighborhood

density negatively affects the naming time and accuracy of words: words from dense

neighborhoods are named less quickly and accurately than words from sparse

neighborhoods (Luce & Pisoni, 1998; Vitevitch & Luce, 1998). This interpretation of this

lexical competition effect is based on the assumption that English listeners have to

retrieve pronunciation information from the lexicon. Because the targets in Experiment 1

were words, it was expected that a Japanese auditory naming experiment would replicate

the effects observed in English auditory naming experiments. However, this pattern was

not observed.

A summary of the effects from Experiment I. as shown in Table 3.7, indicated

that neighborhood density affected auditory naming reaction times. However, there is a

crucial difference between these Japanese auditory naming results and previous results

from English listeners: namely, the direction of the effect. Neighborhood density

facilitated naming time in Japanese, while in English, neighborhood density has been

found to facilitate only nonwords (Vitevitch & Luce, 1998). This may indicate that

87

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Japanese listeners started naming before retrieving full pronunciation forms of the target

words.

The naming time results also showed that the same calculation was not necessarily

the best (i.e., most predictive) neighborhood calculation for fast namers and slow namers.

As shown in Table 3.7, fast namers seemed to be influenced by neighborhoods of words

that were similar in terms of segmental make-up, while slow namers were influenced by

neighborhoods of words that were similar in auditory detail. Previous neighborhood

studies have assumed that neighborhood density is computed from only one calculation,

in which represented form must be the ‘right’ one for all participants and for the duration

of word recognition. However, our data show that Japanese listeners rely at least to some

extent on all the neighborhood calculations proposed in this dissertation. Furthermore,

the relevance of the neighborhood density calculations is highly related to the timecourse

of lexical access processes. These data suggest that during auditory word recognition the

set of activated lexical items changes as a function of time.

The results of this experiment also provided evidence that prosodic word

information is involved in lexical activation. The Auditory definition of neighbors, which

inherently includes pitch accent patterns, and the Segments*- Pitch calculation, which

employs abstract pitch accent patterns, were both found to define neighborhood densities

that correlated better with listeners’ performance (for slow namers) than did the

neighborhood density defined on segmental similarity alone. The results confirmed the

88

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. view that listeners use all possible information that might be useful for lexical access

(e.g.. Cutler, 1997; Soto-Faraco, Sebastian-Galles, & Cutler, 2001).

To further explore these findings, an auditory naming experiment was conducted

with the same 700 word stimuli embedded in noise. It was expected that the naming time

data would show the same neighborhood facilitative effect found in this experiment. As

for the predicted accuracy of the data, it was expected that listeners would make more

mistakes, and thus, potentially show a strong neighborhood inhibitory effect. Luce and

Pisoni (1998) conducted a study of word identification in noise in which they found a

strong neighborhood density inhibitory effect. Similarly, an inhibitory effect for

neighborhood density was also confirmed in Japanese (Amano & Kondo, 1999).

Therefore, two types of neighborhood density effects (facilitation and inhibition) should

be observed in Experiment 2.

89

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CH A PTER4

EXPERIMENT 2: AUDITORY NAMING IN NOISE

4 .1 . Introduction

The purpose of this experiment was to extend the findings in Experiment I: a

facilitative effect for neighborhood density in the naming time data that depended on

different kinds of lexical information for fast and slow namers.

In this experiment, the same set of target words was embedded in noise, to allow a

direct comparison between the data in two auditory naming conditions — normal

(Experiment!) and in noise (current experiment). Previous word identification in noise

experiments in Japanese and English (Luce & Pisoni, 1998, for English; Amano &

Kondo. 1999, for Japanese) showed an inhibitory effect for neighborhood density: words

from dense neighborhoods were identified less accurately than words from sparse

90

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. neighborhoods. The accuracy data in this experiment should also show a neighborhood

density inhibitory effect.

As for the naming time data, participants’ performance should essentially be

similar to Experiment I because this is also an auditory naming experiment. In other

words, slow namers should show a facilitative neighborhood density effect. In addition,

this experiment might replicate the difference between fast namers and slow namers

found in Experiment 1: fast namers tended to rely on the Segments calculation, whereas

slow namers tended to rely on the Auditory calculation.

4.2. M e t h o d s

The same 700 words that were used in Experiment I were the target words in this

experiment. The stimuli were created by adding noise to the audio files used in the

Experiment I such that the signal-to-noise (SN) ratio at the point of peak RMS dB was 0

dB SPL1. The signal-to-noise ratio is the ratio of the amplitude of the stimuli against a

constant level of Gaussian noise. The level of the noise was estimated from the peak

RMS amplitude of each stimulus file. The noise extended 500 ms before and after each

stimulus word. The stimuli were presented to the participants at 75 dB SPL, as measured

using a sound level meter.

1 The signal-to-noise (SN) ratio at the point of peak RMS dB was selected based on the results of a pilot study conducted in order to find the SN ration that yielded about 75% response accuracy. One hundred words were given to one participant at the SN ratio of -2.5 and to another participant at the SN ratio of 0. When the SN ratio was 0. accuracy was about 70%. Thus, that this SN ratio was used in this noise experiment. 91

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants were 27 native speakers of Tokyo Japanese who were bom and raised

in the Tokyo area (Tokyo, Kanagawa, Chiba, and Saitama), as explained in Chapter 2.

No subjects had participated in Experiment 1.

The procedure was exactly the same as in Experiment 1 except for the addition of

a second task. Participants were asked to write down what they said in hiragana

characters after they repeated each word. Some participants used katakana characters

and/or kanji characters in their responses. As long as a response was unambiguously

interpretable, it was included in the analysis. It was emphasized to them that they should

first repeat the words as quickly as possible, and then write down their responses. All the

naming responses were recorded onto a digital audio tape (DAT), and the author analyzed

the accuracy of spoken responses.

4 .3 . R e s u l t s

First, abnormally fast and slow responses falling above or below 2.5 standard

deviations (subjects and items) were eliminated from the naming-time analysis. Before

the analyses were conducted, the mean naming time for each participant was calculated in

order to understand the participants’ performance. The data showed that the range for

naming times was somewhat large. Even after the abnormally fast and slow responses

were removed (about 20%), the mean naming time for each participant varied from 1010

92

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ms to 1439 ms (MEAN = 1243 ms, SD = 97.97)2. Because this naming time range is

quite large, the 27 participants were classified into two groups: the fast namers and the

slow namers. First, the median naming times for all participants were calculated. Then,

participants were split by the median naming time (13 participants for fast namers and 14

participants for slow namers). The analyses of the naming data for fast namers and slow

namers were conducted separately.

The written responses were checked against the named responses, as recorded

onto the DAT tapes, in order to make sure that the responses were named with the correct

accent patterns. The author first coded correct responses and missed responses as " I” and

"0, ” respectively using a computer script. Then, the responses with a different pitch

accent pattern were treated as missed responses and corrected to “0.”

The analyses of the naming data and the word identification data were conducted

separately.

4.3.1. N am in g T im e D a ta

Naming times were analyzed in a multiple general regression model (Cohen &

Cohen, 1983) for three different neighborhood density calculations. The percentage of

variance accounted for by each neighborhood definition was calculated by subtracting the

: Naming times were measured from the onset of the sound tiles. A range of naming times measured from the onset of the embedded target words is between 510 ms and 939 ms.

93

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 of the basic model from the R2 of the basic model + each neighborhood density. The

results of the regression analyses conducted in this section are shown in Appendix E.

The basic model was composed of 6 factors ( Participants, Initial sound category,

Uniqueness point, l" mora frequency. Word frequency, and Duration). All factors except

Uniqueness point were significant, although Initial sound class did not reach significance.

This basic model accounted for 15.1% of the naming time variance (F( 18, 5844) =

57.763697, p <0.00001; Z?2 = 0.151044). The basic model is show in Table 4.1 below.

Participants y ***

N) Initial sound class (p = 0.1203) UP

Duration Inhibition***

l5t Mora Frequency Facilitation***

Word Frequency Facilitation* l*p < 0.05, **p < 0.00 !.***/?< 0.0001)

Table 4.1: Basic model of the naming time data for fast namers, Experiment 2.

94

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Three additional models were constructed using the factors from the basic model

and each of the three different neighborhood calculations for fast namers. Table 4.2

shows the results of the basic model and three additional models for the fast namers. The

columns in the table contain the R2 accounted for by the model, R2 accounted for by the

neighborhood density calculation, and the direction of the neighborhood effect. However,

none of the neighborhood density calculations significantly explained more of the data

than the basic model.

R2 accounted R: accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.151044 NA NA

Basic + Seg 0.151354 0.00031 Facilitative

Basic + Seg&Pitch 0.151044 0 NA

Basic + Auditory 0.151044 0 NA

Table 4.2: Models of the naming time data for fast namers, Experiment 2.

Similarly, the basic model for slow namers was potentially composed of the same

6 factors above. The basic model of the slow namers was built by only two factors

95

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Participants and Duration) accounted for a relatively low (7.7%), but significant,

amount of variance (F(13, 4340) = 28.017, p < 0.00001; R2 = 0.077425). The basic

model is shown in Table 4.3.

Participants

Initial sound class

UP

Duration Inhibition***

Ist Mora Frequency

Word Frequency (***p < 0.001)

Table 4.3: Basic model of the naming time data for slow namers, Experiment 2.

Next, three additional models were constructed using the factors of the basic

model and each of the three different neighborhood calculations for slow namers. Table

4.4 shows these results. The Auditory neighborhood density calculation yielded the

highest and only significant increase in R2 (R1 = 0.000991). Therefore, the Auditory

method is the best neighborhood calculation for describing the data of the slow namers.

96

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.077425 NA NA

Basic + Seg 0.077425 0 NA

Basic + Seg&Pitch 0.077425 0 NA

Basic + Auditory 0.078416 0.000991*3 Facilitation (*p < 0.05)

Table 4.4: Models of the naming time data for slow namers. Experiment 2.

Table 4.5 shows a summary of analyses from the regression models for the fast

and slow namers. The Basic + the Auditory model was chosen as the best model for slow

namers.

' ANOVAs were performed using a median split (high density neighborhood. low density neighborhood). The results showed that the effects were significant subjects analysis only (F/( 1.13) = 6.154. p <0.05: F2(l. 698) = 0.044 .p >0.1). 97

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Namers Slow Namers

"33 Participants V V ■8 S Initial sound class (V) u Ml S3 .a UP .s u Duration Inhibition Inhibition <2 £ o Ist Mora Frequency Facilitation

£ Word Frequency Facilitation

Segments (Facilitation) jM s a tm 'S 2 23 Segments + Pitch 2ec — 2 •33 2 Z Auditory Facilitation

Table 4.5: A summary of the regression models for the naming time data (both fast namers and slow namers), Experiment 2.

A comparison of fast and slow namers in terms of effective factors shows that two

factors used in the basic models had effects for both fast namers and slow namers:

Participants and Duration. Both groups showed variability among the participants, even

after grouping was conducted. The durations of the words also affected the reaction

times. Shorter words were named more quickly than longer words. As shown in §2.4.

the durations of the target words varied from 431 ms to 780 ms. This nearly 350 ms

98

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. difference among the word utterances certainly affected naming times for both groups of

participants.

A facilitative neighborhood density effect was observed for both fast namers and

slow namers. Words with many neighbors were named more quickly than words with

few neighbors. It is worth mentioning that even though fast namers in Experiment 2 were

slower than slow namers in Experiment 1, the Auditory calculation was the based

calculation for slow namers. In the auditory representations, segmental information and

pitch accent information were available. The Segments calculation was the best

calculation for fast namers, although it was not statistically significant. Taken together,

the results of the naming time data in this experiment replicated the findings regarding

neighborhood density in Experiment I.

Finally. Word frequency and first mora frequency were both facilitative effects

observed only for fast namers.

4.3.2. W o r d Identification D ata

This section contains the analysis of the word identification data. The written

responses were typed into a spreadsheet by the author, then checked against the named

responses as recorded onto DAT tapes. This was done to make sure that the responses

were named with the correct accent patterns. If the written responses were not exactly the

same as named responses, they were hand-corrected. Correct responses and incorrect

responses were coded as “ 1” and “0, ” respectively.

99

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The overall average proportions of correct responses by fast namers and by slow

namers were 0.72 (SD: 0.45) and 0.70 (SD: 0.46), respectively. The mean difference

between the two groups of participants was significant (F (I, 18898) = 6.36, p < 0.05).

Therefore, no speed-accuracy trade-off was observed.

As with the reaction time data, general regression models were also built for the

accuracy data. Table 4.6 shows the basic model for fast namers. The basic model was

again constructed using the 6 basic factors ( Participants, Initial sound class. Uniqueness

points, Ist mora frequency. Word frequency, and Duration). The basic model for fast

namers included all factors except for Uniqueness point. The basic model explained just

3.4 % of the data, which, though low. is reliably greater than nothing (F( 18, 9781) =

19.209873, p < 0.0001: R2 = 0.034145).

100

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants yj***

Initial sound class yj'.(C**

UP

Duration Lower accuracy**

1st Mora Frequency Higher accuracy***

Word Frequency Higher accuracy***

(**p <0.01, ***p< 0 .0 0 1)

Table 4.6: Basic model of the word identification data for fast namers, Experiment 2.

Table 4.7 shows three additional models that were constructed by adding one of

the three neighborhood density calculations to the basic model to explain the fast namers'

word identification data. The basic model with the Segments calculation and the basic

model with the Segments + Pitch calculation were models that increase R~ by adding

neighborhood density as a factor. Unlike the analyses of the naming times, the

neighborhood density effects observed in the word identification data show inhibition:

words with many neighbors were recognized less accurately than words with few

neighbors.

101

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.034145 NA NA

Basic + Seg 0.042429 0.008284*** Lower accuracy

Basic + Seg&Pitch 0.043953 0.009808***4 Lower accuracy

Basic + Auditory 0.034145 0 NA

(***p< 0.001)

Table 4.7: Models of the word identification data for fast namers. Experiment 2.

Word identification data from the slow namers were similarly analyzed. Table 4.8

shows the basic model of the word identification data for slow namers. Although the

basic model was again constructed using 6 factors ( participants, initial sound category.

Uniqueness point, I'1 mora frequency, word frequency, and Duration), the basic model

for slow namers included all factors except Uniqueness points. The model accounted for

3.2 % of the data (F(17.9082) = 17.991411, p <0.0001: R r = 0.032580).

4 ANOVAs were performed using a median split (high density neighborhood, low density neighborhood). The results showed that the effects were significant (F/( 1.13) = 22.148. p < 0.00I; F2( 1.627) = 8.629. p < 0.005). 102

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants yj***

Initial sound class yj***

UP

Duration Lower accuracy***

U' Mora Frequency Higher accuracy***

Word Frequency Higher accuracy*** (***p <0.001)

Table 4.8: Basic model of the word identification data for slow namers, Experiment 2.

Table 4.9 shows the results of the basic model and three other models of the word

identification data for the slow namers. The Segments + Pitch calculation yielded the

highest R2 (R2 = 0.012748) that was significant (p < 0.001). Thus, the segment + Pitch

calculation was the best neighborhood calculation for describing the accuracy data of the

slow namers.

103

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R: accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.032580 NA NA

Basic + Seg 0.042105 0.009525*** Lower accuracy

Basic + Seg&Pitch 0.045328 0.012748***5 Lower accuracy

Basic + Auditory 0.032580 0 NA

(***p < 0 .0 0 1 )

Table 4.9: Models of the word identification data for slow namers, Experiment 2.

Table 4.10 contains a summary of analyses of the regression models for the fast

namers and the slow namers. The basic model with the Segments + Pitch calculation was

chosen for both fast namers and slow namers.

5 ANOVAs were performed using a median split (high density neighborhood. low density neighborhood). The results showed that the effects were significant (F/( L.12) = 67.34. p < 0.0001: F2( 1.627) = 9.847. p < 0.005). 104

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Namers Slow Namers

Participants V 1 S Initial sound class V £ '3 £ UP

u Duration Lower accuracy Lower accuracy £ uS0 © lsl Mora Frequency Higher accuracy Higher accuracy aw Eb Word Frequency Higher accuracy Higher accuracy

1 Segments Lower accuracy Lower accuracy Segments + Pitch Lower accuracy Lower accuracy .SJl u e * " m • mv t a w Z Auditory

Table 4.10: A summary of the regression models for the word identification data (both fast namers and slow namers), Experiment 2.

The regression analyses from the word identification data for fast namers and slow

namers demonstrated that fast namers and slow namers performed very similarly in terms

of accuracy of target word recognition. Five factors contributed to word accuracy:

Participants. Initial Sound Class, Is' Mora Frequency, Word Frequency and

Neighborhood density. As with naming time, the Participants factor still reliably differed

from each other even after being split into fast and slow namers.

105

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Initial sound class of the target words affected the accuracy of word

recognition in noise. For fast namers, stops (66%, SD = 47) were named less accurately

than fricatives (71%, SD = 45) and nasals (74%, SD = 44; F(2,9797) = 25.85, p <

0.0001). Tukey HSD multiple comparisons showed significant differences between all

pairs of groups at p < 0.05. Slow namers also showed the same tendency (stops: 70%. SD

= 46; nasals: 76%, SD = 43; fricatives: 71%, SD = 45; F(2,9097) = 13.67, p < 0.0001).

Tukey HSD multiple comparisons showed that the differences between stops and nasals

and between fricatives and nasals were significant at p <0.05. although the difference

between fricatives and stops was not significant. In terms of a sonority scale, stops are

less sonorant than fricatives and nasals, and fricatives are less sonorant than nasals.

Therefore, the performance of fast and slow namers on word identification in noise might

be related to the sonority scale. Further detailed analyses of sound confusion are shown

in Appendix F.

First Mora Frequency was also significant for both groups’ word accuracy scores.

The data demonstrated that words with a higher first mora frequency were recognized

more accurately than words with a lower first mora frequency for both fast and slow

namers.

A Word Frequency effect was observed in this auditory naming experiment.

Frequent words were recognized more accurately than infrequent words.

Neighborhood density showed an inhibitory relationship with word recognition

accuracy: words from dense neighborhoods were named less accurately than words from

106

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sparse neighborhoods. The word recognition accuracy portion of this experiment is

similar to the word identification in noise experiments of Luce and Pisoni (1998) and

Amano and Kondo (1999). Further, the naming time data replicated the results of

Experiment I.

4 .4 . D is c u s s io n

Previous word identification in noise experiments have showed that the

neighborhood density effect is an inhibitory effect (Luce & Pisoni, 1998; Amano &

Kondo, 1999), and these data are consistent with that. The word identification data from

the current study showed that words with many neighbors were recognized less accurately

than words with few neighbors. Thus, the word identification data in the auditory naming

experiments here replicated the same inhibitory neighborhood density effect using

Japanese.

There are, however, two main differences between the two previous word

identification in noise experiments and this current experiment. First. Amano and Kondo

(1999) did not consider word prosody. In their study, participants typed hiragana

characters as responses from the keyboard into a computer so that there was no way to

understand whether listeners misperceived pitch-accent patterns or not. In contrast, the

responses of the participants in this experiment were recorded on DATs so that word

confusability in noise could be analyzed in terms of segments, as well as with respect to

word-level prosody. Second, this experiment was able to demonstrate which kinds of

107

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. phonological information other than a neighborhood density effect were used to recognize

words in a noisy environment.

It was also the case that Amano and Kondo (1999) used a mora-based

neighborhood calculation where only segmental information was considered, and the

effect of the Segments calculation in the current experiment confirmed this finding as

well. Moreover, the data here suggested that pitch accent information was also exploited

to understand confusable words in noise.

Listeners exploited any kinds of information that could help them understand

words in noise. Word Frequency and first mora frequency both contributed to higher

word identification accuracy. .

Initial sound class was also significant. Although a distinction among stops,

fricatives, and nasals was one factor that contributed to the naming time data and the

accuracy data in Experiment I, Initial sound class was used differently in Experiment 2.

Recall that in Experiment I, words beginning with a stop were named significantly more

quickly than words beginning with a nasal or a fricative. Moreover, words beginning

with a fricative were named significantly more quickly than words beginning with a

nasal. Contrary to this, the current experiment showed that stops were named less

accurately than fricatives and nasals. Fricatives were named less accurately than nasals.

A close look at these patterns reflects different characteristics of sounds. The pattern in

the naming times here reflects the length of the segments. In a brief analysis, one target

word from each initial sound used in this experiment was selected, resulting in eight

108

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. words in total. These words have a HLL pitch accent pattern with ‘a’ as the first vowel.

They are kakudo , tahata , rnasuku, namida , sarada , shamozi, zyasuto, and zatsumu.

Durations of the initial sounds were measured and the mean duration for each sound class

(stops, nasals and fricatives) was calculated. The analysis showed that the duration

means of stops, nasals and fricatives were 17 ms, 44 ms, and 276 ms, respectively. The

pattern in the accuracy data, however, reflected sonority of sounds. In general, target

words were less easily recognized because of the noisey environment. The data suggest

that participants are sensitive to properties of sounds that are scaled in terms of sonority.

Different interpretations of Initial sound class information in the naming time data and

the accuracy data support the view that these two types of data reflect different aspects of

lexical access processes.

In the naming time data, the neighborhood density effect was observed only for

slow namers. The Auditory calculation was the best calculation for slow namers. These

data might indicate that the best calculation for neighborhood density changed over the

timecourse of word recognition processing.

It is interesting that Duration affected the naming times in both experiments,

whereas Initial sound class and / 'f mora frequency were not consistently significant

factors when the stimuli were presented in noise. This might indicate that participants

who heard the stimuli without noise may have been better able to exploit the properties of

the initial sounds and moras. This would indicate that it is not only the length of the

target stimuli, but also the length of the first segment, that affected naming time. The

109

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. facilitation of I'1 mora frequency could be interpreted in two ways, both of which are

consistent with the data in Experiment I. First, this may be an effect from production:

participants were able to produce the words beginning with a highly probable word-initial

mora as also observed in Experiment 1. The other possible interpretation is that this is an

effect of high transitional probabilities of sounds (word-initial CV transition). Or, these

two effects are both influential, but not separable, since they both show facilitation.

In sum, in this auditory naming experiment in a noisy condition, the neighborhood

density effect was a facilitative effect in the naming data and an inhibitory effect in the

word identification data. An interesting difference between the naming time data and

word identification data is that the relative increase of variance accounted for by adding

lexical neighborhood was greater for the word identification data than was the relative

size of the neighborhood effects in accounting for the reaction time data. This could

indicate that the naming times were attenuated because of the noisy environment.

This facilitative neighborhood density effect on naming times occurred no matter

whether participants started naming words before the offsets (fast namers. Experiment 1)

or after (slow namers. Experiments I and 2). However, one thing to keep in mind is that

under noise-free conditions (Experiment I), fast namers did not show a speed-accuracy

trade-off. They were 99.86 % accurate even when they produced all 700 words in the

experiment. Thus, there is a crucial difference between English and Japanese naming

experiments with words. If the neighborhood inhibitory effect is the result of lexical

competition, then Japanese naming data may show that lexical competition did not affect

110

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. processing times. But if this is the case, then the naming task was not the right one for

looking at lexical competition effects even though previous English naming experiments

showed that this task shows an inhibitory neighborhood density effect with words.

Therefore, if other tasks could induce lexical competition, an inhibitory neighborhood

density effect might affect the processing times. This possibility will be investigated in

the final experiment.

I l l

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5

EXPERIMENT 3:

SEMANTIC CATEGORIZATION EXPERIMENT

5.1. Introduction

This chapter investigates neighborhood density effects in a task that requires

selection of words stored in the lexicon. The lexical decision task has been used to study

the time course of auditory word recognition for English words presented in the clear

(Goldinger, 1996). Lexical decision has been shown to be sensitive to lexical frequency,

and, with proper controls, it is also sensitive to neighborhood density and neighborhood

frequency (Luce & Pisoni, 1998). However, it requires discrimination between word and

nonword patterns so it is often criticized as having unnatural task characteristics. Further,

Amano and Kondo (1999) failed to show an inhibitory neighborhood density effect on

processing times. 112

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In this last experiment, an alternative to the lexical decision task was used: the

semantic categorization task. Vitevitch and Luce (1999) used this task to investigate the

sublexical and lexical levels of representation in the on-line processing of spoken words.

The semantic categorization task is a relatively new experimental procedure (Forster &

Shen, 1996). In this task, participants are given a semantic category (for example,

animals) and hear a word over headphones; they must decide as quickly and accurately as

possible whether the word belongs to the given semantic category. Vitevitch and Luce

(1999) used this methodology because retrieval of semantic information for responses

definitely requires lexical access, yet such a process would not unnaturally bias

processing towards either the sublexical level or the lexical level as might be the case in

lexical decision. The auditory naming task and the lexical decision task have been

commonly used in the English neighborhood literature. The auditory naming task may

bias processing toward the sublexical level, however, because a response may be made

without accessing lexical representations. On the other hand, the lexical decision task

seems to bias processing towards the lexical level, since participants have to make a

decision about the word’s lexicality.

The semantic categorization task requires not only accessing the lexicon (lexeme)

but also passing the information to a higher semantic level (lemma). Therefore, words

must be competing while participants perform the task. The task required use of actual

words only, which is exactly the constraint for our neighborhood density calculation

based on auditory similarity. Since all target words and filler words must be actual

113

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. words, the acoustic-based neighborhood density calculation from the NTT database can

be kept for this experiment. The final experiment was conducted using the semantic

categorization task to test for a neighborhood effect in Japanese spoken word recognition.

5.2. M e t h o d s

5.2.1. S tim u li

The target words were the same 700 words that were used in Experiments I and 2.

Seven hundred additional words were selected as fillers. There were three crucial

constraints for filler selection in this experiment. First, all 700 additional words had to be

nouns that had sound files in the NTT database. Second, the semantic categories and the

words that belong to them had to be relatively common in Japanese. Third, each semantic

category needed to be represented by an equal number of words. In order to fulfill these

constraints, twenty-four different semantic categories were selected from various

reference resources. Four categories were used twice.

There were three concerns in the selection of semantic categories and their

associated words. First, previous semantic categorization experiments have not yet tested

the effect of using multiple semantic categories within a single experiment. Thus, such

an experimental structure may not work and the collected data may not be coherent at all.

To address this concern, the validity of the semantic categorization experiment was

examined before the final analysis was carried out. The results are reported in §5.3.1.

114

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The second concern was whether participants also would think that the words

selected for each semantic category really belonged to that semantic category. Batting and

Montague (1969) conducted a survey on category norms for verbal items in 56 English

categories. In their survey, each semantic category was given to nearly 450 college

students, and their task was to come up with as many words as possible that belonged to

the semantic category within 30 seconds. The results of this study tell us common words

for specific semantic categories in English. Ogawa (1972) conducted a similar survey on

category norms for verbal items in 52 Japanese categories1. Given the constraints on

filler selection described above, it was not possible to use a complete subset from one or

both of these studies. Thus, it was necessary to create some additional (as yet untested)

categories. A pilot study was conducted to test the categories and 700 newly chosen filler

words.

In the pilot study, 28 semantic categories were first explained to the participants,

four native speakers of Tokyo Japanese. Their task was to choose the most appropriate

one of the 28 categories for each of the 700 filler words. These words and semantic

categories met the constraints mentioned earlier. All four participants categorized them

as expected. However, in post-experiment interviews, they all admitted that some words

were easier to categorize than others. Also, some categories were more intuitive than

others. The 700 filler words and the descriptions of their semantic categories are shown

in Appendix G.

1 In Ogawa (1972), participants performed the task within one minute, not 30 seconds. 115

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The semantic categorization experiment was designed such that the 700 words

comprising the lexion used in previous experiments were filler words, and the newly

chosen 700 words were targets. The newly chosen words had to belong to categories

unambiguously, while the original 700 members of the lexicon used for Experiments 1

and 2 had to be distributed such that they did not belong to the category to which they

were assigned.

As with the original 700 words, the additional 700 words were also scaled so that

the peak root-mean-square (RMS) amplitude values were equated for all new files, at an

amplitude of approximately 75 dB SPL, as measured using a sound level meter.

In this experiment, the newly selected 700 filler words all belong to some category

whereas the 700 original target words do not always belong to any category. In each

block, twenty-five target words and twenty-five filler words were presented. Only

original target words were analyzed in the experiment.

5.2.2. Participants

Participants were 30 native speakers of Tokyo Japanese who were bom and raised

in the Tokyo area (Tokyo, Kanagawa, Chiba, and Saitama) as explained in Chapter 2.

They were all right-handed. They each received a small amount of money for their

participation. No participant had taken part in either of Experiments I or 2.

116

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 .2 .3 . P r o c e d u r e

Participants required one-hour test sessions on each of two successive days.

Participants were tested individually. Each participant was seated in a quiet room. In

each session, one of the two lists, each of which contained 700 words, was presented.

Each list had 14 blocks, and each block contained 50 words. The order of the blocks and

the words within each block were randomized for each participant. The order of the two

lists was counterbalanced among the participants. Presentation of stimuli and response

collection were controlled by a computer.

At the beginning of each test session, participants were given a written list of 14

semantic categories and their descriptions. The written descriptions allowed the

participants to ask questions about the definitions of categories. Following this, the

participants were tokd that they would listen to 50 words each in 14 blocks in the

experiment. At the beginning of each block, they would hear the same description of a

semantic category as in the written list. Their task was to decide whether each word in

the block belonged to the specified semantic category for each block and to press a button

(‘yes’ or ‘no’) as quickly and accurately as possible. The left-hand button was labeled no

and the right-hand button on the response box was labeled yes. The stimulus words were

presented binaurally over headphones at approximately 75 dB SPL.

A visual prompt appeared at the beginning of each trial. One second later, one of

the spoken stimuli was presented over the headphones at 75 dB SPL to participants.

Reaction times were measured from the beginning of the auditory stimulus to the button

117

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. press response. If a participant did not respond to the word within 4 seconds from the

beginning of the auditory stimulus, the computer automatically recorded “no response”

and presented the next trial.

Before the test session, participants were given a practice block of 10 stimuli

excluded in the final data analysis to familiarize them with the procedure. Each session

lasted for an hour.

5 .3 . R e s u lt s

5 .3 .1 . An Evaluation o f the Semantic C ategorization Task in Japanese

Before analyzing the “no” responses for the original 700 words, "yes” responses

to the newly chosen filler words were analyzed in order to make sure that participants

performed their task appropriately. In this experiment, half of the experimental stimuli

were “no” target-word responses that were included in the final analysis, and the other

half were “yes” filler-word responses that were supposed to belong to semantic categories

in the experiment. The strategy here is that “yes” filler words were distracters to the

participants. The participants did not know that the focus of the experiment was on the

“no” target-word responses, rather than on the “yes” filler-word responses. Our

expectation was that the proportion of correct responses (answering “yes”) for filler-

words responses should be relatively high. However, if the proportion of correct

responses to those words were low, the experiment itself would be a failure.

118

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The proportion of correct responses for “yes” filler-word responses was 0.95. The

data also showed some variability in the participants’ performance on the “yes” filler-

word responses. The highest and the lowest proportions of the correct responses were

0.985 and 0.917.

The different semantic categories used in the experiment had to be evaluated. 24

different semantic categories were used for 28 blocks in total, and each block consisted of

25 words selected for each semantic category and 25 target words not belonging to the

category. There are a couple of questions regarding the use of multiple semantic

categories in a single experiment. The first question is whether the different semantic

categories were equally difficult in the experiment. Assuming that words were easy to

categorize in a designated semantic category, participants should not have made mistakes.

Table 5.1 shows a summary of correct responses for “yes" filler-response words in

terms of semantic categories2. The first column shows the name of the semantic

categories (Category). The next column shows the mean proportion of correct responses

for each semantic category (Mean). The next five columns indicate the number of correct

responses. For example, the column labeled 30 indicates the number of words that were

correctly categorized by all 30 participants. Likewise, the column labeled the 29 indicates

the number of words 29 participants correctly classified and so forth. The column for <

27 indicates the number of words that less than 27 participants responded to correctly.

For example, in the block “ Animals ”, 10 words were categorized correctly by 30

119

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. participants, and so on. The final column (“ NW”) shows the number of words correctly

classified by at least 27 (93%) of the participants. In the block Animals , all 25 words

were correctly classified.

: Four semantic categories were used twice. These categories were listed as separate semantic categories such as Career (I) and Career (11). 120

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Category Mean 30 29 28 27 <27 NW

Animals 0.977 10 13 2 0 0 25 Ingredients 0.976 12 8 5 0 0 25 Colors 0.973 13 6 4 2 0 23 Desserts 0.97 12 7 4 I I 23 Fruits 0.976 18 3 I 1 2 22 Diseases 0.972 14 6 2 2 1 22 Sports 0.97 12 8 2 2 I 22 Main dishes 0.968 16 4 2 I 2 22 Vegetables & beans (I) 0.957 11 5 6 0 3 22 Insects 0.948 12 8 2 I 2 22 Birds 0.962 14 5 2 I 3 21 Body parts (I) 0.96 10 8 3 3 I 21 Grammatical terms 0.948 7 6 8 2 2 21 School items 0.945 10 8 3 I 3 21 Rowers 0.95 9 8 3 4 I 20 Flavors 0.948 9 8 3 I 4 20 Careers (II) 0.94 9 7 4 2 3 20 Body parts (H) 0.96 8 11 0 I 5 19 Parts of the buildings 0.948 7 6 6 4 2 19 Things found in house 0.946 8 5 6 4 2 19 Metals 0.934 12 6 I I 5 19 Instruments 0.956 11 5 2 5 2 18 Vehicles 0.956 11 5 2 5 2 18 Careers (I) 0.929 7 7 4 3 4 18 Fish (I) 0.928 7 7 3 3 5 17 Vegetables & beans (II) 0.925 10 6 I 2 6 17 Subjects of study 0.898 7 7 3 0 8 17 Fish (II) 0.898 11 4 2 I 7 17

Table 5.1: A summary of correct responses for “yes” filler-word responses in terms of semantic categories.

There are two pieces of evidence to suggest that all blocks of semantic categories

were not equally easy. The first piece of evidence is that the means of the correct

121

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. responses across words for 28 different blocks had a relatively wide range. The highest

mean and the lowest mean were 0.977 (.Animal) and 0.898 (Fish (II) and Subjects of

study). The second piece of evidence is that the number of words under “ NW” in Table

5.1 varied. As mentioned above, 25 words were selected for each semantic category. The

column "NW" in Table 5.1 shows a relatively wide range of variability. The highest and

lowest word numbers were 25 words (for Animal and Ingredients) and 17 (for Fish (I),

Fish (II), Vegetables & beans (II) and Subjects of study). The above two factors (number

of words classified correctly by at least 93% of participants and mean correct) were

highly correlated (R2 = 0.809, p < 0.0001).

The analysis of “yes" filler-word responses had two goals. The first goal was to

understand how well participants performed their task in the experiment. The second

goal was to investigate more about semantic categories and their representative words in

Japanese. This analysis yielded three new findings. First, participants generally showed

good performance in the semantic categorization experiment. Note that the selection of

semantic categories and their “yes” filler-word responses was not ideal at all, because

some of the semantic categories were not based on objective reference to previous

literature on category norms in Japanese, but on intuitions of this author as a Japanese

native speaker. Additionally, participants had to deal with multiple semantic categories

in each session, so the task was more difficult than in experiments using a single semantic

category (a more typical setup). Even so, the average percent correct for “yes” filler-word

responses by subjects was 95%. A high percentage value indicates that participants

122

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. performed the task very well even under circumstances in which they had to deal with 28

different semantic categories. We expect that an analysis on “no” target-word responses

should be able to provide interpretable results in the semantic categorization experiment.

Secondly, although general accuracy patterns clearly indicate that the participants

performed the task properly, the degree of task difficulty among 28 semantic categories

seemed to be different. The mean values of the words responded to correctly within each

category {Mean) and the number of words correctly classified by at least 93% of the

listeners (NW) shown in Table 5. L both indicate that the participants found their task to be

relatively easier in some categories, such as Animals and Ingredients , than in other

categories, such as Fish (I) and Subjects o f study. Also, these two factors were highly

correlated.

Thirdly, although the degree of task difficulty among semantic categories within

the experiment varied, the selection of semantic categories by itself seemed to be

reasonable. The lowest number of words correctly classified by at least 93% of the

participants was 17 out of 25 words for the categories of Fish (I), Fish ill). Vegetables &

beans (II) and Subjects of study. In other words, at least 17 of the words were considered

by general undergraduate students to be good instances of these semantic categories. The

experimental structure required 25 words for each semantic category. However, the

results clearly suggest that not all the “yes” filler-words were not representative words for

each category. This does not mean that some of the selected semantic categories were

123

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. unacceptable. Rather, some semantic categories had less intuitive or less representative

words than the other categories.

The results of the analysis on ‘'yes” filler-response words is useful for selecting

semantic categories and their representative words in the future semantic categorization

experiments in Japanese.

Turning now to the “no” target-word responses, the results show that the overall

correct percentage of “no” target-word responses was high. The highest and the lowest

proportions of correct responses were 0.985 and 0.917. respectively. A high average

percentage (96.54%) indicates that participants were generally able to categorize “no"

target-word responses in the experiment as expected.

Next, accuracy for each target word was investigated. Table 5.2 shows a

summary of accuracy for the 700 target words. The number of words (Number o f Words)

and the proportion of words (Proportion o f Words) that were correctly classified by

number of participants (Number of Participants) are also show in the table. For example,

261 out of 700 target words were correctly classified by all 30 participants, and its

proportion was 0.3729.

The data indicate that the overall percentage of “no" responses to “no” target-

word responses was high. 502 of the 700 target words (71.58%), were correctly classified

as “no” by at least 29 out of the 30 participants of this study. 597 out of the 700 target

words (85.15%) were correctly classified by at least 28 of the 30 participants. Such a

124

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. high percentage value indicates that participants generally performed their task as

expected.

Number of Participants Number of Words Proportion of Words

30 261 0.3729 29 241 0.3429 28 95 0.1357 27 37 0.0528 26 22 0.0314 25 14 0.0200 24 8 0.0114 23 2 0.0029 22 7 0.0100 21 I 0.0014 20 2 0.0029 19 3 0.0043 18 I 0.0014 17 0.0029 16 I 0.0014 15 I 0.0014 14 0 0 13 2 0.0029 12 0 0 11 0 0 10 I 0.0014 0-9 0 0 Total 700 I

Table 5.2: A summary of the accuracy data for the 700 target words.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. However, the data shown in Table 5.2 also indicate that the percentage of “no”

responses was unusually low for a few tokens. In the worst case, only 10 out of 30

participants responded “no" to “no” target-word responses. The participants seemed to

have difficulty responding to some target words.

As explained in §5.3.1, “no” target-word responses could be classified as “yes" if

they were distributed to inappropriate semantic categories. The main question was

whether unusually low correct responses to certain words were induced by such

inappropriate semantic categories for the words. In order to confirm this possibility, 104

target words that were correctly classified as “no” by at most 26 participants were

investigated to see whether they could possibly be classified as “yes.” Of 104 target

words, 8 target words seemed to have high error rates, as shown in Table 5.3.

126

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. # of Participants Words Semantic Categories

22 sikori ‘stiffness’ Body parts

22 tanima ‘valley’ Body parts

22 tegami ‘letter’ School items

22 katiku ‘livestock’ Animals

‘that 18 sonote Body pans method’

16 kabure ‘rash’ Diseases

15 ‘catfish’ Insects

10 musubi ‘last word’ Grammatical terms

Table 5.3: The discarded target words for the final analysis in the semantic categorization experiment.

The target word assignments to these semantic categories were mistakes: they

should have been assigned more carefully to other semantic categories. Therefore, these

8 target words were dropped from the full analysis. A more detailed explanation is given

in Appendix H.

In summary, this section discussed the validity of the semantic categorization

experiment. The analyses of “yes” filler-word responses and “no” target-word responses

both provided evidence that the participants performed their task as they were expected.

The “no” target-response words were of primary main interest, and the results of the

127

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. analysis may be able to provide interpretable data. The analysis of correct responses to

“yes” Filler-response words also provided useful information on the semantic categories

and their representative words. The basic information on the semantic categories and

their representative words may be useful for future semantic categorization experiments.

5 .3 .2 . S e m a n t ic C ategorization D a t a

Based on the analyses in §5.3.1, 8 target words were eliminated from the

categorization time analysis. Then, abnormal fast and slow responses falling above or

below 2.5 standard deviations of both the subject and stimulus means were deleted.

Semantic categorization times for correct word responses were analyzed in a multiple

regression model (Cohen & Cohen, 1983) for the basic model and four other models.

Before the analyses were conducted, the mean categorization time for each

participant was calculated. The data showed that the range for categorizing times was not

remarkably large. After the abnormal fast and slow responses were removed, the

semantic categorization time for each participant varied from 710 ms to 890 ms (MEAN

= 777 ms, SD = 50.52863318). Since variance of the participants is smaller than those in

the previous two experiments, the participants were not split into fast and slow groups.

The percentage of variance accounted for by each neighborhood definition was

calculated by subtracting the R2 of the basic model from the R2 of the basic model +

neighborhood density. The calculation yielding the highest increase in R2 that is

statistically significant was chosen as the best neighborhood calculation for the data.

128

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The basic model was constructed using 7 factors: 6 factors that were used in the

previous two experiments (participants, initial sound category. Uniqueness points, I'1

mora frequency, Word frequency, and Duration) and an additional factor: Semantic

category. Semantic category was introduced based on the fact that listeners categorized

words in different semantic categories. All seven factors were used to construct the basic

model that explained 22.3 % variance (F(62, 19601) = 90.725063, p < 0.00001; R2 =

0.222983).

Participants V

Initial sound class \^***

Semantic Category \j***

UP Inhibition***

Duration Inhibition***

I5' Mora Frequency Inhibition***

Word Frequency Facilitation***

(* * * ^ < 0.001)

Table 5.4: Basic model of the semantic categorization data, Experiment 3.

129

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Three additional models were constructed using the factors of the basic model and

one of the three different neighborhood calculations. Table 5.5 shows the results of the

four models.

R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.222983 NA NA

Basic + Segs 0.223911 0.000928*** Facilitation

Basic + Segs&Pitch 0.224481 0.001498*** Facilitation

Basic + Auditory 0.223566 0.000583*** Inhibition

(***p < 0 .0 0 1)

Table 5.5: Models of the semantic categorization data, Experiment 3.

The values of R2 accounted for by the neighborhood effect demonstrate a contrast

between the calculations based on the segmental representation (the Segments calculation

and the Segments + Pitch calculation) and the calculation based on the Auditory

calculation. Interestingly, the calculations based on the segmental representation show

facilitation whereas the calculation based on the auditory representation shows inhibition.

Two types of neighborhood density effect are observed in the same experiment.

Therefore, both types of neighborhood density effect might contribute to the

130

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. categorization times. Also, this could support the claim by Luce and Large (2001) that

the effect from probabilistic phonotatics (facilitative effect) and the neighborhood density

effect (inhibitory effect) are separable, and coexist.

In order to confirm this hypothesis, another model was constructed. This time, the

calculations that yielded the highest two (the Auditory calculation and the Segments +

Pitch calculation) were included as neighborhood density. If both effects were real, they

should be also significant in a model where they are both included as separate factors.

The model accounted for 22.5% of the variance, (F(64, 19599) = 88.919995, p < 0.0001;

IF = 0.225026), both types of neighborhood density were significant, and the directions of

the effects did not change: facilitation from the Segments + Pitch calculation (F =

36.9340. p < 0.0001) and inhibition from the Auditory calculation (F = 13.7924. p <

0.0005). A combination of both calculations yielded the highest increase in R~ (R~ =

0.002043) relative to the basic model such that this might be the best model possible.

Unlike the previous two experiments, two neighborhood calculations can coexist in the

model.

Table 5.6 is a summary of the regression model for the semantic categorization

data. The model with a combination of the Segments + Pitch calculation and the

Auditory calculation is shown here.

131

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants yj*** u yj*** 1 Initial sound class 3 u '35 Semantic Category yj*** a £4) UP Inhibition*** a se Duration Inhibition*** L. 9 u a Ist Mora Frequency Inhibition*** rJm Word Frequency Facilitation***

e■Ji ,9 Segments* Pitch Facilitation*** a 3 a Inhibition*** 13 Auditory U (***p < o.OOL)

Table 5.6: A summary of the regression model with two types of neighborhood density (facilitative and inhibitory), Experiment 3.

There are at least two interpretations tor the fact that two neighborhood density

effects are observed in the data. The first interpretation is that both effects are valid for

all participants. The other interpretation is that one neighborhood density effect is

operative for some listeners and the other density measurement correlates with RT for

other listeners. In other words, if we grouped the participants as fast responders and slow

responders as in previous experiments, neighborhood density facilitation might be

observed among fast responders only, whereas neighborhood density inhibition might be

132

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. observed among slow responders only. In order to test this hypothesis, participants were

grouped by the median of participants’ mean categorization times. Fifteen participants

each were grouped as “fast responders” and as “slow responders,” and the data of fast

responders and slow responders were then reanalyzed separately.

The basic model for fast responders consisted of 7 factors as in the previous basic

model. All seven factors were effective. The model accounted for 11.18% of the

variance (F(47, 9922) = 26.586417, p < 0.0001: FT = 0 .111852).

Next, three additional models were constructed by adding to the basic model each

of the three different neighborhood calculations for fast responders. Table 5.7 shows the

results of the four models for fast responders. As in the overall analysis above, the

Segments + Pitch calculation and the Auditory calculation are both significant-5, and have

opposite effects on semantic categorization reaction times.

3 ANOVAs were performed using a median split (high density neighborhood, low density neighborhood) for the Segments + Pitch calculation and the Auditory calculation. The analyses did not show any significant effects for both calculations. 133

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.111852 NA NA

Basic + Seg 0.113391 0.001539*** Facilitation

Basic + Seg&Pitch 0.114240 0.002388*** Facilitation

Basic + Auditory 0.112253 0.000401* Inhibition

(*/?< 0.05, ***p< 0.001)

Table 5.7: Categorization data of fast responders, Experiment 3.

Another model was built using the 7 factors used in the basic model and 2

neighborhood density calculations (the Segments + Pitch calculation and the Auditory

calculation). All factors contributed to the model significantly (F(49, 9922) =26.203629.

p <0.0001, R: = 0 .114600). The Segment + Pitch calculation (F = 26.2982, p <0.0001)

and the Auditory calculation still showed facilitation and inhibition (F = 4.0409, p =

0.0444), respectively.

Similarly, the model for the slow responders was also built. All seven factors

were used to construct the basic model that accounted for 17.37% of the RT variance

(F(47.9696) = 43.153447,/? <0.00001; R2 = 0.173734).

Next, three additional models were constructed by adding to the basic model each

of the three different neighborhood calculations for slow responders. Table 5.8 shows the

134

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. results of the four models for slow responders. As in the overall analysis above, the

Segments + Pitch calculation and the Auditory calculation are both significant4.

R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.173734 NA NA

Basic + Seg 0.174385 0.000651** Facilitation

Basic + Seg&Pitch 0.174850 0.001116*** Facilitation

Basic + Auditory 0.174668 0.000934** Inhibition (**p < 0.01, ***p<0.00l)

Table 5.8: Models of the semantic categorization data for slow responders, Experiment 3

Another model was built using 7 factors used in the basic model and 2

neighborhood calculations (the Segments + Pitch calculation and the Auditory

calculation). All factors contributed to the model significantly (F(49. 9644) = 41.964509.

p < 0.00001; R2 = 0.175745). The Segments + Pitch calculation and the Auditory

* ANOVAs were performed using a median split (high density neighborhood. low density neighborhood) for the Segments + Pitch calculation and the Auditory calculation. The analyses did not show any significant effects for both calculations. 135

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation still show facilitation (F = 12.5943, p = 0.0004) and inhibition (F = 10.4670,

p = 0.0012), respectively.

Table 5.9 shows a summary of the regression models for fast responders and slow

responders. The model with the Segments + Pitch calculation and the model with the

Auditory calculation are chosen for fast responders and slow responders.

136

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Responders Slow responders

Participants VV u ■8 Initial sound class V V S *5 Semantic category VV 9 4) m UP Inhibition Inhibition

a Duration Inhibition Inhibition u% u a l4t Mora Frequency Inhibition Inhibition

Word Frequency Facilitation Facilitation

s■Ji #o Segments + Pitch Facilitation Facilitation a 3 13 Auditory Inhibition Inhibition U

Table 5.9: A summary of the regression models with two types of neighborhood density (facilitative and inhibitory) for fast responders and slow responders, Experiment 3.

Looking at the fast responders’ data first, as Table 5.7 showed, the Segments +

Pitch calculation yielded the highest R2, followed by the Segments calculation. The

Auditory calculation was also significant, although the model with the Segments + Pitch

calculation and the Auditory calculation showed that its effect was a marginal effect. In

this sense, the non-auditory calculations were better than the Auditory calculation. On

the other hand, as shown in Table 5.8, slow responders’ data indicate that the Auditory

137

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation that yielded the higher R2 for slow responders than for fast responders. The

effect from the Segments calculation that was significant in the fast responders’ data

disappeared in the slow responders’ data. This means that neighborhood density

facilitation was stronger than neighborhood density inhibition among the fast responders

but the opposite direction of the effects was observed among the slow responders. In

other words, the neighborhood inhibitory effect was dominant among slow responders.

This suggests that lexical competition occurs with the auditory representation, and the

magnitude of this inhibitory neighborhood density effect is stronger for slow responders

than for fast responders. As Table 5.9 shows, two types of neighborhood density

(facilitative and inhibitory) coexist even after two groups were separated. This may

indicate that they had effects for all the participants.

The above regression analyses of the semantic categorization data for fast

responders and slow responders demonstrated that they performed very similarly in terms

of accuracy of target word recognition. Seven factors contributed to the semantic

categorization times for fast responders and slow responders ( Participants , Initial Sound

Class, Semantic category. Uniqueness point, I'1 Mora Frequency, Duration, and Word

Frequency).

Initial Sound Classes of the target words also affected semantic categorization

times. For the fast responders, stops were categorized more quickly than fricatives and

nasals. Nasals were categorized more quickly than fricatives (F(2,9967) = 2.13.752, p <

0.0001). Slow responders also showed the same tendency (F(2,9691) = 1.55.103, p <

138

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0.0001). Tukey HSD multiple comparisons showed that all comparisons of sound classes

were significantly different for both fast responders and slow responders at p < 0.0001.

This pattern was also observed in the previous two auditory naming experiments.

Therefore, the participants in this semantic categorization experiment were also sensitive

to the duration of the word-initial sound.

Duration was also effective in this experiment: longer words were categorized

more slowly than shorter words.

The T' mora frequency was also effective in the categorization times. The data

demonstrated that words with a higher Ist mora frequency were categorized less quickly

than words with a lower Ist mora frequency. A sound or sequence of sounds (in this

experiment, the mora) could induce either inhibition or facilitation. For example, if

listeners take advantage of effects of probabilistic phonotactics. the neighborhood density

effect is facilitation: words beginning with high first mora frequency are categorized more

quickly than those with low first mora frequency. On the other hand, if they do not take

advantage of probabilistic phonotactics, the neighborhood density effect shows an

opposite direction, because of word competition effects: words with high first mora

frequency would be categorized less quickly and accurately than those with low first mora

frequency. This significance of first mora frequency appeared as a consequence of word.

Since first mora frequency in this experiment caused inhibition, word competition may

have occurred while participants performed the task.

139

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A facilitative Word frequency effect was observed in the semantic categorization

experiment as observed in the auditory naming experiments: Frequent words are

recognized more accurately than infrequent words in a noisy condition.

Uniqueness point was a significant effect in this experiment. This effect was

reported in English (e.g.. Marslen-Wilson, 1984; Luce. Pisoni. & Manous, 1984; Tyler &

Wessels, 1983). However, the effect of the uniqueness point has not been reported in

experiments in Japanese (Amano and Kondo, 2001). The results of this experiment

certainly showed the effect of the uniqueness point. Figure 5.1 shows a graph of the

categorization times as a function of the number of the segments from the word-initial

point for a target word to become unique from the words in the lexicon.

The solid line and the dashed line plot the results of fast responders and slow

responders, respectively. Figure 5.1 shows that words with an earlier uniqueness point

were categorized more quickly than words with a later uniqueness point (F(4, 9965) =

7.33027715, p < 0.0001) for fast responders and (F(4.9689) = 7.71918148, p < 0.0001)

for slow responders. Tukey HSD multiple comparisons showed that words that become

unique at the third segment were categorized more quickly than words that become

unique at the 6th or at the 7th segment among fast responders. Also words that become

unique at the 5th segment were categorized more quickly that words that become unique

at the 7lh segment. For slow responders, words that become unique at the 3rd segment

were always categorized more quickly than words that become unique at later than this

segment. Also words that become unique at the 5th segment were categorized more

140

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. quickly that words that become unique at the 6th segment or at the 7th segment.

Uniqueness point affected the semantic categorization times in Japanese.

850 i 825 800 775 750 725 M 700 675 650 3 4 3 6 7 UP ♦— Fast responders - • o * • Slow responders

Figure 5.1: The categorization times as a function of the number of the segments from the word-initial point for a given word to be unique from the words in the lexicon (UP).

Computational analyses in Yoneyama (2000) showed that the uniqueness point

seems to not be so effective because UP does not occur before the offset of many words

in the Japanese lexicon. This pattern was also confirmed with the analysis of the 700

141

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. target words. Over 55% of the target words do not have UP before the offset. However,

the reaction time data in this experiment showed that some words are distinguished

earlier than others in terms of the uniqueness point.

5.4. D isc u ssio n

The main purpose of the categorization experiment was to test whether an

inhibitory neighborhood density effect would be observed among the categorization time

data. Vitevitch and Luce (1999) explained that if listeners perform a semantic

categorization task, the neighborhood density effect would show inhibition: words in a

dense neighborhood should be categorized less quickly than words from a sparse

neighborhood. This is based on the fact that neighborhood density is a measure of lexical

competition.

However, the response patterns of the participants were more complicated than

expected. The semantic categorization data showed that more than one neighborhood

density calculation was statistically significant. Further, two types of neighborhood

density (facilitative and inhibitory) coexist in the model.

Recall from the auditory naming experiments, that a neighborhood density effect

can be realized either as a facilitative effect or as an inhibitory effect. The current

experiment shows that the neighborhood density effect gradually changes from

facilitation to inhibition in the timecourse of word recognition processing. If only the

calculations that yielded the highest R2 for these semantic categorization data are

142

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. considered, a comparison between fast responders and slow responders may also

demonstrate a transition from facilitation to inhibition. For fast responders, the strongest

neighborhood density effect is calculated by the Segments + Pitch calculation followed by

the Segments calculation. The Auditory calculation that shows inhibition is the least

explanatory calculation of the three. If we assume that R2 explains magnitude of effects

among the three calculations, fast responders performed their task at an earlier stage of

the word recognition process because the Segments + Pitch calculation that yielded the

highest R2 showed a facilitation effect. However, neighborhood density effect from the

lexicon begins to affect their performance. Slow responders, on the other hand, naturally

performed their task at a later stage. The results suggest that they seemed to rely more on

the Auditory calculation yielding the second highest R2 and showing inhibition. This may

indicate that the highest explanatory R1 has gradually moved from the Segments + Pitch

calculation to the Auditory calculation during the timecourse of processing. Also, the

effect from the Segments calculation has gradually disappeared. The semantic

categorization data suggest a gradual transition of the neighborhood density effect from

facilitation to inhibition.

In summary, the results showed that neighborhood density effects occurred in the

semantic categorization experiment. There are three main findings. First, as observed in

previous experiment, neighborhood density effects change during the timecourse of the

word recognition processes. The neighborhood effect from the Segments + Pitch

calculation was stronger than neighborhood inhibitory effect from the Auditory

143

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation for fast responders whereas the latter was stronger than the former for slow

responders.

Second, categorization times were negatively correlated with neighborhood

density in the Segments calculation and the Segments + Pitch calculation: words from a

dense neighborhood were categorized MORE quickly than words from a sparse

neighborhood. However, categorization times were positively correlated with

neighborhood density in the Auditory calculation: words from a dense neighborhood were

categorized LESS quickly than words from a sparse neighborhood. Two types of

neighborhood density effects from the Segments + Pitch calculation and the Auditory

calculation were both effective for fast responders and slow responders.

Finally, two types of neighborhood density effects (facilitative and inhibitory)

coexist. A neighborhood facilitative effect is generally interpreted as an effect of

probabilistic phonotactics (Vitevitch & Luce, 1999). A neighborhood inhibitory effect is

generally interpreted as word competition (Luce & Pisoni, 1998; Vitevitch & Luce,

1999). As Luce and Large (2000) have claimed, both effects of probabilistic phonotactics

and neighborhood competition can be observed simultaneously. The results of this

experiment confirmed this claim in Japanese. Of course, since the neighborhood density

effect is a measure of lexical competition in current word recognition models, these

results provide piece of evidence that lexical competition is also attested in Japanese.

144

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 6

GENERAL DISCUSSION AND CONCLUSION

6.1. Introduction

This dissertation investigated two aspects of spoken word recognition: lexical

representation and lexical competition. Three experiments were conducted using the

same 700 Japanese target words, in an attempt to directly test the neighborhood density

effects in experiments with different tasks.

§6.2 provides a summary of the experimental results. Section 6.3 is a proposal for

a model of spoken-word recognition and word production which can account for these

results. Finally, conclusions are provided in §6.4.

145

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2. S u m m a r y o f R esu l t s

This section summarizes the results of the three experiments reported in this

dissertation. The discussion of other factors other than neighborhood density is presented

in §6.2.1, followed by the findings relating to neighborhood density in §6.2.2.

6.2.1. O t h e r E ffec ts

Many factors other than neighborhood density were included for analyses of

experiments in this dissertation on the assumption that they affect processing time as well

as accuracy. Several of these factors were significant and revealed some interesting

tendencies. This section discusses the influences of factors other than neighborhood

density in spoken word recognition.

Table 6.1 shows the effects included in the analyses of the experiments. All of the

factors shown here had a significant effect on processing times in least one of the

experiments. This supports the claim made in the segmentation studies that listeners

exploit any phonological information to assist them in lexical access.

146

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Processing Time Data Initial sound Participants Frequency Frequency Semantic Duration 1st Mora category Word class e ■s

Exp 1 Fast V V NA I F

Slow VV NA I FF

Exp2 Fast V (V) NA I FF

Slow V NA I

Exp3 Fast V VV I IIF

Slow V >J V I II F

Word Identification Data Initial sound Participants Frequency Frequency Semantic Duration 1st Mora category Word class e

Exp 2 Fast V V NA L HH

Slow V NA LH H

Table 6.1: Summaries of effects other than neighborhood density in three experiments. Processing time data (top) and Accuracy data (bottom). (F = Facilitative effect, I = Inhibitory effect, H = Higher accuracy, L = Lower accuracy, NA = no applicable).

147

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another interesting point is that the factors that are used for lexical access do not

necessarily affect accuracy data in the same way. For example. Duration is always

negatively related to processing time, whereas it does not affect accuracy. Word

frequency shows that even if this affects processing time, it does not necessarily mean

that it affects accuracy. In Experiment 2 (auditory naming in noise). Word frequency

affected accuracy whereas it did not affect processing time. On the other hand, in a noise-

free environment. Word frequency contributed to processing time, but not to accuracy.

Also, even if a factor contributes to both processing time and accuracy, it does not

necessarily do so in the same way. For example, the Initial sound class factor revealed an

effect of durational difference among sound classes (stops, nasals, and fricative) in the

processing data of Experiment I (auditory naming), whereas it showed an effect of

sonority difference among sounds in the word identification data of Experiment 2

(auditory naming in a noisy condition). Although this is evidence that listeners are

sensitive to the Initial sound class of words as one of the types of phonological

information exploited for lexical access, the data also showed that listeners exploit

different characteristics of the Initial sound class.

One last thing to mention is the Uniqueness point. According to Marslen-

Wilson’s (1987, Marslen-Wilson & Tyler, 1980: Marslen-Wilson & Welsh, 1978) cohort

theory, the initial acoustic-phonetic information of a word presented to the listener

activates a “cohort” of lexical candidates that share word-initial acoustic-phonetic

information. Lexical candidates that are incompatible with ensuring top-down and

148

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. bottom-up information are successively eliminated from the cohort until only one word

remains, at which time that word is recognized. Because a word may be recognized when

the initial acoustic-phonetic information of the word is compatible with no other words in

the cohort, word recognition within the framework of cohort theory is said to be

“optimally efficient” in the sense that a listener may recognize a word prior to hearing the

entire word. Thus, a crucial concept in the cohort theory of word recognition is that of

the uniqueness point or optimal discrimination point. The uniqueness point is that point,

measured from the beginning of the word, of which that word becomes distinct from all

other words in the lexicon. For isolated words, the uniqueness point defines the earliest

point, theoretically, at which a word can be recognized, although a word may be

recognized prior to its uniqueness point given sufficiently constraining contextual

information.

The role of the uniqueness point in the recognition of stimuli presented in

isolation has been demonstrated for nonwords in an auditory lexical-decision task

(Marslen-Wilson. 1984) and for specially selected words in a gating task (Luce. Pisoni. &

Manous, L984: Tyler & Wessels, 1983). Thus, the concept of a uniqueness point seems

to be empirically justified.

To determine the extent to which words in isolation may be recognized prior to

their offsets. Luce ( 1986b) calculated uniqueness points or optimal discrimination points,

for all words in a 20,000-word computerized lexicon with more than two phonemes. The

results of this analysis revealed that the frequency-weighted probability of a word’s

149

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. diverging from all other words in the lexicon prior to the last phoneme was only .39.

This finding suggest that an optimally efficient strategy of word recognition may be

severely limited.

Furthermore, Yoneyama (2000) conducted similar uniqueness-point analyses in

Japanese and found that the probability that a word would be distinguished from all other

words in the lexicon prior to the last phoneme was .49 even if its phonological

representation included the pitch accent patterns that help to distinguish words in

Japanese. Therefore, Yoneyama concluded that the concept of the uniqueness point

would also be of limited effectiveness in Japanese. A similar conclusion was drawn in

Amano and Kondo (2000).

However, the results of Experiment 3 (Semantic categorization experiment) show

that the Uniqueness point contributes to processing time: words that become unique

earlier were categorized more quickly than words that become unique later. In Cohort

theory, words are classified into three groups: words that become unique before the last

segment (before), words that become unique at the last segment (at) and words that

become unique after the last segment (after). The results of Experiment 3 revealed that

there was no categorical difference among these categories. Rather, target words (all of

which were six segments long) that become unique at the third segment, were categorized

more quickly and accurately than target words that become unique at a later segment.

Hence, the data of the Uniqueness point studies show that the uniqueness point does have

an influence on word recognition. Yoneyama’s (2000) finding is right in the sense that

150

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the uniqueness point can only provide a limited wore recognition strategy because less

than half of the lexical items have a uniqueness point. However, among these items,

some have an earlier uniqueness point than others - a difference which processes of

spoken word recognition exploit.

The uniqueness point is also considered as a measure of lexical competition. The

significance of this effect leads to conclude that all words are aligned at a left edge of the

words. This is something we may need to consider when we develop spoken-word

recognition models.

6.2.2. Neighborhood Density E ffects

6 .2 .2 .I.Processing Time Data

Table 6.2 shows a summary of the neighborhood density effect in the processing

times of three experiments. The calculations were ordered in this way in order to fulfill

the requirement that the data of the fast group precede the data of the slow group in all

experiments. In this way, the order of the calculations may reflect the timecourse word

representations, since each calculation reflects the word representation at different points

of processing.

151

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Calculations Segments + Exps Groups Segments Auditory Pitch Fast Facilitation Facilitation a. 449 ms X Slow U (Facilitative) Facilitation Facilitation 624 ms Fast (Facilitative) m 670 ms1 * Slow a Facilitation 810 ms Fast Facilitation Facilitation Inhibition a. 738 ms (R: = 0.001539) (R2 = 0.002388) (R2 =0.000401) X Slow (Facilitation Facilitation Inhibition 815 ms (R: = 0.000651) (R2 = 0.001106) (R2 = 0.000934)

Table 6.2: Summary of the neighborhood density effects on processing times in three experiments. Effects in bold show the calculation that yielded the highest increase in R".

The table reveals a few general tendencies. In auditory naming experiments

(Experiments I and 2), all of the participants (in both the fast groups and the slow group)

consistently showed neighborhood density facilitation. Fast namers primarily used the

Segments calculation, while slow namers primarily used the Auditory calculation. The

results of the semantic categorization experiment (Experiment 3) showed that multiple

neighborhood definitions had an effect on processing times for both fast responders and

slow responders. Two types of neighborhood density effects were observed (facilitative

152

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and inhibitory). Models with two types of neighborhood density calculations (the

Segments + Pitch calculation and the Auditory calculation) both had an effect for fast

responders and slow responders. However, the effect from the Auditory calculation was

stronger for slow responders than for fast responders, while the effect from the Segments

+ Pitch calculation was stronger for fast responders than for slow responders.

6.2.2.2.W o r d Identification Data

Because participants did not make many mistakes in accurately naming or

categorizing words in Experiments I and 3. only the accuracy data (word identification

data) from Experiment 2 are discussed here. Table 6.3 shows these accuracy data. Both

fast namers and slow namers identified words from dense neighborhoods less accurately

than words from sparse neighborhoods. This pattern was also observed in previous word

identification in noise experiments in English and Japanese (Pisoni & Luce. 1998: Amano

& Kondo, 1999). Therefore, neighborhood density affects accuracy.

1 Mean reaction times for fast and slow namers in Experiment 2 are calculated from the onset of the target words embedded in noise, not from the onset of the sound files. 153

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Calculations Segments + Groups Segments Auditory Pitch Fast Lower accuracy Lower accuracy

Slow Lower accuracy Lower accuracy

Table 6.3 Summary of the neighborhood density effect in the word identification data of Experiment 2.

6.3. P r o p o s a l : A M o d e l o f S p o k e n -W o r d R e c o g n it io n a n d W o r d P r o d u c t io n

The purpose of this dissertation was to investigate two aspects of lexical access in

Japanese auditory word recognition. As shown in §6.2, the factors used in the three

experiments, including neighborhood density , showed curious patterns of facilitation and

inhibition that need to be explained. The neighborhood density effects that were

significant in their experiments were based on different - the Segments calculation and

the Segments + Pitch calculation were based on symbolic representations whereas the

auditory calculation was based on an auditory representation. Furthermore, the results of

Experiment 3 showed that the Auditory calculation, which showed a neighborhood

inhibitory effect, was based on auditory representation that des not have an internal

structure. Since neighborhood density calculations that were based on different

representations (symbolic and auditory) both had significant effects on patterns in word

154

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. recognition, it is necessary to assume that both representations should be stored in the

lexicon and are involved in auditory word recognition. However, most current

recognition models only assume one of the two representations. In order to explain the

patterns observed in this dissertation, a model is proposed that is inspired by Plaut and

Kello (1999) and Jusczyk (1993) who proposed their models based on infants’ behaviors

(or simulations of their behaviors) for speech comprehension and production. Plaut and

Kello (1999) and Jusczyk (1993) both base their models on the same assumption:

symbolic representations develop out of a need for articulation.

Generally, adult word recognition models do not consider the processes that are

necessary for speech production. However, Experiments 1 and 2 in this dissertation were

auditory naming experiments in which both word comprehension and word production

had to be used in order to perform the task. An explanation of the results of these

experiments therefore requires a model that explains auditory naming experiments

(production task) and semantic categorization experiment (semantic decision task) at the

same time.

Before moving on to this proposal, it will be helpful to review the basic features

of Plaut and Kello’s (1999) model.

6.3.1. Plaut & K ello (1999)

Plaut & Kello (1999) have proposed that phonology emerges from the interplay of

speech comprehension and production. In this view, phonology is an intermediate stage

155

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that connects acoustics, articulation and semantics. The is based on connectionist-pararel

distributed processing (PDP) principles, in which different types of information are

represented as patterns of activity over separate groups of similar, neuron-like processing

units. This section describes the basic features of their model.

Plauto & Kello’s model is shown in Figure 6.1. In their model, phonological

representations play a central role in mediating acoustic, articulatory and semantic

representations. An important aspect of this model is that phonological representations

are not predefined, but are learned by the system under the pressure of understanding and

producing speech. Representations of segments (phonemes) and other structures (onset,

rime, syllable) are not built-in; rather, the relevant similarity between phonological

representations at multiple levels emerges gradually over the course of development.

The system lacks any explicit structure corresponding to words. Instead, the

lexical status of certain acoustic and articulatory sequences is reflected only in the nature

of the functional interactions between these inputs and other representations in the

system. Phonological representation is also symbolic in the system in order to store

temporal acoustic or articulatory information.

As shown in Figure 6.1, it is relatively straightforward to establish a relationship

between semantics and acoustic input, as well as between semantics and articulation,

because symbolic representation mediates the two spaces. However, the hardest part to

the model is how children learn the mapping between acoustic input and articulation.

From the perspective of control theory, Plaut & Kello have proposed that the mapping

156

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. from articulation to acoustics is what they call the forward mapping, whereas the reverse

is the inverse mapping. This forward model must be invertible in the sense that errors

observed in the acoustic input for a given articulation can be translated back into errors in

articulation. Plauto & Kello’s model implemented back-propagation within a

connectionist network.

In Perkell, Matthies. Svirsky, and Jordan (1995), a learned forward model plays a

critical role providing the necessary feedback for learning speech production. Similarly,

in Plaut & Kello's model, the forward model is used to convert acoustic and phonological

feedback (i.e, whether an utterance sounded right) into articulatory feedback, which is

then used to improve the mapping from phonology to articulation (See Plaut & Kello.

1999, for details about their forward model).

157

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Symbolic Representations

Inverse mapping

Articulation Acoustic input

Forward model

Figure 6.1: A model of speech comprehension and production by Plaut & Kello (1999).

158

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 .3 .2 . A M o d e l o f S p o k e n -W o r d R e c o g n it io n a n d W o r d P r o d u c t io n

Our proposed model is a modified version of Plaut and Kello’s model based on

the assumption that the comprehension-production system children acquire in their first

years must underlie the system used by adults. For example, in word recognition, adult

listeners use a segmentation procedure that has been acquired in infancy (e.g., Cutler,

1997). Therefore, the adult comprehension-production mechanism should be based on

the same mechanisms that children have acquired through language comprehension and

production.

A modified version of Plaut and Kello’s model is shown in Figure 6.2. This

model assumes two representations of words: symbolic representations and auditory

representations (Auditory patterns). Auditory patterns do not have any internal structure

and are considered as more complete forms. On the other hand, symbolic representations

have an internal structure represented by linguistic units such as phonemes and are

considered assembled forms.

The necessity for two representations comes from the experimental results

showing that neighborhood density calculations that were based on two representations

(auditory and symbolic) both had significant effects on spoken word recognition. The

Auditory calculation that was based on auditory representations was the only calculation

that showed a neighborhood inhibitory effect. Moreover, auditory representations need to

be stored in the lexicon, since neighborhood inhibitory effect is a measure of lexical

competition and only the Auditory calculation exhibited this effect on processing time. 159

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The existence of auditory representations in the lexicon reflects other findings

with infants (Jusczyk, 1993) and adults (Goldinger, 1989, 1996, 1998; Johnson, 1997ab:

Luce & Lyons, 1998; Pallier, Colome, & Sebastian-Galles, 2001). Jusczyk (1993) has

proposed in his model that exemplars are stored in the lexicon and are connected to

semantic representations. Jusczyk (1993) also mentioned the necessity of symbolic

representation for speech production. Plaut and Kello (1999) and Jusczyk (1993) have

both implied that symbolic representation is mainly established in order to connect the

acoustic input (comprehension) with articulation (production). An acoustic-articulation

mapping introduced the necessity of symbolic representation because acoustic input and

articulation do not have a one-to-one correspondence. Furthermore, adult word

recognition studies have shown that listeners seem to store episodic auditory exemplars in

the lexicon (e.g, Goldinger. 1998; Johnson. 1997). In this dissertation, all the words have

only one exemplar. However, the model allows for the possibility of storing store

multiple exemplars for each word.

Our model includes two routes for adult language users to perform the auditory

naming task for real word stimuli. The first route is the one proposed in Plaut and Kello

(1999) for imitation. The system first derives acoustic and phonological representations

for adult utterance during comprehension. It then uses the resulting phonological

representation as an input for generating a sequence of articulatory gestures. The other

route feeds the acoustic input to the Auditory patterns in order to activate exemplars

160

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. stored in this space. Then, a more completed sequence of gestures is executed for

articulation via symbolic representations.

Semantics

Symbolic Auditory Representations patterns

Articulation Acoustic input

Figure 6.2: A model of spoken-word recognition and word production.

161

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The motivation for proposing dual routes for articulation is also based on the

finding that different phonological forms (auditory patterns and symbolic representations)

affect picture naming times and word reading times by children. Barry, Hirsh, Johnston,

and Williams (2001) found that the estimated AoA (‘'Age of Acquisition”) of a word is a

better predictor of reaction times than lexical frequency in a speeded picture naming task

and a word reading task. They also found that words with early AoA had a smaller

repetition priming effect than words with later AoA when the same subjects were asked

to name pictures that they had either seen (read) before or not seen (read) before. They

interpret this interaction in terms of the phonological completeness hypothesis of AoA by

Brown and Watson (1987) in which early-acquired words have “a more complete

phonological representation” (p. 214), whereas later-acquired words, which are presumed

to be stored in a segmented fashion, require their stored phonology to be assembled for

production (which entails longer processing time).

In English auditory naming experiments, words are named more quickly than

nonwords. This finding is explained in this model by adopting Brown and Watson’s

phonological completeness hypothesis. Words may be early-acquired words and

nonwords may be possible candidates for later-acquired words. If Brown and Watson's

phonological completeness hypothesis is correct, words have “a more complete

phonological representation” than nonwords. An auditory representation of words is

considered ‘‘a more complete representation” here. Presumably, nonwords require using

symbolic representations to store temporal acoustic information, which implies that

162

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. symbolic representations for nonwords need to be assembled for production. Therefore

the naming time difference between words and nonwords reflects which phonological

representation (or which route in the model) is used for production.

This explanation is based on the assumption that participants use a whole-word

representation in order to perform the auditory naming task. Therefore, participants

retrieve a more completely stored phonological form for word stimuli. They are also

assumed to assemble a full representation for the articulation.

This model makes three assumptions. First, words have two representations

(auditory and symbolic). There are, therefore, two routes for performing the auditory

naming task. The second assumption is that temporal acoustic information may also be

stored as a symbolic representation. Symbolic representations are used to perform the

auditory naming task with nonwords. Lastly, the model can assume two different kids of

processing using symbolic representation. In the first case, participants have a

completely-assembled form before they pronounce the target. For example, if Japanese

participants say a CVCVCV target word, they plan all the gestures for articulation in

advance. This pattern is observed when participants take a route from auditory patterns

to symbolic representations. The completely-assembled form is also possible for

nonwords. The second case is that participants do not have a completely-assembled form

before they pronounce the target. In other words, participants name the target as soon as

acoustic information becomes available. In this case, symbolic representation is used to

hold the temporal acoustic information that is sent to articulation space. This routine

163

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. repeats itself until the end of the target word is reached. Both types of processing may be

possible for Japanese words. Japanese target words in our experiments were all

CVCVCV words. Therefore, Japanese participants could have pronounced the target

words by repeating three syllables/moras as soon as acoustic information becomes

available, rather than retrieving an entire phonological representation from the lexicon. In

this case, the naming times reflect how quickly participants start producing a

syllable/mora, and not a word.

This model also allows multiple routes for word recognition. The acoustic input

is transmitted to both a Symbolic representation and an Auditory representation , both of

which are also connected to Semantic information. Recent studies have shown that two

different representations (auditory and symbolic) need to be stored in the lexicon (Luce &

Lyons, 1998, Pallier et al„ 2001).

6.3.3. T h e C u r r e n t F in d in g s in T e r m s o f T h e P r o p o s e d M o d e l

This section discusses how the model proposed in §6.3.2 can explain the data of

three experiments in this dissertation.

This explanation begins with the assumption that facilitative and inhibitory effects

are not mutually exclusive. That is, a facilitative effect on processing times is introduced

by matching the acoustic input (activation) onto the representation(s) stored in the

lexicon. On the other hand, an inhibitory effect on processing times happens when

language users attempt to select a word. In other words, if the experimental task does not

164

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. require selecting a word, an inhibition effect would never be observed while a facilitate

effect may surface.

6 .3 .3 .1 .E x p e r im e n t 1: A u d it o r y N a m in g

The purpose of the auditory naming task is to repeat the sequence of sounds as

quickly as possible. In order to perform this task, participants have to map the acoustic

information to a representation and to articulate the representation. Luce and Pisoni

(1998) have reported that the naming data showed a neighborhood inhibitory effect while

English participants performed the task with real word stimuli, suggesting that word

competition happened. Luce and Pisoni have reasoned that English participants need to

select a word in order to retrieve a phonological form for pronunciation.

The question here is whether word selection is a part of requirement in order to

perform the task. The auditory naming task has been used for nonword targets. Of

course, nonwords are not stored in the lexicon, which means that participants need to use

a symbolic representation that is used to hold the acoustic input in order to produce the

nonword stimuli. Therefore, word selection itself is not a requirement to perform the

task. This indicates that there is a possibility that participants can perform the task for

real words without word selection. In other words, words might be pronounced like

nonwords without retrieving stored phonological representation.

The results of Experiment 1 showed that fast namers started naming words before

their offsets. Even slow namers showed a tendency that they started naming words after

165

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the offsets. Furthermore, their naming times were faster than the mean naming time for

American-English participants (Luce & Pisoni, 1998; Vitevitch & Luce, 1999). The

neighborhood density effects for fast namers and slow namers were all facilitative. Since

a neighborhood inhibitory effect is a measure of word competition effect, the results

indicate that words were not competing while these participants performed the task.

Neighborhood density calculations compare the similarity of words. However, the

participants in this study did not hear the entire words before they repeated them. This

means that, even though the number of neighbors was calculated in the Segments and the

Segments + Pitch calculations, these measures were not showing neighborhood density.

What did they show, then?

The different neighborhood density calculations show three types of information

regarding lexical access: (I) the kinds of phonological information that are used in lexical

access (segments and pitch accents), (2) how similarity is calculated (unit-by-unit or

whole-word) and (3) when, approximately participants had access to such information.

Based on this information, these results show that Japanese participants did not retrieve a

more complete phonological form from the lexicon in order to perform the task. Rather,

the naming times obtained in this task reflect how long the Japanese participants took

order to produce the First syllable/mora.

Table 6.4 reveals a review of the characteristics of the three neighborhood density

calculations. Three calculations were made based on two factors: phonological

information and domain of similarity. Therefore, the effects of different neighborhood

166

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. density calculations may have shown which types of phonological information are used

for lexical access (segments and pitch accents) and how word similarity is calculated

(unit-by-unit or whole-word).

Neighbor lood Density Calculations Segments Segments + Auditory Pitch Segments yes yes yes Phonological information Pitch accent no yes yes

Domain of Similarity unit unit whole-word

Table 6.4: Characteristics of three neighborhood density calculations.

The last kind of information is explained by the relationship between

neighborhood calculations and the timecourse of lexical access. The results of

Experiment 1 showed that the Segments calculation and the Segments + Pitch calculation

played a role in the responses of fast namers while the Segments + Pitch and the Auditory

calculations had an effect on the responses of slow namers. Moreover, fast namers

started naming before the offset of words. This means that the Segments calculation was

used first and the Auditory calculation was used last in the timecourse of lexical access

processes.

167

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants in Japanese word recognition experiments have been reported to be

sensitive to pitch information in word recognition (Cutler & Otake, 1999), which suggests

that segmental information and word-prosodic information are used in word recognition.

When might these participants not be sensitive to word-prosodic pitch information?

Table 6.5 shows relations between three different calculations and the stages of mapping

the acoustic input to word representations.

Acoustic input Neighborhood Calculations

ko° Segments

ko°do' Segments + Pitch

ko'Mo'mo1 Auditory

Table 6.5: Relationships between the acoustic input and neighborhood density calculation.

168

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Within the first mora of the target words is the only time when pitch accent

information is not so clear2. At thispoint, segmental information is the only reliable

information participants can use to produce the word. This means that participants may

start naming words after they process the first mora of the target words. However, by the

time they hear the second mora, pitch accent information is clearly available to

participants. When they hear the first two moras, they know exactly whether the word

begins with HL or LH pitch accent patterns for CVCVCV target words. Therefore, the

Segments + Pitch calculation becomes the best predictor for naming times. Once

participants hear the entire word, the Auditory calculation becomes the best predictor

because now participants are able to retrieve a more complete sequence of articulatory

gestures.

In sum, Japanese participants start naming as soon as they process the first mora

of the target words. Different neighborhood density calculations show the timecourse of

the auditory word recognition system. Next, the means of realizing different stages of

lexical accesses in this model is discussed.

Fast namers were able to start naming the target words just based on the

segmental information in the first mora. They also used pitch accent information when it

: Cutler & Otake (1999) reported that Japanese listeners can perceive whether the target words begin with a high pitch or a low pitch even if they only hear the first mora in their gating task. However, evidence from two different sources suggest that Japanese listeners do not fully exploit the benefit of pitch accent patterns at this stage. First. Cutler & Otake s study also showed that listeners’ performance accuracy is much higher when they hear fragments including two morae. Second, the results of similarity judgments on pitch accent patterns in Chapter 2 suggest that Japanese listeners were not able to determine pitch accent patterns for monomoraic targets accurately. 169

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. becomes available. Figure 6.3 shows participants’ performance in Experiment I

(Auditory naming) in which they started naming the words after they had heard only part

of the word here exploiting segmental information only. Based on the fact that

participants started naming words right after they processed the first mora, they had to

continuously process the acoustic information in order to articulate the three-mora target

words.

When the participants hear the first mora, all the words beginning with /ko/ would

be activated. At this moment, pitch accent cannot be considered, so the acoustic

information is matched with many words in the lexicon. Then, /ko/ is converted into a

sequence of gestures to produce the first mora. The naming times were recorded at the

beginning of the first mora. In order to complete the task, participants needed to repeat

the same routine until the end of the target word. In othder words, participants needed to

assemble a sequence of gestures in terms of moras.

170

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r " \ /ko°dolku7, /ko°ko'ro °/ /ko'ku0do0/, /ko0dolmo‘/ ^ /ko‘i°N0/, /ko°do'mo10 lpo1i1/ ... ^

Symbolic Representations

Articulation Acoustic input

Figure 6.3: Participants’ performance in Experiment 1 (Auditory naming) in which they started naming the words after they had heard only part of the word by exploiting only segmental information.

The case where the Segments + Pitch calculation had a significant effect. Figure

6.4 shows participants’performance in Experiment 1 (Auditory naming) in which the

participants started naming words after they had partially heard the word by exploiting

segmental and word-level prosodic (pitch accent patterns) information. Since

participants were able to exploit both kinds of information (unlike in Figure 6.3), the 171

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. words that have the same first two moras with a LH pitch accent pattern were mapped

with the acoustic input (probably 2 moras long). As the participants started naming the

words before their ending, articulatory gestures must have been assembled as the acoustic

input became available. Since the participants know a sequence of gestures for a word

like /ko°doV, they should have been able to produce the target word more naturally.

r /k o ^ o ’mo1/ /ko°do‘kul / ^ /ko'Mo'mo'Q'poV/... ^

Symbolic Representations

Articulation Acoustic input

Figure 6.4: Participants’ performance in Experiment 1 (Auditory naming) in which the participants started naming the words after they had partially heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information.

172

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 6.5 represents the scenario for the Auditory calculation; it shows

participants’ performance in Experiment 1 (Auditory naming) in which the participants

started naming the words after they had completely heard the word by exploiting

segmental and word-level prosodic (pitch accent patterns) information. The effectiveness

of the Auditory calculation indicates that participants heard the entire words. Recall that

the acoustic information is transmitted to Auditory patterns and Symbolic representations

simultaneously.

If the acoustic information was perceived as three moras followed a pause, only

one representation was activated at Auditory patterns. Then, a complete sequence of

gestures for/ko°do'mo1/ was executed via Symbolic representations. A neighborhood

facilitative effect was observed in the Auditory calculation, since the acoustic information

activated auditory representations. Since no word selection is conducted in Auditory

patterns , the neighborhood effect was facilitative . Here, /k o 'W m o 1/ is pronounced in a

more complete way.

173

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r

/ko^o'm o1/

Symbolic Auditory Representations patterns

Articulation Acoustic input

Figure 6.5: Participants’ performance in Experiment 1 (Auditory naming) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information.

As we have seen, the proposed model was able to explain the data obtained in

Experiment 1.

174

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 .3 .3 .2 .E x p e r im e n t 2: A u d it o r y N a m in g in N o is e

Experiment 2 was an auditory naming experiment in noise with a secondary task.

This experiment collected naming times as well as identification accuracy. Consider the

patterns for slow namers in this model. Figure 6.6 shows participants’ performance in

Experiment 2 (Auditory naming in noise) in which the participants started naming the

words after they completely heard the word and exploited segmental and word-level

prosodic (pitch accent patterns) information.

This experiment provided curious patterns for naming times and word

identification accuracy. As in Experiment I, the naming times were taken to be the

processing times between the onset of the target words embedded in noise and the onset

of the named targets. Word identification accuracy was analyzed on the basis of the

named targets. Different neighborhood density calculations were chosen for naming

times and word identification accuracy: Auditory calculation for naming times and the

Segments + Pitch calculation for word identification accuracy. The directions of effects

were opposite: a facilitative effect for naming times and an inhibitory effect for word

identification accuracy. How can the model account for these patterns?

175

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r /ko°do‘mo1/, /to°rolro1/ ^ /a°nalgo1/, /ka°roIo1/ /ko°walne'/, /e°no1gu1/... V . J

Symbolic Auditory Representation patterns

Articulation Acoustic input

Figure 6.6: Participants’ performance in Experiment 2 (Auditory naming in noise) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word -level prosodic (pitch accent patterns) information.

In a noise-free condition as in Experiment 1, participants can match the acoustic

information onto symbolic representation directly. More than 98% of 700 targets were

produced correctly in an auditory naming task under in noise-free conditions. Because of

noise, however, slow namers in Experiment 2 performed the task after they completely

heard the word.

176

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The effect of neighborhood density in the Auditory calculation suggests that

auditory exemplars were mapped onto the acoustic input of the target word. In the noise-

free condition, only one word was completely matched with the acoustic input in

Auditory patterns so that it is just sent to Symbolic representation in order to retrieve a

sequence of gestures for articulation. However, in the noisy condition, there were

multiple words that consisted of various sounds which were mapped onto the acoustic

input - even if the auditory patterns helped shape the words. The noise created some

mismatches between the actual acoustic input and the mapped input. In order to produce

the words, the words that were mapped to the acoustic input needed to be sent to

Symbolic representations. In Symbolic representation , many words were activated based

on the transmitted information from Auditory patterns. As shown in §6.3.3.1, many

words were activated at Symbolic representations for fast namers. A crucial difference

between those cases and this one is that, here, the activated words in Symbolic

representations do not share the exact same mora. As shown in Figure 6.6. most similar

neighbors to /kodomo/ do not share the initial mora. In order to complete the

experimental task, participants needed to make a decision about which word they had

heard. Recall that an inhibitory effect is assumed to be the result of word selection,

which means that that the neighborhood density effects at Symbolic representations (the

Segments calculation and the Segments + Pitch calculation) will induce inhibition effects.

Furthermore, participants tended to listen to the entire words before they named them,

then, by exploiting both segmental information and pitch accent information. Because of

177

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. this, the Segments + Pitch calculation accounted for the data more satisfactorily than the

Segments calculation.

For the naming times, however, no word selection process was involved. The

activated auditory exemplars in ( Auditory patterns) were just transmitted to Symbolic

representations. Therefore, the Auditory calculation of neighborhood density showed

facilitation, not inhibition.

Two auditory naming experiments (Experiments 1 and 2) provided the following

findings. First, it is a reasonable assumption that the selection of a word causes a

neighborhood inhibitory effect. Neighborhood density effects are facilitative only if word

selection is not involved. This observation was valid for Auditory patterns and Symbolic

representations . which are used for acoustic mapping. Since word selection was not

involved here, a neighborhood density effect should not be observed in our data. The

acoustic information was just converted to the symbolic representation in order to execute

a sequence of articulatory gestures. Therefore, there is a discrepancy between English

and Japanese auditory naming experiments with word targets. In terms of this

framework, the English participants in Luce and Pisoni (1998) had to make a word

selection in order to perform the task. Experiment 2 showed that Japanese participants

also had to make a decision about the word form. More details on the discrepancies

between Japanese and English will be presented in §6.3.4.

Secondly, it is possible that word targets are pronounced in the same way as

nonwords via Symbolic representations. In §6.3.3.1, dual routes for articulation explain

178

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the difference between words and nonwords. However, Japanese participants in this

experiment showed that listening to the entire word target was not a requirement to

perform the task. At earlier stages, participants were able to imitate the acoustic input.

Thus, words are named in two different procedures (via Auditory patterns for a more

complete representation, or via Symbolic representations that require an assembling of

articulatory gestures), whereas nonwords have only one route for articulation via

Symbolic representations.

Third, different neighborhood density calculations show three types of

information regarding lexical access: (1) the kinds of phonological information that are

used for lexical access (segments and pitch accents), (2) how word similarity is calculated

(unit-by-unit or whole-word) and (3) approximately when the participants executed their

articulation. In the timecourse of lexical access, word-level prosodic information for

word activation only become available after segmental information becomes available.

Therefore, the patterns observed here are supported by a recent study by Cutler and Otake

(2002), which claim that words are activated based on segmental information before

word-level prosody constrains word segmentation.

However, one thing to consider is the possibility that there may be mixtures in

some data sets - with some participants sometimes following one naming path while the

rest of the participants follow the other.

179

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 .3 .3 .3 .E x p e r i m e n t 3: S e m a n t ic C ategorization

The semantic categorization experiment requires lexical access because

participants must retrieve the meaning of words in order to decide whether or not the

words the participants hear belong to an assigned semantic category. The results showed

that two types of word competition emerged in this task: a neighborhood density effect

and the effects of initial cohort size and uniqueness point. In this framework, the

acoustic input is connected to both Auditory patterns (whole-word auditory patterns) and

Symbolic representations (unit-by-unit symbolic representation). Neighborhood density

effect is based on similarity of words whereas the effects of initial cohort size and

uniqueness point are based on similarity of units, such as phonemes. Furthermore, both

facilitative and inhibitory neighborhood density effects were observed in this experiment.

How could this have happened?

180

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Semantics

Symbolic Auditory Representations patterns

Acoustic input

Figure 6.7: Participants’ performance in Experiment 3 (Semantic categorization).

Consider fast responders, first. Figure 6.7 shows participants’ performance in

Experiment 3 (Semantic categorization). As in auditory naming experiments, it may be

assumed that the different neighborhood density calculations show three types of

information with respect to lexical access: (1) the kinds of phonological information that

are used for lexical access (segments and pitch accents), (2) how word similarity is

181

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculated (unit-by-unit or whole-word) and (3) approximately when the participants

executed their articulation. The Segments calculation and the Segments + Pitch

calculation show processes before participants hear the entire word, and the Auditory

calculation shows processes after they have heard the entire word. For example, if

segments information is available, participants may process the acoustic input in terms of

similarity unit-by-unit until word-level prosodic information become available. In this

model, three different neighborhood density calculations offer windows into three

different stages of the word recognition process.

The two neighborhood calculations based on Symbolic representations (the

Segments calculation and the Segments + Pitch calculation) showed facilitative effects.

This means that participants analyzed the acoustic input based on segments as well as

segments + pitch. Recall that first mora frequency is interpreted as initial cohort size, so

it makes sense that the initial cohorts were activated based on the first mora. The initial

cohort size showed an inhibitory effect, which should have had on influence on the

Segments calculation. The other calculation based on Symbolic representations was the

Segments + Pitch calculation. In this case, once pitch accent information is available,

segmental information as well as pitch accent information is used in the subsequent

reduction of the cohort. Therefore, both segment-based neighborhood calculations would

have an effect. The effects related to the Cohort theory demonstrate that the acoustic

input was analyzed in terms of segments using Symbolic representations.

182

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Auditory calculation also had an effect in Experiment 3. This calculation is a

measure of whole-word similarity based on Auditory patterns. This means that

participants were sensitive not only to unit-by-unit similarity of words but also to whole-

word similarity. As shown in Figure 6.7, the acoustic input is transmitted simultaneously

to Auditory patterns and Symbolic representations. However, whole-word similarity

does not come into play until the target words are fully recognized. Until that point, the

cohort effects from Symbolic representations were observed earlier than the

neighborhood density effect. As mentioned in Chapter 2, nearly 55% of the target words

do not have a uniqueness point before the last segment. Although most of the targets

were still not recognized,w some ’ target w words took advantage of w the cohort effects and

were recognized at this point.

If the targets had not recognized by this point, the information about activated

words in Symbolic representations (with and without pitch accent patterns) was

transmitted to Auditory patterns, where neighbors were activated by the acoustic input.

Recall that the acoustic information was transmitted to Auditory patterns while cohort

reduction occurred. Word selection was therefore based not only on segment-based

cohort reduction, but also on the overall auditory impression of the words. Therefore, the

words were activated in both representations should have a higher activation level than is

possible for word recognition. Now word decision is not at Symbolic representations but

at Auditory patterns, yielding a neighborhood inhibitory effect in Auditory patterns.

Since Symbolic representations are not used for lexical selection, the two other

183

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. neighborhood density effects were still facilitative. Therefore, three different

neighborhood densities produced sufficient effects in this experiment; the Segments and

the Segments + Pitch calculations produced facilitative effects while the Auditory

calculation produced an inhibitory effect.

Two types of word competition emerged in this experiment. However, they did

not operate simultaneously in the timecourse of the categorization task. A word

competition effect from cohort reduction emerged through the analysis of acoustic

information in terms of unit similarity at Symbolic representations. Neighborhood

density effect emerged by analyzing the same acoustic information in terms of whole-

word similarity at Auditory patterns. The participants elected words based on Auditory

patterns with the information from Symbolic representations.

For slow responders, the basic mechanism is the same for fast responders except

that the Auditory calculation had a stronger effect for slow responders.

In sum, the results of three experiments in this dissertation were satisfactorily

explained by the model proposed in §6.3.2. In §6.3.4, findings from previous

experiments will be reconsidered in light of the proposed model.

6 .3 .4 . P r e v io u s F in d in g s in T e r m s o f T h e P r o p o s e d M o d e l

This section will look at previous findings in neighborhood experiments with

respect to the proposed model. It will being by first looking at the results of English

auditory naming experiments with word targets (Luce & Pisoni, 1998; Vitevitch & Luce,

184

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1999). Section §6.3.4.2 will further investigate the results of a Japanese lexical decision

experiment that did not show a neighborhood inhibitory effect (Amano & Kondo, 1999).

Finally, implications from the current word recognition theories will be reinterpreted in

terms of this model in §6.3.4.3.

6 .3 .4 .1 .A u d it o r y N a m in g E x p e r im e n t s w it h W o r d T a r g e t s in E n g l is h (L u c e & P is o n i , 1998; VlTEVITCH & L u c e , 1999)

These are some important discrepancies between English and Japanese naming

data. Japanese naming data showed that word selection was conducted only under noisy

conditions because the participants were in a situation where they had to choose one of a

few candidate words in order to complete the task. In fact, in a noise-free condition

(Experiment I), Japanese participants did not show an inhibitory effect when they did not

need to choose the word.

On the other hand, neighborhood experiments with an auditory naming task have

shown that neighborhood density is a facilitative effect with nonwords and is an

inhibitory effect with words (Luce & Pisoni, 1998; Vitevitch & Luce. 1999). Luce and

Pisoni (1998) may be accounted for by recognizing that the English participants in the

study had to choose single word forms in order to complete the task even in a noise-free

condition. In the proposed model, monosyllabic auditory patterns quickly become

activated and an auditory exemplar for the target word is selected. Then, the selected

word is converted into a sequence of gestures via Symbolic representations. Then, what

might cause such a situation?

185

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In a naming task, the participants’ task was to repeat monosyllabic words (CVC in

Luce & Pisoni, 1998 and CVCC in Vitevitch & Luce, 1999) as quickly and as accurately

as possible. In order to perform the task, they have to know exactly which syllable they

are going to name in the word they hear. However, because of the structure of the

English lexicon and the usage of English words, choosing a syllable is practically

equivalent to selecting a word. Cutler & Carter (1987) reported that 64% of the content

words in the London-Lund corpus are monosyllabic words. It is also known that

frequency of monosyllabic words is very high in English. Under these conditions, for

English participants, choosing the syllable they minimally need to know in order to

perform the task is equivalent to selecting that word in the lexicon. English participants

were in the situation where they actually decided the word for the task. This happened

because the syllable they needed to know to perform the task was always a high frequency

word in English.

For Japanese participants, however, the first syllable/mora is a part of the target

word. The Japanese participants in this study learned in a practice session that they

would hear CVCVCV words. What they had to know is what the first syllable/mora of

the target words was. Monosyllabic/momomoraic CV words are rare in the Japanese

lexicon, so individual syllables would be unlikely to inspire a stream through the lexicon

for corresponding words. The Japanese participants did not need to select the word in

order to perform this task.

186

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The discrepancy between English and Japanese naming time data may be caused

by a relationship between the target words, the units for production (syllables) and the

lexical structure including frequencies of words.

6 .3 .4 .2 .L e x ic a l D e c is io n E x p e r im e n t in J a p a n e s e (A m a n o & K o n d o , 1999)

This section discusses another discrepancy between English and Japanese data in

the lexical decision experiments. In this task, participants were asked to decide whether

the stimuli were words or nonwords. The task, therefore, requires discrimination between

words and nonwords. Luce and Pisoni (1998) and Vitevitch and Luce (1999) both

reported that the decision times showed a neighborhood inhibitory effect. In Japanese,

Amano and Kondo (1999) conducted a similar lexical decision experiment, but the results

did not show a neighborhood inhibitory effect. Amano and Kondo (1999) accounted for

the discrepancy between lexical decision data and word identification data by claiming

that the neighborhood requires some amount of time to be activated enough to compete

with a target word (p. 1666). Since Amano & Kondo (1999) found a neighborhood

inhibitory effect in a word identification experiment, there might be a reason why it did

not show up in the lexical decision experiment5. It has been assumed here that that the

lexical decision task requires lexical selection: one word is selected from the activated

words in the lexicon in order to perform the task. However, Amano and Kondo’s

3 The neighborhood density calculation used in Amano & Kondo (1999) was based on moras. not on segments. 187

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. explanation may imply that Japanese participants managed to perform the task without

selecting the word from activated words in the lexicon. The rest of this section explores

this possibility.

The lexical decision task requires using words and nonwords for stimuli.

According to the results of English neighborhood density experiments, the task

necessitates the activation of lexical items in memory to categorize the stimulus

successfully, even when the stimulus is a nonword. In other words, to make a lexical

decision on both words and (phonotacticaliy legal) nonwords, participants have to make a

decision on the representation where an analysis of words and nonwords is possible.

In this model, two types of representations are assumed to exist in the lexicon:

Auditory patterns and Symbolic representations. There are therefore three lexical

decision strategies participants could take, depending on which representation they use

for their decision.

In the first two strategies, participants select a word from activated words in the

lexicon. In the first strategy, Japanese participants could perform the task by selecting a

word based on Auditory patterns , because the lexical decision task does not require

analyzing the internal structure of the word. Neighborhood density is based on whole-

word similarity, so if the word selection is observed in Auditory patterns , neighborhood

density would show an inhibitory effect. This neighborhood inhibitory effect was the

prediction in Amano & Kondo's experiment. This means that auditory patterns are fully

activated and word selection should be made in this space. From the results of

188

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Experiments 1 and 2 in this dissertation, whole-word activation should occur later in the

timecourse of word recognition. However, in this experiment, both words and nonwords

are used as targets. Since Auditory patterns are not able to deal with nonwords, so, it

seems unlikely that participants will take this strategy.

The second strategy is that participants base their decisions on symbolic

representations. Recall that symbolic representations of words are stored in the lexicon.

At the same time, symbolic representation are needed to hold temporal acoustic

information. Since nonwords are not stored in the lexicon, a symbolic representation

needs to be selected. Symbolic representations can be used for both words and nonwords,

so participants may focus on symbolic representations. English lexical decision

experiments have consistently shown that there is an effect of neighborhood density for

words while there is an effect of probabilistic phonotactics for nonwords. This difference

depends on whether participants conducted a word selection or not. English participants

may have taken this strategy in the proposed model.

The last strategy is that participants perform their task based on the probabilistic

phonotactics in the Symbolic representation. Amano & Kondo used CVCVCVCV words

with a LHHH pitch accent pattern. In other words, participants performed the task based

on whether a sequence of 4 moras with a proper pitch accent pattern occurs word-initially

or not. If this was the case, the participants did not need to consider other words. When

the acoustic input does not match any symbolic representation, the target could be

identified as a ‘nonword.’

189

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The response patterns in the lexical decision experiment in Amano & Kondo’s

(1999) study did not show a neighborhood inhibitory effect, suggesting that the data

support the third strategy: participants made a decision based on probabilistic

phonotactics.

6.3.4.3.IMPLICATIONS FOR CURRENT RECOGNITION MODELS

So far, the mode has accounted for the results of the three experiments in this

dissertation. It has also been used to account for discrepancies between English and

Japanese predictions in auditory naming and lexical decision experiments.

This next section will consider the implications of this work for current word

recognition theories. In particular, it will address the relationships among word

competition, representations and levels of processing.

Word recognition models usually assume both sublexical and lexical levels of

processing. These effects in spoken word recognition have been demonstrated in a

number of studies that investigated processing of words and nonwords that varied

probabilistic phonotactics (defined as the positional frequencies of segments and

biphones) and lexical competition (neighborhood density). Sublexical effects are

supported by findings in Pitt and Samuel (1995) and Vitevitch and Luce (1998, 1999),

who showed that the frequency of the sound components of spoken stimuli facilitates

processing. Sublexical frequency effects (also known as probabilistic phonotactics ) play

a part in the recognition process as the well-documented effects of lexical competition

190

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Gow & Gordon, 1995; McQueen et al., 1994; Norris et al., 1995; Shillcock, 1990;

Tabossi et al., 1995; Vitevitch & Luce, 1999; Vroomen & de Gelder, 1995; 1997;

Wallece et al., 1995a; Wallace et al., 1995b; Zwitserlood, 1989; Zwitserlood &

Schriefers, 1995). At the same time, competition among lexical representations inhibits

processing (Neighborhood density effect: Luce et al., 2000; Luce & Pisoni, 1998;

Vitevitch & Luce, 1999; Amano & Kondo, 1999).

Furthermore, Vitevitch and Luce (1999) have shown that different experimental

tasks could change the focus of participants in word processing. These results are

consistent with the hypothesis that neighborhood density has an inhibitory effect when

participants mainly focus on lexical processing, while probabilistic phonotactics are

facilitative when participants mainly focus on sublexical processing.

Since the number of words and frequent segments in the language overlap, there is

a strong positive correlation between neighborhood density and probabilistic

phonotactics. Typically, as the number of overlapping words increases, the frequencies of

the segments that make up the overlapping words also increase. Based on this fact, when

a neighborhood facilitative effect can be interpreted as an effect of probabilistic

phonotactics on sublexical processing.

Although neighborhood density and probabilistic phonotactics are highly

correlated, they are still separable. Luce and Large (2001) found that the results from a

speeded same-different task revealed simultaneous facilitative effects from phonotactics

191

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and inhibitory effects from competition effect (neighborhood density) for real word

stimuli.

The above results have suggested that word recognition models need to explain

two types of word competition (segment-based and whole-word based) and two levels of

processing (sublexical and lexical). In fact, two types of word competition were

simultaneously observed in Experiment 3, semantic categorization experiment.

The claim here is that lexical processing and sublexical processing are based on

different representations in the proposed model. Lexical processing is based on Auditory

patterns and sublexical processing is based on Symbolic representations. This model

does not assume any internal linguistic structure for Auditory patterns. However, it also

has Symbolic representations that could be represented by linguistic units, such as

phonemes, moras and syllables. Therefore, lexical processing occurs when participants

treat words as a whole unit, whereas sublexical processing happens when participants

treat words as an assembled unit (such as phonemes, moras and syllables). In other

words, in lexical processing, participants perceive words as complete forms whereas in

sublexical processing, they perceive words as assembled forms.

This model also permits the two types of word competition as based on different

representations. Neighborhood density is based on Auditory patterns whereas cohort

reduction and other segment-based word competition are based on Symbolic

representations.

192

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. According to the results of a series of experiments in Vitevitch and Luce (1999),

participants’ focus is highly related to experimental tasks. For example, a speeded same-

different task focuses participants more on sublexical processing, whereas a semantic

categorization task focuses them more on lexical processing.

In this model, participants focus more on one representation than the other, and

this focus is determined by the experimental task. Also, words are processed in both

representations whereas nonwords are only processed in symbolic representations. A

symbolic representation is the only representation in which participants can hold the

acoustic input for nonwords.

Also, if the experimental tasks require analyzing the internal structure of words,

participants are likely to focus on a symbolic representation. Therefore, many

experimental tasks used in studies on sublexical processing may inherently lead

participants to focus on a symbolic representation (phoneme-monitoring task, syllable-

monitoring task, ABX discrimination task, word-completion task). Also, the auditory

naming task needs to rely on a symbolic representation because this representation

mediates articulation in this model. The lexical decision task is processed at a symbolic

representation as a special case. The lexical decision itself does not require analyzing the

internal structures of words. However, participants also need to hear nonwords that make

them focus on a symbolic representation. In order to process two types of words

efficiently, participants probably tend to focus on a symbolic representation. If this is the

193

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. case, the semantic categorization task is the only task that is not biased towards a

symbolic representation.

The proposed model uses two types of word representations ( Auditory patterns

and Symbolic representations) in order to explain the results obtained in this dissertation.

The two types of word competition observed in the single experiment seemed to be

especially problematic for current models. The proposed model explains the two types of

word competition by positing two different representations, each of which reflects lexical

and sublexical processing. This model was a first attempt to assume auditory patterns

and symbolic representations simultaneously. More detailed investigation needs to be

conducted in the future.

6 .4 . C onclusions

This dissertation attempted to investigate two aspects of lexical access:

representations used for lexical access and word competition effects in Japanese. How

well did it succeed?

In word competition, the neighborhood density effect is a measure of lexical

competition: words from dense neighborhoods are processed less quickly and less

accurately than words from sparse neighborhoods. Therefore, three experiments were

conducted to investigate this effect in this dissertation. The results showed that

neighborhood density effects had influence on both processing times and accuracy.

194

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The processing times of Experiment 3 (the Semantic categorization experiment)

showed a neighborhood inhibitory effect for both fast and slow responders. This effect

was observed in a similar semantic categorization experiment in English (Vitevitch &

Luce, 1999). The accuracy data of Experiment 2 (Word identification in noise

experiment) showed that words from dense neighborhoods were recognized less

accurately than words from sparse neighborhoods. This pattern was also observed in

word identification in noise experiments in Japanese and English (Luce & Pisoni, 1998;

Amano & Kondo, 1999). Based on the assumption that neighborhood inhibitory effect

reflects word competition effect in any language, the above results provide evidence that

word competition also occurs in Japanese.

The results also showed another type of word competition effect in Japanese. The

second type of competition is related to the cohort theory. First, the effect of uniqueness

point was observed: words with an earlier uniqueness point were categorized more

quickly than words with a later uniqueness point. Also, an effect of initial cohort size

was observed. Recall that the first mora frequency can be interpreted in different ways; in

one interpretation, it is as the size of the initial cohort, if the effect shows inhibition. This

factor measures how frequently the initial mora appears word-initially. Therefore, if this

factor is high, there are many words beginning with this initial mora, yielding more

lexical competition. In this case, words with a large initial cohort are processed less

quickly than words with a small initial cohort. The first mora frequency showed an

inhibitory effect in Experiment 3, which nurtures the hypothetical effect of initial cohort

195

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. size. These two results, which support the cohort theory, suggest that the left-to-right

word competition effect happens in word recognition processes in Japanese.

Together, these data provided evidence that two types of lexical competition are at

work in Japanese.

This dissertation also investigated the kind of word representation used for lexical

access. Recent studies have shown that listeners use both abstract and episodic

representations and used in lexical access (e.g.. Luce & Lynos, 1998; Soto-Faraco et al.,

2001; Pallier et al., 2001). The results of the experiments in this dissertation also support

this view. As was just mentioned, these were two types of lexical competition effects

observed in Experiment 3 (the Semantic categorization experiment): (I) whole-word form

competition effect (neighborhood effect) and (2) a phoneme-based competition effect

(cohort reduction). A neighborhood density calculation also showed that inhibition is

related to a measure of whole-word similarities of words as computed by a comparison of

cochleagrams. Therefore, we need to assume that auditory patterns (auditory

representations) are stored in the lexicon.

The necessity of abstract representation comes from the fact that slow namers of

two auditory naming experiments tend to repeat the target words after they hear them

entirely, suggesting that they know exactly which word they are going to say. In this case,

participants have to retrieve an articulatory representation of the words. Since these two

representations do not have a direct, one-to-one correspondence, a symbolic

196

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. representation that mediates between the two representations is needed (see Jusczyk,

1993; Plaut & Kello, 1999 for discussion). 4 In conclusions, the results of the experiments in this dissertation shed light on two

aspects of lexical access involved in this dissertation. First, a lexical competition effect

is confirmed in Japanese. There are also two types of lexical competition in auditory

word recognition: form-based competition (neighborhood density) and phoneme-based

competition (cohort reduction). Finally, both abstract (symbolic) representations and

episodic (auditory) representations need to be stored in the lexicon.

197

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. REFERENCES

Amano, S., & Kondo, T. (1999). Neighborhood effects on spoken word recognition in Japanese. EUROSPEECH 99, 4, 1663-1666.

Amano, S., & Kondo, T. (2000). Neighborhood and cohort in lexical processing of Japanese spoken words. A paper presented at Spoken Word Process Accesses, Jonkerbosch Conference Center, Nijmegen, The Netherlands.

Amano, S., & Kondo, T. (1999,2000). The properties o f the Japanese lexicon. Tokyo: Sanseido Co. Ltd.

Bailey T. M., & Hahn, U. (2001). Determinants of wordlikeness: Phonotactics or lexical neighborhoods? Journal o f Memory and Language , 44, 568-591.

Barry, C„ Hirsh, K. W., Johnston, R. A., & Williams, C. L. (2001). Age of Acquisition, word frequency, and the locus of repetition priming of picture naming. Journal of Memory and Language, 44, 350-375.

Batting, W. F. S., & Montague, W. E. (1969). Category norms for verbal iterms in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology, 49, 229-240.

Beckman, M. E., & Edwards, J. (1990). Lengthenings and shortenings and the nature of prosodic constituency. In J. Kingston, & M. E. Beckman (Eds.), Papers in laboratory phonology I: Between the grammar and physics o f speech (pp. 152-178), Cambridge. UK: Cambridge University Press.

Bradley, D. C., Sanchez-Casas, R. M.. & Gracia-AIbea, J. E. (1993). The status of the syllable in the perception of English and Spanish. Language and Cognitive Processes, 8, 197-233.

Bregman, A. S. (1990). Auditory scene analysis: the perceptual organization of sound. Cambridge, M ass.: MIT Press.

198

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Brown. G. D. A., & Watson, F. L. (1987). First in, first out: Word learning age and spoken word frequency as predictors of word familiarity and word naming latency. Memory and Cognition, 15, 208-216.

Brent, M. R., & Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition , 61,93-125.

Bruck, B., Treiman, R., & Caravolas, M. (1995). Role of the syllable in the processing of spoken English: Evidence from a nonword competition task. Joumai of Experimental Psychology: Human Perception and Performance, 21, 469-479.

Charles-Luce, & Luce, P. A. (1995). An examination of similarity neighbourhoods in young child’s receptive vocabularies. Joumai of Child Language, 22, 727-735.

Cohen, J., & Cohen, P. (1983). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Second Edition. Laurence Erlbaum Accociates, Publishers.

Cutler, A. (1986). Forbear is a homophone: Lexical prosody does not constrain lexical access. Language and Speech, 29, 201-220.

Cutler, A. (1997). The comparative perspective on spoken-language processing. Speech Communication, 31. 3-15.

Cutler. A. & Carter, D.M. (1987). The predominance of strong initial syllables in the English vocabulary. Computer Speech and Language, 2, 133-142.

Cutler, A., & Chen, H. C. (1997). Lexical tone in Cantonese spoken-word processing. Perception <6 Psychophysics, 59, 165-179.

Cutler, A. & Norris, D. G. (1988). The role of strong weak syllables in segmentation for lexical access. Joumai of Experimental Psychology: Human Perception & Perfomiance, 14,113-121.

Cutler, A & Butterfield, S. (1992). Rhythmic cues of speech segmentation: Evidence from Juncture Misperception. Joumai of Memory and Language, 31, 218-236.

Cutler, A., Mehler, J., Norris, D. G. & Segui. J. (1986). The syllables differing role in the segmentation of French and English. Joumai of Memory and Language, 25, 385-400.

Cutler, A., Mehler, J., Norris, D. G. & Segui, J. (1992). The monolingual nature of speech segmentation by bilinguals. Cognitive Psychology, 24, 381-410.

Cutler, A., 8c Otake, T. (1994). Mora or phoneme? Further evidence for language- specific listening. Joumai o f Memory and Language, 33, 824-844.

199

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cutler, A., & Otake, T. (1999). Pitch accent in spoken-word recognition in Japanese. Joumai o f the Acoustical Society of America, 105,1877-1888.

Culer, A., & Otake, T. (2002). Rhythmic categories in spoken-word recognition. Joumai of Memory and Language , 46, 296-322.

Cutler, A., & Young, D. (1994). Rhythmic structure of word blends in English. Proceedings of International Conference on Spoken Language Processing '94, Yokohama, Japan, 3, 1407-1410.

Dupoux, E., Kakehi, K., Hirose, Y., Pallier, C., & Mehler, J. (1999). Epenthetic vowels in Japanese: A perceptual Illusion? Joumai of Experimental Psychology: Human Perception and Perfonnance, 25, 1568-1578.

Dupoux, E., Pallier, C„ Kakehi, K., & Mehler. J. (2001). New evidence for prelexical phonological processing in word recognition. Language and Cognitive Processes, 16, 491-505.

Fear, B. D., Cutler, A.. & Butterfield, S. (1995). The strong weak syllable distinction in English. Joumai of the Acoustic Society of America, 97, 1893-1904.

Finney, S. A.. Protopapas, A., & Eimas, P. D. (1996). Attentional allocation to syllables in American English. Joumai of Memory and Language, 35, 893-909.

Forster, K. I., & Shen, D. (1996). No enemies in the neighborhood: Absence of inhibitory neighborhood effects in lexical decision and semantic categorization. Joumai of Experimental Psychology - Learning, Memory & Cognition, 22, 696-713.

Frish, S.A. (1996). Similarity and Frequency in Phonology. Unpublished doctoral dissertation, Northwestern University, Evanston, IL.

Frisch, S.A., Large, N.R., & Pisoni, D.B. (2000). Perception of wordlikeness: Effect of segment probability and length of the processing of nonwords. Joumai of Memory and Language, 42, 481-496.

Garlock V. M., Walley A. C., & Metsala J. L. (2001). Age-of-acquisition, word frequency, and neighborhood density effects on spoken word recognition by children and adults. Joumai of Memory and Language, 45, 468-492.

Goldinger, S.D. (1992). Words and voices: Implicit and explicit memory for spoken words. Unpublished doctoral dissertation, Indian University, Bloomington, EN.

Goldinger, S.D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Joumai of Experimental Psychology: Learning, Memory and Cognition, 22, 1166-1183. 200

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Goldinger, S.D. (1998). Echoes of echoes?: An episodic theory of lexical access. Psychological Review, 105, 251-279.

Goldinger, S., Luce, P.A., & Pisoni, D.B. (1989). Priming lexical neighbors of spoken words: Effects of competition and inhibition. Joumai of Memory and Language , 28, 501-518.

Gow, D., & Gordon, P. (1995). Lexical and pre-lexical influences in word segmentation: Evidence from priming. Joumai of Experimental Psychology: Human Perception and Performance , 21, 344-459.

Greenberg, J.H & Jenkins, J.J. (1964). Studies in the psychological correlates of the sound system of American English. Word, 2 0, 157-177.

Grossberg, S., Boardman, I., & Cohen, M. (1997). Neural dynamics of variable-rate speech categorization. Joumai of Experiment Psychology: Human Perception and Performance, 2 3 ,483-503.

Hasegawa, Y, & Hata, K. (1992). Fundamental-frequency as an acoustic cue to accent perception. Language and Speech , 35, 87-98.

Hibiya, J. (1995). The velar nasal in Tokyo Japanese: A case of diffusion from above. Language, Variation and Change, 7, 139-152.

Johnson, K. (1997a). The auditory/perceptual basis for speech segmentation. In Ainsworth-Damell, K., & D’lmperio, M. (eds.). Ohio State University Working Papers in Linguistics: Papers from the Linguistic Laboratory, volume 50. Ohio State University.

Johnson, K. (1997b). Speech perception without speaker normalization. In Johnson, K., & Mullenix, J.W., (Eds.).. Talker variability in speech processing, (pp. 145-166). San Diego: Academic Press.

Jusczyk, P. W. (1993). From general to language-specific capacities - the WRAPSA Model of how speech-perception develops. Joumai of Phonetics , 2 1, 3-28.

Kenbou, G, Kindaichi, H, Kindaichi, K, & Shibata, T. (1981). Sanseido Shinmeikai Dictionary. Tokyo: Sanseido Co. Ltd.

Klatt, D. H. (1974). The duration os [s] in English words. Joumai Speech and Hearing Research, 17, 51-63.

Klatt, D. H. (1975). Vowel lengthening is synthetically determined in a connected discourse. Joumai o f Phonetics, 3, 129-140.

201

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Kail, D. H. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America, 59, 1208-1221.

Klatt (1979). Speech perception: A model of acoustic-phonetic analysis and lexical access. Journal of Phonetics, 1, 279-312.

Klatt, D. H. (1981). Lexical representations for speech production and perception. In T. Myers, J. Laver, & J. Anderson (Eds.), The cognitive representation o f speech (pp. 11- 31).

Kubozono, H. (1995). Perceptual evidence for the mora in Japanese. In B. Connell and A. Arvaniti (Eds.), Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV (pp. 141-156). Cambridge: Cambridge University Press.

Lehiste, I. (I960). An acoustic-phonetic study of internal open juncture. Basel (Schweiz); New York : S. Karger.

Lehiste, I. (1972). The timing of utterances and linguistics boundaries. Journal of the Acoustical Society of America, 51, 2018-2024.

Luce, P.A. (1986a). Neighborhoods of words in the mental lexicon. (Research on speech perception technical report 6). Bloomington, IN: Indiana University.

Luce, P.A. (1986b). A computational analysis of uniqueness points in auditory word recognition. Perception & Psychophysics, 39. 155-158.

Luce, P.A.. Pisoni, D.B., & Goldinger, S.D. (1988). Similarity neighborhoods of spoken words. (Research on speech perception progress report 14.) Bloomington. IN: Indiana University.

Luce, P. A., Goldinger, S.D., Auer, E.T., & Vitevitch, M.S. (2000). Phonetic priming, neighborhood activation and PARSYN. Perception and Psychophysics. 62, 615-625.

Luce, P.A., & Large, N.R. (2001). Phonotactics, density, and entropy in spoken word recognition. Language and Cognitive Processes. 16. 565-581.

Luce. P. A.. & Lyons E. A. (1998). Specificity of memory representations for spoken words. Memory and Cognition, 26, 708-715.

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19, 1-36.

Luce, P.A., Pisoni, D.B., & Goldinger, S.D. (1990). Similarity neighborhoods of spoken words. In G. T. M. Altmann (Ed.). Cognitive models o f speech processing,

202

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. psycholinguistic and computational perspectives (pp. 122-147). Cambridge, MA: MIT Press.

Luce, R. D. (1959). Individual choice behavior. New York: Wiley.

van der Lugt, A.H. (2001). The use of sequential probabilities in the segmentation of speech. Perception and Psychophysics, 63, 811-823.

Marslen-Wilson, W.D. (1987). Functional parallelism in spoken word-recognition. Cognition , 25, 71-102.

Marslen-Wilson, W.D., & Welish, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology, 10, 29-63.

Marslen-Wilson, W. D, & Tyler, L. K. (1980). Temporal structure of spoken language understanding. Cognition, 8, 1-71.

McClelland, J. & Elman, J. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86.

McQueen, J.M. (1998). Segmentation of continuous speech using phonotactics. Journal of Memory and Language, 39, 21-46.

McQueen, J. M. & Cutler, A. (1998). Spotting (different types of) words in (different types of) context. Proceedings of the 5th International Conference on Spoken Language Processing, vol. 6, (pp. 2791-2794), Sydney, Australia.

McQueen, J. M., Norris, D. G., & Cutler, A. (1994). Competition in spoken word recognition: Spotting words in other words. Journal o f Experimental Psychology: Learning, Memory and Cognition, 20, 621-638.

McQueen, J.M., Otake, T., & Cutler, A. (2001). Rhythmic cues and possible-word constraints in Japanese speech segmentation. Journal of Memorv and Language, 45, 103-132.

Mehler, J., Dommergues, J.-Y., Frauenfelder, U. & Segui, J. (1981). The syllable's role in speech segmentation. Journal of Verbal and Learning Behavior, 20, 298-305.

Metsala, J. L. (1997). An examination of word frequency and neighborhood density in the development of spoken-word recognition. Memory & Cognition, 25,47-56.

Nakatani, L. H., & Dukes, K. D. (1977). Locus of segmental cues for word juncture. Journal o f the Acoustical Society of America, 62, 714-719.

203

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Norris, D. (1994). Shortlist — A connectionist model of continuous speech recognition.. Cognition, 52, 189-234.

Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209-1228.

Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences , 23, 299-325.

Norris, D., McQueen, J. M., Cutler, A., & Butterfield, S. (1997). The possible-word constraint in the segmentation of continuous speech. Cognitive Psychology, 34, 191-243.

Norris D, McQueen J. M, Cutler A, Butterfield S, & Keams R. (2001). Language- universal constraints on speech segmentation. Language and Cognitive Processes, 16, 637-660.

Nosofskey, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology - General, 115, 39-57.

Pisoni, D. B., Nusbaum, H. C., Luce, P. A., & Slowiaczek, L. M. (1985). Speech- perception, word recognition and the structure of the lexicon. Speech Communication, 4, 75-95.

Oiler, D. K. (1973). The effect of position in utterance on speech segment duration in English. Journal of the Acoustical Society of America, 54, 1235-1247.

Ogawa T. (1972). 52 Kategorii ni zokusuru go no syutsugen hindo hyoo (Category norms for verba items in 52 categories in Japanese). Kansai Gakuin University, Jinbun Ronkyuu, 22, 1-68.

Otake, T., Hatano, G., Cutler. A. & Mehler, J. (1993). Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language, 32, 258-278.

Otake, T., Hatano, G., & Yoneyama, K. (1996a). Japanese speech segmentation by Japanese listeners. In T. Otake & A. Cutler (Eds.), Phonological structure and language processing: Cross-linguistic studies (pp. 183-201). Berlin, Germany: Mouton de Gruyter.

Otake, T., Yoneyama, K., Cutler, A.. & van der Lugt, A. (1996b) The representation of Japanese moraic nasals. Journal o f Acoustic Society of America, 100, 3831-3842.

Otake, T. & Cutler, A. (1999). Perception of suprasegmental structure in a non-native dialect. Journal of Phonetics, 27, 229-253.

204

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Pallier, C., Sebastian-Galles, N., Felguera, T., Christopher, A. & Mehler, J. (1993). Attentional allocation within the syllabic structure of spoken words. Journal of Memory and Language, 32, 373- 389.

Pallier, C., Colome, A., & Sebastian-Galles, N. (2001). The influence of native- language phonology on lexical access: Exemplar-based versus abstract lexical entries. Psychological Science, 12, 445-449.

Perkell, J. S., Matthies, M. L., Svirsky, M. A., & Jordan, M. I. (1995). Goal-based speech motor control: a theoretical framework and some preliminary data. Journal of Phonetics, 23, 23-35.

Pirat, A., Logan, J., Cockell. J., & Gutteridge, M. E. (1995). The role of phonological neighborhoods in the identification of spoken words by preschool children. Poster presented at the Annual Meeting of the Canadian Society for Brain, Behaviour & Cognitive Science, Halifax, Nova Scotia.

Pisoni, D. B. (1996). Word identification in noise. Language and Cognitive Processes, 11,681-688.

Pisoni, D.B., Nusbaum, H. C., Luce, P. A., & Slowiaczek. L. M. (1985). Speech perception, word recognition and the structure of the lexicon. Speech Communication, 4. 75-95.

Pisoni, D.B. (1997). Some thoughts on “normalization” in speech perception. In Johnson. K., & Mullenix, J.W.. (Eds.). Talker variability in speech processing, (pp. 9- 32). San Diego: Academic Press.

Pitt M. A (1998). Phonological processes and the perception of phonotactically illegal consonant clusters. Perception and psychophysics, 60, 941-951.

Pitt, M. A., & McQueen, J.M. (1998). Is compensation mediate the lexicon? Journal o f Memory and Language, 39, 347-370.

Pitt, M. A.. & Samuel, A. G. (1995). Lexical and sublexical feedback in auditory word recognition. Cognitive Psychology, 29, 149-188.

Pitt, M. A., Smith, K. L.. & Klein, J. M. (1998). Syllabic effects in word processing: Evidence from the structural induction paradigm. Journal o f Experimental Psychology - Human perception and Performance, 24, 1596-1611.

Plaut, D. C, & Kello, C. T. (1999). The emergence of phonology from the interplay of speech comprehension and production: a distributed connectionist approach. In

205

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. MacWhinney, B. (Ed.). The emergence o f languagel. (pp. 381-415). Lawrence Erlbaum Associates, Publisheers.

Quene, H. (1992). Durational cues for word segmentation in Dutch. Journal of Phonetics, 2 0 , 331-350.

Radeau, M., & Morais, J. (1990). The uniqueness point effect in the shadowing of spoken words. Speech Communication , 9, 155-164.

Saffran, R. N., and Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal o f Memory and Language , 35.606-621.

Saffran, J. R., Newport, E. L., Aslin, R. N., Tunick, R. A., Barrueco, S., (1997). Incidental language learning: Listening (and learning) out of the comer of your ear. Psychological Science , 8,101-195.

Schacter, D. L., & Church, B. (1992). Auditory priming: Implicit and explicit memory for words and voices. Journal of Experimental Psychology: Learning, Memory and Cognition , 18,915-930.

Sebastian-Galles, N. (1996). The role of accent in speech perception. In T. Otake & A. Cutler (Eds.), Phonological structure and language processing: Cross-linguistic studies (pp. 172-181). Berlin, Germany: Mouton de Gruyter.

Sebastian-Galles, N., Dupoux, E.. Segui, J., & Mehler, J. (1992). Contrasting syllabic effects in Catalan and Spanish. Journal of Memory and Language , 31, 18-32.

Sekiguchi, T. & Nakajima, Y. (1999). The use of lexical prosody for lexical access of the Japanese language. Journal o f Psycholinguistic Research, 2 8 , 439-454.

Shillcock, R. C. (1990). Lexical hypotheses in continuous speech. In G. T. M. Altomann (Ed.), Cognitive models of speech processing: Psycholinguistic and computational perspectives (pp. 24-49), Cambridge, MA: MIT Press.

Smith, K.L, & Pitt, M. A. (1999). Phonological and morphological influences in the syllabification of spoken words. Journal of Memory and Language , 41, 199-222.

Smith, K.L, & Pitt, M. A. Are cues to syllable boundaries also cues to word boundaries? A manuscript submitted for publication.

Soto-Faraco, S., Sebastian-Galles, N., & Cutler, A. (2001). Segmental and suprasegmental mismatch in lexical access. Journal of Memory and Language, 45, 412- 432.

206

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Sugito, M. (1972). Ososagari-koo: Dootai-sokutei ni yoru nihongo akusento no kenkyuu (Delayed pitch fall: An acoustic study). Shoin Joshi Daigaku Ronshuu, 10, Reprinted in M. Tokugawa (Ed.), Akusento (Accent) (pp. 201-229). Tokyo: Yuuseidoo, 1980).

Suomi, K., McQueen, J. M., & Cutler, A. (1997). Vowel harmony and speech segmentation in Finnish. Journal of Memory and Language, 36,422-444.

Tabossi, P., Burani, C., & Scott. D. (1995). Word identification in fluent speech. Journal of Memory and Language, 34, 440-467.

Tyler, L. K., & Wessels, J. (1983). Quantifying contextual contributions to word- recognition processes. Perception and Psychophysics, 34,409-420.

Vance, T. J. (1987). An introduction to Japanese Phonology. Albany, NY: State University of New York.

Vance, T. J. (1995). Final accent vs no accent — Utterance-final neutralization in Tokyo Japanese. Journal of Phonetics, 23,487-499.

Vitevitch, M. S., & Luce, P. A. (1998). When words compete: Levels of processing in perception of spoken words. Psychological Science, 9, 325-329.

Vitevitch, M. S., & Luce, P. A. (1999). Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition. Journal of Memory and Language. 40, 374- 408.

Vroomen, J., & de Gelder, B. (1995). Metrical segmentation and lexical inhibition in spoken word recognition. Journal of Experimental Psychology: Human Perception and Performance, 21, 98-108.

Vroomen J, & de Gelder, B. (1997). Activation of embedded words in spoken word recognition. Journal of Experimental Psvchologx: Human Perception and Performance, 23, 710-720 JUN 1997

Vroomen, J., van Zon & de Gelder, B. (1996). Curs to speech segmentation: Evidence from juncture misperceptions and word spotting. Memory and Cognition, 24, 744-755.

Vroomen, J., Tuomainen, J., & de Gelder, B. (1998). The roles of word stress and vowel harmony in speech segmentation. Journal of Memory and Language, 38, 133-149.

Warner, N. (1997). Japanese final-accented and unaccented phrases. Journal o f Phonetics, 25, 43-60.

207

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wallece, W. P., Stewart, M. T„ & Malone, C. P. (1995a). Recognition memory errors produced by implicit activation of word candidates during the processing of spoken words. Journal of Memory and Language, 34,417-439.

Wallece, W. P., Stewart, M. T., Sherman, H. L., & Malone, C. P. (1995b). False positives in recognition memory produced by cohort activation. Cognition, 55, 85-113.

Ye, Y., & Connine, C. M. (1999). Processing spoken Chinese: The role of tone information. Language and Cognitive Processes, 14, 609-630.

Yoneyama, K. (2000). The structural aspects of the Japanese lexicon. A paper presented at The Speakers Series, Department of Linguistics, Ohio State University.

Yoneyama, K. & Pitt, M. A. (1999). Prelexical representation in Japanese: Evidence from the structural induction paradigm. Proceedings of the 14,u International Congress of Phonetic Sciences, vol. 2, 893-896.

Zwitserlood, P. (1989). The locus of the effects of the sentential-semantic context in spoken word processing. Cognition, 32, 589-596.

Zwitserlood, P., & Schriefers, H. (1995). Effects of sensory information and processing time in spoken-word recognition. Language and cognitive processes, 10, 121-136.

Zwitserlood, P., Sheriefers, H., Lahiri, A., & van Donslaar, W. (1993). The role of syllable in the perception of spoken Dutch. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 260-271.

208

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A: Alphabetic symbols used in the lexicon

In this dissertation, two types of word representations that are used for lexical access are test: abstract representation and auditory representation. The alphabetic symbols shown below are used to describe abstract representation. This representation is “phonetically detailed” representation. Therefore, this is NOT the phonological representation of words linguists generally assume. This representation contains a short and long vowel distinction. In the Tokyo dialect of Japanese, /ou/, /ei/ that appear in a morpheme are realized as [oo] and [eel, respectively. This representation contains this phonetic representation. Similarly, the representation we used has more detailed phonetic information. For examples, U S and fr-$ - both have /ki/ that is realized as a palatarized Ik/ (capitalized k) which is different from Ik/ in other vowel environment such as Ike/ and fkol) in the examples.

Examples: [ee] Word Phonological form Abstract rep. JUS business conditions; market /keiki/ ke'e'K i1 ' r - * cake /keeki/ ke‘euKiu [oo] Word Phonological form Abstract rep. i altitude, height /koudo/ kolo°dou □ - K cord /koodo/ ko'o°do°

The following tables show alphabetic symbols used for a symbolic representation used for lexical access.

209

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •a -u -o

a-

sa su se

cu

na nu ne no n-

ma mi mu me mo ni-

rura ro

wa w-

210

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. &m •a W l -i ?m •u am -e tsm -o

K'n ii ga 2 * Gi <* 7 gu If ke zi ko ea- tffr s -? za C V Zi ■r X zu 1f -tf ze :J zo z- t£'n ti da % Zi o zu V ■T de a K do d- It /< ba XS t' Bi bu s< /< be it bo b- liff It /-v pa IS fc: Pi -S' pu pe it po p-

211

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fc5»l .a L'JIJ .j 5?i] -u fim - 3 ^ V fr ^ V *>tp *3 X £ cfc Ca Cu Cp Co ty- * v ^ X ^ x =f- 3

—vfir lev \Znp l~x 1-

t vfr 1>*P 1>X Ha He Ho hy- t v t l t X t 3 b$> b x b n Ma Mu Me Mo my- 5 V 5 x 5 x 5 3

■J Vfr y * y n> y x y * Ra Ru Re Ro rv- 'J V ' j i y x 'J 3

fcM -a .j 55i| -u X.9I e fc5»l -o ^ v f i iTx Ga Gu G p Go gya- ^ V * x * x * 3 v v f r C v £*> L x Cn Za 7 m Zp Zo zya- v v v x v x V 3 ^ V ff *>*V *>'x *>'cfc Za Zu 7p Zo dya- ^ V * x * x ^ 3 t v f f tf v V# tf x

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. m \ •a I'5*1 -i o5*l -u *.5*1 -e is 5*1 -o

07fr o i' 5 X. o ft wi we wo W O-r Ox 0*-

< & Ct' <* X <*ft Wa Wi We Wo kw- 0 7 0 -r 0 *

£5*1 -a l'5*l -i o5*l -u *.5*1 -e £5*1 -o 0 7 f r Oft Ol.' O x Oft ca ci cu ce co ts- " J 7 O-f O x "Jt

X r ' f t r * t* I' f x t t s za zi zu ze zo zw- X 7 X-r X x X i r

0 7 f r -S'ft -S'I' -S' x -S'ft' fa fi fu fe fo f- 0 7 0-< O x o *

0*7fT O ' ft o ' I' 0* x 0* ft va vi vu ve VO V- 0 7 OV O x 0 >

Moraic Nasal Geminate Consonant Long vowel f\j -o -- double N Q > v -- vowels

213

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX B: The 300 stimulus pairs used in a similarity judgment experiment

The first accent patterns are shown as a function of the second stimulus accent patterns. Shaded cells in the table indicate that these pairs were not presented to participants. A start indicates an accented high pitch, I indicates an unaccented high pitch and 0 indicates a low pitch.

Second Stimulus accent pattern 000*0 i o lin

00*0 * * * s * *000 *0000 O * © * o * © © o Olll on* 01*0 © 011*0 01*00 0 * 01 *0 0* *00 o n 01* 0*0 o n i 011* 01*0 0*00 *000

First First Stimulus accent pattern 01111 * 011*0 01*00 0*000 *0000

214

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix C: 700 target words

The 700 target words that were used in Experiments 2, 3 and 4 are shown in the table below. Three different neighborhood density. Word Frequency, Uniqueness Point (UP), Duration, and Ist mora Frequency are also included in the same table.

Target Words Neighborhood

■r. m m5 W ord Gloss S z*Sfi £L 1* 1* M ora Auditory Word Frequency Duration + + Pilch frequency Segments kabotya t Mate pumpkin 0 0 0.750 2.9031 5 606 0.0542 kabuka M stock prices 6 5 14.846 4.1691 7 550 0.0542 kabuki ( Japanese kabuki 10 / 6.165 3.6791 7 595 0.0542 play) kabure rush 11 10 115.494 2.1206 6 511 0.0542 katiku mm livestock 10 5 0.510 3.1735 7 575 0.0542 katime winning point 2 t 13.102 2.4378 6 529 0.0542 katura A O b wig 7 5 7.937 2.8082 7 587 0.0542 type, printing katuzi 7 3 0.S69 3.3071 7 611 0.0542 S5* type kadode nta departure I 1 154.389 2.4871 7 539 0.0542 kagami 31 mirror 24 16 35.619 3.3867 7 594 0.0542 kagiri PgiJ limit 13 4 215.489 4.3405 6 497 0.0542 bird-in-the-cage' kagome A'CTtf) 4 4 123.741 2.0934 7 571 0.0542 game kakaku ffilS price 51 31 6.239 4.8105 7 552 0.0542 kakari % charge 24 3 64.216 3.3918 7 552 0.0542 kakasi scarecrow 17 10 0.551 2.1703 5 625 0.0542 kakera fcitb a broke piece 5 3 28.112 2.6776 5 534 0.0542 kakine fence 7 I 0.696 3.1389 7 582 0.0542 kakomi H Si- enclosure 4 4 22.260 1.9590 5 540 0.0542 kakuti various places 22 10 4.188 4.3767 7 548 0.0542 kakudo nm angle 16 14 30.035 3.3813 7 535 0.0542 kakugo resolution 16 8 42.974 3.7976 7 539 0.0542 continued

215

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood >.

Word Gloss a. s §■ Word Frequency D uration Auditory + + Pitch Segments Segments 3 maintenance, kakuho 9 7 54.300 4.4668 7 544 0.0542 secure kakuri mm isolation 37 18 94.177 3.0577 7 489 0.0542 kakusa tom gap, difference 9 6 21.541 4.0602 7 539 0.0542 everyone, each kakuzi 35 16 0.600 2.7723 7 612 0.0542 one oven, cooking kamado 2 0 97.944 2.5198 7 545 0.0542 n stove kamasu barracuda 5 3 18.438 1.7559 7 612 0.0542 kamera i i * 7 camera 3 0 104.096 3.9054 7 527 0.0542 kamocu goods 3 1 20.115 3.6224 5 572 0.0542 kamoku n i course, subject 24 10 26.403 3.5569 7 536 0.0542 kamome a seagull 6 5 30.437 3.0306 5 576 0.0542 kanagu metal Fixtures •> 1 4.580 2.7672 7 630 0.0542 kaniku mm flesh 13 5 7.189 2.1703 5 597 0.0542 kanozyo she. sweetheart 12 5 129.781 4.1527 5 502 0.0542 karada body 7 3 74.720 4.8666 7 490 0.0542 karami sharp taste 17 15 198.753 3.3381 7 534 0.0542 karasi mustard 18 14 5.270 2.2648 7 629 0.0542 karasu crow 12 5 34.579 3.0860 7 560 0.0542 karate (marshal karate 5 4 13.295 2.6955 7 563 0.0542 art) kareha teH dead (dry) leaf 0 0 110.212 2.4249 5 563 0.0542 kareki t e * i * dead tree 17 to 31.375 2.1703 7 560 0.0542 karesi m c boy friend 4 0 104.514 1.9638 5 529 0.0542 karite borrower 12 9 10.642 3.1578 6 577 0.0542 karyoku 4c* heat 31 14 89.793 2.9025 7 508 0.0542 karusa &£ lightness 10 1 22.804 2.5809 7 559 0.0542 karuta cards 4 1 167.636 2.5490 6 506 0.0542 karute J l/T chart 3 1 144.483 3.0484 7 485 0.0542 kasane S f e pile, layer 5 4 2.699 3.2925 7 604 0.0542 kasegi mz income 4 0 0.427 3.3107 7 554 0.0542 kaseki \KE fossil 24 13 8.871 3.3703 7 535 0.0542 kasitu i f i * mistake 17 11 3.142 3.2514 7 760 0.0542 kasira n head 7 7 33.385 4.3792 7 574 0.0542 kasoku acceleration 35 18 2.738 3.8015 7 618 0.0542 continued 216

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood >> Cl sSI W ord Gloss 7 s £ er • I M MI ora M D uration + + Pitch Auditory frequency Segments Segments ? ill 3 kasumi 3 misc 13 12 7.389 2.7832 7 629 0.0542 katati K form, shape 21 17 1.981 4.7856 7 608 0.0542 katana 71 sword 5 3 24.954 2.7738 7 517 0.0542 katari Sk) narration 35 25 54.101 3.3249 7 462 0.0542 katate ft* one hand to 7 6.167 3.1136 7 588 0.0542 kawaki VlZ dryness 9 8 6.925 1.9777 7 588 0.0542 kawari f substitution 21 17 207.951 3.7667 7 522 0.0542 kawase &£ powder 7 5 9.114 3.6978 7 600 0.0542 kayaku :km powder 42 26 8.104 2.9143 7 621 0.0542 kazari ornament 15 10 117.313 3.0584 7 562 0.0542 kazicu mm fruit 13 S 27.087 3.0406 7 581 0.0542 kazino casino 0 0 41.125 2.7251 5 565 0.0542 kazoku nm family 29 14 50.468 4.6811 7 539 0.0542 kecuzyo Ma lack 9 5 2.246 3.0860 7 541 0.0170 kedama pill 6 4 47.279 1.6628 6 589 0.0170 a crab of the kegani family i T 74.778 0.6990 5 581 0.0170 Atelecyclidae kegare 55*1 pollution, stain 1 I 180.385 3.2082 6 531 0.0170 kegawa fur *» I 8.543 2.9791 5 614 0.0170 kemono *K beast 9 6 146.514 2.8075 7 554 0.0170 kemuri « smoke 6 5 133.222 3.5904 7 544 0.0170 kemusi hairy caterpillar 4 4 39.342 1.9138 7 598 0.0170 kenami fine fur 2 1 173.561 1.9395 5 529 0.0170 kenuki m z hair tweezers 11 9 12.892 1.4472 7 605 0.0170 kesiki scene, view 13 4 4.633 2.9974 7 496 0.0170 difference. kezime 9 9 79.844 3.5093 7 566 0.0170 distinction kibori *» y wood carving 10 8 20.139 2.5391 5 563 0.0241 kituke fitting 13 11 0.000 2.3054 7 607 0.0241 kitune 2a fox 3 3 0.000 2.9390 7 645 0.0241 disposition. kidate 4 3 2.384 1.2553 5 624 0.0241 m&T nature undulations, ups kihuku 32 19 0.000 2.7016 7 654 0.0241 gf* and downs continued

217

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss * 1“ 1“ M ora Word Frequency D uration frequency Auditory Segments Segments + Pitch 3 constraint, kigane 6 5 99.667 2.6812 5 572 0.0241 hesitation kigeki mm comedy 11 4 3.047 3.2177 5 568 0.0241 kihaku mm rare, thin (air) 27 24 9.209 2.9983 7 621 0.0241 go home, return kikoku 48 25 6.080 4.4291 7 605 0.0241 m home kikori m woodcutter 10 8 58.687 1.4624 7 530 0.0241 kimatu m* the end of a term 13 to 1.159 2.8344 5 657 0.0241 kimari rule 6 3 121.839 3.0885 7 538 0.0241 conclusive kimete I 1 3.549 3.4196 6 610 0.0241 evidence kimoti feeling 8 6 0.268 4.4569 5 650 0.0241 kimono, clothes 19 11 12.927 3.2799 5 593 0.0241 kimuti Korean pickles 5 2 18.257 2.5092 5 585 0.0241 kinako soybean flour 2 2 15.699 1.9956 6 557 0.0241 kinoko m mushroom 19 8 46.914 3.5032 6 510 0.0241 kiretu *3! crack 15 10 0.683 3.5592 5 643 0.0241 kireme gap. break 3 3 15.583 2.8745 6 595 0.0241 kirimi « j y * slice 6 5 58.099 2.3766 7 570 0.0241 kiryoku energy 28 19 0.510 3.0286 7 627 0.0241 kiroku ten record 35 17 7.521 4.5681 7 562 0.0241 kisetu mw season 31 5 0.980 3.9120 7 612 0.0241 kiseki miracle 41 25 3.149 3.0671 7 604 0.0241 kisibe £22 shore, bank I I 5.510 2.5145 7 582 0.0241 kisoku mi rule, regulation 35 7 1.227 3.6867 7 600 0.0241 return home, get kitaku 32 30 4.200 3.7139 7 598 0.0241 home kiteki whistle 11 9 2.471 2.3711 5 569 0.0241 kitoku fern critical 31 25 1.055 2.5276 7 586 0.0241 kiwame extremity 1 I 19.465 3.6049 7 611 0.0241 looking thinner kiyase I 1 11.068 0.3010 6 586 0.0241 mm* when dressed kizasi & L symptom 7 3 20.507 3.6917 5 580 0.0241 kizetu faint 16 9 0.540 2.0253 7 660 0.0241 kizitu MB time limit 28 13 11.199 3.3222 7 578 0.0241 kizoku mm noble | 32 13 86.609 3.0538 7 522 0.0241 continued 218

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss m l k< M oral k< Word Frequency Duration + + Pitch Auditory frequency Segments Segments 5 kizyutu description 16 4 2.003 3.6690 7 668 0.0241 kobaka 'MS looking down on 4 3 4.926 1.4771 5 552 0.0443 kobetu mm individual 8 j 8.669 4.0399 7 614 0.0443 kobito d ' A dwarf I 0 7.268 2.4298 5 603 0.0443 kobune /]'» boat 6 5 9.157 2.4265 7 585 0.0443 kobura cobra 9 2 185.084 1.8062 6 513 0.0443 kobusi m fist 13 5 0.594 2.9595 7 674 0.0443 kodom o =f-m child 4 j 70.920 4.7365 7 509 0.0443 kogetya m i m umber I 0 10.134 1.9731 7 569 0.0443 kogoto scolding, rebuke 9 3 21.967 2.2765 7 566 0.0443 kokage the shade of a tree 8 7 32.629 2.4281 7 566 0.0443 kokyaku customer, client 19 12 0.514 3.8519 4 680 0.0443 Japanese painted kokesi 5 0 4.735 2.1206 7 609 0.0443 C l t L wooden doll kokoti feeling, sensation 12 ■> 1.473 2.7875 7 576 0.0443 kokoro heart 18 0 40.878 4.5364 7 489 0.0443 kokuti notification 47 16 3.227 3.2620 7 513 0.0443 kokudo BB ± country, territory 25 21 91.230 3.7717 7 486 0.0443 kokugi m t a national game 33 22 39.906 2.2355 6 566 0.0443 kokugo mm national language 27 8 38.485 3.4961 7 569 0.0443 kokuso £ 3 ? accusation 20 14 7.567 3.3636 7 526 0.0443 kom aku eardrum 18 12 4.006 2.2810 6 523 0.0443 happened to komimi 1 I 4.249 1.4624 5 608 0.0443 overhear komoti has children 18 5 4.454 2.2856 5 618 0.0443 kom ono 'M il gadget 5 2 7.427 2.7292 5 622 0.0443 komori w y nursing 24 10 81.582 2.5877 7 520 0.0443 komugi /J'* wheat 5 I 83.713 3.2524 5 532 0.0443 konoha leaf 5 4 61.516 2.4487 7 513 0.0443 konoyo this world 2 2 105.140 3.3440 5 501 0.0443 koramu column I I 463.095 3.1864 5 490 0.0443 korera 3 cholera 2 0 196.107 4.4654 5 463 0.0443

koritu v l s l isolation 14 7 13.085 3.7201 5 599 0.0443 korom o i t clothes, dress 3 3 23.327 3.1159 7 585 0.0443 korosi SL murder 32 8 2.439 3.1014 7 683 0.0443 continued 219

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 2 SB s m0* £ — W ord Gloss E x 5 ac a. m 1“ 1“ M ora Word Frequency D uration frequency £ a + Auditory koruku 3 J l/£ cork 2 2 100.286 1.9542 5 527 0.0443 kosame /I'M drizzle, light rain 3 2 32.366 2.7084 5 575 0.0443 koseki F lf register 29 14 10.092 3.2847 7 571 0.0443 kosuto 37. h cost 5 5 8.931 4.1152 5 505 0.0443 Japanese fireplace kotatu 11 6 7.176 2.7612 5 614 0.0443 km with a coverlet kotoba -sm language 4 I 10.523 4.7652 7 561 0.0443 kotori small bird IS S 48.335 2.8267 5 551 0.0443 kotos i this year 29 7 2.038 4.9870 5 616 0.0443 kowasa fear, fearful ness •> 1 98.961 3.0026 5 525 0.0443 koyaku =¥-& child actor 27 18 0.000 2.4771 7 661 0.0443 koyomi m calendar 9 5 309.921 2.8704 5 556 0.0443 koyubi /Mi little finger 3 3 52.744 2.6201 5 584 0.0443 kozara /j'nn. small plate 5 ■) 10.488 1.9777 7 545 0.0443 kozeni /MS small change 0 0 57.494 2.6064 5 586 0.0443 koziki £ £ beggar 11 5 S.064 2.4232 6 594 0.0443 koziwa /J'SS little wrinkles 0 0 142.333 1.0792 5 532 0.0443 kubetu distinction 9 j 57.119 3.6340 ■* 546 0.0163 kubiwa collar, necklace II 32.771 2.0719 5 568 0.0163 kudari Ty decline 7 6 276.128 3.5533 7 491 0.0163 kudoki persuade 3 I 14.473 1.8195 6 584 0.0163 kudosa lengthy, tedious 2 1 45.325 1.0414 5 558 0.0163 kugatu September 2 2 4.192 4.3808 5 592 0.0163 kugiri K«y end. pause, stop 2 I 74.368 3.2669 5 553 0.0163 division, section, kukaku 21 15 2.175 2.8681 5 603 0.0163 E li block kumade m # rake 2 2 227.120 1.8129 7 532 0.0163 kumori 5tj cloudy weather 6 4 139.789 2.8182 7 459 0.0163 kurabu club 5 4 322.708 3.9893 7 498 0.0163 kurage < blf jellyfish 4 2 227.442 2.1987 6 564 0.0163 kurasa <7 37. darkness 9 I 25.504 2.5694 6 568 0.0163 kurasi L living 5 4 30.672 4.0465 7 525 0.0163 kurasu ^ 7 7 7 class 10 S 105.656 4.0001 7 510 0.0163 kurosa black 3 2 40.294 1.2788 6 541 0.0163 kurosu ^ 7 3 7 . cross 6 5 145.723 2.4362 7 541 0.0163 continued 220

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

3 §s ■= w W ord Gloss s — BC ft, & T T M ora W ord Frequency D uration A uditory frequency Segments 3 + 3 kurozi be in the black 3 8 49.951 4.1974 7 510 0.0163 kuruma m car. vehicle 3 2 123.198 4.7690 7 476 0.0163 kurumi < walnut 9 7 175.738 2.3927 6 516 0.0163 kusaki plants 9 i 1.695 2.6503 7 548 0.0163 kusyami < L*><* sneeze 1 38.352 2.2455 5 491 0.0163 kusami bad smell 8 2 51.494 2.0934 6 491 0.0163 kusari m chain 10 9 97.088 2.9191 7 489 0.0163 kusasa na? bad smell 5 I 4.142 2.4116 7 539 0.0163 kusuri m medicine 10 8 22.307 3.9936 7 542 0.0163 kuzyaku peacock 6 4 10.421 2.1790 7 632 0.0163 kuzira i s whale 0 0 119.898 3.4069 7 538 0.0163 kuzure mti crumble, collapse 4 T 76.475 2.5441 5 527 0.0163 mabuta Sk eyelid 1 0 29.545 2.6170 5 502 0.0141 matubi Mm the end S 4 2.852 2.7126 6 587 0.0141 matuge £-3 If eyelashes 6 3 19.946 1.8388 6 528 0.0141 maturi «y festival 9 2 17.519 3.6821 7 587 0.0141 madamu madam 1 1 441.265 2.0043 5521 0.0141 madori ray layout of a house 6 5 90.519 2.5224 7 560 0.0141 mahuyu midwinter 0 0 48.074 2.3385 7 617 0.0141 magiwa the last moment 2 0 2.501 3.0457 5 619 0.0141 maguma ■7 magma 4 0 91.912 2.8915 5 520 0.0141 magure MCti lucky hit 2 0 76.741 2.0086 7 542 0.0141 maguro -7?a tuna 4 1 33.775 3.2041 6 581 0.0141 makoto m truth 2 2 1.813 4.3027 5 581 0.0141 makura pillow S 2 21.296 2.7308 7 533 0.0141 mamizu X tR fresh water 2 1 2.240 2.4487 5 648 0.0141 mamono evil spirit, devil S 7 191.018 2.1335 5 553 0.0141 mamori ^y defense 6 3 74.609 3.6824 7 564 0.0141 mamusi adder, viper 4 4 7.537 2.1614 5 667 0.0141 manabi learning 4 i 70.338 3.9208 5 570 0.0141 m anatu XX midsummer 5 I 7.070 2.6937 7 639 0.0141 manako m eye 5 0 18.343 3.0120 6 573 0.0141 marine 7 'J* marinate 2 0 316.325 1.6021 5 518 0.0141 m aryoku SKA charm 6 2 70.458 2.3284 3 558 0.0141 continued 221

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss * 1“ 1“ M ora Word Frequency D uration frequency Auditory Segments + Pitch Segments 5 maruku mark 4 3 102.941 3.5319 7 547 0.0141 marumi roundness 6 5 71.496 2.4232 7 577 0.0141 maruta AX log 3 1 1.595 3.1717 7 619 0.0141 maruhi secret 2 2 60.720 2.1732 7 599 0.0141 masatu mm friction 7 6 3.069 3.7549 5 657 0.0141 masita MT right under, below 6 5 0.961 2.7324 5 634 0.0141 masuku 7 X ^ 7 mask 3 3 7.421 3.0955 7 519 0.0141 masuto ■77. h mast 5 5 19.002 2.4330 7 554 0.0141 mawari wy around 9 7 142.617 4.2835 7 578 0.0141 sumo wrestlers mawasi 6 j 3.725 3.0430 7 629 0.0141 @L loincloth mahiru Mm midday 2 t 35.533 1.8751 5 604 0.0141 mayaku mm drug, opiate 9 7 22.072 3.7174 6 600 0.0141 mayoke flUBEit amulet, charm 4 3 0.796 0.3010 6 638 0.0141 mavoko mm side 3 2 3.771 2.1139 7 612 0.0141 mayuge eyebrow 3 2 464.494 1.7243 5 530 0.0141 mazyucu mm magic 5 4 27.446 2.3502 4 545 0.0141 mazusa poor, not good 3 I 34.654 2.5224 5 569 0.0141 metuki eyes, look 14 3 j 5.195 1.9823 6 517 0.0092 medama eyeball 5 3 56.549 3.5937 7 585 0.0092 megami xn goddess 3 1 201.910 2.6395 5 532 0.0092 megane 1691 glasses 2 0 51.030 3.8140 7 549 0.0092 mehana g* take shape 2 I 3.567 1.9956 7 603 0.0092 memori SSU memory 4 4 265.940 2.2967 7 584 0.0092 mesaki immediate to 9 11.920 3.6654 7 613 0.0092 mesibe HE pistil 0 0 16.450 1.6812 6 538 0.0092 metaru metal 4 2 1.303 1.9638 5 623 0.0092 metoro * hn Metro 1 1 35.889 1.7924 7 513 0.0092 meyani i n eye mucus 0 0 6.277 1.3802 5 633 0.0092 meyasu g£ standard, aim 0 0 12.177 3.4510 5 601 0.0092 the comer of the meziri 3 0 151.257 1.0414 6 530 0.0092 eye miburi ##y gesture 5 t 251.062 1.6232 6 523 0.0131 mituba trefoil 3 I 24.157 2.4133 7 559 0.0131 mitudo density 4 2 5.666 3.1926 7 584 0.0131 continued 222

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss eu I11 M oraI11 Word Frequency Duration Auditory frequency + + Pitch Segments Segments 3 mitugo H'D-?- triplet 7 2 0.000 2.1523 6 671 0.0131 mituyu mm smuggling 0 0 0.000 3.3979 7 689 0.0131 disorder. midare 2 2 172.869 3.4717 7 541 0.0131 a*i confusion midasi MiiJL heading, headline 7 6 11.079 3.4280 7 600 0.0131 midori m green 15 I 412.643 4.0304 7 475 0.0131 migaki S t polish 5 3 4.110 3.0149 7 616 0.0131 migara am one’s person I L 26.424 3.4473 6 598 0.0131 migite right hand 0 0 0.709 3.3901 5 631 0.0131 the body of a m igoro 6 4 36.479 0.9031 7 591 0.0131 garment mihari jt»y watch 4 I 233.171 2.8215 6 569 0.0131 mikaku taste 24 24 3.603 2.7860 6 639 0.0131 mikata wn friend, one’s side 8 0 0.711 4.6095 7 574 0.0131 mikiri jusiy give up 5 3 11.793 2.7582 7 620 0.0131 mikomi expectation 8 7 23.466 4.1405 6 567 0.0131 III □ mikuro micro 8 4 17.080 2.4346 7 575 0.0131 2 H mimizu Ill III earthworm 3 16.945 2.3502 7 604 0.0131 ! ! identity. m im oto 13 12 7.699 3.5054 7 624 0.0131 &7Z background minami m south 8 5 103.716 3.9672 7 587 0.0131 minari dress, appearance 10 2 246.832 2.2095 6 476 0.0131 m inato harbor, port 2 2 7.058 3.6022 5 630 0.0131 minori crop, harvest 11 6 195.686 4.0280 7 552 0.0131 charm. m iryoku 16 5 0.754 4.0538 5 618 0.0131 mti fascination miruku milk 5 4 111.744 2.9689 7 546 0.0131 m cape 15 11 3.582 2.9763 6 646 0.0131 miseba Hit* highlight, climax 2 0 52.421 2.5717 7 573 0.0131 misesu Mrs. I 0 0.000 1.8751 6 60S 0.0131 m itome 8tf> one’s seal, signet 7 5 19.279 2.8555 5 589 0.0131 present, gift, miyage 3 2 129.258 3.0920 7 530 0.0131 ±m souvenir m iyako & capital 1 0 24.710 4.2747 7 567 0.0131 miyori relative 9 8 34.689 | 2.4698 5 589 0.0131 mizugi ** swimming suit 10 6 44.331 2.9154 7 556 0.0131 continued 223

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

s W ord Gloss i ft. l “ l “ M ora Word Frequency Auditory D uration frequency + + Pitch £ Segments 3 mizuke moisture 10 8 5.527 3.0099 7 602 0.0131 mizuni boiling in water 8 7 18.178 2.1206 6 623 0.0131 moderu model 3 2 257.736 3.9446 7 470 0.0107 modori return 7 4 109.309 2.8149 7 574 0.0107 mohuku Rflfi mourning dress 12 7 1.726 2.2648 3 617 0.0107 mogura * 7 ? mole 6 5 7.071 2.4900 6 606 0.0107 mohaya already, no more 0 0 51.414 0.3010 5 560 0.0107 mokuba wooden horse 5 2 17.829 2.5465 6 577 0.0107 mokuhi m m silence 15 7 0.242 2.6590 7 661 0.0107 mokuzi a table of contents 28 14 1.130 2.2553 5 635 0.0107 momizi maple 1 I 23.748 3.0678 7 553 0.0107 a waffle stuffed monaka 5 1 36.384 3.4536 4 542 0.0107 ft4 > with bean jam m oram morals 0 0 226.904 3.2079 6 517 0.0107 groping (for. *7 mosaku m m 5 4 0.235 3.8093 / 685 0.0107 about) motome request 5 3 16.448 3.1931 6 589 0.0107 moyasi f c - P L bean sprout 4 4 8.394 2.4362 7 641 0.0107 moyori neighboring 5 4 311.086 2.8555 5 564 0.0107 mugitya R* barley tea 0 0 0.000 2.0531 5 636 0.0075 mukade centipede 2 2 1.928 1.9956 5 633 0.0075 mukasi # old times 5 3 3.155 3.9600 7 615 0.0075 munage hair on the chest 5 j 104.342 1.3617 6 562 0.0075 musiba AM decayed tooth 0 0 45.382 2.8445 7 564 0.0075 musubi knot, conclusion 2 I 68.606 2.9600 7 567 0.0075 m usuko son 0 0 0.800 4.2044 5 584 0.0075 musume 5ft daughter 3 3 25.659 4.2775 7 574 0.0075 muzicu m m innocent 2 2 25.001 3.0249 5 564 0.0075 muziko m m tSL accident-free 5 0 7.406 2.1818 6 523 0.0075 in summer, during natuba 4 2 3.954 3.0103 7 619 0.0142 H the summertime strong summer natubi MB 5 2 10.187 1.6990 7 576 0.0142 sunshine nadare avalanche 3 L 152.037 3.1021 6 575 0.0142 nahuda name tag 2 0 28.756 2.5740 5 570 0.0142 nagame m tt) view, landscape 8 6 42.356 2.7202 6 610 0.0142 continued 224

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss & 1*' M1*' ora Word Frequency D uration frequency + + Pitch Auditory Segments Segments 3 nagare stream, current 8 5 139.092 4.3597 7 536 0.0142 nagasa &£ length 4 0 5.646 3.9547 6 610 0.0142 nagasi 35 L sink 18 10 10.790 3.2302 7 603 0.0142 nageki grief, sorrow 7 6 8.908 2.9263 7 607 0.0142 nagisa f t beach, shore I 0 0.711 2.7868 5 625 0.0142 nakama m circle, company 5 2 6.390 4.1748 7 593 0.0142 nakami contents 9 3 12.377 3.9392 7 557 0.0142 namako sea cucumber 10 7 1.339 2.1271 7 630 0.0142 namami mortal 11 1 102.247 2.6542 7 524 0.0142 namazu f t cattish 6 5 1.163 2.3502 7 683 0.0142 a kind of nameko 3 3 11.314 1.S451 7 595 0.0142 tz to z mushroom namida ;£ tears 4 0 1.976 3.8587 7 635 0.0142 namiki ft* a row o f trees 8 2 9.127 2.9795 6 643 0.0142 nanatu • t o seven 4 1 13.475 3.0086 7 569 0.0142 the seventh day of nanoka 2 1 13.954 0.3010 7 616 0.0142 -ta the month announcing nanori 9 9 397.261 3.3201 7 537 0.0142 «*y oneself narabi mu row. line 7 8.849 3.5369 7 610 0.0142 nasake t i t t sympathy, pity I 0 0.322 3.1626 7 560 0.0142 nasubi t £ t U eggplant 2 0 44.664 2.0253 5 543 0.0142 2 nayami 1m &■ sufferings, worry 4 58.041 3.8258 6 614 0.0142 nazasi nominate 9 6 2.081 3.2574 4 668 0.0142 nazimi familiarity 3 2 6.563 3.3570 7 637 0.0142 nazuke * # i t naming 4 3 1.566 1.5441 7 644 0.0142 nebari «y stickiness 4 4 117.227 3.2292 7 564 0.0071 nebiki f i ? i t discount 7 6 7.288 3.3243 7 656 0.0071 necuki fall asleep 13 8 2.303 1.4624 7 614 0.0071 nedoko a s bed 3 2 1.522 2.4232 7 644 0.0071 negoto a # sleeper's talk 3 2 5.923 1.8129 5 608 0.0071 neguse a$ bed head 0 0 0.469 1.0792 5 669 0.0071 nekoze stoop 0 0 0.000 1.6721 5 625 0.0071 nemoto JB5c root, bottom 4 3 10.145 3.6464 5 648 0.0071 nemuke Kft sleepiness 5 5 16.396 2.1818 7 565 0.0071 continued

225

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss a. 1“ 1“ M ora Frequency Auditory W ord D uration frequency + + Pitch Segments Segments 3 nemuri Ig iJ sleep 2 1 163.067 2.8854 7 501 0.0071 nemusa a g $ sleepiness I I 4.955 0.9031 5 586 0.0071 (price) reduction, nesage 3 3 1.392 3.8118 6 646 0.0071 markdown newaza lying-down trick I 1 2.409 2.2529 5 660 0.0071 neziri « y twist, screw 6 5 100.983 1.9777 7 559 0.0071 nezumi a rat. mouse 4 4 5.040 3.3126 7 640 0.0071 dried small nibosi 7 6 8.155 2.0170 4 645 0.0097 sardines nibusa nz dullness j 3 62.689 2.2989 5 523 0.0097 nitizi HB# time, date 4 0 13.271 3.2055 6 528 0.0097 nituke *ttit hard-boiled food 13 13 2.524 2.1584 6 601 0.0097 nigatu -a February I I 2.005 4.2669 5 659 0.0097 bitterness, bitter nigami 9 7 23.280 1.6335 7 609 0.0097 taste nigate weak point 3 3 1.481 3.3623 7 661 0.0097 nigeba , shelter 0 0 24.146 2.4133 5 593 0.0097 nigiri « y grasp, grip 12 11 50.060 3.1959 7 589 0.0097 nigori ay muddiness 11 9 125.011 2.2455 7 535 0.0097 nikibi pimple 3 0 92.940 2.0170 5 525 0.0097 nikomi stew 7 5 9.315 2.2095 5 616 0.0097 nikusa it* hate, hatred 6 2 1.579 2.2304 6 604 0.0097 nimame HksL boiled beans 4 3 110.127 1.8513 5 583 0.0097 nimotu .luggage 1 0 0.271 3.5342 7 605 0.0097 nimono stewed dish 11 8 201.593 2.5877 5 544 0.0097 nisyoku - f t two colors 19 3 0.000 1.8573 7 648 0.0097 niziru =t;+ soup, broth 1 1 47.213 2.6998 6 599 0.0097 nobara If## wild rose 4 2 128.452 1.6812 5 523 0.0068 wild nogiku 4 2 57.213 1.6435 5 547 0.0068 if* chrysanthemum nohara if® field I 1 82.081 2.5611 4 528 0.0068 nokori »y rest, remainder 9 7 25.994 4.1583 7 538 0.0068 nomiya tavern, bar 3 0 39.391 2.5899 5 523 0.0068 noriba ay** station 2 2 115.417 2.3674 5 524 0.0068 norosa slowness 3 3 25.660 0.9542 5 581 0.0068 noruma norm 0 0 164.951 2.8241 7 549 0.0068 continued 226

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

W ord Gloss m 1“ 1“ M ora Word Frequency D uration + + Pitch Auditory frequency Segments Segments 3 noyama i f Lit hills and fields 2 1 91.599 2.5763 5 553 0.0068 nozomi wish, desire 2 2 108.824 3.4874 6 554 0.0068 nozyuku » s camping out L 0 36.843 2.5224 4 564 0.0068 nukege t t ( t € fallen hairs I 1 7.662 2.4609 6 613 0.0024 numeri slime, sliminess 0 0 420.085 1.8751 5 583 0.0024 nunozi cloth 0 0 20.212 2.5211 5 618 0.0024 nusumi theft, stealing 3 3 2.094 3.2143 7 656 0.0024 pakuri \£ < y shoplifting 9 3 34.229 0.3010 7 545 0.0030 “i panama Panama j ~i 2.248 3.3471 4 585 0.0030 paneru panel 1 0 609.670 3.4490 7 567 0.0030 parupu pulp 1 I 192.196 3.1007 5 499 0.0030 paseri parsley 1 0 152.053 2.5172 4 502 0.0030 pasuta pasta 0 0 0.000 2.2405 6 585 0.0030 pazyama pajamas 2 1 82.572 2.7076 3 528 0.0030 pazuru puzzle 0 0 131.534 2.4624 3 532 0.0030 pedaru pedal 2 1 271.349 2.5832 5 510 0.0012 pirahu £ pilaf 3 3 229.883 2.2355 5 510 0.0012 popura T t W poplar, aspen 1 I 45.442 2.3139 6 503 0.0011 porisu police 0 0 182.363 1.7853 5 508 0.0011 poruno 7t'^U>r pornography 0 0 471.249 2.9952 5 471 0.0011 posuto /f'X h mailbox 3 3 21.192 4.0394 6 536 0.00 It poteto /tf-T h potato 1 1 7.216 2.1335 5 492 0.0011 potohu /t* h 7 pot-au-feu 1 0 2.657 0.9031 7 530 0.0021 puragu plug 5 1 152.901 2.0864 7 513 0.0021 puramu plum 8 5 381.905 1.5315 6 475 0.0021 purasu plus 11 6 106.678 3.8809 7 523 0.0021 puraza y=j*f plaza 1 0 94.775 2.6911 7 533 0.0021 puresu press 5 j S3.166 2.9680 7 538 0.0271 sabaki judgment 8 6 0.231 2.9079 7 608 0.0271 sabaku $31 desert 7 5 1.179 3.6845 7 689 0.0271 sabetu mm discrimination 7 2 3.532 4.0775 4 597 0.0271 syaberu shovel 3 3 47.200 3.4S78 5 528 0.0271 satuki azalea 3 1 0.000 4.3416 7 734 0.0271 sadame law. rule, decision 1 I 9.702 2.5763 7 621 0.0271 continued 227

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

Word Gloss • 1M Mora 1M Word Frequency Duration + + Pitch Auditory frequency Segments Segments 2 sadoru ■*K;u saddle 0 0 8.043 3.6732 5 616 0.0271 safari 'J safari 6 1 14.564 1.8692 7 586 0.0271 difference, sagaku 18 6 0.697 1.9638 7 672 0.0053 mm balance syageki firing, shooting 11 8 0.000 3.2453 4 715 0.0271 saguri m sounding, probe 8 6 2.096 3.0952 7 617 0.0271 sakaba igl* bar, tavern 6 5 1.159 2.5705 7 637 0.0271 sakana it fish 7 4 0.237 2.6618 7 637 0.0271 grasp (a knife) sakate with the point 9 8 0.234 3.8851 7 631 0.0271 downward sakaya mm liquor store 4 3 1.202 2.9274 7 607 0.0271 sakotu «# collarbone 3 2 0.000 2.7135 5 690 0.0271 sakoku mm national isolation 10 3 0.000 1.9868 5 690 0.0271 sakusya author, writer 25 10 0.000 2.8254 6 617 0.0271 sakuya last night 15 0 0.651 3.6851 7 678 0.0053 syakuya f t * rented house 8 3 0.000 2.2810 7 678 0.0271 elimination. sakuzyo 9 5 0.000 3.0535 7 635 0.0053 mr deletion syamozi ft*? ladle I 0 17.299 3.5705 5 614 0.0271 samuke chill 1 I 0.000 2.0492 5 672 0.0271 sam usa coldness 0 0 0.000 2.7574 5 666 0.0271 sanagi chrysalis 4 2 1.653 3.3341 5 652 0.0271 sarada salad 3 I 10.238 2.1732 7 565 0.0271 sarami 5 salami 8 0 74.362 3.0686 6 541 0.0053 shooting (a syasatu 8 0.250 1.3010 5 722 0.0053 its person) dead to sasetu turning left 16 11 0.000 3.4216 5 673 0.0271 syasetu editorial 20 12 0.000 2.1430 5 722 0.0271 sasiba H i post crown 3 2 2.572 3.4099 7 574 0.0271 sasimi mUr sliced raw tlsh 5 4 0.266 1.5051 7 636 0.0271 sasizu mm instructions 2 1 0.000 1.8513 6 678 0.0271 sasori s scorpion 5 4 1.541 2.5391 5 647 0.0053 syataku company house 13 13 0.000 1.8865 5 759 0.0271 my philosophy 6 4 2.690 3.1297 5 598 0.0271 sawagi disturbance 4 o 10.084 2.2014 6 646 0.0271 continued 228

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

Word Gloss a. I'1 Mora I'1 Word Frequency Duration + + Pitch Auditory frequency Segments Segments 3 sawari touch 16 12 25.999 3.8001 7 610 0.0271 sayoku £K left wing 5 2 1.655 2.5988 5 641 0.0281 sebire «ti dorsal fin 2 I 18.801 3.6229 7 585 0.0281 sebiro niz business suit ") t 0.620 1.8261 7 658 0.0281 sebone irt backbone, spine 0 0 0.595 3.2711 4 636 0.0281 setubi facilities 5 4 3.057 2.6675 7 587 0.0281 segare * son 1 1 0.602 4.1021 5 668 0.0281 sekiyu petroleum 6 -> 0.388 2.1430 7 704 0.0281 semasa narrowness 2 0 5.113 4.3531 5 601 0.0281 senaka ** back L 0 2.245 2.4200 7 599 0.0281 senobi mwis standing on tiptoe 0 0 39.997 3.5240 5 559 0.0281 serihu words in play ■) 1 0.708 2.4871 7 654 0.02S I serori -fen 'J celery 0 0 56.683 3.4829 5 552 0.0281 seruhu self-service 2 1 15.S29 2.4624 7 624 0.0281 sesuzi mm spine, back 7 t 0.241 1.6628 7 688 0.0281 setake mx height 1 0 0.000 2.8202 5 594 0.0424 sibahu lawn 2 0 0.499 2.5478 5 669 0.0424 sibutu private property 22 9 0.000 2.9795 7 738 0.0424 sityaku um fitting 32 22 1.659 2.7135 4 684 0.0424 sitiya KM pawnshop 3 1 14.003 2.0086 6 487 0.0424 situdo ;SJ£ humidity 3 1 0.294 2.3927 7 595 0.0424 sihuku ordinary clothes 28 IS 0.308 2.9777 7 684 0.0424 sihuto *>7 h shift 3 3 6.472 2.8357 7 614 0.0424 sigatu April 14 6 0.000 3.0060 7 726 0.0424 sigeki M stimulus 22 11 0.000 4.4611 7 726 0.0424 sigemi bush 4 2 9.246 3.9383 7 641 0.0424 sigoto tt* work, business 6 4 0.000 2.3655 7 767 0.0424 sigusa tta gesture 6 3 3.733 4.7631 7 660 0.0424 sihatu the first train 17 15 0.000 2.9533 5 780 0.0424 sikake ftant mechanism 12 10 3.144 2.8621 7 651 0.0424 sikiti site, ground 10 8 0.262 3.3141 6 655 0.0424 square piece of sikisi 11 6 3.6882 7 671 0.0424 fancy paper 0.000 sikiso pigment 4 1 0.000 2.8129 7 647 0.0424 continued

229

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood >» 2 | Word Gloss £ 5 A <£ O* Word Frequency D uration Auditory + + Pilch Segments Segments 5 tour old countries sikoku EH in the same island 53 3 2.677 3.6227 7 564 0.0424 in Japan training. sikomi 16 11 0.292 2.6096 7 665 0.0424 education sikori ucy stiffness 9 7 30.911 3.0273 6 580 0.0424 plan, device, sikumi 8 7 9.971 4.2190 5 600 0.0424 mechanism result, outcome, simatu 12 0 2.195 3.0990 7 634 0.0424 management humidity. simeri S 4 2.398 1.8129 7 673 0.0424 ay dampness a kind of simezi 5 4 0.000 2.4099 7 724 0.0424 Ltoi: mushroom sinem a *> *■ 7 cinema 1 1 5.692 2.3655 7 639 0.0424 shop of old sinise I 1 0.000 3.2310 5 744 0.0424 standing sinobi ms 7 6 4.020 3.0233 7 657 0.0424 sinobu /C»*jp recall, remember 3 2 0.737 3.4029 7 716 0.0424 sirabe tune 4 3 16.555 4.5790 7 629 0.0424 sobriety. sirahu 8 4 11.563 1.3979 7 600 0.0424 mm soberness siraga sa gray hair 3 3 6.084 2.7520 7 635 0.0424 si rase report, notice 5 3 0.000 3.2335 6 648 0.0424 sirasu a* young sardine 8 5 0.600 2.3324 7 707 0.0424 siryoku a* sight, vision 41 17 11.546 3.0748 7 626 0.0424 siromi && white meat 14 9 1.803 2.2625 7 659 0.0424 sirosa &£ whiteness I 1 0.580 2.2148 6 677 0.0424 siruku v ;u ? silk 20 5 7.769 2.3222 7 659 0.0424 sirusi eh sign 2 2 1.844 3.2589 7 678 0.0424 sisatu mm inspection 26 16 0.000 3.8566 7 710 0.0424 sisvamo v v f t smelt 0 0 1.273 1.8921 5 697 0.0424 sisetu l&fifc institution 34 2 0.824 4.6831 7 678 0.0424 siseki S&B* historic site 50 30 1.135 2.9232 7 697 0.0424 sisitu nature 30 20 0.000 3.1562 7 776 0.0424 sisvoku sample 59 29 1.350 2.7364 7 691 0.0424 expenses. sisyutu 26 18 4.0925 5 714 0.0424 expenditure 0.000 continued 230

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

3m 5 Word Gloss i m 1M Mora 1M Frequency Word Duration frequency + Pitch+ Auditory ■ > . Segments 3 undergarment. sitagi 13 12 0.292 3.1738 6 640 0.0424 T* underwear sitaku X* preparations 40 30 0.000 2.9263 7 691 0.0424 preliminary sitami I4 13 3.922 2.7774 7 602 0.0424 TH inspection groundwork. sitazi 15 11 0.477 2.8299 7 628 0.0424 T tfe foundation sitetu private railway I0 5 0.000 3.3115 5 681 0.0424 siteki point out. indicate IS 11 1.595 4.7990 7 610 0.0424 siwake classification 6 2 1.239 2.7007 7 644 0.0424 siw asu December 4 3 0.790 2.6355 5 702 0.0424 sizimi « corbicula 4 3 4.459 2.2430 6 698 0.0424 punishment. syobatu 6 -> 4.521 3.4118 5 629 0.0183 penalty soburi m m v look, behavior 3 0 44.689 2.7110 4 561 0.0171 sodati growth 15 S 0.372 2.5955 5 681 0.0171 sodate W r foster 7 6 0.000 2.5740 7 650 0.0171 sohubo grandparents 0 0 12.675 2.9609 5 527 0.0171 sogeki m s shoot 2 1 2.36S 2.6998 5 657 0.0171 sokoku m s mother land 24 13 6.528 3.5838 6 590 0.0171 syokora V 3 3 7 chocolate 2 0 0.743 0.8451 5 599 0.0183 syokuba W&J* post, work place 5 3 0.000 4.0526 7 608 0.0183 sokudo £ j £ speed, velocity 13 12 3.794 3.6694 7 562 0.0171 svokugo after a meal 9 5 0.000 0.3010 6 699 0.0183 syokum u duty, work 12 6 3.222 3.6702 7 599 0.0183 instantaneous sokusi SH5E 39 13 0.519 3.0913 7 722 0.0171 death occupational syokusyu 22 6 0.000 3.2783 7 699 0.0183 category syokuhi food expense 18 9 1.340 2.9571 7 648 0.0183 sokuza sn & ready, prompt 7 4 3.172 3.1602 6 589 0.0171 syokuzi ** meal, diet 33 IS 1.053 4.0417 7 645 0.0183 syom otu f t * book 8 5 11.822 3.0133 5 580 0.0183 sonata 'J1-Z sonata 0 8.188 2.8048 7 586 0.0171 syoniti tO B the first day 2 1 0.262 3.5960 7 707 0.0183 sonoba £< Z )i* there, on the spot 4 1 0.634 3.3526 7 647 0.0171 continued

231

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

Word Gloss b l“ l“ Mora Frequency Word Duration frequency Auditory Segments Segments + Pitch 5 sonogo afterwards 7 5 11.274 4.3205 7 616 0.0171 sonota other 3 0 1.454 3.8552 6 590 0.0171 sonote games, tricks I 0 0.000 2.6335 6 713 0.0171 sonohi that day 2 2 0.759 3.4310 7 667 0.0171 syoseki *$i book, publication 27 16 0.000 3.3151 7 724 0.0183 sosicu sx qualities 10 7 0.000 2.7259 5 694 0.0171 sosiki mm organization 19 4 0.000 4.7211 7 635 0.0183 syosiki f t x t form 15 7 0.000 2.4843 7 660 0.0171 sosina mm small gift 0 0 7.693 1.3979 5 641 0.0171 syotoku mm income 20 10 0.469 4.2473 7 638 0.0183 syozoku mm affiliation 19 3 1.195 4.0248 5 676 0.0183 subako mu nest box. hive 5 2 0.000 2.3032 7 60S 0.0171 suberi J ty sliding, slide 2 I 19.842 2.9703 7 634 0.0171 suburi mmy batting swing 7 4 30.659 2.2253 5 627 0.0171 sweet-and-sour subuta 3 2 6.045 1.3617 5 562 0.0171 pork syutuba til.% run tor. stand for 2 1 4.473 3.8987 6 590 0.0142 syutudo t t i ± excavation 8 1 0.000 3.5012 7 646 0.0142 sudati H k tL h starting in life 5 5 0.277 2.4082 7 684 0.0171 sliced and sudako 3 1 0.234 0.9542 7 648 0.0171 vinegared octopus sugaru bees (old name) 2 0 13.186 3.4570 5 593 0.0171 sugata $ figure, shape I 0 3.560 4.6539 7 570 0.0171 wonder. sugosa 1 12.979 0.3010 5 621 0.0171 ££ amazement, terror I suhada mm bare skin 1 0 9.564 1.8633 7 552 0.0171 opening, space, sukima 10 5 12.883 2.6170 7 563 0.0171 gap syukusya s# lodging, hotel 15 6 0.000 3.6386 7 618 0.0142 reduced drawing, syukuzu 4 1 0.000 2.5490 6 717 0.0142 mm miniature copy sumibi mx. charcoal fire 2 1 1.619 2.1703 6 622 0.0171 sumika living, dwelling 5 4 0.262 2.8041 7 646 0.0171 sumire s violet 2 2 0.537 2.5611 7 669 0.0171 syumoku a s item, event 20 10 0.327 3.7520 7 702 0.0142 sumomo m plum 1 I 3.591 4.1626 5 699 0.0171 continued 232

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

Word Gloss & 1“ Mora1“ Duration Word Frequency frequency + Pitch+ Auditory Segments Segments =

sunaba sandbox 0 0 1.819 2.2480 6 654 0.0171 sandy soil, the sunati 6 5 0.000 2.3483 7 702 0.0171 sands syuniku * 1 * 1 vermilion inkpad 8 5 0.627 1.3222 7 693 0.0142 superu spelling 0 0 67.398 0.3010 6 470 0.0171 syuraba fighting scene I 1 6.926 2.4265 7 632 0.0142 suramu slum 4 3 20.957 2.8998 5 618 0.0171 suriru thrill 2 1 164.463 2.5237 6 499 0.0171 syuryoku £ * main force 23 8 2.033 3.8037 5 699 0.0142 surume dried cuttlefish T I 7.647 1.8865 5 650 0.0171 syusyoku £ £ staple food 30 16 7.647 3.0515 7 714 0.0142 susumi progress 4 4 1.454 3.1041 7 620 0.0171 sutego m x * deserted child 0 0 1.454 1.9956 6 602 0.0171 syutoku m % acquire, obtain 26 15 1.044 4.0520 6 668 0.0142 suyaki m m z unglazed pottery 2 1 1.515 2.1004 4 636 0.0171 leading actor, syuyaku 16 13 0.409 3.7860 7 714 0.0142 s s leading actress syuzyutu operation 8 6 2.528 4.1816 5 622 0.0142 suzuki f - r t bass 14 10 0.000 3.1216 6 747 0.0171 suzume * sparrow 5 5 0.871 3.0990 7 658 0.0142 tabako m cigarette, tobacco 5 4 11.801 3.9637 7 608 0.0245 tabizi a m journey, travel 4 0 1.294 2.1903 7 660 0.0245 tatiba a * position 4 0 7.513 4.6904 7 556 0.0245 seeing from the tatimi 7 4 18.464 3.2014 7 595 0.0245 £ * > J 1 callerv tagaku large sum 14 6 20.226 2.7752 5 579 0.0245 tahatu occur frequently 2 1 8.505 3.4239 5 593 0.0245 tahata BB4Q farm, fields 0 0 64.270 3.0792 5 483 0.0245 takane K i n high price 10 3 12.375 3.5330 7 513 0.0245 takara $ treasure 16 12 35.093 3.4541 7 549 0.0245 takari fcj& 'y blackmail 17 12 52.810 2.2227 6 537 0.0245 takasa height 10 2 1.439 4.1230 7 568 0.0245 takibi fire 2 I 8.897 1.5911 6 601 0.0245 another country, takoku t e l l 10 3 25.585 3.6340 7 516 0.0245 foreign country takuti housing lot 12 8 0.000 3.5808 5 604 0.0245 continued 233

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

Word Gloss a. 1“ Mora1“ Word + + Pitch Duration frequency Frequency Segments Segments Auditory 2 tamago m egg 2 0 156.420 3.7346 7 514 0.0245 tames i UL experiment 4 3 19.040 2.8274 7 592 0.0245 tanim a sra valley 1 0 70.932 2.8954 5 531 0.0245 tanomi «* request 4 2 142.513 3.3870 7 555 0.0245 tanuki m raccoon dog 7 6 52.285 2.8949 7 505 0.0245 tarako cod’s roe 7 6 12.433 2.1004 7 585 0.0245 tareme 2 2 6.902 1.1461 7 634 0.0245 tasatu murder 11 9 1.120 2.0969 5 648 0.0245 help. aid. tasuke 4 4 7.510 3.5085 7 543 0.0245 Wit assistance tataki on# concrete floor 11 6 4.766 3.2393 7 618 0.0245 tatami * tatami. mat S 5 2.745 3.1550 7 615 0.0245 tawara m straw . bale 8 8 71.593 2.8299 7 569 0.0245 tebiki ¥51# guidance 10 4 188.202 2.9128 6 515 0.0173 tebura empty-handed 3 0 78.267 1.6532 6 515 0.0173 teburi ¥*y gesture, sign 12 4 343.777 1.4472 7 431 0.0173 tetuke deposit 8 6 1.424 1.2304 7 595 0.0173 all night, tetuya flts throughout the 0 0 28.420 3.2533 5 558 0.0173 night trifling with, tedam a 4 3 31.227 2.1399 7 571 0.0173 making sport of tedasi ¥tii L interference 7 1 16.502 1.7559 5 595 0.0173 tedori net profit 8 7 86.031 3.0603 7 544 0.0173 tegaki handwriting 11 10 27.591 3.0099 6 598 0.0173 tegami letter 5 4 147.414 4.1512 5 514 0.0173 , exploit, tegara 6 5 121.835 2.4298 7 553 0.0173 achievement tegata ¥» note, bill 5 5 4.315 3.2923 7 608 0.0173 severance of tegire 5 5 77.457 0.6021 7 508 0.0173 connections teguti ¥□ trick 6 2 6.926 3.4028 5 590 0.0173 tekiti f t f e enemy's land 12 10 3.663 2.3054 7 535 0.0173 tekubi wrist I 0 84.534 2.7789 6 491 0.0173 temaki hand-rolling 10 8 8.580 1.3617 6 610 0.0173 temoti on hand 7 4 3.671 3.0962 7 633 0.0173 continued

234

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 3 >> § — 2 ? W ord Gloss E £ S 3 sc a. 2E » Word Frequency Duration Auditory Segments 3 + 3 temoto in hand, at hand 7 4 8.610 3.3698 7 615 0.0173 tenisu T " — X tennis t 2 133.253 3.6129 7 512 0.0173 tenuki skimp, scamp 18 16 20.899 2.8096 7 601 0.0173 tenisu terrace 0 0 87.472 2.5575 7 569 0.0173 terebi f b f television 0 0 195.777 4.7944 7 464 0.0173 tereya r a t i M shy person 0 0 59.329 1.9294 5 545 0.0173 tesaki finger, tool 15 14 11.593 2.9410 5 572 0.0173 tesita follower 5 3 2.409 1.6232 7 561 0.0173 tesuri handrail 13 8 106.531 2.8382 7 579 0.0173 tesuto f X h test 4 3 17.566 3.8788 7 541 0.0173 tewake dividing the work 8 8 19.553 2.6628 5 5S8 0.0173 tezina ^ 0 0 magic 0 0 131.515 2.4314 5 502 0.0173 tobira m door *> 1 56.564 3.3473 7 502 0.0210 todana w m cupboard, cabinet .> I 159.508 2.1847 4 539 0.0210 todoke n i t report, notice 5 3 24.947 3.6798 7 543 0.0210 tokage h * y lizard 8 6 33.648 2.6222 7 529 0.0210 tokoro Pfr place, spot 9 6 9.229 5.1335 7 528 0.0210 tokoya &B barber shop 6 I 12.075 2.4330 7 558 0.0210 tokugi specialty 10 8 55.990 2.52S9 7 490 0.0210 tomari overnight trip 5 5 258.269 2.7427 7 525 0.0210 tom ato h T h tomato 0 0 29.430 3.3056 7 531 0.0210 tonari m next door 7 5 119.396 4.2578 7 556 0.0210 municipal. toritu S 1 34.565 3.2095 7 540 0.0210 & ±L metropolitan torobi t b ' X slow fire 2 0 322.499 2.9538 5 543 0.0210 tororo t h h grated yam 5 4 88.243 1.5563 7 516 0.0210 tosaka crest, cockscomb I 1 0.730 1.9191 7 621 0.0210 miscellaneous zatumu 0 0 42.877 1.8921 5 552 0.0048 duties syaguti t e n faucet 2 I 0.322 2.6493 4 643 0.0020 syakusya II# the weak 17 12 0.235 3.3747 6 614 0.0020 zasetu m frustration 13 9 4.765 3.3023 5 648 0.0048 zaseki mm seat 10 4 1.847 3.5312 5 654 0.0048 room, drawing zasiki 5 4 1.669 2.5224 7 586 0.0048 mm room continued

235

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 2 5 W ord Gloss m U m I- I- M ora + + Pitch W ord Frequency D uration ■% Segments Auditory 5 frequency zyasuto VVX h just 5 5 24.763 1.9542 7 520 0.0020 zebura zebra 1 0 55.155 1.3424 3 514 0.0062 zibaku suicidal explosion 21 19 8.564 2.4942 5 600 0.0151 paying the zibara SI« expenses out of 7 6 7.527 2.3617 7 616 0.0151 mv own pocket zibeta i ground 0 0 18.555 2.0792 5 541 0.0151 zitugi n t * exercise 15 6 46.582 2.8395 6 526 0.0151 zitumu mm practical affairs i 2 11.517 3.7945 6 567 0.0151 zituwa mn true story I 1 2.685 2.4518 5 586 0.0151 zigoku mu Hell 19 10 0.834 3.2874 7 644 0.0151 spontaneous. zihatu 12 12 0.000 2.1818 5 711 0.0151 S#§ voluntary zihada item skin, surface 5 1 88.950 2.1106 5 533 0.0151 zihaku sa confession 18 16 0.541 3.3604 5 683 0.0151 self-consciousnes zikaku 38 32 4.351 3.6236 7 612 0.0151 S£ s. awakening zikiso SIS direct appeal 5 1 5.257 2.6821 7 542 0.0151 zikoku mm time 36 8 S.712 3.4221 7 546 0.0151 zimaku r m caption, title 26 19 0.971 3.0449 5 637 0.0151 zimetu s s c self-destruction 5 5 0.000 2.7126 5 695 0.0151 zim isa m&z plainness, quiet I 0 7.604 0.3010 7 597 0.0151 zim oto ifeTC local S 8 0.000 4.5959 7 671 0.0151 zim usyo office 1 0 6.475 4.4637 6 554 0.0151 zinusi i f e i landlord 7 5 10.091 3.4555 5 632 0.0151 ziritu self-support 17 14 6.385 3.8345 7 616 0.0151 doing it for ziriki 9 5 2.476 3.3448 7 659 0.0151 S3) oneself zisatu S 3 suicide 11 to 0.000 3.9156 5 685 0.0151 zisyaku magnet 24 19 0.242 2.8116 5 677 0.0151 zisaku one's own work 31 2 2.446 3.2117 7 559 0.0151 zisoku mm speed per hour 33 11 0.000 3.4519 7 560 0.0151 zisyoku im resignation 36 23 14.504 3.7776 7 705 0.0151 zisyuku S 3 ? self-control 14 9 0.000 3.7230 5 667 0.0151 zitaku § ^ | one's own house 25 23 2.321 4.7193 5 623 0.0151 continued

236

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood

Word Gloss Sm 1“ Mora1“ Word Frequency Duration Auditory frequency Segments + Pitch Segments 3 one’s own zihitu 20 17 1.271 2.7634 5 671 0.0151 s* handwriting zizake local sake *> L 0.255 2.4116 6 623 0.0151 Zizoku continuance 27 10 1.434 3.4725 5 650 0.0151 zokyoku prelude 8 4 14.866 2.2765 5 560 0.0086 dices showing the zorom e 0 0 71.211 0.4771 5 516 0.0038 same number removing the zyosecu 15 10 7.377 2.6010 7 653 0.0086 snow zyosicu P&iS dehumidifying 12 11 0.000 1.8388 5 719 0.0086 zubozi guessing right 2 2 21.088 1.1139 5 597 0.0017 zyucugo mm predicate 6 5 4.251 1.4314 7 615 0.0090 zyukuti knowing well 6 5 4.706 2.7007 6 549 0.0090 zyukugo &ts phrase, idiom 7 3 8.482 2.3139 6 635 0.0090 zyum oku m* tree 7 2 79.882 3.2641 5 546 0.0090 zurusa cunning, slyness 4 4 29.165 1.7709 5 541 0.0017 zusiki diagram, graph 6 3 2.555 3.1212 5 630 0.0017 zyuwaki receiver 0 0 2.505 2.8519 5 590 0.0090

237

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX D: Statistics in Experiment 1 .p. 0.0000 0.0000 0.0006 6.7277 0.0167 2.3164 0.1281 1 . 679E2 . 1 18.8840 0.0000 11.7234 1 1 1 1 2 df F . 0 . 63928 . 0 130.62/32 .464E2 4 0.67368 Tol Tol 0.0 1 94 8b 94 0.0 1 .93726 0 0.041/31 0.032763 0.67894 1 .11638/1 0.012380 0.93868 0.9982/4 0.0189/2 76.214838 Std Error Std Coef Std 1 .69/2131 0.064960 - 2.389139 - 331 . 19667b 331 . Part. Corr. Part. Coef ficient Coef

Tabic I).I: liasic model for naming data (fast namcrs), Experiment I.

Initial Sound Initial Participants 1st Mora 1st Ef feet Ef Point Frequency Uniqueness Duration Frequency none Class Constant Word 3 1 6 5 2 4 7 In F (19 , = 9713) 335.969419, , p < 0.000001, R2 = (19 0.396574 F Our

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p, 0.0011 0.0467 3.9966 0.7867 0.3791 19.3493 0.000119.7230 0.0001 10.6966 4. 498E24. 0.0000 ------1 1 1 1 1 13 ------0.93928 0.92699 20.69780 .498112 1 0.0000 T ol . d f F 0.0162/3 0.92/48 0.038306 0.66911 0.019266 0.038093 0.111710 0.028149 0.83903 0.079473 0.3646/1 009000 . 0 0.90100 304.019747 76.671964 Part. Carr. Coeficientf Std Error Std Coef Mora (S e g m en ts) 1“* 1“* Initial Sound Participants F re q u e n c y Point .999280 1 .003091 1 Ef f e e t C o n s t a n t U n iq u e n e s s F r e q u e n c y C l a s s Neighborhood Word 1 3 5 D u r a t i o n 6 4 8 2 7 In F (19 , 9713) = 336.696782, 9713) , p < 0.000001, R2 (19 = 0.397091 F Our Tabic 1X2: BasicTabic model1X2: + Neighborhood density (Segments)naming data Tor (fast namcrs), Experiment 1.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0210 0.2367 1.4 062 1.4 6 .2 696.2 6 0.0217 6.3317 1 .618E2 18.2366 14.7462 4. 4 66K2 4.4 F 1 1 1 1 1 2 13 .86178 0.63928 0.62228 0.94180 0.67473 0 0.91706 Tol . Tol df 0.019717 -0.040970 ------Kxpcrimcnt I. Kxpcrimcnt 0.184964 76.143813 - - - - 2,286708 0.996709 0.018641 0. 0749b40. 0.0)96)9 0.037792 0.64121 0.0)2032 (1.427090 3 2 b .1bBbb6 ----- Part. Corr. Part. Coef f icient f Coef Error Std Coef Std -

(Segs+Pitch) Part icipants Part Initial Sound Initial 1st Mora 1st Point Ef feet Ef Frequency Frequency Uniqueness Constant Class Neighborhood Word 1 6 Duration 2 3 6 8 4 7 'Cable D.3: Basic model + Neighborhood density (Segments + Pitch) Tor naming data (last namcrs), data naming Tor + Pitch) (Segments density + Neighborhood model Basic D.3: 'Cable In F (19, 9713) 336.232442, 9713) = (19, p < 0.000001, F R2 - 0.396761 Our to 4 - ©

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p- . ------6.7277 0.0167 2.3164 0.1281 0.9236 0,3366 1.67982 0.0000 18.8840 0.0000 11.7234 0.0006 4.464E2 0.0000 F 1 1 1 1 1 2 13 d f ------0.63928 /32 62 . 0 0.67368 0.93726 0.70382 ... 0.019486 0.012380 0.93868 ------1.116387 0.018972 0.032763 0.67894 1.697213 2.389139 0.998274 0.064960 0. 0097 62 0.0097 33 1.19667b 33 76.214838 0.041731 Part. Corr. Coeficient f Std Error Std Coef Tol . ( A u d ito r y ) 1st 1st Mora Initial Sound P o i n t Participants Eff e e t F re q u e n c y F re q u e n c y U n iq u e n e s s C o n s t a n t C l a s s Neighborhood Word 1 5 D u r a t i o n 6 3 2 4 8 7 In F (19 , 9713) = 335.969419, , p < 0.000001, R2 (19 = 0.396574 F Our Tabic 1X4: TabicBasic 1X4: model + Neighborhood density (Auditory) naming data Tor (fast namcrs), Experiment I.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0r- 0 o m*—< o

CNX

o o o o o

V a

ni mrn m o

r-

m I. Experiment namcrs), (slow data naming for mudcl l).5: Basic Table CN ii

o cnin 03 r** b*

242

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0016 0b64 . 0 0.0106 b418 . 9 . 99b6 3.641b 6 1 .022E2 1 . 178E2 2.799E2 1 1 1 1 2 12 df F . 0.66912 0 . b4232 . 0 0,91744 0 . b 2 8 b 0 b 8 2 b . 0 7 6 8 b 6 . 0 Tol Tol -0.02743b -0.028961 R2 = 0.311069 , 0.021 143 0. 109267 Std Error Std Coef Std 000001 . 0 3.910223 1.236792 0.234983 0.123139 0.01842491 b 2 8 . 0 0.213690 214.309618 83./9022b Pa r t. Co rr. Co t. r Pa Coef f icient f Coef

(Segments) Initial Sound Initial l“l Mora l“l Participants E£ fect E£ Frequency Frequency Constant none Neighborhood Class Word 1 3 5 6 4 Duration 2 7 In F (18, 224.482738, = 8949) F (18, p < Our Table l).6: Basic model + Neighborhooddensity (Segments) fornaming data (slow namcrs), Experiment 1.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ,p, 0.0000 0.0082 0.0000 0.0000 0.0011 6.9949 4.7439 0.0294 1 .1 162E2 1 .032E2 10,9900 2 . 800E2 F

1 1 1 1

rh 12 a 0.94232 0 . 92400 . 0 2 0.67896 0.84892 0. 11120764260 . 0 -0.028169 -0.020739 Experiment I. 1.223170 -0.029481 0.93788 0.021404 83.199634 Std Error Std Coef Tol . 0.21748b - 3.980479 - C oefif c ie n t Part. Corr. (Segs+Pitch) 0.441193 0.202973 Initial Sound 1st 1st Mora Eff e c t Frequency 220.046011 F re q u e n c y C l a s s n o n e Word Neighborhood 1 C o n s t a n t 3 2 Participants 5 4 D u r a t i o n 6 7 Tabic D.7: Basic model + Neighborhood density (Segments + Pitch) for naming data (slow namcrs), In F (18, 224.571565, = 8949) (18, p < 0.000001, F R2 = 0.311154 Our ' t

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0002 1 9.6130 0.0019 1 7.0333 0.0080 1 13.7610 12 2 . B00E2 0.94232 0.929230.94906 2 1 1 .242E2 4484 . 99 0.96763 0.032914 0.68284 ------1.204062 0.033082 0.023238 0.09161b 0.016177 0.027697 0.70997 Std Error Std Coef Tol. df F 0.179169 -297. 1612‘J 1 82.927884 Part. Corr. ------(Auditory) 0.042903 Initial Sound l at M ora Effect Coefficient F re q u e n c y Frequency 4.466970 n o n e Neighborhood Cla s s Word 1 C o n s t a n t 3 5 4 D u r a t i o n 2 Participants 6 7 In F (18, 8949) 224.756138, 8949) = (18, p < 0.000001, F R2 = 0.311330 Our Table l).8: Basic model + Neighborhood densitynaming (Auditory)data (slowTor namcrs), Experiment 1.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX E: Statistics in Experiment 2 .p. 0 . 1203 . 0 0.0424 0.4401 ------4 . 1202 . 4 0.5962 41.6497 0.0006 --- 1 0.54710 130.49865 72.0814 2 0.0000 2.1182 Tol . Tol df F -0.053532 0.65583-0.024709 0.98037 1 12.9374 1 0.0003 177.56352 Std Std Error Coef Std 0.286767 0.044435 0.09477757 3 0.67 1 0.010101 0.93908 -5.155568 2.539904 -638.671348 Part. Corr. Part.

Talilv E .l: Basic naming model data Tor (fast namcrs), Experiment 2.

Initial Sound Initial Participants 1st Mora 1st Ef feet Ef icient f Coef Class Frequency Frequency Point Constant Word Uniqueness 3 1 5 Duration 6 2 7 4 In F <18, 5844) = 57.763697, p < 0.000001, K2 = 0.151044 = K2 p < 0.000001, = 57.763697, 5844) <18, F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.1055 0.0000 0.0000 0.0009 0.1439 2.1910 0.1119 43.6190 1 11.0622 1 2.6213 2 131358 . 72 0.92369 0.54689 0.49828 Tol . Tol df F -0.050106 0.63995 -0.019280 0.83478 1 2.1364 0.045001 0.098228 0.65660 1 179.73494 0.297209 -4.236061 2.616410 -0.020302 -0.390559 0.267202 -597 .797711 -597 Part. Corr. Part. Coefficient Error Std Coef Std

(Segments) Initial Sound Initial 1“ 1“ Mora Participants Ef feet Ef Frequency Frequency Duration Neighborhood none Class Word 1 Constant 3 5 6 2 4 7 Table K.2: Table K.2: Basic model + Neighborhood density (Segments) naming data Tor (last namcrs), Experiment 2. In F (19, 5843) = 54.846589, p < 0.000001, R2 = 0.151354 = R2 p 0.000001, < = 54.846589, 5843) (19, F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.1203 0.0000 0.0424 0.4939 1 12.9374 0.0003 0 . 54710 . 0 130.49865 72.0814 0.67357 2 1 2.1182 41.6497 Tol . Tol df F 0.094777 -0.053532 0.65583 2. 177.563523 ------0.286767 0.044435 -5.155568 2.539904 -0.024709 0.98037 1 4.1202 -0.008949 0.86005 1 0.4680 -638.671348 Part. Corr. Part.

(Segs< Pi tch) Pi (Segs< Initial Sound Initial Mora 1st Ef feet Ef icient f Coef Std Error Coef Std Constant Frequency Frequency Class Word Neighborhood 1 3 5 4 Duration 6 2 Participants In F (18 , 5844) = 57.763697, p < 0.000001, R2 = 0.151044 = R2 p < 0.000001, 57.763697, = 5844) , (18 F Out Table E.3: Basic model + Neighborhood density (Scgmcnts+Pitcli) for naming data (fast namcrs), Experiment namcrs), (fast data naming for (Scgmcnts+Pitcli) density Neighborhood + model Basic E.3: Table

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 4,1202 0.0424 2.1182 0.1203 41.6497 0.0000 1 1 2 13 72.0814 0.0000 df F ------. - 0.54710 0.65583 1 12.9374 0.0003 0.49865 0.98037 0.70384 1 0.6932 0.4051 0.67357 Tol Tol ------0.024709 ------Std ErrorStd Coef Std ------

---- 0.286767 0.044435 0.094777 -5.155568 2.539904 -0.010892 --- -638.671348 177.56352 -0.053532 Part. Corr. Part. ---

(Auditory) I6' Mora I6' Participants Initial Sound Initial E£ feet E£ icient f Coef Frequency Frequency Constant Class Neighborhood Word 1 2 3 5 6 4 Duration 7 Tabic E.4: Basic model + Neighborhood density (Auditory) for naming data (fast namers), Experiment 2. In F (18, 5844) = 57.763697, p = 57.763697, 5844) <(18, 0.151044 = H2 0.000001, F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •P' 0.0000 0.0000 0.3923 0.5104 0.7391 0.0654 0.7982 26.8868 1 0.4332 2 0.9359 12 0.64526 0.99918 1 0 . 99736 . 0 0.040820 0.093551 0.99809 1.0914 41 Std Error Std Coef Std . Tol df F -0.009992 -0.005056 0.99786 1 0.1109 Part. Corr. Part. Coef ficient Coef

Table K.5: Table BasicK.5: model for naming data (slow namcrs), Experiment 2.

Effect Initial Sound Initial 1st Mora 1st Point 0.003882 Uniqueness Frequency Frequency Class Word 1 Constant 3 2 Participants 4 Duration 0.261665 5 6 6 In Out F (13, 4340) = 28.017431, p < 0.000001, R2 = 0.077425 = p R2 < 0.000001, 28.017431, 4340) F = (13,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 ------0.1856 0.6667 1 0.64526 12 26.8868 Tol . Tol df F 0 .261665 0 .040820 0 0.093551 0.99809 1 41.0914 Part. Corr. Part. Coef f icient f Coef Error Std Coef Std (Segments) 0.006539 0.97172 Participants Ef fect Ef Du rat ion rat Du Constant Neighborhood 1 3 2 4 In F (13 , 4340) = 28.017431, p < 0.000001, R2 = 0.077425 = R2 p < 0.000001, 28.017431, = 4340) , (13 F Our Table E.6: Basic model + Neighborhood density (Segments) for naming data (slow namcrs), Experiment 2. Experiment namcrs), (slow data naming for (Segments) density Neighborhood + model Basic E.6: Table

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 0.0000 0.0000 0.3303 F

1 1 0.9479

1 12 26.8868 ■°l 0.64526 0.95748 ------Experiment 2. Experiment 0.040820 0.093551 0.99809 1.0914 41 0.261665 0.014779 Part. Corr. Part. Coef ficient Coef Error Std Coef Std . Tol

(Segs+ Pitch) (Segs+ Participants Ef fect Ef Constant Neighborhood Tabic E.7: Basic model naniers), model (slow data Basic E.7: naming lor Tabic (Segments*Pitch) density Neighborhood 1 3 Duration 2 4 In F (13 , 4340) = 28.017431, p < 0.000001, R2 = 0.077425 = R2 p < 0.000001, 28.017431, = 4340) , (13 F Our

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ■P' 26.9736 0.0000 F 16648 . 4 0.0308 df 0.64513 12 0.72341 Tol . Tol 0.074105 0.72276 1 18.6873 0.0000 0.037638 -0.037008 0.207274 0.047948 -0.081292 Part. Corr. Part. CoefficientError Std Coef Std

(Audi tory) (Audi Participants Ef feet Ef Durat ion Durat none Neighborhood 1 Constant 3 2 4 Tabic E.8: Basic model -I- Ncigliborliood density (Auditory) for naming data (slow namers), Experiment 2. Experiment namers), (slow data naming for (Auditory) density Ncigliborliood -I- model Basic E.8: Tabic In F (14, 4339) = 26.371357, p < 0.000001, R2 = 0.078416 = R2 p < 0.000001, 26.371357, 4339) = (14, F Our

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Table E.9: Basic model for word identification data (fast namcrs), Experiment 2. 1“ 1“ Mora Initial Sound Initial Ef fect Ef Participants FrequencyFrequency 3.610264 0.364166 Poi nt Poi Duration Uniqueness Class Constant Word 1 5 3 6 2 7 4 In F (18,9781) = 19.209873, p < 0.000001, R2 0.03414b p = 0.000001, < = F19.209873, (18,9781) Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 0.0000 0.0000 0.0000 0.0000 0.0000 1.282E2 1.019E2 1.1250 2 1449 0. 13 7.6632 0.53846 0.52660 2 61.4111 0.66725 1 Tol. df F ------0. 104260 0. 0.91774 1 -0.017774 0.65858 -0.100203 0.82501 1 84.6042 ------Experiment 2. 0.367541137162 . 0 0.000092 0.005415 ------4.161661 0.054654 -0.000135 -0.004962 0.000539 Part. Corr. Part.

(Segments) Initial Sound Initial 1st Mora 1st Ef feet Ef icient f Coef Error Std Coef Std Duration Frequency Frequency Constant Neighborhood Class none Word 1 2 Participants 3 4 5 6 7 Tabic K.10: Basic model + Neighborhood density (Segments) for word identification data (fast nanicrs), In F (19,9780) = 22.807238, p < 0.000001, R2 p 0.042429 < = 0.000001, 22.807238, = F (19,9780) Out Ln

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 62.2698 0.0000 1 0.4657 0.4950 13 7.6762 0.0000 0.77138 2 0.53846 0.93852 1 97.3215 0.0000 Tol . Tol df F 0.100677 0.361085 0.1314570.005350 0.690150.000863 1 -0.109540 0.89719 1.220E2 0.0000 1 1.101E2 0.0000 ■tamers), Experiment 2. Experiment ■tamers), 3.988572 -0.009057 -0.006900 0.64214 Part. Corr. Part. CoefficientError Std Coef Std

(SegsiPi tch) (SegsiPi Initial Sound Initial Participants 1“ Mora Ef fect Ef Frequency052776 . 0 Frequency Class Neighborhood Word Constant 1 3 5 6 4 Duration 2 7 Tabic E. i I: Hasic model + Neighborhood density (Scgments+I’itch) for word identification data (fast data identification word for (Scgments+I’itch) density + Neighborhood model I: Hasic i E. Tabic In F (18,9781) = 24.981747, p < 0.000001, 0.043953 p R2 = < 0.000001, 24.981747, F = (18,9781) Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p. . 0.0019 0.0000 0.0000 9.6920 66.9126 0.0000 98.2832 0.0000 F 1 1 0.6908 0.4059 26209 . 73 0.67966 1 0.68549 0.52787 0.53846 13 7.5982 0.118988 -----

P.xneriincnt 2. 0.005295 0.082621 0.96796 1 0.000091 -0.037525 0.364166 ------3.610264 0.043311 0.008404 0.70603 -0.000285 Part. Corr. Part. CoefficientError Std Coef Std . Tol df

Mora (Audi tory) (Audi 1st 1st Initial Sound Initial Frequency Participants Frequency Ef feet Ef ion Durat Neighborhood Word Class Constant 1 5 6 3 4 7 2 Table E.12: Basic model + Neighborhood density (Auditory) for word identification data (fast namers), (fast data identification word for (Auditory) density Neighborhood + model Basic E.12: Table In F (18,9781) = 19.209873, p < 0.000001, p R2 0.03414b < = 0.000001, = F19.209873, (18,9781) Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0.0000 0.0000 0.0000 0.3570 9 . 1413 . 9 1 77.8438 0.0000 df FP' ' ------0.52787 2 52.9245 0 . 54167 . 0 12 0.67966 1 24.6290 0.0019 0.93739 1 0.8485 0.109984 0.68549 0.079886 0.96796 1 57.9918 -0.062129 Std CoefStd . Tol 0.371942 ------0.041181 0.005408 -0.000463 0.000093 -0.009666 Part. Corr. Part. Coef f icient f Coef Error Std ------

Table E.I3: Basic model for word identification data (slow namers), Experiment 2. Initial Sound Initial lal Mora lal Point Ef feet Ef FrequencyFrequency 3.281609 Duration Uniqueness Constant Word Class 1 6 3 2icipants Part 5 4 7 In F (17, 9082) = 17.991411, p < 0.000001, R2 p 0.032580 = 0.000001, < 9082) = 17.991411, (17, F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0000 0.0012 1.060E2 10.4695 92.4642 0.54167 120.52660 9.2312 2 42.2404 -0.040950 0.65858 1 Std CoefStd . Tol df F Experiment 2. 0.000551107451 -0. 0.82501 1 90.3012 0.005527103090 . 0 0.91774 1 Std Error Std -0.000305 0.000094 Part. Corr. Part.

.175697, p < 0.000001, R2 p 0.042105 = < 0.000001, .175697, 22

(Segments) -0.005233 Initial Sound Initial l6t Mora l6t Ef fect Ef Coefficient Frequency 0.053143 frequency 3.863061 0.375150 0.129472 0.66725 1 none Neighborhood Class Word 1 Constant 3 6 5 2 Participants 4 Duration 7 Table E.I4: Basic model + Neighborhood density (Segments) word identification Tor data (slow namers), In F (18, 9081) = 9081) (18, F Out IO ISl VO

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0233 0.0000 5. 1453 5. F 1 89.7825 2 38.0778 12 9.2624 0.52209 0.54167 0 . 64214 . 0 1 Tol . Tol df 0.100298 0.93828 -0.029024 ------0.000095 0. 371877 0. 0.125465 0.67676 1 1.013E2 0.005457 0.000906 -0.122636 0.84766 I 1.213E2 ------namers), Experiment 2. Experiment namers), 3.743527 -0.000216 Part. Corr.Part. Coefficient Error Std Coef Std

(Segs+Pitch) -0.009972 Initial Sound Initial l6t Mora l6t Part icipants Part Frequency Frequency 0.051703 Ef fect Ef Class Neighborhood none Word Constant 3 5 b 1 4 Duration 2 7_ Tabic E.I5: Basic model + Neighborhood density (Segments + Pitch) for word identilication data (slow data identilication word for + Pitch) (Segments density + Neighborhood model E.I5: Basic Tabic In F (18 , 9081) = 23.953870, p < 0.000001, K2 = = 0.045328 K2 p < 0.000001, 23.953870, = 9081) , (18 F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.7390 0.1110 24.6290 1 77.8438 0.0000 df F 0.92787 2 92.9249 0.0000 0.96796 1 97.9918 0.0000 0.70603 1 0.94167 12 9.1413 0.0000 0.67966 1 0.079886 -0.062129 Experiment 2. Experiment 0.009408 0.000093 0.371942109984 0. 0.68949 0.041181 -0.000463 Part. Corr. Part. CoefficientError Std Coef Std . Tol

(Auditory) 0.003496 Initial Sound Initial l£t Mora l£t Participants Ef feet Ef FrequencyFrequency 3.28160 Class Word Neighborhood 1 Constant 3 9 6 2 4 Duration 7 Table E.I6: Basic model + Neighborhood density (Auditory) Tor word identification data (slow namers), (slow data identification word Tor (Auditory) density + Neighborhood model E.I6: Basic Table In F (17, 9082) = 17.991411, p < 0.000001, R2 = 0.032980 = p < 0.000001, R2 9082) = 17.991411, (17, F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX F: Similarities of Sounds in Noise: MDS Analyses

Introduction

“Phonological similarity” in English is calculated based on form similarity. As explained in Chapter 1, four different neighborhood calculations have been used. Among them, the only one calculation considers phonetic similarity of sounds, which is based on experimentally derived phoneme confusability (Luce, 1986; Luce & Pisoni, 1998). This rule is based on R. D. Luce’s general biased choice rule (R. D. Luce, 1959). Sound confusion matrices were calculated for CVC words in order to understand similarities of sounds in English. A basic assumption here is that if two sounds are similar, their confusability must be higher. This sound confusability is implemented in the neighborhood density calculation in Luce (1986) and Luce & Pisoni (1998). This paper aims to investigate similarities of sounds from the actual experimental data in Japanese. In a word identification in noise experiment, words are presented to the participants in a noisy environment. They were asked to identify the words. Confusable sounds should induce more mistakes than less confusable sounds. If the sounds are similar, the misperceived sounds should be very similar to the actually intended sounds. Therefore, the error patterns in the word identification in noise should provide similarities of sounds in Japanese. Therefore error patterns in the actual experiments should show the same general tendency. This possibility is explored by analyzing the error patterns in the word identification in noise experiment (Experiment 2).

D ata

The segments in the word identification data in Experiment 2 were analyzed. All the responses were transcribed by the author in the same romanization used in the original neighborhood calculations in Experiment 2, and saved in a master file. The written responses were checked with the actual responses recorded onto DAT tapes to make sure that participants wrote down their responses accurately. If the participant wrote down something different from what he or she said, the transcription was “corrected” to match the oral response. In this analysis, mispronunciations of the words about accent patterns are ignored because I am concentrating on consonant - vowel confusion. Fewer than 1% of the written responses were corrected. In terms of accuracy of accent patterns of responses, 86% of the data was correctly identified (Also see §2.3.2.1.4).

262

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. E r r o r a n alysis

First, repeated-measures ANOVAs were conducted. The error patterns were analyzed in terms of sound categories (consonants or vowels) by subjects analysis (FI) and by items (F2). All target words has a CVCVCV words so that consonants were in the first, third, and fifth position, and vowels were in the second, fourth, and sixth positions. Thus, the maximum number of errors possible for either segment type was 3 x 700 = 2100 in the subjects analysis and 3 x 27 = 81 in the items analysis. Table F.l shows the mean number of errors of consonants and vowels in each of the two analyses. Both subjects and items analyses showed that errors occurred significantly more often for the consonants than for the vowels (FI = (1, 26) = 870.003. p < 0.001, F2 [ 1, 699] = 409.488, p < 0.001). Next, the data were analyzed in terms of position with position within the word. Table F.2 shows the mean number of errors in terms of positions. For consonants, the positional effect observed in the subjects and the items analyses was significant (FI(2, 52) = 461.79, p < 0.0001; F2 (2, 1398) = 71.62, p <0.0001). Paired comparisons between consonant positions showed that the differences between PI and P3 and PI and P5 were significant, but not between P3 and P5 (PI vs. P3: Fl( I, 26) = 578.58, p < 0.0001; F2 (I, 699) = 105.85, p <0.0001; PI vs. P5: F l(l, 26) = 635.52, p < 0.0001; F2 (1 ,699) = 91.77, p <0.0001; P3 vs. P5: F l(l, 26) = 1.669, p > 0. 1; F2 (1, 699) = 0.23, p > 0 . 1). For vowels, the positional effect observed in the subjects and the items analyses was significant (Fl(2, 52) = 95.90, p < 0.0001; F2 92, 1398) = 15.20, p < 0.0001). Paired comparisons between vowel positions showed that the differences among positions are all significant (P2 vs. P4: F l(l, 26) = 161.80, p <0.0001; F2 (I. 699) = 35.78, p <0.0001; P2 vs. P6: F l(l, 26) = 73.30, p <0.0001; F2 (1, 699) = 9.86, p <0.005; P4 vs. P6: F l(l, 26) = 28.03, p < 0.0001 ; F2 (1, 699) 3.80. p = 0.0515). There are two main findings about error patterns observed among positions. First, positions of consonants induced more errors that those for vowels. This clearly indicates that consonants are generally more distorted than vowels in noise. Second, positional effects were also observed within vowel and consonant positions. For consonant positions, PI induced more errors than the other two positions (P3 and P5), which induced errors also most equally. For vowel positions, P2 induced more errors more errors than P4 and P6, and P6 induced more errors than P4. Since P6 is the word-final position in which sounds are often deleted cross-linguistically. it is easy to believe that sounds in this position were often misidentified. The data showed that sounds at the end of utterances are “more distorted” (they are softer, particularly if F0 is falling and/or low). But at the beginnings of words, the larger error rate there is because of information structure rather than acoustics. At the beginning of words, there is no preceding context to help predict the upcoming segment so transitional probabilities cannot help here. The results suggest that Japanese listeners made a distinction between consonants and vowels, natural classes that are specified by phonological features, [±sonorant] and [±consonantaI]. 263

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Similarities o f sounds in t h e MDS analyses

This section discusses similarities of sounds in noise. First, similarities of sounds are computed from the error patterns of the responses from participants. In order to explore the underlying structure, multidimensional scaling is applied to the similarities. The confusion matrices from the error patterns in Experiment 2 generally provide information about which pairs of sounds are more confusable than others. The idea here is that if two sounds are similar, they are also confusable in noise. If the similarities of sounds are submitted to multidimensional scaling (MDS), the output should show the auditory similarities among sounds. If Japanese listeners classify the sounds based on auditory features, they should be meaningful dimensions of MDS. MDS is a statistical technique that is useful for uncovering meaningful organization in complex sets of data. Multidimensional scaling was proposed to help understand people’s judgments on the similarity of members of a set of objects. The matrix of similarity judgments was subjected to the multidimensional scaling to create the multidimensional map space. The map that results from a MDS treatment of a set of data is essentially a representation of the psychological relationships among sounds. If two sounds are more similar, the psychological distance in the MDS map should be less. That is, if the “auditory space” is “warped” by linguistic experience (as much cross-linguistic perception work suggests), then this is “auditory similarity.” I do not just mean general, universal “auditory space” here. This MDS analysis does not take position into account — so this is closer to “auditory similarity” than raw counts in Table F. I. Finally, the resulting dimensions must be interpreted to determine the most accurate number of solutions necessary to give the best final MDS solution for the data. In this study, first, similarities of sounds were analyzed based on the error patterns. Correlation computes measures of similarity. Here, correlations between intended sounds that participants actually hear and perceived sounds that they thought they heard were investigated. The more similar the sounds, the higher the correlation should be. Correlations used in this analysis are a matrix of Pearson product-moment correlation coefficients. Pearson correlations vary between -I and +1. A value of 0 indicates that neither of two variables can be predicted from the other by using a linear equation. A Pearson correlation of 1 or -1 indicates that one variable can be predicted perfectly by a linear function of the other. The calculated correlations were submitted to multidimensional scaling. The output was a multidimensional “map” to provide auditory similarity space for sounds in Japanese.

Vowel Analysis

First, vowels and consonants of the responses were separately tabulated. Some of the responses including a moraic nasal or a palatalized /nV were excluded from the analyses, since no corresponding input sounds existed. That is, the matrices were

264

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. designed to make similarity symmetrical. The vowel data and the consonant data were converted into proportions since the frequencies of different segments in the stimulus words were not balanced (Table F.3). Then the data were submitted to Pearson’s correlation analyses to compute similarities among vowels. Table F.4 shows a similarity matrix of vowels that is an output of the correlation analyses. This similarity matrix was then submitted to MDS in order to explore the underlying structure. Figure F.I shows the two-dimensional scaling solutions for similarity of vowels as the best fit for the vowel data. The R2 value for this solution is 0.99957 and stress is 0.00654. Dimension I represents the vowel height, and is graphed along the horizontal axis. Dimension 2 is graphed along the vertical axis, and it is interpreted as representing the degree of backness.

C o n s o n a n t a n a l y s i s

The same procedure for creating a multidimensional scaling solution was followed here to analyze the consonant similarity (Tables F.3 and F.4). Figure F.2 shows the two-dimensional scaling solution for similarity of consonants identified in the stimuli in Experiment 2. The R2 value for this solution is 0.891 and stress is 0.1456, which accounts for a good amount of the variation in the mapping. The best three-dimensional solution did not yield a substantially better R2 value (0.902), and the dimensions were difficult interpret in terms of phonological features. Therefore, I selected the two- dimensional scaling solution as the optimal one in this case. Dimension 1 represents the feature, [±voice]. Dimension 2 seems to show another feature [±sonorant]. Among consonants, nasals (/m/, /n/, /g/), glides (/j/, /w/), and liquids (/r I) are considered as sonorants articulately in general from an articulatory point of view. However, the data suggest that not all the sounds are auditorily considered as sonorants. Phonetic realization of /w/ and I t / are not considered sonorants in an auditory-based psychological space, [w] is located near [k*] which is apart from nasals and [j]. [r] is very close to [d] and [b], suggesting that it is auditorily very similar. Recall that in Otake et al. (1996b), Dutch listeners were not able to respond to Japanese [r] with the visual target “r.’ However, when the visual target was changed from V to ‘d,’ they were able to detect [r] in a phoneme monitoring experiment. This could be a piece of evidence that a phonetic realization of /r/ is very similar to a flap. It is natural to treat /g/ as a sonorant, once we remember that /g/ = [q] in Tokyo Japanese, /g/ in the onset position is realized as [q] after a vowel or a moraic nasal in Tokyo Japanese, and there is a shift from [q] to [g] apparently now in progress (Hibiya, 1995). Since all the counts for [g] in this analysis actually occur in this phonological environment, ‘g’ naturally appears to be [q]. However, in MDS, [q] is apart from a cluster of nasals and it is about mid-way between a cluster of nasals (Ini and /ml) and a cluster of voiced stops (lb/ and /d/). In the future, /g/ is expected to be moving towards voiced stops so that voiced consonants would be classified in terms of [±sonorant] more clearly.

265

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D iscussio n an d C o n c l u s io n

The current analyses of the error data in word identification in noise experiment (Experiment 2) demonstrated that Japanese listeners perceived sounds in terms of natural classes in a noisy condition. Confusion matrices of vowels and consonants submitted to MDS enabled us to see how Japanese listeners perceived sounds and arranged them within the auditory similarity map. Both MDSs for consonants and vowels provided interpretable dimensions, both of which turned out to be phonological features. The outputs of the MDS represent auditory similarity spaces for consonants and vowels. The five Japanese vowels were classified in terms of Height and Backnesss. Height is realized to measure an acoustic parameter, the frequency of the first formant (FI). It is also related to sonority, and the bigger distance between /i, u 1 vs. /e, o/ than between /e, o/ vs. /a/ supports this. Bachiess is an acoustic property of the second formant (F2). Similarly, the phonological features that emerged from MDS for consonants were auditory features: Voice and Sonorant. Voicing and Sonority are not independent: they are based on activity of the vocal folds. Dimensions used to describe consonants and vowels are really basic features of sounds, all of which are related to the sonority scale of sounds in the language. In conclusion, the results of error patterns in the word identification in noise data revealed that similarities of sounds are based on the sonority scale of sounds in Japanese. Further, the data suggest that properties of sounds that are different from language to language were mapped onto a multi-dimensional scaling. For example, /r/ is a flap in Japanese, so it is located near voiced stops in the auditory space shown in MDS (Figure F.2). A transition from [q] to [g] for /g/ in progress is nicely captured from the error patterns of the word identification data. The methodology used in this paper is useful for languages in which it is difficult to collect confusion matrices as it is, for example in English.

266

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Consonants Vowels

Subjects analysis 263 (SD: 46.57) 96 (SD: 20.99)

Items Analysis 10 (SD: 10) 4 (SD: 5)

Table F.l: The mean number of errors of consonants and vowels in each of the two analyses.

Consonants Vowels

PI P3 P5 P2 P4 P6

Subjects analysis 128 69 66 44 22 30 I Items analysis 5 3 3 2 1 ! 1

Table F.2: The mean number of errors in terms of word positions.

267

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Perceived Sounds a e i 0 u a 0.433048 0.17094 0.102564 0.225071 0.068376 !* e 0.046674 0.69895 0.047841 0.138856 0.067678 = S i 0.052571 0.110857 0.146286 0.049143 0.641143 W 5c e a; 0 0.028803 0.107111 0.027003 0.809181 0.027903 u 0.034612 0.044902 0.753976 0.065482 0.101029

Table F.3: Proportions of responses for vowels.

a e i o u a e -0.181 i -0.453 -0.245 0 -0.096 -0.095 -0.205 u -0.44 -0.43 -0.046 -0.22

Table F.4: A similarity matrix for vowels.

268

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Configuration

2 1 1

1

C\J cI UC cA 0 GO 1 o CD E 10 Q cE -1

1 1 - 2 ' 1 1------2 - 1 0 1 2 Dimension-1

Figure F.l: MDS for vowels. Dimensions 1 and 2 represent vowel height (FI) and backness (F2), respectively.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. X •*r pi X 2 Ml

on s 03ft (I OIK (I (II13 0.002 0.001 017 0 0.001 0.001 0014 ’ pi pi 1 X X£ X X2

mi f*i p-* P» P**- pi X pi 5*T pi-r T >e p» n X r-n X -r PI PI X X 5

Ml X X p~ X P"1 X X£ X2 X p*- X —9 ri rp 2 H rp 2 rpPI 2 PI rp rp PI s X X X X X pi X rp X 2 X

£ Ml 2 PI rp 2 p*» PI S/S X X X XX X rpX XX X

-f Ml pi >/*. 2 T rp p i X 5 -r £ l*S 3 5 X 1 XX X zz X

-r f 2 X X rp Ml «r 'P X PI PI X X T 5 2 «p X - rp pi pi 2 X rp X P| X X X zz X X X PI O PI rp s rp OIKKi 014 0 027 0 PI P- X pi rp PI PI rp X s •ri rp rp i_ X X zz X pi X P- X X X X

r-j pi f rp rp pi PI X XXX PI pi s X X X XX

pi P- rp

© £ © © X X 2 p- — — ZZ pi 2 © X Ml ■a T * - - s ui PI pj rp Ml pi rp p*. X zz PT X X X X - rp PI Table Table F.5: Proportions of vowel responses. X pin rp *r T 2 2 X mi PIJk rp pi rp rp Ml '•A X XX i/i X X rp X X X Ml -3 £ 22 rp X «r rp ■"* mi XXX X XX pi — — — — pi-r C/j $ X X X T5 w O rp 2 £ 2 rp p- *r pi > X X X X X PI XX X pi P^ X *5 X «r

A V! a i £ - U C t*S -s 3 3 Intended Sounds

270

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. S3 -0.136

* -0.168 -0.328 0.111 -0.182 -0.074 -

>o -0.08 0.013 -0.107 -0.191 00 r-* 00 GO c/1 © © © -0.092 -0.107 -0.196 68 o- ro 0.01 - 0.063 -0.089 -0.108 -0.087 0 1 1 1 0.157

- 0.322 0.226 - -0.127 -0.056 -0.247 c - 00 3 e

d 0.376 -0.151] -0.2I3| -0.114 -0.162 -0.013

L. -0.07 0.093 -0.183 -0.141 -0.245 -0.094 -0.051 -0.123 -0.104 c * 00 N 8 ©

0.156 0 d -0.086 -0.087 -0.125 -0.076 -0.067 -0.175 -0.095 0.38 0.212 0.475 0.693 -0.143 - -0.213 -0.168 -0.015 -0.158 -0.141 -0.298 © Cl 0 1

■o 0.02 8 - 0.287 -0.154 -0.032 -0.094 -0.183 -0.092 -0.129| -0.092 -0.082 6

00 c

s © 0 1 8 - d r-

o 0.112 0.082 0.557 © -0.105 -0.151 - -0.135 -0.208 -0.098 -0.117 Table K.6: A consonants. for matrix K.6: Table similarity A - o m £ 0.102 0.865 0.056 0.605 -0.164 - -0.069 -0.176 -0.127 -0.263 -0.217 -0.181 -0.259 -0.056 oc 8 00 ■E © -0.04 0.102 o 9 0.102 -0.113 -0.118 -0.063 -0.159 -0.069 -0.126 -0.092 - -0.186 -0.063 - -0.062 c * T T c** c* 00 00 OC ■e- 8 O © © © 8 0.021 0.012 d d 0.122 0.027 9 9 d 0 -0.073 -0.131 -0.079 - -0.098 - -0.077 - -0.098 © © © Cl rf © .c © Cl Cl 0 -0.04 -0.24 0.222 0.166 d d 9 0.319 -0.043 -0.079 -0.159

-0.126 -0.183 -0.155 9 -0.106 cl

0 0 d 1 0.121 0.121 0.505 9 -0.082 -0.056 -0.069 - -0.076 -0.014 -0.074 - -0.072 -0.095 -0.069 -0.081 -0.144

a . j-0.076 to & £ * E - ■o to s u s =—> >0 J* 39

271

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2

1

C\J cI T o o S o PC to ZZ c 0

2 -2 -1 0 1 Dimension-1

Figure F.2: MDS for consonants. SS = [f], C = [Is], CC = [c], KY = [Id], Z= [3], Y = [j]. Dimensions 1 and 2 represent [±voice] and [±sonorant], respectively.

272

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX G: Semantic categories

Twenty-eight semantic categories, definitions and English translations of the

semantic categories, additional 700 words that belong to the categories and descriptive

statistics are listed below.

273

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. It HI (Careers) I Z

Word Mean SE SD daitoryou US president I 0 0 bijinesumaN fc'v*X -7> businessman 1 0 0 kaikeisi accountant 1 0 0 yakuzaisi mmm pharmacist 1 0 0 giiN a a congressman I 0 0 saibaNkaN ®WE judge I 0 0 kaNgohu nurse 1 0 0 beNgosi lawyer 0.9666 0.0333 0.1825 haisya dentist 0.9666 0.0333 0.1825 gaka ms painter 0.9666 0.0333 0.1825 syuhu housewife 0.9666 0.0333 0.1825 syoubousi >m± Fireman 0.9666 0.0333 0.1825 kyouzyu professor 0.9666 0.0333 0.1825 seerusumaN -tr— salesman 0.9666 0.0333 0.1825 isya E # doctor 0.9333 0.0463 0.2537 gizyutusya technician 0.9333 0.0463 0.2537 sensei teacher 0.9333 0.0463 0.2537 hobo m nursery staff 0.9333 0.0463 0.2537 kaseihu sias§ housemaid 0.9 0.0557 0.3051 keikaN H I1 policeman 0.9 0.0557 0.3051 gakusei student 0.9 0.0557 0.3051 daiku * x carpenter 0.8333 0.0692 0.3790 souri prime minister 0.8 0.0742 0.4068 geizyutuka SffrS artist 0.8 0.0742 0.4068 nouka mm farmer 0.6 0.0909 0.4982

274

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. K i (Careers) II rw ftj V to ZtlfrC,mi'X:L'fzt££ W*Sf£lt^

Word Mean SE SD sinarioraitaa scenario writer 1 0 0 tareNto 5 L /> h television I 0 0 personality dezainaa T 'tf'f X — designer 1 0 0 zyoyuu ■km. actress L 0 0 keNsatukaN m m t prosecutor I 0 0 kaNtoku KS manager 1 0 0 sutairisuto X * - f ‘J X h stylist 1 0 0 zimuiN office worker 1 0 0 komedjiaN =1 > "T-f T > comedian 1 0 0 seizika politician 0.9666 0.0333 0.1825 sutyuwaadesu X f a 7 - f X stewardess 0.9666 0.0333 0.1825 sutaNtomaN stuntman 0.9666 0.0333 0.1825 pairoQto j i j a y b pilot 0.9666 0.0333 0.1825 kameramaN * * cameraman 0.9666 0.0333 0.1825 Satyou t t f i president 0.9666 0.0333 0.1825 kyouiN f t * faculty 0.9666 0.0333 0.1825 daNyuu lif t actor 0.9333 0.0463 0.2537 yakusya &# performer 0.9333 0.0463 0.2537 anauNsaa 7 7 ^ >1*— anchorman 0.9333 0.0463 0.2537 rakugoka comic storyteller 0.9333 0.0463 0.2537 kasyu singer 0.9 0.0557 0.3051 kooti 3 — 7 coach 0.9 0.0557 0.3051 repootaa 1 1 reporter 0.8333 0.0692 0.3790 asisutaNto 7 * > X * > h assistant 0.7333 0.0821 0.4497 koQku i v 9 cook 0.6333 0.0894 0.4901

275

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fe (Colors) ccd^p ii rfej -el",, z;h.7!)'bML'Ti'fcf£<#f§A<;:cD*TPu— ic ■r4*'£5A'£#jl-ci'fc*£tfrr. tu rfeJ co«ffiT'fc£*£icii •YES’ ’NO’ £>*'$ >£iii3fc§f£l+^<# LT

Word Mean SE SD murasaki £ purple 1 0 0 haiiro 0cfe gray 1 0 0 giNiro ISfe silver 1 0 0 yamabukiiro umfcfe bright yellow I 0 0 kiNiro gold I 0 0 buruu ?)\,- blue 1 0 0 piNku pink I 0 0 buraQku black 1 0 0 ao W blue 1 0 0 sirubaa silver I 0 0 kuro m black 1 0 0 kiiro yellow 1 0 0 aka # red I 0 0 burauN brown 0.9666 0.0333 0.1825 siro a white 0.9666 0.0333 0.1825 sorairo sky blue 0.9666 0.0333 0.1825 ieroo < X P - yellow 0.9666 0.0333 0.1825 tyairo blown 0.9666 0.0333 0.1825 daidaiiro tangerine 0.9666 0.0333 0.1825 guNzyoo ** cobalt blue 0.9333 0.0463 0.2537 reQdo \y "J K red 0.9333 0.0463 0.2537 goorudo a -;u K gold 0.9333 0.0463 0.2537 howaito K white 0.9333 0.0463 0.2537 tutiiro ± f t soil color 0.9 0.0557 0.3051 kusairo grass green 0.9 0.0557 0.3051

276

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. £H10)£Dir (Main dishes) c :J0-lcJ*T£j!>'tf$:/£tii3fc£f£lt^<#LT<*££ t'o a ^ u - ii n&a©«grj -e-fo The category for this block is “main dishes.” Your task is to determine whether the words you are going to hear in this block belong to this category or not. If you think that they belong to this category, press ‘YES,’ otherwise press ‘NO’ as soon as possible. The

Word Mean SE SD soba * l f soba ‘NO’odle 1 0 0 kareeraisu a L/—7-rx curry rice I 0 0 teNpura TA/'S'b tempura 1 0 0 tyaahaN fried rice I 0 0 toNkatu pork cutlet I 0 0 koroQke ■3Uy*T croquette I 0 0 haNbaagu m ; / a —7 hamburg steak I 0 0 saNdoiQti -y o h V 'v * sandwich 1 0 0 soumeN ■f-5 tok soomeN ‘NO’odle I 0 0 harumaki spring roll 1 0 0 yudouhu j i l l S tofu in the hot water L 0 0 suteeki stake I 0 0 karaage fried chicken I 0 0 gyouza fit* dumpling 1 0 0 kamamesi £16 rice, meat, and 1 0 0 vegetables cooked together in a small pot. gurataN gratin I 0 0 supageQtji X /< 7 ‘V -T-f spaghetti 0.9666 0.0333 0.1825 udoN 5 £Aj udoN ‘NO’odle 0.9666 0.0333 0.1825 raameN 7 - $ > raameN ‘NO’odle 0.9666 0.0333 0.1825 sityuu stew 0.9666 0.0333 0.1825 syuumai y a $ 7 - < steamed dumpling 0.9333 0.0463 0.2537 sukiyaki sukiyaki 0.9333 0.0463 0.2537 piza tflf pizza 0.9 0.0557 0.3051 sekihaN festive red rice 0.8666 0.0631 0.3457 meNti $ mincemeat 0.7 0.0850 0.4660

277

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Dessert) z(Dz?a is rfc'-^o(7)€iPij t? t\ ztifrz>w\.'Xi'tzt£«*n*fc £i#£C:ii. 'YES’ (&**$>£, *oT*fcl+*l.tf 'NO' <»■#$ >£tii3fc-57£l+^<#L-C< fi^L'o *f-=f'j— ii rfc-¥>o-"J h 'T—* pancake 0.9666 0.0333 0.1825 pauNdokeeki / ^ > K-7— * pound cake 0.9666 0.0333 0.1825 mitumame W a boiled beans, agar-agar 0.9666 0.0333 0.1825 cubes and other delicacies with treacle poured on. ohagi fcl rice dumpling covered 0.9666 0.0333 0.1825 with bean jam kibidaNgo S * 0 T millet dumpling 0.9666 0.0333 0.1825 pafe / O x parfait 0.9666 0.0333 0.1825 maNzyuu tta* bean-jam bun 0.9333 0.0463 0.2537 seNbei tftt* rice cracker 0.9333 0.0463 0.2537 tokoroteN tZbXAj a kind of jelly 0.9333 0.0463 0.2537 nikumaN steamed pork bun 0.9333 0.0463 0.2537 kuzumoti S t* pudding-like arrowroot- 0.9 0.0557 0.3051 starch cake siruko H-tt Sweet red-bean broth 0.8666 0.0631 0.3457 278

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D$9 (Animals) rtb^j -efo ^tl/)'blVlL'TL'fcf£<^ISj!)£til3fc-5f£lt^

Word Mean SE SD uma 1 horse I 0 0 tiNpaNzi chimpanzee 1 0 0 simauma IRA zebra I 0 0 buta m Pig 0 0 rakuda b < f£ camel 1 0 0 kiriN t y ^ giraffe I 0 0 usi * cow 1 0 0 usagi 31 rabbit 1 0 0 raioN 7 -'fT > lion 1 0 0 sika m deer I 0 0 koara 3 7 7 koala 0.9666 0.0333 0.1825 saru It monkey 0.9666 0.0333 0.1825 gorira 3"'J 7 gorilla 0.9666 0.0333 0.1825 neko m cat 0.9666 0.0333 0.1825 zou ft elephant 0.9666 0.0333 0.1825 kaNgaruu ■h kangaroo 0.9666 0.0333 0.1825 tora fft tiger 0.9666 0.0333 0.1825 kaba AW* hippo 0.9666 0.0333 0.1825 risu •J* scroll 0.9666 0.0333 0.1825 anaguma im. badger 0.9666 0.0333 0.1825 yagi goat 0.9666 0.0333 0.1825 inu ■k dog 0.9666 0.0333 0.1825 kuma m bear 0.9666 0.0333 0.1825 oukami f t wolf 0.9333 0.0463 0.2537 yamaneko Uift wild cat 0.9333 0.0463 0.2537

279

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Grammatical terms) 'j-ii rx^ffllSj t*T. #. SIS^XcD^W lr. B*f5«it£S-rs£££ tftTC Ct'li rx^ffllSj tt'5*T-a'J--Ciif.5'Ci:lzL^-ro z£. * 5 X t e l t t l l £ 'NO' £th*-5f£it^

280

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Subjects of study) c£ai3fc£*£l+^

Word Mean SE SD suugaku mathematics I 0 0 bizyutu Hffi art 1 0 0 rika g?4 natural science I 0 0 buturi mm physics 1 0 0 saNsuu f t t t arithmetic 1 0 0 taiiku <** physical education 1 0 0 oNgaku nm music I 0 0 doutoku mm moral 0.9666 0.0333 0.1825 seibutu ±m biology 0.9666 0.0333 0.1825 eigo £IS English 0.9666 0.0333 0.1825 seiyousi ®*5& European history 0.9666 0.0333 0.1825 zukou 1 1 manual-art class 0.9666 0.0333 0.1825 tin mm geography 0.9666 0.0333 0.1825 rekisi E5& history 0.9666 0.0333 0.1825 gizyutu Stffi craft 0.9333 0.0463 0.2537 kagaku tb^ chemistry 0.9333 0.0463 0.2537 syakai social science 0.9333 0.0463 0.2537 kaNbuN a x Chinese classic 0.8666 0.0631 0.3457 koteN £24 Japanese classic 0.8666 0.0631 0.3457 hokeN ftfll health care 0.8666 0.0631 0.3457 syodou mm Japanese calligraphy 0.8333 0.0692 0.3790 sekibuN «» integral calculus 0.8 0.0742 0.4068 bibuN differential calculus 0.7 0.0850 0.4660 toukei tttt statistics 0.5333 0.0926 0.5074 syuukyou mm religionw 0.4333 0.0920 0.5040

281

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. hOftttltKfcfrtl& ISlfc** (Spices) Z(DZfay'7

Xfttllt* ZZXlt. ttg-P xIf- htizt(Dfrtt7Lxi'tztzZti?a t u m - x . X2tzmmtzt> ‘YES' < ?)* $ :/£ , *5X U tttll£ *NO‘ 0***l/£tfc3fc£f£lt^ margarine 0.9666 0.0333 0.1825 syouyu L &o

vanilla 0.9333 0.0463 0.2537 sinamoN cinnamon 0.9333 0.0463 0.2537 sake m Japanese rice wine 0.9 0.0557 0.3051 kacuobusi A'Ofc'.S'L bonito flake 0.8333 0.0692 0.3790 kokoa zio7 cocoa 0.8333 0.0692 0.3790 natumegu nutmeg 0.8 0.0742 0.4068 dasi tz L broth 0.8 0.0742 0.4068

282

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Id 'C l (Objects found in the houses) zcDyaytafiTdv-i* r*mv3tt>ti6mi v t. mmniksis,. **& t\ - « * ZtlfrbWl'Xl'tztzK -lzMTifrt'5fr*m7LTl'tzt££2+. fcL. Hlwjtr#fc*B^C

Word Mean SE SD deNsireNzi microwave 1 0 0 suihaNki f tC S rice cooker 1 0 0 koNpyuutaa 3 1 /tf a —* — computer I 0 0 doraiyaa dryer I 0 0 faQkusu fax I 0 0 sutereo X t Is* stereo I 0 0 toosutaa 1----Z 5 — toaster 1 0 0 teeburu T — Jjl* table 1 0 0 bideo t 'x * video 0.9666 0.0333 0.1825 tukue *1 study desk 0.9666 0.0333 ! 0.1825 tokei firtt clock 0.9666 0.0333 0.1825 reizouko ;*«* fridge 0.9666 0.0333 0.1825 sofaa 7 7 7 - sofa 0.9666 0.0333 0.1825 deNwa «t£ telephone 0.9333 0.0463 0.2537 seNpuuki a n a fan 0.9333 0.0463 0.2537 hoNdana book shelf 0.9333 0.0463 0.2537 taNsu mm drawer 0.9333 0.0463 0.2537 beQto h bed 0.9333 0.0463 0.2537 mikisaa mixer 0.9333 0.0463 0.2537 raNpu lamp 0.9 0.0557 0.3051 isu chair 0.9 0.0557 0.3051 puriNtaa printer 0.9 0.0557 0.3051 airoN 7-f □ > iron 0.9 0.0557 0.3051 zyuutaN mm carpet 0.8333 0.0692 0.3790 zyuusaa y a - t - juice mixer 0.8 0.0742 0.4068

283

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. —V (Sports) z < n ~ j u u — ii r x ^ —"j\ x t a ztifrz>wi'Xi'tzt£<$.f§tfz(Dt)mzi

‘YES’ ‘NO' <0/15* > £ Hi 3fc § /£ ( + # < If L T < f£ £ L'o ^fa'U -li T?-f0 -ttuTMi^MiLTX The category for this block is “sports.” Your task is to determine whether the words you are going to hear in this block belong to this category or not. If you think that they belong to this category, press 'YES,’ otherwise press ‘NO’ as soon as possible. The category for this block is “sports." Please prepare for the block. Word Mean SE SD haNdobooru / \ > Ktf—Jl> handball I 0 0 bokusiNgu boxing 1 0 0 sofutobooru 7 7 F tf—)\, softball 1 0 0 sumou mm sumo 1 0 0 wrestling resuriNgu LsZ'J wrestling I 0 0 hougaNnage ffiA ttlf shot put 1 0 0 taisou <*i§ gymnastics I 0 0 taQkyuu Ping-Pong I 0 0 booriNgu bowling 1 0 0 saQkaa -fyvh — soccer 1 0 0 bareebooru /<\s—ft—JL’ volleyball I 0 0 ragubii y ' f t f - rugby I 0 0 goruhu golf 0.9666 0.0333 0.1825 rikuzyou m± track and field 0.9666 0.0333 0.1825 batomiNtoN badminton 0.9666 0.0333 0.1825 zyuudou mm judo 0.9666 0.0333 0.1825 sukeeto 7>*r— h skating 0.9666 0.0333 0.1825 suiei swimming 0.9666 0.0333 0.1825 basukeQtobooru M X 'r'y hTt?— basketball 0.9666 0.0333 0.1825 marasoN ■7 7 7 7 marathon 0.9666 0.0333 0.1825 yakyuu »** baseball 0.9333 0.0463 0.2537 sukii X ^ r- skiing 0.9333 0.0463 0.2537 bootakatobi pole vault 0.9 0.0557 0.3051 takatobi high jump 0.9 0.0557 0.3051 habatobi long jump 0.8666 0.0631 0.3457

284

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1£ (Flowers) r^Ej X'-f0 ztlfr ZNlL'XL'tztz < T=1 'J -lz &+&1)'t’5fr£%z.XL'tzt£££ta t l . Bcx.r#fcJ|tHA%a)«ll9T*ft-6ii^tt •YES’ ©*$:/£. ’NO' CD/t^>£tii3fc-5/'lt^<#LT

Word Mean SE SD gaabera Transvaal daisy, 1 0 0 gerbera suiitopii Z 'f - h t z - sweet pea L 0 0 asagao morning-glory 1 0 0 kiNmokusei fragrant olive I 0 0 bara mm rose I 0 0 himawari sun flower I 0 0 rabeNdaa y ' O y — lavender I 0 0 tyuuriQpu tulip I 0 0 botaN peony I 0 0 kaaneesyoN a > carnation 0.9666 0.0333 0.1825 huriizia 7 ' J - v 7 freesia 0.9666 0.0333 0.1825 tubaki 4* camellia 0.9666 0.0333 0.1825 suiseN * « l daffodil 0.9666 0.0333 0.1825 yuri lily 0.9666 0.0333 0.1825 sakura & cherry blossom 0.9666 0.0333 0.1825 maagareQto ~7—ii Lsy h Margaret 0.9666 0.0333 0.1825 yuugao moonflower 0.9666 0.0333 0.1825 riNdou U > yellowwort 0.9333 0.0463 0.2537 tutuzi o-d c azalea 0.9333 0.0463 0.2537 kiku m chrysanthemum 0.9333 0.0463 0.2537 guraziorasu gladiolus 0.9 0.0557 0.3051 ayame mm iris 0.9 0.0557 0.3051 paNzii / O v — pansy 0.9 0.0557 0.3051 rairaQku 7 - f 7 7 ^ lilac 0.9 0.0557 0.3051 zeraniumu — 't’A geranium 0.6333 0.0894 0.4901

285

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. £tii3fc£>*£lt^

Word Mean SE SD painaQpuru /Ui-vZfii, pineapple I 0 0 momo m peach I 0 0 1 n; piiti St peach I 0 0 nasi % Japanese pear I 0 0 masukaQto V V muscat I 0 0 remoN lemon l 0 0 maNgoo ■7>3*— mango I 0 0 budou mm grape I 0 0 suika ffiJH watermelon I 0 0 biwa loquat I 0 0 kiui kiwi l 0 0 zakuro Eta pomegranate I 0 0 banana banana I 0 0 mikaN mts tangerine l 0 0 papaiya papaya I 0 0 meroN jt u y melon I 0 0 younasi pear I 0 0 riNgo apple l 0 0 itigo m strawberry 0.9666 0.0333 0.I825 apurikoQto 7 ^ 'J U 'y h apricot 0.9666 0.0333 0.1825 anzu 9 apricot 0.9666 0.0333 0.1825 itiziku fig 0.9333 0.0463 0.2537 reezuN u—x> raisin 0.9 0.0557 0.3051 kaki m Japanese persimmon 0.8666 0.0631 0.3457 kuri m maroon 0.8 0.0742 0.4068

286

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 511050)—^5 (Parts of the buildings) ra«S(D-ffl5j V t . M U tl'xtf* *-¥>x/<—K fc'JU&if jWBi'iMvsfcsi'^-r. z0M&®ami$-r&mm£zz-C'iimtoa-8{i£nz.£?o m an F^ft£tt*»d5-«fcLT#*64i*T*LJ:5. £fc. £tt©tf;ufciotf#I®Sfc£fc*1l! ©-«fcLr#?l6*i‘6t*Lj:5. febtf>£»iSB!!0)-B£S-rs ££££*> T r a^CD—SBj fcl'5Ax=f'J—ICLST. CtlA'bML'rL'fcf£<|tISA

Word Mean SE SD heya SUM 1 room 1 0 0 okuzyoo M± | rooftop I 0 0 kaidaN RSfS stair 1 0 0 yane M« roof I 0 0 siNsitu mm bedroom I 0 0 erebeetaa X Ls*—? — elevator 1 0 0 esukareetaa x x A \s—$ — escalator I 0 0 hooru ;u hall 0.9666 0.0333 0.1825 ima era living room 0.9666 0.0333 0.1825 geNkaN £811 entrance 0.9666 0.0333 0.1825 seNmeNzyo jfcffiRff lavatory 0.9666 0.0333 0.1825 teNzyoo ceiling 0.9666 0.0333 0.1825 doa K7 door 0.9666 0.0333 0.1825 robii □ tr­ lobby 0.9333 0.0463 0.2537 mado ig window' 0.9333 0.0463 0.2537 kabe H wall 0.9333 0.0463 0.2537 garasu JiyX glass 0.9333 0.0463 0.2537 yuka m floor 0.9333 0.0463 0.2537 hasira f t pillar 0.9333 0.0463 0.2537 daidoko SHr kitchen 0.9 0.0557 0.3051 rouka B T corridor 0.9 0.0557 0.3051 osiire » L A ti closet 0.9 0.0557 0.3051 syosai ** study room 0.9 0.0557 0.3051 tika ifeT basement 0.8666 0.0631 0.3457 svoumei lighting 0.8333 0.0692 0.3790

287

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (School items) Z(D7nyt;£, ’NO’ a>*$>£iij3fc<5f£it^<»L-c

Word Mean SE SD boNdo bond l 0 0 buNdoki protractor l 0 0 hude m brush l 0 0 kaQtaa f t v $ — cutter I 0 | 0 iroeNpitu fe fa v* colored pencil I 0 0 teepu ■fePT-? Scotch tape I 0 0 kesigomu eraser l 0 0 pareQto /■vL/"J h palette I 0 0 eNpitu ft* pencil l 0 0 koNpasu 3 I s / i X compasses I 0 0 gayousi drawing paper 0.9666 0.0333 0.I825 kureyoN crayon 0.9666 0.0333 0.I825 hudebako *« pencil case 0.9666 0.0333 0.I825 maziQku 7 - > ; ^ marker 0.9666 0.0333 0.1825 z> is K-fe./u 0.9666 0.0333 0.I825 sitaziki T#£ desk pad 0.9666 0.0333 0.I825 enogu paints 0.9666 0.0333 0.1825 ‘nooto J - h notebook 0.9666 0.0333 0.1825 hasami scissors 0.9333 0.0463 0.2537 kyookasyo textbook 0.9333 0.0463 0.2537 maakaa ■7 —-h — highlighter 0.9333 0.0463 0.2537 kabaN fr\th bag 0.9 0.0557 0.3051 uwabaki ± « t slippers 0.8666 0.0631 0.3457 suzuri inkstone 0.7333 0.0821 0.4497 bousi ** hat, cap 0.6 0.0909 0.4982

288

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Living creatures in the water, and seafood) I V t . * X a & £ young yellowtail 0.9666 0.0333 0.1825 madai mm red sea bream, porgy 0.9666 0.0333 0.1825 karei 8 flatfish 0.9666 0.0333 0.1825 sazae ■*M fx top shell 0.9666 0.0333 0.1825 saba t t mackerel 0.9666 0.0324 0.2130 katuo IS bonito 0.9333 0.0463 0.2537 kawahasi * 9 / \ * filefish, leatherfish 0.9333 0.0463 0.2537 tako tf octopus 0.9333 0.0463 0.2537 masu t t trout 0.9 0.0557 0.3051 sawara tt sawara. Spanish 0.9 0.0557 0.3051 mackerel kaki fill oyster 0.9 0.0557 0.3051 kisu sillago 0.8333 0.0692 0.3790 mebaru * /^)l* gopher, rockfish 0.8 0.0742 0.4068 koNbu EM tangle 0.8 0.0742 0.4068 kazu’NO’ko &

289

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. H ^ 3S (Living creatures in the water, and seafood) II rftJfS j vta X3iMzmjP l\\\z ± & t JSHSfc £tZz.xM\.'Ti'tzti**$>£. * 5 T ? fc (+ * ltf *NO’ £>;!?$ > £ d J3 fc £ 7 £ lt^ < A L T

Word Mean SE SD tatiuo *7lfli scabbard fish I 0 0 saNma mackerel pike I 0 0 hotate scallop I 0 0 hoQke Akta mackerel 1 0 0 wakasagi pond smelt 1 0 0 buri iff yellowtail I 0 0 hamaguri £ clam I 0 0 hugu jol® globefish 1 0 0 uni Sfli sea urchin I 0 0 akagai ark shell I 0 0 nizimasu * I» rainbow trout I 0 0 awabi m ear shell 0.9666 0.0333 0.1825 ikura 45=3 salmon roe 0.9666 0.0333 0.1825 sake M salmon 0.9666 0.0333 0.1825 iwasi m sardine 0.9666 0.0333 0.1825 aNkou Tlszi'1? angler 0.9333 0.0463 0.2537 ayu ifi ayu, Japanese river trout 0.9333 0.0463 0.2537 ika mm cuttlefish, squid 0.9 0.0557 0.3051 kaNpati a kind of fishes 0.8333 0.0692 0.3790 hamo i i conger 0.8333 0.0692 0.3790 ainame greenling 0.7666 0.078 0.4301 isidai parrot fish 0.7333 0.0821 0.4497 okoze * 3 -tf stingfish 0.7 0.0850 0.4660 koti II flathead 0.5 0.0928 0.5085 isaki grunt 0.4666 0.0926 0.5074

290

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. i f 31 (Vegetables and beans) I C0?a-y*a>*x=f'J-l* r|?£j X 't0 mm* m i* SSL ZIF& «»SE. 41f£& ifi'*>i'?>fcyrrjSL ri^ggj 4^5 i orojjfi'j-t L i t , t L HlZjcrSfciHH^Z^iiT^'J—l=BT«tJR of=6. 'YES' 0)7t?s>£. *5T'fc(t*i.li' ‘NO’ <*>**$ L r < f : J L '. * x = f u —14 r*f36j r*-T= **l1*l4tMI L t< f:4 L 'o The category for this block is "vegetables.” All kinds of vegetables — root vegetables, leaf vegetables, beans, summer, autumn or winter vegetables belong to this category. If you think that the words you are going to hear in this block belongs to this category, press “YES,’ otherwise press “NO’ as soon as possible. The category for this

Word Mean SE SD niNziN A # carrot I 0 0 daizu XS soy bean I 0 0 guriiNpiisu ? y —> tf —x green pea 1 0 0 asuparagasu 7 asparagus 1 0 0 houreNsou 145*14/ spinach 1 0 0 satoimo taro 1 0 0 piimaN pimento I 0 0 tiNgeNsai qing-geng-cai I 0 0 aona greens 1 0 0 syuNgiku *X garland 1 0 0 chrysanthemum takenoko * bamboo shoot I 0 0 kvuuri ftill cucumber 0.9666 0.0333 0.1825 satumaimo sweet potato 0.9666 0.0333 0.1825 tamanegi £ S onion 0.9666 0.0333 0.1825 nasu egg plant 0.9666 0.0333 0.1825 soramame $ a broad bean 0.9666 0.0333 0.1825 gobou r i4 5 burdock 0.9333 0.0463 0.2537 niNniku \ZAj\Z< garlic 0.9333 0.0463 0.2537 myouga KUO Japanese ginger 0.9333 0.0463 0.2537 siitake L i'fc it shiitake Mashroom 0.9333 0.0463 0.2537 aomame WS green pea 0.9333 0.0463 0.2537 aoziso w c -e a kind of beafsteak 0.9333 0.0463 0.2537 plant sisitou LLfS green pepper 0.8666 0.0631 0.3457 kabu turnip 0.8666 0.0631 0.3457 okura * 7 5 okura 0.7666 0.078 0.4301

291

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I? 31 (Vegetables and beans) II ZSU*‘6«it#<»Lr

The category for this block is '‘vegetables.” All kinds of vegetables — root vegetables, leaf vegetables, beans, summer, autumn or winter vegetables belong to this category. If you think that the words you are going to hear in this block belongs to this category, press ‘YES,’ otherwise press ‘NO’ as soon as possible. The category for this

Word Mean SE SD karihurawaa ■h 'J 7 7 9 - cauliflower I 0 0 morokosi com 1 0 0 hakusai Chinese cabbage I 0 0 negi £ green onion 1 0 0 nagaimo yam I 0 0 komatuna / | ' « * I 0 0 zyagaimo potato I 0 0 matutake matsutake mushroom 1 0 0 retasu lettuce I 0 0 kyabetu ***'? cabbage I 0 0 eNdou s&a pea 0.9666 0.0333 0.1825 iNgeN "f f\j kidney bean 0.9666 0.0333 0.1825 reNkoN SIS lotus root 0.9666 0.0333 0.1825 daikoN * « daikon radish 0.9666 0.0333 0.1825 edamame eta green soybean 0.9666 0.0333 0.1825 enoki jLO ittz It e’NO’ki mushroom 0.9666 0.0333 0.1825 buroQkorii 'J — broccoli 0.9333 0.0463 0.2537 nazuna T X - t shepherds purse 0.9 0.0557 0.3051 syouga ginger 0.9 0.0557 0.3051 azuki /h a small red bean 0.8666 0.0631 0.3457 kuresoN watercress 0.8 0.0742 0.4068 udo o a udo 0.8 0.0742 0.4068 huki S Japanese butterbur 0.7666 0.078 0.4301 takana mm a kind of Chinese 0.7 0.0850 0.4660 cabbage asatuki chives 0.6666 0.0875 0.4794

292

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . f t (Birds) 9 — li r,Hj -cfo ctL^'«3HlL'‘CL'fcf£<^llSA£ffi3fc£>f£lt^< # LT < *£Sl'o *It P * u - Ii rftj -C'to **.- p macaw 1 0 0 oumu aa parrot I 0 0 tubame m swallow I 0 0 toki ft Japanese crested ibis I 0 0 hakutvou swan 0.9666 0.0333 0.1825 wasi ft eagle 0.9666 0.0333 0.1825 kamo f t wild duck, gull 0.9666 0.0333 0.1825 hototogisu cuckoo 0.9666 0.0333 0.1825 taka f t hawk 0.9666 0.0333 0.1825 kitutuki ® * ft woodpecker 0.9333 0.0463 0.2537 kaQkoo * -V30 cuckoo 0.9333 0.0463 0.2537 tyabo f t f t bantam 0.9 0.0557 0.3051 turn a crane 0.8666 0.0631 0.3457 meziro S 3 white-eye 0.8666 0.0631 0.3457 mozu 5 5 f t shrike 0.7333 0.0821 0.4497

293

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. —£B (Body Parts) I z < j ) 7 'j - i i r{* fl®(DRS&-<£>-ta)lfe-f'<-C £££<*> "C r#0)—asj . 'YES' ZoVtflttlti ‘n o ’ 0**$>£tti*'5/'it^<}¥l t < /-$ L '0 *T = ru -ii rf*a>-8isj T-f= -ttiTii *{*LT<*:$1'0 The category for this block is “body parts.” Here, this category includes that words that represent outside parts and internal organs. Your task is to determine whether the words you are going to hear in this block belong to this category or not.. If you think that they belong to this category, press ’YES,’ otherwise press ‘NO’ as soon as possible. The

Word Mean SE SD kao 31 face I 0 0 me S eye I 0 0 ha m teeth 1 0 0 xiNzou

294

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. —$ (Body Parts) II 'j—ii rfto-«j x-to zzx i4. <*0)fl-tfjSi5#£*-ri&lS, <*£> wse^KS^-eoxtefr^<7>—gpj ±:i'5 i o s. ‘NO' 0^^>^aa*'Sf-lt^<»LT&/)' bully I 0 0 siri R blind cheeks I 0 0 asikubi tarsus 1 0 0 hone # bone I 0 0 roQkotu Itttt rib I 0 0 hutomomo *jfi thigh I 0 0 mi mi % ear 0.9666 0.0333 0.1825 nodo

295

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TC^e (Chemical elements) tjtm i V to *tt-es*£5c*E#£**fcE1*Ji<**i:S l ' * f . ZOy-JayWlt. Cftri'bHl'Tl'fcf£< IdSt*-SA'i: 5A'£ #*.Ti'fc/£#rr. t u Bl^jfctfcJHa^rjcfUj o*ii-efe«*^i=ii, ‘y e s ' cd* $>£.-toX'tetttltf ‘NO’ Lr< Ai^L'o tt fju *j T*f= -t*i-ctt»aLr

Word Mean SE SD magunesiumu magnesium I 0 0 titaN titanium I 0 0 aruminiumu aluminum I 0 0 karushiumu calcium I 0 0 taNso stm carbon I 0 0 aeN SfB zinc 1 0 0 ritiumu •jm* A lithium I 0 0 huQso z>vm fluorine 1 0 0 suiso 7k* hydrogen I 0 0 tiQso mm nitrogen I 0 0 eNso «* chlorine 1 0 0 • uraN uranium 0 0 heriumu a.ij ^7 A helium 0.9666 0.0333 0.1825 natoriumu ■ r sodium 0.9666 0.0333 0.1825 kadomiumu ■h K5 0 A cadmium 0.9666 0.0333 0.1825 iou ttlt sulfur 0.9666 0.0333 0.1825 kariumu potassium 0.9666 0.0333 0.1825 bariumu barium 0.9666 0.0333 0.1825 saNso mm oxygen 0.9333 0.0463 0.2537 keiso w * silicon 0.9 0.0557 0.3051 suigiN 7k IS mercury 0.8666 0.0631 0.3457 niQkeru — "J * T 7U nickel 0.8333 0.0692 0.3790 riN phosphorus 0.8 0.0742 0.4068 suzu 7.X tin 0.6333 0.0894 0.4901 kobaruto h cobalt 0.6 0.0909 0.4982

296

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. H& (Insects) ccd^p-y^(D*f-3 >j—1 1 r g * j x t= 'j —ic * tz>tm7Lxw[,'xi'tztz€ire&£ift&tt ‘yes' irixtnttiti ‘NO’ tii3|E'5f£lt^-< A L T < tz£ l'a f g £ j X to *tlVt*mmLX beetle I 0 0 kanabuN cupreous polished l 0 0 chafer, drone beetle kamakiri mantis l 0 0 kuwagata V 1 15 5 stag beetle l 0 0 ameNbo 7 * >7fC water strider I 0 0 agehatsyou ? y / \ f a 7 swallowtail L 0 0 gokiburi cockroach I 0 0 tyoutyou it* butterfly l 0 0 moNsirotyou cabbage butterfly l 0 0 toNbo m * dragonfly l 0 0 suzumebati wasp, hornet, yellow L 0 0 jacket ari t* ant l 0 0 hati s bee 0.9666 0.0333 0.1825 semi it cicada, locust 0.9666 0.0333 0.1825 siroari v P 7 'J white ant, termite 0.9666 0.0333 0.1825 abu it horsefly, gadfly 0.9666 0.0333 0.I825 baQta /<"jV grasshopper 0.9666 0.0333 0.1825 ka a mosquito 0.9666 0.0333 0.1825 ga iS moth 0.9666 0.0333 0.1825 hae fly 0.9666 0.0333 0.1825 imomusi ¥ & green caterpillar 0.9333 0.0463 0.2537 kumo 00$ spider 0.9333 0.0463 0.2537 micubati S it (honey) bee 0.9 0.0557 0.3051 geNgorou Japanese diving 0.8 0.0742 0.4068 beetle sirami louse j 0.4 0.0909 0.4982

297

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Diseases) C (D ^ n ‘V<7©*f-=I*'J— It T '-fo Z;h.rf)'bMl'Tl'fc/£<|if§A<;i0)*x :JU-lc*r«frfc*5fr£#jLTl'fcf£Srt. tL B;:jlT#fcJ|lSitf**a>«irefc*i» ‘YES’ <7)7tC$>£, *5T?J-li r#|3W)«fflJT?r. **i-ef£*BLT

298

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Ingredients) zaizfay?<»i]Td')-\z mmn-ktti vto m z.i& \^/< -ti-£ttx#x< t£t' L '0 f£l$liC#)£tZl'Zl'Zte&tt£m'£to ZZXli. mU£ <»&&<»&& £L'? 1 0<7)*f=f'j — £ L £ to t L , wzz.x$tzmmtfmm<»'ktt£U'')®%$. ‘y e s ’ *U£ ‘NO’ t£{i$.

The category for this block is “ingredients.” Lets consider hamburgers. In order to make them, we need several ingredients including meats. Here all the ingredients like meats that can be used for cooking belong to this category. Your task is to determine whether the words you are going to hear in this block belong to this category or not.. If you think that they belong to this category, press *YES,’ otherwise press 'NO’ as soon as possible. The category for this block is “chemical elements.” Please prepare for the block. Word Mean SE SD hiziki a kind of brown algae I 0 0 butaniku mm pork 1 0 0 gaNmodoki a fried bean curd cake with 1 0 0 vegetables and other ingredients in it kikuraee t< b lf Jews ear 1 0 0 gyuuniku *m beef | 1 0 0 kaNpyou fr/ulfiiO dried gourd shavings 1 0 0 sirataki noodles made from devils 1 0 0 tongue starch vuzu 4>-f citron 1 0 0 kamaboko boiled fish-paste 1 0 0 sooseesi V—-b—v sausage I 0 0 hamu /\A ham 1 0 0 aburaage fried soy bean I 0 0 tikuwa % bacon 0.9666 0.0333 0.1825 haNpeN steamed fish-paste 0.9333 0.0463 0.2537 atuage S » ( f thick deep-fried bean curd 0.9333 0.0463 0.2537 continued

299

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued yuba iftH dried bean curds 0.9333 0.0463 0.2537 koNnyaku Z. < paste made from the arum 0.9333 0.0463 0.2537 root hu a kind of gluten bread 0.9333 0.0463 0.2537

300

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Instruments) Z%mxh&*% IZli "YES’ accordion I 0 0 horuN horn I 0 0 gitaa *r$ — guitar I 0 0 kurarineQto 0 7 U T' "J V clarinet 1 0 0 moQkiN ** xylophone 1 0 0 haapu / \ —7 harp 1 0 0 orugaN organ 0.9666 0.0333 0.1825 siNbaru cymbals 0.9666 0.0333 0.1825 huruuto 7;u— h flute 0.9666 0.0333 0.1825 baioriN /U t'J > violin 0.9666 0.0333 0.1825 koNtorabasu contrabass 0.9666 0.0333 0.1825 marakasu maracas 0.9333 0.0463 0.2537 tyero f - x P cello 0.9333 0.0463 0.2537 biora viola 0.9 0.0557 0.3051 saQkusu saxophone 0.9 0.0557 0.3051 piQkoro £: -y u □ piccolo 0.9 0.0557 0.3051 ooboe t fx oboe 0.9 0.0557 0.3051 ukurere ^ 7 U U ukulele 0.9 0.0557 0.3051 teQkiN iron xylophone 0.8666 0.0631 0.3457 doramu drum 0.8333 0.0692 0.3790

301

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. IHy 1ft (Vehicles) r#y*j t*-*-. z z r - i i , flittf-ty tt'ofci Lrtttstisto^r-r^rtdti'tiOfcL^r. ztu&'fcBL'-ci'fc/i:

Word Mean SE SD zidousya automobile 1 0 0 basu M*X bus 1 0 0 K 1 sukuutaa 1 scooter I 0 0 deNsya n m streetcar 1 0 0 siNkaNseN m m New Trunk Line I 0 0 toreeraa h U — V — trailer 1 0 0 torakutaa tractor I 0 0 hune US ship 1 0 0 reQsya su m train I 0 0 baiku lU O motor-bicycle 1 0 0 mo’NO’ree = £ / [y— ; u monorail 1 0 0 ru takusii * * • > - taxi I 0 0 zyeQtoki v r ; h tt jet plane 0.9666 0.0333 0.1825 toraQku truck 0.9666 0.0333 0.1825 basya f t * carriage 0.9666 0.0333 0.1825 heri helicopter 0.9666 0.0333 0.1825 saNriNsya = » * tricycle 0.9666 0.0333 0.1825 booto * - h boat 0.9666 0.0333 0.1825 ootobai motorcycle 0.9666 0.0333 0.1825 ziipu V--J jeep 0.9333 0.0463 0.2537

kikaNsya « h * locomotive 0.9333 0.0463 0.2537 ziteNsya @ * s m bicycle 0.9 0.0557 0.3051 Cikatecu ifeTt* subway 0.9 0.0557 0.3051 kaato t s - h cart 0.8333 0.0692 0.3790 horobasya a s * caravan 0.6333 0.0894 0.4901

302

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX H: Reasons to Discard Eights Target Words from the Final Analysis

Eight target words were discarded from the final analysis in the semantic categorization experiment. The main reason was that these words were the only words of the 700 words that were possibly associated to the semantic categories. The following paragraphs describe the reasons why they were discarded more in detail. The word, musubi is a last word that is used to end an event (such as a last word for a speech, or a ceremony). It was a mistake to assign this word to the category of “grammatical terms.” Sikori has at least two meaning: 'stiffness’ and 'an unpleasant feeling’. When sikori appeared in the category of "body parts,” it seems that the former meaning was activated, so that the participants responded "yes” to this word. Kabure also has two meanings: ‘rash’ and ‘be influenced.’ The word appeared in the category of “diseases” so that nearly half of the participants must have associated this word to the former meaning, ‘rash.’ Katiku is a collective noun for animals. This word was assigned to the category of “Animals”. All “yes” filler-response words are names of animals, but it cannot be denied that katiku is highly related to animals. Therefore, it must be dropped from the final analysis. Namazu ‘catfish’ is a fresh water fish that lives in marshes or rivers. The description of the semantic category ‘Insects’ imply that the category includes the creatures living in a fresh water, although I did not intend to include fishes in this category. Since the description is confusing, only half of the participants must have gotten it right. Also, it turned out that this is the only “no” target-filler word that is associated to creatures. In this sense, Namazu was more related to the semantic category than the other “no” target-response words. Tanima means ‘valley’ in English. But this word is also often used in a phrase, mune-no tanima, literally meaning “a valley between the breasts” in Japanese. “Cleavage” is an appropriate translation in English. Therefore, the semantic category seemed to give the participants a non-verbal context. In other words, tanima implies a full phrase (mune-no) tanima so that it suddenly became a part of the body, ‘cleavage’. Therefore, this situation should have been avoided. The word, sonote, literately means ‘that method.’ It is a compound word that has two morphemes. The first morpheme, sono that is a demonstrative pronoun modifies the second morpheme, te, meaning ‘method’ in this compound. However, the word, te originally means ‘hand’ that is a part of the body, and normally used the same kanji character Therefore, although the target word means “that method,” participants might have been able to access to the meaning of ‘hand’ in the experiment.

303

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The final word, tegami, was assigned to the category of the things for school. I was in mind that the things elementary students use in class. In Japan, teachers often distribute monthly letters to parents to let them know how children are doing or what the upcoming events are and so on. They are called either gaQkyuu tsuusiN or just tegami. Of course, tegami is highly related to the things observed in class so that some participants might have recalled this situation in class.

304

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX I: Statistics in Experiment 3 .p. 0.0000 0.0000 0.0000 ------1 .210E2 1 1 .324E2 12.2351 0.0000 36.4279 0.0000 .642E2 1 0.0000 2 .748E2 2 59.6065 0.0000 F 1 1 1 1 2 29 27 0.49307 0.49753 0.51024 0.62278 Tol . Tol df 0.134699 0.60030 0.039996 0.90270 0.102249 -0.051178 0.90216 0.954364 67.071752 5.143298 0.852166 0.284031 0.017135 -7.368186 859.582816 Coefficient Std ErrorCoef Std Part. Corr. Part. ------

Table 1.1: Basic semanticmodel categorizationFor data, Experiment 3.

Initial Sound Initial E£ feet E£ 1“ 1“ Mora Participants Point Semantic Duration category Class Frequency Constant Frequency Uniqueness none Word 1 3 5 2 6 4 8 7 In F (62, 19601) 0.222983= 90.725063, = (62, R2 p < 0.000001, F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

(Segments) Initial Sound Initial lBt Mora lBt Ef feet Ef Point Frequency Semantic category Duration Frequency Class Uniqueness Constant none Neighborhood Word 1 2 Participants 3 5 4 6 9 7 8 In F(63, F(63, 19600) = 89.759406, p < 0.000001, = 0.223911 R2 Table Basic1.2: model Neighborhood-f density (Segments)semantic categorization Tor data, Experiment 3. Out U> s

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1 .113E2 37.8543 12.6419 1 .812E2 3.062E2 45.5955 42.8043 0.0000 1 1 1 1 1 2 29 1.328E2 0.0000 27 df F 0.48889 0.49753 0.51024 0.61420 Tol . Tol -0.04385677873 . 0 -0.043996 0.87498 Kxneriment 3. 0.857887 0.045047 0.88903 67.474985 0.108044 Std Std Error Coef Std 5.792833 0.305986 0.017487 0.145111 0.57531 -0.993409 0.161462 -6.334216 0.968165 908.299541 Part. Corr.Part.

90.053720, p < 0.000001,~ R2 = 0 .224481 0 = 90.053720, R2 p < 0.000001,~

) ) - (Segs+ Pi tch) Pi (Segs+ Initial Sound Initial Effect Coefficient Point Participants Mora 1st Semant ic Semant Frequency category Frequency none Uniqueness Neighborhood Class Word 1 Constant 3 5 9 6 Duration 4 8 2 7 Tabic Basic1.3: model + Neighborhood density (Segments + I'itch) for semantic categorization data, In F (63, F (63, 19600 Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 1.325E2 1 .210E2 0.0000 12.5266 0.0000 39.1093 0.0000 14.7104 0.0001 2 .720E2 2 1 .679E2 0.0000 59.5453 0.0000 F 1 2 1 1 1 0.49306 0.49752 29 0.50645 27 0.47377 0.67333 0.62189 0.90216 0.041500 0.89955 1 0.103402 0.029419 -0.051134 0.863360 0.954031 0.012854 67.095969 Std Std Error Coef Std . Tol df 0.317983 0.019281 0.150800 5.336694 0.049299 -7.361844 869.280919 Part. Corr.Part.

(Auditory) Initial Sound Initial Participants Point Ef fect Ef icient f Coef ic Semant 1st Mora 1st Duration Class category Uniqueness Frequency Frequency Word Neighborhood none 1 Constant 3 6 2 5 4 8 9 7 In F(63, F(63, 19600) =897580934, p < 223566 = R2 0.000001, Out Table 1.4: Table Basic1.4: model + Neighborhood density (Auditory) for semantic categorization data, Experiment 3.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • p. • 0.0000 0.0000 1.328E2 1.116E2 12.9077 0.0000 1 .846E2 0.0000 13.7924 0.0002 36.9340 0.0000 48.3374 0.0000 2.996E2 0.0000 42.9218 0.0000 1 1 1 1 1 1 2 ------0.49752 29 0.48888 0.50644 27 0.45957 0.87497 0.67305 0.77841 Tol . Tol df F ------0.046440 0.88623 0.028466 -0.043315 data, Experiment 3. 0.012845 0.161443 67 .494418 67 0.109088 0.61345 0.338566 0.019559160562 0. 0.047703 5.971946 0.858962 -0.981145 -6.340844 0.967851 -0.044042 917.082119 Part. Corr. Part. Coefficient Error Std Coef Std

(Segs+Pitch) (Auditory) Ef fect Ef Initial Sound Initial 1st Mora 1st Point Frequency Semantic category Frequency Constant Class Uniqueness none Neighborhood Word Neighborhood 1 3 6 Duration 2 Participants 9 4 5 7 8 9 Table 1.5: basic1.5: Table model + Neighborhood density (Segments + Pitch Auditory)& forsemantic categorization In F (64, 19599) 0.225026= 88.919995, (64, - R2 p 0.000001, < F Out UJ o VO

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >p. 0.0000 0.0000 6.9664 0.0000 18 .7529 18 81.9502 0.0000 23.5320 0.0000 68.4563 0.0000 F 1 1 .477E2 1 1 1 2 14 17.3907 0.0000 27 df ------0.49401 0.52450 0.50718 0.108523 0.62286 0.043101 0.90360 ------1.280020 -0.048274 0.90389 0.022999 0.148392 0.60043 Std ErrorStd Coef Std . Tol 0.279512 4.963196 1.146113 816.610556 90.206920 Part. Corr . Corr Part.

Table 1.6: Table Basic1.6: modelr for categorization.semantic data (fast responders), Experiment 3. Initial Sound Initial 1st Mora 1st Point Participants Effect Coefficient Duration Frequency -6.209352 Frequency Semantic Uni queness Uni category Word none Class 1 Constant 6 3 8 5 7 2 4 In F (4 7, 7, 9922) 0.111852= = 26.586417, p R2 < 0.000001, (4 F Out ©

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ,p. 0.0000 7.1560 0.0000 17.4147 0.0000 .8030 14 0.0001 1 .605E2 0.0000 17 .2244 17 66.5806 0.0000 .4630 24 0.0000 .5325 91 0.0000 1 1 1 1 2 1 14 27 0.58518 0.60969 0.86198 0 . 52450 . 0 0.49304 0.50717 0.77124 -0.044675 Std CoefStd Tol. df F 1.160335 0.049838 0.88015 1.309693 -0.039175 0.023277156567 0. 0.131142 responders), Experiment 3. 5.739023 0.294911 -5.039007 -0.544269 871.594124 91.101713 0.115830 Part. Corr. Part.

(Segments) Initial Sound Initial 1st Mora 1st Point Ef fect Ef ficient Coef Std Error Frequency Semant ic Semant ion rat Du category Frequency Class Uniqueness none Neighborhood Word Table 1.7: Basic model + Neighborhood density (Segments)semantic categorizationTor data (fast 1 Constant 5 3 2 Participants 4 6 8 7 9 In F (48, 9921) 26.433942, (48, = P < F 0.113391 = R2 0.000001, Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •P' 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0000 7.1461 17.4349 0.0000 62.7897 1.684E2 15.1087 92.0762 26.7414 24.3718 F 1 1 1 1 1 2 14 27 df 0.48958 0.52450 0.50718 Tol. 0.049444 0.89005 0.16178657451 . 0 0.115640 0.61474 -0.055338 0.77965 Std Coef Std 1.153312 1.298026 -0.039225 0.87671 Std Std Error responders), Experiment 3. 5.693646 0.304741 0.023481 -5.045415 -1.126520 0.217845 870.165339 90.683437 Coefficient Part. Corr. Part.

(Segs+Pitch) Initial Sound Initial Ef feet Ef Semantic Point lBl Mora lBl category Frequency Frequency Class Uniqueness Word Neighborhood none 1 Constant 3 5 2 Participants 4 6 Duration 8 9 7 Tabic 1.8: Tabic Basic1.8: model + Neighborhood density (Segments + Pitch) for semantic categorization data (fast In F(48, F(48, 9921) 26657183” = R2= p < 0.000001, 0.114240 Out.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >p. 0.0000 0.0000 0.0000 7.0312 4 .4808 4 0.0343 17.3789 1.387E2 0.0000 19.7436 0.0000 68.6477 23.5034 0.0000 83.4163 0.0000 1 1 1 1 1 2 14 27 df F 0.52449 0.49400 0.50350 0.161704 0.47450 0.024406 0.67313 -0.048236 0.90388 1.147728 0.044287 0.90075 0.025867 0.017341 responders), Experiment 3. Neighborhood density (Auditory)semantic categorization Tor data (fast 0.304588 5.099791 0.036707 + -6.204493 1.279797 824.428436 90.266687 0.109562 0.62182 Part. Corr. Part. Coef £ icient £ Coef Std Error Coef Std . Tol

(Auditory) Initial Sound Initial 1st Mora 1st Ef feet Ef Participants Point Duration Frequency Frequency Semantic Uniqueness category Class Neighborhood none Word Table 1.9: Table Basie1.9: model 1 Constant 6 3 5 8 2 7 9 4 In F (4 8, 9921) = 26.135016, p < 0.000001, R2 = 0 .112253 8, 0 9921) = 26.135016, = R2 p < 0.000001, (4 F Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ' P' ' 0.0000 0.0000 0.0000 7.1995 0.0000 4.0409 0.0444 17.4233 15.1432 0.0001 1.565E2 25.3767 0.0000 93.4500 0.0000 26.2982 0.0000 F 1 1 1 1 1 1 2.0792 63 27 df 0.87671 0.48956 0.50350 0.52449 14 0.67283 Tol . Tol Auditory) Tor semantic Auditory) categorization Tor & 0.023153 Std Coef Std 0.017323 0.026245 0.174304 0.45975 0.217860 -0.054881 0.77930 data (fast responders), Experiment 3. 5.817200 1.154773 0.050517 0.88753 0.034822 0.328321 -5.050407 1.297830 -0.039264 -1 .117227 -1 877.139928 90.735903 0.116567 0.61384 Part. Corr. Part.

(Audi tory) (Audi (Segs+Pitch) Initial Sound Initial 1st Mora 1st Point Effecticient f Coef Error Std Frequency Frequency Semantic category none Uniqueness Class Neighborhood Word Neighborhood 1 Constant 6 Duration 8 4 5 3 2cipants i Part 9 9 7 In F (4 9 , 9920) = 26.203629, p < 0.000001, R2 = 0.1 14600 9920) 0.1 = 26.203629, = , p R2 < 0.000001, 9 (4 F Table 1.10: Table Basic1.10: model + Neighborhood density (Segments Pitch Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >p. 0.0000 8.1539 0.0000 54.1423 0.0000 17.8135 0.0000 1 .297E2 0.0000 81.4516 36.4371 0.0000 82.5709 0.0000 F 1 2 14 27 df 0.49202 0.51619 0.51330 Tol. 0.041137 0.90170 Std Coef Std 1.261092 99.263260 0.106589 0.62255 1 5.322567 0.289389 0.025412 0.136057 0.60010 1 -8.549815 1.416396 -0.058882 0.90023 1 901.990714 Part. Corr. Part. Coef f icient f Coef Error Std

Table 1.11: Table Basic1.11: model for semantic categorization data (slow responders), Experiment 3. Initial Sound Initial 1st Mora 1st Ef feet Ef Point Duration Frequency Frequency none Class Semant ic Semant category Uniqueness Word 3 5 6 1 Constant 4 8 2 Participants 7 In F (47, 9646) 0.173734 - 43.153447, = (47, R2 p < 0.000001, F Out U> t-n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0059 8.3237 0.0000 7.5974 1.364E2 0.0000 81.5681 52.5201 21.2849 0.0000 88.3307 0.0000 28.1542 0.0000 1 1 2 1 1 1 14 df F 0. 51618 0. 0.49119 0.51329 27 0.60883 0.58605 0.85876 0.77028 Tol. 0.045556 0.87793 responders), Kxperimcnt 3. 5.894355 1.277615 0.300229 0.025706 0.141154 -0 . 396604 . -0 0.143888 -0.029057 -7 .692175 -7 .449697 1 -0.052975 943.052288 100.34135 0.111441 Coefficient Std ErrorCoef Std Part. Corr. Part.

(Segments) Ef fect Ef Initial Sound Initial Participants lEt Mora lEt Frequency Point Class Semantic category Uniqueness Frequency none Neighborhood Word Tabic 1.12: Basic model + Neighborhood density (Segments) forsemantic categorization data (slow 1 Constant 3 5 2 6 Duration 4 7 9 8 In F (48, F 9645) 42.441594, (48, = P < 0.000001,R2=0.174385 Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 0.0000 0.0000 8.3630 0.0000 1.413E2 0.0000 50.0266 0.0000 81.6489 21.5053 0.0000 89.5195 0.0000 28.3092 13.0435 0.0003 1 1 1 2 1 1 14 0.48809 0.51617 0.61351 0.87305 0.77769 0.144873 0.57602 0.111728 -0.052670 -0.037880 Std CoefStd Tol. df F 1 .437379 1 0.025921 0.238170 99.929545 Std Error Std responders), Experiment 3. 5.8897190.308140 1.270052 0.045520 0.88792 -7 .647769 -7 -0.860171 945.480837 Part. Corr. Part.

(Segs+Pitch) Initial Sound Initial Participants Point 1®' Mora 1®' Semantic Ef feet Ef icient f Coef category 0.51329 27 Class Frequency Frequency Uniqueness none Word Neighborhood 1 Constant 3 5 6ion Durat 4 2 8 9 7 In F (48, 9645) = 42.578914, (48, p < F = 0.174850 R2 0.000001, Table Basic1.13: mode) + Neighborhood density (Segments + Pitch) forsemantic categorization data (slow Out u> ^3

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0 .0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 8.4078 19.5327 0.0000 1.354E2 10.9159 0.0010 81.5495 53 .8571 53 84.6356 36.4080 1 2 27 0.51617 14 0.50939 0.49201 0.47296 1 0.90023 1 T o l . d f F 0.043137 0.89823 -0.058828 S td C oef 0.018963 0.037244 0.67338 1 responders), Experiment 3. 5.581386 1.262877 0.062653 0.332896 0.028610 0.156512 -8.542015 1.415670 913.269156 99.270989 0.107921 0.62182 1 Part. Corr. M ora ( A u d ito r y ) Effect Coeficientf Std Error Participants Initial Sound P o in t 1st 1st F re q u e n c y S e m a n tic c a t e g o r y U n iq u e n e s s D u r a t i o n F re q u e n c y C o n s t a n t n o n e C l a s s Word Neighborhood Table 1.14: Basic model + Neighborhood density (Auditory) for semantic categorization data (slow 1 5 2 3 8 6 4 9 7 In F(48, F(48, 9645) =42.525267, 0.174668= R2 p 0.000001, < Out

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

1 .436676 1 0.018956 0.238100 Std Std Error Coef Std . Tol ------data (slow rcsnonders). Exocriment 3. 0.350395 0.029014 0.164739 0.45930 1 6.133041 1.271655 0.061327 ------0.844980 1 -7.656066 955.752394 99.930984 0.112942 -- Part. Corr. Part. -

41.964509, p < 0.000001, R2 = 0.]75745= 41.964509, p R2 < 0.000001, (Segs+Pitch) (Auditory) 9644) 9644) = 1st Mora 1st Initial Sound Initial Point Participants Frequency EC feet EC icient f Coef Semantic Frequency none category Neighborhood Neighborhood Class Uniqueness Constant Word 1 3 5 6 Duration 4 8 9 7 9 2 In F<49, F<49, Out: Tabic Basic1.15: model + Neighborhooddensity (Segments + Pitch & Auditory) forsemantic categorization 3 u>

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.