INFORMATION TO USERS
This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.
The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.
In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps.
Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.
ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor, Ml 48106-1346 USA 800-521-0600
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. PHONOLOGICAL NEIGHBORHOODS AND PHONETIC SIMILARITY
IN JAPANESE WORD RECOGNITION
DISSERTATION
Presented in Partial Fulfillment of the Requirements for
the Degree of Doctor of Philosophy in the Graduate
School of The Ohio State University
By
Kiyoko Yoneyama, M.A.
The Ohio State University 2002
Dissertation committee: Approved by
Professor Keith Johnson, Adviser
Professor Mary E. Beckman Adviser Linguistics Graduate Program Professor Mark A. Pitt
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3039544
Copyright 2002 by Yoneyama, Kiyoko
All rights reserved.
___ ® UMI
UMI Microform 3039544 Copyright 2002 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17. United States Code.
ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor. Ml 48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. © Copyright By Kiyoko Yoneyama March. 2002
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT
This dissertation explores two aspects of spoken-word recognition in Japanese: representations of words stored in the lexicon and lexical word competition. The nature of lexical representations and lexical competition were explored by testing three different neighborhood density calculations in naming, word identification in noise and semantic categorization experiments. Neighborhood density is a measure of the number of similar words surrounding a word in the lexicon (“neighbors”). However, definitions of neighbors vary depending on the definition of similarity used. This dissertation tests three neighborhood definitions, each of which coincides with a hypothesis about lexical access with different word representations in Japanese. The first calculation posits the situation where listeners rely on the phonemic word representation as proposed in abstract models. Here, neighborhoods are calculated in terms of the number of phonemes in common, as in the Greenberg-Jenkins calculations (Greenberg-Jenkins. 1964) as widely used in the English word recognition literature. The second neighborhood calculation included prosodic information as another dimension in the neighborhood calculation in order to reflect the finding that prosodic information has a vital role in Japanese word recognition (Cutler & Otake. 1999). This calculation proposes that Japanese listeners use word-level prosody for lexical access. However, both word representation and word-level prosody are separately stored in the lexicon. In other words, the word representation in this calculation is the categorical abstract representation used in the previous neighborhood calculation and the pitch accent patterns additionally constrain the neighbors. A similarity judgment experiment on pitch accent patterns was carried out and the results were implemented in the calculation.
ii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The third neighborhood calculation is designed to test exemplar-based models. In this calculation, neighborhood density was measured by comparing the similarity of cochleagrams of the 66000 audio files (one file for each noun in the NTT psycholinguistic database, Amano & Kondo, 1999, 2000). Therefore, the word representation is an auditory representation in which all segmental and prosodic information is available. In this calculation, as in the GNM (General Neighborhood Model; Bailey & Hahn, 2000), the words in the lexicon are considered as exemplars and they are mapped onto psychological mental space. Data for the analyses were collected from Japanese neighborhood experiments using the same 700 test words used in the previous experiments and a lexicon that consisted of only nouns from the NTT psycholinguistic database. The results of the three experiments in this dissertation shed light on two aspects of lexical access. First, a lexical competition effect is confirmed in Japanese. There are also two types of lexical competition in auditory word recognition: form-based competition (neighborhood density) and phoneme-based competition (cohort reduction). Finally, both abstract (symbolic) representations and episodic (auditory) representations need to be stored in the lexicon. Implications of these results for the current word recognition models are also discussed.
iii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in memory of my grandmother, Toyo Yoneyama and to my parents, Susumu and Rumiko Yoneyama who have been supportive of me
IV
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACKNOWLEDGMENTS
This dissertation wouldn’t exist without help from many people. First and foremost, I would like to thank my adviser, Keith Johnson, who is a true mentor and who has believed in me very patiently.
I would also like to thank Mary Beckman, who was also an excellent researcher role model. She was always allowing me to formulate my own ideas about issues in spoken word recognition, but always ready and eager to provide insights into how the data might be interpreted in different perspectives.
I also thank Mark Pitt who was extremely encouraging and full of enthusiasm for my work. I value most the way in which he tests the problems as an expert in the field of spoken word recognition. I learned much from his thorough investigation and data analyses techniques.
I am also grateful for support from the National Institute for Health for a grant entitled "Cross-linguistic studies of spoken language processing" (R01DC04421. PI: Keith Johnson), which supported my dissertation research in Japan. I also am very thankful to JJ Nakayama for allowing me to use his copy of the NTT Psycholinguistic Databases, which was essential for my dissertation research.
Takashi Otake and Anne Cutler also had a great impact on my interests in spoken word recognition and encouraged me to pursue my PhD study here at The Ohio State University. They provided me a great opportunity to work as a research assistant for a couple of important studies in Japanese word recognition. Without such experience, I wouldn’t have become a PhD student at the Ohio State University. Takashi Otake is my master’s adviser who encouraged me to pursue my PhD. He has always been supportive for more than 10 years. All my dissertation experiments were conducted at his lab. Without his support, this dissertation wouldn’t exist. Anne Cutler is also an excellent role model as a researcher. I decided to pursue my graduate work in the states because I wanted to be a researcher like her. Her energetic and enthusiastic attitudes towards research inspired me a lot.
I also owe a great deal of gratitude to the Labbies, teachers, staff, and friends at OSU. They always offered me help whenever I needed it: Beth Hume, Jan Edwards, JJ Nakayama, Osamu Fujimura, Matt Makashay, Satoko Katagiri, Janice Fon, Pauline Welby, Steve Winters, Laurie Maynell, Allison, Blodgett, Tsan Huang, Georgios Tserdanelis, Misun Seo, Peggy Wong, Grant McGuire, Craig Hilts, Robbin Dautricourt, v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Amanda Miller-Ockhuizen, Jennifer Venditti, Stefanie Jannedy, Mariapaola DTmperio, Liz Strand, Rebecca Herman, Jennifer Vannest, Julie McGory, Nick Cipollone, Sanae Eda, Kooichi Sawasaki, Jim Harmon, and Matt Hyclak.
I thank my family members who have been always thinking of me from Japan: my parents, Susumu and Rumiko Yoneyama, my younger brother and my sister-in-law Kiyoshi and Yumi Yoneyama; and my youngest brother, Kazunari Yoneyama.
My dearest grandmother who passed away at the age of 96 in December, 2001, had been supportive for my entire life. She did not go to school, but she was always interested in education, and she had been encouraging me to complete this endeavor. One of the two wishes she asked me to do was to complete my degree before she went to see her husband in her afterlife. She couldn’t see me finish, but I still believe that she is really happy to see me finished - from there.
This is your dissertation, grandma.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VITA
June 23, 1969 ...... Bom, Chigasaki, Kanagawa, Japan
1993 ...... B.A. English Language. Dokkyo University, Soka, Japan.
1995 ...... M.A. English Linguistics. Dokkyo University, Soka, Japan.
1995-present ...... Graduate Teaching and Research Associate, Department of Linguistics, The Ohio State University
2000...... M.A. Linguistics. The Ohio State University, Columbus. OH.
PUBLICATIONS
Peer-reviewed Journal Article Otake, T., Yoneyama, K.. Cutler, A., & van der Lugt, A. (1996). The representation of Japanese moraic nasals. Journal of Acoustic Society of America, 100 (6), 3831-3842.
Book Chapters
1. Otake, T.. Hatano, G., & Yoneyama, K. Speech segmentation by Japanese. (1996). In Otake, T. and Cutler, A. (eds.), Phonological Structure and Language Processing: Cross-linguistic Studies, Mouton de Gruyter:Berlin, 183-201.
2. Yoneyama, K. (1996) Spoken language recognition and segmentation: Evidence from data of monolingual and bilingual speakers. In the circle of Phonology in Japan (ed.), Study on phonology, 179-182 (Written in Japanese).
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Conference Proceedings
1. Otake, T. & Yoneyama, K. (2000). Shinai jisho no on’in tani to sono ninshiki (Recognition of phonological units in mental lexicon). Phonological studies, 3, 21-28 (Written in Japanese).
2. Otake, T. & Yoneyama. K. (1999). Shinnai jisho ni okeru onsetsu to mora no ninshiki (Recognition of syllables and moras in mental lexicon). Proceedings o f the 113th Annual Meeting of the Phonetic Society o f Japan, 77-82 (Written in Japanese).
3. Yoneyama, K. & Pitt, M.A. (1999). Prelexical representation in Japanese: Evidence from the structural induction paradigm. Proceedings o f the 14th International Congress o f Phonetic Sciences, vol. 2, 893-896.
4. Otake, T. & Yoneyama, K. (1999). Listeners' representations of within-word structure: Japanese preschool children. Proceedings o f the 14th International Congress of Phonetic Sciences, vol. 3, 2193-2196.
5. Yoneyama, K. & Johnson, K. (1999). An Instance-based Model of Categorical Perception in Japanese by native and non-native listeners: A case of Segmental Duration. Phonological Studies,!, 11-18.
6. Otake, T. & Yoneyama, K. (1998). Phonological units in speech segmentation and phonological awareness. Proceedings of International Conference on Spoken Language 98, Vol. 5, 2179-2182.
7. Otake, T., Yoneyama, K., & Maki, H. (1998). Non-native listeners' representations of within-word structure. Proceedings of the 16th International Congress on Acoustics and the 135th Meeting of the Acoustical Society o f America, Vol. 2, 2067-2068.
8. Yoneyama, K. & Johnson, K. (1998). An instance-based model of Japanese speech recognition by native and non-native listeners. Proceedings o f the 16th International Congress on Acoustics and the 135th Meeting of the Acoustical Society of America. Vol. 3 2977-2978.
9. Otake, T. & Yoneyama, K. (1996). Can a mora occur word-initially in Japanese? Proceedings o f the 1996 International Conference on Spoken Language Processing, Philadelphia, vol. 4, 2454-2457.
10. Yoneyama, K. (1996). Segmentation strategies for spoken language recognition: Evidence from Semi-bilingual Japanese speakers of English. Proceedings o f the 1996 International Conference on Spoken Language Processing, Philadelphia, vol. 1, 454- 457.
viii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 11. Otake, T. & Yoneyama, K. (1995). A moraic status and syllable structure in speech perception. Proceedings ofXllllh International Congress of Phonetic Sciences , vol. 2, 686-689.
12. Otake, T & Yoneyama, K. (1994). A Moraic Nasal and a Syllable Structure in Japanese. Proceedings of the 1994 International Conference on Spoken Language Processing in Yokohama, vol. 3, 1427-1430.
Technical Reports
1. Yoneyama, K. (1997). A cross-linguistic study of diphthongs in spoken word processing in Japanese and English. OSU Working Papers in Linguistics , 50, 163- 175.
2. Yoneyama. K. (1995). Segmentation procedure by semi-bilingual speakers of Japanese and English. Dokkyo working papers in Linguistics, vol. 11,67-107.
3. Otake, T. & Yoneyama, K. (1995). Recognition of a moraic nasal in different speech rates. Dokkyo Studies in Data Processing and Computer Science. 13, 23-32, (written in Japanese).
4. Otake, T. & Yoneyama, K. (1994). A geminate consonant and a syllable structure in Japanese. Dokkyo Studies in Data Processing and Computer Science, 12, 55-64. (written in Japanese).
FIELDS OF STUDY
Major Field: Linguistics Specialization: Psycholinguistics, Phonetics
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS
ABSTRACT...... ii
DEDICATION...... iv
ACKNOWLEDGMENTS...... v
VITA...... vii
LIST OF TABLES...... xiii
LIST OF FIGURES...... xviii
LIST OF EQUATION...... xx
CHAPTERS
1. INTRODUCTION...... 1
1.1. Introduction ...... 1 1.2. Information Used for Lexical Access ...... 7 1.3. Mental Lexicon and Phonological Neighbors ...... 14 1.4. Testing neighborhood effects in Japanese: Overview ...... 21 1.5. Organization of the Dissertation ...... 24
2. STILUMI AND THEIR NEIGHBORHOODS...... 26
2.1. Introduction ...... 26 2.2. Japanese Mental Lexicon ...... 26 2.3. Defining Neighbors in Japanese ...... 29 2.3.1. The Segments Calculation ...... 30 2.3.2. The Segments + Pitch Calculation ...... 31 2.3.2. l.Similarity Judgments on Japanese Pitch-Accent Patterns ...... 34 2.3.2.1.1. Purpose ...... 34 2.3.2.1.2. M ethod...... 35 x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2.1.2.1 .Stimuli...... 35 2.3.2.1.2.2.Participant s ...... 37 2.3.2.1.2.3.Procedur e ...... 37 2.3.2.1.3. Results...... 38 2.3.2.1.3.1.Word Prosodic Similarity Based on Greenberg-Jenkins’ rules...... 41 2.3.2.1.4. Calculating the Segments + Pitch Calculation ...... 43 2.3.3. The Auditory Calculation ...... 46 2.3.4. Comparison of Three Neighborhood Calculations ...... 54 2.4. Target Words ...... 57 2.5. Participants ...... 68 2.6. Summary ...... 69
3. EXPERIMENT 1: AUDITORY NAMING...... 70
3.1. Introduction ...... 70 3.2. Methods ...... 71 3.3. Results...... 72 3.4. Discussion ...... 87
4. EXPERIMENT 2: AUDITORY NAMING IN NOISE...... 90
4.1. Introduction ...... 90 4.2. Methods ...... 91 4.3. Results...... 92 4.3.1. Naming Time Data ...... 93 4.3.2. Word Identification Data ...... 99 4.4. Discussion ...... 107
5. EXPERIMENT 3: SEMANTIC CATEGORIZATION EXPERIMENT...... 112
5.1. Introduction...... 112 5.2. Methods ...... 114 5.2.1. Stimuli...... 114 5.2.2. Participants ...... 116 5.2.3. Procedure ...... 117 5.3. Results...... 118 5.3.1. An Evaluation of the Semantic Categorization Task in Japanese 118 5.3.2. Semantic Categorization Data ...... 128 5.4. Discussion ...... 142
6. GENERAL DISCUSSION AND CONCLUSION...... 145 6.1. Introduction ...... 145 6.2. Summary of Results ...... 146
xi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2. I. Other Effects...... 146 6.2.2. Neighborhood Density Effects ...... 151 6.2.2.1.Processing Time Data ...... 151 6.2.2.2.Word Identification Data ...... 153 6.3. Proposal: A Model of Spoken-Word Recognition and Word Production ...... 154 6.3.1. Plaut & Kello (1999) ...... 156 6.3.2. A Model of Spoken-Word Recognition and Word Production ...... 159 6.3.3. The Current Findings in Terms of The Proposed Model ...... 164 6.3.3.1.Experiment I: Auditory Naming ...... 165 6.3.3.2.Experiment 2: Auditory Naming in N oise ...... 175 6.3.3.3.Experiment 3: Semantic Categorization ...... 180 6.3.4. Previous Findings in Terms of The Proposed Model ...... 184 6.3.4.1.Auditory Naming Experiments with Word Targets in English (Luce & Pisoni, 1998; Vitevitch & Luce, 1999) ...... 185 6.3.4.2.Lexical Decision Experiment in Japanese (Amano & Kondo, 1999)187 6.3.4.3.Implication for Current Recognition Models ...... 190 6.4. Conclusions ...... 194
BIBLIOGRAPHY...... 198
APPENDICES...... 209
Appendix A: Alphabetic symbols used in the lexicon ...... 209 Appendix B: The 300 stimulus pairs used in a similarity judgment experiment 214 Appendix C: 700 Target Words ...... 215 Appendix D: Statistics in Experiment 1 ...... 238 Appendix E: Statistics in Experiment 2 ...... 246 Appendix F: Similarities of Sounds in Noise: MDS Analyses ...... 262 Appendix G: Semantic Categories ...... 273 Appendix H: Reasons to Discard Eight Target Words From the Final Analysis in Experiment 3 ...... 304 Appendix I: Statistics in Experiment 3 ...... 306
xii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF TABLES
Table 2.1: Contents of the NTT Database Series (Amano & Kondo, 1999; 2000) ...... 27
Table 2.2: The target, neighbors selected by the Segments calculation, and its operations...... “...... ~...... 1 ...... 31
Table 2.3: Twenty tonal patterns tested in Experiment 1 (0 = low pitch; I = unaccented high pitch; * = accented high pitch) ...... 36
Table 2.4: A target word and its four potential neighbors selected at the first stage with information calculating neighborhood density ...... 45
Table 2.5: Descriptive statistics of neighborhood density computed by three different neighborhood calculations...... 56
Table 2.6. Pearson correlation matrix of the three neighborhood calculations ...... 57
Table 2.7: Three represntations of anago, 'conger eel’ found in the Word Frequency Database (Volume 7, The NTT Database Series) ...... 59
Table 2.8: A summary of the uniqueness-point tabulations ...... 62
Table 2.9: The number of words beginning with ka, the total number of words in the lexicon, and the proportion of words beginning with ka in the lexicon ...... 63
Table 2.10: Pearson correlation matrix of the three neighborhood calculations and other factors ...... 67
Table 3.1: Basic model of the naming time data for fast namers. Experiment . 1...... 74
Table 3.2: Models of the naming time data for fast namers. Experiment 1 ...... 76
Table 3.3: The number of responses before and after the offset of the 700 target words for fast namers...... 78
Table 3.4: Basic model of the naming time data for slow namers. Experiment 1 ...... 79
Table 3.5: Models of the naming time data for slow namers. Experiment 1 ...... 80
xiii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.6: The number of responses before and after the offset of the 700 target words for slow namers ...... 82
Table 3.7: A summary of the reliable effects from the regression models for (the naming time data (fast namers and slow namers). Experiment 1. Effects in bold show the calculation that yielded the highest increase in R2 ...... 83
Table 4.1: Basic model of the naming time data for fast namers. Experiment 2 ...... 94
Table 4.2: Models of the naming time data for fast namers. Experiment 2 ...... 95
Table 4.3: Basic model of the naming time data for slow namers. Experiment 2 ...... 96
Table 4.4: Models of the naming time data for slow namers. Experiment 2 ...... 97
Table 4.5: A summary of the regression models for the naming time data (both fast namers and slow namers). Experiment 2 ...... 98
Table 4.6: Basic model of the word identification data for fast namers. Experiment 2.. 101
Table 4.7: Models of the word identification data for fast namers. Experiment 2 ...... 102
Table 4.8: Basic model of the word identification data for slow namers. Experiment 2.103
Table 4.9: Models of the word identification data for slow namers. Experiment 2 ...... 104
Table 4.10: A summary of the regression models for the word identification data (both fast namers and slow namers). Experiment 2 ...... 105
Table 5.1: A summary of correct responses for "yes” filler-word responses in terms of semantic categories ...... 121
Table 5.2: A summary of the accuracy data for the 700 target words ...... 125
Table 5.3: The discarded target words for the final analysis in the semantic categorization experiment ...... 127
Table 5.4: Basic model of the semantic categorization data. Experiment 3 ...... 129
Table 5.5: Models of the semantic categorization data. Experiment 3 ...... 130
Table 5.6: A summary of the regression model with two types of neighborhood density (facilitative and inhibitory). Experiment 3 ...... 132
Table 5.7: Categorization data of fast responders, Experiment 3 ...... 134
Table 5.8: Models of the semantic categorization data for slow responders. Experiment 3 ...... 135 xiv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 5.9: A summary of the regression models with two types of neighborhood density (facilitative and inhibitory) for fast responders and slow responders. Experiment 3...... 137
Table 6.1: Summaries of effects in three experiments. Processing time data (top) and Accuracy data (bottom). (F = Facilitative effect, I = Inhibitory effect, H = Higher accuracy, L = Lower accuracy, N/A = no applicable) ...... 147
Table 6.2: Summary of the neighborhood density effects on processing times in three experiments. Effects in bold show the calculation that yielded the highest increase in R:...... 152
Table 6.3: Summary of the neighborhood density effect in the word identification data of Experiment 2 ...... 154
Table 6.4: Characteristics of three neighborhood density calculations ...... 167
Table 6.5: Relationships between the acoustic input and neighborhood density calculation ...... 168
Table D.l: Basic model for naming data (fast namers). Experiment 1 ...... 238
Table D.2: Basic model + Neighborhood density (Segments) for naming data (fast namers). Experiment 1...... 239
Table D.3: Basic model + Neighborhood density (Segments + Pitch) for naming data (fast namers). Experiment 1...... 240
Table D.4: Basic model + Neighborhood density (Auditory) for naming data (fast namers), Experiment 1...... 241
Table D.5: Basic model for naming data (slow namers). Experiment 1 ...... 242
Table D.6: Basic model + Neighborhood density (Segments) for naming data (slow namers). Experiment 1...... 243
Table D.7: Basic model + Neighborhood density (Segments + Pitch) for naming data (slow namers). Experiment 1 ...... 244
Table D.8: Basic model + Neighborhood density (Auditory) for naming data (slow namers), Experiment 1...... 245
Table E.l: Basic model for naming data (fast namers), Experiment 2 ...... 246
Table E.2: Basic model + Neighborhood density (Segments) for naming data (fast namers). Experiment 2 ...... 247
xv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table E.3: Basic model + Neighborhood density (Segments + Pitch) for naming data (fast namers), Experiment 2 ...... 248
Table E.4: Basic model + Neighborhood density (Auditory1) for naming data (fast namers). Experiment 2 ...... 249
Table E.5: Basic model for naming data (slow namers). Experiment 2 ...... 250
Table E.6: Basic model + Neighborhood density (Segments) for naming data (slow namers). Experiment 2 ...... 251
Table E.7: Basic model + Neighborhood density (Segments + Pitch) for naming data slow namers). Experiment 2 ...... 252
Table E.8: Basic model + Neighborhood density (Auditory) for naming data (slow namers). Experiment 2 ...... 253
Table E.9: Basic model for word identification data (fast namers), Experiment 2 ...... 254
Table E.10: Basic model + Neighborhood density (Segments) for word identification data (fast namers), Experiment 2 ...... 255
Table E.l 1: Basic model + Neighborhood density (Segments + Pitch) for word identification data (fast namers). Experiment 2 ...... 256
Table E.12: Basic model + Neighborhood density (Auditory) for word identification data (fast namers). Experiment 2...... 257
Table E.13: Basic model for word identification data (slow namers), Experiment 2 ...... 258
Table E.14: Basic model + Neighborhood density (Segments) for word identification data (slow namers). Experiment 2 ...... 259
Table E. 15: Basic model + Neighborhood density (Segments + Pitch) for word identification data (slow namers), Experiment 2 ...... 260
Table E.16: Basic model + Neighborhood density (Auditory) for word identification (slow namers). Experiment 2 ...... 261
Table F.l: The mean number of errors of consonants and vowels in each of the two analyses ...... 267
Table F.2: The mean number of errors in terms of word positions ...... 267
Table F.3: Proportions of responses for vowels ...... 268
Table F.4: A similarity matrix for vowels ...... 268
xvi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table F.5: Proportions of responses for consonants ...... 270
Table F.6 : A similarity matrix for consonants ...... 271
Table 1.1: Basic model for semantic categorization data. Experiment 3 ...... 305
Table 1.2: Basic model + Neighborhood density (Segments) for semantic categorization data. Experiment 3 ...... 306
Table 1.3: Basic model + Neighborhood density (Segments + Pitch) for semantic categorization data. Experiment 3 ...... 307
Table 1.4: Basic model + Neighborhood density (Auditory) for semantic categorization data. Experiment 3 ...... 308
Table 1.5: Basic model + Neighborhood density (Segments+Pitch & Auditory) for semantic categorization data. Experiment 3 ...... 309
Table 1.6: Basic model for semantic categorization data (fast responders), Experiment 3...... 310
Table 1.7: Basic model + Neighborhood density (Segments) for semantic categorization data (fast responders), Experiment 3 ...... 311
Table 1.8: Basic model + Neighborhood density (Segments + Pitch) for semantic categorization data (fast responders). Experiment 3 ...... 312
Table 1.9: Basic model + Neighborhood density (Auditory) for semantic categorization data (fast responders). Experiment 3 ...... 313
Table 1.10: Basic model + Neighborhood density (Segments+Pitch & Auditory) for semantic categorization data (fast responders). Experiment 3 ...... 314
Table 1.11: Basic model for semantic categorization data (slow responders). Experiment 3...... 315
Table 1.12: Basic model + Neighborhood density (Segments) for semantic categorization data (slow responders), Experiment 3 ...... 316
Table 1.13: Basic model + Neighborhood density (Segments + Pitch) for semantic categorization data (slow responders), Experiment 3 ...... 317
Table 1.14: Basic model + Neighborhood density (Auditory) for semantic categorization data (slow responders), Experiment 3 ...... 318
Table 1.15: Basic model + Neighborhood density (Segments+Pitch & Auditory) for semantic categorization data (slow responders). Experiment 3 ...... 319
xvii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIGURES
Figure 2.1: Fundamental frequency (FO) contours of ana , ‘hole (top) and a'na. ‘announcer (bottom) ...... 33
Figure 2.2: The mean similarity rating in the order AB as a function of the mean similarity rating in the order BA. The numbered points (I to 4) in the figure represent Oil* vs. 0111,01* vs. 011,0* vs. 01 and0111* vs. 01111, respectively ...... 40
Figure 2.3: The number of operations (substitutions, deletions or insertions) as a function of the similarity ratings (left), the number of operations as a function of the median similarity rating with a logarithmic function (right) ...... 42
Figure 2.4: Frequency counts as a function of operations for pitch-pattem responses in the auditory naming in noise experiment (Chapter 4) ...... 44
Figure 2.5: Examples of LAFS and X-MOD representations of “Cat.” ...... 47
Figure 2.6: Quantized vectors of the exemplars (kodomo and domori) ...... 48
Figure 2.7: A neighbor-nonneighbor distinction in a similarity space in the Auditory calculation ...... 50
Figure 2.8: The target word, kodomo , ‘child’ and its seven most similar neighbors in the Auditory calculation ...... 52
Figure 2.9: Frequency counts of target words as a function of neighborhood density. Neighborhood density by the Segments calculation (top), neighborhood density by the Segments + Pitch calculation (middle), and neighborhood density by the Auditory calculation (bottom) ...... 55
Figure 2.10: Distribution of word frequency of the target words ...... 60
Figure 2.11: Frequency counts of the target words as a function of frequency of the first mora. Words beginning with a fricative (top left), words beginning with a nasal (top right), and words beginning with a stop (bottom left) ...... 64
Figure 2.12: Distribution of the durations of the target words ...... 65
xviii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 5.1: The categorization times as a function of the number of the segments from the word-initial point for a given word to be unique from the words in the lexicon (UP)...... 141
Figure 6.1: A model of speech comprehension and production by Plaut & Kello (1999)...... 158
Figure 6.2: A model of spoken-word recognition and word production ...... 161
Figure 6.3: Participants’ performance in Experiment 1 (Auditory naming) in which they started naming the words after they had heard only part of the word by exploiting only segmental information ...... 171
Figure 6.4: Participants’ performance in Experiment I (Auditory naming) in which the participants started naming the words after they had partially heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information. 7...... 172
Figure 6.5: Participants’ performance in Experiment 1 (Auditory naming) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information...... 7...... 174
Figure 6.6: Participants’ performance in Experiment 2 (Auditory naming in noise) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word -level prosodic (pitch accent patterns) information ...... 176
Figure 6.7: Participants’ performance in Experiment 3 (Semantic categorization) 181
Figure F.l: MDS for vowels. Dimensions 1 and 2 represent vowel height (FI) and backness (F2), respectively ...... 269
Figure F.2: MDS for consonants. SS = [f], C = [ts], CC = [c], KY = [kJ], Z= [3], Y = [j]. Dimensions I and 2 represent [tvoice] and [tsonorant], respectively ...... 272
xix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF EQUATIONS
Equations 2.1 & 2.2. Equations used in the Auditory calculation for neighborhood density ...... 49
xx
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1
INTRODUCTION
1.1. Introduction
This dissertation investigates two aspects of lexical access in Japanese word
recognition.
The first aspect to be investigated is the kind of word representation used for
lexical access. Two types of word recognition models have been proposed to account for
word representation. Some models propose that words are represented in the lexicon in
the form of abstract phonological structures (Grossberg. Boardman, & Cohen, 1997;
McClelland & Elman, 1986; Norris, 1994; Norris, McQueen & Cutler. 2000). In these
models, the acoustic speech stream is coded as a normalized, language-specific
phonological representation (which may consist of features, phonemes, syllables, or a
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. combination of those). This prelexica! phonological representation is used for matching
with lexical representations.
Other models assume that word forms are stored in the brain in the form of
detailed acoustic traces (Goldinger, 1992, 1996: Klatt, 1979, 1981; Johnson, 1997a,
1997b, Pisoni, 1997). Word recognition involves a “direct” comparison between
memorized acoustic patterns and the pattern elicited by the current acoustic signal. Each
word is associated to many acoustic tokens, and word recognition consists of finding the
nearest match in a vast collection of word forms. Johnson ( 1997ab) proposed an
exemplar-based model of speech perception in which the words are recognized based on
auditory representation. Experimental evidence supporting this view has shown that in
word recognition tasks, participants are very sensitive to nonlinguistic surface form such
as the speaker’s voice (Goldinger. 1996; Schacter & Church. 1992; also see Pisoni.
1997).
However, recent studies have shown that listeners might have both abstract and
episodic representations. Luce and Lyons (1998) showed that English listeners exploit
both abstract and episodic representations of words from identification and memory tasks
as well as from a lexical decision task. This view is also supported by Pallier. Colome.
and Sebastian-Galles (2001) who claim that Spanish-Catalan bilinguals exploit both
abstract and episodic representations. This dissertation aims to investigate which word
representation needs to be assumed in the lexicon and is involved in lexical access in
Japanese.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The second aspect investigated in this dissertation is lexical competition in
Japanese auditory word recognition. Adults actively use a lexicon that was built in
infancy based on surface regularities in the input. It is assumed that lexical entries are
organized into a network, and these entries compete with each other during lexical access
("word competition”). There is now a large body of evidence supporting word
competition models (Gow & Gordon, 1995: McQueen, Norris, & Cutler, 1994; Norris,
McQueen, & Cutler. 1995; Shillcock, 1990: Tabossi, Burani, & Scott. 1995; Vitevitch &
Luce, 1999; Vroomen & de Gelder, 1995a; 1997; Wallece, Stewart, & Malone, 1995b;
Wallace, Stewart, Shaffer, & Mellor, 1995; Zwitserlood. 1989; Zwitserlood & Schriefers,
1995). Most of the studies have been conducted in English so this hypothesis also needs
to be tested in different languages such as Japanese.
These two questions are simultaneously tested by exploring neighborhood density
effects in Japanese auditory word recognition. Neighborhood density is a measure of the
lexical competition effect. It follows that neighborhood density effects in Japanese also
tests lexical competition. In order to define neighbors, explicit assumptions need to be
made about the word representation. For example, English neighborhood density is based
on word similarity, but the word forms compared are assumed to be raw sequences of
segments. In English, stress pairs where no marked difference in vowel quality, (such as
differ and defer) are much more rare than in Dutch, so to a large extent, stress is encoded
in choice of vowel symbols in the Hoosier Mental Lexicon (HML), an online database of
20000 English words (Pisoni, Nusbaum, Luce, & Slowiaczek, 1985). Also, in both
English and Dutch, pitch shapes (accent types, etc) are associated with stress and are 3
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. specified by pragmatic functions. Therefore, the pitch shape per se is not a lexical
property in the same way as the accent pattern is in Japanese. In the case of Japanese,
pitch accent patterns play an important role for word recognition; it is unclear whether the
word representation assumed in English neighborhood studies is also true in Japanese.
Therefore, in order to answer these questions, neighborhood density experiments are
conducted in this dissertation.
Neighborhood density can be defined broadly as the number of words that are
similar in sound to a specific word, and the way in which the similarity sounding words
affect recognition of the target word (Pisoni et al.. 1995; Luce, 1986a; Goldinger. Luce, &
Pisoni, 1989; Luce, Pisoni. & Goldinger. 1990; Luce & Pisoni. 1998; Vitevitch & Luce,
1998; Vitevitch & Luce, 1999; Luce, Goldinger. Auer. & Vitevitch, 2000; Luce & Large,
2001; Amano & Kondo, 1999). Neighborhood density shows inhibition: words that are
similar to many other words (dense neighborhoods) are recognized more slowly than
words in sparse neighborhoods (Luce & Pisoni, 1998; Vitevitch & Luce, 1999: Amano &
Kondo, 1999). However, neighborhood density sometimes shows facilitation: nonwords
that are similar to words in dense neighborhoods are recognized more quickly than the
ones in sparse neighborhoods (Vitevitch & Luce, 1998. 1999). The neighborhood
facilitative effect is observed not only among adults but also among infants and children
(Charles-Luce & Luce, 1995; Metsala, 1997; Pitrat. Logan, Cockell, & Gutteridge, 1995,
Garlock, Walley, & Metsala, 2001).
Neighborhood density and probabilistic phonotactics have a strong positive
correlation in the language between number of overlapping words and segmental 4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. frequency. Typically, as the number of overlapping words increases, the frequencies of
the segments that consist of the overlapping words also increase. Based on this fact, a
neighborhood facilitative effect is interpreted as an effect of probabilistic phonotactics.
Of course, infants do not have a lexicon yet, and it is known that symbolic
representation is acquired a necessity for production (Jusczyk, 1993; Plaut & Kello,
1999). Jusczyk (1993) has proposed that infants acquire auditory exemplars in the
lexicon. Perhaps the two different types of lexical representations (symbolic vs.
cochleagram) represent different potential ''levels” for the child as well. Maybe, infants’
lexical representations are based on auditory exemplars for their sensitivity to sound
frequency whereas lexical representations of children and adults, who have already
established a production path from semantic space to articulation, are based on symbolic
representations.
However, a couple of questions are still unsolved in this research. The first
question is whether this neighborhood density effect universally affects the processing
times for lexical access. The effect of neighbors has been confirmed in English, but not
in many other languages. Amano and Kondo (1999) tested neighborhood density effects
in Japanese and found that the effects were significant in accuracy of a word
identification in noise experiment (off-line task) but not in processing times of a lexical
decision experiment. The second question is exactly how similarity of neighbors should
be calculated. Should we calculate neighborhoods simply in terms of words that differ by
one phoneme, as in the Greenberg and Jenkins (1964) calculation? Does subphonemic
auditory/acoustic similarity determine a word’s neighbors? Also, what role does prosody 5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. play? The third question is highly related to the second question: how are words stored in
the mental lexicon? Are words in the mental lexicon highly detailed episodic
representations as claimed in Goldinger (1989, 1996, 1998) and Johnson ( I997ab), or
does the lexicon consist of highly abstract phonological representations, as linguists
generally assume? Or can we assume that words have both representations as claimed in
Luce and Pisoni (1998) and Pallier and his colleagues (2001)? Exploring the definition of
neighbors in Japanese word recognition should lead to better understanding of the issues
in word representation and word competition.
This dissertation reports the results of auditory naming, word identification in
noise, and semantic categorization experiments designed to explicitly compare phoneme-
based neighborhood definitions and more fine-grained acoustically-based neighborhood
definitions, and to compare neighborhoods defined with and without prosodic structure.
Acoustic similarity is calculated using the audio stimuli available in the NTT database
(Amano & Kondo, 1999, 2000) using a new methodology developed in this dissertation.
Using various ways to calculate neighborhood density makes it possible to determine
which level of lexical representation (an acoustic-auditory representation or an abstract
phonemic representation) is used to calculate phonological similarity within the lexicon.
The results show that different lexical neighborhoods are operative both at lexical and
sublexical levels of word recognition.
The results demonstrate that phonological similarity within the lexicon seems to
be calculated based on the acoustic-auditory representation rather than on an abstract
6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. phonemic representation in any of the experiments. The implications from the results to
current word recognition theories will also be discussed.
1.2. In f o r m a t io n Us e d f o r L e x ic a l A c c e s s
One of the important tasks in understanding speech is finding out the relationship
between speech and linguistic structure. Essentially language users must leam a
relationship among three spaces: the acoustic input, articulation and semantics. These
mappings are not direct at all. In order to mediate these three spaces, phonological
(cognitive) representation emerges. For production and comprehension, language users
have to leam how to map the acoustic properties of speech (such as fundamental
frequency, intensity, duration) onto the (more abstract) linguistic structure stored in the
mental lexicon acquired in infancy through comprehension, and how to convert the
linguistic structure to the acoustic speech signal through articulation. I believe that
“phonetics” is the study of the connection between the phonological (cognitive) structure
of the language and the physiological properties of speech.
The terms pitch, loudness, length, and timbre are often used as auditory correlates
of fundamental frequency, intensity, duration and spectral characteristics, respectively.
Such impressions are evidently determined not only by the physical characteristics of the
speech signal but also by language users’ knowledge. These impressions somehow
straddle the boundary between the physical world and language users’ abstract (cognitive)
representations of that world. This means that every language user has to figure out how
7
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. to interpret the physical aspects of speech in a way that fits into their language system in
order to understand speech (also see Cutler, 1997).
In order to understand lexical access, we need to understand how the mental
lexicon is developed in infancy, what kinds of word representation need to be assumed,
how adult listeners map acoustic information onto the words stored in the mental lexicon,
and how adult speakers map from words stored in the mental lexicon to articulatory
plans.
Once language users acquire the lexicon, information about phonological and
morphological patterns may be available for lexical access. The question then is which
kinds of information are exploited by adults, who have established lexicons and who can
rely on the shapes of words as a whole (i.e., who are not “pre-lexical” anymore), for
lexical access. Cutler (1997) claims that adult listeners use many kinds of phonological
information acquired in a pre-lexical period during infancy, suggesting that phonological
information is in effect prior to morphological information. This view is also confirmed
by Smith and Pitt (1999) who found that the formation of syllabic structure is guided by
phonology prior to morphology. Segmentation studies have shown that different kinds of
phonological information are used for lexical access.
Even after acquiring the lexicon, studies suggest that listeners use multiple
phonological cues for lexical access. Rhythmic structure of languages is a strong cue for
adults. Across many languages it appears that listeners exploit metrical structure to locate
word boundaries in speech, although these boundaries can be determined in a highly
language-specific way. Several studies using syllable monitoring experiments suggest 8
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that French listeners segment the incoming speech signal into syllabic units (Cutler,
Mehler, Norris, & Segui, 1986; Cutler, Mehler, Norris & Segui, 1992; Mehler,
Dommergues, Frauenfelder & Segui, 1981; Otake. Hatano, Cutler & Mehler, 1993).
French listeners segment speech into syllables even when the input is in a language other
than French, English (Cutler et al„ 1986) and Japanese (Otake et al., 1993). These
syllable-based segmentation results are also confirmed by studies on bilingual listeners,
suggesting that French-dominant bilinguals clearly exhibited syllabic segmentation when
they listened to French, whereas English-dominant bilinguals did not. (Cutler et al.,
1992). The syllabic effect in French was also replicated in a study with a phoneme-
induction paradigm, a variant of the phoneme-monitoring task (Pallier, Sebastian-Galles,
Felguera, Christopher, & Mehler, 1993). The syllabic effect has been found with
speakers of Spanish and Catalan in syllable-monitoring experiments, though not in all
cases (Bradley, Sanchez-Casas & Garcia-Albea, 1993; Sebastian-Galles, Dupoux. Segui
& Mehler, 1992).
Many studies have shown that language users in English and Dutch employ
metrical stress information (alternations of strong and weak syllables that are based on
vowel quality (full vs. reduced vowels); Fear, Cutler & Butterfield, 1995). Cutler and
Norris (1988) found that stress alternation (strong and weak syllables) was keyed to word
segmentation in English word-spotting studies. They used a word-spotting task in which
listeners were asked to press a button as soon as they heard a real word embedded at the
beginning of a pseudoword. The results showed that the detecting times for mint in
mintayf and mintef were significantly different, whereas the detecting times for thin in 9
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. thintayf and thintef were not significantly different. They proposed that the CVCC target
(mint) from mintayf was divided across two segmentation units into min_t, so that
listeners had a problem detecting the target whereas in the case of mititef, the CVCC
target was not split because the "e" was a reduced vowel. However, they did not have any
problem detecting CVC targets (thin) because the segmentation of the target in th in ja yf
and th in je f should not be any different. Based on these results. Cutler and Norris
proposed the Metrical Segmentation Strategy (MSS), which is based on the strong/weak
syllable alternation for speech segmentation in English. The MSS effect was also found
in word-spotting studies by McQueen and his colleagues. (1994) and Norris and his
colleagues (1995). Cutler and her colleagues further found that English-dominant
bilinguals use a stress-based segmentation strategy (Cutler et al., 1992). This MSS effect
has also been demonstrated in both spontaneous and experimentally elicited
misperceptions in English (Cutler & Butterfield, 1992). and in word blending
experiments (Cutler & Young, 1994). Further. English listeners' sensitivity to
predominant stress patterns (strong/weak) is also supported by computational analyses on
the English lexicon and corpus (Cutler & Carter, 1987). Smith and Pitt (submitted)
further replicated the MSS in word spotting experiments. The MSS in Dutch is also
found in cross-modal identity priming experiments (Vroomen & de Gelder. 1995) and in
a laboratory-induced misperception experiment and a word-spotting experiment
(Vroomen, van Zon & de Gelder, 1996).
Japanese listeners have been shown to segment speech into moarae in studies on
Japanese auditory word recognition using syllable monitoring (Otake et al., 1993: Otake, 10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hatano, & Yoneyama, 1996a), phoneme monitoring (Cutler & Otake, 1994; Otake,
Yoneyama, Cutler, & van der Lugt, 1996b), word blending (Kubozono, 1995), phoneme
induction (Yoneyama & Pitt, 1999), and word spotting (McQueen, Otake, & Cutler,
2001).
Adult studies have shown that adults also employ the statistics of the language
input for word segmentation. Saffran, Newport, and Aslin (1996) and Saffran. Newport,
Aslin, and Barrueco (1997) exposed adult English speaking listeners to an artificial
language (directly or indirectly) in which the only cues available for word segmentation
were transitional probabilities between syllables to leam the words of this language. Pitt
and McQueen (1998) claim that compensation for coarticulation, which is used as a
strong piece of evidence in support of interactive models of speech perception (like
TRACE), can be an effect of local transitional probability of segments. McQueen (1998)
and van der Lugt (2001) both demonstrated that Dutch listeners use phonotactic cues to
help solve the segmentation problem through word-spotting experiments. Phonotactics is
used by listeners to process phonologically illegal sequences in English (Pitt, 1998) and in
Japanese (Dupoux, Kakehi, Hirose. Pallier & Mehler, 1999; Dupoux. Pallier. Kakehi &
Mehler, 2001).
As we have seen here, adults are sensitive to different levels of statistical
structures of the input language. However, other types of phonological information are
also used among adult listeners. Syllables are used as segmentation units in French,
Spanish and Catalan discussed above. All syllable monitoring experiments with English
listeners so far have failed to show this effect (Cutler et al., 1986; Bradley et al., 1993) 11
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. regardless of whether targets have been presented visually or auditorily. Even with
foreign-language input in which syllable boundaries are clear, experiments have not
shown this effect (Cutler et al., 1986; Otake et al., 1993). However, when using
methodologies other than a syllable monitoring task, the syllabic effect has been found in
English. Bruck, Treiman and Caravolas (1995) found that the listeners were able to
decide whether two nonsense words share sounds more quickly when the nonsense words
shared a syllable (e.g, [kipaest] and [kipbeld]) than when they did not (e.g., [flingil] and
[flikboz]), suggesting that syllabified representations of the nonwords may be used in a
comparison task, even in English. Finney, Protopapas. and Eimas (1996) showed that
English listeners can use syllabic information to cue them to the location of phoneme
targets in a phoneme induction paradigm (a variant of the phoneme detection task) that
showed a syllabic effect in French (Pallier et al.. 1993). although the syllabic effect was
not observed in English words with strong first syllables. Pitt. Smith and Klein (1998)
further conducted more controlled induction experiments with a baseline condition in
which no induction is manipulated. Unlike in Finney et al. (1996), a syllabic effect
appeared even in words with strong first syllables as well as in nonsense words. Further.
Smith and Pitt (submitted) showed that the information used to determine syllable
boundaries (vowel length, lexical stress and phone class) was effective in determining
word boundaries. The syllable effect is also reported in Dutch using syllable-monitoring
experiments (Zwitserlood, Sheriefers, Lahiri, & van Donslaar. 1993). This effect was
clearly shown in the unambiguous as well as in the ambisyllabic cases.
12
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Listeners also rely on tonal information. Listeners exploit accent in Finnish
(Vroomen. Tuomainen & de Gelder, 1998) and in Spanish (Sebastian-Galles et al., 1992,
Sebastian-Galles, 1996). Although Cutler (1986) claimed that lexical stress does not
contribute to lexical access in English, a recent study by Smith and Pitt (submitted)
reported that American-English listeners exploit lexical stress for segmentation. Lexical
pitch accents are involved in auditory word recognition in Japanese (Sekiguchi &
Nakajima, 1999: Cutler & Otake, 1999, Otake & Cutler, 1999). Lexical tones play an
important role in Cantonese and Mandarin Chinese (Cutler & Chen. 1997; Ye & Connine.
1999).
Language-specific phonological information such as vowel harmony is effective
in Finnish (Suomi, McQueen & Cutler, 1997; Vroomen, & de Gelder. 1998). Further,
adults listeners use many different cues that are related to physical characteristics of the
speech signal across languages: silence (Norris et al., 1997): allophonic cues such as
aspiration of word initial stops in English (Lehiste, 1960; Nakatani & Dukes. 1977); the
duration of segments or syllables (Beckman & Edwards. 1990: Gow & Gordon. 1995:
Klatt, 1974, 1975: Lehiste, 1972; Oiler, 1973; Quene, 1992, 1993; Saffran, Newport &
Aslin. 1996; Smith & Pitt, submitted); and fundamental frequency movement (Vroomen
et al.. 1998. Hasegawa & Hata. 1992).
In summary, adults use multiple cues for lexical access. Listeners are sensitive to
all acoustic information relevant to the language’s phonology (Cutler, 1997). Based on
this general observation. Norris. McQueen. Cutler, and Butterfield. (1997) proposed a
possible word constraint (PWC) in which listeners use all possible phonological 13
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. information in order to find the appropriate word boundaries for lexical access. As a
result, they claimed that listeners do not segment fapple into/ and apple because/is not a
possible word in English. PWC is confirmed in other languages such as in Japanese
(McQueen, et al., 2001). Therefore, McQueen, Cutler, Butterfield, and Keams (2001)
claimed that PWC is a universal constraint based on the findings explained above.
1.3. M e n t a l L e x ic o n a n d P honological N e ig h b o r s
This section explores a role of the mental lexicon for lexical access. In spoken-
language recognition, adult listeners actively use the lexicon for lexical access. Lexical
entries are organized into a network, and compete with each other during access (“word
competition"). There is now a large body of evidence of supporting the claim that words
compete for lexical access by adults (Gow & Gordon. 1995: McQueen et al.. 1994: Norris
et al., 1995; Shillcock, 1990; Tabossi, Burani. & Scott. 1995; Vitevitch & Luce. 1999:
Vroomen & de Gelder, 1995: 1997; Wallece, Stewart. & Malone. 1995: Wallace, Stewart.
Shaffer, & Mellor, 1995; Zwitserlood, 1989: Zwitserlood & Schriefers, 1995). At the
same time, high probabilistic statistics (frequency of sounds or sound sequences within
words) at the prelexical level facilitates processing as explained above. These effects are
predicted by any type of activation-competition models such as MERGE (Norris et al..
2000), TRACE (McClelland & Elman, 1986), PARSYN (Luce, et al., 2000), Shortlist
(Norris, 1994) and ARTPHONE (Grossberg, et al., 1997). Thus, activation-competition
is central to current word recognition models.
14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Although many word recognition models do not provide information about how
the words are organized in the lexicon, the Neighborhood Activation Model (NAM; Luce,
1986a; Luce, Pisoni, & Goldinger, 1990; Luce & Pisoni, 1998) as well as PARSYN (a
connectionist model based on NAM; Luce et al., 2000) clearly addresses the structure of
the lexicon: The memory stored for the phonological forms of words is organized in
terms of sound similarity. The number of similar words (“neighbors”) shows an
inhibitory effect: Words with many neighbors are recognized more slowly than words
with few neighbors. Structural relations among words are measured by their
neighborhood size (“Neighborhood density”). For a given word in the lexicon, the
word’s neighborhood size is the number of words in the lexicon that contains sounds
similar to that word. A widely-used calculation of neighborhood size is based on an
algorithm proposed by Greenberg and Jenkins (1964). A neighborhood density effect has
been reported in many studies (Pisoni et al., 1995; Luce, 1986a: Goldinger et al., 1989;
Luce, Pisoni, & Goldinger, 1990; Luce & Pisoni. 1998; Vitevitch & Luce, 1998;
Vitevitch & Luce, 1999: Luce et al., 2000).
“Phonological similarity” in English is calculated based on form similarity. Four
different neighborhood calculations have been used. The first calculation is based on
experimentally derived phoneme confusability (Luce, 1986a: Luce 8c Pisoni, 1998). This
rule is based on R. D. Luce’s general biased choice rule (R. D. Luce, 1959).
The second calculation is that neighbors are determined in terms of shared number
of phonemes (Luce, 1986a: Luce & Pisoni, 1998). In this calculation, neighbors are
words that differ from one another by a single phoneme addition, deletion or substitution 15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in any position (Greenberg & Jenkins, 1964). The number of such neighbors is the
neighborhood density of the target item. Most studies of neighborhood effects have used
this simple notion of neighborhood density (e.g., Charles-Luce & Luce, 1995; Metsala,
1997). Unlike the First calculation, this definition has a sharp cutoff, simply ignoring all
words outside the single phoneme edit distance. Furthermore, this definition does not take
similarity between phonemes into consideration. For example, replacing /b/ with /p/ (a
change involving only voicing) yields a neighbor just the same as does replacing lb/ with
/s/ involving place and manner of articulation in addition to voicing). Also, phoneme
insertion could change word prosody (such as dog vs. doggy), but this aspect is not
considered. Although it is widely recognized as a very rough approximation, the
definition has been surprisingly successful in neighborhood studies in English.
The third calculation is based on the percentage of phoneme matching between
words. Frisch, Large and Pisoni (2000) expanded the neighborhood definition based on
one-phoneme edit distance for CVC words to longer words by basing their calculation of
similarity in the fraction of shared phonemes in a word. For example, a proportional
change of 1/3 would be equivalent to a single phoneme change when applied to CVC
words. This means that the neighbors for CVC words should share 66% of the phonemes
within words. This phoneme-matching percentage (66%) is then also used for multi
syllabic words.
The final calculation is based on distinctive feature lattice distance (Frisch, 1996).
Bailey and Hahn (2001) proposed a “General Neighborhood Model (GNM)” based on
this measure of word similarity. GNM is an adaptation of the Generalized Context Model 16
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (GCM; Nosofskey, 1986) of classification based on similarity of exemplars. In the GCM,
words in the lexicon are considered as exemplars and they are mapped onto psychological
mental space. In this model, all but the target word are considered as neighbors that vary
along a continuous space of similarity. Unlike a sharp neighbor-nonneighbor distinction,
all the words in the lexicon are neighbors to some degree. The model calculates the
psychological distances between individual items by a standard edit distance metric with
assessment of the relative cost of substituting one phoneme for another based on the
natural class lattice distance metric (Frisch, 1996). GNM neighborhood similarity is
similar in spirit to the neighborhood confusability term in Luce’s (1986) but differs in
using an exponential transformation from psychological distances instead of using
confusion probabilities.
These neighborhood calculations can also be weighted by lexical frequency. For
example. Luce and Pisoni (1998) used a neighborhood density calculation that is based on
R.D. Luce’s (1959) choice rule that weights similarity by the frequencies of target words
and neighbors. Vitevitch and Luce (1998, 1999) modified the neighborhood calculation
used in Luce and Pisoni (1998) so that the overall frequency-weighted neighborhood
probability (the first definition) is simplified as the sum of the frequency-weighted
neighbor word probability. In this calculation, neighbors are first calculated by
Greenberg-Jenkins’ phoneme substitution, deletion and insertion rules. Then, the
frequencies of the neighbors are summed in order to obtain the frequency-weighted
neighborhood density. The above two frequency-weighted calculations are very similar.
A crucial difference between the two, however, is assumed distribution of similarity. The 17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. former considers all the words but the target word in the calculation of the similarity
space whereas the latter considers the words within one phoneme edit distance from the
target word as neighbors resulting in a discrete similarity space. A neighborhood
calculation based on distinctive feature lattice distance (Frisch. 1996) can also be
weighted by word frequency. For example, Bailey and Hahn (2001)’s General
Neighborhood Model is a neighborhood calculation that is sensitive to the frequency of
the neighbors.
The survey of the neighborhood calculations revealed two tendencies. First, the
English neighborhood definitions are all based on phonemes as the basic units. Luce and
Pisoni (1998) predict word similarity by calculating sound confusion matrices in CVC
words. A widely used neighborhood calculation based on Greenberg-Jenkins’ rules is
based on the number of common phonemes between words. Moreover, the psychological
distances between individual items is calculated by a standard one phoneme edit distance
metric with assessment of the relative cost of substituting one phone for another based on
the natural class lattice distance metric (Frisch. 1996). This is not surprising, however,
since most adult word recognition models assume that the phoneme is the basic unit of
processing.
Second, the calculations in some studies may not consider word-level prosody.
One main reason for this is that most neighborhood studies used CVC tokens as stimulus
words so that they did not need to consider the effect of lexical stress on word similarity.
For example, these neighborhood calculations predict that FORbear and forB ARE are
neighbors in English. In keeping with this. Cutler (1986) claimed that lexical prosody 18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. observed in words such as FORbear and forB ARE does not constrain lexical access.
However, neighborhood studies using the HML consider word-level prosody, because the
transcriptions in the HML distinguish strong vs. weak vowels.
Referring to English neighborhood calculations, the definition of neighborhood in
Japanese are explored. Three neighborhood calculations are used to test neighborhood
density in Japanese. Each of the neighborhood calculations coincides with a hypothesis
about lexical access with different word representations. The first calculation posits a
situation in which Japanese listeners rely on the phoneme string representation as
proposed in models such as the NAM (Luce & Pisoni, 1998) and the PARSYN (Luce et
al., 2000). Here, neighborhoods are calculated in terms of number of phonemes in
common, as in the Greenberg-Jenkins calculations (Greenberg & Jenkins. 1964) as
widely used in the literature on English.
The second neighborhood calculation included word accent information as
another dimension in the neighborhood calculation in order to reflect the finding that
prosodic information has a vital role in Japanese word recognition (Cutler & Otake,
1999). This calculation proposes that Japanese listeners use word-level prosody for
lexical access. However, both word representation and word-level prosody are separately
calculated. In other words, the word representation in this calculation is the categorical
abstract representation as used in the phoneme-based neighborhood calculation and the
pitch accent patterns additionally constrain the neighbors. Take a pair of words like ka’ki.
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ‘oyster’ and kaki ‘persimmon’ as an example1. The words are considered as ‘neighbors’
in the neighborhood calculation based on Greenberg-Jenkins’ rules. However, Japanese
listeners seem to have sensitivity to similarity of accent patterns, so they may not consider
ka’ki, ‘oyster’ and kaki ‘persimmon’ to be neighbors because the accent patterns’
difference might have a crucial role for their recognition (HL pitch pattern for ‘oyster’
and LH pitch pattern for ‘persimmon’). A similarity judgment experiment on pitch
accent patterns was carried out and the results were implemented in the calculation of
prosodic similarity.
In the third calculation, neighborhood density was measured by comparing the
similarity of cochleagrams of the audio files. In this case, the word representation is an
auditory representation in which all segmental and prosodic information is available. In
this calculation, as in the GNM (Bailey & Hahn, 2000), the words in the lexicon are
considered as exemplars and they are mapped onto psychological mental space. In this
model, all but the target word are considered as neighbors that vary along a continuous
space of similarity. This calculation is thus like an auditory version of GNM.
The details of these calculations will be explained in Chapter 2.
1 An apostrophe specifies the place of lexical accent. Romanization conventions used in this are mainly based on the one created by the Society for the Romanization of the Japanese Alphabet ("99 version") except that moraic nasals and geminate consonants are represented as 'N’ and ‘Q.‘ respectively in Japanese. See website (http://www.roomazi.org/99siki.html) for further details.
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.4. T e s t in g neighborhood e f f e c t s in J a p a n e s e : O v e r v ie w
The main goal of this dissertation is to better understand lexical access processes.
In a series of experiments, the same set of 700 noun words are used as target words with
different experimental tasks. This will give an opportunity to directly compare the results
obtained from different experiments.
In English, the neighborhood density effect has been shown in experiments using
many different methodologies such as auditory naming (e.g.. Luce & Pisoni, 1998;
Vitevitch & Luce, 1998, 1999), word identification in noise (e.g.. Luce & Pisoni. 1998),
lexical decision (Luce & Pisoni, 1998), same-different matching task (Vitevitch & Luce,
1999), and semantic categorization (Vitevitch & Luce. 1999). However, in Japanese,
Amano and Kondo (1999) found a neighborhood density effect (inhibitory) with words in
a word identification in noise experiment but not in a lexical decision experiment.
These methodologies are roughly categorized into two groups: sublexically-biased
and lexically-biased tasks. Lexically-biased tasks require access to the lexicon whereas
sublexically-biased tasks do not. The auditory naming task and same-different matching
task are sublexically-biased tasks that show some effects from shallow phonetic details.
On the other hand, word identification in noise, lexical decision and semantic
categorization tasks are considered to be lexically-biased tasks in the sense that they all
require accessing the lexicon. However, according to Vitevitch and Luce (1999),
semantic categorization requires accessing lexicon but is not biased either towards the
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. lexical level or the sublexical level, because the task decision is made at the semantic
level.
Three neighborhood experiments conducted in this dissertation use different
methodologies. The advantage of testing the neighborhood effects in experiments with
different methodologies is that it should allow us to investigate the influence of the
neighborhood density effect from several different perspectives. For example, a syllable
effect was not observed in sequence-monitoring experiments in English (Cutler et al.,
1986; Bradley et al., 1993) whereas phoneme-induction experiments clearly showed a
syllable effect (Finney et al., 1996; Pitt et al.. 1998). Therefore, the use of a particular
methodology can sometimes show hidden effects that might not be observed using a
different methodology. This could be the case for neighborhood density in Japanese.
Three methodologies chosen for our experiments are auditory naming, word
identification in noise, and semantic categorization. Experiment I uses the auditory
naming task, which was chosen as a sublexically-biased task. Experiment I will be
reported in Chapter 3. Experiment 2 is an auditory naming experiment with a word
identification task in a noise condition (see Pisoni, 1996). Participants performed an
auditory naming task as a primary task. Once they finished naming a stimulus word, they
wrote down what they said in hiragana characters. Experiment 2 aimed to collect both
performance time and identification data in order to compare these data with the data of
Experiment 1, where the same task is performed in a condition without noise, and the
identification data of previous studies (Luce & Pisoni, 1998; Amano & Kondo. 1999).
These results will be reported in Chapter 4. Chapter 5 reports the results of Experiment 3 22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. which uses the semantic categorization task that requires lexical access (see Foster &
Shen, 1996). Vitevitch and Luce (1999) explained that an advantage of using this task is
that it can allow observation of the neighborhood density effect at the lexical level in a
more natural way; it certainly looks at lexical level activity without using nonwords like a
lexical decision task and without biasing either prelexical or lexical levels.
We can look at the results of previous studies in English to predict what the
results of the current experiments should be if Japanese shows the same neighborhood
effects as those observed in English. When words are presented in auditory naming, word
identification and semantic categorization tasks, inhibitory effects of neighborhood
density are observed: high density words are responded to more slowly than low density
words (Luce & Pisoni, 1998; Vitevitch & Luce. 1998, 1999). Therefore, we expect to see
neighborhood inhibitory effects in the three experiments in this dissertation.
These three experiments all use the same 700 targets, allowing us to directly
compare the results across the experiments using the different methodologies. The details
about the selection and characteristics of the target words are shown in §2.4.
Neighborhood density calculations assume existence of the mental lexicon
because what we are trying to show is how many similar words to a given word exist in
the lexicon. The mental lexicon assumed here is based on a standard dialect of Japanese,
the Tokyo dialect. Because of this restriction, participants are all Tokyo-native speakers
(See §2.5). All nouns used in this study are found in the electronic version of a Japanese
standard dictionary (Sanseido Shinmeikai Japanese dictionary ; Kenbou, Kindaichi.
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Kindaichi, & Shibata, 1981). More information about the lexicon used in this study will
be provided in §2.2.
In this dissertation, three different neighborhood calculations are tested in order to
decide how to define neighbors. The first calculation was based on Greenberg-Jenkins’
(1964) phoneme substitution, deletion and insertion rules. The second calculation
included prosodic information as another dimension in the neighborhood calculation in
order to reflect the finding that prosodic information has a vital role in Japanese word
recognition. The third calculation was based on the auditory properties of the words in the
lexicon. The first two calculations are based on the abstract representation of the words
whereas the last calculation is based on the auditory representation. Therefore, finding
the best definition of neighbors in Japanese also contributes to revealing the
representation of words in the lexicon.
1.5. O rganization o f t h e D issertation
In this dissertation, three definitions of neighbors will be tested in three
neighborhood experiments. This dissertation is organized as follows: Chapter 2 provides
basic information about the neighborhood density experiments conducted in this
dissertation. The common features of the experiments will be explained. Chapter 3
reports the results of the auditory naming experiment. In Chapter 4. neighborhood
density effect will be tested in two tasks: auditory naming in noise and word
identification in noise. Chapter 5 further tests neighborhood density effect in a semantic
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. categorization experiment. The results are drawn together and discussed as a whole in the
General Discussion and Conclusion in Chapter 6.
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2
STIMULI AND THEIR NEIGHBORHOODS
2.1 Introduction
This dissertation tests neighborhood density effects in Japanese in order to better
understand the representation of word forms stored in the lexicon and the processes
mapping between this phonological representation and the acoustic-auditory input. Three
experiments were conducted. In this chapter, words used as stimuli in all three
experiments are described, and an explanation is provided for different ways to calculate
their neighborhood densities.
2.2 J a p a n e s e M e n t a l L e x ic o n
Japanese neighborhood experiments require a Japanese lexicon in order to
calculate the neighborhood density for words. The English language lexicon used in
26
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. many previous experiments is the Hoosier Mental Lexicon (Pisoni et al.. 1985). The
Japanese lexicon used in this dissertation is based on the NTT Database Series (Amano &
Kondo, 1999, 2000).
The NTT Database Series has seven volumes, each of which focuses on different
aspects of the Japanese lexicon shown in Table 2.1. Entries in all volumes are cross-
referenced with common ID numbers.
VOLUME CONTENT
I Word Familiarity
2 Word Orthography
3 Word Accent
4 Parts of Speech
5 Characters
6 Character-Word
7 Frequency
Table 2.1: Contents of the NTT Database Series (Amano & Kondo, 1999; 2000)
The lexicon used in this study consists of a subset of words in the NTT Database
Series (Volume I: Word familiarity), the 3rd edition of the Sanseido Shinmeikai
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Dictionary (Kenbou et al., 1981). For a smaller subset of words, there is also a recorded
utterance of each word, and a set of at least three familiarity ratings for ( I) auditory
presentation of the word, (2) visual presentation of the word (in each of its written forms,
if there is more than one typical way to write the word), and (3) simultaneous audio
visual presentation. At least 32 subjects per word rated the familiarity of each word on a
7-point scale starting from 1 (not familiar) to 7 (most familiar). Some words have more
than one pronunciation in standard Japanese (e.g., ‘almond’ has two forms: a'amoNtlo ,
with initial accent, and aamo'Ndo, with penultimate accent). Lexical entries were
recorded by a single adult female speaker of the Tokyo-dialect in the multiple sessions.
The utterances were first digitized onto a PC. and were stored as individual audio files at
16-bit resolution and a 16000 Hz sampling rate in the .wav format (Windows PCM: See
Amano & Kondo. 1999 for further details). The lexicon used in this study consisted of
the smaller subset of 63,531 noun words which have associated recorded utterances with
the NTT Database Series.
Two representations are assumed for each word in the Japanese lexicon: an
abstract representation and an acoustic/auditory representation, each of which is based on
two types of word recognition models (abstract models vs. exemplar-based models). The
abstract representation is described in alphabetic symbols. The full description of
alphabet usage in the representation is shown in Appendix A. Two features of the
phonemic representation are worth mentioning. First, moraic nasals and moraic
consonants are transcribed as N and Q, respectively. This decision was made based on
studies that Japanese listeners are sensitive to moraic structure in auditory word 28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. recognition (e.g., Otake et al., 1993; Cutler & Otake, 1994; Otake et al.. 1996b). Second,
the vowel length contrast in Japanese is expressed by the number of segments (single or
double) such as in ka’do ‘comer’ vs. ka'ado ‘card’ as explained in Vance (1987).
The auditory representation is based on the audio files that came with the NTT
Database Series (Amano & Kondo, 1999, 2000). The auditory representation of the
words in the lexicon is modeled as a sequence of auditory spectra calculated from the
audio files.
Pitch accent patterns are also stored in the lexicon as separate information. The
details will be discussed in §2.3.2.
2.3 D e f in in g N e ig h b o r s in J a p a n e s e
As discussed in Chapter 1. finding the best neighborhood density calculation in
Japanese could provide useful information about how words are represented. Of
particular interest is whether neighborhoods should be calculated in terms of number of
phonemes in common, as in Greenberg and Jenkins (1964), a method which has been
widely used in the English neighborhood literature, or whether neighborhoods should be
calculated in terms of acoustic/auditory similarity.
In the following sections, each of the three neighborhood calculations tested in
this dissertation are explained in more detail. In order to show the different outcomes of
the three neighborhood calculations, the same target word ( kodomo , ‘child’), one of the
words used in the experiments, is used as an example word.
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.1 T h e S e g m e n t s C a l c u l a t io n
The first neighborhood density calculation is an algorithm based on Greenberg
and Jenkins (1964). This calculation will be referred to as "the Segments calculation.”
Neighborhoods are computed by comparing a given segmental transcription (the
stimulus) to all other transcriptions in the Japanese lexicon discussed in §2.2. A neighbor
is defined as any transcription that could be converted to the transcription of the stimulus
word by the substitution, deletion, or addition of one phoneme in any position. Table 2.2
shows an example neighborhood calculation for the target word, kodomo, 'child.’ This
word has four neighbors. Three neighbors are obtained by deleting an onset of the third
syllable of the target word. The last neighbor is obtained by substituting r with the onset
of the second syllable in the target word. The first two neighbors have "the same word
entity" because they have the same kanji characters in Sanseido Shinmeikai Dictionary
(Kenbou et al., 1981). This is not the same thing as aamoNdo , almond' where the accent
difference does not change usage, which was explained in §2.2.
30
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Target word Neighbors Operation ko’doo1 Deletion ‘old discipline’ kodoo kodomo Deletion ‘old road’ ‘child’ kodoo &» Deletion ‘heart beat’ koromo £ Substitution ‘batter’
Table 2.2: The target, neighbors selected by the Segments calculation, and its operations.
2.3.2 T h e S e g m e n t s + P it c h C a l c u l a t io n
The second calculation, “the Segments + Pitch calculation,” is designed to
investigate the case in which abstract models such as Shortlist (Norris, 1994), MERGE
(Norris, et al.. 2000) and TRACE (McClelland & Elman. 1986) take advantage of
suprasegmental information for their word selection. Several studies have claimed that
pitch accent information plays a vital role in Japanese word recognition (Sekiyama &
Nakajima, 1998; Cutler & Otake, 1999). In Japanese, placement of the accent within
each word is lexically specified. For instance, the words, ana , ‘hole’ and a ’na
“announcer’ have the same sequence of sounds but the first is an unaccented word and the
second has an initial accent. These words are realized with the pitch patterns shown in
1 Of course, this placement of the lexical accent is not considered in this neighborhood calculation. 31
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.1. The fact that Japanese listeners can recognize them as different words shows
that the pitch accent contributes to lexical information in Japanese. Perception of accent
location in Japanese is influenced by both FO peak location and post-peak FO fall rate
(steep or shallow relative to the syllable edge (Sugito, 1972: Hasegawa & Hata,I992).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. rj 100.2) [leffcup/down move mid:play between marks right Ti»e: 0.4176Ssec D: 0.34317 I: 0.10259 R: 0.44576
- j j 100.2) {lefcup/down move mid:play between marks right Tine: 0.35515sec 0: 0.00000 L: 0.36168 R: 0.36168
Figure 2.1: Fundamental frequency (FO) contours and spectrograms of ana, "hole" (top) and a Jna "announcer" (bottom). 33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The second neighborhood calculation is based on both segmental similarity and
pitch pattern similarity. An important question is how Japanese listeners perceive
similarity of pitch accent patterns. Traditionally, Japanese pitch accent patterns are
represented with H (high pitch) and L (low pitch) targets on each mora. For example, a
pitch accent contour of a word, ana, ‘hole’ shown in Figure 2.1 is represented as a
sequence of LH. If we assume these Hs and Ls are "tonemes", it should be possible to
calculate similarity of pitch accent patterns using Greenberg-Jenkins’ substitution,
deletion or insertion rules. In order to test this hypothesis, an experiment examining
similarity judgments about Japanese pitch patterns was conducted.
2.3.2.1 S im il a r it y J u d g m e n t s o n J a p a n e s e Pit c h A c c e n t P a t t e r n s
2.3.2.1.1 P u r p o s e
The purpose of this experiment was to investigate how Japanese listeners perceive
similarities of pitch accent patterns in Japanese when pairs of pitch accent patterns are
presented auditorily. In segmental similarity, a single phoneme edit distance is allowed
for words to be segmental neighbors (e.g .,gaN, ‘cancer’ and kaN, “can’ are neighbors,
but gaN and kani, 'crab' are not). Similarly, it is assumed that the pitch patterns consist
of a sequence of pitch units using H (high pitch) and L (low pitch), the question to ask
would be whether one pitch unit difference could also be used to determine pitch
neighbors (e.g., LHH and LHL). If this is the case, a clear categorical boundary would
be expected between the pairs of pitch patterns that are within one pitch edit difference of
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. each other. Thus, LHH and LHL would be pitch accent neighbors whereas HLL and
LHL would not.
2.3.2.1.2 M e t h o d
2.3.2.1.2.1 S t im u l i
Twenty pitch patterns were used. They are the patterns that are attested in simple
I to 5 CV syllable/mora words in Japanese. Table 2.3 shows the 20 pitch patterns and
pseudowords that were used in Experiment I. In this experiment, pitch patterns are
coded by three different levels (0 = low pitch; 1 = unaccented high pitch; * = accented
high pitch) so that liana' ‘flower’ and liana ‘nose’ should be coded as 0* and 01,
respectively.
The patterns were produced on nonword stimuli consisting of a string of /ma/
syllables. That is, a simple CV syllable, ‘ma’ is the only syllable used in all the nonsense
words. Since a lexical pitch accent in some pitch patterns such as 0* and 0 1 11* only
appears when a grammatical particle (such as ‘-ga’ or ‘-wa’) is attached word-finally, 20
pitch patterns realized with the nonsense words were recorded with a grammatical
particle ‘-Qte,’ which was subsequently deleted (i.e., ma°ma*-Qte —* ma°ma*).
The nonsense words were recorded onto a DAT tape by the author, a native
speaker of the Tokyo-dialect, at a sampling rate of 48000 Hz. The data were down-
sampled to a sampling rate of 22050 Hz with 16-bit accuracy when they were transferred
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. onto a computer. Twenty separate sound files were created, and the grammatical particle
‘-Qte’ was deleted.
The waveforms of the 20 sound files were scaled so that the peak root-mean -
square (RMS) amplitude values were equated for all files at an amplitude of
approximately 75 dB sound pressure level (SPL).
1-syllable 1- 2 syllable-2 3-syllable 3- 4-syllable 4- 5-syllable 5- mora word mora word mora word mora word mora word ma mama mamama mamamama mamamamama * 01 Oil 0111 01111
0 *0 01* o n * 0111*
0* 0*0 01*0 011*0
*00 0*00 01*00
*000 0*000
*0000
Table 2.3: Twenty tonal patterns tested in Experiment 1 (0 = low pitch; 1 = unaccented high pitch; * = accented high pitch).
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2.1.2.2 P articipants
Participants were 18 Japanese native speakers who were bom and raised in the
Tokyo area (Tokyo, Saitama, Chiba and Kanagawa). They were thus native speakers of
the Tokyo dialect, the standard dialect in Japanese. The participants were undergraduate
students at Dokkyo University (Saitama, Japan). None had stayed in English-speaking
countries except for short travel visits. They each received a small amount of money for
their participation. None of the participants had any hearing impairment.
2.3.2.1.2.3 P r o c e d u r e
Participants were given answer sheets and a pencil. They were played a pre
recorded list of experimental instructions in which they were told that they would be
hearing a series of nonsense word pairs and that their task was to judge the similarity of
the tonal patterns in each pair on a 7-point scale. Participants were instructed to make
their judgments based on the overall impression of the nonsense words in each pair.
They were told that it might be helpful to think how likely it would be that nonsense
words in the pair could be identical: a judgment of “very likely identical’* would rate a
“ 1" on the 7-point scale.
All participants were tested as a group in a language laboratory at Dokkyo
University. Of the 200 possible pairings of the 20 tokens, a subset of 150 was selected
for our study. Both orders (AB and BA) of each of the pair-wise comparisons were
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. presented in random order to participants, for a total of 300 pairs for similarity judgment.
The 300 pairs are shown in Appendix B.
All stimuli were presented from a laptop computer that was connected to the
central audio system in a language laboratory. Participants heard the stimuli binaurally
via headphones. Each trial started with a short tone followed by a 500 millisecond (ms)
pause. The participants then heard a pair of stimuli with a 500 ms inter-stimulus interval
(ISI). Within 4 seconds after the second stimulus item was presented, they were
instructed to record their judgment by circling a number (from I through 7) printed on
the answer sheets. Before the test session, a practice session of 5 pairs was provided to
participants in order to familiarize them with the procedure. The length of Experiment 1
was about 50 minutes.
2.3.2.1.3 R e s u l t s
The mean similarity ratings across participants were first calculated for all 300
trials. One hundred and fifty pairs were presented in two different orders (AB and BA).
Figure 2.2 shows the mean rating in the order AB as a function of the mean rating in the
order BA. Note that pairs with lower rating are perceived as more similar. The figure
shows that most of the datapoints are very near the diagonal line. This indicates that
participants gave their ratings consistently on the same stimuli in different presentations
in the experiment. A regression analysis showed that the ratings in the order AB are
highly correlated with the ratings in the order BA (/f2 = 0.842). Furthermore, a paired t-
38
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. test showed that the differences between AB and BA order are not significant (t = L.619,
df= 149, p > . I). Thus, participants performed their task consistently.
Figure 2.2 also shows that there are two clusters: four with the highest degree of
similarity (ratings near 1) are separated from the other pairs in their own cluster. It
turned out that these pairs are ones in which the stimuli are of the same length contrasting
unaccented with final accented syllable, i.e., 011* vs. 0111,01* vs. 011,0* vs. 01 and
0111* vs. 01111. This indicates that the H target for an accented syllable at the end of
word (e.g., hana' ‘flower’) and the relatively high pitch at the end of an unaccented word
before an excised Qte’ (e.g., hana ‘nose’) are very similar. Moreover, these four pairs
are separated from the rest of the pairs, which demonstrates that they are more similar to
each other than the other pairs are similar to each other. Also, the ratings reflect that
Japanese listeners were able to discern a difference between pairs like liana ‘nose’ and
hana, ‘flower’ because most of the pairs did not get a rating of ‘I,’ indicating that they
are identical. Vance (1995) and Warner (1997) both reported that Japanese listeners are
able to distinguish an accented high pitch and an unaccented high pitch when each
appears at the end of words. In this sense, our similarity data do not conflict with Vance
(1995) and Warner (1997). These pairs are very similar yet still distinguishable by
Japanese listeners.
39
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7
O 00, 6 c —
5
4
3
2
1
0 0 1 2 3 4 5 6 7 AB
Figure 2.2: The mean similarity rating in the order AB as a function of the mean similarity rating in the order BA. The numbered points (1 to 4) in the figure represent Oil* vs. 0111,01* vs. 011,0* vs. 01 and 0111* vs. 01111, respectively.
In sum, the similarity judgment data showed that there was no stimulus order
effect in judging the prosodic similarity of the pairs. Based on this finding, the means of
40
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the two ratings for the 150 stimulus pairs were calculated and were used for further
analyses on similarity of word prosodic patterns. Furthermore, the difference between a
final accented high pitch (*) and a final unaccented high pitch (I) will be ignored in the
following analyses. The data support a lexical representation of accent patterns as pitch
contours that can be roughly represented by just two levels: low and relatively high
(represented as "0” and “ I,” respectively).
2.3.2.1.3.1 W o r d P r o s o d ic S im il a r it y B a s e d o n G r e e n b e r g -J e n k in s ’ r u l e s
This section investigates whether Japanese listeners’ pitch pattern similarity
judgments can be predicted by a single toneme edit distance criterion.
The means of all 150 pairs were coded with the number of substitutions, deletions
and insertions to change one pitch pattern to the other pattern. A regression analysis was
performed to see how well the number of operations predicts participants’ similarity
ratings. Figure 2.3 shows a graph of the number of operations (substitutions, deletions or
insertions) as a function of the similarity ratings and a graph of the number of operations
as a function of the median similarity rating. Note that participants used a similarity
range (1 to 7) and similarity is higher if the rating is lower.
41
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7
6 6
5
4 4
3 3
2
1
0 1 2 4 0 3 5 6 3 4 5 OPERATIONS OPERATIONS
Figure 2.3: The number of operations (substitutions, deletions or insertions) as a function of the similarity ratings (left), the number of operations as a function of the median similarity rating with a logarithmic function (right).
The results showed that there is no advantage for pairs at one toneme edit
distance. There is tremendous overlap between I operation and 2 operations. Therefore,
a “single-element’ rule as in the Greenberg-Jenkins algorithm would not work. The right-
hand graph of Figure 2.3 shows the number of operations as a function of the mean
similarity ratings. The relation between the number of operations and the mean similarity
rating is in a logarithmic function. Therefore, one toneme edit distance may not be the
best method to describe Japanese listeners’ performance on similarity judgments.
42
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.2.1.4 C a l c u l a t in g t h e S e g m e n t s + P it c h C a l c u l a t io n
The Segments + Pitch calculation is a modified version of the Segments
neighborhood calculation explained in §2.3.1, and reflects the fact that Japanese listeners
are sensitive to pitch information in word recognition. The calculation has two stages.
The first stage selects potential neighbors based on segmental information. The second
stage further selects neighbors based on pitch-accent patterns from the candidates
determined in the first stage.
At the second stage, the word accent patterns are considered. In §2.3.2.1, it was
discovered that similarity rating for pairs of accent patterns and the number of operations
are related logarithmically. In order to incorporate this similarity rating information into
a neighborhood calculation, a cutoff point was introduced. The cutoff point was based on
the error responses in an auditory naming in noise experiment reported in Chapter 4. In
this experiment, the stimuli were presented in noise and the participants were expected to
repeat and write the word they heard. Here, all the error responses in this experiment
were analyzed to see how Japanese listeners misperceived the accent patterns of the
stimuli. In this analysis, the actually perceived accent pattern and the repeated accent
pattern are compared as shown in Table 2.4.
Figure 2.4 shows frequency counts as a function of operations of pitch pattern
responses in the auditory naming in noise experiment reported in Chapter 4. Here, all the
responses (N = 18900) were analyzed in terms of pitch patterns only; misidentified
segment responses are not considered here. As you can see. the highest count of all is the
bar of zero operation. This means that more than 86% of the responses correctly 43
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. reproduced the pitch accent pattern of the stimulus. Whether we introduce a categorical
distinction between pitch neighbors or not, the words that have the exact same pitch
pattern are considered as pitch neighbors.
18000 16326 6000
4000
12000
0000
5 8000
OPERATIONS
Figure 2.4. Distance between accent pattern produced and target accent pattern in the auditory naming in noise experiment as measured by the number of operations needed to change the perceived pattern into the response pattern (Chapter 4).
Now the processes at the second stage of the neighborhood calculation are
explained with kodomo , ‘child’ as a target word. Table 2.4 shows the target word and its
four potential neighbors selected at the first stage with similarity ratings and 44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. neighborhood information. The target word has an accent pattern of 011. The four
words selected as potential neighbors at the first stage of the calculation have either 011
or 100 pitch patterns. Because ko'doo, “old discipline’ does not have the same pitch
accent pattern as the target word, it is no longer a neighbor in this calculation. Therefore,
kodomo , ’child’ has three neighbors.
Potential Neighbors at the Target word first stage of the Neighbor? calculation ko’doo “old discipline' NO 100 kodoo ‘old road’ YES kodomo Oil “child’ kodoo Oil “heart beat’ YES Oil koromo “batter’ YES Oil Neighborhood Density: 3
Table 2.4: A target word and its four potential neighbors selected at the first stage with information calculating neighborhood density.
45
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 .3 .3 T h e A u d it o r y C a l c u l a t io n
The third definition of neighbors in Japanese is based on auditory similarity
among words in the lexicon (“the Auditory calculation’)- In order to calculate auditory
similarity of words, I adopted the equations from an exemplar-based model (X-MOD,
Johnson, I997ab). X-MOD is an exemplar-based model of word recognition, which is an
extension of Klatt’s LAFS model (Klatt, 1980). This instance-based model of
phonological learning and word recognition is based on three assumptions. First, speech
is recognized by reference to stored instances (exemplars). Second, these exemplars have
no internal structure; rather, they are unanalyzed auditory representations. Third,
exemplars are word-sized chunks that result from primitive auditory scene analysis
(Bregman, 1990) where isolated word productions form the basis for word recognition in
running speech. In this model, each 23 ms frame of speech is processed to yield a critical
band spectrum (95 points) at a 43 Hz frame rate, and is vector quantized. After vector
quantization, each frame is expressed as an arbitrary number. In a matching process,
exemplars are activated based on similarity to input, where similarity is an exponetial
function of the Euclidean distance between exemplars.
The critical difference between X-MOD and LAFS is the stored auditory
representation for each word. Figure 2.5 shows examples of LAFS and X-MOD
representations. In contrast to the prototypes stored for LAFS, the stored representations
in X-MOD are distinct exemplars. In LAFS, lexical representations are based on spectral
decoding networks. Unlike LAFS. X-MOD stores multiple auditory exemplars for each
46
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. word rather than prototypes. Also, a lexical representation is a sequence of quantized
vectors. For example, the LAFS representation of “Cat” is a sequence of auditory spectra,
which represents the word, as shown in Figure 2.5. Each number stands for a spectrum
code. One sequence of speech auditory spectra must represent all instances of the word.
This is a “brittle” representation, because it is not robust over sources of variation.
X-MOD assumes that the representation of “Cat” (and other words) is a set of
sequences of auditory spectra, where each code sequence is considered as a distinct
exemplar, as shown in Figure 2.5. The advantage of this model is that it keeps variation
directly in the lexical representation, treating it as information rather than noise. In this
sense, this model is similar to a hidden Markov Model (HMM) representation, but it does
not need to assume that state dependencies are purely local.
LAFS representation of “Cat'”:
73 71 18 11 16 90 1 88
X-MOD representation of “Cat
“Cat” exemplar 1 73 71 18 11 16
“Cat” exemplar 2 73 71 42 18 15
“Cat” exemplar 3 73 42 11 17 89
Figure 2.5: Examples of LAFS and X-MOD representations of “Cat.”
47
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. To calculate similarity of neighbors based on the auditory representation, a basic
assumption is that as in the General Neighborhood Model (GNM, Bailey & Hahn, 2001),
is that all the words stored in the lexicon are neighbors to some degree. Therefore, the
perceived word in an incoming speech signal is compared with all the words in the
lexicon based on the algorithms used in X-MOD.
If the psychological distance between instances i and j is dtJ, perceived similarity
of a target word i to a set of instances stored in memory is calculated by equations in
Equations 2.1 and 2.2. Similarity in this calculation is first computed by comparing two
auditory spectra that are represented as a sequence of numbers (see Figure 2.6). The
auditory property m, the index to the sequenced auditory spectra of exemplar j is written
Xjm. ^nd is represented as a number. The Euclidian distance between exemplar j and item
i is written dij, a sensitivity constant is c.
kodomo exemplar (/): 73 71 18 11 16 90 I 88
domori exemplar (/): 15 16 20 90 2 88 45 67 62
Figure 2.6: Quantized vectors of the exemplars (kodomo and domori)
48
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Equation 2.1) di} = [X(x,m - Xjm)2]m m (Equation 2.2) Simt = X exp(-crfy) j
Equations 2.1 & 2.2. Equations used in the Auditory calculation for neighborhood density
In Equation 2.1, auditory similarity between exemplars is calculated by
comparing sequences of quantized vectors as shown in Figure 2.6. The best alignment of
two exemplars is found to accommodate vector length differences. Auditory properties
of these instances are compared and are summed in order to compute dtj. In this analysis,
a threshold is introduced in order to decide whether two exemplars (words) are neighbors
or not. Figure 2.7 shows a neighbor-nonneighbor distinction in a similarity space in the
Auditory calculation.
49
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Neighbors Non-Neighbors
U 9 m
Neighborhood Threshold
Figure 2.7: A neighbor-nonneighbor distinction in a similarity space in the Auditory calculation.
Recall that Bailey and Hahn (2001) assume that all words in the lexicon are
neighbors to some degree, and they proposed a continuous similarity space within the
lexicon. In this neighborhood density calculation, sound differences are considered. On
the other hand. Luce and Pisoni (1998) used a discrete similarity space. A neighbor-
nonneighbor distinction is based on one phoneme edit distance, so sound similarity is not
included in the neighborhood density calculation. The Auditory calculation in this
dissertation is a combination of these two algorithms. Like Luce and Pisoni s (1998)
algorithm, our algorithm has a threshold for a categorical cutoff point between neighbors
and nonneighbors. That is, only psychological distances ( dij) that are above the threshold 50
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. are plugged into Equation 2.2 to compute similarity among words. However, as in GNM
(Bailey & Hahn, 2001), sound similarity is implemented in the neighborhood calculation
in a psychological space. That is, the similarity value is considered as neighborhood
density in the Auditory calculation. The number of neighbors for kodomo is 165, and the
neighborhood density is 70.920.
In this calculation, not only are different metrics of similarity used (auditory
similarity of the vectors of spectra versus phoneme edit distance), a different idea of
measuring “similarity” is used. That is, in the Segments calculation and the Segments +
Pitch calculation, the number of neighbors was added whereas in the Auditory calculation,
the degrees of similarity were added. However, a linear correlation analysis between the
number of neighbors based on the Auditory calculation and the similarity of neighbors
(the Auditory calculation) revealed that these two calculations are highly correlated (R2 =
0.983083). This suggests that changing the representation (symbolic vs. auditory) is
more important than changing a measure of neighborhood density (number of neighbors
vs, degrees of similarity).
Lastly, an important aspect of the Auditory calculation should be pointed out.
Since this calculation is based on an auditory representation that contains phonetic details,
it is easily assumed that a neighborhood calculation based on the auditory representation
is less abstract than neighborhood calculations based on the symbolic representation.
However, contrary to this prediction, neighbors in the Auditory calculation are selected
less strictly than those in the other two calculations.
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. riMm anago , conger eel’ kowane , “voice quality'
karoo , “fatigue’ moo, “hidden pocket, distinguished talent’
Figure 2.8: The target word, kodomo, ‘child’ and its seven most similar neighbors in the Auditory calculation.
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.8 shows spectrograms of the target word, kodomo, ‘child’ and its seven
most similar neighbors. First of all, even the most similar neighbors for kodomo in the
Auditory calculation cannot be neighbors in other two neighborhood calculations, both of
which are based on the one phoneme edit distance. However, if you look at the
spectrograms of these neighbors, we can see some common elements. First, all neighbors
have the same pitch accent pattern2. Secondly, they have similar durational relationships
and other relational properties (e.g.. spectral edges). Thirdly, although segments that
make up the words are different, neighbors to kodomo show a similar impression.
Although these neighbors are not in one-phoneme edit distance, substituted sounds in the
neighbors are phonetically similar to the ones in the target. For example, a comparison
between kodomo and enogu , /m/ and /g/ are both realized as nasals, once you recall that
/g/ is realized as [rj] in Tokyo dialect, and is in the process of changing to [g] (Hibiya,
1995). The position in /dJ is Filled with M in enogu, both of which are alveolar sounds.
Also, /o/ is changed to /u/. /e/ and /a/ but never to /i/ among the 7 most similar neighbors.
Therefore, the selection of the neighbors is based on the whole-word confusability rather
than segmental confusability. The whole-word confusability is based on auditory
impressions. In short, the Auditory neighborhood calculation is in a sense a broader
neighborhoodW calculation than the Segments V calculation and the Segments V + Pitch
calculation.
: kunoo, ‘suffering’ has a HLL pitch accent pattern in the NTT databases. However, this word was accidentally recorded with a LHH pitch accent pattern. Therefore, in the Auditory neighborhood calculation, it was chosen as one of the most similar neighbors to the target word. 53
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.3.4 C o m p a r is o n o f t h e T h r e e N eighborhood C alculations
Three different neighborhood calculations have been explained in §2.3.1, §2.3.2
and §2.3.3. Three different values of neighborhood density are obtained for the same
target word, kodomo , ‘child.’ Numbers of neighbors in the Segments calculation and the
Segments + Pitch calculation are 4 and 3, respectively. Similarity among words in the
lexicon in the Auditory calculation is 70.920.
The distributions of neighborhood density for the 700 target words used in these
experiments are shown in Figure 2.9. Descriptive statistics of neighborhood density
computed by the three calculations are shown in Table 2.5.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. without prohibited reproduction Further owner. copyright the of permission with Reproduced Figure 2.9: Frequency counts of target words as a function of neighborhood density. neighborhood of function a as words target Frequency of counts 2.9: Figure Neighborhood density by the Segments calculation (top), neighborhood density by density neighborhood (top), calculation theby Segments density Neighborhood Counts Counts Counts the Segments + Pitch calculation (middle), and neighborhood density bythe density neighborhood and (middle), Pitch+ calculation the Segments 100 120 100 120 100 120 60 20 40 20 1 2 3 4 S 6 7 8 9 100 90 80 70 60 SO 40 30 20 10 0 0 0 10 10 03 40 30 20 20 Auditory calculation (bottom). Auditory calculation 040 30 Neighborhood Oenslty Neighborhood Density Neighborhood Neighborhood Density Neighborhood so so 55 07 90 70 60 08 90 80 60 ao 100 100
Segments Segments + Pitch Auditory
N of cases 700 700 700
Minimum 0 0 0
Maximum 59 32 609.67
Median 6 3 9.12076
Mean 8.64 4.92 42.19
Standard Dev 9 6 76
Skewness (Gl) 2.030257 1.984840 3.187733
Table 2.5: Descriptive statistics of neighborhood density computed by three different neighborhood calculations.
Linear correlation analyses were conducted in order to understand the
relationships among the three neighborhood calculations and other factors. Table 2.6
shows the Pearson correlation matrix of the three neighborhood calculations. The results
showed that the Auditory calculation is different from the Segments calculation and the
Segments + Pitch calculation, which are highly positively correlated ( R2 = 0.851). The
Auditory calculation is negatively correlated with both the Segments calculation (R2 = -
0.167) and Segments + Pitch calculation (/f2 = -0.156). The auditory neighborhood
consist of the number of neighbors weighted by similarity whereas the other two consist
of just the number of very close neighbors —not a measure of similarity at all.
56
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Segments Segments + Pitch Auditory
Segments 1.00000000
Segments + Pitch 0.85135157 1.000000
Auditory -0.16761086 -0.15618879 1.00000000
Table 2.6. Pearson correlation matrix of the three neighborhood calculations.
2 .4 T a r g e t W o r d s
The three neighborhood experiments in this dissertation used the same set of 700
target words from the Japanese lexicon. The target words are trimoraic-trisyllabic words
(CVCVCV words) with a rated auditory familiarity of 5 or higher on a 7-point scale with
7 being highly familiar in the NTT Database Series (Volume I; Amano & Kondo. 1999.
2000). They begin with a voiceless stop ([t, k]), a nasal ([n, m]), or a fricative ([s. J, z.
3])-
Several lexical statistics were computed and later used to account for listeners'
performance in the experiments. These included word frequency, uniqueness point, first
mora frequency, and duration. This section describes how these lexical characteristics
were defined and shows the distributions of the target words according to each of these
factors.
Although the study of word frequency effects in spoken word recognition would
be best if we were to use a count of spoken words, no such count is available in Japanese. 57
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Therefore, as in studies of English lexical processing, frequency counts based on written
text are used.
Volume 7 of the NTT database series is devoted to Frequency (word frequency
and character frequency), and word frequency for each word in the lexicon was
calculated based on the Word Frequency Database. Although they call the database
"Word Frequency Database,” it is actually "Morpheme Frequency Database.” The
database consists of frequency information on morphemes that were found in the articles
from Asahi Shinbun, a Japanese newspaper, during a 14 year period (from 1985 to 1998).
Entries in the database have original morpheme ID numbers as well as common ID
numbers. These ID numbers allow us to refer to the words in the Sanseido Shinmeikai
Dicationary (Kenbou et al., 1981), and to calculate the word frequency for each entry of
our lexicon from morpheme frequencies in the database.
One thing to keep in mind for word frequency calculation is that the same word
could be represented more than once in the text corpus. Japanese has three different
writing systems (hiragana. katakana and kanji) so that multiple morphemes might be
listed for the same word in the database. Table 2.7 shows information about the word,
anago , ‘conger eel,’ as an example. Database ID, Common ID, Representations, and
Frequency information are shown in the table.
58
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Database ID Common ID Representations Frequency
042363 010690 f c & r 5
074084 010690 7 7 u* 105
162659 010690 K=F 16
Table 2.7: Three representations of anago, ‘conger eel’ found in the Word Frequency Database (Volume 7, The NTT Database Series).
As shown here, anago has three different representations, each of which has its
specific database ID number in the database (042363, 074084, 062659, respectively).
However, they all have the same common ID (010690). Therefore, word frequency for
anago should be the total of the frequency counts for morphemes with the same common
ID number. The total citation frequency of anago is 126. In this dissertation, word
frequency is defined as the logarithm (base 10) of the citation frequency count for each
word in the database. In order to include words listed in the lexicon with zero token
frequency, a constant 2 was added to the citation frequency count before taking
logarithms in order to avoid taking the log of zero. This computation was conducted with
all words in the lexicon. Therefore, because the citation frequency for anago was 126. it
word frequency is logl0( 126+2), or 2.11.
59
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. rprin e Bar per Proportion
c 13 O O - 0.06
1 2 3 4 Frequency (Iog10)
Figure 2.10: Distribution of word frequency of the target words.
Figure 2.10 shows frequency counts of the 700 target words as a function of word
frequency. As can seen in the figure, word frequency of the target words varies across a
wide range, although the targets are mostly familiar words.
60
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The uniqueness point is the point at which moving from left to right through the
word, the word is distinguished from all other words in the lexicon. This concept is the
core of the Cohort Theory (Marslen-Wilson, 1984; Marslen-Wilson & Tyler, 1980;
Marslen-Wilson & Welsh, 1987). Uniqueness points were identified for each target word
and tabulated according to whether the word was unique before the last segment, was
unique at the last segment, or whether the word did not have a uniqueness point (coded as
after). Uniqueness points were further tabulated according to the number of segments
from the beginning of the word. If a word had no uniqueness point, it was coded as 7
since all 700 target words had 6 segments. Table 2.8 shows a summary of the uniqueness
point tabulations. Note that the uniqueness point ignores pitch patterns.
Over 55% of the target words do not have a uniqueness point. The number of
words that have a uniqueness point before the last segment of the word is 214, or 30.57%
of the targets. The rest of the target words (102 words; 14.56%) were made unique by
the last segment of the word. The uniqueness point could occur as early as at the third
segment of the target words.
61
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UP Groups Before At After UP from 3 4 5 6 7 word-initial # of words 5 8 201 102 374 102 374 Total (%) 214(30.57%) (14.57%) (54.86%)
Table 2.8: A summary of the uniqueness point tabulations
The first mora frequency was calculated by the number of words beginning with a
target mora divided by the total number of words in the lexicon. For example, as in
Table 2.9, 3444 words in the lexicon begin with the mora, ka. The total number of words
in the entire lexicon is 63531. Therefore, the proportion of words beginning with ka in
the lexicon is 0.05421. This frequency is recorded for all of the target words like
kabocha “squash,’ and karada ‘body,’ that start with ka.
This measurement may be used in three ways. First, this measure shows how
practiced the speaker is at parsing this form down to the phoneme segment in order to
distinguish it from other words that begin the same way. Another interpretation of the
measurement is that it shows the proportion of the initial cohorts. In the Cohort Theory
(Marslen-Wilson, 1984: Marslen-Wilson & Tyler, L980; Marslen-Wilson & Welsh,
1978), the initial cohorts are activated once listeners hear a few phonemes. Therefore, if
the proportion is higher, the number of words activated as cohorts should be higher. The
62
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. final interpretation is that this proportion shows a transitional probability of sounds that
consist of the initial mora of the target words.
# of Words beginning # of words in the Proportion of words with ka Lexicon beginning with ka 3444 63531 0.05421
Table 2.9: The number of words beginning with ka , the total number of words in the lexicon, and the proportion of words beginning with ka in the lexicon.
Figure 2.11 shows the distribution of the first mora frequencies for target words
classified in terms of initial sounds. The first mora frequency varies from 0.001133 to
0.054210 (Mean = 0.023305; SD = 0.015113) among all 700 target words. The number
of stimulus words with initial fricatives is 229 and the first mora frequency varies from
0.001684 to 0.042436 (Mean = 0.023616; SD = 0.012307). The number of stimulus
words with initial nasals is 189 and the first mora frequency varies from 0.002424 to
0.014151 (Mean = 0.011326; SD = 0.003018). The number of stimulus words with
initial stops is 282 and the first mora frequency varies from 0.001133 to 0.054210 (Mean
= 0.031081; SD = 0.016790).
63
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fricatives Nasals
3000 3000
2000 2000 -
c c 3 3 o o O o
1000 - - 10.2 03 1000 - u
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.00 0.01 0.02 0.03 0.04 0.05 1st Mora Frequency 1st Mora Frequency Stops
3000
2000 -
c 3 Oa
1000 -
0.00 0.01 0.02 0.03 0.04 0.05 0.06 1st Mora Frequency
Figure 2.11: Frequency counts of the target words as a function of frequency of the first mora. Words beginning with a fricative (top left), words beginning with a nasal (top right), and words beginning with a stop (bottom left).
64
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.12 shows frequency count and proportion per bar as a function of
durations of the target words. The durations of the target words varies between 431 ms
and 780 ms (Mean = 592 ms; SD = 60.4). Note that all the target words have the same
CVCVCV structure in Figure 2.11.
400 500 600 700 800 Duration (ms)
Figure 2.12: Distribution of the durations of the target words.
65
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.10 shows Pearson’s correlation matrix of the three neighborhood
calculations and other factors. The interest here is whether a particular factor is highly
correlated with any of the neighborhood calculations. A noticeable point is that the
Auditory calculation is NEGATIVELY correlated with all other factors whereas the
Segments calculation and the Segments + Pitch calculations are POSITIVELY correlated.
The Auditory calculation is highly (negatively) correlated with Duration : if duration of
the target is longer, the neighborhood density is lower (R2 = -0.531865). This is the
highest correlation observed among neighborhood density calculations and other factors.
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 -0.156189 1 Segs Scgs+Pitch Auditory 0.851352 1 1 0.294376 1st mora 1st 0.265391 0.174769 Word -0.045132 -0.114186 -0.167611 Frequency 1 0.0475430.042633 0.191566 1 0.224668 0.21316 0.248828 -0.531865 Duration 1 Point 0.176829 -0.064771 Uniqueness Tabic 2.10: Pearson correlation matrix ofthe three neighborhood calculations and other factors. lsl lsl mora Uniqueness Point Word Frequency 0.198834 Segs + Pitch 0.189811 Duration -0.00083 Segs 0.233265 Auditory
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The prerecorded auditory sound files of the target words in the NTT databases were
presented to the participants. The 700 target words are shown in Appendix C. The
waveforms of 700 audio files were scaled so that the peak RMS amplitude values were
equated for all files, at an amplitude of approximately 75 dB SPL.
2.5 P articipants
Participants of three experiments reported in this dissertation were all Japanese
native speakers who were bom and raised in the Tokyo area (Tokyo, Saitama. Chiba and
Kanagawa). They spoke the Tokyo dialect as their native dialect. None had stayed in
English-speaking countries except for short travel visits. The participants were mainly
recruited from undergraduate students at Dokkyo University (Saitama, Japan). The age
of the participants ranged from 19 to 3 1 years old. They each received a small amount of
money for their participation. The participants did not have any hearing impairment.
The selection of the participants was mainly based on the fact that dialectal
differences in Japanese affect speech processing in Japanese. Cutler and Otake (1999)
reported that Japanese speakers from Kagoshima andTochigi, where the spoken dialects
do not present an accent contrast, perceive pitch accent patterns differently from the
native speakers of the Tokyo dialect. Because a Tokyo-native speaker produced the
utterances in the NTT database series (Amano & Kondo, 199. 2000), it was necessary to
have participants also be Tokyo-native speakers.
68
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.6 S u m m ar y
In this Chapter, common features of the experiments conducted in this dissertation
were discussed. In §2.2, the lexicon used in this dissertation was described. It is a noun
lexicon that was developed from the electronic version of Sanseido Shinmeikai
Dictionary , a part of the NTT Database Series. Section 2.3 described three
neighborhood calculations (the Segments calculation, the Segments + Pitch calculation
and the Auditory calculation). The properties of the target words were discussed in §2.4.
Finally, §2.5 provided information about the participants.
69
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3
EXPERIMENT I: AUDITORY NAMING
3.1. Introduction
This chapter discusses the results of the auditory naming experiment. In this task,
participants listened to words over headphones and repeated them as quickly as possible.
Previous neighborhood studies on English (Luce & Pisoni. 1998; Vitevitch & Luce,
1998) found an inhibitory neighborhood density effect in auditory naming. If
neighborhood density effects are language-universal, we would expect Japanese listeners
to perform in a similar way. In other words, we would expect neighborhood density to
negatively affect naming time and accuracy: words from a dense neighborhood would be
named less quickly and accurately than words from a sparse neighborhood. This is not
what was found in this experiment. Rather, the reverse facilitative neighborhood effect
70
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. was found. Quite interestingly, the neighborhood densities best predicted listeners’
behavior.
3.2. M e t h o d s
As explained in the previous chapter, seven hundred nouns were selected as target
words from the NTT databases (Amano & Kondo, 1999, 2000). These target words met
the following criteria: all the words ( I) are 3 mora words, (2) are trisyllabic words, (3)
have a rated auditory familiarity of 5 or higher on a 7-point scale in the database (7 =
highly similar), (4) have audio files in the database, and (5) have as their initial sounds a
voiceless stop ([t. k]), a nasal ([n. m]). or a fricative ([s. J, z, 3l).
Participants were 27 native speakers of Tokyo Japanese who were bom and raised
in the Tokyo area (Tokyo. Kanagawa, Chiba, and Saitama). as explained in Chapter 2.
The 27 participants were run individually in a quiet room. Participants completed one-
hour test sessions on each of two successive days. In each session, a list of 350 target
words was presented. Each list was divided into five blocks, each of which contained 70
words. The order of the blocks and the words within each block were randomized. The
order of the two lists was counterbalanced among the participants.
In each test session, participants heard the 350 stimulus words binaurally over
headphones at approximately 75 dB SPL. as measured using a sound level meter. The
participants were instructed that they would hear words over the headphones and that
their task would be to repeat each word as quickly as possible. They were told that the
71
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. microphone would register the time that they began speaking, and that the time between
when the word was played and when the microphone detected their response would be
recorded.
During the experiment, a visual prompt appeared at the beginning of each trial.
One second later, the word was presented over the headphones. Naming times were
measured from the beginning of the auditory stimulus. If a participant did not respond to
the word within 4 seconds from the beginning of the auditory stimulus, “no response”
was logged.
The author tested all participants and checked all responses with headphones
during the experiment in order to correct by hand in the datasheet responses mistakenly
activated by hesitation or wrong words.
Before the test session, participants were given a practice block of 20 stimuli to
familiarize them with the procedure for the auditory naming task. Each session lasted
about 50 minutes.
3 .3 . R e s u l t s
For the naming-time analyses, abnormally fast and slow responses falling above
or below 2.5 standard deviations (subjects and items) were eliminated. Naming times
were analyzed in a multiple general linear regression model (Cohen & Cohen, 1983) for
three different neighborhood density calculations.
72
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Before the analyses were conducted, the mean naming time for each participant
was calculated in order to understand the participants’ performance. The data showed
that the range for naming times was remarkably large. Even after the abnormally fast and
slow responses were removed (about 1% of the total responses), the mean naming time
for each participant varied from 283 ms to 729 ms (MEAN = 534 ms, SD = 113). The
difference between the fastest namer and the slowest namer is was an extremely large 446
ms. However, the participants all performed their task very diligently. Moreover, the
mean naming time for the slowest namer was still faster than the mean naming time for
real-word targets by the American-English participants in the Vitevitch and Luce (1998)
study (over 800 ms). Because of the wide range in participants’ mean naming time, the
27 participants were classified into two groups: the fast namers and the slow namers.
First, median naming times for all participants were calculated. Then, participants were
split by the median naming time (14 participants for fast namers and 13 participants for
slow namers). The analyses of the naming data for fast namers and slow namers were
conducted separately.
The percentage of variance accounted for by each neighborhood definition was
calculated by subtracting the R2 of the basic model from the R2 of the basic model +
neighborhood density. The calculation that yielded the highest R2 that was statistically
significant was chosen as the best neighborhood calculation for the data.
The basic model was composed of 6 factors ( Participants, Initial sound class.
Uniqueness point, / 'r mora frequency, Word frequency, and Duration). For fast namers,
73
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. all factors were used to construct the basic model, although Word frequency did not reach
significance. The overall model was significant (F(19, 9713) = 335.969419, p < 0.00001;
R2 = 0.396574). The basic model for fast namers is shown in Table 3.1 below. Although
a checkmark ( “ V ” ) for Participants and Initial Sound Class means that these two factors
are statistically significant, they were treated as dummy variables in the regression
analyses1.
Participants
Initial sound class yxx*
UP Facilitation*
Duration Inhibition***
Ist Mora Frequency Facilitation*** Word Frequency Facilitation (p = 0.1281) (*p < 0.05. ***p < 0.001)
Table 3.1: Basic model of the naming time data for fast namers, Experiment 1.
1 A checkmark ("V”) is used to show a statistical significance for dummy variables (such as Participants and Initial Sound Class). This convention is used in the rest of this dissertation without any further explanation. 74
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. At the next step, three additional models were constructed by adding one of the
three neighborhood density calculations to the basic model. Table 3.2 shows the results
of the basic model and the three additional models for the fast namers. The colums in the
table contain the R2 accounted for by the model, R2 accounted for by the neighborhood
density calculation, and the direction of the neighborhood effect. The basic model
explained 39.66% of the data. The basic model with the neighborhood calculation based
on segments explained 39.71% of the data. Therefore, 0.0517 % of the variance was
actually accounted for by the neighborhood density effect independently of the other
factors. The same calculation was applied to the other two neighborhood calculations.
75
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models2 for by the neighborhood neighborhood model effect effect Basic 0.396574 NA NA
Basic + Segs 0.397091 0.000517**3 Facilitation
Basic + Segs&Pitch 0.396761 0.000187* Facilitation
Basic + Auditory 0.396574 0 NA (*p < 0.05, ** p < 0.01)
Table 3.2: Models of the naming time data for fast namers, Experiment 1.
A comparison of the /?2 accounted for by the three neighborhood calculations in
Table 3.2 shows two findings. Firstly, the Segments calculation and the Segments +
Pitch calculation both showed neighborhood facilitative effects: words from a dense
neighborhood were named more quickly than words from a sparse neighborhood.
Although all three neighborhood calculations showed neighborhood facilitative effects,
only the effects from the Segments calculation and the Segments + Pitch calculation
reached significance. The Segments calculation yielded the highest statistically
significant Rr, and as such it counts as the best neighborhood calculation for describing
2 Sees = the Segments calculation; Segs+Pitch = the Segments + Pitch calculation; and Auditory = the Auditory calculation. I will use these conventions in the rest of this dissertation. 3 ANOVAs were performed using a median split (high density neighborhood, low density neighborhood). The results showed that the effects were significant (F/(I.13) = 27.463. p <0.001; F2( L. 69S) = 22.499. p < 0.001). 76
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the data of the fast namers. Comparing the R2 of the three calculations, R2 tended to
decrease as the representation of the calculations changed from the categorical phonemic
representation to the Auditory representation.
The second finding is that the neighborhood density effect is facilitative. Such
facilitative effect for neighborhood density has been observed for nonword targets in an
auditory naming task in Vitevitch and Luce (1998, 1999). Vitevitch and Luce (1999) also
claimed that neighborhood inhibitory effects that were strongly observed among words in
English could be modified by focusing participants’ processing on a sublexical level.
They conducted a same-different matching task in which nonwords and words were
presented together. The reasoning behind this was that if the presentation of words and
nonwords were mixed, participants would focus their processing on the sublexical level
that is common to both words and nonwords. The results showed that the previously
observed inhibition effect of neighborhood density for these words considerably
attenuated, resulting in no significant effect of neighborhood density. Their results
showed that neighborhood density inhibition effects could be completely changed to
neighborhood density facilitation if the effect of probabilistic phonotactics was stronger
than a lexical competition effect (neighborhood inhibitory effect).
A comparison of the average naming time for words in Vitevitch and Luce (1998)
to the naming times in this experiment, reveals and extremely large difference (over
800ms for Vitevitch & Luce, 1998, and 283 ms for fast namers in Experiment I). This
average naming-time difference clearly shows that Japanese listeners performed the task
77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. much faster than English listeners. However, at issue is whether Japanese listeners
started naming the words before their offset. Table 3.3 shows the number of responses
before and after the offset of the 700 target words for fast namers. The number of
naming-time responses that started before the target offset is greater than the number of
naming responses that started after the target offset for all fast namers. A WTLCOXON
signed ranks test showed that this tendency is significant (p < 0.001). Therefore, fast
namers tended to start naming the targets before the offset.
Fast Namers Before After FI 424 276 F2 522 178 F3 396 304 F4 694 6 F5 675 25 F6 494 206 F7 700 0 F8 583 117 F9 650 50 F10 678 22 FI 1 546 154 F12 671 29 F13 640 60 F14 677 23
Table 3.3: The number of responses before and after the offset of the 700 target words for fast namers.
78
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A basic model was also constructed for the slow namers using all the factors that
accounted for a significant portion of the naming time variance. The overall fit of the
model for the slow namers was significant (F( 17, 8950) = 237.40333, p < 0.00001; R2 =
0.310789). The basic model for slow namers is shown in Table 3.4.
Participants
Initial sound class
UP
Duration Inhibition***
Ist Mora Frequency Facilitation**
Word Frequency Facilitation*** (**p< 0.01, ***p <0.001)
Table 3.4: Basic model of the naming time data for slow namers, Experiment 1.
Table 3.5 shows the results of the models of the naming time data for the slow
namers. As in the analyses for fast namers, as shown in Table 3.2, neighborhood density
facilitated auditory naming. The effects were significant for the Auditory calculation and
the Segments + Pitch calculation, but marginally significant for the Segments calculation
(p = 0.0564). This reflects a major difference between fast namers and slow namers.
79
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. namely, that the direction of increasing R2 magnitude was different. As shown in Table
3.5, the Auditory calculation yielded the highest statistically significant R2 (R2 =
0.000541). This indicates that the neighborhood density based on the auditory
representation predicted listeners’ performance better than the other methods for
calculating neighborhood density. It is interesting that the more detailed representations
(Auditory and Segments + Pitch) were better at predicting naming times for these slow
namers, while the opposite tendency was observed for the fast namers.
R2 accounted R2 accounted Direction of the for by the Models 1 for by the neighborhood neighborhood | model effect effect Basic 0.310789 NA NA 0.00028 Basic + Seg 0.311069 Facilitation (p = 0.0564) Basic + Seg&Pitch 0.311154 0.000365* Facilitation
Basic + Auditory 0.311330 0.000541**4 Facilitation (*p <0.05, ** p < 0.01)
Table 3.5: Models of the naming time data for slow namers, Experiment 1.
4 ANOVAs were performed using a median split (high density neighborhood. low density neighborhood). The results showed that the effects were significant (F/( 1.12) = 15.798. p < 0.005; F2 ( 1.697) = 36.220. p < 0 .0001). 80
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A comparison of the R2 accounted for by three neighborhood calculations in Table
3.5 shows two findings. First, the Segments + Pitch calculation and the Auditory
calculation both showed neighborhood facilitative effects: words from a dense
neighborhood were named more quickly than words from a sparse neighborhood.
Although all three neighborhood calculations showed neighborhood facilitative effects,
only the effects from the Segments + Pitch calculation and the Auditory calculation
reached significance. Because the Auditory calculation yielded the highest statistically
significant R2, it represents the best neighborhood calculation for describing the data of
the slow namers. Comparing the R2 of the three calculations, there was a tendency for R2
to increase as the representation of the calculations changed from the categorical
phonemic representation to the Auditory representation.
Second, the neighborhood density effect was facilitative. Although such a
facilitative effect for neighborhood density has been observed for nonword targets in
auditory naming task in Vitevitch and Luce (1998, 1999), the data of slow namers
confirmed the same finding with real words for fast namers.
Table 3.6 shows the number of responses before and after the offset of the 700
target words for slow namers. S 1, S3, and S7 named the majority of target words after
they completely heard them, and most of the participants showed the same tendency.
Some participants such as S5 and S 11, named more targets before the offset than after the
offset. For 9 out of 13 slow namers, the number of naming-time responses that started
before the target offset was greater than the number of naming responses that started after
81
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the target offset. A WILCOXON signed ranks test showed that this tendency was
marginal (p = 0.086860). As a group, the slow namers tended to hear the entire word
before performing the task, but this was not true for all target words.
Slow Namers Before After SI 79 621 S2 409 291 S3 82 618 S4 261 439 S5 403 297 S6 200 500 S7 75 625 S8 452 248 S9 237 463 S10 343 357 S ll 468 232 S12 161 539 S 13 347 353
Table 3.6: The number of responses before and after the offset of the 700 target words for slow namers.
Table 3.7 contains a summary of the reliable effects from the regression models
for fast namers and slow namers. The basic model with the Segment calculation and the
basic model with the Auditory calculation were chosen as the best calculation for the fast
namers and the slow namers. respectively.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Namers Slow Namers
Participants V V
E Initial sound class V < w V.a .= UP .£9i aL. Duration Inhibition Inhibition tm 15,1 Mora Frequency Facilitation Facilitation r3 Jm Word Frequency Facilitation
S C Segments Facilitation s a Segments + Pitch Facilitation Facilitation 2 3 ec — *8 3 z u Auditory Facilitation
Table 3.7: A summary of the reliable effects from the regression models for (the naming time data (fast namers and slow namers), Experiment 1. Effects in bold show the calculation that yielded the highest increase in R2.
When the patterns for fast namers and slow namers are compared in terms of
effective factors, it becomes apparent that four of the factors used in the basic models are
common to both fast namers and slow namers. The participants factor had an effect,
even after grouping into fast and slow namers. The initial sound classes of the target
words (Initial sound class) also affected naming time. In an auditory naming experiment,
initial sound class should affect naming times, because different initial sounds may have
different amplitudes. Although Luce and Pisoni (1998) included initial sound difference
83
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. as a factor for their analyses, they did not report the details of any effects of initial sound
difference. The analysis of the current experiment showed that words beginning with a
stop were named significantly more quickly than words beginning with a nasal or a
fricative. Moreover, words beginning with a fricative were named significantly more
quickly than words beginning with a nasal. (F(2,9730) = 173.022; p < 0.00001 for the
fast namers, ^(2, 8965) = 204.769; p < 0.00001 for the slow namers). Tukey HSD
Multiple Comparisons showed that differences among the sound classes were
significantly different for both groups (all at the significance level of p < 0.0001).
Duration of the words also affected naming times. Shorter words were named
more quickly than longer words. As shown in §2.4, the durations of the target words
varied from 431 ms to 780 ms, and thus, it is not surprising that this nearly 350 ms
difference among the word utterances affected the naming times for both groups of
participants for their naming times.
First Mora Frequency affected naming times as well. Note that the first mora
frequency is the proportion of words beginning with that mora. The data demonstrated
that words with a higher first mora frequency were named more quickly than words with
a lower first mora frequency for both fast and slow namers.
Two other factors ( Neighborhood density and Word frequency) demonstrated an
interesting contrast between fast namers and slow namers. A word frequency effect was
observed in this auditory' naming experiment, although it was only found for slow namers.
For these listeners, word frequency facilitated auditory naming. Frequent words were
84
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. named more quickly than infrequent words. This seems to indicate that slow namers but
not fast namers accessed the lexicon while they performed the task.
Recall that Japanese participants performed the task much more quickly than
English participants in the equivalent experiments (Luce & Pisoni, 1998; Vitevitch &
Luce, 1999). Even the slow namers in this experiment performed the task much more
quickly than the English participants. Interestingly, the slow namers showed a word
frequency effect that was not observed in the naming data from the English participants.
This may indicate that Japanese and English are different in terms of when word
frequency has an effect.
The data showed that Japanese participants processed CVCVCV targets in
approximately the same amount of time as English monosyllables. Although the overall
duration of these targets was nearly identical, the amount of phonological information
available at points with the words was different. This means Japanese participants
process more phonological information than English participants do in roughly the same
amount of time. These results are consistent with general claims that later processing
times reflect effects from the lexicon, such as word frequency, but also suggest that the
structure of the stimuli affects processing in the two languages.
As discussed above, a neighborhood density effect was observed among both fast
namers and slow namers. The results showed two interesting findings. First, the
neighborhood density effect was facilitative in this auditory naming experiment in
Japanese. This pattern was observed whether participants started naming words before
85
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (fast namers) or after (slow namers) the word offset. Second, there was an interesting
contrast between fast namers and slow namers, which suggested neighborhood density
might be calculated based on different kinds of information at different stages of
processing. In this experiment, fast namers seemed to rely on the Segments calculation as
well as the Segments + Pitch calculation. Slow namers seemed to rely on the Auditory
calculation as well as the Segments +■ Pitch calculation. This could indicate that the
information available to listeners might be different, depending on when they performed
their task relative to the timecourse of the word recognition. The change in effective
neighborhood suggests that richer information in the word representation becomes
available during the timecourse of word recognition. The calculation based on the
auditory representation of words was more highly correlated with slow namers’ reaction
times because not only segmental information but also pitch-accent information, was
available in the word representation.
In sum, there are four main findings from the analyses of naming times. First,
many factors affected both fast namers and slow namers. The neighborhood density
effect, of main interest, was one of the factors that predicted a significant proportion of
the naming time variance. Second, the neighborhood density effect showed facilitation
for both fast and slow namers. Third, the neighborhood density effect consistently
seemed to be based on different kinds of information (from different levels of
representation) for the fast versus the slow namers. Finally, a lexically-based word
frequency effect was observed only for slow namers.
86
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Because of the extremely high accuracy of the participants (99.86 % of the
responses by the fast namers and 99.85% of the responses by the slow namers), it was not
possible to look at effects of accuracy rates.
3 .4 . D is c u s s io n
Previous auditory naming experiments in English have reported that neighborhood
density negatively affects the naming time and accuracy of words: words from dense
neighborhoods are named less quickly and accurately than words from sparse
neighborhoods (Luce & Pisoni, 1998; Vitevitch & Luce, 1998). This interpretation of this
lexical competition effect is based on the assumption that English listeners have to
retrieve pronunciation information from the lexicon. Because the targets in Experiment 1
were words, it was expected that a Japanese auditory naming experiment would replicate
the effects observed in English auditory naming experiments. However, this pattern was
not observed.
A summary of the effects from Experiment I. as shown in Table 3.7, indicated
that neighborhood density affected auditory naming reaction times. However, there is a
crucial difference between these Japanese auditory naming results and previous results
from English listeners: namely, the direction of the effect. Neighborhood density
facilitated naming time in Japanese, while in English, neighborhood density has been
found to facilitate only nonwords (Vitevitch & Luce, 1998). This may indicate that
87
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Japanese listeners started naming before retrieving full pronunciation forms of the target
words.
The naming time results also showed that the same calculation was not necessarily
the best (i.e., most predictive) neighborhood calculation for fast namers and slow namers.
As shown in Table 3.7, fast namers seemed to be influenced by neighborhoods of words
that were similar in terms of segmental make-up, while slow namers were influenced by
neighborhoods of words that were similar in auditory detail. Previous neighborhood
studies have assumed that neighborhood density is computed from only one calculation,
in which represented form must be the ‘right’ one for all participants and for the duration
of word recognition. However, our data show that Japanese listeners rely at least to some
extent on all the neighborhood calculations proposed in this dissertation. Furthermore,
the relevance of the neighborhood density calculations is highly related to the timecourse
of lexical access processes. These data suggest that during auditory word recognition the
set of activated lexical items changes as a function of time.
The results of this experiment also provided evidence that prosodic word
information is involved in lexical activation. The Auditory definition of neighbors, which
inherently includes pitch accent patterns, and the Segments*- Pitch calculation, which
employs abstract pitch accent patterns, were both found to define neighborhood densities
that correlated better with listeners’ performance (for slow namers) than did the
neighborhood density defined on segmental similarity alone. The results confirmed the
88
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. view that listeners use all possible information that might be useful for lexical access
(e.g.. Cutler, 1997; Soto-Faraco, Sebastian-Galles, & Cutler, 2001).
To further explore these findings, an auditory naming experiment was conducted
with the same 700 word stimuli embedded in noise. It was expected that the naming time
data would show the same neighborhood facilitative effect found in this experiment. As
for the predicted accuracy of the data, it was expected that listeners would make more
mistakes, and thus, potentially show a strong neighborhood inhibitory effect. Luce and
Pisoni (1998) conducted a study of word identification in noise in which they found a
strong neighborhood density inhibitory effect. Similarly, an inhibitory effect for
neighborhood density was also confirmed in Japanese (Amano & Kondo, 1999).
Therefore, two types of neighborhood density effects (facilitation and inhibition) should
be observed in Experiment 2.
89
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CH A PTER4
EXPERIMENT 2: AUDITORY NAMING IN NOISE
4 .1 . Introduction
The purpose of this experiment was to extend the findings in Experiment I: a
facilitative effect for neighborhood density in the naming time data that depended on
different kinds of lexical information for fast and slow namers.
In this experiment, the same set of target words was embedded in noise, to allow a
direct comparison between the data in two auditory naming conditions — normal
(Experiment!) and in noise (current experiment). Previous word identification in noise
experiments in Japanese and English (Luce & Pisoni, 1998, for English; Amano &
Kondo. 1999, for Japanese) showed an inhibitory effect for neighborhood density: words
from dense neighborhoods were identified less accurately than words from sparse
90
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. neighborhoods. The accuracy data in this experiment should also show a neighborhood
density inhibitory effect.
As for the naming time data, participants’ performance should essentially be
similar to Experiment I because this is also an auditory naming experiment. In other
words, slow namers should show a facilitative neighborhood density effect. In addition,
this experiment might replicate the difference between fast namers and slow namers
found in Experiment 1: fast namers tended to rely on the Segments calculation, whereas
slow namers tended to rely on the Auditory calculation.
4.2. M e t h o d s
The same 700 words that were used in Experiment I were the target words in this
experiment. The stimuli were created by adding noise to the audio files used in the
Experiment I such that the signal-to-noise (SN) ratio at the point of peak RMS dB was 0
dB SPL1. The signal-to-noise ratio is the ratio of the amplitude of the stimuli against a
constant level of Gaussian noise. The level of the noise was estimated from the peak
RMS amplitude of each stimulus file. The noise extended 500 ms before and after each
stimulus word. The stimuli were presented to the participants at 75 dB SPL, as measured
using a sound level meter.
1 The signal-to-noise (SN) ratio at the point of peak RMS dB was selected based on the results of a pilot study conducted in order to find the SN ration that yielded about 75% response accuracy. One hundred words were given to one participant at the SN ratio of -2.5 and to another participant at the SN ratio of 0. When the SN ratio was 0. accuracy was about 70%. Thus, that this SN ratio was used in this noise experiment. 91
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants were 27 native speakers of Tokyo Japanese who were bom and raised
in the Tokyo area (Tokyo, Kanagawa, Chiba, and Saitama), as explained in Chapter 2.
No subjects had participated in Experiment 1.
The procedure was exactly the same as in Experiment 1 except for the addition of
a second task. Participants were asked to write down what they said in hiragana
characters after they repeated each word. Some participants used katakana characters
and/or kanji characters in their responses. As long as a response was unambiguously
interpretable, it was included in the analysis. It was emphasized to them that they should
first repeat the words as quickly as possible, and then write down their responses. All the
naming responses were recorded onto a digital audio tape (DAT), and the author analyzed
the accuracy of spoken responses.
4 .3 . R e s u l t s
First, abnormally fast and slow responses falling above or below 2.5 standard
deviations (subjects and items) were eliminated from the naming-time analysis. Before
the analyses were conducted, the mean naming time for each participant was calculated in
order to understand the participants’ performance. The data showed that the range for
naming times was somewhat large. Even after the abnormally fast and slow responses
were removed (about 20%), the mean naming time for each participant varied from 1010
92
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ms to 1439 ms (MEAN = 1243 ms, SD = 97.97)2. Because this naming time range is
quite large, the 27 participants were classified into two groups: the fast namers and the
slow namers. First, the median naming times for all participants were calculated. Then,
participants were split by the median naming time (13 participants for fast namers and 14
participants for slow namers). The analyses of the naming data for fast namers and slow
namers were conducted separately.
The written responses were checked against the named responses, as recorded
onto the DAT tapes, in order to make sure that the responses were named with the correct
accent patterns. The author first coded correct responses and missed responses as " I” and
"0, ” respectively using a computer script. Then, the responses with a different pitch
accent pattern were treated as missed responses and corrected to “0.”
The analyses of the naming data and the word identification data were conducted
separately.
4.3.1. N am in g T im e D a ta
Naming times were analyzed in a multiple general regression model (Cohen &
Cohen, 1983) for three different neighborhood density calculations. The percentage of
variance accounted for by each neighborhood definition was calculated by subtracting the
: Naming times were measured from the onset of the sound tiles. A range of naming times measured from the onset of the embedded target words is between 510 ms and 939 ms.
93
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 of the basic model from the R2 of the basic model + each neighborhood density. The
results of the regression analyses conducted in this section are shown in Appendix E.
The basic model was composed of 6 factors ( Participants, Initial sound category,
Uniqueness point, l" mora frequency. Word frequency, and Duration). All factors except
Uniqueness point were significant, although Initial sound class did not reach significance.
This basic model accounted for 15.1% of the naming time variance (F( 18, 5844) =
57.763697, p <0.00001; Z?2 = 0.151044). The basic model is show in Table 4.1 below.
Participants y ***
N) Initial sound class (p = 0.1203) UP
Duration Inhibition***
l5t Mora Frequency Facilitation***
Word Frequency Facilitation* l*p < 0.05, **p < 0.00 !.***/?< 0.0001)
Table 4.1: Basic model of the naming time data for fast namers, Experiment 2.
94
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Three additional models were constructed using the factors from the basic model
and each of the three different neighborhood calculations for fast namers. Table 4.2
shows the results of the basic model and three additional models for the fast namers. The
columns in the table contain the R2 accounted for by the model, R2 accounted for by the
neighborhood density calculation, and the direction of the neighborhood effect. However,
none of the neighborhood density calculations significantly explained more of the data
than the basic model.
R2 accounted R: accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.151044 NA NA
Basic + Seg 0.151354 0.00031 Facilitative
Basic + Seg&Pitch 0.151044 0 NA
Basic + Auditory 0.151044 0 NA
Table 4.2: Models of the naming time data for fast namers, Experiment 2.
Similarly, the basic model for slow namers was potentially composed of the same
6 factors above. The basic model of the slow namers was built by only two factors
95
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Participants and Duration) accounted for a relatively low (7.7%), but significant,
amount of variance (F(13, 4340) = 28.017, p < 0.00001; R2 = 0.077425). The basic
model is shown in Table 4.3.
Participants
Initial sound class
UP
Duration Inhibition***
Ist Mora Frequency
Word Frequency (***p < 0.001)
Table 4.3: Basic model of the naming time data for slow namers, Experiment 2.
Next, three additional models were constructed using the factors of the basic
model and each of the three different neighborhood calculations for slow namers. Table
4.4 shows these results. The Auditory neighborhood density calculation yielded the
highest and only significant increase in R2 (R1 = 0.000991). Therefore, the Auditory
method is the best neighborhood calculation for describing the data of the slow namers.
96
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.077425 NA NA
Basic + Seg 0.077425 0 NA
Basic + Seg&Pitch 0.077425 0 NA
Basic + Auditory 0.078416 0.000991*3 Facilitation (*p < 0.05)
Table 4.4: Models of the naming time data for slow namers. Experiment 2.
Table 4.5 shows a summary of analyses from the regression models for the fast
and slow namers. The Basic + the Auditory model was chosen as the best model for slow
namers.
' ANOVAs were performed using a median split (high density neighborhood. low density neighborhood). The results showed that the effects were significant subjects analysis only (F/( 1.13) = 6.154. p <0.05: F2(l. 698) = 0.044 .p >0.1). 97
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Namers Slow Namers
"33 Participants V V ■8 S Initial sound class (V) u Ml S3 .a UP .s u Duration Inhibition Inhibition <2 £ o Ist Mora Frequency Facilitation
£ Word Frequency Facilitation
Segments (Facilitation) jM s a tm 'S 2 23 Segments + Pitch 2ec — 2 •33 2 Z Auditory Facilitation
Table 4.5: A summary of the regression models for the naming time data (both fast namers and slow namers), Experiment 2.
A comparison of fast and slow namers in terms of effective factors shows that two
factors used in the basic models had effects for both fast namers and slow namers:
Participants and Duration. Both groups showed variability among the participants, even
after grouping was conducted. The durations of the words also affected the reaction
times. Shorter words were named more quickly than longer words. As shown in §2.4.
the durations of the target words varied from 431 ms to 780 ms. This nearly 350 ms
98
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. difference among the word utterances certainly affected naming times for both groups of
participants.
A facilitative neighborhood density effect was observed for both fast namers and
slow namers. Words with many neighbors were named more quickly than words with
few neighbors. It is worth mentioning that even though fast namers in Experiment 2 were
slower than slow namers in Experiment 1, the Auditory calculation was the based
calculation for slow namers. In the auditory representations, segmental information and
pitch accent information were available. The Segments calculation was the best
calculation for fast namers, although it was not statistically significant. Taken together,
the results of the naming time data in this experiment replicated the findings regarding
neighborhood density in Experiment I.
Finally. Word frequency and first mora frequency were both facilitative effects
observed only for fast namers.
4.3.2. W o r d Identification D ata
This section contains the analysis of the word identification data. The written
responses were typed into a spreadsheet by the author, then checked against the named
responses as recorded onto DAT tapes. This was done to make sure that the responses
were named with the correct accent patterns. If the written responses were not exactly the
same as named responses, they were hand-corrected. Correct responses and incorrect
responses were coded as “ 1” and “0, ” respectively.
99
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The overall average proportions of correct responses by fast namers and by slow
namers were 0.72 (SD: 0.45) and 0.70 (SD: 0.46), respectively. The mean difference
between the two groups of participants was significant (F (I, 18898) = 6.36, p < 0.05).
Therefore, no speed-accuracy trade-off was observed.
As with the reaction time data, general regression models were also built for the
accuracy data. Table 4.6 shows the basic model for fast namers. The basic model was
again constructed using the 6 basic factors ( Participants, Initial sound class. Uniqueness
points, Ist mora frequency. Word frequency, and Duration). The basic model for fast
namers included all factors except for Uniqueness point. The basic model explained just
3.4 % of the data, which, though low. is reliably greater than nothing (F( 18, 9781) =
19.209873, p < 0.0001: R2 = 0.034145).
100
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants yj***
Initial sound class yj'.(C**
UP
Duration Lower accuracy**
1st Mora Frequency Higher accuracy***
Word Frequency Higher accuracy***
(**p <0.01, ***p< 0 .0 0 1)
Table 4.6: Basic model of the word identification data for fast namers, Experiment 2.
Table 4.7 shows three additional models that were constructed by adding one of
the three neighborhood density calculations to the basic model to explain the fast namers'
word identification data. The basic model with the Segments calculation and the basic
model with the Segments + Pitch calculation were models that increase R~ by adding
neighborhood density as a factor. Unlike the analyses of the naming times, the
neighborhood density effects observed in the word identification data show inhibition:
words with many neighbors were recognized less accurately than words with few
neighbors.
101
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.034145 NA NA
Basic + Seg 0.042429 0.008284*** Lower accuracy
Basic + Seg&Pitch 0.043953 0.009808***4 Lower accuracy
Basic + Auditory 0.034145 0 NA
(***p< 0.001)
Table 4.7: Models of the word identification data for fast namers. Experiment 2.
Word identification data from the slow namers were similarly analyzed. Table 4.8
shows the basic model of the word identification data for slow namers. Although the
basic model was again constructed using 6 factors ( participants, initial sound category.
Uniqueness point, I'1 mora frequency, word frequency, and Duration), the basic model
for slow namers included all factors except Uniqueness points. The model accounted for
3.2 % of the data (F(17.9082) = 17.991411, p <0.0001: R r = 0.032580).
4 ANOVAs were performed using a median split (high density neighborhood, low density neighborhood). The results showed that the effects were significant (F/( 1.13) = 22.148. p < 0.00I; F2( 1.627) = 8.629. p < 0.005). 102
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants yj***
Initial sound class yj***
UP
Duration Lower accuracy***
U' Mora Frequency Higher accuracy***
Word Frequency Higher accuracy*** (***p <0.001)
Table 4.8: Basic model of the word identification data for slow namers, Experiment 2.
Table 4.9 shows the results of the basic model and three other models of the word
identification data for the slow namers. The Segments + Pitch calculation yielded the
highest R2 (R2 = 0.012748) that was significant (p < 0.001). Thus, the segment + Pitch
calculation was the best neighborhood calculation for describing the accuracy data of the
slow namers.
103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R: accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.032580 NA NA
Basic + Seg 0.042105 0.009525*** Lower accuracy
Basic + Seg&Pitch 0.045328 0.012748***5 Lower accuracy
Basic + Auditory 0.032580 0 NA
(***p < 0 .0 0 1 )
Table 4.9: Models of the word identification data for slow namers, Experiment 2.
Table 4.10 contains a summary of analyses of the regression models for the fast
namers and the slow namers. The basic model with the Segments + Pitch calculation was
chosen for both fast namers and slow namers.
5 ANOVAs were performed using a median split (high density neighborhood. low density neighborhood). The results showed that the effects were significant (F/( L.12) = 67.34. p < 0.0001: F2( 1.627) = 9.847. p < 0.005). 104
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Namers Slow Namers
Participants V 1 S Initial sound class V £ '3 £ UP
u Duration Lower accuracy Lower accuracy £ uS0 © lsl Mora Frequency Higher accuracy Higher accuracy aw Eb Word Frequency Higher accuracy Higher accuracy
1 Segments Lower accuracy Lower accuracy Segments + Pitch Lower accuracy Lower accuracy .SJl u e * " m • mv t a w Z Auditory
Table 4.10: A summary of the regression models for the word identification data (both fast namers and slow namers), Experiment 2.
The regression analyses from the word identification data for fast namers and slow
namers demonstrated that fast namers and slow namers performed very similarly in terms
of accuracy of target word recognition. Five factors contributed to word accuracy:
Participants. Initial Sound Class, Is' Mora Frequency, Word Frequency and
Neighborhood density. As with naming time, the Participants factor still reliably differed
from each other even after being split into fast and slow namers.
105
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Initial sound class of the target words affected the accuracy of word
recognition in noise. For fast namers, stops (66%, SD = 47) were named less accurately
than fricatives (71%, SD = 45) and nasals (74%, SD = 44; F(2,9797) = 25.85, p <
0.0001). Tukey HSD multiple comparisons showed significant differences between all
pairs of groups at p < 0.05. Slow namers also showed the same tendency (stops: 70%. SD
= 46; nasals: 76%, SD = 43; fricatives: 71%, SD = 45; F(2,9097) = 13.67, p < 0.0001).
Tukey HSD multiple comparisons showed that the differences between stops and nasals
and between fricatives and nasals were significant at p <0.05. although the difference
between fricatives and stops was not significant. In terms of a sonority scale, stops are
less sonorant than fricatives and nasals, and fricatives are less sonorant than nasals.
Therefore, the performance of fast and slow namers on word identification in noise might
be related to the sonority scale. Further detailed analyses of sound confusion are shown
in Appendix F.
First Mora Frequency was also significant for both groups’ word accuracy scores.
The data demonstrated that words with a higher first mora frequency were recognized
more accurately than words with a lower first mora frequency for both fast and slow
namers.
A Word Frequency effect was observed in this auditory naming experiment.
Frequent words were recognized more accurately than infrequent words.
Neighborhood density showed an inhibitory relationship with word recognition
accuracy: words from dense neighborhoods were named less accurately than words from
106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. sparse neighborhoods. The word recognition accuracy portion of this experiment is
similar to the word identification in noise experiments of Luce and Pisoni (1998) and
Amano and Kondo (1999). Further, the naming time data replicated the results of
Experiment I.
4 .4 . D is c u s s io n
Previous word identification in noise experiments have showed that the
neighborhood density effect is an inhibitory effect (Luce & Pisoni, 1998; Amano &
Kondo, 1999), and these data are consistent with that. The word identification data from
the current study showed that words with many neighbors were recognized less accurately
than words with few neighbors. Thus, the word identification data in the auditory naming
experiments here replicated the same inhibitory neighborhood density effect using
Japanese.
There are, however, two main differences between the two previous word
identification in noise experiments and this current experiment. First. Amano and Kondo
(1999) did not consider word prosody. In their study, participants typed hiragana
characters as responses from the keyboard into a computer so that there was no way to
understand whether listeners misperceived pitch-accent patterns or not. In contrast, the
responses of the participants in this experiment were recorded on DATs so that word
confusability in noise could be analyzed in terms of segments, as well as with respect to
word-level prosody. Second, this experiment was able to demonstrate which kinds of
107
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. phonological information other than a neighborhood density effect were used to recognize
words in a noisy environment.
It was also the case that Amano and Kondo (1999) used a mora-based
neighborhood calculation where only segmental information was considered, and the
effect of the Segments calculation in the current experiment confirmed this finding as
well. Moreover, the data here suggested that pitch accent information was also exploited
to understand confusable words in noise.
Listeners exploited any kinds of information that could help them understand
words in noise. Word Frequency and first mora frequency both contributed to higher
word identification accuracy. .
Initial sound class was also significant. Although a distinction among stops,
fricatives, and nasals was one factor that contributed to the naming time data and the
accuracy data in Experiment I, Initial sound class was used differently in Experiment 2.
Recall that in Experiment I, words beginning with a stop were named significantly more
quickly than words beginning with a nasal or a fricative. Moreover, words beginning
with a fricative were named significantly more quickly than words beginning with a
nasal. Contrary to this, the current experiment showed that stops were named less
accurately than fricatives and nasals. Fricatives were named less accurately than nasals.
A close look at these patterns reflects different characteristics of sounds. The pattern in
the naming times here reflects the length of the segments. In a brief analysis, one target
word from each initial sound used in this experiment was selected, resulting in eight
108
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. words in total. These words have a HLL pitch accent pattern with ‘a’ as the first vowel.
They are kakudo , tahata , rnasuku, namida , sarada , shamozi, zyasuto, and zatsumu.
Durations of the initial sounds were measured and the mean duration for each sound class
(stops, nasals and fricatives) was calculated. The analysis showed that the duration
means of stops, nasals and fricatives were 17 ms, 44 ms, and 276 ms, respectively. The
pattern in the accuracy data, however, reflected sonority of sounds. In general, target
words were less easily recognized because of the noisey environment. The data suggest
that participants are sensitive to properties of sounds that are scaled in terms of sonority.
Different interpretations of Initial sound class information in the naming time data and
the accuracy data support the view that these two types of data reflect different aspects of
lexical access processes.
In the naming time data, the neighborhood density effect was observed only for
slow namers. The Auditory calculation was the best calculation for slow namers. These
data might indicate that the best calculation for neighborhood density changed over the
timecourse of word recognition processing.
It is interesting that Duration affected the naming times in both experiments,
whereas Initial sound class and / 'f mora frequency were not consistently significant
factors when the stimuli were presented in noise. This might indicate that participants
who heard the stimuli without noise may have been better able to exploit the properties of
the initial sounds and moras. This would indicate that it is not only the length of the
target stimuli, but also the length of the first segment, that affected naming time. The
109
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. facilitation of I'1 mora frequency could be interpreted in two ways, both of which are
consistent with the data in Experiment I. First, this may be an effect from production:
participants were able to produce the words beginning with a highly probable word-initial
mora as also observed in Experiment 1. The other possible interpretation is that this is an
effect of high transitional probabilities of sounds (word-initial CV transition). Or, these
two effects are both influential, but not separable, since they both show facilitation.
In sum, in this auditory naming experiment in a noisy condition, the neighborhood
density effect was a facilitative effect in the naming data and an inhibitory effect in the
word identification data. An interesting difference between the naming time data and
word identification data is that the relative increase of variance accounted for by adding
lexical neighborhood was greater for the word identification data than was the relative
size of the neighborhood effects in accounting for the reaction time data. This could
indicate that the naming times were attenuated because of the noisy environment.
This facilitative neighborhood density effect on naming times occurred no matter
whether participants started naming words before the offsets (fast namers. Experiment 1)
or after (slow namers. Experiments I and 2). However, one thing to keep in mind is that
under noise-free conditions (Experiment I), fast namers did not show a speed-accuracy
trade-off. They were 99.86 % accurate even when they produced all 700 words in the
experiment. Thus, there is a crucial difference between English and Japanese naming
experiments with words. If the neighborhood inhibitory effect is the result of lexical
competition, then Japanese naming data may show that lexical competition did not affect
110
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. processing times. But if this is the case, then the naming task was not the right one for
looking at lexical competition effects even though previous English naming experiments
showed that this task shows an inhibitory neighborhood density effect with words.
Therefore, if other tasks could induce lexical competition, an inhibitory neighborhood
density effect might affect the processing times. This possibility will be investigated in
the final experiment.
I l l
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5
EXPERIMENT 3:
SEMANTIC CATEGORIZATION EXPERIMENT
5.1. Introduction
This chapter investigates neighborhood density effects in a task that requires
selection of words stored in the lexicon. The lexical decision task has been used to study
the time course of auditory word recognition for English words presented in the clear
(Goldinger, 1996). Lexical decision has been shown to be sensitive to lexical frequency,
and, with proper controls, it is also sensitive to neighborhood density and neighborhood
frequency (Luce & Pisoni, 1998). However, it requires discrimination between word and
nonword patterns so it is often criticized as having unnatural task characteristics. Further,
Amano and Kondo (1999) failed to show an inhibitory neighborhood density effect on
processing times. 112
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In this last experiment, an alternative to the lexical decision task was used: the
semantic categorization task. Vitevitch and Luce (1999) used this task to investigate the
sublexical and lexical levels of representation in the on-line processing of spoken words.
The semantic categorization task is a relatively new experimental procedure (Forster &
Shen, 1996). In this task, participants are given a semantic category (for example,
animals) and hear a word over headphones; they must decide as quickly and accurately as
possible whether the word belongs to the given semantic category. Vitevitch and Luce
(1999) used this methodology because retrieval of semantic information for responses
definitely requires lexical access, yet such a process would not unnaturally bias
processing towards either the sublexical level or the lexical level as might be the case in
lexical decision. The auditory naming task and the lexical decision task have been
commonly used in the English neighborhood literature. The auditory naming task may
bias processing toward the sublexical level, however, because a response may be made
without accessing lexical representations. On the other hand, the lexical decision task
seems to bias processing towards the lexical level, since participants have to make a
decision about the word’s lexicality.
The semantic categorization task requires not only accessing the lexicon (lexeme)
but also passing the information to a higher semantic level (lemma). Therefore, words
must be competing while participants perform the task. The task required use of actual
words only, which is exactly the constraint for our neighborhood density calculation
based on auditory similarity. Since all target words and filler words must be actual
113
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. words, the acoustic-based neighborhood density calculation from the NTT database can
be kept for this experiment. The final experiment was conducted using the semantic
categorization task to test for a neighborhood effect in Japanese spoken word recognition.
5.2. M e t h o d s
5.2.1. S tim u li
The target words were the same 700 words that were used in Experiments I and 2.
Seven hundred additional words were selected as fillers. There were three crucial
constraints for filler selection in this experiment. First, all 700 additional words had to be
nouns that had sound files in the NTT database. Second, the semantic categories and the
words that belong to them had to be relatively common in Japanese. Third, each semantic
category needed to be represented by an equal number of words. In order to fulfill these
constraints, twenty-four different semantic categories were selected from various
reference resources. Four categories were used twice.
There were three concerns in the selection of semantic categories and their
associated words. First, previous semantic categorization experiments have not yet tested
the effect of using multiple semantic categories within a single experiment. Thus, such
an experimental structure may not work and the collected data may not be coherent at all.
To address this concern, the validity of the semantic categorization experiment was
examined before the final analysis was carried out. The results are reported in §5.3.1.
114
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The second concern was whether participants also would think that the words
selected for each semantic category really belonged to that semantic category. Batting and
Montague (1969) conducted a survey on category norms for verbal items in 56 English
categories. In their survey, each semantic category was given to nearly 450 college
students, and their task was to come up with as many words as possible that belonged to
the semantic category within 30 seconds. The results of this study tell us common words
for specific semantic categories in English. Ogawa (1972) conducted a similar survey on
category norms for verbal items in 52 Japanese categories1. Given the constraints on
filler selection described above, it was not possible to use a complete subset from one or
both of these studies. Thus, it was necessary to create some additional (as yet untested)
categories. A pilot study was conducted to test the categories and 700 newly chosen filler
words.
In the pilot study, 28 semantic categories were first explained to the participants,
four native speakers of Tokyo Japanese. Their task was to choose the most appropriate
one of the 28 categories for each of the 700 filler words. These words and semantic
categories met the constraints mentioned earlier. All four participants categorized them
as expected. However, in post-experiment interviews, they all admitted that some words
were easier to categorize than others. Also, some categories were more intuitive than
others. The 700 filler words and the descriptions of their semantic categories are shown
in Appendix G.
1 In Ogawa (1972), participants performed the task within one minute, not 30 seconds. 115
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The semantic categorization experiment was designed such that the 700 words
comprising the lexion used in previous experiments were filler words, and the newly
chosen 700 words were targets. The newly chosen words had to belong to categories
unambiguously, while the original 700 members of the lexicon used for Experiments 1
and 2 had to be distributed such that they did not belong to the category to which they
were assigned.
As with the original 700 words, the additional 700 words were also scaled so that
the peak root-mean-square (RMS) amplitude values were equated for all new files, at an
amplitude of approximately 75 dB SPL, as measured using a sound level meter.
In this experiment, the newly selected 700 filler words all belong to some category
whereas the 700 original target words do not always belong to any category. In each
block, twenty-five target words and twenty-five filler words were presented. Only
original target words were analyzed in the experiment.
5.2.2. Participants
Participants were 30 native speakers of Tokyo Japanese who were bom and raised
in the Tokyo area (Tokyo, Kanagawa, Chiba, and Saitama) as explained in Chapter 2.
They were all right-handed. They each received a small amount of money for their
participation. No participant had taken part in either of Experiments I or 2.
116
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 .2 .3 . P r o c e d u r e
Participants required one-hour test sessions on each of two successive days.
Participants were tested individually. Each participant was seated in a quiet room. In
each session, one of the two lists, each of which contained 700 words, was presented.
Each list had 14 blocks, and each block contained 50 words. The order of the blocks and
the words within each block were randomized for each participant. The order of the two
lists was counterbalanced among the participants. Presentation of stimuli and response
collection were controlled by a computer.
At the beginning of each test session, participants were given a written list of 14
semantic categories and their descriptions. The written descriptions allowed the
participants to ask questions about the definitions of categories. Following this, the
participants were tokd that they would listen to 50 words each in 14 blocks in the
experiment. At the beginning of each block, they would hear the same description of a
semantic category as in the written list. Their task was to decide whether each word in
the block belonged to the specified semantic category for each block and to press a button
(‘yes’ or ‘no’) as quickly and accurately as possible. The left-hand button was labeled no
and the right-hand button on the response box was labeled yes. The stimulus words were
presented binaurally over headphones at approximately 75 dB SPL.
A visual prompt appeared at the beginning of each trial. One second later, one of
the spoken stimuli was presented over the headphones at 75 dB SPL to participants.
Reaction times were measured from the beginning of the auditory stimulus to the button
117
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. press response. If a participant did not respond to the word within 4 seconds from the
beginning of the auditory stimulus, the computer automatically recorded “no response”
and presented the next trial.
Before the test session, participants were given a practice block of 10 stimuli
excluded in the final data analysis to familiarize them with the procedure. Each session
lasted for an hour.
5 .3 . R e s u lt s
5 .3 .1 . An Evaluation o f the Semantic C ategorization Task in Japanese
Before analyzing the “no” responses for the original 700 words, "yes” responses
to the newly chosen filler words were analyzed in order to make sure that participants
performed their task appropriately. In this experiment, half of the experimental stimuli
were “no” target-word responses that were included in the final analysis, and the other
half were “yes” filler-word responses that were supposed to belong to semantic categories
in the experiment. The strategy here is that “yes” filler words were distracters to the
participants. The participants did not know that the focus of the experiment was on the
“no” target-word responses, rather than on the “yes” filler-word responses. Our
expectation was that the proportion of correct responses (answering “yes”) for filler-
words responses should be relatively high. However, if the proportion of correct
responses to those words were low, the experiment itself would be a failure.
118
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The proportion of correct responses for “yes” filler-word responses was 0.95. The
data also showed some variability in the participants’ performance on the “yes” filler-
word responses. The highest and the lowest proportions of the correct responses were
0.985 and 0.917.
The different semantic categories used in the experiment had to be evaluated. 24
different semantic categories were used for 28 blocks in total, and each block consisted of
25 words selected for each semantic category and 25 target words not belonging to the
category. There are a couple of questions regarding the use of multiple semantic
categories in a single experiment. The first question is whether the different semantic
categories were equally difficult in the experiment. Assuming that words were easy to
categorize in a designated semantic category, participants should not have made mistakes.
Table 5.1 shows a summary of correct responses for “yes" filler-response words in
terms of semantic categories2. The first column shows the name of the semantic
categories (Category). The next column shows the mean proportion of correct responses
for each semantic category (Mean). The next five columns indicate the number of correct
responses. For example, the column labeled 30 indicates the number of words that were
correctly categorized by all 30 participants. Likewise, the column labeled the 29 indicates
the number of words 29 participants correctly classified and so forth. The column for <
27 indicates the number of words that less than 27 participants responded to correctly.
For example, in the block “ Animals ”, 10 words were categorized correctly by 30
119
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. participants, and so on. The final column (“ NW”) shows the number of words correctly
classified by at least 27 (93%) of the participants. In the block Animals , all 25 words
were correctly classified.
: Four semantic categories were used twice. These categories were listed as separate semantic categories such as Career (I) and Career (11). 120
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Category Mean 30 29 28 27 <27 NW
Animals 0.977 10 13 2 0 0 25 Ingredients 0.976 12 8 5 0 0 25 Colors 0.973 13 6 4 2 0 23 Desserts 0.97 12 7 4 I I 23 Fruits 0.976 18 3 I 1 2 22 Diseases 0.972 14 6 2 2 1 22 Sports 0.97 12 8 2 2 I 22 Main dishes 0.968 16 4 2 I 2 22 Vegetables & beans (I) 0.957 11 5 6 0 3 22 Insects 0.948 12 8 2 I 2 22 Birds 0.962 14 5 2 I 3 21 Body parts (I) 0.96 10 8 3 3 I 21 Grammatical terms 0.948 7 6 8 2 2 21 School items 0.945 10 8 3 I 3 21 Rowers 0.95 9 8 3 4 I 20 Flavors 0.948 9 8 3 I 4 20 Careers (II) 0.94 9 7 4 2 3 20 Body parts (H) 0.96 8 11 0 I 5 19 Parts of the buildings 0.948 7 6 6 4 2 19 Things found in house 0.946 8 5 6 4 2 19 Metals 0.934 12 6 I I 5 19 Instruments 0.956 11 5 2 5 2 18 Vehicles 0.956 11 5 2 5 2 18 Careers (I) 0.929 7 7 4 3 4 18 Fish (I) 0.928 7 7 3 3 5 17 Vegetables & beans (II) 0.925 10 6 I 2 6 17 Subjects of study 0.898 7 7 3 0 8 17 Fish (II) 0.898 11 4 2 I 7 17
Table 5.1: A summary of correct responses for “yes” filler-word responses in terms of semantic categories.
There are two pieces of evidence to suggest that all blocks of semantic categories
were not equally easy. The first piece of evidence is that the means of the correct
121
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. responses across words for 28 different blocks had a relatively wide range. The highest
mean and the lowest mean were 0.977 (.Animal) and 0.898 (Fish (II) and Subjects of
study). The second piece of evidence is that the number of words under “ NW” in Table
5.1 varied. As mentioned above, 25 words were selected for each semantic category. The
column "NW" in Table 5.1 shows a relatively wide range of variability. The highest and
lowest word numbers were 25 words (for Animal and Ingredients) and 17 (for Fish (I),
Fish (II), Vegetables & beans (II) and Subjects of study). The above two factors (number
of words classified correctly by at least 93% of participants and mean correct) were
highly correlated (R2 = 0.809, p < 0.0001).
The analysis of “yes" filler-word responses had two goals. The first goal was to
understand how well participants performed their task in the experiment. The second
goal was to investigate more about semantic categories and their representative words in
Japanese. This analysis yielded three new findings. First, participants generally showed
good performance in the semantic categorization experiment. Note that the selection of
semantic categories and their “yes” filler-word responses was not ideal at all, because
some of the semantic categories were not based on objective reference to previous
literature on category norms in Japanese, but on intuitions of this author as a Japanese
native speaker. Additionally, participants had to deal with multiple semantic categories
in each session, so the task was more difficult than in experiments using a single semantic
category (a more typical setup). Even so, the average percent correct for “yes” filler-word
responses by subjects was 95%. A high percentage value indicates that participants
122
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. performed the task very well even under circumstances in which they had to deal with 28
different semantic categories. We expect that an analysis on “no” target-word responses
should be able to provide interpretable results in the semantic categorization experiment.
Secondly, although general accuracy patterns clearly indicate that the participants
performed the task properly, the degree of task difficulty among 28 semantic categories
seemed to be different. The mean values of the words responded to correctly within each
category {Mean) and the number of words correctly classified by at least 93% of the
listeners (NW) shown in Table 5. L both indicate that the participants found their task to be
relatively easier in some categories, such as Animals and Ingredients , than in other
categories, such as Fish (I) and Subjects o f study. Also, these two factors were highly
correlated.
Thirdly, although the degree of task difficulty among semantic categories within
the experiment varied, the selection of semantic categories by itself seemed to be
reasonable. The lowest number of words correctly classified by at least 93% of the
participants was 17 out of 25 words for the categories of Fish (I), Fish ill). Vegetables &
beans (II) and Subjects of study. In other words, at least 17 of the words were considered
by general undergraduate students to be good instances of these semantic categories. The
experimental structure required 25 words for each semantic category. However, the
results clearly suggest that not all the “yes” filler-words were not representative words for
each category. This does not mean that some of the selected semantic categories were
123
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. unacceptable. Rather, some semantic categories had less intuitive or less representative
words than the other categories.
The results of the analysis on ‘'yes” filler-response words is useful for selecting
semantic categories and their representative words in the future semantic categorization
experiments in Japanese.
Turning now to the “no” target-word responses, the results show that the overall
correct percentage of “no” target-word responses was high. The highest and the lowest
proportions of correct responses were 0.985 and 0.917. respectively. A high average
percentage (96.54%) indicates that participants were generally able to categorize “no"
target-word responses in the experiment as expected.
Next, accuracy for each target word was investigated. Table 5.2 shows a
summary of accuracy for the 700 target words. The number of words (Number o f Words)
and the proportion of words (Proportion o f Words) that were correctly classified by
number of participants (Number of Participants) are also show in the table. For example,
261 out of 700 target words were correctly classified by all 30 participants, and its
proportion was 0.3729.
The data indicate that the overall percentage of “no" responses to “no” target-
word responses was high. 502 of the 700 target words (71.58%), were correctly classified
as “no” by at least 29 out of the 30 participants of this study. 597 out of the 700 target
words (85.15%) were correctly classified by at least 28 of the 30 participants. Such a
124
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. high percentage value indicates that participants generally performed their task as
expected.
Number of Participants Number of Words Proportion of Words
30 261 0.3729 29 241 0.3429 28 95 0.1357 27 37 0.0528 26 22 0.0314 25 14 0.0200 24 8 0.0114 23 2 0.0029 22 7 0.0100 21 I 0.0014 20 2 0.0029 19 3 0.0043 18 I 0.0014 17 0.0029 16 I 0.0014 15 I 0.0014 14 0 0 13 2 0.0029 12 0 0 11 0 0 10 I 0.0014 0-9 0 0 Total 700 I
Table 5.2: A summary of the accuracy data for the 700 target words.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. However, the data shown in Table 5.2 also indicate that the percentage of “no”
responses was unusually low for a few tokens. In the worst case, only 10 out of 30
participants responded “no" to “no” target-word responses. The participants seemed to
have difficulty responding to some target words.
As explained in §5.3.1, “no” target-word responses could be classified as “yes" if
they were distributed to inappropriate semantic categories. The main question was
whether unusually low correct responses to certain words were induced by such
inappropriate semantic categories for the words. In order to confirm this possibility, 104
target words that were correctly classified as “no” by at most 26 participants were
investigated to see whether they could possibly be classified as “yes.” Of 104 target
words, 8 target words seemed to have high error rates, as shown in Table 5.3.
126
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. # of Participants Words Semantic Categories
22 sikori ‘stiffness’ Body parts
22 tanima ‘valley’ Body parts
22 tegami ‘letter’ School items
22 katiku ‘livestock’ Animals
‘that 18 sonote Body pans method’
16 kabure ‘rash’ Diseases
15 namazu ‘catfish’ Insects
10 musubi ‘last word’ Grammatical terms
Table 5.3: The discarded target words for the final analysis in the semantic categorization experiment.
The target word assignments to these semantic categories were mistakes: they
should have been assigned more carefully to other semantic categories. Therefore, these
8 target words were dropped from the full analysis. A more detailed explanation is given
in Appendix H.
In summary, this section discussed the validity of the semantic categorization
experiment. The analyses of “yes” filler-word responses and “no” target-word responses
both provided evidence that the participants performed their task as they were expected.
The “no” target-response words were of primary main interest, and the results of the
127
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. analysis may be able to provide interpretable data. The analysis of correct responses to
“yes” Filler-response words also provided useful information on the semantic categories
and their representative words. The basic information on the semantic categories and
their representative words may be useful for future semantic categorization experiments.
5 .3 .2 . S e m a n t ic C ategorization D a t a
Based on the analyses in §5.3.1, 8 target words were eliminated from the
categorization time analysis. Then, abnormal fast and slow responses falling above or
below 2.5 standard deviations of both the subject and stimulus means were deleted.
Semantic categorization times for correct word responses were analyzed in a multiple
regression model (Cohen & Cohen, 1983) for the basic model and four other models.
Before the analyses were conducted, the mean categorization time for each
participant was calculated. The data showed that the range for categorizing times was not
remarkably large. After the abnormal fast and slow responses were removed, the
semantic categorization time for each participant varied from 710 ms to 890 ms (MEAN
= 777 ms, SD = 50.52863318). Since variance of the participants is smaller than those in
the previous two experiments, the participants were not split into fast and slow groups.
The percentage of variance accounted for by each neighborhood definition was
calculated by subtracting the R2 of the basic model from the R2 of the basic model +
neighborhood density. The calculation yielding the highest increase in R2 that is
statistically significant was chosen as the best neighborhood calculation for the data.
128
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The basic model was constructed using 7 factors: 6 factors that were used in the
previous two experiments (participants, initial sound category. Uniqueness points, I'1
mora frequency, Word frequency, and Duration) and an additional factor: Semantic
category. Semantic category was introduced based on the fact that listeners categorized
words in different semantic categories. All seven factors were used to construct the basic
model that explained 22.3 % variance (F(62, 19601) = 90.725063, p < 0.00001; R2 =
0.222983).
Participants V
Initial sound class \^***
Semantic Category \j***
UP Inhibition***
Duration Inhibition***
I5' Mora Frequency Inhibition***
Word Frequency Facilitation***
(* * * ^ < 0.001)
Table 5.4: Basic model of the semantic categorization data, Experiment 3.
129
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Three additional models were constructed using the factors of the basic model and
one of the three different neighborhood calculations. Table 5.5 shows the results of the
four models.
R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.222983 NA NA
Basic + Segs 0.223911 0.000928*** Facilitation
Basic + Segs&Pitch 0.224481 0.001498*** Facilitation
Basic + Auditory 0.223566 0.000583*** Inhibition
(***p < 0 .0 0 1)
Table 5.5: Models of the semantic categorization data, Experiment 3.
The values of R2 accounted for by the neighborhood effect demonstrate a contrast
between the calculations based on the segmental representation (the Segments calculation
and the Segments + Pitch calculation) and the calculation based on the Auditory
calculation. Interestingly, the calculations based on the segmental representation show
facilitation whereas the calculation based on the auditory representation shows inhibition.
Two types of neighborhood density effect are observed in the same experiment.
Therefore, both types of neighborhood density effect might contribute to the
130
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. categorization times. Also, this could support the claim by Luce and Large (2001) that
the effect from probabilistic phonotatics (facilitative effect) and the neighborhood density
effect (inhibitory effect) are separable, and coexist.
In order to confirm this hypothesis, another model was constructed. This time, the
calculations that yielded the highest two (the Auditory calculation and the Segments +
Pitch calculation) were included as neighborhood density. If both effects were real, they
should be also significant in a model where they are both included as separate factors.
The model accounted for 22.5% of the variance, (F(64, 19599) = 88.919995, p < 0.0001;
IF = 0.225026), both types of neighborhood density were significant, and the directions of
the effects did not change: facilitation from the Segments + Pitch calculation (F =
36.9340. p < 0.0001) and inhibition from the Auditory calculation (F = 13.7924. p <
0.0005). A combination of both calculations yielded the highest increase in R~ (R~ =
0.002043) relative to the basic model such that this might be the best model possible.
Unlike the previous two experiments, two neighborhood calculations can coexist in the
model.
Table 5.6 is a summary of the regression model for the semantic categorization
data. The model with a combination of the Segments + Pitch calculation and the
Auditory calculation is shown here.
131
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants yj*** u yj*** 1 Initial sound class 3 u '35 Semantic Category yj*** a £4) UP Inhibition*** a se Duration Inhibition*** L. 9 u a Ist Mora Frequency Inhibition*** rJm Word Frequency Facilitation***
e■Ji ,9 Segments* Pitch Facilitation*** a 3 a Inhibition*** 13 Auditory U (***p < o.OOL)
Table 5.6: A summary of the regression model with two types of neighborhood density (facilitative and inhibitory), Experiment 3.
There are at least two interpretations tor the fact that two neighborhood density
effects are observed in the data. The first interpretation is that both effects are valid for
all participants. The other interpretation is that one neighborhood density effect is
operative for some listeners and the other density measurement correlates with RT for
other listeners. In other words, if we grouped the participants as fast responders and slow
responders as in previous experiments, neighborhood density facilitation might be
observed among fast responders only, whereas neighborhood density inhibition might be
132
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. observed among slow responders only. In order to test this hypothesis, participants were
grouped by the median of participants’ mean categorization times. Fifteen participants
each were grouped as “fast responders” and as “slow responders,” and the data of fast
responders and slow responders were then reanalyzed separately.
The basic model for fast responders consisted of 7 factors as in the previous basic
model. All seven factors were effective. The model accounted for 11.18% of the
variance (F(47, 9922) = 26.586417, p < 0.0001: FT = 0 .111852).
Next, three additional models were constructed by adding to the basic model each
of the three different neighborhood calculations for fast responders. Table 5.7 shows the
results of the four models for fast responders. As in the overall analysis above, the
Segments + Pitch calculation and the Auditory calculation are both significant-5, and have
opposite effects on semantic categorization reaction times.
3 ANOVAs were performed using a median split (high density neighborhood, low density neighborhood) for the Segments + Pitch calculation and the Auditory calculation. The analyses did not show any significant effects for both calculations. 133
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.111852 NA NA
Basic + Seg 0.113391 0.001539*** Facilitation
Basic + Seg&Pitch 0.114240 0.002388*** Facilitation
Basic + Auditory 0.112253 0.000401* Inhibition
(*/?< 0.05, ***p< 0.001)
Table 5.7: Categorization data of fast responders, Experiment 3.
Another model was built using the 7 factors used in the basic model and 2
neighborhood density calculations (the Segments + Pitch calculation and the Auditory
calculation). All factors contributed to the model significantly (F(49, 9922) =26.203629.
p <0.0001, R: = 0 .114600). The Segment + Pitch calculation (F = 26.2982, p <0.0001)
and the Auditory calculation still showed facilitation and inhibition (F = 4.0409, p =
0.0444), respectively.
Similarly, the model for the slow responders was also built. All seven factors
were used to construct the basic model that accounted for 17.37% of the RT variance
(F(47.9696) = 43.153447,/? <0.00001; R2 = 0.173734).
Next, three additional models were constructed by adding to the basic model each
of the three different neighborhood calculations for slow responders. Table 5.8 shows the
134
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. results of the four models for slow responders. As in the overall analysis above, the
Segments + Pitch calculation and the Auditory calculation are both significant4.
R2 accounted R2 accounted Direction of the for by the Models for by the neighborhood neighborhood model effect effect Basic 0.173734 NA NA
Basic + Seg 0.174385 0.000651** Facilitation
Basic + Seg&Pitch 0.174850 0.001116*** Facilitation
Basic + Auditory 0.174668 0.000934** Inhibition (**p < 0.01, ***p<0.00l)
Table 5.8: Models of the semantic categorization data for slow responders, Experiment 3
Another model was built using 7 factors used in the basic model and 2
neighborhood calculations (the Segments + Pitch calculation and the Auditory
calculation). All factors contributed to the model significantly (F(49. 9644) = 41.964509.
p < 0.00001; R2 = 0.175745). The Segments + Pitch calculation and the Auditory
* ANOVAs were performed using a median split (high density neighborhood. low density neighborhood) for the Segments + Pitch calculation and the Auditory calculation. The analyses did not show any significant effects for both calculations. 135
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation still show facilitation (F = 12.5943, p = 0.0004) and inhibition (F = 10.4670,
p = 0.0012), respectively.
Table 5.9 shows a summary of the regression models for fast responders and slow
responders. The model with the Segments + Pitch calculation and the model with the
Auditory calculation are chosen for fast responders and slow responders.
136
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Fast Responders Slow responders
Participants VV u ■8 Initial sound class V V S *5 Semantic category VV 9 4) m UP Inhibition Inhibition
a Duration Inhibition Inhibition u% u a l4t Mora Frequency Inhibition Inhibition
Word Frequency Facilitation Facilitation
s■Ji #o Segments + Pitch Facilitation Facilitation a 3 13 Auditory Inhibition Inhibition U
Table 5.9: A summary of the regression models with two types of neighborhood density (facilitative and inhibitory) for fast responders and slow responders, Experiment 3.
Looking at the fast responders’ data first, as Table 5.7 showed, the Segments +
Pitch calculation yielded the highest R2, followed by the Segments calculation. The
Auditory calculation was also significant, although the model with the Segments + Pitch
calculation and the Auditory calculation showed that its effect was a marginal effect. In
this sense, the non-auditory calculations were better than the Auditory calculation. On
the other hand, as shown in Table 5.8, slow responders’ data indicate that the Auditory
137
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation that yielded the higher R2 for slow responders than for fast responders. The
effect from the Segments calculation that was significant in the fast responders’ data
disappeared in the slow responders’ data. This means that neighborhood density
facilitation was stronger than neighborhood density inhibition among the fast responders
but the opposite direction of the effects was observed among the slow responders. In
other words, the neighborhood inhibitory effect was dominant among slow responders.
This suggests that lexical competition occurs with the auditory representation, and the
magnitude of this inhibitory neighborhood density effect is stronger for slow responders
than for fast responders. As Table 5.9 shows, two types of neighborhood density
(facilitative and inhibitory) coexist even after two groups were separated. This may
indicate that they had effects for all the participants.
The above regression analyses of the semantic categorization data for fast
responders and slow responders demonstrated that they performed very similarly in terms
of accuracy of target word recognition. Seven factors contributed to the semantic
categorization times for fast responders and slow responders ( Participants , Initial Sound
Class, Semantic category. Uniqueness point, I'1 Mora Frequency, Duration, and Word
Frequency).
Initial Sound Classes of the target words also affected semantic categorization
times. For the fast responders, stops were categorized more quickly than fricatives and
nasals. Nasals were categorized more quickly than fricatives (F(2,9967) = 2.13.752, p <
0.0001). Slow responders also showed the same tendency (F(2,9691) = 1.55.103, p <
138
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0.0001). Tukey HSD multiple comparisons showed that all comparisons of sound classes
were significantly different for both fast responders and slow responders at p < 0.0001.
This pattern was also observed in the previous two auditory naming experiments.
Therefore, the participants in this semantic categorization experiment were also sensitive
to the duration of the word-initial sound.
Duration was also effective in this experiment: longer words were categorized
more slowly than shorter words.
The T' mora frequency was also effective in the categorization times. The data
demonstrated that words with a higher Ist mora frequency were categorized less quickly
than words with a lower Ist mora frequency. A sound or sequence of sounds (in this
experiment, the mora) could induce either inhibition or facilitation. For example, if
listeners take advantage of effects of probabilistic phonotactics. the neighborhood density
effect is facilitation: words beginning with high first mora frequency are categorized more
quickly than those with low first mora frequency. On the other hand, if they do not take
advantage of probabilistic phonotactics, the neighborhood density effect shows an
opposite direction, because of word competition effects: words with high first mora
frequency would be categorized less quickly and accurately than those with low first mora
frequency. This significance of first mora frequency appeared as a consequence of word.
Since first mora frequency in this experiment caused inhibition, word competition may
have occurred while participants performed the task.
139
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A facilitative Word frequency effect was observed in the semantic categorization
experiment as observed in the auditory naming experiments: Frequent words are
recognized more accurately than infrequent words in a noisy condition.
Uniqueness point was a significant effect in this experiment. This effect was
reported in English (e.g.. Marslen-Wilson, 1984; Luce. Pisoni. & Manous, 1984; Tyler &
Wessels, 1983). However, the effect of the uniqueness point has not been reported in
experiments in Japanese (Amano and Kondo, 2001). The results of this experiment
certainly showed the effect of the uniqueness point. Figure 5.1 shows a graph of the
categorization times as a function of the number of the segments from the word-initial
point for a target word to become unique from the words in the lexicon.
The solid line and the dashed line plot the results of fast responders and slow
responders, respectively. Figure 5.1 shows that words with an earlier uniqueness point
were categorized more quickly than words with a later uniqueness point (F(4, 9965) =
7.33027715, p < 0.0001) for fast responders and (F(4.9689) = 7.71918148, p < 0.0001)
for slow responders. Tukey HSD multiple comparisons showed that words that become
unique at the third segment were categorized more quickly than words that become
unique at the 6th or at the 7th segment among fast responders. Also words that become
unique at the 5th segment were categorized more quickly that words that become unique
at the 7lh segment. For slow responders, words that become unique at the 3rd segment
were always categorized more quickly than words that become unique at later than this
segment. Also words that become unique at the 5th segment were categorized more
140
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. quickly that words that become unique at the 6th segment or at the 7th segment.
Uniqueness point affected the semantic categorization times in Japanese.
850 i 825 800 775 750 725 M 700 675 650 3 4 3 6 7 UP ♦— Fast responders - • o * • Slow responders
Figure 5.1: The categorization times as a function of the number of the segments from the word-initial point for a given word to be unique from the words in the lexicon (UP).
Computational analyses in Yoneyama (2000) showed that the uniqueness point
seems to not be so effective because UP does not occur before the offset of many words
in the Japanese lexicon. This pattern was also confirmed with the analysis of the 700
141
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. target words. Over 55% of the target words do not have UP before the offset. However,
the reaction time data in this experiment showed that some words are distinguished
earlier than others in terms of the uniqueness point.
5.4. D isc u ssio n
The main purpose of the categorization experiment was to test whether an
inhibitory neighborhood density effect would be observed among the categorization time
data. Vitevitch and Luce (1999) explained that if listeners perform a semantic
categorization task, the neighborhood density effect would show inhibition: words in a
dense neighborhood should be categorized less quickly than words from a sparse
neighborhood. This is based on the fact that neighborhood density is a measure of lexical
competition.
However, the response patterns of the participants were more complicated than
expected. The semantic categorization data showed that more than one neighborhood
density calculation was statistically significant. Further, two types of neighborhood
density (facilitative and inhibitory) coexist in the model.
Recall from the auditory naming experiments, that a neighborhood density effect
can be realized either as a facilitative effect or as an inhibitory effect. The current
experiment shows that the neighborhood density effect gradually changes from
facilitation to inhibition in the timecourse of word recognition processing. If only the
calculations that yielded the highest R2 for these semantic categorization data are
142
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. considered, a comparison between fast responders and slow responders may also
demonstrate a transition from facilitation to inhibition. For fast responders, the strongest
neighborhood density effect is calculated by the Segments + Pitch calculation followed by
the Segments calculation. The Auditory calculation that shows inhibition is the least
explanatory calculation of the three. If we assume that R2 explains magnitude of effects
among the three calculations, fast responders performed their task at an earlier stage of
the word recognition process because the Segments + Pitch calculation that yielded the
highest R2 showed a facilitation effect. However, neighborhood density effect from the
lexicon begins to affect their performance. Slow responders, on the other hand, naturally
performed their task at a later stage. The results suggest that they seemed to rely more on
the Auditory calculation yielding the second highest R2 and showing inhibition. This may
indicate that the highest explanatory R1 has gradually moved from the Segments + Pitch
calculation to the Auditory calculation during the timecourse of processing. Also, the
effect from the Segments calculation has gradually disappeared. The semantic
categorization data suggest a gradual transition of the neighborhood density effect from
facilitation to inhibition.
In summary, the results showed that neighborhood density effects occurred in the
semantic categorization experiment. There are three main findings. First, as observed in
previous experiment, neighborhood density effects change during the timecourse of the
word recognition processes. The neighborhood effect from the Segments + Pitch
calculation was stronger than neighborhood inhibitory effect from the Auditory
143
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculation for fast responders whereas the latter was stronger than the former for slow
responders.
Second, categorization times were negatively correlated with neighborhood
density in the Segments calculation and the Segments + Pitch calculation: words from a
dense neighborhood were categorized MORE quickly than words from a sparse
neighborhood. However, categorization times were positively correlated with
neighborhood density in the Auditory calculation: words from a dense neighborhood were
categorized LESS quickly than words from a sparse neighborhood. Two types of
neighborhood density effects from the Segments + Pitch calculation and the Auditory
calculation were both effective for fast responders and slow responders.
Finally, two types of neighborhood density effects (facilitative and inhibitory)
coexist. A neighborhood facilitative effect is generally interpreted as an effect of
probabilistic phonotactics (Vitevitch & Luce, 1999). A neighborhood inhibitory effect is
generally interpreted as word competition (Luce & Pisoni, 1998; Vitevitch & Luce,
1999). As Luce and Large (2000) have claimed, both effects of probabilistic phonotactics
and neighborhood competition can be observed simultaneously. The results of this
experiment confirmed this claim in Japanese. Of course, since the neighborhood density
effect is a measure of lexical competition in current word recognition models, these
results provide piece of evidence that lexical competition is also attested in Japanese.
144
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 6
GENERAL DISCUSSION AND CONCLUSION
6.1. Introduction
This dissertation investigated two aspects of spoken word recognition: lexical
representation and lexical competition. Three experiments were conducted using the
same 700 Japanese target words, in an attempt to directly test the neighborhood density
effects in experiments with different tasks.
§6.2 provides a summary of the experimental results. Section 6.3 is a proposal for
a model of spoken-word recognition and word production which can account for these
results. Finally, conclusions are provided in §6.4.
145
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6.2. S u m m a r y o f R esu l t s
This section summarizes the results of the three experiments reported in this
dissertation. The discussion of other factors other than neighborhood density is presented
in §6.2.1, followed by the findings relating to neighborhood density in §6.2.2.
6.2.1. O t h e r E ffec ts
Many factors other than neighborhood density were included for analyses of
experiments in this dissertation on the assumption that they affect processing time as well
as accuracy. Several of these factors were significant and revealed some interesting
tendencies. This section discusses the influences of factors other than neighborhood
density in spoken word recognition.
Table 6.1 shows the effects included in the analyses of the experiments. All of the
factors shown here had a significant effect on processing times in least one of the
experiments. This supports the claim made in the segmentation studies that listeners
exploit any phonological information to assist them in lexical access.
146
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Processing Time Data Initial sound Participants Frequency Frequency Semantic Duration 1st Mora category Word class e ■s
Exp 1 Fast V V NA I F
Slow VV NA I FF
Exp2 Fast V (V) NA I FF
Slow V NA I
Exp3 Fast V VV I IIF
Slow V >J V I II F
Word Identification Data Initial sound Participants Frequency Frequency Semantic Duration 1st Mora category Word class e
Exp 2 Fast V V NA L HH
Slow V NA LH H
Table 6.1: Summaries of effects other than neighborhood density in three experiments. Processing time data (top) and Accuracy data (bottom). (F = Facilitative effect, I = Inhibitory effect, H = Higher accuracy, L = Lower accuracy, NA = no applicable).
147
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another interesting point is that the factors that are used for lexical access do not
necessarily affect accuracy data in the same way. For example. Duration is always
negatively related to processing time, whereas it does not affect accuracy. Word
frequency shows that even if this affects processing time, it does not necessarily mean
that it affects accuracy. In Experiment 2 (auditory naming in noise). Word frequency
affected accuracy whereas it did not affect processing time. On the other hand, in a noise-
free environment. Word frequency contributed to processing time, but not to accuracy.
Also, even if a factor contributes to both processing time and accuracy, it does not
necessarily do so in the same way. For example, the Initial sound class factor revealed an
effect of durational difference among sound classes (stops, nasals, and fricative) in the
processing data of Experiment I (auditory naming), whereas it showed an effect of
sonority difference among sounds in the word identification data of Experiment 2
(auditory naming in a noisy condition). Although this is evidence that listeners are
sensitive to the Initial sound class of words as one of the types of phonological
information exploited for lexical access, the data also showed that listeners exploit
different characteristics of the Initial sound class.
One last thing to mention is the Uniqueness point. According to Marslen-
Wilson’s (1987, Marslen-Wilson & Tyler, 1980: Marslen-Wilson & Welsh, 1978) cohort
theory, the initial acoustic-phonetic information of a word presented to the listener
activates a “cohort” of lexical candidates that share word-initial acoustic-phonetic
information. Lexical candidates that are incompatible with ensuring top-down and
148
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. bottom-up information are successively eliminated from the cohort until only one word
remains, at which time that word is recognized. Because a word may be recognized when
the initial acoustic-phonetic information of the word is compatible with no other words in
the cohort, word recognition within the framework of cohort theory is said to be
“optimally efficient” in the sense that a listener may recognize a word prior to hearing the
entire word. Thus, a crucial concept in the cohort theory of word recognition is that of
the uniqueness point or optimal discrimination point. The uniqueness point is that point,
measured from the beginning of the word, of which that word becomes distinct from all
other words in the lexicon. For isolated words, the uniqueness point defines the earliest
point, theoretically, at which a word can be recognized, although a word may be
recognized prior to its uniqueness point given sufficiently constraining contextual
information.
The role of the uniqueness point in the recognition of stimuli presented in
isolation has been demonstrated for nonwords in an auditory lexical-decision task
(Marslen-Wilson. 1984) and for specially selected words in a gating task (Luce. Pisoni. &
Manous, L984: Tyler & Wessels, 1983). Thus, the concept of a uniqueness point seems
to be empirically justified.
To determine the extent to which words in isolation may be recognized prior to
their offsets. Luce ( 1986b) calculated uniqueness points or optimal discrimination points,
for all words in a 20,000-word computerized lexicon with more than two phonemes. The
results of this analysis revealed that the frequency-weighted probability of a word’s
149
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. diverging from all other words in the lexicon prior to the last phoneme was only .39.
This finding suggest that an optimally efficient strategy of word recognition may be
severely limited.
Furthermore, Yoneyama (2000) conducted similar uniqueness-point analyses in
Japanese and found that the probability that a word would be distinguished from all other
words in the lexicon prior to the last phoneme was .49 even if its phonological
representation included the pitch accent patterns that help to distinguish words in
Japanese. Therefore, Yoneyama concluded that the concept of the uniqueness point
would also be of limited effectiveness in Japanese. A similar conclusion was drawn in
Amano and Kondo (2000).
However, the results of Experiment 3 (Semantic categorization experiment) show
that the Uniqueness point contributes to processing time: words that become unique
earlier were categorized more quickly than words that become unique later. In Cohort
theory, words are classified into three groups: words that become unique before the last
segment (before), words that become unique at the last segment (at) and words that
become unique after the last segment (after). The results of Experiment 3 revealed that
there was no categorical difference among these categories. Rather, target words (all of
which were six segments long) that become unique at the third segment, were categorized
more quickly and accurately than target words that become unique at a later segment.
Hence, the data of the Uniqueness point studies show that the uniqueness point does have
an influence on word recognition. Yoneyama’s (2000) finding is right in the sense that
150
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the uniqueness point can only provide a limited wore recognition strategy because less
than half of the lexical items have a uniqueness point. However, among these items,
some have an earlier uniqueness point than others - a difference which processes of
spoken word recognition exploit.
The uniqueness point is also considered as a measure of lexical competition. The
significance of this effect leads to conclude that all words are aligned at a left edge of the
words. This is something we may need to consider when we develop spoken-word
recognition models.
6.2.2. Neighborhood Density E ffects
6 .2 .2 .I.Processing Time Data
Table 6.2 shows a summary of the neighborhood density effect in the processing
times of three experiments. The calculations were ordered in this way in order to fulfill
the requirement that the data of the fast group precede the data of the slow group in all
experiments. In this way, the order of the calculations may reflect the timecourse word
representations, since each calculation reflects the word representation at different points
of processing.
151
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Calculations Segments + Exps Groups Segments Auditory Pitch Fast Facilitation Facilitation a. 449 ms X Slow U (Facilitative) Facilitation Facilitation 624 ms Fast (Facilitative) m 670 ms1 * Slow a Facilitation 810 ms Fast Facilitation Facilitation Inhibition a. 738 ms (R: = 0.001539) (R2 = 0.002388) (R2 =0.000401) X Slow (Facilitation Facilitation Inhibition 815 ms (R: = 0.000651) (R2 = 0.001106) (R2 = 0.000934)
Table 6.2: Summary of the neighborhood density effects on processing times in three experiments. Effects in bold show the calculation that yielded the highest increase in R".
The table reveals a few general tendencies. In auditory naming experiments
(Experiments I and 2), all of the participants (in both the fast groups and the slow group)
consistently showed neighborhood density facilitation. Fast namers primarily used the
Segments calculation, while slow namers primarily used the Auditory calculation. The
results of the semantic categorization experiment (Experiment 3) showed that multiple
neighborhood definitions had an effect on processing times for both fast responders and
slow responders. Two types of neighborhood density effects were observed (facilitative
152
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and inhibitory). Models with two types of neighborhood density calculations (the
Segments + Pitch calculation and the Auditory calculation) both had an effect for fast
responders and slow responders. However, the effect from the Auditory calculation was
stronger for slow responders than for fast responders, while the effect from the Segments
+ Pitch calculation was stronger for fast responders than for slow responders.
6.2.2.2.W o r d Identification Data
Because participants did not make many mistakes in accurately naming or
categorizing words in Experiments I and 3. only the accuracy data (word identification
data) from Experiment 2 are discussed here. Table 6.3 shows these accuracy data. Both
fast namers and slow namers identified words from dense neighborhoods less accurately
than words from sparse neighborhoods. This pattern was also observed in previous word
identification in noise experiments in English and Japanese (Pisoni & Luce. 1998: Amano
& Kondo, 1999). Therefore, neighborhood density affects accuracy.
1 Mean reaction times for fast and slow namers in Experiment 2 are calculated from the onset of the target words embedded in noise, not from the onset of the sound files. 153
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Calculations Segments + Groups Segments Auditory Pitch Fast Lower accuracy Lower accuracy
Slow Lower accuracy Lower accuracy
Table 6.3 Summary of the neighborhood density effect in the word identification data of Experiment 2.
6.3. P r o p o s a l : A M o d e l o f S p o k e n -W o r d R e c o g n it io n a n d W o r d P r o d u c t io n
The purpose of this dissertation was to investigate two aspects of lexical access in
Japanese auditory word recognition. As shown in §6.2, the factors used in the three
experiments, including neighborhood density , showed curious patterns of facilitation and
inhibition that need to be explained. The neighborhood density effects that were
significant in their experiments were based on different - the Segments calculation and
the Segments + Pitch calculation were based on symbolic representations whereas the
auditory calculation was based on an auditory representation. Furthermore, the results of
Experiment 3 showed that the Auditory calculation, which showed a neighborhood
inhibitory effect, was based on auditory representation that des not have an internal
structure. Since neighborhood density calculations that were based on different
representations (symbolic and auditory) both had significant effects on patterns in word
154
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. recognition, it is necessary to assume that both representations should be stored in the
lexicon and are involved in auditory word recognition. However, most current
recognition models only assume one of the two representations. In order to explain the
patterns observed in this dissertation, a model is proposed that is inspired by Plaut and
Kello (1999) and Jusczyk (1993) who proposed their models based on infants’ behaviors
(or simulations of their behaviors) for speech comprehension and production. Plaut and
Kello (1999) and Jusczyk (1993) both base their models on the same assumption:
symbolic representations develop out of a need for articulation.
Generally, adult word recognition models do not consider the processes that are
necessary for speech production. However, Experiments 1 and 2 in this dissertation were
auditory naming experiments in which both word comprehension and word production
had to be used in order to perform the task. An explanation of the results of these
experiments therefore requires a model that explains auditory naming experiments
(production task) and semantic categorization experiment (semantic decision task) at the
same time.
Before moving on to this proposal, it will be helpful to review the basic features
of Plaut and Kello’s (1999) model.
6.3.1. Plaut & K ello (1999)
Plaut & Kello (1999) have proposed that phonology emerges from the interplay of
speech comprehension and production. In this view, phonology is an intermediate stage
155
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that connects acoustics, articulation and semantics. The is based on connectionist-pararel
distributed processing (PDP) principles, in which different types of information are
represented as patterns of activity over separate groups of similar, neuron-like processing
units. This section describes the basic features of their model.
Plauto & Kello’s model is shown in Figure 6.1. In their model, phonological
representations play a central role in mediating acoustic, articulatory and semantic
representations. An important aspect of this model is that phonological representations
are not predefined, but are learned by the system under the pressure of understanding and
producing speech. Representations of segments (phonemes) and other structures (onset,
rime, syllable) are not built-in; rather, the relevant similarity between phonological
representations at multiple levels emerges gradually over the course of development.
The system lacks any explicit structure corresponding to words. Instead, the
lexical status of certain acoustic and articulatory sequences is reflected only in the nature
of the functional interactions between these inputs and other representations in the
system. Phonological representation is also symbolic in the system in order to store
temporal acoustic or articulatory information.
As shown in Figure 6.1, it is relatively straightforward to establish a relationship
between semantics and acoustic input, as well as between semantics and articulation,
because symbolic representation mediates the two spaces. However, the hardest part to
the model is how children learn the mapping between acoustic input and articulation.
From the perspective of control theory, Plaut & Kello have proposed that the mapping
156
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. from articulation to acoustics is what they call the forward mapping, whereas the reverse
is the inverse mapping. This forward model must be invertible in the sense that errors
observed in the acoustic input for a given articulation can be translated back into errors in
articulation. Plauto & Kello’s model implemented back-propagation within a
connectionist network.
In Perkell, Matthies. Svirsky, and Jordan (1995), a learned forward model plays a
critical role providing the necessary feedback for learning speech production. Similarly,
in Plaut & Kello's model, the forward model is used to convert acoustic and phonological
feedback (i.e, whether an utterance sounded right) into articulatory feedback, which is
then used to improve the mapping from phonology to articulation (See Plaut & Kello.
1999, for details about their forward model).
157
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Symbolic Representations
Inverse mapping
Articulation Acoustic input
Forward model
Figure 6.1: A model of speech comprehension and production by Plaut & Kello (1999).
158
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 .3 .2 . A M o d e l o f S p o k e n -W o r d R e c o g n it io n a n d W o r d P r o d u c t io n
Our proposed model is a modified version of Plaut and Kello’s model based on
the assumption that the comprehension-production system children acquire in their first
years must underlie the system used by adults. For example, in word recognition, adult
listeners use a segmentation procedure that has been acquired in infancy (e.g., Cutler,
1997). Therefore, the adult comprehension-production mechanism should be based on
the same mechanisms that children have acquired through language comprehension and
production.
A modified version of Plaut and Kello’s model is shown in Figure 6.2. This
model assumes two representations of words: symbolic representations and auditory
representations (Auditory patterns). Auditory patterns do not have any internal structure
and are considered as more complete forms. On the other hand, symbolic representations
have an internal structure represented by linguistic units such as phonemes and are
considered assembled forms.
The necessity for two representations comes from the experimental results
showing that neighborhood density calculations that were based on two representations
(auditory and symbolic) both had significant effects on spoken word recognition. The
Auditory calculation that was based on auditory representations was the only calculation
that showed a neighborhood inhibitory effect. Moreover, auditory representations need to
be stored in the lexicon, since neighborhood inhibitory effect is a measure of lexical
competition and only the Auditory calculation exhibited this effect on processing time. 159
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The existence of auditory representations in the lexicon reflects other findings
with infants (Jusczyk, 1993) and adults (Goldinger, 1989, 1996, 1998; Johnson, 1997ab:
Luce & Lyons, 1998; Pallier, Colome, & Sebastian-Galles, 2001). Jusczyk (1993) has
proposed in his model that exemplars are stored in the lexicon and are connected to
semantic representations. Jusczyk (1993) also mentioned the necessity of symbolic
representation for speech production. Plaut and Kello (1999) and Jusczyk (1993) have
both implied that symbolic representation is mainly established in order to connect the
acoustic input (comprehension) with articulation (production). An acoustic-articulation
mapping introduced the necessity of symbolic representation because acoustic input and
articulation do not have a one-to-one correspondence. Furthermore, adult word
recognition studies have shown that listeners seem to store episodic auditory exemplars in
the lexicon (e.g, Goldinger. 1998; Johnson. 1997). In this dissertation, all the words have
only one exemplar. However, the model allows for the possibility of storing store
multiple exemplars for each word.
Our model includes two routes for adult language users to perform the auditory
naming task for real word stimuli. The first route is the one proposed in Plaut and Kello
(1999) for imitation. The system first derives acoustic and phonological representations
for adult utterance during comprehension. It then uses the resulting phonological
representation as an input for generating a sequence of articulatory gestures. The other
route feeds the acoustic input to the Auditory patterns in order to activate exemplars
160
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. stored in this space. Then, a more completed sequence of gestures is executed for
articulation via symbolic representations.
Semantics
Symbolic Auditory Representations patterns
Articulation Acoustic input
Figure 6.2: A model of spoken-word recognition and word production.
161
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The motivation for proposing dual routes for articulation is also based on the
finding that different phonological forms (auditory patterns and symbolic representations)
affect picture naming times and word reading times by children. Barry, Hirsh, Johnston,
and Williams (2001) found that the estimated AoA (‘'Age of Acquisition”) of a word is a
better predictor of reaction times than lexical frequency in a speeded picture naming task
and a word reading task. They also found that words with early AoA had a smaller
repetition priming effect than words with later AoA when the same subjects were asked
to name pictures that they had either seen (read) before or not seen (read) before. They
interpret this interaction in terms of the phonological completeness hypothesis of AoA by
Brown and Watson (1987) in which early-acquired words have “a more complete
phonological representation” (p. 214), whereas later-acquired words, which are presumed
to be stored in a segmented fashion, require their stored phonology to be assembled for
production (which entails longer processing time).
In English auditory naming experiments, words are named more quickly than
nonwords. This finding is explained in this model by adopting Brown and Watson’s
phonological completeness hypothesis. Words may be early-acquired words and
nonwords may be possible candidates for later-acquired words. If Brown and Watson's
phonological completeness hypothesis is correct, words have “a more complete
phonological representation” than nonwords. An auditory representation of words is
considered ‘‘a more complete representation” here. Presumably, nonwords require using
symbolic representations to store temporal acoustic information, which implies that
162
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. symbolic representations for nonwords need to be assembled for production. Therefore
the naming time difference between words and nonwords reflects which phonological
representation (or which route in the model) is used for production.
This explanation is based on the assumption that participants use a whole-word
representation in order to perform the auditory naming task. Therefore, participants
retrieve a more completely stored phonological form for word stimuli. They are also
assumed to assemble a full representation for the articulation.
This model makes three assumptions. First, words have two representations
(auditory and symbolic). There are, therefore, two routes for performing the auditory
naming task. The second assumption is that temporal acoustic information may also be
stored as a symbolic representation. Symbolic representations are used to perform the
auditory naming task with nonwords. Lastly, the model can assume two different kids of
processing using symbolic representation. In the first case, participants have a
completely-assembled form before they pronounce the target. For example, if Japanese
participants say a CVCVCV target word, they plan all the gestures for articulation in
advance. This pattern is observed when participants take a route from auditory patterns
to symbolic representations. The completely-assembled form is also possible for
nonwords. The second case is that participants do not have a completely-assembled form
before they pronounce the target. In other words, participants name the target as soon as
acoustic information becomes available. In this case, symbolic representation is used to
hold the temporal acoustic information that is sent to articulation space. This routine
163
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. repeats itself until the end of the target word is reached. Both types of processing may be
possible for Japanese words. Japanese target words in our experiments were all
CVCVCV words. Therefore, Japanese participants could have pronounced the target
words by repeating three syllables/moras as soon as acoustic information becomes
available, rather than retrieving an entire phonological representation from the lexicon. In
this case, the naming times reflect how quickly participants start producing a
syllable/mora, and not a word.
This model also allows multiple routes for word recognition. The acoustic input
is transmitted to both a Symbolic representation and an Auditory representation , both of
which are also connected to Semantic information. Recent studies have shown that two
different representations (auditory and symbolic) need to be stored in the lexicon (Luce &
Lyons, 1998, Pallier et al„ 2001).
6.3.3. T h e C u r r e n t F in d in g s in T e r m s o f T h e P r o p o s e d M o d e l
This section discusses how the model proposed in §6.3.2 can explain the data of
three experiments in this dissertation.
This explanation begins with the assumption that facilitative and inhibitory effects
are not mutually exclusive. That is, a facilitative effect on processing times is introduced
by matching the acoustic input (activation) onto the representation(s) stored in the
lexicon. On the other hand, an inhibitory effect on processing times happens when
language users attempt to select a word. In other words, if the experimental task does not
164
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. require selecting a word, an inhibition effect would never be observed while a facilitate
effect may surface.
6 .3 .3 .1 .E x p e r im e n t 1: A u d it o r y N a m in g
The purpose of the auditory naming task is to repeat the sequence of sounds as
quickly as possible. In order to perform this task, participants have to map the acoustic
information to a representation and to articulate the representation. Luce and Pisoni
(1998) have reported that the naming data showed a neighborhood inhibitory effect while
English participants performed the task with real word stimuli, suggesting that word
competition happened. Luce and Pisoni have reasoned that English participants need to
select a word in order to retrieve a phonological form for pronunciation.
The question here is whether word selection is a part of requirement in order to
perform the task. The auditory naming task has been used for nonword targets. Of
course, nonwords are not stored in the lexicon, which means that participants need to use
a symbolic representation that is used to hold the acoustic input in order to produce the
nonword stimuli. Therefore, word selection itself is not a requirement to perform the
task. This indicates that there is a possibility that participants can perform the task for
real words without word selection. In other words, words might be pronounced like
nonwords without retrieving stored phonological representation.
The results of Experiment 1 showed that fast namers started naming words before
their offsets. Even slow namers showed a tendency that they started naming words after
165
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the offsets. Furthermore, their naming times were faster than the mean naming time for
American-English participants (Luce & Pisoni, 1998; Vitevitch & Luce, 1999). The
neighborhood density effects for fast namers and slow namers were all facilitative. Since
a neighborhood inhibitory effect is a measure of word competition effect, the results
indicate that words were not competing while these participants performed the task.
Neighborhood density calculations compare the similarity of words. However, the
participants in this study did not hear the entire words before they repeated them. This
means that, even though the number of neighbors was calculated in the Segments and the
Segments + Pitch calculations, these measures were not showing neighborhood density.
What did they show, then?
The different neighborhood density calculations show three types of information
regarding lexical access: (I) the kinds of phonological information that are used in lexical
access (segments and pitch accents), (2) how similarity is calculated (unit-by-unit or
whole-word) and (3) when, approximately participants had access to such information.
Based on this information, these results show that Japanese participants did not retrieve a
more complete phonological form from the lexicon in order to perform the task. Rather,
the naming times obtained in this task reflect how long the Japanese participants took
order to produce the First syllable/mora.
Table 6.4 reveals a review of the characteristics of the three neighborhood density
calculations. Three calculations were made based on two factors: phonological
information and domain of similarity. Therefore, the effects of different neighborhood
166
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. density calculations may have shown which types of phonological information are used
for lexical access (segments and pitch accents) and how word similarity is calculated
(unit-by-unit or whole-word).
Neighbor lood Density Calculations Segments Segments + Auditory Pitch Segments yes yes yes Phonological information Pitch accent no yes yes
Domain of Similarity unit unit whole-word
Table 6.4: Characteristics of three neighborhood density calculations.
The last kind of information is explained by the relationship between
neighborhood calculations and the timecourse of lexical access. The results of
Experiment 1 showed that the Segments calculation and the Segments + Pitch calculation
played a role in the responses of fast namers while the Segments + Pitch and the Auditory
calculations had an effect on the responses of slow namers. Moreover, fast namers
started naming before the offset of words. This means that the Segments calculation was
used first and the Auditory calculation was used last in the timecourse of lexical access
processes.
167
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Participants in Japanese word recognition experiments have been reported to be
sensitive to pitch information in word recognition (Cutler & Otake, 1999), which suggests
that segmental information and word-prosodic information are used in word recognition.
When might these participants not be sensitive to word-prosodic pitch information?
Table 6.5 shows relations between three different calculations and the stages of mapping
the acoustic input to word representations.
Acoustic input Neighborhood Calculations
ko° Segments
ko°do' Segments + Pitch
ko'Mo'mo1 Auditory
Table 6.5: Relationships between the acoustic input and neighborhood density calculation.
168
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Within the first mora of the target words is the only time when pitch accent
information is not so clear2. At thispoint, segmental information is the only reliable
information participants can use to produce the word. This means that participants may
start naming words after they process the first mora of the target words. However, by the
time they hear the second mora, pitch accent information is clearly available to
participants. When they hear the first two moras, they know exactly whether the word
begins with HL or LH pitch accent patterns for CVCVCV target words. Therefore, the
Segments + Pitch calculation becomes the best predictor for naming times. Once
participants hear the entire word, the Auditory calculation becomes the best predictor
because now participants are able to retrieve a more complete sequence of articulatory
gestures.
In sum, Japanese participants start naming as soon as they process the first mora
of the target words. Different neighborhood density calculations show the timecourse of
the auditory word recognition system. Next, the means of realizing different stages of
lexical accesses in this model is discussed.
Fast namers were able to start naming the target words just based on the
segmental information in the first mora. They also used pitch accent information when it
: Cutler & Otake (1999) reported that Japanese listeners can perceive whether the target words begin with a high pitch or a low pitch even if they only hear the first mora in their gating task. However, evidence from two different sources suggest that Japanese listeners do not fully exploit the benefit of pitch accent patterns at this stage. First. Cutler & Otake s study also showed that listeners’ performance accuracy is much higher when they hear fragments including two morae. Second, the results of similarity judgments on pitch accent patterns in Chapter 2 suggest that Japanese listeners were not able to determine pitch accent patterns for monomoraic targets accurately. 169
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. becomes available. Figure 6.3 shows participants’ performance in Experiment I
(Auditory naming) in which they started naming the words after they had heard only part
of the word here exploiting segmental information only. Based on the fact that
participants started naming words right after they processed the first mora, they had to
continuously process the acoustic information in order to articulate the three-mora target
words.
When the participants hear the first mora, all the words beginning with /ko/ would
be activated. At this moment, pitch accent cannot be considered, so the acoustic
information is matched with many words in the lexicon. Then, /ko/ is converted into a
sequence of gestures to produce the first mora. The naming times were recorded at the
beginning of the first mora. In order to complete the task, participants needed to repeat
the same routine until the end of the target word. In othder words, participants needed to
assemble a sequence of gestures in terms of moras.
170
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r " \ /ko°dolku7, /ko°ko'ro °/ /ko'ku0do0/, /ko0dolmo‘/ ^ /ko‘i°N0/, /ko°do'mo10 lpo1i1/ ... ^
Symbolic Representations
Articulation Acoustic input
Figure 6.3: Participants’ performance in Experiment 1 (Auditory naming) in which they started naming the words after they had heard only part of the word by exploiting only segmental information.
The case where the Segments + Pitch calculation had a significant effect. Figure
6.4 shows participants’performance in Experiment 1 (Auditory naming) in which the
participants started naming words after they had partially heard the word by exploiting
segmental and word-level prosodic (pitch accent patterns) information. Since
participants were able to exploit both kinds of information (unlike in Figure 6.3), the 171
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. words that have the same first two moras with a LH pitch accent pattern were mapped
with the acoustic input (probably 2 moras long). As the participants started naming the
words before their ending, articulatory gestures must have been assembled as the acoustic
input became available. Since the participants know a sequence of gestures for a word
like /ko°doV, they should have been able to produce the target word more naturally.
r /k o ^ o ’mo1/ /ko°do‘kul / ^ /ko'Mo'mo'Q'poV/... ^
Symbolic Representations
Articulation Acoustic input
Figure 6.4: Participants’ performance in Experiment 1 (Auditory naming) in which the participants started naming the words after they had partially heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information.
172
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 6.5 represents the scenario for the Auditory calculation; it shows
participants’ performance in Experiment 1 (Auditory naming) in which the participants
started naming the words after they had completely heard the word by exploiting
segmental and word-level prosodic (pitch accent patterns) information. The effectiveness
of the Auditory calculation indicates that participants heard the entire words. Recall that
the acoustic information is transmitted to Auditory patterns and Symbolic representations
simultaneously.
If the acoustic information was perceived as three moras followed a pause, only
one representation was activated at Auditory patterns. Then, a complete sequence of
gestures for/ko°do'mo1/ was executed via Symbolic representations. A neighborhood
facilitative effect was observed in the Auditory calculation, since the acoustic information
activated auditory representations. Since no word selection is conducted in Auditory
patterns , the neighborhood effect was facilitative . Here, /k o 'W m o 1/ is pronounced in a
more complete way.
173
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r
/ko^o'm o1/
Symbolic Auditory Representations patterns
Articulation Acoustic input
Figure 6.5: Participants’ performance in Experiment 1 (Auditory naming) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word-level prosodic (pitch accent patterns) information.
As we have seen, the proposed model was able to explain the data obtained in
Experiment 1.
174
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 .3 .3 .2 .E x p e r im e n t 2: A u d it o r y N a m in g in N o is e
Experiment 2 was an auditory naming experiment in noise with a secondary task.
This experiment collected naming times as well as identification accuracy. Consider the
patterns for slow namers in this model. Figure 6.6 shows participants’ performance in
Experiment 2 (Auditory naming in noise) in which the participants started naming the
words after they completely heard the word and exploited segmental and word-level
prosodic (pitch accent patterns) information.
This experiment provided curious patterns for naming times and word
identification accuracy. As in Experiment I, the naming times were taken to be the
processing times between the onset of the target words embedded in noise and the onset
of the named targets. Word identification accuracy was analyzed on the basis of the
named targets. Different neighborhood density calculations were chosen for naming
times and word identification accuracy: Auditory calculation for naming times and the
Segments + Pitch calculation for word identification accuracy. The directions of effects
were opposite: a facilitative effect for naming times and an inhibitory effect for word
identification accuracy. How can the model account for these patterns?
175
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. r /ko°do‘mo1/, /to°rolro1/ ^ /a°nalgo1/, /ka°roIo1/ /ko°walne'/, /e°no1gu1/... V . J
Symbolic Auditory Representation patterns
Articulation Acoustic input
Figure 6.6: Participants’ performance in Experiment 2 (Auditory naming in noise) in which the participants started naming the words after they had completely heard the word by exploiting segmental and word -level prosodic (pitch accent patterns) information.
In a noise-free condition as in Experiment 1, participants can match the acoustic
information onto symbolic representation directly. More than 98% of 700 targets were
produced correctly in an auditory naming task under in noise-free conditions. Because of
noise, however, slow namers in Experiment 2 performed the task after they completely
heard the word.
176
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The effect of neighborhood density in the Auditory calculation suggests that
auditory exemplars were mapped onto the acoustic input of the target word. In the noise-
free condition, only one word was completely matched with the acoustic input in
Auditory patterns so that it is just sent to Symbolic representation in order to retrieve a
sequence of gestures for articulation. However, in the noisy condition, there were
multiple words that consisted of various sounds which were mapped onto the acoustic
input - even if the auditory patterns helped shape the words. The noise created some
mismatches between the actual acoustic input and the mapped input. In order to produce
the words, the words that were mapped to the acoustic input needed to be sent to
Symbolic representations. In Symbolic representation , many words were activated based
on the transmitted information from Auditory patterns. As shown in §6.3.3.1, many
words were activated at Symbolic representations for fast namers. A crucial difference
between those cases and this one is that, here, the activated words in Symbolic
representations do not share the exact same mora. As shown in Figure 6.6. most similar
neighbors to /kodomo/ do not share the initial mora. In order to complete the
experimental task, participants needed to make a decision about which word they had
heard. Recall that an inhibitory effect is assumed to be the result of word selection,
which means that that the neighborhood density effects at Symbolic representations (the
Segments calculation and the Segments + Pitch calculation) will induce inhibition effects.
Furthermore, participants tended to listen to the entire words before they named them,
then, by exploiting both segmental information and pitch accent information. Because of
177
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. this, the Segments + Pitch calculation accounted for the data more satisfactorily than the
Segments calculation.
For the naming times, however, no word selection process was involved. The
activated auditory exemplars in ( Auditory patterns) were just transmitted to Symbolic
representations. Therefore, the Auditory calculation of neighborhood density showed
facilitation, not inhibition.
Two auditory naming experiments (Experiments 1 and 2) provided the following
findings. First, it is a reasonable assumption that the selection of a word causes a
neighborhood inhibitory effect. Neighborhood density effects are facilitative only if word
selection is not involved. This observation was valid for Auditory patterns and Symbolic
representations . which are used for acoustic mapping. Since word selection was not
involved here, a neighborhood density effect should not be observed in our data. The
acoustic information was just converted to the symbolic representation in order to execute
a sequence of articulatory gestures. Therefore, there is a discrepancy between English
and Japanese auditory naming experiments with word targets. In terms of this
framework, the English participants in Luce and Pisoni (1998) had to make a word
selection in order to perform the task. Experiment 2 showed that Japanese participants
also had to make a decision about the word form. More details on the discrepancies
between Japanese and English will be presented in §6.3.4.
Secondly, it is possible that word targets are pronounced in the same way as
nonwords via Symbolic representations. In §6.3.3.1, dual routes for articulation explain
178
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the difference between words and nonwords. However, Japanese participants in this
experiment showed that listening to the entire word target was not a requirement to
perform the task. At earlier stages, participants were able to imitate the acoustic input.
Thus, words are named in two different procedures (via Auditory patterns for a more
complete representation, or via Symbolic representations that require an assembling of
articulatory gestures), whereas nonwords have only one route for articulation via
Symbolic representations.
Third, different neighborhood density calculations show three types of
information regarding lexical access: (1) the kinds of phonological information that are
used for lexical access (segments and pitch accents), (2) how word similarity is calculated
(unit-by-unit or whole-word) and (3) approximately when the participants executed their
articulation. In the timecourse of lexical access, word-level prosodic information for
word activation only become available after segmental information becomes available.
Therefore, the patterns observed here are supported by a recent study by Cutler and Otake
(2002), which claim that words are activated based on segmental information before
word-level prosody constrains word segmentation.
However, one thing to consider is the possibility that there may be mixtures in
some data sets - with some participants sometimes following one naming path while the
rest of the participants follow the other.
179
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 .3 .3 .3 .E x p e r i m e n t 3: S e m a n t ic C ategorization
The semantic categorization experiment requires lexical access because
participants must retrieve the meaning of words in order to decide whether or not the
words the participants hear belong to an assigned semantic category. The results showed
that two types of word competition emerged in this task: a neighborhood density effect
and the effects of initial cohort size and uniqueness point. In this framework, the
acoustic input is connected to both Auditory patterns (whole-word auditory patterns) and
Symbolic representations (unit-by-unit symbolic representation). Neighborhood density
effect is based on similarity of words whereas the effects of initial cohort size and
uniqueness point are based on similarity of units, such as phonemes. Furthermore, both
facilitative and inhibitory neighborhood density effects were observed in this experiment.
How could this have happened?
180
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Semantics
Symbolic Auditory Representations patterns
Acoustic input
Figure 6.7: Participants’ performance in Experiment 3 (Semantic categorization).
Consider fast responders, first. Figure 6.7 shows participants’ performance in
Experiment 3 (Semantic categorization). As in auditory naming experiments, it may be
assumed that the different neighborhood density calculations show three types of
information with respect to lexical access: (1) the kinds of phonological information that
are used for lexical access (segments and pitch accents), (2) how word similarity is
181
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. calculated (unit-by-unit or whole-word) and (3) approximately when the participants
executed their articulation. The Segments calculation and the Segments + Pitch
calculation show processes before participants hear the entire word, and the Auditory
calculation shows processes after they have heard the entire word. For example, if
segments information is available, participants may process the acoustic input in terms of
similarity unit-by-unit until word-level prosodic information become available. In this
model, three different neighborhood density calculations offer windows into three
different stages of the word recognition process.
The two neighborhood calculations based on Symbolic representations (the
Segments calculation and the Segments + Pitch calculation) showed facilitative effects.
This means that participants analyzed the acoustic input based on segments as well as
segments + pitch. Recall that first mora frequency is interpreted as initial cohort size, so
it makes sense that the initial cohorts were activated based on the first mora. The initial
cohort size showed an inhibitory effect, which should have had on influence on the
Segments calculation. The other calculation based on Symbolic representations was the
Segments + Pitch calculation. In this case, once pitch accent information is available,
segmental information as well as pitch accent information is used in the subsequent
reduction of the cohort. Therefore, both segment-based neighborhood calculations would
have an effect. The effects related to the Cohort theory demonstrate that the acoustic
input was analyzed in terms of segments using Symbolic representations.
182
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Auditory calculation also had an effect in Experiment 3. This calculation is a
measure of whole-word similarity based on Auditory patterns. This means that
participants were sensitive not only to unit-by-unit similarity of words but also to whole-
word similarity. As shown in Figure 6.7, the acoustic input is transmitted simultaneously
to Auditory patterns and Symbolic representations. However, whole-word similarity
does not come into play until the target words are fully recognized. Until that point, the
cohort effects from Symbolic representations were observed earlier than the
neighborhood density effect. As mentioned in Chapter 2, nearly 55% of the target words
do not have a uniqueness point before the last segment. Although most of the targets
were still not recognized,w some ’ target w words took advantage of w the cohort effects and
were recognized at this point.
If the targets had not recognized by this point, the information about activated
words in Symbolic representations (with and without pitch accent patterns) was
transmitted to Auditory patterns, where neighbors were activated by the acoustic input.
Recall that the acoustic information was transmitted to Auditory patterns while cohort
reduction occurred. Word selection was therefore based not only on segment-based
cohort reduction, but also on the overall auditory impression of the words. Therefore, the
words were activated in both representations should have a higher activation level than is
possible for word recognition. Now word decision is not at Symbolic representations but
at Auditory patterns, yielding a neighborhood inhibitory effect in Auditory patterns.
Since Symbolic representations are not used for lexical selection, the two other
183
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. neighborhood density effects were still facilitative. Therefore, three different
neighborhood densities produced sufficient effects in this experiment; the Segments and
the Segments + Pitch calculations produced facilitative effects while the Auditory
calculation produced an inhibitory effect.
Two types of word competition emerged in this experiment. However, they did
not operate simultaneously in the timecourse of the categorization task. A word
competition effect from cohort reduction emerged through the analysis of acoustic
information in terms of unit similarity at Symbolic representations. Neighborhood
density effect emerged by analyzing the same acoustic information in terms of whole-
word similarity at Auditory patterns. The participants elected words based on Auditory
patterns with the information from Symbolic representations.
For slow responders, the basic mechanism is the same for fast responders except
that the Auditory calculation had a stronger effect for slow responders.
In sum, the results of three experiments in this dissertation were satisfactorily
explained by the model proposed in §6.3.2. In §6.3.4, findings from previous
experiments will be reconsidered in light of the proposed model.
6 .3 .4 . P r e v io u s F in d in g s in T e r m s o f T h e P r o p o s e d M o d e l
This section will look at previous findings in neighborhood experiments with
respect to the proposed model. It will being by first looking at the results of English
auditory naming experiments with word targets (Luce & Pisoni, 1998; Vitevitch & Luce,
184
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1999). Section §6.3.4.2 will further investigate the results of a Japanese lexical decision
experiment that did not show a neighborhood inhibitory effect (Amano & Kondo, 1999).
Finally, implications from the current word recognition theories will be reinterpreted in
terms of this model in §6.3.4.3.
6 .3 .4 .1 .A u d it o r y N a m in g E x p e r im e n t s w it h W o r d T a r g e t s in E n g l is h (L u c e & P is o n i , 1998; VlTEVITCH & L u c e , 1999)
These are some important discrepancies between English and Japanese naming
data. Japanese naming data showed that word selection was conducted only under noisy
conditions because the participants were in a situation where they had to choose one of a
few candidate words in order to complete the task. In fact, in a noise-free condition
(Experiment I), Japanese participants did not show an inhibitory effect when they did not
need to choose the word.
On the other hand, neighborhood experiments with an auditory naming task have
shown that neighborhood density is a facilitative effect with nonwords and is an
inhibitory effect with words (Luce & Pisoni, 1998; Vitevitch & Luce. 1999). Luce and
Pisoni (1998) may be accounted for by recognizing that the English participants in the
study had to choose single word forms in order to complete the task even in a noise-free
condition. In the proposed model, monosyllabic auditory patterns quickly become
activated and an auditory exemplar for the target word is selected. Then, the selected
word is converted into a sequence of gestures via Symbolic representations. Then, what
might cause such a situation?
185
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. In a naming task, the participants’ task was to repeat monosyllabic words (CVC in
Luce & Pisoni, 1998 and CVCC in Vitevitch & Luce, 1999) as quickly and as accurately
as possible. In order to perform the task, they have to know exactly which syllable they
are going to name in the word they hear. However, because of the structure of the
English lexicon and the usage of English words, choosing a syllable is practically
equivalent to selecting a word. Cutler & Carter (1987) reported that 64% of the content
words in the London-Lund corpus are monosyllabic words. It is also known that
frequency of monosyllabic words is very high in English. Under these conditions, for
English participants, choosing the syllable they minimally need to know in order to
perform the task is equivalent to selecting that word in the lexicon. English participants
were in the situation where they actually decided the word for the task. This happened
because the syllable they needed to know to perform the task was always a high frequency
word in English.
For Japanese participants, however, the first syllable/mora is a part of the target
word. The Japanese participants in this study learned in a practice session that they
would hear CVCVCV words. What they had to know is what the first syllable/mora of
the target words was. Monosyllabic/momomoraic CV words are rare in the Japanese
lexicon, so individual syllables would be unlikely to inspire a stream through the lexicon
for corresponding words. The Japanese participants did not need to select the word in
order to perform this task.
186
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The discrepancy between English and Japanese naming time data may be caused
by a relationship between the target words, the units for production (syllables) and the
lexical structure including frequencies of words.
6 .3 .4 .2 .L e x ic a l D e c is io n E x p e r im e n t in J a p a n e s e (A m a n o & K o n d o , 1999)
This section discusses another discrepancy between English and Japanese data in
the lexical decision experiments. In this task, participants were asked to decide whether
the stimuli were words or nonwords. The task, therefore, requires discrimination between
words and nonwords. Luce and Pisoni (1998) and Vitevitch and Luce (1999) both
reported that the decision times showed a neighborhood inhibitory effect. In Japanese,
Amano and Kondo (1999) conducted a similar lexical decision experiment, but the results
did not show a neighborhood inhibitory effect. Amano and Kondo (1999) accounted for
the discrepancy between lexical decision data and word identification data by claiming
that the neighborhood requires some amount of time to be activated enough to compete
with a target word (p. 1666). Since Amano & Kondo (1999) found a neighborhood
inhibitory effect in a word identification experiment, there might be a reason why it did
not show up in the lexical decision experiment5. It has been assumed here that that the
lexical decision task requires lexical selection: one word is selected from the activated
words in the lexicon in order to perform the task. However, Amano and Kondo’s
3 The neighborhood density calculation used in Amano & Kondo (1999) was based on moras. not on segments. 187
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. explanation may imply that Japanese participants managed to perform the task without
selecting the word from activated words in the lexicon. The rest of this section explores
this possibility.
The lexical decision task requires using words and nonwords for stimuli.
According to the results of English neighborhood density experiments, the task
necessitates the activation of lexical items in memory to categorize the stimulus
successfully, even when the stimulus is a nonword. In other words, to make a lexical
decision on both words and (phonotacticaliy legal) nonwords, participants have to make a
decision on the representation where an analysis of words and nonwords is possible.
In this model, two types of representations are assumed to exist in the lexicon:
Auditory patterns and Symbolic representations. There are therefore three lexical
decision strategies participants could take, depending on which representation they use
for their decision.
In the first two strategies, participants select a word from activated words in the
lexicon. In the first strategy, Japanese participants could perform the task by selecting a
word based on Auditory patterns , because the lexical decision task does not require
analyzing the internal structure of the word. Neighborhood density is based on whole-
word similarity, so if the word selection is observed in Auditory patterns , neighborhood
density would show an inhibitory effect. This neighborhood inhibitory effect was the
prediction in Amano & Kondo's experiment. This means that auditory patterns are fully
activated and word selection should be made in this space. From the results of
188
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Experiments 1 and 2 in this dissertation, whole-word activation should occur later in the
timecourse of word recognition. However, in this experiment, both words and nonwords
are used as targets. Since Auditory patterns are not able to deal with nonwords, so, it
seems unlikely that participants will take this strategy.
The second strategy is that participants base their decisions on symbolic
representations. Recall that symbolic representations of words are stored in the lexicon.
At the same time, symbolic representation are needed to hold temporal acoustic
information. Since nonwords are not stored in the lexicon, a symbolic representation
needs to be selected. Symbolic representations can be used for both words and nonwords,
so participants may focus on symbolic representations. English lexical decision
experiments have consistently shown that there is an effect of neighborhood density for
words while there is an effect of probabilistic phonotactics for nonwords. This difference
depends on whether participants conducted a word selection or not. English participants
may have taken this strategy in the proposed model.
The last strategy is that participants perform their task based on the probabilistic
phonotactics in the Symbolic representation. Amano & Kondo used CVCVCVCV words
with a LHHH pitch accent pattern. In other words, participants performed the task based
on whether a sequence of 4 moras with a proper pitch accent pattern occurs word-initially
or not. If this was the case, the participants did not need to consider other words. When
the acoustic input does not match any symbolic representation, the target could be
identified as a ‘nonword.’
189
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The response patterns in the lexical decision experiment in Amano & Kondo’s
(1999) study did not show a neighborhood inhibitory effect, suggesting that the data
support the third strategy: participants made a decision based on probabilistic
phonotactics.
6.3.4.3.IMPLICATIONS FOR CURRENT RECOGNITION MODELS
So far, the mode has accounted for the results of the three experiments in this
dissertation. It has also been used to account for discrepancies between English and
Japanese predictions in auditory naming and lexical decision experiments.
This next section will consider the implications of this work for current word
recognition theories. In particular, it will address the relationships among word
competition, representations and levels of processing.
Word recognition models usually assume both sublexical and lexical levels of
processing. These effects in spoken word recognition have been demonstrated in a
number of studies that investigated processing of words and nonwords that varied
probabilistic phonotactics (defined as the positional frequencies of segments and
biphones) and lexical competition (neighborhood density). Sublexical effects are
supported by findings in Pitt and Samuel (1995) and Vitevitch and Luce (1998, 1999),
who showed that the frequency of the sound components of spoken stimuli facilitates
processing. Sublexical frequency effects (also known as probabilistic phonotactics ) play
a part in the recognition process as the well-documented effects of lexical competition
190
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Gow & Gordon, 1995; McQueen et al., 1994; Norris et al., 1995; Shillcock, 1990;
Tabossi et al., 1995; Vitevitch & Luce, 1999; Vroomen & de Gelder, 1995; 1997;
Wallece et al., 1995a; Wallace et al., 1995b; Zwitserlood, 1989; Zwitserlood &
Schriefers, 1995). At the same time, competition among lexical representations inhibits
processing (Neighborhood density effect: Luce et al., 2000; Luce & Pisoni, 1998;
Vitevitch & Luce, 1999; Amano & Kondo, 1999).
Furthermore, Vitevitch and Luce (1999) have shown that different experimental
tasks could change the focus of participants in word processing. These results are
consistent with the hypothesis that neighborhood density has an inhibitory effect when
participants mainly focus on lexical processing, while probabilistic phonotactics are
facilitative when participants mainly focus on sublexical processing.
Since the number of words and frequent segments in the language overlap, there is
a strong positive correlation between neighborhood density and probabilistic
phonotactics. Typically, as the number of overlapping words increases, the frequencies of
the segments that make up the overlapping words also increase. Based on this fact, when
a neighborhood facilitative effect can be interpreted as an effect of probabilistic
phonotactics on sublexical processing.
Although neighborhood density and probabilistic phonotactics are highly
correlated, they are still separable. Luce and Large (2001) found that the results from a
speeded same-different task revealed simultaneous facilitative effects from phonotactics
191
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and inhibitory effects from competition effect (neighborhood density) for real word
stimuli.
The above results have suggested that word recognition models need to explain
two types of word competition (segment-based and whole-word based) and two levels of
processing (sublexical and lexical). In fact, two types of word competition were
simultaneously observed in Experiment 3, semantic categorization experiment.
The claim here is that lexical processing and sublexical processing are based on
different representations in the proposed model. Lexical processing is based on Auditory
patterns and sublexical processing is based on Symbolic representations. This model
does not assume any internal linguistic structure for Auditory patterns. However, it also
has Symbolic representations that could be represented by linguistic units, such as
phonemes, moras and syllables. Therefore, lexical processing occurs when participants
treat words as a whole unit, whereas sublexical processing happens when participants
treat words as an assembled unit (such as phonemes, moras and syllables). In other
words, in lexical processing, participants perceive words as complete forms whereas in
sublexical processing, they perceive words as assembled forms.
This model also permits the two types of word competition as based on different
representations. Neighborhood density is based on Auditory patterns whereas cohort
reduction and other segment-based word competition are based on Symbolic
representations.
192
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. According to the results of a series of experiments in Vitevitch and Luce (1999),
participants’ focus is highly related to experimental tasks. For example, a speeded same-
different task focuses participants more on sublexical processing, whereas a semantic
categorization task focuses them more on lexical processing.
In this model, participants focus more on one representation than the other, and
this focus is determined by the experimental task. Also, words are processed in both
representations whereas nonwords are only processed in symbolic representations. A
symbolic representation is the only representation in which participants can hold the
acoustic input for nonwords.
Also, if the experimental tasks require analyzing the internal structure of words,
participants are likely to focus on a symbolic representation. Therefore, many
experimental tasks used in studies on sublexical processing may inherently lead
participants to focus on a symbolic representation (phoneme-monitoring task, syllable-
monitoring task, ABX discrimination task, word-completion task). Also, the auditory
naming task needs to rely on a symbolic representation because this representation
mediates articulation in this model. The lexical decision task is processed at a symbolic
representation as a special case. The lexical decision itself does not require analyzing the
internal structures of words. However, participants also need to hear nonwords that make
them focus on a symbolic representation. In order to process two types of words
efficiently, participants probably tend to focus on a symbolic representation. If this is the
193
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. case, the semantic categorization task is the only task that is not biased towards a
symbolic representation.
The proposed model uses two types of word representations ( Auditory patterns
and Symbolic representations) in order to explain the results obtained in this dissertation.
The two types of word competition observed in the single experiment seemed to be
especially problematic for current models. The proposed model explains the two types of
word competition by positing two different representations, each of which reflects lexical
and sublexical processing. This model was a first attempt to assume auditory patterns
and symbolic representations simultaneously. More detailed investigation needs to be
conducted in the future.
6 .4 . C onclusions
This dissertation attempted to investigate two aspects of lexical access:
representations used for lexical access and word competition effects in Japanese. How
well did it succeed?
In word competition, the neighborhood density effect is a measure of lexical
competition: words from dense neighborhoods are processed less quickly and less
accurately than words from sparse neighborhoods. Therefore, three experiments were
conducted to investigate this effect in this dissertation. The results showed that
neighborhood density effects had influence on both processing times and accuracy.
194
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The processing times of Experiment 3 (the Semantic categorization experiment)
showed a neighborhood inhibitory effect for both fast and slow responders. This effect
was observed in a similar semantic categorization experiment in English (Vitevitch &
Luce, 1999). The accuracy data of Experiment 2 (Word identification in noise
experiment) showed that words from dense neighborhoods were recognized less
accurately than words from sparse neighborhoods. This pattern was also observed in
word identification in noise experiments in Japanese and English (Luce & Pisoni, 1998;
Amano & Kondo, 1999). Based on the assumption that neighborhood inhibitory effect
reflects word competition effect in any language, the above results provide evidence that
word competition also occurs in Japanese.
The results also showed another type of word competition effect in Japanese. The
second type of competition is related to the cohort theory. First, the effect of uniqueness
point was observed: words with an earlier uniqueness point were categorized more
quickly than words with a later uniqueness point. Also, an effect of initial cohort size
was observed. Recall that the first mora frequency can be interpreted in different ways; in
one interpretation, it is as the size of the initial cohort, if the effect shows inhibition. This
factor measures how frequently the initial mora appears word-initially. Therefore, if this
factor is high, there are many words beginning with this initial mora, yielding more
lexical competition. In this case, words with a large initial cohort are processed less
quickly than words with a small initial cohort. The first mora frequency showed an
inhibitory effect in Experiment 3, which nurtures the hypothetical effect of initial cohort
195
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. size. These two results, which support the cohort theory, suggest that the left-to-right
word competition effect happens in word recognition processes in Japanese.
Together, these data provided evidence that two types of lexical competition are at
work in Japanese.
This dissertation also investigated the kind of word representation used for lexical
access. Recent studies have shown that listeners use both abstract and episodic
representations and used in lexical access (e.g.. Luce & Lynos, 1998; Soto-Faraco et al.,
2001; Pallier et al., 2001). The results of the experiments in this dissertation also support
this view. As was just mentioned, these were two types of lexical competition effects
observed in Experiment 3 (the Semantic categorization experiment): (I) whole-word form
competition effect (neighborhood effect) and (2) a phoneme-based competition effect
(cohort reduction). A neighborhood density calculation also showed that inhibition is
related to a measure of whole-word similarities of words as computed by a comparison of
cochleagrams. Therefore, we need to assume that auditory patterns (auditory
representations) are stored in the lexicon.
The necessity of abstract representation comes from the fact that slow namers of
two auditory naming experiments tend to repeat the target words after they hear them
entirely, suggesting that they know exactly which word they are going to say. In this case,
participants have to retrieve an articulatory representation of the words. Since these two
representations do not have a direct, one-to-one correspondence, a symbolic
196
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. representation that mediates between the two representations is needed (see Jusczyk,
1993; Plaut & Kello, 1999 for discussion). 4 In conclusions, the results of the experiments in this dissertation shed light on two
aspects of lexical access involved in this dissertation. First, a lexical competition effect
is confirmed in Japanese. There are also two types of lexical competition in auditory
word recognition: form-based competition (neighborhood density) and phoneme-based
competition (cohort reduction). Finally, both abstract (symbolic) representations and
episodic (auditory) representations need to be stored in the lexicon.
197
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. REFERENCES
Amano, S., & Kondo, T. (1999). Neighborhood effects on spoken word recognition in Japanese. EUROSPEECH 99, 4, 1663-1666.
Amano, S., & Kondo, T. (2000). Neighborhood and cohort in lexical processing of Japanese spoken words. A paper presented at Spoken Word Process Accesses, Jonkerbosch Conference Center, Nijmegen, The Netherlands.
Amano, S., & Kondo, T. (1999,2000). The properties o f the Japanese lexicon. Tokyo: Sanseido Co. Ltd.
Bailey T. M., & Hahn, U. (2001). Determinants of wordlikeness: Phonotactics or lexical neighborhoods? Journal o f Memory and Language , 44, 568-591.
Barry, C„ Hirsh, K. W., Johnston, R. A., & Williams, C. L. (2001). Age of Acquisition, word frequency, and the locus of repetition priming of picture naming. Journal of Memory and Language, 44, 350-375.
Batting, W. F. S., & Montague, W. E. (1969). Category norms for verbal iterms in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology, 49, 229-240.
Beckman, M. E., & Edwards, J. (1990). Lengthenings and shortenings and the nature of prosodic constituency. In J. Kingston, & M. E. Beckman (Eds.), Papers in laboratory phonology I: Between the grammar and physics o f speech (pp. 152-178), Cambridge. UK: Cambridge University Press.
Bradley, D. C., Sanchez-Casas, R. M.. & Gracia-AIbea, J. E. (1993). The status of the syllable in the perception of English and Spanish. Language and Cognitive Processes, 8, 197-233.
Bregman, A. S. (1990). Auditory scene analysis: the perceptual organization of sound. Cambridge, M ass.: MIT Press.
198
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Brown. G. D. A., & Watson, F. L. (1987). First in, first out: Word learning age and spoken word frequency as predictors of word familiarity and word naming latency. Memory and Cognition, 15, 208-216.
Brent, M. R., & Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition , 61,93-125.
Bruck, B., Treiman, R., & Caravolas, M. (1995). Role of the syllable in the processing of spoken English: Evidence from a nonword competition task. Joumai of Experimental Psychology: Human Perception and Performance, 21, 469-479.
Charles-Luce, & Luce, P. A. (1995). An examination of similarity neighbourhoods in young child’s receptive vocabularies. Joumai of Child Language, 22, 727-735.
Cohen, J., & Cohen, P. (1983). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Second Edition. Laurence Erlbaum Accociates, Publishers.
Cutler, A. (1986). Forbear is a homophone: Lexical prosody does not constrain lexical access. Language and Speech, 29, 201-220.
Cutler, A. (1997). The comparative perspective on spoken-language processing. Speech Communication, 31. 3-15.
Cutler. A. & Carter, D.M. (1987). The predominance of strong initial syllables in the English vocabulary. Computer Speech and Language, 2, 133-142.
Cutler, A., & Chen, H. C. (1997). Lexical tone in Cantonese spoken-word processing. Perception <6 Psychophysics, 59, 165-179.
Cutler, A. & Norris, D. G. (1988). The role of strong weak syllables in segmentation for lexical access. Joumai of Experimental Psychology: Human Perception & Perfomiance, 14,113-121.
Cutler, A & Butterfield, S. (1992). Rhythmic cues of speech segmentation: Evidence from Juncture Misperception. Joumai of Memory and Language, 31, 218-236.
Cutler, A., Mehler, J., Norris, D. G. & Segui. J. (1986). The syllables differing role in the segmentation of French and English. Joumai of Memory and Language, 25, 385-400.
Cutler, A., Mehler, J., Norris, D. G. & Segui, J. (1992). The monolingual nature of speech segmentation by bilinguals. Cognitive Psychology, 24, 381-410.
Cutler, A., 8c Otake, T. (1994). Mora or phoneme? Further evidence for language- specific listening. Joumai o f Memory and Language, 33, 824-844.
199
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Cutler, A., & Otake, T. (1999). Pitch accent in spoken-word recognition in Japanese. Joumai o f the Acoustical Society of America, 105,1877-1888.
Culer, A., & Otake, T. (2002). Rhythmic categories in spoken-word recognition. Joumai of Memory and Language , 46, 296-322.
Cutler, A., & Young, D. (1994). Rhythmic structure of word blends in English. Proceedings of International Conference on Spoken Language Processing '94, Yokohama, Japan, 3, 1407-1410.
Dupoux, E., Kakehi, K., Hirose, Y., Pallier, C., & Mehler, J. (1999). Epenthetic vowels in Japanese: A perceptual Illusion? Joumai of Experimental Psychology: Human Perception and Perfonnance, 25, 1568-1578.
Dupoux, E., Pallier, C„ Kakehi, K., & Mehler. J. (2001). New evidence for prelexical phonological processing in word recognition. Language and Cognitive Processes, 16, 491-505.
Fear, B. D., Cutler, A.. & Butterfield, S. (1995). The strong weak syllable distinction in English. Joumai of the Acoustic Society of America, 97, 1893-1904.
Finney, S. A.. Protopapas, A., & Eimas, P. D. (1996). Attentional allocation to syllables in American English. Joumai of Memory and Language, 35, 893-909.
Forster, K. I., & Shen, D. (1996). No enemies in the neighborhood: Absence of inhibitory neighborhood effects in lexical decision and semantic categorization. Joumai of Experimental Psychology - Learning, Memory & Cognition, 22, 696-713.
Frish, S.A. (1996). Similarity and Frequency in Phonology. Unpublished doctoral dissertation, Northwestern University, Evanston, IL.
Frisch, S.A., Large, N.R., & Pisoni, D.B. (2000). Perception of wordlikeness: Effect of segment probability and length of the processing of nonwords. Joumai of Memory and Language, 42, 481-496.
Garlock V. M., Walley A. C., & Metsala J. L. (2001). Age-of-acquisition, word frequency, and neighborhood density effects on spoken word recognition by children and adults. Joumai of Memory and Language, 45, 468-492.
Goldinger, S.D. (1992). Words and voices: Implicit and explicit memory for spoken words. Unpublished doctoral dissertation, Indian University, Bloomington, EN.
Goldinger, S.D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Joumai of Experimental Psychology: Learning, Memory and Cognition, 22, 1166-1183. 200
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Goldinger, S.D. (1998). Echoes of echoes?: An episodic theory of lexical access. Psychological Review, 105, 251-279.
Goldinger, S., Luce, P.A., & Pisoni, D.B. (1989). Priming lexical neighbors of spoken words: Effects of competition and inhibition. Joumai of Memory and Language , 28, 501-518.
Gow, D., & Gordon, P. (1995). Lexical and pre-lexical influences in word segmentation: Evidence from priming. Joumai of Experimental Psychology: Human Perception and Performance , 21, 344-459.
Greenberg, J.H & Jenkins, J.J. (1964). Studies in the psychological correlates of the sound system of American English. Word, 2 0, 157-177.
Grossberg, S., Boardman, I., & Cohen, M. (1997). Neural dynamics of variable-rate speech categorization. Joumai of Experiment Psychology: Human Perception and Performance, 2 3 ,483-503.
Hasegawa, Y, & Hata, K. (1992). Fundamental-frequency as an acoustic cue to accent perception. Language and Speech , 35, 87-98.
Hibiya, J. (1995). The velar nasal in Tokyo Japanese: A case of diffusion from above. Language, Variation and Change, 7, 139-152.
Johnson, K. (1997a). The auditory/perceptual basis for speech segmentation. In Ainsworth-Damell, K., & D’lmperio, M. (eds.). Ohio State University Working Papers in Linguistics: Papers from the Linguistic Laboratory, volume 50. Ohio State University.
Johnson, K. (1997b). Speech perception without speaker normalization. In Johnson, K., & Mullenix, J.W., (Eds.).. Talker variability in speech processing, (pp. 145-166). San Diego: Academic Press.
Jusczyk, P. W. (1993). From general to language-specific capacities - the WRAPSA Model of how speech-perception develops. Joumai of Phonetics , 2 1, 3-28.
Kenbou, G, Kindaichi, H, Kindaichi, K, & Shibata, T. (1981). Sanseido Shinmeikai Dictionary. Tokyo: Sanseido Co. Ltd.
Klatt, D. H. (1974). The duration os [s] in English words. Joumai Speech and Hearing Research, 17, 51-63.
Klatt, D. H. (1975). Vowel lengthening is synthetically determined in a connected discourse. Joumai o f Phonetics, 3, 129-140.
201
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Kail, D. H. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America, 59, 1208-1221.
Klatt (1979). Speech perception: A model of acoustic-phonetic analysis and lexical access. Journal of Phonetics, 1, 279-312.
Klatt, D. H. (1981). Lexical representations for speech production and perception. In T. Myers, J. Laver, & J. Anderson (Eds.), The cognitive representation o f speech (pp. 11- 31).
Kubozono, H. (1995). Perceptual evidence for the mora in Japanese. In B. Connell and A. Arvaniti (Eds.), Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV (pp. 141-156). Cambridge: Cambridge University Press.
Lehiste, I. (I960). An acoustic-phonetic study of internal open juncture. Basel (Schweiz); New York : S. Karger.
Lehiste, I. (1972). The timing of utterances and linguistics boundaries. Journal of the Acoustical Society of America, 51, 2018-2024.
Luce, P.A. (1986a). Neighborhoods of words in the mental lexicon. (Research on speech perception technical report 6). Bloomington, IN: Indiana University.
Luce, P.A. (1986b). A computational analysis of uniqueness points in auditory word recognition. Perception & Psychophysics, 39. 155-158.
Luce, P.A.. Pisoni, D.B., & Goldinger, S.D. (1988). Similarity neighborhoods of spoken words. (Research on speech perception progress report 14.) Bloomington. IN: Indiana University.
Luce, P. A., Goldinger, S.D., Auer, E.T., & Vitevitch, M.S. (2000). Phonetic priming, neighborhood activation and PARSYN. Perception and Psychophysics. 62, 615-625.
Luce, P.A., & Large, N.R. (2001). Phonotactics, density, and entropy in spoken word recognition. Language and Cognitive Processes. 16. 565-581.
Luce. P. A.. & Lyons E. A. (1998). Specificity of memory representations for spoken words. Memory and Cognition, 26, 708-715.
Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19, 1-36.
Luce, P.A., Pisoni, D.B., & Goldinger, S.D. (1990). Similarity neighborhoods of spoken words. In G. T. M. Altmann (Ed.). Cognitive models o f speech processing,
202
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. psycholinguistic and computational perspectives (pp. 122-147). Cambridge, MA: MIT Press.
Luce, R. D. (1959). Individual choice behavior. New York: Wiley.
van der Lugt, A.H. (2001). The use of sequential probabilities in the segmentation of speech. Perception and Psychophysics, 63, 811-823.
Marslen-Wilson, W.D. (1987). Functional parallelism in spoken word-recognition. Cognition , 25, 71-102.
Marslen-Wilson, W.D., & Welish, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology, 10, 29-63.
Marslen-Wilson, W. D, & Tyler, L. K. (1980). Temporal structure of spoken language understanding. Cognition, 8, 1-71.
McClelland, J. & Elman, J. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86.
McQueen, J.M. (1998). Segmentation of continuous speech using phonotactics. Journal of Memory and Language, 39, 21-46.
McQueen, J. M. & Cutler, A. (1998). Spotting (different types of) words in (different types of) context. Proceedings of the 5th International Conference on Spoken Language Processing, vol. 6, (pp. 2791-2794), Sydney, Australia.
McQueen, J. M., Norris, D. G., & Cutler, A. (1994). Competition in spoken word recognition: Spotting words in other words. Journal o f Experimental Psychology: Learning, Memory and Cognition, 20, 621-638.
McQueen, J.M., Otake, T., & Cutler, A. (2001). Rhythmic cues and possible-word constraints in Japanese speech segmentation. Journal of Memorv and Language, 45, 103-132.
Mehler, J., Dommergues, J.-Y., Frauenfelder, U. & Segui, J. (1981). The syllable's role in speech segmentation. Journal of Verbal and Learning Behavior, 20, 298-305.
Metsala, J. L. (1997). An examination of word frequency and neighborhood density in the development of spoken-word recognition. Memory & Cognition, 25,47-56.
Nakatani, L. H., & Dukes, K. D. (1977). Locus of segmental cues for word juncture. Journal o f the Acoustical Society of America, 62, 714-719.
203
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Norris, D. (1994). Shortlist — A connectionist model of continuous speech recognition.. Cognition, 52, 189-234.
Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209-1228.
Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences , 23, 299-325.
Norris, D., McQueen, J. M., Cutler, A., & Butterfield, S. (1997). The possible-word constraint in the segmentation of continuous speech. Cognitive Psychology, 34, 191-243.
Norris D, McQueen J. M, Cutler A, Butterfield S, & Keams R. (2001). Language- universal constraints on speech segmentation. Language and Cognitive Processes, 16, 637-660.
Nosofskey, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology - General, 115, 39-57.
Pisoni, D. B., Nusbaum, H. C., Luce, P. A., & Slowiaczek, L. M. (1985). Speech- perception, word recognition and the structure of the lexicon. Speech Communication, 4, 75-95.
Oiler, D. K. (1973). The effect of position in utterance on speech segment duration in English. Journal of the Acoustical Society of America, 54, 1235-1247.
Ogawa T. (1972). 52 Kategorii ni zokusuru go no syutsugen hindo hyoo (Category norms for verba items in 52 categories in Japanese). Kansai Gakuin University, Jinbun Ronkyuu, 22, 1-68.
Otake, T., Hatano, G., Cutler. A. & Mehler, J. (1993). Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language, 32, 258-278.
Otake, T., Hatano, G., & Yoneyama, K. (1996a). Japanese speech segmentation by Japanese listeners. In T. Otake & A. Cutler (Eds.), Phonological structure and language processing: Cross-linguistic studies (pp. 183-201). Berlin, Germany: Mouton de Gruyter.
Otake, T., Yoneyama, K., Cutler, A.. & van der Lugt, A. (1996b) The representation of Japanese moraic nasals. Journal o f Acoustic Society of America, 100, 3831-3842.
Otake, T. & Cutler, A. (1999). Perception of suprasegmental structure in a non-native dialect. Journal of Phonetics, 27, 229-253.
204
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Pallier, C., Sebastian-Galles, N., Felguera, T., Christopher, A. & Mehler, J. (1993). Attentional allocation within the syllabic structure of spoken words. Journal of Memory and Language, 32, 373- 389.
Pallier, C., Colome, A., & Sebastian-Galles, N. (2001). The influence of native- language phonology on lexical access: Exemplar-based versus abstract lexical entries. Psychological Science, 12, 445-449.
Perkell, J. S., Matthies, M. L., Svirsky, M. A., & Jordan, M. I. (1995). Goal-based speech motor control: a theoretical framework and some preliminary data. Journal of Phonetics, 23, 23-35.
Pirat, A., Logan, J., Cockell. J., & Gutteridge, M. E. (1995). The role of phonological neighborhoods in the identification of spoken words by preschool children. Poster presented at the Annual Meeting of the Canadian Society for Brain, Behaviour & Cognitive Science, Halifax, Nova Scotia.
Pisoni, D. B. (1996). Word identification in noise. Language and Cognitive Processes, 11,681-688.
Pisoni, D.B., Nusbaum, H. C., Luce, P. A., & Slowiaczek. L. M. (1985). Speech perception, word recognition and the structure of the lexicon. Speech Communication, 4. 75-95.
Pisoni, D.B. (1997). Some thoughts on “normalization” in speech perception. In Johnson. K., & Mullenix, J.W.. (Eds.). Talker variability in speech processing, (pp. 9- 32). San Diego: Academic Press.
Pitt M. A (1998). Phonological processes and the perception of phonotactically illegal consonant clusters. Perception and psychophysics, 60, 941-951.
Pitt, M. A., & McQueen, J.M. (1998). Is compensation mediate the lexicon? Journal o f Memory and Language, 39, 347-370.
Pitt, M. A.. & Samuel, A. G. (1995). Lexical and sublexical feedback in auditory word recognition. Cognitive Psychology, 29, 149-188.
Pitt, M. A., Smith, K. L.. & Klein, J. M. (1998). Syllabic effects in word processing: Evidence from the structural induction paradigm. Journal o f Experimental Psychology - Human perception and Performance, 24, 1596-1611.
Plaut, D. C, & Kello, C. T. (1999). The emergence of phonology from the interplay of speech comprehension and production: a distributed connectionist approach. In
205
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. MacWhinney, B. (Ed.). The emergence o f languagel. (pp. 381-415). Lawrence Erlbaum Associates, Publisheers.
Quene, H. (1992). Durational cues for word segmentation in Dutch. Journal of Phonetics, 2 0 , 331-350.
Radeau, M., & Morais, J. (1990). The uniqueness point effect in the shadowing of spoken words. Speech Communication , 9, 155-164.
Saffran, R. N., and Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal o f Memory and Language , 35.606-621.
Saffran, J. R., Newport, E. L., Aslin, R. N., Tunick, R. A., Barrueco, S., (1997). Incidental language learning: Listening (and learning) out of the comer of your ear. Psychological Science , 8,101-195.
Schacter, D. L., & Church, B. (1992). Auditory priming: Implicit and explicit memory for words and voices. Journal of Experimental Psychology: Learning, Memory and Cognition , 18,915-930.
Sebastian-Galles, N. (1996). The role of accent in speech perception. In T. Otake & A. Cutler (Eds.), Phonological structure and language processing: Cross-linguistic studies (pp. 172-181). Berlin, Germany: Mouton de Gruyter.
Sebastian-Galles, N., Dupoux, E.. Segui, J., & Mehler, J. (1992). Contrasting syllabic effects in Catalan and Spanish. Journal of Memory and Language , 31, 18-32.
Sekiguchi, T. & Nakajima, Y. (1999). The use of lexical prosody for lexical access of the Japanese language. Journal o f Psycholinguistic Research, 2 8 , 439-454.
Shillcock, R. C. (1990). Lexical hypotheses in continuous speech. In G. T. M. Altomann (Ed.), Cognitive models of speech processing: Psycholinguistic and computational perspectives (pp. 24-49), Cambridge, MA: MIT Press.
Smith, K.L, & Pitt, M. A. (1999). Phonological and morphological influences in the syllabification of spoken words. Journal of Memory and Language , 41, 199-222.
Smith, K.L, & Pitt, M. A. Are cues to syllable boundaries also cues to word boundaries? A manuscript submitted for publication.
Soto-Faraco, S., Sebastian-Galles, N., & Cutler, A. (2001). Segmental and suprasegmental mismatch in lexical access. Journal of Memory and Language, 45, 412- 432.
206
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Sugito, M. (1972). Ososagari-koo: Dootai-sokutei ni yoru nihongo akusento no kenkyuu (Delayed pitch fall: An acoustic study). Shoin Joshi Daigaku Ronshuu, 10, Reprinted in M. Tokugawa (Ed.), Akusento (Accent) (pp. 201-229). Tokyo: Yuuseidoo, 1980).
Suomi, K., McQueen, J. M., & Cutler, A. (1997). Vowel harmony and speech segmentation in Finnish. Journal of Memory and Language, 36,422-444.
Tabossi, P., Burani, C., & Scott. D. (1995). Word identification in fluent speech. Journal of Memory and Language, 34, 440-467.
Tyler, L. K., & Wessels, J. (1983). Quantifying contextual contributions to word- recognition processes. Perception and Psychophysics, 34,409-420.
Vance, T. J. (1987). An introduction to Japanese Phonology. Albany, NY: State University of New York.
Vance, T. J. (1995). Final accent vs no accent — Utterance-final neutralization in Tokyo Japanese. Journal of Phonetics, 23,487-499.
Vitevitch, M. S., & Luce, P. A. (1998). When words compete: Levels of processing in perception of spoken words. Psychological Science, 9, 325-329.
Vitevitch, M. S., & Luce, P. A. (1999). Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition. Journal of Memory and Language. 40, 374- 408.
Vroomen, J., & de Gelder, B. (1995). Metrical segmentation and lexical inhibition in spoken word recognition. Journal of Experimental Psychology: Human Perception and Performance, 21, 98-108.
Vroomen J, & de Gelder, B. (1997). Activation of embedded words in spoken word recognition. Journal of Experimental Psvchologx: Human Perception and Performance, 23, 710-720 JUN 1997
Vroomen, J., van Zon & de Gelder, B. (1996). Curs to speech segmentation: Evidence from juncture misperceptions and word spotting. Memory and Cognition, 24, 744-755.
Vroomen, J., Tuomainen, J., & de Gelder, B. (1998). The roles of word stress and vowel harmony in speech segmentation. Journal of Memory and Language, 38, 133-149.
Warner, N. (1997). Japanese final-accented and unaccented phrases. Journal o f Phonetics, 25, 43-60.
207
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Wallece, W. P., Stewart, M. T„ & Malone, C. P. (1995a). Recognition memory errors produced by implicit activation of word candidates during the processing of spoken words. Journal of Memory and Language, 34,417-439.
Wallece, W. P., Stewart, M. T., Sherman, H. L., & Malone, C. P. (1995b). False positives in recognition memory produced by cohort activation. Cognition, 55, 85-113.
Ye, Y., & Connine, C. M. (1999). Processing spoken Chinese: The role of tone information. Language and Cognitive Processes, 14, 609-630.
Yoneyama, K. (2000). The structural aspects of the Japanese lexicon. A paper presented at The Speakers Series, Department of Linguistics, Ohio State University.
Yoneyama, K. & Pitt, M. A. (1999). Prelexical representation in Japanese: Evidence from the structural induction paradigm. Proceedings of the 14,u International Congress of Phonetic Sciences, vol. 2, 893-896.
Zwitserlood, P. (1989). The locus of the effects of the sentential-semantic context in spoken word processing. Cognition, 32, 589-596.
Zwitserlood, P., & Schriefers, H. (1995). Effects of sensory information and processing time in spoken-word recognition. Language and cognitive processes, 10, 121-136.
Zwitserlood, P., Sheriefers, H., Lahiri, A., & van Donslaar, W. (1993). The role of syllable in the perception of spoken Dutch. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 260-271.
208
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A: Alphabetic symbols used in the lexicon
In this dissertation, two types of word representations that are used for lexical access are test: abstract representation and auditory representation. The alphabetic symbols shown below are used to describe abstract representation. This representation is “phonetically detailed” representation. Therefore, this is NOT the phonological representation of words linguists generally assume. This representation contains a short and long vowel distinction. In the Tokyo dialect of Japanese, /ou/, /ei/ that appear in a morpheme are realized as [oo] and [eel, respectively. This representation contains this phonetic representation. Similarly, the representation we used has more detailed phonetic information. For examples, U S and fr-$ - both have /ki/ that is realized as a palatarized Ik/ (capitalized k) which is different from Ik/ in other vowel environment such as Ike/ and fkol) in the examples.
Examples: [ee] Word Phonological form Abstract rep. JUS business conditions; market /keiki/ ke'e'K i1 ' r - * cake /keeki/ ke‘euKiu [oo] Word Phonological form Abstract rep. i altitude, height /koudo/ kolo°dou □ - K cord /koodo/ ko'o°do°
The following tables show alphabetic symbols used for a symbolic representation used for lexical access.
209
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •a -u -o
a-
sa su se
cu
na nu ne no n-
ma mi mu me mo ni-
rura ro
wa w-
210
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. &m •a W l -i ?m •u am -e tsm -o
K'n ii ga 2 * Gi <* 7 gu If ke zi ko ea- tffr s -? za C V Zi ■r X zu 1f -tf ze :J zo z- t£'n ti da % Zi o zu V ■T de a K do d- It /< ba XS t' Bi bu s< /< be it bo b- liff It /-v pa IS fc: Pi -S' pu pe it po p-
211
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fc5»l .a L'JIJ .j 5?i] -u fim - 3 ^ V fr ^ V *>tp *3 X £ cfc Ca Cu Cp Co ty- * v ^ X ^ x =f- 3
—vfir lev \Znp l~x 1- t vfr 1>*P 1>X Ha Hu He Ho hy- t v t l t X t 3 b$> b x b n Ma Mu Me Mo my- 5 V 5 x 5 x 5 3 ■J Vfr y * y n> y x y * Ra Ru Re Ro rv- 'J V ' j i y x 'J 3 fcM -a .j 55i| -u X.9I e fc5»l -o ^ v f i iTx Ga Gu G p Go gya- ^ V * x * x * 3 v v f r C v £*> L x Cn Za 7 m Zp Zo zya- v v v x v x V 3 ^ V ff *>*V *>'x *>'cfc Za Zu 7p Zo dya- ^ V * x * x ^ 3 t v f f tf v V# tf x Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. m \ •a I'5*1 -i o5*l -u *.5*1 -e is 5*1 -o 07fr o i' 5 X. o ft wi we wo W O-r Ox 0*- < & Ct' <* X <*ft Wa Wi We Wo kw- 0 7 0 -r 0 * £5*1 -a l'5*l -i o5*l -u *.5*1 -e £5*1 -o 0 7 f r Oft Ol.' O x Oft ca ci cu ce co ts- " J 7 O-f O x "Jt X r ' f t r * t* I' f x t t s za zi zu ze zo zw- X 7 X-r X x X i r 0 7 f r -S'ft -S'I' -S' x -S'ft' fa fi fu fe fo f- 0 7 0-< O x o * 0*7fT O ' ft o ' I' 0* x 0* ft va vi vu ve VO V- 0 7 OV O x 0 > Moraic Nasal Geminate Consonant Long vowel f\j -o -- double N Q > v -- vowels 213 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX B: The 300 stimulus pairs used in a similarity judgment experiment The first accent patterns are shown as a function of the second stimulus accent patterns. Shaded cells in the table indicate that these pairs were not presented to participants. A start indicates an accented high pitch, I indicates an unaccented high pitch and 0 indicates a low pitch. Second Stimulus accent pattern 000*0 i o lin 00*0 * * * s * *000 *0000 O * © * o * © © o Olll on* 01*0 © 011*0 01*00 0 * 01 *0 0* *00 o n 01* 0*0 o n i 011* 01*0 0*00 *000 First First Stimulus accent pattern 01111 oni* 011*0 01*00 0*000 *0000 214 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix C: 700 target words The 700 target words that were used in Experiments 2, 3 and 4 are shown in the table below. Three different neighborhood density. Word Frequency, Uniqueness Point (UP), Duration, and Ist mora Frequency are also included in the same table. Target Words Neighborhood ■r. m m5 W ord Gloss S z*Sfi £L 1* 1* M ora Auditory Word Frequency Duration + + Pilch frequency Segments kabotya t Mate pumpkin 0 0 0.750 2.9031 5 606 0.0542 kabuka M stock prices 6 5 14.846 4.1691 7 550 0.0542 kabuki ( Japanese kabuki 10 / 6.165 3.6791 7 595 0.0542 play) kabure rush 11 10 115.494 2.1206 6 511 0.0542 katiku mm livestock 10 5 0.510 3.1735 7 575 0.0542 katime winning point 2 t 13.102 2.4378 6 529 0.0542 katura A O b wig 7 5 7.937 2.8082 7 587 0.0542 type, printing katuzi 7 3 0.S69 3.3071 7 611 0.0542 S5* type kadode nta departure I 1 154.389 2.4871 7 539 0.0542 kagami 31 mirror 24 16 35.619 3.3867 7 594 0.0542 kagiri PgiJ limit 13 4 215.489 4.3405 6 497 0.0542 bird-in-the-cage' kagome A'CTtf) 4 4 123.741 2.0934 7 571 0.0542 game kakaku ffilS price 51 31 6.239 4.8105 7 552 0.0542 kakari % charge 24 3 64.216 3.3918 7 552 0.0542 kakasi scarecrow 17 10 0.551 2.1703 5 625 0.0542 kakera fcitb a broke piece 5 3 28.112 2.6776 5 534 0.0542 kakine fence 7 I 0.696 3.1389 7 582 0.0542 kakomi H Si- enclosure 4 4 22.260 1.9590 5 540 0.0542 kakuti various places 22 10 4.188 4.3767 7 548 0.0542 kakudo nm angle 16 14 30.035 3.3813 7 535 0.0542 kakugo resolution 16 8 42.974 3.7976 7 539 0.0542 continued 215 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood >. Word Gloss a. s §■ Word Frequency D uration Auditory + + Pitch Segments Segments 3 maintenance, kakuho 9 7 54.300 4.4668 7 544 0.0542 secure kakuri mm isolation 37 18 94.177 3.0577 7 489 0.0542 kakusa tom gap, difference 9 6 21.541 4.0602 7 539 0.0542 everyone, each kakuzi 35 16 0.600 2.7723 7 612 0.0542 one oven, cooking kamado 2 0 97.944 2.5198 7 545 0.0542 n stove kamasu barracuda 5 3 18.438 1.7559 7 612 0.0542 kamera i i * 7 camera 3 0 104.096 3.9054 7 527 0.0542 kamocu goods 3 1 20.115 3.6224 5 572 0.0542 kamoku n i course, subject 24 10 26.403 3.5569 7 536 0.0542 kamome a seagull 6 5 30.437 3.0306 5 576 0.0542 kanagu metal Fixtures •> 1 4.580 2.7672 7 630 0.0542 kaniku mm flesh 13 5 7.189 2.1703 5 597 0.0542 kanozyo she. sweetheart 12 5 129.781 4.1527 5 502 0.0542 karada body 7 3 74.720 4.8666 7 490 0.0542 karami sharp taste 17 15 198.753 3.3381 7 534 0.0542 karasi mustard 18 14 5.270 2.2648 7 629 0.0542 karasu crow 12 5 34.579 3.0860 7 560 0.0542 karate (marshal karate 5 4 13.295 2.6955 7 563 0.0542 art) kareha teH dead (dry) leaf 0 0 110.212 2.4249 5 563 0.0542 kareki t e * i * dead tree 17 to 31.375 2.1703 7 560 0.0542 karesi m c boy friend 4 0 104.514 1.9638 5 529 0.0542 karite borrower 12 9 10.642 3.1578 6 577 0.0542 karyoku 4c* heat 31 14 89.793 2.9025 7 508 0.0542 karusa &£ lightness 10 1 22.804 2.5809 7 559 0.0542 karuta cards 4 1 167.636 2.5490 6 506 0.0542 karute J l/T chart 3 1 144.483 3.0484 7 485 0.0542 kasane S f e pile, layer 5 4 2.699 3.2925 7 604 0.0542 kasegi mz income 4 0 0.427 3.3107 7 554 0.0542 kaseki \KE fossil 24 13 8.871 3.3703 7 535 0.0542 kasitu i f i * mistake 17 11 3.142 3.2514 7 760 0.0542 kasira n head 7 7 33.385 4.3792 7 574 0.0542 kasoku acceleration 35 18 2.738 3.8015 7 618 0.0542 continued 216 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood >> Cl sSI W ord Gloss 7 s £ er • I M MI ora M D uration + + Pitch Auditory frequency Segments Segments ? ill 3 kasumi 3 misc 13 12 7.389 2.7832 7 629 0.0542 katati K form, shape 21 17 1.981 4.7856 7 608 0.0542 katana 71 sword 5 3 24.954 2.7738 7 517 0.0542 katari Sk) narration 35 25 54.101 3.3249 7 462 0.0542 katate ft* one hand to 7 6.167 3.1136 7 588 0.0542 kawaki VlZ dryness 9 8 6.925 1.9777 7 588 0.0542 kawari f substitution 21 17 207.951 3.7667 7 522 0.0542 kawase &£ powder 7 5 9.114 3.6978 7 600 0.0542 kayaku :km powder 42 26 8.104 2.9143 7 621 0.0542 kazari ornament 15 10 117.313 3.0584 7 562 0.0542 kazicu mm fruit 13 S 27.087 3.0406 7 581 0.0542 kazino casino 0 0 41.125 2.7251 5 565 0.0542 kazoku nm family 29 14 50.468 4.6811 7 539 0.0542 kecuzyo Ma lack 9 5 2.246 3.0860 7 541 0.0170 kedama pill 6 4 47.279 1.6628 6 589 0.0170 a crab of the kegani family i T 74.778 0.6990 5 581 0.0170 Atelecyclidae kegare 55*1 pollution, stain 1 I 180.385 3.2082 6 531 0.0170 kegawa fur *» I 8.543 2.9791 5 614 0.0170 kemono *K beast 9 6 146.514 2.8075 7 554 0.0170 kemuri « smoke 6 5 133.222 3.5904 7 544 0.0170 kemusi hairy caterpillar 4 4 39.342 1.9138 7 598 0.0170 kenami fine fur 2 1 173.561 1.9395 5 529 0.0170 kenuki m z hair tweezers 11 9 12.892 1.4472 7 605 0.0170 kesiki scene, view 13 4 4.633 2.9974 7 496 0.0170 difference. kezime 9 9 79.844 3.5093 7 566 0.0170 distinction kibori *» y wood carving 10 8 20.139 2.5391 5 563 0.0241 kituke fitting 13 11 0.000 2.3054 7 607 0.0241 kitune 2a fox 3 3 0.000 2.9390 7 645 0.0241 disposition. kidate 4 3 2.384 1.2553 5 624 0.0241 m&T nature undulations, ups kihuku 32 19 0.000 2.7016 7 654 0.0241 gf* and downs continued 217 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss * 1“ 1“ M ora Word Frequency D uration frequency Auditory Segments Segments + Pitch 3 constraint, kigane 6 5 99.667 2.6812 5 572 0.0241 hesitation kigeki mm comedy 11 4 3.047 3.2177 5 568 0.0241 kihaku mm rare, thin (air) 27 24 9.209 2.9983 7 621 0.0241 go home, return kikoku 48 25 6.080 4.4291 7 605 0.0241 m home kikori m woodcutter 10 8 58.687 1.4624 7 530 0.0241 kimatu m* the end of a term 13 to 1.159 2.8344 5 657 0.0241 kimari rule 6 3 121.839 3.0885 7 538 0.0241 conclusive kimete I 1 3.549 3.4196 6 610 0.0241 evidence kimoti feeling 8 6 0.268 4.4569 5 650 0.0241 kimono kimono, clothes 19 11 12.927 3.2799 5 593 0.0241 kimuti Korean pickles 5 2 18.257 2.5092 5 585 0.0241 kinako soybean flour 2 2 15.699 1.9956 6 557 0.0241 kinoko m mushroom 19 8 46.914 3.5032 6 510 0.0241 kiretu *3! crack 15 10 0.683 3.5592 5 643 0.0241 kireme gap. break 3 3 15.583 2.8745 6 595 0.0241 kirimi « j y * slice 6 5 58.099 2.3766 7 570 0.0241 kiryoku energy 28 19 0.510 3.0286 7 627 0.0241 kiroku ten record 35 17 7.521 4.5681 7 562 0.0241 kisetu mw season 31 5 0.980 3.9120 7 612 0.0241 kiseki miracle 41 25 3.149 3.0671 7 604 0.0241 kisibe £22 shore, bank I I 5.510 2.5145 7 582 0.0241 kisoku mi rule, regulation 35 7 1.227 3.6867 7 600 0.0241 return home, get kitaku 32 30 4.200 3.7139 7 598 0.0241 home kiteki whistle 11 9 2.471 2.3711 5 569 0.0241 kitoku fern critical 31 25 1.055 2.5276 7 586 0.0241 kiwame extremity 1 I 19.465 3.6049 7 611 0.0241 looking thinner kiyase I 1 11.068 0.3010 6 586 0.0241 mm* when dressed kizasi & L symptom 7 3 20.507 3.6917 5 580 0.0241 kizetu faint 16 9 0.540 2.0253 7 660 0.0241 kizitu MB time limit 28 13 11.199 3.3222 7 578 0.0241 kizoku mm noble | 32 13 86.609 3.0538 7 522 0.0241 continued 218 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss m l k< M oral k< Word Frequency Duration + + Pitch Auditory frequency Segments Segments 5 kizyutu description 16 4 2.003 3.6690 7 668 0.0241 kobaka 'MS looking down on 4 3 4.926 1.4771 5 552 0.0443 kobetu mm individual 8 j 8.669 4.0399 7 614 0.0443 kobito d ' A dwarf I 0 7.268 2.4298 5 603 0.0443 kobune /]'» boat 6 5 9.157 2.4265 7 585 0.0443 kobura cobra 9 2 185.084 1.8062 6 513 0.0443 kobusi m fist 13 5 0.594 2.9595 7 674 0.0443 kodom o =f-m child 4 j 70.920 4.7365 7 509 0.0443 kogetya m i m umber I 0 10.134 1.9731 7 569 0.0443 kogoto scolding, rebuke 9 3 21.967 2.2765 7 566 0.0443 kokage the shade of a tree 8 7 32.629 2.4281 7 566 0.0443 kokyaku customer, client 19 12 0.514 3.8519 4 680 0.0443 Japanese painted kokesi 5 0 4.735 2.1206 7 609 0.0443 C l t L wooden doll kokoti feeling, sensation 12 ■> 1.473 2.7875 7 576 0.0443 kokoro heart 18 0 40.878 4.5364 7 489 0.0443 kokuti notification 47 16 3.227 3.2620 7 513 0.0443 kokudo BB ± country, territory 25 21 91.230 3.7717 7 486 0.0443 kokugi m t a national game 33 22 39.906 2.2355 6 566 0.0443 kokugo mm national language 27 8 38.485 3.4961 7 569 0.0443 kokuso £ 3 ? accusation 20 14 7.567 3.3636 7 526 0.0443 kom aku eardrum 18 12 4.006 2.2810 6 523 0.0443 happened to komimi 1 I 4.249 1.4624 5 608 0.0443 overhear komoti has children 18 5 4.454 2.2856 5 618 0.0443 kom ono 'M il gadget 5 2 7.427 2.7292 5 622 0.0443 komori w y nursing 24 10 81.582 2.5877 7 520 0.0443 komugi /J'* wheat 5 I 83.713 3.2524 5 532 0.0443 konoha leaf 5 4 61.516 2.4487 7 513 0.0443 konoyo this world 2 2 105.140 3.3440 5 501 0.0443 koramu column I I 463.095 3.1864 5 490 0.0443 korera 3 cholera 2 0 196.107 4.4654 5 463 0.0443 koritu v l s l isolation 14 7 13.085 3.7201 5 599 0.0443 korom o i t clothes, dress 3 3 23.327 3.1159 7 585 0.0443 korosi SL murder 32 8 2.439 3.1014 7 683 0.0443 continued 219 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 2 SB s m0* £ — W ord Gloss E x 5 ac a. m 1“ 1“ M ora Word Frequency D uration frequency £ a + Auditory koruku 3 J l/£ cork 2 2 100.286 1.9542 5 527 0.0443 kosame /I'M drizzle, light rain 3 2 32.366 2.7084 5 575 0.0443 koseki F lf register 29 14 10.092 3.2847 7 571 0.0443 kosuto 37. h cost 5 5 8.931 4.1152 5 505 0.0443 Japanese fireplace kotatu 11 6 7.176 2.7612 5 614 0.0443 km with a coverlet kotoba -sm language 4 I 10.523 4.7652 7 561 0.0443 kotori small bird IS S 48.335 2.8267 5 551 0.0443 kotos i this year 29 7 2.038 4.9870 5 616 0.0443 kowasa fear, fearful ness •> 1 98.961 3.0026 5 525 0.0443 koyaku =¥-& child actor 27 18 0.000 2.4771 7 661 0.0443 koyomi m calendar 9 5 309.921 2.8704 5 556 0.0443 koyubi /Mi little finger 3 3 52.744 2.6201 5 584 0.0443 kozara /j'nn. small plate 5 ■) 10.488 1.9777 7 545 0.0443 kozeni /MS small change 0 0 57.494 2.6064 5 586 0.0443 koziki £ £ beggar 11 5 S.064 2.4232 6 594 0.0443 koziwa /J'SS little wrinkles 0 0 142.333 1.0792 5 532 0.0443 kubetu distinction 9 j 57.119 3.6340 ■* 546 0.0163 kubiwa collar, necklace II 32.771 2.0719 5 568 0.0163 kudari Ty decline 7 6 276.128 3.5533 7 491 0.0163 kudoki persuade 3 I 14.473 1.8195 6 584 0.0163 kudosa lengthy, tedious 2 1 45.325 1.0414 5 558 0.0163 kugatu September 2 2 4.192 4.3808 5 592 0.0163 kugiri K«y end. pause, stop 2 I 74.368 3.2669 5 553 0.0163 division, section, kukaku 21 15 2.175 2.8681 5 603 0.0163 E li block kumade m # rake 2 2 227.120 1.8129 7 532 0.0163 kumori 5tj cloudy weather 6 4 139.789 2.8182 7 459 0.0163 kurabu club 5 4 322.708 3.9893 7 498 0.0163 kurage < blf jellyfish 4 2 227.442 2.1987 6 564 0.0163 kurasa <7 37. darkness 9 I 25.504 2.5694 6 568 0.0163 kurasi L living 5 4 30.672 4.0465 7 525 0.0163 kurasu ^ 7 7 7 class 10 S 105.656 4.0001 7 510 0.0163 kurosa black 3 2 40.294 1.2788 6 541 0.0163 kurosu ^ 7 3 7 . cross 6 5 145.723 2.4362 7 541 0.0163 continued 220 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 3 §s ■= w W ord Gloss s — BC ft, & T T M ora W ord Frequency D uration A uditory frequency Segments 3 + 3 kurozi be in the black 3 8 49.951 4.1974 7 510 0.0163 kuruma m car. vehicle 3 2 123.198 4.7690 7 476 0.0163 kurumi < walnut 9 7 175.738 2.3927 6 516 0.0163 kusaki plants 9 i 1.695 2.6503 7 548 0.0163 kusyami < L*><* sneeze 1 38.352 2.2455 5 491 0.0163 kusami bad smell 8 2 51.494 2.0934 6 491 0.0163 kusari m chain 10 9 97.088 2.9191 7 489 0.0163 kusasa na? bad smell 5 I 4.142 2.4116 7 539 0.0163 kusuri m medicine 10 8 22.307 3.9936 7 542 0.0163 kuzyaku peacock 6 4 10.421 2.1790 7 632 0.0163 kuzira i s whale 0 0 119.898 3.4069 7 538 0.0163 kuzure mti crumble, collapse 4 T 76.475 2.5441 5 527 0.0163 mabuta Sk eyelid 1 0 29.545 2.6170 5 502 0.0141 matubi Mm the end S 4 2.852 2.7126 6 587 0.0141 matuge £-3 If eyelashes 6 3 19.946 1.8388 6 528 0.0141 maturi «y festival 9 2 17.519 3.6821 7 587 0.0141 madamu madam 1 1 441.265 2.0043 5521 0.0141 madori ray layout of a house 6 5 90.519 2.5224 7 560 0.0141 mahuyu midwinter 0 0 48.074 2.3385 7 617 0.0141 magiwa the last moment 2 0 2.501 3.0457 5 619 0.0141 maguma ■7 magma 4 0 91.912 2.8915 5 520 0.0141 magure MCti lucky hit 2 0 76.741 2.0086 7 542 0.0141 maguro -7?a tuna 4 1 33.775 3.2041 6 581 0.0141 makoto m truth 2 2 1.813 4.3027 5 581 0.0141 makura pillow S 2 21.296 2.7308 7 533 0.0141 mamizu X tR fresh water 2 1 2.240 2.4487 5 648 0.0141 mamono evil spirit, devil S 7 191.018 2.1335 5 553 0.0141 mamori ^y defense 6 3 74.609 3.6824 7 564 0.0141 mamusi adder, viper 4 4 7.537 2.1614 5 667 0.0141 manabi learning 4 i 70.338 3.9208 5 570 0.0141 m anatu XX midsummer 5 I 7.070 2.6937 7 639 0.0141 manako m eye 5 0 18.343 3.0120 6 573 0.0141 marine 7 'J* marinate 2 0 316.325 1.6021 5 518 0.0141 m aryoku SKA charm 6 2 70.458 2.3284 3 558 0.0141 continued 221 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss * 1“ 1“ M ora Word Frequency D uration frequency Auditory Segments + Pitch Segments 5 maruku mark 4 3 102.941 3.5319 7 547 0.0141 marumi roundness 6 5 71.496 2.4232 7 577 0.0141 maruta AX log 3 1 1.595 3.1717 7 619 0.0141 maruhi secret 2 2 60.720 2.1732 7 599 0.0141 masatu mm friction 7 6 3.069 3.7549 5 657 0.0141 masita MT right under, below 6 5 0.961 2.7324 5 634 0.0141 masuku 7 X ^ 7 mask 3 3 7.421 3.0955 7 519 0.0141 masuto ■77. h mast 5 5 19.002 2.4330 7 554 0.0141 mawari wy around 9 7 142.617 4.2835 7 578 0.0141 sumo wrestlers mawasi 6 j 3.725 3.0430 7 629 0.0141 @L loincloth mahiru Mm midday 2 t 35.533 1.8751 5 604 0.0141 mayaku mm drug, opiate 9 7 22.072 3.7174 6 600 0.0141 mayoke flUBEit amulet, charm 4 3 0.796 0.3010 6 638 0.0141 mavoko mm side 3 2 3.771 2.1139 7 612 0.0141 mayuge eyebrow 3 2 464.494 1.7243 5 530 0.0141 mazyucu mm magic 5 4 27.446 2.3502 4 545 0.0141 mazusa poor, not good 3 I 34.654 2.5224 5 569 0.0141 metuki eyes, look 14 3 j 5.195 1.9823 6 517 0.0092 medama eyeball 5 3 56.549 3.5937 7 585 0.0092 megami xn goddess 3 1 201.910 2.6395 5 532 0.0092 megane 1691 glasses 2 0 51.030 3.8140 7 549 0.0092 mehana g* take shape 2 I 3.567 1.9956 7 603 0.0092 memori SSU memory 4 4 265.940 2.2967 7 584 0.0092 mesaki immediate to 9 11.920 3.6654 7 613 0.0092 mesibe HE pistil 0 0 16.450 1.6812 6 538 0.0092 metaru metal 4 2 1.303 1.9638 5 623 0.0092 metoro * hn Metro 1 1 35.889 1.7924 7 513 0.0092 meyani i n eye mucus 0 0 6.277 1.3802 5 633 0.0092 meyasu g£ standard, aim 0 0 12.177 3.4510 5 601 0.0092 the comer of the meziri 3 0 151.257 1.0414 6 530 0.0092 eye miburi ##y gesture 5 t 251.062 1.6232 6 523 0.0131 mituba trefoil 3 I 24.157 2.4133 7 559 0.0131 mitudo density 4 2 5.666 3.1926 7 584 0.0131 continued 222 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss eu I11 M oraI11 Word Frequency Duration Auditory frequency + + Pitch Segments Segments 3 mitugo H'D-?- triplet 7 2 0.000 2.1523 6 671 0.0131 mituyu mm smuggling 0 0 0.000 3.3979 7 689 0.0131 disorder. midare 2 2 172.869 3.4717 7 541 0.0131 a*i confusion midasi MiiJL heading, headline 7 6 11.079 3.4280 7 600 0.0131 midori m green 15 I 412.643 4.0304 7 475 0.0131 migaki S t polish 5 3 4.110 3.0149 7 616 0.0131 migara am one’s person I L 26.424 3.4473 6 598 0.0131 migite right hand 0 0 0.709 3.3901 5 631 0.0131 the body of a m igoro 6 4 36.479 0.9031 7 591 0.0131 garment mihari jt»y watch 4 I 233.171 2.8215 6 569 0.0131 mikaku taste 24 24 3.603 2.7860 6 639 0.0131 mikata wn friend, one’s side 8 0 0.711 4.6095 7 574 0.0131 mikiri jusiy give up 5 3 11.793 2.7582 7 620 0.0131 mikomi expectation 8 7 23.466 4.1405 6 567 0.0131 III □ mikuro micro 8 4 17.080 2.4346 7 575 0.0131 2 H mimizu Ill III earthworm 3 16.945 2.3502 7 604 0.0131 ! ! identity. m im oto 13 12 7.699 3.5054 7 624 0.0131 &7Z background minami m south 8 5 103.716 3.9672 7 587 0.0131 minari dress, appearance 10 2 246.832 2.2095 6 476 0.0131 m inato harbor, port 2 2 7.058 3.6022 5 630 0.0131 minori crop, harvest 11 6 195.686 4.0280 7 552 0.0131 charm. m iryoku 16 5 0.754 4.0538 5 618 0.0131 mti fascination miruku milk 5 4 111.744 2.9689 7 546 0.0131 misaki m cape 15 11 3.582 2.9763 6 646 0.0131 miseba Hit* highlight, climax 2 0 52.421 2.5717 7 573 0.0131 misesu Mrs. I 0 0.000 1.8751 6 60S 0.0131 m itome 8tf> one’s seal, signet 7 5 19.279 2.8555 5 589 0.0131 present, gift, miyage 3 2 129.258 3.0920 7 530 0.0131 ±m souvenir m iyako & capital 1 0 24.710 4.2747 7 567 0.0131 miyori relative 9 8 34.689 | 2.4698 5 589 0.0131 mizugi ** swimming suit 10 6 44.331 2.9154 7 556 0.0131 continued 223 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood s W ord Gloss i ft. l “ l “ M ora Word Frequency Auditory D uration frequency + + Pitch £ Segments 3 mizuke moisture 10 8 5.527 3.0099 7 602 0.0131 mizuni boiling in water 8 7 18.178 2.1206 6 623 0.0131 moderu model 3 2 257.736 3.9446 7 470 0.0107 modori return 7 4 109.309 2.8149 7 574 0.0107 mohuku Rflfi mourning dress 12 7 1.726 2.2648 3 617 0.0107 mogura * 7 ? mole 6 5 7.071 2.4900 6 606 0.0107 mohaya already, no more 0 0 51.414 0.3010 5 560 0.0107 mokuba wooden horse 5 2 17.829 2.5465 6 577 0.0107 mokuhi m m silence 15 7 0.242 2.6590 7 661 0.0107 mokuzi a table of contents 28 14 1.130 2.2553 5 635 0.0107 momizi maple 1 I 23.748 3.0678 7 553 0.0107 a waffle stuffed monaka 5 1 36.384 3.4536 4 542 0.0107 ft4 > with bean jam m oram morals 0 0 226.904 3.2079 6 517 0.0107 groping (for. *7 mosaku m m 5 4 0.235 3.8093 / 685 0.0107 about) motome request 5 3 16.448 3.1931 6 589 0.0107 moyasi f c - P L bean sprout 4 4 8.394 2.4362 7 641 0.0107 moyori neighboring 5 4 311.086 2.8555 5 564 0.0107 mugitya R* barley tea 0 0 0.000 2.0531 5 636 0.0075 mukade centipede 2 2 1.928 1.9956 5 633 0.0075 mukasi # old times 5 3 3.155 3.9600 7 615 0.0075 munage hair on the chest 5 j 104.342 1.3617 6 562 0.0075 musiba AM decayed tooth 0 0 45.382 2.8445 7 564 0.0075 musubi knot, conclusion 2 I 68.606 2.9600 7 567 0.0075 m usuko son 0 0 0.800 4.2044 5 584 0.0075 musume 5ft daughter 3 3 25.659 4.2775 7 574 0.0075 muzicu m m innocent 2 2 25.001 3.0249 5 564 0.0075 muziko m m tSL accident-free 5 0 7.406 2.1818 6 523 0.0075 in summer, during natuba 4 2 3.954 3.0103 7 619 0.0142 H the summertime strong summer natubi MB 5 2 10.187 1.6990 7 576 0.0142 sunshine nadare avalanche 3 L 152.037 3.1021 6 575 0.0142 nahuda name tag 2 0 28.756 2.5740 5 570 0.0142 nagame m tt) view, landscape 8 6 42.356 2.7202 6 610 0.0142 continued 224 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss & 1*' M1*' ora Word Frequency D uration frequency + + Pitch Auditory Segments Segments 3 nagare stream, current 8 5 139.092 4.3597 7 536 0.0142 nagasa &£ length 4 0 5.646 3.9547 6 610 0.0142 nagasi 35 L sink 18 10 10.790 3.2302 7 603 0.0142 nageki grief, sorrow 7 6 8.908 2.9263 7 607 0.0142 nagisa f t beach, shore I 0 0.711 2.7868 5 625 0.0142 nakama m circle, company 5 2 6.390 4.1748 7 593 0.0142 nakami contents 9 3 12.377 3.9392 7 557 0.0142 namako sea cucumber 10 7 1.339 2.1271 7 630 0.0142 namami mortal 11 1 102.247 2.6542 7 524 0.0142 namazu f t cattish 6 5 1.163 2.3502 7 683 0.0142 a kind of nameko 3 3 11.314 1.S451 7 595 0.0142 tz to z mushroom namida ;£ tears 4 0 1.976 3.8587 7 635 0.0142 namiki ft* a row o f trees 8 2 9.127 2.9795 6 643 0.0142 nanatu • t o seven 4 1 13.475 3.0086 7 569 0.0142 the seventh day of nanoka 2 1 13.954 0.3010 7 616 0.0142 -ta the month announcing nanori 9 9 397.261 3.3201 7 537 0.0142 «*y oneself narabi mu row. line 7 8.849 3.5369 7 610 0.0142 nasake t i t t sympathy, pity I 0 0.322 3.1626 7 560 0.0142 nasubi t £ t U eggplant 2 0 44.664 2.0253 5 543 0.0142 2 nayami 1m &■ sufferings, worry 4 58.041 3.8258 6 614 0.0142 nazasi nominate 9 6 2.081 3.2574 4 668 0.0142 nazimi familiarity 3 2 6.563 3.3570 7 637 0.0142 nazuke * # i t naming 4 3 1.566 1.5441 7 644 0.0142 nebari «y stickiness 4 4 117.227 3.2292 7 564 0.0071 nebiki f i ? i t discount 7 6 7.288 3.3243 7 656 0.0071 necuki fall asleep 13 8 2.303 1.4624 7 614 0.0071 nedoko a s bed 3 2 1.522 2.4232 7 644 0.0071 negoto a # sleeper's talk 3 2 5.923 1.8129 5 608 0.0071 neguse a$ bed head 0 0 0.469 1.0792 5 669 0.0071 nekoze stoop 0 0 0.000 1.6721 5 625 0.0071 nemoto JB5c root, bottom 4 3 10.145 3.6464 5 648 0.0071 nemuke Kft sleepiness 5 5 16.396 2.1818 7 565 0.0071 continued 225 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss a. 1“ 1“ M ora Frequency Auditory W ord D uration frequency + + Pitch Segments Segments 3 nemuri Ig iJ sleep 2 1 163.067 2.8854 7 501 0.0071 nemusa a g $ sleepiness I I 4.955 0.9031 5 586 0.0071 (price) reduction, nesage 3 3 1.392 3.8118 6 646 0.0071 markdown newaza lying-down trick I 1 2.409 2.2529 5 660 0.0071 neziri « y twist, screw 6 5 100.983 1.9777 7 559 0.0071 nezumi a rat. mouse 4 4 5.040 3.3126 7 640 0.0071 dried small nibosi 7 6 8.155 2.0170 4 645 0.0097 sardines nibusa nz dullness j 3 62.689 2.2989 5 523 0.0097 nitizi HB# time, date 4 0 13.271 3.2055 6 528 0.0097 nituke *ttit hard-boiled food 13 13 2.524 2.1584 6 601 0.0097 nigatu -a February I I 2.005 4.2669 5 659 0.0097 bitterness, bitter nigami 9 7 23.280 1.6335 7 609 0.0097 taste nigate weak point 3 3 1.481 3.3623 7 661 0.0097 nigeba refuge, shelter 0 0 24.146 2.4133 5 593 0.0097 nigiri « y grasp, grip 12 11 50.060 3.1959 7 589 0.0097 nigori ay muddiness 11 9 125.011 2.2455 7 535 0.0097 nikibi pimple 3 0 92.940 2.0170 5 525 0.0097 nikomi stew 7 5 9.315 2.2095 5 616 0.0097 nikusa it* hate, hatred 6 2 1.579 2.2304 6 604 0.0097 nimame HksL boiled beans 4 3 110.127 1.8513 5 583 0.0097 nimotu baggage.luggage 1 0 0.271 3.5342 7 605 0.0097 nimono stewed dish 11 8 201.593 2.5877 5 544 0.0097 nisyoku - f t two colors 19 3 0.000 1.8573 7 648 0.0097 niziru =t;+ soup, broth 1 1 47.213 2.6998 6 599 0.0097 nobara If## wild rose 4 2 128.452 1.6812 5 523 0.0068 wild nogiku 4 2 57.213 1.6435 5 547 0.0068 if* chrysanthemum nohara if® field I 1 82.081 2.5611 4 528 0.0068 nokori »y rest, remainder 9 7 25.994 4.1583 7 538 0.0068 nomiya tavern, bar 3 0 39.391 2.5899 5 523 0.0068 noriba ay** station 2 2 115.417 2.3674 5 524 0.0068 norosa slowness 3 3 25.660 0.9542 5 581 0.0068 noruma norm 0 0 164.951 2.8241 7 549 0.0068 continued 226 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood W ord Gloss m 1“ 1“ M ora Word Frequency D uration + + Pitch Auditory frequency Segments Segments 3 noyama i f Lit hills and fields 2 1 91.599 2.5763 5 553 0.0068 nozomi wish, desire 2 2 108.824 3.4874 6 554 0.0068 nozyuku » s camping out L 0 36.843 2.5224 4 564 0.0068 nukege t t ( t € fallen hairs I 1 7.662 2.4609 6 613 0.0024 numeri slime, sliminess 0 0 420.085 1.8751 5 583 0.0024 nunozi cloth 0 0 20.212 2.5211 5 618 0.0024 nusumi theft, stealing 3 3 2.094 3.2143 7 656 0.0024 pakuri \£ < y shoplifting 9 3 34.229 0.3010 7 545 0.0030 “i panama Panama j ~i 2.248 3.3471 4 585 0.0030 paneru panel 1 0 609.670 3.4490 7 567 0.0030 parupu pulp 1 I 192.196 3.1007 5 499 0.0030 paseri parsley 1 0 152.053 2.5172 4 502 0.0030 pasuta pasta 0 0 0.000 2.2405 6 585 0.0030 pazyama pajamas 2 1 82.572 2.7076 3 528 0.0030 pazuru puzzle 0 0 131.534 2.4624 3 532 0.0030 pedaru pedal 2 1 271.349 2.5832 5 510 0.0012 pirahu £ pilaf 3 3 229.883 2.2355 5 510 0.0012 popura T t W poplar, aspen 1 I 45.442 2.3139 6 503 0.0011 porisu police 0 0 182.363 1.7853 5 508 0.0011 poruno 7t'^U>r pornography 0 0 471.249 2.9952 5 471 0.0011 posuto /f'X h mailbox 3 3 21.192 4.0394 6 536 0.00 It poteto /tf-T h potato 1 1 7.216 2.1335 5 492 0.0011 potohu /t* h 7 pot-au-feu 1 0 2.657 0.9031 7 530 0.0021 puragu plug 5 1 152.901 2.0864 7 513 0.0021 puramu plum 8 5 381.905 1.5315 6 475 0.0021 purasu plus 11 6 106.678 3.8809 7 523 0.0021 puraza y=j*f plaza 1 0 94.775 2.6911 7 533 0.0021 puresu press 5 j S3.166 2.9680 7 538 0.0271 sabaki judgment 8 6 0.231 2.9079 7 608 0.0271 sabaku $31 desert 7 5 1.179 3.6845 7 689 0.0271 sabetu mm discrimination 7 2 3.532 4.0775 4 597 0.0271 syaberu shovel 3 3 47.200 3.4S78 5 528 0.0271 satuki azalea 3 1 0.000 4.3416 7 734 0.0271 sadame law. rule, decision 1 I 9.702 2.5763 7 621 0.0271 continued 227 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood Word Gloss • 1M Mora 1M Word Frequency Duration + + Pitch Auditory frequency Segments Segments 2 sadoru ■*K;u saddle 0 0 8.043 3.6732 5 616 0.0271 safari 'J safari 6 1 14.564 1.8692 7 586 0.0271 difference, sagaku 18 6 0.697 1.9638 7 672 0.0053 mm balance syageki firing, shooting 11 8 0.000 3.2453 4 715 0.0271 saguri m sounding, probe 8 6 2.096 3.0952 7 617 0.0271 sakaba igl* bar, tavern 6 5 1.159 2.5705 7 637 0.0271 sakana it fish 7 4 0.237 2.6618 7 637 0.0271 grasp (a knife) sakate with the point 9 8 0.234 3.8851 7 631 0.0271 downward sakaya mm liquor store 4 3 1.202 2.9274 7 607 0.0271 sakotu «# collarbone 3 2 0.000 2.7135 5 690 0.0271 sakoku mm national isolation 10 3 0.000 1.9868 5 690 0.0271 sakusya author, writer 25 10 0.000 2.8254 6 617 0.0271 sakuya last night 15 0 0.651 3.6851 7 678 0.0053 syakuya f t * rented house 8 3 0.000 2.2810 7 678 0.0271 elimination. sakuzyo 9 5 0.000 3.0535 7 635 0.0053 mr deletion syamozi ft*? ladle I 0 17.299 3.5705 5 614 0.0271 samuke chill 1 I 0.000 2.0492 5 672 0.0271 sam usa coldness 0 0 0.000 2.7574 5 666 0.0271 sanagi chrysalis 4 2 1.653 3.3341 5 652 0.0271 sarada salad 3 I 10.238 2.1732 7 565 0.0271 sarami 5 salami 8 0 74.362 3.0686 6 541 0.0053 shooting (a syasatu 8 0.250 1.3010 5 722 0.0053 its person) dead to sasetu turning left 16 11 0.000 3.4216 5 673 0.0271 syasetu editorial 20 12 0.000 2.1430 5 722 0.0271 sasiba H i post crown 3 2 2.572 3.4099 7 574 0.0271 sasimi mUr sliced raw tlsh 5 4 0.266 1.5051 7 636 0.0271 sasizu mm instructions 2 1 0.000 1.8513 6 678 0.0271 sasori s scorpion 5 4 1.541 2.5391 5 647 0.0053 syataku company house 13 13 0.000 1.8865 5 759 0.0271 satori my philosophy 6 4 2.690 3.1297 5 598 0.0271 sawagi disturbance 4 o 10.084 2.2014 6 646 0.0271 continued 228 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood Word Gloss a. I'1 Mora I'1 Word Frequency Duration + + Pitch Auditory frequency Segments Segments 3 sawari touch 16 12 25.999 3.8001 7 610 0.0271 sayoku £K left wing 5 2 1.655 2.5988 5 641 0.0281 sebire «ti dorsal fin 2 I 18.801 3.6229 7 585 0.0281 sebiro niz business suit ") t 0.620 1.8261 7 658 0.0281 sebone irt backbone, spine 0 0 0.595 3.2711 4 636 0.0281 setubi facilities 5 4 3.057 2.6675 7 587 0.0281 segare * son 1 1 0.602 4.1021 5 668 0.0281 sekiyu petroleum 6 -> 0.388 2.1430 7 704 0.0281 semasa narrowness 2 0 5.113 4.3531 5 601 0.0281 senaka ** back L 0 2.245 2.4200 7 599 0.0281 senobi mwis standing on tiptoe 0 0 39.997 3.5240 5 559 0.0281 serihu words in play ■) 1 0.708 2.4871 7 654 0.02S I serori -fen 'J celery 0 0 56.683 3.4829 5 552 0.0281 seruhu self-service 2 1 15.S29 2.4624 7 624 0.0281 sesuzi mm spine, back 7 t 0.241 1.6628 7 688 0.0281 setake mx height 1 0 0.000 2.8202 5 594 0.0424 sibahu lawn 2 0 0.499 2.5478 5 669 0.0424 sibutu private property 22 9 0.000 2.9795 7 738 0.0424 sityaku um fitting 32 22 1.659 2.7135 4 684 0.0424 sitiya KM pawnshop 3 1 14.003 2.0086 6 487 0.0424 situdo ;SJ£ humidity 3 1 0.294 2.3927 7 595 0.0424 sihuku ordinary clothes 28 IS 0.308 2.9777 7 684 0.0424 sihuto *>7 h shift 3 3 6.472 2.8357 7 614 0.0424 sigatu April 14 6 0.000 3.0060 7 726 0.0424 sigeki M stimulus 22 11 0.000 4.4611 7 726 0.0424 sigemi bush 4 2 9.246 3.9383 7 641 0.0424 sigoto tt* work, business 6 4 0.000 2.3655 7 767 0.0424 sigusa tta gesture 6 3 3.733 4.7631 7 660 0.0424 sihatu the first train 17 15 0.000 2.9533 5 780 0.0424 sikake ftant mechanism 12 10 3.144 2.8621 7 651 0.0424 sikiti site, ground 10 8 0.262 3.3141 6 655 0.0424 square piece of sikisi 11 6 3.6882 7 671 0.0424 fancy paper 0.000 sikiso pigment 4 1 0.000 2.8129 7 647 0.0424 continued 229 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood >» 2 | Word Gloss £ 5 A <£ O* Word Frequency D uration Auditory + + Pilch Segments Segments 5 tour old countries sikoku EH in the same island 53 3 2.677 3.6227 7 564 0.0424 in Japan training. sikomi 16 11 0.292 2.6096 7 665 0.0424 education sikori ucy stiffness 9 7 30.911 3.0273 6 580 0.0424 plan, device, sikumi 8 7 9.971 4.2190 5 600 0.0424 mechanism result, outcome, simatu 12 0 2.195 3.0990 7 634 0.0424 management humidity. simeri S 4 2.398 1.8129 7 673 0.0424 ay dampness a kind of simezi 5 4 0.000 2.4099 7 724 0.0424 Ltoi: mushroom sinem a *> *■ 7 cinema 1 1 5.692 2.3655 7 639 0.0424 shop of old sinise I 1 0.000 3.2310 5 744 0.0424 standing sinobi ms 7 6 4.020 3.0233 7 657 0.0424 sinobu /C»*jp recall, remember 3 2 0.737 3.4029 7 716 0.0424 sirabe tune 4 3 16.555 4.5790 7 629 0.0424 sobriety. sirahu 8 4 11.563 1.3979 7 600 0.0424 mm soberness siraga sa gray hair 3 3 6.084 2.7520 7 635 0.0424 si rase report, notice 5 3 0.000 3.2335 6 648 0.0424 sirasu a* young sardine 8 5 0.600 2.3324 7 707 0.0424 siryoku a* sight, vision 41 17 11.546 3.0748 7 626 0.0424 siromi && white meat 14 9 1.803 2.2625 7 659 0.0424 sirosa &£ whiteness I 1 0.580 2.2148 6 677 0.0424 siruku v ;u ? silk 20 5 7.769 2.3222 7 659 0.0424 sirusi eh sign 2 2 1.844 3.2589 7 678 0.0424 sisatu mm inspection 26 16 0.000 3.8566 7 710 0.0424 sisvamo v v f t smelt 0 0 1.273 1.8921 5 697 0.0424 sisetu l&fifc institution 34 2 0.824 4.6831 7 678 0.0424 siseki S&B* historic site 50 30 1.135 2.9232 7 697 0.0424 sisitu nature 30 20 0.000 3.1562 7 776 0.0424 sisvoku sample 59 29 1.350 2.7364 7 691 0.0424 expenses. sisyutu 26 18 4.0925 5 714 0.0424 expenditure 0.000 continued 230 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 3m 5 Word Gloss i m 1M Mora 1M Frequency Word Duration frequency + Pitch+ Auditory ■ > . Segments 3 undergarment. sitagi 13 12 0.292 3.1738 6 640 0.0424 T* underwear sitaku X* preparations 40 30 0.000 2.9263 7 691 0.0424 preliminary sitami I4 13 3.922 2.7774 7 602 0.0424 TH inspection groundwork. sitazi 15 11 0.477 2.8299 7 628 0.0424 T tfe foundation sitetu private railway I0 5 0.000 3.3115 5 681 0.0424 siteki point out. indicate IS 11 1.595 4.7990 7 610 0.0424 siwake classification 6 2 1.239 2.7007 7 644 0.0424 siw asu December 4 3 0.790 2.6355 5 702 0.0424 sizimi « corbicula 4 3 4.459 2.2430 6 698 0.0424 punishment. syobatu 6 -> 4.521 3.4118 5 629 0.0183 penalty soburi m m v look, behavior 3 0 44.689 2.7110 4 561 0.0171 sodati growth 15 S 0.372 2.5955 5 681 0.0171 sodate W r foster 7 6 0.000 2.5740 7 650 0.0171 sohubo grandparents 0 0 12.675 2.9609 5 527 0.0171 sogeki m s shoot 2 1 2.36S 2.6998 5 657 0.0171 sokoku m s mother land 24 13 6.528 3.5838 6 590 0.0171 syokora V 3 3 7 chocolate 2 0 0.743 0.8451 5 599 0.0183 syokuba W&J* post, work place 5 3 0.000 4.0526 7 608 0.0183 sokudo £ j £ speed, velocity 13 12 3.794 3.6694 7 562 0.0171 svokugo after a meal 9 5 0.000 0.3010 6 699 0.0183 syokum u duty, work 12 6 3.222 3.6702 7 599 0.0183 instantaneous sokusi SH5E 39 13 0.519 3.0913 7 722 0.0171 death occupational syokusyu 22 6 0.000 3.2783 7 699 0.0183 category syokuhi food expense 18 9 1.340 2.9571 7 648 0.0183 sokuza sn & ready, prompt 7 4 3.172 3.1602 6 589 0.0171 syokuzi ** meal, diet 33 IS 1.053 4.0417 7 645 0.0183 syom otu f t * book 8 5 11.822 3.0133 5 580 0.0183 sonata 'J1-Z sonata 0 8.188 2.8048 7 586 0.0171 syoniti tO B the first day 2 1 0.262 3.5960 7 707 0.0183 sonoba £< Z )i* there, on the spot 4 1 0.634 3.3526 7 647 0.0171 continued 231 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood Word Gloss b l“ l“ Mora Frequency Word Duration frequency Auditory Segments Segments + Pitch 5 sonogo afterwards 7 5 11.274 4.3205 7 616 0.0171 sonota other 3 0 1.454 3.8552 6 590 0.0171 sonote games, tricks I 0 0.000 2.6335 6 713 0.0171 sonohi that day 2 2 0.759 3.4310 7 667 0.0171 syoseki *$i book, publication 27 16 0.000 3.3151 7 724 0.0183 sosicu sx qualities 10 7 0.000 2.7259 5 694 0.0171 sosiki mm organization 19 4 0.000 4.7211 7 635 0.0183 syosiki f t x t form 15 7 0.000 2.4843 7 660 0.0171 sosina mm small gift 0 0 7.693 1.3979 5 641 0.0171 syotoku mm income 20 10 0.469 4.2473 7 638 0.0183 syozoku mm affiliation 19 3 1.195 4.0248 5 676 0.0183 subako mu nest box. hive 5 2 0.000 2.3032 7 60S 0.0171 suberi J ty sliding, slide 2 I 19.842 2.9703 7 634 0.0171 suburi mmy batting swing 7 4 30.659 2.2253 5 627 0.0171 sweet-and-sour subuta 3 2 6.045 1.3617 5 562 0.0171 pork syutuba til.% run tor. stand for 2 1 4.473 3.8987 6 590 0.0142 syutudo t t i ± excavation 8 1 0.000 3.5012 7 646 0.0142 sudati H k tL h starting in life 5 5 0.277 2.4082 7 684 0.0171 sliced and sudako 3 1 0.234 0.9542 7 648 0.0171 vinegared octopus sugaru bees (old name) 2 0 13.186 3.4570 5 593 0.0171 sugata $ figure, shape I 0 3.560 4.6539 7 570 0.0171 wonder. sugosa 1 12.979 0.3010 5 621 0.0171 ££ amazement, terror I suhada mm bare skin 1 0 9.564 1.8633 7 552 0.0171 opening, space, sukima 10 5 12.883 2.6170 7 563 0.0171 gap syukusya s# lodging, hotel 15 6 0.000 3.6386 7 618 0.0142 reduced drawing, syukuzu 4 1 0.000 2.5490 6 717 0.0142 mm miniature copy sumibi mx. charcoal fire 2 1 1.619 2.1703 6 622 0.0171 sumika living, dwelling 5 4 0.262 2.8041 7 646 0.0171 sumire s violet 2 2 0.537 2.5611 7 669 0.0171 syumoku a s item, event 20 10 0.327 3.7520 7 702 0.0142 sumomo m plum 1 I 3.591 4.1626 5 699 0.0171 continued 232 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood Word Gloss & 1“ Mora1“ Duration Word Frequency frequency + Pitch+ Auditory Segments Segments = sunaba sandbox 0 0 1.819 2.2480 6 654 0.0171 sandy soil, the sunati 6 5 0.000 2.3483 7 702 0.0171 sands syuniku * 1 * 1 vermilion inkpad 8 5 0.627 1.3222 7 693 0.0142 superu spelling 0 0 67.398 0.3010 6 470 0.0171 syuraba fighting scene I 1 6.926 2.4265 7 632 0.0142 suramu slum 4 3 20.957 2.8998 5 618 0.0171 suriru thrill 2 1 164.463 2.5237 6 499 0.0171 syuryoku £ * main force 23 8 2.033 3.8037 5 699 0.0142 surume dried cuttlefish T I 7.647 1.8865 5 650 0.0171 syusyoku £ £ staple food 30 16 7.647 3.0515 7 714 0.0142 susumi progress 4 4 1.454 3.1041 7 620 0.0171 sutego m x * deserted child 0 0 1.454 1.9956 6 602 0.0171 syutoku m % acquire, obtain 26 15 1.044 4.0520 6 668 0.0142 suyaki m m z unglazed pottery 2 1 1.515 2.1004 4 636 0.0171 leading actor, syuyaku 16 13 0.409 3.7860 7 714 0.0142 s s leading actress syuzyutu operation 8 6 2.528 4.1816 5 622 0.0142 suzuki f - r t bass 14 10 0.000 3.1216 6 747 0.0171 suzume * sparrow 5 5 0.871 3.0990 7 658 0.0142 tabako m cigarette, tobacco 5 4 11.801 3.9637 7 608 0.0245 tabizi a m journey, travel 4 0 1.294 2.1903 7 660 0.0245 tatiba a * position 4 0 7.513 4.6904 7 556 0.0245 seeing from the tatimi 7 4 18.464 3.2014 7 595 0.0245 £ * > J 1 callerv tagaku large sum 14 6 20.226 2.7752 5 579 0.0245 tahatu occur frequently 2 1 8.505 3.4239 5 593 0.0245 tahata BB4Q farm, fields 0 0 64.270 3.0792 5 483 0.0245 takane K i n high price 10 3 12.375 3.5330 7 513 0.0245 takara $ treasure 16 12 35.093 3.4541 7 549 0.0245 takari fcj& 'y blackmail 17 12 52.810 2.2227 6 537 0.0245 takasa height 10 2 1.439 4.1230 7 568 0.0245 takibi fire 2 I 8.897 1.5911 6 601 0.0245 another country, takoku t e l l 10 3 25.585 3.6340 7 516 0.0245 foreign country takuti housing lot 12 8 0.000 3.5808 5 604 0.0245 continued 233 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood Word Gloss a. 1“ Mora1“ Word + + Pitch Duration frequency Frequency Segments Segments Auditory 2 tamago m egg 2 0 156.420 3.7346 7 514 0.0245 tames i UL experiment 4 3 19.040 2.8274 7 592 0.0245 tanim a sra valley 1 0 70.932 2.8954 5 531 0.0245 tanomi «* request 4 2 142.513 3.3870 7 555 0.0245 tanuki m raccoon dog 7 6 52.285 2.8949 7 505 0.0245 tarako cod’s roe 7 6 12.433 2.1004 7 585 0.0245 tareme 2 2 6.902 1.1461 7 634 0.0245 tasatu murder 11 9 1.120 2.0969 5 648 0.0245 help. aid. tasuke 4 4 7.510 3.5085 7 543 0.0245 Wit assistance tataki on# concrete floor 11 6 4.766 3.2393 7 618 0.0245 tatami * tatami. mat S 5 2.745 3.1550 7 615 0.0245 tawara m straw bag. bale 8 8 71.593 2.8299 7 569 0.0245 tebiki ¥51# guidance 10 4 188.202 2.9128 6 515 0.0173 tebura empty-handed 3 0 78.267 1.6532 6 515 0.0173 teburi ¥*y gesture, sign 12 4 343.777 1.4472 7 431 0.0173 tetuke deposit 8 6 1.424 1.2304 7 595 0.0173 all night, tetuya flts throughout the 0 0 28.420 3.2533 5 558 0.0173 night trifling with, tedam a 4 3 31.227 2.1399 7 571 0.0173 making sport of tedasi ¥tii L interference 7 1 16.502 1.7559 5 595 0.0173 tedori net profit 8 7 86.031 3.0603 7 544 0.0173 tegaki handwriting 11 10 27.591 3.0099 6 598 0.0173 tegami letter 5 4 147.414 4.1512 5 514 0.0173 merit, exploit, tegara 6 5 121.835 2.4298 7 553 0.0173 achievement tegata ¥» note, bill 5 5 4.315 3.2923 7 608 0.0173 severance of tegire 5 5 77.457 0.6021 7 508 0.0173 connections teguti ¥□ trick 6 2 6.926 3.4028 5 590 0.0173 tekiti f t f e enemy's land 12 10 3.663 2.3054 7 535 0.0173 tekubi wrist I 0 84.534 2.7789 6 491 0.0173 temaki hand-rolling 10 8 8.580 1.3617 6 610 0.0173 temoti on hand 7 4 3.671 3.0962 7 633 0.0173 continued 234 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 3 >> § — 2 ? W ord Gloss E £ S 3 sc a. 2E » Word Frequency Duration Auditory Segments 3 + 3 temoto in hand, at hand 7 4 8.610 3.3698 7 615 0.0173 tenisu T " — X tennis t 2 133.253 3.6129 7 512 0.0173 tenuki skimp, scamp 18 16 20.899 2.8096 7 601 0.0173 tenisu terrace 0 0 87.472 2.5575 7 569 0.0173 terebi f b f television 0 0 195.777 4.7944 7 464 0.0173 tereya r a t i M shy person 0 0 59.329 1.9294 5 545 0.0173 tesaki finger, tool 15 14 11.593 2.9410 5 572 0.0173 tesita follower 5 3 2.409 1.6232 7 561 0.0173 tesuri handrail 13 8 106.531 2.8382 7 579 0.0173 tesuto f X h test 4 3 17.566 3.8788 7 541 0.0173 tewake dividing the work 8 8 19.553 2.6628 5 5S8 0.0173 tezina ^ 0 0 magic 0 0 131.515 2.4314 5 502 0.0173 tobira m door *> 1 56.564 3.3473 7 502 0.0210 todana w m cupboard, cabinet .> I 159.508 2.1847 4 539 0.0210 todoke n i t report, notice 5 3 24.947 3.6798 7 543 0.0210 tokage h * y lizard 8 6 33.648 2.6222 7 529 0.0210 tokoro Pfr place, spot 9 6 9.229 5.1335 7 528 0.0210 tokoya &B barber shop 6 I 12.075 2.4330 7 558 0.0210 tokugi specialty 10 8 55.990 2.52S9 7 490 0.0210 tomari overnight trip 5 5 258.269 2.7427 7 525 0.0210 tom ato h T h tomato 0 0 29.430 3.3056 7 531 0.0210 tonari m next door 7 5 119.396 4.2578 7 556 0.0210 municipal. toritu S 1 34.565 3.2095 7 540 0.0210 & ±L metropolitan torobi t b ' X slow fire 2 0 322.499 2.9538 5 543 0.0210 tororo t h h grated yam 5 4 88.243 1.5563 7 516 0.0210 tosaka crest, cockscomb I 1 0.730 1.9191 7 621 0.0210 miscellaneous zatumu 0 0 42.877 1.8921 5 552 0.0048 duties syaguti t e n faucet 2 I 0.322 2.6493 4 643 0.0020 syakusya II# the weak 17 12 0.235 3.3747 6 614 0.0020 zasetu m frustration 13 9 4.765 3.3023 5 648 0.0048 zaseki mm seat 10 4 1.847 3.5312 5 654 0.0048 room, drawing zasiki 5 4 1.669 2.5224 7 586 0.0048 mm room continued 235 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood 2 5 W ord Gloss m U m I- I- M ora + + Pitch W ord Frequency D uration ■% Segments Auditory 5 frequency zyasuto VVX h just 5 5 24.763 1.9542 7 520 0.0020 zebura zebra 1 0 55.155 1.3424 3 514 0.0062 zibaku suicidal explosion 21 19 8.564 2.4942 5 600 0.0151 paying the zibara SI« expenses out of 7 6 7.527 2.3617 7 616 0.0151 mv own pocket zibeta i ground 0 0 18.555 2.0792 5 541 0.0151 zitugi n t * exercise 15 6 46.582 2.8395 6 526 0.0151 zitumu mm practical affairs i 2 11.517 3.7945 6 567 0.0151 zituwa mn true story I 1 2.685 2.4518 5 586 0.0151 zigoku mu Hell 19 10 0.834 3.2874 7 644 0.0151 spontaneous. zihatu 12 12 0.000 2.1818 5 711 0.0151 S#§ voluntary zihada item skin, surface 5 1 88.950 2.1106 5 533 0.0151 zihaku sa confession 18 16 0.541 3.3604 5 683 0.0151 self-consciousnes zikaku 38 32 4.351 3.6236 7 612 0.0151 S£ s. awakening zikiso SIS direct appeal 5 1 5.257 2.6821 7 542 0.0151 zikoku mm time 36 8 S.712 3.4221 7 546 0.0151 zimaku r m caption, title 26 19 0.971 3.0449 5 637 0.0151 zimetu s s c self-destruction 5 5 0.000 2.7126 5 695 0.0151 zim isa m&z plainness, quiet I 0 7.604 0.3010 7 597 0.0151 zim oto ifeTC local S 8 0.000 4.5959 7 671 0.0151 zim usyo office 1 0 6.475 4.4637 6 554 0.0151 zinusi i f e i landlord 7 5 10.091 3.4555 5 632 0.0151 ziritu self-support 17 14 6.385 3.8345 7 616 0.0151 doing it for ziriki 9 5 2.476 3.3448 7 659 0.0151 S3) oneself zisatu S 3 suicide 11 to 0.000 3.9156 5 685 0.0151 zisyaku magnet 24 19 0.242 2.8116 5 677 0.0151 zisaku one's own work 31 2 2.446 3.2117 7 559 0.0151 zisoku mm speed per hour 33 11 0.000 3.4519 7 560 0.0151 zisyoku im resignation 36 23 14.504 3.7776 7 705 0.0151 zisyuku S 3 ? self-control 14 9 0.000 3.7230 5 667 0.0151 zitaku § ^ | one's own house 25 23 2.321 4.7193 5 623 0.0151 continued 236 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued Target Words Neighborhood Word Gloss Sm 1“ Mora1“ Word Frequency Duration Auditory frequency Segments + Pitch Segments 3 one’s own zihitu 20 17 1.271 2.7634 5 671 0.0151 s* handwriting zizake local sake *> L 0.255 2.4116 6 623 0.0151 Zizoku continuance 27 10 1.434 3.4725 5 650 0.0151 zokyoku prelude 8 4 14.866 2.2765 5 560 0.0086 dices showing the zorom e 0 0 71.211 0.4771 5 516 0.0038 same number removing the zyosecu 15 10 7.377 2.6010 7 653 0.0086 snow zyosicu P&iS dehumidifying 12 11 0.000 1.8388 5 719 0.0086 zubozi guessing right 2 2 21.088 1.1139 5 597 0.0017 zyucugo mm predicate 6 5 4.251 1.4314 7 615 0.0090 zyukuti knowing well 6 5 4.706 2.7007 6 549 0.0090 zyukugo &ts phrase, idiom 7 3 8.482 2.3139 6 635 0.0090 zyum oku m* tree 7 2 79.882 3.2641 5 546 0.0090 zurusa cunning, slyness 4 4 29.165 1.7709 5 541 0.0017 zusiki diagram, graph 6 3 2.555 3.1212 5 630 0.0017 zyuwaki receiver 0 0 2.505 2.8519 5 590 0.0090 237 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX D: Statistics in Experiment 1 .p. 0.0000 0.0000 0.0006 6.7277 0.0167 2.3164 0.1281 1 . 679E2 . 1 18.8840 0.0000 11.7234 1 1 1 1 2 df F . 0 . 63928 . 0 130.62/32 .464E2 4 0.67368 Tol Tol 0.0 1 94 8b 94 0.0 1 .93726 0 0.041/31 0.032763 0.67894 1 .11638/1 0.012380 0.93868 0.9982/4 0.0189/2 76.214838 Std Error Std Coef Std 1 .69/2131 0.064960 - 2.389139 - 331 . 19667b 331 . Part. Corr. Part. Coef ficient Coef Tabic I).I: liasic model for naming data (fast namcrs), Experiment I. Initial Sound Initial Participants 1st Mora 1st Ef feet Ef Point Frequency Uniqueness Duration Frequency none Class Constant Word 3 1 6 5 2 4 7 In F (19 , = 9713) 335.969419, , p < 0.000001, R2 = (19 0.396574 F Our Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p, 0.0011 0.0467 3.9966 0.7867 0.3791 19.3493 0.000119.7230 0.0001 10.6966 4. 498E24. 0.0000 ------1 1 1 1 1 13 ------0.93928 0.92699 20.69780 .498112 1 0.0000 T ol . d f F 0.0162/3 0.92/48 0.038306 0.66911 0.019266 0.038093 0.111710 0.028149 0.83903 0.079473 0.3646/1 009000 . 0 0.90100 304.019747 76.671964 Part. Carr. Coeficientf Std Error Std Coef Mora (S e g m en ts) 1“* 1“* Initial Sound Participants F re q u e n c y Point .999280 1 .003091 1 Ef f e e t C o n s t a n t U n iq u e n e s s F r e q u e n c y C l a s s Neighborhood Word 1 3 5 D u r a t i o n 6 4 8 2 7 In F (19 , 9713) = 336.696782, 9713) , p < 0.000001, R2 (19 = 0.397091 F Our Tabic 1X2: BasicTabic model1X2: + Neighborhood density (Segments)naming data Tor (fast namcrs), Experiment 1. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0210 0.2367 1.4 062 1.4 6 .2 696.2 6 0.0217 6.3317 1 .618E2 18.2366 14.7462 4. 4 66K2 4.4 F 1 1 1 1 1 2 13 .86178 0.63928 0.62228 0.94180 0.67473 0 0.91706 Tol . Tol df 0.019717 -0.040970 ------Kxpcrimcnt I. Kxpcrimcnt 0.184964 76.143813 - - - - 2,286708 0.996709 0.018641 0. 0749b40. 0.0)96)9 0.037792 0.64121 0.0)2032 (1.427090 3 2 b .1bBbb6 ----- Part. Corr. Part. Coef f icient f Coef Error Std Coef Std - (Segs+Pitch) Part icipants Part Initial Sound Initial 1st Mora 1st Point Ef feet Ef Frequency Frequency Uniqueness Constant Class Neighborhood Word 1 6 Duration 2 3 6 8 4 7 'Cable D.3: Basic model + Neighborhood density (Segments + Pitch) Tor naming data (last namcrs), data naming Tor + Pitch) (Segments density + Neighborhood model Basic D.3: 'Cable In F (19, 9713) 336.232442, 9713) = (19, p < 0.000001, F R2 - 0.396761 Our to 4 - © Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p- . ------6.7277 0.0167 2.3164 0.1281 0.9236 0,3366 1.67982 0.0000 18.8840 0.0000 11.7234 0.0006 4.464E2 0.0000 F 1 1 1 1 1 2 13 d f ------0.63928 /32 62 . 0 0.67368 0.93726 0.70382 ... 0.019486 0.012380 0.93868 ------1.116387 0.018972 0.032763 0.67894 1.697213 2.389139 0.998274 0.064960 0. 0097 62 0.0097 33 1.19667b 33 76.214838 0.041731 Part. Corr. Coeficient f Std Error Std Coef Tol . ( A u d ito r y ) 1st 1st Mora Initial Sound P o i n t Participants Eff e e t F re q u e n c y F re q u e n c y U n iq u e n e s s C o n s t a n t C l a s s Neighborhood Word 1 5 D u r a t i o n 6 3 2 4 8 7 In F (19 , 9713) = 335.969419, , p < 0.000001, R2 (19 = 0.396574 F Our Tabic 1X4: TabicBasic 1X4: model + Neighborhood density (Auditory) naming data Tor (fast namcrs), Experiment I. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0r- 0 o m*—< o CNX o o o o o V a ni mrn m o r- m I. Experiment namcrs), (slow data naming for mudcl l).5: Basic Table CN ii o cnin 03 r** b* 242 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0016 0b64 . 0 0.0106 b418 . 9 . 99b6 3.641b 6 1 .022E2 1 . 178E2 2.799E2 1 1 1 1 2 12 df F . 0.66912 0 . b4232 . 0 0,91744 0 . b 2 8 b 0 b 8 2 b . 0 7 6 8 b 6 . 0 Tol Tol -0.02743b -0.028961 R2 = 0.311069 , 0.021 143 0. 109267 Std Error Std Coef Std 000001 . 0 3.910223 1.236792 0.234983 0.123139 0.01842491 b 2 8 . 0 0.213690 214.309618 83./9022b Pa r t. Co rr. Co t. r Pa Coef f icient f Coef (Segments) Initial Sound Initial l“l Mora l“l Participants E£ fect E£ Frequency Frequency Constant none Neighborhood Class Word 1 3 5 6 4 Duration 2 7 In F (18, 224.482738, = 8949) F (18, p < Our Table l).6: Basic model + Neighborhooddensity (Segments) fornaming data (slow namcrs), Experiment 1. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ,p, 0.0000 0.0082 0.0000 0.0000 0.0011 6.9949 4.7439 0.0294 1 .1 162E2 1 .032E2 10,9900 2 . 800E2 F 1 1 1 1 rh 12 a 0.94232 0 . 92400 . 0 2 0.67896 0.84892 0. 11120764260 . 0 -0.028169 -0.020739 Experiment I. 1.223170 -0.029481 0.93788 0.021404 83.199634 Std Error Std Coef Tol . 0.21748b - 3.980479 - C oefif c ie n t Part. Corr. (Segs+Pitch) 0.441193 0.202973 Initial Sound 1st 1st Mora Eff e c t Frequency 220.046011 F re q u e n c y C l a s s n o n e Word Neighborhood 1 C o n s t a n t 3 2 Participants 5 4 D u r a t i o n 6 7 Tabic D.7: Basic model + Neighborhood density (Segments + Pitch) for naming data (slow namcrs), In F (18, 224.571565, = 8949) (18, p < 0.000001, F R2 = 0.311154 Our ' t Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0002 1 9.6130 0.0019 1 7.0333 0.0080 1 13.7610 12 2 . B00E2 0.94232 0.929230.94906 2 1 1 .242E2 4484 . 99 0.96763 0.032914 0.68284 ------1.204062 0.033082 0.023238 0.09161b 0.016177 0.027697 0.70997 Std Error Std Coef Tol. df F 0.179169 -297. 1612‘J 1 82.927884 Part. Corr. ------(Auditory) 0.042903 Initial Sound l at M ora Effect Coefficient F re q u e n c y Frequency 4.466970 n o n e Neighborhood Cla s s Word 1 C o n s t a n t 3 5 4 D u r a t i o n 2 Participants 6 7 In F (18, 8949) 224.756138, 8949) = (18, p < 0.000001, F R2 = 0.311330 Our Table l).8: Basic model + Neighborhood densitynaming (Auditory)data (slowTor namcrs), Experiment 1. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX E: Statistics in Experiment 2 .p. 0 . 1203 . 0 0.0424 0.4401 ------4 . 1202 . 4 0.5962 41.6497 0.0006 --- 1 0.54710 130.49865 72.0814 2 0.0000 2.1182 Tol . Tol df F -0.053532 0.65583-0.024709 0.98037 1 12.9374 1 0.0003 177.56352 Std Std Error Coef Std 0.286767 0.044435 0.09477757 3 0.67 1 0.010101 0.93908 -5.155568 2.539904 -638.671348 Part. Corr. Part. Talilv E .l: Basic naming model data Tor (fast namcrs), Experiment 2. Initial Sound Initial Participants 1st Mora 1st Ef feet Ef icient f Coef Class Frequency Frequency Point Constant Word Uniqueness 3 1 5 Duration 6 2 7 4 In F <18, 5844) = 57.763697, p < 0.000001, K2 = 0.151044 = K2 p < 0.000001, = 57.763697, 5844) <18, F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.1055 0.0000 0.0000 0.0009 0.1439 2.1910 0.1119 43.6190 1 11.0622 1 2.6213 2 131358 . 72 0.92369 0.54689 0.49828 Tol . Tol df F -0.050106 0.63995 -0.019280 0.83478 1 2.1364 0.045001 0.098228 0.65660 1 179.73494 0.297209 -4.236061 2.616410 -0.020302 -0.390559 0.267202 -597 .797711 -597 Part. Corr. Part. Coefficient Error Std Coef Std (Segments) Initial Sound Initial 1“ 1“ Mora Participants Ef feet Ef Frequency Frequency Duration Neighborhood none Class Word 1 Constant 3 5 6 2 4 7 Table K.2: Table K.2: Basic model + Neighborhood density (Segments) naming data Tor (last namcrs), Experiment 2. In F (19, 5843) = 54.846589, p < 0.000001, R2 = 0.151354 = R2 p 0.000001, < = 54.846589, 5843) (19, F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.1203 0.0000 0.0424 0.4939 1 12.9374 0.0003 0 . 54710 . 0 130.49865 72.0814 0.67357 2 1 2.1182 41.6497 Tol . Tol df F 0.094777 -0.053532 0.65583 2. 177.563523 ------0.286767 0.044435 -5.155568 2.539904 -0.024709 0.98037 1 4.1202 -0.008949 0.86005 1 0.4680 -638.671348 Part. Corr. Part. (Segs< Pi tch) Pi (Segs< Initial Sound Initial Mora 1st Ef feet Ef icient f Coef Std Error Coef Std Constant Frequency Frequency Class Word Neighborhood 1 3 5 4 Duration 6 2 Participants In F (18 , 5844) = 57.763697, p < 0.000001, R2 = 0.151044 = R2 p < 0.000001, 57.763697, = 5844) , (18 F Out Table E.3: Basic model + Neighborhood density (Scgmcnts+Pitcli) for naming data (fast namcrs), Experiment namcrs), (fast data naming for (Scgmcnts+Pitcli) density Neighborhood + model Basic E.3: Table Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 4,1202 0.0424 2.1182 0.1203 41.6497 0.0000 1 1 2 13 72.0814 0.0000 df F ------. - 0.54710 0.65583 1 12.9374 0.0003 0.49865 0.98037 0.70384 1 0.6932 0.4051 0.67357 Tol Tol ------0.024709 ------Std ErrorStd Coef Std ------ ---- 0.286767 0.044435 0.094777 -5.155568 2.539904 -0.010892 --- -638.671348 177.56352 -0.053532 Part. Corr. Part. --- (Auditory) I6' Mora I6' Participants Initial Sound Initial E£ feet E£ icient f Coef Frequency Frequency Constant Class Neighborhood Word 1 2 3 5 6 4 Duration 7 Tabic E.4: Basic model + Neighborhood density (Auditory) for naming data (fast namers), Experiment 2. In F (18, 5844) = 57.763697, p = 57.763697, 5844) <(18, 0.151044 = H2 0.000001, F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •P' 0.0000 0.0000 0.3923 0.5104 0.7391 0.0654 0.7982 26.8868 1 0.4332 2 0.9359 12 0.64526 0.99918 1 0 . 99736 . 0 0.040820 0.093551 0.99809 1.0914 41 Std Error Std Coef Std . Tol df F -0.009992 -0.005056 0.99786 1 0.1109 Part. Corr. Part. Coef ficient Coef Table K.5: Table BasicK.5: model for naming data (slow namcrs), Experiment 2. Effect Initial Sound Initial 1st Mora 1st Point 0.003882 Uniqueness Frequency Frequency Class Word 1 Constant 3 2 Participants 4 Duration 0.261665 5 6 6 In Out F (13, 4340) = 28.017431, p < 0.000001, R2 = 0.077425 = p R2 < 0.000001, 28.017431, 4340) F = (13, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 ------0.1856 0.6667 1 0.64526 12 26.8868 Tol . Tol df F 0 .261665 0 .040820 0 0.093551 0.99809 1 41.0914 Part. Corr. Part. Coef f icient f Coef Error Std Coef Std (Segments) 0.006539 0.97172 Participants Ef fect Ef Du rat ion rat Du Constant Neighborhood 1 3 2 4 In F (13 , 4340) = 28.017431, p < 0.000001, R2 = 0.077425 = R2 p < 0.000001, 28.017431, = 4340) , (13 F Our Table E.6: Basic model + Neighborhood density (Segments) for naming data (slow namcrs), Experiment 2. Experiment namcrs), (slow data naming for (Segments) density Neighborhood + model Basic E.6: Table Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 0.0000 0.0000 0.3303 F 1 1 0.9479 1 12 26.8868 ■°l 0.64526 0.95748 ------Experiment 2. Experiment 0.040820 0.093551 0.99809 1.0914 41 0.261665 0.014779 Part. Corr. Part. Coef ficient Coef Error Std Coef Std . Tol (Segs+ Pitch) (Segs+ Participants Ef fect Ef Constant Neighborhood Tabic E.7: Basic model naniers), model (slow data Basic E.7: naming lor Tabic (Segments*Pitch) density Neighborhood 1 3 Duration 2 4 In F (13 , 4340) = 28.017431, p < 0.000001, R2 = 0.077425 = R2 p < 0.000001, 28.017431, = 4340) , (13 F Our Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ■P' 26.9736 0.0000 F 16648 . 4 0.0308 df 0.64513 12 0.72341 Tol . Tol 0.074105 0.72276 1 18.6873 0.0000 0.037638 -0.037008 0.207274 0.047948 -0.081292 Part. Corr. Part. CoefficientError Std Coef Std (Audi tory) (Audi Participants Ef feet Ef Durat ion Durat none Neighborhood 1 Constant 3 2 4 Tabic E.8: Basic model -I- Ncigliborliood density (Auditory) for naming data (slow namers), Experiment 2. Experiment namers), (slow data naming for (Auditory) density Ncigliborliood -I- model Basic E.8: Tabic In F (14, 4339) = 26.371357, p < 0.000001, R2 = 0.078416 = R2 p < 0.000001, 26.371357, 4339) = (14, F Our Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table E.9: Basic model for word identification data (fast namcrs), Experiment 2. 1“ 1“ Mora Initial Sound Initial Ef fect Ef Participants FrequencyFrequency 3.610264 0.364166 Poi nt Poi Duration Uniqueness Class Constant Word 1 5 3 6 2 7 4 In F (18,9781) = 19.209873, p < 0.000001, R2 0.03414b p = 0.000001, < = F19.209873, (18,9781) Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 0.0000 0.0000 0.0000 0.0000 0.0000 1.282E2 1.019E2 1.1250 2 1449 0. 13 7.6632 0.53846 0.52660 2 61.4111 0.66725 1 Tol. df F ------0. 104260 0. 0.91774 1 -0.017774 0.65858 -0.100203 0.82501 1 84.6042 ------Experiment 2. 0.367541137162 . 0 0.000092 0.005415 ------4.161661 0.054654 -0.000135 -0.004962 0.000539 Part. Corr. Part. (Segments) Initial Sound Initial 1st Mora 1st Ef feet Ef icient f Coef Error Std Coef Std Duration Frequency Frequency Constant Neighborhood Class none Word 1 2 Participants 3 4 5 6 7 Tabic K.10: Basic model + Neighborhood density (Segments) for word identification data (fast nanicrs), In F (19,9780) = 22.807238, p < 0.000001, R2 p 0.042429 < = 0.000001, 22.807238, = F (19,9780) Out Ln Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 62.2698 0.0000 1 0.4657 0.4950 13 7.6762 0.0000 0.77138 2 0.53846 0.93852 1 97.3215 0.0000 Tol . Tol df F 0.100677 0.361085 0.1314570.005350 0.690150.000863 1 -0.109540 0.89719 1.220E2 0.0000 1 1.101E2 0.0000 ■tamers), Experiment 2. Experiment ■tamers), 3.988572 -0.009057 -0.006900 0.64214 Part. Corr. Part. CoefficientError Std Coef Std (SegsiPi tch) (SegsiPi Initial Sound Initial Participants 1“ Mora Ef fect Ef Frequency052776 . 0 Frequency Class Neighborhood Word Constant 1 3 5 6 4 Duration 2 7 Tabic E. i I: Hasic model + Neighborhood density (Scgments+I’itch) for word identification data (fast data identification word for (Scgments+I’itch) density + Neighborhood model I: Hasic i E. Tabic In F (18,9781) = 24.981747, p < 0.000001, 0.043953 p R2 = < 0.000001, 24.981747, F = (18,9781) Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p. . 0.0019 0.0000 0.0000 9.6920 66.9126 0.0000 98.2832 0.0000 F 1 1 0.6908 0.4059 26209 . 73 0.67966 1 0.68549 0.52787 0.53846 13 7.5982 0.118988 ----- P.xneriincnt 2. 0.005295 0.082621 0.96796 1 0.000091 -0.037525 0.364166 ------3.610264 0.043311 0.008404 0.70603 -0.000285 Part. Corr. Part. CoefficientError Std Coef Std . Tol df Mora (Audi tory) (Audi 1st 1st Initial Sound Initial Frequency Participants Frequency Ef feet Ef ion Durat Neighborhood Word Class Constant 1 5 6 3 4 7 2 Table E.12: Basic model + Neighborhood density (Auditory) for word identification data (fast namers), (fast data identification word for (Auditory) density Neighborhood + model Basic E.12: Table In F (18,9781) = 19.209873, p < 0.000001, p R2 0.03414b < = 0.000001, = F19.209873, (18,9781) Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0.0000 0.0000 0.0000 0.3570 9 . 1413 . 9 1 77.8438 0.0000 df FP' ' ------0.52787 2 52.9245 0 . 54167 . 0 12 0.67966 1 24.6290 0.0019 0.93739 1 0.8485 0.109984 0.68549 0.079886 0.96796 1 57.9918 -0.062129 Std CoefStd . Tol 0.371942 ------0.041181 0.005408 -0.000463 0.000093 -0.009666 Part. Corr. Part. Coef f icient f Coef Error Std ------ Table E.I3: Basic model for word identification data (slow namers), Experiment 2. Initial Sound Initial lal Mora lal Point Ef feet Ef FrequencyFrequency 3.281609 Duration Uniqueness Constant Word Class 1 6 3 2icipants Part 5 4 7 In F (17, 9082) = 17.991411, p < 0.000001, R2 p 0.032580 = 0.000001, < 9082) = 17.991411, (17, F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0000 0.0012 1.060E2 10.4695 92.4642 0.54167 120.52660 9.2312 2 42.2404 -0.040950 0.65858 1 Std CoefStd . Tol df F Experiment 2. 0.000551107451 -0. 0.82501 1 90.3012 0.005527103090 . 0 0.91774 1 Std Error Std -0.000305 0.000094 Part. Corr. Part. .175697, p < 0.000001, R2 p 0.042105 = < 0.000001, .175697, 22 (Segments) -0.005233 Initial Sound Initial l6t Mora l6t Ef fect Ef Coefficient Frequency 0.053143 frequency 3.863061 0.375150 0.129472 0.66725 1 none Neighborhood Class Word 1 Constant 3 6 5 2 Participants 4 Duration 7 Table E.I4: Basic model + Neighborhood density (Segments) word identification Tor data (slow namers), In F (18, 9081) = 9081) (18, F Out IO ISl VO Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0233 0.0000 5. 1453 5. F 1 89.7825 2 38.0778 12 9.2624 0.52209 0.54167 0 . 64214 . 0 1 Tol . Tol df 0.100298 0.93828 -0.029024 ------0.000095 0. 371877 0. 0.125465 0.67676 1 1.013E2 0.005457 0.000906 -0.122636 0.84766 I 1.213E2 ------namers), Experiment 2. Experiment namers), 3.743527 -0.000216 Part. Corr.Part. Coefficient Error Std Coef Std (Segs+Pitch) -0.009972 Initial Sound Initial l6t Mora l6t Part icipants Part Frequency Frequency 0.051703 Ef fect Ef Class Neighborhood none Word Constant 3 5 b 1 4 Duration 2 7_ Tabic E.I5: Basic model + Neighborhood density (Segments + Pitch) for word identilication data (slow data identilication word for + Pitch) (Segments density + Neighborhood model E.I5: Basic Tabic In F (18 , 9081) = 23.953870, p < 0.000001, K2 = = 0.045328 K2 p < 0.000001, 23.953870, = 9081) , (18 F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.7390 0.1110 24.6290 1 77.8438 0.0000 df F 0.92787 2 92.9249 0.0000 0.96796 1 97.9918 0.0000 0.70603 1 0.94167 12 9.1413 0.0000 0.67966 1 0.079886 -0.062129 Experiment 2. Experiment 0.009408 0.000093 0.371942109984 0. 0.68949 0.041181 -0.000463 Part. Corr. Part. CoefficientError Std Coef Std . Tol (Auditory) 0.003496 Initial Sound Initial l£t Mora l£t Participants Ef feet Ef FrequencyFrequency 3.28160 Class Word Neighborhood 1 Constant 3 9 6 2 4 Duration 7 Table E.I6: Basic model + Neighborhood density (Auditory) Tor word identification data (slow namers), (slow data identification word Tor (Auditory) density + Neighborhood model E.I6: Basic Table In F (17, 9082) = 17.991411, p < 0.000001, R2 = 0.032980 = p < 0.000001, R2 9082) = 17.991411, (17, F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX F: Similarities of Sounds in Noise: MDS Analyses Introduction “Phonological similarity” in English is calculated based on form similarity. As explained in Chapter 1, four different neighborhood calculations have been used. Among them, the only one calculation considers phonetic similarity of sounds, which is based on experimentally derived phoneme confusability (Luce, 1986; Luce & Pisoni, 1998). This rule is based on R. D. Luce’s general biased choice rule (R. D. Luce, 1959). Sound confusion matrices were calculated for CVC words in order to understand similarities of sounds in English. A basic assumption here is that if two sounds are similar, their confusability must be higher. This sound confusability is implemented in the neighborhood density calculation in Luce (1986) and Luce & Pisoni (1998). This paper aims to investigate similarities of sounds from the actual experimental data in Japanese. In a word identification in noise experiment, words are presented to the participants in a noisy environment. They were asked to identify the words. Confusable sounds should induce more mistakes than less confusable sounds. If the sounds are similar, the misperceived sounds should be very similar to the actually intended sounds. Therefore, the error patterns in the word identification in noise should provide similarities of sounds in Japanese. Therefore error patterns in the actual experiments should show the same general tendency. This possibility is explored by analyzing the error patterns in the word identification in noise experiment (Experiment 2). D ata The segments in the word identification data in Experiment 2 were analyzed. All the responses were transcribed by the author in the same romanization used in the original neighborhood calculations in Experiment 2, and saved in a master file. The written responses were checked with the actual responses recorded onto DAT tapes to make sure that participants wrote down their responses accurately. If the participant wrote down something different from what he or she said, the transcription was “corrected” to match the oral response. In this analysis, mispronunciations of the words about accent patterns are ignored because I am concentrating on consonant - vowel confusion. Fewer than 1% of the written responses were corrected. In terms of accuracy of accent patterns of responses, 86% of the data was correctly identified (Also see §2.3.2.1.4). 262 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. E r r o r a n alysis First, repeated-measures ANOVAs were conducted. The error patterns were analyzed in terms of sound categories (consonants or vowels) by subjects analysis (FI) and by items (F2). All target words has a CVCVCV words so that consonants were in the first, third, and fifth position, and vowels were in the second, fourth, and sixth positions. Thus, the maximum number of errors possible for either segment type was 3 x 700 = 2100 in the subjects analysis and 3 x 27 = 81 in the items analysis. Table F.l shows the mean number of errors of consonants and vowels in each of the two analyses. Both subjects and items analyses showed that errors occurred significantly more often for the consonants than for the vowels (FI = (1, 26) = 870.003. p < 0.001, F2 [ 1, 699] = 409.488, p < 0.001). Next, the data were analyzed in terms of position with position within the word. Table F.2 shows the mean number of errors in terms of positions. For consonants, the positional effect observed in the subjects and the items analyses was significant (FI(2, 52) = 461.79, p < 0.0001; F2 (2, 1398) = 71.62, p <0.0001). Paired comparisons between consonant positions showed that the differences between PI and P3 and PI and P5 were significant, but not between P3 and P5 (PI vs. P3: Fl( I, 26) = 578.58, p < 0.0001; F2 (I, 699) = 105.85, p <0.0001; PI vs. P5: F l(l, 26) = 635.52, p < 0.0001; F2 (1 ,699) = 91.77, p <0.0001; P3 vs. P5: F l(l, 26) = 1.669, p > 0. 1; F2 (1, 699) = 0.23, p > 0 . 1). For vowels, the positional effect observed in the subjects and the items analyses was significant (Fl(2, 52) = 95.90, p < 0.0001; F2 92, 1398) = 15.20, p < 0.0001). Paired comparisons between vowel positions showed that the differences among positions are all significant (P2 vs. P4: F l(l, 26) = 161.80, p <0.0001; F2 (I. 699) = 35.78, p <0.0001; P2 vs. P6: F l(l, 26) = 73.30, p <0.0001; F2 (1, 699) = 9.86, p <0.005; P4 vs. P6: F l(l, 26) = 28.03, p < 0.0001 ; F2 (1, 699) 3.80. p = 0.0515). There are two main findings about error patterns observed among positions. First, positions of consonants induced more errors that those for vowels. This clearly indicates that consonants are generally more distorted than vowels in noise. Second, positional effects were also observed within vowel and consonant positions. For consonant positions, PI induced more errors than the other two positions (P3 and P5), which induced errors also most equally. For vowel positions, P2 induced more errors more errors than P4 and P6, and P6 induced more errors than P4. Since P6 is the word-final position in which sounds are often deleted cross-linguistically. it is easy to believe that sounds in this position were often misidentified. The data showed that sounds at the end of utterances are “more distorted” (they are softer, particularly if F0 is falling and/or low). But at the beginnings of words, the larger error rate there is because of information structure rather than acoustics. At the beginning of words, there is no preceding context to help predict the upcoming segment so transitional probabilities cannot help here. The results suggest that Japanese listeners made a distinction between consonants and vowels, natural classes that are specified by phonological features, [±sonorant] and [±consonantaI]. 263 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Similarities o f sounds in t h e MDS analyses This section discusses similarities of sounds in noise. First, similarities of sounds are computed from the error patterns of the responses from participants. In order to explore the underlying structure, multidimensional scaling is applied to the similarities. The confusion matrices from the error patterns in Experiment 2 generally provide information about which pairs of sounds are more confusable than others. The idea here is that if two sounds are similar, they are also confusable in noise. If the similarities of sounds are submitted to multidimensional scaling (MDS), the output should show the auditory similarities among sounds. If Japanese listeners classify the sounds based on auditory features, they should be meaningful dimensions of MDS. MDS is a statistical technique that is useful for uncovering meaningful organization in complex sets of data. Multidimensional scaling was proposed to help understand people’s judgments on the similarity of members of a set of objects. The matrix of similarity judgments was subjected to the multidimensional scaling to create the multidimensional map space. The map that results from a MDS treatment of a set of data is essentially a representation of the psychological relationships among sounds. If two sounds are more similar, the psychological distance in the MDS map should be less. That is, if the “auditory space” is “warped” by linguistic experience (as much cross-linguistic perception work suggests), then this is “auditory similarity.” I do not just mean general, universal “auditory space” here. This MDS analysis does not take position into account — so this is closer to “auditory similarity” than raw counts in Table F. I. Finally, the resulting dimensions must be interpreted to determine the most accurate number of solutions necessary to give the best final MDS solution for the data. In this study, first, similarities of sounds were analyzed based on the error patterns. Correlation computes measures of similarity. Here, correlations between intended sounds that participants actually hear and perceived sounds that they thought they heard were investigated. The more similar the sounds, the higher the correlation should be. Correlations used in this analysis are a matrix of Pearson product-moment correlation coefficients. Pearson correlations vary between -I and +1. A value of 0 indicates that neither of two variables can be predicted from the other by using a linear equation. A Pearson correlation of 1 or -1 indicates that one variable can be predicted perfectly by a linear function of the other. The calculated correlations were submitted to multidimensional scaling. The output was a multidimensional “map” to provide auditory similarity space for sounds in Japanese. Vowel Analysis First, vowels and consonants of the responses were separately tabulated. Some of the responses including a moraic nasal or a palatalized /nV were excluded from the analyses, since no corresponding input sounds existed. That is, the matrices were 264 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. designed to make similarity symmetrical. The vowel data and the consonant data were converted into proportions since the frequencies of different segments in the stimulus words were not balanced (Table F.3). Then the data were submitted to Pearson’s correlation analyses to compute similarities among vowels. Table F.4 shows a similarity matrix of vowels that is an output of the correlation analyses. This similarity matrix was then submitted to MDS in order to explore the underlying structure. Figure F.I shows the two-dimensional scaling solutions for similarity of vowels as the best fit for the vowel data. The R2 value for this solution is 0.99957 and stress is 0.00654. Dimension I represents the vowel height, and is graphed along the horizontal axis. Dimension 2 is graphed along the vertical axis, and it is interpreted as representing the degree of backness. C o n s o n a n t a n a l y s i s The same procedure for creating a multidimensional scaling solution was followed here to analyze the consonant similarity (Tables F.3 and F.4). Figure F.2 shows the two-dimensional scaling solution for similarity of consonants identified in the stimuli in Experiment 2. The R2 value for this solution is 0.891 and stress is 0.1456, which accounts for a good amount of the variation in the mapping. The best three-dimensional solution did not yield a substantially better R2 value (0.902), and the dimensions were difficult interpret in terms of phonological features. Therefore, I selected the two- dimensional scaling solution as the optimal one in this case. Dimension 1 represents the feature, [±voice]. Dimension 2 seems to show another feature [±sonorant]. Among consonants, nasals (/m/, /n/, /g/), glides (/j/, /w/), and liquids (/r I) are considered as sonorants articulately in general from an articulatory point of view. However, the data suggest that not all the sounds are auditorily considered as sonorants. Phonetic realization of /w/ and I t / are not considered sonorants in an auditory-based psychological space, [w] is located near [k*] which is apart from nasals and [j]. [r] is very close to [d] and [b], suggesting that it is auditorily very similar. Recall that in Otake et al. (1996b), Dutch listeners were not able to respond to Japanese [r] with the visual target “r.’ However, when the visual target was changed from V to ‘d,’ they were able to detect [r] in a phoneme monitoring experiment. This could be a piece of evidence that a phonetic realization of /r/ is very similar to a flap. It is natural to treat /g/ as a sonorant, once we remember that /g/ = [q] in Tokyo Japanese, /g/ in the onset position is realized as [q] after a vowel or a moraic nasal in Tokyo Japanese, and there is a shift from [q] to [g] apparently now in progress (Hibiya, 1995). Since all the counts for [g] in this analysis actually occur in this phonological environment, ‘g’ naturally appears to be [q]. However, in MDS, [q] is apart from a cluster of nasals and it is about mid-way between a cluster of nasals (Ini and /ml) and a cluster of voiced stops (lb/ and /d/). In the future, /g/ is expected to be moving towards voiced stops so that voiced consonants would be classified in terms of [±sonorant] more clearly. 265 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D iscussio n an d C o n c l u s io n The current analyses of the error data in word identification in noise experiment (Experiment 2) demonstrated that Japanese listeners perceived sounds in terms of natural classes in a noisy condition. Confusion matrices of vowels and consonants submitted to MDS enabled us to see how Japanese listeners perceived sounds and arranged them within the auditory similarity map. Both MDSs for consonants and vowels provided interpretable dimensions, both of which turned out to be phonological features. The outputs of the MDS represent auditory similarity spaces for consonants and vowels. The five Japanese vowels were classified in terms of Height and Backnesss. Height is realized to measure an acoustic parameter, the frequency of the first formant (FI). It is also related to sonority, and the bigger distance between /i, u 1 vs. /e, o/ than between /e, o/ vs. /a/ supports this. Bachiess is an acoustic property of the second formant (F2). Similarly, the phonological features that emerged from MDS for consonants were auditory features: Voice and Sonorant. Voicing and Sonority are not independent: they are based on activity of the vocal folds. Dimensions used to describe consonants and vowels are really basic features of sounds, all of which are related to the sonority scale of sounds in the language. In conclusion, the results of error patterns in the word identification in noise data revealed that similarities of sounds are based on the sonority scale of sounds in Japanese. Further, the data suggest that properties of sounds that are different from language to language were mapped onto a multi-dimensional scaling. For example, /r/ is a flap in Japanese, so it is located near voiced stops in the auditory space shown in MDS (Figure F.2). A transition from [q] to [g] for /g/ in progress is nicely captured from the error patterns of the word identification data. The methodology used in this paper is useful for languages in which it is difficult to collect confusion matrices as it is, for example in English. 266 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Consonants Vowels Subjects analysis 263 (SD: 46.57) 96 (SD: 20.99) Items Analysis 10 (SD: 10) 4 (SD: 5) Table F.l: The mean number of errors of consonants and vowels in each of the two analyses. Consonants Vowels PI P3 P5 P2 P4 P6 Subjects analysis 128 69 66 44 22 30 I Items analysis 5 3 3 2 1 ! 1 Table F.2: The mean number of errors in terms of word positions. 267 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Perceived Sounds a e i 0 u a 0.433048 0.17094 0.102564 0.225071 0.068376 !* e 0.046674 0.69895 0.047841 0.138856 0.067678 = S i 0.052571 0.110857 0.146286 0.049143 0.641143 W 5c e a; 0 0.028803 0.107111 0.027003 0.809181 0.027903 u 0.034612 0.044902 0.753976 0.065482 0.101029 Table F.3: Proportions of responses for vowels. a e i o u a e -0.181 i -0.453 -0.245 0 -0.096 -0.095 -0.205 u -0.44 -0.43 -0.046 -0.22 Table F.4: A similarity matrix for vowels. 268 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Configuration 2 1 1 1 C\J cI UC cA 0 GO 1 o CD E 10 Q cE -1 1 1 - 2 ' 1 1------2 - 1 0 1 2 Dimension-1 Figure F.l: MDS for vowels. Dimensions 1 and 2 represent vowel height (FI) and backness (F2), respectively. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. X •*r pi X 2 Ml on s 03ft (I OIK (I (II13 0.002 0.001 017 0 0.001 0.001 0014 ’ pi pi 1 X X£ X X2 mi f*i p-* P» P**- pi X pi 5*T pi-r T >e p» n X r-n X -r PI PI X X 5 Ml X X p~ X P"1 X X£ X2 X p*- X —9 ri rp 2 H rp 2 rpPI 2 PI rp rp PI s X X X X X pi X rp X 2 X £ Ml 2 PI rp 2 p*» PI S/S X X X XX X rpX XX X -f Ml pi >/*. 2 T rp p i X 5 -r £ l*S 3 5 X 1 XX X zz X -r f 2 X X rp Ml «r 'P X PI PI X X T 5 2 «p X - rp pi pi 2 X rp X P| X X X zz X X X PI O PI rp s rp OIKKi 014 0 027 0 PI P- X pi rp PI PI rp X s •ri rp rp i_ X X zz X pi X P- X X X X r-j pi f rp rp pi PI X XXX PI pi s X X X XX pi P- rp © £ © © X X 2 p- — — ZZ pi 2 © X Ml ■a T * - - s ui PI pj rp Ml pi rp p*. X zz PT X X X X - rp PI Table Table F.5: Proportions of vowel responses. X pin rp *r T 2 2 X mi PIJk rp pi rp rp Ml '•A X XX i/i X X rp X X X Ml -3 £ 22 rp X «r rp ■"* mi XXX X XX pi — — — — pi-r C/j $ X X X T5 w O rp 2 £ 2 rp p- *r pi > X X X X X PI XX X pi P^ X *5 X «r A V! a i £ - U C t*S -s 3 3 Intended Sounds 270 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. S3 -0.136 * -0.168 -0.328 0.111 -0.182 -0.074 - >o -0.08 0.013 -0.107 -0.191 00 r-* 00 GO c/1 © © © -0.092 -0.107 -0.196 68 o- ro 0.01 - 0.063 -0.089 -0.108 -0.087 0 1 1 1 0.157 - 0.322 0.226 - -0.127 -0.056 -0.247 c - 00 3 e d 0.376 -0.151] -0.2I3| -0.114 -0.162 -0.013 L. -0.07 0.093 -0.183 -0.141 -0.245 -0.094 -0.051 -0.123 -0.104 c * 00 N 8 © 0.156 0 d -0.086 -0.087 -0.125 -0.076 -0.067 -0.175 -0.095 0.38 0.212 0.475 0.693 -0.143 - -0.213 -0.168 -0.015 -0.158 -0.141 -0.298 © Cl 0 1 ■o 0.02 8 - 0.287 -0.154 -0.032 -0.094 -0.183 -0.092 -0.129| -0.092 -0.082 6 00 c s © 0 1 8 - d r- o 0.112 0.082 0.557 © -0.105 -0.151 - -0.135 -0.208 -0.098 -0.117 Table K.6: A consonants. for matrix K.6: Table similarity A - o m £ 0.102 0.865 0.056 0.605 -0.164 - -0.069 -0.176 -0.127 -0.263 -0.217 -0.181 -0.259 -0.056 oc 8 00 ■E © -0.04 0.102 o 9 0.102 -0.113 -0.118 -0.063 -0.159 -0.069 -0.126 -0.092 - -0.186 -0.063 - -0.062 c * T T c** c* 00 00 OC ■e- 8 O © © © 8 0.021 0.012 d d 0.122 0.027 9 9 d 0 -0.073 -0.131 -0.079 - -0.098 - -0.077 - -0.098 © © © Cl rf © .c © Cl Cl 0 -0.04 -0.24 0.222 0.166 d d 9 0.319 -0.043 -0.079 -0.159 -0.126 -0.183 -0.155 9 -0.106 cl 0 0 d 1 0.121 0.121 0.505 9 -0.082 -0.056 -0.069 - -0.076 -0.014 -0.074 - -0.072 -0.095 -0.069 -0.081 -0.144 a . j-0.076 to & £ * E - ■o to s u s =—> >0 J* 39 271 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 1 C\J cI T o o S o PC to ZZ c 0 2 -2 -1 0 1 Dimension-1 Figure F.2: MDS for consonants. SS = [f], C = [Is], CC = [c], KY = [Id], Z= [3], Y = [j]. Dimensions 1 and 2 represent [±voice] and [±sonorant], respectively. 272 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX G: Semantic categories Twenty-eight semantic categories, definitions and English translations of the semantic categories, additional 700 words that belong to the categories and descriptive statistics are listed below. 273 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. It HI (Careers) I Z Word Mean SE SD daitoryou US president I 0 0 bijinesumaN fc'v*X -7> businessman 1 0 0 kaikeisi accountant 1 0 0 yakuzaisi mmm pharmacist 1 0 0 giiN a a congressman I 0 0 saibaNkaN ®WE judge I 0 0 kaNgohu nurse 1 0 0 beNgosi lawyer 0.9666 0.0333 0.1825 haisya dentist 0.9666 0.0333 0.1825 gaka ms painter 0.9666 0.0333 0.1825 syuhu housewife 0.9666 0.0333 0.1825 syoubousi >m± Fireman 0.9666 0.0333 0.1825 kyouzyu professor 0.9666 0.0333 0.1825 seerusumaN -tr— salesman 0.9666 0.0333 0.1825 isya E # doctor 0.9333 0.0463 0.2537 gizyutusya technician 0.9333 0.0463 0.2537 sensei teacher 0.9333 0.0463 0.2537 hobo m nursery staff 0.9333 0.0463 0.2537 kaseihu sias§ housemaid 0.9 0.0557 0.3051 keikaN H I1 policeman 0.9 0.0557 0.3051 gakusei student 0.9 0.0557 0.3051 daiku * x carpenter 0.8333 0.0692 0.3790 souri prime minister 0.8 0.0742 0.4068 geizyutuka SffrS artist 0.8 0.0742 0.4068 nouka mm farmer 0.6 0.0909 0.4982 274 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. K i (Careers) II rw ftj V to ZtlfrC,mi'X:L'fzt£ Word Mean SE SD sinarioraitaa scenario writer 1 0 0 tareNto 5 L /> h television I 0 0 personality dezainaa T 'tf'f X — designer 1 0 0 zyoyuu ■km. actress L 0 0 keNsatukaN m m t prosecutor I 0 0 kaNtoku KS manager 1 0 0 sutairisuto X * - f ‘J X h stylist 1 0 0 zimuiN office worker 1 0 0 komedjiaN =1 > "T-f T > comedian 1 0 0 seizika politician 0.9666 0.0333 0.1825 sutyuwaadesu X f a 7 - f X stewardess 0.9666 0.0333 0.1825 sutaNtomaN stuntman 0.9666 0.0333 0.1825 pairoQto j i j a y b pilot 0.9666 0.0333 0.1825 kameramaN * * cameraman 0.9666 0.0333 0.1825 Satyou t t f i president 0.9666 0.0333 0.1825 kyouiN f t * faculty 0.9666 0.0333 0.1825 daNyuu lif t actor 0.9333 0.0463 0.2537 yakusya performer 0.9333 0.0463 0.2537 anauNsaa 7 7 ^ >1*— anchorman 0.9333 0.0463 0.2537 rakugoka comic storyteller 0.9333 0.0463 0.2537 kasyu singer 0.9 0.0557 0.3051 kooti 3 — 7 coach 0.9 0.0557 0.3051 repootaa 1 1 reporter 0.8333 0.0692 0.3790 asisutaNto 7 * > X * > h assistant 0.7333 0.0821 0.4497 koQku i v 9 cook 0.6333 0.0894 0.4901 275 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. fe (Colors) ccd^p ii rfej -el",, z;h.7!)'bML'Ti'fcf£<#f§A<;:cD*TPu— ic ■r4*'£5A'£#jl-ci'fc*£tfrr. tu rfeJ co«ffiT'fc£*£icii •YES’ ’NO’ £>*'$ >£iii3fc§f£l+^<# LT Word Mean SE SD murasaki £ purple 1 0 0 haiiro 0cfe gray 1 0 0 giNiro ISfe silver 1 0 0 yamabukiiro umfcfe bright yellow I 0 0 kiNiro gold I 0 0 buruu ?)\,- blue 1 0 0 piNku pink I 0 0 buraQku black 1 0 0 ao W blue 1 0 0 sirubaa silver I 0 0 kuro m black 1 0 0 kiiro yellow 1 0 0 aka # red I 0 0 burauN brown 0.9666 0.0333 0.1825 siro a white 0.9666 0.0333 0.1825 sorairo sky blue 0.9666 0.0333 0.1825 ieroo < X P - yellow 0.9666 0.0333 0.1825 tyairo blown 0.9666 0.0333 0.1825 daidaiiro tangerine 0.9666 0.0333 0.1825 guNzyoo ** cobalt blue 0.9333 0.0463 0.2537 reQdo \y "J K red 0.9333 0.0463 0.2537 goorudo a -;u K gold 0.9333 0.0463 0.2537 howaito K white 0.9333 0.0463 0.2537 tutiiro ± f t soil color 0.9 0.0557 0.3051 kusairo grass green 0.9 0.0557 0.3051 276 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. £H10)£Dir (Main dishes) c Word Mean SE SD soba * l f soba ‘NO’odle 1 0 0 kareeraisu a L/—7-rx curry rice I 0 0 teNpura TA/'S'b tempura 1 0 0 tyaahaN fried rice I 0 0 toNkatu pork cutlet I 0 0 koroQke ■3Uy*T croquette I 0 0 haNbaagu m ; / a —7 hamburg steak I 0 0 saNdoiQti -y o h V 'v * sandwich 1 0 0 soumeN ■f-5 tok soomeN ‘NO’odle I 0 0 harumaki spring roll 1 0 0 yudouhu j i l l S tofu in the hot water L 0 0 suteeki stake I 0 0 karaage fried chicken I 0 0 gyouza fit* dumpling 1 0 0 kamamesi £16 rice, meat, and 1 0 0 vegetables cooked together in a small pot. gurataN gratin I 0 0 supageQtji X /< 7 ‘V -T-f spaghetti 0.9666 0.0333 0.1825 udoN 5 £Aj udoN ‘NO’odle 0.9666 0.0333 0.1825 raameN 7 - $ > raameN ‘NO’odle 0.9666 0.0333 0.1825 sityuu stew 0.9666 0.0333 0.1825 syuumai y a $ 7 - < steamed dumpling 0.9333 0.0463 0.2537 sukiyaki sukiyaki 0.9333 0.0463 0.2537 piza tflf pizza 0.9 0.0557 0.3051 sekihaN festive red rice 0.8666 0.0631 0.3457 meNti $ mincemeat 0.7 0.0850 0.4660 277 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Dessert) z(Dz?a is rfc'-^o(7)€iPij t? t\ ztifrz>w\.'Xi'tzt£ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D$9 (Animals) rtb^j -efo ^tl/)'blVlL'TL'fcf£<^ISj!) Word Mean SE SD uma 1 horse I 0 0 tiNpaNzi chimpanzee 1 0 0 simauma IRA zebra I 0 0 buta m Pig 0 0 rakuda b < f£ camel 1 0 0 kiriN t y ^ giraffe I 0 0 usi * cow 1 0 0 usagi 31 rabbit 1 0 0 raioN 7 -'fT > lion 1 0 0 sika m deer I 0 0 koara 3 7 7 koala 0.9666 0.0333 0.1825 saru It monkey 0.9666 0.0333 0.1825 gorira 3"'J 7 gorilla 0.9666 0.0333 0.1825 neko m cat 0.9666 0.0333 0.1825 zou ft elephant 0.9666 0.0333 0.1825 kaNgaruu ■h kangaroo 0.9666 0.0333 0.1825 tora fft tiger 0.9666 0.0333 0.1825 kaba AW* hippo 0.9666 0.0333 0.1825 risu •J* scroll 0.9666 0.0333 0.1825 anaguma im. badger 0.9666 0.0333 0.1825 yagi goat 0.9666 0.0333 0.1825 inu ■k dog 0.9666 0.0333 0.1825 kuma m bear 0.9666 0.0333 0.1825 oukami f t wolf 0.9333 0.0463 0.2537 yamaneko Uift wild cat 0.9333 0.0463 0.2537 279 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Grammatical terms) 'j-ii rx^ffllSj t*T. #. SIS^XcD^W lr. B*f5 280 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Subjects of study) c Word Mean SE SD suugaku mathematics I 0 0 bizyutu Hffi art 1 0 0 rika g?4 natural science I 0 0 buturi mm physics 1 0 0 saNsuu f t t t arithmetic 1 0 0 taiiku <** physical education 1 0 0 oNgaku nm music I 0 0 doutoku mm moral 0.9666 0.0333 0.1825 seibutu ±m biology 0.9666 0.0333 0.1825 eigo £IS English 0.9666 0.0333 0.1825 seiyousi ®*5& European history 0.9666 0.0333 0.1825 zukou 1 1 manual-art class 0.9666 0.0333 0.1825 tin mm geography 0.9666 0.0333 0.1825 rekisi E5& history 0.9666 0.0333 0.1825 gizyutu Stffi craft 0.9333 0.0463 0.2537 kagaku tb^ chemistry 0.9333 0.0463 0.2537 syakai social science 0.9333 0.0463 0.2537 kaNbuN a x Chinese classic 0.8666 0.0631 0.3457 koteN £24 Japanese classic 0.8666 0.0631 0.3457 hokeN ftfll health care 0.8666 0.0631 0.3457 syodou mm Japanese calligraphy 0.8333 0.0692 0.3790 sekibuN «» integral calculus 0.8 0.0742 0.4068 bibuN differential calculus 0.7 0.0850 0.4660 toukei tttt statistics 0.5333 0.0926 0.5074 syuukyou mm religionw 0.4333 0.0920 0.5040 281 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. hOftttltKfcfrtl& ISlfc** (Spices) Z(DZfay'7 Xfttllt* ZZXlt. ttg-P xIf- h 282 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Id 'C l (Objects found in the houses) zcDyaytafiTdv-i* r*mv3tt>ti6mi v t. mmniksis,. **& t\ - « * ZtlfrbWl'Xl'tztzK -lzMTifrt'5fr*m7LTl'tzt££2+. fcL. Hlwjtr#fc*B^C Word Mean SE SD deNsireNzi microwave 1 0 0 suihaNki f tC S rice cooker 1 0 0 koNpyuutaa 3 1 /tf a —* — computer I 0 0 doraiyaa dryer I 0 0 faQkusu fax I 0 0 sutereo X t Is* stereo I 0 0 toosutaa 1----Z 5 — toaster 1 0 0 teeburu T — Jjl* table 1 0 0 bideo t 'x * video 0.9666 0.0333 0.1825 tukue *1 study desk 0.9666 0.0333 ! 0.1825 tokei firtt clock 0.9666 0.0333 0.1825 reizouko ;*«* fridge 0.9666 0.0333 0.1825 sofaa 7 7 7 - sofa 0.9666 0.0333 0.1825 deNwa «t£ telephone 0.9333 0.0463 0.2537 seNpuuki a n a fan 0.9333 0.0463 0.2537 hoNdana book shelf 0.9333 0.0463 0.2537 taNsu mm drawer 0.9333 0.0463 0.2537 beQto h bed 0.9333 0.0463 0.2537 mikisaa mixer 0.9333 0.0463 0.2537 raNpu lamp 0.9 0.0557 0.3051 isu chair 0.9 0.0557 0.3051 puriNtaa printer 0.9 0.0557 0.3051 airoN 7-f □ > iron 0.9 0.0557 0.3051 zyuutaN mm carpet 0.8333 0.0692 0.3790 zyuusaa y a - t - juice mixer 0.8 0.0742 0.4068 283 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. —V (Sports) z < n ~ j u u — ii r x ^ —"j\ x t a ztifrz>wi'Xi'tzt£<$.f§tfz(Dt)mzi ‘YES’ ‘NO' <0/15* > £ Hi 3fc § /£ ( + # < If L T < f£ £ L'o ^fa'U -li T?-f0 -ttuTMi^MiLTX The category for this block is “sports.” Your task is to determine whether the words you are going to hear in this block belong to this category or not. If you think that they belong to this category, press 'YES,’ otherwise press ‘NO’ as soon as possible. The category for this block is “sports." Please prepare for the block. Word Mean SE SD haNdobooru / \ > Ktf—Jl> handball I 0 0 bokusiNgu boxing 1 0 0 sofutobooru 7 7 F tf—)\, softball 1 0 0 sumou mm sumo 1 0 0 wrestling resuriNgu LsZ'J wrestling I 0 0 hougaNnage ffiA ttlf shot put 1 0 0 taisou <*i§ gymnastics I 0 0 taQkyuu Ping-Pong I 0 0 booriNgu bowling 1 0 0 saQkaa -fyvh — soccer 1 0 0 bareebooru /<\s—ft—JL’ volleyball I 0 0 ragubii y ' f t f - rugby I 0 0 goruhu golf 0.9666 0.0333 0.1825 rikuzyou m± track and field 0.9666 0.0333 0.1825 batomiNtoN badminton 0.9666 0.0333 0.1825 zyuudou mm judo 0.9666 0.0333 0.1825 sukeeto 7>*r— h skating 0.9666 0.0333 0.1825 suiei swimming 0.9666 0.0333 0.1825 basukeQtobooru M X 'r'y hTt?— basketball 0.9666 0.0333 0.1825 marasoN ■7 7 7 7 marathon 0.9666 0.0333 0.1825 yakyuu »** baseball 0.9333 0.0463 0.2537 sukii X ^ r- skiing 0.9333 0.0463 0.2537 bootakatobi pole vault 0.9 0.0557 0.3051 takatobi high jump 0.9 0.0557 0.3051 habatobi long jump 0.8666 0.0631 0.3457 284 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1£ (Flowers) r^Ej X'-f0 ztlfr ZNlL'XL'tztz < T=1 'J -lz &+&1)'t’5fr£%z.XL'tzt£££ta t l . Bcx.r#fcJ|tHA%a)«ll9T*ft-6ii^tt •YES’ ©*$:/£. ’NO' CD/t^>£tii3fc-5/'lt^<#LT Word Mean SE SD gaabera Transvaal daisy, 1 0 0 gerbera suiitopii Z 'f - h t z - sweet pea L 0 0 asagao morning-glory 1 0 0 kiNmokusei fragrant olive I 0 0 bara mm rose I 0 0 himawari sun flower I 0 0 rabeNdaa y ' O y — lavender I 0 0 tyuuriQpu tulip I 0 0 botaN peony I 0 0 kaaneesyoN a > carnation 0.9666 0.0333 0.1825 huriizia 7 ' J - v 7 freesia 0.9666 0.0333 0.1825 tubaki 4* camellia 0.9666 0.0333 0.1825 suiseN * « l daffodil 0.9666 0.0333 0.1825 yuri lily 0.9666 0.0333 0.1825 sakura & cherry blossom 0.9666 0.0333 0.1825 maagareQto ~7—ii Lsy h Margaret 0.9666 0.0333 0.1825 yuugao moonflower 0.9666 0.0333 0.1825 riNdou U > yellowwort 0.9333 0.0463 0.2537 tutuzi o-d c azalea 0.9333 0.0463 0.2537 kiku m chrysanthemum 0.9333 0.0463 0.2537 guraziorasu gladiolus 0.9 0.0557 0.3051 ayame mm iris 0.9 0.0557 0.3051 paNzii / O v — pansy 0.9 0.0557 0.3051 rairaQku 7 - f 7 7 ^ lilac 0.9 0.0557 0.3051 zeraniumu — 't’A geranium 0.6333 0.0894 0.4901 285 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Word Mean SE SD painaQpuru /Ui-vZfii, pineapple I 0 0 momo m peach I 0 0 1 n; piiti St peach I 0 0 nasi % Japanese pear I 0 0 masukaQto V V muscat I 0 0 remoN lemon l 0 0 maNgoo ■7>3*— mango I 0 0 budou mm grape I 0 0 suika ffiJH watermelon I 0 0 biwa loquat I 0 0 kiui kiwi l 0 0 zakuro Eta pomegranate I 0 0 banana banana I 0 0 mikaN mts tangerine l 0 0 papaiya papaya I 0 0 meroN jt u y melon I 0 0 younasi pear I 0 0 riNgo apple l 0 0 itigo m strawberry 0.9666 0.0333 0.I825 apurikoQto 7 ^ 'J U 'y h apricot 0.9666 0.0333 0.1825 anzu 9 apricot 0.9666 0.0333 0.1825 itiziku fig 0.9333 0.0463 0.2537 reezuN u—x> raisin 0.9 0.0557 0.3051 kaki m Japanese persimmon 0.8666 0.0631 0.3457 kuri m maroon 0.8 0.0742 0.4068 286 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 511050)—^5 (Parts of the buildings) ra«S(D-ffl5j V t . M U tl'xtf* *-¥>x/<—K fc'JU&if jWBi'iMvsfcsi'^-r. z0M&®ami$-r&mm£zz-C'iimtoa-8{i£nz.£?o m an F^ft£tt*»d5-«fcLT#*64i*T*LJ:5. £fc. £tt©tf;ufciotf#I®Sfc£fc*1l! ©-«fcLr#?l6*i‘6t*Lj:5. febtf>£»iSB!!0)-B£S-rs ££££*> T r a^CD—SBj fcl'5Ax=f'J—ICLST. CtlA'bML'rL'fcf£<|tISA Word Mean SE SD heya SUM 1 room 1 0 0 okuzyoo M± | rooftop I 0 0 kaidaN RSfS stair 1 0 0 yane M« roof I 0 0 siNsitu mm bedroom I 0 0 erebeetaa X Ls*—? — elevator 1 0 0 esukareetaa x x A \s—$ — escalator I 0 0 hooru ;u hall 0.9666 0.0333 0.1825 ima era living room 0.9666 0.0333 0.1825 geNkaN £811 entrance 0.9666 0.0333 0.1825 seNmeNzyo jfcffiRff lavatory 0.9666 0.0333 0.1825 teNzyoo ceiling 0.9666 0.0333 0.1825 doa K7 door 0.9666 0.0333 0.1825 robii □ tr lobby 0.9333 0.0463 0.2537 mado ig window' 0.9333 0.0463 0.2537 kabe H wall 0.9333 0.0463 0.2537 garasu JiyX glass 0.9333 0.0463 0.2537 yuka m floor 0.9333 0.0463 0.2537 hasira f t pillar 0.9333 0.0463 0.2537 daidoko SHr kitchen 0.9 0.0557 0.3051 rouka B T corridor 0.9 0.0557 0.3051 osiire » L A ti closet 0.9 0.0557 0.3051 syosai ** study room 0.9 0.0557 0.3051 tika ifeT basement 0.8666 0.0631 0.3457 svoumei lighting 0.8333 0.0692 0.3790 287 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (School items) Z(D7nyt; Word Mean SE SD boNdo bond l 0 0 buNdoki protractor l 0 0 hude m brush l 0 0 kaQtaa f t v $ — cutter I 0 | 0 iroeNpitu fe fa v* colored pencil I 0 0 teepu ■fePT-? Scotch tape I 0 0 kesigomu eraser l 0 0 pareQto /■vL/"J h palette I 0 0 eNpitu ft* pencil l 0 0 koNpasu 3 I s / i X compasses I 0 0 gayousi drawing paper 0.9666 0.0333 0.I825 kureyoN crayon 0.9666 0.0333 0.I825 hudebako *« pencil case 0.9666 0.0333 0.I825 maziQku 7 - > ; ^ marker 0.9666 0.0333 0.1825 raNdoseru z> is K-fe./u satchel 0.9666 0.0333 0.I825 sitaziki T#£ desk pad 0.9666 0.0333 0.I825 enogu paints 0.9666 0.0333 0.1825 ‘nooto J - h notebook 0.9666 0.0333 0.1825 hasami scissors 0.9333 0.0463 0.2537 kyookasyo textbook 0.9333 0.0463 0.2537 maakaa ■7 —-h — highlighter 0.9333 0.0463 0.2537 kabaN fr\th bag 0.9 0.0557 0.3051 uwabaki ± « t slippers 0.8666 0.0631 0.3457 suzuri inkstone 0.7333 0.0821 0.4497 bousi ** hat, cap 0.6 0.0909 0.4982 288 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Living creatures in the water, and seafood) I V t . * X a & £ 289 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. H ^ 3S (Living creatures in the water, and seafood) II rftJfS j vta X3iMzmjP l\\\z ± & t JSHSfc £ Word Mean SE SD tatiuo *7lfli scabbard fish I 0 0 saNma mackerel pike I 0 0 hotate scallop I 0 0 hoQke Akta mackerel 1 0 0 wakasagi pond smelt 1 0 0 buri iff yellowtail I 0 0 hamaguri £ clam I 0 0 hugu jol® globefish 1 0 0 uni Sfli sea urchin I 0 0 akagai ark shell I 0 0 nizimasu * I» rainbow trout I 0 0 awabi m ear shell 0.9666 0.0333 0.1825 ikura 45=3 salmon roe 0.9666 0.0333 0.1825 sake M salmon 0.9666 0.0333 0.1825 iwasi m sardine 0.9666 0.0333 0.1825 aNkou Tlszi'1? angler 0.9333 0.0463 0.2537 ayu ifi ayu, Japanese river trout 0.9333 0.0463 0.2537 ika mm cuttlefish, squid 0.9 0.0557 0.3051 kaNpati a kind of fishes 0.8333 0.0692 0.3790 hamo i i conger 0.8333 0.0692 0.3790 ainame greenling 0.7666 0.078 0.4301 isidai parrot fish 0.7333 0.0821 0.4497 okoze * 3 -tf stingfish 0.7 0.0850 0.4660 koti II flathead 0.5 0.0928 0.5085 isaki grunt 0.4666 0.0926 0.5074 290 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. i f 31 (Vegetables and beans) I C0?a-y*a>*x=f'J-l* r|?£j X 't0 mm* m i* SSL ZIF& «»SE. 41f£& ifi'*>i'?>fcyrrjSL ri^ggj 4^5 i orojjfi'j-t L i t , t L HlZjcrSfciHH^Z^iiT^'J—l=BT«tJR of=6. 'YES' 0)7t?s>£. *5T'fc(t*i.li' ‘NO’ <*>**$ L r < f : J L '. * x = f u —14 r*f36j r*-T= **l1*l4tMI L t< f:4 L 'o The category for this block is "vegetables.” All kinds of vegetables — root vegetables, leaf vegetables, beans, summer, autumn or winter vegetables belong to this category. If you think that the words you are going to hear in this block belongs to this category, press “YES,’ otherwise press “NO’ as soon as possible. The category for this Word Mean SE SD niNziN A # carrot I 0 0 daizu XS soy bean I 0 0 guriiNpiisu ? y —> tf —x green pea 1 0 0 asuparagasu 7 asparagus 1 0 0 houreNsou 145*14/ spinach 1 0 0 satoimo taro 1 0 0 piimaN pimento I 0 0 tiNgeNsai qing-geng-cai I 0 0 aona greens 1 0 0 syuNgiku *X garland 1 0 0 chrysanthemum takenoko * bamboo shoot I 0 0 kvuuri ftill cucumber 0.9666 0.0333 0.1825 satumaimo sweet potato 0.9666 0.0333 0.1825 tamanegi £ S onion 0.9666 0.0333 0.1825 nasu egg plant 0.9666 0.0333 0.1825 soramame $ a broad bean 0.9666 0.0333 0.1825 gobou r i4 5 burdock 0.9333 0.0463 0.2537 niNniku \ZAj\Z< garlic 0.9333 0.0463 0.2537 myouga KUO Japanese ginger 0.9333 0.0463 0.2537 siitake L i'fc it shiitake Mashroom 0.9333 0.0463 0.2537 aomame WS green pea 0.9333 0.0463 0.2537 aoziso w c -e a kind of beafsteak 0.9333 0.0463 0.2537 plant sisitou LLfS green pepper 0.8666 0.0631 0.3457 kabu turnip 0.8666 0.0631 0.3457 okura * 7 5 okura 0.7666 0.078 0.4301 291 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I? 31 (Vegetables and beans) II Z The category for this block is '‘vegetables.” All kinds of vegetables — root vegetables, leaf vegetables, beans, summer, autumn or winter vegetables belong to this category. If you think that the words you are going to hear in this block belongs to this category, press ‘YES,’ otherwise press ‘NO’ as soon as possible. The category for this Word Mean SE SD karihurawaa ■h 'J 7 7 9 - cauliflower I 0 0 morokosi com 1 0 0 hakusai Chinese cabbage I 0 0 negi £ green onion 1 0 0 nagaimo yam I 0 0 komatuna / | ' « * I 0 0 zyagaimo potato I 0 0 matutake matsutake mushroom 1 0 0 retasu lettuce I 0 0 kyabetu ***'? cabbage I 0 0 eNdou s&a pea 0.9666 0.0333 0.1825 iNgeN "f f\j kidney bean 0.9666 0.0333 0.1825 reNkoN SIS lotus root 0.9666 0.0333 0.1825 daikoN * « daikon radish 0.9666 0.0333 0.1825 edamame eta green soybean 0.9666 0.0333 0.1825 enoki jLO ittz It e’NO’ki mushroom 0.9666 0.0333 0.1825 buroQkorii 'J — broccoli 0.9333 0.0463 0.2537 nazuna T X - t shepherds purse 0.9 0.0557 0.3051 syouga ginger 0.9 0.0557 0.3051 azuki /h a small red bean 0.8666 0.0631 0.3457 kuresoN watercress 0.8 0.0742 0.4068 udo o a udo 0.8 0.0742 0.4068 huki S Japanese butterbur 0.7666 0.078 0.4301 takana mm a kind of Chinese 0.7 0.0850 0.4660 cabbage asatuki chives 0.6666 0.0875 0.4794 292 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . f t (Birds) 9 — li r,Hj -cfo ctL^'«3HlL'‘CL'fcf£<^llSA 293 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. —£B (Body Parts) I z < j ) 7 'j - i i r{* Word Mean SE SD kao 31 face I 0 0 me S eye I 0 0 ha m teeth 1 0 0 xiNzou 294 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. —$ (Body Parts) II 'j—ii rfto-«j x-to zzx i4. <*0)fl-tfjSi5#£*-ri&lS, <*£> wse^KS^-eoxtefr^<7>—gpj ±:i'5 i o 295 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TC^e (Chemical elements) tjtm i V to *tt-es*£5c*E#£**fcE1*Ji<**i:S l ' * f . ZOy-JayWlt. Cftri'bHl'Tl'fcf£< IdSt*-SA'i: 5A'£ #*.Ti'fc/£#rr. t u Bl^jfctfcJHa^rjcfUj o*ii-efe«*^i=ii, ‘y e s ' cd* $>£.-toX'tetttltf ‘NO’ Lr< Ai^L'o tt fju *j T*f= -t*i-ctt»aLr Word Mean SE SD magunesiumu magnesium I 0 0 titaN titanium I 0 0 aruminiumu aluminum I 0 0 karushiumu calcium I 0 0 taNso stm carbon I 0 0 aeN SfB zinc 1 0 0 ritiumu •jm* A lithium I 0 0 huQso z>vm fluorine 1 0 0 suiso 7k* hydrogen I 0 0 tiQso mm nitrogen I 0 0 eNso «* chlorine 1 0 0 • uraN uranium 0 0 heriumu a.ij ^7 A helium 0.9666 0.0333 0.1825 natoriumu ■ r sodium 0.9666 0.0333 0.1825 kadomiumu ■h K5 0 A cadmium 0.9666 0.0333 0.1825 iou ttlt sulfur 0.9666 0.0333 0.1825 kariumu potassium 0.9666 0.0333 0.1825 bariumu barium 0.9666 0.0333 0.1825 saNso mm oxygen 0.9333 0.0463 0.2537 keiso w * silicon 0.9 0.0557 0.3051 suigiN 7k IS mercury 0.8666 0.0631 0.3457 niQkeru — "J * T 7U nickel 0.8333 0.0692 0.3790 riN phosphorus 0.8 0.0742 0.4068 suzu 7.X tin 0.6333 0.0894 0.4901 kobaruto h cobalt 0.6 0.0909 0.4982 296 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. H& (Insects) ccd^p-y^(D*f-3 >j—1 1 r g * j x t= 'j —ic * tz>tm7Lx 297 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Diseases) C (D ^ n ‘V<7©*f-=I*'J— It T '-fo Z;h.rf)'bMl'Tl'fc/£<|if§A<;i0)*x :JU-lc*r«frfc*5fr£#jLTl'fcf£Srt. tL B;:jlT#fcJ|lSitf**a>«irefc*i» ‘YES’ <7)7tC$>£, *5T? 298 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Ingredients) zaizfay?<»i]Td')-\z mmn-ktti vto m z.i& \^/< -ti-£ttx#x< t£t' L '0 f£l$liC#)£tZl'Zl'Zte&tt£m'£to ZZXli. mU£ <»&&<»&& £L'? 1 0<7)*f=f'j — £ L £ to t L , wzz.x$tzmmtfmm<»'ktt£U'')®%$ The category for this block is “ingredients.” Lets consider hamburgers. In order to make them, we need several ingredients including meats. Here all the ingredients like meats that can be used for cooking belong to this category. Your task is to determine whether the words you are going to hear in this block belong to this category or not.. If you think that they belong to this category, press *YES,’ otherwise press 'NO’ as soon as possible. The category for this block is “chemical elements.” Please prepare for the block. Word Mean SE SD hiziki a kind of brown algae I 0 0 butaniku mm pork 1 0 0 gaNmodoki a fried bean curd cake with 1 0 0 vegetables and other ingredients in it kikuraee t< b lf Jews ear 1 0 0 gyuuniku *m beef | 1 0 0 kaNpyou fr/ulfiiO dried gourd shavings 1 0 0 sirataki noodles made from devils 1 0 0 tongue starch vuzu 4>-f citron 1 0 0 kamaboko boiled fish-paste 1 0 0 sooseesi V—-b—v sausage I 0 0 hamu /\A ham 1 0 0 aburaage fried soy bean I 0 0 tikuwa % 299 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. continued yuba iftH dried bean curds 0.9333 0.0463 0.2537 koNnyaku Z. < paste made from the arum 0.9333 0.0463 0.2537 root hu a kind of gluten bread 0.9333 0.0463 0.2537 300 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Instruments) Z 301 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. IHy 1ft (Vehicles) r#y*j t*-*-. z z r - i i , flittf-ty tt'ofci Lrtttstisto^r-r^rtdti'tiOfcL^r. ztu&'fcBL'-ci'fc/i: Word Mean SE SD zidousya automobile 1 0 0 basu M*X bus 1 0 0 K 1 sukuutaa 1 scooter I 0 0 deNsya n m streetcar 1 0 0 siNkaNseN m m New Trunk Line I 0 0 toreeraa h U — V — trailer 1 0 0 torakutaa tractor I 0 0 hune US ship 1 0 0 reQsya su m train I 0 0 baiku lU O motor-bicycle 1 0 0 mo’NO’ree = £ / [y— ; u monorail 1 0 0 ru takusii * * • > - taxi I 0 0 zyeQtoki v r ; h tt jet plane 0.9666 0.0333 0.1825 toraQku truck 0.9666 0.0333 0.1825 basya f t * carriage 0.9666 0.0333 0.1825 heri helicopter 0.9666 0.0333 0.1825 saNriNsya = » * tricycle 0.9666 0.0333 0.1825 booto * - h boat 0.9666 0.0333 0.1825 ootobai motorcycle 0.9666 0.0333 0.1825 ziipu V--J jeep 0.9333 0.0463 0.2537 kikaNsya « h * locomotive 0.9333 0.0463 0.2537 ziteNsya @ * s m bicycle 0.9 0.0557 0.3051 Cikatecu ifeTt* subway 0.9 0.0557 0.3051 kaato t s - h cart 0.8333 0.0692 0.3790 horobasya a s * caravan 0.6333 0.0894 0.4901 302 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX H: Reasons to Discard Eights Target Words from the Final Analysis Eight target words were discarded from the final analysis in the semantic categorization experiment. The main reason was that these words were the only words of the 700 words that were possibly associated to the semantic categories. The following paragraphs describe the reasons why they were discarded more in detail. The word, musubi is a last word that is used to end an event (such as a last word for a speech, or a ceremony). It was a mistake to assign this word to the category of “grammatical terms.” Sikori has at least two meaning: 'stiffness’ and 'an unpleasant feeling’. When sikori appeared in the category of "body parts,” it seems that the former meaning was activated, so that the participants responded "yes” to this word. Kabure also has two meanings: ‘rash’ and ‘be influenced.’ The word appeared in the category of “diseases” so that nearly half of the participants must have associated this word to the former meaning, ‘rash.’ Katiku is a collective noun for animals. This word was assigned to the category of “Animals”. All “yes” filler-response words are names of animals, but it cannot be denied that katiku is highly related to animals. Therefore, it must be dropped from the final analysis. Namazu ‘catfish’ is a fresh water fish that lives in marshes or rivers. The description of the semantic category ‘Insects’ imply that the category includes the creatures living in a fresh water, although I did not intend to include fishes in this category. Since the description is confusing, only half of the participants must have gotten it right. Also, it turned out that this is the only “no” target-filler word that is associated to creatures. In this sense, Namazu was more related to the semantic category than the other “no” target-response words. Tanima means ‘valley’ in English. But this word is also often used in a phrase, mune-no tanima, literally meaning “a valley between the breasts” in Japanese. “Cleavage” is an appropriate translation in English. Therefore, the semantic category seemed to give the participants a non-verbal context. In other words, tanima implies a full phrase (mune-no) tanima so that it suddenly became a part of the body, ‘cleavage’. Therefore, this situation should have been avoided. The word, sonote, literately means ‘that method.’ It is a compound word that has two morphemes. The first morpheme, sono that is a demonstrative pronoun modifies the second morpheme, te, meaning ‘method’ in this compound. However, the word, te originally means ‘hand’ that is a part of the body, and normally used the same kanji character Therefore, although the target word means “that method,” participants might have been able to access to the meaning of ‘hand’ in the experiment. 303 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The final word, tegami, was assigned to the category of the things for school. I was in mind that the things elementary students use in class. In Japan, teachers often distribute monthly letters to parents to let them know how children are doing or what the upcoming events are and so on. They are called either gaQkyuu tsuusiN or just tegami. Of course, tegami is highly related to the things observed in class so that some participants might have recalled this situation in class. 304 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENEIX I: Statistics in Experiment 3 .p. 0.0000 0.0000 0.0000 ------1 .210E2 1 1 .324E2 12.2351 0.0000 36.4279 0.0000 .642E2 1 0.0000 2 .748E2 2 59.6065 0.0000 F 1 1 1 1 2 29 27 0.49307 0.49753 0.51024 0.62278 Tol . Tol df 0.134699 0.60030 0.039996 0.90270 0.102249 -0.051178 0.90216 0.954364 67.071752 5.143298 0.852166 0.284031 0.017135 -7.368186 859.582816 Coefficient Std ErrorCoef Std Part. Corr. Part. ------ Table 1.1: Basic semanticmodel categorizationFor data, Experiment 3. Initial Sound Initial E£ feet E£ 1“ 1“ Mora Participants Point Semantic Duration category Class Frequency Constant Frequency Uniqueness none Word 1 3 5 2 6 4 8 7 In F (62, 19601) 0.222983= 90.725063, = (62, R2 p < 0.000001, F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Segments) Initial Sound Initial lBt Mora lBt Ef feet Ef Point Frequency Semantic category Duration Frequency Class Uniqueness Constant none Neighborhood Word 1 2 Participants 3 5 4 6 9 7 8 In F(63, F(63, 19600) = 89.759406, p < 0.000001, = 0.223911 R2 Table Basic1.2: model Neighborhood-f density (Segments)semantic categorization Tor data, Experiment 3. Out U> s Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1 .113E2 37.8543 12.6419 1 .812E2 3.062E2 45.5955 42.8043 0.0000 1 1 1 1 1 2 29 1.328E2 0.0000 27 df F 0.48889 0.49753 0.51024 0.61420 Tol . Tol -0.04385677873 . 0 -0.043996 0.87498 Kxneriment 3. 0.857887 0.045047 0.88903 67.474985 0.108044 Std Std Error Coef Std 5.792833 0.305986 0.017487 0.145111 0.57531 -0.993409 0.161462 -6.334216 0.968165 908.299541 Part. Corr.Part. 90.053720, p < 0.000001,~ R2 = 0 .224481 0 = 90.053720, R2 p < 0.000001,~ ) ) - (Segs+ Pi tch) Pi (Segs+ Initial Sound Initial Effect Coefficient Point Participants Mora 1st Semant ic Semant Frequency category Frequency none Uniqueness Neighborhood Class Word 1 Constant 3 5 9 6 Duration 4 8 2 7 Tabic Basic1.3: model + Neighborhood density (Segments + I'itch) for semantic categorization data, In F (63, F (63, 19600 Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 1.325E2 1 .210E2 0.0000 12.5266 0.0000 39.1093 0.0000 14.7104 0.0001 2 .720E2 2 1 .679E2 0.0000 59.5453 0.0000 F 1 2 1 1 1 0.49306 0.49752 29 0.50645 27 0.47377 0.67333 0.62189 0.90216 0.041500 0.89955 1 0.103402 0.029419 -0.051134 0.863360 0.954031 0.012854 67.095969 Std Std Error Coef Std . Tol df 0.317983 0.019281 0.150800 5.336694 0.049299 -7.361844 869.280919 Part. Corr.Part. (Auditory) Initial Sound Initial Participants Point Ef fect Ef icient f Coef ic Semant 1st Mora 1st Duration Class category Uniqueness Frequency Frequency Word Neighborhood none 1 Constant 3 6 2 5 4 8 9 7 In F(63, F(63, 19600) =897580934, p < 223566 = R2 0.000001, Out Table 1.4: Table Basic1.4: model + Neighborhood density (Auditory) for semantic categorization data, Experiment 3. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. • p. • 0.0000 0.0000 1.328E2 1.116E2 12.9077 0.0000 1 .846E2 0.0000 13.7924 0.0002 36.9340 0.0000 48.3374 0.0000 2.996E2 0.0000 42.9218 0.0000 1 1 1 1 1 1 2 ------0.49752 29 0.48888 0.50644 27 0.45957 0.87497 0.67305 0.77841 Tol . Tol df F ------0.046440 0.88623 0.028466 -0.043315 data, Experiment 3. 0.012845 0.161443 67 .494418 67 0.109088 0.61345 0.338566 0.019559160562 0. 0.047703 5.971946 0.858962 -0.981145 -6.340844 0.967851 -0.044042 917.082119 Part. Corr. Part. Coefficient Error Std Coef Std (Segs+Pitch) (Auditory) Ef fect Ef Initial Sound Initial 1st Mora 1st Point Frequency Semantic category Frequency Constant Class Uniqueness none Neighborhood Word Neighborhood 1 3 6 Duration 2 Participants 9 4 5 7 8 9 Table 1.5: basic1.5: Table model + Neighborhood density (Segments + Pitch Auditory)& forsemantic categorization In F (64, 19599) 0.225026= 88.919995, (64, - R2 p 0.000001, < F Out UJ o VO Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >p. 0.0000 0.0000 6.9664 0.0000 18 .7529 18 81.9502 0.0000 23.5320 0.0000 68.4563 0.0000 F 1 1 .477E2 1 1 1 2 14 17.3907 0.0000 27 df ------0.49401 0.52450 0.50718 0.108523 0.62286 0.043101 0.90360 ------1.280020 -0.048274 0.90389 0.022999 0.148392 0.60043 Std ErrorStd Coef Std . Tol 0.279512 4.963196 1.146113 816.610556 90.206920 Part. Corr . Corr Part. Table 1.6: Table Basic1.6: modelr for categorization.semantic data (fast responders), Experiment 3. Initial Sound Initial 1st Mora 1st Point Participants Effect Coefficient Duration Frequency -6.209352 Frequency Semantic Uni queness Uni category Word none Class 1 Constant 6 3 8 5 7 2 4 In F (4 7, 7, 9922) 0.111852= = 26.586417, p R2 < 0.000001, (4 F Out © Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ,p. 0.0000 7.1560 0.0000 17.4147 0.0000 .8030 14 0.0001 1 .605E2 0.0000 17 .2244 17 66.5806 0.0000 .4630 24 0.0000 .5325 91 0.0000 1 1 1 1 2 1 14 27 0.58518 0.60969 0.86198 0 . 52450 . 0 0.49304 0.50717 0.77124 -0.044675 Std CoefStd Tol. df F 1.160335 0.049838 0.88015 1.309693 -0.039175 0.023277156567 0. 0.131142 responders), Experiment 3. 5.739023 0.294911 -5.039007 -0.544269 871.594124 91.101713 0.115830 Part. Corr. Part. (Segments) Initial Sound Initial 1st Mora 1st Point Ef fect Ef ficient Coef Std Error Frequency Semant ic Semant ion rat Du category Frequency Class Uniqueness none Neighborhood Word Table 1.7: Basic model + Neighborhood density (Segments)semantic categorizationTor data (fast 1 Constant 5 3 2 Participants 4 6 8 7 9 In F (48, 9921) 26.433942, (48, = P < F 0.113391 = R2 0.000001, Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. •P' 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0000 7.1461 17.4349 0.0000 62.7897 1.684E2 15.1087 92.0762 26.7414 24.3718 F 1 1 1 1 1 2 14 27 df 0.48958 0.52450 0.50718 Tol. 0.049444 0.89005 0.16178657451 . 0 0.115640 0.61474 -0.055338 0.77965 Std Coef Std 1.153312 1.298026 -0.039225 0.87671 Std Std Error responders), Experiment 3. 5.693646 0.304741 0.023481 -5.045415 -1.126520 0.217845 870.165339 90.683437 Coefficient Part. Corr. Part. (Segs+Pitch) Initial Sound Initial Ef feet Ef Semantic Point lBl Mora lBl category Frequency Frequency Class Uniqueness Word Neighborhood none 1 Constant 3 5 2 Participants 4 6 Duration 8 9 7 Tabic 1.8: Tabic Basic1.8: model + Neighborhood density (Segments + Pitch) for semantic categorization data (fast In F(48, F(48, 9921) 26657183” = R2= p < 0.000001, 0.114240 Out. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >p. 0.0000 0.0000 0.0000 7.0312 4 .4808 4 0.0343 17.3789 1.387E2 0.0000 19.7436 0.0000 68.6477 23.5034 0.0000 83.4163 0.0000 1 1 1 1 1 2 14 27 df F 0.52449 0.49400 0.50350 0.161704 0.47450 0.024406 0.67313 -0.048236 0.90388 1.147728 0.044287 0.90075 0.025867 0.017341 responders), Experiment 3. Neighborhood density (Auditory)semantic categorization Tor data (fast 0.304588 5.099791 0.036707 + -6.204493 1.279797 824.428436 90.266687 0.109562 0.62182 Part. Corr. Part. Coef £ icient £ Coef Std Error Coef Std . Tol (Auditory) Initial Sound Initial 1st Mora 1st Ef feet Ef Participants Point Duration Frequency Frequency Semantic Uniqueness category Class Neighborhood none Word Table 1.9: Table Basie1.9: model 1 Constant 6 3 5 8 2 7 9 4 In F (4 8, 9921) = 26.135016, p < 0.000001, R2 = 0 .112253 8, 0 9921) = 26.135016, = R2 p < 0.000001, (4 F Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ' P' ' 0.0000 0.0000 0.0000 7.1995 0.0000 4.0409 0.0444 17.4233 15.1432 0.0001 1.565E2 25.3767 0.0000 93.4500 0.0000 26.2982 0.0000 F 1 1 1 1 1 1 2.0792 63 27 df 0.87671 0.48956 0.50350 0.52449 14 0.67283 Tol . Tol Auditory) Tor semantic Auditory) categorization Tor & 0.023153 Std Coef Std 0.017323 0.026245 0.174304 0.45975 0.217860 -0.054881 0.77930 data (fast responders), Experiment 3. 5.817200 1.154773 0.050517 0.88753 0.034822 0.328321 -5.050407 1.297830 -0.039264 -1 .117227 -1 877.139928 90.735903 0.116567 0.61384 Part. Corr. Part. (Audi tory) (Audi (Segs+Pitch) Initial Sound Initial 1st Mora 1st Point Effecticient f Coef Error Std Frequency Frequency Semantic category none Uniqueness Class Neighborhood Word Neighborhood 1 Constant 6 Duration 8 4 5 3 2cipants i Part 9 9 7 In F (4 9 , 9920) = 26.203629, p < 0.000001, R2 = 0.1 14600 9920) 0.1 = 26.203629, = , p R2 < 0.000001, 9 (4 F Table 1.10: Table Basic1.10: model + Neighborhood density (Segments Pitch Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >p. 0.0000 8.1539 0.0000 54.1423 0.0000 17.8135 0.0000 1 .297E2 0.0000 81.4516 36.4371 0.0000 82.5709 0.0000 F 1 2 14 27 df 0.49202 0.51619 0.51330 Tol. 0.041137 0.90170 Std Coef Std 1.261092 99.263260 0.106589 0.62255 1 5.322567 0.289389 0.025412 0.136057 0.60010 1 -8.549815 1.416396 -0.058882 0.90023 1 901.990714 Part. Corr. Part. Coef f icient f Coef Error Std Table 1.11: Table Basic1.11: model for semantic categorization data (slow responders), Experiment 3. Initial Sound Initial 1st Mora 1st Ef feet Ef Point Duration Frequency Frequency none Class Semant ic Semant category Uniqueness Word 3 5 6 1 Constant 4 8 2 Participants 7 In F (47, 9646) 0.173734 - 43.153447, = (47, R2 p < 0.000001, F Out U> t-n Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0.0000 0.0000 0.0059 8.3237 0.0000 7.5974 1.364E2 0.0000 81.5681 52.5201 21.2849 0.0000 88.3307 0.0000 28.1542 0.0000 1 1 2 1 1 1 14 df F 0. 51618 0. 0.49119 0.51329 27 0.60883 0.58605 0.85876 0.77028 Tol. 0.045556 0.87793 responders), Kxperimcnt 3. 5.894355 1.277615 0.300229 0.025706 0.141154 -0 . 396604 . -0 0.143888 -0.029057 -7 .692175 -7 .449697 1 -0.052975 943.052288 100.34135 0.111441 Coefficient Std ErrorCoef Std Part. Corr. Part. (Segments) Ef fect Ef Initial Sound Initial Participants lEt Mora lEt Frequency Point Class Semantic category Uniqueness Frequency none Neighborhood Word Tabic 1.12: Basic model + Neighborhood density (Segments) forsemantic categorization data (slow 1 Constant 3 5 2 6 Duration 4 7 9 8 In F (48, F 9645) 42.441594, (48, = P < 0.000001,R2=0.174385 Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. . p, . 0.0000 0.0000 8.3630 0.0000 1.413E2 0.0000 50.0266 0.0000 81.6489 21.5053 0.0000 89.5195 0.0000 28.3092 13.0435 0.0003 1 1 1 2 1 1 14 0.48809 0.51617 0.61351 0.87305 0.77769 0.144873 0.57602 0.111728 -0.052670 -0.037880 Std CoefStd Tol. df F 1 .437379 1 0.025921 0.238170 99.929545 Std Error Std responders), Experiment 3. 5.8897190.308140 1.270052 0.045520 0.88792 -7 .647769 -7 -0.860171 945.480837 Part. Corr. Part. (Segs+Pitch) Initial Sound Initial Participants Point 1®' Mora 1®' Semantic Ef feet Ef icient f Coef category 0.51329 27 Class Frequency Frequency Uniqueness none Word Neighborhood 1 Constant 3 5 6ion Durat 4 2 8 9 7 In F (48, 9645) = 42.578914, (48, p < F = 0.174850 R2 0.000001, Table Basic1.13: mode) + Neighborhood density (Segments + Pitch) forsemantic categorization data (slow Out u> ^3 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. .p. 0 .0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 0 . 0 0 0 0 8.4078 19.5327 0.0000 1.354E2 10.9159 0.0010 81.5495 53 .8571 53 84.6356 36.4080 1 2 27 0.51617 14 0.50939 0.49201 0.47296 1 0.90023 1 T o l . d f F 0.043137 0.89823 -0.058828 S td C oef 0.018963 0.037244 0.67338 1 responders), Experiment 3. 5.581386 1.262877 0.062653 0.332896 0.028610 0.156512 -8.542015 1.415670 913.269156 99.270989 0.107921 0.62182 1 Part. Corr. M ora ( A u d ito r y ) Effect Coeficientf Std Error Participants Initial Sound P o in t 1st 1st F re q u e n c y S e m a n tic c a t e g o r y U n iq u e n e s s D u r a t i o n F re q u e n c y C o n s t a n t n o n e C l a s s Word Neighborhood Table 1.14: Basic model + Neighborhood density (Auditory) for semantic categorization data (slow 1 5 2 3 8 6 4 9 7 In F(48, F(48, 9645) =42.525267, 0.174668= R2 p 0.000001, < Out Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 .436676 1 0.018956 0.238100 Std Std Error Coef Std . Tol ------data (slow rcsnonders). Exocriment 3. 0.350395 0.029014 0.164739 0.45930 1 6.133041 1.271655 0.061327 ------0.844980 1 -7.656066 955.752394 99.930984 0.112942 -- Part. Corr. Part. - 41.964509, p < 0.000001, R2 = 0.]75745= 41.964509, p R2 < 0.000001, (Segs+Pitch) (Auditory) 9644) 9644) = 1st Mora 1st Initial Sound Initial Point Participants Frequency EC feet EC icient f Coef Semantic Frequency none category Neighborhood Neighborhood Class Uniqueness Constant Word 1 3 5 6 Duration 4 8 9 7 9 2 In F<49, F<49, Out: Tabic Basic1.15: model + Neighborhooddensity (Segments + Pitch & Auditory) forsemantic categorization 3 u> Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.