Download Article (PDF)

Total Page:16

File Type:pdf, Size:1020Kb

Download Article (PDF) Libri Phonetica 1999;56:105–107 = Part I, ‘Introduction’, includes a single Winifred Strange chapter written by the editor. It is an Speech Perception and Linguistic excellent historical review of cross-lan- Experience: Issues in Cross-Language guage studies in speech perception pro- Research viding a clear conceptual framework of York Press, Timonium, 1995 the topic. Strange presents a selective his- 492 pp.; $ 59 tory of research that highlights the main ISBN 0–912752–36–X theoretical themes and methodological paradigms. It begins with a brief descrip- How is speech perception shaped by tion of the basic phenomena that are the experience with our native language and starting points for the investigation: the by exposure to subsequent languages? constancy problem and the question of This is a central research question in units of analysis in speech perception. language perception, which emphasizes Next, the author presents the principal the importance of crosslinguistic studies. theories, methods, findings and limita- Speech Perception and Linguistic Experi- tions in early cross-language research, ence: Issues in Cross-Language Research focused on categorical perception as the contains the contributions to a Workshop dominant paradigm in the study of adult in Cross-Language Perception held at the and infant perception in the 1960s and University of South Florida, Tampa, Fla., 1970s. Finally, Strange reviews the most USA, in May 1992. important findings and conclusions in This text may be said to represent recent cross-language research (1980s and the first compilation strictly focused on early 1990s), which yielded a large theoretical and methodological issues of amount of new information from a wider cross-language perception research. For range of languages and phonetic contrasts. this reason, and in view of the contribu- Part II, ‘Linguistic Experience and the tors’ expertise, this book should be wel- Development of Speech Perception’, con- comed by speech scientists, phoneticians, tains five chapters devoted to the role linguists, psychologists, second language native language experience plays in shap- teachers, speech pathologists, and stu- ing the way speech is perceived. In chap- dents of these disciplines. ter 2, Linda Polka, Peter W. Jusczyk, The book contains 16 chapters orga- and Susan Rvachew provide a review nized in three main sections: (a) investi- of strengths and limitations of the most gation of how speech perception develops important techniques used in cross-lan- in the course of learning the first lan- guage studies with infants and children. guage; (b) assessment of how patterns of With regard to infants, they describe and speech perception may change when a test the High Amplitude Sucking tech- second language is learnt, and (c) explo- nique and its variations, as well as the ration of how speech perceptual patterns Conditioned Head-Turn, the Habituation may be modified in the laboratory or of Visual Fixation, and the Head Turn clinic. These sections are preceded by an Preference procedures. On the other hand, introduction and followed by a final chap- the authors suggest special precautions ter devoted to future directions in cross- when assessing the perceptual abilities language speech perception research. of preschool children (aged 3 years and © 1999 S. Karger AG, Basel 105 0031–8388/99/0562–0105 Fax¤+¤41 61 306 12 34 $17.50/0 E-Mail [email protected] Article accessible online at: www.karger.com http://BioMedNet.com/karger older), mainly to ensure comprehension and how they discriminate non-native and attention to the task, and motivation contrasts against the phonological cate- to perform and complete the task. gories of their native language. This issue Chapter 3, by Peter W. Jusczyk, Eliza- directly connects with the content of the beth A. Hohn and Denise R. Mandel, is following part. devoted to picking up regularities in the Part III, ‘Speech Perception in Second sound structure of the native language. Language Learning’, addresses the ques- These researchers from SUNY at Buffalo tion as to how the perception of speech put forward theoretical considerations as sounds is influenced by the learning of a to how and when infants begin to pick up second language. It opens with a method- information about the organization of L1 ological review of the principal variables sound properties. They review a set of in cross-language speech perception re- studies focused largely on features that search with adults by Patrice S. Beddor relate to phonetic and phonotactic proper- and Terry L. Gottfried (chapter 7). In the ties of the native language, and to the way next chapter James E. Flege reviews the infants’ sensitivity to these properties principal findings and problems in L2 develops. The fourth chapter by Patricia speech learning, and presents his Speech K. Kuhl and Paul Iverson concerns the Learning Model. This model sets out to ‘perceptual magnet effect’ [Kuhl, 1991]: account for age-related limits on the abil- a perceptual distortion around a phonetic ity to produce L2 vowels and consonants prototype. The Native Language Magnet in a native-like fashion. Flege assumes Model maintains that exposure to a par- that ‘the phonetic systems used in the pro- ticular language results in a change of the duction and perception of vowels and acoustic space underlying speech percep- consonants remain adaptive over the life tion. The magnet effect is a very inter- span, and that phonetic systems reorga- esting concept, but later investigations nize in response to sounds encountered in have revealed it to be problematic, at least an L2 through the addition of new pho- in adult perception [Lively and Pisoni, netic categories, or through the modifica- 1997]. In chapter 5, Janet F. Werker dis- tion of old ones’ (p. 233). A consequence cusses age-related changes in infant of Flege’s model is that foreign accents cross-language speech perception and are caused, at least in part, by the inaccu- outlines questions that remain unan- rate perception of sounds in an L2. The swered. phonology of the native language filters Finally in chapter 6, Catherine T. Best out features of L2 sounds that are impor- addresses the topic from an ecological tant phonetically but not phonologically. theoretical perspective, in contrast to the This fact would cause a true ‘perceptual theoretical positions held by other authors foreign accent’ that hinders phonetic pro- in the book (Flege, Jusczyk et al., Kuhl duction of the second language. and Iverson, Werker). Best defends a In chapter 9 Flege’s collaborator Ocke- Direct Realism approach, in which articu- Schwen Bohn centers on the aspects of the latory gestures are assumed to be the per- native language that do not influence the ceptual primitives for speech perception. perception of L2 sounds. In chapter 10 Listeners directly recover these gestures Reiko A. Yamada examines the relation from the speech signal without recourse between the age of L2 acquisition and the to innate knowledge of the vocal tract (as perception of American English /r/ and /l/ the Motor Theory states), in the same way by native speakers of Japanese. In the final as other auditory objects or events are contribution of this part (chapter 11), Hen- perceived. Interestingly, this approach ning Wode discusses the implications of makes a coherent set of predictions about the results of speech perception research how listeners perceive non-native phones, for linguistics, and vice versa. 106 Phonetica 1999;56:105–107 Libri Part IV, ‘Modifying Speech Percep- area is a ‘growth industry’ with multiple tion in the Laboratory and Clinic’, exam- needs for further work on the learning of ines, in four chapters, how speech percep- languages by speakers of many different tion may be modified for applied goals, L1s learning many different L2s, for more especially in second language learning coherent developmental research, for a and articulation disorders. As previous detailed theory of the relation between L1 sections, it begins with a methodological and L2, and for a theory of individual dif- chapter that serves, in part, as an intro- ferences in speech perception. duction to the topic. Thus, John S. Logan In sum, this volume is an extremely and John S. Pruitt (chapter 12) deal with valuable reference book for researchers methodological issues in training listen- and professionals in speech science inter- ers to perceive non-native phonemes. ested in a cross-linguistic perspective. It They mainly review aspects related to is a coherent set of chapters that represent training goals, stimulus presentation in up-to-date summaries on the main issues discrimination and identification tasks, in cross-language speech perception feedback provided to the subject, and research. The authors have different the- duration of training. In the next chapter, oretical orientations; most of them are Bernard Rochet discusses the perceptual heads of research teams and well-known basis of foreign accent along the lines experts in the field. The editor has of Flege, and presents results from two achieved a coherent coordination of con- experiments on auditory training for the tents in a multi-authored volume. There teaching of French sounds in adults. remains a certain – unavoidable – redun- In chapter 14 Susan Rvachew and dancy between some of the chapters, and Donald G. Jamieson propose that adult notwithstanding the excellent general L2 learners have several things in com- introduction to the volume by Winifred mon with young children that misartic- Strange, individual overviews would ulate sounds in their native language, have been welcomed at the beginning of because, in most cases, difficulties in each part. producing sounds are correlated with difficulties in identifying sounds. There- References fore, perceptual training can also im- Kuhl, P.K.: Human adults and human infants show a prove phonological production in the case ‘perceptual magnet effect’ for the prototypes of of articulation-disordered children. The speech categories, monkeys do not.
Recommended publications
  • Decoding Articulatory Features from Fmri Responses in Dorsal Speech Regions
    The Journal of Neuroscience, November 11, 2015 • 35(45):15015–15025 • 15015 Behavioral/Cognitive Decoding Articulatory Features from fMRI Responses in Dorsal Speech Regions Joao M. Correia, Bernadette M.B. Jansma, and Milene Bonte Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, and Maastricht Brain Imaging Center, 6229 EV Maastricht, The Netherlands The brain’s circuitry for perceiving and producing speech may show a notable level of overlap that is crucial for normal development and behavior. The extent to which sensorimotor integration plays a role in speech perception remains highly controversial, however. Meth- odological constraints related to experimental designs and analysis methods have so far prevented the disentanglement of neural responses to acoustic versus articulatory speech features. Using a passive listening paradigm and multivariate decoding of single-trial fMRIresponsestospokensyllables,weinvestigatedbrain-basedgeneralizationofarticulatoryfeatures(placeandmannerofarticulation, and voicing) beyond their acoustic (surface) form in adult human listeners. For example, we trained a classifier to discriminate place of articulation within stop syllables (e.g., /pa/ vs /ta/) and tested whether this training generalizes to fricatives (e.g., /fa/ vs /sa/). This novel approach revealed generalization of place and manner of articulation at multiple cortical levels within the dorsal auditory pathway, including auditory, sensorimotor, motor, and somatosensory regions, suggesting
    [Show full text]
  • Sound and the Ear Chapter 2
    © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Chapter© Jones & Bartlett 2 Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Sound and the Ear © Jones Karen &J. Kushla,Bartlett ScD, Learning, CCC-A, FAAA LLC © Jones & Bartlett Learning, LLC Lecturer NOT School FOR of SALE Communication OR DISTRIBUTION Disorders and Deafness NOT FOR SALE OR DISTRIBUTION Kean University © Jones & Bartlett Key Learning, Terms LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR Acceleration DISTRIBUTION Incus NOT FOR SALE OR Saccule DISTRIBUTION Acoustics Inertia Scala media Auditory labyrinth Inner hair cells Scala tympani Basilar membrane Linear scale Scala vestibuli Bel Logarithmic scale Semicircular canals Boyle’s law Malleus Sensorineural hearing loss Broca’s area © Jones & Bartlett Mass Learning, LLC Simple harmonic© Jones motion (SHM) & Bartlett Learning, LLC Brownian motion Membranous labyrinth Sound Cochlea NOT FOR SALE OR Mixed DISTRIBUTION hearing loss Stapedius muscleNOT FOR SALE OR DISTRIBUTION Compression Organ of Corti Stapes Condensation Osseous labyrinth Tectorial membrane Conductive hearing loss Ossicular chain Tensor tympani muscle Decibel (dB) Ossicles Tonotopic organization © Jones Decibel & hearing Bartlett level (dB Learning, HL) LLC Outer ear © Jones Transducer & Bartlett Learning, LLC Decibel sensation level (dB SL) Outer hair cells Traveling wave theory NOT Decibel FOR sound SALE pressure OR level DISTRIBUTION
    [Show full text]
  • SIRT1 Protects Cochlear Hair Cell and Delays Age-Related Hearing Loss Via Autophagy
    Neurobiology of Aging 80 (2019) 127e137 Contents lists available at ScienceDirect Neurobiology of Aging journal homepage: www.elsevier.com/locate/neuaging SIRT1 protects cochlear hair cell and delays age-related hearing loss via autophagy Jiaqi Pang a,b,c,d,1, Hao Xiong a,b,e,1, Yongkang Ou a,b,e, Haidi Yang a,b,e, Yaodong Xu a,b, Suijun Chen a,b,e, Lan Lai a,b,c, Yongyi Ye a,f, Zhongwu Su a,b, Hanqing Lin a,b, Qiuhong Huang a,b,e, Xiaoding Xu c,d,*, Yiqing Zheng a,b,e,** a Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou China b Institute of Hearing and Speech-Language Science, Sun Yat-sen University, Guangzhou, China c Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China d RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China e Department of Hearing and Speech Science, Xinhua College, Sun Yat-sen University, Guangzhou, China f School of Public Health, Sun Yat-Sen University, Guangzhou, China article info abstract Article history: Age-related hearing loss (AHL) is typically caused by the irreversible death of hair cells (HCs). Autophagy Received 10 January 2019 is a constitutive pathway to strengthen cell survival under normal or stress condition. Our previous work Received in revised form 29 March 2019 suggested that impaired autophagy played an important role in the development of AHL in C57BL/6 mice, Accepted 4 April 2019 although the underlying mechanism of autophagy in AHL still needs to be investigated.
    [Show full text]
  • The Physics of Sound 1
    The Physics of Sound 1 The Physics of Sound Sound lies at the very center of speech communication. A sound wave is both the end product of the speech production mechanism and the primary source of raw material used by the listener to recover the speaker's message. Because of the central role played by sound in speech communication, it is important to have a good understanding of how sound is produced, modified, and measured. The purpose of this chapter will be to review some basic principles underlying the physics of sound, with a particular focus on two ideas that play an especially important role in both speech and hearing: the concept of the spectrum and acoustic filtering. The speech production mechanism is a kind of assembly line that operates by generating some relatively simple sounds consisting of various combinations of buzzes, hisses, and pops, and then filtering those sounds by making a number of fine adjustments to the tongue, lips, jaw, soft palate, and other articulators. We will also see that a crucial step at the receiving end occurs when the ear breaks this complex sound into its individual frequency components in much the same way that a prism breaks white light into components of different optical frequencies. Before getting into these ideas it is first necessary to cover the basic principles of vibration and sound propagation. Sound and Vibration A sound wave is an air pressure disturbance that results from vibration. The vibration can come from a tuning fork, a guitar string, the column of air in an organ pipe, the head (or rim) of a snare drum, steam escaping from a radiator, the reed on a clarinet, the diaphragm of a loudspeaker, the vocal cords, or virtually anything that vibrates in a frequency range that is audible to a listener (roughly 20 to 20,000 cycles per second for humans).
    [Show full text]
  • Categorical Speech Processing in Broca's Area
    3942 • The Journal of Neuroscience, March 14, 2012 • 32(11):3942–3948 Behavioral/Systems/Cognitive Categorical Speech Processing in Broca’s Area: An fMRI Study Using Multivariate Pattern-Based Analysis Yune-Sang Lee,1 Peter Turkeltaub,2 Richard Granger,1 and Rajeev D. S. Raizada3 1Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755, 2Neurology Department, Georgetown University, Washington, DC, 20057 and 3Neukom Institute, Dartmouth College, Hanover, New Hampshire 03755 Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic reso- nance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant–vowel syllables along the /ba/–/da/ continuum. Outside of the scanner, individuals’ own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca’s area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual catego- ries (/ba/ vs /da/). Broca’s area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation–fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca’s area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes.
    [Show full text]
  • Relations Between Speech Production and Speech Perception: Some Behavioral and Neurological Observations
    14 Relations between Speech Production and Speech Perception: Some Behavioral and Neurological Observations Willem J. M. Levelt One Agent, Two Modalities There is a famous book that never appeared: Bever and Weksel (shelved). It contained chapters by several young Turks in the budding new psycho- linguistics community of the mid-1960s. Jacques Mehler’s chapter (coau- thored with Harris Savin) was entitled “Language Users.” A normal language user “is capable of producing and understanding an infinite number of sentences that he has never heard before. The central problem for the psychologist studying language is to explain this fact—to describe the abilities that underlie this infinity of possible performances and state precisely how these abilities, together with the various details . of a given situation, determine any particular performance.” There is no hesi­ tation here about the psycholinguist’s core business: it is to explain our abilities to produce and to understand language. Indeed, the chapter’s purpose was to review the available research findings on these abilities and it contains, correspondingly, a section on the listener and another section on the speaker. This balance was quickly lost in the further history of psycholinguistics. With the happy and important exceptions of speech error and speech pausing research, the study of language use was factually reduced to studying language understanding. For example, Philip Johnson-Laird opened his review of experimental psycholinguistics in the 1974 Annual Review of Psychology with the statement: “The fundamental problem of psycholinguistics is simple to formulate: what happens if we understand sentences?” And he added, “Most of the other problems would be half­ way solved if only we had the answer to this question.” One major other 242 W.
    [Show full text]
  • Colored-Speech Synaesthesia Is Triggered by Multisensory, Not Unisensory, Perception Gary Bargary,1,2,3 Kylie J
    PSYCHOLOGICAL SCIENCE Research Report Colored-Speech Synaesthesia Is Triggered by Multisensory, Not Unisensory, Perception Gary Bargary,1,2,3 Kylie J. Barnett,1,2,3 Kevin J. Mitchell,2,3 and Fiona N. Newell1,2 1School of Psychology, 2Institute of Neuroscience, and 3Smurfit Institute of Genetics, Trinity College Dublin ABSTRACT—Although it is estimated that as many as 4% of sistent terminology reflects an underlying lack of understanding people experience some form of enhanced cross talk be- about the amount of information processing required for syn- tween (or within) the senses, known as synaesthesia, very aesthesia to be induced. For example, several studies have little is understood about the level of information pro- found that synaesthesia can occur very rapidly (Palmeri, Blake, cessing required to induce a synaesthetic experience. In Marois, & Whetsell, 2002; Ramachandran & Hubbard, 2001; work presented here, we used a well-known multisensory Smilek, Dixon, Cudahy, & Merikle, 2001) and is sensitive to illusion called the McGurk effect to show that synaesthesia changes in low-level properties of the inducing stimulus, such as is driven by late, perceptual processing, rather than early, contrast (Hubbard, Manoha, & Ramachandran, 2006) or font unisensory processing. Specifically, we tested 9 linguistic- (Witthoft & Winawer, 2006). These findings suggest that syn- color synaesthetes and found that the colors induced by aesthesia is an automatic association driven by early, unisensory spoken words are related to what is perceived (i.e., the input. However, attention, semantic information, and feature- illusory combination of audio and visual inputs) and not to binding processes (Dixon, Smilek, Duffy, Zanna, & Merikle, the auditory component alone.
    [Show full text]
  • Focal Versus Distributed Temporal Cortex Activity for Speech Sound Category Assignment
    Focal versus distributed temporal cortex activity for PNAS PLUS speech sound category assignment Sophie Boutona,b,c,1, Valérian Chambona,d, Rémi Tyranda, Adrian G. Guggisberge, Margitta Seecke, Sami Karkarf, Dimitri van de Villeg,h, and Anne-Lise Girauda aDepartment of Fundamental Neuroscience, Biotech Campus, University of Geneva,1202 Geneva, Switzerland; bCentre de Recherche de l′Institut du Cerveau et de la Moelle Epinière, 75013 Paris, France; cCentre de Neuro-imagerie de Recherche, 75013 Paris, France; dInstitut Jean Nicod, CNRS UMR 8129, Institut d’Étude de la Cognition, École Normale Supérieure, Paris Science et Lettres Research University, 75005 Paris, France; eDepartment of Clinical Neuroscience, University of Geneva – Geneva University Hospitals, 1205 Geneva, Switzerland; fLaboratoire de Tribologie et Dynamique des Systèmes, École Centrale de Lyon, 69134 Ecully, France; gCenter for Neuroprosthetics, Biotech Campus, Swiss Federal Institute of Technology, 1202 Geneva, Switzerland; and hDepartment of Radiology and Medical Informatics, Biotech Campus, University of Geneva, 1202 Geneva, Switzerland Edited by Nancy Kopell, Boston University, Boston, MA, and approved December 29, 2017 (received for review August 29, 2017) Percepts and words can be decoded from distributed neural activity follow from these operations, reflect associative processes, or measures. However, the existence of widespread representations arise from processing redundancy. This concern is relevant at any might conflict with the more classical notions of hierarchical
    [Show full text]
  • Perception and Awareness in Phonological Processing: the Case of the Phoneme
    See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/222609954 Perception and awareness in phonological processing: the case of the phoneme ARTICLE in COGNITION · APRIL 1994 Impact Factor: 3.63 · DOI: 10.1016/0010-0277(94)90032-9 CITATIONS DOWNLOADS VIEWS 67 45 94 2 AUTHORS: José Morais Regine Kolinsky Université Libre de Bruxelles Université Libre de Bruxelles 92 PUBLICATIONS 1,938 CITATIONS 103 PUBLICATIONS 1,571 CITATIONS SEE PROFILE SEE PROFILE Available from: Regine Kolinsky Retrieved on: 15 July 2015 Cognition, 50 (1994) 287-297 OOlO-0277/94/$07.00 0 1994 - Elsevier Science B.V. All rights reserved. Perception and awareness in phonological processing: the case of the phoneme JosC Morais*, RCgine Kolinsky Laboratoire de Psychologie exptrimentale, Universitt Libre de Bruxelles, Av. Ad. Buy1 117, B-l 050 Bruxelles, Belgium Abstract The necessity of a “levels-of-processing” approach in the study of mental repre- sentations is illustrated by the work on the psychological reality of the phoneme. On the basis of both experimental studies of human behavior and functional imaging data, it is argued that there are unconscious representations of phonemes in addition to conscious ones. These two sets of mental representations are func- tionally distinct: the former intervene in speech perception and (presumably) production; the latter are developed in the context of learning alphabetic literacy for both reading and writing purposes. Moreover, among phonological units and properties, phonemes may be the only ones to present a neural dissociation at the macro-anatomic level. Finally, it is argued that even if the representations used in speech perception and those used in assembling and in conscious operations are distinct, they may entertain dependency relations.
    [Show full text]
  • Investigating the Neural Correlates of Voice Versus Speech-Sound Directed Information in Pre-School Children
    Investigating the Neural Correlates of Voice versus Speech-Sound Directed Information in Pre-School Children The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Raschle, Nora Maria, Sara Ashley Smith, Jennifer Zuk, Maria Regina Dauvermann, Michael Joseph Figuccio, and Nadine Gaab. 2014. “Investigating the Neural Correlates of Voice versus Speech-Sound Directed Information in Pre-School Children.” PLoS ONE 9 (12): e115549. doi:10.1371/journal.pone.0115549. http:// dx.doi.org/10.1371/journal.pone.0115549. Published Version doi:10.1371/journal.pone.0115549 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:13581019 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA RESEARCH ARTICLE Investigating the Neural Correlates of Voice versus Speech-Sound Directed Information in Pre-School Children Nora Maria Raschle1,2,3*, Sara Ashley Smith1, Jennifer Zuk1,2, Maria Regina Dauvermann1,2, Michael Joseph Figuccio1, Nadine Gaab1,2,4 1. Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children’s Hospital, Boston, Massachusetts, United States of America, 2. Harvard Medical School, Boston, Massachusetts, United States of America, 3. Psychiatric University Clinics Basel, Department of Child and Adolescent Psychiatry, Basel, Switzerland, 4. Harvard Graduate School of Education, Cambridge, Massachusetts, United States of America *[email protected] OPEN ACCESS Citation: Raschle NM, Smith SA, Zuk J, Dauvermann MR, Figuccio MJ, et al.
    [Show full text]
  • Articulatory Feature-Based Pronunciation Modeling
    Articulatory Feature-Based Pronunciation Modeling Karen Livescu1a, Preethi Jyothib, Eric Fosler-Lussierc aTTI-Chicago, Chicago, IL, USA bBeckman Institute, UIUC, Champaign, IL, USA cDepartment of Computer Science and Engineering, OSU, Columbus, OH, USA Abstract Spoken language, especially conversational speech, is characterized by great variability in word pronunciation, including many variants that differ grossly from dictionary prototypes. This is one factor in the poor performance of automatic speech recognizers on conversational speech, and it has been very difficult to mitigate in traditional phone- based approaches to speech recognition. An alternative approach, which has been studied by ourselves and others, is one based on sub-phonetic features rather than phones. In such an approach, a word’s pronunciation is represented as multiple streams of phonological features rather than a single stream of phones. Features may correspond to the positions of the speech articulators, such as the lips and tongue, or may be more abstract categories such as manner and place. This article reviews our work on a particular type of articulatory feature-based pronunciation model. The model allows for asynchrony between features, as well as per-feature substitutions, making it more natural to account for many pronunciation changes that are difficult to handle with phone-based models. Such models can be efficiently represented as dynamic Bayesian networks. The feature-based models improve significantly over phone-based coun- terparts in terms of frame perplexity and lexical access accuracy. The remainder of the article discusses related work and future directions. Keywords: speech recognition, articulatory features, pronunciation modeling, dynamic Bayesian networks 1. Introduction Human speech is characterized by enormous variability in pronunciation.
    [Show full text]
  • Theories of Speech Perception
    Theories of Speech Perception • Motor Theory (Liberman) • Auditory Theory – Close link between perception – Derives from general and production of speech properties of the auditory • Use motor information to system compensate for lack of – Speech perception is not invariants in speech signal species-specific • Determine which articulatory gesture was made, infer phoneme – Human speech perception is an innate, species-specific skill • Because only humans can produce speech, only humans can perceive it as a sequence of phonemes • Speech is special Wilson & friends, 2004 • Perception • Production • /pa/ • /pa/ •/gi/ •/gi/ •Bell • Tap alternate thumbs • Burst of white noise Wilson et al., 2004 • Black areas are premotor and primary motor cortex activated when subjects produced the syllables • White arrows indicate central sulcus • Orange represents areas activated by listening to speech • Extensive activation in superior temporal gyrus • Activation in motor areas involved in speech production (!) Wilson and colleagues, 2004 Is categorical perception innate? Manipulate VOT, Monitor Sucking 4-month-old infants: Eimas et al. (1971) 20 ms 20 ms 0 ms (Different Sides) (Same Side) (Control) Is categorical perception species specific? • Chinchillas exhibit categorical perception as well Chinchilla experiment (Kuhl & Miller experiment) “ba…ba…ba…ba…”“pa…pa…pa…pa…” • Train on end-point “ba” (good), “pa” (bad) • Test on intermediate stimuli • Results: – Chinchillas switched over from staying to running at about the same location as the English b/p
    [Show full text]