3. Nonlinear Cochlear Signal Processing and Masking in Speech Perception Nonlinearj

3. Nonlinear Cochlear Signal Processing and Masking in Speech Perception Nonlinearj

27 3. Nonlinear Cochlear Signal Processing and Masking in Speech Perception NonlinearJ. B. Allen Co 3.1 Basics .................................................. 27 Part A There are many classes of masking, but two major 3.1.1 Function of the Inner Ear .............. 28 classes are easily defined: neural masking and 3.1.2 History of Cochlear Modeling ......... 31 dynamic masking. Neural masking characterizes the internal noise associated with the neural 3 representation of the auditory signal, a form 3.2 The Nonlinear Cochlea .......................... 35 3.2.1 Cochlear Modeling........................ 35 of loudness noise. Dynamic masking is strictly 3.2.2 Outer-Hair-Cell Transduction......... 41 cochlear, and is associated with cochlear outer- 3.2.3 Micromechanics ........................... 42 hair-cell processing. This form is responsible for dynamic nonlinear cochlear gain changes associated with sensorineural hearing loss, the 3.3 Neural Masking .................................... 45 3.3.1 Basic Definitions .......................... 47 upward spread of masking, two-tone suppression 3.3.2 Empirical Models.......................... 51 and forward masking. The impact of these various 3.3.3 Models of the JND ........................ 51 forms of masking are critical to our understanding 3.3.4 A Direct Estimate of speech and music processing. In this review, of the Loudness JND ..................... 52 the details of what we know about nonlinear 3.3.5 Determination of the Loudness SNR 54 cochlear and basilar membrane signal processing is 3.3.6 Weber–Fraction Formula............... 54 reviewed, and the implications of neural masking is modeled, with a comprehensive historical review 3.4 Discussion and Summary ....................... 55 of the masking literature. This review is appropriate 3.4.1 Model Validation.......................... 55 for a series of graduate lectures on nonlinear 3.4.2 The Noise Model........................... 55 cochlear speech and music processing, from an auditory point of view. References .................................................. 56 3.1 Basics Auditory masking is critical to our understanding of inner ear is the organ that converts signals from acous- speech and music processing. There are many classes tical to neural signals. The loudness JND is a function of masking, but two major classes are easily defined. of the partial loudness L(X), defined as the loudness These two types of masking and their relation to nonlin- contribution coming from each cochlear critical band, ear (NL) speech processing and coding are the focus of or more generally, along some tonotopic central audi- this chapter. tory representation. The critical band is a measure of The first class of masking, denoted neural mask- cochlear bandwidth at a given cochlear place X.The ing, is due to internal neural noise, characterized in loudness JND plays a major role in speech and music terms of the intensity just noticeable difference, denoted coding since coding quantization noise may be masked ΔI(I, f, T) (abbreviated JNDI) and defined as the just by this internal quantization (i. e., loudness noise). discriminable change in intensity.TheJNDI is a func- The second masking class, denoted here as dy- tion of intensity I, frequency f and stimulus type T (e.g., namic masking, comes from the NL mechanical action noise, tones, speech, music, etc.). As an internal noise, of cochlear outer-hair-cell (OHC) signal processing. It the JNDI may be modeled in terms of a loudness (i. e., can have two forms, simultaneous and nonsimultane- perceptual intensity) noise density along the length of ous, also known as forward masking,orpost-masking. the cochlea (0 ≤ X ≤ L), described in terms of a partial Dynamic-masking (i. e., nonlinear OHC signal process- loudness JND (ΔL(X, T), a.k.a. JNDL). The cochlea or ing) is well known (i. e., there is a historical literature 28 Part A Production, Perception, and Modeling of Speech on this topic) to be intimately related to questions lives and even talk on the phone. However they cannot of cochlear frequency selectivity, sensitivity, dynamic understand speech in noise. It is at least possible that range compression and loudness recruitment (the loss this loss is due to the lack of NL OHC processing. of loudness dynamic range). Dynamic masking includes A third example of the application of NL OHC pro- the upward spread of masking (USM) effect, or in neu- cessing to speech processing is still an underdeveloped ral processing parlance, two-tone suppression (2TS). It application area. The key open problem here is: How Part A may be underappreciated that NL OHC processing (i. e., does the auditory system, including the NL cochlea, fol- dynamic masking) is largely responsible for forward lowed by the auditory cortex, processes human speech? masking (FM, or post-stimulus masking), which shows There are many aspects of this problem including speech 3.1 large effects over long time scales. For example OHC coding, speech recognition in noise, hearing aids and effects (FM/USM/2TS) can be as large as 50 dB, with an language learning and reading disorders in children. If FM latency (return to base line) of up to 200 ms. Forward we can solve the robust phone decoding problem,we masking (FM)andNLOHC signal onset enhancement will fundamentally change the effectiveness of human- are important to the detection and identification of per- machine interactions. For example, the ultimate hearing ceptual features of a speech signal. Some research has aid is the hearing aid with built in robust speech feature concluded that forward masking is not related to OHC detection and phone recognition. While we have no idea processing [3.1, 2], so the topic remains controversial. when this will come to be, and it is undoubtedly many Understanding and modeling NL OHC processing is key years off, when it happens there will be a technology to many speech processing applications. As a result, a vi- revolution that will change human communications. brant research effort driven by the National Institute of In this chapter several topics will be reviewed. First Health on OHC biophysics has ensued. is the history of cochlear models including extensions This OHC research effort is paying off at the high- that have taken place in recent years. These models in- est level. Three key examples are notable. First is the clude both macromechanics and micromechanics of the development of wide dynamic-range multiband com- tectorial membrane and hair cells. This leads to com- pression (WDRC) hearing aids. In the last 10–15 years parisons of the basilar membrane, hair cell, and neural WDRC signal processing (first proposed in 1937 by frequency tuning. Hearing loss, loudness recruitment, as researchers at Bell Labs [3.3]), revolutionized the well as other key topics of modern hearing health care, hearing-aid industry. With the introduction of compres- are discussed. The role of NL mechanics and dynamic sion signal processing, hearing aids now address the range are reviewed to help the reader understand the recruitment problem, thereby providing speech audibil- importance of modern wideband dynamic range com- ity over a much larger dynamic range, at least in quiet. pression hearing aids as well as the overall impact of The problems of the impaired ear given speech in noise NL OHC processing. is poorly understood today, but this problem is likely Any reader desiring further knowledge about related to the effects of NL OHC processing. This pow- cochlear anatomy and function or a basic description erful circuit (WDRC) is not the only reason hearing aids of hearing, they may consult Pickles [3.4], Dallos [3.5], of today are better. Improved electronics and transducers Yost [3.6]. have made significant strides as well. In the last few years the digital barrier has finally been broken, with digital 3.1.1 Function of the Inner Ear signal processing hearing aids now becoming common. A second example is the development of otoacoustic The goal of cochlear modeling is to refine our under- emissions (OAE) as a hearing diagnostic tool. Pioneered standing of how auditory signals are processed. The by David Kemp and Duck Kim, and then developed by two main roles of the cochlea are to separate the input many others, this tool allows for cochlear evaluation of acoustic signal into overlapping frequency bands, and neonates. The identification of cochlear hearing loss in to compress the large acoustic intensity range into the the first month has dramatically improves the lives of much smaller mechanical and electrical dynamic range these children (and their parents). While it is tragic to of the inner hair cell. This is a basic question of infor- be born deaf, it is much more tragic for the deafness to mation processing by the ear. The eye plays a similar go unrecognized until the child is three years old, when role as a peripheral organ. It breaks the light image into they fail to learn to talk. If you cannot hear you do not rod- and cone-sized pixels, as it compresses the dynamic learn to talk. With proper and early cochlear implant range of the visual signal. Based on the intensity JND, intervention, these kids can lead nearly normal-hearing the corresponding visual dynamic range is about nine to Nonlinear Cochlear Signal Processing and Masking in Speech Perception 3.1 Basics 29 a) Secreting epithelium y Area vascularis z SV scala vestibuli (periotic space) Ductus cochlearis (endotic space) Part A Vestibular membrane External SM spiral sulcus (Reissner) Tectorial membrane Spiral artery P RL 3.1 Limbus spiralis TM SS BM d Christa Spiral ligament Spiral ganglion Internal basilaris spiral sulcus Basilar Spiral organ Capsule of (Corti) gang cell mambrane Scala tympani ST Myelin sheath (periotic space) b) Outer hair cells Tectorial membrane TM Reticular lamina RL Vestibular lip Limbus Space of Nuel Outer tunnel Inner hair cell Cells of SS Hensen Spiral nerve fibers Nerve fibers Cells of Border Claudius cells Inner tunnel (Corti) Basilar cells Nucleus Connective tissue Pillar (rod Blood BM of Corti) vessel Transverse fibers basilar membrane Nerve fibers Homogeneous substance entering the epithelium of Inner phalangeal cells Outer phalangeal cells organ of Corti Fig.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    34 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us