P. Rao, Audio Signal Processing , Chapter in Speech, Audio, Image and Biomedical Signal Processing using Neural Networks, (Eds.) Bhanu Prasad and S. R. Mahadeva Prasanna, Springer-Verlag, 2007. Audio Signal Processing Preeti Rao Department of Electrical Engineering, Indian Institute of Technology Bombay, India
[email protected] 1 Introduction Our sense of hearing provides us rich information about our environment with respect to the locations and characteristics of sound producing objects. For example, we can e®ortlessly assimilate the sounds of birds twittering outside the window and tra±c moving in the distance while following the lyrics of a song over the radio sung with multi-instrument accompaniment. The human auditory system is able to process the complex sound mixture reaching our ears and form high-level abstractions of the environment by the analysis and grouping of measured sensory inputs. The process of achieving the segrega- tion and identi¯cation of sources from the received composite acoustic signal is known as auditory scene analysis. It is easy to imagine that the machine realization of this functionality (sound source separation and classi¯cation) would be very useful in applications such as speech recognition in noise, au- tomatic music transcription and multimedia data search and retrieval. In all cases the audio signal must be processed based on signal models, which may be drawn from sound production as well as sound perception and cognition. While production models are an integral part of speech processing systems, general audio processing is still limited to rather basic signal models due to the diverse and wide-ranging nature of audio signals.