Psycho-Acoustics Richard Heusdens April 28, 2020 1 Signal and Information Processing Lab, Dept. of Mediamatics Outline • Introduction • Anatomy and physiology of the auditory system • Physical versus subjective scales • Spatial perception • Masking (spectral and temporal) • Perceptual audio coding • ISO MPEG perceptual model April 28, 2020 2 Psychoacoustics = The scientific study of the perception of sound. April 28, 2020 3 Digital Signal Processing Psycho-acoustics ! 1000101011 1000000000 1111010101 101101 1100000000 0111011101 0100000000 Audio Audio Encoder Decoder Storage or Microphone + AD Transmission DA + Loudspeaker April 28, 2020 4 What you see is not what you hear ! Time ® April 28, 2020 5 Anatomy and Physiology of the Human Auditory System Literature: James O. Pickles, “An Introduction to the Physiology of Hearing”, Academic Press, London, 1982 April 28, 2020 6 The Peripheral Hearing System April 28, 2020 7 Outer Ear Acoustic energy is conducted to the eardrum • Conversion of acoustic energy into mechanical energy • Resonance in the range of 2 – 5 kHz (± 10 dB) • Acoustic filtering is in part dependent on source direction April 28, 2020 8 Middle Ear Impedance adjustment for effective energy transmission: • Eardrum: Large volume displacement, little pressure • Oval window: Little volume displacement, large pressure • Band-pass filter (200 Hz - 8 kHz) April 28, 2020 9 Inner Ear Cochlea: • Mechanical energy (oval window) is converted into a neural signal (auditory nerve) • Performs a time-frequency analysis April 28, 2020 10 Cochlea 1. cochlear duct 2. scala vestibuli 3. scala tympani 4. spiral ganglion 5. auditory nerve fibres 2 mm • The red arrow is from the oval window • The blue arrow points to the round window • The cochlea is about 2 mm in diameter April 28, 2020 11 Basilar Membrane April 28, 2020 12 Basilar Membrane Frequency-to-place transformation: April 28, 2020 13 Cochlea 1. Cochlear duct 2. Scala vestibuli 3. Scala tympani 4. Reissner's membrane 5. Basilar membrane 6. Tectorial membrane 7. Stria vascularis 8. Nerve fibres 9. Bony spiral lamina The organ of Corti is on top of the basilar membrane under the tectorial membrane April 28, 2020 14 Organ of Corti • Tectorial membrane (TM) • Deiters’ cells (DC) • Basilar membrane (BM) • Reticular lamina (RL) • Inner haicells (IHC) • Pillar cells (PC) • Outer haircells (OHC) • Stereocilia (St) April 28, 2020 15 Auditory nerve • Auditory nerve fiber responses are spiky • Auditory nerve responses can follow the phase at low frequencies • Auditory nerve responses need to recover at high frequencies • Beyond about 2 kHz phase locking response is lost March 18, 2010 16 Auditory Transduction April 28, 2020 17 Dependence of Subjective Parameters on Physical Parameters Literature: Brian C.J. Moore, “An Introduction to the Psychology of Hearing”, Fourth edition, Academic Press, London, 1997 April 28, 2020 18 Physical vs. Subjective Scales Physical Subjective Fundamental frequency (Hz) « Pitch Sound pressure (Pa, dB SPL) « Loudness April 28, 2020 19 Fundamental Frequency and Pitch • Linear frequency increase over time • Exponential frequency increase over time • Mapping of frequency to pitch is approximately a logarithmic transformation • Empirical finding from several experiments measuring frequency just noticable differences (JNDs): log Df = 0.0264 f - 0.52 f f +1 JND f +2 JND f +3 JND f = 1 kHz → Df = 2 Hz f = 6 kHz → Df = 33 Hz April 28, 2020 20 Sound Pressure and Loudness • Linear sound pressure increase over time • Expansive sound pressure increase over time ® Mapping of sound pressure to loudness resembles a compressive transformation Stevens Law (sones): S = kI 0.3 ® Thus each 10 dB increase in intensity leads to a doubling in loudness (= doubling in sones) April 28, 2020 21 Sone - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Sone Sone - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Sone sound pressure Source of sound sound pressure loudness level sound pressure Source of soundpascal sounddB re pressure 20 µPa sone loudness level threshold of pain 100 134 ~ 676 pascal dB re 20 µPa sone hearing damage during short-term threshold of pain 20 approx.100 120 ~ 256 134 ~ 676 effect hearing damage during short-term 20 ~approx. 128 ... 120 ~ 256 jet, 100 m away effect 6 ... 200 110 ... 140 1024 ~ 128 ... jack hammer, 1 m awayjet, 100 / nightclub m away 2 approx.6 ... 200 100 110~ 64 ... 140 1024 hearing damage during long-term effect 6×10−1 approx. 90 ~ 32 jack hammer, 1 m away / nightclub 2 approx. 100 ~ 64 2×10−1 ... major road, 10 m awayhearing damage during long-term effect 6×8010 −...1 90 ~ 16approx. ... 32 90 ~ 32 6×10−1 2×10−1 ... major road, 10 m away 2×10−2 ... 80 ... 90 ~ 16 ... 32 passenger car, 10 m away 6×6010 −...1 80 ~ 4 ... 16 2×10−1 2×10−2 ... TV set at home level,passenger 1 m away car, 10 m away 2×10−2 ca. 60 60~ 4 ... 80 ~ 4 ... 16 2×10−1 2×10−3 ... normal talking, 1 m TVaway set at home level, 1 m away 2×4010 −...2 60 ~ 1 ... ca.4 60 ~ 4 2×10−2 2×10−3 ... normal talking, 1 m away 2×10−4 ... 40 ... 60 ~ 1 ... 4 very calm room 2×2010 −...2 30 ~ 0.15 ... 0.4 6×10−4 2×10−4 ... leaves' noise, calm breathingvery calm room 6×10−5 10 ~ 0.0220 ... 30 ~ 0.15 ... 0.4 6×10−4 auditory threshold at 1 kHz 2×10−5 0 0 leaves' noise, calm breathing 6×10−5 10 ~ 0.02 −5 April 28, 2020 sone 1 2 4 auditory 8 16 threshold32 64 128at 1 kHz256 512 1024 2×10 0 022 phon 40 50 60 70 80 90 100 110 120 130 140 sone 1 2 4 8 16 32 64 128 256 512 1024 Formulae phon 40 50 60 70 80 90 100 110 120 130 140 According to Stevens'Formulae definition,[1] a loudness of 1 sone is equivalent to the loudness of a signal at 40 phons, the loudness level of a 1 kHz tone at 40 dB SPL. But phons scale with level in dB, not with loudness, so the soneAccording and phon to scales Stevens' are notdefinition, proportional.[1] a loudness Rather, ofthe 1 loudnesssone is equivalent in sones is, to atthe least loudness very of a signal at 40 nearly, a power law phonsfunction, the of loudness the signal level intensity, of a 1 kHzwith tonean exponent at 40 dB of SPL 0.3.. But[2][3 ]phons With scalethis exponent, with level in dB, not with each 10 phon increaseloudness, (or 10 dBso theat 1 sone kHz) and produces phon scales almost are exactly not proportional. a doubling ofRather, the loudness the loudness in in sones is, at least very sones.[4] nearly, a power law function of the signal intensity, with an exponent of 0.3.[2][3] With this exponent, each 10 phon increase (or 10 dB at 1 kHz) produces almost exactly a doubling of the loudness in At frequencies othersones. than [14 ]kHz, the loudness level in phons is calibrated according to the frequency response of human hearing, via a set of equal-loudness contours, and then the loudness level in phons is mapped to loudnessAt in frequenciessones via the other same than power 1 kHz, law. the loudness level in phons is calibrated according to the frequency response of human hearing, via a set of equal-loudness contours, and then the loudness level in phons is mapped to loudness in sones via the same power law. 2 of 4 27/04/15 15:36 2 of 4 27/04/15 15:36 Equal Loudness Contours Alternative scale for loudness is the phon: • The loudness in phon of a specific tone is defined as the level in dB SPL of a 1 kHz reference tone that sounds equaly loud as the specific tone. • MAF = minimum audible field (absolute threshold of hearing) April 28, 2020 23 Sound Pressure and Loudness April 28, 2020 24 Weber’s Law • Weber’s Law states that the just noticeable difference in level is a constant percetage of level: DI = constant I • For pure tones an increase of about 0.5 - 1 dB is just noticable April 28, 2020 25 Spatial perception April 28, 2020 26 Binaural Hearing We have a remarkable ability to compare acoustic signals across the two ears. On headphones: • Identical signals lead to a narrow sound image centered in our head (correlation is 1) • An interaural level difference of 1 dB leads to a just noticeable shift in position towards the louder ear • An interaural time difference of 15 µs leads to a just noticeable shift in position towards the leading ear • When two signals are not identical (correlation <1), the sound image increases in width • Changes in correlation are best audibile around values of 1. • A reduction from 1 to 0.98 is audible, a reduction from 0.1 to 0 is not audible. ò L(t)R(t +t )dt Correlation: CLR (t ) = ò L(t)2 dt ò R(t +t )2 dt April 28, 2020 27 Spatial Hearing Freefield listening: binaural cues help to localize sound source • Low frequencies: Sound bends around the head (path length difference): Interaural time differences • High frequencies: Sound is obstructed by the head: Interaural level differences • Interaural time differences are not perceived above 2 kHz. • Room reflections lead to reduction of coherence April 28, 2020 28 Spectral and Temporal Masking Literature: Brian C.J. Moore, “An Introduction to the Psychology of Hearing”, Fourth edition, Academic Press, London, 1997 T. Painter and A. Spanias, “Perceptual Coding of Digital Audio”. Proceedings of the IEEE, Vol. 88, No. 4, pp. 451-513, April 2000 April 28, 2020 29 Masking The phenomenon where a sound (maskee/test signal) that is perfectly audible in isolation is not audible due to the presence of a masking sound (masker) Relevance for audio coding: • Quantisation noise (maskee) that is introduced by the coding algorithm is masked by the signal which is coded (masker) • By shaping the spectro-temporal shape of the distortion a very efficient coding of a signal
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages64 Page
-
File Size-