Research Collection
Doctoral Thesis
Music perception with hearing aids
Author(s): Kirchberger, Martin J.
Publication Date: 2015
Permanent Link: https://doi.org/10.3929/ethz-a-010608245
Rights / License: In Copyright - Non-Commercial Use Permitted
This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.
ETH Library
DISS. ETH NO. 23167
MUSIC PERCEPTION WITH HEARING AIDS
A dissertation submitted to attain the degree of
DOCTOR OF SCIENCES of ETH ZURICH
(Dr. sc. ETH Zurich)
presented by
MARTIN JULIAN KIRCHBERGER
Dipl. Ing. Graz University of Technology
born on 12.02.1984
citizen of Germany
accepted on the recommendation of
Prof. Dr. Marcy Zenobi-Wong Prof. Dr. Frank Russo Prof. Dr. Ewen MacDonald
2015
i
ii Acknowledgements
Acknowledgements
A thesis is like a song: Its development requires good ideas, experimenting, listening, and fine- tuning. Good ideas, however, can only resonate with the right instruments and ensemble. I am grateful for the support that I received and want to use this opportunity to thank Marcy, Frank, Markus, Schelle, and Stef for being the keys to the realization of this project.
Thank you Stef for funding my thesis. I applied during a hiring freeze and so I tried hard to hit the right pitch. Your genuine interest in music research laid the foundation for this work and has always remained an inspiration for me.
Thank you Schelle for helping me out when I could not attend my presentation due to a scheduling conflict with the Hear the World video shoot. I also appreciate that you always found time for my questions, although your responsibilities had steadily been growing. Experiencing your broad knowledge and incredibly sharp judgment helped me increase the tempo in my thesis.
Thank you Markus for your support in work-related and non-work-related matters. You never hesi- tated to discuss technical issues outside office hours and kept me grounded in turbulent times. Your ability to find structure in complex issues and your creativity in solving them is simply amazing. You are one of the main reasons why I enjoyed working at Sonova so much, and let me assure you that your entire team shares this opinion in unison.
Thank you Frank for your incredible commitment and your passion for this topic. Your point of view added new dimensions to this work. I admire how you managed to explain complex issues in plain English, and I learned to put myself in the position of a reader while writing from you. Our weekly discussions helped to keep the rhythm, and our project visits, both in Switzerland and Canada, fur- ther added to the progression.
Thank you Marcy for orchestrating the whole project. This thesis would not have happened if it weren’t for you. Thanks for integrating me into your team and for your hospitality on numerous occasions. Your thoughtful supervision struck a chord.
iii
iv
Für meine Eltern und meine Maus
v
vi
« Music expresses that which cannot be put into words and that which cannot remain silent. »
Victor Hugo
vii
viii Abstract
Abstract
Hearing loss affects the quality of life in various situations such as in conversations or when listening to music. The research on hearing loss and hearing aids has focused primarily on improving speech intelligibility. To date, there is a lack of understanding of how to process music in hearing aids to optimize music perception in hearing-impaired listeners.
As a first step, a novel research tool was developed to make music perception measureable. The Adaptive Music Perception (AMP) test allows for quantifying music perception by establishing phys- ical discrimination thresholds. Test participants make perceptual judgments about the musical di- mensions of meter, harmony, melody, and timbre while the underlying low-level dimensions (e.g., brightness or attack of timbre) are adaptively manipulated until discrimination thresholds can be determined.
To investigate the effect of hearing loss on music perception, 19 normal-hearing and 21 hearing- impaired listeners submitted to the AMP test. Significant differences were found in 7 of the 10 low- level dimensions, implying that hearing loss affects the perception of meter, harmony, melody, and timbre.
In a follow-up study, the effect of hearing aids on music perception was investigated by comparing the performance of 25 hearing-impaired listeners on the AMP test with and without hearing aids. The discrimination thresholds for the low-level dimensions ‘level’ from the meter test and ‘attack’ from the timbre test were marginally worse with hearing aids. The impairment of timbre perception was found to correlate significantly with the strength of dynamic compression in hearing aids.
Based on these results, potential for optimization was identified in the dynamic compression of hearing aids. The optimization was guided by an investigation of the dynamic range in recorded music. To account for the diversity in music, the analysis was conducted in a genre-specific manner on 1000 songs from the genres chamber music, choir, opera, orchestra, piano, jazz, pop, rap, rock, and schlager. The dynamic range analysis revealed that the modern genres pop, rap, rock, and schlager had smaller dynamic ranges than jazz and the classical genres chamber music, choir, opera,
ix Abstract orchestra, and piano. As opposed to conventional opinions, the dynamic range of all music genres was generally smaller than the dynamic range of speech.
The dynamic compression in hearing aids was optimized by reducing the compression strength rela- tive to the dynamic compression standard NAL_NL2. A linear and a semi-compressive setting were chosen as test conditions to optimize the processing and perception of modern and classical music, respectively. Thirty-one hearing-impaired participants evaluated the sound quality of music seg- ments of three classical (choir, orchestra, and opera) and two modern genres (pop and schlager) under the NAL-NL2 reference condition and the linear and semi-compressive test conditions. The results showed that the perceived sound quality was highest in the linear condition and lowest in the NAL-NL2 reference condition for all genres tested.
The second potential for improving music perception with hearing aids was identified in frequency lowering. Frequency lowering is a signal-processing strategy that maps high frequencies, which are inaudible to the impaired ear, to lower, audible frequencies. State-of-the-art implementations have been developed to increase speech intelligibility, and their transfer functions violate the harmonic structure of music. A novel approach - harmonic frequency lowering - allows to lower musical con- tent while preserving the harmonic ratios and pitch chromas in music. Nineteen participants evalu- ated music detail and the sound quality of music excerpts processed with harmonic frequency low- ering and the benchmark conditions nonlinear frequency compression, low-pass filtering, and no processing. Harmonic frequency lowering yielded significant improvements in the perception of music detail without detrimental effects on sound quality.
In brief, this thesis investigated how hearing loss and hearing aids affect music perception to inform the optimization of hearing-aid technology for music. Two strategies regarding dynamic compres- sion and frequency lowering were developed and proved successful in improving music perception in hearing-impaired listeners.
x Zusammenfassung
Zusammenfassung
Die Schädigung des Hörvermögens wirkt sich in vielen Bereichen des Lebens beeinträchtigend aus. Die Gesprächsteilnahme ist erschwert und der Genuss beim Musikhören leidet. Im Fokus der Hör- forschung liegt seit Jahrzehnten die Verbesserung der Sprachverständlichkeit. Musik spielte bislang eine untergeordnete Rolle und es besteht ein enormer Nachholbedarf, Hörgeräte für die Signalver- arbeitung von Musik zu optimieren.
Um Musikwahrnehmung messbar zu machen wurde in einem ersten Schritt der AMP Test (Adaptive Music Perception Test) entwickelt. Dieser ermöglicht es, Unterschiede in musikrelevanten Wahr- nehmungsdimensionen zu quantifizieren. Während die Versuchsteilnehmer Höraufgaben zu Met- rum, Harmonie, Melodie und Timbre bearbeiten, werden die darunterliegenden physikalischen Dimensionen (z.B. Lautstärke, Frequenz, oder Länge bei Metrum) in Abhängigkeit von der Richtig- keit der Antworten adaptiv verändert bis ein Wahrnehmungsschwellenwert ermittelt werden kann.
Die Auswirkungen einer Hörschädigung auf die Musikwahrnehmung wurde mit Hilfe des AMP Tests an 19 normalhörenden und 21 hörgeschädigten Probanden untersucht. In der hörgeschädigten Gruppe wurden in 7 der 10 physikalischen Dimensionen signifikant höhere Schwellenwerte gemes- sen, welche eine Beeinträchtigung der Wahrnehmung von Metrum, Harmonie, Melodie und Timbre durch Hörverlust belegen.
In einer zweiten Studie wurde die Wirkung von Hörgeräten auf die Musikwahrnehmung untersucht. Fünfundzwanzig hörgeschädigte Probanden führten den AMP Test einmal mit und einmal ohne Hörgeräte aus. In den Dimensionen Lautstärke und Helligkeit, welche die Wahrnehmung von Met- rum und Timbre betreffen, wurden höhere Wahrnehmungsschwellenwerte in der Versuchsdurch- führung mit Hörgeräten gemessen. Die Stärke der Dynamikkompression in Hörgeräten konnte hier- bei als signifikanter Faktor für die Beeinträchtigung der Wahrnehmung von Timbre identifiziert werden.
Aufgrund dieser Ergebnisse wurde ein Optimierungspotential der Musiksignalverarbeitung in der Dynamikkompression verfolgt. Hierzu wurde eine Analyse zur Dynamik von Musik anhand von 1000
xi Zusammenfassung kommerziell erhältlichen Songs durchgeführt. Um die Vielfalt an Musik kontrolliert zu erfassen, wurden jeweils 100 Songs aus den verschiedenen Genres Chor, Kammermusik, Orchester, Oper, Piano, Jazz, Pop, Rap, Rock, und Schlager gewählt. Die Dynamikanalyse erwies, dass die modernen Genres Pop, Rap, Rock, und Schlager weniger dynamisch sind als Jazz und die klassischen Genres Chor, Kammermusik, Orchester, Oper und Piano. Entgegen der allgemeinen Auffassung konnte zu- dem gezeigt werden, dass alle getesteten Genres eine geringere Dynamik aufweisen als Sprache.
Aus dieser Erkenntnis wurde abgeleitet, dass die Stärke der Dynamikkompression in Hörgeräten für Musik geringer eingestellt werden sollte als es der wissenschaftliche Standard NAL-NL2 für Sprache vorschlägt. Als Optimierungsoption für moderne Genres wurde eine lineare Einstellung vorgeschla- gen und als Optimierung für klassische Genres wurde eine halbkompressive Option empfohlen. Beide Einstellungen wurden der NAL-NL2 Referenz in einem Hörversuch mit 31 hörgeschädigten Probanden und Musikbeispielen aus drei klassischen (Chor, Oper und Orchester) und zwei moder- nen (Pop und Schlager) Genres gegenübergestellt. Die wahrgenommene Klangqualität war in der linearen Einstellung bei allen Genres signifikant am höchsten und auch die halbkompressive Einstel- lung wurde als signifikant besser empfunden als die NAL-NL2 Referenz.
Das zweite Optimierungspotential der Musiksignalverarbeitung mit Hörgeräten wurde im Bereich der spektralen Kompression (Frequency Lowering) verfolgt. Frequency Lowering ermöglicht es, ho- he Frequenzen, welche für das geschädigte Ohr unhörbar sind, auf tiefere, hörbare Frequenzen abzubilden. Die bestehenden Implementierungen wurden ursprünglich für Sprachsignale entwickelt und verletzen hierbei jedoch Strukturen, welche für die Musikwahrnehmung wichtig sind. Ein neuer Ansatz – Harmonic Frequency Lowering - erlaubt es, die harmonischen Strukturen und die Tonklasse im Signal aufrechtzuerhalten. In einem Hörversuch verglichen 19 Probanden den Detailreichtum und die Klangqualität von Musikbeispielen, welche entweder unprozessiert waren oder mit Harmo- nic Frequency Lowering, einer nichtlinearen Frequenzkompression oder einem Tiefpassfilter verar- beitet wurden. Die Versuchsteilnehmer konnten mit Harmonic Frequency Lowering am meisten Detail in der Musik wahrnehmen ohne an Klangqualität einzubüssen.
Zusammengefasst wurden in dieser Arbeit die Auswirkungen von Hörschäden mit und ohne Hörge- räteversorgung auf die Musikwahrnehmung untersucht, um daraus Leitlinien für die Hörgeräteop- timierung zu gewinnen. Zwei Optimierungsansätze wurden im Bereich der Dynamikkompression und der spektralen Kompression entwickelt und deren positive Wirkung konnte anhand von Studien mit hörgeschädigten Probanden erfolgreich nachgewiesen werden.
xii Contents
Contents
Acknowledgements ...... iii Abstract ...... ix Zusammenfassung ...... xi List of Figures ...... xvii List of Tables ...... xx List of Equations ...... xxii List of Abbreviations ...... xxiii Chapter 1 Introduction ...... 25 1.1 Music Perception ...... 25 1.2 Hearing and Hearing Loss ...... 28 1.3 Hearing Aids ...... 29 1.3.1 Dynamic compression ...... 29 1.3.2 Frequency lowering ...... 31 1.3.3 Research question and rationale ...... 32 1.4 Study Overview ...... 33 References ...... 34 Chapter 2 Development of the Adaptive Music Perception Test ...... 41 Abstract ...... 42 2.1 Introduction ...... 43 2.1.1 Motivation ...... 43 2.1.2 Concept ...... 44 2.1.3 Objective ...... 44 2.1.4 Hypothesis ...... 45 2.2 Materials and Methods...... 46
xiii Contents
2.2.1 General test design ...... 46 2.2.2 Adaptation ...... 46 2.2.3 Sound stimuli ...... 47 2.2.4 Test procedure ...... 47 2.2.5 Meter subtest: level, pitch, and duration ...... 48 2.2.6 Harmony subtest: dissonance and intonation ...... 49 2.2.7 Melody subtest: melody-to-chord ratio ...... 50 2.2.8 Timbre subtest: brightness, attack, and spectral irregularity ...... 51 2.2.9 Music experience ...... 52 2.2.10 Participants ...... 52 2.3 Results ...... 53 2.3.1 Meter subtest ...... 55 2.3.2 Harmony subtest ...... 57 2.3.3 Melody subtest ...... 58 2.3.4 Timbre subtest ...... 59 2.4 Discussion ...... 59 2.4.1 D1: Level, meter subtest ...... 60 2.4.2 D2: Pitch, meter subtest ...... 60 2.4.3 D3: Duration, meter subtest ...... 61 2.4.4 D4: Dissonance, harmony subtest ...... 61 2.4.5 D5: Intonation, harmony subtest ...... 62 2.4.6 D6 & D7: melody-to-mhord ratio I & II (MCR I & II), melody subtest ...... 62 2.4.7 D8: Brightness, timbre subtest ...... 62 2.4.8 D9: Attack, timbre subtest...... 62 2.4.9 D10: Spectral irregularity, timbre subtest ...... 63 Acknowledgments ...... 65 References ...... 66 Chapter 3 Music Perception in Aided and Unaided Hearing-Impaired Listeners ...... 71 Abstract ...... 71 3.1 Introduction ...... 71 3.2 Materials and Methods...... 72 3.2.1 Participants ...... 72
xiv Contents
3.2.2 AMP test ...... 73 3.2.3 Procedure ...... 74 3.3 Results ...... 75 3.4 Discussion ...... 76 3.5 Conclusion ...... 76 Acknowledgments ...... 77 References ...... 78 Chapter 4 Dynamic Range across Music Genres and the Perception of Dynamic Compression in Hearing-Impaired Listeners ...... 81 Abstract ...... 82 4.1 Introduction ...... 82 4.1.1 Compression in music production ...... 82 4.1.2 Compression in hearing aids ...... 83 4.1.3 Dynamic range of music ...... 85 4.2 Experiment I: Dynamic Range of Music across Genres ...... 85 4.2.1 Stimuli ...... 85 4.2.2 Analysis ...... 87 4.2.3 Results ...... 88 4.2.4 Conclusion ...... 91 4.3 Experiment II: Effect of Compression Ratio on Music Perception ...... 91 4.3.1 Rationale ...... 91 4.3.2 Participants ...... 92 4.3.3 Test conditions ...... 94 4.3.4 Test stimuli ...... 96 4.3.5 Listening test ...... 99 4.3.6 Results ...... 100 4.3.7 Discussion ...... 102 4.4 General Discussion ...... 104 Acknowledgments ...... 106 References ...... 107 Appendix ...... 112 Chapter 5 Harmonic Frequency Lowering: Effects on the Perception of Music Detail and Sound Quality ...... 135
xv Contents
Abstract ...... 136 5.1 Introduction ...... 136 5.1.1 Benefit of frequency lowering on speech intelligibility ...... 138 5.1.2 Effect of frequency lowering on sound quality of speech and music ...... 138 5.2 Description of harmonic frequency-lowering algorithm ...... 141 5.3 Materials and Method ...... 142 5.3.1 Participants ...... 142 5.3.2 Stimuli ...... 144 5.3.3 Procedures ...... 145 5.4 Results ...... 148 5.4.1 Pretest ...... 148 5.5 Main Test ...... 149 5.6 Discussion ...... 151 5.7 Conclusion ...... 152 Acknowledgments ...... 152 References ...... 154 Chapter 6 General Conclusion ...... 161 Reference ...... 166 Chapter 7 Outlook ...... 167 7.1 Adaptive Music Perception Test ...... 167 7.2 Dynamic-Range Analysis and Dynamic Compression ...... 167 7.3 Harmonic Frequency Lowering ...... 168 References ...... 170
xvi List of Figures
List of Figures
Figure 1:1 Simplified schematic of the processing stages of the auditory system. 28 Figure 1:2 Example of a signal being separated into multiple components by a filter bank...... 29 Figure 1:3 Schematic Input-output function (solid line, left abscissa) and gain curves (dashed line, right abscissa) of a frequency band in a hearing aid. Input levels above the compression threshold (CT) are compressed; input levels below the compression threshold are expanded...... 30 Figure 1:4 Examples of input-output function for frequency transposition and frequency compression...... 32 Figure 2:1 The mid-level and low-level dimensions of the Adaptive Music Perception test. D1–D10 indicates dimension 1–10; MCR, melody-to-chord ratio; Sp. Irreg., spectral irregularity...... 45 Figure 2:2 Spectrogram of the AMP test audio stimuli. The reference tone (0 sec - 0.4 sec) in the calibration, meter subtest, and timbre subtest is followed by the reference cadence of the harmony subtest (0.8 sec - 3.8 sec) and by one example of the melody subtest (4.2 sec – 10.2 sec)...... 47 Figure 2:3 The subdominant (IV), dominant (V) and tonic (I) dyad of the harmony subtest...... 49 Figure 2:4 A schematic plot of the harmonic spectrum of the root note and the fifth in just intonation or equal temperament. Deviations are exaggerated for better visibility...... 50 Figure 2:5 Two examples of the melody subtest. The upper bars depict two melodies from melody set 1 that have the same contour (up-down-up). The lower bars depict two melodies from melody set 2 that have a different contour (down-up-down; up-down-up)...... 51 Figure 2:6 Box-plot (median, quartiles, whiskers: 1.5 x IQR, & individual data points) showing the level D1 (dB), the pitch D2 (Hz), and the duration D3 (sec) thresholds of the meter subtest by group and session. (1 NH_t: normal-hearing / test sesssion, 2 NH_r: normal-hearing / retest session, 3 HI_t: hearing-impaired / test session, 4 HI_r: hearing-impaired / retest session)...... 56 Figure 2:7 Box-plot (median, quartiles, whiskers: 1.5 x IQR, & individual data points) showing the dissonance D4 (Hz) and the intonation D5 (Hz) thresholds of the harmony subtest by group and session. (1 NH_t: normal-hearing / test sesssion, 2 NH_r: normal-hearing / retest session, 3 HI_t: hearing-impaired / test session, 4 HI_r: hearing-impaired / retest session)...... 57
xvii List of Figures
Figure 2:8 Box-plot (median, quartiles, whiskers: 1.5xIQR, & individual data points) showing the melody-to-chord ratio I D6 (dB) and the melody-to-chord ratio II D7 (dB) thresholds of the melody subtest by group and session. (1 NH_t: normal-hearing / test sesssion, 2 NH_r: normal-hearing / retest session, 3 HI_t: hearing-impaired / test session, 4 HI_r: hearing-impaired / retest session)...... 58 Figure 2:9 Box-plot (median, quartiles, whiskers: 1.5xIQR, & individual data points) showing the brightness D8 (dB), attack D9 (sec), and spectral irreguliarity D10 (dB) thresholds of the timbre subtest by group and session. (1 NH_t: normal- hearing / test sesssion, 2 NH_r: normal-hearing / retest session, 3 HI_t: hearing- impaired / test session, 4 HI_r: hearing-impaired / retest session)...... 59 Figure 4:1 Dynamic ranges of speech and 10 different music genres. The lines represent percentiles in dB (SPL) across frequency in kHz (99 th : upper dashed line, 90 th : upper solid line, 65 th : thick line, 30 th : lower solid line, and 10 th : lower dashed line)...... 89 Figure 4:2 Dynamic range comparison between different music genres and across frequency. The dynamic range are calculated as the difference between the 99 th and 30 th percentiles according to the IEC 60118-15 standard...... 90 Figure 4:3: Schematic gain curves within a compression band for the linear, semi- compressive, and compressive condition. The compression ratio of the semi- compressive condition ( = ∆ /∆ ) is half the compression ratio of the compressive condition ( = 2 ∗ ∆ /∆ )...... 95 Figure 4:4 Example loudness curves of the linear, semi-compressive, and compressive version. (Here for participant 29 and stimulus 3)...... 96 Figure 4:5 Dynamic range of the test sample in Experiment II (left column) and the audio corpus in Experiment I (right column) for the genres choir, opera, orchestra, pop, and schlager. The lines represent percentiles in dB (SPL) across frequency in kHz (99 th : upper dashed line, 90 th : upper solid line, 65 th : thick line, 30 th : lower solid line, and 10 th : lower dashed line)...... 97 Figure 4:6 Example screen of the main test...... 99 Figure 4:7 Ratings and standard errors for the data on dynamics, quality, and loudness in the linear, semi-compressive, and compressive condition...... 102 Figure 5:1 Examples of the input-output mapping functions for the different frequency lowering schemes frequency transposition (FT), linear frequency compression (LFC), and nonlinear frequency compression (NFC). (FT: cut-off frequency 5 kHz, peak frequency: 6 kHz, frequency shift 3 kHz; LFC: compression ratio 3:2; NFC: cut-off frequency 2 kHz, compression ratio CR nfc : 1.7...... 137 Figure 5:2 Spectrum of a tone dyad of two complex tones with fundamental frequencies at 800 Hz (black) and 2000 Hz (blue). The cut-off frequency for the nonlinear frequency compression (NFC) condition was set to 2 kHz and the frequency shift for the frequency transposition (FT) condition was determined at 3 kHz, half the peak frequency at 6 kHz. The cut-off frequency for harmonic
xviii List of Figures frequency lowering (HFL) was set at 4 kHz. New components to the original version after frequency lowering are marked in green...... 140 Figure 5:3 Schematic of the different input-output functions that were evaluated in the main test...... 145 Figure 5:4 Long-term average spectra of the four music segments in the original version (OV) / LP condition, the HFLo condition and the spectra of the lowered signal for a gain weight of 0 dB...... 146 Figure 5:5 Example screen of the main test...... 147 Figure 5:6 Mean ratings and standard errors for detail and quality in the 5 conditions under test (Orig: Original, NFC: nonlinear frequency compression, HFLi: harmonic frequency lowering with individual cut-off frequency, HFLo: harmonic frequency lowering with default cut-off frequency fc = 4 kHz, LP: low pass)...... 150 Figure 7:1 Example input-output function (solid line) for harmonic frequency lowering with double octave transposition...... 168
xix List of Tables
List of Tables
Table 2:1 The music questionnaire (column 1 and 2) and the respective points allocation (column 3) to determine an objective music experience (ME) measure for each participant...... 53 Table 2:2 Characteristics of the test participants: sex, audiometric data, degree of hearing loss (HL) as defined by ASHA, music experience (ME), and age...... 54 Table 2:3 Results of the repeated measure ANCOVA analysis for each dimension of the tests. The condition hearing loss was the dependent variable, music experience (ME) the covariate...... 55 Table 2:4 Forced-entry multiple linear regression for each dimension. Predictors were hearing loss (HL), age, and music experience (ME)...... 56 Table 3:1 Characteristics of the test participants: sex, audiometric data, hearing loss (HL), music experience (ME), age, and hearing aid (CR: compression ratio). . 73 Table 3:2 Mean discrimination thresholds in the unaided and aided condition and the ANOVA statistics for the dimensions of the meter, harmony, and timbre subtests of the AMP test...... 75 Table 4:1 Peer-reviewed research exploring the effect of different compression settings on music perception in hearing-impaired listeners (CR: compression ratio, AT: attack time, RT: release time, CT: compression thresholds, WDRC: wide dynamic range compression (multiband dynamic compression), ADRO: adaptive dynamic range compression, HA = hearing aid...... 86 Table 4:2 Additional criteria for each genre that the songs and stimuli (45-s segments) had to meet beyond the iTunes genre classification...... 87 Table 4:3 Characteristics of the test participants: audiometric data (dB HL), age (yrs), hearing-aid experience / HAE (yrs), music experience / ME (range from -3: low to 4: high), and coupling (dome and receiver). Note: NT indicates a hearing threshold that was not tested...... 92 Table 4:4: Details about the 20 music segments from experiment II...... 98 Table 4:5 Audio corpus used for the dynamic range analysis sorted by genre and composer/interpret...... 112 Table 5:1: Characteristics of the test participants and the individual parameters for the nonlinear frequency compression algorithm: audiometric data (dB HL), age (yrs), hearing aid experience / HAE (yrs), and music experience / ME (scale
min: -3, scale max: 4), cut-off frequency / Fc, and compression ratio / CR nfc for the NFC (nonlinear frequency compression) condition. NT indicates a hearing threshold that was not tested...... 143
xx List of Tables
Table 5:2 Gain weights in dB at the point of detection and preference for HFLi and HFLo...... 149 Table 5:3 Statistics of pair-wise ANOVAs for music detail between all 5 conditions. The Bonferroni-corrected significance level is p=0.005...... 150
xxi List of Equations
List of Equations
Equation 3:1 – Calculation of the compression ratio CR...... 75 Equation 5:1 – Input-output function for linear frequeny compression...... 136 Equation 5:2 – Input-output function for nonlinear frequeny compression;
constant hereinafter referred to as CR nfc (compression ratio for nonlinear frequency compression)...... 136
xxii List of Abbreviations
List of Abbreviations
AMP test Adaptive Music Perception test ANCOVA analysis of covariance ANOVA analysis of variance CAMEQ Cambridge method for loudness equalization CR compression ratio for dynamic compression
CR nfc compression ratio for nonlinear frequency compression (NFC) D dimension dB decibel DLM dynamic loudness model DRC dynamic range compression DSL desired sensation level FC frequency compression FT frequency transposition HFL harmonic frequency lowering HI hearing-impaired HL hearing loss LFC linear frequency compression LP low pass MCR melody-to-chord ratio ME music experience NAL-NL2 National Acoustic Laboratories, non-linear prescription formula 2 NFC nonlinear frequency compression NH normal-hearing SL sensation level SPL sound pressure level
xxiii
Chapter 1: Introduction
Chapter 1 Introduction
Music is an evolutionarily old and universal aspect of life. Not only do we enjoy music as individuals but also in groups as music allows us to experience the same emotion together. According to Bersch-Burauel (2004), the average time that adults listen to music in everyday situations accumu- lates to almost 3 hours per day. Besides music listening, one in five individuals rehearses an instru- ment or performs on a regular basis (Swiss Federal Statistical Office, 2008). In addition, music is used in therapy to improve the psychological and physiological health of individuals (Koelsch 2009). Music-supported therapy has yielded improvements in the recovery of motor functions in stroke and Parkinson’s patients (Altenmüller et al. 2009; Thaut et al. 1996) and in the treatment of autism (Whipple 2004), dementia (Koger, Chapin, & Brotons 1999), and Alzheimer’s disease (Smeijsters & Puhr 1997). Music has been used to support second-language learning and memory more generally (Good, Russo, & Sullivan 2014; Wallace 1994). Furthermore, it has been shown that music training can lead to improved nonmusical skills, such as cognitive (Hassler, Birbaumer, & Feil 1985; Gardiner
1996; Hetland 2000; Schellenberg 2004), language, and literacy skills (Barwick et al. 1989; Standley & Hughes 1997; Chan, Ho, & Cheung 1998; Anvari et al. 2002; Ho, Cheung, & Chan 2003; Overy 2003).
1.1 Music Perception Perception is the process of identifying and interpreting sensory impressions (Schacter, Gilbert, & Wegner 2011). Our ability to perceive is not fixed but evolves continuously with making experiences (Emmenegger & Schwind 2008). The experiences contribute to a knowledge and understanding of structures which serve as a reference: we hear pitches in terms of their places in the tonal frame- work and we hear tones occurring in relation to a metrical framework of beats (Dowling 2010).
Grouping allows us to interpret our auditory environment more efficiently (Deutsch 1999). We group auditory events based on simple Gestalt-like principles such as proximity. Elements that are close together in time and space are more likely to be grouped than are those found further apart. When tones are presented in rapid sequence for example and these are drawn from two different pitch ranges, the listener perceives two melodic lines in parallel: one derived from the lower tones
25 Chapter 1: Introduction and the other from the higher ones (Bregman 1994). Another grouping principle is similarity. When one instrument plays a melody and a different instrument plays chords simultaneously, we still de- tect the melody based on the similarity of the timbre, even when the tones produced by the differ- ent instruments overlap considerably in pitch (Koelsch & Schröger 2008). Further principles of audi- tory grouping are auditory continuity and spatial location (Darwin 1997; Deutsch 1999).
In contrast to sinusoidal tones, naturally sustained tones such as those produced by musical instru- ments are made up of components whose frequencies are integer or near-integer multiples of the fundamental. The auditory system exploits this feature to combine a set of harmonically related components into a single sound image which gives rise to strongly fused pitch impressions (Deutsch 1999). Pitch can be divided into a two-dimensional percept of pitch chroma and pitch height (Drobisch 1852; Stumpf 1890; Révesz 1914; Bachem 1937; Bachem 1950; Shepard 1982). Chroma defines the position of a tone within the octave whereas pitch height defines the octave. This con- ceptualization is reflected in Western music notation: “G5” for example denotes the chroma “G” in octave 5. Within an octave, Western music distinguishes between 12 chromas which are separated by a semitone. The frequency ratio of the semitone interval depends on the tuning system. In his- torical tuning system such as the Pythagorean tuning or just temperament, the frequency ratios of of adjacent tones vary depending on their position within the scale (Hunt 1992). In equal tempera- ment, the interval ratio between adjacent tones is labeled a semitone and equal to the twelfth root of 2 ( √2 = 1.058).
Apart from atonal music that lacks a tonal center, music does normally not use all 12 chromas but a restricted number of chromas to form a modal scale. Typical scales in Western music are major or minor scales using seven chromas such as in A-major: A, B, C#, D, E, F#, G#. The first tone of the scale, the tonic, is the tonal center of the scale and lays the foundation for a tonal hierarchy that is established within the scale (Krumhansl 1990). The tonic, dominant (V) and median (III) position of the major scale for example are most stable while the leading tone (VII) is the most unstable posi- tion. Musicians and also non-musicians have an implicit knowledge of tonality (Bigand, Parncutt, & Lehrdahl 1996; Janata et al. 2003; Toiviainen & Krumhansl 2003). Listeners therefore raise expecta- tions of what pitches are coming next and respond emotionally if the expectancies are violated (Sloboda 1998; Bharucha 2002; Grewe et al. 2007; Altenmüller & Bernatzky 2015).
The perception of consonance and dissonance is closely related to the perception of tonality, espe- cially in the case of aesthetic dissonance (Plomp 1976; Dowling 2010). Aesthetic dissonance of an
26 Chapter 1: Introduction auditory event is determined by the musical context in which the auditory event is embedded. A single tone can be aesthetically dissonant if it is unstable and creates tension. Contrary to aesthetic dissonance, tonal or acoustic dissonance is not dependent on the context but only on the harmonic structure of the tone cluster. The acoustic dissonance of complex tones is the sum of the disso- nances produced by the interaction of their harmonics (Kameoka & Kuriyagawa 1969a,b). Combina- tions of complex tones are less dissonant when the ratios of the fundamental frequencies are sim- pler. Complex tones separated by a fifth (3:2) or a major third (5:4) for example are thus more con- sonant than complex tones separated by a minor second (16:15). The absolute evaluation of disso- nance has evolved with time over the last centuries (Ebeling 2009). Intervals previously considered dissonant gradually substituted consonant chords (Carter 1945).
Although the perception of a stimulus cannot be fully deduced from its physical properties alone but would require to somehow consider the historic or individual experiences, there are universal correlations between the physics and the perception of sound. To a high degree, the perception of loudness correlates with the amplitude of an oscillation and pitch correlates with the frequency of an oscillation (Fricke & Louven 2008). The mapping of physical dimensions to psychic dimensions, however, is more difficult when it comes to timbre. Timbre is the perceptual attribute by which we distinguish instruments even if they play the same note with the same loudness (ANSI 1973; Don- nadieu 2007). Timbre is multidimensional and brightness and attack are two dimensions that have consistently been identified as important (Grey 1977; Wessel 1979; Krumhansl 1989; McAdams & Cunibile 1992; Krimphoff, McAdams, & Winsberg 1994; McAdams et al. 1995). A physical correlate of brightness is the spectral centroid of a sound and a physical correlate of attack is onset rapidity, the time a sound needs to develop its full amplitude (Krimphoff, McAdams, & Winsberg 1994). In- harmonic tones have weaker pitch salience (Terhardt, Stoll, & Seewann 1982) and provide a less stable perception of the tonal hierarchy (Russo et al. 2007).
Rhythm refers to the temporal organization of events as the music unfolds in time (Thaut 2005). The psychic correlate of time is duration. We perceive beats as pulses marking off units which are equal in duration (Dowling & Harwood 1986). We perceptually group beats into patterns of two, three, or four (Fraisse 1982). Meter imposes an accent structure on these beats by regularly alter- nating between strong and weak beats (Clarke 1999). Accents can be induced by changes of the beats such as in intensity or pitch.
27 Chapter 1: Introduction
In brief, the physical and cognitive-perceptual nature of music comprises simultaneous and sequen- tial components (Thaut 2005). This makes music perception unique among the perception of other art forms such as visual art or language.
1.2 Hearing and Hearing Loss Sound passes various processing stages of the auditory system until it evokes a perception in the brain (Figure 1:1).
Outer ear Middle ear · Pinna · Eardrum Cochlea Auditory nerve Central system · Ear canal · Ossicles
Figure 1:1 Simplified schematic of the processing stages of the auditory system.
The pinna collects impinging sound waves and channels them via the ear canal to the eardrum. The eardrum vibrates and the vibrations are transmitted through a three-ossicular chain to the cochlea. The geometry of the eardrum and the ossicles yield an amplification of the pressure of the vibra- tions by a factor of 22 (Geisler 1998). This amplification allows for an efficient transformation of the original air-borne sound waves into liquid-borne sound waves in the cochlear fluids. The waves in the cochlea cause deflections of the stereocilia of the hair cells which trigger electrical impulses that are then relayed via the auditory nerve and nuclei in the brain stem to the primary and secondary auditory cortices (Seifritz et al. 2001). Two types of hair cells can be distinguished: the inner hair cells and the outer hair cells (Purves et al. 2001). The inner hair cells function as sensory receptors while the outer hair cells actively influence the mechanics of the cochlea (Yates 1995). This so-called active mechanism (Oxenham & Wojtczak 2010) tunes the cochlea to achieve higher frequency se- lectivity and works as a dynamic compressor so that soft sounds are audible and loud sounds are still comfortable.
Hearing loss is the impairment of the transmission of sound at any processing stage of the auditory system. An impairment of the outer or middle ear results in conductive hearing loss (Johnson & Lalwani 2000). Sound is hindered or not efficiently transformed on its way to the inner ear which leads to an attenuation of sound levels and higher auditory thresholds. Common causes are ceru- men in the ear canal, damage to the eardrum, or stiffening of the ossicles (Northern 1984; Zadeh & Selesnick 2001).
28 Chapter 1: Introduction
An impairment of the cochlea involves damages to the structures inside the cochlea such as the inner and outer hair cells (Liberman 1989). Damage to the inner hair cells leads to reduced sensitivi- ty and thus higher auditory thresholds for hearing-impaired individuals (Liberman, Dodds, & Lear- son 1986). Damage to the outer hair cells affects frequency selectivity and the range of levels be- tween threshold and discomfort becomes smaller. This abnormal growth of loudness with increas- ing sound level is also known as loudness recruitment (Fowler 1937; Davis & Silverman 1970). Common causes of cochlear hearing loss are exposure to intense sound, use of ototoxic chemicals, or infection. Hearing loss as a side-effect of ageing, presbyacusis, is also of cochlear nature.
Retrocochlear hearing loss refers to impairments at stages beyond the cochlea. Damage to the audi- tory nerve for example can be caused by the growth of a tumor (Weaver & Northern 1976).
1.3 Hearing Aids Hearing aids are medical products for the treatment of hearing loss. They are computers that pro- cess the incoming signal according to the needs of the user. State-of-the-art hearing aids incorpo- rate various signal-processing units that handle and transform a signal on its way from the micro- phone to the speaker. Two of these signal-processing units, the dynamic compressor and the fre- quency lowering unit, are especially important for music perception.
1.3.1 Dynamic compression Hearing aids apply gain to the input signal so that soft signals become audible and loud signals stay below the user’s discomfort level. The level-dependent gain results in a dynamic compression of the signal. Because hearing loss will vary across frequencies, the gain needs to be frequency-dependent. To achieve this, frequency aids commonly partition the signal into multiple frequency bands (Figure 1:2). State-of-the-art hearing aids contain filter banks with up to 20 bands. A gain is applied to each
filter bank 2
0
-2 original input signal 0 20 40 60 80 100 120 140 160 180 200 10 5
5 0
-5 0 0 20 40 60 80 100 120 140 160 180 200 5
amplitude -5 amplitude 0 -10 0 20 40 60 80 100 120 140 160 180 200 -5 0 20 40 60 80 100 120 140 160 180 200 time (msec) 5
0
-5 0 20 40 60 80 100 120 140 160 180 200 time (msec) Figure 1:2 Example of a signal being separated into multiple components by a filter bank. 29 Chapter 1: Introduction band individually.
Figure 1:3 shows a schematic gain curve and the corresponding input-output function of a frequen- cy band. For input levels above the compression threshold (CT), the gain decreases with increasing signal level. The strength of the compression is specified as the compression ratio (CR). A compres- sion ratio of 3:1 for example increases the output level by 1 for an increase in input level of 3 Below the compression threshold, the gain is reduced to avoid that internal noise becomes audible. In state-of-the-art hearing aids, the gain curve can further include a linear or less compres- sive segment in the transition from expansion to compression. There are established prescription standards for speech that define the compression parameters as a function of hearing loss (Scollie et al. 2006; Moore, Glasberg, & Stone 2010; Dillon et al. 2011). There are no compression standards for music.
100 Δy 90 Δx 80
70
60
50 50
output level level (dB) output 40 40
30 30
20 20 (dB) gain
10 10
CT 10 20 30 40 50 60 70 80 90 100 level (dB SPL)
Figure 1:3 Schematic Input-output function (solid line, left abscissa) and gain curves (dashed line, right abscissa) of a frequency band in a hearing aid. Input levels above the compression threshold (CT) are compressed; input levels below the compression threshold are expanded.
The input-output function and gain curves as displayed in Figure 1:3 are target values for static sig- nals and/or when the system is in a steady state. Music and speech, however, are dynamic signals which can have abrupt changes in level. The effective compression of a signal depends on how fast the system reacts to changes in input level. Hearing aids usually react very fast to increases in level
30 Chapter 1: Introduction so that sudden peaks do not result in excessively high output levels (attack). In comparison, the time to increase the gain due to decreases in signal level is longer (release). There is, however, no standard that defines attack or release precisely and hearing-aid manufacturers use their proprie- tary methods. One potential method is to define attack or release as the time required for a gain to reach a certain percentage (e.g., 90%) of the target value.
1.3.2 Frequency lowering The usability of conventional amplification for the treatment of hearing loss has various limits, es- pecially when the patient suffers from profound high-frequency hearing loss. The power of the speaker can be insufficient to provide the hearing-aid wearer with the output level needed for the audibility of the high-frequency components. Speakers further produce distortions at high output levels. Moreover, high gains can lead to feedback problems in the hearing aid. Furthermore, dead regions are more likely to occur in the higher frequencies (Peepers et al. 2014). Dead regions are regions of non-functioning inner hair cells and/or neurons on the basilar membrane which distort or inhibit the perception of the affected frequencies (Moore and Glasberg 1997).
Frequency lowering is a well-established alternative to conventional amplification in individuals with severe hearing impairment. In order to restore audibility, high-frequency components are re- produced in a lower frequency range.
There are different frequency-lowering strategies for hearing-impaired individuals in hearing aids such as frequency transposition and frequency compression (Kuk et al. 2009). With regard to fre- quency transposition, the high-frequency portion of the spectrum is shifted lower by a constant frequency difference and mixed with the original signal. In frequency compression, the bandwidth of the high-frequency components is compressed to fit into a narrower bandwidth and there is no spectral superimposition of the original and the compressed signal. Figure 1:4 displays example input-output functions for frequency transposition and frequency compression.
Frequency-lowering strategies have primarily been developed to increase speech intelligibility and have become an integral feature of hearing-aid signal processing. This feature, however, has not been revised or optimized for music.
31 Chapter 1: Introduction
frequency transposition frequency compression
6 6
fout fout kHz 4 kHz 4
2 2
2 4 6 8 2 4 6 8 fin kHz fin kHz Figure 1:4 Examples of input-output function for frequency transposition and frequency compression.
1.3.3 Research question and rationale Hearing loss is a global problem. According to the World Health Organization (2015), 360 million people worldwide suffer from disabling hearing loss.
The research into the effects of hearing loss and hearing aids has focused primarily on speech intel- ligibility and has yielded innovative approaches to improve speech perception with hearing aids (e.g., Humes 1991; Levitt 2001; Barreto & Ortiz 2008; Tawfik et al. 2009; Cornelis, Moonen, & Wouters 2012). By contrast, research into the effects on music perception is scarce. Previous re- search indicates that hearing loss affects various aspects of music perception such as pitch stability, the perception of dissonance, the ability to recognize instruments, and the enjoyment of music (Feldmann & Kumpf 1988; Leek & Summers 2001; Tufts et al. 2005; Neukomm 2006; Leek et al. 2008). State-of-the-art hearing aids do not fully compensate for the hearing impairments regarding music perception and hearing-aid users express a need to optimize their devices for music (Madsen & Moore 2014).
This thesis aims at developing strategies to improve the signal processing of hearing aids for music. To guide the development, further research into the effects of hearing loss and hearing aids on mu- sic perception was conducted. The thesis addresses the following three research questions:
(1) How does music perception differ between normal-hearing and hearing-impaired listeners?
(2) How does music perception differ in hearing-impaired listeners with and without hearing aids?
(3) How can hearing aids be optimized for music?
32 Chapter 1: Introduction
1.4 Study Overview
Four studies with a total amount of 115 participants were conducted to investigate the research questions presented above. The first two experiments investigated how hearing loss and hearing aids affect music perception. Based on the findings, a strategy regarding the dynamic compression in hearing aids was developed and successfully validated in the third experiment. In the fourth ex- periment, a novel strategy for frequency lowering was tested. Each of the chapters 2-5 covers one experiment and was written in the style of a journal manuscript.
Chapter 2 / Experiment 1
This chapter aimed at finding differences in music perception between normal-hearing and hearing- impaired listeners. For this purpose, a music test battery was developed which was able to measure the effect of hearing loss on important dimensions of music. An experiment with 19 normal-hearing and 21 hearing-impaired participants was conducted to investigate the effect of hearing impairment as the independent variable.
Chapter 3 / Experiment 2
This chapter aimed at investigating the effect of hearing aids on music perception. For that purpose, 25 hearing-impaired participants conducted the previously developed music test twice, once with and once without hearing aids.
Chapter 4 / Experiment 3
This chapter aimed at optimizing dynamic compression in hearing aids for music. The dynamic range in music across ten different genres was researched and two optimized compression settings were developed. Thirty-one hearing-impaired participants compared the optimized compression settings to a compression standard (NAL-NL2) as a reference.
Chapter 5 / Experiment 4
In this chapter, a frequency-lowering method for music was developed. The benefit of the novel frequency-lowering approach for music was assessed in a study with 19 hearing-impaired listeners who had not experienced frequency lowering prior to the experiment.
33 Chapter 1: Introduction
References
Altenmüller, E., Marco-Pallares, J., Münte, T. F., & Schneider, S. (2009). Neural Reorganization Un- derlies Improvement in Stroke-induced Motor Dysfunction by Music-supported Therapy. Annals of the New York Academy of Sciences , 1169 (1), 395-405.
Altenmüller, E., & Bernatzky, G. (2015). Musik als Auslöser starker Emotionen [Music as activator of strong emotions]. In Musik und Medizin (pp. 221-236). Springer Vienna.
ANSI (1973). Psychoacoustical Terminology. American National Standards Institute, New York.
Anvari, S. H., Trainor, L. J., Woodside, J., & Levy, B. A. (2002). Relations among musical skills, phono- logical processing, and early reading ability in preschool children. Journal of Experimental Child Psy- chology , 83 (2), 111-130.
Bachem, A. (1937). Various types of absolute pitch. Journal of the Acoustical Society of America, 9, 146-151.
Bachem, A. (1950). Tone height and tone chroma as two different pitch Qualities. Acta Psychologi- ca, 7, 80-88.
Barreto, S., & Ortiz, K. (2008). Intelligibility measurements in speech disorders: A critical review of literature. Pró Fono Revista de Atualizacão, 20 , 201–206.
Barwick, J., Valentine, E., West, R., & Wilding, J. (1989). Relations between reading and musical abil- ities. British Journal of Educational Psychology, 59 (2), 253-257.
Bersch-Burauel, A. (2004). Entwicklung von Musikpräferenzen im Erwachsenenalter: eine explora- tive Untersuchung [Development of music preferences in adulthood: an exploratory study]. PhD thesis, University of Paderborn, Germany.
Bharucha, J. (2002). Neural nets, temporal composites, and tonality. Foundations of Cognitive Psy- chology: Core Readings , 455 .
Bigand, E., Parncutt, R., & Lehrdahl, F. (1999). Perceiving musical tension in long chord sequences. Psychological Research , 62 (4), 237-254.
Bregman, A. S. (1994). Auditory scene analysis: The perceptual organization of sound . MIT press.
Carter, E. H. (1945). The Changing Attitudes Toward Consonance and Dissonance In Various Histori- cal Periods. Graduate Theses, Butler University.
34 Chapter 1: Introduction
Chan, A. S., Ho, Y. C., & Cheung, M. C. (1998). Music training improves verbal memory. Nature , 396 (6707), 128-128.
Clarke, E. F. (1999). Rhythm and timing in music. The Psychology of Music , 2, 473-500.
Cornelis, B., Moonen, M., Wouters, J. (2012). Speech intelligibility improvements with hearing aids using bilateral and binaural adaptive multichannel Wiener filtering based noise reduction. Journal of the Acoustical Society of America , 131 , 4743–4755.
Darwin, C. J. (1997). Auditory grouping. Trends in Cognitive Sciences, 1(9), 327-333.
Davis, H., & Silverman, S. R. (1970). Hearing and Deafness . Holt, Rinehart & Winston of Canada Ltd.
Deutsch, D. (1999). Grouping mechanisms in music. The Psychology of Music .
Donnadieu, S. (2007). Mental representation of the timbre of complex sounds. In Analysis, synthe- sis, and perception of musical sounds (pp. 272-319). Springer: New York.
Dowling, W. J. (2010). Music Perception. Oxford Handbook of Auditory Science. OUP Oxford.
Dowling, W. J., & Harwood, D. L. (1986). Music cognition . Academic Press.
Drobisch, M. W. (1852). Über musikalische Tonbestimmung und Temperatur . Weidmann.
Ebeling, M. (2009). Konsonanz und Dissonanz [Consonance and dissonance]. Musikpsychologie, Rowohlts.
Emmenegger, C., & Schwind, E. (2008). Musik, Wahrnehmung, Sprache [Music, perception, speech] . CH: Chronos.
Feldmann, H., & Kumpf, W. Musikhören bei Schwerhörigkeit mit und ohne Hörgerät [Listening to music with and without hearing aids]. Laryngologie, Rhinologie, Otologie und ihre Grenzgebiete , 67 (10), 489-497.
Fowler, E. P. (1937). The diagnosis of diseases of the neural mechanism of hearing by the aid of sounds well above threshold (presidential address). The Laryngoscope , 47 (5), 289-300.
Fraisse, P. (1982). Rhythm and tempo. The Psychology of Music , 1, 149-180.
Fricke, J. P., & Louven, C. (2008). Psychoakustische Grundlagen des Musikhörens [Music listening: fundamentals in psychoacoustic]. In Stoffer, T. H., Oerter, R. (Eds.): Allgemeine Musikpsychologie [General music psychology], DE: Hogrefe.
Gardiner, M. F., Fox, A., Knowles, F., & Jeffrey, D. (1996). Learning improved by arts training. Nature .
35 Chapter 1: Introduction
Geisler, C. D. (1998). From sound to synapse: physiology of the mammalian ear . Oxford University Press.
Good, A. J., Russo, F. A., & Sullivan, J. (2014). The efficacy of singing in foreign-language learning. Psychology of Music , 0305735614528833.
Grey, J. M. (1977). Multidimensional perceptual scaling of musical timbres. The Journal of the Acoustical Society of America , 61 (5), 1270-1277.
Grewe, O., Nagel, F., Kopiez, R., & Altenmüller, E. (2007). Listening to music as a re-creative process: Physiological, psychological, and psychoacoustical correlates of chills and strong emotions. Music Perception , 24 (3), 297-314.
Hassler, M., Birbaumer, N., & Feil, A. (1985). Musical talent and visual-spatial abilities: A longitudinal study. Psychology of Music , 13 (2), 99-113.
Hetland, L. (2000). Learning to make music enhances spatial reasoning. Journal of Aesthetic Educa- tion , 179-238.
Ho, Y. C., Cheung, M. C., & Chan, A. S. (2003). Music training improves verbal but not visual memory: cross-sectional and longitudinal explorations in children. Neuropsychology , 17 (3), 439.
Humes, L. E. (1991). Understanding the speech-understanding problems of the hearing impaired. Journal of the American Academy of Audiology , 2(2) , 59–69.
Hunt, F. V. (1978). Origins in acoustics. ASA publication , 186 .
Janata, P., Birk, J. L., Tillmann, B., & Bharucha, J. J. (2003). Online detection of tonal pop-out in modulating contexts. Music Perception , 20 (3), 283-305.
Johnson, J., & K Lalwani, A. (2000). Sensorineural and conductive hearing loss associated with lateral semicircular canal malformation. The Laryngoscope , 110 (10), 1673-1679.
Kameoka, A., & Kuriyagawa, M. (1969a). Consonance theory part I: Consonance of dyads. The Jour- nal of the Acoustical Society of America , 45 (6), 1451-1459.
Kameoka, A., & Kuriyagawa, M. (1969b). Consonance theory part II: Consonance of complex tones and its calculation method. The Journal of the Acoustical Society of America , 45 (6), 1460-1469.
Keidser, G., Dillon, H. R., Flax, M., Ching, T., & Brewer, S. (2011). The NAL-NL2 prescription proce- dure. Audiology Research , 1(1), 24.
Koelsch, S. (2009). A neuroscientific perspective on music therapy. Annals of the New York Academy of Sciences , 1169 (1), 374-384.
36 Chapter 1: Introduction
Koelsch, S., & Schröger, E. (2008). Neurowissenschaftliche Grundlagen der Musikwahrnehmung [Neuroscientific fundamentals of music perception]. Bruhn, H., Kopiez, R. & Lehmann, AC (Hg.), Mu- sikpsychologie. Das neue Handbuch (S. 393-412). Reinbek: Rowohlt .
Koger, S. M., Chapin, K., & Brotons, M. (1999). Is music therapy an effective intervention for demen- tia? A meta-analytic review of literature. Journal of Music Therapy , 36 (1), 2-15.
Krimphoff, J., McAdams, S., & Winsberg, S. (1994). Caractérisation du timbre des sons complexes. II. Analyses acoustiques et quantification psychophysique [Characterization of timbre of complex tones. II. Acoustic analysis and quantification]. Le Journal de Physique IV , 4(C5), C5-625.
Krumhansl, C. L. (1989). Why is musical timbre so hard to understand. Structure and perception of electroacoustic sound and music , 9, 43-53.
Krumhansl, C. L. (1990). Cognitive foundations of musical pitch (Vol. 17). New York: Oxford Universi- ty Press.
Kuk, F., Keenan, D., Korhonen, P., & Lau, C. C. (2009). Efficacy of linear frequency transposition on consonant identification in quiet and in noise. Journal of the American Academy of Audiology , 20 (8), 465-479.
Leek, M. R., Molis, M. R., Kubli, L. R., Tufts, J. B. (2008). Enjoyment of music by elderly hearing- impaired listeners. Journal of the American Academy of Audiology, 19, 519-26
Leek, M. R., Summers, V. (2001). Pitch strength and pitch dominance of iterated rippled noises in hearing-impaired listeners. Journal of the Acoustical Society of America, 109(6), 2944–2954.
Levitt, H. (2001). Noise reduction in hearing aids: a review. Journal of Rehabilitation Research and Development , 38 , 111–121.
Liberman, M. C. (1989). Quantitative assessment of inner ear pathology following ototoxic drugs or acoustic trauma. Toxicologic Pathology , 18 (1 Pt 2), 138-148.
Liberman, M. C., Dodds, L. W., & Learson, D. A. (1986). Structure-function correlation in noise- damaged ears: A light and electron-microscopic study. In Basic and applied aspects of noise-induced hearing loss (pp. 163-177). Springer US.
Madsen, S. M., Moore, B. C. J. (2014). Music and Hearing Aids. Trends in Hearing, 18:2331216514558271.
McAdams, S., & Cunible, J. C. (1992). Perception of timbral analogies. Philosophical Transactions of the Royal Society B: Biological Sciences , 336 (1278), 383-389.
37 Chapter 1: Introduction
McAdams, S., Winsberg, S., Donnadieu, S., De Soete, G., & Krimphoff, J. (1995). Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes. Psycho- logical Research , 58 (3), 177-192.
Moore, B. C. J. (2007). Cochlear hearing loss: physiological, psychological and technical issues. John Wiley & Sons.
Moore, B. C. J., & Glasberg, B. R. (1997). A model of loudness perception applied to cochlear hearing loss. Auditory Neuroscience , 3(3), 289-311.
Moore, B. C. J., Glasberg, B. R., & Stone, M. A. (2010). Development of a new method for deriving initial fittings for hearing aids with multi-channel compression: CAMEQ2-HF. International Journal of Audiology , 49 (3), 216-227.
Neukomm, E. (2006). Musikhören bei Schwerhörigkeit mit und ohne Hörgerät [Music listening with hearing loss aided and unaided]. Retrieved July 15th 2014 from: http://www.pro- audito.ch/fileadmin/daten/Neukommpas_musikstudie.pdf
Northern, J. L. (Ed.). (1984). Hearing Disorders . Allyn & Bacon.
Overy, K. (2003). Dyslexia and music. Annals of the New York Academy of Sciences , 999 (1), 497-505.
Oxenham, A. J., & Wojtczak, M. (2010). Frequency selectivity and masking. Oxford Handbook of Auditory Science: Hearing , 3, 2.
Plomp, R. (1976). Aspects of tone sensation: a psychophysical study . Academic Press.
Purves, D., Augustine, G. J., Fitzpatrick, D., Katz, L. C., LaMantia, A. S., McNamara, J. O., & Williams, S. M. (2001). Two Kinds of Hair Cells in the Cochlea. Sunderland (MA): Sinauer Associates.
Révesz, G. (1914). Zur Grundlegung der Tonpsychologie [Fundamentals of tone psychology]. Leipzig, Germany: Veit & Company.
Russo, F. A., Cuddy, L. L., Galembo, A., & Thompson, W. F. (2006). Sensitivity to tonality across the pitch range. Perception , 36 (5), 781-790.
Schacter, D., Gilbert, D., & Wegner, D. (2011). Sensation and Perception. Charles Linsmeiser Psy- chology Worth Publishers.
Schellenberg, E. G. (2004). Music lessons enhance IQ. Psychological Science , 15 (8), 511-514.
Scollie, S., Seewald, R., Cornelisse, L., Moodie, S., Bagatto, M., Laurnagaray, D., Beaulac, S., & Pum- ford, J. (2005). The Desired Sensation Level multistage input/output algorithm. Trends in Amplifica- tion . 9(4), 159-197.
38 Chapter 1: Introduction
Seifritz, E., Di Salle, F., Bilecen, D., Radü, E. W., Scheffler, K. (2001). Auditory system: functional magnetic resonance imaging. Neuroimaging Clinics of North America, 11(2), 275-296.
Shepard, R. N. (1982). Structural representations of musical pitch. The Psychology of Music , 343- 390.
Sloboda, J. (1998). Does music mean anything?. Musicae Scientiae , 2(1), 19-31.
Smeijsters, H. J. M. F., & Puhr, A. (1997). Musiktherapie bei Alzheimerpatienten: Eine Meta-Analyse von Forschungsergebnissen [Music therapy with Alzheimer patients: a meta-analysis of research results] . GE: Melos.
Standley, J. M., & Hughes, J. E. (1997). Evaluation of an early intervention music curriculum for en- hancing prereading/writing skills. Music Therapy Perspectives , 15 (2), 79-86.
Stumpf, C. (1890).Tonpsychologie [Psychology of sound]. Leipzig, Germany: Hirzel.
Swiss Federal Statistical Office, (2008). Kulturverhalten in der Schweiz. Eine vertiefende Analyse – Erhebung 2008 [Cultural behavior in Switzerland. A deep analysis – inquiry 2008]. Bundesamt für Statistik, Neuchâtel
Tawfik, S., El Danasoury I., Moussa, A., & Naguib, M. F. N. F. (2009). Enhancement of speech intelli- gibility in digital hearing aids using directional microphone/ noise reduction algorithm. The Journal of International Advanced Otology , 6(1) , 74–82.
Terhardt, E., Stoll, G., & Seewann, M. (1982). Algorithm for extraction of pitch and pitch salience from complex tonal signals. The Journal of the Acoustical Society of America , 71 (3), 679-688.
Thaut, M. H. (2005). Rhythm, music, and the brain: Scientific foundations and clinical applications (Vol. 7). Routledge.
Thaut, M. H., McIntosh, G. C., Rice, R. R., Miller, R. A., Rathbun, J., & Brault, J. M. (1996). Rhythmic auditory stimulation in gait training for Parkinson's disease patients. Movement Disorders , 11 (2), 193-200.
Toiviainen, P., & Krumhansl, C. L. (2003). Measuring and modeling real-time responses to music: The dynamics of tonality induction. Perception-London , 32 (6), 741-766.
Tufts, J. B., Molis, M. R., & Leek, M. R. (2005) Perception of dissonance by people with normal hear- ing and sensorineural hearing loss. Journal of the Acoustical Society of America 118(2), 955–967.
Wallace, W. T. (1994). Memory for music: effect of melody on recall of text. Journal of Experimental Psychology: Learning, Memory, and Cognition , 20 (6), 1471.
39 Chapter 1: Introduction
Weaver, M., Northern, J. L. (1976). The acoustic nerve tumor. Hearing Disorders, Boston: Little, Brown.
Wessel, D. L. (1979). Timbre space as a musical control structure. Computer Music Journal , 45-52.
Whipple, J. (2004). Music in intervention for children and adolescents with autism: A meta-analysis. Journal of Music Therapy , 41 (2), 90-106.
WHO (2015). Fact sheet N°300. Retrieved from http://www.who.int/mediacentre/factsheets /fs300/en
Yates, G. K. (1995). Cochlear structure and function. Hearing, Academic Press San Diego, 41-74.
Zadeh, M. H., & Selesnick, S. H. (2001). Evaluation of hearing impairment. Comprehensive therapy , 27 (4), 302-310.
40 Chapter 2: Development of the AMP test
Chapter 2 Development of the Adaptive Music Perception Test
The content in this chapter was published in the journal Ear & Hearing (March-April edition 2015).
Impact Factor: 2.84
Authors: Martin Kirchberger and Frank Russo
Kirchberger, M. J., & Russo, F. A. (2015). Development of the Adaptive Music Perception Test . Ear and Hearing, 36 (2), 217-228.
41 Chapter 2: Development of the AMP test
Abstract
OBJECTIVES. Despite vast amounts of research examining the influence of hearing loss on speech perception, comparatively little is known about its influence on music perception. No standardized test exists to quantify music perception of hearing-impaired (HI) persons in a clinically practical manner. This study presents the Adaptive Music Perception (AMP) test as a tool to assess important aspects of music perception with hearing loss.
DESIGN. A computer-driven test was developed to determine the discrimination thresholds of 10 low-level physical dimensions (e.g., duration, level) in the context of perceptual judgments about musical dimensions: meter, harmony, melody, and timbre. In the meter test, the listener is asked to judge whether a tone sequence is duple or triple in meter. The harmony test requires that the lis- tener make judgments about the stability of the chord sequences. In the melody test, the listener must judge whether a comparison melody is the same as a standard melody when presented in transposition and in the context of a chordal accompaniment that serves as a mask. The timbre test requires that the listener determines which of two comparison tones is different in timbre from a standard tone (ABX design). Twenty-one HI participants and 19 normal-hearing (NH) participants were recruited to carry out the music tests. Participants were tested twice on separate occasions to evaluate test–retest reliability.
RESULTS. The HI group had significantly higher discrimination thresholds than the NH group in 7 of the 10 low-level physical dimensions: frequency discrimination in the meter test, dissonance and intonation perception in the harmony test, melody-to-chord ratio for both melody types in the mel- ody test, and the perception of brightness and spectral irregularity in the timbre test. Small but significant improvement between test and retest was observed in three dimensions: frequency discrimination (meter test), dissonance (harmony test), and attack length (timbre test). All other dimensions did not show a session effect. Test–retest reliability was poor (<0.6) for spectral irregu- larity (timbre test); acceptable (>0.6) for pitch and duration (meter test), dissonance and intonation (harmony test), and melody-to-chord ratio I and II (melody test); and excellent (>0.8) for level (me- ter test) and attack (timbre test).
CONCLUSION. The AMP test revealed differences in a wide range of music perceptual abilities be- tween NH and HI listeners. The recognition of meter was more difficult for HI listeners when the
42 Chapter 2: Development of the AMP test listening task was based on frequency discrimination. The HI group was less sensitive to changes in harmony and had more difficulties with distinguishing melodies in a background of music. In addi- tion, the thresholds to discriminate timbre were significantly higher for the HI group in brightness and spectral irregularity dimensions.
The AMP test can be used as a research tool to further investigate music perception with hearing aids and compare the benefit of different music processing strategies for the HI listener. Future testing will involve larger samples with the inclusion of hearing-aided conditions allowing for the establishment of norms so that the test might be appropriate for use in clinical practice.
2.1 Introduction
2.1.1 Motivation Hearing impairment affects the perception of sound leading to reductions in speech intelligibility and the enjoyment of music. Extensive research into the effects of hearing loss on speech intelligi- bility (e.g., Humes 1991; Barreto & Ortiz 2008) has led to various attempts to improve speech per- ception through hearing aids (Levitt 2001; Tawfik et al. 2009; Cornelis, Moonen, & Wouters 2012). By comparison, research into the effects of hearing loss on music perception is in its infancy. Hear- ing-impaired listeners – unaided and aided – have specified various ways in which the enjoyment of music is affected: instruments are hard to distinguish, sounds are distorted, and melodies are diffi- cult to recognize (Neukomm 2006; Sarchett 2006; Leek et al. 2008).
There are no standardized tests available that have been designed to assess music perception in individuals with hearing loss adaptively. A large number of music perception tests are specifically designed to assess musicality or the skills of healthy and formally trained individuals (Don, Schellen- berg, & Rourke 1999; Woodruff 1983). Another group of music tests focuses on individuals with cochlear implants (Gfeller et al. 2002; Gfeller et al. 2005; Medel Medical Electronics 2006; Cooper, Tobey, & Loizou 2008; Looi et al. 2008; Spitzer, Mancuso, & Cheng 2008; Kang et al. 2009). The first music perception test battery to be specifically developed for hearing-aid users was the Music Per- ception Test (Uys & van Dijk 2011). This test battery includes eleven subtests that assess perception in four dimensions: rhythm (4 subtests), timbre (2 subtests), pitch (2 subtests), and melody (3 sub- tests). As the tests are not adaptive, its sensitivity to differences between normal-hearing and hear- ing-impaired listeners appears to be limited.
43 Chapter 2: Development of the AMP test
2.1.2 Concept Music can be broken down into a core set of components such as meter, harmony, melody, and timbre. Meter is the division of a rhythmic pattern according to equal periods or measures (Limb 2006). Strong and weak beats in the pattern are grouped into larger units (Oh 2008), typically cate- gorized as duple or triple according to whether the measure is organized in groups of 2 or 3 beats. The perception of harmony corresponds to the vertical organization of pitch (Piston 1987). This organization concerns simultaneous pitches as in a chord or in counterpoint. Another important aspect of harmony concerns the sequencing of chords, as in a harmonic progression.
A melody is a linear succession of tones, which is perceived as a single entity (Limb 2006). The suc- cession of tones can be specified with regard to melodic contour and interval size (Davies & Jen- nings 1976). The former refers to the direction of the pitch change (Schubert & Stevens 2006) and the latter refers to its magnitude.
Timbre is often defined as the attribute of auditory perception whereby a listener can judge two sounds as dissimilar using any criterion other than pitch, loudness and duration (ANSI 1960). This definition is somewhat unsatisfactory in that it defines timbre through what it is not rather than what it is (Sethares 2005). An alternative approach to defining timbre uses multidimensional scaling techniques to extract the underlying physical dimensions that contribute to its perception (Grey 1977; McAdams et al. 1995; Krumhansl 1989; Krimphoff, McAdams, & Winsberg 1994). Brightness and attack are two dimensions that have been consistently identified as important using multi- dimensional scaling techniques (Grey 1977; Wessel 1979; Krumhansl 1989; McAdams & Cunibile 1992; Krimphoff, McAdams, & Winsberg 1994; McAdams et al. 1995). The conclusions regarding a third dimension, however, differ. Within the framework of our study, we have defined a third di- mension in accordance with the conclusions of Krimphoff, McAdams, and Winsberg (1994) as spec- tral irregularity. The spectral irregularity is determined by deviation from linearity in the amplitude envelope of the harmonics. A clarinet would have high spectral irregularity because it lacks energy in the even harmonics. A trumpet tone by contrast would have low spectral irregularity as it pos- sesses odd and even harmonics with a linear spectral roll-off.
2.1.3 Objective We aimed to develop a computerized test that is adaptive and thus capable of assessing music per- ception in hearing as well as hearing-impaired populations. The Adaptive Music Perception Test (AMP-Test) has been designed to investigate the perception of important music dimensions such as
44 Chapter 2: Development of the AMP test
meter, timbre, harmony, and melody. These mid-level dimensions are constructed from 10 low-level dimensions D1-D10 (Figure 2:1) such as attack, level, and pitch, which may ultimately be modified in a fairly direct manner through changes to the hearing-aid technology. While the assessment is not entirely culturally neutral there is no reason to expect that performance will vary across individuals who primarily consume music in the Western harmonic idiom, regardless of genre preferences. Each test of the battery asks the listener to make a response concerning a mid-level dimension while adaptively manipulating the underlying low-level dimensions, thereby allowing for the deter- mination of the respective low-level discrimination thresholds. Thus, thresholds are determined within a musically relevant framework. We contend that the use of these tests corresponds more directly to functional hearing in music than a conventional assessment of discrimination thresholds.
2.1.4 Hypothesis Survey studies of individuals with hearing loss have revealed a wide range of problems encountered following the onset of hearing loss including melody recognition and level changes (Feldman & Kumpf 1988; Leek et al. 2008). We assume that these subjective impressions of problems can be
Figure 2:1 The mid-level and low-level dimensions of the Adaptive Music Perception test. D1 – D10 indicates dimension 1–10; MCR, melody-to-chord ratio; Sp. Irreg., spectral irregularity.
45 Chapter 2: Development of the AMP test objectively tracked with adaptive listening tests that are presented within a musical framework. We predict that individuals with hearing loss will have higher discrimination thresholds than individuals with normal hearing.
2.2 Materials and Methods
The AMP test comprises 4 subtests including meter, harmony, melody, and timbre discrimination.
2.2.1 General test design Each subtest was designed to determine the discrimination thresholds of two or more low-level physical dimensions. In each trial, one of these low-level dimensions was manipulated so as to in- stantiate the mid-level dimension. For example, the downbeat in the meter subtest could be con- veyed by a difference in level, duration or pitch. The adaptive process of each low-level dimension was interleaved to obscure the structure of the test and to discourage analytic modes of listening.
We pseudo-randomized the position of the correct answer with the constraint that the right answer was equally distributed among both answer buttons.
2.2.2 Adaptation An experimental procedure is adaptive if the selection of the stimuli is determined by the results of the preceding trials (Treutwein 1995). Our motivation for making the AMP test adaptive was to measure performance in a manner that was fast and accurate for all degrees of hearing loss and music ability.
The AMP test applies the transformed two-alternative-forced-choice method (2-AFC) using a 2- down 1-up rule (Levitt 1971), which is commonly used in tests of auditory thresholds (e.g., Jesteadt, Wier, & Green 1977; Wier, Jesteadt, & Green 1977; Moore, Glasberg, & Shailer 1984; Bochner, Snell, & MacKenzie 1988).
The presentation and adaptation of the low-level dimensions in each subtest were interleaved. The adaptation process was stopped once a minimum of 6 reversal points was reached on each dimen- sion. The standard step sizes were halved after the first and second reversals in an effort to acceler- ate the determination of thresholds without compromising accuracy. Discrimination thresholds were calculated by averaging the values of the last 4 reversals.
46 Chapter 2: Development of the AMP test
2.2.3 Sound stimuli Stimuli used across all four subtests were digitally synthesized. A spectrogram of the stimuli is displayed in Figure 2:2. The reference tone that was used for calibra- tion and throughout the tests was a com- plex tone with the fundamental frequency