The Pennsylvania State University

The Graduate School

Ecology

VOCAL NOISE COMPENSATION IN NONHUMAN MAMMALS: MODIFICATION

TYPES AND USAGE PATTERNS

A Dissertation in

Ecology

by

Cara F. Hotchkin

 2012 Cara F. Hotchkin

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

December 2012

The dissertation of Cara F. Hotchkin was reviewed and approved* by the following:

Susan E. Parks Associate Professor of Acoustics and Ecology Senior Research Associate, Penn State Applied Research Laboratory Dissertation Adviser Chair of Committee

Victoria A. Braithwaite Professor of Fisheries and Biology

Thomas B. Gabrielson Professor of Acoustics Senior Scientist, Penn State Applied Research Laboratory

Tracy L. Langkilde Associate Professor of Biology

Jennifer L. Miksis-Olds Assistant Professor of Acoustics Senior Research Associate, Penn State Applied Research Laboratory

David M. Eissenstat Professor of Woody Plant Physiology Chair, Intercollege Graduate Degree Program in Ecology

*Signatures are on file in the Graduate School

iii

ABSTRACT Vocal noise compensation and vocal flexibility in response to increased noise are important evolutionary adaptations which allow signalers to successfully communicate in highly variable acoustic environments. While many species of mammals use multiple types of noise- induced modifications, the details of vocal responses to noise are unclear. Open questions include whether specific noise parameters elicit certain modification types, whether animals can adjust different vocal characteristics independently, and what effects behavioral contexts have on vocal noise compensation. The goals of this dissertation were to evaluate these questions, and to characterize the effects of noise amplitude and bandwidth (including degree of spectral overlap) on the acoustic structure of vocalizations produced by two acoustically dependent non-human mammal species. Methods included passive acoustic recordings of vocalizations, investigation of the subjects’ acoustic habitats, and controlled playback experiments. Amplitude and spectral characteristics of acoustic habitats were examined for two (one captive and one wild) beluga whale (Delphinapterus leucas) populations. In the wild habitat, noise was measured during beluga encounters at two sites. Noise sources and levels varied between the two recording locations, with higher overall levels and a higher vessel/beluga encounter rate at Beluga River, and more noise from weather events at Kenai River. In the captive habitat, at Mystic Aquarium, noise levels were substantially higher than noise in Cook Inlet, and noise varied spatially within the exhibit and over both short- and long temporal scales. Comparison of noise levels with published beluga audiograms suggests that due to the differences in noise levels, belugas at Mystic Aquarium would be likely to hear noise at and above 2 kHz, and that in Cook Inlet, belugas are more likely to hear only noise above 4 kHz. Noise-induced vocal modifications were studied in both the captive and wild beluga populations during increased noise from exhibit maintenance or vessel passages, respectively. In both populations, noise-induced changes to vocalization structure were observed, primarily in the spectral content of calls. In captive belugas, minimum call frequencies were significantly related to narrowband noise levels, while in the wild population peak frequency was more likely to be related to noise levels. There were no consistent relationships between the duration of calls and noise levels in either population. Behavioral or environmental contexts also had significant relationships with vocalization structure, indicating a possible interaction between the signalers’ communicative motivation and vocal noise compensation strategies, as observed for human . Modifications observed from the captive whales appeared unlikely to increase

iv communication success during increased exhibit noise, but those produced by wild whales may serve a noise compensation function. Experimental tests of the onset order of vocal modification types and subjects’ attention to specific noise parameters were performed with a captive group of cotton-top tamarins (Sagunius oedipus), an acoustically dependent social New World primate. Observed vocal modifications included the Lombard effect, frequency shifts, and temporal changes, but differed between the two examined call types. Changes to the spectral content and amplitude of vocalizations appeared to occur non-simultaneously for both call types. Data in this chapter represent the first observations of short-term changes to the spectral content of non-human primate vocalizations, and the first empirical demonstration of shifts in spectral tilt in non-human mammal calls. Similarity in responses of humans, cotton-top tamarins, and other non-human mammals indicate relative consistency of selection for signalers able to compensate for short- term increases in noise level. The work presented in this dissertation expands current knowledge of the effects of noise on acoustic communication by nonhuman animals, and advances both theoretical understanding of the evolution of vocal noise compensation and practical considerations of the use of vocal modifications for monitoring and conservation of vulnerable species. It provides substantial evidence for the importance of communicative motivation in vocal noise compensation, and demonstrates previously unknown vocal flexibility in a well-studied non-human primate species. This dissertation advances the study of acoustic communication by non-human mammals, with relevance to the evolution of communication and applied conservation biology.

v

TABLE OF CONTENTS

LIST OF FIGURES ...... ix

LIST OF TABLES ...... xvi

ACKNOWLEDGEMENTS ...... xix

Chapter 1 Introduction ...... 1

Acoustic communication by animals ...... 1 Relevance of this study ...... 3 Summary of chapters and appendices ...... 4 References ...... 5

Chapter 2 The Lombard Effect and other noise-induced vocal modifications: insights from mammalian communication systems ...... 7

Abstract ...... 7 Introduction ...... 7 Historical context and terminology ...... 10 Effects of noise on human speech ...... 11 The Lombard effect in humans ...... 12 Other NIVMs ...... 16 Lombard effect – NIVM linkage (“Lombard speech”) ...... 18 Effects of noise on non-human mammal vocalizations ...... 18 Vocal noise compensation in non-human mammals ...... 19 The Lombard effect in mammals ...... 19 Other NIVMs in non-human mammals ...... 23 Lombard effect – NIVM Linkage in non-human mammals ...... 27 Future research directions ...... 28 Future directions for human research ...... 29 Future directions for non-human mammal research ...... 30 Conclusions ...... 33 APPENDIX: Glossary of terms ...... 35 Acknowledgements ...... 38 References ...... 38

Chapter 3 Characterization of acoustic habitat in captive and wild beluga environments ...... 44

Abstract ...... 44 Introduction ...... 45 Acoustic Habitats ...... 46 Methods ...... 51 Data Collection and processing ...... 51 Noise analyses ...... 56 Results ...... 57

vi

Wild environment: Cook Inlet, Alaska...... 57 Captive environment: Mystic Aquarium ...... 60 Discussion ...... 66 Cook Inlet ...... 66 Mystic Aquarium...... 68 Conclusions and Future Work ...... 70 References ...... 72

Chapter 4 Effects of noise on the vocalizations of captive beluga whales ...... 75

Abstract ...... 75 Introduction ...... 76 Vocal Noise Compensation ...... 76 Acoustic communication by beluga whales ...... 78 Mystic Aquarium...... 79 Methods ...... 82 Data Collection ...... 82 Vocalization analyses ...... 85 Noise Analyses ...... 86 Statistical Analyses ...... 87 Results ...... 88 Noise ...... 88 Spectral modifications ...... 89 Temporal modifications ...... 91 Discussion ...... 93 Vocal noise compensation ...... 95 Context dependency and other confounding factors ...... 97 Conclusions ...... 98 References ...... 98

Chapter 5 Effects of noise on the vocalizations of wild beluga whales in Cook Inlet, Alaska ...... 102

Abstract ...... 102 Introduction ...... 103 Cook Inlet belugas ...... 104 Repertoire classifications ...... 106 Vocal modifications ...... 107 Methods ...... 108 Data collection and processing ...... 108 Vocalization analyses ...... 110 Noise Analyses ...... 111 Statistical analyses...... 112 Results ...... 113 Noise levels ...... 115 Call characteristics ...... 115 Site Differences ...... 116 Spectral Modifications ...... 117 Temporal Modifications ...... 119

vii

Discussion ...... 121 References ...... 125

Chapter 6 Vocal noise compensation in a non-human primate: effects of noise bandwidth and level on cotton-top tamarin (Saguinus oedipus) vocalizations ...... 130

Abstract ...... 130 Introduction ...... 130 Research questions and hypotheses ...... 135 Methods ...... 136 Animal Care ...... 136 Playback stimuli ...... 137 Data collection...... 138 Data Processing ...... 142 Data analyses ...... 143 Statistical analyses...... 146 Results ...... 146 Spontaneous vs. antiphonal calls ...... 147 The Lombard Effect ...... 147 Temporal Modifications ...... 149 Spectral Modifications ...... 150 Onset order of modifications ...... 154 Discussion ...... 155 Combination long calls (CLCs) ...... 156 Chirps ...... 158 Confounding factors ...... 159 Implications for vocal noise compensation ...... 159 Conclusions ...... 160 References ...... 162

Chapter 7 Summary and Conclusions ...... 166

Summary of chapters ...... 166 General discussion ...... 169 Noise induced vocal modifications ...... 169 Future Research ...... 172 Vocal noise compensation ...... 172 Effects of behavioral context ...... 173 Vocal repertoires ...... 175 Conclusions ...... 176 References ...... 176

Appendix A Noise and beluga vocalizations from upper Cook Inlet, Alaska during August 2007 ...... 179

Abstract ...... 179 Introduction ...... 180 Methods ...... 181 Data Collection ...... 181

viii

Data Analysis ...... 184 Results ...... 185 Noise ...... 185 Beluga encounters and vocal behavior ...... 189 Spectral overlap between noise and beluga vocalizations ...... 190 Discussion ...... 192 Acknowledgements ...... 193 References ...... 193

Appendix B Matlab code for Acoustic Habitat analyses ...... 195

Appendix C Vocal repertoire and acoustic behavior of beluga whales at Mystic Aquarium in November 2010...... 201

Abstract ...... 201 Introduction ...... 201 Study system ...... 202 Methods ...... 204 Data Collection ...... 204 Data Analyses ...... 204 Statistical analyses...... 205 Results ...... 205 Vocal Repertoire ...... 206 Diel Patterns ...... 210 Discussion ...... 213 Vocal Repertoire ...... 213 Diel Patterns ...... 214 References ...... 215

ix

LIST OF FIGURES

Figure 2-1. Depiction of several possible noise-induced vocal modifications in the special case of a vocalization that is partially overlapped by low-frequency noise. Other modifications, including frequency decreases, changes to call timing, and redundancy, among others, are not depicted here. a) Conceptual diagram of a vocalization (line) in ambient noise (shaded rectangle). Noise and vocalization amplitudes are indicated by the darkness of the rectangle and line, respectively. (b) Amplitude increase during increased noise; the original “Lombard effect”. (c) Frequency shift of vocalization. (d) Increase in duration of vocalization. (e) Simultaneous change to amplitude, spectral, and temporal parameters...... 9

Figure 3-1. Map of Cook Inlet, Alaska, with EAR deployment locations marked with black stars and major rivers indicated by black lines. The two EAR deployments were within the Beluga River and Kenai River outflow zones. Relevant locations are marked with letters: A) Beluga River, B) Little Susitna River, C) Chickaloon Bay, D) Knik Arm, E) Turnagain Arm...... 48

Figure 3-2. Diagram of the Arctic Coast exhibit at Mystic Aquarium with pools labeled. Gates between pools are indicated by hashes; there are two gates between the Main and Hold pools, one between Main and Med, and one between Med and Hold. The underwater viewing area is located underneath the semicircular canopy on the left side of the map. Image courtesy of Mike Osborn, Mystic Aquarium...... 50

Figure 3-3. Example spectrogram (512 point FFT, Hanning window, 25% overlap) from the KR recording unit illustrating the variety of noise sources present in Cook Inlet. The beluga encounter in this example lasted approximately one hour, and consisted of whistles, buzzes, and echolocation clicks...... 52

Figure 3-4. Unweighted power spectral density comparisons of the DSG noise floor (black), with the lowest (blue) and highest (red) noise recorded at Mystic Aquarium. During the quietest recordings, the noise floor interferes with measurements at frequencies over 6 kHz...... 54

Figure 3-5. Unweighted 1/3 octave band noise levels for all beluga encounters at BR and KR, including all noise sources. Black lines represent the 5th and 95th percentiles of noise levels; blue lines represent 25th and 75th percentiles; 50th percentile indicated by red line...... 60

Figure 3-6. Noise (unweighted) variability measured in the three Arctic Coast pools during November 2010. Black lines represent the 5th and 95th percentiles of noise levels; blue lines represent 25th and 75th percentiles; 50th percentile indicated by a red line. The smallest, shallowest pool (Med) has elevated mid-frequency noise levels between 200 Hz and 3 kHz, probably due to filtration and pool resonances. Hold and Main pools have relatively similar noise spectra, with peaks at around 120 Hz

x

probably caused by electrical noise. Above 10 kHz, all pools have noise levels below the noise floor of the recording unit...... 62

Figure 3-7. Sample spectrograms of noise from the med (1704 11/8/2010) and hold (1703 11/6/2010) pools. Note the increased low frequency noise in the Med pool, and the 5 kHz band in both pools...... 62

Figure 3-8. 1/3 Octave band noise levels measured during experimental manipulations of the exhibit chillers. There was no difference in noise attributable to chiller operations...... 63

Figure 3-9. 1/3 Octave band noise levels measured during routine exhibit maintenance dives during 2010 and 2011. During 2010, noise levels are elevated by up to 8 dB across the frequency spectrum, while in 2011 there was no significant increase in noise during dives due to an increase in background noise levels...... 64

Figure 3-10. Day (0600 – 1759; black) and night (1800 -0559; blue) noise in the Med and Hold pools. Noise dropped at night in both pools, with the most obvious decrease between 1 and 5 kHz. The peak in the holding pool is likely caused by 120 Hz electrical noise...... 65

Figure 3-11. Median 1/3 octave band levels for noise recorded in the Med pool during 2010 (black line) and 2011 (red and blue lines). Noise increased by 2 – 8 dB though the spectrum, but the majority of the energy remained within the 300 Hz – 1 kHz range...... 66

Figure 3-12. Noise level percentiles at both Cook Inlet sites compared with the beluga hearing threshold (Dashed black line; after Richardson et al. (1995) ). Median noise levels (red line) are likely audible above 4 kHz at both sites. The range of audible noise at both locations overlaps with the frequency range of beluga vocalizations (100 Hz to >20 kHz), indicating a potential masking hazard for signalers...... 68

Figure 3-13. Median 1/3 octave band noise levels (unweighted) and beluga hearing thresholds (after Richardson et al. (1995) compared with yearly (left) and spatial (2010; right) variability in the Arctic Coast exhibit. Noise above 2 kHz is likely audible to the whales throughout the exhibit...... 70

Figure 4-1. Use of tonal whistles (red/orange bars) and pulsed/hybrid calls (blue) during day (0600 – 1759) and night (1800 -0559) by the beluga whales at Mystic Aquarium during November 2010. Call rate and usage of pulsed calls dropped dramatically between day and night; this appeared related to the trainer activities and not sunrise/sunset times. See Appendix 3 for more details on repertoire description and vocal behavior...... 80

Figure 4-2. Diagram of the Arctic Coast exhibit at Mystic Aquarium with pools labeled. Gates between pools are indicated by hashes; there are two gates between the Main and Hold pools, one between Main and Med, and one between Med and Hold. The underwater viewing area is located underneath the semicircular canopy on the left side of the map. Image courtesy of Mike Osborn, Mystic Aquarium...... 81

xi

Figure 4-3. Photographs of the DSG unit a) in air, with weights and tether rope attached and b) deployed in the med pool...... 83

Figure 4-4. Spectrograms of the two call types selected for analysis. Both call types are highly stereotyped; flat whistles have a harmonic structure and flat frequency contour. CT1 vocalizations are pulsed and have a distinctive shape and clear sidebands. Similar call types have been observed in wild populations, including the group from which the two Mystic females were originally collected (Chmelnitsky and Ferguson 2012)...... 85

Figure 4-5. Illustration of the broad- and narrow-band noise selections used to evaluate short-term vocal modifications in the Mystic Aquarium data. The vocalization is outlined by the blue box, and noise parameters are shown in red. Broadband noise was measured across the entire frequency range (vertical red box; 20 Hz – 25 kHz), while narrowband noise was measured from the intersection of the broadband noise and 1/3 octave band containing either the peak (shown; shaded red rectangle) or minimum (not show) frequency of the vocalization...... 87

Figure 4-6. Spectrogram of the cleaning dive from November 15, 2010 illustrating the intermittent nature of the noise...... 89

Figure 4-7. Relationship between peak call frequency (natural log-transformed) and narrowband noise levels for CT1 vocalizations. Date is not included as a factor in this model...... 91

Figure 4-8. Relationship between duration of CT1 vocalizations and noise level in the 1/3 octave band containing the minimum frequency of the call. Date is not included as a factor in this model...... 92

Figure 4-9. Spectrograms of whistles produced by the Mystic belugas (left), Mystic trainer whistle (center), and wild belugas in the Churchill, Manitoba population (right). The peak frequencies of the Mystic whistles are nearly identical, but the overall structure of all three sounds is also similar. It is unclear whether the Mystic belugas are mimicking trainer whistles or whether the similarities between the vocalizations and trainer whistles are coincidental. Churchill River vocalization figure modified from Chmelnitsky and Ferguson (2012) call type W1a...... 94

Figure 5-1. Map of Cook Inlet, Alaska, with recorder locations marked with black stars and major rivers indicated by black lines. The two EAR deployments were within the Beluga River and Kenai River outflow zones. Relevant locations are marked with letters: A) Beluga River, B) Little Susitna River, C) Chickaloon Bay, D) Knik Arm, E) Turnagain Arm...... 105

Figure 5-2. Spectrograms of call types shared between the Cook Inlet beluga population and the Churchill River population. Names and images adapted from Chmelnitsky and Ferguson (2012)...... 107

Figure 5-3. 24 hour spectrogram (512 point FFT, Hanning window, 25% overlap) from the Beluga River recording unit illustrating the prevalence of anthropogenic noise

xii

sources at the site. Five vessel passages were recorded on this day; belugas were detected during and after the last vessel passage...... 109

Figure 5-4. Examples of vocalizations from low (A) and medium (B, C) frequency call categories. A) and B) are examples of frequency modulated (FM) calls, while C) is a flat contour (CW) vocalization...... 111

Figure 5-5. Illustration of the broad- and narrow-band noise selections. The vocalization is outlined by the blue box, and noise parameters are shown in red. Broadband noise was measured across the entire frequency range (vertical red box; 20 Hz – 12.5 kHz), while narrowband noise was measured from the intersection of the broadband noise and 1/3 octave band containing either the peak (shown; shaded red rectangle) or minimum (not show) frequency of the vocalization...... 112

Figure 5-6. High-frequency calls produced at the BR recording site during a vessel passage. One mid-frequency call is visible at approximately 23 s. This type of vocal behavior, with consistent high-frequency vocalizations, was observed only once; more typical vocalizations ranged between 1 and 8 kHz, with minimal harmonic structure, even during increased noise. Broadband noise level measured during this recording was 114.9 dB re 1 µPa...... 114

Figure 5-7. Natural log-transform of peak call frequencies for low-frequency CW calls in the Kenai River data against noise level in the 1/3 octave band containing the peak call frequency. Noise clips from the same encounter tended to have very similar 1/3 octave band levels. Encounter ID was not significant in this analysis...... 118

Figure 5-8. Plot of all minimum vocalization frequencies against noise level in the 1/3 octave band containing minimum vocalization frequency. Beluga River data is marked in black, Kenai river data in red. Vocalizations from the Kenai river dataset had generally lower frequencies and varied with narrowband noise level; Beluga river vocalizations did not vary with increasing narrowband noise. The six data points outlined in the blue ellipse all occurred during a single encounter (BR34)...... 118

Figure 5-9. Duration of low-frequency CW calls from both sites. Boxes indicate inter- quartile range; horizontal lines represent means and whiskers represent range. Outliers are indicated with asterisks. Duration of vocalizations was significantly related to Encounter ID for six of eight comparisons (Table 5-4)...... 121

Figure 6-1. A captive cotton-top tamarin (Homer) from the Penn State colony...... 134

Figure 6-2. Spectrograms of white noise playback stimuli recorded during trials. Treatments A – C have a bandwidth of 5 kHz and are presented at 2, 12, and 22 dB above ambient noise levels. Treatments D – F have a 10 kHz bandwidth and are presented at similar broadband rms amplitudes to treatments A – C. (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) ...... 137

Figure 6-3. Spectrograms of the three elicitation stimuli played during trials. All three calls are from the same unfamiliar adult female, and were played in random order

xiii

during each trial. (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) ...... 138

Figure 6-4. Diagram of experimental setup. Letters indicate equipment placement; M = microphone; C=video camera; NS= speaker presenting noise stimulus; ES = speaker presenting elicitation stimulus...... 139

Figure 6-5. A tamarin in the test cage after a test session. Note the transport box to the right of the test cage and microphone and webcam at the bottom of the image...... 141

Figure 6-6. Spectrogram of CLC produced during treatment A (a) before and (b) after noise reduction processing in Adobe Audition. (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) Note that the highest formant of the call has been removed by the noise reduction processing. All calls that were selected in the noise-reduced files were verified against the original recordings to ensure accurate measures of call characteristics...... 143

Figure 6-7. Spectrograms (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) of CLC (a) and chirp (b) vocalizations with measured frequency characteristics indicated. All measurements of CLCs were made on the call as a whole and the individual syllables within the call (not shown). This CLC consists of one chirp and four whistle syllables. Measurements of chirps were made from the fundamental frequency. Note that peak frequency measurements for all syllables, fundamental frequencies, and whole calls were taken automatically from the spectrum view in Raven 1.4 Pro (not shown)...... 145

Figure 6-8. Average CLC source levels vs. noise level. Colors represent individuals (Mulva – red, Bart – blue, Jerry – black), and symbols represent treatment types (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F). Note that call source level never decreases between control (42 dB re 20 µPa noise level) and treatment trials, and that inter- and intra-individual variability is high...... 148

Figure 6-9. Duration of whole CLC vocalizations in each trial type (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F) averaged over all subjects. Light grey bars indicate control averages, dark grey represent treatment trials. Error bars indicate standard deviation...... 149

Figure 6-10. Duration of chirp vocalizations in each trial type (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F) averaged over all subjects. Light grey bars indicate control averages, dark grey represent treatment trials. Error bars indicate standard deviation...... 150

Figure 6-11. Minimum frequency of whole CLCs averaged over all subjects in each trial type (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F). Light grey bars indicate control trials, dark grey represent treatment trials. Error bars indicate standard deviations...... 151

xiv

Figure 6-12. Peak frequency of whole CLCs averaged over all subjects as a function of noise level. Error bars indicate standard deviations...... 151

Figure 6-13. Representative CLCs produced by Mulva during a) control and b) treatment A trials demonstrating changes to spectral tilt. All whistles from a) have strong fundamental frequencies and maximum energy in the 2nd harmonic, while in b) the first whistle has a very faint fundamental frequency at approximately 2 kHz, and peak frequencies for all whistles occur in the 4th harmonics. Reduced energy in the fundamental frequency is also apparent in the second and third whistles. Spectrogram parameters: 1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution...... 152

Figure 6-14. Changes to spectral tilt of CLC whistles during different noise level/bandwidth combinations (treatments A – F), represented as the ratio of energy in the first eight formants (harmonics) to energy in the first formant (fundamental frequency) averaged within and between subjects. Solid lines represent averages from control trials, dashed lines represent treatment trials...... 153

Figure 6-15. Changes to spectral tilt of non-CLC chirps during noise level/bandwidth combinations (treatments A – F), represented as the ratio of the amount of energy in the first three formants (harmonics) to energy in the first formant (fundamental frequency) and averaged within and between subjects. Solid lines represent averages from control trials, while dashed lines represent treatment trials...... 154

Figure A-1. Upper Cook Inlet, Alaska, with August 2007 recording sites marked with black stars. Beluga encounters are marked with whale tail symbols...... 183

Figure A-2. Average broadband rms noise levels from all sites according to tidal stage. Error bars indicate standard deviation. Tidal stage does not appear to drastically influence noise levels at the Port of Anchorage (PoA), and only minimally affects noise levels at Point Mackenzie and at the mid-Knik Arm site. Noise levels at Point Woronzof changed dramatically when tide changed from “high” to “outgoing”. Days spent at each recording site: 3 (PoA), 2 (Pt. Mac), 1 (Pt. Woronzof), 1 (mid-Knik)...... 186

Figure A-3. Typical 1/3 octave band noise levels for high tide recordings at all sites. Noise below 1 kHz at the two developed sites is due to increased anthropogenic activity in these areas. Increased noise at the developed sites extends up to approximately 8 kHz...... 187

Figure A-4. Beluga vocalizations and ship noise recorded at the Port of Anchorage on 14 August 2007. 6-8 animals were observed travelling from north to south during loading of the container ship “Midnight Sun”. Note the high level of noise at low frequencies in this spectrogram...... 189

Figure A-5. Example spectrograms of recorded call types. a) Contoured pulsed series similar to type W4d in Chmelnitsky and Ferguson (2012). b) Contour described in Sjare & Smith (1986); noisy component is not. c) Harmonic “whistle”; Belikov & Bel’kovitch (2007) WT8. d) Upswept call; Belikov & Bel’kovich (2007) WT11. Spectrogram parameters: 256 point Hanning window, 50 % overlap; 256 point DFT. ... 190

xv

Figure A-6 Range of noise levels recorded in Cook Inlet plotted with the estimated beluga hearing threshold (modified from Richardson et. al 1995) and the range of beluga vocalization frequencies recorded in this dataset. The blue line indicates the maximum noise levels recorded (Port of Anchorage site), and the red line indicates minimum noise from mid-Knik arm...... 191

Figure C-1. Diagram of the Arctic Coast exhibit at Mystic Aquarium with pools labeled. Gates between pools are indicated by hashes; there are two gates between the Main and Hold pools, one between Main and Med, and one between Med and Hold. The underwater viewing area is located underneath the semicircular canopy on the left side of the map. Image courtesy of Mike Osborn, Mystic Aquarium...... 203

Figure C-2. Examples of tonal vocalizations recorded from the beluga whales at Mystic Aquarium. Most tonal calls were flat contours of varying durations, with some rising contours and few highly frequency-modulated whistles that are seen in wild beluga populations. Noisy components at the ends of calls (panels 1,3) were ignored for these analyses...... 207

Figure C-3. Examples of pulsed and noisy vocalizations recorded from the beluga whales at Mystic Aquarium. These call types were highly variable and often included some tonal components (see: noisy buzzes panel). “Scream” vocalizations were rare (N=23, 0.005%) and grouped with “other” in the analyses...... 208

Figure C-4. NBWT and CT1 exemplars. These call types were highly stereotyped and made up 15.3% of the total calls analyzed...... 209

Figure C-5. Spectrograms of vocalizations produced during training sessions. These calls are produced in air, but are audible (and were recorded) underwater. The “cheer” call type is produced spontaneously by Kela after successful completion of a trained behavior...... 209

Figure C-6. Mean number of calls per 5 minute subsample from each hour. Error bars indicate standard deviations. Calling rate increased around 0700 and decreased starting around 1400...... 210

Figure C-7. Boxplot of calls per subsample for day (0600 – 1759) and night (1800 - 0559). Significantly more calls were detected during day than night hours...... 210

Figure C-8. Mean-adjusted hourly calling rates in the Arctic Coast exhibit. Error bars indicate standard deviation. Calling rates increased sharply around 0700 and declined around 1300...... 211

Figure C-9. Call types recorded during day (0600 -1759) and night (1800 -0559) hours in the Arctic Coast Exhibit. Pulsed and noisy calls (blue shades) made up a significantly higher portion of the total calls during daytime hours than during the night. Total number of calls was also substantially higher in day (N=4,275) than night (N=524)...... 212

xvi

LIST OF TABLES

Table 2-1. Predictors of the Lombard effect in human speech and nonhuman mammals’ vocalizations with selected references for each factor. Empty boxes indicate that no studies have investigated a given factor in the indicated group. References marked with * indicate that the modification was observed during echolocation (self- communication)...... 14

Table 2-2. Presence of the Lombard effect and other NIVMs in mammals. The simultaneous modifications column refers to observations of two or more vocal modifications, generally the Lombard effect and temporal changes. Checkmarks indicate experimental evidence for a particular modification type. Blank spaces indicate no data is available. Species marked with † indicate that the existing evidence comes from social groups rather than individual signalers, and * indicate that the modifications were observed only during echolocation (self- communication)...... 20

Table 2-3. Noise-induced vocal modifications in humans and non-human mammals, including selected references. References marked with * indicate that the modification was observed during echolocation (self-communication)...... 30

Table 3-1. Recording schedules for the 2010 and 2011 data collection sessions at Mystic Aquarium. Note that not all dates are consecutive, due to availability of pools for recording. On days when two pools are listed, the DSG was moved between pools during the recording period. A check in the overnight column means that the DSG was recording overnight beginning on the date with the checkmark and continuing through the following day...... 55

Table 3-2. Chiller on/off schedule for 2011 recording sessions. AM indicates chillers were scheduled to go off at 0900 and on at 1200; PM indicates a scheduled off time of 1200 and an ‘on’ time of 1500. Actual times vary due to the automatic sensors used to automatically adjust chiller schedules to actual water temperatures. When two ‘on’ times are noted, it indicates that each chilling unit turned on separately rather than both coming online simultaneously...... 56

Table 3-3. 1/3 – Descriptive statistics for raw and M-weighted broadband noise levels at the Beluga and Kenai River datasets from Cook Inlet. BR generally had higher noise levels than KR, though minimum levels were similar. The maximum noise level at BR occurred during a single 30 s file when a small boat or aircraft was recorded. M- weighting decreased noise levels between 10 and 150 Hz, which belugas are unlikely to hear, and thus gives a more realistic measure of the belugas’ acoustic environment...... 58

Table 3-4. Variability in raw and M-weighted broadband noise levels in the BR and KR datasets. Noise levels were generally higher at BR. Rain was only recorded at KR...... 59

xvii

Table 3-5. 1/3 – Descriptive statistics for raw and M-weighted broadband noise levels in the Arctic Coast exhibit during 2010 and 2011. Average differences greater than 6 dBM are apparent between the day and night measurements in both the Med and Hold pools. Maintenance dives increased the average noise in the exhibit by 7 - 12 dB over average Med pool daytime noise. Unweighted noise levels are very high due to very low frequency (<100 Hz) noise in the entire exhibit...... 61

Table 4-1. Recording schedules for the 2010 and 2011 data collection sessions at Mystic Aquarium. Note that not all dates are consecutive, due to availability of pools for recording. On days when two pools are listed, the DSG was moved between pools during the recording period. A check in the “Dive” column indicates an exhibit maintenance dive occurred on that date; dives on Mondays were performed in the afternoon, and Tuesday dives occurred in the morning. The DSG was in the med pool during all dives...... 84

Table 5-1. Percentages of calls from each category at the two Cook Inlet recording locations. Kenai river vocalizations were concentrated in the low-frequency (0 – 4 kHz) band, while Beluga River vocalizations were more evenly distributed...... 114

Table 5-2. Regression statistics for data pooled across sites and call categories. Site/season was significant for all comparisons with minimum and peak call frequencies. There were no significant relationships with duration...... 116

Table 5-3. Regression relationships between the peak frequency of vocalizations and the noise level in the 1/3 octave band containing the call’s peak frequency. EID is a categorical factor relating to the individual encounter during which each call was recorded. All six call categories have significant relationships between peak call frequency and narrowband noise level; three of the call categories have significant relationships with encounter ID...... 117

Table 5-4. Regression relationships between the duration of flat-contour (CW) call types at the BR and KR sites and the three noise variables. In no case does the duration vary with noise level; the only significant relationships are between duration and encounter ID, which was significant in six of eight categories (bold text) and nearly significant for a seventh ...... 120

Table 6-1. Noise levels measured during all trial sessions, averaged over all subjects (N=5). Noise levels over the recording bandwidth (0 – 24 kHz) were similar but unequal between the high (A/D), medium (B/E), and low (C/F) noise amplitude treatments due to difficulties calibrating the playback equipment...... 147

Table 6-2. Average call amplitudes for both call types during control (‘Base VL’) and treatment (‘Trt VL’) periods. Chirps had higher baseline amplitudes, but maximum call amplitudes were similar for both vocalization types...... 148

Table 6-3. Presence/absence of vocal modifications by call and treatment types. Letters indicate the presence of a modification in that subject’s calls during each treatment type. Italic initials indicate a decrease in a given parameter for that subject. Mu =

xviii

Mulva, J = Jerry, B = Bart, Mh = Milhouse, S = Susan. * indicates that Bart produced no chirps during these treatment types, and is not included in the analysis. .... 155

Table A-1. Dates and locations of recordings in Cook Inlet during August 2007. The recording vessel sometimes visited multiple locations in a day, often moving to several locations within a given site in order to capture the full range of variability during the recording period...... 182

Table C-1. Recording schedules for 2010 data collection sessions at Mystic Aquarium. On days when two pools are listed, the DSG was moved between pools during the recording period. A check in the overnight column means that the DSG was recording overnight beginning on the date with the checkmark and continuing through the following day...... 204

Table C-2. Numbers and percent of the different call types found in the 5 minute data subsamples. “Other” calls were non-stereotyped vocalizations which did not fit in any defined categories (ex: scream, Figure C-4)...... 206

Table C-3. Call types divided by time of occurrence (Day: 0600 – 1759; Night 1800 – 0559). There was a dramatic change in the vocal repertoire between day and night, with fewer pulsed calls produced during night. “Other” calls were non-stereotyped vocalizations which did not fit in any defined categories (ex: scream, Figure C-4)...... 212

xix

ACKNOWLEDGEMENTS

This dissertation has been supported by more people that I can possibly thank properly, and to each and every person who lent me an open ear, an extra set of hands, or gave advice, support or encouragement, thank you. If your name is not mentioned here, forgive me. First and foremost, my heartfelt thanks to my advisor, Susan Parks, who has been the most enthusiastic, patient, and engaging mentor, friend, and guide that I could have wished for throughout this process. Her advice and guidance in both the field and the lab have been invaluable, and her endless support and patience have helped me through the more difficult parts of graduate school. I have had extensive personal support from friends, family, and colleagues throughout this process. My parents, Barb and Mike Hotchkin, have always supported me in whatever I decided to do – thank you. Your love and support has meant the world to me, and I would not have been able to achieve nearly as much as I have if you hadn’t believed in me. My fellow graduate students, Sam Denes, Laura Madden, Chad Smith, Jenny Tennessen, Sarah Johnson, and Helen Marie Graves, have been wonderful sources of moral support, technical assistance, advice, and fun. Christen Clemson, Caitlin DeGrose, Mike Thompson and other friends have provided much appreciated breaks from the world of bioacoustics. Each chapter of this dissertation has been supported by many professional contacts and talented researchers, but there are a few people who deserve special recognition. My committee members, Dr. Jen Miksis-Olds, Dr. Thomas Gabrielson, Dr. Tracy Langkilde, and Dr. Victoria Braithwaite, have all helped shape this dissertation, providing guidance on experimental design, analysis, and the process of science. Kathy Shoemaker has provided priceless assistance with administrative needs for five years. Kyla England generously donated two years of her time to sort through months of acoustic recordings in search of beluga vocalizations. Helpful discussions and advice have come from multiple people in the field of beluga whale acoustics, and for their willingness to answer questions and help a beginning graduate student I thank Dawn Grebner, Christine Erbe, Peter Scheifele, and Valeria Vergara. Working closely with the whales at Mystic Aquarium gave me a new appreciation for the intelligence and curiosity of marine mammals, and I owe thanks to many people who assisted with this project. My aunt and uncle, Nancy and John Morrisson opened their home to me for

xx weeks at a time and graciously put up with my coming and going at very odd hours. Gayle Sirpenski, Mike Osborn, Kristine Magao, and Tracy Romano spent hours going over research proposals and equipment design. The training staff was very generous with their time in helping me with deployments and making room for me to record training sessions with the whales, and David Mann provided extraordinary technical support during my first recordings at Mystic. My work in Cook Inlet was helped at first by Barbara Mahoney, who found equipment and a boat for me to use for two summers of field work, and has continued to express enthusiastic support for my research as I have drifted away from my original project. Marc Lammers, Shannon Atkinson, Rachael Blevins, Manuel Castellote, and Bob Small, the members of Team Cook Inlet Beluga Acoustics, provided recordings of whales from Cook Inlet to compare with my data from Mystic Aquarium, and helpful input on analyses and impacts. Dr. Daniel J. Weiss and the Penn State Comparative Communication Laboratory stepped in to help me add an experimental component to this dissertation, and I am profoundly grateful for the use of their animals and facilities, and for Dan’s helpful advice on data collection and analyses. I could not have completed these experiments without extensive support from Helen Marie Graves, Darryl Koif, and an army of undergraduate research assistants too numerous to name here. Financial support for this work was provided by a University Graduate Fellowship from The Pennsylvania State University, Exploratory and Foundational Funding from Penn State Applied Research Laboratory, the National Defense Industrial Association Undersea Warfare Division, the Penn State IGDP in Ecology, and government support under and awarded by DoD, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a. All experiments performed were approved by the Penn State IACUC and Office for Research Protections under proposals submitted by Dr. Susan E. Parks and Dr. Daniel J. Weiss.

Chapter 1

Introduction

Acoustic communication by animals

Acoustic communication, or the use of sound to transmit information, is found throughout the animal kingdom (Shannon and Weaver, 1963; Bradbury and Vehrencamp, 1998). In environments where visual contact is limited, successful acoustic communication is particularly crucial – missed or misinterpreted signals may lead to fitness costs to signalers, including reductions in mating opportunities, lower foraging success, and separation from dependent offspring (Bradbury and Vehrencamp, 1998). To avoid these costs, signaling animals have evolved a variety of adaptations to ensure successful communication despite dynamic and demanding communication environments. Acoustic signaling can be negatively impacted by noise (excess sound in the communication channel), which can cause receivers to make interpretation errors (Shannon and Weaver, 1963; Wiley, 2006). Noise sources include environmental, biogenic, and anthropogenic sounds which can cause vocalization masking, reducing a signaler’s effective communication range (‘active space’) (Brown, 1989; Richardson et al., 1995; Lohr et al., 2003; Jensen et al., 2012). The prevalence of noise across ecosystems and its potentially severe fitness consequences have made it a significant evolutionary driver in many acoustic communication systems, leading of the evolution of a wide variety of communication signals and noise compensation mechanisms in many species (Brumm and Slabbekoorn, 2005). Noise compensation strategies can include changes to overt behaviors associated with communication, including shifts in call rate and repetition, as well as calling time and location (Brumm and Slabbekoorn, 2005; Fuller et al., 2007). While these behavioral changes are likely to increase the probability of successful communication, they are also energetically expensive, and may put the signaler at a disadvantage if intended receivers are not listening when the signal is produced, or if the signaler cannot relocate to ensure being heard (Richardson et al., 1995; Fuller

2 et al., 2007). An alternative strategy is for signalers to change the acoustic structure of vocalizations without changing their overall behavioral patterns, potentially increasing the chances of signal detection at a lower cost (Brumm and Slabbekoorn, 2005; Parks et al., 2007). Acoustic modifications can occur over a variety of time scales; long term (years to decades) changes have been observed in species with limited learning periods (Patricelli and Blickley, 2006) and those with long lifespans (Parks et al., 2007), implying either extreme behavioral plasticity or a genetic or learning component for some species. Short-term noise fluctuations are more common, however, and mechanisms to deal with millisecond to hours-long shifts in noise are commonly observed across taxa (Brumm and Slabbekoorn, 2005; Patricelli and Blickley, 2006; Brumm and Zollinger, 2011). Many animal species (including humans) are known to change acoustic characteristics of vocalizations produced during increased noise (Lane and Tranel, 1971; Brumm and Slabbekoorn, 2005; Brumm and Zollinger, 2011). Observed vocal modifications include increasing the vocalization amplitude in proportion to the noise level (the Lombard effect), shifting some or all of a vocalization out of the noise frequency band, and increasing the duration of the vocalizations (Lombard, 1911; Brumm and Slabbekoorn, 2005). While each of these modifications can individually increase the chances of successful communication, simultaneous use of multiple noise compensation mechanisms may further improve signalers’ chances of detection at minimal energetic cost (Gillooly and Ophir, 2010; Noren et al., 2011; Jensen et al., 2012). Unfortunately, current knowledge of vocal noise compensation in non-human species is limited to a basic understanding of the types of modifications available to certain species, and the effects of single or multiple modifications on detectability or intelligibility of vocalizations remain poorly known (Nemeth and Brumm, 2009). The best-known acoustic communication system is human speech, for which both the predictors and ultimate effects of vocal modifications are well studied (Lu and Cooke, 2009; Garnier et al., 2010). This is not true for most non-human species, and particularly for non-human mammals, which have been relatively poorly studied (Brumm and Zollinger, 2011). While several non-human mammal species have shown evidence for both the Lombard effect and other vocal modifications (Lesage et al., 1999; Brumm et al., 2004; Scheifele et al., 2005; Egnor and Hauser, 2006), questions remain about both the aspects of noise the signalers are responding to and the degree to which animals can control their vocal noise compensation responses. In particular, do signalers in different acoustic environments use the same types of vocal modifications in response to noise? How does noise amplitude or frequency bandwidth affect

3 whether a signaler will change the spectral or temporal parameters of a vocalization? Does noise amplitude induce changes to call frequency or timing? The answers to these questions can provide insight into the effectiveness of acoustic communication in noise and allow for mitigation of the impacts of noise on vulnerable species. They will also contribute to knowledge of the energetic costs of acoustic communication during noise. Another aspect of vocal noise compensation that remains unexplored in non-human animals is whether vocal modifications are always employed simultaneously, as seen in humans (Lane and Tranel, 1971; Junqua, 1996), or whether non-human signalers can selectively employ the most effective modification type for a given noise environment. Although simultaneous use of multiple modifications has been reported for cetaceans, primates, and birds (Lesage et al., 1999; Brumm et al., 2004; Egnor et al., 2006; Patricelli and Blickley, 2006), there are no published reports on whether the onset of these modifications was sequential or simultaneous. Sequential onset of modifications may be a mechanism to reduce the overall cost of modified behaviors by allowing an incremental increase in vocal energy expenditure during noise, but this has yet to be examined and evidence from human speech suggests that some modification types may be physiologically or anatomically linked (Lu and Cooke, 2009; Garnier et al., 2010). In addition, the cumulative impact of multiple modifications on the detectability and intelligibility of signals has not yet been addressed (Brumm et al., 2004). The effects of acoustic environments on signal design and noise-induced vocal modifications are interesting, and can be compared between closely related species (or members of the same species) living in very different acoustic environments, and between very different species exposed to similar noise conditions. Such experiments can provide insight into the evolution of acoustic communication and vocal noise compensation.

Relevance of this study

The effects of noise on animal communication are relevant to studies of the evolution of acoustic communication and to conservation of species vulnerable to impacts of increasing anthropogenic noise. This dissertation seeks to evaluate the effects of noise level and bandwidth on the acoustic characteristics of vocalizations produced by non-human mammals in both natural and captive situations.

4 The two species used in the following studies represent extremes in the field of vocal noise compensation studies. Beluga whales (Delphinapterus leucas), with their extensive and highly flexible vocal repertoires, have been studied only in relatively uncontrolled situations in the wild, despite controlled, captive social groups kept in aquaria around the world (Lesage et al., 1999; Scheifele et al., 2005). Conversely, the vocal behavior of captive cotton-top tamarins (Sauginus oedipus), a non-human primate with apparently limited vocal flexibility, has been closely examined in relation to noise (Egnor and Hauser, 2006; Egnor et al., 2006; Egnor et al., 2007). Insights from the following chapters may be applied to the design of future experiments into vocal noise compensation with either of these species, and species-specific results may be useful for conservation efforts of beluga whales and cotton-top tamarins. Results from this dissertation provide evidence that vocal noise compensation responses in non-human mammals may be more complex than those observed for human subjects. At the very least, generalizations across species and distinct populations should be conducted with extreme caution and considerations of the fact that some very closely related species react to identical stimuli in different ways (Francis et al., 2011).

Summary of chapters and appendices

The second chapter of this dissertation reviews the current knowledge of the Lombard effect and other vocal noise compensation mechanisms in human speech and non-human mammals’ vocalizations. The goals of this chapter are to summarize the literature relevant to noise-induced vocal modifications, reconcile terminology from human and non-human studies, and provide recommendations for future research in both fields. Chapters three, four, and five address the use of noise-induced vocal modifications by beluga whales, with reference to the animals’ acoustic habitats and potential differences in vocal behavior and vocal modifications from captive and wild belugas. Chapter three describes the acoustic habitat in captive and wild beluga habitats at Mystic Aquarium and Cook Inlet, Alaska, respectively. Effects of noise on vocalizations produced by the three belugas at Mystic Aquarium during passive acoustic recordings and environmental manipulations are reported in chapter four. Chapter five details noise-induced vocal modifications from a wild population of beluga whales exposed to noise from ship passages at two sites in Cook Inlet.

5 Hypothesized interactions of noise bandwidth and level and proposed linkages between the Lombard effect and other vocal noise compensation mechanisms are experimentally tested in chapter six. Playback experiments of noise of different bandwidth/level combinations were conducted with cotton-top tamarins, and results are reported in chapter six. A general discussion of the effects of noise on beluga and tamarin vocalizations, the importance of these results, conservation implications and suggestions for future research are presented in chapter seven. The appendices to the dissertation are intended to document additional detail from the data collection and analyses presented in chapters three through six, as well as additional projects carried out during the author’s graduate career. Appendix 1 is a report of acoustic recordings and analysis of noise in upper Cook Inlet, Alaska during August 2007. Appendix 2 contains Matlab scripts used in the analysis of acoustic habitat in Cook Inlet and at Mystic Aquarium. Appendix 3 covers additional analyses performed on the data from Mystic Aquarium, including documentation of the animals’ vocal repertoire and diel behavioral trends.

References

Bradbury, J. W., and Vehrencamp, S. L. (1998). Principles of Animal Communication (Sinauer Associates, Inc, Sunderland, MA). Brown, C. H. (1989). "The active space of blue monkey and grey-cheeked mangabey vocalizations," Animal Behaviour 37, 1023-1034. Brumm, H., and Slabbekoorn, H. (2005). "Acoustic Communication in Noise," in Advances in the Study of Behavior, edited by P. J. B. Slater, C. T. Snowdon, T. J. Roper, H. J. Brockmann, and M. Naguib (Academic Press), pp. 151-209. Brumm, H., Voss, K., Kollmer, I., and Todt, D. (2004). "Acoustic communication in noise: regulation of call characteristics in a New World monkey," J Exp Biol 207, 443-448. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Egnor, S. E. R., and Hauser, M. D. (2006). "Noise-induced vocal modulation in cotton-top tamarins (Saguinus oedipus)," American Journal of Primatology 68, 1183-1190. Egnor, S. E. R., Iguina, C. G., and Hauser, M. D. (2006). "Perturbation of auditory feedback causes systematic perturbation in vocal structure in adult cotton-top tamarins," J Exp Biol 209, 3652-3663. Egnor, S. E. R., Wickelgren, J. G., and Hauser, M. D. (2007). "Tracking silence: adjusting vocal production to avoid acoustic interference," Journal of Comparative Physiology A 193, 477-483. Francis, C. D., Ortega, C. P., and Cruz, A. (2011). "Different behavioural responses to anthropogenic noise by two closely related passerine birds," Biology Letters. Fuller, R. A., Warren, P. H., and Gaston, K. J. (2007). "Daytime noise predicts nocturnal singing in urban robins," Biology Letters 3, 368-370.

6 Garnier, M., Henrich, N., and Dubois, D. (2010). "Influence of sound immersion and communicative interaction on the Lombard effect," Journal of Speech, Language, and Hearing Research 53, 588 - 608. Gillooly, J. F., and Ophir, A. G. (2010). "The energetic basis of acoustic communication," Proceedings of the Royal Society B: Biological Sciences 277, 1325-1331. Jensen, F. H., Beedholm, K., Wahlberg, M., Bejder, L., and Madsen, P. T. (2012). "Estimated communication range and energetic cost of bottlenose dolphin whistles in a tropical habitat," The Journal of the Acoustical Society of America 131, 582-592. Junqua, J. (1996). "The influence of acoustics on speech production: A noise-induced stress phenomenon known as the Lombard reflex," Speech Communication 20, 13-22. Lane, H., and Tranel, B. (1971). "The Lombard sign and the role of hearing in speech," Journal of Speech, Language, and Hearing Research 14, 677 - 709. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65-84. Lohr, B., Wright, T. F., and Dooling, R. J. (2003). "Detection and discrimination of natural calls in masking noise by birds: estimating the active space of a signal," Animal Behaviour 65, 763 - 777. Lombard, E. (1911). "Le signe de l'elevation de la voix," Annales Des Malades de l'creille 37. Lu, Y., and Cooke, M. (2009). "The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise," Speech Communication 51, 1253-1262. Nemeth, E., and Brumm, H. (2009). "Blackbirds sing higher-pitched songs in cities: adaptation to habitat acoustics or side-effect of urbanization?," Animal Behaviour 78, 637-641. Noren, D. P., Holt, M. M., and Williams, T. M. (2011). "Assessing long-term impacts of vocal compensation to ambient noise by measuring the metabolic cost of sound production in bottlenose dolphins," The Journal of the Acoustical Society of America 129, 2397. Parks, S. E., Clark, C. W., and Tyack, P. L. (2007). "Short- and long-term changes in right whale calling behavior: The potential effects of noise on acoustic communication," The Journal of the Acoustical Society of America 122, 3725-3731. Patricelli, G. L., and Blickley, J. L. (2006). "Avian communication in urban noise: causes and consequences of vocal adjustment," The Auk 123, 639-649. Richardson, W. J., Greene, C. R. J., Malme, C. I., and Thompson, D. H. (1995). Marine mammals and noise (Academic Press, San Diego, CA). Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," The Journal of the Acoustical Society of America 117, 1486-1492. Shannon, C. E., and Weaver, W. (1963). The Mathematical Theory of Communication (University of Illinois Press, Chicago, IL). Wiley, R. H. (2006). "Signal Detection and Animal Communication," in Advances in the Study of Behavior, edited by H. J. Brockmann, P. J. B. Slater, C. T. Snowdon, T. J. Roper, M. Naguib, and K. E. WynneEdwards (Academic Press), pp. 217-247.

7 Chapter 2

The Lombard Effect and other noise-induced vocal modifications: insights from mammalian communication systems

Abstract

Humans and non-human mammals exhibit fundamentally similar vocal responses to increased noise, including increases in vocalization amplitude (the Lombard effect) and changes to spectral and temporal properties of vocalizations. Different research focuses have resulted in significant discrepancies in study methodologies and hypotheses between fields, leading to specialized knowledge gaps and techniques from each. This review compares and contrasts noise- induced vocal modifications observed from human and non-human mammals with reference to experimental designs and the history of each field. Topics include the effects of communication motivation and subject-specific characteristics on the acoustic parameters of vocalizations, examination of evidence for a proposed biomechanical linkage between the Lombard effect and other spectral and temporal modifications, and effects of noise on self-communication signals (echolocation). Standardized terminology, cross-taxa tests of hypotheses, and open areas for future research in each field are recommended. Findings indicate that more research is needed to evaluate linkages between vocal modifications, context dependencies, and the finer details of the Lombard effect during natural communication. Studies of non-human mammals can benefit from applying tightly controlled experimental designs developed in human research, while studies of human speech in noise should be expanded to include natural communicative contexts. The effects of experimental design and behavioral contexts on vocalizations should not be underestimated as they may impact the magnitude and type of observed responses.

Introduction

Communication, defined as the transfer of information between a sender and a receiver, occurs in some form in nearly all animal groups (Bradbury & Vehrencamp, 1998; Shannon & Weaver, 1963). Particularly in environments where visual contact is limited, a high proportion of animals’ communicative interactions include acoustic signals which can be negatively impacted

8 by excess energy (‘noise’) in the communication channel (Bradbury & Vehrencamp, 1998; Tyack, 2000). In acoustic communication, noise takes the form of unwanted sound, and has the potential to mask acoustic signals, disrupting the transfer of information and leading to errors in signal detection or interpretation by the receiver (Bradbury & Vehrencamp, 1998; Wiley, 2006). Such errors can have severe fitness consequences for both sender and receiver, including unnecessary aggressive interactions, missed mating opportunities, a lower quality selection of available mates, and disruptions to foraging and other critical behaviors (Bradbury & Vehrencamp, 1998). Noise can therefore serve as a strong selective force on acoustic communication and associated behavior in animals, leading animals to evolve specialized communication signals and adaptations to adjust for noise interference (Brumm & Slabbekoorn, 2005; Brumm & Zollinger, 2011). To increase the probability of successful communication during periods of increased noise, many species have evolved a suite of noise-induced vocal modifications, most of which ultimately attempt to maintain the signal-to-noise ratio (SNR) of vocalizations during increased noise. These modifications include increasing the call amplitude proportional to the increase in noise level, shifting a vocalization’s frequency components to a band with less noise, or calling at a time when the noise level is low (Brumm & Slabbekoorn, 2005). Modifications can occur at the population level, which has been observed in both songbirds and cetaceans (Foote, Osborne & Hoelzel, 2004; Parks, Clark & Tyack, 2007; Patricelli & Blickley, 2006), where changes have occurred over periods of years to decades, indicating behavioral plasticity, or possible genetic or learning influences. Alternatively, short-term modifications may occur, when individual animals change vocal behaviors on the order of milliseconds to hours (Brumm & Slabbekoorn, 2005; Lane & Tranel, 1971; Patricelli & Blickley, 2006). These changes can include increases in vocal amplitude, shifting spectral components of a call out of the noise band, or changing the call duration (as shown in Figure 2-1), and others. As an alternative to maintaining the SNR of a vocalization, animals may also make short-term changes to other communication behaviors, such as call rate and redundancy; these modifications do not increase a particular vocalization’s SNR, but may still increase the chances of successful communication (Brumm & Slabbekoorn, 2005).

9

Figure 2-1. Depiction of several possible noise-induced vocal modifications in the special case of a vocalization that is partially overlapped by low-frequency noise. Other modifications, including frequency decreases, changes to call timing, and redundancy, among others, are not depicted here. a) Conceptual diagram of a vocalization (line) in ambient noise (shaded rectangle). Noise and vocalization amplitudes are indicated by the darkness of the rectangle and line, respectively. (b) Amplitude increase during increased noise; the original “Lombard effect”. (c) Frequency shift of vocalization. (d) Increase in duration of vocalization. (e) Simultaneous change to amplitude, spectral, and temporal parameters.

While similar short-term vocal adjustments have been documented in species from frogs to humans (Brumm & Slabbekoorn, 2005), a mismatch exists in the terminology and the objectives of the fields of human and non-human vocal noise adjustment research. Historically, human research has focused on medical and technological applications, while studies of non- human animals have focused on the evolution of acoustic communication and conservation of species vulnerable to communication disruptions. Due in part to these different focuses, there has been a lack of communication between researchers studying noise-induced vocal modifications in humans and those working with non-human animals (Brumm & Zollinger, 2011). This communication gap has led to species-specific hypotheses about the use of noise induced vocal modifications and their effectiveness of such changes in compensating for increased noise. These hypotheses could benefit from cross-taxa comparisons and sharing of experimental tools and techniques across fields. Recent reviews of the effects of noise on vocalizations from non-mammalian taxa (Brumm & Slabbekoorn, 2005; Patricelli & Blickley, 2006) and receiver adaptations (Bee & Micheyl, 2008; Brumm & Slabbekoorn, 2005; Haykin, 2005) are available. The scope of this

10 review is therefore limited to adaptations of signaling mammals, with the goals of comparing and contrasting the literature relevant to the effects of noise on human speech and non-human mammalian vocalizations and recommending potentially rewarding paths for research in both fields. To adequately address the differences in the fields, results from human and non-human mammals will be addressed separately, and combined for comparison near the end of the chapter. Henceforth, the terms ‘animals’, ‘mammals’, and ‘mammalian’ will exclusively refer to non- human mammals. A full glossary of terms used in this review can be found in Appendix 1.

Historical context and terminology

In 1911, Etienne Lombard published a study of the effects of noise on the speech of human adults, finding that speech amplitude increased when patients with normal hearing were temporarily deafened with loud noise and returned to baseline levels when the noise was removed (Lombard, 1911). Lombard attributed the observed increase in speech amplitude to the patient’s inability to hear himself (‘sidetone regulation’), but more recent experimental results have called this assumption into question (Lane & Tranel, 1971). In the years since Lombard’s initial discovery, the relationship between increased noise level and vocalization amplitude has come to be known mainly as the “Lombard effect” (but is also referred to as: the Lombard sign, function, reflex, phenomenon, response, or reaction) (Brumm & Zollinger, 2011; Egan, 1967). Lombard’s results have been used as a diagnostic test for unilateral deafness, false or exaggerated hearing loss (“pseudohypacusis”), and other hearing pathologies (Brumm & Zollinger, 2011; Lane & Tranel, 1971; Nober & Simmons, 1981), and to develop speech recognition technologies (Junqua, 1993; Junqua, 1996; Junqua, Fincke & Field, 1999). Such practical applications have generated high levels of interest from medical professionals and software engineers, leading to detailed understanding of specific characteristics of the Lombard effect in humans. This has prompted an expansion of the definition of “Lombard effect” in the human literature to include simultaneous (and possibly anatomically linked) increases in vocalization amplitude and word and syllable durations, and changes to a variety of frequency characteristics (Garnier, Henrich & Dubois, 2010; Junqua, 1996; Lane & Tranel, 1971; Patel & Schell, 2008). In some studies, the term “Lombard speech” has been used to encompass all of these changes, while in others they have all been grouped under “Lombard effect” (Brumm & Zollinger, 2011; Garnier et al., 2010; Junqua, 1996; Lane & Tranel, 1971; Patel & Schell, 2008).

11 In contrast to the human literature, studies of non-human animals’ vocal responses to noise restrict “Lombard effect” to its original definition of changes in amplitude only (Brumm & Slabbekoorn, 2005; Brumm et al., 2004; Potash, 1972; Scheifele et al., 2005) and address changes to spectral and temporal characteristics of vocalizations separately. Few studies of non- human species have explicitly linked changes in vocalization amplitude to the other characteristics of Lombard speech, even though many species have been shown to employ a wide variety of noise-induced vocal modifications (Brumm & Slabbekoorn, 2005). Within non-human mammals, studies of vocal adjustments have yet to identify whether changes to amplitude, spectral content, and duration occur simultaneously, as in Lombard speech in humans. This is due to the fact that most existing studies have not examined the Lombard effect in conjunction with other changes (Brumm et al., 2004; Brumm & Zollinger, 2011). The few studies which have explored this relationship in non-human taxa have shown differing results (Egnor & Hauser, 2006; Egnor, Iguina & Hauser, 2006a; Egnor, Wickelgren & Hauser, 2007; Parks et al., 2011; Tressler & Smotherman, 2009; Tressler et al., 2011). While free-tailed bats (Tadarida brasiliensis, (Gervais, 1852)), produced simultaneous changes to vocalization amplitude, duration, and bandwidth during noise playback experiments (Tressler & Smotherman, 2009; Tressler et al., 2011), at least one species of non-human primate has failed to show simultaneous changes to vocalization amplitude, spectral content, and temporal properties (Egnor & Hauser, 2006), and a study of cetaceans found no correlation between call amplitude and call frequency (Parks et al., 2011). Such contradictory results make the application of the human-specific term “Lombard speech” and the human-literature definition of “Lombard effect” to non-human species problematic. In this review, “Lombard effect” will be used to mean an increase in vocalization amplitude level (VL) as a result of an increase in the background noise level (NL). The Lombard effect and other spectro-temporal modifications of vocalizations will collectively be referred to as noise-induced vocal modifications (NIVMs).

Effects of noise on human speech

In the century since Lombard first published his discovery, new questions about the mechanism of the effect and its impacts on communication have been raised (Brumm & Zollinger, 2011; Lane & Tranel, 1971; Lombard, 1911). As Lane and Tranel (1971) noted, the results of Lombard’s work and the studies that followed have implications for the relationships

12 between auditory feedback and vocal output, and the intelligibility of speech in noise (Junqua, 1996; Lane & Tranel, 1971). These aspects of the problem are the most active areas of research today (Garnier et al., 2010; Patel & Schell, 2008). In particular, whether the Lombard effect and other NIVMs are caused by a subconscious effect of subjects’ attention to their own vocal feedback (‘sidetone regulation’) or the need for successful communication with others has been highly controversial (Brumm & Zollinger, 2011; Lane & Tranel, 1971). The “premium on intelligibility” suggested by Lane and Tranel (1971) is the first mention of the need for human speakers to be motivated to communication with another individual. They noted that in many studies subjects performing a reading task exhibited lower magnitude Lombard effects than subjects interacting with an experimenter, and interpreted this discrepancy as an indication of context-dependency during vocal adjustments. Studies of the Lombard effect in human speech have also elaborated on the effects of speaker characteristics such as age, gender, and language, and the impacts of speaking task, noise type, and other experimental design characteristics on the Lombard effect in human speech (Egan, 1967; Garnier et al., 2010; Junqua, 1996; Lane & Tranel, 1971).

The Lombard effect in humans

Lane and Tranel (1971) present a thorough review of the first six decades of research into the Lombard effect in human speech, but there have been few reviews of the substantial progress in this field in the last 40 years (see Junqua, 1996). Advances in this period include insights into the effects of noise type, mode of noise presentation, and task type (reading or interactive) on vocal responses, as well as the discovery of linkages between vocal amplitude and the spectral and temporal components of speech (Garnier et al., 2010; Junqua, 1996; Patel & Schell, 2008) (Table 2-1). The most obvious external factor affecting the presence and magnitude of the Lombard effect is noise level (NL) (Lane & Tranel, 1971; Lombard, 1911). The Lombard effect is a usually linear relationship between NL and VL, with some studies detecting an upper limit to increases in VL, possibly to due anatomical constraints (Amazi & Garber, 1982; Charlip & Burk, 1969; Egan, 1967; Egan, 1972; Garnier et al., 2006; Garnier et al., 2010; Hanley & Steer, 1949; Hormann et al., 1984; Junqua, 1993; Junqua et al., 1999; Lane & Tranel, 1971; Lombard, 1911; Patel & Schell, 2008; Scheifele, 2003; Schulman, 1989; Summers et al., 1988; Tartter, Gomes & Litwin,

13 1993; Ternström, Bohman & Södersten, 2006; Webster & Klumpp, 1962). The magnitude of this effect varies greatly, depending on the design of the experiment, but increases in VL generally range from 0.5 to 2 dB per 10 dB increase in NL (Lane & Tranel, 1971). While the precise reasons for the consistency of this response are unstudied, it is possible that the Lombard effect is the result of a tradeoff between anatomical constraints on the amplitude of signals and the maintenance of successful communication. Multiple studies have demonstrated that the spectral characteristics of noise are an important driver of the Lombard effect (Egan, 1967; Egan, 1972; Garnier et al., 2010; Junqua et al., 1999; Letowski, Frank & Caravella, 1993; Lu & Cooke, 2008; Pittman & Wiley, 2001). The most commonly examined noise types are white noise, which contains equal energy across a broad frequency bandwidth, and “cocktail party” (multi-talker) noise, which contains spectral content similar to multiple human speakers and is theoretically more likely to mask speech than white noise of similar amplitude (Garnier et al., 2010). In one of the first studies to test the effects of spectral content of noise on the Lombard effect, Egan (1967, 1972) used playbacks of band- limited white noise at low-, mid-, and high-frequencies, finding that mid-frequency noise, which contained more spectral overlap with speech, had the greatest impact on the degree of the observed Lombard effect. More recent studies have tested the effects of cocktail party noise against white noise, finding that cocktail party noise produces greater increases in VL at a comparable NL (Cooke & Lu, 2010; Garnier et al., 2006; Garnier et al., 2010; Junqua et al., 1999; Letowski et al., 1993; Lu & Cooke, 2008; Pittman & Wiley, 2001). Garnier et al. (2010) found an effect of noise presentation method on the Lombard effect. In a test of cocktail party noise and broadband white noise played through the standard supra- aural headphones and through loudspeakers at a distance of 1.5m from the subjects, they found a magnified effect of headphones, with study subjects producing VLs 4.7 dB higher than when noise was played through loudspeakers (Garnier et al., 2010). These results indicate that studies using supra-aural headphones may overestimate the Lombard effect by a significant amount due to limited self-feedback for subjects when compared to a more natural listening situation.

14

Table 2-1. Predictors of the Lombard effect in human speech and nonhuman mammals’ vocalizations with selected references for each factor. Empty boxes indicate that no studies have investigated a given factor in the indicated group. References marked with * indicate that the modification was observed during echolocation (self-communication). Non-human Factor Human References References mammal Brumm et al. 2004 Lombard, 1911 (primate) Lane and Tranel 1971 Scheifele et al. 2005 Level  (review)  (cetacean)

Schmidt and Joermann Noise Garnier et al. 2010 1986 (bats)* Sinnott et al. 1975 Egan 1967, 1972 Spectral (primates)   content/Type Tressler and Smotherman Garnier et al. 2010 2009 (bats)* Hörmann et al. 1984 Presentation  Tufts and Frank 2003 Garnier et al. 2010 Miksis-Olds and Tyack Experimental Lane and Tranel 1971 2009 (sirenian) design (review) Task /   Tressler and Smotherman Behavior Amazi and Garber 1982 2009 (bats)* Garnier et al. 2010 Egan 1967, 1972 Lane and Tranel 1971 Gender  (review) Castellanos et al. 1996 Signaler Age  Amazi and Garber 1982

Language  Junqua 1996 (review)

The type of task that subjects are asked to perform during noise is another important aspect of study design. The standard study assay has historically been a reading task, where

15 subjects were asked to read a variety of words or sentences in a randomized order (Castellanos, Benedí & Casacuberta, 1996; Egan, 1967; Egan, 1972; Garnier et al., 2010; Lane & Tranel, 1971; Pittman & Wiley, 2001; Scheifele, 2003). However, as Lane and Tranel (1971) noted, this task does not require the subjects to successfully communicate, and may therefore have a mitigating impact on the magnitude of the Lombard effect. Experimental designs have recently shifted toward asking subjects to describe objects (Amazi & Garber, 1982) or to interact with another speaker or speech recognition software (Garnier et al., 2010; Junqua et al., 1999; Patel & Schell, 2008; Pittman & Wiley, 2001). Supporting Lane and Tranel’s (1971) hypothesis on the importance of interaction, subjects who perform an interactive task exhibit greater Lombard effects than those asked merely to read (Amazi & Garber, 1982; Garnier et al., 2010). Test subject characteristics including sex, age, and language also affect the presentation of the Lombard effect. Several studies have noted an effect of sex on vocal changes, with female subjects generally exhibiting greater magnitude changes in VL than male subjects (Egan, 1967; Egan, 1972; Junqua, 1993; Junqua, 1996) (Letowski et al., 1993). Junqua (1993) found that changes to utterance frequencies and energy in specific frequency bands were dependent on the speaker’s sex, though Castellanos et al. (1996) found no distinct differences between sexes for these properties. Differences in Lombard effect magnitude between children and adults during an interactive task indicates that the effects of noise on speech appear to change as children learn to monitor the effectiveness of their communication. Adult subjects asked to label or describe a picture during periods of quiet and increased noise showed a distinct increase in the magnitude of the Lombard effect for picture description over labeling, while in children asked to perform the same tasks no difference in the magnitude of the Lombard effect was found (Amazi & Garber, 1982). While the Lombard effect appears to be a subconscious response of human subjects to increases in NL, there is evidence that subjects can consciously mitigate the effect given instructions (Brown & Brandt, 1972) or a combination of instructions and visual feedback (Pick, Sigel & Fox, 1989). Subjects in these experiments were not able to completely eliminate the Lombard effect in any situation; however, some conscious control of VL in noise was observed, indicating that there is some interaction between reflexive vocal adjustments and conscious mediation of changes to vocalization structure.

16 Other NIVMs

Changes to the spectral and temporal properties of vocalizations have also been observed in human speech, and are usually studied in conjunction with the Lombard effect. These modifications are generally considered to be biomechanically linked to increases in VL via deformation of the vocal apparatus associated with increased vocal effort (Fitch, 1989; Gramming et al., 1988; Hanley & Steer, 1949; Jessen, Köster & Gfroerer, 2005; Liénard & Di Benedetto, 1999; Ternström et al., 2006). This linkage hypothesis has been generally accepted in the literature since 1989, when Fitch (1989) published a vehement rebuttal of the only study to examine non-Lombard effect NIVMs while ignoring the simultaneous observation of the Lombard effect (Summers et al., 1988). As with the Lombard effect, each of these modifications can be affected by noise type and speaker characteristics, and are generally correlated with the magnitude of the Lombard effect (Fitch, 1989). Most non-Lombard effect NIVMs observed in human speech involve changes to the relative energy within speech formants, a phenomenon known as spectral tilt (Fitch, 1989; Lu & Cooke, 2009a). Higher spectral tilt indicates greater energy at low frequencies (Tartter et al., 1993) relative to higher formants. With the exception of Summers et al. (1988), no study has examined this phenomenon independently of the Lombard effect (Castellanos et al., 1996; Garnier et al., 2006; Garnier et al., 2010; Junqua, 1993; Junqua, 1996; Junqua et al., 1999; Letowski et al., 1993; Liénard & Di Benedetto, 1999; Lu & Cooke, 2009a; Sundberg & Nordenberg, 2006; Tartter et al., 1993), and all of these studies found a shift of energy from low to high frequencies, decreasing spectral tilt. An exception to this rule is a single speaker who lowered her fundamental frequency during increased noise and simultaneously increased her spectral tilt (Tartter et al., 1993), indicating that a minority of speakers may exhibit anomalous speech patterns during increased noise. Changes to fundamental frequency of speech have also been documented; in particular, when exposed to both low and high frequency noise, subjects increased the fundamental frequency of speech. While this may allow subjects to partially compensate for low-frequency masking noise, it also indicates that humans may not be capable of avoiding masking noise at high frequencies via vocal modifications (Lu & Cooke, 2009b). Minor differences in the spectral characteristics of Lombard speech occur between languages (Junqua, 1996), but the overall effect is very similar across French, Spanish, Japanese,

17 and American English. Japanese subjects showed the clearest shifts in formant frequencies, while Spanish subjects had the most variable changes to the second formant frequency (Junqua, 1996). Temporal aspects of speech have also been studied in conjunction with the Lombard effect (Bond, Moore & Gable, 1989; Castellanos et al., 1996; Charlip & Burk, 1969; Cooke & Lu, 2010; Dreher & O'Neill, 1957; Garnier et al., 2006; Garnier et al., 2010; Hanley & Steer, 1949; Junqua, 1993; Junqua, 1996; Junqua et al., 1999; Lu & Cooke, 2008; Patel & Schell, 2008; Pittman & Wiley, 2001; Schulman, 1989; Tartter et al., 1993). The level of analysis has varied across studies, from coarse examination of sentence duration (Dreher & O'Neill, 1957; Lu & Cooke, 2008) to very detailed examination of specific phonemes (Junqua, 1993; Schulman, 1989). Fine scale analyses have documented increases in the duration of produced in noise (Castellanos et al., 1996; Garnier et al., 2006; Junqua, 1996; Junqua et al., 1999; Traunmüller & Eriksson, 2000), sufficient to drive increases in the duration of words (Junqua, 1993). In some cases, increases in duration were observed concurrently with decreases in duration of consonants, indicating differential effects of noise on different portions of words and vocalizations (Junqua, 1993; Patel & Schell, 2008). Increases in whole word duration have also been documented (Bond et al., 1989; Charlip & Burk, 1969; Dreher & O'Neill, 1957; Junqua, 1993; Patel & Schell, 2008; Pittman & Wiley, 2001; Summers et al., 1988; Tartter et al., 1993), though many of these studies did not examine durational changes at the phoneme level. Patel and Schell (2008) provide some evidence that changes in word duration may be also be influenced by the information content of the particular word being expressed; they found that duration of informationally important words increased significantly more than less-useful word types in high noise levels. Such increases in word duration may drive the longer sentence durations observed by Dreher and O’Neill (1957) and Lu and Cooke (2008). As proposed by Shannon and Weaver (1963), changes in speaking rate or vocabulary (use of more easily detectable words) may also serve to increase the chances of successful communication during increased noise (Brumm & Slabbekoorn, 2005; Shannon & Weaver, 1963). However, there has been minimal investigation of the effects of noise on speaking rate (Charlip & Burk, 1969; Hanley & Steer, 1949; Webster & Klumpp, 1962), and none at all on vocabulary change. The few existing studies of changes to rate of speech in noise have found a trend for decreasing vocal rate with increasing noise (Charlip & Burk, 1969; Hanley & Steer, 1949; Hormann et al., 1984), potentially interacting with changes to duration of words or syllables to increase a receiver’s time to interpret the signal heard in noise.

18 Lombard effect – NIVM linkage (“Lombard speech”)

Lombard speech is defined by simultaneous changes to the amplitude, spectral content (including fundamental frequency, formants, and spectral tilt), and temporal characteristics of utterances (Garnier et al., 2010; Patel & Schell, 2008). Because the linkage between these parameters appears to be biomechanical (Gramming et al., 1988; Jessen et al., 2005), one would expect that non-human animals with similar respiratory and vocal anatomy should also exhibit linked changes to amplitude and spectro-temporal properties of vocalizations. However, even non-human primate species which are closely related to humans have very different vocal anatomy (Fitch, 2000), and further tests of this hypothesis will be needed to examine whether the proposed linkage is unique to humans.

Effects of noise on non-human mammal vocalizations

Non-human species are exposed to many types of noise generated by natural sources including wind, rain, and other animals, as well as anthropogenic sounds from urbanization, mineral exploration and extraction, and transportation, among other sources. In the interests of conservation, many studies of noise-induced vocal modifications in non-human animals have focused on species that are dependent on acoustic communication for critical life processes such as foraging and reproduction. These species, which include songbirds and marine mammals, have evolved in environments with a wide range of natural noise variability, and often have extensive vocal repertoires and some degree of flexibility in vocalization type (Nowacek et al., 2007; Patricelli & Blickley, 2006; Tyack, 2008). A majority of the research on NIVMs in non-human species has focused on birds and frogs, highlighting important differences along taxonomic lines; these studies are reviewed in Brumm and Slabbekoorn (2005), Patricelli and Blickley (2006), Warren et al. (2006), and Brumm and Zollinger (2011). By comparison, non-human mammals have been relatively poorly studied (Brumm & Slabbekoorn, 2005; Brumm and Zollinger, 2011). The scope of this review is therefore restricted to mammalian species, whose relatively recent evolutionary divergences make direct comparison of observed vocal modifications in these related species interesting and appropriate.

19 Vocal noise compensation in non-human mammals

Of the few non-human mammals for which the effects of noise on acoustic communication have been studied, primates, cetaceans, and bats are the best-known and all have shown evidence for the Lombard effect (Table 2-2). Explicit studies of the Lombard effect in other mammals are rare (Brumm et al., 2004; Brumm & Zollinger, 2011; Nonaka et al., 1997), but studies of other NIVMs are more common and can provide other information about species’ vocal flexibility.

The Lombard effect in mammals

Evidence for the Lombard effect has been found in all non-human mammalian species for which it has been explicitly studied (Table 2-2). The magnitude of the Lombard effect varies within and between species, which is not surprising given the extreme variability in both study designs (controlled laboratory playbacks to observational field studies) and noise types used (white noise, vessel passages, and tonal stimuli) (Holt et al., 2009; Miksis-Olds & Tyack, 2009; Parks et al., 2011; Scheifele et al., 2005; Sinnott, Stebbins & Moody, 1975). Neural tests for the Lombard response can indicate whether a given species has the potential to exhibit a Lombard effect in the absence of behavioral data. During these experiments, electrical stimuli applied to a brain region cause the test animal to produce a vocalization, from which the source level and other acoustic characteristics can be measured. This was done with domestic cats (Felis catus, Linnaeus 1758) in quiet and during playbacks of pure-tone noise of varying amplitudes (Nonaka et al., 1997). The cats’ VL increased significantly during auditory stimulation, indicating that neural mechanisms necessary for the Lombard effect exist in the brainstem of cats, a region that has been shown to be highly conserved between mammalian species (Nonaka et al., 1997). A non-human primate species which has not been behaviorally tested (squirrel monkey; Saimiri sciureus, Linnaeus 1758) has also shown the neural responses necessary for mediation of the Lombard response (Hage, Jürgens & Ehret, 2006). .

Table 2-2. Presence of the Lombard effect and other NIVMs in mammals. The simultaneous modifications column refers to observations of two or more vocal modifications, generally the Lombard effect and temporal changes. Checkmarks indicate experimental evidence for a particular modification type. Blank spaces indicate no data is available. Species marked with † indicate that the existing evidence comes from social groups rather than individual signalers, and * indicate that the modifications were observed only during echolocation (self-communication).

NIVMs Simultaneous Order Species Lombard References Spectral Temporal Modifications? Effect Common marmoset (Callithrix jacchus)    Brumm et al. 2004

Cotton top tamarins (Saguinus oedipus)  No   Egnor and Hauser 2006; Egnor et al. 2007 Crab eating macaque  Sinnott et al. 1975 (Macaca fascicularis ) Primates Human (Homo sapiens)     Lane and Tranel 1971; Garnier et al. 2010 Southern pig-tailed macaque  Sinnott et al. 1975 (Macaca nemestrina) Squirrel Monkey (Saimiri sciureus)  Hage et al. 2006 Beluga whale†     Scheifele et al. 2005; Lesage et al. 1999 (Delphinapterus leucas) Bottlenose dolphin (Tursiops truncatus) No  Buckstaff 2004 Humpback whale Cetacea  Doyle et al. 2008 (Megaptera novaeangliae) Killer whale (Orcinus orca) †  Holt et al. 2009; Holt et al. 2011   Right whale (Eubalaena glacialis)    Parks et al. 2011; Parks et al. 2007 Free tailed bat* (Tadarida Tressler and Smotherman, 2009; Schmidt     brasiliensis) and Joermann 1986 Chiroptera Mouse tailed bat* (Rhinopoma    Schmidt and Joermann 1986 microphyllum) California Ground Squirrel (Spermophilus Rodentia  No Rabin et al. 2003 beecheyi) Carnivora Domestic Cat (Felis catus)  Nonaka et al. 1997

When behavioral data for a species is present, neurophysiological studies may provide insight into the ultimate anatomical and physiological causes of noise-induced vocal modifications (Tressler & Smotherman, 2009; Tressler et al., 2011). In particular, neuroreceptors in the signaler’s brain which may play a role in sound production are of interest. In one bat species, increases in striatal dopamine in the basal ganglia induced modifications to the structure of bats’ echolocation pulses and severely reduced the observed Lombard effect and associated spectral and temporal modifications (Tressler et al., 2011). Understanding the physiological mechanisms of vocal flexibility may allow for future predictions of the types and magnitudes of noise-induced vocal modifications in related species. Behavioral tests for the Lombard effect are more common than neurological studies, and require subjects to vocalize voluntarily so that VL and other acoustic characteristics can be measured. One such experiment was performed with two macaques of different species (Macaca fascicularis (Linneaus, 1758) and M. nemestrina (Raffles 1821)) that were trained to vocalize at a consistent rate and exposed to noises of different levels and spectral content (Sinnott et al., 1975). When exposed to noise that overlapped the spectral content of their vocalizations, the macaques increased the amplitude of their calls by about 2 dB per 10 dB increase in NL. In contrast, playbacks of noise at several octaves above the subjects’ vocalization frequencies produced no change in VL, despite being well within the animals’ hearing ranges (Sinnott et al., 1975). These results are consistent with the idea that the spectral content of noise can impact the presentation of the Lombard effect, which has been empirically demonstrated in humans (Egan, 1967; Egan, 1972; Garnier et al., 2010). An interesting design aspect of this study is that training the animals to produce a single vocalization at a consistent rate is comparable to human subjects being asked to repeat a single word during noise exposure. As noted by Lane and Tranel (1971), while reading and repetition tasks performed during noise do induce a Lombard effect, the magnitude of the response is much weaker than when the signaler is motivated to be understood by a receiver. Vocalizations from animals attempting to contact a receiver may therefore show a more biologically relevant (and likely greater) response to noise. Spontaneous and interactive vocal tasks during which the signaler is motivated to communicate may provide a more accurate picture of the evolutionary function of the Lombard effect. Experiments that assessed changes to spontaneously produced vocalizations from two non-

22 human primate species have shown that cotton-top tamarins (Saguinus oedipus, Linnaeus 1758) (Egnor & Hauser, 2006; Egnor et al., 2006) and common marmosets (Callithrix jacchus, Linnaeus 1758) (Brumm et al., 2004) both exhibit the Lombard effect during playbacks of band- limited white noise. Both tamarins and marmosets showed increases in VL with increases in NL (tamarins: ~2.5 - 6 dB VL per 10 dB NL, marmosets: ~3 to 7.5 dB VL per 10 dB NL). When compared to the results from the trained macaques studied by Sinnott et al.(1975), there is some evidence that the animals that vocalized spontaneously had greater-magnitude Lombard responses. Differences in field and laboratory conditions can also affect the presentation of NIVMs, including the Lombard effect (Tressler & Smotherman, 2009). To understand the biological relevance of vocal adjustments, it is necessary to study signalers in their natural environments where they are likely to be motivated to communicate. While there are few such studies of humans, who have typically been studied in highly controlled and easily manipulated acoustic chambers, the difficulty of keeping many non-human species alive in the laboratory has made this the standard experimental design for studies of many mammals. The most definitive of these examined the vocal behavior of individual North Atlantic right whales (Eubalaena glacialis, (Müller, 1776)) (Parks et al., 2011) using acoustic recording tags to collect data in the animals’ natural habitat, finding strong evidence for a Lombard effect in this species (Parks et al., 2011). A major obstacle to studying the Lombard effect in free-ranging animals (and particularly marine mammals) is the difficulty of identifying and localizing the signaler. As a proxy, some studies have examined vocal responses at the group level, averaging amplitudes from all vocalizations produced by a focal group of animals. Studies using this method have found evidence suggesting that two other marine mammal species may also exhibit a Lombard response. In studies of both beluga whales (Delphinapterus leucas (Pallas, 1776)) (Scheifele et al., 2005) and killer whales (Orcinus orca (Linnaeus 1758)) (Holt et al., 2009; Holt, Noren & Emmons, 2011), increased noise levels correlated with changes to average vocalization levels from social groupings of individuals. While an increase in average VL from a group does not conclusively demonstrate the Lombard effect, it does provide some evidence that individuals within the group have the ability to increase VL during increased noise, and may therefore exhibit the Lombard effect if studied individually. There is also some evidence for the Lombard effect in a fourth marine mammal, the West Indian manatee (Trichechus manatus (Linnaeus 1758)) (Miksis-Olds & Tyack, 2009). In this species, one of two vocalization types (recorded from individuals and small groups) showed

23 context-dependent amplitude increases during increased high-frequency noise. These results indicate that manatees may adjust the amplitude of their vocalizations to compensate for increased noise, but further studies are needed to confirm the presence of a Lombard effect in this species. Recent technological developments have improved studies of acoustic behavior of free- ranging signalers, allowing researchers to surmount some significant obstacles, including the lack of control over confounding factors. High-capacity acoustic recording tags have recently enabled researchers to perform more rigorously controlled studies of the effects of sound on signalers in natural communication situations (Tyack, Gordon, & Thompson, 2003; Parks et al., 2011). When combined with the detailed analyses developed for studies of humans and captive non-human signalers, these technologies may provide compelling evidence for the occurrence of Lombard effect in the wild.

Other NIVMs in non-human mammals

Non-Lombard vocal adjustments have also been observed from mammalian taxa (Table 2-2). While such changes to spectral and temporal parameters of vocalizations have been linked to the Lombard effect in human speech, the difficulty of studying animals in the field has precluded many studies of NIVMs in non-human mammals from determining whether the target species demonstrated a Lombard effect in conjunction with other changes to the acoustic characteristics of vocalizations. Spectral alterations include shifts in vocalization bandwidth and in the lowest (minimum), highest (maximum), and peak (greatest energy) frequencies, and have been reported among most mammalian taxa known to exhibit the Lombard effect (Table 2- 2) (Brumm & Slabbekoorn, 2005; Brumm & Zollinger, 2011). In addition, a rodent species, the California ground squirrel (Spermophilus beecheyi (Richardson 1829)) (Rabin et al., 2003) which has not been tested for the Lombard effect, has shown evidence for flexibility in the spectral parameters of vocalizations. Changes to vocalization frequencies may reduce masking by removing the frequency of the vocalization from the noise band (Brumm & Slabbekoorn, 2005). In most cases, this involves shifting the entire vocalization to a higher frequency to avoid low-frequency noise, as demonstrated by North Atlantic right whales, beluga whales, and free-tailed bats (Lesage et al.,

24 1999; Parks et al., 2007; Tressler & Smotherman, 2009; Tressler et al., 2011). There is also some evidence from California ground squirrels that species may be able to shift energy within vocalizations to lower frequencies when exposed to high frequency noise (Rabin et al., 2003). One commonly measured index of vocal change in human speech is spectral tilt (Garnier et al., 2010; Jessen et al., 2005; Rivers & Rastatter, 1985), which has been documented in only two non-human species. Recordings of alarm calls from California ground squirrels (Rabin et al., 2003) provided evidence of energy being shifted to higher harmonics of the call during high noise. A very similar finding in ‘scream’ calls produced by female North Atlantic right whales indicates that spectral overlap of high noise and the vocalization’s fundamental frequency causes shifts of energy to higher harmonics of the call (Parks, 2003). Differences in the types of observed vocal modifications between closely related species may indicate that NIVMs are adaptive at the species level (Francis, Ortega & Cruz, 2011). Closely related species of mammals can exhibit very different vocal responses to noise: specifically, some species of marine mammals have not demonstrated spectral changes to their vocalizations. In contrast to the cetacean studies cited above, two other cetacean species (bottlenose dolphins (Tursiops truncatus (Montagu, 1821)), (Buckstaff, 2004) and humpback whales (Megaptera novaeangliae (Borowski, 1781)) (Doyle et al., 2008)) showed no changes to vocalization frequencies as a result of increased noise. Among primates, several studies of cotton- top tamarins failed to find any consistent changes to vocalization frequencies (Egnor & Hauser, 2006; Egnor et al., 2006; Egnor et al., 2007) during playback experiments. Whether the lack of spectral changes in these species is due to the functional significance of the specific vocalizations examined, effects of behavioral context, or a lack of spectral modifications in general is not yet known, and further research should be directed to determine why closely related species may modify vocalizations differently in response to identical acoustic stimuli. Modifications to temporal characteristics of vocalizations include alterations to call or syllable duration and repetition rate, and may increase the chances of a listener detecting the vocalization by allowing increased processing time with a longer signal (Bee & Micheyl, 2008). Two species of non-human primates have exhibited increases in call duration with increasing noise. In “combination long calls” produced by cotton top tamarins and “twitter” vocalizations of common marmosets, the increase in duration was caused by changes to the duration of syllables within a call, rather than to an increase in inter-syllable intervals (Brumm et al., 2004; Egnor & Hauser, 2006), as would be expected if the animals are trying to allow listeners increased detection time.

25 Changes in the duration of cetacean vocalizations have also been observed, but these responses are often less clear and consistent than have been observed in human speech (Lesage et al., 1999; Miksis-Olds & Tyack, 2009; Parks et al., 2007). While high levels of variability in temporal changes may be an artifact of low sample sizes, alternative hypotheses are that changes to vocalization duration may not increase the chances of signal detection by receivers, or that temporal changes affect the behavioral function of the vocalization. The first alternative is unlikely, however, because increases in call duration increase the time-bandwidth product of the sound, allowing greater temporal integration time for receivers, which is likely to increase detectability of the call (Bee and Micheyl, 2008). Further studies will be needed to fully evaluate the impact of noise on temporal changes to cetacean vocalizations. Failure to change the duration of vocalizations during increased noise has also been observed in some species. Bottlenose dolphins did not increase the duration of their signature whistles during noise (Buckstaff, 2004), though they did increase call rate, which may lead to an increased probability of detection by listeners. Similarly, North Atlantic right whales (Parks et al., 2011) showed no consistent change in the duration of their vocalizations, despite having demonstrated the linear relationship between VL and NL that is characteristic of the Lombard effect. Another type of modification can impact the duration of signals over slightly longer (seconds – minutes) time scales. Repetition of part or all of a signal (“serial redundancy”) can increase the duration of a sound without modifying the overall acoustic structure (Brumm & Slabbekoorn, 2005). This type of shift was observed in humpback whales exposed to sonar sounds during the breeding season (Miller et al., 2000). Males sang longer themes within each song during the noise exposure, ultimately increasing song duration. Tyack (2008) notes that this change resulted from increased repetition of phrases within each theme, indicating that serial redundancy may act as a second mechanism for increasing the duration of a signal and increasing the chances of receiver perception.

NIVMs during self-communication

A special case in acoustic communication for some species of non-human mammals is echolocation, a form of communication in which the sender and receiver are the same individual. During echolocation, an individual emits a stereotyped acoustic signal and listens to returning

26 echoes to detect objects (including food and obstacles) in their environment. Similarly to inter- animal communication signals, the evolution of echolocation sounds is likely to have been affected by noise (Brumm & Slabbekoorn, 2005, Brumm & Zollinger, 2011). Given the estimated dates for evolution of the Lombard effect (probably ≥ 100 million years ago; Brumm & Zollinger, 2011) and mammalian echolocation (Bats: 85 – 65 mya; Jones & Teeling, 2006; Odontocetes: 34 – 24 mya; Fahlke et al., 2011), it is also probable that these signals are modified as a result of increased ambient noise. While the specialized nature of these signals limits their structural flexibility, the benefits of being able to compensate for noise during essential foraging and navigation behaviors are clear. Additionally, feedback for an echolocating animal is close to instantaneous, which may impact the benefits to the signaler for optimizing signal propagation in noise. As predicted, noise-induced vocal modifications of VL, spectral, and temporal parameters of echolocation signals have been demonstrated by species dependent on echolocation for essential life functions (Au et al., 1985; Schmidt & Joermann, 1986; Tressler & Smotherman, 2009; Tressler et al., 2011). Individuals of five bat species (Myotis oxygnathus (Monticelli 1885), Eptesicus fucus (Beauvois, 1796), Tadarida brasiliensis, Rhinolophus ferrumequinum (Schreber, 1774), and Rhinopoma microphyllum (Brunnich, 1782)) exposed to broadband white or thermal noise all showed increases in amplitude of echolocation signals, and several species also changed pulse bandwidths (including shifts to both minimum and maximum frequencies), and temporal components (Schmidt & Joermann, 1986; Tressler & Smotherman, 2009; Tressler et al., 2011). In contrast, bats exposed to narrowband noise from conspecifics’ echolocation sounds tend to exhibit a ‘jamming avoidance response’ (JAR), wherein they shift pulse frequencies (Bates, Stamper & Simmons, 2008; Gillam et al., 2009; Hiryu et al., 2010; Tressler & Smotherman, 2009) or temporal characteristics (Obrist, 1995) without modifying vocalization amplitude. These responses are not mutually exclusive; as documented in a study of free-tailed bats (T. brasilensis), context-dependent responses to broad- or narrowband noise indicate the potential for a general vocal adjustment in response to broadband noise, and a specialized, targeted JAR to sounds that may cause direct interference with echolocation signals (Tressler & Smotherman, 2009). A single beluga whale has also shown noise-dependent flexibility in echolocation signals. When moved from a relatively quiet bay to a noisier environment, the whale decreased its click duration while increasing bandwidth, peak frequency, and amplitude (Au et al., 1985). Due to the design of this study, the authors were unable to separate the changes to click amplitude and

27 frequency, and suggested that the whale may have been incapable of producing high amplitude echolocation clicks at lower frequencies. An alternative hypothesis is that the whale was exhibiting simultaneous changes to pulse amplitude and bandwidth as a function of the Lombard effect and other noise-induced modifications. Future research should investigate the potential for vocal modifications during echolocation, which may be distinct from the Lombard effect in toothed whales and dolphins. While signal flexibility during echolocation is restricted by the function of the sounds, there is strong evidence for the Lombard effect and other NIVMs during this form of self- communication. Increases in pulse amplitude (the Lombard effect) and bandwidth were consistently observed, as were changes to pulse duration (though the direction of this change differed between bats and a single beluga whale) (Au et al., 1985; Tressler & Smotherman, 2009; Tressler et al., 2011). Further observations of vocal modifications during self-communication may provide insight into how signalers use their own vocal feedback to improve signal propagation and allow for evaluation of the effects of feedback delay and behavioral context on potential noise compensation responses.

Lombard effect – NIVM Linkage in non-human mammals

While simultaneous use of multiple NIVMs has been observed in a wide variety of mammalian species, including primates, cetaceans, and bats, only a few studies have explicitly examined the Lombard effect in conjunction with other vocal modifications (Brumm et al., 2004; Egnor & Hauser, 2006; Egnor et al., 2006; Egnor et al., 2007; Parks et al., 2011; Tressler & Smotherman, 2009; Tressler et al., 2011). In studies that did find simultaneous usage of multiple modifications, observations included spectral changes, such as increased bandwidth (Tressler & Smotherman, 2009; Tressler et al., 2011), and temporal changes such as increased call durations (Brumm et al., 2004; Egnor & Hauser, 2006; Tressler & Smotherman, 2009; Tressler et al., 2011). In studies of echolocating animals that increased the amplitude, bandwidth, and peak frequency of their pulses, the authors came to different conclusions about the results. Tressler and Smotherman (2009) interpret their findings of simultaneous changes to free-tailed bats’ vocalization bandwidth and duration as a “by-product” of the simultaneously observed Lombard effect, and Tressler et al. (2011) showed possible evidence for a psychological linkage in the

28 control of all three parameters during manipulations of dopamine levels in the animals’ brains. Au et al.(1985) attributed findings of simultaneous increase in a beluga whale’s echolocation pulse frequency, bandwidth, and amplitude to an anatomical or physiological limitation of the whale’s ability to produce low frequency clicks at high amplitudes. The intriguing results from these studies point to a need to assess current perspectives of the linkage between different types of vocal modifications in non-human mammals, and propose future studies to quantify how modifications occur and are controlled. Equally relevant are studies that found no linkage between the Lombard effect and either spectral or temporal modifications. Studies of cotton-top tamarins and North Atlantic right whales failed to find consistent changes to minimum or maximum frequencies of calls produced simultaneously with the Lombard effect during increased noise (Egnor & Hauser, 2006; Egnor, Iguina & Hauser, 2006b; Parks et al., 2011). While other components of these species’ vocalizations including spectral tilt and serial redundancy remain to be investigated, these findings currently provide no support for the hypothesis that an increase in vocal amplitude is likely to cause simultaneous changes in the spectral and temporal properties of vocalizations (Brumm & Slabbekoorn, 2005; Patel & Schell, 2008; Tressler & Smotherman, 2009). A possible explanation for these results is that the species mentioned may have secondarily lost the linkage due to the importance of information encoded in the spectro-temporal content of their calls. Further research into the salience of acoustic parameters and functionality of cotton-top tamarin and North Atlantic right whale vocalizations will be needed to fully investigate these hypotheses.

Future research directions

The effects of noise on acoustic communication are remarkably similar between humans and other mammalian species (Table 2-3), but different experimental approaches from the two fields have left open avenues for research in both fields. Tightly controlled experiments typical of human research should be used to investigate subtle details of the effects of noise type and behavioral context on non-humans, while observational studies of acoustic communication in natural environments should be used to evaluate the biological significance of noise-induced vocal modifications in human speech.

29 Future directions for human research

Studies of the Lombard effect in human speech have typically involved volunteer subjects in controlled environments (soundproof room, headphones, etc.), where they participate in either a controlled reading task or attempt to communicate with an experimenter (Garnier et al., 2010; Junqua, 1996; Lu & Cooke, 2008; Tartter et al., 1993). The subject’s environment is controlled to such a degree that subtle details of the effects of noise can be examined, allowing for detailed knowledge of which aspects of the noise, experimental design, and individual subject are likely to influence the Lombard effect. The effects of noise type and gender are well known, but there is room for an experiment testing the interplay of these factors. For example, cocktail party noise composed of spectral content targeted to male or female subjects could be used to investigate the influence of spectral overlap on Lombard speech in relation to a speaker’s baseline vocal frequencies. Motivated speaking experiments with interactive tasks are increasingly common in human research, but the non-natural acoustic and social environment may affect the behavior of speaking subjects. Experiments into the effect of noise on subjects in more natural communication situations should be conducted, as has been done for non-human animals of several taxa (Brumm, 2004; Parks et al., 2011). Of particular interest to researchers of human behavior may be studies of marine mammals, where the vocal behavior of individual animals has been compared across periods with different noise levels using acoustic recording tags (Parks et al., 2011). Similar studies of human speech produced in natural situations can be achieved by placing acoustic recording devices onto volunteers and examining the characteristics of speech produced outside the laboratory in “natural” noise and speaking contexts. Directed research into the potential for changes in vocalization rate and redundancy by human subjects is also called for. While use of standard reading tasks may be able to account for some changes to vocalization rate, they are not sufficient for analyzing vocabulary changes that would affect the utterance redundancy, including word repetition. Interactive tasks and studies of natural speech behavior should be used to examine redundancy and repetition of informationally important words or phrases.

30

Table 2-3. Noise-induced vocal modifications in humans and non-human mammals, including selected references. References marked with * indicate that the modification was observed during echolocation (self-communication).

Modifications Human References Mammal Taxa References Primate Brumm et al. 2004 Cetacean Parks et al. 2011 Lombard et Amplitude   Schmidt and Joermann Vocalization al. 1911 Bat 1986* Carnivora Nonaka et al. 1997 Type  Cetacean Doyle et al. 2008 Primate Brumm et al. 2004 Dreher and Duration  O'Neill  Cetacean Lesage et al. 1999 1957 Bat Simmons et al. 1978* Temporal Vocalization Hanley and   Cetacean Buckstaff 2004 rate Steer 1949 Serial  Cetacean Doyle et al. 2008 Redundancy Lu and Cetacean Parks et al. 2007 Minimum  Cooke  Tressler and Bat 2009b Smotherman 2009* Maximum  Cetacean Lesage et al. 1999 Frequency Cetacean Lesage et al. 1999 Peak  Tressler and Bat Smotherman 2009* Cetacean Parks 2003 Spectral Tilt  Fant 1959  Rodent Rabin et al. 2003

Future directions for non-human mammal research

In contrast to human research, studies of the effects of noise on mammalian vocalizations have typically focused on the evolutionary origins of vocal modifications and their potential effects on reproduction, mate selection, and conservation. Specific details of the predictors of the Lombard effect and its impact on other NIVMs in non-human mammals are unknown due to the difficulties of controlling environmental and behavioral parameters in the field and studying some species in laboratory settings. Several areas of study, including the effects of NIVMs on the intelligibility of calls, predictors of the Lombard effect (including communication motivation), and the possibility of simultaneous “Lombard speech”-like use of multiple NIVMs in non-human

31 mammals’ vocalizations remain open to researchers working with non-human mammals. New technologies, such as high-capacity acoustic recording tags, will be useful in the development of rigorous studies of noise-induced vocal modifications from signalers in their natural environments. While human research has made much of Lane and Tranel’s “premium on intelligibility” (Garnier et al., 2010; Junqua, 1996), studies of non-human animals have assumed that their subjects are motivated to communicate. While this assumption is probably accurate for animals studied in their natural habitats, laboratory animals asked to repeat a sound on command (Sinnott et al., 1975) are essentially performing a reading task, which may not accurately evaluate the responses of motivated animals. Encouraging subjects to communicate naturally by presenting them with conspecific stimuli or waiting for spontaneous vocalizations may lead to more biologically relevant results. Weight should also be given to the idea that communication in some behavioral contexts will be more critical than others. For instance, communication between spatially separated mother and offspring may be critical to the offspring’s survival, whereas communication between solitary adults may be less crucial and therefore less motivating to the signaler. Changes to signaler motivation may impact the effort with which animals vocalize, leading to confounding results for researchers who do not take context into account (Miksis-Olds & Tyack, 2009). The detectability and intelligibility of calls in noise is also an open area for research in non-human animals, particularly with respect to the proposed noise-compensation properties of vocal modifications. There has been very little work investigating the impacts that call modifications may have upon propagation of the call (in noise and in quiet) and receiver perception of the modified sound. In some marine mammal and bird species, studies of the “active space” in which a call is likely to be heard by a conspecific receiver have indicated that masking noise reduces this area (Brown, 1989; Lohr, Wright & Dooling, 2003; Miller, 2006; Patricelli, Dantzker & Bradbury, 2008), and for several species, it appears that the Lombard effect is likely to mitigate this reduction in range (Lohr et al. 2003; Nemeth & Brumm, 2010). One study of the effects of vocal modifications on song propagation found minimal influence of changes to song frequency and temporal parameters on propagation distance (Nemeth & Brumm, 2010). It should be considered, however, that changes to the spectro-temporal characteristics of vocalizations may increase the intelligibility of the signal during increased noise without apparent impact on propagation distance (Pittman & Wiley, 2001). Additionally, Patel and Schell’s (2008) finding that informational content influences word duration and other changes to human speech

32 provides tantalizing evidence that subtle between-call variability in the Lombard effect and spectro-temporal modifications may indicate the behavioral functionality of non-human animal vocalizations. Predictors of the Lombard effect are much better understood for humans than for non- human mammals. While some studies of right whales of known age and sex classes have found no effect of these characteristics on acoustic properties of vocalizations, the sample sizes in the studies were small (Parks et al., 2007; Parks et al., 2011) and the idea deserves more attention. Playback studies with animals of known age and sex classes may help determine which characteristics of the subject and the noise are relevant in non-human vocal modifications. Playbacks should include conspecific ‘babble’ or ‘multi-talker’ noise (comparable to cocktail party noise in human studies or chorus noise used in playbacks to anuran species (Gerhardt & Klump, 1988; Love & Bee, 2010; Vélez & Bee, 2011)), as this is an especially relevant source of interference for many social species, and has been shown to have a greater effect on human speakers than white noise (Garnier et al., 2010). Despite widespread evidence for the Lombard effect during echolocation in bats (Schmidt & Joermann, 1986; Tressler & Smotherman, 2009; Tressler et al., 2011), there has been no examination of the effects of noise on social vocalizations in these species. Given the evidence for the Lombard effect during self-communication in these species, and in social communication in other mammals, the logical prediction is that bat social sounds should also increase in amplitude during increased noise. If future studies do not show this effect, new hypotheses about the origins and evolution of self- versus social communicative strategies may need to be developed. Self-communication sounds (i.e. echolocation) may also be used to investigate the role of vocal feedback in sound production in non-human mammals. While in human studies subjects’ vocal feedback has been experimentally manipulated (Bauer et al., 2006; Elman, 1981), these studies may be difficult to replicate in non-humans, especially for marine mammals (Egnor et al., 2006). Manipulating delay in pulse returns or frequency of echoes may allow researchers to evaluate the effect of ‘sidetone regulation’ (Lane & Tranel, 1971) on non-human mammalian vocal production. Researchers studying non-human species should also devote more time to investigating the possible connection between the Lombard effect and other NIVMs. Lombard speech is a well- defined phenomenon in the human literature (Garnier et al., 2010; Patel & Schell, 2008), which has received minimal attention in the non-human field. Most studies of NIVMs in mammals have

33 focused on a single aspect of vocal change, and many have avoided studying vocal amplitude because of the difficulty of accurately determining the source level of vocalizations in the field. Without data on the source levels of vocalizations, it is impossible to determine whether a connection exists between vocal effort and other NIVMs in non-human species. In the few non-human studies that have looked for a potential linkage, different taxa have shown conflicting results. Non-human primates, which have the most similar vocal anatomy to humans, have shown no evidence of a link between VL and spectral properties of calls (Egnor & Hauser, 2006). Increased specificity in the analysis of non-human animals’ vocalizations, including investigations of spectral tilt and other formant characteristics, will be an important factor in clarifying the relationship between the Lombard effect and other modifications. At present, there is not enough evidence to suggest that the Lombard effect is necessarily linked to other NIVMs in all mammalian species.

Conclusions

(1) The influence of noise on acoustic communication is an active field of study within multiple fields of study including speech recognition technology, speech pathologies, animal conservation, and the evolution of acoustic communication among other fields (Brumm & Slabbekoorn, 2005; Brumm & Zollinger, 2011; Junqua, 1993; Junqua et al., 1999; Slabbekoorn & Ripmeester, 2008). (2) Standardized terminology is necessary to facilitate communication between fields; in particular, “Lombard effect” should be used only to designate changes in vocalization amplitude (VL) with increasing levels of noise (NL). Application of the term “Lombard speech” should be restricted to cases when the Lombard effect, spectral, and temporal NIVMs are all observed simultaneously. (3) In human NIVM research, there is a clear gap in research on non-laboratory speaking situations where subjects may face realistic pressure to successfully communicate. Methods from non-human studies can be adapted to examine human vocal behavior in natural conditions and provide insight into the broader effects of noise on human communication and behavior. Modifications documented in birds and non-human mammals, such as changes to serial redundancy (Brumm & Slater, 2006; Lengagne et

34 al., 1999; Tyack, 2008) and vocalization type (Francis et al., 2011), deserve more attention, and may be more readily apparent when subjects are observed outside of the laboratory. (4) Gaps exist in knowledge about which aspects of noise non-human signalers are responding to, and whether noise type and presentation or communication motivation affect the types of NIVMs used. Suggestions for future research include playback experiments using conspecific “babble” noise, studies of how the intelligibility of vocalizations in noise is affected by the Lombard effect and other NIVMs, and the possibility of a biomechanical or psychophysical linkage between the Lombard effect and other NIVMs. (5) The effects of noise-induced vocal modifications on the effectiveness of acoustic communication are poorly understood. The Lombard effect does appear to increase detectability of signals during increased noise, but one study of songbirds found little

effect of spectral and temporal modifications on active space (Dreher & O’Neill, 1957; Nemeth & Brumm, 2010). The usefulness of all types of NIVMs in compensating for masking by increased noise should be studied in more detail.

35 APPENDIX: Glossary of terms

Active space: Range over which a vocalization is audible. Includes propagation and attenuation of vocalizations as well as estimates of conspecific hearing thresholds.

Call type: A species-specific, stereotyped vocalization. Many non-human species produce a limited number of vocalizations which can be categorized into discrete call types. In human speech, a single word may be considered a discrete “call type”.

Cetacean: Whales, dolphins, and porpoises.

Cocktail party noise: Noise composed of multiple human speakers or similar spectral content. Often used to attempt to mask human speech; has relevance to signaler adaptations and receiver perspectives, as the spectro-temporal content may increase masking effects on a human speaker.

Echolocation: Emission of sound by a signaler to determine range to an object in the environment via reflection of sound waves from the object. Mainly observed in bat and toothed whale species, which use short duration signals to generate echoes from objects in the surrounding environment, including prey animals and other obstacles.

Formant: A peak in the spectral envelope of a vocalization, affected by the fundamental frequency of vocal cord vibration and filtering through the resonances of the vocal tract. Present in both human speech and non-human vocalizations; however, very little work has been done with formants from non-human species, and structure in these calls is referred to as “harmonics” though this is not strictly correct.

Harmonics: Sound energy at integral multiples of the fundamental frequency of vocal cord vibration.

Intelligibility: Accuracy with which a listener can identify a word or phrase. Changes to acoustic characteristics of vocalizations can affect the intelligibility, which may also affect the active space of the sound.

36 Lombard effect: An increase in vocalization amplitude (VL) correlated with increasing noise amplitude (NL). May be non-linear at very low or very high NLs (see Lane and Tranel, 1971 for amount of variability in response). The Lombard effect as defined here does not include temporal or spectral modifications. See also: Lombard speech.

Lombard speech: Simultaneous application of multiple noise-induced vocal modifications, including the Lombard effect AND changes to spectral and temporal properties of vocalizations. Currently documented only for humans, with some evidence in free-tailed bats (Tressler & Smotherman, 2009; Tressler et al., 2011).

NIVM: Noise-induced vocal modifications. Modifications of spectral content, duration and other acoustic characteristics of vocalizations or vocal behavior during noise, which may be employed independently but can be observed simultaneously. Does not include changes in amplitude (the Lombard effect).

NL: Noise amplitude

Phoneme: One of a set of speech sounds in any given language, which is used by subjects to differentiate between words. All consonants and vowels in English are considered different phonemes.

Serial redundancy: Repetition of signals or portions of signals. In human speech, this may also include use of redundant words to convey the same message.

Sidetone regulation: A subject’s use of his or her own vocal feedback to monitor and regulate the amplitude of speech.

Signal to noise ratio (SNR): Ratio of the acoustic energy in a signal to the background noise (unwanted sound).

Spectral modification: Changes to the frequency components of a vocalization, including the fundamental frequency, higher harmonics or formants, minimum, maximum, and peak frequencies. Also includes spectral tilt modifications.

37

Spectral tilt: Measure of the relative energy in a vocalization’s formants or harmonics. Shifts of energy from low to high frequencies are referred to as decreases in spectral tilt.

Syllable: A single component of a word (human) or vocalization (non-human) made up of a sequence of several sounds.

Temporal modification: Change to the temporal properties of a vocalization, including duration and pulse repetition rate among other characteristics.

VL: Vocalization amplitude

White noise: Noise with equal energy over all frequencies in a given bandwidth.

38

Acknowledgements

The authors would like to thank Thomas Gabrielson, Tracy Langkilde, Victoria Braithwaite, Jennifer Miksis-Olds, Laura Madden, and two anonymous reviewers for helpful suggestions and discussions. This project was funded in part by ONR Grant # N00014-08-1-0967. Support for C. Hotchkin was provided through government support under and awarded by DoD, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a.

References

Amazi, D. K., and Garber, S. R. (1982). "The Lombard Sign as a Function of Age and Task," Journal of Speech and Hearing Research 25, 581-585. Au, W. W. L., Carder, D. A., Penner, R. H., and Scronce, B. L. (1985). "Demonstration of adaptation in beluga whale echolocation signals," The Journal of the Acoustical Society of America 77, 726-730. Bates, M. E., Stamper, S. A., and Simmons, J. A. (2008). "Jamming avoidance response of big brown bats in target detection," Journal of Experimental Biology 211, 106-113. Bauer, J. J., Mittalk, J., Larson, C. R., and Hain, T. C. (2006). "Vocal responses to unanticipated perturbations in voice loudness feedback: an automatic mechanism for stabilizing voice amplituded," J Acoust Soc Am 119, 2363-2371. Bee, M. A., and Micheyl, C. (2008). "The "Cocktail party problem": What is it? How can it be solved? And why should animal behaviorists study it?," Journal of Comparative Psychology 122, 235 - 251. Bond, Z. S., Moore, T. J., and Gable, B. (1989). "Acoustic--phonetic characteristics of speech produced in noise and while wearing an oxygen mask," The Journal of the Acoustical Society of America 85, 907-912. Borowski, G. H. (1781). Gemeinnützzige Naturgeschichte des Tierreichs. G. L. Lange, Berlin and Stralsund Bradbury, J. W., and Vehrencamp, S. L. (1998). Principles of Animal Communication Sinauer Associates, Inc, Sunderland, MA Brown, C. H. (1989). "The active space of blue monkey and grey-cheeked mangabey vocalizations," Animal Behaviour 37, 1023-1034. Brown, W. S. J., and Brandt, J. F. (1972). "The effect of masking on vocal intensity during vocal and whispered speech," Journal of Auditory Research 12, 157 - 161. Brumm, H. (2004). "Causes and consequences of song amplitude adjustment in a territorial bird: a case study in nightingales," An Acad Bras Cienc 76, 289-295. Brumm, H., and Slabbekoorn, H. (2005). "Acoustic Communication in Noise," in Advances in the Study of Behavior, edited by P. J. B. Slater, C. T. Snowdon, T. J. Roper, H. J. Brockmann, and M. Naguib (Academic Press), pp. 151-209.

39 Brumm, H., and Slater, P. J. B. (2006). "Ambient noise, motor fatigue, and serial redundancy in chaffinch song," Behavioral Ecology and Sociobiology 60, 475 - 481. Brumm, H., Voss, K., Kollmer, I., and Todt, D. (2004). "Acoustic communication in noise: regulation of call characteristics in a New World monkey," Journal of Experimental Biology 207, 443-448. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Buckstaff, K. C. (2004). "Effects of watercraft noise on the acoustic behavior of bottlenose dolphins, Tursiops truncatus, in Sarasota Bay, Florida " Marine Mammal Science 20, 709-725. Castellanos, A., Benedí, J.-M., and Casacuberta, F. (1996). "An analysis of general acoustic- phonetic features for Spanish speech produced with the Lombard effect," Speech Communication 20, 23-35. Charlip, W. S., and Burk, K. W. (1969). "Effects of noise on selected speech parameters," Journal of Communication Disorders 2, 212-219. Cooke, M., and Lu, Y. (2010). "Spectral and temporal changes to speech produced in the presence of energetic and informational maskers," The Journal of the Acoustical Society of America 128, 2059-2069. Doyle, L. R., McCowan, B., Hanser, S. F., Chyba, C., Bucci, T., and Blue, J. E. (2008). "Applicability of information theory to the quantification of responses to anthropogenic noise by southeast Alaskan humpback whales," Entropy 10, 33-46. Dreher, J. J., and O'Neill, J. J. (1957). "Effects of ambient noise on speaker intelligiblity for words and phrases," The Journal of the Acoustical Society of America 29, 1320 - 1323. Egan, J. J. (1967). "Psychoacoustics of the Lombard Voice Reflex," Ph.D. dissertation, Case Western University. Egan, J. J. (1972). "Psychoacoustics of the Lombard voice response," Journal of Auditory Research 12, 318 - 324. Egnor, S. E. R., and Hauser, M. D. (2006). "Noise-induced vocal modulation in cotton-top tamarins (Saguinus oedipus)," American Journal of Primatology 68, 1183-1190. Egnor, S. E. R., Iguina, C. G., and Hauser, M. D. (2006). "Perturbation of auditory feedback causes systematic perturbation in vocal structure in adult cotton-top tamarins," Journal of Experimental Biology 209, 3652-3663. Egnor, S. E. R., Wickelgren, J. G., and Hauser, M. D. (2007). "Tracking silence: adjusting vocal production to avoid acoustic interference," Journal of Comparative Physiology A 193, 477-483. Elman, J. L. (1981). "Effects of frequency-shifted feedback on the pitch of vocal productions," The Journal of the Acoustical Society of America 70, 45-50. Fahlke, J.M., Gingerich, P.D., Welsh, R.C., and Wood, A.R. (2011). Cranial asymmetry in Eocene archaeocete whales and the evolution of directional hearing in water. Proceedings of the National Academy of Sciences 108, 14545-14548. Fant, G. (1959). Acoustic analysis and synthesis of speech with applications to Swedish, vol. 15, pp. 1 - 106. Ericksson Tech Report. Fitch, H. (1989). "Comments on "Effects of noise on speech production: Acoustic and perceptual analyses" [J.Acoust. Soc.Am. 84, 917-928 (1988)]," The Journal of the Acoustical Society of America 86, 2017 - 2019. Fitch, W. T. (2000). "The evolution of speech: a comparative review," Trends in Cognitive Sciences 4, 258 - 267. Foote, A. D., Osborne, R. W., and Hoelzel, A. R. (2004). "Whale-call response to masking boat noise," Nature 428, 910.

40 Francis, C. D., Ortega, C. P., and Cruz, A. (2011). "Different behavioural responses to anthropogenic noise by two closely related passerine birds," Biology Letters. Garnier, M., Bailly, L., Dohen, M., Welby, P., and Loevenbruck, H. (2006). "An acoustic and articulatory study of Lombard Speech: Global effects on the utterance," In proceedings of INTERSPEECH 2006 (September 2006), 2246. Garnier, M., Henrich, N., and Dubois, D. (2010). "Influence of sound immersion and communicative interaction on the Lombard effect," Journal of Speech, Language, and Hearing Research 53, 588 - 608. Gerhardt, H.C., and Klump, G.M. (1988). Masking of acoustic signals by the chorus background noise in the green tree frog: A limitation on mate choice. Animal Behaviour 36, 1247- 1249. Gervais, P. (1852). Zoologie et paléontologie françaises. A. Bertrand: Paris, 271p. Gillam, E. H., McCracken, G. F., Westbrook, J. K., Lee, Y., Jensen, M. L., and Balsley, B. B. (2009). "Bats aloft: variability in echolocation call structure at high altitudes," Behavioral Ecology and Sociobiology 64, 69-79. Gramming, P., Sundberg, J., Ternström, S., Leanderson, R., and Perkins, W. H. (1988). "Relationship between changes in voice pitch and loudness," Journal of Voice 2, 118- 126. Hage, S. R., Jürgens, U., and Ehret, G. (2006). "Audio–vocal interaction in the pontine brainstem during self-initiated vocalization in the squirrel monkey," European Journal of Neuroscience 23, 3297-3308. Hanley, T. D., and Steer, M. D. (1949). "Effect of Level of Distracting Noise upon Speaking Rate, Duration and Intensity," Journal of Speech and Hearing Disorders 14, 363-368. Haykin, S. (2005). "The Cocktail Party Problem," Neural computation 17, 1875-1902. Hiryu, S., Bates, M. E., Simmons, J. A., and Riquimaroux, H. (2010). "FM echolocating bats shift frequencies to avoid broadcast echo ambiguity in clutter," Proceedings of the National Academy of Sciences 107, 7048-7053. Holt, M. M., Noren, D. P., and Emmons, C. K. (2011). "Effects of noise levels and call types on the source levels of killer whale calls," The Journal of the Acoustical Society of America 130, 3100-3106. Holt, M. M., Noren, D. P., Veirs, V., Emmons, C. K., and Veirs, S. (2009). "Speaking up: Killer whales (Orcinus orca) increase their call amplitude in response to vessel noise," The Journal of the Acoustical Society of America 125, EL27-EL32. Hormann, H., Lazarus-Mainka, G., Schubeius, M., and Lazarus, H. (1984). "The effect of noise and the wearing of ear protectors on verbal communication," Noise Control Engineering Journal 23, 69 - 77. Jessen, M., Köster, O., and Gfroerer, S. (2005). "Influence of vocal effort on average and variablity of fundamental frequency," Speech, Language and the Law 12, 174 - 212. Jones, G. and Teeling, E.C. (2006). The evolution of echolocation in bats. Trends in Ecology & Evolution 21, 149 -156 Junqua, J. (1993). "The Lombard reflex and its role on human listeners and automatic speech recognizers " The Journal of the Acoustical Society of America 93, 510-524. Junqua, J. (1996). "The influence of acoustics on speech production: A noise-induced stress phenomenon known as the Lombard reflex," Speech Communication 20, 13-22. Junqua, J., Fincke, S., and Field, K. (1999). "The Lombard effect: a reflex to better communicate with others in noise," Proceedings of ICASSP-99, Phoenix, AZ, 2083-2086. Lane, H., and Tranel, B. (1971). "The Lombard sign and the role of hearing in speech," Journal of Speech, Language, and Hearing Research 14, 677 - 709.

41 Lengagne, T., Aubin, T., Lauga, J., and Jouventin, P. (1999). "How do king penguins (Apenodytes patagonicus) apply the mathematical theory of information to communicate in windy conditions?," Proceedings of the Royal Society London B 266, 1623-1628. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65-84. Letowski, T., Frank, T., and Caravella, J. (1993). "Acoustical properties of speech produced in noise presented through supra-aural earphones," Ear & Hearing 14, 332 - 338. Liénard, J., and Di Benedetto, M. (1999). "Effects of vocal effort on spectral properties of vowels," The Journal of the Acoustical Society of America 106, 411 - 422. Linneaus, C. (1758). Systema naturae per regna tria naturae :secundum classes, ordines, genera, species, cum characteribus, differentiis, synonymis, locis. Impensis Direct. Laurentii Salvii, Stockholm, 824p. Lohr, B., Wright, T. F., and Dooling, R. J. (2003). "Detection and discrimination of natural calls in masking noise by birds: estimating the active space of a signal," Animal Behaviour 65, 763 - 777. Lombard, E. (1911). "Le signe de l'elevation de la voix," Annales Des Malades de l'creille 37. Lu, Y., and Cooke, M. (2008). "Speech production modifications produced by competing talkers, babble, and stationary noise," The Journal of the Acoustical Society of America 124, 3261 - 3275. Lu, Y., and Cooke, M. (2009a). "The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise," Speech Communication 51, 1253-1262. Lu, Y., and Cooke, M. (2009b). "Speech production modifications produced in the presence of low-pass and high-pass filtered noise," The Journal of the Acoustical Society of America 126, 1495-1499. Miksis-Olds, J. L., and Tyack, P. L. (2009). "Manatee (Trichechus manatus) vocalization usage in relation to environmental noise levels," The Journal of the Acoustical Society of America 125, 1806-1815. Miller, P. J. O. (2006). "Diversity in sound pressure levels and estimated active space of resident killer whale vocalizations," Journal of Comparative Physiology A 192, 449-459. Miller, P. J. O., Biassoni, N., Samuels, A., and Tyack, P. L. (2000). "Whale songs lengthen in response to sonar," Nature 405, 903-903. Montagu, G. (1821). Description of a species of Delphinus, which appears to be new. Memoirs of the Wernerian Natural History Society 3, 75-82. Müller. (1776). Zoologica Danicae Prodromus seu Animalium Daniae et Norvegiae indigenarum characters, nomine, et synonyma imprimis popularium. Havniae. XXXII, 274 p.. Nemeth, E., and Brumm, H. (2010). Birds and Anthropogenic Noise: Are Urban Songs Adaptive? The American Naturalist 176, 465-475. Nober, E. H. & Simmons, J. Q. (1981). Comparison of auditory stimulus processing in normal and autistic adolescents. Journal of Autism and Developmental Disorders 11, 175-189. Nonaka, S., Takahashi, R., Enomoto, K., Katada, A., and Unno, T. (1997). "Lombard reflex during PAG-induced vocalization in decerebrate cats," Neuroscience Research 29, 283- 289. Nowacek, D. P., Thorne, L. H., Johnston, D. W., and Tyack, P. L. (2007). "Responses of cetaceans to anthropogenic noise," Mammal Reviews 37, 81-115. Obrist, M. K. (1995). "Flexible bat echolocation: the influence of individual, habitat and conspecifics on sonar signal design," Behavioral Ecology and Sociobiology 36, 207-219. Pallas, P. S. (1776). Reise durch verschiedene Provinzen des russischen Reiches. . St. Petersberg, vol.3, Ascidian, 709 p.

42 Parks, S., Searby, A., Célérier, A., Johnson, M., Nowacek, D., and Tyack, P. (2011). "Sound production behavior of individual North Atlantic right whales: implications for passive acoustic monitoring," Endangered Species Research 15, 63-76. Parks, S. E. (2003). Acoustic communication in the North Atlantic right whale (Eubalaena glacialis), Ph.D. Dissertation, Massachusetts Institute of Techonology/Woods Hole Oceanographic Institution Joint Program.. Parks, S. E., Clark, C. W., and Tyack, P. L. (2007). "Short- and long-term changes in right whale calling behavior: The potential effects of noise on acoustic communication," The Journal of the Acoustical Society of America 122, 3725-3731. Parks, S. E., Johnson, M., Nowacek, D. P., and Tyack, P. L. (2011). "Individual right whales call lounder in increased environmental noise," Biology Letters 7, 33-35. Patel, R., and Schell, K. W. (2008). "The influence of linguistic content on the Lombard effect," Journal of Speech, Language, and Hearing Research 51, 209 - 220. Patricelli, G. L., and Blickley, J. L. (2006). "Avian communication in urban noise: causes and consequences of vocal adjustment," The Auk 123, 639-649. Patricelli, G. L., Dantzker, M. S., and Bradbury, J. W. (2008). "Acoustic directionality of red- winged blackbird (Agelaius phoeniceus) song relates to amplitude and singing behaviours," Animal Behaviour 76, 1389-1401. Pick, H. L. J., Sigel, G. M., and Fox, P. W. (1989). "Inhibiting the Lombard effect," The Journal of the Acoustical Society of America 85, 894 - 900. Pittman, A. L., and Wiley, T. L. (2001). "Recognition of speech produced in noise," Journal of Speech, Language, and Hearing Research 44, 487 - 496. Potash, L. M. (1972). "Noise-induced changes in calls of the Japanese quail," Psychonometric Science 26, 252-253. Rabin, L. A., McCowan, B., Hooper, S. L., and Owings, D. H. (2003). "Anthropogenic noise and its effect on animal communication: an interface between comparative psychology and conservation biology," International Journal of Comparative Psychology 16, 172-192. Rivers, C., and Rastatter, M. P. (1985). "The effects of multitalker and masker noise on fundamental frequency variablity during spontaneous speech for children and adults," The Journal of Auditory Research 25, 37 - 45. Scheifele, P. M. (2003). Investigation into the response of the auditory and acoustic communications systems in the Beluga whale (Delphinapterus leucas) of the St. Lawrence River Estuary to noise, using vocal classification, Ph.D. dissertation, University of Connecticut, p. 123. Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," The Journal of the Acoustical Society of America 117, 1486-1492. Schmidt, U., and Joermann, G. (1986). "The influence of acoustical interferences on echolocation in bats," Mammalia 50. Schulman, R. (1989). "Articulatory dynamics of loud and normal speech," The Journal of the Acoustical Society of America 85, 295-312. Shannon, C. E., and Weaver, W. (1963). The Mathematical Theory of Communication (University of Illinois Press, Chicago, IL), Sinnott, J. M., Stebbins, W. C., and Moody, D. B. (1975). "Regulation of voice amplitude by the monkey," The Journal of the Acoustical Society of America 58, 412 - 414. Slabbekoorn, H., and Ripmeester, E. A. P. (2008). "Birdsong and anthropogenic noise: implications and applications for conservation," Molecular Ecology 17, 72-83.

43 Summers, W. V., Pisoni, D. B., Bernacki, R. H., Pedlow, R. I., and Stokes, M. A. (1988). "Effects of noise on speech production: Acoustic and perceptual analyses," The Journal of the Acoustical Society of America 84, 917-928. Sundberg, J., and Nordenberg, M. (2006). "Effects of vocal loudness variation on spectrum balance as reflected by the alpha measure of long-term-average spectra of speech," The Journal of the Acoustical Society of America 120, 453-457. Tartter, V. C., Gomes, H., and Litwin, E. (1993). "Some acoustic effects of listening to noise on speech production," The Journal of the Acoustical Society of America 94, 2437-2440. Ternström, S., Bohman, M., and Södersten, M. (2006). "Loud speech over noise: some spectral attributes, with gender differences," The Journal of the Acoustical Society of America 119, 1648-1665. Traunmüller, H., and Eriksson, A. (2000). "Acoustic effects of variation in vocal effort by men, women, and children," The Journal of the Acoustical Society of America 107, 3438 - 3451. Tressler, J., Schwart, C., Wellman, P., Hughes, S., and Smotherman, M. (2011). Regulation of bat echolocation pulse acoustics by striatal dopamine. The Journal of Experimental Biology 214, 3238-3247. Tressler, J., and Smotherman, M. S. (2009). "Context-dependent effects of noise on echolocation pulse characteristics in free-tailed bats," Journal of Comparative Physiology A 195, 923 - 934 Tyack, P. L. (2000). "Functional aspects of cetacean communication," in Ceatcean societies: field studies of dolphins and whales, edited by J. Mann, R. C. Connor, P. L. Tyack, and H. Whitehead (University of Chicago Press, Chicago, IL). Tyack, P. L. (2008). "Implications for marine mammals of large-scale changes in the marine acoustic environment," Journal of Mammalogy 89, 549-558. Tyack, P., Gordon, J., and Thompson, D. (2003). Controlled Exposure Experiments to Determine the Effects of Noise on Marine Mammals. Marine Technology Society Journal 37, 41-53. Vélez, A. and Bee, M.A. (2011). Dip listening and the cocktail party problem in grey treefrogs: signal recognition in temporally fluctuating noise. Animal Behaviour 82, 1319-1327. Webster, J. C., and Klumpp, R. G. (1962). "Effects of ambient noise and nearby talkers on a face- to-face communication task," The Journal of the Acoustical Society of America 34, 936- 941. Wiley, R. H. (2006). Signal detection and animal communication. In Advances in the Study of Behavior, vol. Volume 36 (ed. H. J. Brockmann, P. J. B. Slater, C. T. Snowdon, T. J. Roper, M. Naguib and K. E. WynneEdwards), pp. 217-247. Academic Press.

44 Chapter 3

Characterization of acoustic habitat in captive and wild beluga environments

Abstract

Determining whether a signaler’s acoustic habitat is likely to affect the types of vocal modifications used during increased noise requires knowledge of the source parameters of common and novel noises. This chapter examines the acoustic environment in one wild (Cook Inlet, Alaska; June – August 2010 and January – April 2011), and one captive (Mystic Aquarium, Mystic, CT; November 6 – 16, 2010 and October 1 – 11, 2011) beluga habitat, with comparison of noise types, amplitudes, and spectral content. In the two Cook Inlet sites, there were differences in both human activity levels and beluga presence. When vocalizations from beluga whales were detected, noise levels at the two sites differed, with Beluga River having higher noise levels and a greater number of anthropogenic noise events (vessel passages). Noise at the Kenai River site was dominated by weather events, with fewer anthropogenic events than at the Beluga River site. At Mystic Aquarium, noise was relatively high amplitude and varied both spatially and temporally, with different spectra between pools and across both long and short time scales. Comparison of noise levels from both wild and captive habitats with an estimated beluga audiogram indicated that belugas at Mystic Aquarium are likely to perceive noise above 2.5 kHz, and whales in Cook Inlet are likely to hear sounds at 4 kHz and above. These noise thresholds overlap with both the frequency range of their vocalizations and the beginning of belugas’ peak hearing sensitivity. Broadband noise levels were highest in the captive environment, rivaling the 95th percentiles for noise at both Cook Inlet sites, with and without M-weighting. Spectral shapes also differed between the two environments, with very high low frequency noise (150 – 300 Hz) in the captive habitat, and a peak at around 1 kHz at the Cook Inlet sites. The loudest sounds recorded in each habitat were approximately equal, but of different durations. Acoustic environments in beluga habitats vary considerably in time and space, but the continuously high noise levels in captive habitats may present a greater barrier to communication than intermittent loud sounds observed at the Cook Inlet sites.

45 Introduction

Acoustic communication systems have evolved in the presence of ambient noise which can act as a significant driver of the structure of acoustic signals (Wiley and Richards, 1978; Bradbury and Vehrencamp, 1998; Brumm and Slabbekoorn, 2005). On short time scales (seconds to hours), noise events can affect the success of communication between individuals, particularly if noise occurs during critical communication times (Halfwerk and Slabbekoorn, 2009). The ability to compensate for increased noise on short time scales can therefore benefit signalers (Brumm and Slabbekoorn, 2005), but there are open questions about the evolution of vocal noise compensation. Some species, particularly mammals, appear to have adapted to compensate for noise by changing the acoustic properties of their vocalizations (Brumm and Zollinger, 2011). It remains unclear, however, whether evolutionary pressure on these signalers has selected for specific compensation strategies for commonly encountered noises or vocal flexibility, allowing signalers to adjust call characteristics according to variability in noise parameters (Brumm and Slabbekoorn, 2005). Environmental conditions may have selected for the extreme vocal flexibility that some species exhibit; understanding the effects of noise on acoustically dependent species therefore requires an understanding of the signalers’ vocal flexibility and its relationship to their acoustic environment. One particular open question is whether signalers of the same species placed in very different acoustic habitats use the same types of vocal modifications – are they flexible enough to adapt to the particular noise in their habitat, or are they using a fixed response paradigm? In addition, understanding the selection pressures that appear to have shaped vocal flexibility in signalers may provide clues whether these species are capable of compensating for increases in novel anthropogenic sounds. In the marine environment, many species depend on acoustic signals to facilitate foraging, navigation, and social interactions (Tyack, 2000). Marine mammals, in particular, use a wide range of sound types for these behavioral functions, and are known to modify their vocalizations to compensate for increased noise (Lesage et al., 1999; Scheifele et al., 2005; Holt et al., 2009; Holt et al., 2011; Parks et al., 2011). One species, the beluga whale (Delphinapterus leucas), is well known for the wide variety of sounds individuals can produce (Schevill and Lawrence, 1949; Sjare and Smith, 1986; Angiel, 1997; Chmelnitsky, 2010), and for their ability to mimic and innovate sounds (Janik et al., 1997; Tyack, 2008). This species has also demonstrated a wide range of vocal flexibility, including noise-induced changes to vocalization amplitude, duration, spectral content, and overall vocalization types (Lesage et al., 1999;

46 Scheifele et al., 2005), making belugas a particularly good species in which to study the effects of noise on the evolution of vocal flexibility. Beluga whales are medium-sized (4 – 6 m) toothed whales found in relatively fluid social groups (usually 2 – 10, but up to several hundred animals) in the Arctic Ocean and sub-Arctic estuaries (O'Corry-Crowe, 2009). Many beluga populations are vulnerable to threats associated with habitat destruction, including increased human usage of their previously ice-covered habitats (Jefferson et al., 2008). Sounds associated with such activities may introduce novel noises into the belugas’ acoustic habitats; mineral exploration and extraction, naval activities, and commercial shipping traffic are all predicted to increase in the Arctic Ocean as global temperatures rise (Tynan and DeMaster, 1997). Both the loss of insulating ice cover and an increase in anthropogenic activities have the potential to affect the acoustic environment in beluga habitats, potentially increasing stress for all animals (Wright et al., 2007b; a), and increasing the need for signalers of many species to expend energy compensating for increased masking noise (Richardson et al., 1995; Noren et al., 2011). Beluga whales are one of the few cetacean species to be successfully kept in captivity, where animals live in small social groups (2 – 12 animals) at aquaria and zoos around the world. Rigid-walled enclosures and exhibit life support and maintenance systems create novel acoustic environments that may affect acoustic communication by the inhabitants, and also allow evaluation of the ability of beluga whales to compensate for unfamiliar masking noises.

Acoustic Habitats

The Arctic and sub-Arctic environments which most beluga populations inhabit are inherently noisy, with sources including weather events, animals (including acoustic signals from other marine mammals), and ice (Richardson et al., 1995). Anthropogenic contributions in some beluga habitats include noise associated with military activity, commercial shipping, and mineral extraction activities (Richardson et al., 1995; Lesage et al., 1999; Erbe and Farmer, 2000; Simard et al., 2008; McQuinn et al., 2011), such as sonar, propeller cavitation, and airgun array sounds. Wild belugas may be able to reduce their exposure to masking noise by relocating to a quieter portion of their habitat, though the relative value of certain habitat areas may preclude this response (Bejder et al., 2006; Beale, 2007).

47 In captive habitats, however, belugas are unable to remove themselves from noisy habitats; noise sources in these environments include few natural sounds and are relatively continuous over the long-term. Exhibit maintenance and life support systems, aquarium events and visitors, other animals, and nearby infrastructure sounds are all potential noise sources in captive habitats. Despite the known vocal flexibility of this species, there are no data on how different noise sources affect vocal responses, and on whether responses to common and novel noise sources are equivalent in terms of energetic costs or effectiveness. Evaluating vocal responses to increasing noise and the effects of novel and evolutionarily familiar noise sources requires characterizing the noise present in natural and novel acoustic habitats. This chapter describes the acoustic environment in wild (Cook Inlet, Alaska) and captive (Mystic Aquarium, Mystic, CT) beluga habitats with the intent of comparing and contrasting noise conditions between these environments. These characterizations will be used in future work on noise-induced modifications to the acoustic structure of beluga whale vocalizations.

Wild – Cook Inlet, Alaska

The only wild beluga populations whose natural habitat currently overlaps with significant anthropogenic development are located in the Canadian St. Lawrence River and Cook Inlet, Alaska (Jefferson et al., 2008). Close proximity to humans means that the behavior and natural history of these two populations are the most well known (Scheifele et al., 2005; Goetz et al., 2007), but that they may also be the most vulnerable to effects of increasing anthropogenic activity. While the acoustic habitat of the St. Lawrence River population has been extensively studied (Lesage et al., 1999; Scheifele et al., 2005; McQuinn et al., 2011; Gervaise et al., 2012), the acoustic environment in Cook Inlet is poorly understood, especially with respect to important beluga habitat areas (Blackwell and Greene, 2002; Hotchkin et al., 2010). Cook Inlet is a sub-Arctic estuary in southern Alaska that branches into two arms at the northeastern end (Figure 3-1). The resident beluga population declined by over 50 percent (from 653 animals to a low of 278) during the early 1990s, and has yet to recover (Hobbs et al., 2000). The population is designated as endangered, and currently numbers between 200 and 400 individuals (Hobbs et al., 2008; Hobbs et al., 2011). Human development around the inlet is limited but increasing. Anchorage (Alaska’s largest city, population ~300,000) is located in the

48 fork between Knik and Turnagain arms, and smaller settlements line both shores. Within the inlet, mineral exploration and extraction, construction, commercial shipping, fishing, and recreational boating activities are common, noisy activities (Blackwell and Greene, 2002). Natural noise in Cook Inlet includes sounds from rain, extreme tidal currents (regularly > 7 knots), and seasonal ice presence (Blackwell and Greene, 2002; Hotchkin et al., 2010).

Figure 3-1. Map of Cook Inlet, Alaska, with EAR deployment locations marked with black stars and major rivers indicated by black lines. The two EAR deployments were within the Beluga River and Kenai River outflow zones. Relevant locations are marked with letters: A) Beluga River, B) Little Susitna River, C) Chickaloon Bay, D) Knik Arm, E) Turnagain Arm.

Beluga usage of Cook Inlet habitat areas varies seasonally (Moore et al., 2000; Goetz et al., 2007; Ezer et al., 2008). Summer population centers are generally found in the mid- and upper inlet, between the Beluga and Little Susitna Rivers, in Chickaloon Bay, and in upper Knik Arm (Speckman and Piatt, 2000; Goetz et al., 2007; Hobbs et al., 2011). These areas are also subject to increased noise exposure due to an increase in anthropogenic activity. Commercial shipping, fishing, military, and construction activities are common in the areas closer to Anchorage, while mineral exploration and extraction activities are more common in the mid-inlet southwest of Beluga River (Blackwell and Greene, 2002).

49 Surveys of habitat use by Cook Inlet belugas reported reduced sightings in lower Cook Inlet during and after the population crash, indicating a range contraction and possible habitat abandonment (Moore et al., 2000; Speckman and Piatt, 2000; Rugh et al., 2010). However, recent evidence suggests that this habitat area is still occupied during the winter when the upper inlet is typically ice covered (Hobbs et al., 2005). Noise sources in this area include recreational boating (mainly summer months), transiting ships, and oil and gas support vessels, and the same natural sources found in the upper inlet. To evaluate habitat usage and characterize the acoustic environments experienced by Cook Inlet belugas, autonomous bottom mounted recording units were placed at several locations within the inlet during 2010 and 2011; this chapter will examine acoustic environments when beluga vocalizations were detected at the Beluga and Kenai River sites. A second study of the short-term acoustic environment near Anchorage during two weeks in August 2007 can be found in Appendix 1.

Captive – Mystic Aquarium, Mystic, Connecticut

Although much effort has been dedicated to studying sound exposure and effects of noise on physiology and behavior of wild marine mammals, relatively little attention has been paid to the effects of noise on captive cetaceans (Scheifele, P. pers. comm.; but see: Romano et al.(2004) and Thomas et al. (1990)). Beluga whales, kept in small groups at aquaria around the world, may experience a wide range of noise sources and levels, to which their responses are unknown. Studies of the noise environments in a few captive beluga habitats have found widely varying noise levels from sources integral to the animals’ enclosures, including life support (filtration and refrigeration) systems and exhibit maintenance, as well as external sources from aquarium visitors and functions (Scheifele, P., pers. comm). The three beluga whales at Mystic Aquarium in Mystic, Connecticut, live in an outdoor exhibit composed of three pools (“main”, “med”, and “hold”; Figure 3-2) with a large underwater viewing area surrounded by potential noise sources. The “Arctic Coast” exhibit is located within 120 m of Interstate 95; other noise sources include exhibit life support and maintenance (pumps, filtration, refrigeration, and scrubbing/vacuum machinery), other exhibits with loud engines, loudspeakers and underwater viewing areas within the Arctic Coast exhibit.

50

Figure 3-2. Diagram of the Arctic Coast exhibit at Mystic Aquarium with pools labeled. Gates between pools are indicated by hashes; there are two gates between the Main and Hold pools, one between Main and Med, and one between Med and Hold. The underwater viewing area is located underneath the semicircular canopy on the left side of the map. Image courtesy of Mike Osborn, Mystic Aquarium.

The acoustic environment in the Mystic Aquarium beluga exhibit has been informally studied (Pond et al., 2007), but there are no data on the spatial and temporal variability of noise in the exhibit. The different sizes and depths of the three pools, including circulation patterns, may cause spatial variability, which could affect whales that habitually spend disproportionate amounts of time in certain areas of the habitat. Temporal variability in exhibit noise as a result of human activities (hours of operation, traffic on nearby roads, etc.) is also expected.

51 Methods

Data Collection and processing

Wild environment: Cook Inlet, Alaska

Data from Cook Inlet were collected by a collaborative group of researchers based at the University of Hawaii (M. Lammers), the Alaska Department of Fish and Game (R. Small), the National Marine Mammal Laboratory (M. Castellote) and the University of Alaska Southeast (S. Atkinson, R. Blevins). Ten Ecological Acoustic Recording Units (EARs) (Lammers et al., 2008) were deployed in Cook Inlet beginning in June 2009, with the goal of documenting beluga presence and noise exposure at sites throughout the inlet. The data analyzed for this chapter were recorded between June and September 2010 at the mouth of the Beluga River (BR) and from December 2010 to May 2011 at the Kenai River (KR). Complete details of EAR deployments and hardware configurations can be found in Lammers et al. (2008; In review). Briefly, the EAR units were deployed on bottom-mounted moorings at 5 – 10 m depth, and set for a 10% duty cycle (30s recorded every 5min) with a sampling rate of 25 kHz, which is appropriate for capturing most non-echolocation beluga vocalizations (Sjare and Smith, 1986; Chmelnitsky, 2010). The recording unit had a flat frequency response (± 3 dB) between 10 Hz and 50 kHz, which is appropriate for capturing the majority of non-echolocation beluga vocalizations. The manufacturer-calibrated sensitivity of the recording system was -193.5 dB re 1µPa, and a total added gain of 47.5 dB. The detection radius of the units was estimated using a synthesized 10 – 12 kHz broadcast at 140 dB re 1µPa at two sites, giving a conservative estimate of 1.5 – 2.5 km detection range for beluga vocalizations in quiet conditions (Lammers et al., In review). Acoustic data were manually browsed by collaborators at the University of Hawaii using long term spectral averages to identify beluga vocalizations (“encounters”) in the recordings (Lammers et al., In review). Encounter times and durations were forwarded to C. Hotchkin. Dates with at least one detected beluga vocalization were fully browsed by C. Hotchkin to confirm presence and quality of vocalizations and to investigate the acoustic environment at the recorder during beluga encounters.

52 Noise sources in the recordings were identified by visually and aurally browsing all beluga encounters (Figure 3-3). Vessel traffic was identified using characteristic engine and propeller sounds in occurring simultaneously with relatively short duration high amplitude noise events. Tidal stage data from NOAA buoys at the North Foreland (BR site) and Chinulna Point (KR site) stations (http://tidesandcurrents.noaa.gov/station_retrieve.shtml?type=Historic+Tide+Data) were used to identify periods when flow noise was likely to have contributed significant energy to the total noise budget.

Figure 3-3. Example spectrogram (512 point FFT, Hanning window, 25% overlap) from the KR recording unit illustrating the variety of noise sources present in Cook Inlet. The beluga encounter in this example lasted approximately one hour, and consisted of whistles, buzzes, and echolocation clicks.

Captive environment: Arctic Coast Exhibit, Mystic Aquarium, Mystic CT

Data collection at the Arctic Coast Exhibit at Mystic Aquarium in Mystic, Connecticut occurred in two phases, between Oct. 25 and Nov. 16, 2010 and Oct. 1 – 11, 2011. Data on aquarium attendance was requested from the Mystic Aquarium offices at the completion of each year’s data collection.

53 The Main (0 – 5 m depth), Med (2.13 m), and Hold (4.0 m) pools in the beluga exhibit are connected by clear Plexiglas gates. Filtration and other life support systems are housed in an indoor area adjacent to the exhibit. Visitor access to the exhibit consists of above- and below- water viewing areas, and includes a regularly used public address system with loudspeakers mounted above the exhibit. The underwater viewing area includes large windows through which visitors may watch and interact with the belugas. Exhibit maintenance is performed on a regular weekly schedule, with “scrubber” dives on Monday and Wednesday afternoons, and “vacuum” dives on Tuesday and Thursday mornings. During dives, three SCUBA divers enter the water, and use either a power scrubber or vacuum equipment to remove algae and other debris. Dives typically last for approximately one hour with intermittent increases in noise level as the machinery is used.

Acoustic recordings

Acoustic recordings were collected using a DSG (www.loggerheadinstruments.com) recording unit with attached HTI 96-MIN hydrophone (High Tech, Inc.) and customizable filters. The frequency response of the recording system was flat (± 3 dB) between 10 Hz and 25 kHz, sufficient to record the expected noise sources in the exhibit as well as most tonal beluga vocalizations. The unit recorded continuously at a sampling rate of either 8 or 50 kHz, with low- pass anti-aliasing filter set at 23.4 kHz and high-pass filter set to ‘off’. Manufacturer calibration of the DSG unit indicated a recording sensitivity of -164 dB, with an added 3 dB gain built into the hardware filters. Differences in sampling rate were the result of hardware malfunctions, and days with 8 kHz sampling rate (26 Oct. – 3 Nov. 2010) were excluded from the following analyses. Electronic self-noise of the recording unit was determined by performing in-air recordings with the hydrophone removed and measuring power spectral density of the recorded signals with custom Matlab scripts. At low frequencies, self noise was minimal, but it began to increase at approximately 10 kHz (Figure 3-4).

54

Figure 3-4. Unweighted power spectral density comparisons of the DSG noise floor (black), with the lowest (blue) and highest (red) noise recorded at Mystic Aquarium. During the quietest recordings, the noise floor interferes with measurements at frequencies over 6 kHz.

The DSG was deployed by hand using a tether rope attached to the exhibit wall and 6.8 kg of weight. The hydrophone was positioned 1.1 m from the pool bottom; due to the different pool depths, the sensor was 1, 2.8, and 4 m from the water surface in the Med, Hold, and Main pools respectively. During all deployments, the whales were separated from the recorded by closing the gates to the deployment pool. The gates are made of clear Plexiglas ®, and permitted visual access to the recorder, but prevented the whales from physically manipulating the instrument. In 2010, deployments were designed to analyze the spatial and temporal noise variability in the exhibit. On recording days, the DSG was therefore deployed alternately in the Med and Hold pools for 8 – 30 h at a time. Only a single hour data was obtained from the Main pool due to the need for the whales to be on display during hours of aquarium operations. Recordings were conducted during both day and nighttime hours. In 2011, deployments were designed to capture noise variability due to experimental manipulation of the exhibit’s refrigeration equipment. The same DSG unit with identical sampling protocol and filter settings as in 2010 was deployed in the Med pool during daylight hours. Restriction of sampling to the med pool allowed for minimal disruptions to the animals and training staff. Recording schedules for both years are shown in Table 3-1.

55

Table 3-1. Recording schedules for the 2010 and 2011 data collection sessions at Mystic Aquarium. Note that not all dates are consecutive, due to availability of pools for recording. On days when two pools are listed, the DSG was moved between pools during the recording period. A check in the overnight column means that the DSG was recording overnight beginning on the date with the checkmark and continuing through the following day.

Year Month Date Day Pool Overnight 6 Sa HOLD  7 Su HOLD  8 M MED  9 Tu MED/HOLD  2010 NOV 10 W HOLD  13 Sa MED  14 Su HOLD  15 M MAIN/MED  16 Tu MED/HOLD  1 Sa MED  2 Su MED  3 M MED  4 Tu MED 2011 OCT  8 Sa MED  9 Su MED  11 Tu MED 

Noise manipulations

Recordings from 2011 were made during early October to capture noise from exhibit life- support systems used to refrigerate the exhibit water (“chillers”). The chillers operate on an automatic on/off schedule according to the ambient air temperature and water temperatures in the exhibit. When water temperatures rise above 12.8°C, one or both chillers activate automatically. During colder months, chilling equipment is unnecessary and the systems are shut down on or before November 1 of each year, and remain off until early summer of the following year. The on-off schedule for the chillers was manually overridden during the 2011 recording period in order to record ambient noise in the exhibit with and without chillers. Table 3-2 gives the on/off schedule for the chillers during the recording days.

56

Table 3-2. Chiller on/off schedule for 2011 recording sessions. AM indicates chillers were scheduled to go off at 0900 and on at 1200; PM indicates a scheduled off time of 1200 and an ‘on’ time of 1500. Actual times vary due to the automatic sensors used to automatically adjust chiller schedules to actual water temperatures. When two ‘on’ times are noted, it indicates that each chilling unit turned on separately rather than both coming online simultaneously.

Actual Year Month Date Day Pool Schedule Off On 1 Sa MED AM 900 1407 2 Su MED PM 1158 1501/1514 3 M MED AM 858 1204 2011 OCT 4 Tu MED PM 1208 1524 8 Sa MED PM 1339 1530/1548 9 Su MED AM 810 1230/1325 11 Tu MED AM 900 1159/1426

Noise analyses

The same measurements of noise level and spectral qualities were used for all data from both habitat types. Due to the large size of the data sets, pseudo-random sub-sampling was used to measure the acoustic environment. Broadband and 1/3 octave band noise levels from the first second of each sound file (Cook Inlet: one 30 s file every 5 minutes; Mystic Aquarium: one 60 s file every minute) were calculated using custom Matlab scripts to filter and analyze the data; code for these scripts can be found in Appendix 2 of this dissertation. Noise clips were not screened to eliminate beluga vocalizations, as sounds from conspecifics and other species are part of the natural soundscape in both wild and captive beluga habitats. While it is likely that some “noise” clips did contain beluga vocalizations, this is unlikely to bias the noise measurements from either habitat due to the low overall vocalization rates. Noise variability was calculated from the sub-sampled data. Noise level percentiles (5th, 25th, 50th, 75th, and 95th) were used to describe temporal and spatial variability in both data sets. Temporal variability in the Mystic Aquarium beluga habitat was examined on hourly, daily, and yearly time scales. Noise from immediately before and after cleaning dive events was used to evaluate the impact of exhibit maintenance on the overall noise budget; similarly mean and median noise measurements from day and night recordings in the Med and Hold pools were compared to evaluate potential diel cycles in habitat noise. Yearly comparisons were made by

57 comparing 2010 daytime med pool measurements with 2011 noise measurements, as the 2011 recordings were all made in the Med pool during daylight hours.

M-weighting Noise measurements were weighted to reduce the impact of very low frequency (<100 Hz) noise on broadband measurements and account for the assumed perceptual abilities of beluga whales following the methods of Southall et al. (2007) and McQuinn et al. (2011).The equation for the “M-weighting” function designed to model the hearing capability of mid-frequency cetaceans is:

where

and the corner frequencies flow and fhigh are set at 150 Hz and 160 kHz, respectively (Southall et al., 2007; McQuinn et al., 2011). M was calculated using the appropriate frequency range for each data set and applied by adding the M function to the calculated noise levels (in dB). To calculate broadband M-weighted noise levels, the M function was applied to average power spectral densities from all noise conditions. The spectra were then integrated to give the weighted broadband noise levels. Customized Matlab code for M-weighting can be found in Appendix 2.

Results

Wild environment: Cook Inlet, Alaska

Data from the Beluga (BR) and Kenai River (KR) sites were collected from June – August 2010 and December 2010 – April 2011, respectively. Noise measurements were calculated from a total of 4,270 seconds from the BR dataset and from 460 seconds of the KR dataset. Noise from natural (rain, tidal currents) and anthropogenic (vessel passages) sources was

58 recorded in both datasets concurrent with beluga encounters. In the BR recordings, 39% (N = 30/76) of encounters included noise from vessel passages, while in the KR data, 26% (N= 5/19) of encounters included vessel noise. The KR data also included a substantial number of rain events (N= 11/19) during beluga encounters, which were not present in the BR dataset.

Broadband noise levels

Broadband (10 Hz – 12.5 kHz) noise levels during beluga encounters varied from 93 –

137 dB re 1µPa (80 to 123 dBM re 1µPa; Table 3-3)). Noise levels in the BR dataset were highest during vessel passages, and the maximum noise level occurred during the passage of a small boat or float plane on June 21, 2010.

Table 3-3. 1/3 – Descriptive statistics for raw and M-weighted broadband noise levels at the Beluga and Kenai River datasets from Cook Inlet. BR generally had higher noise levels than KR, though minimum levels were similar. The maximum noise level at BR occurred during a single 30 s file when a small boat or aircraft was recorded. M-weighting decreased noise levels between 10 and 150 Hz, which belugas are unlikely to hear, and thus gives a more realistic measure of the belugas’ acoustic environment.

Mean Median Min Max

dB dBM dB dBM dB dBM dB dBM Overall 105.4 91.6 98.8 84.1 93.4 80.1 137.1 123.9

No vessel 101.6 87.0 97.9 83.2 93.5 80.3 121.7 108.3 BR Vessel 108.6 95.1 99.3 84.7 93.4 80.2 137.1 123.9

Tidal 101.8 87.1 99.2 84.4 93.4 80.1 115.2 102.1 Overall 99.2 84.8 95.7 82.0 93.3 80.2 114.7 100.7 No vessel 96.8 82.9 94.5 81.1 93.3 80.2 103.7 89.5 KR Vessel 98.7 84.2 96.8 82.8 93.4 80.2 106.0 91.1

Rain 102.8 87.9 95.4 82.2 94.0 80.9 114.7 100.7

Noise levels from the KR data were lower than those measured from BR, ranging from

93 – 115 dB re 1µPa (95 to 116 dBM re 1 µPa). The highest noise level recorded at KR occurred when no vessels were present on 8 January 2011, and was likely caused by rain or another weather event. Calculated percentiles for the broadband noise levels measured are shown in Table 3-4.

59

Table 3-4. Variability in raw and M-weighted broadband noise levels in the BR and KR datasets. Noise levels were generally higher at BR. Rain was only recorded at KR.

5th 25th 50th 75th 95th

dB dBM dB dBM dB dBM dB dBM dB dBM Overall 94.8 81.1 96.6 82.3 98.8 84.1 101.8 87.1 108.8 94.4

No vessel 94.7 81.0 96.2 82.1 97.9 83.2 100.1 85.3 104.8 90.4 BR Vessel 94.9 81.1 96.8 82.6 99.3 84.7 103.2 88.8 111.5 97.5

Tidal 94.9 81.1 97.1 82.6 99.2 84.4 101.7 86.9 106.5 91.2 Overall 93.7 80.5 94.5 81.1 95.7 82.0 98.5 84.2 104.0 88.8 No vessel 93.7 80.5 93.9 80.7 94.5 81.1 97.4 83.6 101.2 86.6 KR Vessel 93.7 80.3 95.5 81.5 96.8 82.8 99.5 84.9 103.4 87.9

Rain 94.3 81.1 94.8 81.5 95.4 82.2 100.5 86.3 108.4 92.0

Spectral content

Noise spectra in both datasets contained the greatest energy between 400 Hz and 2 kHz (Figure 3-5). BR had elevated noise levels (7 – 8 dB higher than at KR) around 1 kHz, due to increased contributions from vessels. Noise at KR was dominated by higher frequency sounds that occurred during rain and other weather events. Noise in both datasets began to increase above 4 kHz, possibly indicating an increase in electronic noise by EAR units in conjunction with greater high-frquency noise levels, particularly in the KR recordings.

60

Figure 3-5. Unweighted 1/3 octave band noise levels for all beluga encounters at BR and KR, including all noise sources. Black lines represent the 5th and 95th percentiles of noise levels; blue lines represent 25th and 75th percentiles; 50th percentile indicated by red line.

Captive environment: Mystic Aquarium

Data was collected at Mystic Aquarium on 14 days in 2010 and 8 days in 2011, resulting in a total of 169.7 hours of useable data, 117.8 h in the med pool (2010: 61.1, 2011: 56.8), 51.0 in the holding pool, and 0.8 h in the main pool. Long- and short-term temporal noise variability was investigated using comparisons between 2010 and 2011 recordings made in the Med pool. Short- term temporal variability was examined during the six analyzed cleaning dive events (2010: 4, 2011: 2), and eight chiller manipulation tests. Noise varied both spatially and temporally, though broadband noise measurements were affected by high levels of low-frequency (<100 Hz) noise throughout the exhibit. M-weighted noise levels and 1/3 octave band measurements indicate differences between pools and years, but unweighted broadband (20 Hz – 23 kHz) measurements (Table 3-5) appear identical across all conditions.

61

Table 3-5. 1/3 – Descriptive statistics for raw and M-weighted broadband noise levels in the Arctic Coast exhibit during 2010 and 2011. Average differences greater than 6 dBM are apparent between the day and night measurements in both the Med and Hold pools. Maintenance dives increased the average noise in the exhibit by 7 - 12 dB over average Med pool daytime noise. Unweighted noise levels are very high due to very low frequency (<100 Hz) noise in the entire exhibit.

Mean Median Min Max

dB dBM dB dBM dB dBM dB dBM Med (day) 118.8 94.1 116.4 90.1 109.9 81.9 129.1 114.3

Med (night) 112.7 87.8 112.0 86.6 109.4 81.9 121.6 105.6

Hold (day) 116.5 93.5 115.2 88.3 112.2 85.2 133.0 119.9 2010 Hold (night) 114.8 87.9 114.7 87.3 112.5 85.3 117.7 98.7

Main 121.1 87.5 120.7 87.0 118.2 85.2 123.6 92.6

Dives 118.2 101.2 116.4 95.6 111.8 84.3 130.0 116.9 Chillers Off 117.5 98.6 116.0 96.4 108.0 81.2 131.4 118.0 2011 Chillers On 116.4 97.0 115.1 95.8 107.8 82.3 128.2 113.3 (Med) Dives 119.8 104.7 115.7 96.3 108.8 82.1 134.5 121.4

Spatial Variability

The three pools in the Arctic Coast exhibit had very different noise profiles during the 2010 recording period. Average daytime broadband noise levels were highest in the med pool

(118.8dB; 94.1 dBM re 1µ Pa; N = 2,080 s), with similar levels in the hold (116.5 dB; 87.8 dBM re

1µ Pa; N = 2,534 s) and main (121.1dB; 87.5 dBM re 1µ Pa; N = 55 s) pools. The unweighted broadband noise levels in the Arctic Coast exhibit are driven by low frequency noise that the M- weighting function is designed to compensate for. Noise levels were also more variable in the med and hold pools (Table 3-5). The relatively low variability in the main pool is likely due to the limited recording time in this area, and to the large size and depth of the main pool, which may have reduced reverberation noise. Noise spectra differed between pools (Figure 3-6). In the main and hold pools, 1/3 octave band noise levels peaked at 150 Hz then declined over the recording bandwidth. The Hold pool spectrum showed an additional peak around 5 kHz, evident in the spectrogram representation (Figure 3-7), probably due to noise from filtration systems or water circulation in the exhibit. Over the frequency bandwidth, levels in the main pool were generally slightly lower than those in the holding pool. The Med pool showed a reversed spectrum, with low noise between 100 and 300 Hz and high noise between 800 Hz and 3 kHz. This difference is probably an effect of the

62 shape, size, and depth of the med pool, which should cumulatively limit the propagation of very low frequency noise.

Figure 3-6. Noise (unweighted) variability measured in the three Arctic Coast pools during November 2010. Black lines represent the 5th and 95th percentiles of noise levels; blue lines represent 25th and 75th percentiles; 50th percentile indicated by a red line. The smallest, shallowest pool (Med) has elevated mid- frequency noise levels between 200 Hz and 3 kHz, probably due to filtration and pool resonances. Hold and Main pools have relatively similar noise spectra, with peaks at around 120 Hz probably caused by electrical noise. Above 10 kHz, all pools have noise levels below the noise floor of the recording unit.

Figure 3-7. Sample spectrograms of noise from the med (1704 11/8/2010) and hold (1703 11/6/2010) pools. Note the increased low frequency noise in the Med pool, and the 5 kHz band in both pools.

63 Temporal Variability

Temporal variability in the Arctic Coast Exhibit occurred on short (seconds – hours) and long (daily, yearly) time scales (Table 3-5). Short-term noise sources included changes in noise from exhibit maintenance and experimental manipulations of the chillers, while the sources of the observed yearly variations are unclear. There was no change in either broadband or 1/3 octave band (Figure 3-8) noise levels associated with chiller operations. Broadband noise levels during chiller operations averaged 98.6 dBM re 1 µPa (median: 96.4 dBM re 1 µPa; unweighted average:116.4 dB re 1 µPa); when the chillers were turned off, broadband noise averaged 97.0 dBM re 1 µPa (median: 95.8 dBM re 1 µPa; unweighted average:117.5 dB re 1 µPa).

Figure 3-8. 1/3 Octave band noise levels measured during experimental manipulations of the exhibit chillers. There was no difference in noise attributable to chiller operations.

Short term noise variation was detected during cleaning dives. Six dive events were recorded; in all cases the recording unit was deployed in the med pool. Mean broadband noise levels during dives (2010: 101.2 dBM re 1 µPa, 118.2 dB unweighted; 2011: 104.7 dBM re 1 µPa, 119.8 dB unweighted) did not differ between years, but noise levels during both years’ dives were substantially higher than the m-weighted daytime noise levels in the med pool (2010: 94.1 dBM re

1 µPa, 118.8 dB unweighted; 2011: 98.6 dBM re 1 µPa, 117.5 dB unweighted). In 2010, 1/3

64 octave band noise levels were increased by up to 5 dB during dives; but in 2011, no such effect was found due to the increase in overall ambient noise levels.

Figure 3-9. 1/3 Octave band noise levels measured during routine exhibit maintenance dives during 2010 and 2011. During 2010, noise levels are elevated by up to 8 dB across the frequency spectrum, while in 2011 there was no significant increase in noise during dives due to an increase in background noise levels.

Daily variation in noise levels was also observed; in both the Med and Hold pools, the daytime noise spectrum was higher than noise recorded at night (Figure 3-10). Average broadband noise levels differed between day and nighttime hours, with a 5 – 7 dBM drop in broadband noise during the nighttime hours (Med day: 94.1 dBM re 1µPa, Med night: 87.8 dBM re

1µPa; Hold day: 93.5 dBM re 1µPa, Hold night: 87.5 dBM re 1µPa).

65

Figure 3-10. Day (0600 – 1759; black) and night (1800 -0559; blue) noise in the Med and Hold pools. Noise dropped at night in both pools, with the most obvious decrease between 1 and 5 kHz. The peak in the holding pool is likely caused by 120 Hz electrical noise.

Long term (yearly) noise variability was measured by comparing noise from the 2010 med pool daytime recordings with the 2011 dataset. Broadband noise levels in the med pool during the 2010 daylight recordings and 2011 recordings increased by an average of 4.5 dBM re 1µPa, which was also observed in the 1/3 octave band measurements (Figure 3-11). Noise between 300 Hz and 2 kHz was 2 to 8 dB higher during the 2011 than 2010.

66

Figure 3-11. Median 1/3 octave band levels for noise recorded in the Med pool during 2010 (black line) and 2011 (red and blue lines). Noise increased by 2 – 8 dB though the spectrum, but the majority of the energy remained within the 300 Hz – 1 kHz range.

Discussion

Understanding the implications of the acoustic habitat for signalers is crucial to evaluating the potential effects of increased noise on signalers, and for predicting the ability of signalers to respond to environmental changes and increased noise over the long term. This study has documented that the acoustic environments that beluga whales experience vary widely within and between wild and captive habitats. Noise spectra and levels differ within and between sites in Cook Inlet, Alaska, and within the beluga habitat at Mystic Aquarium due to differences in the background noise sources, and may be relevant to predictions of what wild belugas may experience as a result of global climate change.

Cook Inlet

Within Cook Inlet, noise from the two datasets differed in source types, overall levels, and spectra. At both locations, tidal currents, vessel passages, and weather events contributed to

67 the noise budget and overlapped with beluga presence at the recorders. Overall, the Beluga River data had slightly higher noise levels and a higher incidence of beluga/vessel encounters, probably due to the recording season (summer). Broadband (unweighted) noise levels at the BR recorder ranged from 93 to 138 dB re 1 µPa. While the exact positions of the vocalizing belugas cannot be determined from this data set, very high signal-to-noise ratio vocalizations indicate that the belugas can and do closely approach the recording unit’s position, and may therefore experience the noise levels measured at the recorder. Broadband noise levels in the Kenai River recordings were slightly lower than those from the BR recordings, ranging from 93 to 115 dB re 1 µPa (unweighted). There were fewer beluga encounters overall, more weather noise events, and a lower rate of beluga/vessel encounters. Due to the recording season (winter), and the recorder’s proximity to a relatively large town (Kenai, AK; population ~7,000 (Census Bureau, 2012)), these noise levels are likely the low end of what would occur in this area year-round. Higher noise levels from recreational and commercial vessel traffic, and from construction activities are expected in the summer months (B. Mahoney, pers. comm.). No specific ice noise was detected while belugas were present at this recorder, but ice remains a likely source of winter noise at this and other Cook Inlet locations (Moore et al., 2000). While beluga use of the KR area during the summer appears to be relatively low, additional analyses of beluga noise exposure at KR in the summer should be conducted. 1/3 octave band levels also differed between BR and KR, with more high frequency (> 3 kHz) noise at in the KR data and higher low frequency noise levels from BR. The BR data had higher median noise levels across the spectrum and higher minimum noise levels at all frequencies. Comparison of the calculated variability in 1/3 octave bands with published beluga audiograms (Richardson et al., 1995) indicated that the median noise levels at 4 kHz and above are likely to be audible to belugas at both sites. In high noise situations (95th percentile), belugas at both sites are likely to detect noise down to 2 kHz (Figure 3-12).

68

Figure 3-12. Noise level percentiles at both Cook Inlet sites compared with the beluga hearing threshold (Dashed black line; after Richardson et al. (1995) ). Median noise levels (red line) are likely audible above 4 kHz at both sites. The range of audible noise at both locations overlaps with the frequency range of beluga vocalizations (100 Hz to >20 kHz), indicating a potential masking hazard for signalers.

Mystic Aquarium

Noise at Mystic Aquarium varied spatially and temporally within the beluga habitat, over both long and short time scales. The minimum unweighted broadband noise level recorded in the exhibit was 108 dB re 1 µPa (2011; chillers on); maximum levels (unweighted) ranged to 134.5 dB re 1 µPa (2011; dive). M-weighted noise levels were much lower because of the contribution of noise below 100 Hz to the overall noise budget; these levels ranged from 81 to 121 dBM re 1 µPa. Spatial variation in exhibit noise was more apparent from noise spectra than from unweighted broadband noise levels. The hold and main pools had very similar overall spectra, with high levels around 100 Hz and decreasing noise with increasing frequency. Comparing overall levels between the two pools is difficult due to the limited recording time (~1 hour) from the main pool and high low frequency noise throughout the exhibit; however, average broadband noise in the Main pool during the single hour of recording was higher than the average daytime noise in either the holding or med pools.

69 The spectral content of noise from the med pool was substantially different from the other two exhibit areas, with highest noise levels between 300 Hz and 2 kHz. The observed differences may be due to the pool’s small size and shallow depth limiting the propagation of low frequency sounds, the filtration drain located near the recording position, circulation patterns caused by closing the gates to the pool, or a combination of these factors. In all pools, the peaks between 3 and 5 kHz are likely due to filtration systems. Temporal variation in noise occurred on both long and short time scales. During 2010, noise levels varied on the scale of minutes to hours during maintenance dives in the exhibit, increasing both broadband rms and 1/3 octave band noise levels. This change was not apparent during the 2011 recordings. One possible explanation for this is that the yearly variation in noise in the med pool was large enough to mask any additional noise from the power scrubbers and vacuuming equipment. The change in noise levels in the med pool between 2010 and 2011 may have been due to changes in the filtration equipment in the Arctic Coast exhibit or to a change in an external noise source such as nearby traffic, but more data are needed to fully evaluate this change. Chiller manipulations in the beluga exhibit did not affect the noise level significantly. The most likely explanation for the lack of impact is that the chillers are not physically coupled to the exhibit via pipes or pool walls. Water exiting the chillers feeds back into the main exhibit filtration system before re-entering the exhibit; noise from life support systems is therefore more likely to originate in the water circulation and mechanical noise caused by the filters. Comparison of the noise levels recorded at Mystic Aquarium in both years with the published beluga audiogram indicates that in the med pool, noise below 3 kHz is unlikely to be audible to the whales. In the hold and main pools, the peak in the noise spectrum at 2 kHz is overlaps with the audiogram, indicating that noise above 2 kHz may be heard by the whales (Figure 3 – 13).

70

Figure 3-13. Median 1/3 octave band noise levels (unweighted) and beluga hearing thresholds (after Richardson et al. (1995) compared with yearly (left) and spatial (2010; right) variability in the Arctic Coast exhibit. Noise above 2 kHz is likely audible to the whales throughout the exhibit.

Conclusions and Future Work

The two habitat types examined in this study had very different noise sources, spectral content, and levels. As expected, the wild habitat sites had relatively low noise levels, with intermittent high-amplitude periods during vessel passages and weather events. The filtration and exhibit maintenance sounds in the captive habitat created minimum noise levels that rivaled the 95th percentile of all noise (unweighted) in the Cook Inlet habitat. In addition, these high noise levels were continuous over the recording period, with intermittent increases from exhibit maintenance. Comparing the data from both sites with published hearing threshold data for beluga whales indicates that in both wild and captive environments, noise below 4 kHz is likely to be inaudible to the whales. In the wild, whales may perceive noise between 2.5 and 4 kHz during the very loudest encounters recorded; in the captive environment, whales are likely to perceive noise down to 2 kHz in all pools. Examination of the acoustic environments in Cook Inlet and Mystic Aquarium has shown that belugas are exposed to a wide range of noise types and levels. In the captive environment,

71 noise was relatively high amplitude and continuous, and the distance between signalers was limited. The effects of such long term exposure on vocal behavior are unstudied, but several types of outcomes are possible. Firstly, signalers with a maximum range to a receiver (as in the Mystic Aquarium habitat) may not need to modify their vocalizations to communicate effectively over the limited range between signaler and receiver. Alternatively, signalers could adapt their vocal responses to the new “baseline” noise level, and respond similarly to signalers in lower background noise levels. Other possible outcomes include the possibility that signalers may habituate to long-term noise levels and produce calls differently than animals in quieter habitats. A fourth and less likely outcome of consistently high noise is that signalers may be pushed to the physiological or anatomical limits of vocal flexibility, limiting further changes to call structure. The effects of increased noise on the vocal behaviors of the Mystic Aquarium belugas are examined in chapter four of this dissertation. In the wild habitat, noise was intermittent (on the scale of minutes to hours) and relatively low except during rain (KR) and vessel passages. While the acoustic environment is affected by anthropogenic noise, the environment in Cook Inlet is conducive to studying the effects of environmental and anthropogenic noise sources on vocal behavior of belugas. The temporal and spectral characteristics of noise in Cook Inlet indicate that signalers in this habitat would benefit from vocal flexibility and the ability to make short-term changes to the acoustic structure of vocalizations that allow signalers to avoid masking. The differences between the acoustic habitats in the Arctic Coast Exhibit at Mystic Aquarium and in Cook Inlet, Alaska are interesting in light of the possible effects of global climate change on anthropogenic activities and the eventual increase in noise levels in natural beluga habitats (Tynan and DeMaster, 1997). If the overall noise environment in the Arctic shifts as predicted, anthropogenic activities may become a dominant and relatively consistent source of potential masking noise. Understanding the reactions of belugas to such noise conditions may allow for more effective models of the effects of potential masking noise on wild populations; the data presented in this dissertation are the first step in evaluating the reactions of belugas to these different acoustic environments.

72 References

Angiel, N. M. (1997). "The vocal repertoire of the beluga whale in Bristol Bay, Alaska," M.S. Thesis, University of Washington. Beale, C. M. (2007). "The behavioral ecology of disturbance reactions," International Journal of Comparative Psychology 20, 111-120. Bejder, L., Samuels, A., Whitehead, H., and Gales, N. (2006). "Interpreting short-term behavioural responses to disturbance within a longitudinal perspective," Animal Behaviour 72, 1149-1158. Blackwell, S. B., and Greene, C. R. J. (2002). "Acoustic measurements in Cook Inlet, Alaska, during 2001," Prepared for National Marine Fisheries Service. Greeneridge Report 271-1. Greeneridge Sciences, Inc., Aptos, CA. Bradbury, J. W., and Vehrencamp, S. L. (1998). Principles of Animal Communication (Sinauer Associates, Inc, Sunderland, MA). Brumm, H., and Slabbekoorn, H. (2005). "Acoustic Communication in Noise," in Advances in the Study of Behavior, edited by P. J. B. Slater, C. T. Snowdon, T. J. Roper, H. J. Brockmann, and M. Naguib (Academic Press), pp. 151-209. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Census Bureau, U. S. (2012). "U.S. Census Bureau: State and County QuickFacts, Kenai, AK." Chmelnitsky, E. (2010). "Beluga whale, Delphinapterus leucas, vocalizations and their relation to behaviour in the Churchill River, Manitoba, Canada," M.S. Thesis, University of Manitoba. Erbe, C., and Farmer, D. M. (2000). "Zones of impact around icebreakers affecting beluga whales in the Beaufort Sea," The Journal of the Acoustical Society of America 108, 1332-1340. Ezer, T., Hobbs, R., and Oey, L. (2008). " On the movement of beluga whales in Cook Inlet, Alaska," Oceanography 21, 186-195. Gervaise, C., Simard, Y., Roy, N., Kinda, B., and Menard, N. (2012). "Shipping noise in whale habitat: Characteristics, sources, budget, and impact on belugas in Saguenay--St. Lawrence Marine Park hub," The Journal of the Acoustical Society of America 132, 76-89. Goetz, K. T., Rugh, D. J., Read, A. J., and Hobbs, R. C. (2007). "Habitat use in a marine ecosystem: beluga whales Delphinapterus leucas in Cook Inlet, Alaska," Marine Ecology Progress Series 330, 247-256. Halfwerk, W., and Slabbekoorn, H. (2009). "A behavioural mechanism explaining noise-dependent frequency use in urban birdsong," Animal Behaviour 78, 1301-1307. Hobbs, R., Shelden, K. E. W., Rugh, D. J., and Norman, S. A. (2008). "2008 Status review and extinction risk assessment of Cook Inlet belugas (Delphinapterus leucas)." in AFSC Processed Report. 2008-02 (Alaska Fisheries Science Center, National Marine Fisheries Service). Hobbs, R. C., Laidre, K. L., Vos, D. J., Mahoney, B. A., and Eagleton, M. (2005). "Movements and area use of belugas, Delphinapterus leucas, in a subarctic Alaskan estuary," Arctic 58, 331-340. Hobbs, R. C., Rugh, D. J., and DeMaster, D. P. (2000). "Abundance of belugas, Delphinapterus leucas, in Cook Inlet, Alaska, 1994-2000," Marine Fisheries Review 62. Hobbs, R. C., Sims, C. L., and Shelden, K. E. W. (2011). "Esitimated abundance of belugas in Cook Inlet, Alaska, from aerial surveys conducted in June 2011," in Unpublished Report (NMFS, NMML). Holt, M. M., Noren, D. P., and Emmons, C. K. (2011). "Effects of noise levels and call types on the source levels of killer whale calls," The Journal of the Acoustical Society of America 130, 3100- 3106. Holt, M. M., Noren, D. P., Veirs, V., Emmons, C. K., and Veirs, S. (2009). "Speaking up: Killer whales (Orcinus orca) increase their call amplitude in response to vessel noise," The Journal of the Acoustical Society of America 125, EL27-EL32.

73 Hotchkin, C. F., Parks, S. E., and Mahoney, B. A. (2010). "Anthropogenic noise sources and sound production of beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska.," in 159th Meeting of the Acoustical Society of America (Baltimore, MD), p. 1727. [Abstract] Janik, V., Slater, P. B., Peter J.B. Slater, J. S. R. C. T. S., and Manfred, M. (1997). "Vocal Learning in Mammals," in Advances in the Study of Behavior (Academic Press), pp. 59-99. Jefferson, T. A., Karczmarski, L., Laidre, K., O’Corry-Crowe, G., Reeves, R. R., Rojas-Bracho, L., Secchi, E. R., Slooten, E., Smith, B. D., Wang, J. Y., and Zhou, K. (2008). "IUCN Red List of Threatened Species. Version 2009.2.: Delphinapterus leucas." Lammers, M. O., Brainard, R. E., Au, W. W., Mooney, T. A., and Wong, K. B. (2008). "An ecological acoustic recorder (EAR) for long-term monitoring of biological and anthropogenic sounds on coral reefs and other marine habitats," The Journal of the Acoustical Society of America 123, 1720-1728. Lammers, M. O., Castellote, M., Small, R., Atkinson, S., Jenniges, J., Rosinski, A., Oswald, J. N., Garner, C., and Au, W. W. (In review). "Passive acoustic monitoring of Cook Inlet beluga whales (Delphinapterus leucas)," Journal of the Acoustical Society of America. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65- 84. McQuinn, I. H., Lesage, V., Carrier, D., Larrivee, G., Samson, Y., Chartrand, S., Michaud, R., and Theriault, J. (2011). "A threatened beluga (Delphinapterus leucas) population in the traffic lane: Vessel-generated noise characteristics of the Saguenay-St. Lawrence Marine Park, Canada," The Journal of the Acoustical Society of America 130, 3661-3673. Moore, S. E., Shelden, K. E. W., Litzky, L. K., Mahoney, B. A., and Rugh, D. J. (2000). "Beluga, Delphinapterus leucas, habitat associations in Cook Inlet, Alaska," Marine Fisheries Review 62, 60-80. Noren, D. P., Dunkin, R. C., Williams, T. M., and Holt, M. M. (2011). "Energetic cost of behaviors performed in response to vessel disturbance: one link in the population consequences of acoustic disturbance model," in The Effects of Noise on Aquatic Life, edited by A. Hawkins, and A. N. Popper. O'Corry-Crowe, G. M. (2009). "Beluga Whale: Delphinapterus leucas," in Encyclopedia of Marine Mammals (Second Edition), edited by F. P. William, W. Bernd, and J. G. M. Thewissen (Academic Press, London), pp. 108-112. Parks, S. E., Johnson, M., Nowacek, D. P., and Tyack, P. L. (2011). "Individual right whales call louder in increased environmental noise," Biology Letters 7, 33-35. Pond, R., Starke, K., and Tremblay, S. (2007). "Soundscape assessment of beluga whale holding pool, Mystic Aquarium, Mystic, CT," (http://www.bioacoustics.uconn.edu/mystic-soundscape.html; Accessed 07/01/2012). Richardson, W. J., Greene, C. R. J., Malme, C. I., and Thompson, D. H. (1995). Marine mammals and noise (Academic Press, San Diego, CA). Romano, T. A., Keogh, M. J., Kelly, C., Feng, P., Berk, L., Schlundt, C. E., Carder, D. A., and Finneran, J. J. (2004). "Anthropogenic sound and marine mammal health: measures of the nervous and immune systems before and after intense sound exposure," Canadian Journal of Fisheries and Aquatic Sciences 61, 1124-1134. Rugh, D. J., Shelden, K. E. W., and Hobbs, R. C. (2010). "Range contraction in a beluga whale population," Endangered Species Research 12, 69-75. Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," The Journal of the Acoustical Society of America 117, 1486-1492.

74 Schevill, W. E., and Lawrence, B. (1949). "Underwater Listening to the White Porpoise (Delphinapterus leucas)," Science 109, 143-144. Simard, Y., Roy, N., and Gervaise, C. (2008). "Passive acoustic detection and localization of whales: Effects of shipping noise in Saguenay-St. Lawrence Marine Park," The Journal of the Acoustical Society of America 123, 4109-4117. Sjare, B. L., and Smith, T. G. (1986). "The vocal repertoire of white whales, Delphinapterus leucas, summering in Cunningham Inlet, Northwest Territories," Canadian Journal of Zoology 64, 407 - 415. Southall, B. L., Bowles, A. E., Ellison, W. T., Finneran, J. J., Gentry, R. L., Greene, C. R., Jr. , Kastak, D., Ketten, D. R., Miller, J. H., Nachtigall, P. E., Richardson, W. J., Thomas, J. A., and Tyack, P. L. (2007). "Appendix A: Acoustic Measures and Terminology," Aquatic Mammals 33, 498-501. Speckman, S. G., and Piatt, J. F. (2000). "Historic and current use of Lower Cook Inlet, Alaska, by belugas, Delphinapterus leucas," Marine Fisheries Review 62. Thomas, J. A., Kastelein, R. A., and Awbrey, F. T. (1990). "Behavior and blood catecholamines of captive belugas during playbacks of noise from an oil drilling platform," Zoo Biology 9, 393-402. Tyack, P. L. (2000). "Functional aspects of cetacean communication," in Ceatcean societies: field studies of dolphins and whales, edited by J. Mann, R. C. Connor, P. L. Tyack, and H. Whitehead (University of Chicago Press, Chicago, IL). Tyack, P. L. (2008). "Convergence of Calls as Animals Form Social Bonds, Active Compensation for Noisy Communication Channels, and the Evolution of Vocal Learning in Mammals," Journal of Comparative Psychology 122, 319-331. Tynan, C. T., and DeMaster, D. P. (1997). "Observations and Predictions of Arctic Climatic Change: Potential Effects on Marine Mammals," Arctic 50, 308 - 322. Wiley, R. H., and Richards, D. G. (1978). "Physical constraints on acoustic communication in the atmosphere: Implications for the evolution of animal vocalizations," Behavioral Ecology and Sociobiology 3, 69-94. Wright, A. J., Soto, N. A., Baldwin, A. L., Bateson, M., Beale, C. M., Clark, C., Deak, T., Edwards, E. F., Fernandez, A., Godinho, A., Hatch, L. T., Kakuschke, A., Lusseau, D., Martineau, D., Romero, L. M., Weilgart, L. S., Wintle, B. A., Notarbartolo-di-Sciara, G., and Martin, V. (2007a). "Anthropogenic noise as a stressor in animals: a multidisiplinary perspective," International Journal of Comparative Psychology 20, 250 - 273. Wright, A. J., Soto, N. A., Baldwin, A. L., Bateson, M., Beale, C. M., Clark, C., Deak, T., Edwards, E. F., Fernandez, A., Godinho, A., Hatch, L. T., Kakuschke, A., Lusseau, D., Martineau, D., Romero, L. M., Weilgart, L. S., Wintle, B. A., Notarbartolo-di-Sciara, G., and Martin, V. (2007b). "Do marine mammals experience stress related to anthropogenic noise?," International Journal of Comparative Psychology 20, 274-316.

Chapter 4

Effects of noise on the vocalizations of captive beluga whales

Abstract

Short-term changes to the acoustic structure of vocalizations (“vocal flexibility”) can help signalers compensate for the masking effects of noise and increase chances of successful communication. Vocally flexible beluga whales (Delphinapterus leucas) have been shown to change the source level and spectral content of their vocalizations during increased noise. This study used passive acoustic recordings of a stable social group of belugas at Mystic Aquarium in Mystic, Connecticut, to examine changes in the acoustic structure of vocalizations during short- term increases in exhibit noise. During increased low-frequency noise from exhibit maintenance, the minimum and peak frequencies of two stereotyped beluga vocalizations, a pulsed call (CT1) and a flat-contoured whistle, were related to both broad- and narrow-band noise levels. For the pulsed call type, minimum call frequencies increased during increased narrowband noise, and peak call frequencies decreased. For flat whistles, the minimum frequency increased with higher broadband noise levels. There was some evidence of an effect of noise on the pulse-repetition rate of the CT1 pulsed call type, but no consistent effect of noise on duration of either CT1 or flat whistles. Call structure also varied between dates of cleanining dives, implying a possible influence of behavioral or environmental factors (including which animal produced each signal) which may interact with noise-induced modifications to affect call structure. These findings indicate that the beluga whales at Mystic exhibit a variety of noise-induced vocal modifications, but that call structure is also likely to be related to behavioral context and the communicative motivation and identity of the signaling animal. The observed noise-induced vocal modifications indicate that beluga whales are capable of changing specific parameters of stereotyped vocalizations during noise, which may allow signalers to increase their chances of successful communication.

76 Introduction

Vocal Noise Compensation

Animal communication has evolved in the presence of ambient noise from weather events, animals, and other natural sources (Bradbury and Vehrencamp, 1998; Tyack, 2000). A signaler’s acoustic environment, which is comprised of noise from other animals, weather, and anthropogenic sounds in conjunction with the physical structure of the habitat, can impact the effective range (‘active space’) of signals and overall success of acoustic communication (Wiley and Richards, 1978; Wiley, 2006). During periods of increased noise, signalers that are unable to maintain communication with conspecifics may suffer long-term fitness consequences, including loss of contact with offspring (Schroeder et al., 2012), increased aggression (Wilczynski and Brenowitz, 1988), loss of mating opportunities (Lengagne, 2008; Barber et al., 2009), and increased energetic costs (Fuller et al., 2007; Noren et al., 2011). Signalers capable of successfully communicating through masking noise have a clear selective advantage over those unable to do so (Brumm and Slabbekoorn, 2005; Patricelli and Blickley, 2006), which may have contributed to the evolution of short-term vocal flexibility in many species of birds and mammals (Brumm and Slabbekoorn, 2005; Brumm and Zollinger, 2011). The diversity of acoustic habitats and of vocally flexible species leads to the question of how vocal modifications are influenced by the types of noise a flexible signaler experiences over both long and short time scales. This chapter investigates vocal modifications by beluga whales in an artificial environment dominated by noise from equipment maintenance for future comparison with a wild population of belugas which experiences a very different acoustic environment.

Noise-induced vocal modifications

Although not all species are equally flexible, animals from many vertebrate taxa (e.g. mammals, birds, and amphibians) have demonstrated some type of noise-induced vocal modification (Brumm and Zollinger, 2011). In general, signal duration and timing of call production appear to be more commonly modified than spectral content (Janik et al., 1997), and these changes make up the majority of observed vocal modifications for many species of

77 amphibians (Lengagne, 2008; Love and Bee, 2010; Brumm and Zollinger, 2011). In the more flexible bird and mammal species, the most widespread modification is an increase in vocalization source level as a function of increasing noise amplitude (the Lombard effect) (Lombard, 1911; Lane and Tranel, 1971; Scheifele et al., 2005; Brumm and Zollinger, 2011). This modification is less well-studied than temporal and spectral changes in non-human vocalizations due to technical challenges, but has been found in every bird and mammal for which it has been tested (Brumm, 2004). Given the high signal-to-noise ratios at which signalers vocalize during low-noise conditions (Lane and Tranel, 1971; Brumm and Slabbekoorn, 2005), the typical magnitude Lombard effect (approximately a 2 dB increase in call amplitude per 10 dB increase in noise) is often useful in maintaining signal-to-noise ratio of calls during increased noise levels. Spectral and temporal modifications are also common among bird and mammal species. In many cases when it is impossible to determine the amplitude of a call (often for free-ranging animals whose identity and orientation are unknown) spectral and temporal modifications are detected (Brumm and Slabbekoorn, 2005). Spectral modifications can include changes to many frequency components, including (but not limited to) minimum (lowest), maximum (highest), and peak (most energy) frequency parameters (Brumm and Slabbekoorn, 2005; Patricelli and Blickley, 2006). Shifting all or part of a signal to a less-noisy frequency band may help compensate for increased noise by increasing the call’s signal-to-noise ratio against the higher- noise band, and therefore increase the signaler’s chances of being detected. The degree of spectral overlap between the signal and masking noise is important for human and avian signalers (Egan, 1967; 1972), and may be relevant to non-human mammals which have also demonstrated a capability for frequency shifts (Halfwerk and Slabbekoorn, 2009; Hu and Cardoso, 2010; Francis et al., 2011). Many species, particularly marine mammals, produce vocalizations that are composed of a series of pulses, referred to as “pulsed calls”. A spectro-temporal parameter that affects the vocalizations of species that produce pulsed calls is the pulse repetition rate (PRR) (Watkins, 1968; Au and Hastings, 2008; Janik, 2009). The rate of pulse repetition (pulses per second) affects the frequency content of calls, producing distinct high-amplitude “sidebands”. Modification of PRR changes the frequency of these bands, potentially shifting the frequency of important call parameters. In some species, changes to PRR have been documented during increased noise, indicating that modifications to PRR may serve a noise compensation function (Lesage et al., 1999).

78 Changes to the temporal parameters of calls may also increase the chances of signal detection (Brumm and Slabbekoorn, 2005; Patricelli and Blickley, 2006). Increases in call duration during increased noise increase the time-bandwidth product, allowing greater temporal integration time for the receiver (Bee and Micheyl, 2008), and it is therefore not surprising that this modification has been observed in most vertebrate taxa, including non-human mammals (Brumm and Zollinger, 2011).

Acoustic communication by beluga whales

Beluga whales are medium-sized (4 – 6 m) toothed whales found in relatively fluid social groups (usually 2 – 10, but up to several hundred animals) in the Arctic Ocean and sub-Arctic estuaries (O'Corry-Crowe, 2009). Habitat loss is one of several threats to this species; as global temperatures rise, their habitats will be subject to less ice cover (Jefferson et al., 2008), and more direct anthropogenic impacts as mineral exploration and extraction industries, naval activities, and commercial shipping traffic begin to exploit newly navigable routes and explore previously ice-covered areas (Tynan and DeMaster, 1997). Both the loss of insulating ice cover and an increase in anthropogenic activities have the potential to affect the acoustic environment in beluga habitats, creating masking hazards for animals listening for predators, prey, and conspecifics, and for signalers who must exert additional energy to compensate for changing acoustic environments (Richardson et al., 1995; Noren et al., 2011). Belugas are known as the ‘canaries of the sea’ due to their extensive and flexible vocal repertoires (Beddard, 1900; Schevill and Lawrence, 1949; Sjare and Smith, 1986b; Angiel, 1997; Chmelnitsky and Ferguson, 2012). Since their vocalizations were first described by Schevill and Lawrence (1949), vocal repertoires have been described for wild populations from the White Sea (Belikov and Bel'kovitch, 2006; 2007; 2008), the St. Lawrence River (Sjare and Smith, 1986a; b), Churchill River, Manitoba (Chmelnitsky and Ferguson, 2012), Svalbard, Norway (Karlsen et al., 2002), and Bristol Bay, Alaska (Angiel, 1997). Efforts are underway to fully characterize the repertoire of the Cook Inlet, Alaska, beluga population (Hotchkin et al., 2010; Blevins et al., 2012). While it is difficult to compare call types between studies due to the high degree of gradation between signals (Recchia, 1994; Angiel, 1997; Chmelnitsky and Ferguson, 2012), every described repertoire contains call types that are shared between populations, including tonal whistles, pulsed tones, and ‘hybrid’ sounds (Karlsen et al., 2002; Chmelnitsky and Ferguson,

79 2012). These same general categories have also been described for captive beluga populations from the Vancouver, Point Defiance, Shedd, and Mystic Aquariums (Recchia, 1994; Vergara et al., 2010; Hotchkin and Parks, 2011). In both wild and captive environments, belugas are exposed to high levels of natural and/or anthropogenic sounds (Chapter 3, this dissertation), which can potentially mask vocalizations between individuals and reduce communication range (Johnson et al., 1989; Richardson et al., 1995; Erbe et al., 1999). Given the high vocal flexibility and graded repertoires described for most populations, it is unsurprising that this species is also known to respond to increased noise by changing their vocal behavior and the structure of their vocalizations. Belugas from the St. Lawrence River population appear to compensate for increased noise by increasing the amplitude of their sounds in direct proportion to the increase in noise levels (Scheifele et al., 2005), and by changing their call types and increasing the frequencies of their vocalizations (Lesage et al., 1999). However, the correlative natures of these studies and the unknown structure of the belugas’ social groupings make it difficult to determine the noise parameters and other factors to which the whales are responding. Controlling for behavioral state and social structure in noise compensation studies of odontocetes necessitates the use of captive subjects. Beluga whales are one of the few cetacean species to be successfully kept in captivity, where individuals live in small social groups (2 – 12 animals) at aquaria and zoos around the world. Noise is caused by exhibit life support and maintenance systems (Chapter 3, this dissertation), and includes sporadically high levels which may induce vocal modifications in the belugas if they are subject to masking in their relatively small exhibit. This study examined the effects of increased noise on the acoustic structure of vocalizations in a captive group of three beluga whales at Mystic Aquarium in Mystic, Connecticut.

Mystic Aquarium

The beluga population at Mystic Aquarium consists of three whales: two adult females, and one adolescent male. The females, Kela and Naku, were both 27 – 28 years old at the time of recordings, and have lived at Mystic Aquarium since they were collected as juveniles from the wild population near Churchill, Manitoba in 1985 (Sirpenski, G., pers. comm.). The male, Juno, was born at Marineland Canada in 2002, transferred to Sea World Orlando, and sent to Mystic

80 Aquarium on breeding loan in January 2010. The females are known to have “normal” beluga hearing thresholds (Sirpenski, G., pers. comm.), and all three whales receive regular health checks from trainers and veterinary staff at the aquarium. The Mystic belugas’ vocal repertoire is composed of a wide range of call types, including tonal whistles, pulsed calls, hybrid call types, and echolocation clicks. Usage of call types varies temporally (see Appendix 3 for more detail), with a higher rate of pulsed calls (and more calls overall) recorded during daytime hours (Figure 4-1). While the behavioral functions of the belugas’ sounds are not known, the animals appear to use acoustic signals to communicate with each other inside the exhibit in addition to producing spontaneous vocalizations during interactions with trainers (in air) and when interacting with aquarium visitors at the underwater viewing area (underwater).

Figure 4-1. Use of tonal whistles (red/orange bars) and pulsed/hybrid calls (blue) during day (0600 – 1759) and night (1800 -0559) by the beluga whales at Mystic Aquarium during November 2010. Call rate and usage of pulsed calls dropped dramatically between day and night; this appeared related to the trainer activities and not sunrise/sunset times. See Appendix 3 for more details on repertoire description and vocal behavior.

81 Arctic Coast Exhibit

The beluga whale habitat at Mystic Aquarium in Mystic, Connecticut, is an outdoor exhibit composed of three pools (“main”, “med”, and “hold”; Figure 4-2) with a large underwater viewing area. The Main (0 – 5 m depth), Med (2.13 m), and Hold (4.0 m) pools in the beluga exhibit are connected by closeable gates made of metal and Plexiglas. Filtration and other life support systems are housed in an indoor area adjacent to the exhibit, based on cement blocks. Visitor access to the exhibit consists of above- and below-water viewing areas, and includes a regularly used public address system with loudspeakers mounted above the exhibit. The underwater viewing area includes large Plexiglas windows through which visitors may watch and interact with the belugas.

Figure 4-2. Diagram of the Arctic Coast exhibit at Mystic Aquarium with pools labeled. Gates between pools are indicated by hashes; there are two gates between the Main and Hold pools, one between Main and Med, and one between Med and Hold. The underwater viewing area is located underneath the semicircular canopy on the left side of the map. Image courtesy of Mike Osborn, Mystic Aquarium.

The acoustic environment in the Arctic Coast exhibit is dominated by noise from filtration systems and exhibit maintenance (Chapter 3, this dissertation), and varies temporally

82 and spatially within the exhibit. Noise is lowest during nighttime hours, when both human and beluga activities are reduced (Appendix 3), and highest during exhibit maintenance dives, when power scrubbers and vacuums are used to remove debris from the exhibit (Chapter 3, this dissertation). This study used passive acoustic recordings of the noise and beluga vocalization during maintenance dives to evaluate whether the beluga whales housed at Mystic Aquarium modify the acoustic structure of their calls to compensate for increased noise.

Methods

Data Collection

Data collection at the Arctic Coast Exhibit at Mystic Aquarium in Mystic, Connecticut occurred in two phases, between Oct. 25 and Nov. 16, 2010, and Oct. 1 – 11, 2011. Increased noise from exhibit maintenance was recorded during both years. During 2011, exhibit equipment was manipulated to reduce noise levels throughout the exhibit.

Acoustic recordings

Acoustic data were collected using a DSG (www.loggerheadinstruments.com) recording unit with attached HTI 96-MIN hydrophone and customizable filters. The frequency response of the recording system was flat (± 3 dB) between 10 Hz and 25 kHz, sufficient to record the expected noise sources in the exhibit as well as most tonal beluga vocalizations. The unit recorded continuously at a sampling rate of either 8 or 50 kHz, with low-pass anti-aliasing filter set at 23.4 kHz and high-pass filter set to ‘off’. Manufacturer calibration of the DSG unit indicated a recording sensitivity of -164 dB, with an added 3 dB gain built into the hardware filters. Differences in sampling rate were the result of hardware malfunctions, and days with 8 kHz sampling rate (26 Oct. – 3 Nov. 2010) were excluded from the analyses.

83

Figure 4-3. Photographs of the DSG unit a) in air, with weights and tether rope attached and b) deployed in the med pool.

The DSG unit was deployed by hand using a tether rope attached to the exhibit wall and 6.8 kg of weight (Figure 4-3). The hydrophone was positioned 1.1 m from the pool bottom; due to the different pool depths, the sensor was 1, 2.8, and 4 m from the water surface in the Med, Hold, and Main pools respectively. During all deployments, the DSG was located in a closed pool. The whales had visual access through the gate, but could not enter the recording pool. In 2010, deployments were designed to analyze the spatial and temporal noise variability in the exhibit. The DSG was therefore deployed alternately in the Med and Hold pools with one hour of recording in the Main pool, and recordings were conducted during both day and nighttime hours. In 2011, deployments were designed to capture noise variability due to experimental manipulation of the exhibit’s refrigeration equipment. The same DSG unit with identical sampling protocol and filter settings was therefore deployed in the Med pool during daylight hours. Restriction of sampling to the med pool allowed for minimal disruptions to the animals and training staff. Recording schedules for both years are shown in Table 4-1.

84

Table 4-1. Recording schedules for the 2010 and 2011 data collection sessions at Mystic Aquarium. Note that not all dates are consecutive, due to availability of pools for recording. On days when two pools are listed, the DSG was moved between pools during the recording period. A check in the “Dive” column indicates an exhibit maintenance dive occurred on that date; dives on Mondays were performed in the afternoon, and Tuesday dives occurred in the morning. The DSG was in the med pool during all dives.

Year Month Date Day Pool Dive 6 Sa HOLD

7 Su HOLD

8 M MED  9 Tu MED/HOLD  2010 NOV 10 W HOLD

13 Sa MED

14 Su HOLD

15 M MAIN/MED  16 Tu MED/HOLD

1 Sa MED

2 Su MED

3 M MED  2011 OCT 4 Tu MED

8 Sa MED

9 Su MED

11 Tu MED 

Behavioral observations

Scan sampling of behavior of all animals in the exhibit was conducted once per minute for the first ten minutes of every hour of daytime acoustic recordings during both years. No nighttime behavioral sampling was conducted. Behavioral sampling was intended to document interactions of whales with trainers and aquarium visitors, stereotyped behavioral patterns, and unusual events that could influence the acoustic behavior of the whales in the exhibit. Because there is no position outside of the exhibit with a full view of the pools, some observations include a ‘miss’ if an animal was out of view for any given scan. These data were used to identify times of training sessions and unusual events (trainers on the deck, etc.) and remove these data from the acoustic analyses.

85 Vocalization analyses

Acoustic data were browsed using Raven 1.4 pro to select and analyze vocalizations produced during and immediately prior to exhibit maintenance dives which increased noise levels in the exhibit. Waveform and spectrogram (1024 point, Hamming window, 75% overlap, frequency resolution 11.7 Hz) views were used simultaneously to determine the duration of calls. Duration was first marked on the spectrogram and adjusted based on zoomed views of the beginning and end of the selected call. Spectral parameters were measured using the spectrogram and spectrum slice views with identical resolutions. Vocalizations were hand selected by drawing boxes around the calls in each view; measures of minimum, maximum, and peak frequencies were automatically measured using Raven. Vocalization amplitude was not measured due to uncertainty about the identity and position of the signaling whale. Two vocalization types (Figure 4-4) were selected for analysis based on their prevalence in the data and high proportion in the overall call repertoire (Appendix 3). Flat whistles had a flat frequency contour, harmonic structure, and duration of between 0.25 and 1.5 seconds. CT1 vocalizations were pulsed, with a highly stereotyped shape and distinct sidebands (Chmelnitsky and Ferguson, 2012), and duration between 0.36 and 2.0 seconds.

Figure 4-4. Spectrograms of the two call types selected for analysis. Both call types are highly stereotyped; flat whistles have a harmonic structure and flat frequency contour. CT1 vocalizations are pulsed and have a distinctive shape and clear sidebands. Similar call types have been observed in wild populations, including the group from which the two Mystic females were originally collected (Chmelnitsky and Ferguson 2012).

86 The two vocalization types were measured slightly differently due to their different spectro-temporal characteristics. The tonal flat whistles were measured using minimum and peak frequencies and duration. CT1 vocalizations were examined using minimum and peak frequencies, duration, and pulse repetition rate. Average pulse repetition rates were calculated using the difference in frequency between the harmonic sidebands (Watkins, 1968). Frequency of each harmonic sideband was measured manually at the temporal midpoint of each CT1 vocalization; the differences between sidebands were averaged to estimate the pulse repetition rate.

Noise Analyses

A 0.3 second clip of noise was taken from immediately before each selected vocalization to analyze the acoustic environment at the time the vocalization was produced. Noise clips included all sound present in the recording, including previous vocalizations and filtration noise. Noise clips were imported into Matlab® (The MathWorks, Inc.) and analyzed with custom scripts to determine noise levels. Broadband levels were calculated by importing the clip and determining the root-mean-squared level over the entire bandwidth (20 Hz – 25 kHz). Narrowband noise levels were determined by filtering the clip with custom 1/3-octave band filters (using a slight modification of the code presented in Appendix 2) and selecting the bands containing the call characteristic of interest (minimum or peak frequency; Figure 4-5).

87

Figure 4-5. Illustration of the broad- and narrow-band noise selections used to evaluate short-term vocal modifications in the Mystic Aquarium data. The vocalization is outlined by the blue box, and noise parameters are shown in red. Broadband noise was measured across the entire frequency range (vertical red box; 20 Hz – 25 kHz), while narrowband noise was measured from the intersection of the broadband noise and 1/3 octave band containing either the peak (shown; shaded red rectangle) or minimum (not show) frequency of the vocalization.

Statistical Analyses

Statistical analysis was conducted with Minitab version 16. General regression models with associated residual plots (i.e.: normal plot of residuals, residuals vs. fits) were used to check the data for consistency with the assumptions of linear regression (Gotelli and Ellison, 2004). For minimum and peak call frequencies, data violated the assumption of homoscedasticity; for these variables, data were natural log transformed, and re-examined to ensure that the transformed data complied with the assumptions. Regression models were created for broad and narrowband noise levels against the spectral and temporal call parameters for both flat whistles and CT1 vocalizations. Comparisons 1 through 3 were used for both call types; comparison 4 was used only for the pulsed CT1 vocalizations.

88

1. Natural log of minimum call frequency against noise levels from a) broadband and b) 1/3 octave band containing call minimum frequency. 2. Natural log of peak call frequency against noise levels from a) broadband and b) 1/3 octave band containing call peak frequency. 3. Call duration against noise levels from a) broadband, b) 1/3 octave band containing call minimum frequency, and c) 1/3 octave band containing call peak frequency. 4. Average pulse repetition rate against noise levels from a) broadband, b) 1/3 octave band containing call minimum frequency, and c) 1/3 octave band containing call peak frequency.

A categorical factor (“date”) was included in all models to account for the potential for differences in behavioral context between the separate dive events.

Results

Five cleaning dive events were recorded during the data collection periods (2010: N=3; 2011: N=2). A total of 4,025 vocalizations of all call types were recorded during cleaning dives, with a mean ± standard deviation of 805.0 ± 668.3 calls per dive (range: 95 – 1,538). From the dive-event recordings, a total of 95 flat whistles (0 – 59 per dive) and 179 high signal-to-noise ratio CT1 vocalizations (7 – 83 per dive) were selected for analysis.

Noise

Dive noise was intermittent, with sharp changes in amplitude (Figure 4-6). Broadband noise levels ranged from 108 to 127 dB re 1 µPa (median: 116 dB), and there were no differences in the broadband noise levels that preceded flat whistles (range: 109 – 127; median: 117 dB re 1 µPa) and CT1 vocalizations (range: 108 – 125; median: 116 dB re 1 µPa).

89

Figure 4-6. Spectrogram of the cleaning dive from November 15, 2010 illustrating the intermittent nature of the noise.

Narrowband noise is defined as noise levels within the 1/3 octave bands containing minimum (MinNL) and peak (PeakNL) call frequencies. Narrowband noise levels from clips preceding calls did appear to differ between the CT1 (MinNL: 69 – 102 dB re 1 µPa, median: 86 dB re 1 µPa; PeakNL: 67 – 102 dB re 1 µPa, median: 95 dB re 1 µPa) and flat whistle vocalizations (MinNL: 71 – 102 dB re 1 µPa, median: 72 dB re 1 µPa; PeakNL: 71 – 99 dB re 1 µPa, median: 78 dB re 1 µPa), with CT1 calls occurring following noise with higher narrowband noise levels.

Spectral modifications

The spectral content (natural log transformed) of both flat whistles and CT1 vocalizations varied with noise levels and date, indicating possible interactions between noise characteristics and behavioral state of the signalers with respect to vocal noise compensation strategies.

90 Flat whistles

For flat whistles, minimum call frequency averaged 2.19 ± 0.81 kHz (range: 0.98 – 6.6 kHz) was significantly related to both broadband noise levels and date (BBNL: F1,4,90 = 6.56, 2 p=0.012; Date: F1,4,90 = 5.51 p=0.001; Adjusted R = 15.5 %). There was no relationship between minimum call frequency and noise level in the 1/3 octave band containing the call minimum 2 frequency (MinNL: F1,4,90 = 0.021, p=0.89; F1,4,90 = 4.12, p=0.009; Adjusted R = 9.3 %). Peak frequency of flat whistles (mean ± SD 3.22 ± 1.45 kHz; range: 0.1 – 8.1 kHz) did 2 not vary with either broad (BBNL: F1,4,90 = 0.27, p=0.6; Date: F1,4,90 = 0.49, p=0.69; Adjusted R =

-2.63%) or narrowband noise levels (PeakNL: F1,4,90 = 1.00, p=0.32; Date: F1,4,90 = 0.52, p=0.67; Adjusted R2 = -1.81 %), and there was no relationship between peak frequency of flat whistles and either noise condition.

CT1 vocalizations

Minimum frequency of CT1 vocalizations averaged 2.08 ± 1.6 kHz (range: 0.31 – 7.5 kHz) and varied significantly with broadband noise and date (BBNL: F1,4,173 = 7.66, p=0.006; 2 Date: F1,4,173 = 18.4, p <0.001; Adjusted R = 29.9%). There was also a significant relationship between minimum frequency, narrowband noise levels, and date (MinNL: F1,4,173 = 22.7, p < 2 0.001; Date: F1,4,173 = 18.81, p <0.001; Adjusted R = 35.3 %). Peak frequency of CT1 vocalizations ranged from 0.54 to 10.9 kHz (mean ± SD: 4.23 ± 2.59 kHz), and was not significantly related to broadband noise level, but was related to date 2 (BBNL: F1,4,173 = 1.24, p =0.27; Date: F1,4,173 = 18.6, p <0.001; Adjusted R = 28.2 %). Narrowband noise and date were both significantly related to peak frequency of CT1 2 vocalizations (PeakNL: F1,4,173 =106.1, p < 0.001; Date: F1,4,173 = 20.44, p <0.001; Adjusted R = 55.19 %) (Figure 4-7).

91

Figure 4-7. Relationship between peak call frequency (natural log-transformed) and narrowband noise levels for CT1 vocalizations. Date is not included as a factor in this model.

Temporal modifications

Flat whistles

The duration of flat whistles (0.55 ± 0.29 s; range: 0.25 – 1.54) was significantly related to date in all conditions, but only significantly related to broadband noise levels (BBNL: F1,4,90 = 2 6.3, p=0.014; Date: F1,4,90 = 11.5, p<0.001; Adjusted R = 28.3%). Narrowband minimum noise levels were marginally significant (MinNL: F1,4,90 = 3.72, p=0.056; Date: F1,4,90 = 5.69, p=0.001; Adjusted R2 = 35.3%). There was no relationship between flat whistle duration and narrowband noise levels in the 1/3 octave band containing the peak call frequency (PeakNL: F1,4,90 = 0.01, 2 p=0.92; Date: F1,4,90 = 7.75, p<0.001; Adjusted R = 23.3%). Despite the apparently strong relationship between whistle duration and broadband noise level, when the date term was dropped from the regression models, the relationship was no longer significant and there was no apparent effect of noise level on duration.

92 CT1 vocalizations

Duration of CT1 vocalizations (0.91 ± 0.36 secs; range: 0.37 – 2.02) was significantly related to date in all noise conditions, but the only noise level that was significantly related was the narrowband minimum noise measurement (MinNL: F1,4,173 = 4.18, p =0.042; Date: F1,4,173 = 2 43.8, p <0.001; Adjusted R = 50.5 %) (Figure 4-8). Broadband (BBNL: F1,4,173 = 0.32, p =0.57; 2 Date: F1,4,173 = 9.83, p <0.001; Adjusted R = 16.6 %) and narrowband peak noise levels had no significant relationship to CT1 duration (PeakNL: F1,4,173 = 0, p =0.98; Date: F1,4,173 = 9.35, p <0.001; Adjusted R2 = 16.4 %).

Figure 4-8. Relationship between duration of CT1 vocalizations and noise level in the 1/3 octave band containing the minimum frequency of the call. Date is not included as a factor in this model. The average pulse repetition rate (1,456.2 ± 101.5 pulses/sec; range: 1180.0 – 2089.7) was also related to date in all conditions. The relationship between pulse repetition rate and minimum narrowband noise was marginally significant (MinNL: F1,4,173 = 3.71, p =0.056; Date: 2 F1,4,173 = 10.98, p <0.001; Adjusted R = 18.2 %), and there was no significant relationship 2 between broadband (BBNL: F1,4,173 = 0.32, p =0.57; Date: F1,4,173 = 9.83, p <0.001; Adjusted R =

16.6 %) or peak narrowband noise levels (PeakNL: F1,4,173 = 0, p =0.98; Date: F1,4,173 = 9.35, p <0.001; Adjusted R2 = 16.4 %).

93 Discussion

The acoustic structures of vocalizations produced by the beluga whales at Mystic Aquarium were significantly related to both noise parameters and to the date the calls were recorded. The correlation of behavioral changes with dive date indicates that the signalers’ behavioral states or environmental context may interact with the acoustic environment to influence vocal modifications. Vocal modifications also differed between call types, possibly indicative of either different behavioral functions for these calls or use of different vocalization types in response to increased noise.

Flat whistles

The only noise-induced change to the acoustic structure of flat whistles was an increase in minimum call frequency during increased broadband noise. Both the minimum frequency and the duration of flat whistles were significantly related to dive date, indicating an effect of behavioral state or context dependency on this call type. The lack of change in the peak frequency of flat whistles is interesting, and may be related to observations of other species of cetaceans mimicking important acoustic stimuli in their environment (Hooper et al., 2006). Marine mammal trainers often use whistles as an acoustic bridge stimulus during training sessions. In the current data, the peak frequencies of many of the recorded beluga whistles are very similar to the Mystic trainers’ 6 kHz whistle (Figure 4-9). If the whales perceive the trainers’ whistles as a meaningful sound and are mimicking it, they may have a strong incentive to maintain the peak frequency even in the presence of high levels of masking noise. An alternative but not mutually exclusive explanation may be more likely. A recent study of the vocal repertoire of the population from which the two Mystic females were collected showed that the wild population also produces a stereotyped flat-contour whistle (W1a) (Chmelnitsky and Ferguson, 2012). It is possible that the Mystic belugas are modifying an existing call type to match socially important stimuli (Tyack, 2008), but the similarity between the whale and trainer whistles may also be coincidental. A future experiment shifting the frequency of the trainer whistles may help separate these possibilities.

94

Figure 4-9. Spectrograms of whistles produced by the Mystic belugas (left), Mystic trainer whistle (center), and wild belugas in the Churchill, Manitoba population (right). The peak frequencies of the Mystic whistles are nearly identical, but the overall structure of all three sounds is also similar. It is unclear whether the Mystic belugas are mimicking trainer whistles or whether the similarities between the vocalizations and trainer whistles are coincidental. Churchill River vocalization figure modified from Chmelnitsky and Ferguson (2012) call type W1a.

CT1 vocalizations

The acoustic structure of the pulsed CT1 call type was related to broad and narrowband noise levels and to dive date, again suggesting a possible interaction between behavioral or environmental influences and noise compensation strategies. Modifications to the minimum and peak frequencies of the CT1 vocalization type were not consistent across noise conditions. The minimum frequency of this call type was significantly related to both broad- and narrowband noise levels. During high broadband noise levels, minimum call frequency decreased, while during high noise in the 1/3 octave band containing the minimum call frequency, call minimum frequencies increased. One possible explanation for this observation is that the noise level in the 1/3 octave band containing the minimum call frequency was generally low, and likely to contain high levels of noise from the maintenance equipment, which may cause masking to call components with similar spectral characteristics. Broadband

95 noise levels included sound from the entire spectrum, which may be less likely to mask signals between 2 and 7 kHz. The peak frequency of CT1 vocalizations, which generally occurred at the beginning of the call, decreased in relation to increased levels of narrowband noise in the 1/3 octave band containing the peak call frequency. In this noise scenario, with high-amplitude noise covering a wide frequency range, shifting the peak frequency of a CT1 call down would seem counterproductive and unlikely to help the signaler compensate for noise. It is possible that the observed shift in peak frequency is a byproduct of other changes to other call characteristics, such as pulse rate or maximum frequency, or that it is more affected by behavioral contexts than increases in noise level.

Vocal noise compensation

The spectro-temporal modifications observed in this study are unlikely to allow the Mystic beluga whales to compensate for increases in exhibit noise caused by maintenance dives. Given the spectral signature of the noise and the amount of spectral overlap with the flat whistles and CT1 call types, increases in minimum and decreases in peak call frequencies are likely to increase the degree of spectral overlap between the signal and the noise, and actually decrease call detectability. There are several possible explanations for why the Mystic beluga whales appear not to have changed their vocalizations in ways that might have increased communication success during the recorded increases in noise. Other vocal changes, such as the Lombard effect, and more overt changes to vocal behavior rather than vocalization structure may be more effective at allowing signalers to compensate for noise in this situation. While I was unable to measure the source levels of the vocalizations recorded at Mystic, the widespread nature of this particular adaptation and its presence in a wild beluga population (Scheifele et al. 2005), indicate that it is likely that the Mystic belugas also increased the amplitude of their vocalizations during the maintenance dives. The probable presence of the Lombard effect implies another possible explanation for the seemingly maladaptive modifications observed here. Changes to the spectral and temporal parameters of other mammalian species’ vocalizations have been linked to increases in call amplitude via a psychophysical or biomechanical connection (Lane and Tranel, 1971; Tressler

96 and Smotherman, 2009). If the Mystic belugas were increasing call amplitude during the increases in exhibit noise, the spectro-temporal modifications found here could be a by-product of the Lombard effect, rather than independent modifications. Future studies should take advantage of the availability of captive cetaceans to examine potential linkages between the Lombard effect and other noise-induced vocal modifications, such as those observed in this study. Other potential changes to vocal behavior which could increase communication success are changes to call types or call timing as a result of increased noise. The observed differences in narrowband noise levels between CT1 vocalizations and flat whistles indicate that this might be occurring in the Mystic exhibit. While I did not test whether the belugas were likely to change their vocalization types during increased noise, changes to call types during increased noise have previously been documented in this species (Lesage et al., 1999) and in several songbirds (Patricelli and Blickley, 2006; Wood and Yezerinac, 2006). Alternatively, the different narrowband noise levels could have been correlated with date and/or a different behavioral context for the whales. Several species of non-human primates, frogs, and birds appear to use quiet “gaps” in the temporal structure of noise to produce vocalizations that are less-likely to be masked (Brumm and Slabbekoorn, 2005; Patricelli and Blickley, 2006; Egnor et al., 2007; Roy et al., 2011), and it is possible that the beluga whales in this study are using a similar strategy. A brief analysis of this possibility found that it is unlikely that the Mystic belugas preferentially produce calls during quiet gaps, finding that there was no relationship between the broadband noise levels sampled at 1 second intervals and whether or not a call was recorded. However, given the previously indicated relationships between call structures and narrowband noise, it is still possible that the Mystic belugas are responding to noise level in a particular frequency band and producing calls during low narrowband noise times. A fifth explanation for the apparent lack of vocal noise compensation behaviors is that the whales may be able to communicate effectively within the exhibit without the use of vocal modifications or changes to other aspects of their vocal behavior. There is a limited maximum range between signaler and receivers within the exhibit. In some areas, including the holding pool, visual contact between the whales is possible. During exhibit maintenance, the range between individuals is further reduced when two or all of the whales are confined to the holding pool. Confining the whales to a space where visual contact is possible and the maximum distance between individuals is less than five meters may mean that normal call characteristics are adequate even during the high levels of increased noise.

97 Context dependency and other confounding factors

In every case where vocalizations were related to noise levels, call structure was also related to the recording date, implying an effect of environmental or behavioral contexts unrelated to the dive noise. The idea that vocal noise compensation can be affected by the motivational and behavioral contexts of the signalers was first noted by Lane and Tranel (1971), who observed that human speakers engaged in an interactive speaking task exhibited a greater magnitude Lombard effect than speakers asked to read a script. This phenomenon also appears to extend to non- humans; in certain behavioral contexts, such as when dependent offspring are present, West Indian manatees (Trichechus manatus) are more likely to modify the structure of their vocalizations (Miksis-Olds and Tyack, 2009). Other behavioral and emotional states may also affect the acoustic structure of calls (Briefer, 2012). In the case of the three whales at Mystic Aquarium, interactions between the young adult male and two mature females have the potential to strongly influence behavioral states, as the females appear to actively avoid the male when possible (pers. obs.). If the female confined to the pool with the male during any given dive was agitated, this could have affected the vocalizations recorded; due to the inability to identify signalers, it would be impossible to distinguish one agitated whale from the pooled vocalizations. Other factors could also have affected the observed changes to call structure presented here. In particular, due to the sampling design and use of only a single hydrophone, I was unable to identify which animal produced the vocalizations analyzed. Many mammalian species have individual-specific information encoded in their vocalizations (Snowdon et al., 1983; Sousa-Lima et al., 2002; Janik et al., 2006; Sproul et al., 2006). Among cetaceans, bottlenose dolphins (Tursiops truncatus) emit “signature whistles” which are produced mainly by a single individual and occasionally mimicked by other members of the social group (Janik et al., 2006). While there is currently no evidence that beluga whales produce signature whistles, it is reasonable to assume that some properties of beluga vocalizations contain individually-specific information that may be used by conspecifics to identify signalers (Snowdon et al., 1983; Sousa-Lima et al., 2002; Janik et al., 2006; Sproul et al., 2006). It is also possible that the noise level received by the whales was different than that at the recorder, which may have confounded my measurements of vocal modifications. The DSG was physically separated from the whales during all recording sessions, and noise varied spatially within the exhibit (chapter 3, this dissertation). If the whales received higher noise levels than

98 recorded, I may have underestimated the influence of noise on vocal modifications, and vice versa.

Conclusions

The beluga whales at Mystic Aquarium did change the acoustic structure of their vocalizations in relation to increases in exhibit noise, but they did so in an apparently context- dependent fashion that was unlikely to increase communication success, if the whales were indeed noise-limited. Factors that may have affected the presence and types of observed vocal modifications include the limited ranges between signaler and receiver, availability of other vocal noise compensation strategies, and the inability to identify and analyze separately vocalizations produced by different individuals. While this study does not support the hypothesis that the captive beluga whales at Mystic Aquarium use such modifications to vocally compensate for increases in environmental noise, this may be because acoustic communication between the Mystic whales is not noise-limited. This study does confirm that beluga whales are capable of changing the structure of their calls, including highly-stereotyped vocalizations, during increased noise, possibly increasing their chances of successful communication and avoiding the potential fitness consequences of masked communication signals.

References

Angiel, N. M. (1997). "The vocal repertoire of the beluga whale in Bristol Bay, Alaska," M.S. Thesis, University of Washington. Au, W. W. L., and Hastings, M. C. (2008). Principles of Marine Bioacoustics (Springer Science, New York, NY). Barber, J. R., Crooks, K. R., and Fristrup, K. M. (2009). "The costs of chronic noise exposure for terrestrial organisms," Trends in Ecology & Evolution 25, 180-189. Beddard, F. E. (1900). A Book of Whales (G.P. Putnam's Sons, New York, NY). Bee, M. A., and Micheyl, C. (2008). "The "Cocktail party problem": What is it? How can it be solved? And why should animal behaviorists study it?," Journal of Comparative Psychology 122, 235 - 251. Belikov, R. A., and Bel'kovitch, V. M. (2006). "High-pitched tonal signals of beluga whales (Delphinapterus leucas) in a summer assemblage off Solovetskii Island in the White Sea," Acoustical Physics 52, 125-131.

99 Belikov, R. A., and Bel'kovitch, V. M. (2007). "Whistles of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 53, 528-534. Belikov, R. A., and Bel'kovitch, V. M. (2008). "Communicative pulsed signals of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 54, 115-123. Blevins, R., Atkinson, S., Lammers, M. O., and Small, R. (2012). "Calling Behavior of Cook Inlet Beluga Whales," in Alaska Marine Science Symposium (Anchorage, AK). Bradbury, J. W., and Vehrencamp, S. L. (1998). Principles of Animal Communication (Sinauer Associates, Inc, Sunderland, MA). Briefer, E. F. (2012). "Vocal expression of emotions in mammals: mechanisms of production and evidence," Journal of Zoology, Published online 8 May 2012. Brumm, H., and Slabbekoorn, H. (2005). "Acoustic Communication in Noise," in Advances in the Study of Behavior, edited by P. J. B. Slater, C. T. Snowdon, T. J. Roper, H. J. Brockmann, and M. Naguib (Academic Press), pp. 151-209. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Chmelnitsky, E. G., and Ferguson, S. H. (2012). "Beluga whale, Delphinapterus leucas, vocalizations from the Churchill River, Manitoba, Canada," The Journal of the Acoustical Society of America 131, 4821-4835. Egan, J. J. (1967). "Psychoacoustics of the Lombard Voice Reflex," Ph.D. dissertation, Case Western University. Egan, J. J. (1972). "Psychoacoustics of the Lombard voice response," Journal of Auditory Research 12, 318 - 324. Egnor, S. E. R., Wickelgren, J. G., and Hauser, M. D. (2007). "Tracking silence: adjusting vocal production to avoid acoustic interference," Journal of Comparative Physiology A 193, 477-483. Erbe, C., King, A. R., Yedlin, M., and Farmer, D. M. (1999). "Computer models for masked hearing experiments with beluga whales (Delphinapterus leucas)," The Journal of the Acoustical Society of America 105, 2967-2978. Francis, C. D., Ortega, C. P., and Cruz, A. (2011). "Different behavioural responses to anthropogenic noise by two closely related passerine birds," Biology Letters 7, 850 - 852. Fuller, R. A., Warren, P. H., and Gaston, K. J. (2007). "Daytime noise predicts nocturnal singing in urban robins," Biology Letters 3, 368-370. Gotelli, N. J., and Ellison, A. M. (2004). A Primer of Ecological Statistics (Sinauer Associates, Inc., Sunderland, MA). Halfwerk, W., and Slabbekoorn, H. (2009). "A behavioural mechanism explaining noise- dependent frequency use in urban birdsong," Animal Behaviour 78, 1301-1307. Hooper, S., Reiss, D., Carter, M., and McCowan, B. (2006). "Importance of contextual saliency on vocal imitation by bottlenose dolphins," International Journal of Comparative Psychology 19. Hotchkin, C. F., and Parks, S. E. (2011). "Noise exposure and acoustic behavior of beluga whales (Delphinapterus leucas) in an outdoor exhibit," in 161st Meeting of the Acoustical Society of America (Seattle, WA), p. 2396. Hotchkin, C. F., Parks, S. E., and Mahoney, B. A. (2010). "Anthropogenic noise sources and sound production of beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska.,"[Abstract]. In 159th Meeting of the Acoustical Society of America (Baltimore, MD), p. 1727. Hu, Y., and Cardoso, G. C. (2010). "Which birds adjust the frequency of vocalizations in urban noise?," Animal Behaviour 79, 863-867.

100 Janik, V., and Slater, P. B. (1997). "Vocal Learning in Mammals," Advances in the Study of Behavior 26, 59-99. Janik, V. M. (2009). "Acoustic communication in delphinids," Advances in the Study of Behavior Volume 40, 123-157. Janik, V. M., Sayigh, L. S., and Wells, R. S. (2006). "Signature whistle shape conveys identity information to bottlenose dolphins," Proceedings of the National Academy of Sciences 103, 8293-8297. Jefferson, T. A., Karczmarski, L., Laidre, K., O’Corry-Crowe, G., Reeves, R. R., Rojas-Bracho, L., Secchi, E. R., Slooten, E., Smith, B. D., Wang, J. Y., and Zhou, K. (2008). "IUCN Red List of Threatened Species. Version 2009.2.: Delphinapterus leucas." Johnson, C. S., McManus, M. W., and Skaar, D. (1989). "Masked tonal hearing thresholds in the beluga whale," The Journal of the Acoustical Society of America 85, 2651-2654. Karlsen, J. K., Bisther, A. B., Lydersen, C. L., Haug, T. H., and Kovacs, K. K. (2002). "Summer vocalisations of adult male white whales (Delphinapterus leucas) in Svalbard, Norway," Polar Biology 25, 808-817. Lane, H., and Tranel, B. (1971). "The Lombard sign and the role of hearing in speech," Journal of Speech, Language, and Hearing Research 14, 677 - 709. Lengagne, T. (2008). "Traffic noise affects communication behaviour in a breeding anuran, Hyla arborea," Biological Conservation 141, 2023-2031. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65-84. Lombard, E. (1911). "Le signe de l'elevation de la voix," Annales Des Malades de l'creille 37. Love, E. K., and Bee, M. A. (2010). "An experimental test of noise-dependent voice amplitude regulation in Cope's grey treefrog, Hyla chrysoscelis," Animal Behaviour 80, 509-515. Miksis-Olds, J. L., and Tyack, P. L. (2009). "Manatee (Trichechus manatus) vocalization usage in relation to environmental noise levels," The Journal of the Acoustical Society of America 125, 1806-1815. Noren, D. P., Dunkin, R. C., Williams, T. M., and Holt, M. M. (2011). "Energetic cost of behaviors performed in response to vessel disturbance: one link in the population consequences of acoustic disturbance model," in The Effects of Noise on Aquatic Life, edited by A. Hawkins, and A. N. Popper. O'Corry-Crowe, G. M. (2009). "Beluga Whale: Delphinapterus leucas," in Encyclopedia of Marine Mammals (Second Edition), edited by F. P. William, W. Bernd, and J. G. M. Thewissen (Academic Press, London), pp. 108-112. Patricelli, G. L., and Blickley, J. L. (2006). "Avian communication in urban noise: causes and consequences of vocal adjustment," The Auk 123, 639-649. Recchia, C. A. (1994). "Social Behaviour of Captive Belugas, Delphinapterus Leucas," Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA. Richardson, W. J., Greene, C. R. J., Malme, C. I., and Thompson, D. H. (1995). Marine mammals and noise (Academic Press, San Diego, CA). Roy, S., Miller, C. T., Gottsch, D., and Wang, X. (2011). "Vocal control by the common marmoset in the presence of interfering noise," The Journal of Experimental Biology 214, 3619-3629. Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," The Journal of the Acoustical Society of America 117, 1486-1492. Schevill, W. E., and Lawrence, B. (1949). "Underwater Listening to the White Porpoise (Delphinapterus leucas)," Science 109, 143-144.

101 Schroeder, J., Nakagawa, S., Cleasby, I. R., and Burke, T. (2012). "Passerine Birds Breeding under Chronic Noise Experience Reduced Fitness," PLoS ONE 7, e39200. Sjare, B. L., and Smith, T. G. (1986a). "The relationship between behavioral activity and underwater vocalizations of the white whale, Delphinapterus leucas," Canadian Journal of Zoology 64, 2824 - 2831. Sjare, B. L., and Smith, T. G. (1986b). "The vocal repertoire of white whales, Delphinapterus leucas, summering in Cunningham Inlet, Northwest Territories," Canadian Journal of Zoology 64, 407 - 415. Snowdon, C. T., Cleveland, J., and French, J. A. (1983). "Responses to context- and individual- specific cues in cotton-top tamarin long calls," Animal Behaviour 31, 92-101. Sousa-Lima, R. S., Paglia, A. P., and Da Fonseca, G. A. B. (2002). "Signature information and individual recognition in the isolation calls of Amazonian manatees, Trichechus inunguis (Mammalia: Sirenia)," Animal Behaviour 63, 301-310. Sproul, C., Palleroni, A., and Hauser, M. D. (2006). "Cottontop tamarin, Saguinus oedipus, alarm calls contain sufficient information for recognition of individual identity," Animal Behaviour 72, 1379-1385. Tyack, P. L. (2000). "Functional aspects of cetacean communication," in Cetacean societies: field studies of dolphins and whales, edited by J. Mann, R. C. Connor, P. L. Tyack, and H. Whitehead (University of Chicago Press, Chicago, IL). Tyack, P. L. (2008). "Convergence of Calls as Animals Form Social Bonds, Active Compensation for Noisy Communication Channels, and the Evolution of Vocal Learning in Mammals," Journal of Comparative Psychology 122, 319-331. Tynan, C. T., and DeMaster, D. P. (1997). "Observations and Predictions of Arctic Climatic Change: Potential Effects on Marine Mammals," Arctic 50, 308 - 322. Vergara, V. L., Michaud, R., and Barrett-Lennard, L. G. (2010). "What can Captive Whales tell us About their Wild Counterparts? Identification, Usage, and Ontogeny of Contact Calls in Belugas (Delphinapterus leucas)," International Journal of Comparative Psychology 23, 278 - 309. Watkins, W. A. (1968). "The harmonic interval: fact or artifact in spectral analysis of pulse trains”. Technical report: Reference no: 68-13. Woods Hole Oceanographic Institution, Woods Hole, MA. Wilczynski, W., and Brenowitz, E. A. (1988). "Acoustic cues mediate inter-male spacing in a neotropical frog," Animal Behaviour 36, 1054-1063. Wiley, R. H. (2006). "Signal Detection and Animal Communication," in Advances in the Study of Behavior, edited by H. Jane Brockmann, P. J. B. Slater, C. T. Snowdon, T. J. Roper, M. Naguib, and E. W. Katherine (Academic Press), pp. 217-247. Wiley, R. H., and Richards, D. G. (1978). "Physical constraints on acoustic communication in the atmosphere: Implications for the evolution of animal vocalizations," Behavioral Ecology and Sociobiology 3, 69-94. Wood, W. E., and Yezerinac, S. M. (2006). "Song sparrow (Melospiza melodia) song varies with urban noise, " The Auk 123, 650-659.

Chapter 5

Effects of noise on the vocalizations of wild beluga whales in Cook Inlet, Alaska

Abstract

Beluga whales use sound to facilitate social interactions, foraging, and other behaviors. During increased noise, belugas from both wild and captive populations modify the acoustic structure of their vocalizations, possibly in an attempt to compensate for acoustic masking. A hypothesized cause of the non-recovery of the Cook Inlet, Alaska beluga population is that increased anthropogenic noise causes behavioral and physiological changes which have led to increased stress and lower reproductive fitness in the population. This study used data from long- term passive acoustic recorders at two locations in Cook Inlet to investigate whether Cook Inlet belugas exhibit vocal modifications during increased noise. Simultaneous recordings of beluga vocalizations and anthropogenic noise from 21 vessel passages were examined to determine whether belugas at the Kenai and Beluga River mouths exhibited potential vocal noise compensation responses. Broadband and 1/3 octave band noise levels were compared with minimum and peak vocalization frequencies and call duration. Vocal behavior differed between the two datasets, with fewer mid and high-frequency calls produced during the winter at Kenai River than in summer at Beluga River. When divided into six categories based on frequency content and contour shape, all six categories showed a significant relationship between peak vocalization frequencies and the noise level in the 1/3 octave band containing the peak call frequency. Three categories also had significant relationships between peak frequency and the social and/or environmental context of the vocalizations during each noise event (“encounter”). Frequency modulated calls were more likely to vary in duration than were non-modulated calls. Minimum call frequency was not significantly related to either broad or narrow-band noise levels at Beluga River, but was related to narrowband noise levels at the Kenai River site. These findings indicate that Cook Inlet belugas modify their vocalizations in response to both noise level and behavioral context, and that further study is needed to identify the details of vocal noise compensation in this population.

103 Introduction

The effects of on marine mammals are highly controversial and investigations have often concentrated on the potentially lethal effects of some noise types on certain deep-diving cetaceans (Weilgart, 2007; Parsons et al., 2008). Recent shifts in focus have led to some investigations of the sub-lethal effects of noise exposure that most cetaceans are likely to experience (Richardson et al., 1995; Lesage et al., 1999; Weilgart, 2007; Wright et al., 2007b; Noren et al., 2011; Gervaise et al., 2012), such as overt behavioral changes to migration routes and habitat areas (Richardson et al., 1990), changes to overall activity budgets (Aguilar Soto et al., 2006; Lusseau et al., 2009), and modifications to vocal behavior and acoustic communication (Lesage et al., 1999; Scheifele et al., 2005; Holt et al., 2011; Parks et al., 2011, Chapter 4, this dissertation). While infrequent occurrences of these behavioral changes are unlikely to directly impact individuals’ fitness, chronic noise exposures and responses may cumulatively lead to negative impacts on fitness of exposed individuals (Barber et al., 2009; Noren et al., 2011; Schroeder et al., 2012). Chronic noise exposure is a threat to many populations of cetaceans (Richardson et al., 1995; Weilgart, 2007; Wright et al., 2007b; Wintle, 2009; Noren et al., 2011), and can directly affect the physiology of exposed animals (Wright et al., 2007a). Documented links between noise and physiological stress in both captive and wild cetaceans indicate that marine mammals are not an exception to this phenomenon (Romano et al., 2004; Wright et al., 2007b; a; Rolland et al., 2012). Experimental noise exposures in beluga whales demonstrated that short-term noise exposure can increase levels of stress hormones in blood from non-habituated animals (Romano et al., 2004), but habituation may not always mitigate the physiological stress associated with noise exposure. Chronically exposed and habituated North Atlantic right whales showed a decrease in fecal glucocorticoid hormone levels during period with a reduction in low frequency noise (Rolland et al., 2012). Indirect effects of chronic noise exposure may also contribute to physiological stress and behavioral modifications. Acoustic masking limits the communication range available to signalers, potentially limiting contact with group members and conspecifics while simultaneously increasing exposure to predators (Brumm and Slabbekoorn, 2005; Barber et al., 2009). Marine mammals that cannot communicate over the majority of their normal habitat during noise exposure (Clark et al., 2009) may be additionally stressed by 1) a loss of contact with social companions, and 2) vocal modifications required to communicate during noise (Noren et al.,

104 2011). Even for vocally plastic species capable of modifying vocalization structure to compensate for noise, chronic exposure and high rates of masking events may contribute to stress reactions and eventually result in physiological and energetic costs for signalers (Wright et al., 2007b; Noren et al., 2011). In addition to the potential physiological and energetic costs of vocal modifications, such changes may also indicate that signalers are responding to an environmental change and are potentially vulnerable to associated threats and stressors (Castellote and Fossa, 2006). Understanding the vocal behavior and noise responses of cetaceans should therefore be a goal in future passive acoustic monitoring and conservation plans.

Cook Inlet belugas

For one population of beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska, chronic noise exposure and associated masking effects have been hypothesized as a contributing cause of the population’s failure to recover from a crash during the 1980s and 1990s (NMFS, 2008). Although over-hunting appears to have been the predominant cause of the crash (Moore and DeMaster, 2000), a long-term trend of stable or decreasing population size despite the cessation of hunting led the federal government to list Cook Inlet belugas as endangered in 2008 (Hobbs et al., 2000; NMFS, 2008; Hobbs et al., 2011). Other potential causes, including illness, non-acoustic habitat degradation, and competition for food with commercial and recreational fishers have been proposed (NMFS, 2008). There is evidence that increased noise levels induce behavioral changes and vocal modifications in other wild belugas (Lesage et al., 1999; Scheifele et al., 2005); this phenomenon may therefore be an important factor for the Cook Inlet whales, and should be examined more closely. Cook Inlet is a sub-Arctic estuary in southern Alaska that branches into two arms at the northeastern end (Figure 5-1). The beluga population currently numbers between 200 and 400 individuals (Hobbs et al., 2008; Hobbs et al., 2011) whose seasonal habitat includes coastal waters around Anchorage (Alaska’s largest city, population ~300,000), which is located in the fork between Knik and Turnagain arms and the lower inlet in winter. Within the inlet, mineral exploration and extraction, construction, commercial shipping, fishing, and recreational boating activities are common, noisy activities (Blackwell and Greene, 2002). Natural noise in Cook Inlet

105 includes sounds from rain and other weather events, extreme tidal currents (regularly > 7 knots), and seasonal ice presence (Blackwell and Greene, 2002; Hotchkin et al., 2010).

Figure 5-1. Map of Cook Inlet, Alaska, with recorder locations marked with black stars and major rivers indicated by black lines. The two EAR deployments were within the Beluga River and Kenai River outflow zones. Relevant locations are marked with letters: A) Beluga River, B) Little Susitna River, C) Chickaloon Bay, D) Knik Arm, E) Turnagain Arm.

Beluga usage of Cook Inlet habitat areas varies seasonally (Moore et al., 2000; Goetz et al., 2007; Ezer et al., 2008). Summer population centers are generally found in the mid- and upper inlet, between the Beluga and Little Susitna Rivers, in Chickaloon Bay, and in upper Knik Arm (Speckman and Piatt, 2000; Goetz et al., 2007; Hobbs et al., 2011). These areas are subject to increased noise from commercial shipping, fishing, military, and construction activities (common in the areas closer to Anchorage), and mineral exploration and extraction activities in the mid-inlet southwest of Beluga River (Blackwell and Greene, 2002). Surveys of habitat usage by Cook Inlet belugas reported reduced sightings in lower Cook Inlet during and after the population crash, indicating a range contraction and possible habitat abandonment (Moore et al., 2000; Speckman and Piatt, 2000; Rugh et al., 2010). However, recent

106 evidence suggests that this habitat area is still occupied during the winter when the upper inlet is typically ice covered (Hobbs et al., 2005). Noise sources in this area include recreational boating (mainly summer months), transiting ships, and oil and gas support vessels, and the same natural sources found in the upper inlet. To assess beluga distribution and noise exposure in Cook Inlet, autonomous bottom- mounted Ecological Acoustic Recorders (EARs) (Lammers et al., 2008; Lammers et al., In review) were placed in several locations in Cook Inlet by a collaborative group of researchers based at the University of Hawaii (M. Lammers), the Alaska Department of Fish and Game (R. Small), the National Marine Mammal Laboratory (M. Castellote) and the University of Alaska Southeast (S. Atkinson, R. Blevins). This study used data from the mouths of the Beluga and Kenai rivers to assess whether Cook Inlet belugas change the structure of their acoustic signals during increases in anthropogenic noise.

Repertoire classifications

The wide range of vocalizations produced by beluga whales has earned them the name “canaries of the sea” (Beddard, 1900). Vocal repertoire descriptions are available for many populations (Sjare and Smith, 1986; Angiel, 1997; Karlsen et al., 2002; Belikov and Bel'kovitch, 2006; 2007; 2008; Chmelnitsky, 2010; Panova et al., 2012) and captive groups (Recchia, 1994; Kelley, 2010; Vergara et al., 2010), and illustrate that despite the extreme geographic and genetic isolation of some populations (O'Corry-Crowe, 2009), many call types are shared in all groups. An objectively classified repertoire for the Cook Inlet beluga population is not available at the time of this writing (Blevins, R., pers. comm.), but this population does exhibit some whistles, pulsed calls, and hybrid pulsed/tonal call types that are found in other groups (unpublished data, C. Hotchkin; see Appendix 1). Call types that appear to be common across populations include (but are not limited to) a stereotyped flat-contoured whistles, frequency modulated whistles with shallow or deep waves, and a low-frequency contour with a convex portion and flat tail (Figure 5-2) (Sjare and Smith, 1986; Angiel, 1997; Karlsen et al., 2002; Belikov and Bel'kovitch, 2007; Chmelnitsky and Ferguson, 2012). Spectrograms of additional vocalization types recorded in Cook Inlet during a previous project can be found in Appendix 1.

107

Figure 5-2. Spectrograms of call types shared between the Cook Inlet beluga population and the Churchill River population. Names and images adapted from Chmelnitsky and Ferguson (2012).

Vocal modifications

Vocal noise compensation mechanisms include changes to the acoustic structure of vocalizations. Changes to call amplitude, temporal characteristics, and spectral parameters during increased noise have been observed from a wide variety of species (Brumm and Slabbekoorn, 2005; Brumm and Zollinger, 2011), including wild beluga whales from the St. Lawrence River population (Lesage et al., 1999; Scheifele et al., 2005) and captive belugas at Mystic Aquarium (Chapter 4, this dissertation). The apparent effect of most vocal adjustments is to maintain or increase the signal-to-noise ratio of all or part of a vocalization, which may allow signalers to compensate for the detrimental effects of noise on the propagation of vocalizations (Lane and Tranel, 1971; Brumm and Slabbekoorn, 2005). The most common vocal modification, known as the Lombard effect, is an increase in vocalization amplitude proportional to increases in noise level (Lombard, 1911; Lane and Tranel, 1971; Brumm and Zollinger, 2011). While all mammals and birds that have been explicitly tested have shown evidence for this response (Hotchkin and Parks, In review), it is difficult to study in the wild when the distance from the signaler to the recording instrument can be variable and impossible to determine in some cases.

108 Other vocal modifications include changes to the spectro-temporal parameters of acoustic signals (Patricelli and Blickley, 2006). The relationships between noise and spectral and temporal call characteristics are less consistent between individuals and species (Hu and Cardoso, 2010; Francis et al., 2011a) and can be dependent on behavioral context (Miksis-Olds and Tyack, 2009; Tressler and Smotherman, 2009; Holt et al., 2011), complicating interpretation of observed changes. However, increases in call frequency when vocalizations are overlapped by increased low-frequency noise are commonly observed in both birds and marine mammals (Lesage et al., 1999; Halfwerk and Slabbekoorn, 2009; Hu and Cardoso, 2010; Francis et al., 2011b), and changes to call or call-subunit duration have also been observed. Long-term passive acoustic recordings with no concurrent visual observations during this study precluded examinations of call amplitude, but this methodology is well-suited to examining potential noise-induced changes to call frequencies and durations. Given the vocal flexibility of beluga whales and the demonstrated effects of noise on other belugas’ vocalizations, I predicted that the Cook Inlet whales would modify both the spectral and temporal contents of their vocalizations in order to reduce masking during increased noise.

Methods

Data collection and processing

Data were collected by a collaborative group of researchers based at the University of Hawaii (M. Lammers), the Alaska Department of Fish and Game (R. Small), the National Marine Mammal Laboratory (M. Castellote) and the University of Alaska Southeast (S. Atkinson, R. Blevins). Ten Ecological Acoustic Recording Units (EARs) (Lammers et al., 2008) were deployed in Cook Inlet beginning in June 2009, with the goal of documenting the presence of beluga whales and other marine mammals and evaluating natural and anthropogenic noise at sites throughout the inlet. The data analyzed for this chapter were recorded between June and September 2010 at the mouth of the Beluga River (BR) and from December 2010 to May 2011 at the mouth of the Kenai River (KR). Complete details of EAR deployments and hardware configurations can be found in Lammers et al. (2008; In review). The EAR units were deployed on bottom-mounted moorings at

109 5 – 10 m depth, and set for a 10% duty cycle (30 s recorded every 5 min) with a sampling rate of 25 kHz. The recording unit had a flat frequency response (± 3 dB) between 10 Hz and 50 kHz, which is appropriate for capturing the majority of non-echolocation beluga vocalizations. The manufacturer-calibrated sensitivity of the recording system was -193.5 dB re 1µPa, and a total added gain of 47.5 dB. The detection radius of the units was estimated using a synthesized 10 – 12 kHz broadcast at 140 dB re 1µPa at two sites, giving a conservative estimate of 1.5 – 2.5 km detection range for beluga vocalizations in quiet conditions (Lammers et al., In review). Acoustic data were manually browsed by collaborators at the University of Hawaii using long-term spectral averages to identify beluga vocalizations (“encounters”) in the recordings (Lammers et al., In review). Beluga vocalizations were differentiated from other marine mammal sounds (killer whale (Orcinus orca) and harbor seal (Phoca vitulina)) based on aural impressions and similarity to known beluga call types. Durations and times of beluga encounters were forwarded to C. Hotchkin. Dates with at least one detected beluga vocalization were fully browsed by C. Hotchkin to confirm presence and quality of vocalizations and to investigate the acoustic environment at the recorder during beluga encounters. Anthropogenic noise events in the recordings were identified by visually and aurally browsing all days on which beluga vocalizations were detected. Vessel traffic was identified using characteristic engine and propeller sounds in occurring simultaneously with relatively short duration high amplitude noise events (Figure 5-3).

Figure 5-3. 24 hour spectrogram (512 point FFT, Hanning window, 25% overlap) from the Beluga River recording unit illustrating the prevalence of anthropogenic noise sources at the site. Five vessel passages were recorded on this day; belugas were detected during and after the last vessel passage.

110 Vocalization analyses

Beluga encounters with good-quality vocalizations were assigned an analysis priority (1 = high, 3 = low) based on quality of beluga vocalizations (signal-to-noise ratios, number of calls, etc.) and noise sources present during the encounter. Vocalizations that occurred at the same time as a vessel passage were given highest analysis priority, due to the opportunity to investigate a range of noise levels with relative certainty of the group composition and behavioral state remaining constant throughout the encounter. Encounters without a vessel passage or other noise source were not used to investigate noise induced vocal modifications. Encounters selected for analysis of vocal modifications were browsed for beluga vocalizations in Raven 1.4 (Cornell Lab of Ornithology). Waveform, spectrogram (1024 point, Hamming window, 75% overlap, frequency resolution 11.7 Hz, after Vergara et al., 2010), and spectrum views were used to visually browse each file and select up to five calls from each 30 s sound file. Due to the variability in call types and qualities, two criteria were used in selecting calls: 1) calls for which the entire frequency contour was not clearly visible were excluded from analyses, and 2) calls that were temporally overlapped by other vocalizations were included only if the analyst was sure that the overlapping calls were not produced by the same individual. In the case of overlapping vocalizations, only the call with the highest SNR was selected for analysis. Broadband pulsed and noisy vocalizations were excluded from analyses due to low sample sizes at both sites. Vocalizations selected for analysis were measured using the spectrogram and spectrum slice views in Raven 1.4. Measurements of call duration, minimum and maximum frequencies were made from the spectrogram view; peak frequency measurements were determined from the spectrum slice measurements. If multiple call harmonics were visible, all harmonics were included in the selections; peak frequency was taken from the entire call and minimum frequency from the fundamental frequency. Such calls were noted as harmonic in the annotation column. Due to the difficulty of objectively categorizing beluga whales’ call types (Recchia, 1994), the lack of a published repertoire for Cook Inlet belugas, and the complexity of determining the behavioral function of call types, analyses were first performed on all selected vocalizations. Vocalizations were then divided into broad frequency (low: 0 – 4 kHz, mid: 4 – 8 kHz, and high: 8 – 12.5 kHz) and contour (flat (CW), and frequency modulated (FM)) based on the peak frequency measured from each call and a subjective impression of the contour shape (Figure 5-5). These categories were used to determine whether whales were more likely to change

111 calls that were significantly overlapped by noise, without addressing the behavioral contexts of stereotyped call contours.

Figure 5-4. Examples of vocalizations from low (A) and medium (B, C) frequency call categories. A) and B) are examples of frequency modulated (FM) calls, while C) is a flat contour (CW) vocalization.

Vocalizations recorded from each site were compared to examine whether there were differences in the types or characteristics of calls recorded at the Beluga and Kenai river sites. Due to differences in vocalizations recorded at the BR and KR sites, the subsequent analyses of noise levels and noise-induced vocal modifications were conducted separately for each site. Analysis of vocal modifications was limited to low and medium frequency calls from BR and to low frequency calls from KR due to low sample sizes of high frequency calls at both sites and mid frequency calls at KR.

Noise Analyses

A 0.3 second clip of noise was taken from immediately before each selected vocalization to analyze the acoustic environment at the time the vocalization was produced. Noise clips

112 included all sound present in the recording, including previous vocalizations, ship noise, and flow noise. Noise clips were imported into Matlab® (The MathWorks, Inc.) and analyzed with custom scripts to determine noise levels. Broadband levels were calculated by importing the clip and determining the root-mean-squared level over the entire bandwidth (20 Hz – 12.5 kHz). Narrowband noise levels were determined by filtering the clip with custom 1/3-octave band filters (using a slight modification of the code presented in Appendix 2) and selecting the bands containing the call characteristic of interest (minimum or peak frequency; Figure 5-6).

Figure 5-5. Illustration of the broad- and narrow-band noise selections. The vocalization is outlined by the blue box, and noise parameters are shown in red. Broadband noise was measured across the entire frequency range (vertical red box; 20 Hz – 12.5 kHz), while narrowband noise was measured from the intersection of the broadband noise and 1/3 octave band containing either the peak (shown; shaded red rectangle) or minimum (not show) frequency of the vocalization.

Statistical analyses

Statistical analysis was conducted with Minitab version 16. General regression models with associated residual plots (ie: normal plot of residuals, residuals vs. fits) were used to check

113 the data for consistency with the assumptions of linear regression (Gotelli and Ellison, 2004). In several cases, data violated the assumption of homoscedasticity; for these cases, data were natural log transformed, and re-examined to ensure that the transformed data complied with the assumptions. Regression models were created for broad and narrowband noise levels against the spectral and temporal call parameters for each of the vocalization classes. Comparisons included: 1. Minimum call frequency against noise levels from a) broadband and b) 1/3 octave band containing call minimum frequency 2. Peak call frequency against noise levels from a) broadband and b) 1/3 octave band containing call peak frequency 3. Call duration against noise levels from a) broadband, b) 1/3 octave band containing call minimum frequency, and c) 1/3 octave band containing call peak frequency

The first set of regression models compared call characteristics from data pooled over call categories and sites with noise parameters. High statistical significance of “site/season” as a factor in these models prompted separate analysis of data from BR and KR. Encounter ID was included as a categorical predictor in all subsequent models.

Results

A total of 606 vocalizations from 21 encounters were selected for analysis; calls were analyzed from 16 BR encounters (N=384 calls) that occurred between 12 June and 8 August 2010, and 5 KR encounters (N= 221 calls) from 15 January to 18 March 2011. In some cases, there were two or more analyzed encounters on a single day; in others, analyzed encounters were separated by up to six weeks. Site-specific differences in the production of call types (FM/CW low, medium, high) were detected. At the BR site, low-frequency vocalizations made up 64.4% of the total calls; at the KR site, 98.2 % of calls were low-frequency (Table 5-1). Rates of CW and FM call types at the two sites were similar, with 63.6 and 36.4 % at BR and 59.7 and 40.3 % KR, respectively. High frequency vocalizations were rare at both sites, but more common in the Beluga River data (Figure 5-5).

114

Table 5-1. Percentages of calls from each category at the two Cook Inlet recording locations. Kenai river vocalizations were concentrated in the low-frequency (0 – 4 kHz) band, while Beluga River vocalizations were more evenly distributed.

Beluga River Kenai River

Total 64.4% 98.2% Low (0 – 4 kHz) Frequency modulated (FM) 38.4% 58.4% Flat contour (CW) 26.0% 39.8% Total 23.6% 0.9% Med (4 – 8 kHz) Frequency modulated (FM) 16.4% 0.9% Flat contour (CW) 7.3% 0.0% Total 11.9% 0.9% High (8 - 12.5 kHz) Frequency modulated (FM) 8.8% 0.5% Flat contour (CW) 3.1% 0.5%

Figure 5-6. High-frequency calls produced at the BR recording site during a vessel passage. One mid-frequency call is visible at approximately 23 s. This type of vocal behavior, with consistent high-frequency vocalizations, was observed only once; more typical vocalizations ranged between 1 and 8 kHz, with minimal harmonic structure, even during increased noise. Broadband noise level measured during this recording was 114.9 dB re 1 µPa.

115 Noise levels

Broadband noise levels immediately preceding vocalizations ranged from 94 to 116 dB re 1µPa (median: 103 dB re 1µPa) in the Beluga River data. For the same vocalizations, 1/3 octave band noise levels ranged from 69 to 93 dB re 1µPa (median: 76 dB re 1µPa). In the Kenai River data, broadband noise levels were slightly lower, ranging from 93 to 104 dB re 1µPa (median: 96 dB re 1µPa), with narrowband noise ranging between 63 and 87 dB re 1µPa (median: 72 dB re 1µPa).

Call characteristics

Over all noise levels, low-frequency FM calls from the Beluga River dataset had an average minimum frequency (± standard deviation; range; median) of 1.6 kHz (± 1.1; 0.37 – 3.8; 1.3 kHz). At the Kenai River, low FM calls averaged 1.0 kHz (± 0.53; 0.33 – 3.5; 0.81 kHz) minimum frequency. The average minimum frequency of the mid-FM calls at the BR site was 5.5 kHz (± 1.5; 1.9 – 7.8; 6.0 kHz). Peak frequencies of the BR FM calls averaged 1.9 kHz (±1.1; 0.49 – 3.9; 1.7 kHz) and 6.5 kHz (± 1.3; 4.0 – 7.9; 6.8 kHz) for low and mid-frequency calls, respectively. For the KR data, low-frequency FM calls had average peak frequencies of 1.2 kHz (± 0.54; 0.47 – 3.6; 0.99 kHz). Average duration of low frequency BR and KR calls was 0.78 sec (± 0.46; 0.14 – 2.0; 0.64 sec) and 0.82 sec (± 0.49; 0.16 – 3.4; 0.74 sec), respectively. Mid-frequency call duration from the BR site was 0.80 sec (± 0.46; 0.11 – 2.28; 0.75 sec). Low frequency flat-contoured (CW) calls at the BR site had an average minimum frequency of 1.6 kHz (± 0.86, 0.41 – 3.8; 1.5 kHz); at the KR site, this call category averaged 1.3 kHz (± 0.71; 0.38 – 3.3; 1.4 kHz). Mid-frequency BR calls averaged 5.4 kHz (± 1.5; 1.6 – 7.6; 6.1 kHz). The average peak frequency of CW calls from the BR was 1.9 kHz (±1.1; 0.49 – 3.9; 1.7 kHz) for low and 6.5 kHz (±1.3; 4.0 – 8.0; 6.8 kHz) for mid-frequency calls. At the KR site, peak frequency of low CW calls was 1.2 kHz (± 0.54; 0.47 – 3.6; 1.0 kHz).

116 Duration of CW calls averaged 0.69 sec (±0.38; 0.14 – 1.9; 0.62 sec) and 0.67 sec (± 0.36; 0.18 – 2.0; 0.59 sec) for low frequency calls at the BR and KR sites, respectively. Mid – frequency calls from the BR site had an average duration of 0.91 sec (± 0.35; 0.27 – 1.57; 0.89 sec).

Site Differences

Analysis of the data pooled across sites and call categories indicated that while there were significant relationships of minimum and peak frequencies with broad and narrowband noise levels, there were also highly significant relationships with the site/season factor (Table 5-2). There were no significant relationships between call duration and site or noise levels, and data were therefore analyzed separately for the two sites.

Table 5-2. Regression statistics for data pooled across sites and call categories. Site/season was significant for all comparisons with minimum and peak call frequencies. There were no significant relationships with duration.

df (NL, Site,error) Min Freq Peak Freq Duration

F 12.40 NA 0.06 Min NL p < 0.001 NA 0.800 F 1,1,603 65.60 NA 0.29 Site/season p < 0.001 NA 0.590 Adjusted R2 17.62% NA -0.28% F NA 2.71 0.13 Peak NL p NA 0.100 0.720 F 1,1,603 NA 111.99 0.35 Site/season p NA <0.001 0.550 Adjusted R2 NA 20.90% -0.27% F 35.70 27.35 0.23 BB NL p < 0.001 <0.001 0.630 F 1,1,603 25.90 48.04 0.01 Site/season p < 0.001 <0.001 0.930 Adjusted R2 20.62% 23.99% -0.26%

117 Spectral Modifications

Vocal behavior also appeared to vary as a result of noise level and by individual encounters. Vocal responses were not consistent across call type categories (CW/FM, low, medium, high frequencies), but in five of six categories, noise level in the 1/3 octave band containing the peak frequency of the call was significantly related to the peak call frequency (Table 5-3, Figure 5-7).

Table 5-3. Regression relationships between the peak frequency of vocalizations and the noise level in the 1/3 octave band containing the call’s peak frequency. EID is a categorical factor relating to the individual encounter during which each call was recorded. All six call categories have significant relationships between peak call frequency and narrowband noise level; three of the call categories have significant relationships with encounter ID.

Peak NL EID Adjusted R2 df (NL, EID,error) F p F p

CW low 1,14,84 11.92 0.001 2.01 0.027 13.33% CW medium 1,15,131 11.76 0.001 2.47 0.003 12.61% BR FM low 1,10,16 14.47 0.002 3.19 0.019 46.27% FM medium 1,15,46 4.03 0.051 1.26 0.267 9.53% CW low 1,4,82 23.41 0.000 1.84 0.130 25.20% KR FM low 1,4,123 5.97 0.016 1.79 0.136 6.41%

118

Figure 5-7. Natural log-transform of peak call frequencies for low-frequency CW calls in the Kenai River data against noise level in the 1/3 octave band containing the peak call frequency. Noise clips from the same encounter tended to have very similar 1/3 octave band levels. Encounter ID was not significant in this analysis.

Kenai River site

Significant relationships between the minimum frequency of vocalizations and narrowband noise levels were observed only at the Kenai river site (Figure 5-8). The minimum frequency of low-frequency CW calls was related to the noise level in the 1/3 octave band containing the minimum frequency of the call and by encounter (MinNL: F1,4,82 = 35.98 2 p<0.0001; EID: F1,4,82 = 3.33 p=0.014; adjusted R = 31.9%).The minimum frequency of low FM calls varied only with the noise level in the associated 1/3 octave band (MinNL: F1,4,123 = 10.59 2 p=0.001; EID: F1,4,123 = 1.53 p=0.197; adjusted R = 9.48%). Peak frequencies of both CW and FM call types at the KR site were related to narrowband noise levels (Table 5-3).

Figure 5-8. Plot of all minimum vocalization frequencies against noise level in the 1/3 octave band containing minimum vocalization frequency. Beluga River data is marked in black, Kenai river data in red. Vocalizations from the Kenai river dataset had generally lower frequencies and varied with narrowband noise level; Beluga river vocalizations did not vary with increasing narrowband noise. The six data points outlined in the blue ellipse all occurred during a single encounter (BR34).

119 Beluga River site

The peak frequencies of all BR call classes were significantly related to the noise level in the 1/3 octave band containing the peak call frequency and to individual encounters (Table 5-3).

For low-frequency CW calls at the Beluga River site, both minimum (F1,14,84 = 5.52 2 p=0.021; EID: F1,14,84 = 1.33 p=0.210; adjusted R = 8.59%) and peak (F1,14,84 = 5.63 p=0.020; 2 EID: F1,14,84 = 1.41 p=0.167; adjusted R = 7.25%) frequencies were related to the broadband noise level, with no differences detected between encounters. Low frequency FM calls at the BR site did not show any significant relationships between broad- or narrowband noise levels and minimum call frequency (Min NL: F1,15,131 = 0.01 2 p=0.942; EID: F1,15,131 = 1.42 p=0.147; adjusted R = 3.92%; BBNL: F1,15,131 = 0.04 p=0.832; EID: 2 F1,15,131 = 1.41 p=0.152; adjusted R = 3.95%) Minimum frequencies of medium-frequency CW vocalizations at the BR site were related only to specific encounters, and not to narrow- or broadband noise levels (Min NL: F1,10,16 2 = 0.09 p=0.773; EID: F1,10,16 = 2.64 p=0.040; adjusted R = 36.42%; BBNL: F1,10,16 = 0.11 2 p=0.742; EID: F1,10,16 = 2.62 p=0.042; adjusted R = 36.52%). There were no significant relationships between minimum frequencies of mid-frequency FM calls and noise levels or individual encounters.

Temporal Modifications

Relationships between broad and narrow-band noise levels and the duration of vocalizations were inconsistent at both sites and for all call categories. For CW calls, duration of vocalizations was significantly related to the encounter ID in six of eight classes analyzed and nearly significant for a seventh (Table 5-4, Figure 5-9). However, this was not the case for FM calls, for which duration varied with broadband noise level and not encounter ID. BR medium FM calls had no significant relationships between duration and narrowband noise levels or 2 encounter IDs (Min NL: F1,15,46 = 0.67 p=0.417; EID: F1,15,46 = 1.94 p=0.114; adjusted R = 5.62%; 2 Peak NL: F1,15,46 = 0.14 p=0.710; EID: F1,15,46 = 1.15 p=0.346; adjusted R = 2.38%), but the effects of broadband noise level trended toward significance in this case (BB NL: F1,15,46 = 3.10 2 p=0.085; EID: F1,15,46 = 1.11 p=0.378; adjusted R = 8.27%). For low frequency FM calls from the

120

BR site, duration was significantly related to broadband noise level (BB NL: F1,15,131 = 4.57 2 p=0.034; EID: F1,15,131 = 1.38 p=0.165; adjusted R = 60.46%), and not to either minimum or peak narrowband noise levels (Min NL: F1,15,131 = 0.89 p=0.347; EID: F1,15,131 = 1.34 p=0.188; adjusted 2 2 R = 3.85%; Peak NL: F1,15,131 = 2.69 p=0.104; EID: F1,15,131 = 1.49 p=0.118; adjusted R = 5.14%). Frequency modulated calls from the KR site had no significant relationships between call duration and any predictive variable (Min NL: F1,4,123= 0.43 p=0.512; EID: F1,4,123= 1.37 p=0.250; 2 2 adjusted R = 0.40 %; Peak NL: F1,4,123= 0.06 p=0.802; EID: F1,4,123= 1.18 p=0.321; adjusted R = 2 0.10 %; BB NL: F1,4,123= 0.07 p=0.788; EID: F1,4,123= 1.26 p=0.290; adjusted R = 0.11 %).

Table 5-4. Regression relationships between the duration of flat-contour (CW) call types at the BR and KR sites and the three noise variables. In no case does the duration vary with noise level; the only significant relationships are between duration and encounter ID, which was significant in six of eight categories (bold text) and nearly significant for a seventh

BR CW KR CW

df (NL, EID,error) Low Medium Low

F 0.79 0.41 0.04 Min NL p 0.378 0.531 0.833 F 1,14,84 2.14 1.94 3.61 EID p 0.017 0.114 0.009 Adjusted R2 13.13% 33.65% 12.75% F 2.97 10.39 0.16 Peak NL p 0.089 0.005 0.690 F 1,15,131 2.33 3.71 2.66 EID p 0.009 0.010 0.038 Adjusted R2 15.30% 58.74% 12.87% F 0.03 0.35 1.06 BB NL p 0.855 0.564 0.306 F 1,4,82 2.03 2.45 4.74 EID p 0.025 0.053 0.002 Adjusted R2 12.35% 33.39% 13.82%

121

Figure 5-9. Duration of low-frequency CW calls from both sites. Boxes indicate inter- quartile range; horizontal lines represent means and whiskers represent range. Outliers are indicated with asterisks. Duration of vocalizations was significantly related to Encounter ID for six of eight comparisons (Table 5-4).

Discussion

The vocal behavior of marine mammals is complex and depends on a wide range of factors, including behavioral states, group composition, and other variables. One potentially important factor which may affect acoustic signal structure is the amount of potential masking noise in the acoustic environment at the time a vocalization is produced. The data from this study indicate that noise levels are significantly related to the acoustic structure of vocalizations produced by Cook Inlet beluga whales at two sites in their habitat.

Overall vocal behavior

Vocal differences were detected between data sets, with a much lower incidence of mid- and high-frequency vocalizations in the Kenai River data. There are several potential explanations for this finding. First, it is possible that the beluga whales using the two sites

122 represent two different subgroups, with different vocal behaviors. However, data from long-term satellite tags indicate that individuals do move between the BR and KR areas within the inlet (Hobbs et al., 2005), so this explanation is not entirely satisfactory. Alternatively, the same beluga population could exhibit either geographic or seasonal variations in behaviors at the two sites. Summer distributions of Cook Inlet whales are concentrated at river mouths in the upper inlet, and coincide with seasonal runs of anadromous fishes, a major food source for this population (Huntington, 2000). Behavioral use of the Beluga River site during the summer months may therefore be dominated by foraging and winter use may be concentrated on socializing or other behaviors. A second seasonal factor that may influence vocal behavior is the presence of new calves in social groups during the summer. The calves’ social interactions and contacts with mothers and other affiliates may cause variations in vocal behavior by all animals (Miksis-Olds and Tyack, 2009). While this analysis cannot discriminate between different geographic and seasonal variations, the observed difference in vocal behavior between sites points to a knowledge gap in the understanding of habitat use by this population. Another factor that could have affected the types of vocalizations recorded at the two sites is sound propagation and the positions of the signalers in relation to the recorders. It is possible that calls propagated differently at the Kenai River site than at the Beluga River site due to some difference in sound speed profiles (related to seasonality of recordings), bathymetry, sediment, vegetation or other environmental conditions. I believe site-specific differences in propagation are unlikely because the two recording locations have similar bathymetry and sediment types, with minimal vegetation. While it is possible that some seasonal difference affected propagation of vocalizations at the two sites, the analyzed vocalizations from both sites were selected for high signal to noise ratios, minimizing the potential effects of propagation differences. Acoustic environment differences may also contribute to differences in vocal behavior between sites. The third chapter of this dissertation examined the acoustic habitat at the Beluga and Kenai River sites during beluga encounters. Noise levels at the Beluga River site were consistently higher than those at Kenai River over the entire frequency spectrum. It is possible that the observed differences in vocal behavior are a response to the overall noise differences between the two recording locations. Use of more mid- and high-frequency vocalizations at Beluga River may be a response to the higher rate of vessel encounters (and associated low- frequency noise events) at this site. Signalers may take advantage of the lower noise at the Kenai River to communicate with low-frequency sounds that may travel farther (Wiley and Richards,

123 1978; Larom et al., 1997) or be less energetically expensive (Noren et al., 2011). More studies of the overall noise exposure of Cook Inlet belugas throughout the inlet and the full behavioral context of their vocalizations will be needed to evaluate this hypothesis.

Call structure

Acoustic structure of vocalizations was related to noise levels and encounter IDs at both sites, a sign of potential context-dependent vocal modifications. For all call type categories, peak frequency of beluga vocalizations increased significantly with increasing noise in the 1/3 octave band containing the call’s peak frequency. Three of these call types also had significant relationships with encounter ID. The two call categories in which peak frequency was not associated with encounter ID were both from the Kenai River site. Minimum frequency of vocalizations was also related to narrowband noise levels, though only at the Kenai River site. The lack of a relationship at the Beluga River location is likely to be an effect of the high encounter-specific variation at this site, which may in turn be related to the seasonal or site-specific conditions mentioned above. Call duration was not significantly related to any narrowband noise parameter, and was related to broadband noise for only one call category: low-frequency FM calls from the Beluga River site. In all other call categories, call duration was related specifically the encounter ID. The relationship of call parameters to encounter ID complicates interpretation of these results. Social and environmental variables such as group size, composition and behavioral states are known to impact vocalizations of other species, and are likely to influence signalers’ responses to noise (Miksis-Olds and Tyack, 2009; Tressler and Smotherman, 2009). Group composition and behavioral state are particularly important to consider, given that animals with dependent offspring and individuals participating in cooperative foraging groups may have higher motivation to communicate than do groups of traveling adults (Miksis-Olds and Tyack, 2009; Tressler and Smotherman, 2009; Chmelnitsky, 2010).

124 Confounding factors

There are several factors from both the data collection and analysis phases that could have influenced the current findings. In particular, using one stationary long-term recording unit at each site is not the most effective way to evaluate noise-induced vocal modifications. Because no visual monitoring was conducted during vessel passages, it is impossible to know the exact relationship between the noise source (the vessel) and the vocalizing whales. Noise levels at the recorder may not be representative of the noise experienced by the belugas, which may in turn affect observations of vocal modifications in relation to noise levels measured at the recorder. Overestimates of vocal modifications would be expected if the signalers experienced higher noise levels than the recording unit, while the opposite would be true if the whales experienced lower noise levels. The duration of vocalizations from Cook Inlet belugas was not associated with increases in noise level, and was only related to encounter ID for flat (CW) contour types. While FM calls from the BR site were related to noise levels, the effects of noise on call duration could have been confounded by the coarse method of dividing calls into categories. Category types were based on overall contour shape and frequency content of the vocalization, with no reference to the duration of the calls. Very short “chirps” were grouped together with whistles which were sometimes over 1 s long; pooling the long and short duration contours together is likely to have masked any effects of noise level on call duration. When an objective repertoire for the Cook Inlet beluga population is available (Blevins, R., pers. comm.), the effects of noise on call contours of different durations should be evaluated.

Conclusions

This study provides evidence that Cook Inlet beluga whales respond to increased anthropogenic noise in their habitats. Changes to the peak and minimum frequencies of vocalizations during increased noise may allow signalers to communicate more effectively during noise, or may be indicative of a heightened emotional state (Briefer, 2012) caused by the increase in noise or physical presence of a vessel in the habitat. Regardless of the immediate cause, changes to the acoustic structure of vocalizations in noise may come at an energetic cost to the

125 signaler; such costs may accumulate to a substantial amount of energy for signalers that repeatedly modify call structures due to chronic noise exposures. The lack of visual observations during recordings precluded analyses of the Lombard effect in this population. Increase of vocalization source levels with increasing noise amplitude is a response that has been documented in wild belugas and several other cetaceans, humans, and other species (Lombard, 1911; Lane and Tranel, 1971; Brumm and Zollinger, 2011). It is likely that Cook Inlet belugas do increase the source level of vocalizations during increased noise, given the previous evidence of a capacity for a Lombard-like response (Scheifele et al., 2005) and changes to the spectral content of vocalizations documented in this study, but further research on this population should be done to confirm whether call amplitude is a flexible parameter for this population. Chronic noise exposure and the need for repeated modifications to vocalization structure are likely to impose energetic costs and may increase physiological stress in the Cook Inlet beluga population. Further investigations of the interactions between group composition, behavioral states, and increases in anthropogenic noise may provide insight into the context- dependent vocal noise compensation strategies in this population and allow for eventual mitigation of the effects of chronic noise exposure on this vulnerable population.

References

Aguilar Soto, N., Johnson, M., Madsen, P. T., Tyack, P. L., Bocconcelli, A., and Fabrizio Borsani, J. (2006). "Does intense ship noise disrupt foraging in deep-diving Cuvier's beaked whales (Ziphius cavirostris)? ," Marine Mammal Science 22, 690-699. Angiel, N. M. (1997). "The vocal repertoire of the beluga whale in Bristol Bay, Alaska," M.S. Thesis, University of Washington. Barber, J. R., Crooks, K. R., and Fristrup, K. M. (2009). "The costs of chronic noise exposure for terrestrial organisms," Trends in Ecology & Evolution 25, 180-189. Beddard, F. E. (1900). A Book of Whales (G.P. Putnam's Sons, New York, NY), 245. Belikov, R. A., and Bel'kovitch, V. M. (2006). "High-pitched tonal signals of beluga whales (Delphinapterus leucas) in a summer assemblage off Solovetskii Island in the White Sea," Acoustical Physics 52, 125-131. Belikov, R. A., and Bel'kovitch, V. M. (2007). "Whistles of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 53, 528-534. Belikov, R. A., and Bel'kovitch, V. M. (2008). "Communicative pulsed signals of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 54, 115-123.

126 Blackwell, S. B., and Greene, C. R. J. (2002). "Acoustic measurements in Cook Inlet, Alaska, during 2001," Prepared for National Marine Fisheries Service. Greeneridge Report 271-1. Greeneridge Sciences, Inc., Aptos, CA. Briefer, E. F. (2012). "Vocal expression of emotions in mammals: mechanisms of production and evidence," Journal of Zoology, Published online 8 May 2012. Brumm, H., and Slabbekoorn, H. (2005). "Acoustic Communication in Noise," in Advances in the Study of Behavior, edited by P. J. B. Slater, C. T. Snowdon, T. J. Roper, H. J. Brockmann, and M. Naguib (Academic Press), pp. 151-209. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Castellote, M., and Fossa, F. (2006). "Measuring Acoustic Activity as a Method to Evaluate Welfare in Captive Beluga Whales (Delphinapterus leucas)," Aquatic Mammals 32, 325- 333. Chmelnitsky, E. (2010). "Beluga whale, Delphinapterus leucas, vocalizations and their relation to behaviour in the Churchill River, Manitoba, Canada," M.S. Thesis, University of Manitoba. Chmelnitsky, E. G., and Ferguson, S. H. (2012). "Beluga whale, Delphinapterus leucas, vocalizations from the Churchill River, Manitoba, Canada," The Journal of the Acoustical Society of America 131, 4821-4835. Clark, C. W., Ellison, W. T., Southall, B. L., Hatch, L. T., Van Parijs, S. M., Frankel, A., and Ponirakis, D. (2009). "Acoustic masking in marine ecosystems: intuitions, analysis, and implication," Marine Ecology Progress Series 395, 201-222. Ezer, T., Hobbs, R., and Oey, L. (2008). " On the movement of beluga whales in Cook Inlet, Alaska," Oceanography 21, 186-195. Francis, C. D., Ortega, C. P., and Cruz, A. (2011a). "Different behavioural responses to anthropogenic noise by two closely related passerine birds," Biology Letters 7, 850 - 852. Francis, C. D., Ortega, C. P., and Cruz, A. (2011b). "Noise Pollution Filters Bird Communities Based on Vocal Frequency," PLoS ONE 6, e27052. Gervaise, C., Simard, Y., Roy, N., Kinda, B., and Menard, N. (2012). "Shipping noise in whale habitat: Characteristics, sources, budget, and impact on belugas in Saguenay-St. Lawrence Marine Park hub," The Journal of the Acoustical Society of America 132, 76- 89. Goetz, K. T., Rugh, D. J., Read, A. J., and Hobbs, R. C. (2007). "Habitat use in a marine ecosystem: beluga whales Delphinapterus leucas in Cook Inlet, Alaska," Marine Ecology Progress Series 330, 247-256. Gotelli, N. J., and Ellison, A. M. (2004). A Primer of Ecological Statistics (Sinauer Associates, Inc., Sunderland, MA), 239 - 287. Halfwerk, W., and Slabbekoorn, H. (2009). "A behavioural mechanism explaining noise- dependent frequency use in urban birdsong," Animal Behaviour 78, 1301-1307. Hobbs, R., Shelden, K. E. W., Rugh, D. J., and Norman, S. A. (2008). "2008 Status review and extinction risk assessment of Cook Inlet belugas (Delphinapterus leucas)." in AFSC Processed Report. 2008-02 (Alaska Fisheries Science Center, National Marine Fisheries Service). Hobbs, R. C., Laidre, K. L., Vos, D. J., Mahoney, B. A., and Eagleton, M. (2005). "Movements and area use of belugas, Delphinapterus leucas, in a subarctic Alaskan estuary," Arctic 58, 331-340. Hobbs, R. C., Rugh, D. J., and DeMaster, D. P. (2000). "Abundance of belugas, Delphinapterus leucas, in Cook Inlet, Alaska, 1994-2000," Marine Fisheries Review 62, 37-45.

127 Hobbs, R. C., Sims, C. L., and Shelden, K. E. W. (2011). "Estimated abundance of belugas in Cook Inlet, Alaska, from aerial surveys conducted in June 2011," in Unpublished Report (NMFS, NMML). Holt, M. M., Noren, D. P., and Emmons, C. K. (2011). "Effects of noise levels and call types on the source levels of killer whale calls," The Journal of the Acoustical Society of America 130, 3100-3106. Hotchkin, C. F., and Parks, S. E. (In press). "The Lombard effect and other vocal noise compensation strategies: insights from mammalian communication systems." Hotchkin, C. F., Parks, S. E., and Mahoney, B. A. (2010). "Anthropogenic noise sources and sound production of beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska.," in 159th Meeting of the Acoustical Society of America (Baltimore, MD), p. 1727. [Abstract] Hu, Y., and Cardoso, G. C. (2010). "Which birds adjust the frequency of vocalizations in urban noise?," Animal Behaviour 79, 863-867. Huntington, H. P. (2000). "Traditional knowledge of the ecology of belugas, Delphinapterus leucas, in Cook Inlet, Alaska " Marine Fisheries Review 62, 134-140. Karlsen, J. K., Bisther, A. B., Lydersen, C. L., Haug, T. H., and Kovacs, K. K. (2002). "Summer vocalisations of adult male white whales (Delphinapterus leucas) in Svalbard, Norway," Polar Biology 25, 808-817. Kelley, M. M. (2010). "Acoustic analysis of the ultrasonic underwater repertoire of beluga whales (Delphinapterus leucas) at the John G. Shedd Aquarium," MS Thesis, Western Illinois University. Lammers, M. O., Brainard, R. E., Au, W. W., Mooney, T. A., and Wong, K. B. (2008). "An ecological acoustic recorder (EAR) for long-term monitoring of biological and anthropogenic sounds on coral reefs and other marine habitats," The Journal of the Acoustical Society of America 123, 1720-1728. Lammers, M. O., Castellote, M., Small, R., Atkinson, S., Jenniges, J., Rosinski, A., Oswald, J. N., Garner, C., and Au, W. W. (In review). "Passive acoustic monitoring of Cook Inlet beluga whales (Delphinapterus leucas)," Journal of the Acoustical Society of America. Lane, H., and Tranel, B. (1971). "The Lombard sign and the role of hearing in speech," Journal of Speech, Language, and Hearing Research 14, 677 - 709. Larom, D., Garstang, M., Payne, K., Raspet, R., and Lindeque, M. (1997). "The influence of surface atmospheric conditions on the range and area reached by animal vocalizations," Journal of Experimental Biology 200, 421-431. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65-84. Lombard, E. (1911). "Le signe de l'elevation de la voix," Annales Des Malades de l'creille 37. Lusseau, D., Bain, D. E., Williams, R., and Smith, J. C. (2009). "Vessel traffic disrupts the foraging behavior of southern resident killer whales Orcinus orca," Endangered Species Research 6, 211-221. Miksis-Olds, J. L., and Tyack, P. L. (2009). "Manatee (Trichechus manatus) vocalization usage in relation to environmental noise levels," The Journal of the Acoustical Society of America 125, 1806-1815. Moore, S. E., and DeMaster, D. P. (2000). "Cook Inlet belugas, Delphinapterus leucas: Status and overview," Marine Fisheries Review 62, 1 - 5. Moore, S. E., Shelden, K. E. W., Litzky, L. K., Mahoney, B. A., and Rugh, D. J. (2000). "Beluga, Delphinapterus leucas, habitat associations in Cook Inlet, Alaska," Marine Fisheries Review 62, 60-80.

128 NMFS (2008). "Conservation plan for the Cook Inlet beluga whale (Delphinapterus leucas)." (National Marine Fisheries Service, Juneau, AK.). Noren, D. P., Dunkin, R. C., Williams, T. M., and Holt, M. M. (2011). "Energetic cost of behaviors performed in response to vessel disturbance: one link in the population consequences of acoustic disturbance model," in The Effects of Noise on Aquatic Life, edited by A. Hawkins, and A. N. Popper. Nowacek, D. P., Johnson, M. P., and Tyack, P. L. (2004). "North Atlantic right whales (Eubalaena glacialis) ignore ships but respond to alerting stimuli," Proc. R. Soc. Lond. B 271, 227-231. O'Corry-Crowe, G. M. (2009). "Beluga Whale: Delphinapterus leucas," in Encyclopedia of Marine Mammals (Second Edition), edited by F. P. William, W. Bernd, and J. G. M. Thewissen (Academic Press, London), pp. 108-112. Panova, E., Belikov, R., Agafonov, A., and Bel’kovich, V. (2012). "The relationship between the behavioral activity and the underwater vocalization of the beluga whale (Delphinapterus leucas)," Oceanology 52, 79-87. Parks, S. E., Johnson, M., Nowacek, D. P., and Tyack, P. L. (2011). "Individual right whales call louder in increased environmental noise," Biology Letters 7, 33-35. Parsons, E. C. M., Dolman, S. J., Wright, A. J., Rose, N. A., and Burns, W. C. G. (2008). "Navy sonar and cetaceans: Just how much does the gun need to smoke before we act?," Marine Pollution Bulletin 56, 1248-1257. Patricelli, G. L., and Blickley, J. L. (2006). "Avian communication in urban noise: causes and consequences of vocal adjustment," The Auk 123, 639-649. Recchia, C. A. (1994). "Social Behaviour of Captive Belugas, Delphinapterus Leucas," Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA. Richardson, W. J., Greene, C. R. J., Malme, C. I., and Thompson, D. H. (1995). Marine mammals and noise (Academic Press, San Diego, CA), Richardson, W. J., Würsig, B., and Greene Jr, C. R. (1990). "Reactions of bowhead whales, Balaena mysticetus, to drilling and dredging noise in the Canadian Beaufort Sea," Marine Environmental Research 29, 135-160. Rolland, R. M., Parks, S. E., Hunt, K. E., Castellote, M., Corkeron, P. J., Nowacek, D. P., Wasser, S. K., and Kraus, S. D. (2012). "Evidence that ship noise increases stress in right whales," Proceedings of the Royal Society B: Biological Sciences, Published online 8 February 2012. Romano, T. A., Keogh, M. J., Kelly, C., Feng, P., Berk, L., Schlundt, C. E., Carder, D. A., and Finneran, J. J. (2004). "Anthropogenic sound and marine mammal health: measures of the nervous and immune systems before and after intense sound exposure," Canadian Journal of Fisheries and Aquatic Science. 61, 1124-1134. Rugh, D. J., Shelden, K. E. W., and Hobbs, R. C. (2010). "Range contraction in a beluga whale population," Endangered Species Research 12, 69-75. Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," The Journal of the Acoustical Society of America 117, 1486-1492. Schroeder, J., Nakagawa, S., Cleasby, I. R., and Burke, T. (2012). "Passerine Birds Breeding under Chronic Noise Experience Reduced Fitness," PLoS ONE 7, e39200. Sjare, B. L., and Smith, T. G. (1986). "The vocal repertoire of white whales, Delphinapterus leucas, summering in Cunningham Inlet, Northwest Territories," Canadian Journal of Zoology 64, 407 - 415. Speckman, S. G., and Piatt, J. F. (2000). "Historic and current use of Lower Cook Inlet, Alaska, by belugas, Delphinapterus leucas," Marine Fisheries Review 62, 22 -26.

129 Tressler, J., and Smotherman, M. S. (2009). "Context-dependent effects of noise on echolocation pulse characteristics in free-tailed bats," Journal of Comparative Physiology A 195, 923 - 934 Vergara, V. L., Michaud, R., and Barrett-Lennard, L. G. (2010). "What can Captive Whales tell us About their Wild Counterparts? Identification, Usage, and Ontogeny of Contact Calls in Belugas (Delphinapterus leucas)," International Journal of Comparative Psychology 23, 278 - 309. Weilgart, L. S. (2007). "The impacts of anthropogenic ocean noise on cetaceans and implications for management," Canadian Journal of Zoology 85, 1091-1116. Wiley, R. H., and Richards, D. G. (1978). "Physical constraints on acoustic communication in the atmosphere: Implications for the evolution of animal vocalizations," Behavioral Ecology and Sociobiology 3, 69-94. Wintle, B. A. (2009). "Adaptive Management, Population Modeling and Uncertainty Analysis for Assessing the Impacts of Noise on Cetacean Populations," International Journal of Comparative Psychology 20, 237-249. Wright, A. J., Soto, N. A., Baldwin, A. L., Bateson, M., Beale, C. M., Clark, C., Deak, T., Edwards, E. F., Fernandez, A., Godinho, A., Hatch, L. T., Kakuschke, A., Lusseau, D., Martineau, D., Romero, L. M., Weilgart, L. S., Wintle, B. A., Notarbartolo-di-Sciara, G., and Martin, V. (2007a). "Anthropogenic noise as a stressor in animals: a multidisiplinary perspective," International Journal of Comparative Psychology 20, 250 - 273. Wright, A. J., Soto, N. A., Baldwin, A. L., Bateson, M., Beale, C. M., Clark, C., Deak, T., Edwards, E. F., Fernandez, A., Godinho, A., Hatch, L. T., Kakuschke, A., Lusseau, D., Martineau, D., Romero, L. M., Weilgart, L. S., Wintle, B. A., Notarbartolo-di-Sciara, G., and Martin, V. (2007b). "Do marine mammals experience stress related to anthropogenic noise?," International Journal of Comparative Psychology 20, 274-316.

Chapter 6

Vocal noise compensation in a non-human primate: effects of noise bandwidth and level on cotton-top tamarin (Saguinus oedipus) vocalizations

Abstract

Vocal noise compensation mechanisms are common among vertebrate species, but there are few data on which aspects of noise elicit vocal responses. Detailed studies of the effects of specific noise characteristics and the potential interactions between compensation strategies are available only for humans, although some non-human species exhibiting similar vocal modifications also appear to respond to similar noise parameters. Among non-human primates, however, the effects of spectral overlap between noise and vocalizations are poorly known, representing a critical gap in understanding how noise impacts acoustic communication. This study used multiple amplitudes of broad- and narrow-band white noise to determine whether signaling cotton-top tamarins (Saguinus oedipus) differentially modify the acoustic characteristics of their vocalizations in response to the amount of spectral overlap and/or the amplitude of noise. Modifications to long distance contact calls termed “combination long calls” (CLCs) were associated with both noise level and amount of spectral overlap. Observed changes included increased call amplitude (the Lombard effect), changes to fundamental frequency, and spectral tilt. Changes to chirp frequencies and duration were predicted only by noise amplitude, as there was no spectral overlap. Notably, a previously documented increase in CLC duration was not observed. These results indicate that noise-induced vocal modifications in cotton-top tamarin calls are influenced by noise level and degree of spectral overlap, and that modification magnitude and types are context-dependent, changing with vocalization type.

Introduction

Noise is known to exert a significant influence on the evolution of acoustic communication, imposing fitness costs on animals that lose contact with offspring, miss mating

131 opportunities, or do not hear alarm calls from group members (Bradbury and Vehrencamp, 1998). Short-term variability in ambient noise (ranging from milliseconds to days) due to weather events, animal choruses, or other environmental factors, can overlap the range of signalers’ evolved communication frequencies and potentially disrupt communication (Wiley, 2006). Just as long-term trends in noise spectra can influence signal structure (Ey and Fischer, 2009), short-term changes to the acoustic environment may select for vocal flexibility in signaling animals, allowing signalers to compensate for increased noise by changing either their calling behavior or the acoustic characteristics of their signals (Brumm and Slabbekoorn, 2005). Vocal noise compensation mechanisms that result in a change to the acoustic structure of a signal are termed noise-induced vocal modifications (NIVMs; (Hotchkin and Parks, In press). These can include changes along three dimensions: signal amplitude, frequency characteristics, and temporal parameters (Patricelli and Blickley, 2006; Hotchkin and Parks, In press). Some species have limited vocal control and may exhibit flexibility along only one axis, while others have demonstrated vocal flexibility in all three dimensions (Lesage et al., 1999; Love and Bee, 2010). Different types of vocal modifications may be more effective at compensating for noise with specific spectral or temporal characteristics. For instance, in the case of relatively continuous low-frequency noise, either increasing call amplitude or shifting call frequency out of the noise may increase signal detection (Brumm and Slabbekoorn, 2005). Even if there is no spectral overlap between the noise and the signal, low frequency noise has the potential to mask signals in higher frequency bands (Zwicker and Fastl, 1999), and can potentially impact vocalization structure. When there is no spectral overlap with noise, however, frequency shifts are not likely to increase communication success. Rather, increasing call amplitude is likely to be the most effective modification. In the case of intermittent noise, changes to call duration or repetition may increase the likelihood of a portion of the vocalization occurring during quiet periods, alerting any receivers to the signaler’s attempt to communicate and increasing the chances of successful communication. In systems with highly dynamic noise environments, the ability to flexibly respond to different noise types with truly adaptive vocal modifications may increase successful communication, providing a fitness advantage to flexible signalers. As predicted, many vertebrate species have demonstrated the vocal control and flexibility necessary to alter the acoustic structure of a signal (Brumm and Slabbekoorn, 2005; Patricelli and Blickley, 2006; Nowacek et al., 2007). The most common noise-induced modification observed in bird and mammal vocalizations is a linear relationship between noise level and signal

132 amplitude known as the Lombard effect (Lombard, 1911; Scheifele et al., 2005; Brumm and Zollinger, 2011). Increasing vocalization amplitude during noise may allow signalers to monitor their own vocal production and/or communicate more effectively with a receiver (Lombard, 1911; Lane and Tranel, 1971). Typical Lombard effect slopes (1-2 dB increase in signal amplitude per 10 dB increase in noise level; (Lane and Tranel, 1971; Schmidt and Joermann, 1986; Scheifele et al., 2005)) are not enough to completely compensate for the loss of signal-to- noise ratio (SNR) in every situation. In general, however, animals often vocalize at higher SNRs than necessary in low ambient noise (Lane and Tranel, 1971; Brumm et al., 2004; Brumm and Slabbekoorn, 2005; Egnor and Hauser, 2006; Brumm and Zollinger, 2011). In combination, this signal excess and the Lombard effect can help mitigate the loss of effective communication range during most noise events (Lane and Tranel, 1971; Brumm and Slabbekoorn, 2005). Spectral modifications, or shifts in the start, end, minimum (lowest), maximum (highest), and peak (greatest energy) frequencies of a signal, can reduce masking by shifting all or part of a call out of masking noise, reducing the degree of spectral overlap of the signal (Brumm and Slabbekoorn, 2005). In bird and mammal vocalizations, shifts to higher vocalization frequencies are often observed when animals are exposed to low-frequency sounds (Lesage et al., 1999; Halfwerk and Slabbekoorn, 2009; Tressler and Smotherman, 2009; Hu and Cardoso, 2010; Cardoso and Atwell, 2011). Some species also appear to be capable of shifting to lower frequency calls during high-frequency noise (Rabin et al., 2003; Halfwerk and Slabbekoorn, 2009). Notably lacking from the research on spectral shifts are nonhuman primates, which have never been observed to consistently shift their vocalization frequencies in noise (Egnor and Hauser, 2006; Egnor et al., 2006), and have historically been assumed to lack the vocal control to change the acoustic structure of their calls (Fitch and Hauser, 1995; Snowdon et al., 2009). One spectral modification that has been studied almost exclusively in human speech is spectral tilt, a measure of the relative energy in low and high frequency portions of a signal (Fitch, 1989; Parks, 2003; Rabin et al., 2003). During increased noise, energy shifts from low frequency portions of the signal to higher frequency formants (referred to as a decrease in spectral tilt), potentially removing some of the signal from masking noise (Sundberg and Nordenberg, 2006; Lu and Cooke, 2009). The apparent cause of this change is a biomechanical or psychophysical linkage between vocal effort, which is directly related to signal amplitude, and deformation of the vocal apparatus causing a change in the spectral balance of vocalizations (Fitch, 1989). If observed independently of the Lombard effect, changes to spectral tilt may also

133 be adaptive for communication in noise, as it can shift a portion of the signal out of the noise band. Temporal parameters of signals (e.g. duration, number of syllables, and repetition) can also affect signal detection and successful communication. Changes to temporal characteristics may allow the receiver more time to detect and process the signal, or capture the receiver’s attention before the information rich portion of the sound is produced (Ord and Stamps, 2008). Increases in the duration of whole calls and syllables have been found for many species, including those that have not demonstrated other types of vocal flexibility (Hanley and Steer, 1949; Miller et al., 2000; Brumm et al., 2004; Patricelli and Blickley, 2006). In particular, two species of non- human primates (cotton-top tamarins (Sagunius oedipus) and common marmosets (Callithrix jaccus)), are known to increase the duration of syllables within their long-distance contact calls while simultaneously demonstrating the Lombard effect (Brumm et al., 2004; Egnor and Hauser, 2006). Notably, other types of temporal modification are relatively rare (Brumm and Zollinger, 2011), Noise structure, including amplitude, spectral content, and temporal properties, can significantly affect which modifications will increase communication success (Brumm and Slabbekoorn, 2005). However, the aspects of noise to which signalers respond are not fully understood. Both human and non-human signalers respond to noise amplitude and degree of spectral overlap (Lesage et al., 1999; Tressler and Smotherman, 2009; Garnier et al., 2010) and some non-human primates also respond to the temporal structure of noise (Egnor et al., 2007; Roy et al., 2011), but there is little data on how these parameters interact to influence the types of vocal modifications observed. Many existing studies have historically analyzed the effects of single vocal modifications (Brumm, 2004), but signalers that respond to multiple noise parameters may actually exhibit concurrent changes to several aspects of vocalizations (Hotchkin and Parks, In review). Interactions between different types of vocal modifications may affect the propagation patterns or detectability of the signals, and the cumulative effects may not be linearly predicted based on the individual modification types. It is therefore important to understand how signalers perceive noise, and what effects the use of multiple modifications has on the signalers’ overall communication success. Among non-human primates, vocal noise compensation has been best studied in cotton- top tamarins (hereafter tamarins), a small arboreal species of New World monkey that is native to Colombia (Savage et al., 1996). Adults of this species form long-term pair bonds and live in groups of a breeding pair and sub-adult offspring (Goldizen, 1990), communicating within and

134 between groups with an extensive vocal repertoire (Cleveland and Snowdon, 1982). Captive tamarins appear to exhibit the majority of call types found in the wild population (McConnell and Snowdon, 1986). In particular, when individuals are removed from a captive colony, the focal individual or other animals in the colony often will spontaneously produce a multi-syllable, stereotyped vocalization known as the ‘combination long call’ (CLC), which consists of “chirp” and “whistle” syllables, and is presumed to function as a long-distance contact call (Ghazanfar et al., 2001; Ghazanfar et al., 2002; Miller and Hauser, 2004). This call type is also produced in response to other individuals producing CLCs (‘antiphonal calling’); perceptual characteristics of the call type are well known, and informationally important components have been identified (Weiss et al., 2001; Weiss and Hauser, 2002; Jordan et al., 2004).

Figure 6-1. A captive cotton-top tamarin (Homer) from the Penn State colony.

Previous studies of NIVMs in cotton-top tamarins and common marmosets have shown that non-human primates are capable of modifying the temporal and amplitude characteristics of their vocalizations during increased noise (Brumm et al., 2004; Egnor and Hauser, 2006; Egnor et al., 2006). While there is no evidence that tamarins are able to modify the fundamental frequency of their CLCs in a consistent manner, a related species (black tufted-ear marmosets; Callithrix kuhlii) can modify the spectral content of calls in response to changes in social context (Rukstalis et al., 2003), indicating that tamarins may also have some control of the spectral content of their vocalizations. Both tamarins and marmosets have been shown to attend to the temporal properties of noise and use this information to vocalize during quiet gaps in noise (Egnor et al., 2006; Egnor et al., 2007; Versace et al., 2008; Roy et al., 2011). However, there have been no studies

135 investigating whether tamarins or marmosets, like humans, modify their calls or calling behavior in response to the spectral content of a masking stimulus. Additionally, all previous studies of tamarin vocal behavior in noise have used a maximum of two levels of noise (loud and quiet) to investigate changes (Egnor and Hauser, 2006; Egnor et al., 2006; Egnor et al., 2007). Such a design is sufficient to evaluate whether the subjects are responding to the noise, but not to evaluate whether signalers employ modifications independently, or whether all vocal changes occur simultaneously, in a linked all-or-nothing paradigm. In order to investigate whether different types of vocal modifications can occur independently of one another, an intermediate level of noise is required. This study uses playbacks of six different noise amplitude and bandwidth combinations to explore (1) how the parameters of noise affect the vocal response, and (2) whether tamarins employ vocal modifications non-simultaneously.

Research questions and hypotheses

This study experimentally tested whether vocal noise compensation mechanisms used by cotton-top tamarins are affected by varying the amplitude and spectral content (bandwidth) of masking noise. Previous research has established that tamarins concurrently modify the amplitude (the Lombard effect) and duration of their CLCs in response to high amplitude white noise, though no consistent changes to CLC fundamental frequency were reported (Egnor and Hauser, 2006). Consequently, I predicted that all noise conditions will elicit changes to CLC amplitude and duration, but that there will be no change in the fundamental frequency of CLCs in any noise condition. Additionally, due to the demonstrated perceptual importance of the second CLC harmonic (Weiss et al., 2001; Weiss and Hauser, 2002), I predicted that narrowband (5 kHz bandwidth) noise, targeted to overlap second harmonic frequencies, would cause greater magnitude changes to vocalization amplitude than equal amplitude broadband noise (10 kHz bandwidth). Despite the robust harmonic structure of CLCs and evidence for the perceptual importance of specific harmonics, shifts in the peak frequency and spectral tilt of these calls have, to the best of my knowledge, never been investigated. Given evidence of changes to spectral tilt from both close and distant evolutionary relatives (Parks, 2003; Rabin et al., 2003; Lu and Cooke, 2009), it is reasonable to predict that shifts in the spectral tilt of tamarin vocalizations are likely to occur during increased noise. If spectral tilt is linked to the Lombard effect as hypothesized for

136 humans, changes should be highly correlated with increases in vocal amplitude and greatest during high-amplitude narrowband noise. In addition, changes to spectral tilt during narrowband noise are more likely to shift energy to an unmasked frequency band than equivalent changes in response to the broadband stimulus of equal amplitude. Concurrent usage of vocal modifications has previously been demonstrated by tamarins, but there has been no examination of the order in which modifications are employed. While some birds appear to be able to adjust acoustic parameters independently (Cardoso and Atwell, 2011), this appears less common in mammals (Fitch, 1989; Tressler and Smotherman, 2009). I therefore predicted that modifications to the amplitude, duration, and spectral tilt of CLCs during increased noise would be simultaneous and have proportionally similar magnitudes of change at each noise level.

Methods

Animal Care

The animals used for this study were housed on the main campus of The Pennsylvania State University, University Park, PA in a single colony room. The colony consisted of 13 adult cotton-top tamarins (7 males, 6 females) housed in mated pairs, with the exception of a single male housed alone. Animals were fed the standard tamarin diet of monkey chow, seeds, nuts, and fruits with ad libitum access to water. The design of the colony room allowed unlimited acoustic contact by all colony members and limited visual contacts with animals in neighboring home cages. Seven of the 13 animals participated in this study (males: Bart, Homer, Jerry, Milhouse; females: Elaine, Mulva, Susan); the other six individuals were not reliably catchable during the data collection period and were therefore excluded. Two of the individuals tested (Homer and Elaine) did not vocalize during trials and were therefore excluded from analysis, yielding an effective sample size of five animals. Of these five, none were mated pairs, and all were housed separately from one another.

137 Playback stimuli

Noise stimuli

White noise playback stimuli were generated using the noise generation function in Adobe Audition version 5 (Adobe, 2011) (sampling rate: 44.1 kHz), and band-pass filtered to generate stimuli of the appropriate bandwidth and level combinations. Two different bandwidths (“broad” and “narrow”), and three levels were used, for a total of six treatments (Figure 6-2). Noise bandwidths and frequency ranges were selected to target the masking noise at the perceptually important second harmonic of combination long calls (around 4 kHz; see Weiss and Hauser (2002). Broadband stimuli contained energy between 100 Hz and 10 kHz, in effort to mask the first four to five harmonics of the CLCs. Narrowband stimuli were generated by band- pass filtering the broadband stimuli between 1.5 kHz and 6.5 kHz, retaining masking noise around the first three CLC harmonics. Noise levels were set to not exceed 70 dB re 20 µPa rms over the recording bandwidth.

Figure 6-2. Spectrograms of white noise playback stimuli recorded during trials. Treatments A – C have a bandwidth of 5 kHz and are presented at 2, 12, and 22 dB above ambient noise levels. Treatments D – F have a 10 kHz bandwidth and are presented at similar broadband rms amplitudes to treatments A – C. (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution)

138 CLC Elicitation

Cotton-top tamarins often produce CLCs when they are removed from their colony, isolated from other individuals, or in response to another individual’s CLC (Ghazanfar et al., 2001; Ghazanfar et al., 2002; Jordan et al., 2004). This experiment used playbacks of CLCs from an unfamiliar adult female tamarin to elicit CLCs from focal animals during control and treatment trials. Three exemplars from the same female (Figure 6-3) were played in random order during test periods; the exemplars were the same for the entire experiment.

Figure 6-3. Spectrograms of the three elicitation stimuli played during trials. All three calls are from the same unfamiliar adult female, and were played in random order during each trial. (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution)

Data collection

Data were collected between November 2011 and March 2012. To minimize potential for increased stress in animals exposed to noise (Wright et al., 2007a), only one session was conducted with each animal per day, and no animal experienced more than two sessions per week. Each session consisted of one control and one treatment trial presented in random order with a 15 – 60 minute rest period in the home cage between trials. Randomization was performed within and between tamarins using the pseudo-random ‘rand’ function in Matlab® (MathWorks, 2007) to determine the testing order for individual tamarins within a session and the order of treatments for each tamarin between sessions.

139 Randomization was done at these levels to prevent individuals from detecting a pattern in capture of conspecifics and noise levels between sessions. While the randomized order of testing individuals did not always work as anticipated, due to complications with catching animals, the order of animals being tested varied between days, and is thus unlikely to have affected the outcome of the experiment. The order of treatment (noise playback) and control (silent playback) trials within a session were also randomized to prevent the animals from predicting what the noise environment would be during their next capture. During treatment trials, the noise playback was started before the animal entered the testing area, so that they did not perceive a change in the testing room’s acoustic environment during the trial.

Materials

The testing area (Figure 6-4) consisted of a staging area, experimenter station and soundproof testing room (Acoustic Systems, customized chamber) lined with acoustic foam to reduce reverberation. To reduce noise, air vents were covered with acoustic foam during sessions and uncovered at the end of each day.

Figure 6-4. Diagram of experimental setup. Letters indicate equipment placement; M = microphone; C=video camera; NS= speaker presenting noise stimulus; ES = speaker presenting elicitation stimulus.

140 Stimuli were projected into the chamber from the experimenter station, where a two- channel power amplifier (Samson Servo 300 flat (± 0.5dB) 20 Hz – 20 kHz) was connected to two speakers (Tannoy Reveal 6p; flat (±3dB) 63 Hz – 30 kHz) inside the testing room. Noise stimuli were played back through a Marantz PMD620 recorder connected to channel one of the amplifier; channel two was connected to an iMac computer used to play back call elicitation stimuli. The speakers were located in the corner of the testing room and the stimulus played through each (noise or elicitation) was held constant over the course of the experiment. Other equipment in the testing room included a video camera (Microsoft Livecam Studio model 1425 webcam) which streamed live video to a laptop computer at the experimenter station, which was used to monitor the focal animal’s behavior and position during trials. Video was also recorded for future analysis of the focal animal’s head positions and behavior. To record the tamarins’ vocal responses to broadcast stimuli, a calibrated Earthworks M30 omnidirectional microphone was connected to an Edirol R40 Pro data acquisition system (DAQ) sampling at 48 kHz, with phantom power set to ‘on’ and maximum gain. The microphone was calibrated using a G.R.A.S. type 42 AB ½-inch microphone calibrator (114 dB re 20 µPa) applied to the microphone while varying the gain on the recorder. The Edirol DAQ was calibrated for different gain levels using a signal generator (HP 33120A) and digital oscilloscope (Tektronix DPO 2404) with the DAQ phantom power set to ‘on’. These calibrations were used to calculate the recording sensitivity of the microphone for use in measuring absolute sound levels in the test chamber. The animal’s position during the trial was within a 0.6 x 0.3 x 0.6 m test cage with the bottom and front sides made of steel mesh and the others of Plexiglas. This served to keep the focal animal on the front of the cage and facing the microphone for the majority of the experiment. Noise within the test cage was recorded before the beginning of data collection; levels were consistent with noise recorded at the microphone position during data collection.

Sessions

To begin each session, a focal animal was lured from its home cage into a transport box by a research assistant using a raisin or mini marshmallow. The animal was then moved to the testing area (> 100 m from the colony room). While the focal animal was being moved from the colony room to the testing area, the noise playback for the randomly assigned trial type (control

141 or treatment) was started. In all control trials, the noise condition was a silent playback, while noise level and bandwidth varied between treatment trials (treatments A – F, Figure 6-2). The transport box was removed from the cart outside the staging area and brought into the soundproof testing room, where the focal animal was allowed to enter the test cage. If necessary, a marshmallow was used to lure the focal animal into the test cage (Figure 6-5). Recording began as the door to the soundproof room was closed.

Figure 6-5. A tamarin in the test cage after a test session. Note the transport box to the right of the test cage and microphone and webcam at the bottom of the image.

Each trial began with a two-minute acclimation period to allow the animal to adjust to the test situation and to account for any possible lag between noise exposure and the onset of vocal modifications. During the acclimation period, no call elicitation stimuli were played to the focal animal. Vocalizations produced by the focal animal during this period were recorded for use as exemplars of spontaneous vocalizations and controls for any potential differences between spontaneous and elicited calls (Miller et al., 2009). The test period began immediately after the two minute acclimation period was complete. If the focal animal had been spontaneously producing CLC vocalizations during the acclimation period, which was common for several individuals, the experimenter waited 60 seconds to see if the animal would continue vocalizing spontaneously. If no CLCs were produced within that time, then call elicitation stimuli were played in random order at approximately 30 second intervals to elicit response calling from the focal animal. If the focal animal did not produce spontaneous CLCs during the acclimation period, elicitation stimuli playbacks began within 15s of the end of

142 the acclimation period, and were repeated in random order when the focal animal was silent for ≥ 30 seconds for the duration of the test period. The test period continued until the animal had produced 5 high SNR CLCs, or ≥ 10 chirps, or until the total noise exposure (including acclimation period) reached 12 minutes, at which point the trial ended. If the animal did not produce enough calls, the data were excluded from analysis and the trial was re-run on another day. At the end of a trial, the animal was removed from the test cage (using a marshmallow if necessary) and returned to its home cage for a 15 – 60 minute rest period between trials. The test cage was sanitized with Quatricide ® PV after each trial. During trials, the experimenter was positioned at the experimenter station, listening to the audio recordings, noting all vocalizations produced by the focal animal, and playing elicitation stimuli at appropriate intervals. The experimenter and a research assistant also monitored the streaming video for any indications of agitated behavior by the focal animal. If the research assistant judged that the focal animal was exhibiting an unusual amount of agitated behaviors (alarm chirps, twirling, leaping, digging at the sides of the cage, etc.), the trial and session were aborted and re-run on another day. The second trial of a session was conducted according to the same protocol with the opposite trial type (if the first trial of a session was control, the second trial was a treatment, and vice versa). After the end of the second trial, the focal animal was returned to its home cage and not tested again on that day.

Data Processing

Acoustic data were processed for analysis using high-pass filtering and noise-reduction processes in Adobe Audition. All audio recordings were high-pass filtered (100Hz cutoff frequency, 8192 point Blackman Window, Adobe Audition 5.5) to reduce low-frequency noise from ventilation in the test room and from sources outside the building that could be heard in the room (bus passages, etc). Noise reduction processing was used only to give the experimenter an unobstructed view of all vocalizations produced during the trials (figure 6-6), and not for analysis of vocalization characteristics. Further analyses were done using files that had been high-pass filtered only, with no noise reduction.

143

Figure 6-6. Spectrogram of CLC produced during treatment A (a) before and (b) after noise reduction processing in Adobe Audition. (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) Note that the highest formant of the call has been removed by the noise reduction processing. All calls that were selected in the noise-reduced files were verified against the original recordings to ensure accurate measures of call characteristics.

Data analyses

Source level of vocalizations

The source level of a vocalization can be calculated as where RL is the level at the receiver, and TL is the transmission loss of the sound through the environment between source and receiver. In this experiment the microphone (receiver) was located within 0.7 m of the sound source, and TL was therefore not included in my calculations. In order to determine the source level of each vocalization, I subtracted the mean-squared amplitude of noise in the test chamber from the mean-squared amplitude of the vocalization, took the square root of this value, and transformed the data into decibels. Noise amplitude was calculated from a 5s clip with no vocalizations and no cage noise taken from the same trial as the vocalization of interest. Vocalization source level was transformed into decibels using the formula

144 where VA and NA represent mean-squared vocalization and mean-squared noise amplitudes, respecitively, using the in-air reference pressure of 20 µPa. The average call source level for each trial was generated by averaging the calculated mean-squared source levels for all calls in a trial and then transforming into dB using the formula above.

Spectral and temporal modifications

Vocalizations were logged from high-pass filtered recordings using Raven 1.4 Pro (Bioacoustics Research Program, 2011). Recordings were visually and aurally browsed for vocalizations and all chirps and CLCs were manually selected. Included parameters were: duration, start, minimum, maximum, and peak frequencies. Duration was determined from the waveform view of the entire call, zoomed in to determine when the sound began and ended. Spectrogram (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) and spectrum slice views were used simultaneously to determine the minimum and maximum frequencies. Peak frequencies were determined automatically from the spectrum view, while start frequency was determined manually using the cursor. Measurements of acoustic characteristics were conducted differently for the two call types. For CLCs, spectral and temporal measurements were taken from the call as a whole (Figure 6-7a) and from each individual syllable. Measured parameters included: minimum (lowest), maximum (highest visible), and peak frequencies, duration, and number whistle and chirp syllables. The peak frequency of the lowest harmonic (“fundamental”) was also measured for all syllables. Start frequency was not measured for CLC vocalizations. For chirps, minimum, maximum, start, and peak frequencies were measured from the fundamental frequency of the call (Figure 6-7b), rather than including all formants. This allowed for detection of possible changes to the fundamental frequency and evaluation of the actual start frequencies of the calls. Duration of whole CLC vocalizations included inter-syllable pauses, which were excluded when the individual syllable durations were measured. In addition, number of each type of syllable was measured for each CLC.

145

Figure 6-7. Spectrograms (1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution) of CLC (a) and chirp (b) vocalizations with measured frequency characteristics indicated. All measurements of CLCs were made on the call as a whole and the individual syllables within the call (not shown). This CLC consists of one chirp and four whistle syllables. Measurements of chirps were made from the fundamental frequency. Note that peak frequency measurements for all syllables, fundamental frequencies, and whole calls were taken automatically from the spectrum view in Raven 1.4 Pro (not shown).

Spectral tilt

Clips of all vocalizations measured in Raven were exported for spectral tilt and amplitude analyses. For CLCs, clips were made of the entire call and each syllable individually. Clips were saved as 16 bit wav files that were imported into Matlab ® (MathWorks, 2007) for analysis. Spectral tilt was measured by taking the ratio of the amount of energy in the high frequency formants of the vocalization and the energy in the call’s fundamental frequency. Clips of call syllables were imported into Matlab, which was used to generate an averaged power spectral density for the signal and subtract the noise spectrum. The built-in Matlab function “findpeaks” was then used to find the peaks in the call’s power spectrum (the harmonics of the whistle) and compare the values for all peaks to the value for the fundamental frequency of the vocalization. These ratios were then compared between treatments and individuals.

146 Statistical analyses

To minimize pseudoreplication (Hurlbert, 1984), call parameters were averaged for all vocalizations produced by an individual during a trial, giving a total of 12 data points (6 treatments (A-F), 6 control (A-F)) per individual. While overall noise exposure was intended to be matched over the two bandwidth levels, this did not work as planned due to calibration errors that caused the overall amplitude for the broadband stimuli (treatments D-F) to be approximately 5dB lower than the “matched” narrowband (treatments A-C) stimuli. Noise level was therefore treated as a continuous variable in all analyses. The duration of spontaneous and elicited CLCs was analyzed to determine whether this factor affected our findings of no change in CLC duration. Based on observation of the focal animals’ behavior in response to elicitation stimulus playbacks, an “elicited” CLC was defined as a CLC produced by the focal animal within 20 s after an elicitation playback. Spontaneous and elicited CLCs produced during the control trials for all treatment types were compared to evaluate potential differences in call structure between spontaneous and elicited calls (Ruiz-Miranda et al., 2002). This was analyzed in Microsoft Excel with a two-factor ANOVA using individual and elicited vs. spontaneous as fixed effects. Statistical analyses were performed in SAS version 9.2. Analysis of covariance (ANCOVA) models were generated to determine the effects and interactions of noise level and bandwidth on source level, peak frequency, minimum frequency, duration, and spectral tilt of both call types. Focal animal identity was treated as a random effect in all models. Onset order of vocal modifications was analyzed as the presence or absence of a given modification during each noise stimulus for each subject. Due to the high variability within and between subjects, statistical analyses were not performed on these data.

Results

Of the five individuals tested in this experiment, three (Bart, Jerry, and Mulva) reliably produced combination long calls (CLCs) spontaneously or in response to elicitation stimuli, and two (Milhouse and Susan) produced chirp vocalizations. Only one subject (Bart) produced both chirps and CLCs, and he did not chirp in every trial. Vocalization type appeared to affect the

147 types of vocal modifications observed, with different suites of modifications detected in CLCs and chirps. Inter- and intra-individual variability was high in all trials. Ambient noise during control sessions was approximately 42 dB re 20 µPa, and the highest exposure level was approximately 66 dB re 20 µPa during treatment A (narrowband, high amplitude) (Table 6-1).

Table 6-1. Noise levels measured during all trial sessions, averaged over all subjects (N=5). Noise levels over the recording bandwidth (0 – 24 kHz) were similar but unequal between the high (A/D), medium (B/E), and low (C/F) noise amplitude treatments due to difficulties calibrating the playback equipment. Narrowband (5kHz) Broadband (10 kHz) A B C D E F Control 42.5 42.4 44.4 42.4 42.4 42.4 [dB re 20 µPa rms] Treatment 64.3 54.1 46.7 59.2 51.3 44.1 [dB re 20 µPa rms]

Spontaneous vs. antiphonal calls

No differences were found between spontaneous (N=31) and antiphonal (N=43) CLCs with respect to whole-call duration (F=1.00 p=0.42) and peak frequency (F=1.24, p=0.38).

The Lombard Effect

The Lombard effect was observed in both call types. For combination long calls (CLCs), a repeated-measures ANCOVA model revealed a main effect of noise level (F(1,30)= 33.64, p < 0.0001) and a marginally significant interaction of noise level and bandwidth (F(1,30)= 3.13, p=0.09), indicating that the subjects increased vocal amplitude slightly more during broadband than narrow-band noise, despite broadband noise stimuli having slightly lower energy than narrowband stimuli (Figure 6-8). There was no main effect of noise bandwidth (F(1,30) = 1.86, p = 0.18). An increase in the source level of individual syllables (whistles) was also observed, with a main effect of noise level (F(1,30)= 11.21, p < 0.01), no effect of bandwidth, and no interaction.

148

Figure 6-8. Average CLC source levels vs. noise level. Colors represent individuals (Mulva – red, Bart – blue, Jerry – black), and symbols represent treatment types (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F). Note that call source level never decreases between control (42 dB re 20 µPa noise level) and treatment trials, and that inter- and intra-individual variability is high.

Amplitude of chirp vocalizations was also significantly higher during increased noise (F(1,26) = 5.89, p=0.023), but there was no observed effect of bandwith (F(1,26) = 0.00, p = 0,996) and not interactions (F(1,26)=0.03, p = 0.85). The overall magnitude of the effect was smaller for chirps than for CLCs, due to the higher baseline amplitude for chirp vocalizations (Table 6-2).

Table 6-2. Average call amplitudes for both call types during control (‘Base VL’) and treatment (‘Trt VL’) periods. Chirps had higher baseline amplitudes, but maximum call amplitudes were similar for both vocalization types.

A B C D E F

Base VL 56.4 60.5 61.2 60.8 61.7 59.9 CLCs Trt VL 68.1 65.4 64.3 72.1 68.5 65.1 Δ VL 11.8 4.9 3.1 11.3 6.7 5.2 Base VL 64.0 67.1 65.6 66.1 67.8 64.9 Chirps Trt VL 68.1 66.5 67.7 68.6 69.8 65.5 Δ VL 4.1 -0.5 2.1 2.5 2.0 0.6

149 Temporal Modifications

No changes to the duration of CLCs were detected in either whole CLC vocalizations (F(1,30); NL F= 2.42 p=0.13, BW F= 0.06 p=0.80, NL*BW F=0.02 p=0.88; Figure 6-9) or whistle syllables (F(1,30); NL F=0.41 p=0.52, BW F = 0.41p=0.43, NL*BW F=0.42 p=0.52).

Figure 6-9. Duration of whole CLC vocalizations in each trial type (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F) averaged over all subjects. Light grey bars indicate control averages, dark grey represent treatment trials. Error bars indicate standard deviation.

In contrast to CLCs, duration of chirps increased significantly with increased noise amplitude (F(1,26) = 9.41 p < 0.01) (Figure 6-10). There was no interaction or main effect of bandwidth on this call type (F(1,26) =0.05 p=0.83 and F(1,26) =0.12 p=0.73, respectively).

150

Figure 6-10. Duration of chirp vocalizations in each trial type (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F) averaged over all subjects. Light grey bars indicate control averages, dark grey represent treatment trials. Error bars indicate standard deviation.

Spectral Modifications

CLCs

Noise level and bandwidth both affected the spectral properties of CLCs. Significant increases were found in the peak frequency of the lowest (fundamental) harmonic of whole CLCs (F(1,30) = 4.99 p=0.03). While the peak frequency of the fundamental harmonic increased, the minimum frequency of the fundamental harmonic decreased (Figure 6-11), effectively increasing the bandwidth of the fundamental frequency. While there was no significant effect of noise level (NL F(1,30) = 2.18 p = 0.15) on minimum frequency, a significant interaction of noise level and bandwidth (NL*BW F(1,30) = 8.69 p < 0.01) and a main effect of bandwidth (BW F(1,30) = 8.03 p<0.01) were detected. Greater decreases in minimum frequency were found in response to broadband noise than to narrowband noise.

151

Figure 6-11. Minimum frequency of whole CLCs averaged over all subjects in each trial type (Narrowband loud, medium, quiet: A, B, C; Broadband loud, medium, quiet: D, E, F). Light grey bars indicate control trials, dark grey represent treatment trials. Error bars indicate standard deviations.

There was also a significant effect of noise level on peak frequency of whole CLCs (F(1,30) = 4.80 p=0.04) and of whistle syllables (F(1,30) = 21.07 p<0.01), with no effects of noise bandwidth and no interactions (Figure 6-12).

Figure 6-12. Peak frequency of whole CLCs averaged over all subjects as a function of noise level. Error bars indicate standard deviations.

Changes to peak frequency of whistles cannot be entirely accounted for by changes to the fundamental frequency of CLCs. Instead, changes in peak frequency appear to be the result of a combination of shifts in fundamental frequency and changes to spectral tilt (Figure 6-13). Statistical analysis (ANCOVA models) of the ratio of power spectral density in F3 to F1 (third harmonic to fundamental frequency) indicated a significant shift as an effect of noise level (NL

152 F(1,30) = 19.98 p < 0.01) with no effect of bandwidth (BW F(1,30) = 2.37 p = 0.13) and no interaction (NL*BW F(1,30) = 2.88 p=0.10).

Figure 6-13. Representative CLCs produced by Mulva during a) control and b) treatment A trials demonstrating changes to spectral tilt. All whistles from a) have strong fundamental frequencies and maximum energy in the 2nd harmonic, while in b) the first whistle has a very faint fundamental frequency at approximately 2 kHz, and peak frequencies for all whistles occur in the 4th harmonics. Reduced energy in the fundamental frequency is also apparent in the second and third whistles. Spectrogram parameters: 1024 point Hamming window, 75% overlap, 11.7 Hz frequency resolution.

Spectral tilt changes were apparent for all animals and were strongest in the two loud treatments (A and D; Figure 6-14). In control sessions, the second harmonic generally contained the greatest amount of energy, with a ratio of 2 – 3 times the energy in the fundamental frequency. During treatment sessions, the second harmonic was still often the peak frequency, but more energy was also found up to the fifth and sixth harmonics. In trials with low noise levels (treatments C and F), there was virtually no change in spectral tilt between control and treatment sessions.

153

Figure 6-14. Changes to spectral tilt of CLC whistles during different noise level/bandwidth combinations (treatments A – F), represented as the ratio of energy in the first eight formants (harmonics) to energy in the first formant (fundamental frequency) averaged within and between subjects. Solid lines represent averages from control trials, dashed lines represent treatment trials.

Chirps

Spectral modification in chirp vocalizations included changes to several aspects of the fundamental frequency of the vocalization: significant increases in peak (NL F(1,26) = 6.17 p = 0.02; BW F(1,26) = 0.80 p = 0.38; NL*BW F(1,26) = 0.61 p =0.44) and maximum (NL F(1,26) = 5.83 p = 0.02; BW F(1,26) = 0.68 p = 0.42; NL*BW F(1,26) = 0.55 p =0.47) frequencies were detected, along with a marginally significant increase in start (NL F(1,26) = 3.9 p = 0.06; BW F(1,26) = 0.78 p = 0.38; NL*BW F(1,26) = 0.64 p =0.43) frequency and no change in minimum frequency (NL F(1,26) = 2.95 p = 0.1; BW F(1,26) = 0.00 p = 0.99; NL*BW F(1,26) = 0.02 p =0.88). Spectral tilt of chirp vocalizations did not change with noise level (NL F(1,26) = 2.76 p=0.11) or bandwidth (BW F(1,26) = 0.42 p=0.52; NL*BW F(1,26) = 0.57 p=0.46; see Figure 6- 15).

154

Figure 6-15. Changes to spectral tilt of non-CLC chirps during noise level/bandwidth combinations (treatments A – F), represented as the ratio of the amount of energy in the first three formants (harmonics) to energy in the first formant (fundamental frequency) and averaged within and between subjects. Solid lines represent averages from control trials, while dashed lines represent treatment trials.

Onset order of modifications

Onset order of vocal modifications was difficult to analyze given the high inter- and intra-individual variability. A presence/absence analysis of the average for each subject in each treatment (Table 6-3) revealed that not all subjects changed their vocalizations in the same way, indicating that modifications do not always occur simultaneously.

155

Table 6-3. Presence/absence of vocal modifications by call and treatment types. Letters indicate the presence of a modification in that subject’s calls during each treatment type. Italic initials indicate a decrease in a given parameter for that subject. Mu = Mulva, J = Jerry, B = Bart, Mh = Milhouse, S = Susan. * indicates that Bart produced no chirps during these treatment types, and is not included in the analysis.

A* B C D E F* SL Mu,J,B Mu Mu,J Mu,J,B Mu,J,B Mu,J,B Duration Mu,J,B Mu,J,B J Mu,J B Mu,J,B CLCs Fundamental J,B Mu,J Mu,B Mu,J,B Mu,B Mu,B Tilt Mu,J,B Mu,J,B Mu,J,B Mu,J,B Mu,J,B Mu,B, J SL Mh,S B,Mh,S B,Mh,S B,Mh,S B,Mh,S Mh,S Duration S B,S - B,S Mh,B Mh,S Peak Mh,S Mh,B B,S Mh,B,S Mh,B,S Mh,S Chirps Max Mh,S Mh,B S Mh,B,S Mh,B Mh Start S Mh,B,S Mh,B Mh,B,S B,S Mh,S Min Mh,S Mh,B,S B,S Mh,S Mh,B,S Mh,S Tilt -* Mh Mh,B Mh,S Mh,S S*

Discussion

Previous studies have indicated that two species of non-human primates (cotton-top tamarins and common marmosets) increase the amplitude and duration of their long-distance contact vocalizations during increased noise (Brumm et al., 2004; Egnor and Hauser, 2006; Egnor et al., 2006; Egnor et al., 2007). The current study expanded on these findings by targeting masking noise of different amplitude and bandwidth combinations at perceptually important features of the tamarins’ combination long calls (CLCs), and opportunistically recording changes to chirp vocalizations. While there was some frequency overlap between the chirps and the broadband noise stimuli (maximum frequency: 10 kHz), the narrowband noise (maximum: 6.5 kHz) did not overlap with chirps at all. We also investigated spectral tilt, a phenomenon that has previously been observed in humans, but, to the best of our knowledge, never in a nonhuman primate species. Vocal plasticity in the acoustic structure of noise-induced vocal modifications was observed from all subjects and both call types and included the first evidence of consistent changes to the spectral content of non-human primate vocalizations during increased noise. Both

156 call types were strongly influenced by noise amplitude. Spectral overlap, which occurred only during CLC production, also exerted some influence. Modifications also differed by vocalization type: in CLCs, the Lombard effect and changes to several spectral components were evident, while a previously documented change in duration (Egnor and Hauser, 2006) was absent. In non- CLC chirps, amplitude, chirp durations, and fundamental frequencies all increased significantly with noise amplitude.

Combination long calls (CLCs)

The three subjects producing CLCs modified their vocalizations in response to noise amplitude and noise bandwidth. Subjects changed their vocalizations more dramatically during broadband noise than during narrowband noise of higher amplitudes, apparently reacting to the degree of spectral overlap between the noise and the entire spectral range of their CLCs rather than masking of the second harmonic alone as was predicted. A greater degree of spectral overlap drove increases in the peak fundamental frequency, decreases in the minimum frequency of the whole call, and a slightly greater magnitude Lombard effect despite lower overall noise levels. During narrowband noise, which had a lower degree of spectral overlap, there were no changes to the minimum frequency of CLCs, and only moderate increases in the peak fundamental frequency. Noise amplitude appeared to influence some vocal modifications independently of spectral overlap. Source levels and spectral tilt of whole CLCs and whistle syllables all increased as a result of increases in noise amplitude, with no main effects of noise bandwidth. Shifts in spectral tilt of whistles appeared to affect increases in the peak frequency of whole CLC vocalizations. In humans, studies of spectral tilt during noise have postulated that changes are a byproduct of the Lombard effect and deformation of the vocal apparatus associated with increased air pressure in the lungs and vocal cords (Liénard and Di Benedetto, 1999b; Fitch, 2000; Jessen et al., 2005). However, it is also possible that the Lombard effect and changes to spectral tilt are linked psychophysically due to selection for improved communication as an effect of employing both modifications simultaneously. Future work should attempt to more fully describe the linkage between these two modifications, particularly for non-human species. Onset order of noise-induced vocal modifications to CLCs was not simultaneous within or between subjects: modifications did not occur in any recognizable pattern, and there appeared

157 to be no sustained linkage between the presence of any two modifications. While the magnitude of changes may be correlated, if there is an anatomical linkage between different types of vocal modifications, there should be a recognizable pattern in their occurrence, which was not observed during this experiment. Results from the current study conflict with previous reports of vocal noise compensation in cotton-top tamarins. Egnor and Hauser (2006) described an increase in CLC duration during increased noise, but this experiment found no changes to the duration of either whole CLCs or of whistle syllables. One possible explanation for this discrepancy is that the small sample size and high within- and between-subject variability masked any changes to call duration during increased noise that may have been evident with a larger or more consistent group of subjects. Alternatively, the previous study analyzed calls that were spontaneously produced by animals in isolation (Egnor and Hauser, 2006), whereas this study analyzed elicited as well as spontaneous vocalizations. The Penn State tamarins do not always spontaneously produce CLCs, necessitating the use of a call elicitation stimulus. Differences in spontaneous and elicited contact calls have been demonstrated in other non-human primate species (Ruiz-Miranda et al., 2002), and averaging all calls within a trial may have masked subtle differences in elicited and spontaneous CLCs. In this experiment, approximately half of the analyzed CLCs from each noise type were elicited and half spontaneous. A comparison of spontaneous and elicited calls during trials indicated that it is unlikely that this factor was responsible for the patterns of vocal modification seen here. The origins of the animals involved in each study may also have impacted the results. Other non-human primate species have been shown to modify their call characteristics when integrated into new social contexts, presumably leading to group-specific call structures (Snowdon and Elowson, 1999; Rukstalis et al., 2003). Animals from colonies with different group-specific vocalization structures may be differently restricted in the ways in which they can modify their calls while maintaining group-specific features (Tyack, 2008a). The previous experiment also reported a lack of consistent change to the average fundamental frequency of CLCs during increased noise. This study measure the minimum and peak components of the fundamental frequency, rather than the average, and found that while the minimum frequencies tended to decrease, peak fundamental frequency increased. These opposing changes resulted could potentially account for the inconsistent changes observed by Egnor and Hauser (2006).

158 Chirps

Multiple characteristics of chirp vocalizations changed in response to noise amplitude, but there was no change in call structure in response to noise bandwidth. This difference is probably due to the minimal degree of spectral overlap between chirps and noise stimuli. Noise amplitude strongly influenced chip amplitude, duration, and maximum and peak fundamental frequencies; there was also some evidence of changes to the start and minimum frequencies, but these changes were not significant. Interestingly, while the Lombard effect was observed there were no changes to spectral tilt in this call type, indicating that while vocal effort and spectral tilt may share a biomechanical linkage, there are likely to be other factors influencing this phenomenon in the tamarins. As with CLCs, the onset order of vocal modifications in chirps was non-simultaneous. Subjects modified different aspects of calls during the six playback treatments in an inconsistent manner, sometimes changing the same parameter in opposite directions with no apparent pattern (e.g. Bart – peak frequency). In addition, the Lombard effect and spectral tilt were not observed simultaneously in this call type, suggesting that the tamarins are capable of modifying at least some call parameters individually. Both the types of noise-induced vocal modifications and the baseline parameters of chirps differed from those seen in CLCs. Such differences in the acoustic structure of vocalizations may depend on a variety of factors, particularly the signaler’s motivation for calling and the behavioral function of the sound (Lane and Tranel, 1971; Tressler and Smotherman, 2009; Hotchkin and Parks, In press). There are currently few descriptions of the behavioral contexts in which chirps are produced (McConnell and Snowdon, 1986; Bauers and Snowdon, 1990). Cleveland and Snowdon (1982) described the behavioral function of this call type for wild tamarins as either “post-food” (type C chirp), or “general alarm” (type D chirp), but the described spectral and temporal ranges of type C and D chirps are substantially overlapped. The difficulty of interpreting the behavioral function of this call type increases due to the lack of studies of the behavioral function of chirps in captive contexts. While the call types still presumably serve a communicative function, it is possible that the tamarins used in this experiment produced chirps to communicate with a conspecific they assumed was at very close range, or as a way to self- soothe. Either of these potential functions could explain why the magnitude of the Lombard effect in chirps was lower than that for CLCs; future studies should concentrate on whether the behavioral functions of call types affect the intensity of noise-induced vocal modifications.

159 Confounding factors

An experimental design factor that could have affected the outcome of the experiment was the noise bandwidth. “Broad” and “narrow” are relative terms, and the 5 kHz bandwidth of the narrowband stimuli, which is the narrowest bandwidth shown to evoke a response from free- tailed bats (Tressler and Smotherman, 2009), was likely too broad to adequately mask only the targeted harmonic. The broadband stimuli, which were designed to mask the entire CLC vocalization, may have actually been more effective at masking the second harmonic than narrowband noise of slightly higher amplitudes. One factor that was not included in my analyses, but which may have affected the results was the communicative motivation and emotional state of the subjects during trials (Lane and Tranel, 1971; Briefer, 2012). All trials in which an animal exhibited obvious agitation were aborted, but subjects likely varied in their degree of hunger, desire for social interaction, and other emotional states during data collection, possibly influencing their vocal behavior and increasing within and between subject variability. For chirp vocalizations, a second potential confounding factor was the directionality of the sound. Many mammalian vocalizations are directional (Egnor and Hauser, 2006; Tressler and Smotherman, 2009; Holt et al., 2010), with a beam pattern that radiates from the sound source (the animal’s mouth). During trials in which animals produced chirps, the subjects (Milhouse and Susan) had a tendency to swivel their heads and vocalize while facing away from the microphone. Due to technical problems with the video recordings, it was not possible to exclude chirp vocalizations produced in this way, which may have affected the measurements of chirp amplitude and potentially underestimated the magnitude of the Lombard effect in chirps. Finally, the sample size for this experiment was too small to account for possible differences in vocal behavior of different age and sex classes (Egan, 1972; Ternström et al., 2006a) and the hearing capabilities of the Penn State tamarins are unknown.

Implications for vocal noise compensation

Context dependency of vocal noise compensation has previously been demonstrated in humans, bats, and manatees (Egan, 1972; Miksis-Olds and Tyack, 2009; Tressler and Smotherman, 2009; Garnier et al., 2010); this study adds evidence for context-dependent

160 compensation in a non-human primate species. This experiment demonstrated that cotton-top tamarins also use a flexible modification paradigm to change the acoustic structure of CLCs and chirps in response to specific noise parameters. The tested parameters are noise characteristics that are likely to vary within and between noise sources in both captive and wild habitats – weather events, biologic sources, and anthropogenic noise inputs can all vary in bandwidth and intensity. Signalers capable of adapting their vocalizations to effectively communicate in all types of noise may therefore have a selective advantage over animals using an all or nothing response paradigm. Onset order of noise-induced vocal modifications has not been explicitly examined for any species, and the assumption of simultaneous onset for mammals has been based solely on evidence from human speech (Tressler and Smotherman, 2009; Brumm and Zollinger, 2011; Cardoso and Atwell, 2011). Although the small sample size and high variability within and between subjects in this experiment precluded statistical analysis of this finding, our results provide some evidence that cotton-top tamarins are able to employ modifications non- simultaneously. However, human subjects have also shown variability in vocal changes, particularly between subjects (Lane and Tranel, 1971; Tartter et al., 1993a), and results from the current experiment should be interpreted cautiously. If vocal modifications do not always occur simultaneously, signalers must select which modifications to employ in a given noise environment. While it is possible that the tamarins consciously select which changes they will make to their calls during noise, it seems more likely that a subconscious mechanism based on noise characteristics like spectral overlap and amplitude is the proximate mechanism for vocal noise compensation in tamarins.

Conclusions

The results of this study demonstrate that cotton-top tamarins are capable of modifying spectral, temporal, and amplitude characteristics of vocalizations during short periods of increased noise, casting doubt on the assumption of limited vocal control in non-human primate species (Fitch and Hauser, 1995; Egnor and Hauser, 2004; Snowdon et al., 2009). To my knowledge, it is the first study to demonstrate changes to the spectral content of non-human primate vocalizations during increased noise, indicating that short-term vocal flexibility in primates is not unique to humans. This study is also the first to experimentally evoke a change in

161 spectral tilt in a non-human species, and to note that changes to spectral tilt can potentially increase detectablity of a signal during increased noise independently of the Lombard effect. The results of this study also indicate that cotton-top tamarins may have the ability to employ vocal noise compensation mechanisms non-simultaneously, giving signalers the flexibility to respond effectively to many different types of masking noise. The subjects in this study demonstrated a previously unanticipated degree of vocal plasticity in two call types in context dependent fashions. The Lombard effect was found in both call types, with similar maximum amplitudes for both call types. Future studies should investigate the behavioral function of chirp vocalizations to evaluate the significance of higher baseline amplitudes for this call type, and to understand the effects of behavioral functions of calls on vocal noise compensation. Context dependent vocal modifications have also been documented in other mammalian species (Miksis-Olds and Tyack, 2009; Tressler and Smotherman, 2009; Garnier et al., 2010; Holt et al., 2011), indicating that noise compensation responses are more flexible than previously assumed; behavioral states, and external social and environmental conditions should be accounted for when examining vocal compensation strategies. In humans, changes to spectral tilt are apparently biomechanically linked to the Lombard effect through deformation of the vocal apparatus associated with increased air pressure (Fitch, 1989; Jessen et al., 2005). In cotton-top tamarins, however, the evidence for such a linkage was unclear. While both the Lombard effect and changes to spectral tilt were clearly evident in CLCs, not all subjects demonstrated both changes simultaneously, and at least one subject increased vocal amplitude and energy at low frequencies during the same trial. In chirps, there was no evidence of a change in spectral tilt despite a significant increase in chirp amplitude, implying that these two parameters can be adjusted independently. Further studies should determine whether the behavioral function of chirps affects the presence of changes to spectral tilt, and investigate the lack of change in CLC duration found in this study. Starting points for these studies could include noise stimuli targeted at chirp frequencies to evaluate the effects of spectral overlap on this call type, and investigation of the potential differences between spontaneous and elicited CLC vocalizations, which may affect changes to CLC duration. Vocal noise compensation has important implications for the evolution of acoustic communication. The results of this study indicate that non-human primates have more vocal flexibility than previously assumed, including the ability to rapidly modify the spectral and temporal characteristics of calls in response to a dynamic acoustic environment and respond

162 flexibly to noises with different spectro-temporal parameters. Comparison of these results to knowledge of human vocal noise compensation suggests that the simultaneous onset of vocal modifications cannot yet be generalized to all other mammals (Tressler and Smotherman, 2009; Brumm and Zollinger, 2011), and that the communicative motivation and noise parameters may play a greater role than previously suspected.

References

Adobe (2011). "Adobe Audition version 5." Bauers, K., and Snowdon, C. T. (1990). "Discrimination of chirp vocalizations in the cotton-top tamarin," American Journal of Primatology 21, 53-60. Bioacoustics Research Program (2011). "Raven Pro: Interactive Sound Analysis Software (Version 1.4)," (The Cornell Lab of Ornithology, Ithaca, NY), p. Available from http://www.birds.cornell.edu/raven. Bradbury, J. W., and Vehrencamp, S. L. (1998). Principles of Animal Communication (Sinauer Associates, Inc, Sunderland, MA). Briefer, E. F. (2012). "Vocal expression of emotions in mammals: mechanisms of production and evidence," Journal of Zoology. Brumm, H. (2004). "The impact of environmental noise on song amplitude in a territorial bird," Journal of Animal Ecology 73, 434-440. Brumm, H., and Slabbekoorn, H. (2005). "Acoustic Communication in Noise," in Advances in the Study of Behavior, edited by P. J. B. Slater, C. T. Snowdon, T. J. Roper, H. J. Brockmann, and M. Naguib (Academic Press), pp. 151-209. Brumm, H., Voss, K., Kollmer, I., and Todt, D. (2004). "Acoustic communication in noise: regulation of call characteristics in a New World monkey," J Exp Biol 207, 443-448. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Cardoso, G. C., and Atwell, J. W. (2011). "On the relation between loudness and the increased song frequency of urban birds," Animal Behaviour 82, 831-836. Cleveland, J., and Snowdon, C. T. (1982). "The Complex Vocal Repertoire of the Adult Cotton- top Tamarin (Saguinus oedipus oedipus)1)," Zeitschrift für Tierpsychologie 58, 231-270. Egan, J. J. (1972). "Psychoacoustics of the Lombard voice response," J. Aud. Res. 12, 318 - 324. Egnor, S. E. R., and Hauser, M. D. (2004). "A paradox in the evolution of primate vocal learning," Trends in Neurosciences 27, 649-654. Egnor, S. E. R., and Hauser, M. D. (2006). "Noise-induced vocal modulation in cotton-top tamarins (Saguinus oedipus)," American Journal of Primatology 68, 1183-1190. Egnor, S. E. R., Iguina, C. G., and Hauser, M. D. (2006). "Perturbation of auditory feedback causes systematic perturbation in vocal structure in adult cotton-top tamarins," J Exp Biol 209, 3652-3663. Egnor, S. E. R., Wickelgren, J. G., and Hauser, M. D. (2007). "Tracking silence: adjusting vocal production to avoid acoustic interference," J Comp Physiol A 193, 477-483. Ey, E., and Fischer, J. (2009). "The "acoustic adaptation hypothesis" - a review of the evidence from birds, anurans, and mammals," Bioacoustics 19, 21 - 48.

163 Fitch, H. (1989). "Comments on "Effects of noise on speech production: Acoustic and perceptual analyses" [J.Acoust. Soc.Am. 84, 917-928 (1988)]," The Journal of the Acoustical Society of America 86, 2017 - 2019. Fitch, W. T. (2000). "The evolution of speech: a comparative review," Trends in Cognitive Sciences 4, 258-267. Fitch, W. T., and Hauser, M. D. (1995). "Vocal production in nonhuman primates: Acoustics, physiology, and functional constraints on “honest” advertisement," American Journal of Primatology 37, 191-219. Garnier, M., Henrich, N., and Dubois, D. (2010). "Influence of sound immersion and communicative interaction on the Lombard effect," Journal of Speech, Language, and Hearing Research 53, 588 - 608. Ghazanfar, A., Flombaum, J., Miller, C., and Hauser, M. (2001). "The units of perception in the antiphonal calling behavior of cotton-top tamarins (Saguinus oedipus): playback experiments with long calls," Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology 187, 27-35. Ghazanfar, A. A., Smith-Rohrberg, D., Pollen, A. A., and Hauser, M. D. (2002). "Temporal cues in the antiphonal long-calling behaviour of cottontop tamarins," Animal Behaviour 64, 427-438. Goldizen, A. (1990). "A comparative perspective on the evolution of tamarin and marmoset social systems," International Journal of Primatology 11, 63-83. Halfwerk, W., and Slabbekoorn, H. (2009). "A behavioural mechanism explaining noise- dependent frequency use in urban birdsong," Animal Behaviour 78, 1301-1307. Hanley, T. D., and Steer, M. D. (1949). "Effect of Level of Distracting Noise upon Speaking Rate, Duration and Intensity," Journal of Speech and Hearing Disorders 14, 363-368. Holt, M. M., Noren, D. P., and Emmons, C. K. (2011). "Effects of noise levels and call types on the source levels of killer whale calls," The Journal of the Acoustical Society of America 130, 3100-3106. Holt, M. M., Southall, B. L., Insley, S. J., and Schusterman, R. J. (2010). "Call directionality and its behavioural significance in male northern elephant seals, Mirounga angustirostris," Animal Behaviour 80, 351 - 361. Hotchkin, C. F., and Parks, S. E. (In press). "The Lombard effect and other vocal noise compensation strategies: insights from mammalian communication systems." Biological Reviews Hu, Y., and Cardoso, G. C. (2010). "Which birds adjust the frequency of vocalizations in urban noise?," Animal Behaviour 79, 863-867. Hurlbert, S. H. (1984). "Pseudoreplication and the Design of Ecological Field Experiments," Ecological Monographs 54, 187-211. Jessen, M., Köster, O., and Gfroerer, S. (2005). "Influence of vocal effort on average and variablity of fundamental frequency," Speech, Language and the Law 12, 174 - 212. Jordan, K., Weiss, D., Hauser, M., and McMurray, B. (2004). "Antiphonal Responses to Loud Contact Calls Produced by Saguinus oedipus," International Journal of Primatology 25, 465-475. Lane, H., and Tranel, B. (1971). "The Lombard sign and the role of hearing in speech," Journal of Speech, Language, and Hearing Research 14, 677 - 709. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65-84. Liénard, J., and Di Benedetto, M. (1999). "Effects of vocal effort on spectral properties of vowels," The Journal of the Acoustical Society of America 106, 411 - 422.

164 Lombard, E. (1911). "Le signe de l'elevation de la voix," Annales Des Malades de l'creille 37. Love, E. K., and Bee, M. A. (2010). "An experimental test of noise-dependent voice amplitude regulation in Cope's grey treefrog, Hyla chrysoscelis," Animal Behaviour 80, 509-515. Lu, Y., and Cooke, M. (2009). "The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise," Speech Communication 51, 1253-1262. MathWorks, T. (2007). "Matlab," (Natick, MA). McConnell, P. B., and Snowdon, C. T. (1986). "Vocal Interactions between Unfamiliar Groups of Captive Cotton-Top Tamarins," Behaviour 97, 273-296. Miksis-Olds, J. L., and Tyack, P. L. (2009). "Manatee (Trichechus manatus) vocalization usage in relation to environmental noise levels," The Journal of the Acoustical Society of America 125, 1806-1815. Miller, C., Beck, K., Meade, B., and Wang, X. (2009). "Antiphonal call timing in marmosets is behaviorally significant: interactive playback experiments," Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology 195, 783- 789. Miller, C. T., and Hauser, M. D. (2004). "Multiple acoustic features underlie vocal signal recognition in tamarins: antiphonal calling experiments," Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology 190, 7-19. Miller, P. J. O., Biassoni, N., Samuels, A., and Tyack, P. L. (2000). "Whale songs lengthen in response to sonar," Nature 405, 903-903. Nowacek, D. P., Thorne, L. H., Johnston, D. W., and Tyack, P. L. (2007). "Responses of cetaceans to anthropogenic noise," Mammal Rev. 37, 81-115. Ord, T. J., and Stamps, J. A. (2008). "Alert signals enhance animal communication in noisy environments," Proceedings of the National Academy of Sciences 105, 18830-18835. Parks, S. E. (2003). "Acoustic communication in the North Atlantic right whale (Eubalaena glacialis)," (MIT-WHOI). Patricelli, G. L., and Blickley, J. L. (2006). "Avian communication in urban noise: causes and consequences of vocal adjustment," The Auk 123, 639-649. Rabin, L. A., McCowan, B., Hooper, S. L., and Owings, D. H. (2003). "Anthropogenic noise and its effect on animal communication: an interface between comparative psychology and conservation biology," International Journal of Comparative Psychology 16, 172-192. Roy, S., Miller, C. T., Gottsch, D., and Wang, X. (2011). "Vocal control by the common marmoset in the presence of interfering noise," The Journal of Experimental Biology 214, 3619-3629. Ruiz-Miranda, C. R., Archer, C. A., and Kleiman, D. G. (2002). "Acoustic Differences between Spontaneous and Induced Long Calls of Golden Lion Tamarins, Leontopithecus rosalia," Folia Primatologica 73, 124-131. Rukstalis, M., Fite, J. E., and French, J. A. (2003). "Social Change Affects Vocal Structure in a Callitrichid Primate (Callithrix kuhlii)," Ethology 109, 327-340. Savage, A., Giraldo, L. H., Soto, L. H., and Snowdon, C. T. (1996). "Demography, group composition, and dispersal in wild cotton-top tamarin (Saguinus oedipus) groups," American Journal of Primatology 38, 85-100. Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," J. Acoust. Soc. Am 117, 1486-1492. Schmidt, U., and Joermann, G. (1986). "The influence of acoustical interferences on echolocation in bats," Mammalia 50. Snowdon, C. T., and Elowson, A. M. (1999). "Pygmy Marmosets Modify Call Structure When Paired," Ethology 105, 893-908.

165 Snowdon, C. T., Marc, N., Klaus, Z., Nicola, S. C., and Vincent, M. J. (2009). "Plasticity of Communication in Nonhuman Primates," in Advances in the Study of Behavior (Academic Press), pp. 239-276. Sundberg, J., and Nordenberg, M. (2006). "Effects of vocal loudness variation on spectrum balance as reflected by the alpha measure of long-term-average spectra of speech," The Journal of the Acoustical Society of America 120, 453-457. Tartter, V. C., Gomes, H., and Litwin, E. (1993). "Some acoustic effects of listening to noise on speech production," The Journal of the Acoustical Society of America 94, 2437-2440. Ternström, S., Bohman, M., and Södersten, M. (2006). "Loud speech over noise: some spectral attributes, with gender differences," The Journal of the Acoustical Society of America 119, 1648-1665. Tressler, J., and Smotherman, M. S. (2009). "Context-dependent effects of noise on echolocation pulse characteristics in free-tailed bats," Journal of Comparative Physiology A 195, 923 - 934 Tyack, P. L. (2008). "Convergence of calls as animals form social bonds, active compensation for noisy communication channels, and the evolution of vocal learning in mammals," Journal of Comparative Psychology 122, 319-331. Versace, E., Endress, A. D., and Hauser, M. D. (2008). "Pattern recognition mediates flexible timing of vocalizations in nonhuman primates: experiments with cottontop tamarins," Animal Behaviour 76, 1885 - 1892. Weiss, D. J., Garibaldi, B. T., and Hauser, M. D. (2001). "The production and perception of long calls by cotton-top tamarins (Saguinus oedipus): Acoustic analyses and playback experiments," J Comp Psych 115, 258-271. Weiss, D. J., and Hauser, M. D. (2002). "Perception of harmonics in the combination long call of cottontop tamarins, Saguinus oedipus," Animal Behaviour 64, 415-426. Wiley, R. H. (2006). "Signal Detection and Animal Communication," in Advances in the Study of Behavior, edited by H. J. B. Peter, J. B. Slater, C. T. Snowdon, T. J. Roper, M. Naguib, and E. W. Katherine (Academic Press), pp. 217-247. Wright, A. J., Soto, N. A., Baldwin, A. L., Bateson, M., Beale, C. M., Clark, C., Deak, T., Edwards, E. F., Fernandez, A., Godinho, A., Hatch, L. T., Kakuschke, A., Lusseau, D., Martineau, D., Romero, L. M., Weilgart, L. S., Wintle, B. A., Notarbartolo-di-Sciara, G., and Martin, V. (2007). "Anthropogenic noise as a stressor in animals: a multidisiplinary perspective," International Journal of Comparative Psychology 20, 250 - 273. Zwicker, E., and Fastl, H. (1999). Psychoacoustics: Facts and models (Springer, New York, NY).

Chapter 7

Summary and Conclusions

The effects of noise on acoustic communication have important implications for studies of the evolution of communication and language, behavioral ecology, psychology, and applied conservation biology. The goal of this dissertation was to critically evaluate and integrate current knowledge of noise-induced vocal modifications in mammals with experimental tests of the effects of specific noise parameters on vocalization structures to give a more complete understanding of the effects of noise exposure on the acoustic characteristics of mammalian vocalizations and address the question of whether signalers in widely varying acoustic enfironments use the same types of vocal modifications. The two species used in these experiments (beluga whales (Delphinapterus leucas) and cotton-top tamarins (Saguinus oedipus)) are acoustically-dependent social mammals with large vocal repertoires. Both species are also known to modify the structure of their vocalizations during increased noise (Lesage et al., 1999; Scheifele et al., 2005; Egnor and Hauser, 2006; Egnor et al., 2006), which makes them well- suited for detailed studies of the effects of specific noise parameters on vocal noise compensation strategies. This chapter includes a short summary of the major findings of chapters 2 through 6 of this dissertation, a general discussion of the relevance of the work as a whole, and suggestions for future research.

Summary of chapters

Current knowledge of the Lombard effect and other noise-induced vocal modifications in humans and non-human mammals was reviewed in chapter 2, with the ultimate goals of standardizing the terminology associated with noise-induced vocal modifications, evaluating the current state of knowledge for both fields, and illuminating paths for future research in both areas. I concluded that both human and non-human vocal modification fields could benefit from better communication across taxa, and that important areas for research include the potential for a biomechanical or psychophysical linkage between vocal modification types and the effects of a

167 signaler’s motivation to communicate on signal structure. Specific terms to refer to simultaneous and non-simultaneous vocal modifications were recommended, and specific experiments that could fill existing knowledge gaps suggested. In Chapter 3, I evaluated the differences in acoustic habitats experienced by beluga whales in a single captive habitat (Mystic Aquarium) and at two sites in a wild habitat (Cook Inlet, Alaska), including a weighting function intended to account for the perceptual capabilities of beluga whales (Southall et al., 2007). The average noise levels in the captive environment were substantially higher than those at either of the two wild locations, and were consistently high over the long term. Noise in the captive environment varied temporally and spatially due to the shape and size of the three exhibit pools and to the filtration design and water circulation patterns. These differences could affect communication by the belugas as they move between pools and cope with temporal changes. In the wild habitat, noise levels and sources differed between sites: Beluga River had a higher percentage of beluga/vessel encounters than did Kenai River, while the Kenai River dataset had lower average noise levels and a greater frequency of weather-noise events. Differences between the two datasets could indicate either geographic or seasonal differences in anthropogenic usage, as the recordings were collected during the summer at Beluga River and in the winter at Kenai River. Overall, the acoustic habitats of belugas in Cook Inlet and Mystic Aquarium were markedly different, and present different types of communication challenges for this vocally flexible species. Chapter 4 detailed the effects of noise on the acoustic structure of two call types (flat whistles and a stereotyped pulsed vocalization) produced by the belugas at Mystic Aquarium during periods of increased noise from exhibit maintenance. Both broad- and narrow-band noise levels were used to investigate the perceptual importance of noise frequency and spectral overlap in relation to vocalization parameters. Vocal modifications were different for the two call types, but minimum call frequency appeared to be strongly influenced by narrowband noise levels for both call types. Acoustic structure of calls was also strongly influenced by the “date” factor in most cases, indicating a high level of context dependency that is likely to affect vocal noise compensation strategies. These findings show that captive beluga whales are capable of modifying their vocalizations during high levels of an evolutionarily novel noise; whether these changes are adaptive and used for vocal noise compensation by wild or captive whales remains to be tested.

168 In Chapter 5, I examined short-term changes to the acoustic structure of vocalizations produced by beluga whales during periods of increased anthropogenic noise at two sites in Cook Inlet, Alaska. The lack of an objectively defined repertoire for this population made it difficult to analyze specific call types, and so vocalizations were categorized into broad categories based on their contour shape and peak call frequency. Due to differences in call type usage between the Beluga and Kenai Rivers, calls from the two datasets were analyzed separately. Significant changes to vocalization structure were detected at both sites; in particular, peak frequency of vocalizations increased significantly with narrowband noise in every category, and minimum frequency was significantly related to narrowband noise levels at the Kenai River site. At the Beluga River location, call structure was also related to the “encounter ID” variable, indicating an effect of behavioral or social context on call structure which may interact with noise-induced modifications. Though it is not possible to investigate these factors using the current datasets, future studies of this endangered population should consider the effects of behavioral states and the potential for different reactions from animals with different communicative motivations. The final chapter of this dissertation used a series of noise playbacks to examine the effects of noise amplitude and spectral overlap on vocalizations produced by cotton-top tamarins. In this chapter, I tested short-term changes in the structure of long-distance contact calls (CLCs) and chirp vocalizations (unknown behavioral function) in response to noise stimuli of different amplitudes and spectral content. Vocal modifications differed between call types, but both noise amplitude and frequency content appeared to be perceptually salient. Fundamental frequencies of both call types were modified during noise stimuli, and changes to spectral tilt were observed in CLCs. These results represent the first demonstration of a short-term noise-induced change in the spectral content of non-human primate vocalizations. The observed change in spectral tilt is also the first experimental demonstration of this phenomenon in a non-human mammal. In addition to the significant finding of shifts in spectral characteristics, the tamarins also demonstrated increases in both CLC and chirp amplitudes (the Lombard effect) and changes to chirp duration. Puzzlingly, there was no evidence of changes to spectral tilt of chirps, and no change in CLC duration. Future experiments which could help to explain these results and address some factors of experimental design were discussed in detail at the end of chapter six. This study also offered the opportunity to investigate the possibility of non-simultaneous onset of vocal modifications during increased noise. Results indicated that it is possible for spectral changes to occur independently of amplitude modifications, as changes to the spectral content of chirps was not correlated with a Lombard effect, and some CLCs contained change to

169 either amplitude or spectral tilt but not both. The onset order and use of different types of vocal modifications is an exciting area for future research, and may provide insight into the evolution of acoustic communication and the effectiveness of certain types of vocal noise compensation strategies.

General discussion

The work presented in this dissertation expands the current knowledge of the effects of noise on acoustic communication by nonhuman animals, and advances both theoretical understanding of the effects of acoustic habitats on the use of noise-induced vocal modifications and practical considerations of the effects of social and behavioral contexts on vocal adjustment strategies. The work as a whole is broadly significant in its review of noise-induced vocal modifications in the human and non-human literature, demonstration of the importance of communicative motivation during vocal noise compensation by non-human mammals, discovery of changes to both stereotyped vocalizations and overall repertoires from beluga whales, and its examination of previously undocumented plasticity in the vocalizations of non-human primates. Throughout this dissertation, I have tried to emphasize both the positive (potential for improved signal detection, maintenance of social contacts, etc.) and negative (energetic costs, increased stress) effects of modifications to the acoustic structure of vocalizations during increased noise. The cumulative impacts of noise-induced vocal modifications are difficult to examine, and are likely to depend on the signaler’s physiological condition and the behavioral and environmental contexts during which noise exposure occurs. The results of this dissertation should therefore be interpreted cautiously, as signalers from these and future studies may incur different costs and accrue different benefits based on their individual situations.

Noise induced vocal modifications

Modifications to all aspects of vocalizations were observed during the studies presented in the preceding chapters. Multiple modification types, including the Lombard effect, spectral shifts, and temporal changes, were elicited during controlled experiments with cotton-top tamarins during production of two types of calls. Members of two separate populations of beluga

170 whales modified the spectral characteristics of both highly stereotyped vocalizations (Chapter 4) and of calls grouped according to broad frequency range and contour shape categories (Chapter 5). Noise-induced changes to the temporal characteristics of calls were less common, observed only in “chirp” calls from the tamarins. Overall, this dissertation has demonstrated that signalers in different acoustic habitats do use similar types of vocal modifications during increased noise, despite previous assumptions of non-plastic vocalizations in non-human primates and the use of two very different mammalian species. One of the major goals of this project was to assess the effects of specific noise parameters, such as amplitude and spectral content, on vocalization structure and the types of vocal modifications observed. Interactions between such noise characteristics influence characteristics of human speech (Egan, 1967; 1972) and free-tailed bat echolocation calls (Tressler and Smotherman, 2009) in ways that appear to reduce the negative effects of noise on the effective communication range of vocalizations. My results indicate that this is also true for vocalizations of cotton-top tamarins. Both of the noise characteristics tested in the controlled playback experiments (noise amplitude and the degree of spectral overlap between noise and vocalizations) were perceptually salient for the signalers, who adjusted the spectral content of two call types in ways that appeared to compensate for masking noise. The importance of temporal structure and other noise characteristics to the structure of vocalizations is not yet fully understood, and would be an interesting area for future study. Onset order and usage patterns of vocal modifications have been less thoroughly studied than the types of modifications available to species and the perceptual salience of noise characteristics. In humans, onset of all vocal modifications occurs simultaneously, a phenomenon referred to as “Lombard speech” (Garnier et al., 2010; Hotchkin and Parks, In press), and several studies have proposed that non-human species experience a similar simultaneous usage paradigm for vocal modifications (Tressler and Smotherman, 2009; Brumm and Zollinger, 2011). However, results from this work indicate a potential difference in vocal modification usage between human and non-human signalers. Changes to the spectral content and amplitude of tamarin contact vocalizations did not always occur simultaneously – some high-amplitude calls showed “normal” spectral tilt characteristics, while relatively low-amplitude vocalizations sometimes had very low energy in the fundamental frequency. Additionally, changes to the fundamental frequency of chirp calls had no correlation with call amplitudes (chapter 6, this dissertation). While there is likely to be some physiological connection between amplitude and other signal parameters, the onset and usage of different vocal modification types used by species with high vocal flexibility

171 may be more dependent on the relevant noise parameters than on the vocal effort and amplitude of acoustic signals. One factor that may contribute to non-simultaneous onset of vocal modifications is the effect of behavioral and environmental context on the acoustic structure of vocalizations. In every study presented here, the acoustic structures of vocalizations were also related to either behavioral context or vocalization type, revealing context-dependencies and or effects of signaler motivation which may impact vocalization structures. These findings are consistent with the effects of noise on vocal modifications in human speech, in which subjects asked to perform an interactive task exhibit greater-magnitude changes to vocal structure (Lane and Tranel, 1971; Hotchkin and Parks, In press), and with a study of manatees, in which changes to vocalizations depended on the presence or absence of dependent offspring and other behavioral factors (Miksis-Olds and Tyack, 2009). Changes to vocalization structure due to communicative motivation and the emotional states of signalers (Briefer, 2012) may interact with noise-induced modifications. Controlling for these factors may help researchers evaluate the perceptual salience of certain aspects of noise (e.g. spectral content, amplitude, duration, etc.) and the degree of masking to which signalers are exposed. This dissertation also proposes future study of a previously unexamined vocal modification in non-human mammals. Spectral tilt involves shifts in frequency content within a stereotyped vocalization, and has typically been classified as a “by-product” of changes to the vocal folds during increased vocal amplitude (the Lombard effect) (Fitch, 1989; Tressler and Smotherman, 2009; Brumm and Zollinger, 2011). Given the results of the tamarin experiment in chapter 6, in which not all high-amplitude vocalizations included shifts in spectral tilt (and vice versa), I argue that the presumption of a biomechanical linkage between these changes for all mammals is premature. An alternative to the “byproduct” hypothesis is the possibility that changes to spectral tilt may independently increase signal detectability. In the case of low- frequency noise that overlaps part or all of a vocalization, shifting call energy to higher frequency bands could release some or all of the informationally important call parameters from the masking noise, allowing signalers to ensure successful communication without modifying the overall structure of the call. If this modification did increase communication effectiveness, simultaneous observations of changes to spectral tilt and the Lombard effect may an effect of co- evolved adaptations rather than an anatomical side-effect of increased vocal effort. Vocal flexibility in non-human primates and its relationship the evolution of acoustic communication and human language have been hotly debated (Egnor and Hauser, 2004;

172 Snowdon et al., 2009; Owren et al., 2010). In previous studies many researchers assumed that these species were incapable of modifying the acoustic structures of their vocalizations at all (Snowdon et al., 2009). More recently, studies have shown that at least two species of non-human primates exhibit the Lombard effect and changes to the temporal components of calls (Brumm et al., 2004; Egnor and Hauser, 2006; Egnor et al., 2006), but there has been no previous indication of short-term noise induced spectral flexibility in non-human primates. The findings of short-term modification to fundamental frequencies and spectral tilt of tamarin vocalizations thus suggest greater evolutionary continuity in vocal production between nonhuman primates and humans than previously supposed. The context-dependent differences in modification types deserve further investigation in future studies.

Future Research

The work presented in this dissertation leads logically to many paths for future research, including investigations into both human and non-human vocal modifications, vocal repertoires and behavioral functions of sounds produced by both wild and captive beluga whales, and further studies of vocal flexibility and spectral shifts in non-human primate vocalizations. Specific recommendations for experiments on each of these topics were given at the end of the relevant chapters, and I will therefore cover only the major points and suggestions here.

Vocal noise compensation

To the best of my knowledge, the hypothesized beneficial effects of noise-induced vocal modifications on the success of acoustic communication by non-human mammals have not yet been explicitly examined (but see: Nemeth and Brumm, 2010). Do shifts in vocalization frequencies or temporal properties actually increase signal detection over unmodified sounds? What are the differences between improvements in signal propagation, influenced by higher amplitude signals, and improvements in intelligibility, which depend on receiver perception? Changes to signalers’ effective communication ranges during increased noise have been shown for many species (Brown, 1989; Lohr et al., 2003; Miller, 2006; Nemeth and Brumm, 2010), but

173 the effects of vocal changes on the perception of acoustic signals by receivers remain known only for human speech (Draegert, 1951; Lu and Cooke, 2009). A related question deals with the relationship between the Lombard effect and other noise-induced vocal modifications. Several researchers have proposed the idea that noise- induced modifications to vocalization frequencies and temporal parameters are physiological byproducts of the Lombard effect (Patel and Schell, 2008; Tressler and Smotherman, 2009; Nemeth and Brumm, 2010), and are unlikely to improve propagation of vocalizations (Nemeth and Brumm, 2010). However, given the apparent independence of amplitude and spectral modifications exhibited by the tamarins in chapter six, and the detailed discussion of this phenomenon in chapter one of this dissertation, the byproduct hypothesis seems overly simplified. Changes to the frequency content of vocalizations, including both spectral tilt and fundamental frequencies, may increase call detectability by shifting at least part of a signal out of masking noise. While there is still likely to be some type of biomechanical linkage between amplitude and spectral modifications, especially at extreme amplitudes, independent modification of signal frequencies may be adaptive in moving whole or partial signals out of the noise band, and should be examined as a potential adaptive modification rather than labeled a byproduct without thorough investigation. The effects of acoustic habitats on the structures of vocalizations are also an intriguing area for future research. While long-term noise spectra are related to the acoustic structure of vocalizations in many species (Brumm and Slabbekoorn, 2005), there has been little investigation of the relationships between long-term noise and the vocal flexibility of signalers in relatively noisy and quiet habitats. This study showed that two species of social mammals that live in high-noise habitats use the same types of vocal modifications; increases in call frequency, but minimal changes to call durations (chapters 4-6, this dissertation). Future studies should investigate the possibility that vocal flexibility of signalers may be directly related to long-term noise levels or the frequency of high-noise events in their natural habitats.

Effects of behavioral context

The confounding effects of behavioral contexts, emotional states, and signaler motivation were clearly apparent in the studies of wild and captive beluga whales, and in the playback experiments with cotton-top tamarins. These results are consistent with other studies, which

174 indicate that signalers vocalize differently during different behavioral states (Cleveland and Snowdon, 1982; Sjare and Smith, 1986a; Thompson et al., 1986; Beale, 2007; Benoit-bird and Au, 2009; Snowdon et al., 2009), social contexts, and when motivated by different tasks (Lane and Tranel, 1971; Garnier et al., 2010; Hotchkin and Parks, In press). However, few studies have explicitly addressed the relationship between noise-induced and context-induced modifications and the acoustic structure of vocalizations (Miksis-Olds and Tyack, 2009; Tressler and Smotherman, 2009). A profitable path for future research would be to investigate the detailed vocal behavior of a model non-human species in several behavioral states during both quiet and increased noise. An examination of changes to stereotyped vocalizations that are associated with different behaviors or signaler motivations could also provide insight into the effects of behavioral context and signaler motivation. For instance, in the cotton-top tamarin experiments, signalers generally produced one of two call types: a long-distance contact call (CLC) or a functionally ambiguous chirp. The CLCs, which were the focus of the masking experiment, changed dramatically during noise, in response to both bandwidth and level; such changes are likely to be related to the call type’s importance in maintaining social contact with group members. If masking noise targeted at the spectral content of chirp vocalizations elicits the same responses seen in CLCs, it would imply either a contact function for the call, or a noise-compensation response system that is more consistent across calls than the current data indicate. Studying the effects of noise on calls with both known and unknown functions will add to our knowledge of the commonality of noise- induced vocal modifications between call types and allow further insights into the signaler motivations associated with stereotyped vocalizations. A third suggested experiment investigating the effects of behavioral context would use human subjects to examine the potential causes and implications of changes to the use of call types from vocal repertoires during increased noise. While several non-human species have exhibited changes to call types during masking noise, there have been few investigations of shifts in speaker vocabularies during increased masking noise (Hanley and Steer, 1949; Webster and Klumpp, 1962; Charlip and Burk, 1969; Patel and Schell, 2008). By asking subjects to communicate a set idea to a naïve listener, researchers could evaluate trends in types of words used and potentially open new avenues for investigating behavioral functions of non-human vocalizations.

175 Vocal repertoires

The vocal repertoires of non-human animals are relatively poorly known, particularly in relation to the behavioral functions of sounds and the importance of specific features of stereotyped calls. A better understanding of the use of sounds by animals could facilitate more detailed analysis of passive-acoustic monitoring studies, allowing researchers to understand the importance of certain habitat areas to foraging or socializing animals. Vocal repertoires have been published for the two study species used in this research, but the wide within- and between- population variability makes it difficult to draw conclusions about behavioral relevance of call types in these recordings. In addition, changes to a signaler’s environment (wild vs. captive, in particular) may affect the behavioral context and function of call types. Beluga whales are clearly vocally flexible, and capable of producing a wide range of sound types (Sjare and Smith, 1986b; Angiel, 1997; Karlsen et al., 2002; Chmelnitsky and Ferguson, 2012). Widely separated populations apparently produce at least some of the same stereotyped calls (pers. obs.), and yet whales removed from the wild as juveniles exhibit vocal repertoires that diverge from their natal population’s. This dissertation involved subjective analysis of vocalizations from three captive whales at Mystic Aquarium (Appendix 3) and a coarse repertoire categorization for a wild population in Cook Inlet (Chapter 5 and Appendix 1). The most striking difference in my observations was a lack of frequency modulated contours (particulary “wavy” whistles) and other tonal sounds, and the high prevalence of pulsed and noisy calls at Mystic when compared with recordings of whales in Cook Inlet and the published repertoire of the Churchill, Manitoba population. Investigation of these differences could provide insights into the effects of captivity and/or chronically noisy acoustic habitats on vocal production by this species, as well as the behavioral functions of sounds produced by these whales. Call types that are shared between several wild and captive populations are especially interesting – do the belugas use the same calls in similar behavioral contexts in all environments? Understanding the behavioral usage of calls within and between populations can expand our knowledge of the evolution of acoustic communication, and potentially the influences of acoustic habitats on sound production. Some of the same questions apply to cotton-top tamarins’ calls. The vocal repertoire of this species has been catalogued in relation to behavioral contexts in a wild population (Cleveland and Snowdon, 1982), but not in any of the groups of captive tamarins used for psychological research. Further investigations of the differences in captive and wild tamarins’ use of stereotyped

176 signals may provide insight into individual differences, communicative motivation, and the potential behavioral functions of stereotyped vocalizations.

Conclusions

This dissertation has clarified some existing questions about noise-induced vocal modifications in non-human mammals (perceptual salience of noise parameters, vocal flexibility of non-human primates, relationship of acoustic habitat to vocal flexiblity), raised new issues in areas including the influence of communicative motivation and interactions with noise effects, and summarized the current understanding of noise-induced vocal modifications in mammalian species. My hope is that this work contributes to studies of both the evolution of acoustic communication and to the conservation of species vulnerable to increases in anthropogenic noise.

References

Angiel, N. M. (1997). "The vocal repertoire of the beluga whale in Bristol Bay, Alaska," (M.S. Thesis, University of Washington). Beale, C. M. (2007). "The behavioral ecology of disturbance reactions," International Journal of Comparative Psychology 20, 111-120. Benoit-bird, K. J., and Au, W. W. L. (2009). "Phonation behavior of cooperatively foraging spinner dolphins," The Journal of the Acoustical Society of America 125, 539-546. Briefer, E. F. (2012). "Vocal expression of emotions in mammals: mechanisms of production and evidence," Journal of Zoology, Published online 8 May 2012. Brown, C. H. (1989). "The active space of blue monkey and grey-cheeked mangabey vocalizations," Animal Behaviour 37, 1023-1034. Brumm, H., Voss, K., Kollmer, I., and Todt, D. (2004). "Acoustic communication in noise: regulation of call characteristics in a New World monkey," J Exp Biol 207, 443-448. Brumm, H., and Zollinger, S. A. (2011). "The evolution of the Lombard effect: 100 years of psychoacoustic research," Behaviour 148, 1173-1198. Charlip, W. S., and Burk, K. W. (1969). "Effects of noise on selected speech parameters," Journal of Communication Disorders 2, 212-219. Chmelnitsky, E. G., and Ferguson, S. H. (2012). "Beluga whale, Delphinapterus leucas, vocalizations from the Churchill River, Manitoba, Canada," The Journal of the Acoustical Society of America 131, 4821-4835. Cleveland, J., and Snowdon, C. T. (1982). "The Complex Vocal Repertoire of the Adult Cotton- top Tamarin (Saguinus oedipus oedipus))," Zeitschrift für Tierpsychologie 58, 231-270. Draegert, G. L. (1951). "Relationships between voice variables and speech intelligibility in high level noise " Speech Monographs 18, 272-278.

177 Egan, J. J. (1967). "Psychoacoustics of the Lombard Voice Reflex," (Case Western University). Egan, J. J. (1972). "Psychoacoustics of the Lombard voice response," J. Aud. Res. 12, 318 - 324. Egnor, S. E. R., and Hauser, M. D. (2004). "A paradox in the evolution of primate vocal learning," Trends in Neurosciences 27, 649-654. Egnor, S. E. R., and Hauser, M. D. (2006). "Noise-induced vocal modulation in cotton-top tamarins (Saguinus oedipus)," American Journal of Primatology 68, 1183-1190. Egnor, S. E. R., Iguina, C. G., and Hauser, M. D. (2006). "Perturbation of auditory feedback causes systematic perturbation in vocal structure in adult cotton-top tamarins," J Exp Biol 209, 3652-3663. Fitch, H. (1989). "Comments on "Effects of noise on speech production: Acoustic and perceptual analyses" [J.Acoust. Soc.Am. 84, 917-928 (1988)]," The Journal of the Acoustical Society of America 86, 2017 - 2019. Garnier, M., Henrich, N., and Dubois, D. (2010). "Influence of sound immersion and communicative interaction on the Lombard effect," Journal of Speech, Language, and Hearing Research 53, 588 - 608. Hanley, T. D., and Steer, M. D. (1949). "Effect of Level of Distracting Noise upon Speaking Rate, Duration and Intensity," Journal of Speech and Hearing Disorders 14, 363-368. Hotchkin, C. F., and Parks, S. E. (In press). "The Lombard effect and other vocal noise compensation strategies: insights from mammalian communication systems." Karlsen, Karlsen, J., Bisther, Bisther, A., Lydersen, Lydersen, C., Haug, Haug, T., Kovacs, and Kovacs, K. (2002). "Summer vocalisations of adult male white whales (Delphinapterus leucas) in Svalbard, Norway," Polar Biology 25, 808-817. Lane, H., and Tranel, B. (1971). "The Lombard sign and the role of hearing in speech," Journal of Speech, Language, and Hearing Research 14, 677 - 709. Lesage, V., Barrette, C., Kingsley, M. C. S., and Sjare, B. (1999). "The effect of vessel noise on the vocal behavior of belugas in the St. Lawrence River estuary, Canada " Marine Mammal Science 15, 65-84. Lohr, B., Wright, T. F., and Dooling, R. J. (2003). "Detection and discrimination of natural calls in masking noise by birds: estimating the active space of a signal," Animal Behaviour 65, 763 - 777. Lu, Y., and Cooke, M. (2009). "The contribution of changes in F0 and spectral tilt to increased intelligibility of speech produced in noise," Speech Communication 51, 1253-1262. Miksis-Olds, J. L., and Tyack, P. L. (2009). "Manatee (Trichechus manatus) vocalization usage in relation to environmental noise levels," The Journal of the Acoustical Society of America 125, 1806-1815. Miller, P. J. O. (2006). "Diversity in sound pressure levels and estimated active space of resident killer whale vocalizations," Journal of Comparative Physiology A 192, 449-459. Nemeth, E., and Brumm, H. (2010). "Birds and anthropogenic noise: are urban songs adaptive?," American Naturalist 176, 465-475. Owren, M. J., Amoss, R. T., and Rendall, D. (2010). "Two organizing principles of vocal production: implications for nonhuman and human primates," American Journal of Primatology 71, 1 - 15. Patel, R., and Schell, K. W. (2008). "The influence of linguistic content on the Lombard effect," Journal of Speech, Language, and Hearing Research 51, 209 - 220. Scheifele, P. M., Andrew, S., Cooper, R. A., Darre, M., Musiek, F. E., and Max, L. (2005). "Indication of a Lombard vocal response in the St. Lawrence River beluga," The Journal of the Acoustical Society of America 117, 1486-1492.

178 Sjare, B. L., and Smith, T. G. (1986a). "The relationship bewteen behavioral activity and underwater vocalizations of the white whale, Delphinapterus leucas," Canadian Journal of Zoology 64, 2824-2831. Sjare, B. L., and Smith, T. G. (1986b). "The vocal repertoire of white whales,Delphinapterus leucas, summering in Cunningham Inlet, Northwest Territories," Canadian Journal of Zoology 64, 407-415. Snowdon, C. T., Marc, N., Klaus, Z., Nicola, S. C., and Vincent, M. J. (2009). "Plasticity of Communication in Nonhuman Primates," in Advances in the Study of Behavior (Academic Press), pp. 239-276. Southall, B. L., Bowles, A. E., Ellison, W. T., Finneran, J. J., Gentry, R. L., Greene, C. R., Jr. , Kastak, D., Ketten, D. R., Miller, J. H., Nachtigall, P. E., Richardson, W. J., Thomas, J. A., and Tyack, P. L. (2007). "Appendix A: Acoustic Measures and Terminology," Aquatic Mammals 33, 498-501. Thompson, P. O., Cummings, W. C., and Ha, S. J. (1986). "Sounds, source levels, and associated behavior of humpback whales, Southeast Alaska.," The Journal of the Acoustical Society of America 80, 735-740. Tressler, J., and Smotherman, M. S. (2009). "Context-dependent effects of noise on echolocation pulse characteristics in free-tailed bats," Journal of Comparative Physiology A 195, 923 - 934 Webster, J. C., and Klumpp, R. G. (1962). "Effects of ambient noise and nearby talkers on a face- to-face communication task," J Acoust Soc Am 34, 936-941.

Appendix A Noise and beluga vocalizations from upper Cook Inlet, Alaska during August 2007

During August 2007, C. Hotchkin volunteered with the National Oceanic and Atmospheric Administration in Anchorage, Alaska, in order to collect data on the acoustic environment in Cook Inlet before construction work began at the Port of Anchorage. Opportunistic encounters with beluga whales allowed for recordings of vocalizations and analysis of the whales’ vocal repertoire during two days in the summer of 2007. Analyses were completed between September 2007 and April 2010. This work was presented at the 2009 meeting of the Society for Marine Mammalogy in Quebec City, Quebec, Canada and at the 159th meeting of the Acoustical Society of America in April 2010.

Abstract

Beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska are geographically and genetically isolated from other Alaskan beluga populations, and were listed as endangered in 2008. One potential threat to the recovery of the population is anthropogenic noise, which may disrupt communication and normal behaviors throughout the population's limited range. In order to evaluate this potential problem, knowledge of anthropogenic noise sources and levels, and in- depth understanding of the animals' acoustic behavior is necessary. This project used a single boat-based hydrophone system to evaluate noise levels at several locations in Cook Inlet on 6 days from August 2-14, 2007. Belugas were encountered on two days during this period, at the Port of Anchorage and near the mouth of the Little Susitna River, and recorded vocalizations were analyzed to develop a preliminary catalog of the whales' vocal repertoire. Beluga vocalizations were measured and categorized into whistles, high-frequency whistles, and pulsed/noisy sounds. Most recorded vocalizations were similar to call types found in other beluga populations. Vocalization frequencies ranged from 0.381 kHz to 24 kHz (the limit of our recordings), with most energy at frequencies above 2 kHz. Recorded noise sources included

180 ships at and around the Port of Anchorage, commercial and military airplane over-flights, and tidal flow. Broadband and 1/3-octave band levels were evaluated for all anthropogenic and natural noise sources. Vessel noise levels were highest below 0.5 kHz, but frequencies ranged to greater than 8 kHz at the Port of Anchorage. Based on the overlap in frequency between beluga vocalizations and noise, anthropogenic sound can potentially interfere with beluga communication close to transiting and docked vessels in Cook Inlet.

Introduction

The Cook Inlet beluga whale (Delphinapterus leucas) population is genetically and geographically isolated, currently consisting of fewer than 500 individuals (Hobbs et al., 2008; O'Corry-Crowe, 2009). After a rapid and drastic decline in the early 1990s, the population was designated as depleted by the National Marine Fisheries Service (NMFS) in May 2000, and was given endangered status in October 2008 after the population failed to recover to pre-crash levels (Moore and DeMaster, 2000). Since 1998, the population has hovered between estimates of 278 and 435 animals (Hobbs et al., 2008; Hobbs et al., 2011). After the depleted designation in 2000, NMFS developed recovery and conservation plans for the population. Data gaps on the population’s life history, diet, and movement patterns were identified, as were potential threats to recovery (NMFS, 2008). One identified threat is the increasingly noisy acoustic environment in Cook Inlet. Anthropogenic noise from shipping, oil drilling, and construction activities has the potential to interfere with the whales’ behavior via disturbance reactions, and to interfere with communication through call masking (Richardson et al., 1995). A NMFS report by Blackwell and Greene (2002) analyzed underwater and in-air sound levels from ships, oil drilling platforms, airplanes, and natural sources in Cook Inlet during August 2001. They found that while broadband underwater noise levels may approach 180 dB re 1uPa, there was minimal potential for the noise to interfere with beluga communication (Blackwell and Greene, 2002). Several projects are currently collecting data on seasonal noise trends (B. Mahoney, pers. communication.), but as of the time this data was collected there were no published studies on anthropogenic noise in Cook Inlet. The acoustic behavior of Cook Inlet beluga whales is also relatively unknown. Vocal repertoires exist for other beluga stocks (St. Lawrence River (Sjare and Smith, 1986b; a); White Sea (Belikov and Bel'kovitch, 2006; 2007; 2008); Bristol Bay (Angiel, 1997)), giving preliminary

181 information on the types of calls to be expected from the Cook Inlet whales. Call frequencies range from 0.26 kHz to greater than 20 kHz; echolocation frequencies have been measured at 40 – 60 kHz and 100 – 120 kHz (Au, 1993). Source levels for beluga vocalizations have not been explicitly measured; however, based on other odontocete source levels, an assumption of 160 dB re 1uPa is not unreasonable (Erbe, 2000). However, at this time there is no published data on the types and frequencies of calls used by Cook Inlet belugas. This information would help to determine overlap between anthropogenic noise sources and beluga vocalizations, to help infer whether masking of acoustic communication is likely to negatively affect the Cook Inlet beluga population. The main objective of this project was to supplement the previous data on Cook Inlet noise levels gathered by Blackwell and Greene in 2001 by gathering data on the noise levels at the Port of Anchorage and other sites in Cook Inlet. These data may also be used as a baseline for noise levels at the Port of Anchorage prior to the beginning of construction for port expansion, the Knik Arm Bridge, or ferry docks at Ship Creek. A secondary objective was to determine the potential for anthropogenic noise to interfere with beluga communication in Cook Inlet. Recordings of beluga vocalizations were obtained on two days and were analyzed for spectral content and compared to the analyzed noise levels and spectra. Spectral overlap between noise and beluga vocalizations was quantified to determine the potential for masking of signals produced by whales close to transiting and docked vessels.

Methods

Data Collection

Acoustic recordings were made at four locations in upper Cook Inlet on 6 days between 2 and 14 August 2007. Locations were selected for the amount of anthropogenic activity at each site; time of recordings was dependent on tidal cycle due to the inability to launch the recording vessel at low tide. Recorded sources included vessels in and around the Port of Anchorage, military and commercial jet flyovers, and ambient sounds (including tidal noise in all locations). On two days (August 9th and 14th), belugas were encountered and vocalizations were recorded.

182 Recordings were made using a calibrated ITC 6050C hydrophone connected to a custom- built adjustable gain post-amplifier and a Sound Devices 702T digital audio recorder. The hydrophone contained a low-noise preamplifier and a 30m cable, which was faired to eliminate strumming. The hydrophone had a flat (± 3dB) frequency response from 0 to 20 kHz and an overall sensitivity of – 162 dB re 1µPa/V. Signals were recorded at a sampling rate of 48 kHz with 16 bit quantization onto a formatted compact flash card, and downloaded to a laptop computer at the end of each recording day.

Table A-1. Dates and locations of recordings in Cook Inlet during August 2007. The recording vessel sometimes visited multiple locations in a day, often moving to several locations within a given site in order to capture the full range of variability during the recording period. Date Site Name Latitude (N) Longitude (W) 8/2/2007 61.237850 149.884517 61.244917 149.885533 Port of Anchorage (PoA) 61.234717 149.895400 61.245050 149.884850 61.228800 149.903617 8/3/2007 61.266733 149.918617 61.270367 149.917550 Point Mackenzie (PtMac) 61.263833 149.880867 61.220417 149.922083 8/6/2007 61.179317 150.062767 Point Woronzof (PtWor) 61.205217 150.026117 61.206317 150.000083 8/7/2007 Port of Anchorage (PoA) 61.246767 149.884333 Mid Knik Arm (MidKnik) 61.319050 149.821083 8/9/2009 61.235250 150.263867 Susitna Delta Encounter 61.184100 150.390133 (vocalizations only) 61.184683 150.403917 61.189867 150.397667 8/14/2007 Port of Anchorage 61.245400 149.885083 Point Mackenzie 61.263717 149.921700

Recordings were made from a rigid hulled inflatable boat owned and operated by NMFS in Anchorage, and launched out of the small boat launch at Ship Creek. At the beginning of each recording session, the vessel was anchored securely to prevent drifting into the path of larger vessels (when at the Port) or being carried away from recording locations by tidal currents. This also allowed us to maximize recording time at any given location, by eliminating time required to move back into a position we had drifted away from. The vessel engine was turned off at all times during recording.

183

Figure A-1. Upper Cook Inlet, Alaska, with August 2007 recording sites marked with black stars. Beluga encounters are marked with whale tail symbols.

The hydrophone was attached to a 4-lb lead weight and lowered into the water to begin a recording session. When possible, deployment depth was 10m. If the water was too shallow, the hydrophone was deployed to a minimum depth of 6m. Occasionally, hydrophone depth was adjusted between recordings in the same location to account for tidal changes in water depth. Recording locations were obtained at the beginning of each session using the GPS unit on the recording vessel, and the vessel’s depth finder was used to estimate water depth. Distances to nearby sound sources were visually estimated by two observers at the beginning of every recording. Weather, sea-state, and tidal data were collected throughout every recording session; sea state was never above 2 during recording sessions. Beluga whale vocalizations were recorded opportunistically on two days (9 and 14 August). One encounter occurred at the Port of Anchorage, as a small group (6 – 8 individuals) of whales was observed transiting through the area, close to shore. A second encounter, from which the majority of analyzed vocalizations were taken, occurred in the middle inlet, near the Susitna River Delta (Figure A-1).

184 Data Analysis

Noise analyses

Noise samples were first examined as waveform and spectrogram in Raven Pro, and played to the analyst via headphones. If there was no appreciable variability in the waveform amplitude, two 10-second samples were taken from each file. If the file had large amplitude variations, at least one 10s sample was taken from each of the high and low amplitude sections. Broadband rms and 1/3 octave band levels were computed for each sample, using a 1 second transform length for exponential averaging. Recordings made at similar locations under different tidal conditions were sorted by tidal condition before being analyzed, to evaluate the impact of tidal flow on overall RMS sound levels. “Incoming tide” was defined as any time greater than 30 minutes before predicted high tide in Anchorage. “High tide” ranged from 30 minutes before to 30 minutes after the predicted moment of high tide, and “outgoing tide” was defined as greater than 30 minutes after the predicted tide.

Vocalization analysis

Recordings from the two days with beluga encounters (9 and 14 August) were browsed manually in Raven Pro 1.4 to detect non-overlapped calls with high SNR. High quality calls were analyzed in Raven to determine minimum, maximum, start, end, and peak frequencies, as well as duration, number of syllables, and number of inflection and modulation points. These parameters were used to run a principle components analysis (PCA) to objectively determine call types. Analyzed vocalizations were also subjectively matched to published vocal repertoires from other beluga populations. When no matches were found among published repertoires, vocalizations were assigned to a subjective category created by the author based on contour shape, spectral content, and whether the call was pulsed or tonal.

185 Results

Noise

A total of 13.7 hours of data were collected on six days between 2 and 14 August 2007. Three of these days included visits to the Port of Anchorage (PoA), two to the docks at Point Mackenzie (PtMac), and one each to sites at Point Woronzof, near the Anchorage International Airport, and to a site near the middle of Knik Arm near Elmendorf Airforce base. Multiple sub- sites were sampled during each site visit (Figure A-1). The PoA and PtMac sites are hereafter referred to as “developed” because of the increased anthropogenic activity in this area compared to either the PtWor or MidKnik (“remote”) sites. Recordings occurred during all tidal stages at the PoA and PtMac sites, and during only high and outgoing tides at PtWor and MidKnik; comparisons between all sites will therefore focus only on data recorded during high tide; this will also reduce the impact of flow noise over the dangling hydrophone during recordings.

Tidal (flow-noise) contributions

Tidal noise contributed to overall broadband SPLs in the two relatively remote sites (Point Woronzof and mid-Knik Arm), but appeared to have less of an impact near the Port of Anchorage and Point Mackenzie (Figure A-2). The two remote locations were only sampled on one day each, which could affect the presented levels. No vessels were seen at either remote location on the recording dates. When only high tide levels are analyzed, the two developed sites (PoA and PtMac) had significantly higher noise levels than did either of the undeveloped sites.

186

Figure A-2. Average broadband rms noise levels from all sites according to tidal stage. Error bars indicate standard deviation. Tidal stage does not appear to drastically influence noise levels at the Port of Anchorage (PoA), and only minimally affects noise levels at Point Mackenzie and at the mid-Knik Arm site. Noise levels at Point Woronzof changed dramatically when tide changed from “high” to “outgoing”. Days spent at each recording site: 3 (PoA), 2 (Pt. Mac), 1 (Pt. Woronzof), 1 (mid-Knik).

Developed sites

The two developed sites sampled during these recordings were close to the boat launch and therefore sampled the most frequently. Recordings were made at the Port of Anchorage (PoA) on three days and at the Point Mackenzie (PtMac) docks on two days. Activities observed at the PoA site during recording sessions included loading and unloading of various container ships (the most frequent of which was the Midnight Sun), use of tugs to move gravel barges near the port, and bucket dredging. No recordings were made of tugs docking vessels, or of cargo ships leaving the dock, as Blackwell and Greene (2002) recorded; the levels reported here are therefore substantially lower than was observed in 2001, and lower than would be observed when

187 vessels are moving into position or leaving the dock. No vessels were observed at the PtMac site on either day, but activity at the PoA, just across Knik Arm from PtMac was visible. Broadband noise levels ranged from 84 – 123 dB re 1 µPa, averaging 110.9 dB (standard deviation: 4.5 dB). The minimum and maximum noise levels were both recorded on August 2, indicating a wide daily range of noise levels at the port. Tidal stages during minimum and maximum noise levels were outgoing and high, respectively. Spectral content of the noise at this site was concentrated below 1 kHz, but ranged to over 8 kHz (Figure A-3). Noise levels at the Point Mackenzie sites ranged from 91 – 120 dB re 1 µPa (mean ± SD 106 ± 6). No vessels were observed at PtMac sites during recording, but vessel noise was clearly audible on recordings. Noise was highest during outgoing tide at this site, likely due to flow noise of the current over the hydrophone. Spectral content of noise was concentrated below 1 kHz, similar to the noise at the Port of Anchorage.

Figure A-3. Typical 1/3 octave band noise levels for high tide recordings at all sites. Noise below 1 kHz at the two developed sites is due to increased anthropogenic activity in these areas. Increased noise at the developed sites extends up to approximately 8 kHz.

188 Remote locations

Two relatively isolated locations were sampled on one day each. An entire day (August 6) was spent at Point Woronzof near Anchorage International Airport, under the approach pattern for landing jets. On August 7, an area in the middle of Knik Arm, north of Anchorage and closer to Forts Elmendorf and Richardson (US Air Force and Army, respectively). Recordings at Point Woronzof were made during high and outgoing tidal periods. Broadband, levels recorded during the outgoing tide were substantially higher than those recorded during high tide (mean ± SD) (Outgoing: 110 ± 13 dB; High: 88 ± 3 dB). No ships or waterborne vessels were observed at this site; the only noise sources in close proximity to the recording platform were the tide and commercial airliners from Ted Stevens International Airport. Approaches by airplanes were generally limited to under a minute of increased noise exposure; on 6 August 2007, 19 planes were observed approaching or leaving the airport in 3.5 hours. This amount of noise spread over such a time would not be expected to increase the average SPL by 17 dB; a significant fraction of the increase can therefore be attributed to the tidal shift. Spectral content of noise at Point Woronzof was relatively even, with a flat distribution of noise across the entire frequency band (Figure A-3). There were no visually or audibly detectable vessel sounds in the recorded data. One hour and twenty minutes of recordings were made at a position north of Anchorage, during high and outgoing tides. Unlike recordings made at Point Woronzof, broadband noise levels recorded during the outgoing tide were not appreciably higher than those made during high tide (mean ± SD outgoing: 92 ± 2 dB; high:92 ± 8 dB) . No vessels were observed in any proximity to the recording platform, though several military jets were observed circling above Knik Arm during the outgoing tidal period. Spectral content of noise recorded in mid-Knik arm was similar to that recorded at the other remote site, with a relatively flat level across the entire frequency bandwidth, particularly at lower frequencies where the developed sites had elevated noise levels (Figure A-3).

189 Beluga encounters and vocal behavior

Beluga whales were encountered on two days (9 and 14 August) during data collection. On August 9, the experimenter and recording platform were cooperating with an effort to film and record Cook Inlet beluga whales near the Susitna River Delta in the mid-inlet region. On August 14, 6-8 beluga whales were observed approaching the Port of Anchorage from the North, passing between the recording vessel and shore and between the recording vessel and the Midnight Sun docked at the port. Due to the high level of noise at the port during the August 14 encounter (Figure A-4), all vocalization analyses used data collected on August 9.

Figure A-4. Beluga vocalizations and ship noise recorded at the Port of Anchorage on 14 August 2007. 6-8 animals were observed travelling from north to south during loading of the container ship “Midnight Sun”. Note the high level of noise at low frequencies in this spectrogram.

A total of 80 minutes of data were recorded during beluga encounters, yielding 2,574 vocalizations, excluding echolocation clicks. Of these, 1,025 were of sufficient quality and analyzed for spectral and temporal parameters and classified into call types. Pulsed calls composed 19.7 % of recorded vocalizations, while the remaining 80.3 % were contoured whistles. Some call types had both tonal and pulsed components; these were classified as contoured calls (see Figure A-5b for an example).

190 Vocalization frequencies ranged from 0.38 to 24 kHz, with 97.6% of calls using frequencies below 10 kHz, and 59.7 % using frequencies lower than 5 kHz. In 83.4 % of cases, the entire call was below 10 kHz and below 5 kHz 24.2% of the time. Duration of calls ranged from 0.035 to 6.5s (0.66 ± 0.6) for all call types pooled together. Attempts to objectively classify vocalization types using a principle components analysis failed to discriminate call types. When possible, vocalization types were subjectively matched to published repertoires, or assigned to call types created by the author. Example spectrograms of four call types are given in Figure A-5.

Figure A-5. Example spectrograms of recorded call types. a) Contoured pulsed series similar to type W4d in Chmelnitsky and Ferguson (2012). b) Contour described in Sjare & Smith (1986); noisy component is not. c) Harmonic “whistle”; Belikov & Bel’kovitch (2007) WT8. d) Upswept call; Belikov & Bel’kovich (2007) WT11. Spectrogram parameters: 256 point Hanning window, 50 % overlap; 256 point DFT.

Spectral overlap between noise and beluga vocalizations

The hearing threshold for beluga whales has been determined using behavioral audiograms with captive animals (Awbrey et al., 1988; Au, 1993; Richardson et al., 1995; Mooney et al., 2008). Their hearing is poor below 1 kHz, with an estimated threshold of 140 dB

191 at 100 Hz, falling to approximately 40 dB at 20 kHz. Peak hearing sensitivity is found around 40 kHz, at the frequencies used for echolocation. Starting at around 300 Hz, the maximum observed noise levels at the Port fall just above the animals’ estimated hearing threshold, indicating that the animals are likely to perceive noise from vessels at the Port of Anchorage, and possibly nearby transiting vessels in other areas. In particularly loud situations, as noted by Blackwell and Greene (2002) communication masking may be an issue for these whales.

Figure A-6 Range of noise levels recorded in Cook Inlet plotted with the estimated beluga hearing threshold (modified from Richardson et. al 1995) and the range of beluga vocalization frequencies recorded in this dataset. The blue line indicates the maximum noise levels recorded (Port of Anchorage site), and the red line indicates minimum noise from mid-Knik arm.

The spectral range for beluga vocalizations was between 0.035 kHz and 24 kHz (the limit of our recordings). The vast majority of the analyzed calls utilized frequencies below 10 kHz, with a quarter of all calls falling completely below 5 kHz, which overlaps with both the range of noise levels recorded and the whales’ hearing range (Figure A-6), indicating that vocalization masking may be occurring when whales are near the Port of Anchorage or when vessels transit close to whales in other areas of Cook Inlet.

192 Discussion

Noise levels in upper Cook Inlet during August 2007 varied with the amount of anthropogenic activity and tidal stage at each site. Levels were highest at the two developed sites, the Port of Anchorage and Point Mackenzie, which have high levels of vessel activity particularly during high tides. The two relatively remote sites sampled had lower noise levels, especially below 1 kHz, where most shipping noise occurs. Tidal flow noise appeared to have minimal impacts on noise levels at the Port of Anchorage and Point Mackenzie recording locations, but dramatically changed noise levels near Point Woronzof. In recordings made at the Port of Anchorage, shipping noise was detectable up to 8 kHz, though the highest levels were found around 0.5 kHz. Noise levels at the developed sites were 14 – 18 dB higher on average than noise levels at remote sites. Within-site variability was high, particularly at the Port of Anchorage, where the highest and lowest noise levels were both recorded on the same day, with a 26 dB difference dependent on the anthropogenic activity and tidal levels. Flow noise at the remote sites also caused high levels of variation, with noise increasing during outgoing tides. Recorded beluga vocalizations represented many different call types, some of which have been previously described from other beluga populations, and some which appear novel and possibly unique to the Cook Inlet beluga stock. A principle components analysis failed to discriminate unique call types, but many vocalizations were sufficiently stereotyped to allow a human analyst to subjectively classify call types, leading to a partial description of the vocal repertoire for this population. The majority of beluga vocalizations recorded used frequencies that overlapped the increased anthropogenic noise frequencies found at the Port of Anchorage and Point Mackenzie sites, with calls as low as 0.391 kHz, directly in the frequency band that most shipping noise also occupies. Other vocalizations ranged to the limit of our recording abilities, 24 kHz. The spectral overlap between beluga vocalizations and anthropogenic noise in Cook Inlet indicate that there is potential for communication masking when whales are close to the Port of Anchorage, and potentially when whales are close to vessels transiting through other areas of the inlet. Shipping noise was highest at low frequencies, where beluga whales have very high hearing thresholds, and the whales are unlikely to hear most of the levels recorded during August 2007. However, as Blackwell and Greene (2002) noted, the highest levels of shipping noise occurred

193 during active docking of a container ship by a tugboat, which was not observed during data collection for this report.

Acknowledgements

This project was supported by the National Marine Fisheries Service Protected Resources Division in Anchorage, AK. Thanks are due to Barbara Mahoney, Lt. Jonathan Taylor, and Matt Eagleton for assistance with data collection, and to Drs. Charles Greene, Susanna Blackwell, Susan Parks, Thomas Gabrielson, and Dawn Grebner for assistance with analysis.

References

Angiel, N. M. (1997). "The vocal repertoire of the beluga whale in Bristol Bay, Alaska," (University of Washington). Au, W. W. L. (1993). The Sonar of Dolphins (Springer-Verlag, New York, NY). Awbrey, F. T., Thomas, J. A., and Kastelein, R. A. (1988). "Low-frequency underwater hearing sensitivitiy in belugas, Delphinapterus leucas," The Journal of the Acoustical Society of America 84, 2273-2275. Belikov, R. A., and Bel'kovitch, V. M. (2006). "High-pitched tonal signals of beluga whales (Delphinapterus leucas) in a summer assemblage off Solovetskii Island in the White Sea," Acoustical Physics 52, 125-131. Belikov, R. A., and Bel'kovitch, V. M. (2007). "Whistles of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 53, 528-534. Belikov, R. A., and Bel'kovitch, V. M. (2008). "Communicative pulsed signals of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 54, 115-123. Blackwell, S. B., and Greene, C. R. J. (2002). "Acoustic measurements in Cook Inlet, Alaska, during 2001," Prepared for National Marine Fisheries Service. Greeneridge Report 271-1. Greeneridge Sciences, Inc., Aptos, CA. Erbe, C. (2000). "Detection of whale calls in noise: Performance comparison between a beluga whale, human listeners, and a neural network," The Journal of the Acoustical Society of America 108, 297-303. Hobbs, R., Shelden, K. E. W., Rugh, D. J., and Norman, S. A. (2008). "2008 Status review and extinction risk assessment of Cook Inlet belugas (Delphinapterus leucas)." in AFSC Processed Report. 2008-02 (Alaska Fisheries Science Center, National Marine Fisheries Service). Hobbs, R. C., Sims, C. L., and Shelden, K. E. W. (2011). "Esitimated abundance of belugas in Cook Inlet, Alaska, from aerial surveys conducted in June 2011," in Unpublished Report (NMFS, NMML).

194 Mooney, T. A., Nachtigall, P. E., Castellote, M., Taylor, K. A., Pacini, A. F., and Esteban, J.-A. (2008). "Hearing pathways and directional sensitivity of the beluga whale, Delphinapterus leucas," Journal of Experimental Marine Biology and Ecology 362, 108- 116. Moore, S. E., and DeMaster, D. P. (2000). "Cook Inlet belugas, Delphinapterus leucas: Status and overview," Marine Fisheries Review 62, 1 - 5. NMFS (2008). "Conservation plan for the Cook Inlet beluga whale (Delphinapterus leucas)." (National Marine Fisheries Service, Juneau, AK.). O'Corry-Crowe, G. M. (2009). "Beluga Whale: Delphinapterus leucas," in Encyclopedia of Marine Mammals (Second Edition), edited by F. P. William, W. Bernd, and J. G. M. Thewissen (Academic Press, London), pp. 108-112. Richardson, W. J., Greene, C. R. J., Malme, C. I., and Thompson, D. H. (1995). Marine mammals and noise (Academic Press, San Diego, CA). Sjare, B. L., and Smith, T. G. (1986a). "The relationship between behavioral activity and underwater vocalizations of the white whale, Delphinapterus leucas," Canadian Journal of Zoology 64, 2824 - 2831. Sjare, B. L., and Smith, T. G. (1986b). "The vocal repertoire of white whales, Delphinapterus leucas, summering in Cunningham Inlet, Northwest Territories," Canadian Journal of Zoology 64, 407 - 415.

195 Appendix B Matlab code for Acoustic Habitat analyses

MATLAB CODE APPENDIX The Matlab codes presented here was used to calculate the broadband and 1/3 octave band noise levels for Chapter 3 of this dissertation. It is presented as a guide to future students. The first three scripts (“tob_percentile”, “filfortest_percentile”, and “hw10b”) were used to generate 1/3 octave band noise levels for the first second of every selected sound file. The fourth and fifth scripts (“avgPSD” and “BBmweight”) were used to apply the M-weighting function to the calculated power spectral densities and integrate over the spectrum to generate M-weighted broadband noise levels. The function “HW10b” refers to a function (“third_octave_filter”) written by Dr. Thomas Gabrielson at Penn State, which is not presented here.

TOB_PERCENTILE

[filename, pathname]=uigetfile('*.wav', 'Pick .wav files', 'MultiSelect', 'on'); filename=sort(filename); % Select .wav files to analyze % zz=zeros(20,length(filename)); % create empty matrix for 1/3 octave band values RMS=zeros(1,length(filename)); % create empty vector for RMS values % for ii=1:length(filename) [aa,fs]=wavread([pathname,filename{1,ii}], fs); % read in 1st second of each .wav file % [RMS_dB,fc,xn,xndB,avgSPL,M]=filfortest_percentile(aa,fs); % calculate values zz(:,ii)=avgSPL.'; RMS(:,ii)=RMS_dB; clear aa end % yy=prctile(zz, [5, 25, 50, 75,95],2); % Calculate percentiles RMSc=RMS.'; allvals=[RMSc,cc];

FILFORTEST_PERCENTILE

function [RMS_dB,fc,xn,xndB,avgSPL]=filfortest_percentile(aa,fs); % %

196 S=(10^(-164/20))/(10^-6); % Conversion from hydrophone sensitivity [dB re 1 V/uPa] to [V/Pa] - value given is for DSG unit used at Mystic gain=10^(3/20); % Returns pressure factor for a given dB gain as set by postamp. [Unitless] (P/Pref is only a ratio!); CF=S*gain; % Correction factor for sensitivity and gain (divide .wav by dBgain pressure; see quiz 2 part a4) [V/Pa] x=(aa)/CF; % given in Pa % % dt=1/fs; N=length(x); T=N*dt; df=1/T; times=(0:(N-1))*dt; freqs=(0:(N-1))*df; % Tc=1.0; % Fast averaging % xn_low=zeros(13,67); % create empty vector for values % bn=20:43; % Band numbers for the 1/3 octave bands of interest % for ii=1:13; fc=bn(ii);% bn(1):bn(13) - center frequencies of filters for low frequencies nb=fc; xn2=hw10b(x,fs,Tc,nb); % 1/3 octave filter function xn2=xn2.'; xn_low(ii,:)=xn2;% compile values for each 1/3 octave band end avg_low=mean(xn_low,2);% Take average of intensity measure in each 1/3 octave band avgSPL_low=(10*log10(avg_low./(1*10^-6).^2)); % convert from [Pa] to [dB re 1uPa] clear ii % fm=10.^(bn./10); fc1=fm(1:13).'; % Center frequencies of filters % xn_high=zeros(10,67); % create empty vector for high frequency values % bn2=bn(14:23); % band numbers for high frequency 1/3 octave band filters for ii=1:10; fc=bn2(ii);%bn(14):bn(23); center frequencies for high frequency bands nb=fc; xn3=hw10b(x,fs,Tc,nb); % 1/3 octave filter funciton xn3=xn3.'; xn_high(ii,:)=xn3; end avg_high=mean(xn_high,2); avgSPL_high=(10*log10(avg_high./(1*10^-6).^2)); clear ii

197 % fm=10.^(bn./10); fc2=fm(14:23).'; % xn=[xn_low;xn_high]; xndB=(10*log10(xn./(1*10^-6).^2)); avgSPL=[avgSPL_low; avgSPL_high]; fc=[fc1; fc2]; %

HW10B

function [xn,times2]=hw10b(x,fs,Tc,nb) % x is original time series, fs is sample rate, Tc is exponential-average % time constant, nb is ISO band number for filter. % dt=1/fs; N=length(x); times=(0:(N-1))*dt; % % Step 1: 1/3-octave filter fm=10^(nb/10); % calculate center frequencies for 1/3 octave bands % fl=fm*10^-.05; % fu=fm*10^.05; % [bb,aa]=third_octave_filter(fm,fs); % calculate filter coefficients using Dr. Gabrielson's code Fxn=filter(bb,aa,x); % apply filter % % Step 2 - Square filter output xx=Fxn.^2; % % Step 3 - Exponential Averaging % alpha=dt/Tc; % B=[0 alpha]; A=[1 alpha-1]; % yef=filter(B,A,xx); % % Step 4 - downsample filtered file % xn=yef(1:750:end); times2=times(1:750:end);

AVGPSD

198 function [aGxx,freqs2,dB,df,Nrecs]=avgPSD(xn, fs); % Returns the PSD for an averaged series of records.Inputs are a time % series and the sampling rate. % dt=1/fs; T=length(xn)*dt; N=length(xn); df=1/T; % times=(0:(N-1))*dt; freqs=(0:(N-1))*df; % % RMS averaging Nrecs=1; % # of records Nfft=length(xn); % # of points/record % Trec=Nfft*dt; dfrec=1/(Trec); times2=(0:Nfft-1)*dt; freqs2=(0:Nfft/2)*dfrec; % W=1; % apply window function here if desired Aw=1; % correction factor for windowing function % n1=1; n2=n1+ Nfft-1; record1=xn(n1:n2).*W; % %Calculate linear spectrum Xm1=fft(record1)*dt; % % Calculate Sxx Sxx1=(1/Trec).*(abs(Xm1).^2); % % Calculate Gxx Gxx1=[(1/Trec).*(abs(Xm1(1)).^2); 2*Sxx1(2:Nfft/2);(1/Trec).*abs(Xm1((Nfft/2)+1)).^2].*Aw; %apply CF here for ii=2:Nrecs; n1=n1+Nfft; n2=n2+Nfft; record1=xn(n1:n2).*W; %Calculate linear spectrum Xm1=fft(record1)*dt; % % Calculate Sxx Sxx1=(1/Trec).*(abs(Xm1).^2); % % Calculate Gxx Gxx1a=[(1/Trec).*(abs(Xm1(1)).^2); 2*Sxx1(2:Nfft/2);(1/Trec).*abs(Xm1((Nfft/2)+1)).^2].*Aw; Gxx1=[Gxx1, Gxx1a];

199 end % aGxx=sum(Gxx1, 2)/Nrecs; aGxx=mean(Gxx1,2); sum=sum(aGxx*dfrec); sumdB=10*log10(sum/((10^-6)^2)); % figure % loglog(freqs2, aGxx) % xlabel('Frequency [Hz]') % ylabel('PSD [Pa^2/Hz]') % % dB=10*log10(aGxx/((10^-6)^2)); % figure % semilogx(freqs2,dB)

BBMWEIGHT function [BBMstats,MprctilesdB]=BBmweight(freqs2,allPSDs,Nrecs); % Function to generate M-weighted broadband noise levels from power % spectral densities generated from Mystic and Cook Inlet data. % % Inputs are the frequency vector and matrix of PSDs generated by the % function avgPSD, and the number of averages used in that function. df=1; % fc=freqs2; flow=150; % [Hz] - from Southall et al. Mid frequency cetacean hearing lower limit fhigh=160000;%[Hz] - Southall et al. upper limit R=((fhigh^2).*fc.^2)./((flow^2+fc.^2).*(fhigh^2+fc.^2)); M=20*log10(R./max(abs(R))); M=M.'; % dB=10*log10(allPSDs/((10^-6)^2)); % mweightPSDs=zeros(size(allPSDs,1), size(allPSDs,2)); for ii=1:size(allPSDs,2) mweightPSDs(:,ii)=dB(:,ii)+M; end % intM=((10^-6)^2)*(10.^(mweightPSDs/10)); BB_M=sum(intM*dfrec); BB_MdB=10*log10(BB_M/((10^-6)^2)); % avgBBM=mean(BB_M); avgBBMdB=10*log10(avgBBM/((10^-6)^2)); % medBBM=median(BB_M); medBBMdB=10*log10(medBBM/((10^-6)^2)); % minBBM=min(BB_M);

200 minBBMdB=10*log10(minBBM/((10^-6)^2)); % maxBBM=max(BB_M); maxBBMdB=10*log10(maxBBM/((10^-6)^2));

BBMstats=[avgBBMdB,medBBMdB,minBBMdB,maxBBMdB]; BBstats=[avgBBdB,medBBdB,minBBdB,maxBBdB]; % Mprctiles=prctile(BB_M,[5,25,50,75,95]); MprctilesdB=10*log10(Mprctiles/((10^-6)^2)); %

201 Appendix C

Vocal repertoire and acoustic behavior of beluga whales at Mystic Aquarium in November 2010

Abstract

Beluga whales are known as a vocal species of cetacean, with a large variety of call types observed from many different populations. In captivity, belugas may behave differently than wild whales in fluid and complex social situations. Captive belugas, housed in small groups in various aquaria, may exhibit different vocal patterns than wild animals from the same populations. This study examined the vocal behavior and repertoire of three captive beluga whales at Mystic Aquarium in Mystic, Connecticut. Analysis of recordings from day and nighttime hours indicates that the Mystic belugas have a more limited vocal repertoire than wild beluga populations, and that the captive whales show a strong diel pattern in vocal activity. The Mystic whales appear to have adjusted their behavior and vocalizations to living in a small, highly controlled environment with regularly timed trainer interactions and activities.

Introduction

Beluga whales are known as the ‘canaries of the sea’ because of their extensive and flexible vocal repertoires (Beddard, 1900; Schevill and Lawrence, 1949; Sjare and Smith, 1986b; Angiel, 1997; Chmelnitsky and Ferguson, 2012). Since their vocalizations were first described by Schevill and Lawrence (1949), repertoires have been detailed for wild populations from the White Sea (Belikov and Bel'kovitch, 2006; 2007; 2008), the St. Lawrence (Sjare and Smith, 1986a; b) and Churchill Rivers (Chmelnitsky and Ferguson, 2012), Svalbard, Norway (Karlsen et al., 2002), and Bristol Bay, Alaska (Angiel, 1997). Efforts are underway to fully characterize the repertoire of the Cook Inlet, Alaska, beluga population (Hotchkin et al., 2010; Blevins et al., 2012). While it is difficult to compare call types between studies due to the high degree of gradation between signals (Recchia, 1994; Angiel, 1997; Chmelnitsky and Ferguson, 2012),

202 every described repertoire contains call types that are shared between populations, including tonal whistles, pulsed tones, and ‘hybrid’ sounds (Karlsen et al., 2002; Chmelnitsky and Ferguson, 2012). Despite the many descriptions of vocal repertoires from wild beluga populations, there have been relatively few studies of the vocalizations produced by captive belugas (Recchia, 1994; Castellote and Fossa, 2006; Vergara and Barrett-Lennard, 2008; Kelley, 2010; Vergara et al., 2010). Contact calls have been described for the whales at the Vancouver Aquarium, and similar call types described from wild whales of the St. Lawrence River population (Vergara et al., 2010), but no other studies have linked objective call-type classifications and behavioral functions of calls in captive belugas. Vocal behavior in captive marine mammals may provide clues to the animals’ welfare and their reactions to social or environmental changes (Castellote and Fossa, 2006; Vergara et al., 2010). In addition, studying vocalizations from captive animals may allow evaluation of the effects of captivity on the “normal” behaviors of these species. The relatively stable, regular structure of these animals’ days (training sessions, feeding times, etc.) and the restricted range of captive animals appears to cause the loss of some “normal” behaviors, and cause others to exhibit stereotyped behaviors not seen in the wild (Mason and Latham, 2004; Swaisgood and Shepherdson, 2005). Whether vocal behavior and acoustic communication are also impacted by captivity has yet to be studied in marine mammals. This project sought to begin evaluating the effects of captivity on the vocal repertoire and behavioral functions of vocalizations produced by captive beluga whales at Mystic Aquarium in Mystic, Connecticut though 1) subjective classification of the animals’ vocal repertoire, 2) examination of possible diel patterns in vocal behavior, and 3) comparison of vocalizations from Mystic with those recorded from wild animals at the Churchill River, Manitoba (Chmelnitsky and Ferguson, 2012). These steps may allow for a basic recognition of differences in vocal behavior of wild and captive belugas, and future evaluation of the behavioral functions of sounds produced by these animals.

Study system

The beluga habitat at Mystic Aquarium is called the “Arctic Coast Exhibit”, and is an outdoor habitat composed of three pools (“main”, “med”, and “hold”; Figure C-1) with a large

203 underwater viewing area. Whales are allowed to swim freely in all three pools, with occasional closure of one or both small pools and short-term confinement in the med or holding pools as required for exhibit maintenance and veterinary procedures (Sirpenski, G. pers. comm.).

Figure C-1. Diagram of the Arctic Coast exhibit at Mystic Aquarium with pools labeled. Gates between pools are indicated by hashes; there are two gates between the Main and Hold pools, one between Main and Med, and one between Med and Hold. The underwater viewing area is located underneath the semicircular canopy on the left side of the map. Image courtesy of Mike Osborn, Mystic Aquarium.

Two adult female belugas, and one male on the verge of sexual maturity occupied the Arctic Coast Exhibit during data collection. The two females, Kela and Naku, were born in the wild population found near Churchill, Manitoba, collected as juveniles and have lived at Mystic since 1985 (Sirpenski, G., pers. comm.). They were both 27 – 28 years old at the time of recordings. The male, Juno, was born at Marineland Canada in 2002, transferred to Sea World Orlando, and sent to Mystic Aquarium on breeding loan in January 2010. All three whales were found to have normal beluga hearing curves (Sirpenski, G., pers. comm.), and have regular health checks from trainers and veterinary staff.

204 Methods

Data Collection

Data were collected following the methods outlined in chapter 4 of this dissertation between 3 and 16 November 2010. Vocalizations were recorded during daylight and darkness (sunrise and sunset times taken from: http://www.esrl.noaa.gov/gmd/grad/solcalc/). Behavioral observations were conducted via scan sampling (1 scan/minute) during the first 10 minutes of every hour of recording in order to document general behavior patterns and responses to unusual events (trainers in the exhibit, etc.).

Table C-1. Recording schedules for 2010 data collection sessions at Mystic Aquarium. On days when two pools are listed, the DSG was moved between pools during the recording period. A check in the overnight column means that the DSG was recording overnight beginning on the date with the checkmark and continuing through the following day.

Year Month Date Day Pool Overnight 6 Sa HOLD  7 Su HOLD  8 M MED  9 Tu MED/HOLD  2010 NOV 10 W HOLD  13 Sa MED  14 Su HOLD  15 M MAIN/MED  16 Tu MED/HOLD 

Data Analyses

Beluga vocalizations were analyzed for call rates and types during each hour of recording. For each hour of data, training sessions were excluded and a subsample of five randomly selected minutes was analyzed. A single analyst (C. Hotchkin) visually and aurally evaluated all calls detected in the subsample, and assigned each to a subjectively designated call

205 type. Tonal vocalizations were categorized as ‘whistles’ if their duration was greater than 0.25 seconds, and as chirps if duration was < 0.25 sec. Call type names were assigned based on the aural impression of the call. Call rate was analyzed by multiplying the number of calls counted by 12 to determine an hourly calling rate. Adjusted means for each hour were then calculated for comparison with sunrise and sunset times, and for easier visualization of the data.

Statistical analyses

A Kruskall-Wallis test was performed on the raw counts from the subsampled data to determine whether the calling rates for the Mystic whales differed during “day” and “night”.

Results

A total of 4,799 vocalizations were detected in the five-minute subsamples. Calls were detected during both day and nighttime hours, and consisted of tonal whistles, pulsed and noisy calls, hybrid pulsed/tonal sounds and echolocation clicks (not analyzed or counted). Twelve stereotyped call types, each of which composed > 2.5% of all calls were identified (Table C-2). Calls which were non-stereotyped and/or composed less than 2.5% of vocalizations were grouped into an “other” category.

206

Table C-2. Numbers and percent of the different call types found in the 5 minute data subsamples. “Other” calls were non-stereotyped vocalizations which did not fit in any defined categories (ex: scream, Figure C-4).

Total (N) Total %

Flat whistle 861 17.94 Rising chirp/whistle 751 15.65 Chirp/chirp series 583 12.15 NBWT 381 7.94 CT1 356 7.42 Buzz 282 5.88 Creak 243 5.06 Flat chirp 284 5.92 Pulse series 153 3.19 Upsweep 146 3.04 Noisy buzz 139 2.90 Noisy chirps 130 2.71 Other 490 10.21

Vocal Repertoire

The vocal repertoire of the beluga whales at Mystic Aquarium is highly variable, with many stereotyped call types detected. Tonal calls (Figure C-2) made up a slight majority of the total vocalizations recorded (51.7%); flat and rising contoured whistles composed 17.9 % and 15.7% of the total vocalizations, respectively. Very few whistles had any complex frequency modulation (N=5, 0.001% of all calls).

207

25 Rising Chirp Rising/ Flat whistle Flat chirps whistles flat whistle 20

15

10 Frequency [kHz] Frequency

5

0 0 1 0 1 0 1 0 1 0 1 2 Time [s] Figure C-2. Examples of tonal vocalizations recorded from the beluga whales at Mystic Aquarium. Most tonal calls were flat contours of varying durations, with some rising contours and few highly frequency-modulated whistles that are seen in wild beluga populations. Noisy components at the ends of calls (panels 1,3) were ignored for these analyses.

Pulsed and noisy calls made up a significant fraction of the total vocalizations (N=1,684; 35.1%) of the total number of calls, and were generally more variable than the tonal whistles. “Honks”, “whines”, “buzzes”, and “creaks” (figure C-3) were common, but the two most prevalent and highly stereotyped pulsed calls were “NBWT” (N=381, 7.9%) and “CT1” (N=355, 7.4%) (Figure C-4).

208

25 Honk Whines Noisy chirp Buzz series

20

15

10

5

0 0 1 0 1 2 3 4 0 1 2 0 1 25

Creaks Scream Noisy buzzes Frequency [kHz] Frequency 20

15

10

5

0 0 1 0 1 0 1 2 3 Time [s] Figure C-3. Examples of pulsed and noisy vocalizations recorded from the beluga whales at Mystic Aquarium. These call types were highly variable and often included some tonal components (see: noisy buzzes panel). “Scream” vocalizations were rare (N=23, 0.005%) and grouped with “other” in the analyses.

209

Figure C-4. NBWT and CT1 exemplars. These call types were highly stereotyped and made up 15.3% of the total calls analyzed.

In addition to spontaneously produced vocalizations, the whales at Mystic have also been trained to produce a selection of sounds in air (Figure C-5). These sounds were not recorded outside of training sessions, but represent vocalizations available to the whales.

Figure C-5. Spectrograms of vocalizations produced during training sessions. These calls are produced in air, but are audible (and were recorded) underwater. The “cheer” call type is produced spontaneously by Kela after successful completion of a trained behavior.

210 Diel Patterns

The belugas at Mystic exhibited diel patterns in vocal behavior. Average hourly call rate calculated from the subsampled data was 617.0 ± 204.0 calls per hour during daylight, and 175.0 ± 45.6 calls/ hour at night; a Kruskal-Wallis test of the subsampled data (Figure C-6) showed that 2 this difference was highly significant (χ 1,113 = 50.11 p < 0.001; Figure C-7).

Figure C-6. Mean number of calls per 5 minute subsample from each hour. Error bars indicate standard deviations. Calling rate increased around 0700 and decreased starting around 1400.

Figure C-7. Boxplot of calls per subsample for day (0600 – 1759) and night (1800 - 0559). Significantly more calls were detected during day than night hours.

211 Call detections for each five minute subsample were used to calculate mean-adjusted hourly calling rates (Figure C-8). Variability in call rates was much greater during daylight hours than during the night. Call rates and variability increased abruptly between 7:00 and 8:00 am, and declined more gradually, starting at approximately 1:00pm and falling below the mean between 3:00 and 4:00 pm.

Figure C-8. Mean-adjusted hourly calling rates in the Arctic Coast exhibit. Error bars indicate standard deviation. Calling rates increased sharply around 0700 and declined around 1300.

Call types also changed dramatically between day and night. Pulsed calls were more prevalent during the day than during the night (38.2% vs. 9.7% of all calls, respectively; Figure C-9). In particular, several pulsed call types, including NBWT, CT1, and Buzz, which each comprised >5% of calls during daylight hours were produced at very low rates (or not produced at all) during the night (Table C-3).

212

Figure C-9. Call types recorded during day (0600 -1759) and night (1800 -0559) hours in the Arctic Coast Exhibit. Pulsed and noisy calls (blue shades) made up a significantly higher portion of the total calls during daytime hours than during the night. Total number of calls was also substantially higher in day (N=4,275) than night (N=524).

Table C-3. Call types divided by time of occurrence (Day: 0600 – 1759; Night 1800 – 0559). There was a dramatic change in the vocal repertoire between day and night, with fewer pulsed calls produced during night. “Other” calls were non-stereotyped vocalizations which did not fit in any defined categories (ex: scream, Figure C-4).

DAY Day % NIGHT Night % Flat whistle 753 17.61 108 20.61 Rising 616 14.41 135 25.76 chirp/whistle Chirp/chirp 468 10.95 115 21.95 series NBWT 381 8.91 0 0.00 CT1 355 8.30 1 0.19 Buzz 281 6.57 1 0.19 Creak 228 5.33 15 2.86 Flat chirp 212 4.96 72 13.74 Pulse series 144 3.37 9 1.72 Upsweep 142 3.32 4 0.76 Noisy buzz 138 3.23 1 0.19 Noisy chirps 106 2.48 24 4.58 Other 451 10.55 39 7.44

213 Discussion

Vocal Repertoire

A subjective classification of the calls produced by the beluga whales at Mystic Aquarium produced a vocal repertoire consisting of twelve call categories and an “other” grouping. The vocal repertoire of the Mystic beluga whales is consistent with published repertoires from wild populations (Sjare and Smith, 1986b; Angiel, 1997; Kelley, 2010; Chmelnitsky and Ferguson, 2012), and included tonal whistles, pulsed sounds, and combination pulsed/tonal calls. Many of the stereotyped calls produced by the Mystic whales appear to match vocalizations described for the Churchill, Manitoba beluga population (Chmelnitsky and Ferguson, 2012); this is probably due to the fact that the two females at Mystic were born into this group and collected for captivity as subadults (G. Sirpenski, pers. comm.). In particular, the call type categorized as CT1 for the Mystic whales appears similar to Chmelnitsky and Ferguson’s (2012) call type C6, which they described as a combination call type. In the recordings from Mystic Aquarium, however, the call type was pulsed with sidebands present in the upper (6 – 12 kHz) frequency bands and very little evidence of any tonal component. CT1 was used frequently by the Mystic whales, composing 7.4% of the analyzed calls (compared with 2.1% of call type C6 reported by Chmelnitsky and Ferguson (2012)). When the captive repertoire is compared with the published repertoire for the Churchill, Manitoba population, from which the 2 captive females were collected, there are distinct differences. In particular, the Mystic whales used a much higher proportion of pulsed and noisy calls than did the wild population (44% to 25%, respectively). They also used fewer frequency- modulated “wavy” whistles (captive: <2.5%; wild 6.3 %). The causes of these repertoire differences are unknown, but one potential explanation is that the captive whales, living in a less dynamic environment, with regular schedules and feedings may need a less diverse repertoire. Additionally, the belugas at Mystic Aquarium live in a relatively small enclosure with a small, stable population. If these frequency-modulated whistles serve in long-distance communication, individual identification, or group cohesive functions, the restricted mobility and social structure of the captive population may have rendered them unnecessary. While specific whistle contours are often associated with behavioral states in wild populations, there is currently no understanding

214 of the general functional differences between whistles and pulsed calls. Evaluating these differences should be the focus of a future study. A second future project should investigate the behavioral functions of the pulsed calls used by the Mystic belugas, and compare them with behavioral functions for similar call types in wild beluga populations (e.g. Sjare and Smith 1986a, Chmelnitsky 2010).

Diel Patterns

The Mystic beluga whales exhibited highly significant diel patterns in their vocal behavior. Vocalization rates increased around 0700 hours and began to decline around 1300 hours, dropping below the adjusted mean at around 1500 hours. The increase in vocalization rate corresponds to the time of sunrise during the recording period and to the arrival of trainers and other aquarium staff. The decrease, however, only correlates with a decrease in trainer activity, as sunset was around 1630 during data collection. The whales’ vocal behavior correlates better with the trainers’ activity than with the sunrise and sunset times, indicating that the whales activity patterns are likely driven by their interactions with trainers and the presence of humans in and around the Arctic Coast exhibit. Vocalization types were also correlated with day and nighttime hours. Pulsed calls made up a greater proportion of the repertoire during the day than at night, which may provide clues to the behavioral functions of these calls. The observed reduction in rate of pulsed calls during “resting” hours is consistent with the hypothesis that pulsed vocalizations function in inter- individual communication or are indicative of elevated arousal and activity levels. Behavioral observations during night-time hours would be useful in confirming whether the whales are actually exhibiting reduced activity during these times or whether they are changing only their vocal behavior. The vocal repertoire and behavior of the beluga whales at Mystic Aquarium is apparently different from the most closely-related wild population and is closely tuned to the daily rhythm of human activity at the Aquarium. Future studies of the vocal repertoire and vocal behavior of these whales may provide insight into the communicative function of specific call types and the effects of captivity on marine mammals.

215 References

Angiel, N. M. (1997). "The vocal repertoire of the beluga whale in Bristol Bay, Alaska," (M.S. Thesis, University of Washington). Beddard, F. E. (1900). A Book of Whales (G.P. Putnam's Sons, New York, NY). Belikov, R. A., and Bel'kovitch, V. M. (2006). "High-pitched tonal signals of beluga whales (Delphinapterus leucas) in a summer assemblage off Solovetskii Island in the White Sea," Acoustical Physics 52, 125-131. Belikov, R. A., and Bel'kovitch, V. M. (2007). "Whistles of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 53, 528-534. Belikov, R. A., and Bel'kovitch, V. M. (2008). "Communicative pulsed signals of beluga whales in the reproductive gathering off Solovetskii Island in the White Sea," Acoustical Physics 54, 115-123. Blevins, R., Atkinson, S., Lammers, M. O., and Small, R. (2012). "Calling Behavior of Cook Inlet Beluga Whales," in Alaska Marine Science Symposium (Anchorage, AK). Castellote, M., and Fossa, F. (2006). "Measuring Acoustic Activity as a Method to Evaluate Welfare in Captive Beluga Whales (Delphinapterus leucas)," Aquatic Mammals 32, 325- 333. Chmelnitsky, E. G., and Ferguson, S. H. (2012). "Beluga whale, Delphinapterus leucas, vocalizations from the Churchill River, Manitoba, Canada," The Journal of the Acoustical Society of America 131, 4821-4835. Hotchkin, C. F., Parks, S. E., and Mahoney, B. A. (2010). "Anthropogenic noise sources and sound production of beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska.," in 159th Meeting of the Acoustical Society of America (Baltimore, MD), p. 1727. Karlsen, J. K., Bisther, A. B., Lydersen, C. L., Haug, T. H., and Kovacs, K. K. (2002). "Summer vocalisations of adult male white whales (Delphinapterus leucas) in Svalbard, Norway," Polar Biology 25, 808-817. Kelley, M. M. (2010). "Acoustic analysis of the ultrasonic underwater repertoire of beluga whales (Delphinapterus leucas) at the John G. Shedd Aquarium," (MS Thesis, Western Illinois University). Mason, G. J., and Latham, N. R. (2004). "Can't stop, won't stop: is stereotypy a reliable animal welfare indicator?," Animal Welfare 13, 57-69. Recchia, C. A. (1994). "Social Behaviour of Captive Belugas, Delphinapterus Leucas," (Massachusetts Institute of Technology, Ph.D. Thesis, Cambridge, MA). Schevill, W. E., and Lawrence, B. (1949). "Underwater Listening to the White Porpoise (Delphinapterus leucas)," Science 109, 143-144. Sjare, B. L., and Smith, T. G. (1986a). "The relationship between behavioral activity and underwater vocalizations of the white whale, Delphinapterus leucas," Canadian Journal of Zoology 64, 2824 - 2831. Sjare, B. L., and Smith, T. G. (1986b). "The vocal repertoire of white whales, Delphinapterus leucas, summering in Cunningham Inlet, Northwest Territories," Canadian Journal of Zoology 64, 407 - 415. Swaisgood, R. R., and Shepherdson, D. J. (2005). "Scientific approaches to enrichment and stereotypies in zoo animals: what's been done and where should we go next?," Zoo Biology 24, 499-518. Vergara, V. L., and Barrett-Lennard, L. G. (2008). "Vocal development in a beluga calf (Delphinapterus leucas)," Aquatic Mammals 34, 123-143.

216 Vergara, V. L., Michaud, R., and Barrett-Lennard, L. G. (2010). "What can Captive Whales tell us About their Wild Counterparts? Identification, Usage, and Ontogeny of Contact Calls in Belugas (Delphinapterus leucas)," International Journal of Comparative Psychology 23, 278 - 309.

VITA Cara F. Hotchkin

Education The Pennsylvania State University, University Park, PA 08/01/07 – 12/22/12 Program: Intercollege Graduate Degree Program in Ecology Minor: Graduate program in Acoustics Dissertation: Vocal noise compensation mechanisms in non-human mammals: a multi-species examination of modification types and usage patterns

University of Rhode Island, Kingston, RI 09/04/03 – 05/25/07 B.S. Marine Biology, summa cum laude B.S. Coastal and Marine Policy and Management, summa cum laude

Peer-reviewed publications Hotchkin, C.F. and Parks, S.E. In press. The Lombard effect and other vocal noise compensation mechanisms: insights from mammalian communication systems. Biological Reviews. Parks, S.E., Hotchkin, C.F., Cortopassi, K.A., and Clark, C.W. 2012. Characteristics of gunshot sound displays by North Atlantic right whales in the Bay of Fundy. The Journal of the Acoustical Society of America 131(4).

Previous Experience Acoustics researcher, NOAA NEFSC – AMAPPS Cruise 06/02/11 – 06 22/11 NOAA Ship Henry B. Bigelow, Northeast Fisheries Science Center Supervisors: Joy Stanistreet and Dr. Sofie van Parijs, NOAA NEFSC

Summer Student Fellow, Woods Hole Oceanographic Institution 05/28/06 – 08/15/06 PI: Dr. Peter Tyack, WHOI, Woods Hole, MA.

Additional training and awards 2010 SEABASS 2010 Workshop (Bioacoustics summer school) at Penn State 2009 – 2012 NDSEG Fellowship 2009 National Defense Industrial Association Undersea Systems Fellowship 2008 NSF Graduate Research Fellowship Program Honorable Mention 2008 Sound Analysis Workshop, Bioacoustics Research Program, Cornell University 2007 – 2008 University Graduate Fellowship (The Pennsylvania State University) 2007 URI President’s Award for Student Excellence in Marine Affairs 2007 URI President’s Award for Student Excellence in Marine Biology 2007 URI Harold Riemenschneider Award for Excellence in Biology 2003 – 2007 URI Dean’s List 2003 – 2007 URI Centennial Scholarship