Lecture 22: Affect Recognition from the Voice

Lecture 22: Affect Recognition from the Voice

Lecture 22: Affect Recognition from the Voice CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Affective computing in the news CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Takeaway from today ▪ Recognizing emotion from voice is hard – “Artifacts” can undermine recognition accuracy – Like face, context can be crucial ▪ Many tools confound perceived emotion with felt emotion A few seconds of speech are enough to determine the emotional state of the caller ▪ But voice stronger association (than face) with physiology CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Review ▪ The challenge of variance – Within-person: Same person can show considerable variability – Across people: Same expression manifest in very different ways across people – Across contexts: Lighting, motion, social context CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Faces communicate far more than affect ▪ Age ▪ Race ▪ Gender ▪ Nationality CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Faces communicate far more than affect What about voice? ▪ Age ▪ Race ▪ Gender ▪ Nationality ▪ Language ▪ Dialect – African-American vernacular ▪ Accent – Texan v. Georgian ▪ Intelligence? CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Voices communicate far more than affect ▪ If statement difficult to process, less likely to be judged true and compelling – Even if difficulty from incidental features – Because of accent of speaker (Lev-Ari & Keysar, 2010) – Ease the name of source can be pronounced (Newman et al., 2014) ▪ If statement difficult to process, less likely to be judged true and compelling ▪ Even if difficulty arises from factors irrelevant to content of speech – Because of accent of speaker (Lev-Ari & Keysar, 2010) – Ease the name of source can be pronounced (Newman et al., 2014) CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Voices communicate far more than affect ▪ Study: gave participants science presentations – Conference talks; radio interviews from NPR Science Friday ▪ Manipulated audio quality – Good vs. Low audio quality (like what you might notice on Zoom or Skype) Newman, E.J., & Schwarz, N. (2018). Good sound, good research: How audio quality influences perceptions of the researcher and research. Science Communication, 40(2), 246–257. CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Review: Cultural influences on judgment Ellsworth & Peng 1997 CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Example: Cultural influences on judgment ▪ People explained instructions by racially ambiguous character; American or Chinese Accent (identical appearance and gestures) – 2(Acccent) x 2(Native- vs. Chinese-American) Study ▪ Give “Fish Task” (measure of collectivist tendencies) Chinese Accent Bi-culturals more Chinese mono-culturals more American Dehghani, M., Khooshabeh, P., Huang, L., Nazarian, A. & Gratch J. (2012). Using Accent to Induce Cultural Frame-Switching. In the Proceedings of the 34th Annual Conference of the Cognitive Science Society CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Example: Customer service C. M. Lee and S. S. Narayanan, “Toward detecting emotions in spoken dialogs,” IEEE Transactions on Speech and Audio Processing, 2005 CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Example: Depression detection CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Voice has advantages Not impacted by lighting Harder to regulate / mask? Less influence of head orientation CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Problems recognizing “in the wild” Ambient Noise Wind, Breath, Movement Reverberation CSCI 534(Affective Computing) – Lecture by Jonathan Gratch CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Solutions: audio-visual source separation Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Just as with the face, need to control for this individual variability when recognizing affect or emotion CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Recognizing affect in speech Is person (or group) depressed? Is this a nice Is this an angry sentence? person? (Scherer 2005) Personality prediction e.g., does person like this product? CSCI 534(Affective Computing) – Lecture by Jonathan Gratch CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Adapted from Dan Jurafsky CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Adapted from Dan Jurafsky CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Adapted from Dan Jurafsky Expression of Emotion ▪ Categorical labels ▪ Anger, happiness, sadness, neutral ▪ Dimensional or attribute based labels ▪ Valence (negative vs positive) ▪ Arousal (calm vs active) ▪ More accurate emotion descriptors (intensity) ✦Sample 1: [fru; ()] [ang; ()] [neu; ()] ✦Sample 2: [fru; ()] [oth; (exasperated)] [neu; ()] ✦Sample 3: [ang; ()] [ang; ()] [ang; ()] 22 Adapted from Carlos Busso 2 3 Anatomy of speech production ▪ To consider how emotion shapes speech, useful to consider how speech is produced CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Slide adapted from Danny Bone (SAIL) 2 4 Anatomy of speech production CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Slide adapted from Danny Bone (SAIL) 2 5 Anatomy of speech production Fundamental frequency, (f0) is number of glottal cycles that occur per second This frequency, over some interval, is perceived as pitch CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Slide adapted from Danny Bone (SAIL) 2 6 Anatomy of speech production CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Slide adapted from Danny Bone (SAIL) 2 7 Anatomy of speech production CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Video from SAIL lab Speech and Physiology ▪ Speech production engages wide range of physiological systems ▪ These systems impacted by emotion – Sympathetic activation increases respiration rate, muscle tension, saliva production ▪ Many of these systems under involuntary control ▪ Thus, aspects of speech could serve as “honest signal” of physiological processes associated with emotion CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Determines the individual sounds of speech Resonance Sound out Larynx vibrates in Air CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Determines overall quality of speech Resonance Larynx vibrates in Air CSCI 534(Affective Computing) – Lecture by Jonathan Gratch How is this impacted by emotion? Shape changes ▪ Arousal ▪ Short-term Stress Resonance Sound out ▪ Congestion ▪ Inflammation Why would emotion impact congestion / inflammation? Larynx vibrates in Air CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Hormones ▪ Hormones create structural changes – Testosterone changes vocal tract through development ▪ Holds across species – Vocal pitch correlated with testosterone level in Giant Pandas – What did we call this type of signal? honest signal CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Review Parasympathetic Sympathetic (conserves energy; (mobilizes & expends energy; undertakes ‘housekeeping’) prepares for fight or flight) Challenge Stephen Porges Boredom Engagement Jim Blascovich (arousal) Regulation Threat Joe Tamaka Julian Thayer CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Vocal markers of arousal? Engagement (arousal) CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Vocal markers of arousal? ▪ Clinical interviews – Taped clinical interactions between physicians and patients in follow- up consultations about cancer diagnosis – Measured skin conductance (indicator of sympathetic activation) – Vocal jitter (aspect of voice quality perceived as hoarseness) increased during and immediately after skin conductance increases – Voice unsteadiness (slope and standard deviation of fundamental frequency) associated with changes in skin conductance Postma-Nilsenová, Holt, Heyn, Groeneveld, Finset, A case study of vocal features associated with galvanic skin response to stressors in a clinical interaction, Patient Educ. Couns. 99 (2016) CSCI 534(Affective Computing) – Lecture by Jonathan Gratch (CO) (CO) CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Measure Send a low impedance to this voltage electrical current through current through inner bands outer bands (ICG) Measure heart electrical activity (ECG) As blood volume increases, impedance decreases CSCI 534(Affective Computing) – Lecture by Jonathan Gratch 37 Slide courtesy of Jessica Cornick Vocal markers of physiological threat? ▪ Neubauer et al., 2017 – Examined if there were vocal indicators of challenge threat in a “bomb disposal” task – Couldn’t find marker of threat response – But cardiac output was found to significantly predict f0 and peakSlope in several trials. – Conclusion: vocal and physiological features are indeed strongly related and that one modality could be used to estimate the other in certain contexts. ▪ But this area underexplored Neubauer, et al. The relationship between task-induced stress, vocal changes, and physiological state during a dyadic team task. ICMI 2017 CSCI 534(Affective Computing) – Lecture by Jonathan Gratch Vocal markers of physiology? ▪ Such results promising but area underexplored – Meta-analysis suggests the results highly variable and could depend on subtle aspects of social context ▪ More typical to take a “computer science” approach – Ignore theory

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    95 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us