Detecting Emotional response to music using near-infrared spectroscopy of the prefrontal cortex

by

Saba Moghimi

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Institute of Biomaterials and Biomedical Engineering University of Toronto

⃝c Copyright 2013 by Saba Moghimi Abstract

Detecting Emotional response to music using near-infrared spectroscopy of the

prefrontal cortex

Saba Moghimi

Doctor of Philosophy

Graduate Department of Institute of Biomaterials and Biomedical Engineering

University of Toronto 2013

Many individuals with severe motor disabilities may not be able to use conventional means of emotion expression (e.g. vocalization, facial expression) to make their - tions known to others. Lack of a means for expressing emotions may adversely affect the quality of life of these individuals and their families. The main objective of this thesis was to implement a non-invasive means of identifying emotional arousal (neutral vs. intense) and valence (positive vs. negative) by directly using brain activity. In this light, near infrared spectroscopy (NIRS), which optically measures oxygenated and deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively), was used to monitor prefrontal cortex hemodynamics in 10 individuals as they listened to music ex- cerpts. Participants provided subjective ratings of arousal and valence. With respect to valence and arousal, prefrontal cortex [HbO2] and [Hb] were characterized and significant prefrontal cortex hemodynamic modulations were identified due to emotions. These mod- ulations were not significantly related to the characteristics of the music excerpts used for inducing emotions. These early investigations provided evidence for the use of pre- frontal cortex NIRS in identifying emotions. Next, using features extracted from [HbO2] and [Hb] in the prefrontal cortex, an average accuracy of 71% was achieved in identifying arousal and valence. Novel hemodynamic features extracted using dynamic modeling and template-matching were introduced for identifying arousal and valence. Ultimately, the

ii ability of autonomic nervous system (ANS) signals including heart rate, electrodermal activity and skin temperature to improve the identification results, achieved when using

PFC [HbO2] and [Hb] exclusively, was investigated. For the majority of the participants, prefrontal cortex NIRS-based identification achieved higher classification accuracies than combined ANS and NIRS features. The results indicated that NIRS recordings of the prefrontal cortex during presentation of music with emotional content can be automat- ically decoded in terms of both valence and arousal encouraging future investigation of

NIRS-based emotion detection in individuals with severe disabilities.

iii Dedication

To Hope and Trinity for inspiring me to pursue this work.

iv Acknowledgements

I would like to thank my supervisor Dr. Tom Chau for his kind help and all his support throughout my work. I will be forever indebted to him for giving me the chance to be part of his dynamic research team. His mentorship has helped me develop skills that I will carry for the rest of my life. My special thanks to my co-supervisor Dr. Anne-Marie

Guerguerian for sharing her knowledge and supporting me throughout the challenges I faced. Her unwavering care and concern for the patients has always been a source of inspiration to me. I would like to thank my committee members Dr. Maureen Dennis and Dr. Milos Popovic for sharing their insight, and guiding me with their suggestions.

I would like to express my gratitude to Dr. Azadeh Kushki and Dr. Sarah Power for their kind help throughout my research. I am also grateful to Ka Lun Tam and Pierre Duez for their technical support. I would like to express my gratitude to Dr. Negar

Memarian and Dr. Stefanie Blain-Moraes for helping me in developing my research skills.

I would like to thank the participants who took the time to help me with this study, without whom this work would have not been possible. I acknowledge the financial support of the National Science and Engineering Research Council CREATE CARE program, and Holland Bloorview Kids Rehabilitation Hospital graduate scholarship. I would like to thank donors of the K.M. Peterborough Hunter graduate studentship for their financial support.

Finally, I would like to express my gratitude to my family whose love and support has always embraced me although they are miles and miles away. I would like to thank my father for all his contributions. His interest in my work and our discussions truly motivated me in my research. I thank my mother and my aunt Ferreshteh who reminded me to be strong and determined throughout my work. Special thanks to my sister who helped me in so many ways from encouraging me in my work to sharing her technical insight. Finally, my special thanks to Amin Abdossalami for reminding me to never give

v up.

vi Contents

1 Introduction 1

1.1 Preamble ...... 1

1.2 Motivation ...... 1

1.3 Current clinical evidence for EEG-based BCIs, a literature appraisal . . . 3

1.3.1 BCI Development Using Electroencephalography ...... 5

1.3.2 Applications User Interface ...... 7

1.3.3 Controlling brain computer interfaces ...... 7

1.3.4 Evaluation Criteria ...... 12

1.3.5 Future Directions in BCI research ...... 13

1.3.6 Towards affective brain computer interfaces ...... 16

1.4 Neural correlates of emotion ...... 18

1.4.1 The role of prefrontal cortex in default, salient and executive con-

trol networks ...... 20

1.5 Near-infrared spectroscopy of the brain ...... 21

1.6 Emotion induction via music ...... 22

1.7 Objectives ...... 25

1.8 Roadmap ...... 25

2 Experimental Protocol 29

2.1 Preamble ...... 29

vii 2.2 Introduction ...... 29

2.3 Participants ...... 29

2.4 Stimuli ...... 30

2.5 Signal acquisition ...... 31

2.6 Pre-processing ...... 31

2.7 Study design ...... 32

3 Characterizing PFC Hemodynamic changes due valence and arousal 35

3.1 Preamble ...... 35

3.2 Abstract ...... 36

3.3 Introduction ...... 36

3.4 Methods ...... 38

3.4.1 Procedures ...... 38

3.4.2 Wavelet-based peak detection ...... 40

3.4.3 Statistical analysis ...... 42

3.5 Results ...... 42

3.6 Discussion ...... 45

4 The Effect of Music Characteristics 47

4.1 Preamble ...... 47

4.2 Introduction ...... 47

4.3 Methods ...... 49

4.3.1 Music characteristic extraction ...... 49

4.3.2 Music database ...... 50

4.3.3 Statistical analysis ...... 50

4.4 Results ...... 51

4.5 Discussion ...... 52

4.5.1 Subject specific patterns ...... 53

viii 4.5.2 Temporal dynamics ...... 54

4.6 Conclusion ...... 54

5 Automatic Detection of Emotional Response to Music 55

5.1 Preamble ...... 55

5.2 Abstract ...... 56

5.3 Introduction ...... 57

5.4 Methods ...... 59

5.4.1 Stimuli ...... 59

5.4.2 Preprocessing ...... 61

5.4.3 Feature extraction ...... 61

5.4.4 Classification procedures ...... 62

5.5 Results ...... 63

5.6 Discussion ...... 66

5.6.1 Classification Accuracy ...... 66

5.6.2 Diversity in the music database ...... 69

5.6.3 Challenges ...... 69

6 Combining autonomic and central nervous system activity 71

6.1 Preamble ...... 71

6.2 Introduction ...... 72

6.3 Methods ...... 73

6.3.1 Procedures ...... 73

6.3.2 NIRS data ...... 74

6.3.3 ANS data ...... 75

6.3.4 Analysis ...... 75

6.3.5 Feature extraction ...... 76

6.3.6 Classification ...... 80

ix 6.3.7 Mixture of experts ...... 80

6.4 Results ...... 83

6.4.1 Dynamic model-based features ...... 84

6.4.2 Classification results ...... 84

6.5 Discussion ...... 85

6.6 Conclusion ...... 88

7 Concluding remarks 89

7.1 Summary of contributions ...... 89

7.1.1 A literature appraisal of the existing evidence for the use of BCI

for individuals with disabilities [143] ...... 89

7.1.2 PFC [Hb] and [HbO2] patterns characterization using wavelet anal- ysis with respect to emotional arousal and valence [142] ...... 90

7.1.3 Identified emotional arousal and valence in response to dynamic emotion induction using PFC NIRS [144] ...... 90

7.1.4 Introduced features based on dynamic modeling for emotion iden- tification ...... 91

7.1.5 Multi-modal emotion identification using a mixture of classifier ex- perts ...... 91

7.2 Recommendation for future studies ...... 92

7.2.1 Assessing PFC hemodynamics for emotion identification in the pe-

diatric population and individuals with severe disabilities . . . . . 92

7.2.2 Potential clinical implications ...... 93

7.2.3 Dynamic emotional rating paradigms ...... 94

7.2.4 Emotional sensitivity measures ...... 94

7.2.5 Individual specific analysis ...... 94

7.2.6 Inclusion of larger sample sizes ...... 95

x Appendix A: Open Challenges Regarding Control Mechanisms 96

Appendix B: Music Database 100

Appendix C: Music characteristic extraction using MIRTOOLBOX 103

Appendix D: Region specific analysis of [HbO2] and [Hb] with respect to music characteristics 104

Appendix E: Contributions from Systemic Blood Flow 105

Appendix F: Cognitive Processing Activity in the Prefrontal Cortex 107

Appendix G: Research Ethics 108

Bibliography 114

Acknowledgements

xi List of Tables

1.1 Summary of BCI studies on individuals with disabilities (1999-2005) . . . 8

1.1 Summary of BCI studies on individuals with disabilities (2006-2009) . . . 9

1.2 BCI Control Mechanisms...... 10

1.3 A summary of existing theories of emotions. See [50] for more details. . . 23

4.1 P-values for the main effect of arousal and valence rating in modeling mode, dissonance and maximum sound pressure level...... 51

4.2 P-values for the main effect of music characteristics (i.e. dissonance, mode,

and maximum sound pressure level) in modeling the peaks of [HbO2] and [Hb] averaged across the nine recording sites...... 52

5.1 Summary of features used in the analysis ...... 62

5.2 Classification accuracy in % for each participant when classifying HA vs.

BN. Feature-types corresponding to the best average accuracy are also presented for each participant (M = stimulus period mean; ∆M = stimulus

period mean - preceding noise period mean; LSR = lateral slope ratio;

∆LM = Lateral mean difference; S = slope, CV = coefficient of variation 65

xii 5.3 Classification accuracy in % for each participant when classifying PV vs. NV. Feature-types corresponding to the best average accuracy are also

presented for each participant (M = stimulus period mean; ∆M = stimulus

period mean - preceding noise period mean; LSR = lateral slope ratio;

∆LM = Lateral mean difference; S = slope, CV = coefficient of variation 66

6.1 Features resulting from arx dynamic modeling. (very low frequency band (VLF) = 0-0.025 Hz, low frequency band (LF) = 0-0.075 Hz and high

frequency band (HF) = 0.075-0.1 HZ) ...... 79

6.2 Feature used for training classifier experts ...... 83

6.3 Classification accuracy in % determined using ANS features for solving

the HA vs BN and PV vs. NV classification problem ...... 85 6.4 Classification accuracy in % determined using the mixture of experts for

solving the HA vs. BN and PV vs. NV classification problem ...... 86

6.5 Classification accuracy in % for each participant when classifying HA vs.

BN. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and

arx (b) input:[HbO2]/[Hb])) and template-based features...... 86 6.6 Classification accuracy in % for each participant when classifying PV vs.

NV. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and

arx (b) input:[HbO2]/[Hb])) and template-based features...... 87

1 The list of music pieces included in the common music database . . . . . 101

2 The list of self-selected music pieces ...... 102 3 The significance of the main effect of a. Mode, b. Dissonance, and c.

Maximum sound pressure level for each recording site shown in Figure

2.1. (α = 0.05) ...... 104

xiii List of Figures

1.1 General BCI Components ...... 3

1.2 Various structures within the survival network involved in the emotional

response, and the resulting outputs. [123] ...... 19

1.3 General overview of NIRS recording system ...... 22

1.4 Thesis roadmap...... 27

2.1 The layout of light sources (circles) and detectors (X’s). The vertical line

denotes anatomical midline. The annotated shaded areas correspond to

recording locations...... 32

2.2 Trial sequence ...... 33

2.3 The Self Assessment Manikin Rating System is shown. The top and the

bottom row depict valence (positive to negative) and arousal (intense to

neutral) ratings, respectively. The participant could select one of the nine levels of arousal/valence by marking the corresponding circles shown. For

example, in the sample rating provided, a very intense positive emotion is

represented ...... 34

3.1 The layout of light sources (circles) and detectors (X’s). The vertical line

denotes anatomical midline. The annotated shaded areas correspond to

recording locations...... 39

3.2 Trial sequence ...... 40

xiv 3.3 Mexican hat wavelet ...... 41

3.4 Box-plot of valence and arousal ratings for each participant ...... 43

3.5 Slopes of regression lines between participant arousal ratings and (a) the

maximum wavelet coefficient (MWC), and (b) the corresponding scale.

Only slopes significantly different from zero are shown (p < 0.005). . . . . 44

3.6 Slopes of regression lines between participant valence ratings and (a) the maximum wavelet (MWC), and (b) the corresponding scale. Only slopes

significantly different from zero are shown (p < 0.005)...... 44

3.7 Plotted in black are the (a) [HbO2] (top panel) and (b) [Hb] (bottom panel)recordings across nine interrogation sites for a music sample inducing

intense negative emotions from one of the participants during 45 seconds of aural stimulus. In grey are the corresponding waveforms of wavelet

coefficients at the scale where the maximum wavelet coefficient occurs.

These waveforms have been scaled by their standard deviation to facilitate

visual comparison...... 45

4.1 In grey: the normalized sound pressure level of self-selected song A for par-

ticipant 3. In black: normalized [HbO2] averaged across the nine recording

locations shown for each of the four repetitions of song A. The [HbO2] var- ied in different repetitions of the same song...... 52

5.1 Plots (a) and (c) exemplify normalized HbO2 concentration signals at dif- ferent recording locations while plots (b) and (d) are the corresponding

normalized Hb concentration signals. The dark lines represent normalized

signals corresponding to highly valenced, high arousal stimuli while the lighter grey line depicts normalized concentrations during Brown noise

presentation to the same participant. The same Brown noise sample is

illustrated for both positively and negatively valenced examples...... 64

xv 5.2 Location of features resulting in the best overall accuracy. Each rectangle is located over a recording site. The size of the rectangle is proportional

to the number of features selected from the corresponding location. The

vertical line denotes the anatomical midline (HA = high arousal; BN =

Brown noise; PV = positive valence; NV=negative valence)...... 67

5.3 Adjusted classification accuracy (shown in (??)) results (averaged across

participants) versus the number of trials included for classification against

brown noise trials, after sorting all trials based on ratings of arousal in

descending order. (e.g. accuracies reported for the top 12 are the result of

classifying the 12 highest rated arousal trials against all trials with brown noise. The confidence intervals are shown as error bars for each number

of trials included.) ...... 68

5.4 Adjusted classification accuracy (shown in (??)) results (averaged across

participants) versus the number of trials included for classification, after sorting all trials based on ratings of positive and negative valence in de-

scending order. (e.g. accuracies reported for the top 12 are the result of

classifying the 12 most positively rated trials against the 12 most nega-

tively rated trials. The confidence intervals are shown as error bars for

each number of trials included.) ...... 68

6.1 Trial sequence ...... 74

6.2 The layout of light sources (circles) and detectors (X’s). The vertical line

denotes anatomical midline. The annotated shaded areas correspond to

recording locations...... 75

6.3 A. Custom-made template, B. Sample normalized [HbO2] recorded in a trial with chills...... 77

6.4 Feature segmentation...... 81

6.5 A simplified diagram depicting fusion of classifier decisions...... 82

xvi 6.6 Sample trial with chills (participant 2): EDA recording and estimation,

using the average [HbO2] concentrations as the input to the arx model. The fit achieved by the model for the depicted estimation is 52.9%. . . . 84

6.7 Sample scaled frequency response estimated for (A) chilling and (B) neu-

tral trials for participant 4. The magnitude of the frequency response was normalized by dividing the results by the total power of the signal over

the entire frequency range...... 85

1 Ethics approval notice ...... 109

2 Participant consent form ...... 113

xvii List of abbreviations

ANS: autonomic nervous system AR: autoregressive arx: autoregressive model with exogenous input BCI: brain computer interface BN: brown noise BVP: blood volume pulse CNS: central nervous system HA: high arousal EDA: electrodermal activity EEG: electroencephalography

[HbO2]: oxygenated hemoglobin concentration [Hb]: deoxygenated hemoglobin concentration MRITOLLBOX: music information retrieval toolbox MRI: magnetic resonance imaging MWC: maximum wavelet coefficient NIRS: near infrared spectroscopy NV: negative valence PFC: prefrontal cortex PET: positron emission tomography PV: positive valence

xviii

Chapter 1

Introduction

1.1 Preamble

Sections of this chapter are drawn from the following published review paper: Moghimi S,

Kushki A, Guerguerian AM, Chau T, A Review of EEG-Based Brain-Computer Interfaces as Access Pathways for Individuals with Severe Disabilities. To appear in Assistive technology: the official journal of RESNA 2012.

1.2 Motivation

Many individuals with severe motor disabilities may not be able to use conventional means of communication such as speech or facial gestures to express their intentions. Lack of communication may adversely impact the quality of life of these individuals as well as that of their families. In particular, manifestation of emotions such as facial ex- pressions and body language are an imperative part of human interactions. Emotional communication enables caretakers to address the needs of infants [225]. Severe motor impairments may result in an absence of physical displays of emotion, and leave caretak- ers with no means of interpreting emotional reactions. Realizing alternative pathways through which individuals with severe motor impairments may express their affective

1 2 response may ultimately improve their quality of life and quality of care while reducing care-giver stress[70].

Alternative access pathways can be used to translate functional intent into electri- cal signals for environmental or computer control [217]. Examples of these alternative pathways include mechanical switches and vision-based systems that generate binary control signals from limb movements [236], eye gaze [7, 37], mouth opening [137] or tongue protrusion [128]. These solutions, however, are not appropriate for individuals who are cognitively capable but have little or no voluntary and repeatable muscle control. The etiology ranges from acute conditions such as brain-stem stroke, infectious basilar arteritis , acute inflammatory demyelinating polyneuropathy [162] or brainstem tumor

[86] to chronic causes including amyotrophic lateral sclerosis, severe spastic quadriplegic cerebral palsy, severe nemaline myopathy and multiple sclerosis [97]. For example, in- dividuals affected by neuro-degenerative conditions such as amyotrophic lateral sclerosis (ALS) or multiple sclerosis (MS) may experience locked in syndrome (LIS) in the late stages of the disease. Individuals with LIS have little or no voluntary muscle control while retaining cognitive awareness. These individuals are aware of their surroundings, however, they may not be able to communicate their intent via speech or facial expres- sion. Children with severe congenital disabilities due to severe motor impairments may also experience communication difficulties. To enable communication without relying on motor capacity, physiologically-based communication systems have been investigated.

In particular, communication alternatives have been developed by directly using brain activity. Technologies known as brain computer interfaces (BCI) can generate a control command enabling users to operate communication interfaces [237, 115]. This thesis ex- plores affective BCI systems [158] capable of detecting emotional response. These systems constitute an emerging field of BCI research. 3

USER INTERFACE

CONTROL INTERFACE Communication

Brain sensing Decoder Environment module Control control command Device control

Communication: restoring communication e.g. Alternative and augmentative communication (AAC), Speller, computer mediated communication such as internet access control

Environment control: Interacting with and influencing the surrounding environment e.g. TV, bed position, lights control

Device control: Controlling mechanical devices to restore mobility or dexterity e.g. Neuroprosthesis, wheelchair control

Figure 1.1: General BCI Components

1.3 Current clinical evidence for EEG-based BCIs, a

literature appraisal

While many different BCI system paradigms have been proposed (e.g. [133, 52]), most fundamentally, a BCI system is comprised of an activity sensing module, a brain ac- tivity decoder and an output module as depicted in Figure 1.1. The activity sensor measures brain activity while the decoder detects specific evoked or spontaneous brain activity patterns and translates them into control commands. The output module takes these control commands to drive applications such as an on-screen scanning keyboard for communication.

One of the key aspects of BCI development is choosing a brain sensing module suit- able for long-term bedside monitoring. With their high spatial resolution, electrode implants ([98, 103, 104]) have facilitated accurate cursor control in humans ([77]) and high throughput ([194]) and multi-joint prosthesis control ([227]) in primates, but do require invasive surgery ([69]). Sacrificing some spatial resolution for non-invasiveness, economy and portability, EEG is the most widely used modality in BCI applications to date. Therefore, Electroencephalography(EEG) which monitors electrical potentials 4 from the skull surface has dominated BCI research due to its non-invasiveness, low cost, portability and convenient set-up requirements.

Another modality explored for BCI development is magnetic resonance imaging (MRI)

[234]. MRI is capable of detecting hemodynamic changes by monitoring blood oxygena- tion level dependent (BOLD) with high spatial resolution and can detect signals from deeper brain areas. In fact, Weiskopf et al. have been able to differentiate brain activity corresponding to motor imagery, visual imagery and spatial navigation using MRI [233].

Despite these findings, and the spatial resolution available using MRI, the current MRI technologies are bulky, expensive, and require radio frequency and magnetic shielding which impedes their use as a portable bedside monitoring system.

Another emerging cerebral hemodynamic monitoring technology for BCI develop- ment is near infrared spectroscopy (NIRS). NIRS monitors the level of oxygenated and deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in the cerebral cortex using optical imagery. Near-infrared light shined through the adult skull, is de- tected 2.5-3 cm apart from the source [228, 90]. The light intensity detected can be used to identify [HbO2] and [Hb] in the underlying tissue due to the differences in absorp- tion characteristics of these two chromophores. NIRS systems are relatively inexpensive, portable, and suitable for long-term bedside monitoring. Recent studies have illustrated the ability of NIRS to detect task-related changes in brain activity. These findings have indicated that active music imagery (mental singing) can be differentiated from the rest state and mental math with accuracies significantly above chance [177, 45, 65]. In addi- tion to user-convenience, NIRS is particularly immune to electro-genic artifacts due to eye movement and muscle contractions which are frequently encountered in the prefrontal area. Therefore, in this thesis, NIRS was selected for detecting hemodynamic changes in the prefrontal cortex associated with emotional responses. However, due to the extent of BCI systems developed using EEG, a literature appraisal of clinically investigated

EEG-based BCI systems was conducted to set the stage for understanding the potential 5 of NIRS.

1.3.1 BCI Development Using Electroencephalography

To find out the extent of existing evidence for EEG-based BCI use for individuals with disabilities and identifying research gaps, a literature review was conducted. The spe- cific focus of this search was BCI systems for communication and environmental control.

Studies related to other BCI applications such as brain-controlled prostheses were ex- cluded from the review. PubMed, ISI Web of Science, and OVID (MEDLINE, CINAHL,

EBM Reviews, EMBASE, and Ovid Healthstar) databases were searched using keyword combinations containing brain computer interface and one of disability, disabilities, dis- abled or ALS. Only English-language journal articles that directly evaluated EEG-based

BCI technology with participants with physical disabilities were included. This search was further narrowed to journal articles published between January 1999 and December 2010.

The search identified 380 articles. The reference and citation lists of the retrieved articles were further examined. The articles were screened, based on title and abstract, to only include studies involving individuals with disabilities. This screening exercise yielded

119 articles. Further screening for EEG studies focused on restoring communication in the target population further reduced the sample to 39 articles, as listed in Table 1.1.

In the following sections, we appraise these articles with respect to the participants’ characteristics, and to the articles’ control mechanisms, clinical findings, and evaluation criteria.

The level of evidence for clinical interventions is typically rated according to study design criteria [192]. None of the studies examined were controlled experiments per se. Fourteen (14) articles compared BCI use between able-bodied participants and individu- als with disabilities. However, if we regard BCI as a clinical intervention, these studies do not follow conventional experimental designs as there were no control and experimental 6 groups. Thus, according to the conventional rating criteria [192], the entire collection of selected articles would be rated as level V i.e., case series without controls. By and large, the majority of studies involved a small number of participants; only 8 of the 39 studies had more than 6 participants. The selected studies do not propose an interven- tion for a certain population, but rather report efforts of restoring communication in the few individuals participating in the respective investigations. Therefore, one may argue that the focus of these BCI assistive technology studies has been on individual- centered solutions [199]. In this light, introducing clinical rating guidelines revolving around person-centered constructs such as person-environment interaction [88] may be more appropriate for these studies.

The selected studies included 6 single-subject reports, 25 studies with fewer than six participants, and eight studies with 6-35 participants with disabilities. A total of 14 studies involved both able-bodied and individuals with disabilities.

Of the 39 studies involving participants with disabilities, 29 studies reported BCI eval- uation results for individuals with locked-in syndrome (LIS) [211]. In LIS, consciousness is preserved but severe motor and communication impairments are present (quadriplegia and anarthia). Depending on the level of residual motor control, LIS is classified into three categories [8]: (1) incomplete LIS, where remnant voluntary motion is preserved in addition to those retained in classical LIS, (2) classical LIS, which refers to cases with total immobility except for blinking and vertical eye movement, and (3) total LIS where no voluntary motor control is preserved. Among the studies involving participants with

LIS, only two studies recruited participants with total LIS. The majority of these stud- ies (64%) considered participants with LIS resulting from amyotrophic lateral sclerosis

(ALS), an adult-onset progressive neuro-degenerative disease that affects both upper and lower motor neurons [147]. In its late stages, ALS can lead to the locked-in state.

Other conditions reported in the reviewed articles included different levels of spinal cord injury (SCI) (8 studies), cerebral palsy (4 studies), cerebral paresis ( 1 studies), 7 muscular dystrophy (4 studies), stroke (3 study), chronic Guillian Barr syndrome (2 studies), multiple sclerosis (1 study), spinal muscular atrophy (2 studies), post-polio (1 study), and primary lateral sclerosis (1 study). All of the selected studies considered adult participants.

1.3.2 Applications User Interface

The selected articles used EEG-based BCI systems for two main applications, namely, augmentative and alternative communication (34 studies) and environmental control (5 studies). Augmentative and alternative communication tools enable or facilitate com- munication with other individuals and include spellers and Internet navigation tools.

Environmental control tools enable the user to modify environmental conditions and in- clude body-position control, control of electronic appliances, and navigation in real and virtual environments.

One of the spelling applications used is a binary tree arrangement of the alphabet to efficiently locate a desired letter [166]. At each level of the tree, the user is presented with two segments of the alphabet and can eventually choose the desired letter by traversing the alphabet tree. Another common spelling interface is the scanning keyboard, where different columns and rows of an array of letters are sequentially intensified [230]. A third widely used interface involves on-screen object navigation. The user-guided cursor points to the desired options or letters for communication purposes.

Table 1.1 summarizes EEG-based BCI studies involving individuals with disabilities in the past decade.

1.3.3 Controlling brain computer interfaces

EEG-based BCIs rely on modulation of brain activity for application control. The mecha- nism used to modulate brain activity may rely on reactions evoked by externally presented stimuli or generated spontaneously by a trained user. The most commonly used control 8 Spelling Spelling Spelling Spelling Spelling Spelling Spelling Spelling Spelling Spelling Spelling Application Cursor control Cursor control Cursor control Cursor control 3-choice switch 4-choice switch Cursor(ball) control Environmental control SCP SCP SCP SCP SCP SCP SCP SCP SCP SCP SCP P300 P300 P300 P300 P300 SMR SMR SMR MSMR Control mechanism 1(CP) 4(ALS) 5(ALS) 5(ALS) 1(ALS) 3(ALS) 1(ALS) 2(ALS) 5(ALS) 2(ALS) 1(infantile CP) 11(not specified) 2(SCI),2 able-bodied 1(MS),7(able-bodied) 1(ALS), 9 able-bodies 3(ALS), 13 able-bodied 4(SCI) partial paralysis 10(ALS),10 able-bodied Participants (condition) ,1 (incomplete praraplegia) 15(ALS), 1(brain stem stroke 1(SCI),1(Guillain Barr´esyndrome) 1(ALS),1(LIS post vertebrobasilar trombosis) 1(ALS),1(LIS post vertebrobasilar trombosis) Table 1.1: Summary of BCI studies on individuals with disabilities (1999-2005) K¨ubleret al.(2005) [117] Neumann and K¨ubler(2003) [154] Neumann et al. (2003)[154] Neuper et al.(2003) [155] Bayliss et al.(2003) [10] K¨ubleret al.(2004) [116] Wolpaw, McFarland (2004) [238] Sellers et al. [204] Piccione et al.(2005) [173] Krausz et al. (2003) [107] M¨ulleret al. (2003) [148] Neumann, Birbaumer (2003) [153] Hinterberger et al.(2003) [76] Sellers et al. (2003) [202] Kaiser et al.(2002) [93] K¨ubleret al.(2001) [115] K¨ubleret al.(1999) [114] Birbaumer et al.(2000) [14] Donchin, Spencer, Wijesinghe(2000) [38] 3 (complete paraplegia), 10 able-bodied Authors Birbaumer et al.(1999) [13] Note: ALS: amyotropic lateralsyndrome, sclerosis, Lv:level, CP: MD: cerebral muscular palsy,slow dystrophy, DMD: MS: cortical Dunchenne multiple potentials, muscular sclerosis, SMA: dystrophy, SCI: spinal LIS: spinal muscular locked-in cord atrophy, SMR: injury, SCP: sensorimotor rhythms. 9 control Spelling Spelling Spelling Spelling navigation Application Binary switch Cursor control Cursor control Cursor control Cursor control Cursor control 6-choice switch Spelling,Cursor Internet surfing Internet browsing Spelling(auditory) Environment control Environment control Virtual Environment Cursor(circle) control Environmental control SCP SCP P300 P300 P300 P300 P300 SMR SMR SMR SMR SMR SMR SMR SMR SMR SMR SSVEP SMR, SCP,P300 Control mechanism 7(SCI) 1(ALS) 1(ALS) 4(ALS) 8(ALS) 6(DMD) 8 able-bodied ,4 able-bodied 3(PLS), 3(ALS) 3(ALS),3 able-bodied 1(CP),1(SMA),1(ALS) 1(CP),1(MDD),1(ALS) 3(ALS), 10 able-bodied Participants (condition) 1(SCI,lv.C4),1(SCI,lv.T7) ,1(SCI,lv.C4),1(SCI,lv.C5) , incomplete lesion below C5) 14(SMA,DMD),14 able-bodied ,1(Post-anoxic encephalopathy) ,1(brain and spinal cord injury) 1(SCI, complete lesion below C4 11(SCI,lv.C4-C7), 16 able-bodied 2(ALS),1(MD post-polio), 3(SMA) 2(post-polio), 1(CP), 2(SCI), 1(LIS) 29(ALS),6(Guillain Barr´esyndrome) 1(brain stroke),1(ALS),9 able-bodied ,1(muscular dystrophy),1(cerebral paresis) 5(SCI,lv.C4-C5), 1(Guillian Barr´esyndrome) 1(diffuse brain damage post hypoxia),2(brain stroke) Table 1.1: Summary of BCI studies on individuals with disabilities (2006-2009) Authors Karim et al.(2006) [94] Neuper et al.(2006) [26] Vaughan et al. (2006) [226] K¨ubleret al.(2009) [113] Babiloni et al. (2009)Conradi [5] et al.(2009) [29] Felton et al. (2009) [48] Bai et al. (2010) Mugler et al. (2010) Sellers,Donchin(2006) [203] Wang et al.(2006) [230] Kauhanen et al.(2007) [95] Leeb et al.(2007) [127] Nijboer et al.(2008) [159] McFarland et al.(2008) [135] K¨ubler,Birbaumer(2008) [112] Bai et al.(2008) [6] Hoffmann et al.(2008) [78] Cincotti et al.(2008) [27] Note: ALS: amyotropic lateralsyndrome, sclerosis, Lv:level, CP: MD: cerebral muscular palsy,slow dystrophy, DMD: MS: cortical Dunchenne multiple potentials, muscular sclerosis, SMA: dystrophy, SCI: spinal LIS: spinal muscular locked-in cord atrophy, SMR: injury, SCP: sensorimotor rhythms. 10

Table 1.2: BCI Control Mechanisms.

Control Mechanism Slow Cortical Potentials (SCP) Spontaneous Sensorimotor Rhythms (SMR) P300 Evoked Steady State Visually Evoked Potentials Language tasks Mental Task Mental Arithmetic mechanisms are shown in Table 1.2. The remainder of this section discusses each of these mechanisms in detail. The review of the selected articles led to identifying different challenges surrounding BCI control mechanisms which are listed in Appendix A.

Slow cortical potentials

The most frequently deployed control mechanism among the selected studies is the slow cortical potential (SCP), a spontaneously generated signal. SCPs are slowly varying trends that are time locked to specific external or internal events [12]. The duration of these potentials is generally between 300 milliseconds to several seconds [13].

Voluntary control using behavioral manipulations can cause positive or negative SCP shifts. Negative deviations of the SCPs are known to be associated with arousal as well as response preparation [220, 56]. Positive deflections in SCPs are related to response inhibition and relaxation [42]. Voluntary control of SCP can be achieved by providing visual or auditory biofeedback to participants [43, 12].

Thought translation device (TTD) is an example of an SCP-based BCI [11], employing voluntarily generated SCPs to control a computer. The TTD requires a training phase during which the user receives visual or audio-visual feedback reflecting the presence of positive or negative deflections [115]. In particular, the reviewed articles reported that successful use of the SCP-based BCI (achieving accuracies higher than 70%) required several training sessions. Among the reviewed articles SCP-based BCIs were used to operate spelling interfaces, navigate the Internet, and control environmental devices. 11

Sensorimotor rhythms

Like SCPs, sensorimotor rhythms (SMRs) rhythms are spontaneously occurring EEG activities in the somatosensory cortex in the absence of movement [157]. These rhythms are attenuated by movement or somatosensory stimulation. Control of SMRs can be achieved through biofeedback-based training as the user performs motor imagery [119,

170]. While SMR-based BCIs were successfully used by individuals with disabilities

[117, 170, 239], voluntary modulation of SMRs for BCI control required many training sessions.

SMR-based BCI have been used for cursor control, where bilateral motor imagery of hands, legs, and tongue was used to control the direction of cursor movement. SMRs have also been used for selecting various targets. For example, Mcfarland et al. [135] used a linear combination of SMRs to enable selection once the cursor reached the target choice. Such cursor control can also be used for spelling.

P300 evoked potentials

In contrast to the spontaneous control mechanisms, the P300 is an evoked response.

The P300 wave has a latency of 300ms and is the positive-going component of the event- related potential that results from exposure to an occasional stimulus [181]. This response is generated by a network comprising the prefrontal cortex, anterior insula, cingulate gyrus, temporoparietal cortex, medial temporal cortex, and the hippocampal formation

[183] and can be maximally recorded from the midline centroparietal regions [174].

An example of a P300-based BCI is the P300 speller [47] that intensifies columns and rows of an alphabet matrix presented visually to the user. A P300 response is elicited when the user is presented with intensification of the row and column containing the desired letter. Thus, the presence of the P300 can be used to detect the user’s choice.

The P300 speller was shown to achieve higher than 70% accuracy in 5 of 6 participants with ALS in [159]. Moreover, Sellers et al. reported that with visual and auditory P300- 12 inducing stimuli, 2 of 3 participants with ALS achieved a selection accuracy comparable to that of able-bodied individuals using a similar system [202].

Steady state visually evoked responses

When presented with repetitive visual stimuli, EEG recordings from the parieto-occipital sites demonstrate peaks at frequencies matching that of the stimuli and its harmonics

[73]. This response is known as the steady state visually evoked potential (SSVEP). The physiological mechanism underlying generation of SSVEPs remains largely unknown, al- though the amplitude of the SSVEPs is reportedly related to increase in synaptic activity [165]. It is suggested that SSVEP peaks intensify with selective attention to the stimulus

[145, 3].

In Wang et al. (2006), 11 volunteers with SCI attempted to operate an environmental control using an SSVEP-based BCI system. Of the 11 participants, 10 were able to reach an information transfer rate of 21 bits/minute using this system [230]. In this study, an array of buttons, each flickering at a different frequency was presented to the user. The user chose the desired option by attending to the appropriate button.

Mental task

Several other mental tasks such as language and arithmetic have also been shown to induce distinctive EEG patterns in able-bodied individuals [140, 185]. Despite the cog- nitive load imposed by these BCIs, they may have merits as BCI control mechanisms for the target population. To the best of our knowledge, BCIs based on language and arithmetic mental tasks have not been tested by the target population.

1.3.4 Evaluation Criteria

Performance of BCI systems have generally been measured by speed and accuracy which are both important for communication. Since the reviewed studies focused on different 13 applications (e.g., spelling, cursor control), various measures of speed and accuracy were used to report system performance as listed in Table 1. Examples of accuracy measures include classification accuracy and r2 value, which reflects the level of correlation between user intent and the signal features [237]. The number of characters typed per minute has also served as a measure of speed. Information transfer rate, also known as bit rate, has also been commonly used as a combined measure of accuracy and speed [237]. This measure reflects the amount of ”correct” information transferred per unit time.

1.3.5 Future Directions in BCI research

A closer look at the reviewed studies provides a means of identifying emerging challenges in BCI development and means to overcome these issues. In this light, the current section summarizes future directions identified in BCI research.

Involve pediatric populations

The reviewed articles largely focused on individuals with adult-onset disabilities. It is un- clear whether or not the findings of these studies translate to individuals with congenital disabilities, who often have never experienced any means of communication. For exam- ple, to the best of our knowledge, BCIs relying on motor imagery have never been tested with individuals who have never experienced voluntary control of their movements. Pro- longed deprivation from communication in childhood can lead to learned helplessness and impede the development of contingency awareness [216]. Despite this compelling clinical reason for investigating BCI use in the early stages of life, none of the reviewed studies have investigated the effectiveness of EEG-based BCIs in the pediatric population.

Consider personal contextual factors in determining BCI speed requirements

Communication speed (e.g. words typed per minute) has traditionally been an important factor in assessing BCI performance. The emphasis on maximizing speed may stem from 14 studies with able-bodied individuals or those with traumatic disabilities who may expect BCI systems to replicate the high throughput of pathways such as speech. Nonetheless, joint BCI studies involving both able-bodied individuals and those with severe disabili- ties have pinpointed delays in reaction time [6] and slower item selection rates [38] in the disabled participants. Thus, the speed expectations of patients are likely very different from those of their able-bodied counterparts. Indeed, proficient users of single-switch scanning systems typically only achieve 8-24 words per minute [61]. Further, children with developmental disabilities and communication difficulties are known to exhibit only a handful of intentional communication acts per minute (e.g., words, gestures and vocal- izations) [18]. Therefore, we recommend that as an indicator BCI performance, speed ought to be contextualized in terms of the individual’s time scale for communication, tak- ing into account the time required to process received information and the time needed to muster the resources to respond. The level of cognitive awareness of the BCI user has a significant effect on the choice of control mechanism and may affect the speed of oper- ating the BCI. In particular, spontaneous control mechanisms are appropriate for users who can voluntarily modulate EEG patterns. However, due to the lack of alternative means of communication, the cognitive awareness of the participant cannot always be assessed using standard assessment tools that rely on motor responses [87]. Therefore, a comprehensive evaluation of BCI performance ought to include an appropriate cognitive assessment.

Train and evaluate in ecologically salient environments

BCI evaluation would not be complete without considering the environmental context it operates within. Because a BCI system is often the only means of communication for an individual with severe disabilities, BCI solutions must allow long-term use in home environments. Despite this, only a handful of articles have evaluated BCI performance in home environments [76] or a simulated home-like environment [27]. A notable example 15 is the evaluation of the BCI 2000 system modified for use in home environments [226]. In evaluating BCI accuracy, contextual factors may also include communication partners.

In this regard, it is important to view the BCI as a tool for facilitating meaningful communication and not necessarily as a tool for producing exact selections. For example, when using a BCI system to control a scanning keyboard, meaningful communication can occur in spite spelling errors. This suggests that to obtain an environmentally relevant evaluation of a BCI, a measure of the conversational partner’s receptive communication may be important. While BCI training and evaluation may be performed in the user’s home environment, trained personnel must often be present to ensure proper set-up and operation of the equipment. This can limit BCI users to the geographical vicinity of research facilities. To overcome these geographical restrictions, both researchers and patients may benefit from tele-monitoring systems that enable remote supervision of training [148].

Introduce user-aware BCIs

None of the reviewed studies have incorporated user state (fatigue, attention, emotional status) while operating BCI systems when determining performance. For example, it is not clear whether the performance degradation observed during long periods of BCI use has resulted from exacerbated fatigue or due to the failure of detection algorithms used. Detecting changes in user status such as the level of fatigue, and attention may improve BCI performance assessments. In addition, awareness of user-status may allow the BCI to more intimately accommodate the user’s moment-by-moment needs. For ex- ample, once user fatigue is detected, the system can suggest a rest period. Specific EEG patterns are shown to reflect different states such as fatigue and attention. Extended periods of performing tasks such as mental arithmetic or driving result in an increase in frontal theta rhythms [224]. Hamadicharef et al (2009) were able to differentiate atten- tion (reading/arithmatic) versus non-attention (rest) state with accuracies up to 89.4% 16

[67]. Petrantonakis and Hadjileontiadis (2010) showed that the six basic emotions (hap- piness, surprise, anger, fear, disgust, and sadness) could be differentiated with 83.33% accuracy using EEG activity [168]. Based on these findings, in future studies, EEG sig- nals monitored by BCI systems may also be used to estimate user state leading to a more user-accommodating implementation. EEG signals may also reflect the dynamics of the interaction between the user and BCI systems. For example, error-related potentials, which are manifested after an error occurs [24, 80], may be used as a post-hoc correction mechanism. Once an error-related potential appears, an auto-correction strategy may be invoked or user verification may be solicited. Using EEG patterns associated with the user-system interaction such as error-related potentials may lead to more usable BCIs [25].

Develop more effective training protocols

None of the reviewed studies have focused on the development of engaging training paradigms. Training is an imperative part of realizing SMR and SCP BCI systems.

Improving the training interface may directly affect training success. Studies involving able-bodied participants have previously explored alternative training paradigms. The interested reader is referred to Neuper and Pfurtscheller (2010) [155]. For example, im- mersive training protocols (using virtual environment) have been suggested for realizing an informative yet engaging training environment [127]. Using more engaging training paradigms such as those involving learning reinforcements may increase user motivation, improve training effectiveness and reduce requisite training times. Such training regimens would be particularly useful for motivating the pediatric user with disabilities.

1.3.6 Towards affective brain computer interfaces

Despite the merits offered by existing BCI systems, many nonverbal children and youth are usually not candidates for existing BCI technologies due to developmental delays, lim- 17 ited expressive communication and unknown levels of receptive communication. Indeed the aforementioned challenges preclude the training of specific mental activities. How- ever, these individuals are still candidates for affective BCIs (A-BCI) which enable the automatic recognition of affective states using brain activity [158]. A-BCIs may provide a means of detecting spontaneous and natural reactions to emotion-evoking stimuli.

A-BCI development is a step towards addressing existing gaps in BCI research intro- duced in 1.3.5. Emotions are an intuitive and natural means of responding to stimuli.

Therefore, A-BCI may provide an opportunity to realize communication pathways for the pediatric population. A-BCIs can bring awareness to user state in existing BCI systems. Emotional awareness may help create more user accommodating systems and develop more effective training paradigms. Unlike existing active BCI systems which gen- erate voluntary and direct commands for communication (e.g. the P300 speller), A-BCIs may offer passive but intuitive control. Passive BCI systems detect implicit information regarding the user state (e.g. emotions) and intentions, and enable situational interpre- tations [242]. Ultimately, an affective BCI may enable the decoding of emotional state in the absence of overt emotional expression.

Computer-based detection of emotional responses may enhance implicit communica- tion about the user in human computer interaction systems [31]. Affective computing has long been touted for its potential for more realistic and user-accommodating interactions

[171]. An emotionally-aware system stands to benefit non-verbal individuals with severe disabilities by estimating their emotional state in the absence of more explicit means of interaction (e.g. speech and gestures). In turn, knowledge of the patients affective state may help to mitigate care-giver stress and facilitate treatment decisions in a timely fashion [70]. 18

1.4 Neural correlates of emotion

Emotional response has been shown to engage different pathways in the central and autonomic nervous system. Autonomic nervous system (ANS) activity sensors such as those that detect cardiovascular, respiratory, and electrodermal can unveil emotional responses [108, 129]. For a review of studies using ANS activity sensors for identifying emotions, the reader is referred to [108].

Based on theories suggesting a close relationship between emotional response and survival, key neural structures in the brain have been identified in different animal studies. Figure 1.2 summarizes the many neural structures involved in orchestrating an emotional response within what is known as the survival network [123]. As shown in Figure 1.2, emotional response can engage many substrates in the mammalian brain. The human brain is no exception to this rule. Neuro-imaging techniques such as positron emission tomography (PET) [221] and magnetic resonance imaging (MRI) [21] have provided an opportunity for in vivo characterization of emotional perception in the human brain

[15, 16, 44, 206, 209].

Various brain circuits including parts of the limbic system and amygdala are found to be responsible for the perception of emotional stimuli ([164, 208, 126]). Among these areas, the frontal cortex plays an important role in regulating emotional response to sen- sory input [34, 33, 187, 141]. Previous studies have confirmed the role of the frontal area in emotional response. For example, severity of the depressive symptomatology in pa- tients following stroke lesions was reported to be significantly correlated with proximity of the lesion to the frontal pole [186]. Moreover, left and right frontal activations were also found in response to watching video clips inducing positive and negative emotional responses, respectively [235]. Activations in the orbito-frontal and ventral prefrontal cor- tex in response to highly pleasurable self-selected music excerpts have also been reported

[15]. Tanida et al. showed that inducing mental stress could lead to bilateral increase or decrease of oxygenated hemoglibin ([HbO2]) and deoxygenated hemoglobin ([Hb]), 19

Orbitofrontal Cortex Choice behavior Memory of emotional events

Hippocampus Memory consolidationof emotional events, spatial learning Dorsal & Ventral Striatum

Instrumental approach or avoidance behavior Cortex Dorsal Motor of Vagus Amygdala Nucleus Ambiguus Thalamus Hippocampus Ulcers, urination, defecation, bradycardia

Sensory Paraventricular N. Corticosteroid release (”stress response”)

Lateral Hypothalamus

Tachycardi, skin conductance response, paleness, pupil dilation, blood pressure elevation

Response Figure 1.2: Various structures within the survival network involved in the emotional response, and the resulting outputs. [123] 20

respectively [219]. Matsuo et al. have reported PFC [HbO2] increases in a group of indi- viduals with post-traumatic stress disorder as well as a healthy control group in response to trauma-related videos [134].

1.4.1 The role of prefrontal cortex in default, salient and exec-

utive control networks

One of the remarkable features of the brain is its ability to attend to salient events in the environment. The ability of the brain to regulate various processes and divert attention to the more salient ones has been attributed to intrinsic and distinct functional networks

[214]. These networks are composed of strongly coupled sets of information processing nodes distributed in the brain. Functional connectivity studies have confirmed the ex- istence of at least three canonical networks: (i) central executive network, (ii) default network; and (iii) salience network [214]. The salience and central executive network exhibit increased activity during cognitively demanding tasks [63]. The default network, on the other hand, shows higher levels of activity during resting state [63]. By regulat- ing activation and deactivation of these networks, the brain can realize various ongoing processes during resting state and respond to salient events when required. These salient events could involve cognition, homeostasis or emotions [201]. Therefore, emotional re- sponse may result in activity changes within these intrinsic brain networks. The salience, central executive and default network are shown to encompass different areas within the prefrontal cortex. The dorsolateral prefrontal cortex is shown to be part of the central executive network [214, 201]. The ventromedial prefrontal cortex serves as one of the nodes in the default network [214, 63]. Finally, the salience network encompasses the ventrolateral prefrontal cortex [214, 201]. In addition, various areas in the prefrontal cortex are shown to act as information hubs by integrating diverse information sources within different brain networks. Buckner et al [20] identified prominent hubs in the medial/lateral prefrontal cortex, in a functional magnetic resonance imaging study. 21

Based on the existing evidence, recordings from the prefrontal cortex may tap into three major networks in the brain (salience, central executive and default networks). In addition, recordings from the medial/lateral prefrontal cortex may enable monitoring of the activity of intrinsic cortical hubs [20].

Unlike deeper brain areas such as amygdala, and limbic system, prefrotnal cortex (PFC) hemodynamics can conveniently be monitored using non-invasive and portable brain monitoring modalities, such as NIRS. Accessibility of PFC by brain sensing mod- ules and particularly NIRS provides a great opportunity for realizing a bed-side emotion identification system. Therefore, in this thesis, PFC hemodynamics were used for iden- tifying emotional response.

1.5 Near-infrared spectroscopy of the brain

Among various brain monitoring modalities, hemodynamic measurements are not prone to electrogenic artifacts such as bio-potentials associated with eye-movement or frontalis/temporalis muscle contraction. These artifacts primarily occur in the forehead area and may reduce signal to noise ratio when recording EEG from the prefrontal and frontal region. There- fore, hemodynamic measurement in cortical areas are involved in emotion processing is a meaningful pursuit in developing affective brain computer interfaces (A-BCIs).

Various brain sensing modalities have been developed for cerebral hemodynamic mon- itoring such as magnetic resonance imaging (MRI), and positron emission tomography

(PET). However, neither of these technologies are currently suitable for long-term bedside monitoring for emotion identification purposes. Current MRI technologies are bulky, ex- pensive, and require radio frequency and magnetic shielding which impedes their use as a portable bedside monitoring system. PET systems require administration of radioactive tracers, and are therefore not suitable for long-term and repeated monitoring.

Near-infrared spectroscopy (NIRS), which is also a hemodynamic-based brain sensing 22

Sample recording

light emission [HbO2 ] light detection

[Hb]

Figure 1.3: General overview of NIRS recording system modality, offers many advantages such as low cost and portability making it suitable for long-term bed-side use. NIRS optically monitors the level of oxygenated and de- oxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in the cerebral cortex. Near-infrared light penetrates the adult skull and can be superficially detected

2.5-3 cm away from the source [228, 90] (Figure 1.3). The detected light intensity can be used to identify [HbO2] and [Hb] in the underlying tissue due to the differences in ab- sorption characteristics of these two chromophores. Deeper brain areas in the emotional network such as amygdala cannot be monitored using NIRS. However, PFC activity can be conveniently monitored using superficial light emitters and detectors.

1.6 Emotion induction via music

In a recent review of physiological markers of emotion [108], Kreibig illustrated the di- versity of emotion induction paradigms and their usage frequencies. In the reviewed sample, film clips were most frequently used, but other emotion induction methods such as imagery, personalized recall and musical excerpts were also reported [108]. Among these techniques, music, which is often presented over longer durations, has the capacity to induce a response changing with time. Emotions experienced during the initial pre- sentation of a piece of music may be different from those surfacing as the music unfolds.

However, music-based emotion induction is subject to debate among researchers, and the 23

Table 1.3: A summary of existing theories of emotions. See [50] for more details.

1954 Arnold Felt tendency towards an object Gasson accompanied by specific bodily changes 1986 Lutz A means of negotiating social relations and White 1991 Lazarus Organized psychophysiological reactions with respect to ongoing relationships with the environment 1991 Ekman Characteristics common among emotions: ”rapid onset , short duration, unbidden occurance, automatic appraisal, and coherence among responses. 2008 Juslin Emotions are typically described as relatively brief, though intense, affective reactions to potentially important events or changes in the external or internal environment that involve several subcomponents: cognitive appraisal (e.g., you appraise the situation as dangerous), subjective feeling (e.g., you feel afraid),physiological arousal (e.g., your heart starts to beat faster),expression (e.g., you scream), action tendency (e.g., you run away), regulation (e.g., you try to calm yourself) use of music as an emotion induction method is less prevalent than other stimuli [108].

Those opposing the use of music for inducing emotions have argued that music fails to present immediacy resembling real-life situations (e.g. a life-threatening event).

The lack of consensus among researchers regarding the use of music for emotion induction may stem from disagreement about the very definition of emotions. Theories of emotion have emphasized different attributes of an emotional response in defining emotions. Emotions have been defined with respect to bodily changes, social relations, homeostasis within the surrounding environment and autonomic appraisal. Table 1.3 summarizes a number of different theories of emotion, and highlights the diversity among them [50].

To overcome the lack of consensus about the role of music in inducing emotions,

Juslin et al. explored the ability of music to induce emotions based on various mechanisms leading to emotional response [92]. Despite competing arguments for and against musical 24 emotion induction, music has been used as an emotional auditory stimulus in many studies [106, 109, 81, 213, 60]. In addition, music has been used in many studies involving emotional processing in the brain [15, 16].

There is a lot of diversity in the choice of music excerpts used in studies of emotion.

These studies can be categorized into two main streams: (i) studies using music in un- altered form (with no computer adjustments) (ii) studies using music with modifications to specific music characteristics such as dissonance or chords to influence emotional ex- perience. For example, by modifying the degree of permanent dissonance (which affects the pleasantness of stimuli) Blood et al. studied neural emotional processing with music in a positron emission tomography study [16]. Steinbeis et al. [215] produced harmonic sequences that ended on an irregular chord function, and were able to identify electro- dermal activity modulations when the musical expectancy was violated. Other studies use unaltered music belonging either to a collection of pre-selected music excerpts [167] or a number of music pieces self-selected by the individuals [15].

With respect to neutral auditory stimulus, various strategies have been proposed. For example, in some studies a neutral auditory stimulus was presented with environment sounds (e.g. sounds from ocean waves or songbirds) [4]. Random static noise has also been applied as a neutral stimulus [51, 212]. Other studies have used computer adjustments to neutralize the emotional content of music [16].

Using music for emotion induction offers some specific advantages, particularly in studies of emotion involving the pediatric population. The emotional content of music is shown to be discernable by children as young as 6 years of age [32]. In addition, emotions in music are known to be perceived across cultures [55]. Therefore, as a dynamic cross- cultural emotion induction method, music has many merits. Therefore, to achieve the goals of the current thesis, a music emotion induction paradigm was implemented. 25

1.7 Objectives

The objective of this thesis was to implement and test a means of identifying emotional arousal and valence in response to music using Near Infrared Spectroscopy (NIRS) of the prefrontal cortex (PFC). To achieve this goal, multiple investigations were necessary to resolve technical and physiological challenges in the context of neural correlates of emotion. In this light, the specific objectives of this thesis were:

A. To identify correlates of emotion by characterizing the signals recorded via PFC

NIRS with respect to emotional arousal and valence.

B. To investigate whether the detected activity patterns in objective A were due to emotional response or mere music perception.

C. To identify features from the NIRS signals which are correlated to emotional response and investigate the ability to differentiate emotional arousal and valence based on these features.

D. To compare detection accuracies achieved using PFC NIRS signals to those at- tained with autonomic nervous system signals such as heart rate, skin temperature, and electrodermal activity, which have previously been used for emotion identification in the literature.

E. To design and test a multi-modal emotion (arousal and valence) identification system using ANS and PFC NIRS monitors.

1.8 Roadmap

The roadmap of this thesis is organized according to the objectives listed above. Chap- ters 5-6 are arranged as journal articles each focused on one or multiple objectives listed in section (1.7). The thesis structure is summarized in Figure 1.4. Chapter 2 provides details regarding the methods and data collection procedures used. Chapters 3 to 6 may duplicate information regarding the procedures summarized in chapter 2. Likewise, 26

the introduction (or background) section of some chapters may also replicate informa- tion presented in chapter 1. Where duplication occurs, we will highlight in the chapter

preamble, sections that the reader can skip. Following the five main chapters, the thesis

concludes with a summary of contributions and recommendations for future studies.

In Chapter 2, the study protocol is described in detail. These descriptions explain the experimental paradigm, data collection procedures, and measurements used. In ad- dition, the relevant data preprocessing algorithms are introduced in detail.

In Chapter 3, PFC NIRS signals, namely [HbO2] and [Hb] are characterized using wavelet peak detection. The wavelet peak detection algorithm allows characterization in

time and frequency domains. These wavelet characteristics are examined with respect to

subjective ratings of arousal and valence rating. This chapter is in line with Objective

A.

In Chapter 4, the main effect of three music characteristics (mode, dissonance and

maximum sound pressure level), which are known to be effective in inducing emotions,

on PFC hemodynamics is investigated. PFC is likely to be involved in a brain network

specialized for perceiving emotions, and therefore, the activities observed may be due to

music perception and not the emotional content of the music. This chapter focusses on objective B, and investigates whether PFC hemodynamics are directly affected by the

identified music characteristics.

In Chapter 5, a group of time-domain features are extracted from PFC [HbO2] and [Hb] measurements. These features are then used for training two separate classifiers for

arousal and valence differentiation. In this validation study, a PCF NIRS-based arousal

and valence identification system is tested and therefore objective C is addressed.

Autonomic nervous system (ANS) activity has long been used for identifying emo- tions. Therefore, in the pursuit of a physiologically-based emotion identification system,

it is important to compare the current detection rates achieved using PFC with those

realized using ANS activity. 27

Chapter 2: Study protocol

Chapter 3: Characterizing PFC hemodynamic changes with respect to valence and arousal

Chapter 4: Investigating the effect of music characteristics

Chapter 5: Automatic detection of emotional response using PFC hemodynamic features

Chapter 6: Identifying emotional valence and arousal by combining autonomic and central nervous system activity

Chapter 7: Concluding remarks

Figure 1.4: Thesis roadmap. 28

In Chapter 6, ANS activity collected in the form of heart rate, electrodermal activ- ity and skin conductance features is used for solving the same classification problems as those formulated in Chapter 5. In addition, a dynamic model based feature extraction is implemented to improve classification results by including frequency domain features. Ul- timately, a mixture of classifier experts, each trained using PFC or NIRS features are used for solving the classification problem (i.e. high arousal versus low arousal and positive versus negative valence). Overall, this chapter investigates the ability of a multi-modal emotion identification system to improve upon accuracies achievable with classifiers con- sidering exclusively PFC NIRS features. In this manner, Chapter 6 addresses objectives

C, D and E. Chapter 2

Experimental Protocol

2.1 Preamble

This chapter summarizes the experimental details including the procedures, methods, data acquisition and data preprocessing.

2.2 Introduction

To realize the testing of the hypothesis put forth in this thesis, a database of physiological responses, and corresponding ratings of emotional valence (positive versus negative), and arousal (intense versus neutral) was created using a collection of music excerpts. In this chapter, you will find details regarding data collection procedures. In addition, you will be introduced to the preprocessing techniques applied to the light intensities collected via the NIRS devices to achieve [HbO2] and [Hb] signals used in the rest of this thesis.

2.3 Participants

Ten able-bodied volunteers (five females, five males, age: 25 ± 2.7 years) were re- cruited for this study. The participants reported to have normal hearing, and normal or

29 30 corrected-to-normal vision. The recruitment criteria excluded individuals with reported cardiovascular diseases, metabolic disorders, history of brain injury, respiratory condi- tions, drug and alcohol-related and psychiatric conditions. Participants were instructed to refrain from caffeine and alcohol consumption 5 hours prior to the study. Volunteers had an average of 5.5 years of past music training. The duration of musical training is reported due to previous research documenting the influence of music training on the physiological responses to music [207]. Ethics approval was obtained from the Bloorview

Research Institute research ethics board (see Appendix E) and all participants provided informed written consent.

2.4 Stimuli

The stimuli were composed of 78 music excerpts. All music segments were 45 s in du- ration. The excerpts included lyrical and non-lyrical pieces. The lyrics were in different languages (English, French, Italian and Spanish) to reduce potential effects of brain ac- tivation due to mental singing. Within the excerpts presented to each participant, 72 standard music pieces were chosen by two researchers from different genres of music

(classical, rock, jazz and pop). Specifically, candidate pieces were assessed in terms of their valence characteristics as suggested by the tone, rhythm and lyrics (where applica- ble). Note that the researcher assessments were used solely to ensure an approximately uniform representation of music between valences (positive versus negative). The actual data analysis described in section 2.5 relied solely on participant ratings of valence and arousal. For the participant-selected pieces, participants chose a priori three pieces of mu- sic that personally induced intense positive emotions (joy or excitement) and three that induced intense negative emotions (sadness). The control acoustic stimulus was Brown noise (BN). User feedback in our pilot studies indicated that this type of noise was sub- jectively more pleasant than white noise at the same sound pressure level [229]. For more 31 information regarding other alternatives for the neutral auditory stimulus, the reader is referred to section 1.6. A list of the standard database presented to all participants is presented in Appendix B.

2.5 Signal acquisition

An Imagent Functional Brain Imaging System from ISS Inc. (Champaign, IL) was used for NIRS measurements. A custom made rubber polymer (3M 9900 series) headgear held three light detectors and ten light sources in place over the forehead, as depicted in Figure 2.1. At each X location in Figure 2.1, two light sources, one at 830 nm and the other at 690 nm, were co-located. This layout had been previously used for prefrontal cortex monitoring in Power et al (2010) and provided readings at the nine shaded locations in Figure 2.1 [179]. With data from two wavelengths, this configuration yielded 18 different channels of light intensity readings. The midpoint of the headgear was aligned to anatomical midline (as estimated by the position of the participants nose), while the lower edge of the headgear sat just above the eyebrows. Light sources were modulated at 110 MHz and the detector amplifiers were modulated at 110.05 MHz which led to a cross-correlation frequency of 5 kHz. The data were sampled at 31.25 Hz. During a complete cycle of all ten sources, each source illuminated the surface for 1.6 ms during which eight acquisitions were made. A fast Fourier transform was applied to the average of the eight waveforms to obtain an estimate of ac and dc intensities as well as the phase delay [179]. The dc light intensities were used to determine HbO2 and Hb concentrations

(i.e. [HbO2] and [Hb] ).

2.6 Pre-processing

Low-frequency artifacts such as respiration, heart rate and the Mayer wave were filtered using a type II third order Chebychev low pass filter with a cut-off frequency of 0.1 32

Figure 2.1: The layout of light sources (circles) and detectors (X’s). The vertical line denotes anatomical midline. The annotated shaded areas correspond to recording loca- tions.

Hz (normalized stop-band edge frequency of 0.032 and stop-band ripple of 50 dB down from the peak pass-band value) [178]. The 830 nm and 630 nm light intensities at each of the nine recording sites were used to calculate HBO2 and Hb concentrations via the modified BeerLambert law [30, 41], which resulted in 18 channels of concentration data. To reduce the effects of initial device calibration, the concentration time series were normalized within each experimental block against the mean in the same block.

2.7 Study design

Each participant completed four sessions conducted on separate days. In each session, the participant completed three blocks with optional breaks between blocks. Each block consisted of 12 consecutive trials: four trials with positively valenced songs (one of which was a participant-selected song), four trials with negatively valence songs (one of which was a participant-selected song) and four BN trials. Within a block, the music and

BN trials were pseudo-randomized, such that two BN trials never occurred consecutively while positively and negatively valenced songs appeared in no apparent order. The same 33

Figure 2.2: Trial sequence pseudo-random sequence of trials was employed for all participants. Figure 2.2 depicts a trial sequence. In each trial, the participant listened to 10 s of BN, followed by a 45 s auditory stimulus (music or BN), and finally 5 s of BN. The sound level was faded in and out at the beginning and end of the trial, respectively, to reduce the risk of eliciting a startle. At the end of each trial, the participant rated the intensity and valence of their emotional experience using a nine-level self-assessment Manikin [146] shown in Figure 2.3.

The beginning and end of each trial was marked by an audible tone. The participants were instructed to close their eyes when they heard the initial tone, and to open their eyes upon hearing the second tone. 34

Figure 2.3: The Self Assessment Manikin Rating System is shown. The top and the bottom row depict valence (positive to negative) and arousal (intense to neutral) ratings, respectively. The participant could select one of the nine levels of arousal/valence by marking the corresponding circles shown. For example, in the sample rating provided, a very intense positive emotion is represented Chapter 3

Characterizing PFC Hemodynamic changes due valence and arousal

3.1 Preamble

This chapter investigates the overall hemodynamic patterns accompanying emotional response. Identifying patterns associated with emotions in prefrontal cortex using near infrared spectroscopy is an important step towards emotion identification. In this study, NIRS recordings were used to characterize the PFC hemodynamic response to emotional arousal and valence. In particular, a wavelet-based peak detection technique was used to characterize chromophore concentration patterns.

This chapter is entirely reproduced from the following journal article: Moghimi S,

Kushki A, Guerguerian AM, Chau T, Characterizing emotional response to music in the prefrontal cortex using near infrared spectroscopy. Neuroscience Letters. 2012;

Elsevier.

Readers can skip section 3.4.1 as it reiterates the procedures described in chapter (2).

35 36

3.2 Abstract

Known to be involved in emotional processing, the human prefrontal cortex (PFC) can be non-invasively monitored using near-infrared spectroscopy (NIRS). As such, PFC NIRS can serve as a means for studying emotional processing by the PFC. Identifying pat- terns associated with emotions in PFC using NIRS may provide a means of bedside emotion identification for nonverbal children and youth with severe physical disabilities.

In this study, NIRS was used to characterize the PFC hemodynamic response to emo- tional arousal and valence in a music-based emotion induction paradigm in 9 individuals without disabilities or known health conditions. In particular, a novel technique based on wavelet-based peak detection was used to characterize chromophore concentration patterns. The maximum wavelet coefficients extracted from oxygenated hemoglobin con- centration waveforms from all nine recording locations on the PFC were significantly associated with emotional valence and arousal. Specifically, high arousal and negative emotions were associated with larger maximum wavelet coefficients.

3.3 Introduction

Selected groups of nonverbal individuals with severe disabilities and little or no voluntary muscle control have benefited from communication alternatives based on brain activity known as brain computer interfaces (BCI) [237, 112]. However, due to developmental delays, limited expressive communication and unknown levels of receptive communica- tion, many nonverbal children and youth are usually not candidates for existing BCI technologies. Indeed the aforementioned challenges preclude the training of specific men- tal activities. However, these individuals are still candidates for affective BCIs (A-BCI) which enable the automatic recognition of affective states using brain activity [158]. A-

BCIs may provide a means of detecting spontaneous and natural reactions to specific stimuli. To this purpose, affective responses evoked by visual stimuli have been previ- 37 ously decoded in both facial thermographic [156] and cerebral hemodynamic pathways [218].

Emotional responses engage many different areas of the brain including parts of the limbic system, and prefrontal cortex (PFC) [164, 208, 126, 34]. Neuro-imaging techniques such as positron emission tomography (PET) [221] and magnetic resonance imaging (MRI) [21] have provided an opportunity for characterizing emotional perception in the brain [39, 209, 44, 206, 16, 15]. However, the bulky set-ups required by PET and MRI systems, and potential patient discomfort [150] preclude their use in studies of emotional responses in real-life settings, and particularly in developing A-BCIs.

Among the different modalities available for monitoring brain activity, near infrared spectroscopy (NIRS) is noninvasive, and particularly well-suited for monitoring PFC activity, which is among the regions involved in emotional processing [34], in life-like settings. NIRS monitors hemodynamic activity in the brain by measuring changes in oxygenated ([HbO2]) and deoxygenated hemoglobin ([Hb]) concentrations (i.e. [HbO2] and [Hb]) in regional cerebral blood flow [90, 228]. NIRS is not prone to electrogenic artifacts (e.g. electrooculogram), present in the forehead area. NIRS provides lower spa- tial resolution compared to PET and MRI neuroimaging systems, but it is non-invasive, relatively inexpensive, and portable. As such, NIRS may be particularly more amenable for A-BCI development involving children with severe disabilities.

Exposure to emotionally-laden stimuli is known to produce measureable changes in chromophore concentrations (i.e., [HbO2] and [Hb]) [74, 241, 218]. Examining hemody- namic changes in the prefrontal cortex using NIRS, Hoshi et al. showed that exposure to both pleasant and extremely unpleasant pictures led to increases and decreases in [Hb]

, respectively [83]. Similarly, a recent study showed that highly-positive and negative emotions associated with music could be differentiated with more than 70% accuracy

[144] based on prefrontal cortex NIRS measurements.

The current study used PFC NIRS to investigate characteristics of the hemodynamic 38

response, specifically [HbO2] and [Hb], to emotionally-laden music. Music has repeatedly been used for emotion induction in various studies [91, 134]. Characterizing emotional response to music in able-bodied adults is a step towards future investigations of emotion in non-verbal individuals with severe disabilities.

In the current study, the relationship between chromophore concentrations, [HbO2] and [Hb], and subjective ratings of emotional arousal and valence, was investigated us- ing wavelet analysis. The wavelet transform is a tool for signals analysis in time and frequency. Broadly speaking, the wavelet transform evaluates the similarity of the time series to a given pattern, known as the mother wavelet. In particular, when applied to a time-series, the wavelet transform produces a set of coefficients across a set of time points and scale values where the wavelet coefficient at time t and scale a represents the similarity of the data at t to the mother wavelet scaled by a factor of a. The wavelet transform maps the signal onto a set of bases (wavelet family) consisting of scaled and translated versions of a mother wavelet function.

In the present study, wavelet analysis provided a means of extracting oxygenation patterns relevant to emotional valence and arousal, and was used to investigate the shape of the peak [HbO2] and [Hb] (i.e. differentiate abrupt peaks from gradual peaks). In particular, the maximum wavelet coefficient revealed the scale at which the signal most closely resembles the prototypical hemodynamic response (e.g., increase followed by decrease in oxygenation).

3.4 Methods

3.4.1 Procedures

Ten adults without disabilities or known health conditions (9 right-handed) were recruited for this study. Only the 9 right-handed participants (5 female, age: 252.7 years) were included in the analysis to mitigate any response variations due to differences in hemi- 39

Figure 3.1: The layout of light sources (circles) and detectors (X’s). The vertical line denotes anatomical midline. The annotated shaded areas correspond to recording loca- tions. spheric dominance. The Bloorview Research Institute research ethics board approved of the study, and informed written consent was provided by all participants. Participants donned a custom polyethylene headgear, which covered their foreheads and accommo- dated the placement of multiple emitters and detectors. An Imagent Functional Brain

Imaging System from ISS Inc. (Champaign IL) was used for NIRS measurements across nine different regions on the forehead (Figure 6.2). Each source housed two diodes that emitted light at 830nm and 690nm. The light was detected by three detectors. The data were sampled at 31.25 Hz.

During each trial, participants listened to either a music excerpt or a noise recording that represented a neutral auditory stimulus (Figure 3.2). The music excerpts comprised a database of 78 music pieces selected by the researchers together with 6 self-selected excerpts for each participant. The study was divided into 4 separate sessions encompass- ing 36 trials each (12 noise trials, 24 musical excerpts). After each trial, participants were prompted to rate their emotions in terms of arousal and valence using a nine level self-assessment manikin [146]. The valence ratings were mapped to 1(most positive) to

9(most negative), and arousal ratings ranged from 1(least intense) to 9(most intense). 40

Figure 3.2: Trial sequence

3.4.2 Wavelet-based peak detection

In this phase of the study, the relationship between chromophore concentrations, [HbO2] and [Hb], and subjective ratings of emotional arousal and valence, was investigated using wavelet analysis. The wavelet transform is a tool for signal analysis in time and frequency.

Broadly speaking, the wavelet transform evaluates the similarity of the time series to a given pattern, known as the mother wavelet. In particular, when applied to a time-series, the wavelet transform produces a set of coefficients (shown in (3.1)) across a set of time translations and scale values where the wavelet coefficient at time displacement u ∈ ℜ and scale u ∈ ℜ+ represents the similarity of the data at u to the mother wavelet (ψ(t)) scaled by a factor of s.

1 t − u ψ(u, s)(t) = √ ψ( ) (3.1) s s

The continuous wavelet coefficient corresponding to scale s and translation u can be determined using (3.2).

∫ +∞ 1 ∗ t − u Wf(u,s) = f(t)√ ψ ( )dt (3.2) −∞ s s

In this manner, the original signal (f(t)) is projected onto a two dimensional space of u and s. Therefore, Wf(u,s) allows the study of signal characteristics at time u and scale S. The scale s changes inversely with frequency. Therefore, abrupt peaks (i.e. accompanied 41

Figure 3.3: Mexican hat wavelet

by faster changes in the vicinity of the peak) correspond to larger coefficients at lower

scales, whereas gradual peaks (i.e. accompanied by slower changes in the proximity of

the peak) correspond to higher scales. Therefore, wavelet coefficients identify peaks as well as the rate of changes near these peaks. The interested reader is referred to [131] for

more details regarding wavelet analysis.

Wavelet analysis is often used for detecting patterns of interest in the data, such

as, stereo-typed neuroelectric waveforms (e.g. event related potentials) [35] and [206],

localized spikes in biological data [152], and unknown transients in the signal [54]. Visual

inspection of [HbO2] and [Hb] in high arousal and positive/negative rated trials lead to the selection of the Mexican hat function as the mother wavelet 3.3.

The wavelet transform was computed for scales a ∈ {70, 71,, 400} which maps to pseudo-frequencies in the range of 0.0190.111Hz [182] (the required range for capturing the chromophore concentration changes given that the filtered concentrations have useful

frequency content in the range of < 0.1Hz). The transform was applied to time values

S ∈ {1, , k, , N}, where N was the number of samples included for analysis [152]. The first

5 s of the [HbO2] and [Hb] series during the music intervals were discarded in order to ignore any residual activity from the period preceding music onset. Therefore, a total of

40 s of data corresponding to N = 1250 samples were included when determining wavelet

coefficients. Given the wavelet coefficients for each concentration time series, two features 42 were extracted for subsequent analysis: the maximum wavelet coefficient over all time (i.e. across all translations) and scales, and the scale at which the maximum occurred.

3.4.3 Statistical analysis

To test whether or not the wavelet features (maximum wavelet coefficients for [HbO2] and [Hb] and corresponding scales) were related to the subjective ratings of arousal and valence, we used a mixed effects repeated measures linear regression analysis. Separate regressions were conducted for valence (most positive (1) to most negative (9)) and arousal (neutral (1) to most intense (9)) ratings, and for [HbO2] and [Hb] chromophores. We report the regression coefficient (slope) and the associated p-value as an indicator of the correlation. The analysis was repeated for each of the nine recording sites, again considering arousal and valence ratings separately. To account for multiple comparisons

(9 recording sites), we set a Bonferroni adjusted significance level of α = 0.05/9 = 0.005

[79].

3.5 Results

Box-plots shown in Figure 3.4 a and b depict the distribution of participant valence and arousal ratings, respectively. Across participants, the median valence rating was neutral (i.e. 5), but the median arousal rating was situated more toward the lower end of the scale (i.e. 3).

Figure 3.7 illustrates [HbO2] and [Hb] recordings (in black) from the nine PFC inter- rogation sites for a representative trial rated at the highest arousal and most negative valence. Along with each recording is shown the corresponding temporal waveform of wavelet coefficients (in grey) at the scale containing the maximum wavelet coefficient.

Figure 3.5 and report the slopes of the regression lines between the maximum wavelet coefficient and the subjective arousal and valence ratings, respectively. The results in 43

a b

9 9

8 8

7 7

6 6

5 5

4 4 Arousal rating Valence rating Valence 3 3

2 2

1 1

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 Participant number Participant number Figure 3.4: Box-plot of valence and arousal ratings for each participant

Figure 3.5 and 3.6 indicated that the maximum wavelet coefficient and the corresponding scale exhibit significant regional associations with emotional ratings. In particular, the maximum wavelet coefficients extracted from [HbO2] were significantly related to ratings of arousal in all nine recording regions while coefficients from [Hb] were correlated to inferior left (L2, L3 and L4) and right (R2, R3 and R4) locations. In both cases, the regression slopes were positive, indicating that high arousal resulted in larger [HbO2] and [Hb] peaks in the respective regions. The scale of the maximum wavelet coefficient provided a measure of the sharpness of the concentration peaks. Specifically, smaller scales (higher signal frequency) correspond to more abrupt concentration changes whereas larger scales (lower frequencies) are indicative of more gradual changes. As such, our results suggest that higher ratings of arousal were associated with more gradual peaks in

[HbO2] in regions R2, R3, and L3. Collectively, these findings reveal that more intense emotions were accompanied by larger and less abrupt changes in the concentration of oxy- and deoxy-hemoglobin.

Negative emotions were correlated with larger values of the maximum wavelet coeffi- cient across all nine recording sites in [HbO2], and inferolateral left (L3) and right (R3) 44

Interrogation site Wavelet chromophore R1 R2 R3 R4 OL1 L2 L3 L4 feature

a. MWC [HbO 2] 0.9531 1.4831 1.7885 1.3225 0.9643 1.0918 1.2179 1.5341 1.4581 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 [Hb ] 0.2748 0.5811 0.4388 0.3289 0.4989 0.5136 p=0.0007 p<.0001 p<.0001 p=0.0001 p<.0001 p<.0001

b. Scale of [HbO 2] 5.5083 4.3377 4.7002 MWC <.0001 0.0002 0.0002 [Hb]

Figure 3.5: Slopes of regression lines between participant arousal ratings and (a) the maximum wavelet coefficient (MWC), and (b) the corresponding scale. Only slopes significantly different from zero are shown (p < 0.005).

Interrogation site Wavelet chromophore R1 R2 R3 R4 OL1 L2 L3 L4 feature

a. MWC [HbO2] 0.8048 1.9382 2.1436 1.6458 1.1121 1.2404 1.6016 2.0886 1.6445 p=0.0043 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 p<.0001 [Hb] 0.5016 0.5143 p<.0001 p<.0001

b. Scale of [HbO2] 4.3357 MWC p=0.0047

[Hb] -4.0693 4.8146 p=0.0032 p=0.0019

Figure 3.6: Slopes of regression lines between participant valence ratings and (a) the maximum wavelet (MWC), and (b) the corresponding scale. Only slopes significantly different from zero are shown (p < 0.005). 45

a)

b)

Figure 3.7: Plotted in black are the (a) [HbO2] (top panel) and (b) [Hb] (bottom panel)recordings across nine interrogation sites for a music sample inducing intense neg- ative emotions from one of the participants during 45 seconds of aural stimulus. In grey are the corresponding waveforms of wavelet coefficients at the scale where the maximum wavelet coefficient occurs. These waveforms have been scaled by their standard deviation to facilitate visual comparison. locations in [Hb]. More negative ratings also corresponded to more gradual peaks in [Hb] at L2, and [HbO2] at R3. Therefore, negative emotions tended to elicit larger and less sudden regional chromophore concentration peaks. More negative ratings on the other hand resulted in sharp concentration peaks in a more midline, superior location (L1).

3.6 Discussion

In this study, arousal ratings were found to be associated with changes in chromophore concentrations. Intense emotional experience has been reported to result in heightened hemodynamic changes. Tanida et al. showed that mental stress induction could result in bilateral increase or decrease of [HbO2] and [Hb], respectively [219]. Matsuo et al. have reported PFC [HbO2] increases in a group of individuals with post-traumatic stress 46

disorder as well as a healthy control group in response to trauma-related videos [134]. Previous findings have reported lateral activation in the PFC due to positive or neg-

ative emotional stimuli [235]. For example, Altenm¨ulleret al. [4] reported an increase in

the left temporal activation due to exposure to positive auditory stimuli, and a bilateral

increase in response to negative auditory stimuli using electroencephalography (EEG). However, in the current study, significant regression slopes were observed bilaterally.

Therefore no evidence of lateral activation patterns was obtained with respect to ratings

of valence.

The significance of the regression slopes resulting from models involving maximum

wavelet coefficients of [HbO2] indicated that the Mexican hat mother wavelet was a suit- able template for identifying patterns relevant to emotional arousal and valence in [HbO2] across all nine recording sites. Unlike static emotion induction paradigms (e.g. pictures)

where short time exposures can result in emotional experience [74], [218] and [241], dy-

namic emotion induction paradigms (e.g. music) can involve emotional unfolding at any time during the course of exposure to stimuli. For example, the emotions experienced

during the introduction to a musical piece may be different from those experienced during

the main body. This scenario resembles real life emotional experience where emotions

can be manifested at any point in time. The results of the current study encourage

future studies of the temporal dynamics of emotion [106] using wavelet analysis for the localization of emotional responses in time. Chapter 4

The Effect of Music Characteristics

4.1 Preamble

There is compelling evidence of a network in the brain specialized for perceiving music.

For example, previous studies of focal lesions in the brain have indicated selective loss of the ability to perceive specific music characteristics [243, 231]. This network may include the prefrontal cortex (PFC)[16, 99]. Therefore, the PFC may play a dual role of perceiving music characteristics and formulating emotional responses. To identify which of these two mechanisms (music perception vs. emotional response) were involved in the activation patterns observed in the PFC, the effect of music characteristics, namely mode, dissonance and sound pressure level on the PFC [HbO2] and [Hb] was investigated in this chapter.

4.2 Introduction

Every musical piece can be characterized by specific structural and performance fea- tures. Performance characteristics, such as energy, timbre and pitch, involve the manner in which the performer executes a musical piece. These features are quite variable due to differences in performer skills and state. Structural features, on the other hand, involve

47 48 acoustic and foundational characteristics of music and are more consistent across per- formers. These structural features, which include dissonance and mode, are shown to play an important part in conveying the emotional content of a musical piece [91]. Therefore, these characteristics may have played a part in inducing certain emotional experiences, and these emotional responses may have resulted in prefrontal hemodynamic changes detected in chapter (3). However, this reasoning may be challenged by an alternative view point regarding the perception of musical characteristics in the brain.

Previous research has identified particular brain networks specialized for perceiving musical characteristics. Primarily, lesions in the temporal lobe and auditory cortex were shown to affect perception of pitch and tonal melodies. Zatorre showed that lesions in the right temporal lobe could adversely affect the ability to discriminate tonal melodies

[243]. In a study involving a control group and thirty-six patients with focal excisions,

Warrier et al. identified the right anterior auditory cortical areas as being responsible pitch judgements [231]. Such findings provide compelling evidence for the existence of a brain network specialized for music perception. This network may include PFC, and the prefrontal area may also be involved in perceiving music. Khalfa et al., who used major and minor mode for emotion induction in an MRI study, reported left orbito and mid-dorsolateral frontal activations in response to the minor mode [99]. Using auditory stimuli designed to only vary in harmonic dissonance and unpleasantness, Blood et al. found that the subjective ratings of dissonance correlated negatively with orbitofrontal and ventromedial prefrontal cortex activation [16].

Due to the potential role of the PFC in perceiving music, it was necessary to in- vestigate whether the activity patterns identified were purely due to the perception of music characteristics or a result of the emotional content of the music. In this phase, the influence of music characteristics on the observed activity patterns detected in the PFC

[HbO2] and [Hb] was investigated. Musical characteristics, such as dissonance and sound pressure level, were compared to hemodynamic changes. In addition, emotional ratings 49 of arousal and valence were compared to average musical characteristics extracted from the corresponding trials.

4.3 Methods

4.3.1 Music characteristic extraction

The music characteristics investigated in this chapter included mode (major or minor tonality), sound pressure level (volume), and dissonance (a characteristic of harmony).

Dissonance has been noted as a mechanism by which modern music is capable of inducing emotions. Children as young as 4 month were shown to react differently when exposed to consonant versus dissonant music pieces [244]. The ability of dissonance to induce emotions has been attributed to an innate response to danger because many alarming sounds in nature such as cries of birds are dissonant auditory cues. Therefore, dissonance, resulting from modifications to harmonic structures, can convey the salience of the au- ditory stimulus and result in emotional response. Previous studies of emotion have used dissonant and consonant music excerpts for inducing pleasant and unpleasant emotions

[16]. Similarly, intensity or volume has been shown to play a role in inducing emotions.

Studies of music for marketing purposes, and psychological assessments have confirmed that the music volume can play a role in emotion induction [232, 19]. Finally, music mode is shown to affect emotion induction; the major mode is commonly associated with positive valence while minor mode conveys negative emotional content. In previous studies of emotion involving brain activity, music mode has been used to convey positive and negative emotions [99]. Unlike dissonance which involves the harmonic structure of music, mode is related to the melodic characteristics of music. Interested readers are referred to [91] for more information regarding emotional content of music.

For each music excerpt used for this study, mode and dissonance features were de- termined using the music information retrieval toolbox (MIRTOOLBOX) developed in 50

University of Jyv¨skyl¨,Finland, which is available in MATLAB (Mathworks) [124]. MIR- toolbox allows time domain extraction of music characteristics by breaking the music piece into time epochs. These epochs were chosen to be 1.5 ms. Average characteristics were extracted from the entire course of the trial. For more information regarding music characteristic extraction see Appendix C.

4.3.2 Music database

As described in chapter 2, the music collection used during the data acquisition phase

was composed of two subsets: 72 music pieces identically played for all participants, and six self-selected songs specific to each participant. The self-selected music excerpts were

played once per session. Therefore, each participant was exposed to four repetitions of

the same song. During these repetitions, the music characteristics remained the same.

Therefore, comparing [HbO2] and [Hb] recorded during separate repetitions may provide an opportunity to detect music characteristic-dependent activity patterns in the PFC hemodynamics.

The remaining 72 music pieces and the respective arousal and valence ratings were used to detect whether emotional ratings have been influenced by music characteristics.

In addition, the hemodynamic response to the common music excerpts was compared to

music characteristics to identify if music characteristics had a significant effect on the

hemodynamic patterns observed.

4.3.3 Statistical analysis

To test whether the music characteristics were related to subjective ratings of arousal and

valence, a mixed effects repeated measures linear regression analysis was fit to the music characteristics extracted from the common music excerpts used (i.e. 72 music pieces).

Separate mixed effect regression analysis were conducted for valence and arousal ratings.

The p-value associated with the regression slopes were recorded as an indicator of the 51

Table 4.1: P-values for the main effect of arousal and valence rating in modeling mode, dissonance and maximum sound pressure level. hhhh hhh independent variable hhhh hhhh Arousal Valence hhhh dependent variable hhhh Mode 0.6128 0.0056 Dissonance 0.0082 <0.0001 Maximum sound pressure level 0.0280 0.0006

significance of the detected relationship (p < 0.05).

To determine the extent to which music volume,dissonance and mode have affected

[HbO2] and [Hb] averaged across the nine recording regions, a mixed effect model was fit to the peak values of average [HbO2] and [Hb] with the main effect of each music char- acteristic separately (i.e. mode, maximum sound pressure level and average dissonance).

For region specific analysis of [HbO2] and [Hb] with respect to mode, maximum sound pressure level and dissonance, please see appendix D.

4.4 Results

Table 4.1 summarizes the significance of the slope of the regression line for the mixed

effect model fit to mode, maximum sound pressure and dissonance to the main effect

of arousal and valence ratings in two separate models. The ratings of valence were

found to be significantly (p < 0.05) related to mode, maximum sound pressure level and dissonance in each trial. The ratings of arousal were significantly related to dissonance

and maximum sound pressure level.

Music characteristics did not significantly (p < 0.05) influence the peaks of [HbO2] and [Hb] averaged across the nine recording sites. Table 4.2 summarizes the p-values corresponding to the main effects of the music characteristics, namely, mode, maximum sound pressure and dissonance in modeling peaks of average [HbO2] and [Hb] across the nine recording locations. 52

Table 4.2: P-values for the main effect of music characteristics (i.e. dissonance, mode, and maximum sound pressure level) in modeling the peaks of [HbO2] and [Hb] averaged across the nine recording sites. hhhh hhhh independent variable hhhh hhh Mode Dissonance Maximum sound hhhh dependent variable hhhh pressure level [HbO2] 0.205 0.098 0.059 [Hb] 0.769 0.052 0.250

1

0.5

0

-0.5

-1 0 5 10 15 20 25 30 35 40 45 time Figure 4.1: In grey: the normalized sound pressure level of self-selected song A for participant 3. In black: normalized [HbO2] averaged across the nine recording locations shown for each of the four repetitions of song A. The [HbO2] varied in different repetitions of the same song.

Figure 4.1 depicts the normalized [HbO2] averaged across all nine interrogation sites during 4 repetitions of a self-selected song. Visual inspection suggests that the average

[HbO2] collected during separate repetitions of the same song showed temporal differ- ences. For example, in Figure 4.1, the peak [HbO2] appeared at different time points.

4.5 Discussion

The slope of the regression line fit to the music characteristics reached significance

(p < 0.05) with the main effect of arousal and valence rating (with the exception of 53 music mode modeled using valence rating). The arousal and valence ratings represent emotional experience, Therefore, these results confirmed the significant effect of music characteristics in inducing emotions (Table 4.1). The music mode was found to signif- icantly influence valence (p < 0.05) while the effect of mode on arousal did not reach significance. This finding echoes those of previous studies involving emotional ratings and music characteristics. Husain et. al showed that modifying the mode of a piece by Mozart can affect the mood without influencing arousal [85]. Previous studies have acknowledged the effect of mode on the perceived valence among listeners [75].

As shown in Table 4.2, mode, dissonance and maximum sound pressure level did not significantly influence the peak PFC hemodynamics across participants (The slope of the regression line did not reach significance). The results shown in Table 4.1, on the other hand, confirmed that music characteristics significantly influenced subjective ratings (with the exception of mode which did not significantly influence arousal ratings).

In addition, as shown in figure 4.1, repeated exposure to a music excerpt with identical music characteristics resulted in different PFC hemodynamic patterns. Based on these

findings, mode dissonance and maximum sound pressure level are unlikely to have directly influenced PFC hemodynamics, but they significantly (p < 0.05) influenced subjective ratings of arousal and valence and, therefore, emotional experience.

4.5.1 Subject specific patterns

As described in previous chapters, emotions can be individual specific since different participants may manifest different levels of emotional sensitivity. The variability in emotional ratings in this study (see Figure 3.4) highlights the subject-specific nature of emotional experience. In addition, in the same participant, emotional response may vary between sessions due to mood differences [191]. These differences in emotional experience may be responsible for the amount of variability between repeated exposure to the same music excerpts as observed in Figure 4.1. 54

4.5.2 Temporal dynamics

Music characteristics are dynamic phenomena, and so are PFC hemodynamics and emo- tions. Therefore, instantaneous comparisons between these three elements seem necessary for understanding how they interact. However, accessing instantaneous emotional ratings by interrupting the user may result in distractions impeding natural emotional response.

Therefore, in the current thesis, emotional ratings were collected at the end of each mu- sic excerpt. Future studies involving music characteristics and the brain should consider implementing experimental paradigms to realize dynamic emotional ratings.

4.6 Conclusion

In this chapter, the effect of music characteristics namely, mode, dissonance, and maxi- mum sound pressure level on subjective ratings of arousal/valence and maximum [HbO2] and [Hb] in the PFC was investigated. The PFC [HbO2] and [Hb] averaged across the nine recording locations were not significantly influenced by the music characteristics under investigation. However, the results indicated that dissonance and maximum sound pressure level have significantly influenced subjective ratings of arousal and valence. In addition, the ratings of valence were found to be significantly influenced by music mode.

Overall, these findings supported the conjecture that music characteristics can affect emotional experience. The evidence fails to support the hypothesis that the observed PFC hemodynamic patterns were due to music perception. Therefore, the patterns in the PFC were more likely to have resulted from the underlying emotions than the perception of music. Chapter 5

Automatic Detection of Emotional Response to Music

5.1 Preamble

In this chapter, the feasibility of automatic detection of emotional response to aural stimuli using near-infrared spectroscopy of the prefrontal cortex is examined. Here, you will find details of the machine learning algorithms used for training participant-specific classifiers which were used to differentiate various levels of arousal and valence.

This chapter is entirely reproduced from the following journal article: Moghimi S,

Kushki A, Guerguerian AM, Chau T, Automatic detection of a prefrontal cortical re- sponse to emotionally rated music using multi-channel near-infrared spectroscopy. Jour- nal of Neural Engineering. 2012; 026022-9.

Readers can skip sections 5.4.1 and 5.4.2 since they reiterate the procedures described in chapter (2).

55 56

5.2 Abstract

Emotional responses can be induced by external sensory stimuli. For severely disabled

nonverbal individuals who have no means of communication, the decoding of emotion may offer insight into an individuals state of mind and his/her response to events taking

place in the surrounding environment. Near-infrared spectroscopy (NIRS) provides an

opportunity for bed-side monitoring of emotions via measurement of hemodynamic activ-

ity in the prefrontal cortex, a brain region known to be involved in emotion processing. In this paper, prefrontal cortex activity of ten able-bodied participants was monitored using

NIRS as they listened to 78 music excerpts with different emotional content and a control

acoustic stimuli consisting of the Brown noise. The participants rated their emotional

state after listening to each excerpt along the dimensions of valence (positive versus nega-

tive) and arousal (intense versus neutral). These ratings were used to label the NIRS trial data. Using a linear discriminant analysis-based classifier and a two-dimensional time-

domain feature set, trials with positive and negative emotions were discriminated with

an average accuracy of 71.94% ± 8.19%. Trials with audible Brown noise representing a

neutral response were differentiated from high arousal trials with an average accuracy of

71.93% ± 9.09% using a two-dimensional feature set. In nine out of the ten participants, response to the neutral Brown noise was differentiated from high arousal trials with ac- curacies exceeding chance level, and positive versus negative emotional differentiation accuracies exceeded the chance level in seven out of the ten participants. These results illustrate that NIRS recordings of the prefrontal cortex during presentation of music with emotional content can be automatically decoded in terms of both valence and arousal encouraging future investigation of NIRS-based emotion detection in individuals with severe disabilities. 57

5.3 Introduction

Emotions have been characterized as patterns of experience, perception, action and com- munication that can be animated in response to physical and social encounters [96].

Some theories suggest that emotions can be manifested as a result of human inter- actions with the surrounding environment [53, 125, 23], which result in physiological changes [160] such as the modulation of central and peripheral nervous system activity

[15, 74, 9, 28, 210, 111]. These changes may facilitate the identification of emotional state in non-verbal individuals with severe disabilities who may have no other means of expression. Of particular appeal is the detection of affective responses through brain activity monitoring, as there is no requirement for voluntary motor control. Indeed, computer-based detection of emotional responses may enhance implicit communication about the user in humancomputer interaction systems [31]. Affective computing has long been touted for its potential for more realistic and user-accommodating interactions

[171]. An emotionally aware system stands to benefit non-verbal individuals with severe disabilities by estimating their emotional state in the absence of more direct means of interaction (e.g. speech and gestures). In turn, knowledge of the patients affective state may help to mitigate care-giver stress and facilitate treatment decisions in a timely fash- ion [70]. Various brain circuits including parts of the limbic system and amygdala are responsible for perception of emotional stimuli [164, 208, 126]. In addition, the frontal region of the human brain is involved in regulating emotional response to sensory input

[187, 33, 34]. For example, severity of the depressive symptomatology in patients fol- lowing stroke lesions was reported to be significantly correlated with proximity of the lesion to the frontal pole [186]. Moreover, left and right frontal activations were also found in response to watching video clips inducing positive and negative emotional re- sponses, respectively [235]. Activations in the orbito-frontal and ventral prefrontal cortex in response to highly pleasurable self-selected music excerpts have also been reported [15].

Among various brain measurement modalities such as electroencephalography [157], 58 positron emission tomography [221], magnetoencephalography [68] and magnetic reso- nance imaging (Bushong 1988), near infrared spectroscopy (NIRS) is particularly well suited to long-term bedside monitoring of prefrontal cortex activity. NIRS involves the optical measurement of changes in oxygenated (HbO2) and deoxygenated hemoglobin (Hb) concentrations in regional cerebral blood flow [90, 228]. Being an optical modality, NIRS measurements are not susceptible to electrogenic artifacts such as electrooculo- grams and electromyograms.

NIRS has been used previously to detect emotional responses in the prefrontal cortex.

Recent findings with emotionally laden visual stimuli have confirmed the presence of prefrontal cortex activations detectable by NIRS [74, 241, 83]. Likewise, in the context of automatic emotion detection, Tai and Chau (2009) were able to differentiate between prefrontal responses to affective pictures and baseline activity on a single-trial basis with an average of 75% accuracy [218]. However, the perception of visual stimuli may require gaze fixation and the control of the eye muscles responsible for keeping the eyes open. Therefore, individuals with severe disabilities who possess little or no voluntary muscle control, possibly concomitant with vision impairment, may not be able to observe visual stimuli. However, evidence suggests that aural stimuli, the perception of which requires no voluntary muscle control, can also elicit a pre-frontal response [15, 16, 17].

Previous findings indicate that when used as a BCI control task, active music imagery

(mental singing) can be differentiated from the rest state and mental math with accuracies significantly above chance [177, 45, 65]. However, NIRSbased automatic detection of passive prefrontal responses to affective aural stimuli remains unexplored to date. In this study, we examined the feasibility of automatically detecting emotional responses to aural stimuli by near-infrared spectroscopic interrogation of the prefrontal cortex. Music in particular is recognized for its ability to induce an emotional response in a wide array of individuals [138]. The emotional content of music is known to be perceived across cultures [55] and distinguished by children as young as 6 years of age [32]. In fact, music 59 has been frequently used as an emotional auditory stimulus [106, 109, 81, 213, 60]. In this paper, music excerpts were thus used for inducing affective brain activity.

5.4 Methods

Ten able-bodied volunteers (five females, five males, age: 25 2.7 years) were recruited for this study. The participants reported to have normal hearing, and normal or corrected-to- normal vision. The recruitment criteria excluded individuals with reported cardiovascular diseases, metabolic disorders, history of brain injury, respiratory conditions, drug and alcohol-related and psychiatric conditions. Participants were instructed to refrain from caffeine and alcohol consumption 5 h prior to the study. Volunteers had an average of 5.5 years of past music training. Ethics approval was obtained from the Bloorview Research

Institute research ethics board (see Appendix E) and all participants provided informed written consent.

5.4.1 Stimuli

The stimuli were composed of 78 researcher-selected and 6 participant-selected musical pieces. All music segments were 45 s in duration. The excerpts included lyrical and nonlyrical pieces. The lyrics were in different languages (English, French, Italian and

Spanish) to reduce potential effects of brain activation due to mental singing. The 78 standard music pieces were chosen by two researchers from different genres of music

(classical, rock, jazz and pop). Specifically, candidate pieces were assessed in terms of their valence characteristics as suggested by the tone, rhythm and lyrics (where applica- ble). Note that the researcher assessments were used solely to ensure an approximately uniform representation of music between valences (positive versus negative). The actual data analysis described in section 2.5 relied solely on participant ratings of valence and arousal. For the participant-selected pieces, participants chose a priori three pieces of 60 music that personally induced intense positive emotions (joy or excitement) and three that induced intense negative emotions (sadness). The control acoustic stimulus was

Brown noise (BN). User feedback in our pilot studies indicated that this type of noise was subjectively more pleasant than white noise at the same sound pressure level (Vossa and Clarke 1978).

Each participant attended four sessions, which occurred on separate days, no more than four weeks apart. In each session, participants completed three blocks with optional breaks between blocks. Each block consisted of 12 consecutive trials: four trials with positively valenced songs (one of which was a participant-selected song), four trials with negatively valence songs (one of which was a participant-selected song) and four BN trials. Within a block, the music and BN trials were pseudo-randomized, such that two BN trials never occurred consecutively while positively and negatively valenced songs appeared in no apparent order. The same pseudo-random sequence of trials was employed for all participants. Figure 2 depicts a trial sequence. In each trial, the participant listened to

10 s of BN, followed by a 45 s auditory stimulus (music or BN), and finally 5 s of BN. The sound level was faded in and out at the beginning and end of the trial, respectively, to reduce the risk of eliciting a startle. At the end of each trial, the participant rated the intensity and valence of their emotional experience using a nine-level self-assessment

Manikin (Morris 1995). The beginning and end of each trial was marked by an audible tone. The participants were instructed to close their eyes when they heard the initial tone, and to open their eyes upon hearing the second tone.

In this phase, hemodynamic activity was represented by features extracted from

[HbO2] and [Hb] concentrations. A subset of these features was selected and used for training a linear discriminant analysis based classifier. The classifier was then tested using a second subset set aside for testing. The training and testing feature subsets were both labeled based on arousal and valence ratings provided by the participants.

Classifiers were trained separately for differentiating arousal and valence levels in each 61 participant.

5.4.2 Preprocessing

Low-frequency artifacts such as respiration, heart rate and the Mayer wave were filtered using a type II third order Chebychev low pass filter with a cut-off frequency of 0.1 Hz (normalized stop-band edge frequency of 0.032 and stop-band ripple of 50 dB down from the peak pass-band value) [178]. The 830 nm and 630 nm light intensities at each of the nine recording sites were used to calculate HbO2 and Hb concentrations via the modified BeerLambert law [30, 41], which resulted in 18 channels of concentration data.

To reduce the effects of initial device calibration, the concentration time series were normalized within each experimental block against the mean in the same block.

5.4.3 Feature extraction

Two genres of features were considered: laterality features and single-channel features.

All features were extracted from [HbO2] and [Hb] concentrations. Table 5.1 summarizes the features used. Single-channel features were calculated at each of the 9 interrogation locations and consisted of the mean, slope and coefficient of variation of the concentration signals during the 45s aural stimuli period, as well as the change in the average concen- tration from the preceding baseline period to the task period. The slope was determined by fitting a line using linear regression to all data points in the 45s trial window and calculating the corresponding slope. The entire 45s window was used for determining the slope because the concentration changes could occur at any point during the presenta- tion of the aural stimulus. Coefficient of variation was determined by finding the ratio of the variance to the mean over the course of the trial. The amplitude-based features reflected the level of chromophore concentration which captured regional brain activity.

The slope of the concentration waveform represented response latency (i.e. faster vs. slower changes). Such features have previously characterized task-based activation in 62

Table 5.1: Summary of features used in the analysis

Feature type Features Lateral slope ratio (LSR)= right concentration Laterality features slope/left concentration slope

Lateral absolute mean difference (∆LM) =| Left concentration mean−Right concentration mean| Stimuli period mean (M) Single channel-based features Stimuli period slope (S) Coefficient of variation (CV ) Mean difference between signal and noise (∆M) = Stimuli period mean - Preceding noise period mean the prefrontal cortex [178, 180, 218, 151]. In total there were 4 features/location × 9 locations × 2 chromophore concentrations = 72 single-channel features.

The two laterality features quantified differences in activity between the left and the right sides, and thus were calculated for each of the four pairs of interrogation locations symmetrical about the midline (i.e., 1L-1R, 2L-2R, 3L-3R and 4L-4R in Fig 2.1). Later- ality features included the ratio of the concentration signal slopes, and the difference in the average signal values, between corresponding left and right channels. The inclusion of these features was motivated by physiological findings that confirm lateralized activa- tions in response to emotional stimuli [235, 33, 4]. In total, there were 2 features/channel pair × 4 channel pairs × 2 chromophore concentrations = 16 laterality features.

5.4.4 Classification procedures

For each trial, 65 seconds of data were extracted, including the 45 second stimulus period and the preceding (10 s) and subsequent (5 s) Brown noise periods. The trials with Brown noise (BN) were set aside, and the rest of the data were partitioned according to arousal and valence ratings. For the analysis of arousal, the 48 highest rated trials (out of 96 trials with music) over all four sessions were selected. For the valence component, the

24 highest positively-rated and 24 highest negatively-rated trials across all four sessions 63

(out of 96 trials with music) were selected. The high arousal (HA), positive valence (PV), negative valence (NV), and Brown noise (BN) trials were labeled accordingly. Note that arousal and valence labeling were performed independently [156, 190].

A classifier based on linear discriminant analysis [40] was used to solve two different two-class classification problems (HA vs. BN and PV vs. NV). Comparing the two valence categories (i.e. PV and NV) individually with Brown noise was not feasible due to the difference in sample sizes (nHV = nLV = 24, nBN = 48). The classification accuracy was estimated using the average of 50 independent iterations of 10-fold cross-validation.

Due to the differences in prefrontal activation in different participants, Feature selection was performed to select a subset of the feature set that best separated the two classes for each participant. To measure separability, we used the Fisher score which is [40] defined as the ratio of the difference between the mean of features extracted from each class under investigation to the sum of variances of features from each class on the training data. The Fisher score for each feature was calculated and the top two features with the highest score were selected for classification. Classification accuracy is reported as correct classification rate.

5.5 Results

Fig. 5.1 depicts normalized sample concentration recordings from all recording locations for participant 3. Fig. 5.1(a) and 5.1(b) are recordings during a music excerpt rated as highly arousing and strongly positive, whereas Fig. 5.1(c) and 5.1(d) are normal- ized sample recordings from one of the most arousing but most negatively rated trials.

Recordings during a sample Brown noise trial are provided for comparison in both cases.

Some immediate patterns are evident. For both HbO2 plots, we notice a general increase in concentration (hyper-oxygenation), illustrated in (Fig. 5.1(a)) and (Fig. 5.1(c)). The hyper-oxygenation happens at different points in time during exposure to various audi- 64

(a) HbO2 concentration for positively va- (b) Hb concentration for positively va- lenced stimulus lenced stimulus

(c) HbO2 concentration for negatively va- (d) Hb concentration for negatively va- lenced stimulus lenced stimulus

Figure 5.1: Plots (a) and (c) exemplify normalized HbO2 concentration signals at differ- ent recording locations while plots (b) and (d) are the corresponding normalized Hb con- centration signals. The dark lines represent normalized signals corresponding to highly valenced, high arousal stimuli while the lighter grey line depicts normalized concentra- tions during Brown noise presentation to the same participant. The same Brown noise sample is illustrated for both positively and negatively valenced examples. tory stimuli. In both positive and negative-rated trials depicted in Fig. 5.1, a decrease in

Hb concentration following hyper-oxygenation is observed which is consistent with pre- vious findings of functional NIRS studies[161, 136]. The valenced responses are visibly distinct from the sample Brown noise response (light grey traces).

The average classification accuracies for the valence (PV versus NV) and arousal (HA versus BN) classification problems are reported in Tables 5.3 and 5.2, respectively, for each participant. The best accuracy averaged over all participants was obtained with

2-dimensional feature sets for both HA versus BN (71.93%), and PV versus NV (71.94%) classification problems. Tables 3 and 2 also summarize the different features selected by the feature selection algorithm for each classification problem and each participant. As seen, the optimal feature set was different for each participant.

The spatial distribution of features leading to the best accuracies are marked in Fig

5.2 for the HA versus BN and PV versus NV classification problems. In these figures, 65

Table 5.2: Classification accuracy in % for each participant when classifying HA vs. BN. Feature-types corresponding to the best average accuracy are also presented for each participant (M = stimulus period mean; ∆M = stimulus period mean - preceding noise period mean; LSR = lateral slope ratio; ∆LM = Lateral mean difference; S = slope, CV = coefficient of variation

Participants Gender HA vs. BN% (2 features) features chosen 1 M 90.21 ± 1.72 ∆M 2 F 76.91 ± 1.04 ∆M 3 F 78.67 ± 3.31 ∆M 4 F 67.57 ± 2.01 M,S 5 F 69.04 ± 1.91 ∆M,CV 6 M 58.12 ± 2.55 S 7 M 61.71 ± 2.43 S,∆M 8 F 71.16 ± 1.08 S 9 M 70.17 ± 3.93 ∆M 10 M 75.72 ± 1.28 ∆M Average 71.93 ± 9.09

the magnitude of a rectangular area is directly proportional to the frequency at which the feature in question was selected at a specific recording site across all participants. The vertical line represents the anatomical midline. The values are based on the feature set dimensionality resulting in the highest average classification accuracy.

Fig 5.3 illustrates how the adjusted classification accuracy (i.e. average of classifica- tion sensitivity and specificity) results averaged across all participants change as trials with lower arousal ratings are compared to brown noise. Similarly, Fig. 5.4 depicts how the classification results change when different ranges of positively- and negatively-rated trials are compared. Comparisons ranged from the highest negative trials (top 12) ver- sus the highest positive trials (top 12) to all positively-rated trials classified against all negatively-rated trials. 66

Table 5.3: Classification accuracy in % for each participant when classifying PV vs. NV. Feature-types corresponding to the best average accuracy are also presented for each participant (M = stimulus period mean; ∆M = stimulus period mean - preceding noise period mean; LSR = lateral slope ratio; ∆LM = Lateral mean difference; S = slope, CV = coefficient of variation

Participants Gender PV vs. NV% (2 features) features chosen 1 M 75.20 ± 4.22 ∆M,M 2 F 77.73 ± 2.09 LSR,S 3 F 63.28 ± 4.30 LSR,M 4 F 67.76 ± 2.83 LSR, ∆M 5 F 77.57 ± 4.10 ∆M 6 M 63.04 ± 3.67 ∆M,M 7 M 62.00 ± 3.46 S,CV 8 F 86.91 ± 2.87 ∆M 9 M 76.99 ± 5.11 ∆M,M 10 M 68.96 ± 6.55 S,M Average 71.94 ± 8.19

5.6 Discussion

5.6.1 Classification Accuracy

The objective of this phase was to detect the brain response to emotionally-laden music by monitoring the prefrontal hemodynamics manifested as changes in the HBO2 and Hb concentrations. Visual inspection of the concentration waveforms in Fig. 5.1 sup- ports the choice of discriminatory features (e.g., mean and slope). Emotional arousal in response to music was classified against the Brown noise response with an average accuracy of 71.93% while emotional valence (i.e. positive or negative) was differentiated with 71.94% accuracy. These findings indicated that the emotional content of music in- duces differential patterns of activity in the prefrontal cortex, detectable algorithmically by NIRS.

As reported in Tables 5.2 and 5.3, classification accuracies varied across participants, corroborating previous findings of individual differences in emotional reactivity [189, 22].

As can be seen in Tables 5.2 and 5.3, accuracies above chance level were achieved for 67

(a) HbO2, HA versus BN (b) Hb HA versus BN

(c) HbO2, PV versus NV (d) Hb, PV versus NV

Figure 5.2: Location of features resulting in the best overall accuracy. Each rectangle is located over a recording site. The size of the rectangle is proportional to the number of features selected from the corresponding location. The vertical line denotes the anatomi- cal midline (HA = high arousal; BN = Brown noise; PV = positive valence; NV=negative valence).

9 out of 10 participants in the HA versus BN classification problem (α = 0.05), while in the PV versus NV scenario, accuracies for 7 out of 10 participants exceeded chance (α = 0.05)1.

One of the concerns when investigating emotional experience using PFC activity is

the possibility of activation due to the emotion induction task requirements as opposed to the emotions induced [74]. However, Fig 5.3 illustrates how the classification accuracy

degrades as trials with increasingly lower arousal rating are compared against brown

noise. Therefore, the difference in the task requirements (e.g. attentional demands),

when presenting music compared to brown noise presentation, is unlikely to be responsible

for classification accuracy. Similarly, in Fig 5.4, the classification accuracies degrade as trials with increasingly lower positive and negative valence ratings are classified against

11Note that for a two-class problem, the 95% confidence interval (α = 0.05) for 48 and 24 trials per class are 50 ± 9.80 and 50 ± 13.59, respectively [149] 68

90

85

80

75

70

65

60

55

Adjusted classification accuracy (%) 50 12 24 36 48 60 72 84 96 Number of trials included Figure 5.3: Adjusted classification accuracy (shown in (??)) results (averaged across par- ticipants) versus the number of trials included for classification against brown noise trials, after sorting all trials based on ratings of arousal in descending order. (e.g. accuracies reported for the top 12 are the result of classifying the 12 highest rated arousal trials against all trials with brown noise. The confidence intervals are shown as error bars for each number of trials included.)

100

95

90

85

80

75

70

65

60

55

50

Adjusted classification accuracy (%) 12 24 36 48 Number of trials included Figure 5.4: Adjusted classification accuracy (shown in (??)) results (averaged across participants) versus the number of trials included for classification, after sorting all trials based on ratings of positive and negative valence in descending order. (e.g. accuracies reported for the top 12 are the result of classifying the 12 most positively rated trials against the 12 most negatively rated trials. The confidence intervals are shown as error bars for each number of trials included.) 69 each other. This decrease in the classification accuracies is expected due to potential similarities between trials rated at the lower positive and lower negative ends of valence

(approaching neutral state).”.

According to Fig 5.2, which depicts the recording sites corresponding to features se- lected across all participants, the spatial distribution of the features resulting in the best overall accuracy was bilateral. This finding is consistent with the bilateral physiological substrates that are responsible for the perception of valence in the prefrontal cortex [33].

Nonetheless, in three out of ten participants, unilateral activation was most discrimina- tory as laterality features were among those selected for solving the valence classification problem (see Table 5.3).

5.6.2 Diversity in the music database

Previous studies have reported regional brain activity modulation due to specific char- acteristics of music such as rhythm, timbre, and major/minor chords [118, 193, 163]. In these studies, the investigators varied selected music characteristics while carefully con- trolling for others. Other studies, focusing on emotion induction, have used diverse music databases (e.g., self-selected music pieces) to ensure successful elicitation of emotional reactions [15, 200]. In the current study, the second approach was used.

The variability of arousal and valence ratings for a given piece of music across partic- ipants (i.e., the same music excerpt rated differently among participants) suggests that the observed brain activity was indeed attributable to emotional experiences. Moreover, the variability in ratings among participants implies that the classification algorithm was not likely biased towards specific musical characteristics.

5.6.3 Challenges

Due to the limited number of samples, only two dimensions of emotion (valence and arousal) were considered. Although these measures are informative, they fail to capture 70 more specific emotional labels. For example, fear and sadness can both be rated as negatively valenced and high in arousal. In order to differentiate more specifically among emotional labels, other dimensions of emotion such as occurrence (eruptive vs. gradually arising) and dominance (complete control vs. no control over the situation) need to be considered [240]. Special care was devoted to standardize headgear placement across all four sessions, which in turn, should have minimized instrumentation inconsistencies. However, differ- ences in the shape of the skull may have led to variabilities in the brain regions monitored in different participants. Therefore, the present results preclude conclusions about the specific brain regions that were activated. The human response to emotional stimuli may be affected by emotional sensitivity.

In fact, [169], have shown that individuals with high trait emotional intelligence respond faster and show more sensitivity in an emotion induction paradigm . Including a measure of emotional sensitivity in addition to the self-reported ratings might have helped to explain the inter-subject variability in classification accuracies.

Previous studies of emotion have indicated gender differences as an important factor in emotional response [132, 241]. However, the limited number of participants did not allow further investigations of gender related differences in the emotional response. Future studies with larger sample sizes need to be devised to investigate the effects of gender in emotion-induced prefrontal hemodynamic response. Chapter 6

Combining autonomic and central nervous system activity

6.1 Preamble

In this chapter, autonomic nervous system activity signals, namely electrodermal ac- tivity (EDA), blood volume pulse (BVP), and skin temperature are used to solve the classification problems introduced in chapter 5 (i.e. high arousal vs. brown noise and most positive vs. most negative). In addition, new features using dynamic modeling and template matching are introduced for emotion identification.

The goal here is to compare the results achieved using ANS features with those obtained using exclusively PFC hemodynamic features (see chapter 5), and to combine classifiers trained using features derived from ANS and PFC hemodynamics to improve upon accuracies obtained in chapter 5. Readers can skip sections 6.3.1, 6.3.2 and 6.3.6 since they reiterate the procedures described in chapter (2) and the classification steps introduced in section (5.4.4).

71 72

6.2 Introduction

Emotional response may engage various pathways in the central and autonomic nervous system. In fact, some theories surrounding the neural basis of emotion have argued for the existence of an intricate interaction pattern between central and autonomic nervous system (ANS) during emotional response [222]. Autonomic nervous system activity has long been used in the field of physiologically-based emotion identification. Physiological emotion detection may provide a means of affective communication for adults and youth with severe disabilities who may not be able to use conventional means of emotional ex- pression such as speech or facial gestures due to severe motor impairments. In particular, identifying affective state may help to mitigate care-giver stress and facilitate treatment decisions in a timely fashion [70].

Cardiovascular, respiratory, and particularly electrodermal activity (EDA) sensors can detect ANS activity modulations during emotional response [108, 129]. Many studies have used multiple indicators of ANS activity for identifying emotions [244, 66, 100]. For example, using EDA, temperature, BVP and ECG monitors, Kim et al. achieved 78% in differentiating anger, sadness and stress [102].

Emerging neural indicators of emotion are based on activity of the central nervous system (CNS), particularly brain areas which are found to be involved in emotional pro- cessing. Highly pleasurable music excerpts were shown to result in activation patterns in the amygdala, as well as the frontal and ventral prefrontal cortex [15], using mag- netic resonance imaging (MRI). Another hemodynamic monitoring technology applied for emotion identification is near-infrared spectroscopy (NIRS) which measures oxygenated and deoxygenated hemoglobin concentrations ([HbO2] and [Hb], respectively) in cerebral blood flow [90, 228]. NIRS, which is a portable and relatively inexpensive optical im- agery technology, is not suitable for monitoring deeper brain regions such as amygdala, but is capable of monitoring the PFC which is part of the emotional perception circuitry in the brain. In fact, NIRS studies of the PFC have identified correlates of emotion in 73 regional hemodynamic activity [218, 144]. Hoshi et al. showed that emotional response to pleasant and unpleasant pictures resulted in regional increase and decrease of PFC

[HbO2], respectively [83]. Based on the existing physiological evidence, both autonomic and central nervous system activity may show modulation during emotional activity. Therefore, realizing a multi-modal emotion identification system which uses both CNS and ANS activity is a meaningful pursuit. Recent studies have explored concomitant use of signals from both

ANS and CNS pathways for detecting emotions. For example, Kuncheva et al. showed that an ensemble of classifiers each trained using electrocardiogram, electroencephalo- gram, EDA and pulse signals [120] could achieve accuracies up to 73% in differentiating positive from negative emotional state. In this light, the current chapter focuses on using features from ANS activity for identifying most intense-rated music excerpts from neutral brown noise and most positive-rated music excerpts from most negative rated excerpts.

Furthermore, a mixture of experts was used for combining classifier decisions. These classifiers were trained using ANS-based features or NIRS-based features separately.

6.3 Methods

6.3.1 Procedures

10 able-bodied individuals (5 female, age: 25 2.7 years) with no reported cardiovascular diseases, metabolic disorders, history of brain injury, respiratory conditions, drug and alcohol-related and psychiatric conditions were recruited for this study. Ethics approval was obtained from the ethics board at Holland Bloorview Kids Rehabilitation Hospital.

The experiments were conducted over four separate sessions, and encompassed a total of 144 trials, 48 of which included brown noise. Pilot studies indicated that this type of noise was subjectively more pleasant than white noise at the same sound pressure level. In each session, participants completed three blocks. Each block consisted of 12 consecutive trials: 74

Figure 6.1: Trial sequence

four trials with positively valenced songs (including one participant-selected song), four trials with negatively valence songs (including one participant-selected song), and four

Brown noise trials. The music excerpts were randomly selected from a database composed of six music pieces self-selected by the participant and a common music database selected by researchers. The common music database included music pieces from different genres of music (classical, rock, jazz, and pop), with and without lyrics. The trials within each block were pseudo-randomized, such that Brown noise trials never occurred consecutively, and, positively and negatively valenced songs appeared in no apparent order. The same pseudo-random sequence of trials was used for all participants. Figure 6.1 illustrates a typical trial sequence.

6.3.2 NIRS data

A multi-channel NIRS monitoring system (Imagent Functional Brain Imaging System from ISS Inc., Champaign, IL) was used to record hemodynamic response across nine different regions in the PFC. In this system, five optode pairs and three detectors were placed on the forehead as shown in Figure 6.2. In each optode couple, one source emitted light at 830nm, and the other at 690nm. The signals were recorded at a 31.25Hz sampling rate. 75

Figure 6.2: The layout of light sources (circles) and detectors (X’s). The vertical line denotes anatomical midline. The annotated shaded areas correspond to recording loca- tions.

6.3.3 ANS data

Blood volume pulse (BVP), electrodermal activity (EDA) and temperature were recorded using a ProComp Infiniti multimodality encoder (Thought Technology, Montreal, QC,

Canada) at 256 Hz sampling rate. EDA was recorded using two Ag-AgCl surface elec- trodes with 10-mm-diameter which were attached to index and middle finger phalanges on the non-dominant hand. Using a thermal sensor secured on the fifth finger, the skin temperature was recorded. The blood volume pressure was obtained using a photo- plethysmograph sensor attached to the thumb. The recorded blood volume pressure was used to determine heart rate by finding the interbeat interval.

6.3.4 Analysis

The BVP signals were band-pass filtered (0.2-0.33 Hz) using Daubechies-based contin- ues wavelet transform [2] to facilitate peak detection. The inverse of the peak-to-peak distances in time was used as an indicator of heart rate, and the peak values were used to determine pulse volume amplitude (PVA) [101]. 76

6.3.5 Feature extraction

PFC features

PFC hemodynamic-based features were extracted from [HbO2] and [Hb] at each of the 9 recording locations. These features included the mean, slope (determined using linear regression) and coefficient of variation (ratio of variance to the mean), all estimated over the music presentation period, and the change in the mean between the preceding noise and music presentation period. In addition to single-channel features, the ratio of the concentration signal slopes and the difference in the average signals were determined between left and right channels (i.e., 1L-1R, 2L-2R, 3L-3R and 4L-4R in Fig 2). Laterality features were introduced into the feature set, based on previous reports of lateralized response to emotional stimuli [235, 33, 4]. For more information regarding these features, the reader is referred to section 5.4.3.

Based on the findings presented in chapter (3), new features were derived by intro- ducing a custom made template. First, by repeated visual inspection of [HbO2] and [Hb] patterns in the highest arousal-rated trials, a template was designed. Figure 6.3. A and

B depict the designed template and a sample recoding from participant 3 during which chills were reported, respectively. As shown in Figure 6.3, the template was designed to capture the sudden increase and the proceeding plateau in the concentration waveform.

This custom-made template was akin to a mother wavelet and the maximum coefficients across translation and scale were determined. The maximum coefficients determined us- ing the template (maximum across scale and time) were used as features for classification. Hence, the template was empirically determined.

ANS features

The features representing autonomic nervous system signals included the mean, the range and the difference in the mean values during the aural stimulus period and the preceding 77

A. Template B. Sample trial with chills 0.4 1 0.2

0 0.5

-0.2 0 -0.4

-0.6 -0.5 -0.8

-1 -1 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 30 35 40 45 time

Figure 6.3: A. Custom-made template, B. Sample normalized [HbO2] recorded in a trial with chills. noise period (see the trial sequence in Figure 2.2) for temperature recordings and EDA.

The number and magnitude of electrodermal responses (EDR) were also added to the feature set. EDRs were detected by differentiating the EDA recordings and convolving the resulting waveform with a Bartlett window, and finding the two consecutive zero- crossings (positive to negative and negative to positive) [102]. The maximum value between these two zero-crossings was recorded as the magnitude of the EDR [102]. The average heart rate and PVA signals within the 45 sec period of exposure to music were also included as features to represent cardiovascular response. In addition, the ratio of high frequency (heart rates above 70 bpm) to low frequency (heart rates below 70 bpm) energy content was also included in the feature set. These ANS features were selected based on previous findings in studies of emotion involving ANS activity [108, 184].

Dynamic model-based features

System identification has previously been used for modeling interactions among various physiological signals [197, 175]. For example, Saul et al. used system identification to understand the relationship between respiratory signals and heart rate in their charac- terization of autonomic regulation of heart rate [198]. The differences observed between models fit to neutral and chilling trials, which will be reported in section 6.4.1, suggested that dynamic model-based features may be useful in differentiating emotions.

To capture the relationship between EDA and [HbO2], an autoregressive model with 78 exogenous input (arx) was applied [130]. This arx model described a system based on immediate past input (x) and output (y) values as shown in (6.1).

y(t) = b1x(t − 1) + .. + bnbx(t − nb) − a1y(t − 1) − .. − anay(t − na) + ε (6.1)

In (6.1), nb and na are model orders, and ai and bi are model coefficients. In the arx model estimated, the [HbO2]/[Hb] was used as the input (x) to the system and the EDA was set as the output (y) and vice versa. Model order was selected according to the Akaike Information Criterion (AIC )[130].

The EDA signals (originally collected at 256Hz) were down-sampled by a scale of 7

(resulting in a sampling rate of 4.57 Hz), and [HbO2] signals (originally collected at 31.25

Hz) were re-sampled at 4.57 Hz to match EDA sampling rate. To match the bandwidth of the EDA signals to that of NIRS recordings, both types of signals were low pass filtered using the same type II third order Chebychev filter (i.e. 0.1Hz cut-off). The concentration time series and EDA signals within each trial were normalized to have zero mean and were scaled down by the maximum absolute signal value during the trial. This normalization resulted in signal magnitudes ranging from -1 to 1.

An autoregressive (AR) model was used to represent EDA, [HbO2], and [Hb] dynam- ics. Unlike arx, an AR model describes the signal with respect to the immediate past of the same signal as shown in (6.2).

y(t) = a1y(t − 1) + .. + anay(t − na) + ε (6.2)

To illustrate the potential merits of dynamic-based feature extraction in emotion identification, arx models ([HbO2] averaged across the nine recording regions was used as input) were fit to trials with the highest arousal rating (i.e. chills) and those with brown noise rated as neutral (i.e. neutral). These models were compared (chills vs. neutral) 79

Table 6.1: Features resulting from arx dynamic modeling. (very low frequency band (VLF) = 0-0.025 Hz, low frequency band (LF) = 0-0.075 Hz and high frequency band (HF) = 0.075-0.1 HZ)

Feature types Model-based features Model coefficients

Energytotal EnergyV LF Frequency response features EnergyHF EnergyLF

EnergyHF EnergyV LF

in the frequency domain to explore the usefulness of dynamic-based feature extraction in differentiating arousal. Chills were selected for comparison as they are well-defined emotional events. The model orders identified using AIC were recorded for chills and neutral trials, and the AIC order selected for the majority of trials was considered as the generalized model order (GMO).

To identify features, each trial (n=144) was first modeled using arx under two condi- tions: a) with [HbO2]/[Hb] as the input and, b) with EDA as the input (GMO for chills was used for modeling in both (a) and (b)). The model coefficients (i.e. ai , bi ) were included as direct model-based features. Other features were based on the frequency response of the estimated dynamic model. The energy of the frequency response within three frequency bands, namely, the very low frequency band (VLF = 0 - 0.025 Hz), the low frequency band (LF=0 - 0.075 Hz) and the high frequency band (HF = 0.075 - 0.1

HZ) was used for extracting frequency response-based features. Ninety percent of the spectral peaks of the models’ transfer functions occurred within these frequency bands.

Table 6.1 summarizes these features. Using a similar procedure, the GMO was used for identifying AR model coefficients which were also included as features. 80

6.3.6 Classification

In order to compare the classification results with these newly proposed features against that attained only for NIRS signals, procedures identical to chapter 5 were used for labeling the data and classification. The trials with Brown noise (BN) were separated, and the rest of the data were partitioned according to arousal and valence ratings. For the analysis of arousal, the 48 highest rated trials over all four sessions were selected. For the valence component, the 24 highest positively-rated and 24 highest negatively-rated trials across all four sessions were selected. The high arousal (HA), positive valence (PV), negative valence (NV), and Brown noise (BN) trials were labeled accordingly. Arousal and valence labeling were performed independently

A classifier based on linear discriminant analysis [40] was used to solve two different two-class problems (HA vs. BN and PV vs. NV). The classification accuracy was esti- mated using the average of 50 independent iterations of 10-fold cross-validation. Feature selection was performed to identify feature set that best separated the two classes for each participant. To measure separability, the Fisher score [40] was used, which is defined as the ratio of the difference between the mean of features extracted from each class under investigation to the sum of variances of features from each class. The Fisher score for each feature was calculated and the two features with the highest scores were selected for classification. Classification accuracy was recorded as the correct classification rate.

6.3.7 Mixture of experts

Six linear discriminant analysis based classifiers [40] were separately trained using exclu- sively features from one of the feature types, shown in Table 6.2, namely time domain

PFC features, ANS features, template-based features, and system dynamic-based fea- tures. These classifier experts were used to decide the labels of trials set aside for testing.

The features were randomly segregated into training A and testing sets (shown in Figure

6.4) using a 10-fold cross-validation algorithm. Testing data were set aside for the final 81

10-folds

Training B All Training A features Validation

Testing

10-folds Figure 6.4: Feature segmentation. testing of the classifier ensemble.

To determine the class label (i.e. wj, j = 1, 2) for the sample x in the testing set, the classifier decisions were combined using the classifier confidence and the support for x belonging to each class using a classifier combination algorithm introduced in [121] (see Figure 6.5). Classifier support was determined using the discriminant function for class

′ wj (i.e. gj(x)) and transformed into a logistic link function gj(x) which resulted in a value ranging from 0 to 1. Larger support values indicated more likelihood for class wj . For a two class problem, the discriminant and logistic link functions are determined as shown in (6.3) and (6.4), respectively.

exp(gj(x)) p(wj|x) = , j = 1, 2 (6.3) exp(g1(x)) + exp(g2(x))

′ 1 ′ − ′ g1(x) = , g2(x) = 1 g1(x), j = 1, 2 (6.4) 1 + exp(−g1(x))

The classifier support values were arranged in the form of a decision profile (DP(x)).

The DP(x) was composed of dc,j(x) elements which represented the support that classifier

Dc (c = 1, 2, ..., 6) had for class wj, given the vector x from the testing set.

Classifier competence (Gc), which represented the classifier ability to identify class 82

Time-domain Feature NIRS Selection Classifier 1

Feature Classifier 2 ANS Selection

Template Feature Classifier 3 Output -based Selection Class

ARX Feature Classifier 4 input: EDA Selection

ARX Feature input: Selection Classifier 5 [Hb]/[HbO2 ]

Feature Fusion of Classifier Decisions AR Classifier 6 Selection

Figure 6.5: A simplified diagram depicting fusion of classifier decisions.

labels, was determined based on the training A data. The training A data was parti- tioned into validation and training B in 20 iterations of a 10-fold cross validation. The correct classification rates in each iteration and fold were averaged to estimate classifier competence, Gc. This step resulted in 6 values, one for each feature set, namely, time domain PFC, ANS features, template-based features, arx features (input: EDA), arx features (input: [HbO2]/[Hb]), and AR features. To determine Gc (c = 1, 2, .., 6), in each iteration and fold, 1 feature was selected using Fisher scores applied to the primary

training set (see Section 6.3.6).

The parameter λ was estimated by finding the real root greater than −1 of the

polynomial shown in (6.5).

∏6 1 + λ = (1 + λGi), λ ≠ 0. (6.5) i=1

For a given x, the DP vector corresponding to x for each class was sorted from the high- → est support to the lowest (i.e. [d1,j(x), d2,j(x), ..., d6,j(x)] [dc1,j(x), dc2,j(x), ..., dc6,j(x)]). 83

Table 6.2: Feature used for training classifier experts

Feature types Time domain PFC features Multi-channel time domain features Laterality features EDA features ANS features Skin temperature features BVP features Wavelet-based features Maximum wavelet coefficients across time and scale arx features (input: EDA, output: [HbO2]/[Hb]) Dynamic-based features arx features (input: [HbO2]/[Hb], output: EDA) AR features (EDA, [HbO2], and [Hb])

The classifier competence values were sorted accordingly (i.e.Gc1 (x),Gc2 (x), ..., Gc6 (x)). The measure Q(k) was calculated recursively as shown in 6.6.

− − Q(k) = Gck + Q(k 1) + λGck Q(k 1),Q(1) = Gc1 , k = 2, ..., 6. (6.6)

The final degree of support for class wm was determined using a Suego integral, shown in (6.7), and the class with the highest µm(x) value was noted as the ensemble decision for sample x from the testing set [121].

{ { }} µm(x) = maxk=1,..6 min dck,m(x),Q(k) , m = 1, 2. (6.7)

6.4 Results

ANS data from 61 trials out of the 1440 trials across all participants was lost due to technical issues, and were therefore not included for the analysis. [HbO2] and [Hb] signals corresponding to trials for which ANS signals were lost were also excluded from the analysis. 84

Figure 6.6: Sample trial with chills (participant 2): EDA recording and estimation, using the average [HbO2] concentrations as the input to the arx model. The fit achieved by the model for the depicted estimation is 52.9%.

6.4.1 Dynamic model-based features

The estimated arx model achieved a fit value exceeding 50% in 70% of trials in the chills category and 77% of trials in the neutral cases. These results exemplify the ability of the arx model to capture the interaction dynamics between EDA and [Hb]/[HbO2]. Figure

6.6 depicts a sample ([HbO2]) and the corresponding EDA recording and estimation for a trial with chills for which the fit value was 52.9%. Figure 6.7 shows the normalized frequency responses (magnitude and phase) for models with chills and neutral trials for participant 4. The trials with chills manifested two distinct peaks. The two peaks, shown in Figure 6.A, were observed in all other participants. However, the location of these peaks varied across participants. Neutral-rated trials manifested a low pass filter property with a single peak similar to that shown in Figure 6.7.B.

6.4.2 Classification results

Table 6.3 summarizes the ANS-based classification results for HA vs. BN and PV vs.

NV. Clearly, the ANS-based results varied across participants.

The mixture of experts classification rate is presented in Table 6.4. Tables 6.5 and 6.6 85

A. Chilling B. Neutral 0.03 0.035

0.025 0.03 0.025 0.02 0.02 0.015 0.015 0.01 0.01

0.005 0.005

0 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.05 0.1 0.15 0.2 0.25 0.3 Figure 6.7: Sample scaled frequency response estimated for (A) chilling and (B) neutral trials for participant 4. The magnitude of the frequency response was normalized by dividing the results by the total power of the signal over the entire frequency range.

Table 6.3: Classification accuracy in % determined using ANS features for solving the HA vs BN and PV vs. NV classification problem

Participants HA vs BN % PV vs NV % 1 59.8 ± 1.5 64.5 ± 1.7 2 53.0 ± 1.7 59.8 ± 2.5 3 77.3 ± 1.1 59.8 ± 2.8 4 51.5 ± 1.4 62.7 ± 2.5 5 55.6 ± 1.5 55.5 ± 2.0 6 58.7 ± 1.6 55.9 ± 2.0 7 55.4 ± 1.2 46.4 ± 2.2 8 58.1 ± 1.5 54.2 ± 2.3 9 54.3 ± 1.3 53.2 ± 2.3 10 69.8 ± 1.4 60.1 ± 2.4 summarize the dynamic model-based results for the two classification problems, namely

PV vs. NV and HA vs. BN.

6.5 Discussion

Many physiologically-based emotion identification efforts have included electromyogram sensors to capture muscle activity due to emotions (e.g. muscle contractions resulting from facial expression) [244, 66, 100]. For example, Picard et al. achieved an accuracy of 81% in differentiating eight emotional states (neutral, anger, joy, grief, hate, romantic love, reverence, platonic love) using features from facial electromyography, BVP, EDA an respiration [172]. 86

Table 6.4: Classification accuracy in % determined using the mixture of experts for solving the HA vs. BN and PV vs. NV classification problem

Participants HA vs. BN % PV vs. NV % 1 83.8 ± 0.8 85.1 ± 0.9 2 75.9 ± 1.0 58.4 ± 2.6 3 91.9 ± 0.5 57.5 ± 2.2 4 58.6 ± 1.7 49.2 ± 2.3 5 60.9 ± 1.4 64.5 ± 1.7 6 59.7 ± 1.5 55.7 ± 1.7 7 51.8 ± 1.4 42.8 ± 2.3 8 71.5 ± 1.2 71.4± 1.8 9 60.7 ± 1.6 58.8 ± 2.2 10 69.55 ± 1.2 58.9± 2.4

Table 6.5: Classification accuracy in % for each participant when classifying HA vs. BN. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and arx (b) input:[HbO2]/[Hb])) and template-based features. Participants AR % arx (a) % arx (b) % Template-based 1 56.0 ± 1.6 58.1±1.6 44.4 ±1.2 56.5±1.6 2 47.7± 1.0 65.6±1.4 48.0 ± 1.6 64.5±1.6 3 81.7± 1.1 65.1±1.1 61.1 ± 1.4 85.6 ±0.8 4 63.9± 1.1 45.2 ±1.2 49.1 ± 1.3 54.2 ±1.1 5 56.2± 2.1 56.5 ± 1.6 57.1 ± 1.3 59.0 ±1.4 6 63.8± 1.1 55.8± 1.4 59.0 ± 1.0 42.0 ± 1.1 7 46.0 ± 1.7 47.7± 1.6 51.6 ± 2.0 48.2 ±1.4 8 43.1± 1.3 54.7±1.9 45.1 ± 1.3 61.6±1.5 9 66.9± 1.5 62.4± 1.5 44.7 ± 1.4 61.4±1.1 10 59.8 ± 1.2 55.2± 1.7 54.9 ±1.7 63.8±1.4

Other studies have exclusively focused on ANS activity sensors. In a study involving electrocardiogram (ECG), Agrafioti et al. [1] achieved accuracies up to 89% in differen- tiating valence, and reported between-subject variability in classification results. Using

EDA, temperature, BVP and ECG monitors, Kim et al. achieved 78% in differentiat- ing anger sadness and stress [102], and also indicated differences in correct classification rate among participants. These findings confirm differences in correct identification rates across individuals, which was also observed in the current investigation.

Autonomic nervous system activity patterns may vary across individuals. For ex- 87

Table 6.6: Classification accuracy in % for each participant when classifying PV vs. NV. Using dynamic-based features (i.e. AR, arx (arx (a) input:EDA and arx (b) input:[HbO2]/[Hb])) and template-based features. Participants AR % arx (a) % arx (b) % Template-based 1 46.3 ± 2.4 52.3 ± 1.6 59.9 ± 1.7 59.0±2.5 2 46.4 ± 1.9 49.4 ± 1.8 54.1 ± 1.8 59.3± 1.8 3 51.9 ± 1.9 47.3 ± 2.4 69.4 ± 1.6 56.2±2.1 4 44.1 ± 2.2 54.0 ± 2.2 39.4± 1.4 46.7±2.4 5 64.3 ± 2.6 52.1± 1.8 53.2 ± 2.2 57.8 ±2.1 6 47.3 ± 2.2 55.2 ± 2.1 51.7 ± 2.3 45.6 ±1.5 7 46.5 ± 2.4 46.4 ± 2.5 47.0 ± 1.6 37.9±1.6 8 53.3 ± 1.9 59.7 ± 2.1 55.4 ± 2.1 62.0±2.3 9 48.0 ± 2.2 49.7 ± 2.6 40.4 ± 2.4 61.1±2.5 10 43.2 ± 1.9 59.2± 1.6 57.8 ± 2.3 47.8±1.9 ample, the EDA response magnitude due to sympathetic arousal may be suppressed in some individuals[223]. This phenomena may explain the variability in ANS-based emo- tion identification results in Table 6.3.

Various features such as ANS-based, dynamic model-based or template-based features may not be equally useful for identifying emotions. For example, for a particular partici- pant, dynamic model features may result in low accuracies in HA. vs. BN differentiation while ANS features lead to higher identification accuracies (see results for participant

10), but for another individual, the opposite may be true (see results for participant

3). The multi-modal mixture of experts, used in this study, automatically accounted for this variability by estimating the classifier competence which ultimately assessed the usefulness of the feature set. Combining classifier decisions maintained or improved the

HA vs. BN classification results in only three participants (i.e. participants 3,6 and 8) when compared to results obtained exclusively using NIRS features (see Table 5.2). The

PV vs. NV correct classification results were generally (with the exception of participant 1) lower (comparing 5.3 and 6.4). Previous studies have indicated differences between arousal and valence detection accuracies. For example, using skin conductivity, blood volume pressure, respiration and an electromyogram, Healey et al. found that valence 88 differentiation was less accuracte than arousal differentiation [72]. Compared to results obtained in chapter 5 (i.e. Table 5.2 and 5.3), the accuracies obtained using PFC hemodynamic-based features were generally higher than the combi- nation of classifiers based on PFC hemodynamic and ANS features. The average results from the classifier combination shown in Table 6.4 was 66% for HA. vs. BN and 60% for PV vs. NV which is lower than the results achieved using NIRS features exclusively

(see chapter 5). However, including autonomic nervous system activity features may help improve emotion identification in a subset of individuals. Future studies involving larger sample sizes, which are more likely to represent various ANS response phenotypes, may help identify whether the multi-modal approach exceeds the performance achieved using PFC NIRS features alone.

6.6 Conclusion

In this chapter, a multi-modal ensemble of classifiers was used to differentiate highest arousal rated trials from brown noise (HA vs. BN), and most positive rated trials from most negative rated trials (PV vs. NV). Each classifier in the ensemble was trained by exclusively using features from ANS or PFC hemodynamics. Novel dynamic-based features were introduced and demonstrated potential in arousal differentiation.

The classification results varied across participants. In particular, the classifier en- semble was capable of maintaining or improving upon the results achieved using only PFC hemodynamics in 3 participants for the HA vs. BN problem. However, the valence differentiation rate was lower than those achieved with PFC hemodynamics alone. Chapter 7

Concluding remarks

7.1 Summary of contributions

This thesis made several contributions to the field of rehabilitation engineering, specifi- cally, in the area of affective brain computer interfaces. In summary, the results of this thesis illustrated the feasibility of emotion identification using prefrontal cortex (PFC) near infrared spectroscopy (NIRS) in response to a dynamic emotion induction method

(i.e. music). The specific contributions are listed in this chapter.

7.1.1 A literature appraisal of the existing evidence for the use

of BCI for individuals with disabilities [143]

The existing evidence for the use of brain computer interfaces (BCIs) involving indi- viduals with disabilities was critically appraised. This literature review resulted in the identification of current challenges surrounding BCI use for individuals with severe dis- abilities. In addition important recommendations for future studies were made, including the inclusion of user state and involvement of pediatric population. These recommen- dations may benefit future BCI research efforts in realizing more user-accommodating systems suitable for the target papulation (i.e. individuals with severe disabilities).

89 90

7.1.2 PFC [Hb] and [HbO2] patterns characterization using wavelet analysis with respect to emotional arousal and valence

[142]

Regional PFC [Hb] and [HbO2] activity was characterized using wavelet peak detection. This algorithm allowed identification of hemodynamic characteristics with respect to the arousal and valence dimensions of emotions. In addition to hemodynamic response magnitude, the wavelet peak detection method allowed investigation of the speed of hemodynamic response (i.e. using the scale at maximum wavelet coefficient). Intense negative emotional ratings were found to be generally related to heightened changes in [HbO2] . These findings warranted further investigation of PFC NIRS for emotion identification, particularly when using dynamic emotion induction methods such as music.

7.1.3 Identified emotional arousal and valence in response to

dynamic emotion induction using PFC NIRS [144]

Using time domain and laterality features extracted from the PFC NIRS, the highest

arousal rated trials were differentiated from trials with brown noise (HA. vs. BN) with

an average accuracy of 71%. Similarly, in differentiating the most positively rated trials

from most negatively rated trials (PV vs. NV), an average accuracy of 71% was achieved. The 10 fold cross-validation used for classifier training and testing simulated single-trial

identification of arousal and valence and provided further evidence for the use of PFC

NIRS as a means of emotion identification. 91

7.1.4 Introduced features based on dynamic modeling for emo-

tion identification

Using dynamic modeling, additional features were introduced for solving the HA vs.

BN and PV vs. NV classification problem. Dynamic modeling was used for capturing

PFC NIRS and EDA signal dynamics. In addition, the interaction dynamics between

[Hb]/[HbO2] and EDA were captured using an arx model. Unlike previous emotion iden- tification efforts which exclusively used autonomic or central nervous system signals for identifying emotions, the arx model captured the interaction between PFC hemodynam- ics and EDA. Despite variability across participants, using features extracted using arx models, accuracies up to 81% were achieved in differentiating arousal.

7.1.5 Multi-modal emotion identification using a mixture of

classifier experts

A multi-modal mixture of experts exclusively trained using ANS and PFC hemody- namic features was implemented for emotion identification. The classifier combination automatically accounted for the variability across participants by estimating classifier competence and taking classifier confidence into account. The mixture of experts was capable of improving HA vs. BN identification in three participants. 92

7.2 Recommendation for future studies

7.2.1 Assessing PFC hemodynamics for emotion identification

in the pediatric population and individuals with severe

disabilities

The results of the current thesis have established grounds for future emotion identifi- cation efforts involving individuals with severe disabilities. In particular, the pediatric population with severe motor disabilities who may not be able to use other BCI systems due to developmental delays, limited expressive communication and unknown levels of receptive communication may benefit from NIRS-based emotion identification systems.

Emotional response may be an intuitive and more natural means of communication for children with severe disabilities.

Future studies involving typically developing children and children with severe dis- abilities should consider frontal cortex development in the target age group [36]. The current results were achieved based on data from adults for whom the prefrontal cortex is fully developed. The next step would be to test the proposed system with typically developing children to investigate the feasibility of arousal and valence identification in different prefrontal cortex developmental stages. This step will inform studies involving children with disabilities for whom the prefrontal cortex is unlikely to be affected by the clinical condition. Ultimately, the system may be tested with children with conditions which may affect the prefrontal cortex.

Specific changes to current study design may be necessary for testing the system for the pediatric population. For example, music excerpts used may need to be adjusted for children (e.g. by using more simplified musical structures). Previous studies involving music-induced emotions in children may be useful in identifying music excerpts suited for inducing music in children (e.g. [81, 82]). Including self-selected excerpts may not be feasible due to difficulty in identifying personal preference in children with regards 93 to music. More simplified emotion rating paradigms should be considered to facilitate emotional ratings by children (e.g. using facial gestures as rating items [81] ).

7.2.2 Potential clinical implications

The system proposed in this thesis may serve as a passive brain computer interface for detecting emotional state in nonverbal individuals with severe motor disabilities.

Knowledge of the emotional state may facilitate clinical decisions. For example, by assessing the emotional response to various interventions, the care-givers and clinicians may devise improved care strategies.

Physiological-based emotional identification has been effective in situational interpre- tations within clinical settings. In a study involving ten children with disabilities, auto- nomic nervous system activity was monitored during interaction with therapeutic clowns compared to television exposure in the complex continuing care at Holland Bloorview

Kids Rehabilitation Hospital [105]. The results indicated a significant difference between therapeutic clown intervention and exposure to television [105]. Similarly, the results of the current thesis may lead to augmented awareness regarding the patient state by providing a means for ongoing bed-side monitoring of emotional state using prefrontal cortex activity.

Magnetic resonance imaging technology offers improved spatial resolution compared to NIRS and allows monitoring of deeper brain regions not accessible by NIRS. However, the use of magnetic resonance imaging technology may trigger anxiety and discomfort to the extent that sedation may be required [150]. Unlike magnetic resonance imaging, NIRS is suitable for long-term bedside monitoring, particularly in children and infants.

Therefore, future studies of emotion using NIRS of the prefrontal cortex may shed light on emotional understanding in children of various age groups. 94

7.2.3 Dynamic emotional rating paradigms

Emotions may appear as transient phenomena during the emotion induction period.

For example, emotions during initial presentation of a musical piece may be different from those manifested as the music unfolds. Therefore, the next step for studies in- volving dynamic emotion induction (e.g. music or videos) and the brain is to consider implementing experimental paradigms that support dynamic emotional rating. Dynamic emotional ratings will enable the study of the temporal dynamics of emotion using PFC hemodynamics. Ultimately, investigating the temporal dynamics of emotions with re- spect to PFC hemodynamics will facilitate emotion decoding in real-life settings where emotions can be manifested at any point in time.

7.2.4 Emotional sensitivity measures

Due to differences in emotional sensitivity among individuals, future studies should con- sider emotional intelligence assessments prior to each recording session. Petrides et al.

[169] have shown that individuals with high trait emotional intelligence respond faster and show more sensitivity in an emotion induction paradigm. These individual differences may explain the variability in the emotion identification success rate across participants.

In this way, including a measure of emotional sensitivity in addition to the self-reported ratings may be useful for future investigations involving physiologically-based emotion identification.

7.2.5 Individual specific analysis

The subject-specific feature selection algorithm has been used for demonstrating the feasibility of the proposed affective BCI. Previous emotion identification efforts have used similar approaches due to individual differences in the physiological response to emotions

[172]. However, for this system to be used for individuals with severe motor disabilities, 95 the most informative features need to be identified. Due to the large variability in the physiological response to emotions, including large participant cohorts is necessary to capture various response phenotypes before global features can be identified and used in studies involving individuals with severe disabilities. Given the limited sample size, introducing global features was not feasible in the current study. Future studies with larger sample size may help identify features that can robustly identify emotions across individuals. Another approach maybe to implement adaptive feature selection where the feature set can be optimized based on individual response phenotypes.

7.2.6 Inclusion of larger sample sizes

The results in this thesis were reported in a sample of 10 able-bodied adults. Due to the extent of variability observed in identification accuracy across individuals, the sample size may be a limitation in extending the results to larger sample sizes. To account for the individual differences, individual feature selection and classification algorithms were used. In addition, a mixed model was used for statistical analysis to account for the limited sample size. However, future investigations should consider larger sample sizes to account for different physiological phenotypes that may exist among individuals. 96

Appendix A: Open Challenges Regarding Control Mechanisms

Studies involving individuals with disabilities have demonstrated Various EEG con- trol mechanisms. Each control mechanism has challenges and merits with respect to habituation, required training period, response rate, fatigue, and cognitive awareness.

Exploring subject-specific control, performance predictors, alternative control mecha- nisms, and self-paced BCI designs can help ameliorate current BCI technologies.

• Habituation and Response Rate

P300-based BCI may be affected by habituation. In particular, there are reports of P300 peak magnitude and latency changes with repeated exposure to stimuli

[122]. Alternatively, SMR and SCP-based BCIs are not reported to be affected

by habituation. Based on the bit rates achieved in the reviewed articles, SSVEP-

based systems provide the fastest information transfer rate among the four control

mechanisms.

• Training and Fatigue

BCI systems based on evoked responses (P300 and SSVEP) require very little train-

ing for the participants as these responses are naturally occurring. In contrast, it

generally takes several training sessions for a user to learn to modulate spontaneous

EEG patterns. Despite the benefits of SSVEP-based systems with respect to train- ing and transfer rate, the low-frequency flickering stimuli used by these systems is

fatiguing on the eyes and may induce photo-epileptic seizures in the photo-sensitive

population [49].

• Subject-specific EEG-based Control

Studies with able-bodied individuals have indicated that the ability to generate

various EEG patterns is user dependent. For example, of the 81 participants eval-

uating a P300-based speller nearly 3% did not produce any correct characters [64]. 97

Similarly, only 19% of 99 participants using a SMR-based BCI achieved accuracies above 80% [64]. These results suggest that some users may not be able to generate

EEG patterns to control a particular type of BCI. This issue, however, has not

been investigated in individuals with disabilities.

• Lack of Predictive Indicators of Performance

Due to the large amount of financial and time-related resources often required to

conduct studies involving the target population, it would be beneficial to develop predictive indicators of success with given control mechanisms. One such predictive

measure is initial performance with a control mechanism. Using an SCP-based BCI,

Neumann et al. found that initial performance was related to performance in later

attempts in five patients with ALS [154]. In a later study, Kbler et al. [116] found that initial performance was moderately correlated with the performance in the

advanced training sessions. Both studies were conducted with SCP-based BCIs.

• Limited Scope of Mental Tasks

Studies of BCIs based on spontaneous responses have focused exclusively on SMR

and SCP for use in individuals with disabilities. Several other mental tasks such

as language and arithmetic have also been shown to induce distinctive EEG pat- terns in able-bodied individuals (Milln et al., 2002; Roberts Penny, 2000). Despite

the cognitive load imposed by these BCIs, they may have merits as BCI control

mechanisms for the target population. To the best of our knowledge, BCIs based

on language and arithmatic mental tasks have not been tested by the target pop-

ulation.

• System-paced Versus Self-paced

The majority of the reviewed BCIs require the user to generate EEG patterns

when cued by the system. This limits the user’s ability to control initiation and

duration of the mental task, a restriction that may hinder system practicality as 98

an independently controlled communication device. One way to overcome this limitation is to develop self-paced (asynchronous) BCIs with a no control state

[139]. This can be accomplished through machine learning techniques that allow

detection of specific EEG patterns at any point in time [139, 140]. Leeb et al. (2007)

developed such a system for controlling a wheelchair in the virtual environment and reported successful operation by an individual with SCI [127].

• Performance Evaluation

The reviewed articles mainly focused on traditional measures of BCI performance,

namely, speed and accuracy. These measures, however, must be appropriately mod-

ified when used to evaluate system performance with individuals with disabilities [115]. Specifically, performance evaluation should consider the context in which

the system operates. According to the International Classification of Function-

ing, Disability and Health (ICF) (World Health Organization, 2001), this context

includes personal factors such as the nature of the disability, as well as environmen-

tal factors (physical, social, and attitudinal issues). Personal factors relating to the nature of the disability are important in evaluating BCI suitability. In particular,

severity of the disability may affect BCI performance. For example, while BCIs

have been successfully used by individuals with incomplete locked-in syndrome,

[112] reported that basic communication could not be restored in any of the par-

ticipants with complete locked-in syndrome. Further study of different locked-in syndrome sub-types can help identify the population which can most benefit from

BCI use. Another important personal factor is the possible improvement or decline

in function. Specifically, the extent of available communication function is a critical

personal factor. In this light, BCI speed is only a limited measure of performance gains over other communication means available to the user. For example, while a

BCI may be much slower than speech or muscle activated switches, it may provide

a functional means of communication in the absence of extant muscle control. 99

• Neuroethics and responsible dissemination to media

With the ubiquity of BCI research, neuroethical concerns are materializing [188],

particularly around the breach of user privacy [46]. Further, many potential BCI

users face communication difficulties due to severe disabilities (e.g. conditions re-

sulting in LIS). Consequently, there are many challenges in reliably obtaining and

interpreting the user’s informed consent for participation in BCI research. In- terested readers are referred to Haselager et. al (2009) [71]. Researchers should

exercise special care when communicating with caregivers and potential BCI users.

Because the reality of BCI research is often not well-portrayed by the media, users

and care-givers may formulate expectations beyond what is feasible. To manage expectations, researchers must avoid ”over-hyping the significance of their findings”

[57, 71]. In a recent study, Nijboer, Clausen, Allison Haselager (2011) [158] pub-

lished results of a survey in which more than 80% of 144 BCI researchers acknowl-

edged the importance of active participation of researchers in separating factual and

fictional statements published in media. In addition, ”85.8 % of the participants recommended ethical guidelines specific to BCI research and use within five years”.

Until such guidelines exist, researchers can prevent user and care-giver frustration

and disappointment by realistically presenting the expected outcomes, as well as

risks and complications surrounding BCI technology. 100

Appendix B: Music Database Music excerpts used for emotion induction were selected from a variety of difference genres of music. Previous studies of emotion using music were consulted in creating the music database 1. In addition, motion picture soundtracks were also included due to their ability to induce emotions. Table indicates the music excerpts included in the making of the common database.

Each participant selected 6 music excerpts prior to data collection. These songs, listed in table 2, were chosen by each participant for inducing intense positive or negative emotions. Some participants selected identical music excerpts independent of each other.

In addition, with no prior knowledge of the common music database, a number of the music pieces in the common database appeared in the self-selected songs. 101

Table 1: The list of music pieces included in the common music database

Title Composer/Artist Caribbean blue Enya Arajuez Andrea bocelli Sirens Police ”Natural born killers” motion picture First youth , ”Cinema paradiso” motion picture Bachehaye alp Mohsen Alizadeh, ”Dans les Alp” motion picture Can’t take my eyes off of you The Everly Brothers Sur le fil Yann Tiersen, ”Le Fabuleux Destin d’Amelie Poulain” motion picture Can can Jacques Offenbach Goodbye Lenin Yann Tiersen, ”Goodbye Lenin” motion picture Kinderszenen Robert Schumann Agnus dei Samuel Barber Just the way you are Bruno Mars Nocturne No. 20 in C sharp minor Fr´ed´ericChopin Halo Beyone Adagio, G minor Tomaso Albinoni La noyee Yann Tiersen, ”Le Fabuleux Destin d’Amelie Poulain” motion picture The man who sold the world Nirvana Cello Suite No. 1, Prelude Johann Sebastian Bach Concerto No. 3 in F major, Op. 8, Allegro Antonio Vivaldi All that I am living for Evanescence Bella Ciao The mission Ennio Morricone, ”The mission” sound track Les millionnaire du dimanche One day Matisyahu Hasta Siempre Comandante Buena Vista Social Club The Lion Sleeps Tonight The Tokens, ”The Lion King” motion picture La vieille barque Mireille Mathieu Fireworks Katy Perry Nothing else matters Metallica Alp Mohsen Alizadeh, ”Dans les Alp” motion picture Waka Waka (This Time for Africa) Shakira Les roi du monde Philippe d’Avilla, Damien Sargue and Gr´egoriBaquet , ”Romeo et Juli´ette”musical Lullaby Javier Navarrete ”Pan’s Labyrinth” motion picture Unforgiven III Metallica The Winner Takes It All Abba Con Te Partiro Andrea Bocelli c’est peut etre des ange G´erardLenorman Malena Ennio Morricone ”Malena” motion picture If I had a hammer Peter Paul and Mary je t’aime Lara Fabien Haven’t met you yet Michael Bubl´e Yesterday The Beatles To the beat of my heart Hilary Duff Hit the road Jack! Ray Charles Histoire D’un Amour Dalida When our wings are cut can we still fly? Gustavo Santaolalla, ”21 grams” motion picture Habanera Georges Bizet, ”Carmen” Opera One day I’ll fly away Nicole Kidman, ”Mouline Rouge” motion picture Concerto No. 1 in E major, Op. 8 Allegro Antonio vivaldi Sari gelin Composer unknown (Armenians, Azerbaijanis, Persians, and Turks folk song) Empire State of Mind Jay Z, Alicia Keys Wonderful life Black Scarborough fair Simon and Garfunkel Por una cabeza Carlos Gardel, featured in ”the Scent of a woman” motion picture Je suis malade Alice Dona and Serge Lama Cloud song Riverdance Send me an angel The Scorpions Cinderella Steven Curtis Chapman Don’t dwell Tracy Chapman I will survive Gloria Gaynor I wanna hold your hand The beatles Moon river Audrey Hepburn, ”Breakfast at Tiffany’s” motion picture You are loved Josh Groban V´erone ”Notre Dame de Paris” musical Don’t cry for me Argentina Sinead O’Connor cover, ”Don’t Cry for me Argentina” motion picture The voice Celtic women Slow me down Emmy Rossum Zombie The Cranberries Apr´esune rˆeve, (Op. 7, No. 1) Gabriel Faur´e Dani california Red Hot Chilli Peppers Caruso Lucio Dalla Waltz, Swan lake ballet Pyotr Ilyich Tchaikovsky 102

Table 2: The list of self-selected music pieces

Title Composer/Artist Iris The Goo Goo Dolls Tears in heaven Eric Clapton Requiem ”Dies Irae” Wolfgang Amadeus Mozart Untitled Sigur R´os Ain’t no mountain high enough Marvin Gaye and Tammi Terrell Theme from Schindler’s List John Williams, ”Schindler’s List” motion picture Julien Placebo How to save a life The Frays Nocturne No. 20 in C sharp minor Fr´ed´ericChopin Virtual insanity Jamiroquai Little Town Chorus Beauty And the Beast, Paige O’Hara and Richard White, ”Beauty and the Beast” motion picture A world of our own The seekers Veronica Sawyer smokes Crash Love Tall trees Matt Mays and El Torpedo Cello Suite No. 1, Prelude Johann Sebastian Bach hallelujah Jeff Buckley News bar Charlie Clouser Un petit peu d’air Felipecha Grand valse brillante Fr´ed´ericChopin That’s how you know Amy Adams, ”Enchanted” motion picture He wasn’t man enough for me Toni Braxton Man! I feel like a woman Shania Twain Mad world Gary Jules Be our guest Chorus Beauty And the Beast, ”Beauty and the Beast” motion picture Through and through and through Joel Plaskett So close Jon McLaughlin, ”Enchanted” motion picture Human nature Michael Jackson Way over yonder in the minor key Billy Bragg and Wilco Everyday will be like a holiday William Bell, (RZA ) Give me Jesus Fernando Ortego Don’t forget about me Chris Kirby I like it Julio Iglesias Love you Free design Au parc Chiara Mastroianni Candle in the wind Elton John Red sun Neil Young Blower’s daughter Damien Rice Bookends Simon and Garfunkel All I do is win DJ Khaled Hit the road Jack! Ray Charles 103

Appendix C: Music characteristic extraction using MIRTOOLBOX Music characteristics used in this thesis were sound pressure level, mode, and dis- sonance, and extraction of each feature is described in details in this appendix. The interested reader is referred to [89] for more details regarding music characteristic extrac- tion.

• Sound pressure

The sound pressure level (shown in 1) is a logarithmic measure of sound pressure

(ρrms), expressed in decibels (dB) above a standard reference level (ρref = 2 × 10−5P a).

ρ L(db) = 20log( rms ) (1) ρref

The sound pressure level waveform indicates the volume changes throughout the

music excerpt, and is extracted from the music waveform.

• Dissonance

The MIRTOOLBOX [124] estimates dissonance using a method proposed by Plomp

and Levelt (1965) [176], which determines the sensory dissonance by identifying

pairs of sinusoids that appear close in frequency. Therefore, the ratio of each pair of sinusoid was used for identifying dissonance.

The total dissonance was determined by computing the peaks of spectrum and

finding the average of dissonance between all possible pairs of peaks [205].

• Mode

The MIRTOOLBOX [124] determines the mode (i.e. major/minor) using the key strength value. The key strength value represents the probability of each possible

candidate key. This probability value is determined using a cross-correlation of

chromagram with similar profiles representing each possible tonality [62, 110]. 104

Table 3: The significance of the main effect of a. Mode, b. Dissonance, and c. Maximum sound pressure level for each recording site shown in Figure 2.1. (α = 0.05)

Characteristic Chromophore R1 R2 R3 R4 O L1 L2 L3 L4 [Hb ] X X X X X X X X X X a. Mode O [Hb] X X X X X X X X X X [Hb ] X X X X X X X X p=0.034 X b. Dissonance O [Hb] X X X X X X X X X X [Hb ] X X p=0.015 X X X X X X X c. Max sound pressure level O [Hb] X X X X X X X X X X

Appendix D: Region specific analysis of [HbO2] and [Hb] with respect to music characteristics

In chapter 6, to identify the effect of music characteristics on PFC [HbO2] and [Hb], these signals were averaged across the nine recording sites (see Figure 2.1). However, hemodynamic changes may vary across the PFC, and identifying the effect of music characteristics in each recording location is a meaningful pursuit.

Although considering regional activity patterns was appealing, due to the limited samples available for this analysis (72 samples per participant), the average [HbO2] and [Hb] signals were considered. (Including each recording site leads to 18 comparisons per music characteristics). The average [HbO2] and [Hb] signals were used to reliably captured the general pattern of hemodynamic changes. However, a separate analysis involving maximum [HbO2] and [Hb] recordings at each recording site was conducted, and the significance of the effect of each music characteristic is reported in Table 3. The domains marked with ’x’ in Table 3 indicate that the effect of the corresponding music characteristic did not reach significance. Therefore, the effect of mode did not reach sig- nificance (α = 0.05) for any of the recording locations. However, the effect of dissonance was significant for maximum [HbO2] in locations L3 and R3 (see figure 2.1)for dissonance and maximum sound pressure level respectively. These locations correspond to infero- lateral PFC regions. However after applying Bonferroni adjustment for 9 comparisons per chromophore which results in α = 0.005, the effect of music characteristics does not reach significance for any of the locations. 105

Appendix E: Contributions from Systemic Blood Flow

In NIRS hemodynamic monitoring, the near-infrared light needs to travel through scalp, skull and cerebral spinal fluid before it reaches the brain. This light passage through the scalp may introduce systemic blood flow components in the detected signals [84].

Assessing contributions from the systemic blood flow (e.g. skin blood flow) in the recorded signals is an important pursuit for researchers in the field of functional NIRS. Specific practices in the NIRS recordings were shown to reduce the effect of unwanted systemic blood flow. For example, increased distance between the light source and detector was shown to reduce systemic blood flow contributions [58, 59]. The source-detector distance selected for this study (i.e. r=3cm) was within the recommended range for detecting cerebral hemodynamic changes. Joint studies using magnetic resonance imaging and

NIRS have confirmed reliable hemodynamic detection at 3.3 cm [195]. The wavelengths used for detection (i.e. 690 nm and 830 nm) were also shown to reduce the contribution of the systemic blood flow to the detected signals [196]. In addition to the strategies used in the current thesis, an important practice for future research in this field would be inclusion of systemic blood flow monitors such as laser Doppler flowmetry. Using these sensors, the systemic blood flow contributions can be directly measured and compared to the NIRS recordings. For example, Hoshi et al. [83] used laser Doppler flowmetry sensors and demonstrated that there were no task-related changes in the systemic blood

flow, while the NIRS recordings showed significant modulations. Another study involving 60 second exposure to visual stimulus by Villringer et al. [228], also demonstrated that significant NIRS signal changes were not accompanied by task-related modulations in systemic blood flow detected using laser Doppler flowmetry. Lack of additional laser

Doppler flowmetry sensors in the current thesis is a limitation that needs to be addressed in future studies. Although empirical observations such as region-specific changes with respect to emotional rating suggest a more dominant contribution from the cerebral blood

flow compared to skin blood flow, future studies of emotion using NIRS should consider 106 including additional skin blood flow sensors (e.g. laser Doppler flowmetry sensors or NIRS sensors placed less than 0.5 cm apart) to assess systemic contribution to the results. 107

Appendix F: Cognitive Processing Activity in the Prefrontal Cortex The prefrontal cortex is engaged during various emotional and cognitive processes.

The reader is referred to 1.4.1 for more details regarding the role of PFC from a net- work perspective. Due to the wide range of processes during which prefrontal cortex is recruited, other activities (e.g. cognitive processing) may have modulated brain activity in this study. Therefore, there is the possibility of misrepresenting unrelated cognitive functions such as thinking of something as emotional response. To reduce the probability of detecting cognitive processes unrelated to emotion, the analysis was conducted with re- spect to subjective ratings and multiple trials were used (144 trials per individual). Unless the unrelated cognitive activities (e.g. distractions, thinking of something) are consis- tently repeated across trials, increasing the number of trials can be used as a strategy to mask the unrelated cognitive functions. In addition, assuming that subjective ratings are a correct representation of one’s emotions, unrelated cognitive tasks occurring during the trial may be reflected in the ratings (i.e. trials during which distractions occurred may be rated lower). In addition to cognitive processes unrelated to emotions, there may be those accompanying emotions. Distinguishing between cognitive and emotional response may be challenging. For example, one of the mechanisms through which music can induce emotion is by evoking episodic memories which involves memory retrieval

[92]. Hence, emotional response can be accompanied by cognitive appraisal. The current study design cannot distinguish cognitive appraisal which accompanies or results in emo- tional response from the emotional response itself. However, if this cognitive appraisal was detected during the study, the findings would not be undermined. Because detecting the cognitive appraisal accompanying emotions would lead to identifying emotions. 108

Appendix G: Research Ethics 109

Figure 1: Ethics approval notice 110

Assessing auditory stimuli presentation modalities in the affective modulation-based human computer interface

November XX, 2010

Dear Participant,

My name is Saba Moghimi. I am a PhD. student at the University of Toronto. My supervisor, Professor Tom Chau, and I work in a research team at Bloorview Kids Rehab. We are investigating a technology that can potentially be used as a communication device for people who cannot move or speak. Before agreeing to take part in this study, I would like to tell you how you will be involved.

What is the study about?

Access technologies help people who cannot move or speak to communicate with other people. Switches and eye trackers are examples ofthese technologies. Unfortunately, people who cannot make movements cannot use these switches. To help these people, researchers are investigating communication devices that are controlled by brain activity.

In this study, we will try to use brain activity and some other body signals to detect emotional reactions in response to auditory stimulus (music). This study will not help you. This study will help us design devices that help people with disabilities who cannot speak or express what they like. How will I be involved in this study?

To volunteer for this study you must be able to communicate in English. You must also be at least 18 years old and have normal or corrected- to-normal vision and hearing. Please do not volunteer for this study if you know you have any of the following conditions: 1) degenerative disease; 2) cardiovascular disease; 3) metabolic disorders; 4) trauma-induced brain injury; 5) respiratory conditions; 6) drug and alcohol-related conditions; and 6) psychiatric conditions. We will ask you to come in forf our sessions in a 3-5 months period. Each session will be about two hours. You will be asked not to drink any caffeinated beverages or alcohol an hourbefore the recording sessions. We will send you a reminder before each session. 111

We will put some sensors on your forehead.You ca n see the sensors in Figure 1. These sensors can record your brain activity. Do not worry; we will not be able to read your thoughts. We will also put sensors on your finger to measure your skin temperature, the amount of sweat in your skin, and your pulse. You can see the sensors in Figure 1. B. We will also ask you to wear a belt around your chest to record how you breathe. These sensors will not hurt you. You can let the researcher know if you are uncomfortable and we will remove the sensors. We can stop the recording or let you take a break if you are tired.

A B

Before the experiment, we will ask you to name a number of music pieces you like.You will hear the music you told us about and some other music pieces and sounds from the environment. We will ask you to rate how you felt listening to the music after it plays. Will anyone know what I say?

Your brain signals and physiological signals will be recorded in a private room. Only you and the researcher will be present. You can feelfree to ask the researcher any questions about the experiment. All your concerns will be kept confidential. We will not be able to read your mind or your thoughts with these signals. All the information that we collect from you will be confidential. All the forms that may have your information and the data collected from you will be saved on a secure server or in a locked cabinet. We will not use your name when publishing the results of this study. We will keep your name and the data collected from you for seven years, and will destroy all the information at the end of this time. We will not release any information that might identify you without asking for your consent.

Do I have to do this?

If you decide not to take part in this study, that is okay. If you decide to take part, but change your mind at any time, that is also okay. You may drop out of the study at any time. Doing this will not affect your status at Bloorview Kids Rehab or at the University of Toronto.

What are the risks and benefits? 112

You may get tired during the experiment. We have planned breaks during the session, but you can ask for additional breaks during the experiment if you wish. You may also get bored or feel sleepy. Please let us know when you are tired. We will let you take a break. You will not directly benefit from this study. However, we think that this study will benefit people who have no means of communication. After the study, we will send youank a th you letter, and you will also receive a small token of appreciation for your participation.

What if I have questions?

Please ask me to explain anything you don’t understand before signing the consent form. My phone number is 416-425-6220 x3603. If you leave a message, I will return your call within 48 hours. I can also be reached by email at [email protected].

Thank you for thinking about helping us with this project.

Yours sincerely,

Saba Moghimi

Ph.D Candidate Bloorview Kids Rehab Phone: 416-425-6220 x3270 E-mail: [email protected] Supervisor: Professor Tom Chau Bloorview Kids Rehab E-mail: [email protected] 113

CONSENT FORM Holland Bloorview Kids Rehabilitation Hospital

Re: Detecting mental selection on the basis of prefrontal cortical and autonomic nervous system activity

Please complete this form below and return it to the investigator.

The investigator explained this study to me. I read the information letter dated ______and understand what this study is about. I understand that I may drop out of the study at any time. I agree to participate in this study.

______

Participant’s Name (please print) Signature Date

______

Researcher’s Name Signature Date Figure 2: Participant consent form Bibliography

[1] F. Agrafioti, D. Hatzinakos, and A.K. Anderson. Ecg pattern analysis for emotion

detection. Affective Computing, IEEE Transactions on, 3(1):102–115, 2012.

[2] C. Ahlstrom, A. Johansson, F. Uhlin, T. L¨anne,and P. Ask. Noninvasive in-

vestigation of blood pressure changes using the pulse wave transit time: a novel

approach in the monitoring of hemodialysis patients. Journal of Artificial Organs, 8(3):192–197, 2005.

[3] B.Z. Allison, D.J. McFarland, G. Schalk, S.D. Zheng, M.M. Jackson, and J.R. Wol- paw. Towards an independent brain-computer interface using steady state visual

evoked potentials. Clinical neurophysiology, 119(2):399–408, 2008.

[4] E. Altenm¨uller,K. Sch¨urmann,V.K. Lim, and D. Parlitz. Hits to the left, flops

to the right: different emotions during listening to music are reflected in cortical

lateralisation patterns. Neuropsychologia, 40(13):2242–2256, 2002.

[5] F. Babiloni, F. Cincotti, M. Marciani, S. Salinari, L. Astolfi, F. Aloise,

F. De Vico Fallani, and D. Mattia. On the use of brain–computer interfaces outside

scientific laboratories: Toward an application in domotic environments. Interna- tional review of neurobiology, 86:133–146, 2009.

[6] O. Bai, P. Lin, S. Vorbach, M.K. Floeter, N. Hattori, and M. Hallett. Sensorimotor

beta rhythm-based brain–computer interface. Journal of neural engineering, 5:24–

35, 2008.

114 115

[7] R. Bates and HO Istance. Why are eye mice unpopular? A detailed comparison of head and eye controlled assistive technology pointing devices. Universal Access in

the Information Society, 2(3):280–290, 2003.

[8] G. Bauer, F. Gerstenbrand, and E. Rumpl. Varieties of the locked-in syndrome.

Journal of Neurology, 221(2):77–91, 1979.

[9] T. Baumgartner, M. Esslen, and L. J¨ancke. From emotion perception to emotion experience: Emotions evoked by pictures and classical music. International Journal

of Psychophysiology, 60(1):34–43, 2006.

[10] J.D. Bayliss, S.A. Inverso, and A. Tentler. Changing the P300 brain computer

interface. CyberPsychology & Behavior, 7(6):694–704, 2004.

[11] N. Birbaumer. Slow cortical potentials: Plasticity, operant control, and behavioral effects. The Neuroscientist, 5(2):74, 1999.

[12] N. Birbaumer, T. Elbert, AG Canavan, and B. Rockstroh. Slow potentials of the

cerebral cortex and behavior. Physiological Reviews, 70(1):1, 1990.

[13] N. Birbaumer, N. Ghanayim, T. Hinterberger, I. Iversen, B. Kotchoubey, A. K¨ubler,

J. Perelmouter, E. Taub, and H. Flor. A spelling device for the paralysed. Nature, 398(6725):297–298, 1999.

[14] N. Birbaumer, A. K¨ubler,N. Ghanayim, T. Hinterberger, J. Perelmouter, J. Kaiser,

I. Iversen, B. Kotchoubey, N. Neumann, and H. Flor. The thought translation de-

vice (TTD) for completely paralyzedpatients. IEEE Transactions on Rehabilitation

Engineering, 8(2):190–193, 2000.

[15] A.J. Blood and R.J. Zatorre. Intensely pleasurable responses to music correlate

with activity in brain regions implicated in reward and emotion. Proceedings of the

National Academy of Sciences of the United States of America, 98(20):11818, 2001. 116

[16] A.J. Blood, R.J. Zatorre, P. Bermudez, and A.C. Evans. Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions.

Nature neuroscience, 2:382–387, 1999.

[17] M. Boso, P. POLITI, F. BARALE, and E. EMANUELE. Neurophysiology and

neurobiology of the musical experience. Functional neurology, 21(4):187–191, 2006.

[18] N.C. Brady, J. Marquis, K. Fleming, and L. McLean. Prelinguistic predictors of language growth in children with developmental disabilities. Journal of Speech,

Language and Hearing Research, 47(3):663, 2004.

[19] G.C. Bruner. Music, mood, and marketing. The Journal of Marketing, pages

94–104, 1990.

[20] R.L. Buckner, J. Sepulcre, T. Talukdar, F.M. Krienen, H. Liu, T. Hedden, J.R. Andrews-Hanna, R.A. Sperling, and K.A. Johnson. Cortical hubs revealed by

intrinsic functional connectivity: mapping, assessment of stability, and relation to

alzheimer’s disease. The Journal of Neuroscience, 29(6):1860–1873, 2009.

[21] S.C. Bushong. Magnetic resonance imaging. St. Louis, MO (USA); CV Mosby Co.,

1988.

[22] A.H. Buss and R. Plomin. A temperament theory of personality development. Wiley-

Interscience, 1975.

[23] J.J. Campos, R.G. Campos, and K.C. Barrett. Emergent themes in the study

of emotional development and emotion regulation. Developmental Psychology,

25(3):394, 1989.

[24] C.S. Carter, T.S. Braver, D.M. Barch, M.M. Botvinick, D. Noll, and J.D. Cohen.

Anterior cingulate cortex, error detection, and the online monitoring of perfor-

mance. Science, 280(5364):747–749, 1998. 117

[25] R. Chavarriaga and J. del R Millan. Learning from eeg error-related potentials in noninvasive brain-computer interfaces. Neural Systems and Rehabilitation Engi-

neering, IEEE Transactions on, 18(4):381–388, 2010.

[26] A.˜ Christa Neuper, G.R. Muller-Putz, R. Scherer, and G. Pfurtscheller. Motor

imagery and EEG-based control of spelling devices and neuroprostheses. Event- related dynamics of brain oscillations, page 393, 2006.

[27] F. Cincotti, D. Mattia, F. Aloise, S. Bufalari, G. Schalk, G. Oriolo, A. Cherubini,

M.G. Marciani, and F. Babiloni. Non-invasive brain–computer interface system:

Towards its application as assistive technology. Brain research bulletin, 75(6):796– 803, 2008.

[28] C. Collet, E. Vernet-Maury, G. Delhomme, and A. Dittmar. Autonomic nervous

system response patterns specificity to basic emotions. Journal of the autonomic

nervous system, 62(1-2):45–57, 1997.

[29] J. Conradi, B. Blankertz, M. Tangermann, V. Kunzmann, and G. Curio. Brain-

computer interfacing in tetraplegic patients with high spinal cord injury. Int J

Bioelectromagnetism Volume, 11(2):65–68, 2009.

[30] M. Cope. The application of near infrared spectroscopy to non invasive monitoring of cerebral oxygenation in the newborn infant. Department of Medical Physics and

Bioengineering, University College London, pages 214–9, 1991.

[31] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and

J.G. Taylor. Emotion recognition in human-computer interaction. Signal Processing

Magazine, IEEE, 18(1):32–80, 2001.

[32] S. Dalla Bella, I. Peretz, L. Rousseau, and N. Gosselin. A developmental study of

the affective value of tempo and mode in music. Cognition, 80(3):B1–B10, 2001. 118

[33] R.J. Davidson. Emotion and affective style: Hemispheric substrates. Psychological Science, 3(1):39, 1992.

[34] RJ Davidson. What does the prefrontal cortex do in affect. Perspectives on frontal

EEG asymmetry research. Biological Psychology, 67:219–233, 2004.

[35] T. Demiralp et al. Event-related oscillations are real brain responses wavelet analy-

sis and new strategies. International Journal of Psychophysiology, 39(2-3):91–127,

2001.

[36] M. Dennis. Prefrontal cortex: Typical and atypical development. The frontal lobes:

Development, function and pathology, pages 128–162, 2006.

[37] P.A. Di Mattia, F.X. Curran, and J. Gips. An eye control teaching device for

students without language expressive capacity: EagleEyes. Edwin Mellen Pr, 2001.

[38] E. Donchin, KM Spencer, and R. Wijesinghe. The mental prosthesis: assessing the speed of a P300-basedbrain-computer interface. IEEE transactions on rehabilitation

engineering, 8(2):174–179, 2000.

[39] W.C. Drevets, J.L. Price, J.R. Simpson, R.D. Todd, T. Reich, M. Vannier, and

M.E. Raichle. Subgenual prefrontal cortex abnormalities in mood disorders. Nature, 386(6627):824–827, 1997.

[40] R.O. Duda, P.E. Hart, and D.G. Stork. Pattern classification, volume 2. Citeseer,

2001.

[41] A. Duncan, J.H. Meek, M. Clemence, CE Elwell, L. Tyszczuk, M. Cope, and D. Delpy. Optical pathlength measurements on adult head, calf and forearm and

the head of the newborn infant using phase resolved optical spectroscopy. Physics

in Medicine and Biology, 40:295, 1995. 119

[42] T. Elbert, N. Birbaumer, W. Lutzenberger, and B. Rockstroh. Biofeedback of slow cortical potentials: self-regulation of central-autonomic patterns. Biofeedback and

self-regulation, pages 321–342, 1979.

[43] T. Elbert, B. Rockstroh, W. Lutzenberger, and N. Birbaumer. Biofeedback of

slow cortical potentials. I. Electroencephalography and Clinical Neurophysiology, 48(3):293–301, 1980.

[44] A. Etkin, T.D. Wager, et al. Functional neuroimaging of anxiety: a meta-analysis of

emotional processing in ptsd, social anxiety disorder, and specific phobia. American

Journal of Psychiatry, 164(10):1476–1488, 2007.

[45] T.H. Falk, M. Guirgis, S. Power, and T. Chau. Taking nirs-bcis outside the lab:

Towards achieving robustness against environment noise. Neural Systems and Re-

habilitation Engineering, IEEE Transactions on, 19(2):136–146, 2011.

[46] M.J. Farah. Neuroethics: the practical and the philosophical. Neuroethics Publi- cations, page 8, 2005.

[47] L.A. Farwell and E. Donchin. Talking off the top of your head: toward a men-

tal prosthesis utilizing event-related brain potentials. Electroencephalography and

clinical Neurophysiology, 70(6):510–523, 1988.

[48] E.A. Felton, J.A. Wilson, J.C. Williams, and P.C. Garell. Electrocorticographically

controlled brain–computer interfaces using motor and sensory imagery in patients

with temporary subdural electrode implants. Journal of Neurosurgery: Pediatrics,

106(3), 2007.

[49] R.S. Fisher, G. Harding, G. Erba, G.L. Barkley, and A. Wilkins. Photic-and

pattern-induced seizures: a review for the Epilepsy Foundation of America Working

Group. Epilepsia, 46(9):1426–1441, 2005. 120

[50] S.T. Fiske, D.T. Gilbert, and G. Lindzey. Handbook of social psychology. 1, 2010.

[51] E.O. Flores-Guti´errez,J.L. D´ıaz,F.A. Barrios, R. Favila-Humara, M.A.´ Guevara,

Y. del R´ıo-Portilla, and M. Corsi-Cabrera. Metabolic and electric brain patterns

during pleasant and unpleasant emotions induced by music masterpieces. Interna-

tional Journal of Psychophysiology, 65(1):69–84, 2007.

[52] GM Friehs, VA Zerris, CL Ojakangas, MR Fellows, and JP Donoghue. Brain-

machine and brain-computer interfaces. Stroke, 35(11-Supplment 1):2702–2705,

2004.

[53] N.H. Frijda and B. Mesquita. The social roles and functions of emotions. Emotion and culture, pages 51–87, 1994.

[54] M. Frisch and H. Messer. The use of the wavelet transform in the detection of an

unknown transient signal. Information Theory, IEEE Transactions on, 38(2):892–

897, 1992.

[55] T. Fritz, S. Jentschke, N. Gosselin, D. Sammler, I. Peretz, R. Turner, A.D.

Friederici, and S. Koelsch. Universal recognition of three basic emotions in music.

Current Biology, 19(7):573–576, 2009.

[56] A.W.K. Gaillard. Slow brain potentials preceding task performance. Biological

Psychology, 21(4):282–283, 1985.

[57] J.M. Garreu and S.J. Bird. Ethical issues in communicating science. Science and

engineering ethics, 6(4):435–442, 2000.

[58] TJ Germon, PD Evans, NJ Barnett, P Wall, AR Manara, and RJ Nelson. Cerebral

near infrared spectroscopy: emitter-detector separation must be increased. British

journal of anaesthesia, 82(6):831–837, 1999. 121

[59] TJ Germon, PD Evans, AR Manara, NJ Barnett, P Wall, and RJ Nelson. Sensitiv- ity of near infrared spectroscopy to cerebral and extra-cerebral oxygenation changes

is determined by emitter-detector separation. Journal of clinical monitoring and

computing, 14(5):353–360, 1998.

[60] A. Gerrards-Hesse, K. Spies, and F.W. Hesse. Experimental inductions of emotional states and their effectiveness: A review. British Journal of Psychology, 85(1):55–78,

1994.

[61] S. Glennen and D.C. DeCoste. The handbook of augmentative and alternative

communication. 1997.

[62] E. G´omez. Tonal description of polyphonic audio for music content processing.

INFORMS Journal on Computing, 18(3):294–304, 2006.

[63] M.D. Greicius, B. Krasnow, A.L. Reiss, and V. Menon. Functional connectivity in

the resting brain: a network analysis of the default mode hypothesis. Proceedings of the National Academy of Sciences, 100(1):253–258, 2003.

[64] C. Guger, G. Edlinger, W. Harkam, I. Niedermayer, and G. Pfurtscheller. How

many people are able to operate an EEG-based brain-computer interface (BCI)?

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(2):145, 2003.

[65] M. Guirgis, T. Falk, S. Power, S. Blain, and T. Chau. Harnessing physiological

responses to improve nirs-based brain-computer interface performance. In Proc.

ISSNIP Biosignals and Biorobotics Conference 2010, pages 59–62, 2010.

[66] A. Haag, S. Goronzy, P. Schaich, and J. Williams. Emotion recognition using

bio-sensors: First steps towards an automatic system. Affective Dialogue Systems,

pages 36–48, 2004. 122

[67] B. Hamadicharef, H. Zhang, C. Guan, C. Wang, K.S. Phua, K.P. Tee, and K.K. Ang. Learning eeg-based spectral-spatial patterns for attention level measurement.

pages 1465–1468, 2009.

[68] M. H¨am¨al¨ainen,R. Hari, R.J. Ilmoniemi, J. Knuutila, and O.V. Lounasmaa. Mag-

netoencephalographytheory, instrumentation, and applications to noninvasive stud- ies of the working human brain. Reviews of modern Physics, 65(2):413, 1993.

[69] HM Hamer, HH Morris, EJ Mascha, MT Karafa, WE Bingaman, MD Bej,

RC Burgess, DS Dinner, NR Foldvary, JF Hahn, et al. Complications of inva-

sive video-EEG monitoring with subdural grid electrodes. Neurology, 58(1):97, 2002.

[70] M.B. Happ. Interpretation of nonvocal behavior and the meaning of voicelessness

in critical care. Social Science & Medicine, 50(9):1247–1255, 2000.

[71] P. Haselager, R. Vlek, J. Hill, and F. Nijboer. A note on ethical aspects of bci. Neural Networks, 22(9):1352–1357, 2009.

[72] J. Healey and R. Picard. Digital processing of affective signals. 6:3749–3752, 1998.

[73] C.S. Herrmann. Human EEG responses to 1–100 Hz flicker: resonance phenomena in visual cortex and their potential correlation to cognitive phenomena. Experi-

mental Brain Research, 137(3):346–353, 2001.

[74] MJ Herrmann, A.C. Ehlis, and AJ Fallgatter. Prefrontal activation through task

requirements of emotional induction measured with NIRS. Biological psychology,

64(3):255–263, 2003.

[75] K. Hevner. The affective character of the major and minor modes in music. The

American Journal of Psychology, pages 103–118, 1935. 123

[76] T. Hinterberger, A. K¨ubler,J. Kaiser, N. Neumann, and N. Birbaumer. A brain computer interface (BCI) for the locked in: comparison of different EEG classifica-

tions for the thought translation device. Clinical Neurophysiology, 114(3):416–425,

2003.

[77] LR Hochberg, MD Serruya, GM Friehs, JA Mukand, M Saleh, AH Caplan, A Bran-

ner, D Chen, RD Penn, and JP Donoghue. Neuronal ensemble control of prosthetic

devices by a human with tetraplegia. Nature, 442(7099):164–171, 2006.

[78] U. Hoffmann, J.M. Vesin, T. Ebrahimi, and K. Diserens. An efficient P300-based

brain-computer interface for disabled subjects. Journal of Neuroscience methods, 167(1):115–125, 2008.

[79] S. Holm. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65–70, 1979.

[80] C.B. Holroyd and M.G.H. Coles. The neural basis of human error processing:

reinforcement learning, dopamine, and the error-related negativity. Psychological

review, 109(4):679, 2002.

[81] T. Hopyan, S. Laughlin, and M. Dennis. Emotions and their cognitive control in

children with cerebellar tumors. Journal of the International Neuropsychological

Society, 1(-1):1–12, 2006.

[82] TALAR HOPYAN, SUZANNE LAUGHLIN, and MAUREEN DENNIS. Emotions

and their cognitive control in children with cerebellar tumors. Journal of the In- ternational Neuropsychological Society, 16(6):1027, 2010.

[83] Y. Hoshi, J. Huang, S. Kohri, Y. Iguchi, M. Naya, T. Okamoto, and S. Ono. Recog- nition of human emotions from cerebral blood flow changes in the frontal region:

A study with event-related near-infrared spectroscopy. Journal of Neuroimaging,

21(2):e94–e101, 2011. 124

[84] Yoko Hoshi. Functional near-infrared spectroscopy: Potential and limitations in neuroimaging studies. International Review of Neurobiology, 66:237–266, 2005.

[85] G. Husain, W.F. Thompson, and E.G. Schellenberg. Effects of musical tempo and

mode on arousal, mood, and spatial abilities. Music Perception, 20(2):151–171,

2002.

[86] S Inci and T Ozgen. Locked-in syndrome due to metastatic pontomedullary tumor-

case report. Neurologia Medico-Chirurgica, 43(10):497–500, 2003.

[87] IH Iversen, N. Ghanayim, A. K¨ubler,N. Neumann, N. Birbaumer, and J. Kaiser. A

brain computer interface tool to assess cognitive functions in completely paralyzed

patients with amyotrophic lateral sclerosis. Clinical neurophysiology, 119(10):2214–

2223, 2008.

[88] R.I. Jahiel and M.J. Scherer. Initial steps towards a theory and praxis of person- environment interaction in disability. Disability & Rehabilitation, 32(17):1467–

1474, 2010.

[89] J.H. Jensen. Feature extraction for music information retrieval. 2010.

[90] F.F. Jobsis. Noninvasive, infrared monitoring of cerebral and myocardial oxygen sufficiency and circulatory parameters. Science, 198(4323):1264, 1977.

[91] P.N. Juslin. From mimesis to catharsis: expression, perception, and induction of emotion in music. Musical communication, pages 85–115, 2005.

[92] P.N. Juslin and D. V¨astfj¨all.Emotional responses to music: The need to consider

underlying mechanisms. Behavioral and Brain Sciences, 31(5):559–575, 2008.

[93] J. Kaiser, A. K¨ubler,T. Hinterberger, N. Neumann, and N. Birbaumer. A non-

invasive communication device for the paralyzed. Minimally Invasive Neurosurgery,

45(1):19–23, 2002. 125

[94] A.A. Karim, T. Hinterberger, J. Richter, J. Mellinger, N. Neumann, H. Flor, A. K¨ubler,and N. Birbaumer. Neural Internet: web surfing with brain potentials

for the completely paralyzed. Neurorehabilitation and Neural Repair, 20(4):508,

2006.

[95] L. Kauhanen, P. Jyl¨anki,J. Lehtonen, P. Rantanen, H. Alaranta, and M. Sams. EEG-based brain-computer interface for tetraplegics. Computational Intelligence

and Neuroscience, 2007:1, 2007.

[96] D. Keltner and J.J. Gross. Functional accounts of emotions. Cognition and Emo-

tion, 13(5):467–480, 1999.

[97] IK Keme-Ebi and AA Asindi. Locked-in syndrome in a nigerian male with mul-

tiple sclerosis: a case report and literature review. Pan African Medical Journal,

1(4):10pp, 2008.

[98] P.R. Kennedy, R.A.E. Bakay, M.M. Moore, K. Adams, and J. Goldwaithe. Direct control of a computer from the human central nervous system. IEEE Transactions

on Rehabilitation Engineering, 8(2):198–202, 2000.

[99] S. Khalfa, D. Schon, J.L. Anton, and C. Li´egeois-Chauvel. Brain regions involved in

the recognition of happiness and sadness in music. Neuroreport, 16(18):1981–1984, 2005.

[100] J. Kim and E. Andr´e. Emotion recognition based on physiological changes in

music listening. Pattern Analysis and Machine Intelligence, IEEE Transactions

on, 30(12):2067–2083, 2008.

[101] J.M. Kim, K. Arakawa, K.T. Benson, and D.K. Fox. Pulse oximetry and circu-

latory kinetics associated with pulse volume amplitude measured by photoelectric

plethysmography. Anesthesia & Analgesia, 65(12):1333–1339, 1986. 126

[102] K.H. Kim, SW Bang, and SR Kim. Emotion recognition system using short-term monitoring of physiological signals. Medical and biological engineering and comput-

ing, 42(3):419–427, 2004.

[103] S.P. Kim, J.D. Simeral, L.R. Hochberg, J.P. Donoghue, and M.J. Black. Neural

control of cursor velocity in humans with tetraplegia. Journal of neural engineering, 5:455–476, 2008.

[104] S.P. Kim, JD Simeral, LR Hochberg, JP Donoghue, GM Friehs, and MJ Black.

Multi-state decoding of point-and-click control signals from motor cortical activity

in a human with tetraplegia. In Neural Engineering, 2007. CNE’07. 3rd Interna-

tional IEEE/EMBS Conference on, pages 486–489, 2007.

[105] S. Kingsnorth, S. Blain, and P. McKeever. Physiological and emotional responses of disabled children to therapeutic clowns: A pilot study. Evidence-Based Comple-

mentary and Alternative Medicine, 2011, 2011.

[106] S. Koelsch. Investigating emotion with music. Annals of the New York Academy

of Sciences, 1060(1):412–418, 2005.

[107] G. Krausz, R. Scherer, G. Korisek, and G. Pfurtscheller. Critical Decision-Speed and Information Transfer in the Graz Brain–Computer Interface. Applied psy-

chophysiology and biofeedback, 28(3):233–240, 2003.

[108] S.D. Kreibig. Autonomic nervous system activity in emotion: A review. Biological

psychology, 84(3):394–421, 2010.

[109] G. Kreutz, U. Ott, D. Teichmann, P. Osawa, and D. Vaitl. Using music to induce emotions: Influences of musical preference and absorption. Psychology of music,

36(1):101, 2008.

[110] C.L. Krumhansl. Cognitive foundations of musical pitch. (17), 1990. 127

[111] C.L. Krumhansl. An exploratory study of musical emotions and psychophysiology. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie

exp´erimentale, 51(4):336, 1997.

[112] A. K¨ublerand N. Birbaumer. Brain computer interfaces and communication in

paralysis: Extinction of goal directed thinking in completely paralysed patients?

Clinical neurophysiology, 119(11):2658–2666, 2008.

[113] A. K¨ubler,A. Furdea, S. Halder, E.M. Hammer, F. Nijboer, and B. Kotchoubey.

A Brain–Computer Interface Controlled Auditory Event-Related Potential (P300)

Spelling System for Locked-In Patients. Annals of the New York Academy of Sci-

ences, 1157(Disorders of Consciousness):90–100, 2009.

[114] A. K¨ubler, B. Kotchoubey, T. Hinterberger, N. Ghanayim, J. Perelmouter, M. Schauer, C. Fritsch, E. Taub, and N. Birbaumer. The thought translation

device: a neurophysiological approach to communication in total motor paralysis.

Experimental Brain Research, 124(2):223–232, 1999.

[115] A. K¨ubler,N. Neumann, J. Kaiser, B. Kotchoubey, T. Hinterberger, and NP Bir-

baumer. Brain-computer communication: self-regulation of slow cortical poten- tials for verbal communication. Archives of physical medicine and rehabilitation,

82(11):1533, 2001.

[116] A. K¨ubler, N. Neumann, B. Wilhelm, T. Hinterberger, and N. Birbaumer.

Predictability of brain-computer communication. Journal of Psychophysiology,

18(2):121–129, 2004.

[117] A. K¨ubler,F. Nijboer, J. Mellinger, TM Vaughan, H. Pawelzik, G. Schalk, DJ Mc-

Farland, N. Birbaumer, and JR Wolpaw. Patients with ALS can use sensorimotor

rhythms to operate a brain-computer interface. Neurology, 64(10):1775, 2005. 128

[118] H. Kuck, M. Grossbach, M. Bangert, and E. Altenm¨uller.Brain processing of meter and rhythm in music. Annals of the New York Academy of Sciences, 999(1):244–

253, 2003.

[119] W.N. Kuhlman. EEG feedback training: enhancement of somatosensory cortical activity. Electroencephalography and clinical neurophysiology, 45(2):290–294, 1978.

[120] L. Kuncheva, T. Christy, I. Pierce, and S. Mansoor. Multi-modal biometric emotion

recognition using classifier ensembles. Modern Approaches in Applied Intelligence, pages 317–326, 2011.

[121] L.I Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. 2004.

[122] W.J. Lammers and P. Badia. Habituation of P300 to target stimuli. Physiology &

behavior, 45(3):595–601, 1989.

[123] P.J. Lang and M.M. Bradley. Emotion and the motivational brain. Biological

Psychology, 84(3):437–450, 2010.

[124] O. Lartillot, P. Toiviainen, and T. Eerola. A matlab toolbox for music information

retrieval. Data analysis, machine learning and applications, pages 261–268, 2008.

[125] R.S. Lazarus. Emotion and adaptation. 1991.

[126] J.E. LeDoux. Emotion circuits in the brain. The Science of Mental Health: Fear

and anxiety, page 259, 2001.

[127] R. Leeb, D. Friedman, G.R. Muller-Putz, R. Scherer, M. Slater, and G. Pfurtscheller. Self-Paced(Asynchronous) BCI Control of a Wheelchair in Virtual

Environments: A Case Study with a Tetraplegic. Computational Intelligence and

Neuroscience, 2007:79642, 2007. 129

[128] B Leung and T Chau. A multiple camera tongue switch for a child with severe spas- tic quadriplegic cerebral palsy. Disability & Rehabilitation: Assistive Technology,

5(1):58–68, 2010.

[129] R.W. Levenson. Autonomic nervous system differences among emotions. Psycho-

logical science, 3(1):23–27, 1992.

[130] L. Ljung. System identification. 1999.

[131] S.G. Mallat. A wavelet tour of signal processing. San Diego, CA:Academic Pr,

1999.

[132] K. Marumo, R. Takizawa, Y. Kawakubo, T. Onitsuka, and K. Kasai. Gender

difference in right lateral prefrontal hemodynamic response while viewing fearful

faces: A multi-channel near-infrared spectroscopy study. Neuroscience research, 63(2):89–94, 2009.

[133] SG Mason and GE Birch. A general framework for brain-computer interface design.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(1):70–

85, 2003.

[134] K. Matsuo, T. Kato, K. Taneichi, A. Matsumoto, T. Ohtani, T. Hamamoto, H. Ya- masue, Y. Sakano, T. Sasaki, M. Sadamatsu, et al. Activation of the prefrontal

cortex to trauma-related stimuli measured by near-infrared spectroscopy in post-

traumatic stress disorder due to terrorism. Psychophysiology, 40(4):492–500, 2003.

[135] D.J. McFarland, D.J. Krusienski, W.A. Sarnacki, and J.R. Wolpaw. Emulation of

computer mouse control with a noninvasive brain-computer interface. Journal of neural engineering, 5(2):101, 2008.

[136] JH Meek, CE Elwell, MJ Khan, J. Romaya, JS Wyatt, DT Delpy, and S. Zeki.

Regional changes in cerebral haemodynamics as a result of a visual stimulus mea- 130

sured by near infrared spectroscopy. Proceedings of the Royal Society of London. Series B: Biological Sciences, 261(1362):351, 1995.

[137] N. Memarian, A.N. Venetsanopoulos, and T. Chau. Infrared thermography as an

access pathway for individuals with severe motor impairments. Journal of Neuro-

Engineering and Rehabilitation, 6(1):11, 2009.

[138] L.B. Meyer. Emotion and meaning in music. University of Chicago Press, 1956.

[139] J.R. Mill´an.Adaptive brain interfaces. Communications of the ACM, 46(3):74–80,

2003.

[140] J.R. Mill´an,J. Mouri˜no,M. Franz´e,F. Cincotti, M. Varsta, J. Heikkonen, and

F. Babiloni. A local neural classifier for the recognition of EEG patterns associated

to mental tasks. IEEE Transactions on Neural Networks, 13(3), 2002.

[141] M.T. Mitterschiffthaler, C.H.Y. Fu, J.A. Dalton, C.M. Andrew, and S.C.R.

Williams. A functional mri study of happy and sad affective states induced by

classical music. Human Brain Mapping, 28(11):1150–1162, 2007.

[142] S. Moghimi, A. Kushki, A.M. Guerguerian, and T. Chau. Characterizing emo-

tional response to music in the prefrontal cortex using near infrared spectroscopy. Neuroscience Letters, 2012.

[143] S. Moghimi, A. Kushki, A.M. Guerguerian, and T. Chau. A review of eeg-based

brain-computer interfaces as access pathways for individuals with severe disabilities.

Assistive technology: the official journal of RESNA, to appear (2012).

[144] S. Moghimi, A. Kushki, S. Power, A.M. Guerguerian, and T. Chau. Automatic detection of a prefrontal cortical response to emotionally rated music using multi-

channel near-infrared spectroscopy. Journal of Neural Engineering, 9(2):026022,

2012. 131

[145] ST Morgan, JC Hansen, and SA Hillyard. Selective attention to stimulus location modulates the steady-state visual evoked potential. Proceedings of the National

Academy of Sciences of the United States of America, 93(10):4770, 1996.

[146] J.D. Morris. SAM: the Self-Assessment Manikin. An efficient cross-cultural mea-

surement of emotional response. Journal of Advertising Research, 35(6), 1995.

[147] DW Mulder, LT Kurland, KP Offord, and CM Beard. Familial adult motor neuron disease: amyotrophic lateral sclerosis. Neurology, 36(4):511, 1986.

[148] G.R. Muller, C. Neuper, and G. Pfurtscheller. Implementation of a telemonitoring

system for the control of an EEG-based brain-computer interface. IEEE Transac-

tions on Neural Systems and Rehabilitation Engineering, 11(1):54–59, 2003.

[149] G.R. M¨uller-Putz,R. Scherer, C. Brunner, R. Leeb, and G. Pfurtscheller. Better than random? a closer look on bci results. International Journal of Bioelectromag-

netism, 10(1):52–55, 2008.

[150] K.J. Murphy and J.A. Brunberg. Adult claustrophobia, anxiety and sedation in

mri. Magnetic resonance imaging, 15(1):51–54, 1997.

[151] M. Naito, Y. Michioka, K. Ozawa, Y. Ito, M. Kiguchi, and T. Kanazawa. A commu- nication means for totally locked-in als patients based on changes in cerebral blood

volume measured with near-infrared light. IEICE transactions on information and

systems, 90(7):1028–1037, 2007.

[152] Z. Nenadic and J.W. Burdick. Spike detection using the continuous wavelet trans-

form. Biomedical Engineering, IEEE Transactions on, 52(1):74–87, 2005.

[153] N. Neumann and N. Birbaumer. Predictors of successful self control during

brain-computer communication. Journal of Neurology, Neurosurgery & Psychiatry,

74(8):1117, 2003. 132

[154] N. Neumann and A. K¨ubler.Training locked-in patients: A challenge for the use of brain-computer interfaces. IEEE Transactions on Neural Systems and Rehabili-

tation Engineering, 11(2):169–172, 2003.

[155] C. Neuper, GR M¨uller,A. K¨ubler,N. Birbaumer, and G. Pfurtscheller. Clinical

application of an EEG-based brain-computer interface: a case study in a patient

with severe motor impairment. Clinical Neurophysiology, 114(3):399–409, 2003.

[156] B.R. Nhan and T. Chau. Classifying affective states using thermal infrared imaging

of the human face. Biomedical Engineering, IEEE Transactions on, 57(4):979–987,

2010.

[157] E. Niedermeyer and F.H.L. Da Silva. Electroencephalography: basic principles,

clinical applications, and related fields. Lippincott Williams & Wilkins, 2005.

[158] F. Nijboer, SP Carmien, E. Leon, FO Morin, RA Koene, and U. Hoffmann. Affec- tive brain-computer interfaces: Psychophysiological markers of emotion in healthy

persons and in persons with amyotrophic lateral sclerosis. In Affective Comput-

ing and Intelligent Interaction and Workshops, 2009. ACII 2009. 3rd International

Conference on, pages 1–11. IEEE, 2009.

[159] F. Nijboer, EW Sellers, J. Mellinger, MA Jordan, T. Matuz, A. Furdea, S. Halder,

U. Mochty, DJ Krusienski, TM Vaughan, et al. A P300-based brain-computer

interface for people with amyotrophic lateral sclerosis. Clinical neurophysiology, 119(8):1909–1916, 2008.

[160] K. Oatley, D. Keltner, and J.M. Jenkins. Understanding emotions. Wiley-

Blackwell, 2006.

[161] H. Obrig, C. Hirth, JG Junge-Hulsing, C. Doge, T. Wolf, U. Dirnagl, and A. Vill-

ringer. Cerebral oxygenation changes in response to motor stimulation. Journal of

Applied Physiology, 81(3):1174, 1996. 133

[162] F Ortiz-Corredor, JJ Silvestre-Avendano, and A Izquierdo-BEllo. Locked-in state mimicking cerebral death in a child with guillain-barre syndrome. Revista de Neu-

rologica, 44(10):636–638, 2007.

[163] K.J. Pallesen, E. Brattico, C. Bailey, A. Korvenoja, J. Koivisto, A. Gjedde, and

S. Carlson. Emotion processing of major, minor, and dissonant chords. Annals of the New York Academy of Sciences, 1060(1):450–453, 2005.

[164] J. Panksepp and G. Bernatzky. Emotional sounds and the brain: the neuro-affective

foundations of musical appreciation. Behavioural Processes, 60(2):133–155, 2002.

[165] M.A. Pastor, J. Artieda, J. Arbizu, M. Valencia, and J.C. Masdeu. Human cerebral

activation during steady-state visual-evoked responses. Journal of Neuroscience,

23(37):11621, 2003.

[166] J. Perelmouter and N. Birbaumer. A binary spelling interface with random errors.

IEEE Transactions on Rehabilitation Engineering, 8(2):227–232, 2000.

[167] I. Peretz, L. Gagnon, and B. Bouchard. Music and emotion: perceptual deter-

minants, immediacy, and isolation after brain damage. Cognition, 68(2):111–141,

1998.

[168] P.C. Petrantonakis and L.J. Hadjileontiadis. Emotion recognition from eeg using

higher order crossings. Information Technology in Biomedicine, IEEE Transactions

on, 14(2):186–197, 2010.

[169] KV Petrides and A. Furnham. Trait emotional intelligence: Behavioural validation

in two studies of emotion recognition and reactivity to mood induction. European Journal of Personality, 17(1):39–57, 2003.

[170] G. Pfurtscheller, C. Neuper, C. Guger, W. Harkam, H. Ramoser, A. Schlogl,

B. Obermaier, M. Pregenzer, et al. Current trends in Graz brain-computer interface 134

(BCI) research. IEEE Transactions on Rehabilitation Engineering, 8(2):216–219, 2000.

[171] R.W. Picard. Affective computing. The MIT press, 2000.

[172] R.W. Picard, E. Vyzas, and J. Healey. Toward machine emotional intelligence:

Analysis of affective physiological state. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(10):1175–1191, 2001.

[173] F. Piccione, F. Giorgi, P. Tonin, K. Priftis, S. Giove, S. Silvoni, G. Palmas, and

F. Beverina. P300-based brain computer interface: reliability and performance in

healthy and paralysed participants. Clinical neurophysiology, 117(3):531–537, 2006.

[174] T.W. Picton. The P300 wave of the human event-related potential. Journal of clinical neurophysiology, 9(4):456, 1992.

[175] GD Pinna and R. Maestri. Reliability of transfer function estimates in cardio-

vascular variability analysis. Medical and Biological Engineering and Computing,

39(3):338–347, 2001.

[176] R. Plomp and W.J.M. Levelt. Tonal consonance and critical bandwidth. The journal of the Acoustical Society of America, 38(4):548–560, 1965.

[177] S Power, T Falk, and T Chau. Classification of prefrontal activity due to mental

arithmetic and music imagery using hidden markov models and frequency domain

near-infrared spectroscopy. Journal of Neural Engineering, 7(2):026002:9pp, 2010.

[178] S. Power, A. Kushki, and T. Chau. Toward a 3-state system-paced NIRS-BCI: automatic discrimination of mental arithmetic, music imagery from the no-control

state. under review at Journal of Neural Engineering, 2011.

[179] S.D. Power, T.H. Falk, and T. Chau. Classification of prefrontal activity due to 135

mental arithmetic and music imagery using hidden Markov models and frequency domain near-infrared spectroscopy. Journal of Neural Engineering, 7:026002, 2010.

[180] S.D. Power, A. Kushki, and T. Chau. Towards a system-paced near-infrared spec-

troscopy brain–computer interface: differentiating prefrontal activity due to mental

arithmetic and mental singing from the no-control state. Journal of Neural Engi- neering, 8:066004, 2011.

[181] W.S. Pritchard. Psychophysiology of P300. Psychological Bulletin, 89(3):506–540,

1981.

[182] V. Rajagopalan and A. Ray. Symbolic time series analysis via wavelet-based par-

titioning. Signal Processing, 86(11):3309–3320, 2006.

[183] C. Ranganath and G. Rainer. Neural mechanisms for detecting and remembering

novel events. Nature Reviews Neuroscience, 4(3):193–202, 2003.

[184] P. Rani, C. Liu, N. Sarkar, and E. Vanman. An empirical study of machine learning

techniques for affect recognition in human–robot interaction. Pattern Analysis &

Applications, 9(1):58–69, 2006.

[185] S.J. Roberts and W.D. Penny. Real-time brain-computer interfacing: A preliminary

study using Bayesian learning. Medical and Biological Engineering and computing,

38(1):56–61, 2000.

[186] R.G. Robinson, K.L. Kubos, L.Y.N.B. Starr, K. Rao, and T.R. Price. Mood disor-

ders in stroke patients: importance of location of lesion. Brain, 107(1):81, 1984.

[187] E.T. Rolls. On¡ em¿ The brain and emotion¡/em¿. Behavioral and Brain Sciences,

23(02):219–228, 2000.

[188] A. Roskies. Neuroethics for the new millenium. Neuron, 35(1):21, 2002. 136

[189] M.K. Rothbart and D. Derryberry. Development of individual differences in tem- perament. Advances in developmental psychology, 1:37–86, 1981.

[190] J.A. Russell. A circumplex model of affect. Journal of personality and social

psychology, 39(6):1161, 1980.

[191] C.L. Rusting. Personality, mood, and cognitive processing of emotional informa- tion: three conceptual frameworks. Psychological bulletin, 124(2):165, 1998.

[192] D.L. Sackett. Rules of evidence and clinical recommendations on the use of an-

tithrombotic agents. Chest, 95(2 Supplement):2S, 1989.

[193] S. Samson. Neuropsychological studies of musical timbre. Annals of the New York

Academy of Sciences, 999(1):144–151, 2003.

[194] G Santhanam, SI Ryu, BM Yu, A Afshar, and KV Shenoy. A high-performance

brain-computer interface. Nature, 442(7099):195–198, 2006.

[195] Ichiro Sase, Hideo Eda, Akitoshi Seiyama, Hiroki C Tanabe, Akira Takatsuki, and

Toshio Yanagida. Multi-channel optical mapping: Investigation of depth informa-

tion. In Proc SPIE, volume 4250, pages 29–36, 2001.

[196] Hiroki Sato, Masashi Kiguchi, Fumio Kawaguchi, Atsushi Maki, et al. Practical-

ity of wavelength selection to improve signal-to-noise ratio in near-infrared spec-

troscopy. Neuroimage, 21(4):1554–1562, 2004.

[197] J.P. Saul, RD Berger, P. Albrecht, SP Stein, M.H. Chen, and R.J. Cohen. Transfer

function analysis of the circulation: unique insights into cardiovascular regulation. American Journal of Physiology-Heart and Circulatory Physiology, 261(4):H1231–

H1245, 1991.

[198] J.P. Saul, R.D. Berger, MH Chen, and R.J. Cohen. Transfer function analysis 137

of autonomic regulation. ii. respiratory sinus arrhythmia. American Journal of Physiology-Heart and Circulatory Physiology, 256(1):H153–H161, 1989.

[199] M.J. Scherer. The change in emphasis from people to person: introduction to

the special issue on Assistive Technology. Disability & Rehabilitation, 24(1-3):1–4,

2002.

[200] L.A. Schmidt and L.J. Trainor. Frontal brain electrical activity (eeg) distinguishes

valence and intensity of musical emotions. Cognition & Emotion, 15(4):487–500,

2001.

[201] W.W. Seeley, V. Menon, A.F. Schatzberg, J. Keller, G.H. Glover, H. Kenna, A.L. Reiss, and M.D. Greicius. Dissociable intrinsic connectivity networks for salience

processing and executive control. The Journal of neuroscience, 27(9):2349–2356,

2007.

[202] E. Sellers, G. Schalk, and E. Donchin. The p300 as a typing tool: tests of brain- computer interface with an als patient. Psychophysiology, 40:77, 2003.

[203] E.W. Sellers and E. Donchin. A P300-based brain-computer interface: initial tests

by ALS patients. Clinical Neurophysiology, 117(3):538–548, 2006.

[204] E.W. Sellers, A. K¨ubler,and E. Donchin. Brain–computer interface research at

the University of South Florida cognitive psychophysiology laboratory: the P300

speller. Biomed. Eng, 51(4):647–656, 2004.

[205] W.A. Sethares. Tuning, timbre, spectrum, scale. 2004.

[206] Y.I. Sheline. 3d mri studies of neuroanatomic changes in unipolar major depression:

the role of stress and medical comorbidity. Biological Psychiatry, 48(8):791–800,

2000. 138

[207] D.V. SHERMAN and D. Ely. Biochemical and galvanic skin responses to music stimuli by college students in biology and music. Perceptual and motor skills,

74(3c):1079–1090, 1992.

[208] A. Siegel and H. Edinger. Neural control of aggression and rage behavior. Handbook

of the Hypothalamus, 3(Part B), 1981.

[209] J.R. Simpson, W.C. Drevets, A.Z. Snyder, D.A. Gusnard, and M.E. Raichle.

Emotion-induced changes in human medial prefrontal cortex: Ii. during antici-

patory anxiety. Proceedings of the National Academy of Sciences, 98(2):688–693,

2001.

[210] R. Sinha, W.R. Lovallo, and O.A. Parsons. Cardiovascular differentiation of emo-

tions. Psychosomatic Medicine, 54(4):422, 1992.

[211] E. Smith and M. Delargy. Locked-in syndrome. British Medical Journal,

330(7488):406, 2005.

[212] E.M. Sokhadze. Effects of music on the recovery of autonomic and electrocortical

activity after stress induced by aversive visual stimuli. Applied psychophysiology

and biofeedback, 32(1):31–50, 2007.

[213] M.P. Spackman, M. Fujiki, B. Brinton, D. Nelson, and J. Allen. The ability of chil-

dren with language impairment to recognize emotion conveyed by facial expression

and music. Communication Disorders Quarterly, 26(3):131, 2005.

[214] D. Sridharan, D.J. Levitin, and V. Menon. A critical role for the right fronto-

insular cortex in switching between central-executive and default-mode networks. Proceedings of the National Academy of Sciences, 105(34):12569–12574, 2008.

[215] N. Steinbeis, S. Koelsch, and J.A. Sloboda. Emotional processing of harmonic 139

expectancy violations. Annals of the New York Academy of Sciences, 1060(1):457– 461, 2005.

[216] M.W. SullivanK et al. Contingency, means end skills, and the use of technology in

infant intervention. Infants & Young Children, 5(4):58, 1993.

[217] K. Tai, S. Blain, and T. Chau. A review of emerging access technologies for indi- viduals with severe motor impairments. Assistive technology: the official journal

of RESNA, 20(4):204, 2008.

[218] K. Tai and T. Chau. Single-trial classification of NIRS signals during emotional in-

duction tasks: towards a corporeal machine interface. Journal of NeuroEngineering

and Rehabilitation, 6(1):39, 2009.

[219] M. Tanida, M. Katsuyama, and K. Sakatani. Relation between mental stress-

induced prefrontal cortex activity and skin conditions: A near-infrared spectroscopy

study. Brain research, 1184:210–216, 2007.

[220] J.J. Tecce. Contingent negative variation (CNV) and psychological processes in

man. Psychological Bulletin, 77(2):73–108, 1972.

[221] M.M. Ter-Pogossian, M.E. Raichle, and B.E. Sobel. Positron-emission tomography.

Sci. Am.;(United States), 243(4), 1980.

[222] J.F. Thayer and R.D. Lane. A model of neurovisceral integration in emotion reg-

ulation and dysregulation. Journal of affective disorders, 61(3):201–216, 2000.

[223] M. Toyokura. Waveform and habituation of sympathetic skin response. Electroen- cephalography and Clinical Neurophysiology/Electromyography and Motor Control,

109(2):178–183, 1998.

[224] L. Trejo, K. Knuth, R. Prado, R. Rosipal, K. Kubitz, R. Kochavi, B. Matthews, 140

and Y. Zhang. Eeg-based estimation of mental fatigue: Convergent evidence for a three-state model. Foundations of Augmented Cognition, pages 201–211, 2007.

[225] E.Z. Tronick. Emotions and emotional communication in infants. American psy-

chologist, 44(2):112, 1989.

[226] T.M. Vaughan, D.J. McFarland, G. Schalk, W.A. Sarnacki, D.J. Krusienski, E.W.

Sellers, and J.R. Wolpaw. The wadsworth bci research and development program:

at home with bci. IEEE Transactions on Neural Systems and Rehabilitation Engi-

neering, 14(2):229–233, 2006.

[227] M Velliste, S Perel, MC Spalding, AS Whitford, and AB Schwartz. Cortical control

of a prosthetic arm for self-feeding. Nature, 453(7198):1098–1101, 2008.

[228] A. Villringer, J. Planck, C. Hock, L. Schleinkofer, and U. Dirnagl. Near infrared

spectroscopy (NIRS): a new tool to study hemodynamic changes during activation of brain function in human adults. Neuroscience Letters, 154(1-2):101–104, 1993.

[229] R.F. Vossa and J. Clarke. ” 1/f noise” in music: Music from 1/f noise. J. Acoust. Soc. Am, 63(1):258, 1978.

[230] Y. Wang, R. Wang, X. Gao, B. Hong, and S. Gao. A practical VEP-based brain- computer interface. IEEE Transactions on Neural Systems and Rehabilitation En-

gineering, 14(2):234–240, 2006.

[231] C.M. Warrier and R.J. Zatorre. Right temporal cortex is critical for utilization of

melodic contextual cues in a pitch constancy task. Brain, 127(7):1616–1625, 2004.

[232] L. Wedin. A multidimensional study of perceptual-emotional qualities in music.

Scandinavian Journal of Psychology, 13(1):241–257, 1972.

[233] N. Weiskopf, K. Mathiak, S.W. Bock, F. Scharnowski, R. Veit, W. Grodd,

R. Goebel, and N. Birbaumer. Principles of a brain-computer interface (bci) based 141

on real-time functional magnetic resonance imaging (fmri). Biomedical Engineer- ing, IEEE Transactions on, 51(6):966–970, 2004.

[234] N. Weiskopf, F. Scharnowski, R. Veit, R. Goebel, N. Birbaumer, and K. Mathiak.

Self-regulation of local brain activity using real-time functional magnetic resonance

imaging (fMRI). Journal of Physiology-Paris, 98(4-6):357–373, 2004.

[235] R.E. Wheeler, R.J. Davidson, and A.J. Tomarken. Frontal brain asymmetry and emotional reactivity: A biological substrate of affective style. Psychophysiology,

30(1):82–89, 1993.

[236] A. Wilson. Augmentative communication in practice: An introduction (2nd ed.).

University of Edinburgh, Edinburgh, Scotland, 1998.

[237] J.R. Wolpaw, N. Birbaumer, D.J. McFarland, G. Pfurtscheller, and T.M. Vaughan. Brain-computer interfaces for communication and control. Clinical neurophysiology,

113(6):767–791, 2002.

[238] J.R. Wolpaw and D.J. McFarland. Control of a two-dimensional movement signal

by a noninvasive brain-computer interface in humans. Proceedings of the National

Academy of Sciences of the United States of America, 101(51):17849, 2004.

[239] J.R. Wolpaw, D.J. McFarland, T.M. Vaughan, and G. Schalk. The Wadsworth

Center brain-computer interface (BCI) research and development program. IEEE

Transactions on Neural Systems and Rehabilitation Engineering, 11(2):204–207,

2003.

[240] W.M. Wundt and C.H. Judd. Outlines of psychology. W. Engelmann, 1907.

[241] H. Yang, Z. Zhou, Y. Liu, Z. Ruan, H. Gong, Q. Luo, and Z. Lu. Gender difference

in hemodynamic responses of prefrontal area to emotional stress by near-infrared

spectroscopy. Behavioural brain research, 178(1):172–176, 2007. 142

[242] T.O. Zander and C. Kothe. Towards passive brain–computer interfaces: apply- ing brain–computer interface technology to human–machine systems in general.

Journal of Neural Engineering, 8(2):025005, 2011.

[243] R.J. Zatorre. Discrimination and recognition of tonal melodies after unilateral

cerebral excisions. Neuropsychologia, 23(1):31–41, 1985.

[244] M.R. Zentner and J. Kagan. Infants’ perception of consonance and dissonance in

music. Infant Behavior and Development, 21(3):483–492, 1998.