Automatic Cognitive Load Detection from Face, Physiology, Task Performance and Fusion During Affective Interference

Automatic Cognitive Load Detection from Face, Physiology, Task Performance and Fusion during Affective Interference a, b b a M. Sazzad Hussain 1, Rafael A. Calvo , Fang Chen a National ICT Australia (NICTA), Australian Technology Park, Eveleigh, NSW 1430, Australia b School of Electrical and Information Engineering, University of Sydney, NSW 2006, Australia Abstract. Cognitive load is experienced during critical tasks and also while engaged emotional states are induced either by the task itself or by extraneous experiences. Emotions irrelevant to the working memory representation may interfere with the processing of relevant tasks and can influence task performance and behavior, making the accurate detection of cognitive load from nonverbal information challenging. This paper investigates automatic cognitive load detection from facial features, physiology and task performance under affective interference. Data was collected from participants (n=20) solving mental arithmetic tasks with emotional stimuli in the background and a combined classifier was used for detecting cognitive load levels. Results indicate that the face modality for cognitive load detection was more accurate under affective interference, whereas physiology and task performance were more accurate without the affective interference. Multimodal fusion improved detection accuracies, but it was less accurate under affective interferences. More specifically, the accuracy decreased with increasing intensity of emotional arousal. Keywords: Cognitive load measurement, affect, machine learning, multimodality, data fusion, HCI. Acknowledgement M. Sazzad Hussain was supported by Australia Award and National ICT Australia (NICTA). NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program. 1 Corresponding author. Tel: 61435009691 Email address: [email protected] 1. Introduction Understanding the various aspects of human psychology, along with the theories of computer science, is burgeoning in the field of human computer interaction (HCI) (Card, Moran, & Newell, 1983). New sensor technologies, such as low-cost physiological sensors, webcams, GPS, Internet tracking, etc., can be used for gathering great volumes of multimodal data. Machine learning techniques are applied to analyze, find patterns and make predictions related to user experiences. The information obtained is then employed for designing human-centered systems that can adapt to users psychological state, generate responses and provide assistance accordingly. Researchers (Duric et al., 2002; Hollender, Hofmann, Deneke, & Schmitz, 2010; Pantic & Rothkrantz, 2003; Schultz, Peter, Blech, Voskamp, & Urban, 2007; Sharma, Pavlovic, & Huang, 1998) in the area are investigating behavioral, perceptual, attentional, cognitive and affective aspects of human psychology as well as their interdependencies, with the goal of improving usability, user experience, performance and learning. Psychological phenomena such as cognition and affect in particular, are becoming increasingly important topics of research as they are needed for making systems more aware of the user’s inner states of mind (Calvo & D'Mello, 2010; F. Chen et al., 2012). Scenarios such as air traffic control rooms, traffic management centers, customer call centers, etc., involve complicated tasks for operators to manage and solve. For instance, air traffic control has been recognized as a complex job with concerns related to the workload that overwhelms controllers and compromises the safety of air travel (Sperandio, 1978). Cognitive load (CL) influences task performance, therefore, the ability to measure it in real-time can support users affected by cognitive overload through personalized adaptive systems (F. Chen et al., 2012; de Tjerk, Henryk, & Neerincx, 2010). These systems can help users not only achieve high productivity but also avoid task related stress and frustration that could lead to error and hazardous situations. Complicated tasks can be responsible for inducing high cognitive load and this may also introduce other psychological factors such as emotions. On the contrary, personal feelings during critical task activities may also influence the experience of CL and introduce overload. The interference of such factors will make CL detection more challenging (S. Chen & Epps, 2012). This paper investigates the automatic detection of CL levels from multimodal information under the influence of affective components (e.g. arousal). 1.1. Cognitive Load Detection: Theory and Techniques The term ‘cognitive load’ is used in cognitive psychology to refer to the load caused by the executive control of working memory (F. Paas, Tuovinen, Tabbers, & Van Gerven, 2003). The limited working memory (capacity and duration) available during task activities results in the experience of CL (F. Chen et al., 2012; S. Chen & Epps, 2012; Huang & Tettegah, 2010; F. Paas et al., 2003). The finite working memory can become either under loaded or overloaded based on the amount of information processed simultaneously (Cowan, 2001). CL is experienced all the time, but the impact is more severe under critical conditions, for instance higher CL may be experienced if the task is complex. Research in Cognitive Load Theory (CLT) has focused on identifying instructional designs that create unnecessary working memory load, such as extraneous load, in order to provide more effective alternatives (Ayres & Gog, 2009; Kirschner, Ayres, & Chandler, 2011). Sustaining the optimal CL is important for minimizing errors during critical tasks (e.g. driving, air traffic control, etc.) as well as for maintaining performance (e.g. customer service, call center operation, etc.). Analytical and empirical methods can be used for assessing CL in general (F. Paas et al., 2003). CL is a factor that users experience, while mental effort is a unit that users actually exert in response (Jong, 2010). Self-rating techniques used to report the experienced mental effort have shown to be quite reliable based on numeric indicators (Gopher & Braune, 1984; Odonnell & Eggemeier, 1986). For realistic situations, this approach may seem impractical, because the questionnaires can interrupt the flow of tasks and even add more tasks for the potentially overloaded user. Techniques employing task performance include the concurrent measurement of primary and secondary task performances (Odonnell & Eggemeier, 1986). Hockey (2003) proposed a model that presents the relationship between performance and workload. According to the model, higher performance can still be achieved within the ‘effort’ region, but high CL results in a greater chance of errors with performance beginning to decline. The indicators of performance include reaction time, accuracy and error rate. Besides these techniques some studies have explored speech for acoustic and prosodic patterns (pitch variation in voice) as a cue to reflect CL (Le, Ambikairajah, Epps, Sethu, & Choi, 2011; Le, Epps, Choi, & Ambikairajah, 2010; Yin, Chen, Ruiz, & Ambikairajah, 2008). The rate of peaks in pitch and pauses are proven to be good indicators of CL. Behavioral measures, such as eye-gaze tracking (Gütl et al., 2005), mouse (Ark, Dryer, & Lu, 1999) and keyboard inputs, digital pens, etc., have also been explored to find patterns as indicators of CL. The changes in the cognitive functioning are also reflected on physiological variables based on the assumption that any changes in cognitive functioning are reflected in the human physiology. Physiological measures such heart rate variability, skin response, brain signals, and pupil dilation have been explored and reported in the past (Beatty & Lucero-Wagoner, 2000; Berka et al., 2007; Kramer, 1991; F. G. W. C. Paas & Van Merriënboer, 1994; Shi, Ruiz, Taib, Choi, & Chen, 2007; Stanners, Coulter, Sweet, & Murphy, 1979). These studies have demonstrated that the physiological data can correlate (more or less depending on the type of signal) with the level of stimulation experienced and can also represent various levels of mental effort. 1.2. Interdependencies between Cognition and Affect Behavioral and physiological expressions of feelings, along with factors such as moods, attitudes, affective styles, and temperament, have been described as affective phenomena (Calvo & D'Mello, 2010). In the context of this study, affect (i.e. emotion) refers to the feeling of positive or negative pleasance (i.e. valence) and its intensity (i.e. arousal). Cognition and affect are regarded as two interrelated aspects of human functioning and cannot be seen as completely separate phenomena (Huang & Tettegah, 2010; Immordino-Yang & Damasio, 2007). Immordino-Yang and Damasio (Immordino-Yang & Damasio, 2007) provided a theoretical framework showing the interdependencies between cognition and emotions. There is also a close relationship between the operation of working memory and affective components (Kalyuga, 2011). Studies have also suggested that interdependences between cognition, affective states and other human functioning (e.g. behavioral, perception) are essential for building adaptive intelligent HCI systems (Duric et al., 2002). In previous studies, a number of techniques have been investigated for affect detection (Calvo & D'Mello, 2010). At the same time, research in the area of CL measurement and detection has progressed separately (F. Paas et al., 2003). Both communities

Load more