Investigating Executive Functioning in Everyday Life using an Ecologically Oriented Virtual Reality Task

by

Diana Jovanovski

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Psychology University of Toronto

© Copyright by Diana Jovanovski, 2010

Investigating Executive Functioning in Everyday Life using an Ecologically Oriented Virtual Reality Task

Diana Jovanovski

Doctor of Philosophy

Department of Psychology University of Toronto

2010

Abstract

Commonly employed executive function measures may be of limited use due to their modest ecological validity. A novel task was developed - the Multitasking in the City Test (MCT) - in an attempt to improve ecological validity. The MCT involves task demands that resemble the demands of everyday activities. In study one, healthy participants were recruited in order to explore ‘normal’ performance on the MCT and its relationship with other cognitive measures.

The MCT showed poor associations with executive tests and significant correlations with non- executive tests. This suggested the MCT may evaluate executive functioning in a different way from other executive measures such that it does not simply measure component executive processes but the integration of these components into meaningful behaviour. Patients with stroke and traumatic brain injury were recruited for study two to further explore the ecological validity and MCT performance characteristics. Only the MCT and a semantic fluency task demonstrated good ecological validity via significant relationships with a behavioural rating scale. Patients and normals made qualitatively similar types of errors although patients made these errors more frequently. Patients demonstrated better planning ability but completed fewer tasks than normals on the MCT. This discrepancy was attributed to impaired initiation. In study

ii

three, the MCT and verbal fluency tasks were administered to brain-injured individuals both pre- and post-executive function rehabilitation to evaluate their utility as treatment outcome measures and to assess ecological validity via a different behavioural rating scale from the one used in study two. Strategies trained during treatment generalized to MCT but not verbal fluency performance. Both MCT and semantic fluency performance were found to have good ecological validity. Overall, the findings from this research project suggest the MCT and semantic fluency tasks have good ecological validity. They further suggest that several common executive function measures lack adequate ecological validity and may not be predictive of real world behaviour. Moreover, these results support the concept of an executive function ‘system’ that can be fractionated into a variety of executive processes and that impairments in one process

(e.g., initiation) can exist alongside intact functioning in other processes (e.g., planning).

iii

Acknowledgments

I am incredibly grateful for the invaluable support and encouragement of my supervisor,

Konstantine Zakzanis. Without his mentorship and intellectual guidance, this work would not have been possible. Equally as important, I consider him a close personal friend and hope to continue collaborating with him on research projects and clinical endeavours in the years to come. I also wish to thank my other committee members, David Nussbaum and Suzanne Erb, for their guidance and support over the past several years. They are both bright and caring academics who truly fostered a rich, intellectually stimulating experience for me throughout this challenging yet rewarding journey.

I am immensely appreciative of the help of several individuals on the development of this thesis. Specifically, I offer my heartfelt gratitude to Lesley Ruttan, Amery Treble, Dmytro

Rewilak, Douglas Bors, Darren Kadis, and Douglas Garrett. For their friendship, understanding, and encouragement, I am also deeply appreciative of my dear friends Shirin Montazer and

Tharsni Kankesan. I have relied on them as a constant source of advice and wisdom, both personally and professionally, and value their opinions dearly.

I wish to thank Zachariah Campbell as well as his caring family for their encouragement in my pursuit of a doctoral degree. Zac’s unwavering support and confidence in me has provided an unparalleled source of strength and motivation throughout this journey. Most importantly, he has been a best friend and partner and I am extraordinarily fortunate to have him in my life.

Finally, I am eternally indebted to my family, and especially my parents Tanas and

Lubenka, for their love and unconditional support. My mother’s strength and selfless devotion to her children are a true inspiration. My father, with all of his remarkable accomplishments, has

iv

and always will be a most admirable role model for me in all realms of life. His dedication and tireless commitment to his family, ideals, and passions are characteristics I intend to emulate throughout my life.

v

Table of Contents

Abstract……………………………………………………………………………………………ii

Acknowledgments.…………………………………………………………………………….....iv

Table of Contents……………………………………………………………………………...….vi

List of Tables……………………………………………………………………………………viii

List of Figures………………………………………………………………………………...…...x

List of Appendices………………………………………………………………………………..xi

Chapter 1: Introduction……………………………………………………………………………1

Chapter 2: Development and Performance Characteristics of an Ecologically Oriented Executive Function Measure in Healthy Adults…………………………………..…28

Abstract…………………………………………………………………………………..28

Introduction………………………………………………………………………………29

Method…………………………………………………………………………………...34

Results……………………………………………………………………………………44

Discussion………………………………………………………………………………..52

Chapter 3: Ecologically Valid Assessment of in Patients with Acquired Brain Injury……………………..………………………………..…..59

Abstract…………………………………………………………………………………..59

Introduction……………………………………………………………………………....60

Method…………………………………………………………………………………..62

Results…………………………………………………………………………………...73

Discussion……………………………………………………………………………….81

vi

Chapter 4: Evaluating the use of the MCT and Verbal Fluency Tasks as Outcome Measures for Goal Management Training………….………………………………..…………92

Abstract…………………………………………………………………………………..92

Introduction……………………………………………………………………...... 93

Method…………………………………………………………………………………...98

Results…………………………………………………………………………………..103

Discussion………………………………………………………………………………110

Chapter 5: General Discussion………………………………………………………..………...124

vii

List of Tables

Chapter 2

Table 2.1: Demographic characteristics of sample……………………………...... 170

Table 2.2: Neuropsychological test performance of sample …………………………...171

Table 2.3: Qualitative errors made by normal participants on the MCT……………….172

Table 2.4: Correlations between MCT measures……………………………………….173

Table 2.5: Correlations between the MCT and cognitive tests…………………………174

Chapter 3

Table 3.1: Description of brain lesion site(s) of each patient…………………………..175

Table 3.2: Demographic characteristics of sample……………………………………..176

Table 3.3: Neuropsychological test performance and FrSBe scores of sample……………………………………………………………………….177

Table 3.4: Qualitative comparison of errors made by the patient and normal sample on the MCT…………………………………………………………178

Table 3.5: Correlations between MCT measures……………………………………….179

Table 3.6: Correlations between the MCT and other executive function tests…………180

Table 3.7: Correlations between the MCT and non-executive tests……………………181

Table 3.8: Correlations between executive tests and FrSBe – family ratings………….182

Chapter 4

Table 4.1: Neuropsychological test performance and questionnaire results of sample pre- and post-intervention…………………………………………...183

Table 4.2: Qualitative comparison of errors made by the sample on the MCT pre- and post-intervention…………………………………………………..184

Table 4.3: Effect sizes of neuropsychological test and questionnaire scores from pre- to post-intervention……………………………………………….185

Table 4.4: Demographic characteristics of normal sample……………………………..186

viii

Table 4.5: MCT performance on two administrations of the task in a normal sample……………………………………………………………….187

Table 4.6: Effect sizes of MCT scores in patient and normal samples…………………188

ix

List of Figures

Chapter 2

Figure 2.1: Bird’s eye view map of the virtual environment for the Multitasking in the City Test…………………………………………………………….189

Figure 2.2: Screen shots of the Multitasking in the City Test………………………….190

Chapter 4

Figure 4.1: Ratings on the Dysexecutive Questionnaire at pre- and post-intervention……………………………………………………………191

Figure 4.2: Scatter plots of the relationship between the Dysexecutive Questionnaire-Other Executive Function scale (DEX-I EF) and executive function measures…………………………………………..194

x

List of Appendices

Appendix A: Abbreviations List………………………………………………………………..199

Appendix B: Multitasking in the City Test (MCT) Instructions………………………………..201

Appendix C: Computer Skills Questionnaire…………………………………………………..203

Appendix D: Dysexecutive Questionnaire (DEX) Executive Function (EF) Scale……………204

xi

Chapter 1 Introduction

As the most complex faculties of the human brain, the are fundamental to independent, purposive, and necessary behaviours and are often ascribed to the prefrontal cortex. One of the most extensive descriptions of the executive functions was articulated by the Russian neuropsychologist Alexander Luria (1966), who observed that patients with damage to the frontal lobes often presented with intact speech and motor and sensory abilities, yet complex psychological abilities were severely impaired. He noted that these patients often had difficulty carrying out complex, purposive, and goal-directed behaviours and were often unable to accurately evaluate the success or failure of their behaviours. They were particularly impaired in terms of their ability to use the information at hand to modulate future behaviour. Luria described these patients as unconcerned with their failures, indecisive, hesitant, and indifferent to their lack of awareness of their own behaviours.

Neuropsychological Conceptualizations of Executive Functioning

Various conceptualizations have been developed since the time of Luria’s observations in

an attempt to articulate the neuropsychology of executive functions. One of the oldest and most

well-known conceptualizations is Baddeley and Hitch’s (1974) model, which

involves slave systems (i.e., the phonological loop and visuo-spatial sketch pad) that are

responsible for maintaining stimuli in a number of different forms for their eventual

manipulation by the ‘central executive component’. This model has been expanded by Baddeley

and co-workers over the years (e.g., see Baddeley, 1998). A similar model that has received

considerable attention is Norman and Shallice’s (1986) conceptualization of the control of

1 2 action. In their model, they describe two control-to-action processes. The first, contention scheduling, is activated by routine situations and involves automatic behaviours. The second is the Supervisory Attentional System (SAS1), which has been likened to Baddeley and Hitch’s

central executive component. The SAS plays an important role when the execution of routine

and learned behaviours will not suffice, such as in novel or unfamiliar situations in which one

cannot rely on established routines (Shallice, 1994). The SAS is thought to be responsible for

performing a variety of processes involved in the development of plans and willed actions. A

more recent version of the theory advances the model by providing empirical evidence of the

fractionation of the SAS into different subprocesess (Shallice & Burgess, 1996). In this

framework, a fundamental process for coping with non-routine situations is the development and

application of an appropriate strategy, or “temporary new schema”. Strategy generation is

described to occur either spontaneously or through a problem solving process. This is followed

by careful monitoring of the new schema (or strategy). There are also “special purpose”

processes that aid in appropriate strategy generation. One such process is the generation and

realization of intentions, “so that one can prepare a strategy and plan action for a later time” (p.

1406). This process was thought to be the essential factor responsible for the impairments

evinced in three patients with frontal head injuries, whose difficulties involved carrying out

various unrelated subtasks in the absence of a signal from the experimenter as to when to do it

along with problems obeying a set of previously trained simple rules (Shallice & Burgess, 1991).

Both of these difficulties were interpreted as being specifically related to problems in the setting

up and realization of intentions, consistent with the tenets of the supervisory system framework.

1 See Appendix A for a list of abbreviations found throughout this dissertation.

3

Fuster (2002) proposed a perception-action cycle in which lateral prefrontal cortex controls at least four cognitive operations: selective attention, working memory, preparatory set, and response monitoring. In this model, prefrontal networks constitute what is called executive

memory. Prefrontal functions are placed within the broad framework of the cortical substrate of

long-term memory, which consist of various networks involving prefrontal and posterior cortical

components. It is proposed that “the most general function of the lateral prefrontal cortex is the

temporal organization of behaviour, speech, and reasoning” (p. 99). This assumption is based on

neuropsychological data from humans with extensive lesions in this region who demonstrate

difficulties in executing plans of behaviour and complex mental operations (Fuster, 1997; Luria,

1966). These deficits are attributed to “impairments in the representation and execution of goal-

directed sequences of actions” (p. 99; Fuster, 2002). A critical role of the lateral prefrontal

cortex, according to Fuster, is the capacity for temporal integration. In his view, “Temporal

integration is essential to temporal organization. Temporal order derives directly from the

mediation of cross-temporal contingencies—that is, from the integration of temporally separate

units of perception, action and cognition into a sequence toward a goal” (p.99). All goal-directed

tasks are presumed to involve temporal integration and thus performance on such tasks is

dependent on the functional integrity of lateral prefrontal cortex. Fuster suggests that the

available evidence points to the crucial role of the prefrontal cortex in the selective and orderly

activation of cortical networks for temporal integration and organization, likening its role to the

“supervisory ” process proposed by Norman and Shallice (as described above;

see also Shallice, 1988).

Grafman (2002) provides another understanding of executive functions in his

characterization of the Structured Event Complex (SEC). This model also centers around the

4 human prefrontal cortex and accounts for the social and emotional functions associated with this region of the brain in addition to the cognitive executive functions. Grafman defines SECs as “a set of events, structured in a particular sequence that as a complex composes a particular kind of activity that is usually goal oriented” (p. 298, Grafman, 2002). An example of an SEC is the individual events involved in the goal of purchasing items from a shopping list. SECs have boundaries that signal their onset and offset and are determined by temporal cues, cognitive cues, or environmental/perceptual cues. They have a particular goal that must be achieved prior to offset of the SEC. The model is based on a key principle involving the ability of prefrontal cortex neurons to sustain their firing and code the temporal and sequential properties of ongoing events in the environment or in the mind (see Fuster, 2002) over progressively longer periods of time. This has allowed human beings to more effectively code, store, and retrieve abstract and complex behaviours through evolutionary processes. According to Grafman, SECs comprise a set of differentiated representational forms (essentially distributed memory units) that are stored in different regions of the human prefrontal cortex. They are activated in parallel to reproduce all the SEC elements of a typical episode. These unique memories represent thematic knowledge, abstractions, morals, concepts, social rules, features of specific events, and grammars for the variety of SECs embodied in actions, stories and narratives, scripts, and schemas. The framework specifies that online processing of an SEC allows for the prediction of subsequent events. Thus, damage to the human prefrontal cortex would result in restricted retrieval of the SEC and lead to day-to-day behavioural impairments, distractibility, and difficulty detecting social and behavioural mistakes in activity sequences.

Stuss and Levine (2002) have defined executive functions as “high-level cognitive functions that are involved in the control and direction of lower-level functions.” Stuss and

5 colleagues have focused their theory of executive functioning on the role of attentional control processes, given the conspicuous role that the attention system plays in various prominent theories of frontal functions (e.g., Mesulam, 1985; Norman & Shallice, 1986). Recently, Stuss and Alexander (2007) have emphasized that ‘executive’ functions are simply one of the discrete categories of functions within the frontal lobes, and that it is the cognitive (as opposed to emotional) aspects of tasks that comprise the executive functions. They have identified three separate frontal “attentional” control processes within the executive function category: energization, defined as the process of initiation and sustaining of any response; task setting, or the ability to set a stimulus-response relationship; and monitoring, the process of checking the task over time for ‘quality control’ and the adjustment of behaviour. They were able to isolate the three processes to delineate a specific brain-behaviour relationship and presented data to demonstrate the distinctiveness of each process. Their theory is supported by data from patients with focal lesions as a method of demonstrating that a region is essential for an attentional process, rather than simply demonstrating that the region is activated (e.g., by way of functional magnetic resonance imaging) during the process. This empirical evidence led them to argue against the idea of an undifferentiated central executive or supervisory system. Instead, they suggest that executive function impairments are better explained as deficits in these functionally and anatomically independent but interrelated “attentional” control processes that are associated with focal circumscribed lesions in discrete regional frontal areas. They note that while the three attentional processes do play a ‘supervisory’ role, such that they control lower-order processes and are domain-general (i.e., they interact with other modular cognitive modalities), their functioning “…can be explained by the flexible assembly of these processes in response to

6 context, complexity and intention over real time into different networks within the frontal regions, and between frontal and posterior regions” (p. 911).

All of the theories described above concur that there is no unitary executive function and propose that these functions can be fractionated into distinct processes. The nature of these executive subprocesses, however, has been variously operationalized within the different conceptualizations presented here and still other characterizations of these functions exist. The models presented here also have in common the notion that the different executive subsystems are not independent processes; rather, they operate in concert and are integrated (in addition to posterior cognitive processes) within the intact brain. Finally, these models similarly describe executive functions as playing a supervisory role, in which control is exerted over lower-order cognitive functions across different modalities.

Neuroanatomy of Executive Functions

Executive functions are described as consistently involving prefrontal cortex, and

specifically lateral prefrontal cortex, but also implicate neuroanatomical networks between

frontal and both posterior and subcortical regions. Indeed, the evidence from the research

literature supports this systems view in which executive functions result from the interplay of

diverse cortical and subcortical neural systems (Cummings, 1993; Gazzaniga, Ivry, & Mangun,

1998; Tekin & Cummings, 2002). Animal studies, neuroimaging investigations, and

neuropsychological lesion research all provide support for this systems perspective. In the

animal literature, tracing methods have identified numerous closed and open segregated loops

connecting prefrontal and subcortical structures (Alexander, Crutcher, & DeLong, 1990; Joel &

Weiner, 1994), demonstrating the complex interconnectivity of these regions. Similarly,

7 functional neuroimaging research of normal controls performing executive function tasks consistently yields activation in the anterior cingulate cortex, the posterior neocortex, the striatum, and the cerebellum, in addition to the prefrontal cortex (D’Esposito et al., 1995; Dove,

Pollmann, Schubert, Wiggins, & von Cramon, 2000; Konishi et al., 1998; Taylor, Kornblum,

Lauber, Minoshima, & Koeppe, 1997). Finally, lesion studies have demonstrated frequent executive impairments in patients with damage to brain regions outside of the prefrontal cortex

(Elias & Treland, 1999; Godefroy et al., 1992; Kramer, Reed, Mungas, Weiner, & Chui, 2002).

For example, Kramer and colleagues found that non-demented patients with subcortical lacunar infarcts exhibited subtle cognitive impairments when compared to control subjects, particularly evident as declines in executive functioning. In sum, there is overwhelming evidence indicating that the neural systems thought to underlie executive functioning include reciprocal projections between the prefrontal cortex and posterior brain regions as well as subcortical structures including the cerebellum, thalamus, and .

Taken together, while the frontal lobes do play a vital role in behavioural regulation, it is inaccurate to equate the anatomical term “” with the functional term “executive”.

Indeed, Baddeley (1998) argued against the tendency to use these terms synonymously on the following grounds:

 Executive processes need not be unitary;

 The frontal lobes represent a large and multifaceted area of the brain, which is unlikely to

be unitary in function;

 Executive processes are likely to involve links between different parts of the brain and

hence are unlikely to be exclusively associated with a frontal location;

8

 Patients may conceivably have executive deficits without clear evidence of frontal

damage; and

 Patients with frontal lesions will not always show executive deficits.

The Dysexecutive Syndrome

Baddeley (1986) proposed the term Dysexecutive Syndrome (DES) to describe patients with executive function deficits and to demarcate such deficits from their anatomical location.

The DES has been described as involving a constellation of deficits in executive functions including planning, problem solving, multitasking, and behavioural control (Burgess, Veitch, de

Lacy Costello, & Shallice, 2000; Mateer, 1999; Shallice & Burgess, 1991). According to

Baddeley, these impairments can be explained as a failure at the level of the SAS. Typically, individuals who are labeled as suffering from the DES present with difficulty initiating and stopping activities, problems with mental and behavioural shifts, increased distractibility, and impairments in learning novel tasks even though their basic cognitive functions are intact

(Shallice & Burgess, 1991).

The concept of a DES has been a source of debate in the literature. This debate may stem from the terminology, specifically the use of the word “syndrome” as it implies that patients with executive function deficits present in a similar manner with the same symptomatology. Stuss and Alexander (2007) firmly oppose this notion and believe the concept of an undifferentiated

‘dysexecutive’ disorder is faulty. As illustrated in the first section of this review, the theories of

Stuss and several other prominent conceptualizations in the executive function literature provide evidence of the fractionation of executive functions. That is, numerous sources of evidence support the notion of an executive system comprised of a number of processes or modules that

9 may be selectively impaired in any patient, thus resulting in unique types of behavioural and cognitive sequelae (e.g., see Damasio, 1996; Shallice & Burgess, 1991; Stuss et al., 1995; Stuss

& Alexander, 2007, etc.). This is consistent with the fact that several neural systems are involved in executive function processes and numerous sources of damage to the brain can lead to a range of executive impairments. Indeed, impaired executive functioning is a common feature of a multitude of clinical patient groups. Among those evidencing executive dysfunction are individuals who have incurred damage to the brain resulting from stroke (Pohjasvaara,

Leskelä, & Vataja, 2002), traumatic brain injury (TBI; Mattson & Levin, 1990; Stuss & Gow,

1992), tumour (Goldstein, Obrzut, John, Ledakis, & Armstrong, 2004), multiple sclerosis (MS;

Arnett et al., 1994), meningitis (Schmidt, Heimann, & Djukic, 2006), (Bullock & Voss,

2006), and various psychiatric disorders such as depression and bipolar disorder (e.g., see

Elderkin-Thompson, Mintz, Haroon, Lavretsky, & Kumar, 2007 and Mur, Portella, Martinez-

Aran, Pifarre, & Vieta, 2007, respectively). Given that such deficits may arise from a multitude of etiologies, it is not entirely unexpected that executive dysfunction is a common neuropsychological signature among individuals with brain damage (Levine et al., 2000).

Furthermore, the fact that brain damage can result in various and distinct types of executive deficits discredits the notion of a unitary “dysexecutive syndrome”.

Traditional Tests of Executive Function

Many test measures of executive function have been utilized for several decades to characterize impairment. For example, the Wisconsin Card Sorting Test (WCST; Berg, 1948;

Grant & Berg, 1948) is one of the most commonly employed clinical measures used to articulate executive dysfunction. To this end, the task was developed to measure cognitive flexibility

(Berg, 1948). It involves matching a target with other stimuli based on a given dimension,

10 namely colour, form, or number, until feedback indicates that the given dimension is no longer correct and an alternative dimension should be chosen, therefore requiring a shift in attentional focus or cognitive set. A “category” is completed after the subject has made 10 consecutive correct responses, prompting a switch to the next sorting category (e.g., from colour to form).

Typically, the test is discontinued after a subject has successfully completed six categories or, failing that, after all 128 cards have been sorted. More recently, an abbreviated 64-card version of the WCST was developed, with research indicating that the WCST-64 is a valid, more efficient alternative to the standard version in patients with neurologic and psychiatric disorders

(Love, Greve, Sherwin, & Mathias, 2003; Robinson, Kester, Saykin, Kaplan, & Gur, 1991;

Sherer, Nick, Millis, & Novack, 2003). The WCST first earned its reputation as a measure of frontal dysfunction when Milner’s (1963, 1964) studies documented defective performances by patients with prefrontal excisions to the dorsolateral frontal regions when compared to patients with excisions to the orbitofrontal or posterior cortex. It was assumed that because these patients had difficulty with the task, dorsolateral prefrontal lesions result in an inability “to shift from one

sorting principle to another, apparently due to perseverative interference from previous modes of

response” (p.99, Milner, 1963). Several subsequent investigations of various clinical populations

using functional neuroimaging instrumentation since this time support these early findings by

demonstrating that the WCST is sensitive to frontal lobe function (Arnett et al., 1994; Rezai et

al., 1993; Stuss et al., 2000; Weinberger et al., 1988). However, additional evidence has

accumulated since the time of Milner’s studies that suggests that other brain regions are also

involved in performing the task (e.g., Anderson, Damasio, Jones, Tranel, 1991; Barcelo, 2001).

For instance, patients with frontal lobe pathology are not always impaired on this test while some

patients with damage beyond the frontal cortex perform poorly on the WCST (Anderson et al.,

11

1991). Moreover, it has been shown that patients with frontal lobe pathology do not always differ from control subjects on the WCST (Stuss et al., 1983). Impairments on the WCST have

also been documented in a variety of conditions associated with executive dysfunction, including

but not limited to TBI (Anderson, Bigler, & Blatter, 1995), stroke (Su, Wuang, Chang, Guo, &

Kwan, 2006), MS (Arnett et al., 1994), Parkinson’s disease (Green et al., 2002),

(Beatty, Jocic, Monson, & Katzung, 1994; Heinrichs & Zakzanis, 1998), obsessive-compulsive

disorder (Lacerda et al., 2003), and chronic cocaine and polydrug use (Jovanovski, Erb, &

Zakzanis, 2005; Rosselli & Ardila, 1996).

Several other executive function test measures have also been associated with variable

findings in terms of neuroimaging evidence and heterogeneity of performance across different

patient groups. For instance, Tower tests are among the most commonly administered executive

tasks in research and clinical settings. These tests consist of several pegs that hold a number of

items (e.g., beads or rings) in a particular array. An examiner presents the subject with a

stimulus card of a desired pattern, and it is the responsibility of the subject to achieve that

particular design as efficiently as possible (i.e., in the least number of moves). In the case of the

Tower of London (TOL; Shallice, 1982), successive trials require an increasing number of

minimum moves to achieve the goal state. Tower tests have generally been regarded as

measures of planning ability (e.g., see Lezak, 2004) although the extent to which actual careful

planning is involved in performing these tasks may be quite limited (Miyake et al., 2000).

Researchers have shown that frontal lobe damage is a good predictor of TOL performance (e.g.,

Owen, Downes, Sahakian, Polkey, & Robbins, 1990; Owen, Sahakian, Semple, Polkey, &

Robbins, 1995). TOL performance in frontal lobe dementia and frontal lesion patients has been

shown to be similar and both types of patients have performed significantly worse than matched

12 controls (Carlin et al., 2000). Several recent studies, however, have failed to discriminate between patients with strictly frontal lesions and controls using this test (Andres & van der

Linden, 2001; Cockburn, 1995). Various patient populations have been found to be impaired on the TOL, including patients with Huntington’s disease (Watkins et al., 2000; Zakzanis, Leach &

Kaplan, 1999).

The Trail Making Test (TMT) is a further example of a test measure commonly employed to articulate executive function. It was adapted by Reitan (1955) from the Army

Individual Test Battery and added to the Halstead Battery. There are two parts to the test: Part A

(TMT A) requires the subject to connect with their pencil, in order, 25 encircled numbers that are arranged in a random pattern on a page; Part B (TMT B) requires the subject to connect 25 encircled numbers and letters in alternating order (i.e., 1-A-2-B-3-C…). Although some authors identify the TMT as primarily a measure of attention (e.g., Strauss, Sherman, & Spreen, 2006),

Part B is often classified as an executive function measure in neuropsychological investigations due to its mental flexibility—specifically its attentional switching—component (e.g., Herman,

2004). Poor performance on TMT B has been associated with frontal lobe dysfunction (Ricker et al., 1996). More recently, however, TMT performance was shown to be associated with frontal and temporal lobe activation (Zakzanis, Mraz, & Graham, 2005) and meta-analytical findings indicated no significant differences between patients with frontal damage and those with damage to posterior brain regions on TMT B (Demakis, 2004). Deficits on TMT B have been found in a variety of clinical groups, such as patients with lead exposure (Stewart et al., 1999), mild

(Zakzanis, Leach, & Kaplan, 1999) and moderate to severe (Millis et al., 2001) TBI, and major depression (Elderkin-Thompson et al., 2003).

13

Another commonly employed category of executive function test is verbal fluency measures such as the Controlled Oral Word Association Test (COWAT; Benton & Hamsher,

1989; Spreen & Strauss, 1998) or the Verbal Fluency Test from the Delis-Kaplan Executive

Function System (D-KEFS; Delis, Kaplan, & Kramer, 2001). These measures assess the spontaneous production of words under restricted search conditions (Strauss, Sherman, &

Spreen, 2006). In particular, examinees must rapidly generate words according to letter (e.g., F,

A, S) or category (e.g., Animals), or alternate between two different semantic categories. Verbal fluency tasks have also been associated with frontal lobe functioning in addition to a number of other brain regions, including parietal and temporal structures (Henry & Crawford, 2004; Henry,

Crawford, & Phillips, 2004; Stuss et al., 1998) and can be useful in the detection of impairments in various conditions including aphasia (Spreen & Benton, 1977) and dementia (Heun,

Papassotiropoulos, & Jennssen, 1998).

The Evolution of the Field of Neuropsychology

Neuropsychological tests have traditionally been used to detect and localize neuropathology. Historically, the role of the neuropsychologist was to make inferences about the etiology of a given patient’s brain injury on the basis of their profile of cognitive strengths and weaknesses. With the advent and accessibility of neuroimaging techniques that allow for more objective methods of localizing brain injury, clinical neuropsychology necessarily had to evolve from this initial purpose. Fine-grained behavioural deficits are often discernable before a lesion is visible on a structural scan. Consequently, while neuropsychological tests are still employed to aid in the diagnosis of some conditions (e.g., early stage Alzheimer’s disease, Derrer et al.,

2002), the referral questions typically asked of modern day neuropsychologists have largely shifted away from localizing brain damage, to the description of functional impairment resulting

14 from such damage. Clinical neuropsychologists are now most often asked to comment on a patient’s ability to return to work or school or to perform instrumental activities of daily living such as managing finances and medication, shopping, driving or other transportation needs. In addition, referrals are no longer made exclusively by physicians. Employers, lawyers, and insurance companies now often request the services of the clinical neuropsychologist so to characterize the breadth and severity of cognitive impairment for disability benefits and rehabilitative purposes.

The incongruity of this shift in focus in neuropsychology is that the tests utilized to answer these new types of questions have not evolved along with the field. That is, the same tests that were originally developed for experimental research studies addressing questions related to localization of brain function or injury are the exact measures currently being used to address clinical questions addressing real world functioning. This is despite the fact that there is little empirical evidence that these tests adequately answer questions about everyday functioning.

Clinical neuropsychologists are often called upon to answer questions with potentially life-altering consequences for their patients. In the field of forensic neuropsychology, for example, Chaytor and Schmitter-Edgecombe (2003) discuss how the statements made by a neuropsychologist about a client’s functioning after brain injury often weigh heavily in issues of compensation. As such, it is of chief importance to all parties involved that an accurate determination of the functional sequelae of the brain injury is made. If the tools that are utilized to do so lack sufficient evidence to support the inferences made on the basis of test results, it can be said that today clinical neuropsychology stands on weak scientific grounds. Clearly, continued use of any such testing that is inadequate for the prediction of daily functioning not only threatens the credibility of the field of neuropsychology, it also does a great disservice to those

15 who depend on neuropsychological findings for a variety of purposes from compensation to employment suitability to treatment recommendations. As Chaytor and Schmitter-Edgecombe

(2003) point out, “if we are to continue accepting referral questions that target ecological problems, then we need to ensure that our methods are appropriate” (p. 182).

Ecological Validity

Referral questions related to the evaluation of everyday functioning address the concept of ecological validity, which refers to the “functional and predictive relationship between the patient’s performance on a set of neuropsychological tests and the patient’s behaviour in a variety of real world settings” (Sbordone, 1996, p. 16 2). As the recommendations made by

neuropsychologists can often have broad and substantial consequences for individuals, the

ecological validity of neuropsychological tests should be established. Two approaches are

commonly utilized to address ecological validity: veridicality and verisimilitude (Franzen &

Wilhelm, 1996). Veridicality is the degree to which performance on existing tests is related to

scores on other measures that predict everyday functioning. The evaluation of veridicality is

based on determining, by way of statistical analysis, the relationship between performance on

neuropsychological test measures and measures of everyday functioning, such as employment

status, clinician or informant ratings, or questionnaires. The concept of veridicality addresses the

notion that traditional tests may indeed be predictive of everyday abilities despite not having

been designed as “ecologically valid” measures of those abilities.

2 Some authors use this definition of ecological validity to describe the more general concept of external validity while referring to ecological validity as specifically involving the similarity between task demands and the demands of everyday life (i.e., Franzen & Wilhelm’s [1996] ‘verisimilitude’ approach, as described on the following page).

16

Verisimilitude is the similarity between the demands of a task and the demands imposed in the everyday environment. This approach has been taken in the development of a new generation of tests that have been devised with ecological validity in mind. Examples of these types of tests include the Behavioural Assessment of the Dysexecutive Syndrome (BADS;

Wilson, Alderman, Burgess, Emslie, & Evans, 1996), the Rivermead Behavioural Memory Test

(RBMT; Wilson, Cockburn, & Baddeley, 1985), and the Test of Everyday Attention (Robertson,

Ward, Ridgeway, & Nimmo-Smith, 1996). These tests have better face validity, meaning they

“look like” they are measuring what they are attempting to measure when compared to traditional tests. The goal with such tests is to attempt to simulate real life cognitive tasks and to identify those individuals who experience problems with such tasks. They do not necessarily have discriminative validity and thus may not be useful in identifying brain damage. Thus, the distinction between a diagnostic focus versus a focus on functional ability is evident in the development of tests using the veridicality versus verisimilitude approach, respectively.

A recent article by Chaytor and Schmitter-Edgecombe (2003) reviewed the ecological validity of neuropsychological tests by evaluating the efficacy of these two approaches. They identified six studies that explored the issue of ecological validity of executive functioning tests.

The studies differed in terms of the specific tests utilized although both traditional (veridicality) and verisimilitude tests were employed. Their overall findings indicated that executive tests were not significantly correlated with self-report measures but all studies reviewed revealed significant associations between executive tests (traditional and verisimilitude) and everyday abilities as measured by clinician ratings and informant (e.g., relatives) questionnaires. They further noted that performance on verisimilitude tests was more consistently related to the

17 outcome measures than traditional tests, suggesting that verisimilitude measures (i.e., BADS and

Everyday Problem Solving Inventory) have better ecological validity.

Although the research literature does provide evidence that traditional tests have

reasonable ecological validity, it remains uncertain what abilities they are predicting. Burgess

and others (2006) discuss how performance on these measures is of little benefit for assessment

purposes as their predictive validity (or “generalizability”) remains unclear. That is, they note

that we really do not know what situations in everyday life require the abilities measured by tests

such as the WCST. Indeed, it may be a leap in deductive reasoning to suggest that a card sorting

impairment could be extrapolated to difficulties in abilities required in everyday life. It is

difficult to even imagine a real life scenario remotely similar enough to the WCST that one could

predict the types of everyday behavioural difficulties that would be associated with impaired

WCST performance.

Even more poignant is the instance when patients perform normally on traditional

executive tests yet have clear executive impairments in their daily lives (Eslinger & Damasio,

1985). Notably, Shallice and Burgess (1991) described three patients with frontal lobe damage

who performed within normal limits on 12 tests of executive function, including the TMT and a

modified version of the WCST, yet presented with significant executive disabilities in daily life.

This finding became the impetus for the development of the BADS, a test battery composed of

tasks that attempt to resemble situations one may encounter in everyday life. The BADS manual

notes that patients with executive impairments are often difficult to assess because it is common

to find that the individual component processes of executive functioning (i.e., the building

blocks) are intact. They explain that the impairments experienced by these patients involve the

‘ability to initiate their use, monitor their performance and use this information to adjust their

18 behaviour’ (Burgess & Alderman, 1990, p. 183). 3 Most tests used presently however, focus on the building blocks or component processes. Indeed, Shallice and Burgess (1991) point out that in most neuropsychological tests “the patient typically has a single explicit problem to tackle at any one time, the trials tend to be very short…task initiation is strongly prompted by the examiner and what constitutes successful trial completion is clearly characterized” (pp. 727-

728). Traditional executive tests rarely if ever require the patient to organize or plan his or her behaviour over extended periods of time or to set priorities when required to perform multiple tasks despite the fact that these are precisely the types of abilities that are required to successfully

perform many everyday tasks.

The BADS has been shown to be more sensitive to executive impairment when compared

with other traditional executive measures (Bennett, Ong, & Ponsford, 2005). There is also

evidence suggesting, however, that it is limited in its ability to predict real world functioning in

patients with brain injury (Norris & Tate, 2000; Wood & Liossi, 2006) and that it is insensitive

to executive function impairments in relatively high functioning individuals (Sohlberg & Mateer,

2001) and in a group of patients who were reported to have planning deficits by care staff

(McGeorge et al., 2001). Finally, only two of the six subtests achieved modest correlations with

an ecologically valid test that is set in a real life location (Alderman, Burgess, Knight, &

Henman, 2003). The questionable ecological validity of the BADS may be due in part to the fact

that there is still a substantial lack of similarity between its subtests and real life situations (i.e.,

lack of verisimilitude). That is, there are major differences between the demands of the testing

environment and those of daily life (Acker, 1990). For instance, the BADS subtests involve

3 The BADS manual describes the patients in their validation study as experiencing the DES but the description seems to apply equally well to any individual experiencing executive function difficulties of any type or pattern.

19 tabletop administration of primarily paper and pencil tasks, thereby strongly resembling other more traditional neuropsychological tests in this regard. Indeed some of these subtests are, generally speaking, more similar to the traditional tasks described above than activities one would perform in everyday life. The Rule Shift Cards subtest, for example, involves the participant responding ‘Yes’ or ‘No’ after a series of playing cards are turned over one at a time according to a specified rule (e.g., whether the current card is the same colour as the previously presented card). Clearly, this is not a task one would engage in as part of their daily activities.

Moreover, the subtests that appear to resemble tasks one may perform in the real world involve situations that only occur rarely, if at all, for most people such as having to navigate an open field for a pair of lost keys or taking a trip to the zoo.

Manchester, Priestley, and Jackson (2004) suggest that office-based tests typically have low-to-moderate ecological validity. They argue that the identification of executive impairments in daily life may best be achieved through more naturalistic assessment measures in combination with information provided by significant others. An example of this naturalistic approach is the

Multiple Errands Test (MET; Shallice & Burgess, 1991). The task is carried out at a real life shopping center where participants are required to complete three task sets consisting of eight instructions. One requires buying various merchandise, the other involves being at a certain place by a specified time, and the third task is more complex and involves obtaining a particular type of information that is available within the precinct, such as the exchange rate for a specific foreign currency. The list of tasks is presented to participants as a general list of requirements.

In addition, the instructions include ‘rules’ that must be followed, such as only being allowed to enter shops within a certain area and not to enter a shop unless it is to buy something.

Participants are required to structure and plan how they are going to execute the various tasks

20 effectively on their own. The test is scored by two observers who monitor the participants’ behaviour while carrying out the activities. Shallice and Burgess found that the three patients who performed this task and who had known executive impairments in their daily life had severe difficulties with this task, despite normal performances on other cognitive and traditional executive tests.

A later study utilized a simplified version of the MET (MET-SV), designed to be more suited to general clinical use, to determine its ecological validity (Alderman, Burgess, Knight, &

Henman, 2003). This study found that the MET-SV discriminated well between patients with frontal lobe pathology and controls, even after accounting for I.Q. and even though many of the patients passed traditional executive tasks. Alderman and colleagues (2003) demonstrated that there were two clear patterns of failure in the neurological group, in that performance was either characterized by failure to achieve tasks or by rule breaking behaviours. Patients who tended to break task rules were those whose caretakers reported more severe executive memory disturbances, particularly involving memory control (e.g., , problems with temporal sequencing, and perseveration). On the other hand, patients whose performance involved task failures had more negative affective symptoms, such as apathy and lack of emotion, interpreted by the researchers as an initiation problem. They also found that performance on the MET-SV was strongly associated with observed dysexecutive symptoms in everyday life as evaluated by both self- and informant-ratings on the Dysexecutive Questionnaire (DEX; Burgess, Alderman,

Wilson, Evans, & Emslie, 1996). Finally, they noted that patients made more errors than controls and that these errors were qualitatively different from those made by controls, noting in particular that they made significantly more social rule breaks.

21

Although the MET and its simplified version (MET-SV) appear to have good ecological validity, there are a number of limitations with this task. Perhaps most problematic is that it is impractical. The test is quite time consuming and costly to administer. Moreover, it may not be feasible to travel to locations outside of the testing environment (often within a hospital setting), particularly with patients who have limited mobility. A second problem is the potential difficulty of establishing inter-rater reliability. Third, real world situations are unpredictable and

experimenters have little to no control over events that may arise. Finally, the concept of

devising explicit rules as part of the task may be helpful when attempting to distinguish between

various types of deficits, but it is not necessarily ecologically valid. In real life, we are usually

not told that you cannot enter a shop other than to buy something or that you must not travel

outside of a certain spatial boundary. A crucial point is that implementing task rules likely

serves to provide more structure to the task, which could mask or reduce the effect of executive

deficits in some patients in the same way traditional tests do. Moreover, most real life situations

do not typically involve other individuals (i.e., the assessors in this case) shadowing one’s every

move and making notes about their behaviour, which in and of itself could influence test

performance.

Virtual Reality

Virtual reality (VR) technology holds the potential to address many of the limitations of

previous executive function measures, from the paper and pencil “tabletop” tasks to the MET

and similar real life tasks. VR can be described as an advanced form of human-computer

interface that allows the user to “interact” and become “immersed” in a computer-generated

environment in a naturalistic fashion (Rizzo & Buckwalter, 1997). Virtual environments (VEs)

have numerous features that make them attractive for assessment and rehabilitation purposes. In

22 contrast to traditional executive test measures, VEs actively engage participants by allowing them to be involved in a task while at the same time being less focused on the fact that they are being tested (Rizzo & Buckwalter, 1997). Motivation during task performance may be increased using VR technology through the development of arguably more interesting tasks (e.g.,

Campbell et al., 2009) when compared to the paper and pencil variety. Unlike testing in the real world, strict experimental control over stimulus delivery and measurement can be maintained while objectively measuring behaviour in a VE (Schultheis & Rizzo, 2001). Similar to real world testing, tasks designed to be performed in the VE can closely simulate real life demands.

VEs also offer safety from potential dangers of the real environment and are ideal for patients with physical disabilities whose mobility issues may prevent them from performing certain tasks in the real world (e.g., Weiss, Bialik, & Kizony, 2003). Other advantages of using VEs are that they are cost-effective, often simple to operate, reduce data-processing time, and allow for the detailed analysis of task performance (Elkind, Rubin, Rosenthal, Skoff, & Prather, 2001). A variety of virtual environments have already been developed to enhance functional assessment and rehabilitation, including virtual cities (Campbell et al., 2009; Zakzanis, Quintin, Graham, &

Mraz, 2009), school classrooms (Rizzo, Buckwalter, & van der Zaag, 2002), university departments (McGeorge et al., 2001), and supermarkets (Klinger, Chemin, Lebreton, & Marié,

2004; Lo Priore, Castelnuovo, Liccione, & Liccione, 2003).

A number of researchers have begun to develop VR tasks for the assessment of executive functioning. Various VR measures of executive functioning have been modelled after more traditional tests such as the WCST (Elkind et al., 2001; Pugnetti et al., 1995; Pugnetti et al.,

1998a). For instance, one of the first VR tests of executive function required users to reach the exit of a virtual building through the use of environmental clues (e.g., colour or shape) that aided

23 in the correct selection of doors leading from room to room (Pugnetti et al., 1995; Pugnetti et al.,

1998a). Similar to the WCST, the correct choice criteria were changed after a fixed number of successful trials, and the user was then required to shift cognitive set and devise a new choice strategy in order to pass into the next room. In a comparison between neurologically impaired patients and non-impaired controls on both the VR task and the WCST, controls performed more successfully on both tests. Moreover, the performance of the patients on the VR task mirrored anecdotal descriptions of everyday behaviour provided by family members. Weak correlations between the VR task and the WCST suggested that they were measuring different functions.

Participants made errors earlier in the test sequence on the VR task when compared to the traditional test. The authors suggested that this finding was due to “the more complex (and complete) cognitive demands of the VE setting at the beginning of the test when perceptuomotor, visuospatial (orientation), memory, and conceptual aspects of the task need to be fully integrated into an efficient routine” (Pugnetti et al., 1998b; p. 160). Further support for the ecological validity of this VR task is provided by a case study describing the task’s successful detection of deficits that limited the patient’s daily activities and that remained undetected by traditional executive function tests (Mendozzi, Motta, Barbieri, Alpini, & Pugnetti, 1998).

More recently, researchers are using VR systems for detailed response measurement and analysis to examine specific behaviours characteristic of patients with executive dysfunction.

Klinger, Chemin, Lebreton, and Marié (2006) examined planning deficits in patients with

Parkinson’s disease as compared to age-matched controls in a virtual supermarket. The researchers described the patients’ paths through the supermarket as characterized by numerous stops, turning around, and hesitancies as compared to the paths of controls. Zhang and colleagues (2001) utilized a virtual kitchen in order to assess selected cognitive functions of TBI

24 patients as compared to normal volunteers. The TBI patients consistently demonstrated poorer performance in the ability to process information, identify logical sequencing, and complete the overall assessment. The results of such studies suggest that the use of VEs is valuable in enhancing our ability to describe the functional behaviours of individuals with executive dysfunction.

Research Study Overview

The main objective of the current research is to address two important questions posed in

Chaytor and Schmitter-Edgecombe’s (2003) review of the ecological validity of neuropsychological tests, namely Which tests have the most ecological validity? and How can ecological validity be improved? These questions will be explored in regard to executive function testing. An additional question to be addressed by this research is Which executive function measures are most appropriate for the evaluation of executive function rehabilitation?

As a starting point for meeting these goals, a novel VR executive function task called the

Multitasking in the City Test (MCT) was developed. This task is loosely modelled on the MET but differs in important ways from this task. The main differences are that the MCT is a less structured task such that it lacks explicit rules that may constrain behaviour; the tasks to be performed more closely resemble everyday behaviours familiar to most people; and planning ability is evaluated and compared to actual task performance (unlike most versions of the MET in which no opportunity is given to plan tasks ahead of time). These distinctions will be further elaborated upon in Chapter 2.

A second objective of this research is to explore MCT test performance characteristics in an effort to further our understanding of the various subprocesses of the executive function

25 system and how they are integrated during goal-directed behaviour. These analyses will primarily focus on an evaluation of the qualitative and quantitative patterns of performance on the MCT. Performance comparisons across healthy and patient samples, and similarities and differences in the functions evaluated on the MCT and other standardized executive and non- executive measures will also be made.

In summary, the three studies that comprise this research in terms of the general research question(s) being addressed and the methodological approaches utilized for each study is as follows:

Study 1. How can ecological validity be improved? This first study involved the development of the MCT along with the characterization of performance on this novel task. The main objective in developing the MCT was to attempt to improve the ecological validity of executive function measures using a verisimilitude approach. The MCT, along with a battery of neuropsychological tests including commonly employed executive function measures, was administered to a sample of normal (non-clinical) participants. The following questions were also addressed in this study.

What do qualitative and quantitative patterns of performance on the MCT reveal about the nature of the subprocesses of the executive function system? What are the similarities/differences in the executive processes being measured by the MCT and existing executive function measures? In order to address these research questions, a detailed analysis of “normal” performance on the MCT was undertaken. Additionally, correlational analyses between the

MCT and existing measures of cognitive function (including executive function tests) were employed to explore commonalities and differences in the cognitive processes tapped by these measures.

26

Study 2. Which tests have the most ecological validity? Having characterized normal performance on the MCT in study 1, it was of interest to explore the performance of patients likely to exhibit executive impairments in their daily lives. As such, patients with stroke or TBI were recruited and administered a neuropsychological test battery that included the MCT. The primary objective of this study was to evaluate the ecological validity of the MCT and existing executive function measures by examining the relationships between performance on these tests and ratings of real world functioning. Thus, this study addressed the question of which executive function measures are most predictive of everyday executive functioning. Further questions posed in this study were: How do patients with reported executive deficits perform on an unstructured, verisimilitude measure of executive function (i.e., the MCT)? Does their performance differ qualitatively or quantitatively from healthy participants? To address these questions, MCT performance in this sample of patients was characterized to further explore executive function systems in those with reported executive deficits. Although not a primary focus, MCT performance in the patient sample was compared with the normative sample from study 1 in an attempt to determine whether specific qualitative and/or quantitative patterns of performance might be markers of executive impairment.

Study 3. Which executive function tasks are most appropriate as outcome measures for the evaluation of an executive function treatment program? In this final report, the aim was to explore ecological validity from a rehabilitation perspective and to evaluate the generalizability of skills acquired in treatment. In particular, this study sought to determine whether participation in an executive function intervention leads to specific improvements from baseline to post- intervention on the MCT and other cognitive test measures. The analysis specifically focused on whether post-intervention skills would generalize better to skills assessed on a test of

27 verisimilitude (MCT) or on a test of veridicality (semantic fluency). It was also of interest to determine whether performance on the MCT would relate to behavioural ratings of executive functioning, thus providing further evidence of the ecological validity of the test. These results were compared with the relationship between semantic fluency performance and behavioural ratings of executive function to evaluate which measure (MCT or semantic fluency) had better ecological validity.

Chapter 2 Development and Performance Characteristics of an Ecologically Oriented Executive Function Measure in Healthy Adults

Abstract

A novel virtual reality executive function task (Multitasking in the City Test; MCT) was developed with the aim of investigating planning and multitasking with ecological validity in mind. Thirty undergraduate students (21 females, 9 males) were recruited for participation in the study and completed a neuropsychological test battery that included the MCT along with well established, standardized executive function measures and measures of other cognitive functions

(e.g., memory, visuospatial skills, etc.). As expected, the sample performed within normal limits on the battery of standardized tests administered. Participants also completed the MCT successfully although specific types of errors were common even in this neurologically intact group of individuals. Spearman correlation coefficients were computed between the various test measures. Only the plan score derived from the MCT was found to be significantly associated with one of the standardized executive function tests administered (Modified Six Elements test), suggesting that both variables may be measuring a similar construct. Statistically significant correlations were also found between MCT scores and two non-executive test measures (Trail

Making Test Part A and Judgment of Line Orientation), suggesting that other ‘basic’ cognitive functions such as speed of information processing and visuospatial skills are components of

MCT performance. Preliminary evidence from this study suggested that the MCT may be an ecologically valid method of objectively evaluating how component executive processes are integrated into meaningful behaviour. It was suggested that the task represents a departure from traditional tests that tend to focus simply on the ‘building blocks’ of executive function.

28 29

Introduction

The ecologically valid assessment of executive functions should involve the evaluation of common everyday behaviours such as planning and multitasking. Lezak (2004) defines planning as “the identification and organization of the steps and elements…needed to carry out an intention or achieve a goal” (p.614). Multitasking has been defined as the ability to “not only plan and organize based on temporal and conditional associations between actions, but also to maintain this condition and temporal information in working memory, along with other information such as the immediate environmental stimuli, goals, and subgoals” (p. 23, Marcotte,

Scott, Kamat, & Heaton, 2010).

The MET is a multitasking executive function test after which several VR tasks have been modeled (Law, Logie, & Pearson, 2006; McGeorge et al., 2001; Miotto & Morris, 1998;

Titov & Knight, 2005; Rand, Basha-Abu Rukan, Weiss, & Katz, 2009). For instance, McGeorge and colleagues (2001) compared real world and virtual world behaviour in their sample of TBI patients and matched normal controls, who were asked to complete vocationally-oriented, errand-planning tasks consisting of multiple sub-goal elements. The researchers replicated the general layout of the psychology department of a university in their VE and participants were asked to imagine that they are a staff member in the psychology department. They were given a list of errands to complete, which included several simple tasks such as reading a magazine for

10 minutes in a particular room. There were also more complex tasks such as buying a new notebook from a shop that is only open for a specified 10-minute period and then going to a particular room and copying out the names and telephone numbers of three hotels in the city from the Yellow Pages before making a telephone call. The task included a rule that participants were told to obey at all times: One of the stairwells in the department was for travelling up only,

30 and the other one was for travelling down only. After reading through the list of errands, participants were instructed to take as long as they needed to construct a plan for the task using a provided map of the department. All participants completed the real life environment task prior to and on a separate occasion from the VE task. Although the TBI group performed in the normal range on a standard executive function battery (BADS), significant differences were found between the two groups on both the real and virtual tasks in that patients completed fewer errands and produced significantly worse plans than did controls. Furthermore, performance in both the real and virtual world was highly correlated, suggesting that the VE task had ecological validity.

Most recently, a group of researchers developed the Virtual MET (VMET), a task that

very closely matches the task instructions of a hospital version of the MET (Knight, Alderman,

& Burgess, 2002) within a virtual shopping environment (VMall; Rand, Basha-Abu Rukan,

Weiss, & Katz, 2009). 4 The MET-Hospital Version, as in the original MET, involves three tasks

that abide by certain rules and is performed in a hospital shopping centre (i.e., buy six items,

ascertain four items of information, and meet the tester at a specified time at a predetermined

location). The VMET consists of identical instructions and the same number of tasks involving

items to be bought and information to be obtained as the MET-Hospital Version, with only minor

changes made in the information requested and the specific products to purchase in accordance

with what was feasible and could be found in the VMall. For the third task, rather than requiring

participants to meet the tester at a certain time, this instruction was changed to checking the

content of the shopping cart at a certain time. The order of administration of the VMET and

4 It is noteworthy to point out that it was not known that the VMET was being developed at the time of development of the MCT as the report by Rand and colleagues (2009) had not yet been published.

31

MET-Hospital Version was counterbalanced in an effort to eliminate bias. In their sample of healthy “younger” participants (Mean age ± SD = 26.3 ± 2.7 years), they found that the correlations between MET performance and VMET performance were not significant. The authors suggested that the task may have been too simple for this group, noting that ceiling effects resulting in low variability in scores may have been responsible for the lack of significant correlations in this sample. In post-stroke and healthy older participants, however, they found significant moderate to high correlations between the MET-Hospital Version and the VMET, supporting the ecological validity of the VMET as an assessment tool of executive functions.

There appears to be a general consensus in the literature that existing VR tasks modelled after real life tasks have good ecological validity. In the current study, a novel VR executive function task was developed with the aim of investigating planning and multitasking with ecological validity in mind. The task, named the Multitasking in the City Test (MCT), is an errand-running task that takes place in a virtual city. Certain characteristics of the MCT distinguish it from existing VR and real life executive function tasks designed to be ecologically valid. For instance, all of the tasks in the MCT were developed to be similar to the types of tasks most people engage in on a regular basis (e.g., shopping, attending appointments, etc.), quite unlike some of the scenarios that have been devised previously (e.g., working as a psychology department staff member, obtaining information about exchange rates for a particular foreign currency, etc). Thus, the goal of devising a task with verisimilitude was of the highest priority in developing the MCT.

A further difference between the MCT and other tests is that tasks must be performed

with little explicit rule constraints, in contrast to tests such as the MET in which participants had

to abide by certain rules such as not travelling beyond a certain spatial boundary and not entering

32 a shop other than to buy something. It is acknowledged that the implementation of rules within a task helps to reduce ambiguity and simplify task demands in addition to uncovering “rule breaking” behaviours that may predict monitoring difficulties in everyday life (e.g., see

Alderman, Burgess, Knight, & Henman, 2003). It was of interest in the development of the

MCT, however, to explore whether and how often participants engage in behaviour that is clearly not goal-directed when they are not explicitly told ahead of time not to do so. It was speculated that the inclusion of explicit rules would serve to mask or reduce monitoring and other executive deficits in some individuals by providing more structure within the test situation in the same way traditional tests do. Thus, the only “rule” specified in the MCT is that participants must return to their (virtual) home within 15 minutes and this was done primarily for practical purposes, so as to avoid excessive testing time.

The MCT is also unlike many other executive function tasks (with the exception of the task developed by McGeorge et al., 2001) in that participants are required to construct a plan of how they will go about completing the task after the instructions are read to them and prior to beginning the task. The quality of the plan produced will help to evaluate planning and organizational abilities and to assess whether plan quality is associated with task performance.

In the current study, the MCT was administered to a group of young healthy adult participants in order to characterize “normal” performance on this task. In considering the psychometric properties of the MCT, a specific goal in devising the task was to attempt to avoid ceiling effects. Given that normative data was obtained from a group of presumably high functioning undergraduate students at a large university, attempting to avoid near perfect test scores was expected to be a formidable task. Nonetheless, it was anticipated that if the task was sufficiently difficult while maintaining the realistic nature of the individual tasks, performance

33 could elicit certain types of errors even in healthy persons as it is not entirely unexpected for

‘normal’ individuals to experience some minor difficulties with planning and multitasking. For example, while the tasks in the MCT appear relatively simple with little variability in the level of complexity of the individual tasks, the order in which the tasks must be performed may present a challenge for some of the normal participants. In particular, the city was intentionally constructed such that the stores in which participants must make purchases are located closer to the participant’s home than the bank where they must acquire funds necessary to complete these purchases. Thus, participants may decide to make a purchase prior to going to the bank to withdraw money despite being told that they have less than a dollar in their pocket (i.e., less than the amount required to make any of the purchases on the list of errands). Participants are also expected to attend a doctor’s appointment on time and to accomplish a ‘budgeting’ task by determining which of a variety of types of pens they are able to afford without going over budget. These test elements introduce complexity within the test situation by making successful performance contingent on judicial, commonsensical decision making skills rather than a more simplistic approach, such as accomplishing tasks in the most spatially efficient order.

In the present study, it was of interest to address the following research questions:

1. How can ecological validity be improved? As noted, the MCT was developed in an attempt to improve the ecological validity of executive function assessment. The verisimilitude approach was utilized during test development to meet this objective.

2. Generally speaking, how can test performance characteristics on the MCT inform our understanding of executive functioning? More specifically, this study sought to address the following question: What do qualitative and quantitative patterns of performance on the MCT

34 reveal about the nature and operation of the subprocesses of the executive function system? In addition to the MCT, a neuropsychological test battery consisting of paper-and-pencil standardized measures was administered to the normal sample. Correlational analyses were conducted to explore associations between the MCT and these measures. Thus, further questions posed were: What are the common underlying processes being tapped by the MCT and other neuropsychological test measures? Are there differences in the executive processes being measured by the MCT and existing executive function measures? It was hypothesized that the

MCT would be associated to some extent with other non-executive test measures given that most everyday tasks involve the parallel recruitment of multiple cognitive processes (Marcotte, Scott,

Kamat, & Heaton, 2010), such as memory and visuospatial skills. Given the substantial differences in task requirements on the MCT when compared to other standardized executive function measures in this battery, it was presumed that the MCT would recruit different functional processes from these other measures. Thus, it was further hypothesized that MCT scores would not be consistently associated with other executive measures and may in fact be only weakly associated with such measures.

Method

Participants

Thirty undergraduate students (21 females, 9 males; Mean Age ±S.D. =19.4 ±1.54 years) were recruited from the Introductory Psychology course at the University of Toronto,

Scarborough Campus. Given the multicultural composition of the city of Toronto, the sample was composed of many different ethnicities with participants born in Canada (n=9), China/Hong

Kong (n=6), India/Pakistan/Sri Lanka/Bangladesh (n=6), the Middle East (n=3), Russia/Europe

35

(n=4), and the Caribbean (n=2). The number of years the sample spoke English was approximately 13 years (Mean ±S.D. = 13.03 ±6.74 with a range of 1 to 23 years). The entire sample had no reported history of neuropsychological or psychiatric disorder. Demographic characteristics of the sample are summarized in Table 2.1. Students received 1.5% bonus marks in an Introductory Psychology course for their participation.

Measures

After providing informed written consent and completing a demographic questionnaire, all participants completed a neuropsychological test battery that consisted of the MCT and standardized executive function measures (TMT B, TOL, BADS Modified Six Elements test) along with further measures sampling other cognitive realms (TMT A, Wechsler Test of Adult

Reading, Judgment of Line Orientation, Star Cancellation, and RBMT-Extended Version Story

Immediate and Delayed). The test battery was administered in a counterbalanced, predetermined order in which 15 participants completed the MCT as their first test and 15 participants completed the MCT as their last test, thus allowing for the exploration of potential order effects between the MCT and the other executive function tests administered. Other than the MCT, the order of administration of all other tests in the battery including the executive function measures was invariant (i.e., not counterbalanced) across participants. The time required for testing was approximately 1.5 hours per participant.

A brief description of the standardized tests included in the battery is provided below; the reader is referred to Lezak, Howieson, and Loring (2004), Strauss, Sherman, and Spreen (2006), or the respective test manuals for more detailed descriptions of the test measures utilized. The specific task instructions for the MCT are provided in Appendix B, while Figures 2.1 and 2.2

36 show the map of the MCT that was provided to participants, and screen shots of the task, respectively.

Executive Function Tests

Trail Making Test (TMT; Reitan, 1955).

The TMT, originally part of the Army Individual Test Battery (1944) and adapted by

Reitan (1955) for the Halstead Battery, is a measure of attention, speed, and mental flexibility

(Strauss, Sherman, & Spreen, 2006). The test is in the public domain and can be reproduced without permission (Lezak, 2004). There are two parts to the test. In Part A (TMT A), the participant is required to draw lines to connect 25 encircled numbers in order (e.g., 1, 2, 3, etc.) on a page. In Part B (TMT B), the participant must connect 25 encircled numbers and letters in alternating order (e.g., 1, A, 2, B, etc.). The participant is instructed to connect the circles “as fast as you can” without lifting the pencil from the paper. The TMT B was included as a standardized executive function measure of mental flexibility in this battery while the TMT A was used to evaluate attention and processing speed. TMT scores are the amount of time (in seconds) to completion and number of sequencing errors made. Given that the majority of participants did not make any errors on either TMT A or TMT B, only the time-based scores were utilized in the analyses.

Tower of London. Drexel University: 2 nd Edition (Culbertson & Zillmer, 2001).

The TOL was included as a measure of planning ability given its prominent use in the executive function literature and in clinical settings, although the extent to which planning is involved in performing the task may be limited (Miyake et al., 2000). This version of the TOL uses two boards, one for the examiner to place three coloured wooden beads (red, green, or blue)

37 in the goal position, and the other containing the three coloured wooden beads that the participants are required to rearrange from a standard “start” position to the examiner’s model in as few moves as possible. There are two rules to be followed when arranging the beads: participants are not to place more beads on a peg than it will hold and they can only move one bead at a time. Ten problems are administered in order of increasing difficulty and 2 minutes are allowed for each trial. A total of eight different “index” scores can be obtained involving number of moves, successful completions, and timing aspects. The “Total Correct” score

(defined as the number of problems solved in the minimum move count) was utilized in all relevant analyses in the current study. The “Total Move” score (the total number of moves the participant makes across all ten trials), “Total Time Violations” (the total number of times the participant fails to complete a test problem within the first minute of time) and “Total Rule

Violations” (the total number of times the participant violates either of the two rules specified above) were included in all analyses but, due to space restrictions, these scores are not included in the tables and are only reported in the Results section when significant findings were obtained.

Modified Six Elements (MSE) Test. Behavioural Assessment of the Dysexecutive

Syndrome (BADS; Wilson, Alderman, Burgess, Emslie, & Evans, 1996).

The MSE is a subtest of the BADS that assesses the participant’s ability to plan and

organize their time effectively while multitasking. The subtest was chosen because it has been

shown to be one of the most sensitive BADS subtests in assessing executive function deficits

(Gouveia, Brucki, Malheiros, & Bueno, 2007; Rochat, Ammann, Mayer, Annoni, & Van der

Linden, 2009) and was considered one of the BADS subtests most similar to the MCT owing to

its planning and multitasking elements. On the MSE, the participant is instructed to do three

tasks (describing or dictating an event, arithmetic problems, and a picture naming exercise), each

38 of which is divided into two parts (A and B). The participant is instructed to attempt to do at least some of all six subtasks within a 10-minute period. There is one rule that participants must follow: They are not allowed to do the two parts of the same task consecutively. A profile score

(out of 4) is obtained and is based on the number of tasks attempted, the number of tasks where rule breaks were made, and the time spent on any one task.

Other (Non-Executive) Cognitive Function Tests

Wechsler Test of Adult Reading (WTAR; The Psychological Corporation, 2001).

The WTAR was developed as a measure of premorbid functioning in adults. The test requires the reading out loud of irregularly spelled words (e.g., cough), thereby minimizing the measurement of the person’s current ability to apply standard pronunciation rules and maximizing the ability to measure one’s previous learning of the word. This method relies on the fact that reading recognition has been shown to be relatively resistant to the cognitive declines seen in normal aging and brain injury and the fairly strong correlation between intellectual functioning and reading ability in normal people. Thus, word reading tends to provide a fairly reasonable estimate of premorbid ability. In the current study, the WTAR was used as a brief measure of current intellectual functioning (or intelligence quotient; IQ) given that the sample consisted of non-clinical healthy participants. WTAR raw scores were converted into standard scores.

Judgment of Line Orientation (JLO; Benton, Hannay, & Varney, 1975).

The JLO is a commonly employed measure of visual-spatial ability and requires the participant to estimate angular relationships between line segments by visually matching angled line pairs to 11 numbered radii that form a semicircle. The test consists of 30 items that contain

39 different pairs of angled lines to be matched to the display cards. There is a five-item practice set that is administered prior to the test items and participants are required to obtain a minimum of 2 correct responses on the practice items in order to proceed with the test items. If a score of less than 2 is obtained on the practice items, the test items are not administered and the participant receives a score of 0 (thus requiring the clinician to determine whether this represents a true visual-spatial impairment or some other impairment affecting their ability to perform the test). The score is the number of items (out of 30) on which the participant was able to successfully judge the angles of both lines correctly.

Star Cancellation (Wilson, Cockburn, & Halligan, 1987).

The Star Cancellation test was used to screen for spatial neglect. It is an untimed test designed to increase the sensitivity of cancellation tasks to inattention by increasing its difficulty.

There are 56 small stars, the target stimuli, within a jumble of words, letters, and larger stars.

The examiner must first demonstrate the task by cancelling, or drawing a diagonal line through two of the small stars. The score is the total number of small stars cancelled, with a maximum possible score of 54.

Story – Immediate and Delayed; Rivermead Behavioural Memory Test – Extended

Version (RBMT-E; Wilson et al., 1999).

The Story – Immediate and Delayed subtests from the RBMT-E was administered as a measure of episodic memory. There are two versions of the battery (Version 1 and 2) and the first version was used in the present study. In the Story subtest, the examiner reads to the participant a short prose passage consisting of five sentences and asks them to recall as much as they can remember immediately afterwards (Immediate Recall) and then asks the participant to

40 recall the story again after a 20 minute delay (Delayed Recall). The passage resembles items from a newspaper and comprises 21 ‘idea’ units. A total possible raw score of 21 points can be obtained, 1 point for each idea correctly recalled and ½ point for each partially recalled idea (as per the guidelines specified in the test manual). Raw scores can be converted to profile scores

(ranging from 0 to 4) but were left as raw scores in the current analysis to better account for the variability in scores obtained across the sample.

The Multitasking in the City Test (MCT).

With the objective of developing a VR test that can be administered on standard computers commonly available in health care facilities, the VR test was designed to be run on a standard personal computer (PC) and without the use of a head-mounted display. The virtual environment was displayed on a 17-inch colour monitor and was viewed from a first person perspective (i.e., without a graphical representation of the user in the environment). Participants navigated the environment through the use of a standard joystick and made action decisions by pressing the number keys on the keyboard. For observation and coding purposes, each participant’s trial was recorded by way of a program that creates video files of computer screen activity (Screen Recorder Gold Version 2.1).

The virtual environment for the MCT is a city consisting of a post office, drug store, stationary store, coffee shop, grocery store, optometrist’s office, doctor’s office, restaurant and pub, bank, dry cleaners, pet store, and the participant’s home. The number of venues within the city was considered sufficiently complex to be representative of a typical shopping centre but not too complex so as to make the task unnecessarily confusing from a spatial perspective. All buildings can be entered freely although interaction within them is possible only for those

41 buildings that must be entered as part of the task requirements. For instance, when a participant walks up to the cashier in the drug store or the automated teller machine in the bank, an “event screen” appears providing options for what the participant would like to do. In these cases, options include purchasing an item, withdrawing money, or doing ‘nothing’. At the bottom of the computer monitor, participants can view a list of the errands they are to perform as well as those errands they have already successfully completed, the amount of money in their wallet, the items in their backpack, and a clock displaying the time (i.e., in hours, minutes, and seconds).

Participants initially underwent a training/learning session for 3 minutes in which they were instructed to familiarize themselves with the joystick and were allowed to navigate through the city. The decision to allot 3 minutes for the training/learning session was based on pilot data obtained from a non-clinical sample of individuals with limited to no prior experience using a joystick and limited computer knowledge/experience. In general, these individuals required 1 to

2 minutes (and in no cases more than 3 minutes) to feel comfortable navigating using the joystick and to become familiar with the city. A map of the city was placed beside them so they could refer to it at all times (see Figure 2.1 for map of city and Appendix B for task instructions).

During the learning session, participants were only able to navigate around the city but not interact with it (e.g., they were not able to purchase any items). Immediately after the learning session, the MCT was administered. The instructions were read to participants and, if necessary, repeated until the assessor was satisfied the participant knew what they were expected to do.

Next, participants were prompted to ask any questions they had about completing the task. They were then invited to take as long as they needed to construct a plan for the task using the map and were provided with a pen and paper to record their plan. Participants were provided with unlimited time for planning as it was anticipated that the time required to plan would be variable

42 across participants and it was of interest to explore how planning time would be associated with the various task performance variables (i.e., completion time, number of tasks completed, errors made, etc.).

Upon completion of the plan, the MCT was presented to participants on the computer monitor. They were again provided with the joystick and recording of the computer screen activity commenced. The test required participants to complete as many errands as possible within 15 minutes. The inclusion of a total of eight tasks to be completed was considered a realistic number of tasks that a person would typically set out to accomplish on a given day in everyday life. Pilot data indicated that normal individuals required a maximum of 5-6 minutes to perform all eight tasks (many required less time) and thus it was considered reasonable to assume that patient samples could perform most if not all tasks within 15 minutes.

The MCT software was programmed to record the time from the beginning of the test to

completion, in addition to recording each option selected from the event screens. Thus, from this

data the following scores could be obtained: Completion Time (in seconds); number of Tasks

Completed (maximal score of 8); and Task Repetitions , defined as performing the same task more than once and on two separate occasions (i.e., the participant must have left the building after performing the task and then re-entered and performed the same task, usually after completing other tasks in the interim). Each participant’s video file of task performance was used to determine qualitative errors made by the sample. These included Insufficient Funds

errors, defined as the number of times participants tried to make a purchase for which they had

insufficient money in their wallet to buy. Other qualitative errors were categorized as

Inefficiences, which included entering a ‘task’ building but not performing a task that was still

incomplete; performing the post office task in two separate visits instead of one visit; entering a

43 building in which there was no task to perform; and notable wandering behaviour (defined as walking around the city for more than 30 seconds without performing a task and not necessarily involving visits to any buildings). Number of incomplete tasks and insufficient funds errors were categorized as Task Failures. Other task failures involved purchasing a more expensive item over the more economical option (in the case of the pens that must be purchased) and not meeting the scheduled time deadlines (i.e., not attending the doctor’s appointment on time and/or not returning home on time). Thus, all qualitative errors were categorized as Task Failures, Task

Repetitions, or Inefficiencies.

Data Analyses

Descriptive analyses were conducted for all MCT measures including qualitative errors made by the sample. All statistical analyses were performed using Statistical Package for the

Social Sciences (SPSS) for Windows, version 17.0. Due to the non-normal distributions of many of the relevant variables along with the relatively small sample size, data analysis was necessarily limited to descriptive statistics and analysis of bivariate relationships using non- parametric statistical tests. Given that the MCT is a novel test and the present study involves exploratory research, no corrections were made for multiple correlations and comparisons such that all analyses with p values at .05 or lower were considered significant (in line with previous research of a similar nature; for example, see McGeorge et al., 2001; Rand, Basha-Abu Rukan,

Weiss, & Katz, 2009). Spearman correlation coefficients were computed to examine the relationships between the various scores of the MCT and between the MCT scores and both demographic and cognitive test variables. To analyse any order or gender effects, the Mann-

Whitney U procedure was used to compare the MCT-first and MCT-last groups and to compare males and females on the various tests administered.

44

Results

Test Performance

Test performance scores are summarized in Table 2.2. As expected, group performance on both the executive and non-executive tests administered was generally within normal limits with a few exceptions. Variability was noted on the WTAR scores, which was expected given the heterogeneity of the sample in regard to English language knowledge. That is, English is a second language for a number of our participants and thus reading ability in English was expected to be somewhat lower for these individuals. Nonetheless, the majority of participants

(27/30) scored within the normal range (i.e., within one standard deviation) on the WTAR and the entire sample demonstrated good comprehension of task instructions on the MCT and all other cognitive measures administered. For all other test scores, despite a fairly wide range in performance, the overwhelming majority performed within the normal range across the various measures. The neuropsychological profile of each participant who obtained a low performance score on a particular test was analyzed to explore whether there was a pattern of poor performance. None of these participants demonstrated consistently poor scores. That is, if a low score was obtained by a participant on one measure, their scores on all other measures were within the normal range of performance. Thus, a poor score obtained by a participant on a particular test was interpreted as normal variability in performance that often occurs even in non- clinical populations (e.g., Schretlen, Testa, Winicki, Pearlson, & Gordon, 2008), possibly due to wavering attention and/or effort.

45

MCT Plan Analysis

Participants varied fairly widely in the amount of time they spent constructing their plans for the task (32-523 seconds), with an average planning time of just over 3 minutes. Analysis of the plans produced by the sample revealed substantial variability in the level of detail included along with the planned sequence of actions and budgeting information. A sequence score out of six was assigned to each participant’s plan of how to complete the task. Plan scores were based on the logic of the sequence, whereby higher scores were reflective of plans that were logical from both a spatial perspective (i.e., completing errands sequentially according to shortest distance between previous location and current location visited) in addition to judicious decision making (e.g., plans to go to the bank before any stores). In cases where the most spatially practical sequence was discordant with the more judicious sequence, points were always allocated for more judicious plans even if they were spatially inefficient. The allocation of points in this way was based on the rationale that the test was not meant to evaluate the arguably more ‘basic’ ability to navigate one’s environment in the most spatially efficient manner, but to utilize higher level, complex executive skills in making sound decisions about how to accomplish daily life tasks effectively. For instance, it is more logical to ensure that one obtains money from the bank prior to purchasing cough syrup even if the drug store is passed on the way to the bank. Thus, a plan score of 1/6 would be awarded if a participant’s plan indicated the bank would be the last place visited in the sequence. A score of 6/6 would be awarded if the plan indicated the bank would be the first place visited (or the second place if the doctor’s office was first in the sequence), provided that all other places on the list were visited and that the plan indicated that the next place visited would be the post office, drug store, or stationary store (as visiting either of these locations next would result in a spatially efficient plan). Eleven of the 30

46 participants (37%) obtained sequence scores of 3 or less (out of 6), indicating that a considerable portion of the sample made non-ideal decisions about how to approach the task prior to task initiation. All of these poor plan scores were due to not visiting the bank before making purchases although two of the lower scores were also due to producing incomplete plans (i.e., no indication of visiting one or more places on the list). Purely spatial sequence errors did occur, albeit infrequently. For instance, one participant’s plan was almost entirely spatially inefficient having involved a haphazard, “back and forth” pattern (i.e., Bank Stationary Store  Post

Office  Drug Store  Doctor’s Office  Grocery Store) despite being fairly judicious and regardless of the fact that there did not appear to be a logical reason why nearby stores were not entered sequentially. In this case, the drug store could have been entered after the stationary store given their proximity. Another possibility would have been to visit the doctor’s office immediately after the stationary store, as this would have been indicative of a judicious strategy

(i.e., suggestive of an appreciation for the importance of making it to the doctor’s appointment on time) and the point would have been awarded. Overall, however, most participants’ plans were noted to be efficient from a strictly spatial perspective.

The analysis of budgeting plans was particularly illuminating in that such plans were practically non-existent. Only 5 participants included any written description of an attempt to budget by calculating how much they could spend on the six pens they were instructed to buy without going over budget. Considering that the majority of participants (83%) did not appear to plan a budget (as per their written plans), this finding was interpreted as reflective of a fairly minimal amount of actual budgeting requirements in this task. It was assumed that most participants were able to perform such simple arithmetic “in their heads”, thereby not necessitating that these calculations be written down on paper. Indeed, all participants who

47 completed the task of buying the six pens purchased the least expensive option ($2) except for one participant, who purchased the $4 pens that were still within budget. None of the participants purchased the $6 pens that would have put them over budget.

MCT Performance: Qualitative and Quantitative Analysis

In regard to performance on the MCT, the number of tasks completed was high as would be expected in a normal sample with most participants (80%) completing all eight tasks within the 15 minutes provided. There was a considerable range in completion time scores (163s -

767s) with an average task completion time of about 5 minutes. Other MCT measures also revealed a range of scores across the normal sample. As previously noted, qualitative errors on the MCT were categorized as task failures, task repetitions, and inefficiencies; the frequency in which these errors occurred is shown in Table 2.3. Only 4 out of 30 participants performed the

MCT with no errors or inefficiencies. The most frequent error was attempting to make a purchase before going to the bank. This error was made by 13 participants (43% of the sample) and was made a total of 14 times across the sample. Other frequent errors involved inefficiencies such as performing the post office task in two separate visits instead of one visit

(error made by 8 participants or 27% of the sample) and entering a building in which there was a task that still had to be completed but not performing that task (7 participants or 23% of the sample made this error). Task repetitions also occurred although not as frequently as the other error types (4 participants repeated a task). Taken together with the variability in plan scores across the sample, the frequency of errors and inefficiencies in task performance suggest the

MCT is sufficiently challenging that ceiling effects can be avoided in a normal adult population.

48

Concordance between MCT Plans and Actual Performance

It was of interest to explore whether participants actually followed the plans they devised during performance on the MCT. This was not a straightforward analysis given that many participants planned to make purchases prior to going to the bank and then had to abandon these plans during task completion upon receiving feedback in the form of an “insufficient funds” error message. Thus, the current analysis focused on those individuals who obtained high plan scores

(Scores ≥4 out of 6; N = 17) as all of these plans indicated that money would be obtained from

the bank prior to purchasing items. It was expected that these individuals would follow the plans

they devised. Interestingly, only 3 of these 17 individuals actually followed their plans exactly;

another 8 ‘almost’ followed their plans (defined as visiting 3 or more places in the correct

sequence but not all places) while 6 individuals completed the task quite differently from the

plans they devised. The next step involved exploring how many of those who did not perform

the task exactly as planned (i.e., the ‘almost’ and ‘different’ plans; N = 14) did at least go to the

bank as their first task. This analysis revealed that all but one (13/14) of these individuals did in

fact go to the bank first but did not follow the planned sequence exactly for the remaining places

visited.

Inter-rater Reliability

Errors were scored separately by two independent raters: the current author and senior investigator (D.J.), a graduate level psychology student who administered the MCT but did not supervise while the task was performed (only remaining present in the room to address any technical questions/issues from participants), and another graduate level psychology student

(Z.C.). Both raters used the video files of the computer screen activity from each participant to

49 observe and record behaviours during performance of the task including the sequence in which tasks were completed and the nature of the errors made. Inter-rater reliability was evaluated by calculating intraclass correlation coefficients (ICCs) for each category of error (number of tasks completed, insufficient funds errors, task repetitions, time deadlines met, and inefficiency errors). It is worth noting that inter-rater reliability was expected to be very high overall given that video files can be viewed repeatedly to confirm the accuracy of response scoring. Both raters did in fact view each video file a minimum of two times to ensure scoring accuracy.

Indeed, in all cases except for inefficiency errors, inter-rater reliability was perfect (1.00). The inter-rater reliability for inefficiency errors was .98 due to one discrepancy between raters in classifying wandering behaviour for one participant (i.e., one rater categorized the behaviour as reflective of “wandering” whereas the other did not). Essentially, the issue was with the amount of time spent wandering given that a period of no less than 30 seconds was required to be considered reflective of this behaviour. The second rater considered the wandering behaviour to have commenced several seconds after the first rater, thus not meeting the 30 second criteria for wandering. In this case the thesis supervisor, a board licensed clinical neuropsychologist (K.Z.), reviewed the video file to resolve this discrepancy and rated this behaviour as indeed reflective of wandering. Thus, this was considered an instance of wandering behaviour in all statistical analyses in which the ‘total errors’ variable was included, as described below. It should be mentioned that the ‘wandering’ error is the only measure that involves a level of subjectivity

(i.e., in regard to exact start and end times) and therefore the noted score discrepancy was not completely unexpected. All other scores can be easily and objectively measured from the video files, with little to no likelihood of disagreement between raters’ scores.

50

Correlational Analyses

Demographic variables.

All neuropsychological test measures, including the various MCT variables, were correlated with WTAR scores, years of education completed, and gender. As the sample was fairly homogeneous with regard to age (range = 18-25), this was not considered a variable of interest with regard to its associations with the MCT or the other test variables. Education was not found to be significantly associated with any of the test variables. Intellectual ability, as estimated by the WTAR, was positively correlated with the RBMT Story – Immediate Recall ( r

= .38, p = .04) and the MCT plan score measure ( r = .42, p = .02). Females tended to perform

slower than males on the TMT A, with the difference attaining significance (Males: Mean ±S.D.

= 25.2 ±10.4; Females: Mean ±S.D. = 33.6 ±9.7; U = 48.50, p = .04). There were no significant

gender differences on any of the other test measures, including any of the MCT variables.

MCT variables.

Spearman correlations were computed to examine the relationships amongst the various

MCT measures (see Table 2.4). The plan score measure was negatively associated with the total

errors variable, such that the higher the planning score the lower the number of errors made on

the task ( r = -.45, p = .01 ). Not unexpectedly, longer completion times were positively

associated with a higher number of errors ( r = .46, p =.01).

MCT variables and cognitive (executive and non-executive) tests.

This analysis explored the associations between the MCT and the other

neuropsychological test measures utilized in this study (see Table 2.5). Only one of the

51 executive function measures (MSE) was significantly related to the MCT. That is, better performance on the MSE was associated with higher MCT planning scores ( r = .40, p = .03). A separate analysis exploring the relationships between the standardized executive function measures (excluding the MCT) revealed that only one correlation was significant (i.e., TOL was negatively associated with BADS MSE, r = -.39, p = .03).

Correlations computed between the MCT and the non-executive test measures revealed positive associations between MCT completion time and TMT A ( r = .38, p = .04), and between

tasks completed and JLO ( r = .45, p = .01). MCT plan time and total errors were not

significantly related to any of the test variables (executive or non-executive).

Executive and non-executive tests.

Finally, correlations between the executive function tests (other than the MCT) and the

other cognitive tests in the battery were computed. For this analysis, each executive function test

was broken down into the various measures that could be derived from it (i.e., TOL: Total Move,

Total Correct, Total Time Violations, and Total Rule Violations; MSE: Profile Score, Subtasks

Attempted, Rule Breaks, Time Violations; TMT B: Total Time and Total Errors). For the TOL,

only the Total Move score was significantly related to the JLO ( r = -.38, p = .04). None of the

MSE measures correlated significantly with any of the other cognitive measures. The TMT B

Total Time score was negatively associated with the JLO ( r = -.37, p = .04) and the RBMT

Delayed ( r = -.40, p = .03).

Order Effects Analysis

The order effects analysis (Mann-Whitney U test) indicated that there were no significant

differences in test performance between the MCT-first and MCT-last groups.

52

Discussion

A novel VR test of planning behaviour, the MCT, was administered to a normal sample with the objective of exploring executive functioning using an ‘ecologically valid’ measure and to determine associations between the MCT and standardized neuropsychological tests. This study set out to address whether the ecological validity of executive function measures could be improved and to further our understanding of the operation of various executive function subprocesses (by way of an analysis of MCT test performance characteristics). In order to be successful on the MCT, participants must plan efficiently, monitor their actions, multitask and prioritize competing subtasks, and utilize feedback to guide decision making. Errands consisted of everyday tasks such as making purchases, meeting time deadlines, and staying within a budget when purchasing items.

MCT Performance

As would be expected of a normal sample, performance on the MCT was generally good with most participants successfully completing all tasks within the time provided. Nonetheless, considerable errors and inefficient patterns of performance were noted within the sample.

Ceiling effects were avoided to some extent. That is, while the number of tasks completed was high, other qualitative aspects of performance indicated that very few individuals in the sample obtained a ‘perfect’ performance. As noted above, most participants made errors during the task despite having successfully completed all tasks. The most common error involved attempting to make a purchase before obtaining money from the bank, made by almost half of the sample

(13/30). Other frequent errors involved inefficient patterns of performance such as entering a building but not performing the required task within the building, entering the same building on

53 two separate occasions to perform two tasks that could have been performed in a single visit, or entering unnecessary buildings and other forms of ‘wandering’ behaviour. Some inefficiency errors were spatial in nature, such as taking haphazard or inefficient routes around the city to complete task requirements.

Participants were evaluated on their devised plans, which indicated the route they would take to complete errands around the city prior to task initiation. Curiously, plan scores were lower than expected for the group despite the fact that participants were given the opportunity to familiarize themselves with the city by ‘walking’ around it during a practice session and even though task instructions were thoroughly reviewed prior to devising their plans. Potential reasons for the low plan scores may be that normal participants underestimate the importance of planning and/or overestimate their own abilities and thus did not put sufficient care and effort into this aspect of task performance. The most common error made was to plan to visit the drug store and/or stationary store prior to visiting the bank. This was also the most common error made during actual task performance, as participants generally attempted to follow the plans they devised upon task initiation. Some were then forced to mentally revise their planned sequence of actions during the task upon receiving feedback that they had ‘insufficient funds’ to perform the tasks in the originally intended order. Interestingly, when this happened, many participants had to abandon their original plans as they proceeded to complete all remaining tasks in a very different order from the one on paper. At times, this involved completing tasks rather haphazardly from both a spatial and ‘logical’ perspective. These findings highlight that, for efficient and successful goal-directed behaviour, it is important to devise a good plan prior to task initiation.

54

The plan analysis revealed that the majority of participants did not record any information related to budgeting on paper (i.e., calculating how much money they would have left over to purchase pens). This was interpreted as largely due to the minimal amount of mathematical computations required. The lack of evidence of budgeting did not seem to have an effect on their performance in this regard, as none of the participants purchased the most expensive ($6) pens. A few participants did run out of money, however, because they repeated purchases (e.g., buying groceries more than once). It is noteworthy to point out here that there were instances where technical problems were related to repeat purchases as participants who appeared to accidentally back into the cashier after making a purchase would be automatically prompted to choose whether they would like to make the same purchase. Although there was an option to ‘do nothing’, in those infrequent instances during which the participant decided to purchase the item again, this was not categorized as a repeat purchase as it was considered a technical error. Actual task repetitions, however, did occur. That is, there were instances in which participants left and then returned to the same building and purchased the same item again even though information on the computer monitor indicated that this item was already purchased and in their backpack. Task repetitions such as these can potentially be used to assess the ability to monitor behaviour. In the current normal sample, such repetitions occurred too infrequently to be indicative of an actual monitoring deficit but such data would be useful when analyzing performance patterns in patients with suspected or documented executive function impairments

(see Chapters 3 and 4).

The relationships amongst the various MCT measures were explored in an effort to investigate any interesting patterns of performance in the sample. The most noteworthy finding from this analysis was that higher plan scores were associated with a lower number of total errors

55 on the task. This makes sense intuitively and again highlights the importance of careful planning and foresight in successful, goal-directed behaviour. Although these findings are merely correlational and thus cannot be interpreted as causal, they suggest that even neurologically- intact individuals may benefit from good planning given its association with fewer errors during task performance.

The MCT and its Relationships with Other Cognitive Measures

An interesting finding to emerge from the analysis was the relative lack of association between the MCT and the other executive function tests. In fact, the only measure that was associated with the MCT was the MSE, which was the only test in the battery other than the

MCT that was designed with ‘ecological validity’ in mind. The MSE was associated with the

MCT plan score. Both measures appear to have parallel requirements involving planning and organization and the ability to prioritize. In the case of the MCT, the plan score reflects the participant’s ability to multitask and prioritize competing subtasks according to information provided in the instructions. The MSE also involves multitasking and prioritizing in that attempting all tasks is more important than trying to complete any one task. The ability to make priorities in the face of competing subtasks is a common element in several novel executive function tests designed with ecological validity in mind (e.g., some BADS subtests, MET, etc.) but appears to be lacking in many of the traditional executive measures (e.g., WCST, TOL, TMT

B, etc.). The objective evaluation of the ability to prioritize is important in regard to ecological validity, as many real life situations require the consideration of the order in which a list of tasks must be accomplished within a particular time frame. Moreover, prioritization is a major component of multitasking, a common impairment amongst individuals with executive dysfunction (e.g., Shallice & Burgess, 1991)

56

Given the multifarious nature of the MCT, and the fact that the other executive function measures utilized here are generally regarded as more unitary measures of executive function processes (with the exception of the MSE), it was anticipated a priori that the executive tests in the battery would not be significantly correlated with the MCT. This hypothesis was confirmed.

A possible explanation for these findings is that traditional executive tests measure the component processes or “building blocks” of executive functions. In contrast, the multitasking and planning elements of the MCT suggest that it requires the ability to initiate these component processes, monitor performance, and use this information to adjust behaviour (Burgess &

Alderman, 1990).

Other studies in the neuropsychological literature provide converging evidence suggesting that VR tasks may be measuring different executive processes when compared to their paper-and-pencil test counterparts. The research literature demonstrates poor correlations between performance on tests such as the WCST or the BADS and VR tasks designed with ecological validity in mind (McGeorge et al., 2001; Pugnetti et al., 1998a). Similar to the present investigation, these studies suggest that VR verisimilitude tests of executive functioning are tapping into different types of executive mechanisms than existing paper-and-pencil measures.

One common factor among VR tests of executive functioning that provides some evidence of better ecological validity when compared to more traditional tests is the increased cognitive demand required to perform such tasks. As mentioned previously, Pugnetti and colleagues (1998a) suggested that their VR task (as opposed to the WCST) required that

“perceptuomotor, visuospatial (orientation), memory, and conceptual aspects of the task be fully integrated into an efficient routine” (p. 160) for successful performance. In the present investigation, MCT performance was related to performance on tests of visuospatial ability and

57 processing speed. Significant associations between the MCT tasks completed score and JLO performance suggest visuospatial processes are involved in MCT performance. Perhaps participants who perform more successfully on the JLO have better visuospatial skills, enabling less effortful navigation in the VE and thereby freeing up cognitive resources for other task demands. The significant association between TMT A and MCT completion time indicates that information processing speed is a relevant component of MCT performance. The fact that MCT scores were related to these non-executive measures bespeaks the complex cognitive demands required on this test. These results also support the notion that the MCT is an ecologically valid measure that appears to simulate the complex demands of real world activities, which involve the recruitment of various lower level cognitive processes.

The MCT and Associations with Demographic Variables

Demographic variables associated with MCT performance included education and IQ.

The current findings indicated that higher levels of education were associated with longer completion times and meeting fewer deadlines. This result was rather unexpected. Considering both the narrow range and relatively high education levels in this sample of undergraduate students, however, it is most probably a spurious finding rather than a meaningful association.

Alternatively, the longer completion times in those who achieve higher levels of education may be indicative of increased reflectiveness, which may have in turn interfered with meeting the required deadlines.

IQ scores, on the other hand, were found to be more variable across the sample and showed a positive association with only the plan score measure from the MCT. This would suggest that intellectual ability is generally not associated with actual performance on the MCT,

58 a finding that is consistent with previous research indicating weak relationships between intelligence and performance on various executive function measures (e.g., Eslinger & Damasio,

1985; Shallice & Burgess, 1991). The positive relationship between IQ and the plan score may reflect the importance of verbal skills for the ability to devise good plans, as the measure of IQ utilized here was merely a brief test of reading ability (WTAR). It would be interesting to explore associations between the MCT and a more extensive intelligence battery such as the

Wechsler Adult Intelligence Scale-III (WAIS-III) or even an abbreviated version of the Wechsler battery (i.e., Wechsler Abbreviated Scale of Intelligence) to corroborate findings of poor associations between IQ and executive functioning.

Conclusion

In sum, the purpose of the present study was to evaluate the multiple executive functions that must be integrated to achieve goal-directed, purposive behaviour in everyday life. In order to do so, a VR test of planning and multitasking ability (MCT) was developed with ecological validity in mind. The current study characterized normal performance on the MCT and described its relationship (or lack thereof) with standardized cognitive tests, including other executive function measures. The preliminary evidence suggests that the MCT may provide an ecologically valid method of objectively evaluating the integration of component executive functions into meaningful behaviour and represents a departure from traditional tests that tend to focus simply on the ‘building blocks’ of executive function.

Chapter 3 Ecologically Valid Assessment of Executive Dysfunction in Patients with Acquired Brain Injury

Abstract

The current investigation sought to further establish the psychometric properties of the MCT and its potential clinical utility by characterizing performance on this task in a group of patients with acquired brain injury. The primary objective of this research was to evaluate the ecological validity of the MCT. Ecological validity was addressed via the exploration of associations between performance on this test and a subjective measure of everyday executive functioning

(Frontal Systems Behavior Scale, FrSBe). The sample was comprised of 13 individuals (11 males) who suffered a stroke or traumatic brain injury. A neuropsychological test battery consisting of the MCT and common executive and non-executive measures was administered.

The only executive function tests that were significantly related to the FrSBe were the MCT and a semantic fluency test. All other executive test measures (Wisconsin Card Sorting Test,

Controlled Oral Word Association Test, Trail Making Test Part B, and Modified Six Elements

Test) were not related to the FrSBe. The patient group produced better plans but completed fewer tasks on the MCT than the previous normal sample. Patient and normal groups tended to make similar types of errors, although some of these errors occurred more frequently in the patient sample. This study demonstrated the ecological validity of the MCT and suggested that patients can be differentiated from healthy individuals by quantitative (i.e., number of errors) rather than qualitative (i.e., type of errors) aspects of performance. Further interpretation of

MCT performance and comparison with existing executive function tests were discussed.

59 60

Introduction

Stroke and TBI are among the most common forms of acquired brain injury (ABI), often leading to cognitive disability and behavioural difficulties. Stroke is the third leading cause of death in the United States and is a leading cause of long-term disability; approximately 795,000 strokes occur in the U.S. each year and the American Heart Association estimated that the cost of stroke would amount to almost $73.7 billion in both direct and indirect costs in 2010 (American

Heart Association, 2010). The statistical data and economic effects of TBI tell a similar story.

Each year in the United States, there are approximately 1.7 million new cases of TBI (National

Center for Injury Prevention and Control, 2010). In 2000, TBI was associated with an estimated cost of $60 billion in direct medical costs and indirect costs such as lost productivity in the

United States (Finkelstein, Corso, & Miller, 2006).

Arguably, one of the most disabling impairments in patients with ABI is executive dysfunction. Executive function deficits are common in stroke (Pohjasvaara, Leskelä, & Vataja,

2002; Zinn, Bosworth, Hoenig, & Swartzwelder, 2007) and TBI patients (Mattson & Levin,

1990; Stuss & Gow, 1992) and real life everyday disorganization is often the chief cognitive complaint in such patients (Lazar, Festa, Geller, Romano, & Marshall, 2007; Shallice & Burgess,

1991). Although it is well-established that executive deficits can result from ABI, the impact of these deficits on daily activities is not well understood. This may in part be due to the limited ecological validity of traditional executive function measures (Marcotte, Scott, Kamat, &

Heaton, 2010).

Given the focus on ecological validity in the development of the MCT, it was of interest in the current investigation to further establish the psychometric properties of the test and its potential clinical utility by characterizing MCT performance in a group of patients with stroke or

61

TBI. The primary objective of this research was to evaluate the ecological validity of the MCT.

Ecological validity was addressed via the exploration of associations between performance on this test and a subjective measure of everyday executive functioning. Although it is recognized that there is no absolute method for the quantification of real life executive functioning as any method of assessment will involve a certain degree of error, an informant-based questionnaire appeared to be the best choice for the current purposes for a number of reasons. First, the reliance on a self-report measure from a brain damaged individual poses problems with measurement error as impaired memory or lack of insight into any deficits is often characteristic of ABI (Sazbon & Groswasser, 1991). Second, clinician ratings may be biased as they typically only see a limited sample of the individual’s behaviour in a clinical setting that may not be reflective of their actual behaviour in everyday environments. Finally, informant-based questionnaires are typically associated with fewer systematic biases and have been utilized in many previous studies of ecological validity in the executive function literature (e.g., Wilson,

Alderman, Burgess, Emslie, & Evans, 1996; Burgess, Alderman, Evans, Emslie, & Wilson,

1998; Knight, Alderman, & Burgess, 2002). Thus, a well-established behavioural rating scale called the Frontal Systems Behavior Scale (FrSBe; Grace & Malloy, 2001) was used as the primary measure of real life executive functioning. The current research also explored the question of whether other common executive function measures are related to informant reports of everyday executive functioning skills using the FrSBe to determine whether these tests are more or less predictive of real world functional outcome. It was hypothesized that the MCT would be more predictive of everyday executive functioning given that it was designed with ecological validity in mind.

62

A secondary goal of the present study was to characterize MCT performance in a sample of patients with ABI to further explore the executive function system in those with reported executive deficits. Although not a primary focus, MCT performance in the patient sample was compared with the normative sample from study 1 in an attempt to determine whether specific qualitative and/or quantitative patterns of performance (e.g., number or type of errors, completion time, etc.) might be markers of executive impairment.

Thus, the current study addressed the following research questions:

1. Which executive function measures are most ecologically valid? This analysis focused on a comparison between the MCT and common standardized measures of executive function.

2. How do patients with ABI and reported executive deficits perform on the MCT? Does their performance differ qualitatively or quantitatively from normal participants?

Method

Participants

Patients with ABI were recruited from the outpatient Neuropsychology program at

Toronto Rehabilitation Institute’s Rumsey Hospital. Patients were selected using the following initial criteria: diagnosis of stroke (as evidenced by positive neuroimaging findings) or TBI of moderate or severe degree (as defined by Glasgow Coma Scale [GCS] score and duration of post-traumatic amnesia along with positive neuroimaging findings) at a minimal interval of 12 weeks prior to testing to avoid bias from a transient impairment of cognitive function in the acute stage following brain injury onset. In addition, patients were not included if there was evidence of the following conditions: severe language or sensory deficits, hemiplegia, vestibular problems, or a prior history of major psychiatric disorders, alcoholism, or drug abuse that could potentially

63 moderate cognitive functioning. Patients who did not have the capacity to provide informed consent or who were unable to read, understand, and speak English were also excluded from the study.

Throughout the recruitment period, the clinical charts of all patients referred by other therapists (e.g., occupational therapists, physiotherapists, speech-language pathologists, etc.) to the Neuropsychology program as part of their routine care were reviewed by the staff clinical neuropsychologist. Patients who met the above criteria were contacted via telephone for recruitment 5. Details of the study were explained at the time of scheduling routine neuropsychological evaluation that all patients must complete as part of the intake procedure at

Rumsey’s Neuropsychology program. Of the 29 eligible patients, 13 agreed to participate. Of these 13 patients, 11 were patients with stroke and two were diagnosed with TBI. Six had right- sided lesions, six had left-sided injury, and one had bilateral brain damage. Table 3.1 provides a more detailed description of each patient’s neuropathology(s) and illustrates that the sample was diverse with regard to brain injury localization. Severity of injury was determined based on their acute health records. Modified Rankin Scales (mRS; van Swieten, Koudstaal, Visser, Schouten,

& van Gijn, 1988) for the patients with stroke were estimated on the basis of records obtained at the time of admittance to the Neuropsychology program. The mRS is a commonly used scale for measuring the degree of disability or dependence in the daily activities of patients with stroke and it has become the most widely used clinical outcome measure for stroke clinical trials. Of the 11 patients with stroke, two were assigned mRS scores of 3 to 4, suggesting moderately severe disability, requiring assistance with ambulation and/or other daily activities; eight were

5 The telephone numbers of patients were available in their clinical chart.

64 assigned scores of 2, indicating that they needed assistance for some daily life activities; and one was assigned a score of 1, indicating no apparent disability. The two patients with TBI were classified as having sustained a moderate to severe brain injury (defined as GCS at injury of less than 12, documented loss of consciousness of greater than 30 minutes, and/or documented posttraumatic amnesia of greater than 24 hours). Of the 13 patients in the sample, only two had returned to work at the time of testing; nine had not returned to work and were on disability leave, one was retired at the time of injury onset, and one attempted a return to work but was unable to perform tasks other than ‘easy’ duties and subsequently ceased working.

Each patient was asked to name a significant other who might respond by telephone or in person to a questionnaire (FrSBe – Family Rating). All significant others agreed to participate.

The current study was conducted in accordance with human ethics standards and received ethics approval from the joint Toronto Rehabilitation Institute/University of Toronto Scientific and

Ethics Review Committee. All patients and their significant others provided informed, written or verbal (in the case of most significant others) consent.

Measures

Sociodemographic and clinical information was collected for each patient including gender, age, highest level of education completed, diagnosis, and imaging findings. The WTAR was used to estimate premorbid IQ. All patients completed the Test of Memory Malingering

(TOMM; Tombaugh, 1996) to assess effort and symptom credibility. The entire sample produced credible scores on the TOMM, indicating that there was no evidence of symptom exaggeration or malingering (as defined by a score of ≥45 out of 50 on Trial 2 as per the test

manual criterion). In addition, all patients completed a neuropsychological test battery involving

measures of visuoperception and visuoconstructional abilities, memory, attention, and executive

65 function as part of a larger battery of tests routinely administered at the Rumsey hospital. The battery consisted of the MCT along with the following tests: COWAT, Semantic Fluency

(Animals), WCST, MSE, TMT A and B, WAIS-III subtests (Digit Symbol, Block Design, and

Digit Span), JLO, Rey-Osterreith Complex Figure Test, California Verbal Learning Test-II, and

Wechsler Memory Scale-III - Logical Memory subtest. The Beck Depression Inventory-II (BDI-

II) and Beck Anxiety Inventory (BAI) were also administered to assess psychological symptomatology. A brief questionnaire was completed in which patients rated their computer skills and experience. The patient sample and a significant other or close relative completed parallel forms of the FrSBe (Self and Family Rating Forms). The test battery was administered in a counterbalanced, predetermined order in which 7 patients completed the MCT as their first test and 6 patients completed the MCT as their last test, thus allowing for the exploration of potential order effects between the MCT and the other executive function tests administered.

Each patient was evaluated for a total of 4-6 hours over the course of 1-4 sessions.

Descriptions of the MCT, MSE, TMT, WTAR, and JLO have already been provided in

Chapter 2. A brief description of the other standardized tests and questionnaires included in the battery is provided below; the reader is referred to the respective test manuals along with Lezak,

Howieson, and Loring (2004), and Strauss, Sherman, and Spreen (2006) for more detailed descriptions of the test measures utilized.

66

Executive Function Tests

Controlled Oral Word Association Test (COWAT; Benton & Hamsher, 1989; Spreen &

Strauss, 1998).

The purpose of the COWAT and other fluency tasks (including the Semantic Fluency

task described below) is to assess the spontaneous production of words under restricted search

conditions. Fluency tests are used to measure executive functioning. The COWAT involves

three one-minute word naming trials in which the examiner asks the examinee to say as many

words as they can think of that start with a given letter of the alphabet. They are instructed to

avoid using proper names, numbers, and the same word with a different suffix. In the version

utilized in this battery, the letters employed were F-A-S. The score is the total number of

admissible words for the three letters.

Semantic Fluency (Animals).

The examinee is asked to produce as many animal names as possible within a one-minute

interval. Proper names, wrong words, and repetitions are not admissible. The score is the total

number of admissible words for the category.

Wisconsin Card Sorting Test (WCST; Berg, 1948; Grant & Berg, 1948; Heaton, Chelune,

Talley, Kay, & Curtis, 1993).

The WCST is a widely-used executive function measure of the ability to form abstract

concepts, to shift and maintain set, and to utilize feedback. It consists of four stimulus cards that

are placed in front of the examinee. The first card is of a red triangle, the second contains two

green stars, the third has three yellow crosses, and the fourth contains four blue circles. The

67 examinee is given two decks of 64 response cards, all of which have designs similar to those on the stimulus cards which vary in color, form, and number. The examinee is instructed to match the cards in the decks to one of the four stimulus (or “key”) cards and is provided with feedback after each match as to whether he or she is right or wrong. A “category” is completed after the examinee has made 10 consecutive correct responses, prompting a switch to the next sorting category (i.e., colour is the first category, form the second, number the third, and then back to colour, form, and number for a total of six categories). No warning is given to the examinee that the sorting rule has changed. Typically, the test is discontinued after the examinee has successfully completed six categories or, failing that, after all 128 cards have been sorted. A number of scores are derived on the basis of WCST performance but only one was used in the current analysis to reduce Type I error. That is, Percent Perseverative Errors (defined as the concentration of perseverative errors in relation to overall test performance, or number of trials given, multiplied by 100) was selected as the measure of executive functioning from the WCST as this variable is considered more purely executive than other variables that can be derived from this test (Lezak, 1995) and has been used in previous ecological validity research (e.g., Burgess,

Alderman, Evans, Emslie, & Wilson, 1998; Poole, Ober, Shenaut, & Vinogradov, 1999; Ready,

Stierman, & Paulsen, 2001).

Other (Non-Executive) Cognitive Function Tests

Digit Symbol-Coding, Block Design, & Digit Span - Wechsler Adult Intelligence Scale –

III (WAIS-III; The Psychological Corporation, 1997).

The WAIS-III is a battery of tests designed to provide a measure of general intellectual

function in older adolescents and adults. The individual subtests of the WAIS-III are often used

68 to evaluate specific cognitive abilities. The following is a brief description of the three subtests that were administered in the current study. The Digit Symbol-Coding subtest is a visuographic

test of psychomotor performance that requires examinees to copy symbols paired with numbers

during a 120-second trial. Block Design is a visuoconstruction task in which examinees are

asked to replicate models or pictures of two-colour designs with blocks. Digit Span is a test of

attention, specifically short-term storage capacity, comprised of two different tests: Digits

Forward requires examinees to repeat number sequences read by the examiner in the same order

as presented and Digits Backward involves instructing the examinee to repeat the number

sequence in reverse order. For each of the WAIS-III subtests utilized in this battery, the raw

scores (i.e., the total number of correct responses or correct trials) were used for statistical

analyses.

Rey-Osterreith Complex Figure Test (ROCF; Meyers & Meyers, 1995).

The ROCF was designed to assess both perceptual organization and visual memory.

Various administration procedures exist for the test. In the current study, examinees were asked

to copy a complex figure onto a sheet of paper followed by recall trials 3 minutes later

(Immediate Recall) and 30 minutes later (Delayed Recall) in which they were asked to produce

the figure by recall. Immediately after the Delayed Recall trial, a Recognition subtest was

administered in which 24 figures are randomly placed across four pages and the examinee is

asked to circle the figures that were part of the original complex figure. An 18-item, 36-point

scoring system is used for the copy and recall trials. Two scores from this test were entered into

the analysis: Delayed Recall and Recognition (i.e., the number of items correctly and incorrectly

identified on the recognition task). Certain scores such as those derived from immediate recall

69 trials from the ROCF and the other two memory tests in the battery (described below) were omitted from any of the statistical analyses in order to reduce Type I error.

California Verbal Learning Test-II (CVLT-II; Delis, Kramer, Kaplan, & Ober, 2000).

The CVLT-II measures verbal learning and memory using a multiple-trial list-learning task. The test evaluates both recall and recognition of two word lists during immediate and delayed memory trials. There are five learning trials in which the examinee is asked to recall words from List A immediately after presentation of the list. List A is composed of 16 words, four from each of four semantic categories (furniture, vegetables, animals, and ways of travelling). A second interference list (List B) is then presented for one trial, followed by short- delay free-recall and short-delay cued-recall trials of List A. After a 20-minute period, long- delay free-recall, long-delay cued-recall, and yes/no recognition trials of List A are administered.

The cued-recall trials allow for the evaluation of the use of semantic associations as a strategy for learning words. A number of scores can be derived from the CVLT-II. The following were used in the current analyses: Trials 1-5 (the sum of the words recalled across learning trials 1 through 5 from List A, commonly considered a measure of verbal learning), Long-Delay Free

Recall (the sum of List A words correctly recalled upon free recall), and Recognition Hits (the sum of List A words correctly recognized).

Logical Memory - Wechsler Memory Scale-III (WMS-III; Wechsler, 1997).

The Wechsler Memory Scale is a widely used memory battery that has undergone several revisions. The Logical Memory (LM) subtest from the WMS-III battery was designed to test immediate (LM I) and delayed (LM II) recall of two short verbal stories read aloud by the examiner. During LM I, the first story is read to the examinee only once and the second story is

70 read two times. After each reading, the examinee is asked to retell the story from memory, in order, using as close to the same words as possible. During LM II, the examinee is again asked to retell the two stories from memory after a 30-minute delay and is prompted with a clue only if they are unable to recall any details from the story(ies). None of the patients in the present sample required prompting for either story. Only the LM II raw score was used in the current analyses and represents the total number of units recalled from the two delayed recall trials.

Beck Inventories.

The BDI-II (Beck, Steer, & Brown, 1996) is a 21-item survey of mood symptoms. The questionnaire asks the examinee to rate each item on a four-point scale in regard to depressive symptoms experienced over the last week. The results are scored from normal to severe. The

BAI (Beck, 1993) is an analogue to the BDI-II, used for the measurement of symptoms of anxiety, and also renders a score between normal and severe.

Frontal Systems Behavior Scale (FrSBe; Grace & Malloy, 2001).

The FrSBe is a 46-item, paper-and-pencil behavioural rating scale that was designed to evaluate three personality syndromes hypothesized to be associated with frontal lobe systems dysfunction: apathy, disinhibition, and executive dysfunction. Higher scores are reflective of greater behavioural disturbance. The FrSBe can also be used to quantify behavioural changes over time with the inclusion of both baseline (retrospective) and current assessments of behaviour. All items on the FrSBe are evaluated on a five-point scale (1 = Almost Never; 2 =

Seldom; 3 = Sometimes; 4 = Frequently; 5 = Almost Always). The FrSBe includes a Total

Score, along with scores on three subscales: Apathy (14 items), Disinhibition (15 items), and

Executive Dysfunction (ED; 17 items). Each patient completed the Self-Rating Test Booklet and

71 a significant other/close relative of theirs completed the identical Family Rating Test Booklet.

The correlational analyses only included Family Ratings for reasons outlined above regarding issues pertaining to accuracy of evaluation and insight (see Introduction), with the exception of the calculation of the insight index described below which required both Self- and Family-Rating scores. In addition, only “current” behavioural ratings were considered for both the Total score and the Executive Dysfunction subscale as this study was concerned with present functioning.

The Apathy and Disinhibition subscales from the FrSBe were not entered into the analyses to reduce the number of correlations (reducing Type I error) as it was hypothesized that these scales would not relate to the MCT or the other executive function tests in this battery in any meaningful way.

An index of self-awareness or “insight” was computed by measuring the discrepancy between FrSBe ratings obtained from patients and their significant other (Self minus Family

Ratings) for both the Total scores and the ED subscale scores 6. Positive scores indicate overestimation of deficits by the patients while negative scores indicate underestimation of deficits. Patients with discrepancy scores < -10 were classified as “under-estimators” and those with discrepancy scores > +10 were considered “over-estimators”. As 100% agreement between patients and significant others is highly unlikely, accurate estimators were those who disagreed by no more than 10 points in either direction (±10), as per Hoerold and colleagues (2008).

6 The computation of insight scores in this manner may be limited to some extent by bias on the part of some informants.

72

Computer Skills Questionnaire.

A brief evaluation of each patient’s knowledge and background with computer use was devised in order to assess whether computer experience had any impact on MCT performance

(please refer to Appendix C for a copy of the questionnaire).

Data Analyses

Descriptive analyses (qualitative and quantitative) were conducted for all MCT measures.

All statistical analyses were performed using SPSS version 17.0. As with the previous study, data analyses were limited to primarily descriptive statistics and analysis of bivariate relationships using non-parametric statistics due to the small sample size. With regard to the inferential statistical analyses, corrections to significance levels were again not made (i.e., p values of .05 or less were considered significant) due to the exploratory nature of this research, as indicated in Chapter 2. The following MCT scores were entered into the analyses: Plan Time,

Plan Score, Completion Time, Number of Tasks Completed, and Total Errors (Insufficient Funds

+ Time Deadlines Not Met + Task Repetitions + Inefficiencies). Spearman correlation coefficients were computed in order to determine the associations between the various MCT scores and between MCT scores and demographic variables, computer skills ratings, executive function tests, non-executive tests, FrSBe Total and ED scores, and the insight index scores.

Given that the current sample was not matched in age, education, or gender to the normal sample from Chapter 2, non-parametric statistics would be inappropriate for the exploration of any group differences. Loglinear regression and effect size analysis were also not possible due to the obvious confounding variable of age, considering that executive function is known to decline with age (Levine et al., 1998) and that there is a stark contrast in age range between the two

73 groups (i.e., normal sample: 18-25 years; patient sample: 40-75 years). Thus, if any significant difference(s) between groups were found using statistical analyses, age could not be ruled out as responsible for these differences. The groups were thus compared only qualitatively. To analyse any order effects, a Mann-Whitney U test was conducted to compare the neuropsychological test results of the MCT-first and MCT-last groups.

Results

Demographic characteristics are presented in Table 3.2. Briefly, the sample was comprised of 13 patients (11 males) with an average age of approximately 60 (Mean Age ±S.D.

= 58.38 ±10.77 years, range = 40-75). The average premorbid IQ (WTAR score) of the sample fell in the High Average range, with scores ranging from Average to Superior, and an average education of 14 years, indicating that the group was comprised of individuals who were high functioning prior to injury. Depression and anxiety (as measured by the BDI-II and BAI, respectively) for the entire sample were either in the minimal or mild range, suggesting that these psychological variables had little clinical relevance and were not likely to play a role in performance on the neuropsychological and behavioural measures in the battery.

Test Performance

Test performance scores are summarized in Table 3.3. Of the 17 neuropsychological test variables entered into the analyses (including the WTAR), two patients obtained borderline or impaired scores on 10 of these measures, another six patients achieved borderline or impaired scores on between 1 and 5 measures, while the remaining five patients performed at the low average range or better on all tests in the battery. Of the five executive function test variables

(not including the MCT), four patients achieved impaired scores on the WCST, three were

74 impaired on the COWAT, two of which were also impaired on the Animals test, three patients were impaired on the TMT-B, and only one achieved a poor score on the MSE.

Table 3.3 also shows FrSBe Self-Rating and Family-Rating scores (Total and ED subscale). In regard to Self-Ratings, 7 patients obtained elevated scores, 2 were borderline, and

4 were normal for the Total Score. The Self-Rating ED subscale score showed a similar pattern:

7 elevated, 1 borderline, and 5 normal. The Family-Rating scores were remarkably similar. That is, for both Family Total and ED subscale scores, 7 were elevated, 1 borderline, and 5 normal.

Self- and Family-Ratings were not consistent across the sample. That is, for Total scores, 8 were the same (e.g., both self and family scores were elevated for the same individual) and 5 were different (i.e., normal self ratings were matched with borderline or elevated family ratings for the same patient, or vice versa). For ED subscale scores, 7 were the same and 6 were different. The mean FrSBe Total and ED scores, however, were quite similar when comparing Self- and

Family-Ratings.

MCT Plan Analysis and Comparisons with Normative Sample

The patient sample as a whole required more time to plan the task requirements than the normal sample (see Tables 2.2 and 3.3). Their average planning time was almost 5 minutes, in contrast to around 3 minutes for the normal group. The amount of time spent planning the task varied substantially across the sample (range: 135-490 seconds). The overall extra time spent during this planning phase appears to have been advantageous, as the patient sample obtained higher plan scores than the normal group (Patients: M = 4.62, S.D. = 1.73; Normals: M = 3.80,

S.D. = 1.95). The poorest plan produced in the patient group (plan score = 1.5/6) was incomplete, involving visits to only 4 of the 6 locations in an incorrect sequence, with the bank

75 being the last place visited. The next two poorest plan scores (<3/6) were due to visiting the bank last in the specified sequence or not including the bank visit at all. Overall, eight patients indicated on their plans that they would visit the bank prior to making any purchases. As with the normal sample, there was very little evidence of budgeting in the plans devised, with only one patient including any information indicative of determining finances to allocate for the purchase of pens. As noted in Chapter 2, this was deemed more reflective of the limited budgeting requirements of the task rather than simple neglect of this aspect of the task instructions. Indeed, all patients purchased pens that were within budget.

MCT Performance: Qualitative and Quantitative Analysis and Comparison to Normal Sample

In terms of MCT performance, there were some qualitative and quantitative differences between the patient and the normal sample in regard to completion time and specific types of task failures and inefficiencies (see Tables 3.3 and 3.4). As with planning time, completion times between the two groups were noted to be dramatically different with the normal sample completing the task twice as quickly as the patient group (Normals: M = 319.53, S.D. = 124.21, range = 163-767; Patients: M = 636.00, S.D. = 233.16, range = 243-924; all scores measured in seconds). In terms of task failures, there appeared to be a pattern in regard to the type of errors each group made. The normal group made more insufficient funds errors than the patients, as they were more likely to attempt to make a purchase before they withdrew money from the bank.

Given that the sample size of the normal group was more than double that of the patient group, however, the difference may not be of any statistical significance even if such a comparison were appropriate here (i.e., 43% of the normals compared to 31% of the patients made this error). The normal group was also more likely to perform the post office task in two separate visits rather than one (27% versus 15%) but again, the difference may not be large enough to account for the

76 unequal sample sizes. The remaining distinctions in performance were noteworthy because the patient sample actually made more errors than the normals despite being much fewer in number.

The most striking difference was the total number of task failures: patients made 29 such errors versus 24 in the normal group, a difference that may not seem remarkable if not for the considerable sample size discrepancy between the two groups.

When analyzing these error patterns qualitatively, the variable most contributory to this difference was the number of incomplete tasks: Normals were far more likely to complete all task requirements (as noted in Chapter 2), with only 20% of the sample (6/30) not completing all

8 tasks while 62% of the patient group did not complete all tasks. Moreover, a total of 19 tasks were incomplete across the patient group as a whole while only 8 tasks were not completed across the normal sample. There was also more variability in the number of tasks completed within the patient group, ranging from 3-8 in contrast to 5-8 in the normal sample.

The other performance patterns are more difficult to determine given the discrepancies in sample size but are worthy of mention. Patients were more likely not to meet scheduled time deadlines, as this task failure occurred a total of six times across the patient group and only two times in the normal group. In terms of differences by percentage of sample, patients were more likely to make inefficiency errors involving entering a building in which there was no task to perform (31% versus 10%) along with notable wandering behaviour (38% versus 13%).

Surprisingly, the normal group and patient group did not differ in repetition errors and a small number of task repetitions were made in general.

77

Concordance between MCT Plans and Actual Performance

In order to investigate whether the patient sample followed the plans they devised during performance on the MCT, a similar analysis to the one conducted with the normal sample was undertaken. Again, the analysis focused on those individuals who obtained high plan scores ( ≥4

out of 6) as these plans were less likely to require an impromptu change in plans during task

performance as a result of ‘insufficient funds’. Of the 9 patients with high plan scores, 3

followed their plans exactly, 5 ‘almost’ followed their plans (defined as visiting 3 or more places

in the correct sequence without obtaining perfect sequence concordance), and another 1 patient

performed the task very differently from the plan devised. Within this group of 9 patients with

high plan scores, 7 went to the bank prior to making purchases. The 1 patient whose plan was

very different from performance and the 1 patient with perfect plan-performance concordance

attempted to make a purchase prior to obtaining money. These results indicate that most but not

all patients with high plan scores obtained money prior to attempting to make purchases during

actual test performance.

Inter-rater Reliability

Errors were scored separately by the same two independent raters as in study 1 (DJ and

ZC) and in the same manner. In the current study the ICCs computed for each category of error

(number of tasks completed, insufficient funds errors, task repetitions, time deadlines met, and

inefficiency errors) were all perfect (1.00), indicating excellent inter-rater reliability.

78

Correlational Analyses

MCT, demographic variables, and computer skills.

No significant correlations were found between any of the MCT scores and age, premorbid IQ (WTAR), or computer skills. Education (number of years) was significantly and negatively associated with two MCT scores, completion time ( r = -.63, p = .02) and total errors

(r = -.63, p = .02). Gender effects were not analyzed as there were only two females in the sample.

MCT variables.

The analysis of associations among the various MCT measures did not reveal any interesting relationships (see Table 3.5). The only statistically significant correlation occurred between completion time and total errors ( r = .69, p < .01), suggesting (not unexpectedly) that

the more errors made, the longer it takes to complete the task. The relationship between

completion time and total errors was also found to be significant and in the expected direction in

the normal sample in study 1. The other significant association from Chapter 2, between plan

score and total errors, was not found to be significant in the current study.

MCT and executive function tests.

The correlation matrix for the executive functioning tests is presented in Table 3.6. As

can be seen in the table, three of the five executive function tests were correlated with at least

one MCT variable. Both fluency tasks (COWAT and Animals) were not significantly associated

with any of the MCT scores. TMT-B was positively related to MCT completion time ( r = .64, p

= .02). In the previous normative study, TMT-B was not found to be associated with any MCT

79 scores. The WCST was also positively associated with MCT completion time ( r = .84, p <.01) along with total errors ( r = .60, p = .03). The MSE was positively related to the MCT plan score

(r = .80, p < .01), an association that was also found in the normative sample although the relationship was found to be stronger in the current patient sample ( r = .80) when compared to the normal sample ( r = .40).

MCT and non-executive tests.

The correlational analysis between the MCT scores and the non-executive test measures produced additional interesting results (see Table 3.7). MCT number of tasks completed did not correlate with any of the cognitive measures. Particularly noteworthy in this analysis was the relationships found with the MCT plan score variable. That is, plan score was positively associated with tests of different cognitive functions involving measures of memory (ROCF

Delayed Recall: r = .66, p = .02; CVLT-II Learning: r = .55, p = .05, and Recognition: r = .57, p

= .04), visuoconstruction (WAIS-III Block Design: r = .76 , p < .01) and visuospatial ability

(JLO: r = .57, p = .04). MCT total errors was negatively related to the visuospatial task (JLO: r

= -.56, p = .05). MCT completion time was positively related to TMT-A ( r = .59 , p = .04) and negatively related to JLO ( r = -.59, p = .03). These associations were also in the expected direction. Finally, plan time was negatively related to a visuospatial memory task (ROCF

Recognition: r = .59, p = .03), indicating that longer planning times are associated with poorer performance on this recognition memory task. One possible explanation for this latter finding is that both measures involve visuospatial functions and thus longer planning times on the MCT may be related to difficulty planning a route around the city using the provided map.

80

MCT and FrSBe.

Spearman correlations were computed between the FrSBe Family Ratings (Total Score and ED subscale) and the MCT and other executive function tests (see Table 3.8). The only executive function test scores that were related to the FrSBe measures were MCT plan score

(FrSBe Total: r = -.66, p = .02; FrSBe ED: r = -.59, p = .04) and Animals (FrSBe Total: r = -.60, p = .03; FrSBe ED: r = -.68; p = .01). Interestingly, none of the other executive tests (WCST,

COWAT, TMT-B, MSE) were found to be related to the FrSBe.

MCT and insight.

Insight scores derived from the FrSBe Total score indicated there were 4 over-estimators,

2 under-estimators, and 7 accurate estimators. FrSBe ED insight scores revealed that there were

2 over-estimators, 2 under-estimators, and 9 accurate estimators. Overall, this reflects that the majority of patients had good insight into their deficits as 11 out of 13 patients either had accurate insight or actually overestimated their deficits. In the current analysis, both insight indices (Total and ED) were not associated with any of the MCT scores or any of the other executive function test measures in the battery ( p > .05 in all cases).

Order Effects Analysis

The order effects analysis (Mann-Whitney U test) indicated that there were no significant differences in test performance between the MCT-first and MCT-last groups across all measures administered ( p > .05 in all cases).

81

Discussion

A principal objective of this investigation was to evaluate the ecological validity of the

MCT and other standardized executive function tests. The current study further sought to explore MCT performance patterns amongst a group of patients with ABI with the goal of evaluating executive (dys)function in terms of both numerical scores and qualitative descriptions of performance. This study compared the performance of the ABI group with the normal sample from Chapter 2 in an attempt to explore any differences (quantitative and/or qualitative) suggestive of a neuropsychological signature for individuals with executive dysfunction.

MCT Performance

Overall, performance on the MCT in the ABI group was more variable than the normal sample in regard to the number of tasks completed. Other authors have also noted greater variability amongst patients on cognitive test performance when compared to healthy individuals

(Knight, Alderman, & Burgess, 2002; Zakzanis, 2001). This variability in test performance is most likely explained by the varying degree of (brain) injury severity in the sample along with differences in both lesion location and the nature of behavioural/cognitive impairments present.

The duration of time spent devising plans on the MCT was substantially longer for the patient sample than for the normal sample. This finding contrasts with Knight and colleagues’

(2002) study using the MET-Hospital Version, in which controls spent more time planning before beginning the task than patients did (although this difference was not statistically significant in their study). Interestingly, the current patient sample produced better plans overall on the MCT than the normal participants. Plans were deemed better primarily due to superior sequencing of subtasks. This finding was unexpected and contrasts with previous findings using

82 a university-based version of the MET, in which patients produced poorer plans than controls

(McGeorge et al., 2001). Taken together, the present findings indicate that patients took extra time planning the task but also devised better plans when compared to the normal sample.

Moreover, these better plans may have helped reduced the likelihood of making a specific type of error—insufficient funds errors—as patients were less likely to attempt to make purchases prior to withdrawing money than the normal participants.

Although patients had better plans and made less insufficient funds errors than the normal group, they also tended to complete fewer tasks. Previous studies of the MET have also consistently reported that patient samples complete fewer tasks than controls (Alderman,

Burgess, Knight, & Henman, 2003; McGeorge et al., 2001; Shallice & Burgess, 1991).

Alderman and co-workers (2003) found that a pattern of errors involving incomplete tasks was particularly evident within a subgroup of their patient sample, whom they labelled Task Failers.

They provisionally proposed that the root cause of this pattern was a failure of ‘initiation’ since at the most basic level the failures were most often that the tasks were not initiated in the first place. Similarly, the data suggest that patients in the current study were generally good at devising plans yet do not follow through with said plans completely. Indeed, the analysis of plan-performance concordance revealed that few patients followed their plans exactly, although most followed their plans to some extent. This discrepancy between plans and performance is perhaps reflective of a “dissociation between knowing and doing” (Teuber, 1964). That is, patients appeared to have good comprehension of the task requirements (“knowing”) as evidenced by their plans, yet actually did something else during the task (i.e., they did not completely follow the plans they devised). Closer inspection of task performance revealed that this most often involved not initiating some of the tasks indicated within the plans, in other

83 words not “doing” tasks. Thus, impaired initiation may be responsible for the “knowing and doing” dissociation found when comparing plans with actual performance in the present sample.

Indeed, several patients (or their significant other) reported reduced attempts to accomplish certain everyday tasks despite normal motivation levels. Moreover, previous studies have demonstrated that difficulties with initiation are one of the most frequently reported symptoms of executive dysfunction (e.g., Burgess & Shallice, 1996).

Consistent with the findings of Dawson and others (2009), the current results indicate that there is no specific category of error on the MCT that is pathological as patients did not make errors that were qualitatively different from those made by healthy participants. For instance, it was anticipated that normals would not exhibit ‘wandering behaviour’ and were not likely to enter buildings in which there was no task to perform. A few individuals from the normal sample, however, did indeed exhibit these inefficient patterns of behaviour on the MCT.

Nonetheless, these inefficiencies did occur more frequently in the patient sample. Thus, similar to the results of Dawson and colleagues, it was the number of errors rather than the category of error that best differentiated those with neuropathology from healthy controls. It must be pointed out that the errors noted to be unique to patients in previous studies involved social rule breaks such as approaching shoppers as if they were shop-keepers or swearing loudly (see Rand, Rukan,

Weiss, & Katz, 2009; Alderman, Burgess, Knight, & Henman, 2003; Shallice & Burgess, 1991, for other social rule breaks made by patients). Such errors, of course, could not occur in the MCT given that it does not allow for the possibility of social interaction to take place as there are no

‘people’ to encounter in the VE. The focus of the present study, however, was to explore patterns of executive dysfunction rather than social impropriety. Thus, in terms of executive functioning, it can be tentatively concluded from the current results that some executive errors

84 exist on a continuum whereby such errors are made by neurologically intact individuals but patients with executive dysfunction make a greater number of these errors.

Ecological Validity

The ecological validity of the MCT was demonstrated by the statistically significant correlations between the plan score and the FrSBe (Total and ED scale). The Animals test was also significantly related to the FrSBe. None of the other executive function measures, however, were associated with this functional outcome measure. Although the Animals test was related to the FrSBe, it was also one of the measures not related to the MCT suggesting they are measuring different functions. Moreover, it is unclear which abilities in real life could be predicted from performance on the Animals test. Indeed, Burgess and colleagues (1998) in discussing the (lack of) ecological validity of traditional executive function tests argue that “the circumstance of a person performing the WCST [or Animals] test under strict examination conditions might be so different from most situations in the real world that there is little correspondence between cognitive resources tapped in the examination condition, and those tapped in real world ones” (p.

547, Burgess, Alderman, Evans, Emslie, & Wilson, 1998). In keeping with this, it is difficult to make predictions about precisely which everyday tasks may be compromised on the basis of an identified impairment in the ability to name animals. In contrast, there appears to be a more direct correspondence between cognitive processes measured on the MCT and those used in real life and hence allows for better prediction and evaluation of specific everyday executive abilities.

For instance, MCT plan scores likely reflect the ability to efficiently sequence, prioritize, and organize the steps and elements needed to carry out an intention or achieve a goal (Lezak,

Howieson, & Loring, 2004) and easily translate to various activities one may perform in everyday life.

85

In keeping with the above, the current findings suggest the MCT is superior to other standardized executive measures used in this battery for the prediction of real world executive functioning. That is, all but one of the various popular executive function measures utilized in this battery were not related to a measure of everyday executive function ratings. It is particularly noteworthy that even the one ‘ecologically valid’ measure used in this battery, the

MSE subtest from the BADS test battery, was not related to everyday executive functioning here nor in a previous study of the MET (Knight, Alderman, & Burgess, 2002). In light of these results, it is strongly suggested that the clinical utility of immensely popular executive function measures (i.e., WCST, TMT-B, COWAT) be questioned given the lack of evidence of ecological validity. In particular, these findings caution against the use of such measures to make predictions with potentially life-altering consequences, such as estimations of an individual’s vocational aptitude or ability to manage finances. At the very least, these results indicate that these measures not be used as the sole basis for making such predictions about real world behaviours. These findings further suggest that tests of verisimilitude and informant ratings of executive skills may contribute valuable and unique information about everyday functioning not captured by standardized executive function tests.

The MCT and its Relationship with Standardized Executive Function Measures

In contrast to the findings from the normal sample, the MCT was correlated with several

of the other executive function tests in the patient sample. Only the two verbal fluency measures

were not associated with any of the MCT scores. Knight and others (2002) also found that

verbal fluency was not associated with the MET. In their study, they used a modified version of

the WCST (i.e., Modified Card Sorting Test) and found that the percentage perseverative errors

measure was the only test variable that correlated with the MET. This same variable was

86 positively correlated with MCT completion time in the current study. A possible interpretation for this result is that individuals who make more perseverative errors are slower to grasp the concept of problem solving tasks and thus require more time to complete such tasks. TMT B was also positively associated with MCT completion time and is most likely due to a common information processing speed component on both tests.

In the normal sample, the MCT plan score was moderately correlated with the MSE ( r =

.40) and this relationship was again found in the patient sample, although this time the two

measures were more highly correlated ( r = .80). As noted in Chapter 2, both of these tasks

involve a number of similar planning and multitasking components that are likely responsible for

the significant association. The sensitivity of the MSE to planning impairments, however, has

been questioned. For instance, McGeorge and colleagues (2001) showed that their patient

sample did not perform significantly different from the normal sample used in standardization of

the BADS when given the entire battery, even though all of their patients had planning deficits as

reported by care staff. They suggested that this may be because the battery is not sensitive to

planning deficits in a subset of patients. Indeed, there was little variability in MSE performance

in the current sample, with only one patient performing in the impaired range and all others

obtaining perfect or near perfect scores. Even on the MCT plan score measure, on which it was

deemed patients performed quite well, scores were more variable with 5 of the 13 patients

receiving scores of 4 or less out of a total possible 6 points. The variability in plan scores on the

MCT across the normal and patient samples suggest it may be a sensitive measure of planning

ability in high functioning individuals with or without neurological injury. On the other hand,

the current data indicate that the MSE test lacks sensitivity to the executive deficits of high

87 functioning individuals, as has been reported elsewhere using the entire BADS battery (Sohlberg

& Mateer, 2001).

At first glance, the finding of significant correlations between the MCT and most of the executive function measures used in this study appears to be inconsistent with the theory presented in study 1 that existing standardized executive tests and the MCT are measuring different aspects of executive functioning. That is, it was postulated that these other tests are measuring the building blocks of executive functioning while the MCT evaluates higher order functions such as the initiation and integration of these building blocks. It is proposed, however, that the present results are still consistent with this theory. That is, the three executive measures that were found to be related to the MCT were associated with only one or at most two (in the case of the WCST) MCT variables. Moreover, certain MCT variables (i.e., plan time and tasks completed) were not associated with any of the executive function measures. These results indicate that the significant associations between these tests and specific MCT variables reflect

that they are tapping the same component executive processes. The lack of relationship between

these executive tests and other MCT scores, however, appears to be compatible with the

interpretation from study 1 in that overall performance on the MCT requires higher order control

mechanisms and the ability to both initiate and integrate these so-called building blocks of

executive function.

The MCT and its Relationship with Other Non-Executive Measures

Similar to the findings from the normal sample, the MCT measures were related to a

variety of non-executive tests from the battery. These findings provide further evidence that

various cognitive processes must be integrated for successful MCT performance. There were

88 some differences in the associations found here when compared to those from Chapter 2. In the patient sample, total errors was negatively related to visual spatial ability (JLO) suggesting that poor visuospatial skills were associated with making more errors on the MCT. In the normal sample, the total errors variable was not related to JLO or any of the other neuropsychological test measures administered. Instead, JLO performance was positively related to the number of tasks completed on the MCT in the normative group. In the patient sample, the correlation between JLO and tasks completed was still moderate (r = .38, p = .20) despite not attaining statistical significance 7. Collectively, however, the findings from the normative and patient

studies suggest a role for visuospatial processes on the MCT.

MCT plan score was related to a number of cognitive measures in the patient sample but was not associated with any of the non-executive tests in the normal sample. In the current study, plan score was found to be positively related to a number of memory measures (ROCF

Delayed Recall and CVLT-II Learning and Recognition) along with visuoconstruction and visuospatial ability (WAIS-III Block Design and JLO). These relationships indicate that patients may need to incorporate various cognitive processes in the development of their plans while normals do not utilize these ‘lower order’ abilities to the same extent, perhaps instead relying more on higher order executive processes during plan construction. The incorporation of basic cognitive processes, however, appears to have been a successful strategy given that the patients tended to produce better plans. These conclusions are stated cautiously, however, as the battery of tests administered to the patient sample was quite different from that of the normal sample, which makes direct comparison of test performance across groups difficult.

7 The magnitude of the correlation suggests a considerable relationship between these variables that likely would have attained statistical significance with a larger sample size.

89

The MCT and Associations with Demographic Variables and Computer Skills

Age was not a factor in neuropsychological test performance within the patient group, despite research suggesting executive functioning declines with age (Levine et al., 1998). Higher education levels were associated with shorter MCT completion times and fewer errors. These findings of better MCT performance associated with more years of schooling may reflect a protective effect of education in those with brain injury (Satz, 1993).

Computer skills and experience was also not found to be related to MCT performance. A likely explanation for this finding is that the VE was designed to be very simple to navigate even for those with limited backgrounds in the use of computers. Indeed, the 3 minutes allotted for the practice trial proved to provide all the training necessary such that even the least computer savvy patient in the sample could perform the task with minimal, if any, technical glitches.

Limitations

Limitations of the current study included small sample size and the consequent limited inferential statistics that could be computed. The intention was to include a sample size of 30 patients, however, the limited availability of patients meeting inclusion criteria and expressing interest in participating in the study during the recruitment period meant the investigation had to be completed prior to obtaining the proposed sample size.

The types of statistical tests that could be conducted were limited by the small sample size and thus the failure to meet assumptions associated with certain analyses (i.e., variables were not normally distributed and/or lacked homogeneity of variance). For instance, if not for

90 the limited sample size, regression analysis would have allowed for the prediction of the amount of variance in everyday executive functioning that could be accounted for by performance on the

MCT and other executive tests. Obviously conducting regression analyses would have strengthened the findings of ecological validity in the current report. Statistical analyses were further limited by the fact that the study did not have a matched control group. Instead, only qualitative comparisons with the normal sample could be made and inferential statistics had to be avoided because it would be impossible to determine whether any differences were actually due to the experimental variables or due to demographic variables such as age, gender, IQ, or education. It is recognized that a matched control group would have allowed for statistical comparison between patient and normal test performance. Unfortunately, a lack of access and ethics approval prevented the attainment of a sufficiently matched group (i.e., in terms of gender, age, and education) to this end. The main objective of this study, however, was not to compare test performance of normal and patient groups, but rather to explore performance patterns of patients and to evaluate the ecological validity of the MCT and other standardized executive measures.

Conclusion

In conclusion, the findings reported here suggest that assessment measures developed using VR technology and designed with ecological validity in mind are valuable tools for neuropsychologists attempting to predict real world functioning. The MCT may offer a superior method of evaluating the degree and nature of real life executive function impairments when compared to traditional executive function measures. The MCT appears to be more sensitive to the executive impairments of high-functioning individuals with ABI than an existing standardized measure of executive functioning designed to be ecologically valid (MSE). The

91 present study suggests there may be a dissociation between “knowing and doing” in some patients, as good plans did not necessarily translate into successful task performance. This dissociation may be related to initiation problems. Given the small sample size, the present results must be considered preliminary and future research using larger samples is necessary to further evaluate the reliability of these findings. Nonetheless, performance patterns seen here could provide valuable information that would aid diagnostic decision making and inform rehabilitation goals.

Chapter 4 Evaluating the use of the MCT and Verbal Fluency Tasks as Outcome Measures for Goal Management Training Effects

Abstract

The evaluation of rehabilitative interventions targeting executive dysfunction must involve the use of appropriate outcome measures. The objective of the present study was to assess whether the MCT and verbal fluency tasks are appropriate outcome measures for the evaluation of the effectiveness and generalization of an executive function treatment program. A further objective was to explore the ecological validity of both types of measures and to determine which is most predictive of real world behaviour. Four patients with executive dysfunction underwent a 7- week Goal Management Training (GMT) program. Patients’ pre- and post-intervention performance was compared on a battery of standardized neuropsychological tests including verbal fluency measures, a behavioural rating scale of everyday executive deficits, and the MCT.

This study found that strategies learned in GMT appear to be generalizable to activities not specifically trained during intervention, as demonstrated by improvements in MCT performance.

Trained strategies did not generalize to verbal fluency tasks. In regard to treatment efficacy, the largest effects were found on the MCT variables, all of which showed medium or large effect sizes from pre- to post-intervention while only negligible effects were found on the fluency tasks.

These results suggest the MCT is a better outcome measure for evaluating GMT effects than verbal fluency tasks. Further evidence of the ecological validity of the MCT, and to a lesser extent semantic fluency tasks, was found in the present study. Study limitations and clinical implications were discussed.

92 93

Introduction

Despite the prevalence of executive dysfunction across multiple neurological and psychiatric conditions, there have been few validated rehabilitative interventions targeting these impairments. One intervention holding promise for patients with executive dysfunction is

Robertson’s Goal Management Training (GMT; Robertson, 1996). GMT is based on Duncan’s

(1986) theory of goal-neglect (or failure to execute intentions) in which disorganized behaviour is attributed to impaired construction and use of “goal lists”, considered to direct behaviour by controlling actions that promote or oppose task completion. GMT attempts to ameliorate goal- neglect through verbally-mediated, metacognitive strategies systematically targeting planning abilities by teaching individuals to structure their intentions. Through presentations, discussions, exercises, and homework assignments, GMT trains participants to use strategies such as stopping and orienting to relevant information, partitioning goals into more easily managed subgoals, encoding and retaining goals, and monitoring performance.

Investigations into the efficacy of GMT have been promising in both normal older adults and in patients with ABI. In order to investigate the efficacy of GMT for older adults with cognitive complaints, van Hooren and colleagues (2007) randomly assigned 69 normal, community-dwelling adults aged 55 years or older to a 6-week GMT program or to a waitlist control group. After the intervention, participants from the GMT group reported significantly fewer anxiety symptoms, were significantly less annoyed by their cognitive failures, and reported improved ability to manage their executive failures as compared to control participants. Though this study reported positive results for subjective outcome measures, the intervention showed no effect on the Stroop Colour Word Test (Houx, Jolles, & Vreeling, 1993; Stroop, 1935). In a similar study with healthy adults, Levine and colleagues (2007) randomly assigned 49 elderly

94 individuals to a 4-week GMT intervention or to a waitlist control group who received GMT after the first GMT group had completed the intervention. Participants in both groups demonstrated improved performance on tasks purported to assess executive functioning, as well as improved self-ratings of executive deficits. Both of these improvements coincided with when GMT was received by each group and gains were maintained at follow-up six months after completing the intervention. One significant limitation of these results, however, is that the GMT intervention was only one of three interventions provided to participants as part of a larger study (Winocur et al., 2007). As such, outcome assessments were conducted only after participants completed all three interventions. Therefore, it is not possible to draw conclusions about the specificity of

GMT effects on post-intervention results as improvements could have been affected by any of the three interventions received (Levine et al., 2007).

Investigations into the effects of GMT with brain damaged individuals include two case studies and one randomized controlled trial. Levine and colleagues (2000) reported on 30 TBI patients who were randomly assigned to receive a brief trial (1-hour session) of GMT or motor skills training. Upon completion of the intervention the GMT group, but not the motor skills group, showed significant improvement on paper-and-pencil tasks designed to mimic everyday executive tasks that are problematic for patients with goal neglect. In the same report, Levine and colleagues described a case study of a post-encephalitic patient seeking to improve her meal- preparation abilities. For this patient, GMT was expanded over two sessions and supplemented with exercises requiring the implementation of GMT stages while following recipes, some of which were performed as homework assignments. Both naturalistic observation and self-report measures demonstrated improved meal preparation performance following the intervention.

95

More recently, the effectiveness of the full GMT protocol was investigated in a case report of a patient with cerebellar damage (Schweizer et al., 2008). After suffering a right-sided, cerebellar arteriovenous malformation rupture, the patient reported difficulties with organization and slowed information processing and exhibited compromised executive functioning on a battery of neuropsychological tests. Despite only modest improvements in the patient’s test performance immediately post-GMT and at long-term follow-up, the patient’s spouse endorsed a complete resolution of dysexecutive behaviours on the Significant Other version of the DEX

(DEX-I; Burgess, Alderman, Wilson, Evans, & Emslie, 1996) during evaluation 4 months after completing the intervention. Moreover, the patient was able to return to full-time work in an intellectually challenging occupation. Taken together, results of investigations into the effects of

GMT in brain injured individuals appear very promising.

In addition to evaluating the efficacy of cognitive interventions, another figural concern in neuropsychological rehabilitation is demonstrating generalization. In order to determine the true effectiveness and clinical value of an intervention, it is imperative to demonstrate that abilities or strategies learned during treatment will be utilized in the real world such that they lead to sustained improvement in everyday functioning. Two of the studies described above included outcome measures designed to be ecologically valid in an attempt to demonstrate the potential for generalization of GMT to everyday functioning. In Levine and colleagues’ (2000) study of GMT with TBI participants, everyday paper-and-pencil tests were designed for the study and used as outcome measures for assessing participants’ ability to hold goals in mind, analyze subgoals, and monitor their performance. The three tasks included were proofreading

(i.e. underlining, circling, and crossing out words that met certain criteria in a paragraph),

grouping (i.e. listing individuals by age and sex), and room layout (i.e. answering 5 questions

96 about a seating chart). Two sets of each of these three tasks were constructed (for use during pre-training and post-training) with identical instructions but different stimuli to minimize practice effects. Similar tasks were included in Levine and others’ (2007) study of GMT with elderly individuals. These tasks were named simulated real-life tasks (SRLTs). According to the authors, the tasks mimicked everyday activities presenting problems for individuals with brain dysfunction due to demands on working memory, attention, and strategic processes. Each of three SRLTs (two carpool tasks, and one swimming lessons task) had a main goal requiring successful arrangement of different dimensions or subgoals. Each task had a parallel structure involving sorting people or objects into groups based on various constraints. Both studies

(Levine et al., 2000, 2007) provided evidence that the effects of GMT are generalizable by way of improvements on the everyday tasks devised. Previous studies have also noted generalizable effects of GMT to different everyday tasks via self-report evidence from participants (Levine et al., 2007; McPherson, Kayes, & Weatherall, 2009; Schweizer et al., 2008).

Given the evidence of the ecological validity of the MCT demonstrated in study 2, the main objective of the present study was to explore whether it was an appropriate measure to evaluate the effectiveness and generalization of GMT in patients experiencing everyday executive dysfunction. Moreover, this study also sought to compare the MCT with other methods of evaluating the effects of GMT. Thus, patients’ pre- and post-GMT performance on three different types of measures were compared. The first type of measure consisted of a brief battery of standardized neuropsychological tests including both executive (i.e., verbal fluency) and non-executive measures. The second type of measure was a behavioural rating scale assessing behavioural manifestations of executive impairment in everyday functioning (i.e.,

DEX). Data from this scale was included to provide subjective information about how the

97 effects of GMT may generalize to everyday behaviours. The third measure utilized was the

MCT. Given the focus of GMT on training everyday executive functions, it was hypothesized that the most substantial gains in test performance would be observed on the MCT when compared to the other neuropsychological tests in the battery. It was also of interest to explore whether performance on the standardized executive function tests would show more substantial gains in performance post-intervention than the non-executive tests. These findings would address the clinical utility of traditional executive tests in evaluating the effects of executive function rehabilitation. Finally, the ecological validity of the two types of executive measures in the battery was evaluated. Given that the MCT and a verbal (semantic) fluency measure were already found to be predictive of everyday behaviour in study 2, a different informant questionnaire (i.e., DEX) was used in the current study in an attempt to obtain confirmatory evidence of ecological validity.

In short, the present study was undertaken to address the following questions:

1. Which executive function task (MCT or verbal fluency) is most appropriate as an outcome measure for the evaluation of the effectiveness and generalization of GMT trained strategies?

2. Do these two executive measures have ecological validity when a different behavioural rating scale (DEX) from the one used in study 2 is administered? Which of these measures is most predictive of real world behaviour?

98

Method

Participants and Design

Patients were recruited for the study after being referred for GMT from either the Brain

Health Clinic at Baycrest Centre or the Neuropsychology Consultation Service at Sunnybrook

Health Sciences Centre. Both institutions are teaching hospitals fully affiliated with the

University of Toronto. For inclusion into the GMT program and this research study, patients were required to have subjective complaints of executive impairment in their everyday activities as a result of acquired damage to the brain or psychiatric disorder. This inclusion criterion was based on the suggestion by Levine and colleagues (2000) that individuals selected for real life disorganized behaviour, as opposed to simply those with brain damage, would be most likely to benefit from GMT. Exclusion criteria consisted of uncorrected hearing loss, uncorrected visual impairment, current or past history of substance abuse, or poor English language comprehension and communication. The study involved a pre-post design in which patients underwent baseline testing one to two weeks before commencement of the GMT program and underwent post- testing within one week of completion of the program. All testing was conducted by an undergraduate or graduate psychology student under the supervision of a clinical neuropsychologist. This study was approved by both the Baycrest and the University of Toronto at Scarborough ethics review boards and all patients provided informed consent.

Goal Management Training

Patients met as a group with three trainers (a registered clinical psychologist, a post- doctoral psychology graduate student, and an undergraduate psychology student) for 2 hours per week for 7 consecutive weeks. The GMT protocol involves an interactive format including

99 presentations, discussions, exercises, and homework assignments. The main goal of training was to assist patients in developing awareness of attentional control lapses in everyday life and strategies to overcome them. In the first few sessions, patients and trainers discuss the idea of

“absentminded slips” and the internal and external contextual factors that may contribute to their occurrence. The main training focus is the acquisition of a simple self-command (e.g. “stop”) to interrupt ongoing automatic behaviours that occur during these attentional lapses in order to resume executive control. Once the opportunity for more controlled processing has been established, patients are encouraged to remain present-minded, refocus on the goal of the task at hand, divide the task into manageable steps, and continuously check their progress toward the goal. The following is a brief description of the topics covered in each session: Session 1 -

Definition of absentminded slips and discussion of contexts that make them more likely and methods of preventing them; Session 2 - Discussion of what it means to be on automatic pilot and methods to stop this from happening, and introduction of personalized STOP phrases;

Session 3 - Introduction to the concept of short-term memory as a mental blackboard and demonstration of present-mindedness breathing technique; Session 4 - Training in defining the main goal of a task, discussion of methods for remembering goals, and descriptions of different memory systems; Session 5 - Description of how to partition main goals into sub-goals and application of STOP phrases to goal execution; Session 6 - Discussion of problem solving techniques and description of techniques for monitoring performance and goal execution; and

Session 7 - Discussion and review. Throughout the intervention, discussion of patients’ real life executive problems was encouraged and application of the GMT strategies to these difficulties was reviewed. For instance, one patient required assistance with disorganization and poor time management resulting in reduced productivity in accomplishing job-related tasks. Another

100 patient felt overwhelmed with managing finances and asked for help in this area to avoid making careless errors.

Outcome Measures

Neuropsychological Test Battery

A brief battery of standardized neuropsychological tests, the MCT, and DEX were administered pre- and post-intervention in order to provide a means of objectively evaluating any cognitive changes following GMT. In addition to executive function tests, non-executive cognitive measures of verbal memory, visual memory, and confrontation naming were also included. A brief measure of IQ was administered at baseline only. At post-intervention testing, alternate versions of tests or the split-half method (i.e. odd items at pre-testing, even items at post-testing) were employed for all tests, except the MCT, in order to minimize practice effects.

In regard to the MCT, there is no alternate form and the test is not amenable to administration using the split-half method. Thus, the same form of the MCT was administered at baseline and post-intervention. The battery included the CVLT-II (described in Chapter 3; Standard Form administered pre-intervention; Alternate Form at post-intervention) along with the following tests.

Brief Visuospatial Memory Test-Revised (BVMT-R; Benedict, 1997).

The BVMT-R is a test of visuospatial learning and memory involving three 10 second presentations of a matrix consisting of six simple figures. After each presentation, examinees are required to draw the designs accurately and in their correct location on the page. Twenty-five minutes after the third immediate recall trial there is a delayed recall trial followed by a forced- choice recognition component. Form A was administered pre-GMT and Form B was

101 administered post-GMT. The scores reported will be Total Recall and Delayed Recall, measures of short-term and long-term visual memory, respectively.

Boston Naming Test (BNT; Kaplan, Goodglass, & Weintraub, 1983).

The BNT is a confrontation naming test in which examinees are asked to name items depicted in line drawings. The split-half method of administration was used for this test. The

Total Correct score (out of 30) will be reported in the present study.

Delis-Kaplan Executive Function System (D-KEFS) Verbal Fluency Test (Spreen &

Benton, 1977; Delis, Kaplan, & Kramer, 2001).

The D-KEFS Verbal Fluency Test measures fluency in the verbal domain and is considered an executive functioning task. Examinees must rapidly generate words according to letter and category in addition to alternating between two different semantic categories during

60-second trials. The Letter Fluency component administered during pre-intervention is exactly the same as the COWAT test in that examinees generate words beginning with F, A, and S. The

Category Fluency test administered at pre-intervention involved animal naming (like the

Animals test used in Chapter 3) and a second trial during which examinees must think of as many Boys’ Names as possible. The Category Switching condition requires examinees to alternate between producing names of fruits and types of furniture. At post-intervention testing, different versions of these tests were utilized to minimize practice effects and involved the following: Letter – B, H, and R; Category: Clothing and Girls’ Names; Switching: Vegetables and Musical Instruments. The scores reported will be the total correct responses (words) for each of Letter Fluency, Category Fluency, and Category Switching Fluency.

102

National Adult Reading Test (NART; Nelson, 1991).

The NART is a brief measure of IQ requiring examinees to pronounce a list of words

increasing in difficulty. The estimated IQ will be reported.

Dysexecutive Questionnaire (DEX; Wilson, Alderman, Burgess, Emslie, & Evans, 1996).

The DEX is a behavioural rating scale designed to assess difficulties commonly associated with executive dysfunction in emotion or personality, motivation, behaviour, and cognition. The DEX has two forms, one to be completed by the patient (DEX-S) and one to be completed by someone who interacts with the patient on a regular basis (i.e., an informant), such as a spouse (DEX-I). The items in the DEX-S and DEX-I are direct parallels. Additionally, certain items were selected from the DEX that were felt to be most reflective of the types of functions evaluated on executive function measures, and the total scores from only these items were calculated to derive an “Executive Function” (EF) scale. Items included on the EF scale evaluated abstract thinking, planning, temporal sequencing, lack of insight, perseveration, inhibition, knowledge-response dissociation, distractibility, and decision making (see Appendix

D for specific items included on the EF scale). An index of “insight” was also computed by totaling the difference between patient (self) and informant ratings (informant scores were subtracted from patient scores; S-I) on each of the 20 questions to create a composite S-I discrepancy score similar to the insight index calculated using the FrSBe in Chapter 3. Positive

S-I scores reflect overestimation of deficits by patients while negative scores are reflective of underestimation of dysexecutive problems.

103

Data Analysis

Descriptive analyses (qualitative and quantitative) were conducted for all MCT measures.

Qualitative analysis of MCT performance involved comparisons with the previous patient

(Chapter 3) and normal (Chapter 2) samples in addition to within-subjects comparisons (i.e., between pre- and post-intervention). All statistics were performed using SPSS version 17.0. As with the previous studies, statistical analyses were limited to non-parametric statistics due to the small sample size. Corrections to significance levels were again not made (i.e., p values of .05 or less were considered significant) due to the exploratory nature of this research. Only the plan time, completion time, and tasks completed scores from the MCT were entered into the analyses as the other variables (plan score and total errors) lacked sufficient variability pre- and/or post- intervention to perform statistical analyses. To evaluate whether GMT had a significant effect on test scores and questionnaire results, the non-parametric Wilcoxon Signed Ranks Test for paired samples was computed. In addition, the magnitude of the difference between pre- and post-intervention scores was computed via effect sizes (Cohen’s raw d in addition to corrected d). Finally, scatter plots of the post-intervention relationship between DEX-I EF scores and both the MCT and verbal fluency tasks were assembled to explore whether informant ratings of everyday executive functioning were related to these executive function test measures.

Results

At baseline the sample consisted of five adults, with one (35-year-old male) patient withdrawing from the study after baseline testing and before commencement of treatment because he was “too busy with other commitments”. Thus, a total of four patients (three females) were included in the study. The mean age of the sample was approximately 50 years

104

(Mean ±S.D. = 48.5±9.3 years, range: 38-60), with above average education levels (Mean ±S.D.

= 19.5±1.9 years; range: 18-22) and mean IQ (as estimated by the NART) in the average range

(Mean ±S.D. = 100.4 ±2.8; range: 97.39-104.01). The sample was composed of patients with various neurological or psychiatric conditions, including stroke, meningitis and septicemia, MS, and bipolar II disorder comorbid with post-traumatic stress disorder. According to their self- report and clinical records, all patients had been experiencing executive dysfunction problems such as difficulty multitasking, distractibility, and disorganization for at least two years at the time of baseline testing.

Test Performance

Test performance scores are summarized in Table 4.1. One patient performed within normal limits on all neuropsychological tests administered. The second patient performed in the borderline range on a visual memory task (BVMT-R Delayed Recall) pre-intervention but within normal limits across all measures post-intervention. The third patient obtained scores in the impaired range on all three fluency tasks (D-KEFS Letter, Category, and Switching) both pre- and post-intervention. This patient also performed in the borderline or impaired range on measures of learning and memory and confrontation naming both pre-intervention (i.e., BVMT-

R Total Recall and Delayed Recall and BNT) and post-intervention (i.e., CVLT-II Learning and

BNT). Finally, the fourth patient’s test scores were in the impaired range on a verbal memory measure (CVLT-II Recognition) prior to treatment and were borderline or impaired on visual and verbal memory tests and confrontation naming (i.e., BVMT-R Total Recall and Delayed Recall,

CVLT-II Long Delay and Recognition, and BNT) after treatment. For these latter two patients, all other scores were within normal limits.

105

Table 4.1 also shows DEX-S and DEX-I scores. All scores, both pre- and post- intervention, were below the mean scores of self and informant ratings in the validation sample from the BADS manual (Wilson, Alderman, Burgess, Emslie, & Evans, 1996), indicating that the current sample’s executive dysfunction symptoms were less severe than the sample of 78 mixed neurological patients (mainly patients with closed head injury) that were included in validation of the DEX. Indeed, in the current study all patients and their informants rated their executive impairments in the 50 th percentile or below when compared to the ratings provided in the manual, suggesting relatively mild executive problems both pre- and post-GMT. Mean

DEX-S and DEX-I scores increased from pre- to post-GMT, suggesting that patients’ dysexecutive symptoms worsened after treatment contrary to the expectation that executive functioning would improve post-intervention. Upon closer inspection, however, the increase in

DEX-I scores was minimal and, as illustrated in Figure 4.1(a), only two patients endorsed more dysexecutive symptoms post-intervention while the other two patients endorsed fewer such symptoms. Moreover, mean DEX-I EF scores decreased post-intervention (see Table 4.1) albeit a similar pattern was found here wherein the first two patients’ scores increased while the latter two scores decreased (see Figure 4.1(b)). The increase in DEX-S scores post-intervention may reflect increased insight. Indeed, two of the patients obtained negative scores on the insight index at baseline, reflecting underestimation of deficits by these individuals (the other two scores were positive and suggestive of intact insight; see Figure 4.1(c)). All patients obtained scores indicative of good insight post-GMT, endorsing more severe dysexecutive symptoms than their informants. Moreover, in all but one patient, insight scores improved post-intervention (as indicated by more positive scores when compared to baseline scores).

106

MCT Plan Analysis: Pre- and Post-Treatment

An unexpected finding observed when comparing pre- and post-intervention plans was

the increase in planning time after GMT. In the current sample, pre-intervention planning times

were in line with that observed in the normal sample from Chapter 2 (approximately 3 minutes)

whereas average planning times post-intervention were almost a whole minute longer (i.e.,

approximately 4 minutes). Indeed, in all four patients, planning time increased. Qualitatively,

the comparison of plans produced pre- and post-intervention initially appeared to be a fruitless

task. Other than the stark contrast in one patient’s plan score (baseline = 1/6; post-GMT = 6/6),

the other three patients obtained perfect scores both pre- and post-intervention. This finding was

indicative of ceiling effects and thus little room or need for improvement via GMT training.

Closer inspection of the plans constructed, however, revealed some interesting findings. With

the exception of one patient who produced essentially the same plan pre- and post-intervention,

the plans produced were more detailed post-intervention such that patients constructed sentences

detailing what they were going to do (e.g., go to bank – withdraw money) as opposed to simply

listing where they were going to go (e.g., bank → post office → doctor, etc.). Another notable difference was the improved spatial efficiency of plans after treatment. Again, with the one exception noted above (i.e., identical pre- and post-intervention plans), post-intervention plans appeared better thought-out from a spatial perspective and involved less back and forth travelling when compared to pre-intervention plans. For instance, the pre-intervention plan of one patient involved travelling from the bank backwards to the stationary store, drug store, and grocery store, and then forward again to the doctor’s office and post office (both in the vicinity of the bank). Post-intervention, this patient was more strategic in navigating through the city and planned to complete errands at the bank, post office, and doctor’s office prior to completing the

107 grocery store, drug store, and stationary store errands on the way back home. As detailed in

Chapter 2, it is important to note here that patients can obtain a perfect plan score with a variety of location sequences, some of which are not the most spatially efficient. This explains why patients’ scores were (for the most part) “perfect” from a quantitative perspective both pre- and post-intervention yet a qualitative improvement after GMT was still observed. As expected, plans lacked any evidence of budgeting for the purchase of pens with the exception of one of the pre-intervention plans.

A more detailed analysis of the plan that improved (and substantially so) from pre- to post-intervention is warranted here. It is noteworthy that the patient who devised the plan obtained a neuropsychological profile that was within normal limits (with the exception of one borderline score) across all standardized tests administered. The main reason for the poor plan score pre-intervention was that the bank was indicated as the last location visited in the plan sequence. The plan appears to have been constructed with only spatial efficiency in mind, as the planned route indicates that the first location visited would be the one closest to home (stationary store) followed by the location that was closest to the stationary store (drug store) and so forth around the city. In addition to planning to visit the bank last, the doctor’s office was also visited according to the point in the sequence when it would be most practical from a spatial perspective, with no apparent regard for appointment time. These mistakes were not repeated post-intervention yet the plan was still spatially efficient in addition to being judicious. A final note about this plan comparison is the increase in planning time, which was observed to be the largest increase from pre- to post-intervention across the sample (i.e., the post-intervention plan took 66 seconds longer to complete).

108

MCT Performance: Pre- and Post-Treatment

In contrast with plan times, completion times pre- to post-intervention decreased from over 10 minutes to less than 6 minutes (see Table 4.1). Pre-intervention completion times were in line with those of the patient sample from Chapter 3 while post-intervention completion times were in the same range as the normal sample from Chapter 2.

Qualitative comparison of errors is presented in Table 4.2. Most noteworthy here is the ceiling effects obtained post-intervention, where a total of one error across the sample was made.

This is contrasted with the comparatively high frequency of errors made pre-intervention. The most common errors were those that were categorized as ‘task failures’, including incomplete tasks (the most frequent error made by the patient sample in Chapter 3) and insufficient funds errors (the most frequent error made by the normal sample in Chapter 2). Only two patients made task repetition errors (one each) and three of the four patients made one inefficiency error.

None of the patients in the present study entered any building in which there was no task to perform, an error that was made by a few individuals in both the previous patient sample and the normal sample. The only error made post-intervention was that one task was not completed by one patient. In total, 14 errors were made pre-intervention compared to one error made post- intervention.

Concordance between MCT Plans and Performance: Pre- and Post-Intervention

As would be expected given the high plan scores obtained, plan-performance concordance was generally very good. The patient who obtained the poor pre-intervention plan score performed the task in a very different sequence from the one planned, undoubtedly because of the error message that appeared upon attempting to make a purchase prior to going to the

109 bank. Obviously, this necessitated revising the plan on an impromptu basis and is responsible for the poor plan-performance concordance. Pre-intervention, concordance between plan and performance was ‘almost’ the same in two patients (defined as visiting 3 or more but not all places in the planned sequence) and was perfect in one patient. Post-intervention, all four patients obtained perfect plan-performance concordance.

Inter-rater Reliability

Errors were scored separately by the same two independent raters as in study 1 and 2

(D.J. and Z.C.) and in the same manner. The ICCs were again computed for each category of error (number of tasks completed, insufficient funds errors, task repetitions, time deadlines met, and inefficiency errors) for both pre- and post-intervention MCT performance. Similar to study

2, all ICCs were perfect (1.00) and indicative of excellent inter-rater reliability.

Treatment Effects

The non-parametric Wilcoxon Signed Ranks Test revealed that there were no significant differences between pre- and post-intervention scores for any of the neuropsychological test measures or DEX scores. Interestingly, the only variables to even approach significance were from the MCT (i.e., plan time and completion time; Z = -1.83, p = .07 for both variables). The effect size analysis revealed that the differences between pre- and post-intervention scores were of variable magnitudes across the different test measures. The interpretation of the magnitude of effect sizes obtained followed Cohen’s (1988) conventional criteria for a small, medium, or large effect to provide a frame of reference, although it is recognized that these heuristic criteria are also dependent on context (for example, see Zakzanis, 2001). As shown in Table 4.3, medium to large differences in scores were observed on the MCT while differences in the other

110 neuropsychological test measures were mostly small or negligible. The one exception was a medium to large effect size obtained on the BVMT-R Delayed Recall score. The DEX scores were more variable, with a medium to large effect found for the DEX-S score and small to medium effect for DEX-S EF score (both in the opposite from expected direction, suggestive of worsening of symptoms) while negligible effects were obtained for the DEX-I and DEX-I EF ratings. The insight index increased post-GMT indicating better insight by patients, although the magnitude of the effect size was considered small.

Ecological Validity: Relationship between DEX-I EF Ratings and Executive Function Tests

In order to explore the ecological validity of the executive function measures included in the battery (i.e., MCT and D-KEFS Verbal Fluency), scatter plots of their relationship with

DEX-I EF scores were assembled (see Figure 4.2). These graphs show a positive linear relationship between MCT completion time and DEX-I EF score (perfect rank order correlation using Spearman’s rho ). D-KEFS Category and Switching Fluency scores were negatively related to DEX-I EF ratings (i.e., in the expected direction), with the scatter plots and trend lines demonstrating a good linear relationship (Spearman’s r = -.80 for both). The scatter plots for

MCT plan time and D-KEFS Letter Fluency demonstrate a poor linear relationship with DEX-I

EF, as patients’ scores are relatively distant from the trend lines.

Discussion

This study sought to determine whether the MCT and verbal fluency tasks were appropriate outcome measures for the evaluation of a treatment program (GMT) in a small sample of patients with everyday executive function deficits. A second objective of the study was to explore the ecological validity of the MCT and verbal fluency task performance. The

111 findings with regard to the generalizability of training-induced benefits suggest that they may extend to performance on the MCT but not necessarily to verbal fluency performance. This study also provided evidence of the ecological validity of the MCT and semantic fluency tasks.

Neuropsychological Test Measures and Treatment Outcome

The generalizability and effectiveness of GMT was evaluated using neuropsychological test measures and subjective ratings of executive impairments (the former measures will be discussed here while the latter discussed in a later section). Inclusion of the verbal fluency tests in the outcome measures allowed for the evaluation of abilities not directly targeted for rehabilitation, but rather executive language functions in this instance. All three such measures of verbal fluency (i.e., letter, category, and switching), however, had negligible to small effects when comparing pre- to post-intervention performance. This contrasts with a previous study that found improved scores on phonetic fluency of medium to large effects after a GMT intervention

(Winocur et al., 2007). A potential interpretation of these results is that strategies trained during

GMT did not generalize to those required on fluency tasks in this sample. Thus, GMT showed poor generalization and effectiveness when evaluated using standardized measures of verbal fluency.

Similar findings were observed on the other neuropsychological test measures utilized in this battery. The effect size analysis revealed negligible effects of GMT across measures of verbal and visual memory and confrontation naming, with the exception of medium to large effects on a delayed visual memory task. A possible explanation for the relatively large effects on this measure include differences in the level of difficulty across alternate forms of the BVMT-

R (i.e., the form used post-GMT may have been less challenging than the one used pre-GMT).

112

Another possibility is the greater potential for large improvements post-GMT due to relatively poor scores at baseline. Indeed, two of the four patients obtained borderline or impaired scores during baseline testing, thus increasing the likelihood for more substantial improvements during post-intervention testing. Finally, there is the possibility that the improvements on delayed visual memory were in fact due to GMT. These gains may be attributable to training of encoding and retrieval strategies during Stage 4 of the program. Overall, the findings of mainly negligible effects across neuropsychological test measures suggest that standardized executive function tests are not superior to non-executive tests for indexing treatment efficacy or demonstrating generalization of strategies learned in GMT. These results are consistent with previous studies that demonstrated no effect or only modest improvements in standardized executive function measures after GMT (Schweizer et al., 2008; van Hooren et al., 2007). In sum, the empirical evidence suggests that standardized executive measures have inadequate clinical utility in regard to the assessment of GMT effects. The most likely explanation for these findings is that these tests do not measure the constructs specifically targeted by GMT.

A different account emerges, however, when MCT performance is analyzed. The findings from this measure suggest that the training effects associated with GMT were generalizable, at least within the constraints of behaviour assessed in the laboratory as the MCT was not included in the training protocol. Strategies trained in GMT do indeed appear more applicable to those required on the MCT than on fluency tasks. For instance, it is conceivable that GMT instruction on the partitioning of a goal into subgoals and the prioritization of tasks would be more relevant to performance on the MCT than on a verbal fluency test. Indeed, there was a notable improvement in MCT performance, both qualitatively and quantitatively, from baseline to post-intervention. Although non-parametric statistical analyses did not reveal any

113 significant differences from pre-GMT to post-GMT on any of the neuropsychological test measures administered, this was likely due to low power. Accordingly, the findings from the effect size analysis, which is not moderated by the small sample size, was taken to be a more accurate estimate of the effectiveness of treatment. This analysis revealed that overall the largest effects were found on the MCT variables, all of which showed medium to large effect sizes

(when compared to negligible effects on the fluency tasks).

The potential influence of practice effects on the MCT must be addressed. It was not possible to use different forms of this test for the pre- and post-GMT evaluations, as only one version of the test exists at present. Thus, there is the obvious possibility that post-intervention performance was more reflective of the benefits of practice than the application of goal management strategies learned during rehabilitation. Some evidence, however, refutes the notion that practice effects alone were responsible for improvements in performance. One such piece of evidence is the substantial increase in plan time that was noted post-GMT across all four patients. If the results obtained were simply a practice effect, one would expect plan time to decrease rather than increase due to the benefits of familiarity with the task. The largest increase in plan time and plan score was found in the one patient who obtained a very poor plan score pre-GMT, suggesting that the extra time taken to carefully plan the route, rather than just the effect of practice, contributed to the improved performance. Finally, plans were noted to be more detailed and accurate post-GMT. One might expect that patients would require less detailed plans to accomplish the required tasks if practice effects were solely responsible for the change in their performance. Instead, the increased detail in plans appears more reflective of the effects of GMT, and perhaps more specifically training on the partitioning of a goal into clearly defined subgoals.

114

The slower plan times post-intervention are consistent with the findings from Levine and colleagues (2000) study, in which they found that the portion of their sample that received GMT

(when compared to the group receiving motor skills training) performed slower on post-GMT tasks. They suggested that GMT increased participants’ care and attention to the tasks, in turn reducing errors. This was also found here, where patients reduced errors to virtually zero on the

MCT (i.e., one error was made across the entire sample) during post-intervention testing. It should be noted, however, that while reductions in specific types of errors were found, in some cases these errors occurred so infrequently pre-intervention that they were already at ceiling, thus limiting the extent of any gains attributable to GMT. Nevertheless, improvements in performance were evident. There was a slight reduction in task repetitions, possibly as a result of improved checking and monitoring skills taught in GMT. It was expected that training in planning and maintaining goals and subgoals would result in completing more tasks within the time limit, fewer missed time deadlines, and faster completion times. Improvements in all of these aspects of performance were indeed found although only modestly so, in part due to better than expected pre-GMT performance. A main focus of GMT is the acquisition of a simple self- command (“stop”) to interrupt ongoing automatic behaviours that occur during attentional lapses in order to resume executive control (Schweizer et al., 2008). Such training may have helped prevent patients from making “automatic pilot errors”, defined as absentminded slips due to inappropriate habitual responding (Levine et al., 2000). In particular, training of this ‘stop’ command along with instruction on the prioritization of subgoals may have helped patients avoid making insufficient funds errors (i.e., attempting to make purchases without money), as this was the error type with the most notable decrease post-intervention.

115

The increased planning times post-GMT coincided with decreased completion times.

While faster completion times may be reflective of a practice effect, it could also be argued that the extra care and attention devoted to devising more detailed and accurate plans aided patients in completing the task more efficiently. Indeed, only one patient had perfect plan-performance concordance prior to the intervention while perfect plan-performance concordance was obtained for all patients post-GMT. This perfect concordance between plans and performance post-GMT was undoubtedly due to the fact that all patients produced ‘perfect’ plans. Although patients produced good plans pre-intervention, their post-intervention plans improved such that they were more detailed, logical, and spatially efficient. Moreover, actual performance on the MCT also improved post-GMT. In line with van Hooren and colleagues’ (2007) GMT study, the current results demonstrate that, post-intervention, patients were better able to perform activities according to a plan and better able to structure activities in daily life (or in this case, activities resembling those one would perform in daily life).

Some preliminary data exploring practice effects on the MCT has been collected in a sample of 4 normal participants, who performed the task on two occasions 4 weeks apart (see

Tables 4.4 - 4.6). In contrast to the current patient sample, the plan time of the normal sample decreased during the second administration of the task, with an obtained effect size in the medium range. Additionally, completion times and total errors decreased as they did in the patient sample, with a very large effect size obtained for the completion time variable but only small to medium effects on the total errors variable in the normal sample (the number of tasks completed variable was perfect across all 4 participants on both trials). These results are consistent with the presence of practice effects on the MCT. The plan score, however, decreased during the second testing period although the effect size obtained on this measure was small to

116 negligible. Plan score would be expected to improve due to practice effects alone yet it did not in the normal sample. Moreover, from a spatial perspective the efficiency of the route taken only improved in 1 participant; 2 of these participants’ paths were spatially inefficient on both trials and the other participant’s route was spatially efficient on both occasions. These findings contrast with those from the patient sample, who appeared to put forth more effort into their plans (as evidenced by an increase in plan time and plan detail) and took more spatially efficient routes during the second, post-intervention administration of the task. Thus, practice effects may not be responsible for these particular improvements in the patient sample given that the normal sample did not demonstrate these same improvements in performance. From this preliminary data, it can be deduced that while certain MCT variables, such as completion time and total errors, may be subject to practice effects, other MCT scores (i.e., plan time, plan score) likely improved in the patient sample for reasons other than practice such as benefits gained during

GMT.

Further Evidence of Ecological Validity

The DEX-I EF score was utilized to explore relationships with the executive function variables to evaluate ecological validity. This score was shown to have the best linear relationship with MCT completion time. Findings in support of the ecological validity of the completion time measure were not unexpected, given that this score is likely reflective of various aspects of task performance including duration engaged in completing tasks, efficiency and accuracy of the sequence of errands performed, and presence or absence of errors made such as task repetitions and inefficiencies. These aspects of MCT performance indeed appear very relevant to everyday behaviours and seem particularly reflective of an individual’s productivity and efficiency in real life.

117

The DEX-I EF scores were also linearly related, albeit to a lesser extent, to the two semantic fluency measures (D-KEFS Category Fluency and Category Switching Fluency) administered. It is interesting to note that the MCT and a semantic fluency task (Animal naming) were the only tests associated with subjective ratings of real world executive functioning in the patient study reported in Chapter 3. Thus, these results provide further evidence of the ecological validity of the MCT and semantic fluency tasks. As argued in

Chapter 3, the seemingly more direct relationship between task requirements and daily activities suggests that performance on the MCT may be better suited for the prediction of real life executive functions than semantic fluency measures. It is also worth noting that several MCT variables (i.e., plan score and total errors) did not have sufficient variability across the small sample to allow for the computation of correlations with the DEX. Accordingly, future studies with larger sample sizes, a matched control group, and perhaps more severely impaired patients are required to evaluate whether these MCT scores are also predictive of real world behaviour.

MCT Performance: Comparisons with Studies One and Two

To diverge slightly from the current topic, MCT performance in the current sample was compared with those from the two previous studies and revealed some interesting patterns. The most frequent errors in this sample were the same as those from the previous two chapters: incomplete tasks (similar to the patient sample) and insufficient funds (similar to the normative sample). In the present group certain types of errors, such as task repetitions and inefficiencies, occurred very infrequently while others were non-existent, such as entering a building in which there was no task to perform. Interestingly, this error occurred in both the previous patient and normal groups. The most likely explanation for the lack of such errors is the small sample size and the fact that the sample was composed of high functioning individuals.

118

Another interesting finding from this comparison was that the plans produced by the current sample were similar to those produced by the patient sample in Chapter 3 and that plans produced by patients overall (i.e., across both studies) were more accurate than those produced by the normal group in Chapter 2. It is also of interest that plan times pre-GMT were nearly identical to the plan times of the normal group yet the current patient sample produced better plans. A possible explanation for these results is that individuals with executive difficulties and reasonably intact insight into their impairments are better able to appreciate the importance of foresight and planning intended activities ahead of time when compared to neurologically intact individuals.

DEX and Treatment Outcome

Overall, patients and their informants reported fairly mild executive function impairments on the DEX. Contrary to what would be expected, dysexecutive symptoms as rated by patients worsened immediately post-intervention. Taken together with the findings of improved scores on the insight index, there is some evidence that GMT may improve awareness of executive difficulties. This makes sense intuitively as the executive deficits experienced by the sample were undoubtedly more salient to them immediately after undergoing the 7-week GMT program in which these deficits were the primary focus. Thus, an increased awareness and appreciation of their dysexecutive problems may have been an immediate effect of GMT, thus explaining the increase in reported symptoms and the reporting of more severe symptoms than their informants.

The increase in symptom severity reported by patients post-intervention may have been a product of the timing of the evaluation, as it is likely that patients did not have enough time to fully implement GMT strategies in their daily lives to even notice improvements in executive function at the time of these evaluations (see below for further discussion).

119

Subjective behavioural ratings from informants showed an overall decrease in symptoms on the derived DEX EF scale. As noted previously, informant ratings are generally regarded as more accurate than patient ratings and are therefore preferred for estimating actual executive function impairment on the DEX. The items from the DEX that were used to derive the EF scale were specifically chosen because they were thought to reflect the particular functions targeted by

GMT. The fact that informant ratings on this scale decreased post-intervention suggests that specific executive functions targeted by GMT improved even in this sample of relatively mildly impaired patients. This conclusion must be made cautiously and tentatively however, as two of the informant ratings on the EF scale actually increased slightly (suggestive of worsening of symptoms) and the overall effect size from baseline to post-GMT was considered small.

A possible explanation for the increase in DEX-I total score may be that expectations for immediate improvements post-intervention were high and unrealistic and that the post- intervention evaluations were more reflective of the discrepancy between these expectations and reality as opposed to actual observed changes from baseline. Another possibility is the timing of the completion of the DEX questionnaire. The administration of the DEX immediately after

GMT was perhaps not the ideal time as informants may not have had sufficient opportunity to fully appreciate any changes in executive functioning. Alternatively, patients may not yet have fully implemented the strategies learned in training. In the case study report by Schweizer and others (2008), the informant did not rate the patient until long-term follow-up four months post-

GMT. Indeed, in this earlier study ratings provided by the patient on another scale (Cognitive

Failures Questionnaire; Broadbent, Cooper, Fitzgerald, & Parkes, 1982) indicated the presence of more difficulties immediately post-intervention when compared to baseline, similar to the findings of the current study. Levine and colleagues (2007) found non-significant improvements

120 on the DEX immediately post-GMT but significantly lower DEX scores at long-term follow-up.

They suggested that the real life effects measured on the DEX needed time to consolidate and that the post-intervention assessment may have been too early to detect effects. In the present study, long-term follow-up evaluations using the DEX may have allowed for consolidation of real life effects in the patient sample. Unfortunately, it was not possible to administer these long- term evaluations due to ethics restrictions not allowing for such imposition into patients’ (or, more specifically, their informants) lives months after completing the GMT program.

Long-Term Benefits of GMT

The evidence suggests that the effects of GMT extend beyond the immediate post- intervention period (Levine et al., 2000, 2007; McPherson, Kayes, & Weatherall, 2009;

Schweizer et al., 2008; van Hooren et al., 2007). Moreover, even in high functioning samples like the present one, it is often observed that subtle changes such as those observed here on neuropsychological test performance and behavioural ratings can result in a substantial change in daily functioning when compared to pre-intervention abilities. Test performance in particular has been noted to underestimate the degree of therapeutic benefit derived from GMT (Schweizer et al., 2008). Evidence of the long-term beneficial effects of GMT in terms of the positive impact it had on patients’ daily lives was provided by reports from the study sample approximately two months post-GMT. At that time, patients were asked to evaluate the training program and their use of the learned strategies in their daily lives with free-form comments.

Three of the four patients provided their reviews, all of which were positive and suggested that the techniques taught during GMT were being utilized regularly in their everyday activities. For instance, one patient described GMT as an “excellent program” and that, “The ‘stop’ technique is fantastic. Whenever I feel flustered because I don’t know what I’m looking for, for example, I

121

‘stop’ and get oriented or more focused”. Another patient stated, “The GMT program was very helpful. Particularly the emphasis on paying attention to avoid memory slips. That seems so simple when stated, however it is more difficult to put into practice. Taking a moment or two to focus on what I’m doing has improved my effectiveness”. These evaluations are indicative of the potential for long-term benefits and the generalizability of GMT techniques to everyday life.

Limitations

In addition to the aforementioned limitations of this study (e.g., small sample size, lack of alternate form for MCT and potential practice effects, lack of long-term follow-up DEX evaluations), a design limitation potentially undermining the study findings is the lack of inclusion of a matched control group not receiving treatment or receiving non-executive function treatment. As in study 2, a lack of access and ethics approval for the attainment of a sufficiently matched control group (i.e., with respect to gender, age, and education) is recognized as a significant limitation. In the current study, the ethical issues related to either delaying treatment

(in a waitlist control group protocol) or providing an alternate treatment for a matched group was also of primary concern given that patients were included in the GMT program by clinical referral and not specifically for the purposes of the current research study. Without a control group to provide a comparison base for the evaluation of post-GMT effects, it is acknowledged that positive results could be attributed to a variety of factors other than the intervention’s effect.

In addition to practice effects, patients may show a placebo effect in which they expect the intervention to help resulting in improved post-intervention performance. Alternatively, patients may be motivated to show improved performance in order to please the experimenter or play the

“good subject role” (Weber & Cook, 1972). Furthermore, improved performance at post- intervention assessment could reflect non-specific effects of the GMT intervention such as an

122 increase in social interaction or therapeutic gain from meeting individuals with similar cognitive difficulties. These potential confounds involving non-specific effects of GMT can be partially refuted by the fact that patients evaluated themselves as having worsened executive impairments after treatment. Nonetheless, future studies of the efficacy and generalization of GMT would be strengthened by implementing randomized controlled trials in order to control for these confounding variables.

A further limitation in regard to any comparisons made between the current study and the previous patient study (study 2) was the potential for gender effects. Study 2 included primarily male patients while the current study was comprised of mainly females (i.e., all but one). Thus, any differences in performance on the MCT or other neuropsychological tests administered may be attributable to gender effects. It is notable, however, that normative data from the various standardized executive function measures administered in these studies have not shown a gender bias in performance (e.g., TMT: Hester, Kinsella, Ong, & McGregor, 2005; Tombaugh, 2004;

Verbal Fluency: Heaton, Miller, Taylor, & Grant, 2004; Tombaugh, Kozak, & Rees, 1999;

WCST: Heaton, 1981; Heaton, Chelune, Talley, Kay, & Curtiss, 1993). As the executive abilities measured on these tests are presumably also measured on the MCT, albeit in a perhaps different way, it is speculated that gender would not play a major role on the MCT. At most, the neuropsychological literature has documented gender effects favouring males in regard to visuospatial processing (e.g., Basso & Lowery, 2004) and better performance in females in regard to verbal skills, including verbal memory (e.g., Delis, Kramer, Kaplan, & Ober, 2000 ). In

the MCT, however, the simplicity of the design of the city with respect to spatial navigation and

the limited verbal memory skills required (i.e., task instructions were always on the monitor and

on the instruction sheet) likely limited the extent to which gender differences would bias task

123 performance. Finally, study 1 explored gender effects in the normative sample and found no differences between males and females on the MCT or any other cognitive test administered with the exception of somewhat slower TMT A performance by females.

Conclusion

In conclusion, the present findings indicate that GMT is an effective therapeutic program for the amelioration of executive deficits in patients with long-standing complaints of dysexecutive symptoms. The strategies learned in GMT appear to be generalizable to activities not specifically trained during intervention, as demonstrated by improvements in MCT performance. Trained strategies do not appear to generalize to a traditional executive function measure assessing verbal fluency. Further evidence of the ecological validity of the MCT, and to a lesser extent semantic fluency tasks, was suggested by the current results. The potential for long-term beneficial effects of GMT was evinced by subjective reports from patients of continued everyday use of techniques and improvements in daily activities two months post- intervention. Several study limitations were noted and further studies using randomized controlled trials, alternate forms for all neuropsychological tests, and larger sample sizes are warranted.

Chapter 5 General Discussion

The field of clinical neuropsychology has been evolving over the past several decades. A major aspect of this evolution is the shift in focus from the localization of brain lesions to an increased interest in taking neuropsychology out of the laboratory and into the real world. The most commonly used executive function measures may be of limited use in this regard due to their questionable ecological validity (Burgess et al., 2006).

Over the last 20 years, several researchers have developed executive measures that attempt to assess the multitasking and planning demands of real life by simulating real world activities (McGeorge et al., 2001; Shallice & Burgess, 1991; Wilson et al., 1996, etc.). Some evidence suggests that tests using this verisimilitude approach are superior to traditional test measures in predicting everyday performance (Chaytor & Schmitter-Edgecombe, 2003). These novel tasks, however, involve scenarios or incorporate subtasks that do not always closely mimic the real life daily activities of most individuals (e.g., applying for a position in the psychology department, obtaining information about the weather in another country; visiting a zoo, etc.).

The lack of similarity between such tasks and everyday activities may negatively affect ecological validity. Moreover, these tasks customarily impose rule constraints in their list of instructions (e.g., forbidding travel outside of a certain spatial boundary). These rules may inadvertently create more structure for examinees, despite this being the very thing that should be avoided when constructing executive function measures (Lezak, Howieson, & Loring, 2004).

The inclusion of specific rules to follow while completing a task may serve to mask certain executive deficits in individuals who may be adept at following rules but struggle when left to their own devices.

124 125

The present research program endeavoured to place clinical neuropsychology on firmer scientific ground in terms of ecological validity and verisimilitude testing with the development of a novel measure, the MCT. The MCT attempts to index executive functioning with ecological validity in mind by using a different paradigm from previous measures of verisimilitude. Of primary importance, the task utilized in this program of research was designed to closely mimic real life activities that are presumably more relevant and familiar to most individuals than other common verisimilitude measures. More poignantly, observed behaviour on the task was unconstrained by rules in an attempt to uncover executive deficits potentially masked or reduced because of the structure imposed by these rules. Another integral aspect considered during task development was the evaluation of the concordance (or discordance) between planning and actual performance on the task. This type of analysis has infrequently been done in previous research despite its potential to provide valuable information about the fractionation of the executive system (e.g., planning, initiation, etc.) and implications for identifying specific processes to target in rehabilitation.

A fundamental goal in this research program was to explore the ecological validity of the

MCT and existing standardized executive function measures. Thus, a comparison of the

ecological validity of verisimilitude and veridicality tasks was undertaken. A second goal of this

research was the analysis of the various subprocesses of the executive function system and how

they were integrated during goal-directed behaviour. These analyses focused on an evaluation of

the qualitative and quantitative patterns of performance on the MCT, performance comparisons

across normal and patient samples, and similarities and differences in the functions evaluated on

the MCT and other standardized executive and non-executive measures.

126

The objectives, hypotheses, and main findings from the analyses are summarized in the next section. They are organized according to the two major topics covered in this program of research: ecological validity and test performance characteristics. Following each summary, the theoretical and clinical implications of the findings from this research program will be discussed.

To conclude, the limitations of executive function test measures and directions for future research will be discussed.

Ecological Validity

The ecological validity of the MCT and other commonly employed, standardized measures of executive function was primarily addressed in studies 2 and 3. That is, a primary objective of both studies was to evaluate the ecological validity of these measures by examining associations between performance on these tests and behavioural rating scales of everyday executive functioning. It was hypothesized that the MCT would be most predictive of real world executive skills given that it was designed with ecological validity in mind. The findings from both studies revealed that this was indeed the case. In both studies, one MCT variable (i.e., plan score in study 2; completion time in study 3) was found to have the highest, statistically significant correlations with the informant-rating behavioural scale score when compared to all other executive function tests administered. A measure of semantic fluency, however, was also found to be highly correlated with the outcome measure used in both studies albeit to a lesser extent than the MCT. Given that measures of verbal fluency would be considered tests of veridicality as they do not directly simulate real life tasks, the finding that this measure was also predictive of everyday abilities suggests that at least some traditional test measures may have good ecological validity. For instance, it may be that semantic fluency measures index a more central or critical process than anticipated. The evidence for the ecological validity of both the

127

MCT and semantic fluency was strengthened by the findings of significant associations with not one but two different functional outcome measures (FrSBe and DEX). That both scales were related to both of these test measures demonstrated the validity of these findings.

In evaluating the ecological validity of a task, it may not suffice to simply demonstrate significant associations with functional outcome measures. A further aspect of a test’s ecological validity to address is how performance on the measure could predict the types of activities one could successfully perform or may struggle with in daily life. It was argued that it would be difficult to extrapolate precisely which abilities in real life could be predicted from performance on a semantic fluency task. More specifically, it is not clear how the ability to rapidly generate animal names could be used to infer success or failure in daily activities such as shopping tasks or attending appointments on time. It may not be the generation of animal names per se, but the underlying process of generating and scanning verbal alternatives that is common with everyday behaviours. Thus there may be an indirect relationship between semantic fluency (e.g., animal naming) and the prediction of performance on everyday activities that involve, as one component of the task, the ability to generate items or words that are semantically related. One example may be the composition of an email message or report in which different words that convey the same meaning must be used to avoid repetition within the text.

In contrast, there may be a more direct relationship between MCT performance and everyday abilities. For instance, the plan score (associated with functional outcome in study 2) presumably measures organization and sequencing functions including the ability to prioritize tasks. In study 3, completion time was associated with functional outcome in the expected direction. The completion time measure reflects not only duration engaged in completing required tasks, but also additional time spent due to errors made such as task repetitions,

128 wandering, or other such behaviours. Lengthy completion times on the MCT may thus reflect a variety of executive impairments demonstrated in real life involving difficulties engaging in and maintaining goal-directed behaviour. On the basis of this seemingly more direct relationship between task requirements and daily activities, performance on the MCT may be better suited for the prediction of real life executive functioning than semantic fluency measures.

Further evidence that the MCT may be a more ecologically valid measure than semantic fluency measures was demonstrated in the rehabilitation study (study 3). Both measures were utilized to evaluate changes in performance from baseline to post-GMT. An effect size analysis revealed that all MCT variables showed medium to large effect sizes in the expected direction while only negligible effects were demonstrated on verbal fluency (including semantic fluency) tasks. Indeed, MCT performance improved notably in the form of better plans produced, fewer errors made, and shorter completion times. These findings further strengthened the argument for the better ecological validity of the MCT, as the strategies trained during GMT are those identified to be regularly employed in daily life. These strategies would likely be more relevant to MCT performance than fluency tasks. For instance, as outlined in Chapter 4, strategies trained included instruction on the partitioning of a goal into subgoals and the ability to prioritize tasks.

Such strategies are most probably utilized on the MCT but it is difficult to articulate how they may be pertinent to verbal fluency task performance. Thus the successful application of GMT strategies, as documented by improvements in performance on the MCT and not on the semantic fluency task, demonstrated the MCT’s superior ecological validity.

Finally, indirect evidence for the ecological validity of the MCT was also demonstrated in study 1 (and supporting evidence found in study 2). Here, significant correlations found between MCT performance and tests of visuospatial ability and processing speed demonstrated

129 that the task involves the recruitment of a variety of lower order cognitive processes. A slightly different and more extensive test battery was used in study 2 and here the MCT was also related to visuospatial functions and processing speed along with visual and verbal memory and visuoconstruction ability. The complex demands of real world activities undoubtedly involve the parallel engagement of both executive processes and more ‘basic’ cognitive functions. The findings of statistically significant relationships with measures of these basic functions therefore suggested that the MCT may simulate the complex demands of real world activities involving the simultaneous recruitment of executive and non-executive processes.

One of the most noteworthy findings from this research was that all standardized

executive function test measures administered in study 2, with the exception of semantic fluency,

were not significantly correlated with the functional outcome measure utilized. In particular,

even the one ‘ecologically valid’ measure employed (MSE) was not related to everyday

executive functioning here or in a previous study (Knight, Alderman, & Burgess, 2002). On the

basis of the present findings, it was strongly suggested that the clinical utility of immensely

popular executive function measures (i.e., WCST, TMT-B, and COWAT) be questioned given

their demonstrated lack of ecological validity in these investigations. It was further cautioned

that predictions with potentially life-altering consequences, such as estimations of an individual’s

vocational aptitude or ability to manage finances, not be made solely on the basis of performance

on these tests.

In sum, the above findings provide evidence of the ecological validity of the MCT.

Study results also demonstrate that semantic fluency tasks have ecological validity. The data

from study 3, however, suggest that semantic fluency measures are less useful than the MCT for

the evaluation of the generalizability of intervention effects. Finally, evidence of the lack of

130 ecological validity of other commonly employed, standardized executive function measures was presented.

Measurement of executive functioning in the real world.

The findings from these analyses indicate that, while there appears to be a place for certain traditional veridicality tests in the prediction of real world abilities, tests of verisimilitude may be more consistently related to functional outcome and thus may provide a superior method of assessing everyday functioning. These results are corroborated by the Chaytor and Schmitter-

Edgecombe (2003) review, in which both veridicality and verisimilitude tests were found to be related to informant questionnaires but the verisimilitude tests were more consistently associated with functional outcome measures. In the present case, however, the findings go one step further in suggesting that many traditional test measures are not actually related in any significant way to everyday behaviour.

At the very least, the present findings indicate that the concomitant use of both traditional and ecologically oriented test instruments may provide the best assessment of real world performance. This was the same opinion offered by Marcotte and colleagues (2004). They found that global neuropsychological performance using traditional tests was a significant predictor of passing or failing an on-road driving test in a study of driving abilities of HIV- positive participants. They also demonstrated that performance on an interactive PC-based driving simulation (an ecologically oriented instrument) explained additional variance beyond traditional test measures in predicting on-road performance. They concluded that simulator testing may provide information about real world behaviours not captured by traditional neuropsychological measures. Accordingly, they proposed that a multimodal assessment of driving ability may provide the most informative laboratory-based approach.

131

The same may apply to the evaluation of other real world tasks, such as errand running/multitasking behaviour as observed in the current context. Indeed, the lack of association between semantic fluency and MCT performance suggests they are measuring different functions (as discussed above), yet both were related to functional outcome. Moreover, the qualitative data that can be obtained from verisimilitude tests such as the MCT make them appealing for clinical purposes. For instance, patients’ performance on the MCT could be examined in detail to describe not only the number but also the specific types of errors they may be likely to display in daily functioning, such as poor planning skills or tardiness in attending appointments due to being sidetracked by other errands. This is in stark contrast to traditional executive tests that typically provide only numerical scores. The detailed analysis of behaviours relevant to everyday functioning made available by verisimilitude tests provides a definite advantage over traditional tests in terms of clinical utility. It is thus recommended that a multimodal approach involving the use of both veridicality and verisimilitude instruments be adopted for the evaluation of a variety of everyday tasks, as each type of instrument may add unique information in predicting real world performance. In light of the evidence of poor ecological validity of several common executive function measures, a further recommendation is that the specific tests utilized be chosen selectively and on the basis of empirical evidence of their predictive validity.

Test Performance Characteristics and the Executive Function System

In addition to ecological validity, the other major emphasis of this research was on MCT test performance characteristics and what they may reveal about the subprocesses of the executive function system. These analyses were primarily focused on qualitative and quantitative patterns of performance across the three samples, and similarities and differences

132 between the MCT and other standardized executive function measures by way of correlational analyses.

In regard to normative performance on the MCT, it was hypothesized that the test was sufficiently complex that ceiling effects would be avoided in the presumably high-functioning group of undergraduate psychology students recruited in study 1. This appeared to be the case.

That is, while the number of tasks completed was high, other qualitative aspects of performance indicated that very few individuals in the sample obtained ‘perfect’ or even ‘near perfect’ scores.

Various types of errors were made by most participants, involving inaccurate sequencing of events, task repetitions, and inefficiencies including wandering behaviour and entering unnecessary buildings. Moreover, plan scores were lower than expected for the group.

Normative performance on the MCT was contrasted with that of patients with ABI in

study 2. Several key differences in performance patterns were noted. The ABI group on average

spent considerably more time planning and completing the task than the normal sample. Quite

unexpectedly, however, the patient group produced better plans overall on the MCT than the

normal participants. Plans were deemed better primarily due to good sequencing of events.

These results are contrasted with the finding that patients completed fewer tasks than normal

participants in line with our expectations and previous research (Alderman, Burgess, Knight, &

Henman, 2003; McGeorge et al., 2001; Shallice & Burgess, 1991). In comparing the groups in

study 1 and 2, patients also showed much greater variability on the number of tasks completed

measure, consistent with the findings of other investigations documenting greater heterogeneity

within patient groups when compared to healthy participants (Knight, Alderman, & Burgess,

2002; Zakzanis, 2001).

133

Given the unequal sample sizes between the groups in study 1 and 2, differences in the frequency in which specific types of errors occurred across the two samples was difficult to evaluate. That is, it would be expected that more errors would occur within the normal participants simply by virtue of the fact that the sample size was more than twice as large as the patient sample. Thus, it was particularly noteworthy that the total number of tasks not completed across the group was substantially larger in the patient sample than the normal sample. Although the other differences were uninterpretable due to the sample size discrepancy, it was interesting that the patient and normal groups made the exact same types of errors. That is, there were no errors made by some patients that were not also made by some normal participants. Thus, it was the number of errors rather than the category of error that best differentiated those with neuropathology from healthy participants. As noted, it was only the number of incomplete tasks that clearly distinguished the two groups. In sum, these findings indicate that patients can best be differentiated from healthy individuals by quantitative rather than qualitative aspects of performance on the MCT.

In considering the MCT performance data from the patient sample in study 2, it is worth

reiterating the discrepancy between the production of good plans and comparatively worse

performance. Analysis of this discrepancy revealed that tasks that were planned were frequently

not actually completed nor attempted. The interpretation provided for these results was that

impairments in initiation were likely responsible for the dissociation between good planning of a

sequence of actions and poor actual execution of those actions, consistent with Teuber’s (1964)

notion of the “dissociation between knowing and doing”.

The comparison of study 3 findings with the first two studies revealed that, while the

most frequent errors from the earlier studies (insufficient funds and incomplete tasks,

134 respectively) were also the most frequent in study 3, other types of errors were very infrequent or non-existent. For instance, none of the patients in study 3 entered a building in which there was no task to perform although this error occurred in both previous studies. The low frequency of certain types of errors in comparison to the other two studies was interpreted as a factor of both the small sample size and the high functioning individuals that comprised the group. It is also possible that the sample developed compensatory mechanisms to overcome their fairly chronic executive deficits. On the other hand, the findings presented in study 2 suggest that more acutely or severely brain injured individuals would exhibit a wider variety of errors.

The comparison of performance patterns across all three studies revealed an interesting finding: Plans produced across both patient groups (study 2 and 3) were of higher quality overall than those produced by the normal group (study 1). The most plausible explanation offered for this finding was that individuals with executive difficulties and reasonably intact insight into their impairments are better able to appreciate the importance of foresight and planning intended activities ahead of time. This may be particularly true for the patients in study 3 as they had been suffering with executive difficulties for several years and would likely have developed various strategies, including making plans, to compensate for their impairments. Perhaps healthy individuals take for granted the utility of making plans prior to engaging in intended activities.

Study 1 and 2 addressed the relationship between the MCT and other common executive measures. The findings from these analyses were mixed as study 1 found little association between the MCT and other executive function tests while study 2 found that most of the executive tests employed were significantly correlated with the MCT. In study 1 it was proposed that the lack of relationship may be because traditional executive tests measure the component processes or “building blocks” of executive functions, while the MCT requires the ability to

135 initiate these building blocks or component processes, monitor performance, and use this information to adjust behaviour (Burgess & Alderman, 1990). Although seemingly contradictory, the findings obtained in study 2 in which most executive measures were related to the MCT were postulated to be consistent with this theory. That is, the three executive measures that were found to be related to the MCT were associated with only one or at most two (in the case of the WCST) MCT variables. Moreover, certain MCT variables (i.e., plan time and tasks completed) were not associated with any of the executive function measures. These results indicate that the significant associations between these tests and specific MCT variables reflect

that they are tapping the same component executive processes. The lack of relationship between

these standardized tests and other MCT scores, however, was viewed as compatible with the

interpretation from study 1 in that overall performance on the MCT requires higher order

‘control’ mechanisms and the ability to integrate these so-called building blocks of executive

function.

The fractionation of the executive function system.

The introductory section of this thesis articulated various well-known accounts of

executive functions/prefrontal processes (Fuster, 1997, 2002; Grafman, 2002; Norman &

Shallice, 1986; Shallice, 1994; Shallice & Burgess, 1996; Stuss & Alexander, 2007; Stuss &

Levine, 2002). Some general themes were noted across these different frameworks and the

current research findings appear to be consistent with such themes. That is, the present results

support the idea that there is no unitary executive function. They further suggest that the

‘supervisory system’ can be fractionated into a variety of processes carried out by different

subsystems but that operate together in a globally integrated manner. Performance on the MCT

appears to involve several different ‘executive’ type processes, including planning, sequencing,

136 multitasking/prioritizing, initiating, monitoring, decision making, and likely others. The fact that standardized executive measures were related to some but not other MCT variables provides evidence of the fractionation of the executive system into various processes, some of which are being tapped mutually by both the MCT and these other measures and other processes tapped only on the MCT.

Moreover, the finding that some MCT scores were associated with other basic cognitive functions such as visuospatial and visuoconstruction ability, memory, and processing speed is suggestive of the executive system’s role in the integration of a variety of cognitive processes.

These results in particular seem to support Fuster’s (2002) conceptualization of the role of the prefrontal cortex/supervisory system in temporal integration and temporal order. More specifically, the current research findings appear to be consistent with his description of the supervisory system’s capacity for the mediation of cross-temporal contingencies. This process involves “the integration of temporally separate units of perception, action and cognition into a sequence toward a goal” (p.99). Indeed, the associations with various executive and non- executive cognitive functions on the MCT suggest that the function of the supervisory system extends beyond strictly executive-type processes to integrate information from other basic cognitive systems during goal-directed behaviour.

The present findings suggest that deficits in the integration of different executive processes may occur within an impaired supervisory system. Thus the findings from study 2 suggest that this breakdown may occur when intact planning ability is compromised by initiation problems during task completion. Lezak and colleagues (2004) described four components of executive functioning, of which planning was one and purposive action (which includes initiation) was another. It was emphasized that, while each component is distinct with regard to

137 behavioural activities, “all are necessary for appropriate, socially responsible, and effectively self-serving adult conduct” (p. 611). It is presumed then that an intact executive system integrates its various components, including planning and purposive action, to accomplish successful, goal-directed behaviour. If initiation difficulties were indeed the root cause for impaired performance in the patient sample (study 2), this provides confirmation of the differentiation of executive processes and the existence of a difficulty integrating the intact executive component (planning) with the impaired one (initiation) for effective behaviour. Tasks like the MCT that are unstructured, complex, and performed over longer periods of time may be ideal measures for indexing such impairments in the integration of the various subprocesses of the executive system.

The current research also demonstrated the importance of good planning skills. Given the difficulties patients with executive impairments tend to have dealing with novelty, the present results indicated that these difficulties can be overcome to some extent by carefully considering how to approach tasks prior to initiation. The patient samples in this research were generally high functioning, had good insight into their deficits, and produced well-sequenced, efficient plans on the whole. The findings from a select few patients and even several participants from the normal sample, however, illustrated the problems that may be encountered when planning is unsatisfactory. This was typically the case for those who did not properly sequence events during their plan such that they would attempt to make purchases prior to withdrawing money from the bank. In these cases, task performance involved having to improvise the order in which tasks were completed since plans could not be successfully implemented. Sometimes this resulted in a well-sequenced, spatially efficient approach and other times it did not, instead involving a disorganized, spatially impractical route. Assessment of the ability to appreciate

138 when to abandon a plan in the face of unanticipated events, or to improvise, obviously has clinical relevance. The research findings revealed that this was another aspect of MCT performance with clinical implications as it may allow for the prediction of an individual’s ability to adapt successfully to novel and/or unexpected situations likely to arise in the real world.

General Limitations and Future Directions

Clinic-based tests such as the MCT and the other standardized tests discussed throughout this research commonly focus on ability – a skill or talent the individual possesses, as opposed to function – implementing that skill in an environmental context (Goldstein, 1996). A general limitation of this research, then, is that the measures utilized only assess what an individual is capable of doing for the prediction of real world behaviour, but do not evaluate what the person actually does. The MCT and other verisimilitude tasks may offer a slight advantage over traditional tests given the limited supervision during performance and the ‘real world’ context, which may make them more similar to tasks performed in everyday life. Nonetheless, a test based on verisimilitude is not necessarily ecologically valid (Chaytor & Schmitter-Edgecombe,

2003). Although the studies presented here provide evidence of ecological validity by way of associations with informant ratings of everyday functioning, the MCT has yet to be validated with respect to its true, real world counterparts (Rabin, Burton, & Barr, 2007).

Another problem is that perfect verisimilitude is not possible, given the difficulties inherent in completely replicating the environment in which the behaviours evaluated would take place (Goldstein, 1996). Indeed, Chapter 3 described some of the major differences between the

VE in which the MCT takes place and the real world. Primary among these differences was the

139 lack of ability to interact socially given there are no virtual people within the VE. It was noted, however, that the task was mainly focused on examining executive functioning and thus assessing social and emotional behaviour was not of primary interest. Admittedly, however, this would have added to the realism of the task and could provide valuable information about how executive functions interact with social/emotional processes to guide behaviour.

Finally, the challenges inherent in the development of ecologically oriented measures must be addressed. Marcotte and colleagues (2010) note the difficulty in developing measures reflective of daily functioning that are “sufficiently challenging to provide a distribution of functioning across ‘normal’ individuals” (pg. 24) such that ceiling and floor effects are avoided.

The BADS, for instance, has been criticized for being insufficiently challenging for high functioning individuals with everyday executive impairments (Sohlberg & Mateer, 2001).

Marcotte and others (2010) also highlight the danger of making a task too difficult that it becomes “test like” or similar to a video game and thus lacks realism. In the case of the MCT, the results from the studies presented here indicate that a reasonable balance between real world similarity and task difficulty has likely been achieved. That is, ‘normal’ individuals generally performed quite well on the task although most made some errors, thus both ceiling and floor effects were avoided. Unequal sample sizes proved to be problematic when evaluating error patterns in normal and patient groups, although the preliminary evidence indicated that patients generally made more errors. This suggests that the MCT is sensitive to the effects of brain injury. A further positive aspect of the psychometric properties of the MCT is that it appears to have very good inter-rater reliability.

Despite the seemingly good psychometric properties of the MCT, some modifications may be appropriate. Marcotte and colleagues (2010) acknowledged the necessity to modify their

140 own functional measures to achieve a balance between task difficulty and real world relevance.

Future variations of the MCT could attempt to improve realism by incorporating distractions and/or interactions with virtual people within the VE to explore social behaviour, including social rule breaks, given the frequency in which these types of errors occurred in previous studies

(e.g., Alderman, Burgess, Knight, & Henman, 2003). The internal validity of the task could be preserved by ensuring all participants are approached by the same ‘person’ at the same time in the same manner, thus reducing confounds in task performance. In light of the simplicity of the budgeting requirements of the task, future modifications could involve increasing the difficulty of this aspect of the task by involving more than one unspecified expense (e.g., the amount of money available to spend on pens and shampoo) or increasing the number of items to buy and the amount of money to spend. Future investigations using the MCT as an outcome measure in rehabilitative contexts or other instances in which repeat administrations are necessary should involve the development and utilization of different versions of the task (involving a different list of task requirements and shopping list items) to reduce or eliminate potential practice effects.

Conclusion

The present research described the ecological validity of verisimilitude and traditional executive function measures and the characterization of various subcomponents of the executive function system. The unique contribution of the thesis was in the development and empirical study of a novel task (MCT) that was less structured and more closely resembled actual everyday errands than existing real life and VR verisimilitude tasks. This research demonstrated that tests of verisimilitude may be better predictors of real world behaviours than many of the most commonly employed traditional executive function tests. A further contribution was made by providing confirmatory evidence for the fractionation and integration of various subprocesses of

141 executive functioning as indicated by existing theory and literature. Within the MCT, these subprocesses likely include at least planning, sequencing, multitasking, prioritizing, initiating, monitoring, and decision making. The current findings also provide support for the notion that executive processes and lower-order cognitive functions must be temporally organized and integrated to achieve successful goal-directed behaviour.

The findings from this research project suggest there is a place for both tests of verisimilitude and traditional executive function measures in neuropsychological evaluation, with each type of measure providing unique information about the prediction of real world abilities. In light of the current results, greater interest in developing and utilizing verisimilitude tasks appears warranted and long overdue. Measures of verisimilitude are well positioned to inform specific areas of focus for treatment given their real world similarities and the ability to identify which process within the executive system is impaired (e.g., planning versus initiation).

Verisimilitude tasks may further be used not only for assessment purposes but also as rehabilitative tools, especially if multiple versions of varying degrees of difficulty are developed.

This area of research remains in its incipient stages but appears to be an economical, effective treatment option as it allows for more repetitive practice and reduces therapist-patient time required due to the limited supervision needed while performing such tasks (Fong et al., 2010).

Verisimilitude instruments can potentially play a valuable role in both executive function assessment and intervention and consequently may help to place clinical neuropsychology on firmer scientific ground.

References

Acker, M. B. (1990). A review of the ecological validity of neuropsychological tests. In

D. E. Tupper & K. D. Cicerone (Eds.), The neuropsychology of everyday life: Assessment

and basic competencies (pp. 19-56). Boston: Kluwer Academic Publishers.

Alderman, N., Burgess, P. W., Knight, C., & Henman, C. (2003). Ecological validity of a

simplified version of the multiple errands shopping test. Journal of the International

Neuropsychological Society, 9, 31-44.

Alexander, G. E., Crutcher, M. D., & DeLong, M. R. (1990). Basal ganglia-thalamocortical

circuits: parallel substrates for motor, oculomotor, "prefrontal" and "limbic"

functions. Progress in Brain Research, 85 , 119-146.

American Heart Association. (2010). Heart Disease and Stroke Statistics—2010 Update.

Dallas, TX: American Heart Association.

Anderson, C. V., Bigler, E. D., & Blatter, D. D. (1995). Frontal lobe lesions, diffuse

damage, and neuropsychological functioning in traumatic brain-injured patients. Journal

of Clinical and Experimental Neuropsychology, 17, 900-908.

Anderson, S. W., Damasio, H., Jones, R. D., & Tranel, D. (1991). Wisconsin Card Sorting

Test performance as a measure of frontal lobe damage. Journal of Clinical and

Experimental Neuropsychology, 13, 909-922.

Andres, P., & Van der Linden, M. (2001). Supervisory attentional system in patients with focal

142 143

frontal lesions. Journal of Clinical & Experimental Neuropsychology, 23,

225-239.

Army Individual Test Battery: Manual of directions and scoring. (1944). Washington,

D.C.: War Department, Adjutant General’s Office.

Arnett, P. A., Rao, S. M., Bernardin, L., Grafman, J., Yetkin, F. Z., & Lobeck, L.

(1994). Relationship between frontal lobe lesions and Wisconsin Card Sorting Test

performance in patients with multiple sclerosis. Neurology, 44, 420-425.

Baddeley, A. D. (1986). Working memory. Oxford: Oxford University Press.

Baddeley, A. (1998). The central executive: A concept and some misconceptions.

Journal of the International Neuropsychological Society, 4, 523-526.

Baddeley, A. D., & Hitch, G. J. L. (1974). Working Memory. In G. A. Bower (Ed.), The

psychology of learning and motivation: Advances in research and theory (Vol. 8, pp. 47-

89). New York: Academic Press.

Barceló, F. (2001). Does the Wisconsin Card Sorting Test measure prefrontal function?

The Spanish Journal of Psychology, 4, 79-100.

Basso, M. R., & Lowery, N. (2004). Global-local visual biases correspond with visual-spatial

orientation. Journal of Clinical & Experimental Neuropsychology, 26, 24-30.

Beatty, W. B., Jocic, Z., Monson, N., & Katzung, V. M. (1994). Problem solving by

144

schizophrenic and schizoaffective patients on the Wisconsin and California Card Sorting

Tests. Neuropsychology, 8, 49-54.

Beck, A. T., & Steer, R. A. (1993). Beck Anxiety Inventory Manual. San Antonio, TX: The

Psychological Corporation.

Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory-II. Manual.

San Antonio, TX: The Psychological Corporation.

Benedict, R. H. B. (1997). Brief Visuospatial Memory Test–Revised: Professional manual.

Odessa, FL: Psychological Assessment Resources, Inc.

Bennett, P. C., Ong, B., & Ponsford, J. (2005). Assessment of executive dysfunction

following traumatic brain injury: Comparison of the BADS with other clinical

neuropsychological measures. Journal of the International Neuropsychological Society,

11, 606-613.

Benton, A. L., & Hamsher, K. deS. (1989). Multilingual Aphasia Examination. Iowa City:

AJA Associates.

Benton, A. L., Hannay, H. J., & Varney, N. R. (1975). Visual perception of line direction in

patients with unilateral brain disease. Neurology, 25, 907-910.

Berg, E. A. (1948). A simple objective treatment for measuring flexibility in thinking.

Journal of General Psychology, 39, 15-22.

145

Broadbent, D. E., Cooper, P. F., Fitzgerald, P. & Parkes, K. R. (1982). The Cognitive Failures

Questionnaire (CFQ) and its correlates. British Journal of Clinical Psychology, 21 , 1-16.

Bullock, R., & Voss, S. (2006). Executive dysfunction in dementia. In K. Rockwood &

S. Gauthier (Eds.), Trial designs and outcomes in dementia therapeutic research (pp.

259-265). Philadelphia, PA: Taylor & Francis.

Burgess, P. W., & Alderman, N. (1990). Rehabilitation of dyscontrol syndromes

following frontal lobe damage: A cognitive neuropsychological approach. In R. L. Wood

& I. Fussey (Eds.), Cognitive rehabilitation in perspective (pp. 183-203). Basingstoke:

Taylor & Francis.

Burgess, P. W., Alderman, N., Evans, J., Emslie, H., & Wilson, B. A. (1998). The

ecological validity of tests of executive function. Journal of the International

Neuropsychological Society, 4, 547-558.

Burgess, P. W., Alderman, N., Forbes, C., Costello, A., Coates, L. M-A., Dawson, D. R.,

et al. (2006). The case for the development and use of “ecologically valid” measures of

executive function in experimental and clinical neuropsychology. Journal of the

International Neuropsychological Society, 12, 194-209.

Burgess, P. W., Alderman, N., Wilson, B. A., Evans, J. J., & Emslie, H. (1996). The

Dysexecutive Questionnaire. In B. A. Wilson, N. Alderman, P. W. Burgess, H. Emslie, &

J. J. Evans, Behavioural assessment of the dysexecutive syndrome. Bury St. Edmunds:

Thames Valley Test Company.

146

Burgess, P. W., & Shallice, T. (1996). Response suppression, initiation and strategy use

following frontal lobe lesions. Neuropsychologia, 34, 263-273.

Burgess, P. W., Veitch, E., de Lacy Costello, A., & Shallice, T. (2000). The cognitive and

neuroanatomical correlates of multitasking. Neuropsychologia, 38, 848-863.

Campbell, Z., Zakzanis, K. K., Jovanovski, D., Joordens, S., Mraz, R., & Graham, S. J.

(2009). Utilizing virtual reality to improve the ecological validity of clinical

neuropsychology: An fMRI case study elucidating the neural basis of planning by

comparing the Tower of London with a three-dimensional navigation task. Applied

Neuropsychology, 16, 295-306.

Carlin, D., Bonerba, J., Phipps, M., Alexander, G., Shapiro, M., & Grafman, J. (2000).

Planning impairments in frontal lobe dementia and frontal lobe lesion patients.

Neuropsychologia, 38, 655-665.

Chaytor, N., & Schmitter-Edgecombe, M. (2003). The ecological validity of

neuropsychological tests: A review of the literature on everyday cognitive skills.

Neuropsychology Review, 13, 181-197.

Cockburn, J. (1995). Performance on the Tower of London after severe head injury.

Journal of the International Neuropsychological Society, 6, 537-544.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2 nd ed.).

Hillsdale, NJ: Lawrence Erlbaum Associates.

147

Culbertson, W. C., & Zillmer, E. A. (2001). Tower of London: Drexel University (TOL DX ).

North Tonawanda, NY: Multi-Health Systems.

Cummings, J. L. (1993). Frontal-subcortical circuits and human behavior. Archives of

Neurology, 50, 873-879.

Damasio, A. R. (1996). The somatic marker hypothesis and the possible functions of the

prefrontal cortex. Philosophical Transactions of the Royal Society of London. Series B,

Biological Sciences, 351, 1413-1420.

Dawson, D. R., Anderson, N. D., Burgess, P., Cooper, E., Krpan, K. M., & Stuss, D. T.

(2009). Further development of the Multiple Errands Test: Standardized scoring,

reliability, and ecological validity for the Baycrest version. Archives of Physical

Medicine and Rehabilitation, 90 (Suppl. 1), 41-51.

Delis, D., Kaplan, E., & Kramer, J. (2001). Delis-Kaplan Executive Function Scale. San

Antonio, TX: The Psychological Corporation.

Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (2000). California Verbal Learning Test-

Second Edition (CVLT-II). San Antonio, TX: The Psychological Corporation.

D’Esposito, M., Detre, J. A., Alsop, D. C., Shin, R. K., Atlas, S., & Grossman, M.

(1995). The neural basis of the central executive system of working memory. Nature,

378 , 279–281.

148

Demakis, G. J. (2004). Frontal lobe damage and tests of executive processing: A meta-

analysis of the Category Test, Stroop Test, and Trail-Making Test. Journal of Clinical

and Experimental Neuropsychology, 26, 441-450.

Derrer, D. S., Howieson, D. B., Mueller, E. A., Camicioli, R. M., Sexton, G., & Kaye, J. A.

(2002). Memory testing in dementia: How much is enough? Journal of Geriatric

Psychiatry and Neurology, 13, 1-6.

Dove, A., Pollmann, S., Schubert, T., Wiggins, C. J., & von Cramon, D. Y. (2000).

Prefrontal cortex activation in task switching: An event-related fMRI study. Cognitive

Brain Research, 9 , 103–109.

Duncan, J. (1986). Disorganization of behavior after frontal lobe damage. Cognitive

Neuropsychology, 3 , 271-290.

Elderkin-Thompson, V., Kumar, A., Bilker, W. B., Dunkin, J. J., Mintz, J., Moberg, P. J.,

et al. (2003). Neuropsychological deficits among patients with late-onset minor and

major depression. Archives of Clinical Neuropsychology, 18, 529-549.

Elderkin-Thompson, V., Mintz, J., & Haroon, E. (2007). Executive dysfunction and

memory in older patients with major and minor depression. Archives of Clinical

Neuropsychology, 22 , 261-270.

Elias, J. W., & Treland, J. E. (1999). Executive function in Parkinson’s disease and

subcortical disorders. Seminars in Clinical Neuropsychiatry, 4 , 34–40.

149

Elkind, J. S., Rubin, E., Rosenthal, S., Skoff, B., & Prather, P. (2001). A simulated reality

scenario compared with the computerized Wisconsin Card Sorting Test: An

analysis of preliminary results. CyberPsychology & Behavior, 4 , 489-496.

Eslinger, P. J., & Damasio, A. R. (1985). Severe disturbance of higher cognition after

bilateral frontal lobe ablation: Patient EVR. Neurology, 35, 1731-1741.

Finkelstein, E., Corso, P., & Miller, T. (2006). The Incidence and Economic Burden of

Injuries in the United States. New York: Oxford University Press.

Fong, K. N., Chow, K. K., Chan, B. C., Lam, K. C., Lee, J. C., Li, T. H., et al. (2010).

Usability of a virtual reality environment simulating an automated teller machine for

assessing and training persons with acquired brain injury. Journal of Neuroengineering

and Rehabilitation, 7, 19-27.

Franzen, M. D., & Wilhelm, K. L. (1996). Conceptual foundations of ecological validity in

neuropsychology. In R. J. Sbordone & C. J. Long (Eds.), Ecological validity of

neuropsychological testing (pp. 91-112). Delray Beach, FL: GR Press/St. Lucie Press.

Fuster, J. M. (1997). The prefrontal cortex: Anatomy, physiology and neuropsychology

of the frontal lobe (3 rd ed.). Philadelphia: Lippincott-Raven.

Fuster, J. M. (2002). Physiology of executive functions: The perception-action cycle. In

D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 96-108) . New

York: Oxford University Press.

150

Gazzaniga, M. S., Ivry, R. B., & Mangun, G. R. (1998). Cognitive neuroscience: The

biology of the mind. New York: W.W. Norton.

Godefroy, O., Rousseaux, M., Leys, D., Destee, A., Scheltens, P., & Pruvo, J. P. (1992).

Frontal lobe dysfunction in unilateral lenticulostriate infarcts. Prominent role of cortical

lesions. Archives of Neurology, 49 , 1285–1289.

Goldstein, G. (1996). Functional considerations in neuropsychology. In R. J. Sbordone &

C. J. Long (Eds.), Ecological validity of neuropsychological testing (pp. 75-89). Delray

Beach, FL: GR Press/St. Lucie Press.

Goldstein, B., Obrzut, J. E., John, C., Ledakis, G. & Armstrong, C. L. (2004). The impact

of frontal and non-frontal brain tumor lesions on Wisconsin Card Sorting Test

performance. Brain & Cognition, 54 , 110-116.

Gouveia, P. A., Brucki, S. M., Malheiros, S. M., & Bueno, O. F. (2007). Disorders in

planning and strategy application in frontal lobe lesion patients. Brain and Cognition, 63,

240-246.

Grace, J., & Malloy, P. F. (2001). Frontal Systems Behavior Scale Professional Manual.

Lutz, FL: Psychological Assessment Resources, Inc.

Grafman, J. (2002). The structured event complex and the human prefrontal cortex. In

D. T. Stuss & R. T. Knight (Eds.), Principles of frontal lobe function (pp. 292-310) . New

York: Oxford University Press.

151

Grant, D. A., & Berg, E. A. (1948). A behavioural analysis of the degree of reinforcement

and ease of shifting to new responses in a Weigl-type card sorting problem. Journal of

Experimental Psychology, 38, 404-411.

Green, J., McDonald, W. M., Vitek, J. L., Evatt, M., Freeman, A., Haber, M., et al. (2002).

Cognitive impairments in advanced PD without dementia. Neurology, 59, 1320-1324.

Heaton, R. K. (1981). A manual for the Wisconsin Card Sorting Test. Odessa, FL: Psychological

Assessment Resources.

Heaton, R. K., Chelune, G. J., Talley, J. L., Kay, G. G., & Curtis, G. (1993). Wisconsin

Card Sorting Test (WCST) manual, revised and expanded. Odessa, FL: Psychological

Assessment Resources.

Heaton, R. K., Miller, S. W., Taylor, M. J., & Grant, I. (2004). Revised comprehensive norms for

an expanded Halstead-Reitan battery: Demographically adjusted neuropsychological

norms for African American and Caucasian adults. Lutz, FL: Psychological Assessment

Resources.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis . New York:

Academic Press.

Heinrichs, R. W., & Zakzanis, K. K. (1998). Neurocognitive deficit in schizophrenia: A

quantitative review of the evidence. Neuropsychology, 12, 426-445.

152

Henry, J. D., & Crawford, J. R. (2004). A meta-analytic review of verbal fluency

performance following focal cortical lesions. Neuropsychology, 18, 284-295.

Henry, J. D., Crawford, J. R., & Phillips, L. H. (2004). Verbal fluency performance in

dementia of the Alzheimer’s type: A meta-analysis. Neuropsychologia, 42, 1212-1222.

Herman, M. (2004). Neurocognitive functioning and quality of life among dually

diagnosed and non-substance abusing schizophrenia inpatients. International Journal of

Mental Health Nursing, 13, 282-291.

Hester, R. L., Kinsella, G. J., Ong, B., & McGregor, J. (2005). Demographic influences on

baseline and derived scores from the Trail Making Test in healthy older Australian adults.

The Clinical Neuropsychologist, 19, 45-54.

Heun, R., Papassotiropoulos, A., & Jennssen, F. (1998). The validity of psychometric

instruments for detection of dementia in the elderly general population. International

Journal of Geriatric Psychiatry, 13, 368-380.

Hoerold, D., Dockree, P. M., O’Keeffe, F. M., Bates, H., Pertl, M., & Robertson, I. H.

(2008). Neuropsychology of self-awareness in young adults. Experimental Brain

Research, 186, 509-515.

Houx, P. J., Jolles, J., & Vreeling, F. W. (1993). Stroop interference: Aging effects

assessed with the Stroop Color-Word Test. Experimental Aging Research, 19 , 209-224.

153

Joel, D., & Weiner, I. (1994). The organization of the basal ganglia-thalamocortical

circuits: Open interconnected rather than closed segregated. Neuroscience, 63 , 363–379.

Jovanovski, D., Erb, S., & Zakzanis, K. K. (2005). Neurocognitive deficits in cocaine

users: A quantitative review of the evidence. Journal of Clinical and Experimental

Neuropsychology, 27, 189-204.

Kaplan, E. F., Goodglass, H., & Weintraub, S. (1983). The Boston Naming Test . Philadelphia:

Lea & Febiger.

Klinger, E., Chemin, I., Lebreton, S., & Marié, R. M. (2004). A virtual supermarket to

assess cognitive planning. CyberPsychology and Behavior, 7, 292-293.

Klinger, E., Chemin, I., Lebreton, S., & Marié, R. M. (2006). Virtual action planning in

Parkinson’s disease: A control study. CyberPsychology & Behavior, 9 , 342-347.

Knight, C., Alderman, N., & Burgess, P. W. (2002). Development of a simplified

version of the multiple errands test for use in hospital settings. Neuropsychological

Rehabilitation, 12, 231-255.

Konishi, S., Nakajima, K., Uchida, I., Kameyama, M., Nakahara, K., Sekihara, K., et al.

(1998). Transient activation of inferior prefrontal cortex during cognitive set shifting.

Nature Neuroscience, 1 , 80–84.

154

Kramer, J. H., Reed, B. R., Mungas, D., Weiner, M. W., & Chui, H. C. (2002). Executive

dysfunction in subcortical ischaemic vascular disease. Journal of Neurology,

Neurosurgery, and Psychiatry, 72 , 217–220.

Lacerda, A. L. T., Dalgalarrondo, P., Caetano, D., Haas, G. L., Camargo, E. E., &

Keshavan, M. S. (2003). Neuropsychological performance and regional cerebral blood

flow in obsessive-compulsive disorder. Progress in Neuro-Psychopharmacology &

Biological Psychiatry, 27, 657-665.

Law, A. S., Logie, R. H., & Pearson, D. G. (2006). The impact of secondary tasks on

multitasking in a virtual environment. Acta Psychologica, 122 , 27-44.

Lazar, R. M., Festa, J. R., Geller, A. E., Romano, G. M., & Marshall, R. S. (2007).

Multitasking disorder from right temporoparietal stroke. Cognitive and Behavioral

Neurology, 20, 157-162.

Levine, B., Robertson, I. H., Clare, L., Carter, G., Hong, J., Wilson, B. A., et al. (2000).

Rehabilitation of executive functioning: An experimental-clinical validation of Goal

Management Training . Journal of the International Neuropsychological Society, 6 , 299-

312.

Levine, B., Stuss, D. T., Milberg, W. P., Alexander, M. P., Schwartz, M., & MacDonald, R.

(1998). The effects of focal and diffuse brain damage on strategy application: Evidence

from focal lesions, traumatic brain injury and normal aging. Journal of the International

Neuropsychological Society, 4, 247-264.

155

Levine, B., Stuss, D. T., Winocur, G., Binns, M. A., Fahy, L., Mandic, M., et al. (2007).

Cognitive rehabilitation in the elderly: Effects on strategic behaviour in relation to goal

management. Journal of the International Neuropsychological Society, 13 , 143-152.

Lezak, M. D. (1995). Neuropsychological assessment (3 rd ed.). New York: Oxford

University Press.

Lezak, M. D., Howieson, D. B., & Loring, D. W. (Eds.). (2004). Neuropsychological assessment

(4 th ed.). New York: Oxford University Press.

Lo Priore, C., Castelnuovo, G., Liccione, D., & Liccione, D. (2003). Experiences with V-

STORE: Considerations on presence in virtual environments for effective

neuropsychological rehabilitation of executive functions. CyberPsychology and

Behavior, 6, 281-287.

Love, J. M., Greve, K. W., Sherwin, E., & Mathias, C. (2003). Comparability of the standard

WCST and WCST-64 in traumatic brain injury. Applied Neuropsychology, 10, 246-251.

Luria, A. R. (1966). Higher cortical function in man. New York: Basic Books.

Manchester, D., Priestley, N., & Jackson, H. (2004). The assessment of executive

functions: Coming out of the office. Brain Injury, 18, 1067-1081.

Marcotte, T. D., Scott, J. C., Kamat, R., & Heaton, R. K. (2010). Neuropsychology and the

prediction of everyday functioning. In T. D. Marcotte & I. Grant (Eds.), Neuropsychology

of Everyday Functioning. New York: The Guilford Press.

156

Marcotte, T. D., Wolfson, T., Rosenthal, T. J., Heaton, R. K., Gonzalez, R., Ellis, R. J., et

al. (2004). A multimodal assessment of driving performance in HIV infection.

Neurology, 63, 1417-1422.

Mateer, C. A. (1999). Executive function disorders: Rehabilitation challenges and

strategies. Seminars in Clinical Neuropsychiatry, 4, 50-59.

Mattson, A. J., & Levin, H. S. (1990). Frontal lobe dysfunction following closed head

injury. Journal of Nervous and Mental Disease, 178 , 282-291.

McGeorge, P., Phillips, L. H., Crawford, J. R., Garden, S. E., Della Sala, S., Milne, A. B.,

et al. (2001). Using virtual environments in the assessment of executive dysfunction.

Presence-Teleoperators and Virtual Environments, 10 , 375–383.

McPherson, K. M., Kayes, N., & Weatherall, M. (2009). A pilot study of self-regulation

informed goal setting in people with traumatic brain injury. Clinical Rehabilitation, 23,

296-309.

Mendozzi, L., Motta, A., Barbieri, E., Alpini, D., & Pugnetti, L. (1998). The application

of virtual reality to document coping deficits after a stroke: Report of a case.

CyberPsychology & Behavior, 1, 79–91.

Mesulam, M. M. (Ed.). (1985). Principles of behavioral neurology. Philadelphia, PA:

F. A. Davis.

157

Meyers, J. E., & Meyers, K. R. (1995). Rey Complex Figure Test and Recognition Trial.

Odessa, FL: Psychological Assessment Resources.

Milner, B. (1963). Effects of different brain lesions on card sorting. Archives of

Neurology, 9, 90-100.

Milner, B. (1964). Some effects of frontal lobectomy in man. In J. M. Warren & K. Akert

(Eds.), The frontal granular cortex and behavior . New York: McGraw Hill.

Millis, S. R., Rosenthal, M., Novack, T. A., Sherer, M., Nick, T. G., Kreutzer, J. S., et al.

(2001). Long-term neuropsychological outcome after traumatic brain injury. Journal of

Head Trauma Rehabilitation, 16, 343-355.

Miotto, E. C., & Morris, R. G. (1998). Virtual planning in patients with frontal lobe

lesions. Cortex, 34, 639-657.

Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D.

(2000). The unity and diversity of executive functions and their contributions to complex

“Frontal Lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49-100.

Mur, M., Portella, M. J., Martinez-Aran, A., Pifarre, J. & Vieta, E. (2007). Persistent

neuropsychological deficit in euthymic bipolar patients: Executive function as a core

deficit. Journal of Clinical Psychiatry, 68 , 1078–1086.

National Center for Injury Prevention and Control. (2010). Traumatic brain injury in the

158

United States: Emergency department visits, hospitalizations and deaths 2002-2006.

Atlanta, GA: Centers for Disease Control and Prevention.

Nelson, H. E. (1991). National Adult Reading Test (2 nd ed.). London: NFER-Nelson Publishing

Co., Ltd.

Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control

of behaviour. In R. J. Davidson, G. E. Schwartz, & D. Shapiro (Eds.), Consciousness and

self-regulation: Advances in research and theory (1 st ed.). New York: Plenum Press.

Norris, G., & Tate, R. L. (2000). The Behavioural Assessment of the Dysexecutive

Syndrome (BADS): Ecological, concurrent and construct validity. Neuropsychological

Rehabilitation, 10, 33-45.

Owen, A. M., Downes, J. J., Sahakian, B. J., Polkey, C. E., & Robbins, T. W. (1990).

Planning and spatial working memory following frontal lobe lesions in man.

Neuropsychologia, 28, 1021-1034.

Owen, A. M., Sahakian, B. J., Semple, J., Polkey, C. E., & Robbins, T. W. (1995).

Dopamine-dependent fronto-striatal planning deficits in early Parkinson’s disease.

Neuropsychology, 9 , 126-140.

Pohjasvaara, T., Leskelä, M., & Vataja, R. (2002). Post-stroke depression, executive dysfunction

and functional outcome. European Journal of Neurology, 9, 269-275.

159

Poole, J. H., Ober, B. A., Shenaut, G. K., & Vinogradov, S. (1999). Independent frontal-system

deficits in schizophrenia: Cognitive, clinical, and adaptive implications. Psychiatry

Research, 85, 161-176.

Pugnetti, L., Mendozzi, L., Attree, E. A., Barbieri, E., Brooks, B. M., Cazzullo, C. L.,

et al. (1998a). Probing memory and executive functions with virtual reality: Past and

present studies. CyberPsychology and Behavior, 1, 151–161.

Pugnetti, L., Mendozzi, L., Brooks, B. M., Attree, E. A., Barbieri, E., Alpini, D., et al.

(1998b). Active versus passive exploration of virtual environments modulates spatial

memory in MS patients: A yoked control study. The Italian Journal of Neurological

Sciences, 19 (Suppl. 6), 424–430.

Pugnetti, L., Mendozzi, L., Motta, A., Cattaneo, A., Barbieri, E., & Brancotti, S. (1995).

Evaluation and retraining of adults’ cognitive impairments: Which role for virtual

reality technology? Computers in Biology and Medicine, 25 , 213–227.

Rabin, L. A., Burton, L. A., & Barr, W. B. (2007). Utilization rates of ecologically oriented

instruments among clinical neuropsychologists. Clinical Neuropsychology, 21, 727-743.

Rand, D., Basha-Abu Rukan, S., Weiss, P. L., & Katz, N. (2009). Validation of the virtual

MET as an assessment tool for executive functions. Neuropsychological Rehabilitation,

19, 583-602.

Ready, R. E., Stierman, L., & Paulsen, J. S. (2001). Ecological validity of

160

neuropsychological and personality measures of executive functions. The Clinical

Neuropsychologist, 15, 314-323.

Reitan, R. M. (1955). The relation of the Trail Making Test to organic brain damage. Journal of

Consulting Psychology, 19, 393-394.

Rezai, K., Andreasen, N. C., Alliger, R., Cohen, G., Swayze II, V., & O’Leary, D. S. (1993). The

neuropsychology of the prefrontal cortex. Archives of Neurology, 59, 636-642.

Ricker, J. H., Axelrod, B. N., & Houtler, B. D. (1996). Clinical validation of the oral Trail

Making Test. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 9, 50-53.

Rizzo, A. A., & Buckwalter, J. G. (1997). Virtual reality and cognitive assessment and

rehabilitation: The state of the art. In G. Riva (Ed.), Psycho-neuro-physiological

assessment and rehabilitation in virtual environments: Cognitive, clinical and

human factors in advanced human computer interactions (pp. 123-146). Amsterdam: IOS

Press.

Rizzo, A. A., Buckwalter, J. G., & van der Zaag, C. (2002). Virtual environment

applications in clinical neuropsychology. In K. Stanney (Ed.), The handbook of virtual

environments (pp. 1027-1064). New York: Lawrence Erlbaum Associates, Inc.

Robertson, I. H. (1996). Goal management training: A clinical manual. Cambridge, UK:

PsyConsult.

Robertson, I. H., Ward, T., Ridgeway, V., & Nimmo-Smith, I. (1996). The structure of

161

normal human attention: The test of everyday attention. Journal of the International

Neuropsychological Society, 2, 525-534.

Robinson, L. J., Kester, D. B., Saykin, A. J., Kaplan, E. F., & Gur, R. C. (1991). Comparison

of two short forms of the Wisconsin Card Sorting Test. Archives of Clinical

Neuropsychology, 6, 27-33.

Rochat, L., Ammann, J., Mayer, E., Annoni, J. M., & Van der Linden, M. (2009).

Executive disorders and perceived socio-emotional changes after traumatic brain injury.

Journal of Neuropsychology, 3, 213-227.

Rosselli, M., & Ardila, A. (1996). Cognitive effects of cocaine and polydrug abuse. Journal of

Clinical and Experimental Neuropsychology, 18, 122-135.

Satz, P. (1993). Brain reserve capacity on symptom onset after brain injury: A

formulation and review of evidence for threshold theory. Neuropsychology, 7, 273-295.

Sazbon, L., & Groswasser, Z. (1991). Time-related sequelae of TBI in patients with prolonged

post-comatose unawareness (PC-U) state. Brain Injury, 5, 3-8.

Sbordone, R. J. (1996). Ecological validity: Some critical issues for the

neuropsychologist. In R. J. Sbordone & C. J. Long (Eds.), Ecological validity of

neuropsychological testing (pp. 15-41). Delray Beach, FL: GR Press/St. Lucie Press.

Schmidt, H., Heimann, B., & Djukic, M. (2006). Neuropsychological sequelae of

bacterial and viral meningitis. Brain, 129 , 333-345.

162

Schretlen, D. J., Testa, S. M., Winicki, J. M., Pearlson, G. D., & Gordon, B. (2008). Frequency

and bases of abnormal performance by healthy adults on neuropsychological testing.

Journal of the International Neuropsychological Society, 14, 436-445.

Schultheis, M. T., & Rizzo, A. A. (2001). The application of virtual reality in

rehabilitation. Rehabilitation Psychology, 46, 296-311.

Schweizer, T. A., Levine, B., Rewilak, D., O’Connor, C., Turner, G., Alexander, M. P.,

et al. (2008). Rehabilitation of executive functioning after focal damage to the

cerebellum. Neurorehabilitation and Neural Repair, 22, 72-77.

Screen Recorder Studio Corp. (1999). Screen Recorder Gold (Version 2.1). [Computer software].

Retrieved from www.capture-screen.com.

Shallice, T. (1982). Specific impairments of planning. Philosophical Transactions of the

Royal Society of London, 298, 199-209.

Shallice, T. (1988). From neuropsychology to mental structure. New York: Cambridge

University Press.

Shallice, T. (1994). Multiple levels of control processes. In C. Umiltà & M. Moscovitch

(Eds.), Attention and performance, XV: Conscious and nonconscious information

processing (pp. 395-420). Cambridge, MA: A Bradford Book/The MIT Press.

Shallice, T., & Burgess, P. (1991). Deficits in strategy application following frontal lobe

163

damage in man. Brain, 114, 727-741.

Shallice, T., & Burgess, P. (1996). The domain of supervisory processes and temporal

organization of behaviour. Philosophical Transactions of the Royal Society of London.

Series B, Biological Sciences, 351, 1405-1412.

Sherer, M., Nick, T. G., Millis, S. R., & Novack, T. A. (2003). Use of the WCST and the WCST-

64 in the assessment of traumatic brain injury. Journal of Clinical and Experimental

Neuropsychology, 25, 512-520.

Sohlberg, M. M. & Mateer, C. A. (2001). Cognitive rehabilitation: An integrative

neuropsychological approach . New York: The Guilford Press.

Spreen, O., & Benton, A. L. (1977). Neurosensory Center Comprehensive Examination

for Aphasia. Victoria, BC: Neuropsychology Laboratory, University of Victoria.

Spreen, O., & Strauss, E. (1998). A compendium of neuropsychological tests (2 nd ed.).

New York: Oxford University Press.

Stewart, W. F., Schwartz, B. S., Simon, D., Bola, K. I., Todd, A. C., & Links, J. (1999).

Neurobehavioral function and tibial and chelatable lead levels in 543 former organolead

workers. Neurology, 52, 1610-1617.

Strauss, E., Sherman, E. M. S., & Spreen, O. (Eds.). (2006). A compendium of

neuropsychological tests: Administration, norms, and commentary (3 rd ed.). New York:

Oxford University Press.

164

Stroop, J. R. (1935). Studies of interference in serial-verbal reaction. Journal of

Experimental Psychology, 18 , 643-662.

Stuss, D. T., & Alexander, M. P. (2007). Is there a dysexecutive syndrome? Philosophical

Transactions of the Royal Society of London. Series B, Biological Sciences, 362, 901-

915.

Stuss, D. T., Alexander, M. P., Hamer, L., Paumbo, C., Dempster, R., Binns, M., et al.

(1998). The effects of focal anterior brain lesions on verbal fluency. Journal of the

International Neuropsychological Society, 4, 265-278.

Stuss, D. T., Benson, D. F., Kaplan, E. F., Weir, W. S., Naeser, M. A., Lieberman, I., et al.

(1983). The involvement of orbitofrontal cerebrum in cognitive tasks.

Neuropsychologia, 21 , 235-248.

Stuss, D. T., & Gow, C. A. (1992). “Frontal dysfunction” after traumatic brain injury.

Neuropsychiatry, Neuropsychology, & Behavioral Neurology, 5, 272-282.

Stuss, D. T., & Levine, B. (2002). Adult clinical neuropsychology: Lessons from studies

of the frontal lobes. Annual Review of Psychology, 53, 401-433.

Stuss, D. T., Levine, B., Alexander, M. P., Hong, J., Palumbo, C., Hamer, L., et al. (2000).

Wisconsin Card Sorting Test performance in patients with focal frontal and posterior

brain damage: Effects of lesion location and test structure on separable cognitive

processes. Neuropsychologia, 38, 388-402.

165

Stuss, D. T., Shallice, T., Alexander, M. P., & Picton, T. W. (1995). A multidisciplinary

approach to anterior attentional functions. Annals of the New York Academy of Sciences,

769, 191-211.

Su, C. Y., Wuang, Y. P., Chang, J. K., Guo, N. W., & Kwan, A. L. (2006). Wisconsin Card

Sorting Test performance after putaminal hemorrhagic stroke. The Kaohsiung Journal of

Medical Sciences, 22, 75-84.

Taylor, S. F., Kornblum, S., Lauber, E. J., Minoshima, S., & Koeppe, R. A. (1997).

Isolation of specific interference processing in the Stroop task: PET activation studies.

NeuroImage, 6, 81–92.

Tekin, S., & Cummings, J. L. (2002). Frontal-subcortical neuronal circuits and clinical

neuropsychiatry: An update. Journal of Psychosomatic Research, 53, 647-654.

Teuber, H. L. (1964). The riddle of frontal lobe function in man. In J. M. Warren & K.

Akert (Eds.), The frontal granular cortex and behavior. New York: McGraw-Hill.

The Psychological Corporation. (1997). WAIS-III and WMS-III Technical Manual. San

Antonio, TX: Harcourt Assessment.

The Psychological Corporation. (2001). Wechsler Test of Adult Reading. San Antonio,

TX: Harcourt Assessment.

Titov, N., & Knight, R. G. (2005). A computer-based procedure for assessing functional

cognitive skills in patients with neurological injuries: The virtual street. Brain

166

Injury, 19 , 315-322.

Tombaugh, T. N. (1996). Test of Memory Malingering (TOMM). North Tonawanda, NY:

Multi-Health Systems.

Tombaugh, T. N. (2004). Trail Making Test A and B: Normative data stratified by age and

education. Archives of Clinical Neuropsychology, 19, 203-214.

Tombaugh, T. N., Kozak, J., & Rees, L. (1999). Normative data stratified by age and education

for two measures of verbal fluency: FAS and animal naming. Archives of Clinical

Neuropsychology, 14, 167-177. van Hooren, S. A. H., Valentijn, S. A. M., Bosma, H., Ponds, R. W. H. M., van Boxtel, M. P. J.,

Levine, B., et al. (2007). Effect of a structured course involving goal management

training in older adults: A randomised controlled trial. Patient Education and Counseling,

65 , 205-213. van Swieten, J. C., Koudstaal, P. J., Visser, M. C., Schouten, H. J., & van Gijn, J. (1988).

Interobserver agreement for the assessment of handicap in stroke patients. Stroke, 19,

604–607.

Watkins, L. H., Rogers, R. D., Lawrence, A. D., Sahakian, B. J., Rosser, A. E., & Robbins,

T. W. (2000). Impaired planning but intact decision making in early Huntington’s

disease: Implications for specific fronto-striatal pathology. Neuropsychologia, 38, 1112-

1125.

167

Weber, S. J., & Cook, T. D. (1972). Subject effects in laboratory research: An examination of

subject roles, demand characteristics, and valid inferences. Psychological Bulletin, 77 ,

273-295.

Weinberger, D. R., Berman, K. F., & Zec, R. F. (1988). Physiological dysfunction of

dorsolateral prefrontal cortex in schizophrenia. I: Regional cerebral blood flow (rCBF)

evidence. Archives of General Psychiatry, 45, 609-615.

Weiss, P. L., Bialik, P., & Kizony, R. (2003). Virtual reality provides leisure time opportunities

for young adults with physical and intellectual disabilities. CyberPsychology & Behavior,

6, 335-342.

Wilson, B. A., Alderman, N., Burgess, P. W., Emslie, H., & Evans, J. J. (1996). Behavioural

Assessment of the Dysexecutive Syndrome. Bury St. Edmunds: Thames Valley Test

Company.

Wilson, B. A., Clare, L., Cockburn, J. M., Baddeley, A. D., Tate, R., & Watson, P. (1999).

The Rivermead Behavioural Memory Test – Extended Version. Bury St. Edmunds:

Thames Valley Test Company.

Wilson, B. A., Cockburn, J., & Baddeley, A. D. (1985). The Rivermead Behavioural

Memory Test Manual. Bury St. Edmunds: Thames Valley Test Company.

Wilson, B. A., Cockburn, J., & Halligan, P. (1987). Behavioural Inattention Test. Bury St.

Edmunds: Thames Valley Test Company.

168

Winocur, G., Craik, F. I. M., Levine, B., Robertson, I. H., Binns, M. A., Alexander, M.,

et al. (2007). Cognitive rehabilitation in the elderly: Overview and future directions.

Journal of the International Neuropsychological Society, 13 , 166-171.

Wood, R. L., & Liossi, C. (2006). The ecological validity of executive tests in a severely

brain injured sample. Archives of Clinical Neuropsychology, 21, 429-437.

Zakzanis, K. K. (2001). Statistics to tell the truth, the whole truth, and nothing but the

truth: Formulae, illustrative numerical examples, and heuristic interpretation of effect size

analyses for neuropsychological researchers. Archives of Clinical Neuropsychology, 16,

653-667.

Zakzanis, K. K., Leach, L., & Kaplan, E. (1999). Neuropsychological Differential

Diagnosis. The Netherlands: Swets & Zeitlinger.

Zakzanis, K. K., Mraz, R., & Graham, S. J. (2005). An fMRI study of the Trail Making

Test. Neuropsychologia, 43, 1878-1886.

Zakzanis, K. K., Quintin, G., Graham, S. J., & Mraz, R. (2009). Age and dementia related

differences in spatial navigation within an immersive virtual environment. Medical

Science Monitor, 15, 140-150.

Zhang, L., Abreu, B. C., Masel, B., Scheibel, R. S., Christiansen, C. H., Huddleston, N., et al.

(2001). Virtual reality in the assessment of selected cognitive function after brain injury.

American Journal of Physical Medicine & Rehabilitation, 80 , 597-604.

Zinn, S., Bosworth, H. B., Hoenig, H. M., & Swartzwelder, H. S. (2007). Executive

169 function deficits in acute stroke. Archives of Physical Medicine and Rehabilitation, 88,

173-180.

Table 2.1. Demographic characteristics of sample (n = 30) Mean SD Range Age (years) 19.40 1.54 18-25 Education (years) 12.70 1.15 12-17 IQ (WTAR SS) 99.93 13.09 75-127 Abbreviations: WTAR, Wechsler Test of Adult Reading; SS, Standard Score

170 171

Table 2.2. Neuropsychological test performance of sample (n = 30) Mean SD Range MCT Plan Time (seconds) 191.27 103.70 32-523 Plan Score (/6) 3.80 1.95 1-6 Budget (/1) 0.15 0.35 0-1 Completion Time (seconds) 319.53 124.21 163-767 Tasks Completed (/8) 7.73 0.64 5-8 Insufficient Funds 0.50 0.57 0-2 Time Deadlines Met (/2) 1.93 0.25 1-2 Task Repetitions 0.13 0.35 0-1 Total Inefficiencies 0.83 0.95 0-3 Total Errors 1.53 1.17 0-4 Executive Function Tests TMT B (seconds) 62.73 14.15 35-95 BADS MSE (profile score) 3.57 0.86 1-4 TOL Total Correct 2.93 2.53 0-8 Non-Executive Cognitive Tests RBMT Immediate Recall (raw score) 6.62 2.23 2.5-11.5 RBMT Delayed Recall (raw score) 6.17 2.55 2-11 Star Cancellation 53.93 0.25 53-54 TMT A (seconds) 31.07 10.47 15-62 JLO 24.27 4.08 15-30 Abbreviations: MCT, Multitasking in the City Test; TMT B, Trail Making Test Part B; BADS MSE, Behavioural Assessment of the Dysexecutive Syndrome Modified Six Elements Test; TOL, Tower of London; RBMT, Rivermead Behavioural Memory Test; TMT A, Trail Making Test Part A; JLO, Judgment of Line Orientation Test

172

Table 2.3. Qualitative errors made by normal participants on the MCT (n = 30)

Number of participants Total Error who made error (% of Frequency sample)

Task Failures Incomplete Task 6 (20) 8 Tried to make purchase before bank (Insufficient 13 (43.33) 14 Funds error) Purchased an expensive item ($4 pens) over the 1 (3.33) 1 more economical option ($2 pens) Did not meet scheduled time deadline 2 (6.67) 2

Total: 25 Task Repetitions Attended doctor's appointment two times 1 (3.33) 1 Purchased the same item more than once 2 (6.67) 2 Withdrew money ($20) from bank more than 1 (3.33) 1 once

Total: 4 Inefficiencies Entered a 'task' building but did not perform a 7 (23.33) 8 task Performed two tasks in two separate visits to 8 (26.67) 8 post office instead of one visit Entered a building in which there was no task to 3 (10) 5 perform Notable wandering behaviour (not necessarily 4 (13.33) 4 involving visits to buildings) Total: 25 Abbreviations: MCT, Multitasking in the City Test

173

Table 2.4. Correlations between MCT measures MCT Measures Completion Tasks MCT Measures Plan Time Plan Score Total Errors Time Completed Plan Time -.27 .34 .13 .29 Plan Score .01 .01 -.45* Completion Time -.06 .46* Tasks Completed -.17 Abbreviation: MCT, Multitasking in the City Test *p < .05

174

Table 2.5. Correlations between the MCT and cognitive tests MCT Scores Cognitive Function Tests Plan Time Plan Score Completion Time Tasks Completed Total Errors TMT B .30 -.05 .35 -.21 .11 BADS MSE .18 .40* -.13 -.14 -.28 TOL .09 -.20 .10 -.31 .14 RBMT Immediate Recall -.14 .36 -.07 .17 -.07 RBMT Delayed Recall -.08 .24 -.14 .24 -.05 TMT A .04 -.02 .38* -.18 .16 JLO .11 .02 -.13 .45* -.01 Abbreviations: MCT, Multitasking in the City Test; TMT B, Trail Making Test Part B; BADS MSE, Behavioural Assessment of the Dysexecutive Syndrome Modified Six Elements Test; TOL, Tower of London; RBMT, Rivermead Behavioural Memory Test; TMT A, Trail Making Test Part A; JLO, Judgment of Line Orientation Test *p < .05

175

Table 3.1. Description of brain lesion site(s) of each patient

Patient Brain Lesion/Diagnosis

1 Left cavernous malformation in basal ganglia/internal capsule due to hemorrhage

2 Left superior frontal white matter infarct and complete left carotid artery occlusion

Left small acute infarct in posterior and superior thalamus in MCA territory; near occlusion of 3 left carotid artery

4 Left internal carotid artery (complete occlusion) and stroke in left temporal cortex

5 Left occipital-temporal CVA

Left anterior thalamic infarct with mild periventricular areas of subcortical white matter 6 abnormalities

7 Right MCA territory stroke and basal ganglia infarction

Right lateral medullary infarct with right parietal and frontal infarct and right vertebral artery 8 occlusion

Right hemisphere - 2 areas of decreased attenuation: (1) subtle, small, located in corona radiata; 9 (2) close proximity to body of right lateral ventricle

10 Right MCA occlusion of M2 segment within Sylvian fissure

Right subarachnoid hemorrhage and aneurysm clipping with small area of mostly decreased 11 densities involving frontal lobe region

Right huge hematoma from skull base into the mediastinum in the retropharyngeal space; follow 12 up CT scan showed resolution of hematoma Bilateral extensive cortical contusion involving frontal and temporal lobes; bilateral small subdural hemorrhage along with scattered subarachnoid hemorrhage within bilateral parietal, 13 subcortical, and temporal sulci; left resolving contusion or possible infarction of parietal subcortical region

Abbreviations: MCA, middle cerebral artery; CVA, cerebrovascular accident; CT, computed tomography

176

Table 3.2. Demographic characteristics of sample (n = 13) Mean SD Range Age (years) 58.38 10.77 40-75 Education (years) 14.54 2.73 10-19 Premorbid IQ (WTAR) 112.42 11.08 89-126 BDI-II 9.69 5.71 1-19 BAI 5.62 4.79 0-15 Abbreviations: WTAR, Wechsler Test of Adult Reading; BDI-II, Beck Depression Inventory-II, BAI, Beck Anxiety Inventory

177

Table 3.3. Neuropsychological test performance and FrSBe scores of sample (n = 13) Mean SD Range MCT Plan Time (seconds) 286.31 124.61 135-490 Plan Score (/6) 4.62 1.73 1.5-6.0 Budget (/1) 0.08 0.28 0-1 Completion Time (seconds) 636.00 233.16 243-924 Tasks Completed (/8) 6.54 1.76 3-8 Insufficient Funds 0.31 0.48 0-1 Time Deadlines Met (/2) 1.54 0.88 0-2 Task Repetitions 0.31 0.85 0-3 Total Inefficiences 1.23 1.42 0-4 Total Errors 2.31 1.80 0-5 Executive Function Tests TMT B (seconds) 112.08 63.25 43-254 BADS MSE 3.62 0.87 1-4 COWAT (# of words) 33.69 11.32 18-59 Animal Fluency (# of words) 17.08 5.31 6-24 WCST (% perseverative errors) 24.69 19.78 8-73 Non-Executive Cognitive Tests ROCF Delayed Recall 13.81 6.71 2.5-24.0 ROCF Recognition 19.92 2.06 16-23 CVLT-II Learning (Trial 1-5 words) 39.23 7.78 30-55 CVLT-II Long-Delay Free Recall 6.85 2.15 3-11 WMS-III Logical Memory II 20.62 5.80 12-33 WAIS-III Digit Span 17.15 3.98 9-23 WAIS-III Digit Symbol 52.15 11.76 37-77 WAIS-III Block Design 33.08 15.20 11-63 TMT A (seconds) 45.92 18.66 22-88 JLO 22.69 7.38 6-30 FrSBe Scores Self-Rating Total (Current) 106.54 16.09 82-128 Self-Rating ED Scale (Current) 40.54 7.45 29-50 Family Rating Total (Current) 99.62 25.53 61-147 Family Rating ED Scale (Current) 41.46 10.77 24-61 Abbreviations: MCT, Multitasking in the City Test; TMT B, Trail Making Test Part B; BADS MSE, Behavioural Assessment of the Dysexecutive Syndrome Modified Six Elements Test; COWAT, Controlled Oral Word Association Test; WCST, Wisconsin Card Sorting Test; ROCF, Rey-Osterreith Complex Figure; CVLT-II, California Verbal Learning Test-II; WMS-III, Wechsler Memory Scale-III; WAIS-III, Wechsler Adult Intelligence Scale-III; TMT A, Trail Making Test Part A; JLO, Judgment of Line Orientation; FrSBe, Frontal Systems Behavior Scale; ED, Executive Dysfunction

178

Table 3.4. Qualitative comparison of errors made by the patient and normal sample on the MCT Patient (n = 13) Normal (n = 30) Number of participants Number of participants Total Error Total Error who made error (% of who made error (% of Frequency Frequency sample) sample)

Task Failures Incomplete Task 8 (61.54) 19 6 (20) 8 Tried to make purchase before bank (Insufficient 4 (30.77) 4 13 (43.33) 14 Funds error) Purchased an expensive item ($4 pens) over the 0 0 1 (3.33) 1 more economical option ($2 pens) Did not meet scheduled time deadline 3 (23.08) 6 2 (6.67) 2

Total:29 Total:25

Task Repetitions Attended doctor's appointment two times 1 (7.69) 1 1 (3.33) 1 Purchased the same item more than once 1 (7.69) 1 2 (6.67) 2 Withdrew money ($20) from bank more than 2 (15.38) 2 1 (3.33) 1 once

Total: 4a Total: 4 Inefficiencies Entered a 'task' building but did not perform a 3 (23.08) 3 7 (23.33) 8 task Performed two tasks in two separate visits to 2 (15.38) 2 8 (26.67) 8 post office instead of one visit Entered a building in which there was no task to 4 (30.77) 6 3 (10) 5 perform Notable wandering behaviour (not necessarily 5 (38.46) 5 4 (13.33) 4 involving visits to buildings) Total:16 Total:25 Abbreviations: MCT, Multitasking in the City Test a One participant from the patient sample was responsible for making 3 of the 4 task repetitions, one from each of the 3 categories.

179

Table 3.5. Correlations between MCT measures MCT Measures Completion Tasks MCT Measures Plan Time Plan Score Total Errors Time Completed Plan Time .51 .21 -.07 .42 Plan Score -.36 .19 -.13 Completion Time -.42 .69** Tasks Completed -.28 Abbreviation: MCT, Multitasking in the City Test **p < .01

180

Table 3.6. Correlations between the MCT and other executive function tests MCT Scores Completion Executive Function Tests Plan Time Plan Score Tasks Completed Total Errors Time TMT B -.09 -.47 .64* -.29 .39 BADS MSE .52 .80** -.43 .36 -.27 COWAT (FAS) .44 .39 -.22 -.01 -.02 Animals .53 .49 .04 -.10 .36 WCST (ppe) .05 -.53 .84** -.41 .60* Abbreviations: MCT, Multitasking in the City Test; TMT B, Trail Making Test Part B; BADS MSE, Behavioural Assessment of the Dysexecutive Syndrome Modified Six Elements Test; COWAT, Controlled Oral Word Association Test; WCST, Wisconsin Card Sorting Test; ppe, percent perseverative errors *p < .05; ** p < .01

181

Table 3.7. Correlations between the MCT and non-executive tests MCT Scores Cognitive Function Tests Plan Time Plan Score Completion Time Tasks Completed Total Errors ROCF Delayed Recall .29 .66* -.41 -.12 -.26 ROCF Recognition .59* .49 -.18 .19 .16 CVLT-II Learning (Trials 1-5) .35 .55* .20 -.32 -.04 CVLT-II Delayed Free Recall .44 .52 -.06 -.20 -.05 CVLT-II Recognition .14 .57* -.30 .31 -.01 WMS-III Logical Memory II .19 .29 -.03 -.22 -.23 WAIS-III Digit Span .25 .42 -.47 .18 -.49 WAIS-III Digit Symbol -.32 -.25 -.51 .09 -.29 WAIS-III Block Design .17 .76** -.52 .33 -.52 TMT A -.22 -.50 .59* -.14 .21 JLO -.01 .57* -.59* .38 -.56* Abbreviations: MCT, Multitasking in the City Test; ROCF, Rey-Osterreith Complex Figure; CVLT-II, California Verbal Learning Test-II; WMS, Wechsler Memory Scale-III; WAIS-III, Wechsler Adult Intelligence Scale-III; TMT A, Trail Making Test Part A; JLO, Judgment of Line Orientation *p < .05; ** p < .01

182

Table 3.8. Correlations between executive tests and FrSBe - family ratings Executive Function Test FrSBe Total FrSBe ED Scale MCT Plan Time -.34 -.33 Plan Score -.66* -.59* Completion Time -.08 -.08 Tasks Completed .32 .32 Total Errors -.14 -.14 TMT B .26 .32 BADS MSE -.26 -.23 COWAT (FAS) -.29 -.39 Animals -.60* -.68* WCST (ppe) .15 .04

Abbreviations: FrSBe, Frontal Systems Behavior Scale; ED, Executive Dysfunction; MCT, Multitasking in the City Test; TMT B, Trail Making Test Part B; BADS MSE, Behavioural Assessment of the Dysexecutive Syndrome Modified Six Elements Test; COWAT, Controlled Oral Word Association Test; WCST, Wisconsin Card Sorting Test; ppe, percent perseverative errors *p < .05

183

Table 4.1. Neuropsychological test performance and questionnaire results of sample pre- and post-intervention (n = 4) Pre-Treatment Post-Treatment Mean SD Range Mean SD Range MCT Plan Time (seconds) 183.50 72.95 120-287 233.25 75.33 175-340 Plan Score (/6) 4.75 2.50 1-6 6 0 N/A Budget (/1) 0.25 0.50 0-1 0 0 0 Completion Time (seconds) 624.75 199.08 494-920 338.75 127.18 211-455 Tasks Completed (/8) 7.25 0.96 6-8 7.75 0.50 7-8 Insufficient Funds 1.00 1.15 0-2 0 0 0 Time Deadlines Met (/2) 1.75 0.50 1-2 2 0 N/A Task Repetitions 0.50 0.58 0-1 0 0 0 Total Inefficiencies 1.00 0.82 0-2 0 0 0 Total Errors 2.75 1.50 1-4 0 0 0 Executive Function Tests D-KEFS Letter Fluency (# of words) 33.25 21.42 5-57 34.50 17.18 13-55 D-KEFS Category Fluency (# of words) 31.00 12.19 14-42 29.25 11.24 16-40 D-KEFS Category Switching (# of words) 11.50 6.46 2-16 12.50 5.80 4-17 Non-Executive Cognitive Tests BVMT-R Total Recall 21.00 4.97 15-27 20.00 7.07 11-27 BVMT-R Delayed Recall 6.50 2.89 3-10 9.00 3.16 5-12 CVLT-II Learning (Trial 1-5 words) 54.25 14.20 36-66 52.25 21.88 24-71 CVLT-II Long-Delay Free Recall 11.25 4.03 7-16 10.75 5.50 6-16 BNT (# of words) 26.00 5.23 19-30 26.50 4.04 23-30 Questionnaire Scores DEX-S 19.50 4.65 13-24 23.00 1.15 22-24 DEX-S EF 11.50 1.91 9-13 12.25 1.26 11-14 DEX-I 14.50 6.25 8-23 15.50 5.51 9-21 DEX-I EF 9.25 4.35 5-15 8.75 3.59 4-12 Insight 5.00 8.76 -3 - 16 7.50 5.74 3-15 Abbreviations: MCT, Multitasking in the City Test; D-KEFS, Delis-Kaplan Executive Function System; BVMT-R, Brief Visuospatial Memory Test-Revised; CVLT-II, California Verbal Learning Test-II; BNT, Boston Naming Test; DEX-S, Dysexecutive Questionnaire-Self; EF, Executive Function scale; DEX-I, Dysexecutive Questionnaire-Other

184

Table 4.2. Qualitative comparison of errors made by the sample on the MCT pre- and post-intervention (n = 4) Pre-Treatment Post-Treatment Number of Number of participants Total Error Total Error participants who who made error Frequency Frequency made error

Task Failures Incomplete Task 2 3 1 1 Tried to make purchase before bank 2 4 0 0 (Insufficient Funds error) Did not meet scheduled time deadline 1 1 0 0

Total:8 Total:1

Task Repetitions Attended doctor's appointment two times 1 1 0 0 Purchased the same item more than once 1 1 0 0 Total: 2 Total: 0 Inefficiencies Entered a 'task' building but did not perform a 1 2 0 0 task Performed two tasks in two separate visits to 1 1 0 0 post office instead of one visit Notable wandering behaviour (not necessarily 1 1 0 0 involving visits to buildings)

Total:4 Total:0

Abbreviations: MCT, Multitasking in the City Test

185

Table 4.3. Effect sizes of neuropsychological test and questionnaire scores from pre- to post-intervention (n = 4)

Effect size: Effect size: Interpretation Interpretation raw d corrected d MCT Plan Time (seconds) -0.67 medium -0.49 medium Completion Time (seconds) 1.71 large 1.25 large Tasks Completed (/8) 0.65 medium 0.48 medium Total Errors 1.89 large 1.37 large Executive Function Tests D-KEFS Letter Fluency (# of words) 0.06 nil 0.05 nil D-KEFS Category Fluency (# of words) -0.15 nil -0.11 nil D-KEFS Category Switching (# of words) 0.16 nil 0.12 nil Non-Executive Cognitive Tests BVMT-R Total Recall -0.16 nil -0.12 nil BVMT-R Delayed Recall 0.83 large 0.60 medium CVLT-II Learning (Trial 1-5 words) -0.11 nil -0.08 nil CVLT-II Long-Delay Free Recall -0.10 nil -0.08 nil BNT (# of words) 0.11 nil 0.08 nil Questionnaire Scores DEX-S Total -1.03 large -0.75 medium DEX-I Total -0.17 nil -0.12 nil DEX-S EF -0.46 medium -0.34 small DEX-I EF 0.13 nil 0.09 nil Insight 0.34 small 0.25 small

Abbreviations: MCT, Multitasking in the City Test; D-KEFS, Delis-Kaplan Executive Function System; BVMT-R, Brief Visuospatial Memory Test-Revised; CVLT-II, California Verbal Learning Test-II; BNT, Boston Naming Test; DEX-S, Dysexecutive Questionnaire-Self; EF, Executive Function scale; DEX-I, Dysexecutive Questionnaire-Other Notes: As outlined in Cohen (1988), conventional criteria for a "small", "medium", and "large" effect size were used to provide a frame of reference for obtained effect sizes. Raw d computation is equivalent to formula for Hedge's g. The corrected d statistic is also reported as it does not overestimate the population effect size, especially in the case of a small sample size (Hedges & Olkin, 1985).

186

Table 4.4. Demographic characteristics of normal sample (n = 4) Mean SD Range Age (years) 31.00 1.63 29-33 Education (years) 20.50 5.75 12-24 Gender 2 males/2 females

187

Table 4.5. MCT performance on two administrations of the task in a normal sample (n = 4) First Administration Second Administration Mean SD Range Mean SD Range MCT Plan Time (seconds) 165.25 157.81 40-371 86.00 84.66 0-166 Plan Score (/6) 3.75 2.63 0-6 3.00 3.46 0-6 Budget (/1) 0.25 0.50 0-1 0 0 0 Completion Time (seconds) 321.50 59.58 249-392 217.75 8.77 208-229 Tasks Completed (/8) 8 0 N/A 8 0 N/A Insufficient Funds 0.50 0.58 0-1 0 0 0 Time Deadlines Met (/2) 2 0 N/A 2 0 N/A Task Repetitions 0.50 0.58 0-1 0.25 0.50 0-1 Total Inefficiencies 0.25 0.50 0-1 0.25 0.50 0-1 Total Errors 1 1.15 0-2 0.50 1.00 0-2

Abbreviation: MCT, Multitasking in the City Test Notes: The interval between the first and second administration of the MCT was 4 weeks.

188

Table 4.6. Effect sizes of MCT scores in patient and normal samples Patient (n = 4) Normal (n = 4) Effect size: Effect size: Interpretation Interpretation raw d raw d MCT Plan Time (seconds) -0.67 medium 0.63 medium Plan Score 0 nil -0.24 small Completion Time (seconds) 1.71 large 2.44 large Tasks Completed (/8) 0.65 medium 0 nil Total Errors 1.89 large 0.46 medium Abbreviation: MCT, Multitasking in the City Test Notes: Interval between first and second administration of the MCT was 8-10 weeks in the patient sample and 4 weeks in the normal sample. Wilcoxon Signed Ranks Test analyses indicated that there were no statistically significant differences in MCT scores when comparing first and second administrations in the normal sample; only the plan time and completion time scores approached significance (both p = .07).

Figure 2.1. Bird’s eye view map of the virtual environment for the Multitasking in the City Test

BANK RESTAURANT DRY STATIONARY

& PUB CLEANERS STORE

X

COFFEE PET DRUG

STORE HOME SHOP STORE POST OFFICE

DOCTOR’S GROCERY STORE OFFICE OPTOMETRIST

189 190

Figure 2.2. Screen shots of the Multitasking in the City Test

191

Figure 4.1. Ratings on the Dysexecutive Questionnaire at pre- and post-intervention

a.

25

Pre- 20 Intervention Post- Intervention

15

10 DEX-I Total Score

5

0 1 2 3 4 Patient

192

b.

16

14 Pre- Intervention Post- 12 Intervention

10

8

6 DEX-I EF DEX-I Score

4

2

0 1 2 3 4 Patient

193

c.

20 Pre-Intervention Post-Intervention

15

10

5 Insight Score Index

0

-5 1 2 3 4 Patient

Abbreviations: DEX-I, Dysexecutive Questionnaire-Other; EF, Executive Function scale Notes: Bar graphs a and b depict DEX-I and DEX-I EF ratings, respectively, for each patient pre- and post-intervention and indicate that two ratings increased (indicative of worsened dysexecutive symptoms) while the other two ratings decreased (indicative of improvements in dysexecutive problems). Bar graph c represents changes in the index of insight from pre- to post-intervention by patient. Positive scores represent overestimation of deficits while negative scores reflect underestimation of deficits. As illustrated in graph c, all patients endorsed more deficits than their informants (suggesting good insight) at post-intervention and insight scores improved in 3 of the 4 patients.

194

Figure 4.2. Scatter plots of the relationship between the Dysexecutive Questionnaire-Other Executive Function scale and executive function measures a.

14

12

10

8

6 DEX-I EF DEX-I Score EF

4

2

0 0 50 100 150 200 250 300 350 400 MCT Plan Time (s)

195 b.

14

12

10

8

6 DEX-I EF DEX-I Score EF

4

2

0 0 50 100 150 200 250 300 350 400 450 500 MCT Completion Time (s)

196 c.

14

12

10

8

6 DEX-I EF DEX-I Score 4

2

0 0 10 20 30 40 50 60 D-KEFS Letter Fluency

197 d.

14

12

10

8

6 DEX-I EF DEX-I Score

4

2

0 0 5 10 15 20 25 30 35 40 45 D-KEFS Category Fluency

198 e.

14

12

10

8

6 DEX-I EF DEX-I Score EF 4

2

0 0 2 4 6 8 10 12 14 16 18 D-KEFS Switch Fluency

Abbreviations: DEX-I EF, Dysexecutive Questionnaire-Other Executive Function scale; MCT, Multitasking in the City Test; D-KEFS, Delis-Kaplan Executive Function System Notes: All graphs represent patient scores obtained post-intervention. These graphs show a positive linear relationship between MCT completion time and DEX-I EF score (perfect rank order correlation using Spearman’s rho; graph b ). D-KEFS Category and Switching Fluency scores were negatively related to DEX-I EF ratings (i.e., in the expected direction), with the scatter plots and trend lines demonstrating a good linear relationship (Spearman’s r = -.80 for both; graphs d and e). The scatter plots for MCT plan time and D- KEFS Letter Fluency demonstrate a poor linear relationship with DEX-I EF, as patients’ scores are relatively distant from the trend lines (graphs a and c).

Appendices

Appendix A

Abbreviations List

ABI = Acquired Brain Injury

BADS = Behavioural Assessment of the Dysexecutive Syndrome

BAI = Beck Anxiety Inventory

BDI-II = Beck Depression Inventory-II

BNT = Boston Naming Test

BVMT-R = Brief Visuospatial Memory Test-Revised

COWAT = Controlled Oral Word Association Test

CVLT-II = California Verbal Learning Test-II

DES = Dysexecutive Syndrome

DEX(-S); (-I) = Dysexecutive Questionnaire(-Self); (-Other)

D-KEFS = Delis-Kaplan Executive Function System

ED = Executive Dysfunction

EF = Executive Function

FrSBe = Frontal Systems Behavior Scale

GCS = Glasgow Coma Scale

GMT = Goal Management Training

ICC = Intraclass correlation coefficient

IQ = intelligence quotient

JLO = Judgment of Line Orientation

LM = Logical Memory

MCT = Multitasking in the City Test

MET(-SV) = Multiple Errands Test(-Simplified Version)

199 200 mRS = Modified Rankin Scale

MSE = Modified Six Elements

NART = National Adult Reading Test

PC = personal computer

RBMT(-E) = Rivermead Behavioural Memory Test(-Extended Version)

ROCF = Rey-Osterreith Complex Figure

SAS = Supervisory Attentional System

SEC = Structured Event Complex

SPSS = Statistical Package for the Social Sciences

SRLT = Simulated Real-Life Task

TBI = Traumatic Brain Injury

TMT = Trail Making Test

TOL = Tower of London

TOMM = Test of Memory Malingering

VE = Virtual Environment

VR = Virtual Reality

WAIS-III = Wechsler Adult Intelligence Scale-III

WCST = Wisconsin Card Sorting Test

WMS-III = Wechsler Memory Scale-III

WTAR = Wechsler Test of Adult Reading

201

Appendix B

Multitasking in the City Test (MCT) Instructions

Learning Session Instructions for the MCT

I want you to imagine that you have recently moved to a new city and that you will spend the next 3 minutes becoming familiar with your new city of residence as well as getting used to using the joystick to navigate through the city. The joystick will allow you to move forward by pushing it forward, to the left by pushing it to the left, etc. During this learning session, you should pay close attention to where various stores and other buildings are located in the city. As you want to focus on getting to know places within the city, you will not be able to do anything other than walk around. For example, you will not be able to make any purchases or withdraw money from the bank.

MCT Instructions

It is now 3:30 p.m. and you are at home. You are planning to spend the next 15 minutes or so running errands within your city of residence. The following errands are on your list of things to do:

1) You must go to various stores to purchase the following items:

• 6 blue pens, which cost various prices

• Cough syrup, which will cost $4.00

• Groceries, which will cost a total of $12.50

2) You must go to the doctor’s office as you have an appointment for 3:45 p.m.

3) You must pick up your mail from the post office.

202

4) At the post office, you have to drop off a letter for which you must buy postage,

which will cost 75 cents.

5) You must go to the bank to withdraw money as you only have 67 cents in your wallet.

You must try not to go over your budget of $20.00 for all the items you need to buy

today.

6) You should be back at home no later than 3:50 p.m.

You should attempt to complete as many of the errands as possible within a period of 15 minutes. It is up to you to decide the best order in which to complete these errands. There will be a clock on the screen at all times to assist you with this. In addition, the list of errands will also be on the screen at all times. To perform a task, you must simply walk up close to a desired location using your joystick, such as the cashier, and another screen will immediately appear which will prompt you to choose what you would like to do from a list of options. Each option corresponds to a particular number so you will have to enter the appropriate number from the number keys on the keyboard. When a task is completed, it will appear under the “Tasks

Completed” section of the screen and all of the items you acquire will appear in the “Items in

Backpack” section of the screen. You will have a map of the city beside you at all times to help you plan and navigate through your route.

203

Appendix C

Computer Skills Questionnaire

Name:______

Date: ______

1. How often do you use desktop or laptop computers?

□ Never used a computer

□ A few times a month or less

□ Once a week

□ Every day or two

□ Several times a day

2. Have you ever used a joystick (i.e., on a computer or video game)?

□ Yes

□ No

□ Don’t know

3. How would you rate your current computer skills?

Very poor Poor Fair Good Very Good

□ □ □ □ □

204

Appendix D

Dysexecutive Questionnaire (DEX) Executive Function (EF) scale

The following items from the Dysexecutive Questionnaire (DEX) were included in the derived

Executive Function (EF) scale (items below are from the DEX-I questionnaire with parallel items used to derive the DEX-S EF scale):

1. Has problems understanding what other people mean unless they keep things simple and straightforward.

4. Has difficulty thinking or planning ahead for the future.

6. Gets events mixed up with each other, and gets confused about the correct order of events.

7. Has difficulty realizing the extent of his/her problems and is unrealistic about the future.

14. Finds it hard to stop repeating, saying, or doing things once he/she has started.

16. Finds it difficult to stop doing something even if he/she knows he/she shouldn’t.

17. Will say one thing, but will do something different.

18. Finds it difficult to keep his/her mind on something, and is easily distracted.

19. Has trouble making decisions or deciding what he/she wants to do.