Measurement of Intelligence in Children and Adolescents with Autism Spectrum Disorder: Factors Affecting Performance

A doctoral dissertation submitted to the

Graduate School

of the University of Cincinnati

in partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

in the Department of Psychology

of the College of Arts and Sciences

by

Katherine T. Baum, M.A.

September 1, 2011

Committee: Paula Shear, Ph.D. (Chair) Somer Bishop, Ph.D. Steven Howe, Ph.D. Sarah Whitton, Ph.D. Abstract

The assessment of children with autism spectrum disorders (ASD) requires the measurement

of intelligence, because the diagnostic criteria include a judgment about whether social and

deficits are greater than would be expected given the general developmental level of

the child. In addition, results of cognitive testing, including IQ scores and the potential discrepancy

between verbal and nonverbal intellectual abilities, impact educational placement, treatment

strategies, research design, and theories of neurodevelopment and in ASD. Despite its

widespread importance, there are fundamental methodological aspects of intellectual assessment in

ASD, such as the intelligence measure selected, that may affect testing results.

The Wechsler Intelligence Scale for Children (WISC) and the Stanford-Binet (SB) are two of

the most commonly used measures to evaluate intelligence in ASD. Full-scale and composite scores

on the WISC and SB have been shown to be highly correlated with each other in several pediatric

populations, including children with mild (ID) or attention-deficit hyperactivity

disorder (ADHD), as well as in typically-developing children. Despite these high correlations,

significant discrepancies between the scores, as many as 20 IQ points, have been reported for an individual. Comparisons between scores on the WISC and SB have not been made in children with

ASD. The present study focused on how the WISC and SB compare on IQ scores, as well as characterization of intelligence and verbal-nonverbal discrepancy. The study also addressed the question of whether diagnostic symptoms, adaptive functioning, or neuropsychological deficits associated with the clinical presentation of ASD, differentially affect performance on the WISC and the SB.

Forty children with ASD between the ages of 10 and 16, who were recruited through an autism clinic, completed a test battery (WISC-IV, SB-5, Beery VMI, CELF-Screener, and NEPSY –

vii

Theory of Mind) and parents completed several measures assessing behavior, ASD symptomatology

and adaptive functioning.

Full-scale (FSIQ), verbal, nonverbal and working scores were highly correlated

between the two tests. FSIQ and verbal IQ scores differed significantly between the two tests,

although on average, less than 4 IQ points. The majority (72%) obtained higher FSIQ scores on the

SB-5, with 14% obtaining scores on the two tests that were greater than one standard deviation from

one another. Verbal and nonverbal differences between scores were similar with 16% and 18%,

respectively, scoring more than one standard deviation. Classification of verbal-nonverbal

discrepancies was consistent for 62-67% of the sample, depending on criteria used. The expectation that domains of cognitive functioning that are disproportionately affected in this population (e.g., language abilities, visual-motor skills, and theory of mind) would be related to performance on intelligence measures, was not supported. However, age was associated with FSIQ and nonverbal IQ difference scores on the two tests with older participants scoring higher on the SB-5. All IQ scores from both the WISC-IV and SB-5 were moderately correlated with adaptive functioning, with the exception of the WISC’s nonverbal score (WISC-PRI).

Overall, the convergent validity of the WISC-IV and SB-5 is good in children and adolescents with ASD. Although the average difference between tests on FSIQ and verbal scores was relatively small, approximately 15% of individuals obtained significantly different scores and classifications on the two tests. Older children and those with greater theory of mind skills tended to score higher on SB-5 full-scale and nonverbal indices relative to the WISC-IV. Further, verbal- nonverbal discrepancy classifications were only moderately consistent. Replication of these findings and comparisons to other diagnostic groups will provide further support for the convergent validity of the two measures and a more comprehensive assessment of neuropsychological functioning may determine whether these factors differentially impact performance on these two IQ tests.

viii

ix

ACKNOWLEDGEMENTS

I would first like to acknowledge my dissertation committee members, Drs. Paula Shear,

Steven Howe, Somer Bishop, and Sarah Whitton, whose feedback was instrumental in the preparation of this document. My chair and mentor, Dr. Paula Shear, deserves special acknowledgement for her guidance on this document and throughout my training. Her encouragement and support instigated the beginning and completion of this ambitious project. This research would not have been possible without the incredible and direction of Dr. Somer

Bishop who revealed the need for this type of research. Somer, your passion for the world of autism is infectious and I am grateful to have been under your mentorship. A special thanks to Dr. Steve

Howe for his time and expertise on this project, as well as throughout my graduate training. Oh how

I will miss your honest and witty critiques. Additionally, thank you to the ‘Bishop Lab’ of Cincinnati

Children’s Hospital for their support, especially Dr. Amie Duncan and Ms. Leslie Markowitz who spent hours collecting parent data. Thanks to Suzan Sucro for her assistance. This work was supported by Dr. Catherine Lord (NICHD#: RC1MH089721), Dr. Mike Richardson of the

University of Cincinnati’s Department of Psychology, and the Department of Psychology’s Seeman-

Frakes funds. I am particularly grateful to all the families who shared their time, energy, and lives with me through participation in this study. Parents, you are not alone in seeing the beautiful light that your children bring to this world through their strength, minds, and unique characters.

The completion of this dissertation project marks the culmination of many years of work toward by doctorate in psychology at UC. Thank you to those who made my life outside of UC worthwhile: to my grad school friends for walking side-by-side with me through these last few years, to my family, especially my parents, whose love and pride have proven to be an inspiration, and most importantly, to my husband and best friend. Nick, your unyielding encouragement, support, and has kept me sane and made this all possible. You are my rock.

x

TABLE OF CONTENTS

Page

ABSTRACT………………………………...…………………………………...…………….. vii

ACKNOWLEDGMENTS……………………..…………………………………………….. x

TABLE OF CONTENTS……………………………..………………….………………….. xi

LIST OF TABLES………………………………………………...……………………...…... xiii

LIST OF FIGURES …………………………..……………...………………………..……... xiv

CHAPTER

I. INTRODUCTION 1

Diagnostic Assessment…………………………………………………………...………… 1

Theory of Intelligence in ASD……………………………………..…………..……………. 4

IQ and Other Domains …………………………………………………………………….. 5

Service Eligibility and Planning………………………………....………………………..….. 8

Clinical use of IQ Scores in ASD……………………………...…………………………..… 9

Use of IQ in ASD Research…………………………………..…………………………….. 10

Adaptive Functioning in ASD………………………………………………………………. 12

Selection of Instruments to Estimate Intelligence in ASD……………………...…………… 13

II. METHOD 21

Participants……….………………………………………………………………...... ……. 21

Procedures………………………………………………...………………………...………. 23

Measures……………………………………..………………………………..……………. 25

Statistical Analyses……………………………………………...………………..………….. 34

xi

III. RESULTS 36

Convergent Validity Between WISC-IV and SB-5………………………...…...…………….. 36

Full Scale IQ………………….…………………………………...... …………………. 39

Verbal Intelligence Scores………………………………………………………..…………. 41

Nonverbal Intelligence Scores……....…………………………...…………………...……… 43

Working Memory Scores…………………………………………..…………………..……. 44

Classification of Intellectual Functioning Based on Index Scores Across Measures………..... 45

Verbal–Nonverbal IQ Discrepancy Scores on the WISC-IV and SB-5…………..………….. 49

Variables Associated With Score Differences Between WISC-IV and SB-5………..………... 52

Test Order Effects on Group Differences………………………………………………… 58

IQ and Adaptive Functioning…………………………...…………..……………………… 62

IV. DISCUSSION 66

Classification of Intelligence………………………...………..……………………………... 70

Factors Accounting For IQ Score Differences……………………………...………………. 72

Verbal-Nonverbal Discrepancies……………….…………………………………….…… 75

Test Order Effects………………………………….……………………….……………… 76

IQ Scores and Adaptive Functioning……………………………………….……………… 78

Sample Characteristics…………………………….…………………….…………………... 79

Limitations and Future Studies……………………………………………..……………….. 80

REFERENCES……………………………………………………………………………… 82

xii

LIST OF TABLES

TABLES Page

1. DSM-IV-TR Criteria for Autistic Disorder (AD), Asperger’s Disorder (AspD), and PDD-NOS ...……………………………………….……………………. 3 2. Child Demographic and Diagnostic Data (n = 40) ………………...………… 24 3. Index Scores by IQ Measure………………………………………..………… 27 4. Multi-trait, Multi-method Evaluation of the Wechsler Intelligence Scale for Children 4TH Edition (WISC-IV) and Stanford Binet, 5th Edition (SB-5) …...… 38 5. Child IQ Data……………………...…………………………………………. 39 6. Classification of Cognitive Ability by FSIQ Scores on WISC-IV and SB-5…… 47 7. Classification of Intellectual Disability (ID) based on WISC-IV and SB-5 FSIQ scores with WISC-IV as the Standard………………………………………… 48 8. Classification of Intellectual Disability (ID) based on WISC-IV and SB-5 FSIQ scores with SB-5 as the Standard……………………………………………… 49 9. Classification of Verbal-Nonverbal IQ Score Discrepancies based on Various Criteria for Discrepancy, n = 37………………………………………………. 51 10. Correlations Between IQ Difference Scores and Other Cognitive Behavioral, and Diagnostic Variables……………………………………………………… 53 11. IQ and Factor Scores on WISC-IV and SB-5 by Order of Test Administration. 59 12. Behavior Problem Scores on the WISC-IV and SB-5 by Order of Test Administration……………………………………………………………… 61 13. Comparisons of Correlations Between Parallel IQ Scores on the WISC-IV and SB-5 based on Order of Test Administration…………………………………. 62 14. Correlation between the WISC-IV and SB-5 IQ Index Scores and Cognitive and Adaptive Functioning ……………………………………………..……... 64 15. Comparisons of Correlations Between Parallel IQ Scores on the WISC-IV and SB and Vineland Domain scores……………………………………………… 65

xiii

LIST OF FIGURES

FIGURES Page

1. Correlation between Full-Scale IQ (FSIQ) scores on the WISC-IV and SB-5 (n = 36) ………………………………………………………………………………... 41 2. Correlation between Verbal Comprehension Index (VCI) of the WISC-IV and the Verbal IQ (VIQ) of the SB-5 (n = 37)………………..…….…….….……..……… 42 3. Correlation between Perceptual Reasoning Index (PRI) of the WISC-IV and the Nonverbal IQ (NVIQ) of the SB-5 (n = 40) ……………….……………….…….. 43 4. Correlation between Working Memory Index (WMI) of the WISC-IV and the Working Memory factor score (WM) of the SB-5…………………..…..…….…… 44 5. Correlation between differences on FSIQ between measures and age …….….…… 54 6. Correlation between differences on FSIQ between measures and WISC-IV verbal- nonverbal discrepancy ……………………………………………….…………… 55 7. Correlation between nonverbal score differences between measures and age……… 56 8. Correlation between nonverbal score differences between measures and NEPSY Theory of Mind z scores …………………………………………………………. 57 9. Correlation between FSIQ scores on WISC-IV and SB-5 by test order group; n = 37……………………………………………..……………………...…………… 62

xiv

CHAPTER ONE

Introduction

Autism spectrum disorders (ASDs), including autistic disorder, Asperger’s disorder, and pervasive developmental disorder, not otherwise specified (PDD-NOS), comprise the fastest growing group of developmental disabilities with growth of about 10 - 17% each year (Autism

Society of America, 2006). This group of neurodevelopmental disorders affects approximately 1.5 million people in the United States (Autism Society of America, 2007), and current estimates provided by the Center for Disease Control (2009) indicate that approximately 1 in 110 children are affected. Generally, these disorders are characterized by impairments in social interaction and communication, as well as the presence of repetitive behaviors (American Psychiatric Association,

2000). Measurement of intelligence in children with ASD is integral to both clinical and research work. Diagnostic decision-making, treatment planning, research design, neurodevelopmental theory, and public policy decisions all rely on intelligence test scores. Despite its widespread importance, there are still many unanswered methodological questions related to assessment of intelligence in

ASD. The present study will focus on how two different intelligence measures compare in characterizing the intelligence of children and adolescents with ASD and how they relate to other standard measures of functioning.

Diagnostic Assessment

The assessment of ASD often includes, as a core component, measurement of a child’s overall intellectual functioning. One major purpose of intelligence tests is to determine whether children meet the ASD diagnostic criterion that requires observed social or communication impairments to be beyond what is expected given the child’s cognitive/developmental level

1

(American Psychiatric Association, 2000). Results of intelligence measures, therefore, inform

diagnostic decision-making (Goldstein et al., 2008).

A second diagnostic purpose of IQ scores is to differentiate between diagnoses on the

autism spectrum. Current diagnostic criteria for ASD from the Diagnostic and Statistical Manual, 4th edition, Text Revision (DSM-IV TR; American Psychiatric Association, 2000) are outlined in Table

1. As indicated in Table 1, ‘absence of delays in […] cognitive development’ differentiates Asperger’s disorder from autistic disorder, making the measurement of intelligence crucial. A third purpose relates to assessing the presence or absence of disorders that may be comorbid with an ASD diagnosis. For example, IQ scores determine classification of intellectual disability (ID; or what is currently referred to as mental retardation in the DSM-IV-TR) and disability (American

Psychiatric Association, 2000). It is estimated that approximately 55-70% of individuals with ASD have cognitive impairment in the range of ID (Charman et al., 2011; Fombonne, 2005). One factor that determines this diagnosis is impairment in intelligence, which is defined by a full-scale IQ score of approximately 70 or below. The likelihood of a child who scored 70 or below on one IQ measure scoring in the ID range (< 70) on a different measure has yet to be explored.

In ASD, assessment of intelligence is important along with a consideration of the pattern of

index scores (e.g., potential discrepancies between verbal and nonverbal intellectual abilities). One

learning disability classification, nonverbal learning disability (NLD), is based in part on the

significant discrepancies between intelligence index scores. Scores on intelligence measures yield a

cognitive profile that delineates areas of strength and weakness for an individual child. In NLD,

which is currently classified as Learning Disorder, Not Otherwise Specified (DSM-IV TR; American

Psychiatric Association, 2000), children have strengths in verbal abilities relative to nonverbal

abilities. This profile is common in children with ASD (Mayes & Calhoun, 2003a; Munson et al.,

2008); thus, there is continued research on the use of cognitive profiles to distinguish NLD from

2

Table 1

DSM-IV-TR Criteria for Autistic Disorder (AD), Asperger’s Disorder (AspD) and PDD-NOS

Diagnostic Symptoms AD AspD PDD-NOS

i) Qualitative impairment in social interaction X X *

ii) Restricted repetitive and stereotyped patterns of behavior, interests X X *

and activities

iii) Qualitative impairments in communication; delay in development of X *

language, social communication, or symbolic or imaginative play

iv) Absence of delays in language development, cognitive development, X

age-appropriate self-help skills, or curiosity about the environment

v) Criteria not met for another specific pervasive developmental or X X psychiatric disorder (although rule out of AspD is not required for autism)

*severe and pervasive impairment in these areas with either late age at onset, atypical symptomatology, or subthreshold symptomatology

ASD, specifically milder forms such as Asperger’s disorder (Semrud-Clikeman, Walkowiak,

Wilkinson, & Christopher, 2010). Semrud-Clikeman and colleagues (2010) revealed that the NLD

group showed greater impairments on tasks of visual-spatial, visual-motor, and fluid reasoning

abilities compared to children with Asperger’s disorder and attention-deficit hyperactivity disorder

(ADHD). The cognitive profile of the NLD group, characterized by significant verbal-performance

IQ split on the Wechsler Intelligence Scale for Children, 3rd edition (WISC-III; Wechsler, 1991), was

related to difficulty in social functioning, which was not true for the other groups in the study.

3

Theory of Intelligence in ASD

Intelligence scores are used and interpreted in a variety of capacities for individuals with

ASD. First, results of IQ assessment have an impact on theories of neurodevelopment and cognition in ASD. One example is that specific cognitive profiles may represent etiologically significant subtypes of autism (Joseph, Tager-Flusberg, & Lord, 2002; Munson, et al., 2008). Groups

of both young and school-age children with ASD have been classified based on cognitive profiles

that incorporate level of cognitive functioning and/or discrepancy between verbal and nonverbal

abilities. Joseph and colleagues (2002) reported that the profile characterized by greater nonverbal

abilities than verbal abilities was associated with increased social impairment, possibly reflecting a

neurobehavioral marker for an etiologically distinct subtype of autism. Subsequent research revealed

that children with higher nonverbal abilities had a greater head circumference than those with

greater verbal abilities or no significant cognitive differences (Deutsch & Joseph, 2003) and another

study identified specific chromosome chains related to IQ-discrepancy scores (Chapman et al.,

2011). Similar to Joseph’s work, Munson and colleagues (2008) reported that latent IQ group

classification, which was assigned based on cognitive discrepancies and IQ level, was significantly

associated with social functioning and adaptive behavior. These conclusions that draw upon

cognitive profiles have major implications for our understanding of ASD and thus should be based

on assessments that result in similar patterns of cognitive skills.

A number of theoretical frameworks postulate that core neuropsychological impairments

account for the defining behavioral features of autism (Joseph, 1999). One theory is based on

Anderson’s (1992) model of intelligence and focuses on how impaired processing abilities account

for the pattern of deficits seen in ASD. The executive functioning (EF) hypothesis, which was first

explicitly evaluated by Rumsey (1985), posits that the deficits observed in ASD, including in theory

of mind, are manifestations of primary deficits in executive control over behavior (Joseph, 1999).

4

One goal of the present study is to consider processing speed and EF deficits in their relation to

how intelligence is measured in this population.

IQ and Other Domains

It is not well understood how performance on intelligence measures in this population

relates to other domains of cognitive functioning. However, there is some support for the impaired processing and EF deficit models in ASD. Research by Scheuffgen and colleagues (2000) built upon

Anderson’s model of intelligence that seeks to explain discrepant cognitive abilities in various development disorders, including autism (Anderson, 1992). This model outlines two pathways to knowledge acquisition: 1) through central processes that are highly related to a basic processing mechanism or speed of processing, and 2) through modular processes, such as phonological encoding and theory of mind (Anderson, 1992, 2008). One study compared three subgroups of children: children with moderate ID, children with ASD, and typically-developing children with high

IQ. They performed an inspection time task, which is to be the single best indicator of speed of processing. Children with autism showed information processing capacities that were comparable to typically-developing children despite below average IQs, which indicated that a specific cognitive or modular processing deficit is present in ASD and responsible for deficits in knowledge independent of basic processing speed (Scheuffgen, Happe, Anderson, & Frith, 2000).

More recent research failed to replicate these findings and instead revealed that children with ASD had similar inspection times to age- and IQ-matched controls (Wallace, Anderson, & Happe, 2009).

However, this study did reveal that slower basic processing was associated with low IQ in the matched group but not the ASD group. Processing speed appears to be important in understanding intelligence in ASD and it may be necessary that our intelligence assessments capture this cognitive ability. However, given that it is not related to overall cognitive abilities in ASD as it is in typically-

5

developing children, the incorporation of this ability into the full-scale or overall intelligence score should be considered. The WISC-IV (Wechsler, 2003) has a processing speed index, while other

measures, such as the Stanford Binet, fifth edition (SB-5; Roid, 2003) do not. Evaluating differences

between these measures, which quantify different cognitive abilities, may shed light on models of

intelligence in this population.

The EF hypothesis is supported by research that has identified significant impairments in

executive functions such as planning, mental flexibility, and inhibition in children with autism (Hill,

2004). Other executive abilities, such as working memory, shifting of mental set, and theory of mind

have also been found to be impaired in individuals with ASD (for a review, see Joseph, 1999). In

their model of intelligence, Anderson and colleagues (Anderson, 2008) propose that the primary

cognitive deficit that accounts for impaired performance on IQ tests in individuals with ASD relates

to impaired executive abilities, specifically impaired theory of mind. Research has identified

associations between performance on specific EF tasks and IQ scores and evaluated how this

relationship relates to behavioral aspects of the disorder (Lopez, Lincoln, Ozonoff, & Lai, 2005).

Although the research by Lopez and colleagues (2005) demonstrated a strong association between

performance on EF tasks and IQ scores, it has yet to be determined if EF deficits differentially

impact performance on various IQ tests. Aspects of theory of mind, such as perspective-taking and

understanding others’ mental states or beliefs, may also impact an individual’s performance

differentially across varying IQ tasks such as the WISC-IV and SB-5.

Scores on the WISC-IV and SB-5 may also differ because of the demands on other cognitive

domains that are typically impaired in children with ASD, such as language (Baron-Cohen, 2005;

Joseph & Tager-Flusberg, 2004; Liss, Fein, et al., 2001). Understanding how neuropsychological

variables impact performance on common IQ tests would guide our interpretation of IQ results in

children with specific deficits. For example, an IQ score of 65 in an individual with significant

6

language impairments may be reflective of the inability to provide verbal responses rather than a true

intellectual impairment. Pragmatic language, or the use of language in social communication, is

inherent to the testing process; thus, deficits in this area, which tend to be universal in ASD (Tager-

Flusberg, Paul, & Lord, 2005), may impair a child’s ability to understand what is required during test

administration. In addition to evaluating whether neuropsychological factors impact performance on

IQ tests, scores on neuropsychological measures may also be used to evaluate the discriminant

validity between different IQ tests, since they are designed to assess constructs other than IQ.

Individuals with ASD commonly perform poorly on tests that have high visual-motor

demands, which may differentially impact performance on the WISC-IV versus the SB-5 (Mayes &

Calhoun, 2004; Siegel, Minshew, & Goldstein, 1996). In a study of visual-motor ability, adolescents

with ASD had poorer handwriting quality than typically-developing adolescents. Additionally, it was

concluded that nonverbal abilities, as measured by the Wechsler scales, predicted performance on this visual-motor task in ASD but not in the typically-developing group (Fuentes, Mostofsky, &

Bastian, 2010). Another study revealed that visual-motor impairments were moderated by IQ, where high-functioning children with ASD showed visual-motor scores that were significantly lower than expected given IQ, while the visual-motor abilities of low-functioning children were consistent with their IQ level (Mayes & Calhoun, 2003b); however, these results were based on a sample where some received the WISC-IV and some received the SB-5. The relationship between visual-motor skills and performance on these two measures has not yet been compared directly.

Approximately one-third of children with ASD have mood or aggressive disorders (e.g.,

Ming, Brimacombe, Chaaban, Zimmerman-Bier, & Wagner, 2008). This emotional instability may

impact performance, as one IQ test may potentially prove more frustrating than the other. Severity of autism, based on social and communication impairments and the presence of repetitive behaviors, may also affect a child’s ability to complete tasks. To illustrate, imagine a 12-year-old boy who is

7

preoccupied with the alignment and placement of items. At home, he lines up his toys, has a tantrum when the movies or books on his shelf are not perfectly aligned, and has to have the food on his

plate in specific locations. During a testing situation, this child may obtain a very low score on a

block design task due to his lengthy completion time and not because of poor visual-spatial

processing. The potential differential impact of symptoms and behavior across IQ tests must,

therefore, be examined.

Service Eligibility and Planning

In addition to its diagnostic importance, IQ testing in those with autism is also critical to

their eligibility for services through government, community, educational, and forensic settings. For

example, based on the recent revision to the Individuals with Disabilities Education Act (IDEA), any child with a diagnosis of autism is eligible for services within the public school system

("Individuals with Disabilities Education Act. Building the legacy: IDEA 2004.," 2004). However, research reveals that when determining educational placement and services, there is greater emphasis placed on a child’s cognitive abilities than on other areas of functioning, such as degree of social deficit (White, Scahill, Klin, Koenig, & Volkmar, 2007). Thus, children on the autism spectrum whose IQ scores are at or below 70, indicating ID, may be afforded additional services within the school setting. In contrast to special education programs, which are federally mandated, funding and regulation of gifted education programs varies state to state. In some states, like Ohio, children are required to score two standard deviations above the mean (e.g., IQ > 130) on either group- or individually-administered tests of intelligence or achievement (Ohio Department of Education,

2001), placing a strong emphasis on the results of cognitive evaluations.

Benefits through government agencies also rely on IQ scores. For example, to qualify for

Social Security Disability Insurance, an individual with a medical diagnosis of autism must

8

demonstrate impairments in areas such as motor development, communication, social or personal

functioning, or maintaining concentration and persistence to complete tasks. Cognitive impairment

is another key area, where an IQ score less than or equal to 70 meets one of the eligibility criteria

(Autism and Social Security Disability, 2011). Significant concerns have been raised regarding the use

of a cutoff score to determine services, due in part to changes in normative samples over time and

the reliability of the IQ score (Kanaya & Ceci, 2007). As a result of these concerns, the American

Association of Mental Retardation (now called the American Association of Intellectual and

Developmental Disabilities) recommends that providers use an interval of IQ scores, 70-75

(American Association of Mental Retardation, 2002), to classify ID. However, the use of this cutoff range has not yet been implemented consistently in practice (Kanaya & Ceci, 2007).

Clinical Use of IQ Scores in ASD

Clinically, IQ scores are used to guide treatment decisions and to speculate about a child’s

prognosis. Overall intelligence has been identified as a significant predictor of outcome in the ASD population (e.g., Gillberg & Steffenburg, 1987), as well as in groups of typically-developing children and those with a variety of developmental disorders. In ASD, an IQ above 70 is associated with a greater likelihood of positive functional outcomes in adulthood, including employment, higher education, and greater likelihood of independent living (Howlin, Goode, Hutton, & Rutter, 2004).

Individuals with IQs in this range are less restricted with regard to their vocational options and tend to have a greater number of friendships (Howlin, et al., 2004). Setting realistic goals is essential for children and their families and requires an understanding of the child’s developmental level, which is based on IQ scores.

Results of IQ assessment, such as the discrepancy between verbal and nonverbal abilities, also guide treatment recommendations, including educational and vocational placement decisions. A

9

clinician may provide different recommendations for a child with significantly greater nonverbal

than verbal abilities than for one with equivalent verbal and nonverbal cognitive skills. For example,

the clinician may recommend vocational options that capitalize on an individual’s cognitive strengths or that focus on visual learning activities and on part-whole relationships (Sattler, 2001p. 856).

Intelligence tests are used for other clinical purposes, such as to evaluate the efficacy of

various therapies and interventions (e.g., Dawson et al., 2010; Howlin, Magiati, & Charman, 2009;

Thompson, Thompson, & Reid, 2010). IQ scores have been used as evidence that specific

behavioral interventions improve a child’s cognitive functioning (e.g., Dawson, et al., 2010),

although concerns have been raised about the use of these scores (Magiati & Howlin, 2001). The

present healthcare climate promotes evidence-based interventions across all age and diagnostic

groups, demanding methodologically-sound research tools that reliably assess functioning.

Therefore, any potential differences across IQ measures in terms of the full-scale, verbal, or

nonverbal IQs that they generate or the relative discrepancies between verbal and nonverbal abilities

that they identify have important clinical and research implications in terms of diagnosis and

treatment planning.

Use of IQ in ASD research

In autism research, IQ scores serve many different functions. They are vital in identifying

high-functioning individuals that are the focus of many cognitive studies and serve as a matching

variable for the selection of comparison groups (Mottron, 2004). They are also often included in

statistical models to predict outcomes (Szatmari, Bryson, Boyle, Streiner, & Duku, 2003). However,

there is little consistency with regard to the test selected. Large-scale studies involving children

across a wide age range of developmental levels are unable to use the same test for all children due

to the age restrictions and floor effects. Further, these studies often involve multiple sites that may

10 not use the same intelligence measure. To draw meaningful conclusions from the data, convergent validity between measures must be established in the ASD population or a single measure must be identified as the standard.

Some research studies rely on scores from multiple measures of intelligence, with individuals in the sample receiving different IQ tests based on convenience, their age, or their cognitive level

(e.g., Chapman, et al., 2011; Charman, et al., 2011; Howlin, 2003; Klin et al., 2007; Mayes &

Calhoun, 2003a; Szatmari, et al., 2003). For example, in one study of 116 children with autism, 45 children ages 3 to 6, and eight children who were unable to obtain a basal on the WISC-IV, were administered the SB-5 (Roid, 2003), while the remaining 63 children, ages six and older, were administered the WISC-IV (Mayes & Calhoun, 2003b). It was concluded that younger and lower functioning children had relative nonverbal strengths and children with high IQ scores had a weakness in graphomotor skills (Mayes & Calhoun, 2003b). In another study, Klin and colleagues

(2007) evaluated children with ASD ages 7 to 18 years across two sites and concluded that verbal

IQ, and to a lesser extent, performance IQ, predicted the communication aspect of adaptive functioning. However, one site involved in the study consistently used the Wechsler scales while the other site evaluated IQ with both the Wechsler scales and the Differential Abilities Scale (Elliott,

1990). Volkmar, Szmarti, and Sparrow (1993) also utilized multiple measures of intelligence, including the WISC and SB, to make determinations about how IQ impacts sex differences in ASD.

In these studies, the scores of these measures are treated the same and often compared in the analyses; however, it is not understood how various measures compare across composite scores.

Beyond test selection, the score used within IQ results (e.g., FSIQ, verbal IQ, nonverbal IQ) varies across studies. Some studies match participants using nonverbal IQ (e.g., Howlin, 2003), while others select their sample based on verbal IQ (e. g., Klin, et al., 2007) or both verbal and FSIQ (e. g.,

Goldstein, Minshew, Allena, & Seaton, 2002). Given that children with ASD often demonstrate

11

large discrepancies between verbal and nonverbal IQ scores (Joseph, et al., 2002), a single composite

score that combines verbal and nonverbal scores, like FSIQ, may not be representative of their best abilities. Nonverbal IQ scores tend to be the most stable over time in children with ASD (Begovac,

Begovac, Majic, & Vidovic, 2009; Howlin, et al., 2004), lending support to its use in research;

however, in the general population it has the lowest correlation with corresponding index scores on

different tests, relative to full-scale and verbal IQ scores (Roid, 2003). Nonverbal intelligence scores have been identified as being the most significant predictors of long-term outcome in children with

ASD (Howlin, et al., 2004; Stevens et al., 2000). Howlin and colleagues (2004), in one of the most rigorous follow-up studies of children with ASD into adulthood, concluded that children with nonverbal IQ scores of 70 or greater had significantly better social, employment, and living outcomes than individuals with IQ scores below 70. Given the prognostic utility of these scores, it is important to determine their consistency across measures. It is important to identify a single construct (i.e., full-scale, verbal, or nonverbal IQ) that holds constant across IQ measures so that we can draw accurate conclusions about differences between groups or individuals and make informed decisions about which scores to use in research.

Adaptive Functioning in ASD

In addition to IQ, adaptive skills are a common focus of outcome studies, as these abilities determine how well the individual functions in everyday life in terms of functional communication skills, getting along with people, self-help and life skills, and independence. In typically-developing populations, overall IQ strongly predicts adaptive behavior, but this is not always the case in populations of children with ASD. High-functioning individuals with ASD show significantly greater adaptive functioning deficits than would be predicted by their intellectual abilities, and have significantly poorer adaptive skills than IQ-matched controls (Bölte & Poustka, 2002; Kenworthy,

12

Case, Harms, Martin, & Wallace, 2010). However, this is not true for individuals with ASD with ID, as IQ is often predictive of adaptive outcomes in this group (Liss, Harel, et al., 2001). This differential predictive utility of IQ may be due to floor effects of the IQ and adaptive behavior measures, where low-functioning individuals obtain the lowest possible score. Thus, different IQ measures, which vary in their lowest possible score, may differentially impact this result.

Adaptive functioning scores are also important in this population because they are a required component of the ID diagnosis, which is a common comorbidity in ASD (Fombonne, 2005).

Impairment in both intelligence and adaptive skills are necessary for classification of ID, but it is unclear whether the results of various IQ measures differentially characterize individuals.

Additionally, assessing the way in which two commonly used IQ measures may differ in their relation to adaptive skills will be a valuable means of assessing discriminant validity of the measures.

Evaluating the relationship between IQ and adaptive behavior may also aid in understanding which aspects of cognition are most important in predicting outcomes related to everyday living. Knowing whether scores on one test are more predictive of adaptive functioning than those on another, may guide clinicians’ decisions about test selection and prognosis.

Selection of Instruments to Estimate Intelligence in ASD

Several different intelligence measures are used in clinical and research evaluations of children with ASD. The measure chosen is often dependent on the child’s level of functioning, age, language abilities, or preference of the clinician. In verbal individuals with a level of functioning estimated to be above the moderately impaired range of ID, the Wechsler Intelligence Scales are the most commonly administered, followed by the Stanford-Binet (Roid, 2003; Thorndike, Hagen, &

Sattler, 1986), and it is these two measures that will be the focus of the present study. The Wechsler

Intelligence Scale for Children, 4th edition, (WISC-IV; Wechsler, 2003), is a measure of general

13

intellectual functioning that generates an FSIQ and four index scores: the Verbal Comprehension

(VCI), Perceptual Reasoning (PRI) , Working Memory (WMI), and Processing Speed (PSI) indices.

The WISC-IV was standardized on a group of 2,200 children ages 6 years 0 months to 16 years 11 months, who were representative of the population of children in the United States with regard to race/ethnicity, education, and geographic region (Wechsler, 2003). Although a small proportion

(about 5.7%) of children from the special diagnostic group studies were also incorporated into the normative sample, the majority were typically-developing, which raises the question as to whether the test performs similarly in subgroups of children with developmental disabilities (Wechsler, 2003).

Special group studies of children with autistic disorder (n = 19) and Asperger’s disorder (n = 27)

revealed some consistencies in performance on the test (e.g., Block Design and Arithmetic subtests

comparable to controls despite lower scores on other subtests; low PSI scores); however, it still

remains unclear whether the results from the WISC-IV can be interpreted in the same way they are

for typically-developing children. It is not known whether a low IQ score for a child with ASD can

be interpreted in exactly the same way as a low score for a child without ASD.

The Stanford-Binet (SB) is another measure of overall cognitive functioning that can be

administered to individuals 2 to 85+ years of age. The SB results in an FSIQ as well as composites

of verbal (VIQ) and nonverbal (NVIQ) abilities. Five factors scores, Fluid Reasoning (FR),

Knowledge (KN), Quantitative Reasoning (QR), Visual-Spatial Processing (VS) and Working

Memory (WM), as well as an Abbreviated IQ (ABIQ) score are also computed (Roid, 2003). During

the validation study of the SB-5, 83 children with autism (aged 2–17 years), were administered the

test, but unlike the WISC-IV, they were not included in the normative sample. There was little

information about their cognitive profile, making it difficult to determine if and to what degree their

results differed from the normative sample.

14

In addition to the work in the validation samples, cognitive profiles based on both the

Wechsler scales and the SB have been described in children with ASD. Studies using these measures

reveal that the most common cognitive profile in individuals with ASD is characterized by equal verbal and nonverbal abilities (Charman, et al., 2011; Coolican, Bryson, & Zwaigenbaum, 2008;

Siegel, et al., 1996). However, profiles with significantly higher nonverbal intelligence scores are more prevalent than those with significantly higher scores on both measures

(Coolican, et al., 2008; Lincoln, Courchesne, Allen, Hanson, & Ene, 1998), although this is not reported in all studies (Mayes & Calhoun, 2004). To better understand the differences found between studies, several factors, including age, language ability, and diagnosis, have been evaluated

for their relationship with these cognitive profiles, and little consistency has been revealed. For

example, one study supports the profile characterized by nonverbal abilities greater than verbal

abilities across ASD diagnostic subgroups (Coolican, et al., 2008) while others do not (Klin,

Volkmar, Sparrow, Cicchetti, & Rourke, 1995). To evaluate the effect of age and language abilities

on cognitive profiles, one study used both SB and WISC to evaluate intelligence, although they did

not compare the two. Rather, children functioning at the two to five-year-old level were evaluated

with the SB and children functioning older than 6 years were administered the WISC (Mayes &

Calhoun, 2003a). This study reveals significant effects of age, IQ level, and early language

development on the verbal/nonverbal discrepancy (Mayes & Calhoun, 2003a). More specifically,

nonverbal IQ is significantly higher than verbal IQ during the preschool years in both low (IQ < 80)

and high (IQ > 80) IQ groups; however, by six years of age, this discrepancy is not significant in

either IQ group. Other studies show that individual child factors (i.e., age, language delay) do not

relate to intellectual profiles (Coolican, et al., 2008; de Bruin, Verheij, & Ferdinand, 2006). Given

that the conclusions of many studies are based on results from two different measures, it is

15

important to assess how the scores on the measures relate to one another, as well as to factors such

age, language, and diagnosis.

There has been some debate about which tools best measure intellectual ability in children

with ASD. Some investigators state that the WISC may be preferable because it provides valid

measures across a number of constructs and yields profiles that can be readily translated into intervention objectives (Klin, Saulnier, Tsatsanis, & Volkmar, 2005). The WISC is also the most

widely used measure for matching IQ levels in scientific settings (Mottron, 2004). Despite this support for the widespread use of the WISC in general pediatric evaluations, the Autism Treatment

Network, which is the nation's first network of hospitals and physicians dedicated to developing a model of comprehensive medical care for individuals with autism, requires the Stanford-Binet, 5th edition (Roid, 2003) for assessment of cognitive functioning (Autism Speaks, 2010). This choice is likely due to the expansive age range to which this measure can be administered, as well as the assessment of cognitive abilities at lower levels of functioning.

Other considerations with regard to test selection relate to the possible scores that can be derived from the two measures. The floor for a psychological assessment tool refers to the lowest possible standard score an individual can obtain. This floor differs between index/factor scores of the WISC-IV and SB-5, which are 45 and 40, respectively (Roid, 2003; Wechsler, 2003). Full-scale

IQ scores on both measures go as low as 40. However, raw scores in the extremely low range may fall below the cutoff for a standardized score on either test, forcing clinicians to focus on other results such as age equivalents or ratio IQs [i.e., mental age divided by chronological age (Bishop, et al., in press)]. Although there is only a five IQ point difference between the floors of the SB-5 and

WISC-IV index scores, the SB-5 contains items that more accurately assess functioning at the lower level, as it provides age-equivalents as low as two years (Roid, 2003) compared to the WISC-IV, which only goes as low as six years (Wechsler, 2003). From the existing research, conclusions cannot

16 yet be drawn about what these differences mean for children with ASD, especially those who might be cognitively functioning at or below a six-year-old level.

Although direct comparisons between the SB and the WISC have not been made in children with ASD, several studies, including those conducted as part of the validation of the measures, have compared scores on the WISC and SB. The technical manual for the most recent edition of the SB, the SB, 5th edition (SB-5; Roid, 2003) reports strong correlations with WISC-III (Wechsler, 1991) index scores in typically developing children, with full scale and verbal intelligence scores more highly correlated across tests than nonverbal scores. Specifically, the FSIQs on the SB-5 and WISC-

III were correlated at .84, the Verbal IQs at .85 and the nonverbal scores (SB-5 Nonverbal IQ and

WISC-III Performance IQ) at .66 (Roid, 2003). These comparisons have not been made with the most recent version of the child Wechsler scale, the WISC-IV, nor have comparisons been made in children with ASD.

Full-scale and index scores on the WISC and SB have been shown to be highly correlated in several pediatric populations, including those with mild ID, academic difficulties, and those who are typically-developing. Generally, full-scale scores on the measures are the most highly correlated

(range r = .73 - .92; M = .81, SD = .07), followed by verbal scores (range r = .70 - .82; M = .77, SD

= .05) then nonverbal scores (range r = .55 - .79; M = 68, SD = .11). Despite these high correlations, significant discrepancies between the scores are reported (Lukens & Hurrell, 1996;

Prewett & Matavich, 1992; Prewett & Matavich, 1994; Rothlisberg, 1987). Specifically, scores on the

SB, including full-scale, verbal, and nonverbal scores, have been reported to be higher in children with ID and those with academic difficulties (Lukens & Hurrell, 1996; Prewett & Matavich, 1994).

In one study of children with mild ID, 97% obtained a higher FSIQ score on the SB-IV than on the

WISC-III (average = 8 points higher; range 1-20 points), with 62% of those children showing significantly higher SB scores (Lukens & Hurrell, 1996). Verbal and nonverbal scores on the SB were

17 also higher than corresponding scores on the WISC in this sample, with an average difference of about 10 and 5, respectively (Lukens & Hurrell, 1996). Results revealing higher SB than WISC scores were found in elementary school students referred for academic difficulties (Prewett &

Matavich, 1992; Prewett & Matavich, 1994); however, in one study, there was an effect of IQ level on the SB-WISC discrepancy (Prewett & Matavich, 1992). Prewett and Matavich (1992) compared the WISC-R and SB-4 and found that for children with low IQ (51-69), their SB FSIQ score was significantly higher (average = 9 points higher) than the WISC-R FSIQ score, but SB FSIQ scores were significantly lower in the group with average IQ (91-99), on average 6 points lower. Mean SB composite scores were also lower than mean full-scale scores on the WISC in a typically-developing population with an average of a seven-point discrepancy (Rothlisberg, 1987). In children with learning disabilities, SB composite scores were, on average about three points lower, although this was not significant (Phelps, Bell, & Scott, 1988). It is important to note that results in these studies are based on group mean differences and do not account for the WISC-SB discrepancy in individual children, as this was not reported in most studies.

In adults with ID, Wechsler scales estimate overall intelligence to be about one standard deviation higher than that of the SB (Silverman et al., 2010). In published studies of children with

ASD that include both measures, SB data is used for younger children, three to six years old, and the

WISC for older children, six years and older; thus, the choice of which test is administered is confounded with maturation. One study that used this methodology concluded that a discrepancy between verbal and nonverbal abilities is present in younger children but not in older children

(Mayes & Calhoun, 2003a). Given that these conclusions are based on the verbal and nonverbal scores of two different measures, it is important to understand how these measures relate to one another in this population before we can draw such conclusions. Evaluating this relationship, as well

18

as the relationship between IQ scores and other functional domains, will guide both clinical and

research use of measures of intelligence in ASD.

The primary hypothesis of the current study is that symptom, cognitive and behavioral

factors associated with ASD may affect performance on the WISC-IV and the SB-5 differentially. It

may be that specific demands associated with the independent measures are affected differentially by

the core deficits and associated disruptive behaviors in ASD, resulting in different intelligence

scores. In addition, the tasks on one measure may be more engaging for these children. Garred and

Gilmore (2009) found that 70% of typically-developing preschoolers not only preferred the SB-5 to

the Wechsler Preschool and Primary Scale of Intelligence, 3rd edition , but they were more attentive

and more persistent during this measure. Given that there is currently no research on preference or

differential scores on the two measures in children with ASD, significant concerns should be raised

regarding the generalization of the existing reliability studies in typically-developing children

comparing scores on these tests (Lukens & Hurrell, 1996; Prewett & Matavich, 1994). In the absence of a direct comparison of these measures in children with ASD, it is not clear whether conclusions

about relative patterns of performance across subtests or about level of intellectual functioning are

reliable across IQ tests or valid in predicting outcome. It is understood that no psychological

assessment is a perfect reflection of a given construct, in this case, intelligence. However, if the

equivalency of IQ scores on commonly used measures is compromised in children with ASD to the

degree it is in children with other clinical diagnoses, such as ID or academic difficulties, the validity

of drawing inferences from these measures should be questioned.

Current Study

The current study: 1) investigated the convergent validity between the latest versions of two

commonly used measures of intelligence, the WISC-IV (Wechsler, 2003) and SB-5 (Roid, 2003), in

19 adolescents with ASD, 2) evaluated differences in full-scale, verbal, and nonverbal intelligence scores between the two measures, as well as the IQ classifications and reported verbal/nonverbal discrepancies; 3) determined whether neuropsychological (i.e., language, visual-motor functioning, and theory of mind) behavioral, or symptom variables critical in this population affected scores; and

4) investigated the differential association between intelligence scores on the two IQ measures and adaptive behavior. To assess discriminant validity, the relationship between the IQ tests were compared to their relationship with adaptive functioning scores, as well as cognitive variables such as language, visual-motor functioning, and theory of mind.

It was hypothesized that FSIQ and verbal abilities (VCI/VIQ) would have the strongest correlation coefficients between measures in an ASD group and would be significantly more closely associated than the inter-test nonverbal IQs (PRI/NVIQ) and verbal-nonverbal discrepancies.

Further, children’s intelligence scores, including FSIQ, verbal abilities, and nonverbal abilities, were hypothesized to be higher on the SB-5 than on the WISC-4. It was anticipated that there would be differences in the verbal-nonverbal discrepancies on the WISC-IV and SB-5, with the magnitude of the discrepancy being larger on the SB-5. IQ is related to aspects of language, visual-motor functioning, and theory of mind (Joseph & Tager-Flusberg, 2004; Mayes & Calhoun, 2003a, 2003b); therefore, if significant differences are found between the IQ measures, it is anticipated that language, visual-motor functioning, and theory of mind will account for a portion of the difference in these scores. Finally, it is hypothesized that verbal, nonverbal, and overall intelligence scores from both measures will be significantly positively related to adaptive functioning scores, with the highest correlation being between verbal abilities and the communication subscale. It is further hypothesized that the two measures will be more highly correlated with each other compared to their correlation with adaptive behavior, language, theory of mind, and visual-motor scores, evidencing discriminant validity of the two intelligence measures.

20

CHAPTER TWO

Method

Participants

Participants were 40 children and adolescents between the ages of 10 years 0 months and 16 years 11 months (36 males, 5 females) with a diagnosis of autism spectrum disorder (ASD). All participants had received a previous diagnosis of ASD from clinicians in the Division of

Developmental and Behavioral Pediatrics (DDBP) at Cincinnati Children’s Hospital Medical Center

(CCHMC). Each child had undergone a multidisciplinary evaluation through DDBP that involved an initial visit with a developmental pediatrician, cognitive and behavioral testing by a clinical psychologist, and evaluation of communication skills by a speech-language pathologist (SLP).

However, at the time two of the 40 participants were diagnosed, the standard multidisciplinary evaluation did not include the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2000) as a standardized diagnostic measure; thus, the results of this measure were not incorporated into the diagnostic decision for these individuals. For the remaining 38 participants, the ADOS was administered during the SLP evaluation. The reliability of all ADOS examiners was assessed through regular trainings and reliability assessments with the authors of the instrument. Genetic testing, special education, and occupational and physical therapy assessments were also conducted as needed. A final clinical diagnosis was conferred by the developmental pediatrician, who synthesized the assessment data from all disciplines. Members of the multidisciplinary team have extensive training in ASD assessment and diagnosis. Specific diagnoses are outlined in Table 2. To determine whether participants still met diagnostic criteria, all parents completed the Social Responsivenes

Scale (SRS; Constantino & Gruber, 2005). Raw SRS scores have historically been used in research; however, recent normative samples have resulted in T scores that can be used clinically and

21

interpreted across disciplines. These T scores were used to identify those with mild to severe deficiencies in social behavior (Constantino & Gruber, 2005). The majority of participants (92.7%) met the recommended cut off of T > 60 for clinically significant symptoms of ASD; however, three

participants obtained T scores below this cutoff – 52, 54, and 57. For the primary analyses, these

individuals remained in the study sample given their prior documentation of an ASD diagnosis.

However, subsequent analyses were run with these individuals excluded and results were reported

when this had an impact on the outcome.

Participants for the present study were recruited through an Institutional Review Board-

approved clinical database of all children and adolescents who received a final diagnosis of ASD

between 1999 and 2009 from CCHMC (71%), community resources (12%), flyers (10%), and

referrals from other studies (7%). Individuals who were not recruited through the CCHMC database

learned of the study through flyers in a DDBP clinic, an autism fair in the community, or through a

study conducted within DDBP. CCHMC serves a large percentage of the ASD community and is

the sole center that provides a comprehensive diagnostic evaluation; thus, many of the potential

families have sought services through DDBP at some point. Potential participants completed a telephone screening to determine eligibility, including whether or not they had received their

diagnosis through DDBP. To identify individuals with idiopathic autism, further exclusion criteria

were utilized. Children and adolescents with limited verbal communication (i.e., primarily nonverbal

or only single words) or whose parents believed they would be unable to meet the demands of the

testing situation (e.g., work at a table, follow basic commands) were excluded. Additional exclusion

criteria included the presence of a genetic or metabolic disorder, a history of neurological injury,

including traumatic brain injury, tumor, or stroke, a chronic physical disorder or neurological condition (e.g., epilepsy, Tourette’s syndrome), and a severe psychiatric diagnosis, such as schizophrenia or bipolar disorder. Individuals who spoke English as a second language, had a history

22 of substance abuse, had significant hearing, vision, or mobility impairments, or who had undergone intelligence testing in the preceding three months were also excluded. Of the 85 individuals whose parent or guardian completed the telephone screening for the study, 20 individuals did not meet eligibility criteria. They were excluded for the following : 7 due to seizures, 5 because they were nonverbal or had only single words, 3 did not receive an ASD diagnosis through DDBP, 2 had a severe psychiatric diagnosis, 1 had Fragile X syndrome, 1 had a mitochondrial disorder, and 1 spoke English as a second language. Of the 65 individuals who met eligibility criteria, 5 indicated that they were not interested, 1 was unable to be scheduled before he turned 17 years old and no longer met age criteria, and 18 were never enrolled because the families could no longer be contacted or they failed to attend scheduled study appointments. Forty participants had complete data for both intelligence measures and the parent-reported adaptive behavior measure. One participant completed only the Stanford-Binet, 5th edition (SB-5) and was unable to complete the remainder of the test battery due to significant behavioral problems; this individual was excluded from the final dataset.

Parents or legal guardians of the participants reported a median annual income category of

$66,000-$80,000 on a background form completed during the study visit. The majority of parents

(88%) had completed at least some college. Additional demographic and diagnostic data are presented in Table 2.

Procedures

Written informed consent was obtained from the primary caregiver of each participant, and written assent was obtained from each child who was over the age of 11 and deemed cognitively capable of meaningfully participating in the assent process. One participant, age 11, was considered unable to comprehend the nature of the assent process based on examiner’s evaluation and parental

23 report. In this case, verbal assent was obtained. The study was approved by the Institutional Review

Boards of both CCHMC and the University of Cincinnati. Participants completed a 2.5 to 5-hour

Table 2

Child demographic and diagnostic data (n = 40)

N Percent

Gender

Male 35 87.5

Female 5 12.5

Race

White 34 85.0

African American 2 5.0

Native American 1 2.5

Bi-Racial 3 7.5

Diagnosis

Autism 17 42.5

Asperger’s 8 20.0

PDD-NOS 3 7.5

ASD (some clinicians did not specify 12 30.0 beyond ASD)

M SD

Age 12:8 1:11

Note: Autism Spectrum Disorder (ASD) and Pervasive Developmental Disorder, Not Otherwise

Specified (PDD-NOS).

24

neuropsychological test battery during an individual child assessment that included two standardized

clinical measures of intelligence, as well as measures of language, visual-motor functioning, and

theory of mind. The participants were randomized to either receive the WISC-IV first or the SB-5

first. Sheets indicating group assignment for a projected 50 participants were shuffled and one was

selected prior to each participant’s testing session. While the child was completing cognitive testing,

the parent or guardian, 95% of whom were biological mothers, completed several questionnaires,

including behavioral measures and symptom inventories. The majority of participants completed the

assessment in less than 3.5 hours. Testing was administered in a clinic setting by the first author

(KB) who was trained on the test battery. Of the 40 participants, one participant completed the

assessment over two consecutive days because he and his mother indicated he was fatigued after half

of testing was completed, while the remaining 39 participants were administered all measures in a

single testing session. The data of the child tested over two days were not outliers in terms of the IQ scores on the individual tests or the discrepancy between scores across the two tests. Thus, this individual was included in all analyses.

Measures

Intellectual Functioning. The test battery included two standardized measures of intelligence, The

Wechsler Intelligence Scale for Children, 4th edition, (WISC-IV; Wechsler, 2003) and the Stanford Binet, 5th edition (SB-5; Roid, 2003), which were administered in a counter-balanced order. The WISC-IV is an individually-administered assessment of cognitive abilities for children ages 6 years, 0 months to 16 years, 11 months. The measure yields one overall composite score, Full Scale IQ (FSIQ), and four factor analytically-derived index scores: Verbal Comprehension Index (VCI), Perceptual Reasoning

Index (PRI), Working Memory Index (WMI) and Processing Speed Index (PSI). The VCI is composed of verbal subtests requiring reasoning, comprehension, and conceptualization, and the

25

PRI is composed of subtests measuring perceptual reasoning and organization. The subtests of the

WMI measure attention, concentration, and working memory, and the PSI subtests measure speed of mental and graphomotor processing. Each index score is standardized to have a mean of 100 and standard deviation of 15 in the general population. The standardization sample consists of children ages 6 to 16 years 11 months and is representative of the U.S. population with regard to age, race/ethnicity, parental education, and geographic region. The reliability coefficients range from .88

(PSI) to .97 (FSIQ), indicating excellent reliability and consistency. The standardized administration procedures were followed during administration of the WISC-IV, and scores were converted to age- corrected standardized scores using the best available national norms.

The SB-5 is another measure of overall intellectual abilities for children and adults (ages 2 years, 0 months to 89 years, 11 months) that results in three composite scores: Full Scale (FSIQ),

Nonverbal (NVIQ) and Verbal IQ (VIQ); and five factor scores: Fluid Reasoning (FR), Knowledge

(KN), Quantitative Reasoning (QR), Visual-Spatial Processing (VS) and Working Memory (WM). All composite and factor indices have a mean of 100 and a standard deviation of 15. The FSIQ is derived from 10 subtests, five verbal and five nonverbal. The normative sample for the 5th edition consisted of 4,800 individuals, ages 2 to 85+ years who were representative of the United States with regard to age, sex, ethnicity, geographic region, and socioeconomic levels (Roid, 2003). Reliability is excellent for the FSIQ (.97 to .98) and index scores (average of .90) (Roid, 2003). The SB-5 was administered according to standardized procedures, and scores were converted to age-corrected standardized scores using the best available national norms. Table 3 outlines the index and factors scores of the WISC-IV and SB-5 and identifies those that correspond to similar constructs.

Language. The Clinical Evaluation of Language Fundamentals, 4th edition-Screener, CELF-4 Screener;

(Semel, Wiig, & Secord, 2004) is a 47-item language measure designed to screen individuals for language disorders. This task is a screening measure that was included in the battery to provide an

26

Table 3

Index Scores by IQ Measure

Construct WISC-IV SB-5

Overall intelligence Full-scale IQ (FSIQ) Full-scale IQ (FSIQ)

Verbal intelligence Verbal Comprehension Index (VCI) Verbal IQ (VIQ)

Nonverbal intelligence Perceptual Reasoning Index (PRI) Nonverbal IQ (NVIQ)

Working Memory Working Memory Index (WMI) Working Memory (WM)

Processing Speed Index (PSI) ---

--- Knowledge

--- Fluid Reasoning

--- Visual Spatial

--- Quantitative Reasoning

estimate of general language abilities across multiple domains of language functioning. For the age

range of children and adolescents represented in the present study, the measure consists of five

different language tasks, which are similar to subtests on the CELF-4 (Semel, Wiig, & Secord, 2003):

Concepts and Following Directions, Word Structure, Recalling Sentences, Formulated Sentences,

and Word Classes. All items are scored (1) correct or (0) incorrect. The standardization sample for the

CELF-Screener was 1,200 students ages 5 years, 0 months to 21 years 11 months, consisting of both a clinical group (those diagnosed with a language disorder or learning disability) and a nonclinical group (those who did not have identified language or learning disabilities). Measures of internal consistency reveal a split-half reliability coefficient of .70 for children 5-8 years old and .72 for individuals 9-21 years. The full CELF battery has been used in previous studies of children with

27

ASD and been shown to relate to children’s everyday language (Condouris, Meyer, & Tager-

Flusberg, 2003); however, the CELF-Screener has not yet been administered in a study in individuals

with ASD. Scores on the CELF-Screener are highly correlated with the CELF-4 Core Language

Scores (r = .67 - .75; Semel, et al., 2004) and are thus, also believed to be related to a child’s natural

language. The CELF-Screener was administered according to its standardized procedures and raw

scores were converted to z scores based on the age-grouped national norms provided by the

measure’s authors. These z scores were the primary units of analysis for the present study.

Visual-motor skills. A measure of graphomotor skills, the Beery-Buktenica Developmental Test of

Visual-Motor Integration, 5th edition, Beery VMI (Beery, Buktenica, & Beery, 2006), was administered to

provide an estimate of visual-motor abilities. The Beery VMI-Full Form is a 30-item measure that is

administered to children ages 2 to 18 years and has been used in research of children with ASD

(Mayes & Calhoun, 2003b). It consists of a developmental sequence of geometric forms that are to be copied with paper and pencil and is designed to assess the extent to which individuals can integrate their visual and motor abilities. This measure has been normed five times from 1964 to

2003 on a total of more than 11,000 children. Excellent reliability (.99) has been reported between various editions of the test (Beery, et al., 2006). This measure was administered according to the test’s standardized procedures and raw scores were converted to age-corrected standardized scores using the VMI’s standardized national norms. The resulting standard scores (M = 100, SD = 15) were used in this study’s analyses.

Theory of Mind. The NEPSY, 2nd edition (Korkman, Kirk, & Kemp, 2007) is a general

assessment tool that allows clinicians to formulate a tailored assessment across multiple domains of

neuropsychological functioning. The theory of mind subtest on this measure was included in the test

battery to provide an estimate of the participants’ abilities on theory of mind tasks. Although other

measures of theory of mind have been used in the literature with children with ASD, such as false

28

belief tasks (Happe, 1994; Joseph & Tager-Flusberg, 2004; Lind & Bowler, 2009), no studies to the

author’s knowledge have used this brief single subtest. This 21-item subtest was selected due to its

assessment of various aspects of theory of mind. It assesses the ability to understand mental

functions such as belief, intention, deception, emotion, imagination, and pretending, as well as the

ability to understand that others have their own , beliefs, and feelings that may be different

from one’s own (Korkman, et al., 2007). Six of the 21 items comprise the contextual task which assesses one’s ability to recognize the appropriate affect given various social contexts. The total score on this subtest is not converted to a standard score; the raw scores range from 0 to 26, with a higher raw score indicating greater capability on theory of mind tasks (Korkman, et al., 2007). There are no normative data available for this subtest; however, a Senior Analyst at Pearson provided the means and standard deviations for the theory of mind subtest for 487 children from the normative sample who were between the ages of 10 and 16 (personal communication, Ying Ming, August 31,

2011). These data were used to convert raw scores to the z scores that are the unit of analysis in the

present study.

Behavior Problems. The Child Behavior Checklist (CBCL; Achenbach, 1991; Achenbach &

Rescorla, 2001) is a factor-analytically derived behavior checklist completed by parents or guardians.

All parents in the study completed the 113-item measure, rating their child’s current behavior. Each

item on this scale is rated on a 3-point scale: (0) not true, (1) somewhat/sometimes true, and (2) very/often

true. Scores are computed for Total Behavior Problems, Internalizing Problems, and Externalizing

Problems. Items measuring anxious/depressed behavior, withdrawn/depressed behavior, and

somatic complaints together form the broad-band dimension of internalizing behavior, whereas

items measuring aggressive and rule-breaking behavior form the broad-band dimension of externalizing behavior. The Total Behavior Problems dimension consists of the internalizing and

29

externalizing subscales, as well as thought problems, attention problems, social problems, and other problems. For all three dimensions, a T score of 65 or above (> 93rd percentile) is considered

significantly elevated (Achenbach, 1991). Narrowband scales include: anxious/depressed behavior,

withdrawn/depressed behavior, somatic complaints, thought problems, attention problems, social

problems, aggressive, and rule-breaking behavior. DSM-oriented scales are also derived, one of

which is a Pervasive Developmental Disorder Problems scale. The CBCL is not intended to be

diagnostic of ASD, but can be used to identify a range of behavioral issues including those

symptomatic of ASD. The CBCL has been shown to have high reliability and validity. It shows

stability over time and provides standardized scores for age and gender (Achenbach, 1991). This

behavioral measure has also been used in children with autism (e.g., Hartley, Sikora, & McCoy,

2008).

Autism Spectrum Disorder Symptoms. The Social Responsiveness Scale (SRS; Constantino &

Gruber, 2005) is a 65-item rating scale that measures the severity of symptoms of autism spectrum

disorder as they occur in natural social settings over six months prior to evaluation. Specifically, this

measure covers varying dimensions of interpersonal behavior, communication, and

repetitive/stereotypic behavior that are characteristics of ASD. The SRS was developed for children

ages 4 to 18 years and can be completed by a parent or teacher. This tool measures impairment on a

quantitative scale across a wide range of severity, which is consistent with recent research indicating that autism is best conceptualized as a spectrum condition rather than an all-or-nothing diagnosis

(Constantino, 2002). Each item is scored from: (0) never true to (3) almost always true. Results of the

SRS consist of a standard T score, Total Score, reflecting severity of social deficits in ASD, as well as

scores for five Treatment Subscales: Receptive, Cognitive, Expressive, and Motivational aspects of

social behavior, as well as Autistic Preoccupations. Total scores can range from 0 to 195, with

30

reliability estimates above .90 for both males and females. The subscales are not to be used

diagnostically, but can be helpful in evaluating severity of symptoms to design and evaluate

treatment programs, for example. The standardization sample for the SRS consisted of more than

1,600 children (4 to 18 years of age) from the general population. Norms correct for rater (parent or

teacher), as well as age and gender of the child rated. The SRS is increasingly used to measure ASD

symptoms that discriminate between children with and without ASD and has been shown to be

highly correlated with the ADI-R domain scores (r = 0.65 – 0.74; Constantino & Gruber, 2005). In

the present study, it was used to evaluate current severity of ASD symptoms. A T score of 60 has

been identified as a cutoff score to indicate clinically elevated levels of social impairment.

The Child Communication Checklist-2nd edition (CCC-2; Bishop, 2006) is a measure

designed to assess language and communication skills in children ages 4 years to 16 years 11 months

The CCC quantifies degree of pragmatic language impairment. It is completed by an informant who

has regular contact with the child, typically a parent. It is a 70-item scale with possible responses based on the frequency of the communicative behavior: (0) less than once a week (or never) to (3) several

times (more than twice) a day (or always). Responses result in two composite scores, one of which, the

General Communication Composite (GCC), will be used in the present study. The GCC is a

standardized score (M = 100, SD = 15) that can be used to identify children with any type of

clinically significant communication problem, with scores less than 80 indicating significant

impairment (Bishop, 2006). The GCC is a summation of the scaled scores of eight subscales:

Speech, Syntax, Semantics, and Coherence assess aspects of the child’s language structure,

articulation and phonology, vocabulary, and discourse, while the Initiation, Scripted Language,

Context, and Nonverbal Communication subscales target the pragmatic aspects of communication that are often not evaluated by standardized language assessment measures. Within each subscale, two of the items evaluate communicative strengths, while five items assess communication

31

difficulties. The standardization sample for the U.S. edition of the CCC-2 (Bishop, 2006) consisted

of 950 children (ages 4 years 0 months to 16 years 11 months). Internal consistency reliability

coefficients ranged from .66 to .85 across subscales. The reliabilities for the GCC range from .94 to

.96. Validity was assessed by calculating classification rates for a variety of matched clinical groups based on GCC scores at 1, 1.5, and 2.0 SDs below the mean. For the group with ASD, 89% of the

children were identified as such based on a GCC 1.0 S below the mean. Based on these results, the

CCC–2 demonstrates good reliability and validity (Bishop, 2006).

The Repetitive Behavior Scale-Revised (RBS-R; Bodfish, Symons, & Lewis, 1999) is an

empirically-derived clinical rating scale that assesses the presence and severity of a number of

restricted, repetitive behaviors that are characteristic of individuals with ASD. This measure is

typically completed by the parent, who is asked to make ratings based on the interactions and

observations of the individual over the past month. Further, the parent is told to consider: 1) how

frequently the behavior occurs, 2) how difficult it is to interrupt the behavior, and 3) how much the

behavior interferes with ongoing events. The 43 items on this scale are rated on a four-point Likert-

type rating scale ranging from: (0) behavior does not occur to (3) behavior occurs and is a severe problem. The

RBS-R has six conceptually-derived subscales which include: (a) Stereotyped Behavior (i.e.,

movements with no obvious purpose that are demonstrated repeatedly); (b) Self-injurious Behavior

(i.e., actions that cause or have the potential to cause redness, bruising, or other injury to the body);

(c) Compulsive Behavior (i.e., behavior that is repeated and performed according to a rule or

involves things being done ‘‘just so’’); (d) Ritualistic Behavior (i.e., performing activities of daily

living in a similar manner); (e) Sameness Behavior (i.e., resistance to change, insisting that things stay

the same); and (f) Restricted Behavior (i.e., limited range of focus, interest, or activity). Total scores

are obtained for each subscale, as well as the number of items endorsed within each scale. An overall

score and overall number of items endorsed are also obtained. There are no normative data for this

32

measure; however, scores are negatively correlated with NVIQ, especially the Sameness Behavior

subscale.

Adaptive Functioning. The Vineland Adaptive Behavior Scales, 2nd edition (Vineland-II;

Sparrow, Cicchetti, & Balla, 2005) is a widely used and well-validated standardized measure of adaptive behavior. The Vineland-II Survey Interview Form, which was used in the present study, is a structured parent interview measure. Each item is rated on a Likert-type rating scale with responses of: (0) never true, (1) sometimes or partially true, or (2) often true. For this study’s age group (10 years to 16 years 11 months), this measure assesses adaptive functioning across the following domains:

Communication Skills, Daily Living Skills, and Socialization Skills. The Communication domain includes skills required for receptive, expressive, and written language, while the Daily Living Skills domain includes practical skills related to self-care and contributing to a household and community.

The Socialization domain pertains to skills needed to get along with others, engage in leisurely activities such as play, and regulate emotions and behavior. Responses result in standardized scores

(M = 100; SD = 15) for each domain, as well as an Adaptive Behavior Composite (ABC). For children younger than six years of age, a Motor Skills domain score is also obtained and is required to compute the ABC (Sparrow, Cicchetti, & Balla, 2005). Based on a nationally representative sample, reliabilities for each of four domains (Communication, Daily Living Skills, Socialization,

Motor Skills) range from .93-.99.

Statistical Analyses

Of the 40 participants who completed both intelligence measures, the WISC-IV was unavailable for four. For one of these four participants, the Symbol Search subtest, which is one of

two subtests that factor into the PSI, was not administered. Thus, the PSI could not be calculated,

nor could the FSIQ (Wechsler, 2003). The remaining three participants obtained scores of 0 on at

33

least two of the three verbal subtests. According to the WISC-IV Technical Manual (Wechsler,

2003), the VCI cannot be calculated if the child is unable to obtain a score on two or more of the

verbal subtests; thus, these three participants did not obtain VCI or FSIQ scores on the WISC-IV.

Results are reported accordingly. However, elimination of these three low-scoring participants from

the analyses decreases the ability to generalize findings to lower-functioning individuals. Thus, for

some of the analyses, these three individuals were assigned the lowest possible score on the VCI and

FSIQ, which are 45 and 40, respectively. When inclusion of these low scores changed results, the

statistical analyses are reported both with and without these individuals. As noted previously, three

participants scored below the cutoff on the SRS (T = 60), indicating social impairment symptoms below clinical threshold. These individuals had a prior documentation of an ASD diagnosis and thus, remained in the study sample. However, results are reported when removal of the data for these three individuals alters statistical outcomes.

A multitrait-multimethod matrix was used to assess convergent and discriminant validity between the aspects of the two intelligence tests that are designed to evaluate similar aspects of intelligence (e.g., verbal intelligence). The composite score, corresponding index scores (FSIQ,

‘Verbal’ = VCI/VIQ, ‘Nonverbal’ = PRI/PIQ, and WMI/WM) and the verbal-nonverbal discrepancy were included. Verbal-nonverbal discrepancy scores were calculated by subtracting nonverbal scores (PRI and PIQ) from verbal scores (VCI and VIQ, respectively); thus, a negative number indicates higher nonverbal scores, while a positive number indicates higher verbal scores.

Paired sample t tests were used to examine the mean differences of all IQ scores across measures

(FSIQ, VCI/VIQ, PRI/NVIQ and the ‘Verbal’/’Nonverbal’ discrepancy). Potential differences in

IQ classifications were assessed by comparing the frequency of descriptive IQ classifications (i.e.,

average, low average) between the two tests.

34

Correlations were used to determine whether or not scores on tasks of other neuropsychological domains (i.e., language, visual-motor functioning, and social perception), functional domains (i.e., behavior and adaptive functioning), or on measures of autism symptoms contributed to differences in IQ scores between measures. Specifically, all correlations between IQ difference scores (i.e., WISC-FSIQ – SB-FSIQ) and cognitive, functional, and diagnostic variables are reported. Test order groups, WISC-IV administered first and SB-5 administered first, were compared across all index scores to determine if test order impacted IQ performance.

The convergent validity of the two IQ measures is a primary aim of this study, which necessitates the evaluation of discriminant validity. Thus, all IQ index scores were evaluated for their correlation with neuropsychological variables (CELF, VMI, and theory of mind) and adaptive functioning variables (Vineland Communication, Daily Living Skills, and Socialization). Statistical tests were always two-sided, with α = .05.

35

CHAPTER THREE

Results

Convergent Validity Between WISC-IV and SB-5

A MTMM matrix (Table 4) was used to investigate the convergent validity between the

WISC-IV (Wechsler, 2003) and SB-5 (Roid, 2003) in adolescents with ASD. To analyze the matrix, the average correlations were computed for the IQ and Index scores that have corresponding content across the two IQ tests for each of the following types of associations: the monotrait- monomethod (MTMM), the heterotrait-monomethod (HTMM), the monotrait-heteromethod

(MTHM), and the hetereotrait-heteromethod (HTHM). MTMM correlations were those between the same test and the same trait or score (i.e., WISC-FSIQ: WISC-FSIQ; SB-VIQ: SB-VIQ), based on published reliability coefficients from the test manuals. The HTMM correlations were those between the index scores within each test (i.e., WISC-FSIQ: WISC-VCI, WISC-PRI, and WISC-

WM). MTHM correlations included those between corresponding index scores of the two different tests (i.e., WISC-VCI: SB-VIQ; WISC-WMI: SB-WM). Finally, the HTHM correlations were those between index scores of tests that were not designed to measure the same construct (i.e., WISC-

VCI: SB-NVIQ; WISC-WMI: SB-VIQ). The correlations between the remaining index scores from each IQ test that do not have a corresponding score on the other measure, such as the WISC-IV PSI or SB-5 Knowledge, and all other scores were reported as ‘Other.’

As anticipated, results revealed that the MTMM correlations were the highest, with an average of r = .92. The average MTHM correlation, which is the correlation between the index scores within each test (r = .84), was equal to the average HTMM correlation, which is the correlation between the two tests’ corresponding ‘traits’ (r = .84). It was hypothesized that verbal score correlations would be the strongest; however, the verbal correlations (r = .86) were not found

36

to differ significantly from nonverbal correlations (r = .82). The average correlation between the

index scores on the WISC-IV (r = .78) was significantly lower than the average correlation between

the index scores on the SB-5 (r = .91; z r= 2.91, p < .01). These averages are similar to what is

reported in the normative samples, as the SB-5 VIQ, NVIQ, and FSIQ have an average correlation

of .92 (range .85 - .96) for the age group of this study, and the WISC-IV has averages of .75 to .80

across ages 10-16:11 (range .59 - .88). The average HTHM correlation (r = .77) was strong, but

lower than the average HTMM (r = .84) and MTHM(r = .84). Finally, the average of the ‘other’

correlations was the lowest of all correlations (r = .76). All correlations in Table 4 were significant at

p < .001.

37

Table 4 Multi-trait Multi-method Evaluation of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV) and Stanford Binet, 5th edition (SB-5) WISC.FSIQ WISC.VCI WISC.PRI WISC.WMI WISC.PSI* SB.FSIQ SB.VIQ SB.NVIQ SB.WM SB.FR* SB.KN* SB.QR* SB.VS* WISC.FSIQ 0.97 0.89 0.87 0.84 0.70 0.88 0.85 0.83 0.79 0.79 0.83 0.80 0.77 WISC.VCI 0.95 0.66 0.70 0.51 0.84 0.86 0.73 0.75 0.73 0.85 0.69 0.77 WISC.PRI 0.93 0.69 0.55 0.81 0.74 0.82 0.66 0.78 0.74 0.78 0.77 WISC.WMI 0.93 0.54 0.77 0.76 0.72 0.78 0.72 0.71 0.71 0.64 WISC.PSI* 0.89 0.61 0.58 0.60 0.61 0.54 0.58 0.59 0.51 SB.FSIQ 0.97 0.97 0.96 0.90 0.94 0.93 0.94 0.91 SB.VIQ 0.94 0.87 0.91 0.90 0.93 0.89 0.91 SB.NVIQ 0.94 0.83 0.92 0.87 0.93 0.90 SB.WM 0.88 0.79 0.82 0.78 0.78 SB. FR* 0.88 0.85 0.88 0.82 SB.KN* 0.92 0.84 0.79 SB.QR* 0.92 0.82 SB.VS* 0.86

Note: All correlations are significant, p < .01; n = 36 FSIQ = Full scale IQ; VCI = Verbal Comprehension Index; PRI = Perceptual Reasoning Index; WMI = Working Memory Index; PSI = Processing Speed Index; VIQ = Verbal IQ; NVIQ = Nonverbal IQ; WM = Working Memory; FR = Fluid Reasoning; KN = Knowledge; QR = Quantitative Reasoning; VS = Visual Spatial Boxes with underlined numbers represent monotrait-monomethod correlations and are derived front reliability statistics in the published test manuals Lightly shaded boxes represent heterotrait-monomethod correlations Boxes with borders represent monotrait-heteromethod correlations Dark shaded boxes represent heterotrait-heteromethod correlations Boxes that are not shaded represent ‘other’ correlations * denotes rows and columns of correlations that factor into the 'Other' category

38

Descriptive statistics and paired t tests for scores on the two IQ tests are provided in Table 5.

Table 5

Child IQ Data

WISC-IV SB-5 t d

n M (SD) Range n M (SD) Range

FSIQ 36a 78.81(19.38) 40-121 36 a 82.06(19.03) 40-122 -2.05* -0.17

VCI/VIQ 37 a 83.49(20.41) 45-124 37 a 79.59(20.09) 43-119 2.29* 0.19

PRI/NVIQ 40 87.55(20.11) 51-129 40 84.73(20.36) 43-132 1.47 0.14

WMI/WM 40 76.38(21.00) 50-123 40 78.78(18.46) 48-109 -1.14 -0.12

PSI 39 a 77.79(15.95) 50-112 ------

Fluid Reasoning ------40 85.68(20.84) 47-132 ---

Knowledge ------40 79.53(19.57) 49-111 ---

Quantitative --- 40 50-130 ---

Reasoning --- 82.40(19.37)

Visual Spatial ------40 88.03(19.57) 50-126 ---

Note: FSIQ = Full scale IQ; VCI = Verbal Comprehension Index; PRI = Perceptual Reasoning Index; WMI = Working Memory Index; PSI = Processing Speed Index; VIQ = Verbal IQ; NVIQ = Nonverbal IQ; WM = Working Memory a Four participants did not have a valid FSIQ. *p < .05

Full-Scale IQ

As anticipated, FSIQ scores on the two measures were highly correlated, r = 0.88, p < .001

(see Figure 1). However, a paired samples t test revealed that FSIQ scores on the WISC-IV were

significantly lower than the FSIQ scores on the SB-5, t(35) = -2.05, p = .048, d = -0.17 (see Table 5).

39

Significant differences between FSIQ scores on the two measures remained when the lowest

possible scores (standard score = 40) were included for the three participants who were unable to

obtain a score on the VCI of the WISC-IV due to scores of 0 on two of the three verbal subtests,

t(38) = - 2.27, p = .03, d = -0.16. These differences also remained when the three participants who

scored less than 60 on the SRS, indicating social impairment symptoms below clinical threshold,

were removed, t(32) = - 2.10, p = .04, d = -0.20.

In the sample with complete data (n = 36), the average difference revealed that the SB-5

FSIQ scores were 3.25 (SD = 9.51) IQ points higher than the WISC-IV FSIQ, with a range of a 0 to

23-point difference. Although SB-5 scores were higher, on average, differences ranged from the SB-

5 being 23 points higher than the WISC-IV to the WISC-IV being 22 points higher than the SB-5.

IQ test scores are computed with some degree of error, significant differences in an individual’s scores raise some concern. Confidence intervals, which use margins of error to guide interpretation of individual IQ scores, provide assurance that a given score falls somewhere in the interval range provided within the test manual. In the SB-5 and WISC-IV, both the 90% and 95% confidence intervals have various ranges, but all fall within 12 IQ points of one another. Thus, a difference of a standard deviation (15 points) was used in the present analyses to indicate a significant difference.

Of the 36 participants for whom an FSIQ was calculated on both the WISC-IV and SB-5, the majority (71.8%) scored higher on the SB-5, with 11.1% of participants scoring at least one standard deviation (15 points) higher. One participant had an FSIQ score on the SB-5 that was 1.5 standard deviations higher than the WISC-IV FSIQ score. Two participants achieved exactly the same FSIQ on the two measures, and 8 (22.2%) scored higher on the WISC-IV. Only one participant obtained an FSIQ on the WISC that was at least a standard deviation higher than that of the SB-5 and none of the participants scored 1.5 standard deviations higher on the WISC-IV. Both

40 the WISC-IV and SB-% Full Scale IQ scores were significantly positively correlated with the SB-5

Abbreviated IQ (r = .95, p < .01 and r = .82, p < .01, respectively).

r = .88

Figure 1. Correlation between Full-Scale IQ (FSIQ) scores on the WISC-IV and SB-5 (n = 36). The dotted line represents unity across the two measures. The horizontal and vertical lines indicate the cutoff for intellectual disability at IQ<70.

Verbal Intelligence Scores

Verbal intelligence scores on the WISC-IV (VCI) were highly correlated with verbal intelligence scores on the SB-5 (VIQ; r = 0.86, p < .001), as illustrated in Figure 2. However, VCI scores on the WISC-IV were significantly higher than VIQ scores on the SB-5, t(36) = 2.23, p = .03; d = 0.19. Significant differences between verbal scores on the two measures remained when the lowest possible scores (standard score = 45) were included for the three participants who were unable to obtain a score on the VCI of the WISC-IV due to scores of 0 on two of the three verbal subtests, t(39) = 2.25, p = .03; d = 0.17. When the individuals who scored below the clinical cutoff on the SRS were removed, however, the t test was not significant, t(33) = 1.95, p = .06; d = 0.19, which may be due a loss of power.

As seen in Table 5, the WISC-IV VCI was, on average, 3.89 (SD = 10.62) IQ points higher than SB-5 VIQ for the 37 participants with complete data, ranging from a 0 to 33-point difference.

41

Differences occurred in both directions, with individual scores ranging from the VIQ on the SB-5 being 16 points higher than the WISC-IV’s VCI to the VCI being 33 points higher than the VIQ. In contrast to the pattern seen with the FSIQ, the majority of participants (62.2%) scored higher on the

r = .86

Figure 2. Correlation between Verbal Comprehension Index (VCI) of the WISC-IV and the Verbal IQ (VIQ) of the SB-5 (n = 37). The dotted line represents unity across the two measures. The horizontal and vertical lines indicate the cutoff for intellectual disability at IQ<70.

WISC-IV than the SB verbal scale, with 16.2% of participants scoring at least one standard deviation higher. One participant had a verbal IQ score on the WISC-IV more than two standard deviations higher than the SB-5 VIQ score. However, most (81.1%) obtained a verbal score on the SB-5 that was within a standard deviation of their score on the WISC-IV. Differences were also seen in the opposite direction with 14 (35.0%) participants scoring higher on the SB-5 VIQ than the WISC-IV

VCI. Only one participant obtained a VIQ on the SB-5 that was at least a standard deviation higher than the VCI on the WISC-IV and no participants scored more than 1.5 standard deviations higher.

42

Nonverbal Intelligence Scores

Nonverbal intelligence scores on the WISC-IV (PRI) were highly correlated with the NVIQ of the SB-5 (r = 0.82, p < .01), as illustrated in Figure 3, and the two scores did not differ significantly from one another, t(39) = 1.47, p > .05, d = 0.14. Although not significant, the average difference revealed that the WISC-PRI was 2.83 (SD = 12.14) IQ points higher than the SB-5

NVIQ, with a range from 0 to 37-points. Similar to the other cognitive scores, differences occurred in both directions, with the SB-5 NVIQ score being as much as 26 points higher than the PRI for one child, and the PRI being as much as 37 points higher than the NVIQ for another child.

Approximately half of participants (55.0%) obtained a PRI score on the WISC-IV that was higher than their NVIQ score on the SB-5 and 35.0% obtained a higher NVIQ score. Only 5.0% of the sample scored at least a standard deviation higher on the SB-5 NVIQ and 12.5% scored at least a standard deviation higher on the WISC-IV PRI.

Figure 3. Correlation between Perceptual Reasoning Index (PRI) of the WISC-IV and the Nonverbal IQ (NVIQ) of the SB-5 (n = 40). The dotted line represents unity across the two measures. The horizontal and vertical lines indicate the cutoff for intellectual disability at IQ<70.

43

Working Memory Scores

Working memory scores on the WISC-IV (WMI) were highly correlated with those of the

SB-5 (WM) as illustrated in Figure 4 (r = 0.78, p < .01) and did not differ significantly from one another, t(39) = -1.14, p > .05, d = -0.12. The average difference between working memory scores on the two measures was 2.40 (SD = 13.32), in the direction of SB-5 WM being higher than WISC-

IV WMI, with discrepancies ranging from 0 to 37 points. The SB-5 WM score was a maximum of 27 points higher than the WISC-IV WMI for one child, and the WMI was a maximum of 37 points higher than the WM for another child.

Figure 4. Correlation between Working Memory Index (WMI) of the WISC-IV and the Working Memory factor score (WM) of the SB-5 (n = 40). The dotted line represents unity across the two measures.

The majority of participants (60.0%) obtained a WM score on the SB-5 that was higher than their WMI score on the WISC-IV, although most participants (67.5%) obtained working memory scores on the two measures that were within one SD of one another. Twenty percent of the sample

(n = 8) received SB-5 WM scores that were between 1 and 2 SDs higher than their WISC-IV WMI

44

score. Similar to other score differences between measures, discrepancies were also seen in the other

direction with 4 individuals (10.0%) scoring between 1 and 2 SD higher on the WISC-IV WMI, and

one child (2.5%) scoring more than 2 SDs higher on the WMI.

Classification of Intellectual Functioning Based on Index Scores Across Measures

The discrepancies between IQ scores on the two measures were also examined in terms of

their implications for categorical classification. FSIQ scores are used to classify individuals into

qualitative ranges of cognitive ability (i.e., superior – IQ > 120; average – 80 > IQ < 119; borderline

– 70 > IQ < 79; ID – IQ < 70). Most of the participants in the sample (83.3%) were classified into the same descriptive category by both IQ measures, although six participants were classified differently. Of these six individuals, two (5.6% of total sample) were classified as ID on the WISC-

IV but borderline on the SB-5. Two participants (5.6% of total sample) were classified as borderline on the WISC-IV but average on the SB-5. Finally, two participants were classified as average on the

WISC-IV and one (2.8% of total sample) was classified as borderline and the other superior (2.8% of total sample) on the SB-5. Independent t tests did not reveal any significant differences in cognitive, behavioral, or adaptive functioning measures between participants whose FSIQ scores on the two measures were in the same descriptive category and those that had different classifications, p

> .05 for all comparisons.

The IQ classifications of cognitive ability were also compared based on VIQ and VCI scores. IQ classifications based on the two measures were consistent in 75.7% of participants. For the n = 9 cases in which different IQ classifications resulted, two (5.4% of the full sample) scored in the borderline range on the WISC-IV VCI but the average range on the SB-5 VIQ, and two (5.4%) scored in the superior range on the WISC-IV, but the average range on the SB-5; the remainder

45 were classified in the average range on the WISC-IV but in the borderline (n = 3; 8.1%) or ID (n =

2; 5.4%) ranges on the SB-5.

IQ classification comparisons between the nonverbal scores (PRI and NVIQ) revealed the same IQ classifications in 75.0% of cases. Of the 10 participants whose scores were classified differently on the two tests, one child (2.5% of total sample) had received a score on the WISC-IV in the ID range (IQ < 70) and one (2.5% of the total sample) in the borderline range (IQ between

70 and 79), but fell one classification category higher on the SB-5. The largest discrepancies in categorization were observed for participants who scored in the average range on the WISC-IV PRI.

Nineteen of the 25 participants in this category scored in the average range on the SB-5 NVIQ; however, two participants (5.0% of total sample) scored in the superior range, three participants

(7.5% of total sample) scored in the borderline range, and one individual (2.5% of total sample) scored in the ID range on the SB-5 NVIQ. The one child whose scores differed by two cognitive ability ranges obtained a PRI of 84 and a NVIQ of 47. Finally, two of the three participants (5.0% of total sample) classified as superior based on the WISC-IV PRI scored in the average range on the

SB-5.

Differences in qualitative classification were not examined for the working memory measures, because it would be unusual to use this index alone in clinical work or research in order to quantify intelligence.

Classification of IQ scores into categories (i.e., ID, borderline, superior) is useful both clinically and in research; however, studies that focus on samples of high- or low-functioning children and adolescents do not rely exclusively on these descriptive categories. For the purposes of sample selection in research studies of high-functioning individuals with ASD, various cutoff FSIQ scores are used to identify those in the ‘Average’ or ‘Unimpaired’ range. Cut-scores of 70 (e.g.,

Goldstein, et al., 2008; Mayes & Calhoun, 2008) and 85 (e.g., Ozonoff, South, & Miller, 2000) have

46

been used in previous studies. Research that compares high-functioning individuals with ASD to

other clinical or normative groups often use only verbal IQ scores (Klin, et al., 2007) or both verbal

and FSIQ scores (e.g., Goldstein, et al., 2008; Siegel, et al., 1996) to identify potential participants

who are average or unimpaired. As noted, cutoff scores in these types of studies vary. Nonverbal IQ

scores are also used in studies to identify individuals with high-functioning autism (e.g., Howlin, et

al., 2004; Szatmari, et al., 2003). Table 6 details how individuals in this sample would be classified with an intellectual disability using various cutoff scores for high-functioning (IQ < 70 or IQ < 85) based on the FSIQ, verbal, and nonverbal scores on the WISC-IV and SB-5.

Table 6

Classification of Cognitive Ability by FSIQ Scores on Wechsler Intelligence Scale for Children, 4th edition (WISC-

IV) and Stanford Binet, 5th edition (SB-5)

n <70 70+ <85 85+

WISC-IV FSIQ 36 10 (27.8%) 26 (72.2%) 22 (61.1%) 14 (38.9%)

SB-5 FSIQ 36 8 (22.2%) 28 (77.8%) 19 (52.8%) 17 (47.2%)

WISC-IV VCI 37 12 (32.4%) 25 (67.6%) 18 (48.6%) 19 (51.4%)

SB-5 VIQ 37 15 (40.5%) 22 (59.5%) 20 (54.1%) 17 (45.9%)

WISC-IV PRI 40 9 (22.5%) 31 (77.5%) 17 (42.5%) 23 (57.5%)

SB-5 NVIQ 40 8 (20.0%) 32 (80.0%) 18 (45.0%) 22 (55.0%)

Note: Only the 36 individuals with FSIQ scores on both tests were included; VCI = Verbal Comprehension Index; VIQ = Verbal IQ; Only the 37 individuals with VCI scores on the WISC-IV were included; PRI = Perceptual Reasoning Index; NVIQ = Nonverbal IQ

Positive predictive values and negative predictive values (PPV and NPV) were

computed to determine the proportion of individuals identified as ID (IQ < 70) on one

47 measure who were classified similarly on the second measure. Because one test is not more accurate than the other, and thus there is no ‘correct’ classification, PPV and NPV were computed twice, once with the WISC-IV and once with the SB-5 being used as the ‘gold standard.’ Data are presented in Tables 7 and 8. Using the WISC-IV as the standard, the PPV suggested that 100% of those who scored less than 70 on this measure were also classified as

ID on the SB-5. However, when the SB-5 was used as the standard, the PPV suggested that

80% of individuals who were classified as ID on this measure were also classified as ID on the

WISC-IV. The NPV using WISC-IV as the standard was 93% and was 100% when the SB-5 was used as the standard. The sensitivity was 80% and specificity was 100% with the WISC-IV as the standard. Using the SB-5 as the standard, sensitivity was 100% and specificity was 93%.

Both the WISC-IV and SB-5 demonstrated good concordance in their identification of ID versus not ID given the high sensitivity and specificity. Further, both measures minimize false positive and false negative rates, as evidenced by the high positive and negative predictive values.

Table 7

Classification of Intellectual Disability (ID) based on Wechsler Intelligence Scale for Children, 4th edition

(WISC-IV) and Stanford Binet, 5th edition (SB-5) FSIQ scores with WISC-IV as the Standard

WISC-IV Full-scale IQ

ID Not ID

ID 8 0 SB-5 Full-scale IQ Not ID 2 26

Note: Only the 36 individuals with FSIQ scores on both tests were included

48

Table 8

Classification of Intellectual Disability (ID) based on Wechsler Intelligence Scale for Children, 4th edition

(WISC-IV) and Stanford Binet, 5th edition (SB-5) scores with SB-5 as the Standard

SB-5 Full-scale IQ

ID Not ID

ID 8 2 WISC-IV Full-scale IQ Not ID 0 26

Verbal-Nonverbal IQ Discrepancy Scores on the WISC-IV and SB-5

Cognitive profiles based on discrepancies between verbal and nonverbal intelligence scores have been used to corroborate specific neuropathologies, ASD phenotypes, and chromosomal abnormalities (Chapman, et al., 2011; Deutsch & Joseph, 2003; Joseph, et al., 2002). A correlation between these discrepancy scores across the two IQ measures revealed that they were significantly related, r = .48, p < .01. A paired samples t test showed that the average verbal-nonverbal discrepancy on the WISC-IV (M = 6.38, SD = 16.42), with nonverbal scores being higher than verbal scores, did not differ significantly from the average verbal-nonverbal difference on the SB-5, which also showed nonverbal scores to be generally higher than verbal scores, M = 8.00, SD =

11.21; t(36) = 0.67, p > .05. Verbal-nonverbal discrepancy scores on the WISC-IV ranged from 41 points, with the nonverbal score higher, to 32, with the verbal score higher. The range of verbal- nonverbal discrepancy scores on the SB-5 was somewhat more restricted, with a maximum difference of 31 in the direction of a higher nonverbal IQ and only 12 points in the direction of a higher verbal IQ.

Significant verbal-nonverbal discrepancies are classified in the ASD literature in a number of ways. Clinically, the statistically significant discrepancy based on the normative sample, which is

49

obtained from the respective test manual, is used. The WISC-IV (but not the SB-5) provides

separate base rates at each ability level as well as overall base rates for the normative sample, as their data analysis revealed that verbal-nonverbal discrepancies differed across levels of IQ (Wechsler,

2003). In the present study, only the age group base rates provided within both tests manuals were

used, for maximum consistency across measures. Some researchers, however, do not use test

manuals to identify discrepancy, but instead rely on set discrepancies, ranging from a 12-point

discrepancy (e.g., Ozonoff, et al., 2000; Siegel, et al., 1996) to a 15-point discrepancy (e.g., Chapman, et al., 2011). Given the implications of these verbal-nonverbal discrepancies, it is important to evaluate how various measures and discrepancy criteria classify children and adolescents. Table 9 illustrates the number of children and adolescents in the sample who were identified as having a significant verbal-nonverbal IQ discrepancy based on the respective test manuals (Roid, 2003;

Wechsler, 2003), on a 12-point discrepancy, and 15-point discrepancy, as well as the direction

(higher verbal versus nonverbal) of any significant discrepancy.

In total, 18 (48.6%) individuals had significant verbal-nonverbal discrepancies on the WISC-

IV and 16 individuals (43.2%) had significant verbal-nonverbal discrepancies on the SB-5 using the respective test manual cutoffs, which is comparable to what is seen in other ASD samples

(Charman, et al., 2011). Although this may appear to indicate reliability in reporting across measures, at the individual level, it is revealed that 59.5% of the entire sample was classified similarly across the two tests. Eighteen individuals had a significant verbal-nonverbal discrepancy on the WISC-IV while

16 had a significant discrepancy on the SB-5. However, only 9 of the 18 who had discrepancies on

either test were classified as having the same significant discrepancy classification (i.e., V

both tests. Of the 18 individuals classified as having a significant discrepancy on the WISC-IV,

50.0% were classified as having no such discrepancy on the SB-5. Similarly, 43.8% of the 16

50

individuals classified as having a significant verbal-nonverbal discrepancy on the SB-5 did not have significant discrepancies on the WISC-IV.

Table 9

Classification of Verbal-Nonverbal IQ Score Discrepancy Based on Various Criteria for Discrepancy, n = 37

IQ discrepancy based on test manuals

WISC-IV V>NV V=NV VNV 0 (0%) 2 (5.4%) 0 (0%) 2 (5.4%) SB-5 V=NV 3 (8.1%) 13 (35.2%) 5 (13.5%) 21 (56.8%) V

IQ discrepancy based on 12-pt difference WISC-IV V>NV V=NV VNV 0 (0%) 1 (2.7%) 0 (0%) 1 (2.7%) SB-5 V=NV 3 (8.1%) 15 (40.6%) 6 (16.2%) 24 (64.9%) V

IQ discrepancy based on 15-pt difference WISC-IV V>NV V=NV VNV 0 (0%) 0 (0%) 0 (0%) 0 (0%) SB-5 V=NV 3 (8.1%) 19 (51.4%) 5 (13.5%) 27 (73.0%) V

With a 12-point cutoff being used to indicate a significant discrepancy between verbal and

nonverbal scores on a given test, there was 62.2% agreement across all verbal-nonverbal discrepancy

classifications between the WISC-IV and SB-5. Although this agreement is slightly higher than what

was revealed using the test manuals, analysis of the 18 individuals who had significant verbal-

51

nonverbal differences on the WISC-IV revealed that only 44.4% met the cutoff for a significant

verbal-nonverbal discrepancy on the SB-5. Of the 13 identified on the SB-5 as having a significant discrepancy, 61.5% were identified similarly on the WISC-IV. Using a 15-point discrepancy to identify significance resulted in further misclassification, as only 6 of the 14 (42.9%) individuals identified as having significant verbal-nonverbal discrepancies on the WISC-IV revealed significant discrepancies on the SB-5. However, there was generally more overall agreement between tests

(67.6%) using this cutoff. Based on an independent samples t test, the group classified similarly on the two tests (i.e., both tests indicate significant or non-significant verbal-nonverbal discrepancy) did not differ on any cognitive, behavioral, or ASD-diagnosis specific measure from those classified differently by the two tests, p > .05 for all comparisons.

Variables Associated with Score Differences Between WISC-IV and SB-5

The second aspect of Aim 2 is to better understand factors that may explain the differences between FSIQ, verbal, nonverbal, and working memory scores on the two measures. Correlations were computed between the various difference scores (i.e., the WISC-IV minus SB-5 score, where a negative number indicates a higher SB-5 score and a positive number indicates a higher WISC-IV score) and demographic, cognitive, and behavioral variables. Specifically, the following variables were included: age, gender, FSIQ, verbal, nonverbal, and working memory scores on both measures,

WISC-PSI, SB factor scores (FR, KN, QR, and VS), as well as verbal-nonverbal discrepancy scores within each IQ measure, CELF, VMI, and Theory of Mind scores, CBCL internalizing, externalizing, and total behavior problems, scaled score from the RBS-R, SRS total score, CCC –

General Communication Composite (GCC), and Vineland Communication, Daily Living Skills,

Socialization, and Adaptive Behavior Composite scores. Table 10 displays all correlations.

52

Table 10 Correlations Between IQ Difference Scores and Other Cognitive, Behavioral, and Diagnostic Variables FSIQ.diff V.diff NV.diff WM.diff FSIQ difference (WISC FSIQ – SB FSIQ) 1. Verbal difference (WISC VCI – SB VIQ) .63*** 1. Nonverbal difference (WISC PRI – SB NVIQ) .70*** .19 1. Working memory difference (WISC WM – SB .56*** WMI) .26 .31 1. Age (in years) -.38** -.05 -.32** -.29 Sex -.17 -.14 .03 -.21 WISC.Full-Scale IQ (FSIQ) .28 .03 .14 .27 WISC.Verbal Comprehension Index (VCI) .14 .29 -.07 .11 WISC.Perceptual Reasoning Index (PRI) .24 -.04 .28 .18 WISC.Working Memory Index (WMI) .27 -.03 -.07 .50** WISC.Processing Speed Index .33** -.05 -.07 .00 WISC.Verbal-nonverbal discrepancy -.13 .41** -.54 -.14 SB.FSIQ -.21 -.21 -.28 -.04 SB.Verbal IQ -.19 -.23 -.22 -.06 SB.Nonverbal IQ -.22 -.17 -.32** -.01 SB.Fluid Reasoning -.23 -.24 -.26 .04 SB.Knowledge -.13 -.09 -.23 -.01 SB.Quantitative Reasoning -.24 -.30 -.27 .01 SB.Visual-spatial -.20 -.09 -.24 -.08 SB.Working Memory -.14 -.24 -.30 -.16 SB.Verbal-nonverbal discrepancy .02 -.14 .16 -.11 Clinical Evaluation of Language Fundamentals .11 .12 -.08 .18 Beery Visual-motor Integration .23 -.09 .25 .11 Theory of Mind (NEPSY) -.14 .05 -.32** -.19 Child Behavior Checklist (CBCL)-Internalizing -.24 -.17 .1 -.17 CBCL.Externalizing -.12 -.14 .05 -.14 CBCL.Total Behavior Problems -.23 -.25 .15 -22 Repetitive Behavior Scale, Revised -.05 -.05 .26 -.19 Social Responsiveness Scale -.28 -.17 .03 -.27 Child Communication Checklist .25 .27* -.14 .29* Vineland (VABS) Communication .27 .17 -.08 .35** VABS.Daily Living Skills -.14 -.11 -.30 -.11 VABS.Socialization .18 .15 -.10 .13 VABS.Adaptive Behavior Composite .10 .05 -.18 .12 ***p < .01 **p < .05 *p < .10

53

Full-scale, verbal, nonverbal, and working memory difference scores were not significantly related to gender, FSIQ on either measure, WISC-VCI or WISC-PRI, SB-VIQ, FR, KN, QR, VS,

WM, SB-5 verbal-nonverbal discrepancy, CELF, VMI, externalizing or total behavior, RBS-R, SRS, or Vineland Adaptive Behavior Composite, Daily Living Skills, or Socialization scores, p > .05. Four independent t tests compared low versus high functioning individuals, where groups were defined using cutoff scores of FSIQ = 70 and FSIQ = 85 on both measures, and no significant differences were found, p > .05.

Differences between FSIQ scores. The difference in full-scale IQ scores across measures was negatively correlated with age (in months), r = - .38, p = .02. Thus, older participants were more likely than younger ones to have a higher SB-5 FSIQ score relative to the WISC-IV FSIQ (see

Figure 5).

r = ‐.38

Figure 5. Correlation between differences on FSIQ between measures and age *Note: FSIQ score differences = WISC-IV - SB-5; positive # indicates WISC-IV is higher

54

FSIQ score differences were positively correlated with WISC-IV Processing Speed Index, r

= .33, p = .046, indicating that better processing speed abilities, as measured by the WISC-IV PSI, was associated with relatively higher WISC-IV FSIQ scores as compared to SB-5 FSIQ. Given that

PSI scores are incorporated into WISC-FSIQ, this finding is not surprising. However, the fact that this index score, and not other scores (VCI, PRI, or WMI), contributed to differences is of interest.

Differences between verbal IQ scores. Verbal score differences across the two measures were significantly and positively associated with WISC-IV verbal-nonverbal discrepancy scores, r = .41, p

= .01. Thus, individuals with a verbal advantage on the WISC-IV were more likely to perform better on this measure than on the SB-5. This relationship is illustrated in Figure 6. There was a trend for the Global Communication Composite (GCC) on the CCC-2 to be positively related to the verbal

IQ difference scores between the measures, r = .27, p = .10. Thus, children with better communication abilities had a tendency to perform better on the verbal subtests of the WISC-IV as compared to the SB-5, although this was not statistically significant and the trend disappeared when the analysis was repeated without one participant who had a verbal score difference of 33 points.

r = ‐.41

Figure 6. Correlation between differences on FSIQ between measures and WISC-IV verbal- nonverbal discrepancy *Note: Verbal score difference = WISC-IV VCI - SB-5 VIQ; positive # indicates WISC-IV is higher

55

Differences between nonverbal IQ scores. The difference between nonverbal IQ scores on the

WISC-IV and SB-5 were negatively associated with age, r = -.32, p = .04, indicating that a higher SB-

5 nonverbal score relative to the WISC-IV nonverbal score is associated with older ages (see Figure

7). Post-hoc analyses revealed that this difference was primarily due to the significant positive

correlations between age and SB-5 nonverbal fluid reasoning scores, r = .33, p = .04, and age and

SB-5 nonverbal knowledge scores, r = .35, p = .03, with higher nonverbal fluid reasoning and

knowledge scores being associated with older ages. Age was not correlated with any WISC Index

score or subtest score, p > .05.

r = ‐.32

Figure 7. Correlation between nonverbal score differences between measures and age *Note: Nonverbal score difference = WISC-IV PRI - SB-5 NVIQ; positive # indicates WISC-IV is higher

Differences between working memory scores. The difference between working memory scores on the two tests was positively associated with the Vineland Communication scores, r = .35, p = .03, indicating that better Vineland Communication was associated with an advantage on WISC-IV

Working Memory relative to SB-5 Working Memory. However, when the outlier was removed, there

56

was no significant correlation, p < .05. Similarly, impaired communication, as measured by the

Global Communication Composite (GCC) on the CCC-2, trended toward being positively related to

the working memory difference scores between the measures, r = .29, p = .08. Specifically, there was

a trend for higher scores on the GCC, indicating greater communication abilities, to be associated

with a better performance on the working memory subtests of the WISC-IV (WMI) as compared to

working memory subtests on the SB-5 (WM). However, when the largest WM difference score of 37

was removed, there was no longer a significant trend.

Nonverbal score differences were negatively correlated with NEPSY Theory of Mind

(ToM), r = -.32, p = .04, indicating that better theory of mind skills were associated with relatively

higher SB-5 NVIQ scores as compared to WISC-IV PRI. As expected, the correlation between

NEPSY ToM and SB-5 NVIQ was significantly higher (r = .61, p = .00) than the correlation between NEPSY ToM and WISC-4 PRI (r = .42, p = .001). This relationship is illustrated in Figure

8.

r = ‐.32

Figure 8. Correlation between nonverbal score differences between measures and NEPSY Theory of Mind z scores *Note: Nonverbal score difference = WISC-IV PRI - SB-5 NVIQ; positive # indicates WISC-IV is higher

57

Test Order Effects on Group Differences

The effect of test order was examined by comparing all IQ and Index scores for the test

administered first to the test that was administered second. Paired samples t tests revealed no significant differences based on test order on FSIQ, t(35) = 1.20, p = .24, d = 0.72, verbal scores, t(36) = -0.48, p = .67, or nonverbal scores, t(39) = -1.10, p = .28. As noted previously, test order was counterbalanced with some individuals being administered the WISC-IV first (n = 16) and others the SB-5 first (n = 24). The effect of test order was further analyzed by comparing all IQ and

Index scores for those who were administered the WISC-IV first to those who were administered the SB-5 first. Sample sizes of the two groups differed across analyses due to the inability to obtain

FSIQ scores for individuals who scored too low on the WISC-IV verbal subtests and for the one participant for whom a processing speed index score could not be calculated. FSIQ scores were compared for the 36 participants who obtained scores on both measures and verbal scores were compared for the 37 participants who obtained a VCI score on the WISC-IV. The scores for all 40 participants were included in the WISC-IV PRI and all SB-5 score comparisons.

Independent t tests revealed that children and adolescents who were administered the WISC-

IV first scored significantly higher on WISC-IV FSIQ score than those who were administered the

SB-5 first, t(34) = 2.06, p = .047 (see Table 11). The average difference between the two groups on the WISC-IV FSIQ score was 13.06 (SD = 6.34).

These significant test order group differences remained when the three participants who fell below the threshold for clinically significant social impairments on the SRS were excluded, t(31) =

2.31, p = .03, d = 0.83. However, when the three individuals who scored too low on the verbal

WISC-IV subtests to obtain a VCI or FSIQ were assigned the lowest possible score and added to the sample, no significant group differences on WISC-IV FSIQ score resulted, t (37) = 1.26, p = .22,

58

Table 11

IQ and Factor Scores on the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV) and Stanford

Binet, 5th edition (SB-5) by Order of Test Administration

WISC-IV First SB-5 First Variable t d n M SD n M SD

WISC-IV FSIQ 14 86.79 15.72 22 73.73 20.09 -2.06* 0.72

SB-5 FSIQ 16 82.81 23.48 24 77.83 19.60 0.73 0.23

WISC-IV VCI 14 89.36 19.66 23 79.91 20.45 1.38 0.47

SB-5 VIQ 16 80.25 23.04 24 74.75 20.59 0.79 0.25

WISC-IV PRI 16 93.88 18.33 24 83.33 20.50 1.66 0.54

SB-5 NVIQ 16 87.44 24.06 24 82.92 17.80 0.68 0.21

WISC-IV WMI 16 79.44 20.03 24 74.33 21.80 0.75 0.24

SB-5 WM 16 81.44 22.13 24 77.00 15.81 0.74 0.23

WISC-IV PSI 16 79.44 20.03 23 77.52 15.28 0.13 0.11

SB-5 FR 16 89.06 23.26 24 83.42 19.24 0.84 0.26

SB-5 KN 16 80.44 21.51 24 78.92 18.66 0.24 0.08

SB-5 QR 16 84.38 19.12 24 81.08 19.83 0.52 0.17

SB-5 VS 16 92.44 21.61 24 85.08 17.95 1.17 0.37

Note: FSIQ = Full scale IQ; VCI = Verbal Comprehension Index; VIQ = Verbal IQ; PRI = Perceptual Reasoning Index; NVIQ = Nonverbal IQ; WMI = Working Memory Index; WM = Working Memory a Four participants did not obtain an FSIQ, one due to no score on Symbol Search, and therefore, no PSI or FSIQ on the WISC, and three due to inability to obtain a VCI score or FSIQ; thus, those scores were not included in the FSIQ or VCI/VIQ comparisons between tests. *p < .05

59

d = 0.41. Two of these three individuals were in the WISC-IV first group and one was in the SB-5

first group.

To further understand the impact of those participants with low scores on the analysis of the

order effect, children and adolescents with substantially low FSIQ on either measure (< 60) were

excluded, leaving n = 26. The majority of the 10 individuals (80.0%) removed due to low IQ were in

the SB-5 first group. There were no significant test order group differences within the remaining

26participants (p > .05), suggesting that test order effects were a result of individuals with low IQ

being randomly assigned to the SB-5 first.

Participants who were administered the WISC-IV first did not obtain scores that differed

significantly from those who were administered the SB-5 first on any other index score (see Table

11), or in their verbal-nonverbal discrepancy scores, language ability on the CELF-Screener, visual-

motor skills (VMI), theory of mind (NEPSY), or adaptive functioning (Vineland-II), p > .05. Groups did not differ on demographic variables such as age, sex, and race. However, the group that received the WISC-IV first did receive more impaired scores on measures of behavioral (CBCL) functioning.

Group differences were significant for externalizing behavior problems, t(37) = 2.56, p = .02, d =

1.03, and total behavior problems, t(37) = 2.13, p = .04, d = 0.78. Descriptive statistics for these behavior scores are provided in Table 12. Correlations between IQ and behavior problems were explored to determine if the higher IQ in the WISC-IV first group accounted for this result.

There were no significant correlations between external or total behavior problems and any of the IQ scores, p > .05. However, internalizing problems was positively correlated with WISC-IV

PRI (r = .40, p < .05) and all SB-5 index scores (FSIQ, VIQ, NVIQ) with correlations ranging from

.33 to .37. There were no significant differences between groups on ASD symptom variables (SRS,

RBS-R, CCC-2 GCC). To summarize, individuals who were administered the WISC-IV first have scores on symptom, cognitive, and adaptive functioning measures that are similar to those who were

60

administered the SB-5 first; however, they have significantly greater externalizing and total behavior

problems.

Table 12

Behavior Problem Scores on the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV) and Stanford

Binet, 5th edition (SB-5) by Order of Test Administration

WISC-IV First SB-5 First Variable t d N M SD n M SD

CBCL-Internalizing 15 63.208.30 24 60.29 9.37 0.98 0.33

CBCL-Externalizing 15 60.078.26 24 53.00 8.48 2.56* 0.84

CBCL-Total Behavior 15 66.07 7.92 24 60.71 7.47 2.13* 0.69

*p < .05 **p < .01

Fisher’s r to z transformation was used to compare the correlations between the WISC-IV

IQ index scores and the SB-5 IQ index scores across the two test order groups. Results reveal that

participants who were administered the SB-5 first had significantly higher correlations between their

WISC-IV FSIQ score and their SB-5 FSIQ score than those administered the WISC-IV first, z r = -

2.17, p < .05. Figure 9 displays these correlations. There were no other significant differences between test order groups across the correlations between the other corresponding index scores, p >

.05 (see Table 13).

61

r = 0.75

r = 0.94

Figure 9. Correlation between FSIQ scores on WISC-IV and SB-5 by test order group; n = 37

Table 13

Comparisons Between Correlations of Parallel IQ Scores on the Wechsler Intelligence Scale for Children, 4th edition

(WISC-IV) and Stanford Binet, 5th edition (SB-5) Based on Order of Test Administration

WISC-IV & SB-5 Score Correlations WISC-IV First SB-5 First

n r n r z r

Full scale IQ (WISC-IV FSIQ, SB-5 FSIQ) 14 0.75 22 0.94 -2.02*

Verbal (VCI/VIQ) 14 0.79 23 0.89 -0.93

Nonverbal (PRI/NVIQ) 16 0.77 24 0.90 -1.28

Working Memory (WMI/WM) 16 0.76 24 0.82 -0.46

NOTE: All correlations are significant at p < .01; * indicates p < .05 VCI = Verbal Comprehension Index; VIQ = Verbal IQ; PRI = Perceptual Reasoning Index; NVIQ = Nonverbal IQ; WMI = Working Memory Index; WM = Working Memory

IQ and Adaptive Functioning

To assess discriminant validity between the measures, as outlined in Aim 3, the relationships

between IQ index scores (FSIQ, verbal intelligence, nonverbal intelligence, and working memory) of

62 the two tests and cognitive (CELF, VMI, and theory of mind) and adaptive functioning variables were assessed. The correlations between these scores are outlined in Table 14. All cognitive variables were strongly correlated with IQ scores, p > .05. Vineland Communication domain was correlated with all eight IQ scores across tests, with a range of r = .35 to .54, with the exception of the WISC-

IV PRI, for which the correlation just failed to reach significance, r = 0.32, p = .051. The Vineland

Daily Living Skills domain was correlated with all IQ scores, with a range of r = .37 to .53. Finally,

Vineland Socialization was moderately correlated with all IQ scores, with the exception of the

WISC-IV PRI, which failed to reach significance, r = .31, p = .054.

To determine if the two tests differed significantly in their relationship to adaptive functioning, Fisher’s r to z transformations were used to compare the correlation coefficients between each of the three Vineland domain scores and the corresponding IQ scores (WISC-FSIQ,

VCI, PRI, WMI and SB-FSIQ, VIQ, NVIQ, WM) from the two measures. None of the correlations differed significantly, p > .05 for all comparisons, indicating that both tests are associated comparably with Vineland scores (see Table 15).

63

Table 14

Correlation Between the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV) and Stanford Binet, 5th edition

(SB-5) IQ Index Scores and Cognitive and Adaptive Functioning

Cognitive Factors Adaptive Functioning

CELF VMI ToM V.Com V.DL V.Soc

WISC.FSIQ 0.78** 0.64** 0.51** 0.41* 0.40* 0.35*

WISC.VCI 0.85** 0.48** 0.66** 0.39* 0.37* 0.35*

WISC.PRI 0.63** 0.71** 0.42** 0.32 0.38* 0.31

WISC.WMI 0.73** 0.53** 0.43** 0.54** 0.45* 0.37*

SB.FSIQ 0.79** 0.60** 0.69** 0.38* 0.53** 0.38*

SB.VIQ 0.84** 0.60** 0.72** 0.38* 0.50** 0.39*

SB.NVIQ 0.67** 0.55** 0.61** 0.35* 0.52** 0.34*

SB.WM 0.70** 0.53** 0.65** 0.37* 0.51** 0.32*

CELF 1 0.57** 0.72** 0.45* 0.40* 0.50**

VMI 1 0.40** 0.24 0.35* 0.33*

ToM 1 0.34 0.44* 0.51**

V.Com 1 0.57** 0.75**

V.DL 1 0.65**

V.Soc 1

*p < .01 FSIQ = Full scale IQ; VCI = Verbal Comprehension Index; PRI = Perceptual Reasoning Index; WMI = Working Memory Index; VIQ = Verbal IQ; NVIQ = Nonverbal IQ; WM = Working Memory; CELF = Clinical Evaluation of Language Fundamentals; VMI = Visual Motor Integration; ToM = NEPSY Theory of Mind; V.Com = Vineland Communication; V.DL = Vineland Daily Living Skills; V.Soc = Vineland Socialization

64

Table 15

Comparisons of Correlations Between Parallel IQ Scores on the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV) and Stanford Binet, 5th edition (SB-5) and Vineland Domain Scores

Communication Daily Living Skills Socialization WISC-IV SB-5 WISC-IV SB-5 WISC-IV SB-5

n R n r z r n r n r z r n r n r z r Full-scale IQ 35 0.41* 39 0.38* 0.15 35 0.40* 39 0.53** -0.69 35 0.35* 39 0.38* -0.14 VCI/VIQ 36 0.39* 39 0.38* 0.05 36 0.37* 39 0.50** -0.67 36 0.35* 39 0.39** -0.19 PRI/NVIQ 39 0.32 39 0.35* -0.14 39 0.38* 39 0.52** -0.75 39 0.31 39 0.34* -0.14 * p < .05

**p < .01 VCI = Verbal Comprehension Index; VIQ = Verbal IQ; PRI = Perceptual Reasoning Index; NVIQ = Nonverbal IQ

65

CHAPTER FOUR

Discussion

Children and adolescents with ASD are routinely administered various tests of intelligence, as this aids in differential diagnosis and has huge implications for diagnosis, clinical care, and service eligibility, as well as public policy and research practices. The use of IQ test scores in individuals with ASD rests on the assumption that there is strong reliability and validity of the results across measures. However, it has been unclear if the diagnostic symptoms and cognitive deficits inherent to

ASD differentially impact an individual’s ability to perform on IQ measures.

The present study adds to our current understanding of the relationship between the WISC-

IV and SB-5 in a group of adolescents with ASD. First, the results suggest that the good convergent validity between the WISC and SB that is documented in typically-developing children on previous versions of the test (Prewett & Matavich, 1992; Prewett & Matavich, 1994; Roid, 2003) is present also in children and adolescents with ASD. Specifically, the correlations between the corresponding scores on the two measures (i.e., FSIQ, verbal IQ, and nonverbal IQ) were high. In previous studies, these group-level comparisons have been the only ones reported to determine if two measures were reliable with little or no attention given to the individual discrepancies between measures. Thus, excellent convergent validity between the WISC-IV and SB-5 is reported in studies of pediatric populations (Lukens & Hurrell, 1996; Phelps, et al., 1988; Prewett & Matavich, 1994) and between previous versions of the tests (WISC-III and SB-5) in the SB-5 validation sample (Roid, 2003).

Intelligence testing is a population-based science in which individual scores are compared to a larger normative group. Thus, analyzing and comparing IQ tests at the group level is necessary and appropriate for reliability and validity evaluations and for making predictions about how scores on one test compare to those on another. However, these group-level comparisons fail to account for

66 the differences that may be present at an individual level. It is possible that in test validation and research study samples, other group-level comparisons and comparisons at the individual level are slightly less optimistic than the high correlations between IQ scores would suggest, as was the case in the present study. The results of this study call into question the common method of describing reliability between measures by evaluating only correlations between test scores. Clinicians and researchers may be being misled by drawing conclusions about individuals’ scores based solely on inter-test correlations.

The hypothesis that the full-scale and verbal IQ scores would be more highly correlated than the nonverbal scores was not supported. As expected based on the number of tasks that contribute to each index, the highest correlation was for FSIQ and the lowest (albeit still quite strong at r = .78) for working memory. The correlation between the nonverbal indices (r = .82) is substantially higher than the .66 correlation between the WISC-III and SB-5 that is published in the SB-5 manual (Roid,

2003). The relationship between nonverbal scores on the SB-5 and the current edition of the WISC, the WISC-IV, has not yet been reported in normative samples; thus, it is possible that the higher correlation in this sample of individuals with ASD is a result of revisions to the current edition of the WISC rather than to developmental differences between the samples. For example, in comparison to the WISC-III, the tasks that make up the WISC-IV nonverbal index make less demands on processing speed, include more fluid reasoning, and include both more easy and more difficult items in order to more accurately assess a broad range of abilities (Wechsler, 2003). Studies comparing previous WISC and SB versions in a typically developing population (Rothlisberg, 1987) and those with ADHD (Saklofske, Schwean, Yackulic, & Quinn, 1994) reveal lower nonverbal IQ score correlations than the present study. Although it cannot be determined from the data available, it may be that the two tests have become increasingly convergent with each new edition. An alternate explanation of the higher nonverbal correlation in the present study is that adolescents with

67

ASD perform more similarly across nonverbal tasks than do typically-developing children. In the

absence of a comparison group of typically-developing children, it is not possible from the present

data to evaluate this latter possibility.

The majority of children in the present study obtained higher FSIQ scores on the SB-5

compared to the WISC-IV, with a mean difference between three and four points. This difference

was statistically significant and in line with our hypothesis, but has little clinical relevance.

Furthermore, the vast majority (81.1%) received scores on the two measures that were within one

standard deviation of each other. Although this is the first study to compare the WISC-IV and SB-5

in any population, these rates of agreement between scores are significantly higher than what has been reported in other patient populations (e.g., ID and ADHD) on previous versions of the measures, in which the vast majority of participants scored higher on the Stanford Binet (Lukens &

Hurrell, 1996; Saklofske, et al., 1994). These previous studies involved younger children or those with lower IQ scores. Therefore, it may be that younger children or those with lower IQs perform better on the SB-5 because of the decreased demands on verbal ability, as described in the manual. It may also be that they are more engaged with the SB-5 tasks compared to those on Wechsler scales, which results in higher scores, as has been noted in a sample of young children (Garred & Gilmore,

2009). Given the verbal level necessary to complete the WISC-IV and SB-5 and the floor score of these measures, which are 45 and 40 respectively, children with extremely low intellectual functioning are not included in the present study. Furthermore, the study was restricted to individuals 10 years and older.

In contrast to the FSIQ results where SB-5 scores were higher on average, the sample showed lower verbal scores on the SB-5 compared to the WISC-IV. However, the mean difference of 3.9 IQ points does not have major clinical implications. Although the small difference between the tests’ verbal intelligence scores does not raise clinical concern for the group as a whole, a

68

significant proportion of the sample (16%) had verbal scores on the WISC-IV that were > 1 SD higher than verbal scores on the SB-5. Nonverbal and working memory scores between tests, however, showed no significant differences at the group level, but 17.5% of the sample had nonverbal scores that were at least a standard deviation apart and approximately one-third obtained a working memory score on the WISC-IV that was more than one standard deviation from their working memory score on the SB-5, more often in the direction of the SB-5 being higher. Despite reliability at the group level, individual differences in FSIQ scores were as much as 23 points, and as much as 33 and 37 points for verbal and nonverbal IQ scores, respectively. These differences could have serious implications for an individual in terms of diagnosis, conceptualization of his or her cognitive abilities, prognosis, and treatment recommendations. Significant differences between corresponding test scores are apparent for 12-33% of the sample; however, it is difficult to conclude if the two measures are more different from one another than they would be in a typically- developing population or more different than they would be if the same measure were administered to the same child on the same day. Without data describing these differences, conclusions specific to

ASD cannot be made.

Assessment of intelligence in the present study is in an ideal controlled environment. The same children are administered both tests of intelligence on the same day and demonstrate that, overall, they are able to reliably perform in a single session. The absence of a test order effect is further indication of the sample’s consistency in performance across measures. However, it is important to note that this was a select sample of individuals with ASD. They were a group of verbal adolescents with an ASD and no significant genetic, medical or psychiatric comorbidities, had

English as their primary language, and had not had an intelligence assessment in the preceding three months. Although this may limit generalizability of the findings to higher functioning, idiopathic autism, it remains an important contribution to our understanding of how intelligence is measured in

69

this population. Furthermore, other studies typically administered tests several months apart (Lukens

& Hurrell, 1996; Prewett & Matavich, 1994), which may account for the significant differences seen

in those samples relative to differences in the present study. The results of the present study showed higher SB-5 FSIQ scores compared to WISC-IV FSIQ, but higher WISC-VCI compared to SB-

VIQ, although the difference was minimal (about 3 IQ points). This same pattern was observed in

children with ID and those with academic difficulties, although the SB-5 tended to be slightly higher

(about 7 IQ points) on average (Lukens & Hurrell, 1996; Prewett & Matavich, 1994). Although the

difference in results may be due to the interval between test administrations, it may also be the result

of the diagnostic group assessed.

Classification of Intelligence

Reliable classification of ID using an IQ cutoff score of 70 is important for determining

service eligibility and making public policy decisions. Results of the present study demonstrate high

sensitivity and specificity between the WISC-IV and SB-5 in classification of ID in individuals with

ASD, with only 6% of the sample classified as ID on one test but not the other. Thus, service

eligibility would not be greatly impacted based on test selection. Previous work has raised concern

about the use of a cutoff score to diagnosis ID and determine eligibility for services (American

Association of Mental Retardation, 2002). However, for individuals with ID and ASD, a cutoff score

of 70 results in good consistency between tests. In addition, policy decisions rely on IQ scores to

classify an intervention as an evidence-based practice (e.g., Dawson, et al., 2010). The reliability of

IQ results in this study indicates that efficacy of interventions at a group level would be reported

similarly, regardless of the test selected.

Classification consistency using standard descriptive categories (i.e., ID, borderline, average,

superior) at the individual level further supported the good convergent validity between measures.

70

For FSIQ, nearly 85% of the sample received scores on the two measures that placed them in the

same cognitive ability categories. The six individuals who were classified differently by the two

FSIQs had differences ranging from 14 to 23 points, with only one participant obtaining a FSIQ on the WISC-IV that was within 1 SD of the FSIQ score on the SB-5. For most of these individuals, the shifts between descriptive cognitive ability ranges involved being in the ID, borderline, or average range on the WISC-IV, but one category higher on the SB-5. Thus, for those participants who had discrepant classifications, the differences tended to be the result of significantly lower

FSIQ scores on the WISC-IV. There was a clear lack of consistency between the two measures in the classification of a single subject’s intelligence based on verbal and nonverbal IQ scores, with

25% of the sample obtaining scores that placed them in different descriptive categories across the two tests. Individuals tended to score lower on the SB-5 VIQ relative to the WISC-IV VCI, but with the nonverbal scores, differences occurred in both directions.

Research design and practices make assumptions about the accuracy and stability of IQ scores and patterns of IQ scores across tests, as these scores are used not only for sample selection and matching, but scores from multiple measures are often equated in analyses (Mayes & Calhoun,

2003b). The high reliability between IQ scores across tests provides solid support for the comparison of WISC-IV and SB-5 scores within studies of individuals with ASD; however, the significant variability for individuals, across verbal and nonverbal scores in particular, supports the idea of maintaining a single consistent measure across sites and time. Researchers, as well as clinicians, and should exercise caution when using and interpreting verbal and nonverbal IQ scores on the WISC-IV or SB-5 to classify individuals with ASD, as one in four children in the current study were classified differently by the two tests in each of these cognitive domains.

71

Factors Accounting for IQ Score Differences

Individuals with ASD present with specific difficulties in aspects of language, social skills, and behavior that could impact their ability to engage in the testing process. The magnitude and direction of the difference in FSIQ scores was significantly associated with age and processing speed. Older children tended to score higher on the SB-5 relative to the WISC-IV, which may be due in part to the finding that older children had a tendency to score higher on the SB-5 NVIQ than on the WISC-IV PRI. Within the SB-5 NVIQ, older children performed better on the nonverbal fluid reasoning subtest and the nonverbal knowledge subtest. Although these scores are corrected for typical developmental effects, this relationship may reflect the fact that children with ASD who are older may be more engaged in the matrices, object series, and visual aspects of these SB-5 subtests. This age-related effect on scores also implies that the normative data for the two measures, which is corrected for age, does not represent this sample of children with ASD in the same way as it does typically-developing children. This may be due, in part, to the fact that a small percentage of children with ASD are incorporated into the normative group of the WISC-IV

(Wechsler, 2003), but not the SB-5. Although this may not account for the significant difference between the two measures and the age-related effects, the impact of incorporation of diagnostic groups into normative samples warrants further investigation.

Given that older children are more likely than younger children to obtain higher nonverbal and full-scale IQ scores on the SB-5 compared to the WISC-IV, younger children with ASD may score higher if they were evaluated with the WISC-IV than if they completed the SB-5. For young children, their observed social or communication impairments may be expected given a low FSIQ score on the SB-5, but may be greater than expected if a higher score on the WISC-IV would have been obtained, impacting the likelihood of receiving an ASD diagnosis. The SB-5 is administered as part of a national initiative to standardize the diagnostic evaluation and medical treatment for

72

individuals with ASD; therefore, clinicians in this network should consider how age impacts SB-5

scores and how these scores inform their diagnostic practice. Regardless of age, 11% of children and

adolescents in the present study obtained FSIQ scores on the SB-5 that were at least a standard

deviation higher than their FSIQ on the WISC-IV. In terms of diagnoses, this considerable

variability at the individual level for a fair percentage of the sample indicates that children and

adolescents could be receiving diagnoses on the autism spectrum that requires unimpaired cognitive

abilities (i.e., Asperger’s disorder) or in the range of ID based on the results of one measure but not the other.

Performance on nonverbal tests has been shown to be positively related to both theory of mind and age in a previous study of 31 children with ASD (Joseph & Tager-Flusberg, 2004).

Consistent with this report, the theory of mind subtest was moderately to strongly correlated with nonverbal IQ scores on both tests. Furthermore, this study showed that an individual’s theory of mind contributes to the differential performance on two tests of nonverbal skills, with individuals with higher theory of mind scores performing better on the SB-5 NVIQ relative to the WISC-IV

PRI. The SB-5 nonverbal subtests may place greater demands on an individual’s ability to take another’s perspective, as in the Picture Absurdities subtest, versus the subtests on the WISC-IV, which rely heavily on nonverbal abstract and visual-spatial reasoning. This suggests that participant age and theory of mind may have partially independent and additive effects on nonverbal intelligence in this population. The strong correlation between age and the theory of mind scores may explain why older children obtain higher scores on specific nonverbal factor scores. This effect of theory of mind on test performance may be specific to ASD given decades of research identifying this cognitive ability as being a core deficit in the cognitive processing of these individuals

(Anderson, 2008; Joseph, 1999). Future research should assess the role that multiple aspects of

73

theory of mind abilities has on IQ test performance in other patient populations as well as in

individuals with ASD, as this would contribute to our understanding of this effect.

Children with better processing speed had an advantage on the WISC-IV FSIQ relative to

the SB-5. These results are confounded by the fact that WISC-IV PSI scores are included in WISC-

IV FSIQ; therefore, any increase in PSI will result in an increase in WISC-IV FSIQ. The correlations

between the WISC-IV PSI and WISC-IV index scores are similar to the correlations between PSI

and the SB-5 factor scores, indicating that this ability is implicated in all tasks on both measures.

Although strong conclusions cannot be drawn about the role of processing speed across IQ tests,

since no comparable index is available on the SB-5, it is important to consider how this cognitive

ability factors into the understanding of intelligence.

An individual’s processing capabilities are an important component of theories of

intelligence, particularly for ASD. The modular theory of cognitive deficits in ASD proposes that

impaired intellectual functioning in ASD is a factor of deficits in specific cognitive abilities or

‘modules,’ such as syntax and theory of mind, versus deficits of a basic processing mechanism

(Anderson, 1992). This theory is supported by research that indicates children with ASD and below-

average IQ have processing speed abilities that are comparable to their typically-developing peers

with average IQ (Scheuffgen, et al., 2000). Although other research failed to replicate this finding of

children with ASD and low IQ having equal processing abilities as average IQ controls, (Wallace, et

al., 2009), it did provide support for the modular theory of intelligence. It showed that the

relationship between basic processing and IQ in typically-developing samples, where faster

processing was associated with higher IQ scores, is not present in ASD, where processing abilities are unrelated to IQ level. If processing speed does not account for overall cognitive ability in ASD, as it does in typically-developing populations, perhaps its incorporation as a primary component of an FSIQ score, as it is with the WISC-IV, should be considered when selecting a test. IQ tests are

74 constructed to fit the structure of intelligence in typically-developing populations and may not fully capture the structure of cognitive abilities in ASD. Previous studies have evaluated this and shown that the factor structure of intelligence in high-functioning individuals with ASD is generally similar and includes aspects of processing (Goldstein, et al., 2008; Lincoln, Courchesne, Kilman, Elmasian,

& Allen, 1988); however, this is not clear for low-functioning individuals with ASD. An external measurement of processing speed, perhaps through an inspection time task, as this is thought to be the most pure measure of processing abilities, may elucidate the impact of speed of processing on

IQ test performance for both low and high functioning children and adolescents.

Other ASD-related impairments related to visual-motor, social, language, and behavioral functioning did not differentially impact performance on the WISC-IV compared to the SB-5. There were no external factors, such as age, cognitive abilities, behavioral, or adaptive functioning, that accounted for the differences between tests on verbal and working memory scores. The trends related to better communication skills being related to better scores on WISC-IV working memory and verbal index scores were not significant when extreme difference scores were removed. It may be that universal deficits in pragmatic language (Tager-Flusberg, et al., 2005) and social communication, for example, and the presence of repetitive behaviors affect performance across measures equally.

Verbal-Nonverbal Discrepancies

There were no significant differences between verbal-nonverbal discrepancy scores at the group level; however, 32-40% of participants were classified differently on the two tests across the three different cutoff scores (cutoff points set forth in the test manuals of the WISC-IV and SB-5, a

12-point discrepancy, and a 15-point discrepancy). The correlations between SB-5 verbal and nonverbal scales were significantly higher than the correlations between the WISC verbal and

75

nonverbal scales; thus, a high or low verbal score on the SB-5 was often associated with a similar nonverbal score. Verbal-nonverbal score discrepancies on the SB-5 were significantly lower than participants’ discrepancy scores than the WISC-IV. No cognitive, behavioral, or symptom variables differentiated the groups with consistent discrepancy classification across tests versus those with inconsistent classifications. Given the considerable clinical and research implications of this discrepancy and our inability to identify the individuals is at risk for differential classification based on other variables, this lack of consistency is particularly alarming. Test manual cut-points were the most reliable method of identifying a nonverbal-verbal discrepancy. Using the test manual, 68.4% of the individuals who were classified as having a discrepancy on the WISC-IV were also classified as having a discrepancy on the SB-5, but this percentage declined to 50.0% and 42.9% when using the

12- and 15-point criteria. In a clinical setting, where only one IQ test is administered, decisions about test selection and the application of a 12- or 15-point cutoff would have led to differing recommendations and case conceptualizations for at least half of the individuals in this sample.

Given the considerable repercussions this pattern of results has on an individual’s care and the implications of these discrepancy scores in ASD research, it is recommended that test manual criteria be used over other methods of identifying discrepancies; however, even this method is still associated with unacceptably high variability across measures.

Test Order Effects

No significant effect of test order was identified, as scores on the test that was administered first were not significantly different than scores obtained on the second test administered. However, further examination of test order effects revealed differences in the direction of children who

completed the WISC first earning higher WISC-IV FSIQ scores than those who completed SB-5

first and the 13-point average magnitude of this difference is both statistically significant and

76

clinically meaningful. The effect of randomization only impacted WISC-IV FSIQ; therefore, it is not

surprising that the group that completed the WISC-IV first, who obtained higher WISC-IV FSIQ

scores, had a significantly lower correlation between WISC-IV and SB-5 FSIQ scores (r = .75) than the group that took the SB-5 first (r = .94), for whom the average FSIQ for both tests was in the

Borderline range.

It is difficult to determine whether the WISC-IV first group derived a benefit from completing this measure first or whether the group was comprised of a more intelligent group of children by chance, as there is support for both arguments. The WISC-IV first group consistently scored higher on every IQ score on both tests compared to the SB-5 first group, although not all comparisons were significant. If receiving the WISC-IV first provided an advantage, higher scores on the second IQ test administered, with comparable scores on the first test, would be expected.

However, the WISC-IV first group performed better on both the first and the second test administered. It is important to note that when the three participants who were unable to obtain

WISC-IV FSIQ scores due to floor effects on WISC-IV verbal subtests were included, no significant group differences remained. There were also no significant differences when individuals with significantly low IQ scores (IQ < 60) were removed. Given that differences in IQ scores were not observed when low-functioning individuals were included, the observed differences between groups may be the result of lower functioning children being randomly assigned to the SB-5 first group.

On the other side, if the random assignment of test order simply resulted in the children with high cognitive abilities being assigned to the WISC-IV first group, it might be expected that these individuals would have obtained higher scores on other measures of functioning, such as language, visual-motor skills, theory of mind, or adaptive behavior. However, there were no differences between groups. Previous research of children with ASD revealed that their cognitive abilities are less strongly associated with each other than the abilities of typically-developing

77

individuals and, thus, there is less accuracy in estimating scores on other cognitive measures from a given IQ score (Goldstein, et al., 2008). Therefore, the absence of differences in other cognitive and

functional domains between the higher IQ WISC-IV first group and the SB-5 group may not be

particularly surprising. Although test order effects were not significant across the first and second

test administered to the sample, the differences observed for the groups administered the WISC-IV

versus the SB-5 first could impact the results of this study. Group differences between tests on full-

scale and verbal scores may not truly reflect differences in performance across the sample of

individuals with ASD, but could be the result of higher scores on the SB-5 in the WISC-IV first

group. A larger sample size and more systematic approach to random assignment of test order may

more fully elucidate the nature of this relationship.

No other differences between the two IQ tests were significant across the test order groups,

and there were no significant differences between the groups on other measures of cognitive and

adaptive functioning or ASD symptom severity. However, the WISC-IV group had significantly more externalizing and total behavior problems compared to individuals administered the SB-5 first.

Across the entire sample, scores on internalizing behavior subscales were significantly related to

WISC-IV PRI scores and all SB-5 index scores (FSIQ, VIQ, and NVIQ); however, there were no significant associations between externalizing or total behavior and IQ in the whole group or in the

WISC-IV first group separately; thus, the fact that this group of children perform better on IQ tests does not appear to be related to the significantly higher levels of externalizing and total behavior problems.

IQ Scores and Adaptive Functioning

The study’s third aim was to assess the discriminant validity of the two IQ measures by analyzing their relationships with other measures of functioning. Aside from the correlations

78

between language scores and IQ scores, the correlations between IQ and cognitive variables (VMI and Theory of Mind) were significantly lower than the interest correlations between the two measures’ index scores, providing evidence that the two tests’ scores are more highly associated with

one another than they are with other measures of functioning. Further support for discriminant

validity was evidenced by the moderate correlations (r = .31 - .51) between adaptive functioning subscales (Communication, Daily Living Skills, and Socialization) and IQ scores, which were significantly lower than the correlations between the tests’ corresponding indices. One exception was with the WISC-PRI, which was only correlated with Daily Living Skills and not Communication or Socialization. The correlations between IQ and adaptive behavior are comparable to what has been reported in other studies of children with ASD (Klin, et al., 2007). Furthermore, there are no differences between IQ tests in the magnitude of the correlations between IQ scores and adaptive functioning, indicating that scores on WISC-IV and SB-5 relate similarly to measures of everyday functioning. However, the results of this study further illustrate the fact that the convergence between adaptive behavior scores on the Vineland and IQ scores is markedly different than what is seen in typically populations. Therefore, the use of IQ and adaptive behavior scores to diagnosis ID can be problematic in the population of children with ASD.

Sample Characteristics

The male to female ratio in the present study was 7.2 to 1, which is higher than the approximately 4:1 ratio seen in the general ASD population (Volkmar, et al., 1993; Yeargin-Allsopp et al., 2003). However, the male to female ratio in ASD greatly varies and is contingent upon a variety of factors. Early studies of sex differences in ASD revealed that male to female ratios varied greatly across IQ levels, with low-functioning individuals on the autism spectrum revealing male to female ratios of about 2.8:1 (Lord, Schopler, & Revicki, 1982; Volkmar, et al., 1993). More recently,

79

Yeargin-Allsopp (2003) reported ranges in male to female ratios from 7.3:1 among children with no

cognitive impairment to 1.3:1 in children with profound cognitive impairment. The observed male to female ratio in the present study is likely a result of the sample consisting of few individuals in the moderate, severe, and profound ranges of ID. Given that the measures compared in the present study only provide scores as low as 40, assessment at the lower range of intellectual functioning was not possible.

Limitations and Future Directions

No reliability studies have been conducted on the current versions of the WISC and SB

(WISC-IV and SB-5); thus, the present study provides a foundation from which this line of research can be built. In the absence of a comparative sample, no conclusions can be drawn about whether the obtained pattern of differences across IQ tests is specific to ASD or is also characteristic of other populations. Including samples of typically-developing children and adolescents and those with other developmental disabilities would make this specificity clearer. Although to the author’s knowledge this is the largest study to date that compares WISC and SB in any patient population, the sample size is still small. A larger sample of children and adolescents would provide more statistical power and allow for more complex comparisons between and within various IQ classification groups.

A comprehensive assessment of neuropsychological factors was not conducted due to a

desire to limit the length of the testing session. However, it is possible that broader measures of

language, visual-motor ability, or executive abilities, including theory of mind, would allow for

stronger conclusions about the impact of neuropsychological factors on IQ score discrepancies. In

addition, an assessment of processing speed that was independent of the IQ measures may have

80

provided more compelling information about how this ability relates to performance on the two

measures.

Although not directly assessed in the present study, higher scores on the SB-5 may also be reflective of the adolescent’s increased interest and attention to the SB-5 tasks relative to the WISC-

IV, as this has been reported in previous studies using the SB and Wechsler scales (Garred &

Gilmore, 2009). Objectively obtaining information from the participants about their preference for

one measure or the other and systematically recording information about engagement, persistence,

attention, and motivation during the session may shed light on why some children may perform

better on one test over the other.

81

REFERENCES

American Association of Mental Retardation. (2002). Mental retardation: Definition, classification, and systems of supports. Annapolis, MD. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders, 4th edition, Text Revision: DSM-IV-TR. Washington, DC. Anderson, M. (1992). Intelligence and development: A cognitive theory. Oxford: Blackwell. Anderson, M. (2008). What can autism and dyslexia tell us about intelligence? Q J Exp Psychol (Colchester), 61(1), 116-128. Autism Society of America. (2006). What is autism: Facts and statistics. Retrieved from http://www.autism-society.org/site/PageServer?pagename=about_whatis_factsstats. Autism Society of America. (2007). ASA guidelines Retrieved December 15, 2009, from http://www.autism-society.org/site/PageServer?pagename=about_whatis_factsstats Autism Speaks. (2010, 07/20/09). ATN Registry Instruments, from http://www.autismspeaks.org/docs/sciencedocs/atn/ATNRegistryInstruments072009.pdf Baron-Cohen, S. (2005). Theory of mind and autism: A fifteen year review. In S. Baron-Cohen, H. Tager-Flusberg & D. J. Cohen (Eds.), Understanding other minds: Perspectives from developmental cognitive neuroscience (2nd ed., Vol. xix, pp. 3-20). New York: Oxford University Press. Beery, K., Buktenica, N., & Beery, N. (2006). Beery-Buktenica Developmental Test of Visual- Motor Integration (5 ed.). San Antonio, TX: Pearson. Begovac, I., Begovac, B., Majic, G., & Vidovic, V. (2009). Longitudinal studies of IQ stability in children with childhood autism - literature survey. Psychiatria Danubina, 21(3), 310- 319. Bishop, D. (2006). Children’s Communication Checklist-2 U.S. Edition. San Antonio, TX: Pearson Bölte, S., & Poustka, F. (2002). The relation between general cognitive level and adaptive behavior domains in individuals with autism with and without co-morbid mental retardation. Child Psychiatry and Human Development, 33(2), 165-172. Chapman, N. H., Estes, A., Munson, J., Bernier, R., Webb, S. J., Rothstein, J. H., . . . Wijsman, E. M. (2011). Genome-scan for IQ discrepancy in autism: evidence for loci on chromosomes 10 and 16. Hum Genet, 129(1), 59-70. doi: 10.1007/s00439-010-0899-z Charman, T., Pickles, A., Simonoff, E., Chandler, S., Loucas, T., & Baird, G. (2011). IQ in children with autism spectrum disorders: data from the Special Needs and Autism Project (SNAP). Psychol Med, 41(3), 619-627. doi: S0033291710000991 [pii] 10.1017/S0033291710000991 Condouris, K., Meyer, E., & Tager-Flusberg, H. (2003). The relationship between standardized measures of language and measures of spontaneous speech in children with autism. American Journal of Speech-Language Pathology, 12(3), 349-358. Constantino, J., & Gruber, C. (2005). Social responsiveness scale (SRS). Los Angeles, CA: Western Psychological Association. Coolican, J., Bryson, S. E., & Zwaigenbaum, L. (2008). Brief report: data on the Stanford-Binet Intelligence Scales (5th ed.) in children with autism spectrum disorder. J Autism Dev Disord, 38(1), 190-197.

82

Dawson, G., Rogers, S., Munson, J., Smith, M., Winter, J., Greenson, J., . . . Varley, J. (2010). Randomized, controlled trial of an intervention for toddlers with autism: the Early Start Denver Model. Pediatrics, 125(1), e17-23. doi: peds.2009-0958 [pii] 10.1542/peds.2009-0958 de Bruin, E. I., Verheij, F., & Ferdinand, R. F. (2006). WISC-R subtest but no overall VIQ-PIQ difference in Dutch children with PDD-NOS. J Abnorm Child Psychol, 34(2), 263-271. Deutsch, D. K., & Joseph, R. M. (2003). Brief Report: Cognitive correlates of enlarged head circumference in children with autism. Journal of Autism and Developmental Disorders, 33(2), 209–214. Elliott, C. (1990). Differential Ability Scales. San Antonio, TX: Psychological Corporation. Fombonne, E. (2005). Epidemiology of autistic disorder and other pervasive developmental disorders. Journal of Clinical Psychiatry, 66, 3-8. Fuentes, C. T., Mostofsky, S. H., & Bastian, A. J. (2010). Perceptual reasoning predicts handwriting impairments in adolescents with autism. Neurology, 75(20), 1825-1829. doi: 75/20/1825 [pii] 10.1212/WNL.0b013e3181fd633d Garred, M., & Gilmore, L. (2009). To WPPSI or To Binet, that is the question: A comparison of the WPPSI-III and SB5 with typically developing preschoolers. Australian Journal of Guidance & Counselling, 19(2), 104-115. Gillberg, C., & Steffenburg, S. (1987). Outcome and prognostic factors in infantile autism and similar conditions: a population-based study of 46 cases followed through puberty. J Autism Dev Disord, 17(2), 273-287. Goldstein, G., Allen, D. N., Minshew, N. J., Williams, D. L., Volkmar, F., Klin, A., & Schultz, R. T. (2008). The structure of intelligence in children and adults with high functioning autism. Neuropsychology, 22(3), 301-312. Goldstein, G., Minshew, N. J., Allena, D. N., & Seaton, B. E. (2002). High-functioning autism and schizophrenia: A comparison of an early and late onset neurodevelopmental disorder. Archives of Clinical Neuropsychology, 17, 461-475. Happe, F. G. (1994). Wechsler IQ profile and theory of mind in autism: A research note. Journal of Child Psychology and Psychiatry, 35(8), 1461-1471. Hartley, S., Sikora, D., & McCoy, R. (2008). Prevalence and risk factors of maladaptive behaviour in young children with autistic disorder. Journal of Intellectual Disability Research, 52(10), 819-829. Hill, E. L. (2004). Executive dysfunction in autism. Trends Cogn Sci, 8(1), 26-32. doi: S1364661303003152 [pii] Howlin, P. (2003). Outcome in high-functioning adults with autism with and without early language delays: Implications for the differentiation between autism and Asperger syndrome. Journal of Autism and Developmental Disorders, 33(1), 3-13. Howlin, P., Goode, S., Hutton, J., & Rutter, M. (2004). Adult outcome for children with autism. Journal of Child Psychology and Psychiatry, 45(2), 212-229. Howlin, P., Magiati, I., & Charman, T. (2009). Systematic review of early intensive behavioral interventions for children with autism. Am J Intellect Dev Disabil, 114(1), 23-41. doi: 1944-7515-114-1-23 [pii] 10.1352/2009.114:23;nd41 Individuals with Disabilities Education Act. Building the legacy: IDEA 2004. (2004).

83

Joseph, R. M. (1999). Neuropsychological frameworks for understanding autism. Int Rev Psychiatry, 11(4), 309-324. Joseph, R. M., & Tager-Flusberg, H. (2004). The relationship of theory of mind and executive functions to symptom type and severity in children with autism. Dev Psychopathol, 16(1), 137-155. Joseph, R. M., Tager-Flusberg, H., & Lord, C. (2002). Cognitive profiles and social- communicative functioning in children with autism spectrum disorder. J Child Psychol Psychiatry, 43(6), 807-821. Kanaya, T., & Ceci, S. J. (2007). Are all IQ scores created equal? The differential costs of IQ cutoff scores for at-risk children. Child Development Perspectives, 1(1), 52-56. Kenworthy, L., Case, L., Harms, M. B., Martin, A., & Wallace, G. L. (2010). Adaptive behavior ratings correlate with symptomatology and IQ among individuals with high-functioning autism spectrum disorders. J Autism Dev Disord, 40(4), 416-423. Klin, A., Saulnier, C., Tsatsanis, K., & Volkmar, F. (2005). Clinical evaluation in autism spectrum disorders: Psychological assessment within a transdisciplinary framework. In F. Volkmar, P. Rhea, A. Klin & D. Cohen (Eds.), Handbook of autism and pervasive developmental disorders (pp. 772–798). New Jersey: John Wiley & Sons. Klin, A., Saulnier, C. A., Sparrow, S. S., Cicchetti, D. V., Volkmar, F. R., & Lord, C. (2007). Social and communication abilities and disabilities in higher functioning individuals with autism spectrum disorders: the Vineland and the ADOS. Journal of Autism and Developmental Disorders, 37(4), 748-759. Klin, A., Volkmar, F. R., Sparrow, S. S., Cicchetti, D. V., & Rourke, B. P. (1995). Validity and neuropsychological characterization of Asperger Syndrome: Convergence with nonverbal learning disabilities syndrome. Journal of Child Psychology and Psychiatry, 7, 1127- 1140. Korkman, M., Kirk, U., & Kemp, S. (2007). NEPSY - Second Edition. San Antonio, TX: Pearson. Lincoln, A., Courchesne, E., Allen, M., Hanson, E., & Ene, M. (1998). Neurobiology of Asperger’s syndrome: Seven case studies and quantitative magnetic resonance imaging findings. In E. Schopler, G. B. Mesibov & L. Kunce (Eds.), Asperger’s syndrome or high-functioning autism: Current issues in autism (pp. 145-163). New York: Plenum. Lincoln, A. J., Courchesne, E., Kilman, B. A., Elmasian, R., & Allen, M. (1988). A study of intellectual abilities in high-functioning people with autism. J Autism Dev Disord, 18(4), 505-524. Lind, S. E., & Bowler, D. M. (2009). Recognition memory, self-other source memory, and theory-of-mind in children with autism spectrum disorder. Journal of Autism and Developmental Disorders, 39, 1231-1239. Liss, M., Fein, D., Allen, D., Dunn, M., Feinstein, C., Morris, R., . . . Rapin, I. (2001). Executive functioning in high-functioning children with autism. J Child Psychol Psychiatry, 42(2), 261-270. Liss, M., Harel, B., Fein, D., Allen, D., Dunn, M., Feinstein, C., . . . Rapin, I. (2001). Predictors and correlates of adaptive functioning in children with developmental disorders. J Autism Dev Disord, 31(2), 219-230. Lopez, B. R., Lincoln, A. J., Ozonoff, S., & Lai, Z. (2005). Examining the relationship between executive functions and restricted, repetitive symptoms of autistic disorder. Journal of Autism and Developmental Disabilities, 35(4), 445-460.

84

Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Leventhal, B. L., DiLavore, P. C., . . . Rutter, M. (2000). The autism diagnostic observation schedule—generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of Autism and Developmental Disorders, 30(3), 205-223. Lord, C., Schopler, E., & Revicki, D. (1982). Sex differences in autism. J Autism Dev Disord, 12(4), 317-330. Lukens, J., & Hurrell, R. (1996). A comparison of the Stanford-Binet IV and the WISC-III with mildly retarded children. Psychology in the Schools, 33, 24-27. Magiati, I., & Howlin, P. (2001). Monitoring the progress of preschool children with autism enrolled in early intervention programmes. Autism, 5(4), 399-406. Mayes, S. D., & Calhoun, S. L. (2003a). Ability profiles in children with autism: influence of age and IQ. Autism, 7(1), 65-80. Mayes, S. D., & Calhoun, S. L. (2003b). Analysis of WISC-III, Stanford-Binet:IV, and academic achievement test scores in children with autism. J Autism Dev Disord, 33(3), 329-341. Mayes, S. D., & Calhoun, S. L. (2004). Similarities and differences in Wechsler Intelligence Scale for Children-Third edition (WISC-III) profiles: Support for subtest analysis in clinical referrals. The Clinical Neuropsychologist, 18, 559-572. Mayes, S. D., & Calhoun, S. L. (2008). WISC-IV and WIAT-II profiles in children with high- functioning autism. J Autism Dev Disord, 38(3), 428-439. Ming, X., Brimacombe, M., Chaaban, J., Zimmerman-Bier, B., & Wagner, G. (2008). Autism spectrum disorders: Concurrent clinical disorders. Journal of Child Neurology, 23(1), 6- 13. Mottron, L. (2004). Matching strategies in cognitive research with individuals with high- functioning autism: current practices, instrument biases, and recommendations. J Autism Dev Disord, 34(1), 19-27. Munson, J., Dawson, G., Sterling, L., Beauchaine, T., Zhou, A., Koehler, E., . . . Estes, A. (2008). Evidence for latent classes of IQ in young children with autism spectrum disorder. American Journal on Mental Retardation, 113(6), 439-452. Ohio Revised Code - School districts to identify gifted students. (2001). Ozonoff, S., South, M., & Miller, J. N. (2000). DSM-IV-defined Asperger syndrome: Cognitive, behavioral and early history differentiation from high-functioning autism. Autism, 4(1), 29-46. Phelps, L., Bell, M., & Scott, M. (1988). Correlations between the Stanford-Binet: Fourth edition and the WISC-R with a learning disabled population. Psychology in the Schools, 25, 380- 382. Prewett, P., & Matavich, M. (1992). Mean-score differences between the WISC-R and the Stanford Binet Intelligence Scale: Fourth edition. Diagnostique, 17, 195-201. Prewett, P., & Matavich, M. (1994). A comparison of referred students’ performance on the WISC-III and the Stanford-Binet Intelligence Scale: Fourth edition. Journal of Psychoeducational Assessment, 12(42-48). Roid, G. H. (2003). Stanford Binet intelligence scales (5th ed.) (5th ed.). Itasca, IL: Riverside Publishing. Rothlisberg, B. A. (1987). Comparing the Stanford-Binet, fourth edition to the WISC-R: A concurrent validity study. Journal of School Psychology, 25, 193-196. Rumsey, J. M. (1985). Conceptual problem-solving in highly verbal, nonretarded autistic men. J Autism Dev Disord, 15(1), 23-36.

85

Saklofske, D., Schwean, V., Yackulic, R., & Quinn, D. (1994). WISC-III and SB:FE performance of children with attention deficit hyperactivity disorder. Canadian Journal of School Psychology, 10(2), 167-171. Sattler, J. M. (2001). Assessment of Children: Cognitive Applications (4th ed.). La Mesa, CA: Jerome Sattler Publisher, Inc. Scheuffgen, K., Happe, F., Anderson, M., & Frith, U. (2000). High "intelligence," low "IQ"? Speed of processing and measured IQ in children with autism. Dev Psychopathol, 12(1), 83-90. Semel, E., Wiig, E. H., & Secord, W. A. (2003). Clinical evaluation of language fundamentals, 4th edition. San Antonio, TX: The Psychological Corporation. Semel, E., Wiig, E. H., & Secord, W. A. (2004). Clinical Evaluation of Language Fundamentals, 4th edition, Screening Test. San Antonio, TX: Pearson. Semrud-Clikeman, M., Walkowiak, J., Wilkinson, A., & Christopher, G. (2010). Neuropsychological differences among children with Asperger syndrome, nonverbal learning disabilities, attention deficit disorder, and controls. Developmental neuropsychology, 35(5), 582-600. Siegel, D. J., Minshew, N. J., & Goldstein, G. (1996). Wechsler IQ profiles in diagnosis of high- functioning autism. J Autism Dev Disord, 26(4), 389-406. Silverman, W., Miezejeski, C., Ryan, R., Zigman, W., Krinsky-McHale, S., & Urv, T. (2010). Stanford-Binet & WAIS IQ Differences and Their Implications for Adults with Intellectual Disability (aka Mental Retardation). Intelligence, 38(2), 242-248. Stevens, M. C., Fein, D. A., Dunn, M., Allen, D., Waterhouse, L. H., Feinstein, C., & Rapin, I. (2000). Subgroups of children with autism by cluster analysis: a longitudinal examination. J Am Acad Child Adolesc Psychiatry, 39(3), 346-352. doi: S0890- 8567(09)66163-3 [pii] 10.1097/00004583-200003000-00017 Szatmari, P., Bryson, S. E., Boyle, M. H., Streiner, D. L., & Duku, E. (2003). Predictors of outcome among high functioning children with autism and Asperger syndrome. J Child Psychol Psychiatry, 44(4), 520-528. Tager-Flusberg, H., Paul, R., & Lord, C. (2005). Language and communication in autism. In R. P. F. Volkmar, & A. Klin (Ed.), Handbook on autism and pervasive developmental disorders (3rd ed., pp. 335-364). New York: Wiley. Thompson, L., Thompson, M., & Reid, A. (2010). Neurofeedback outcomes in clients with Asperger's syndrome. Appl Psychophysiol Biofeedback, 35(1), 63-81. doi: 10.1007/s10484-009-9120-3 Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). Stanford-Binet intelligence scales (4th ed.). Chicago: Riverside Publishing. Volkmar, F. R., Szatmari, P., & Sparrow, S. S. (1993). Sex differences in pervasive developmental disorders. J Autism Dev Disord, 23(4), 579-591. Wallace, G. L., Anderson, M., & Happe, F. (2009). Brief report: information processing speed is intact in autism but not correlated with measured intelligence. J Autism Dev Disord, 39(5), 809-814. Wechsler, D. (1991). Wechsler intelligence scale for children, 3rd edition. San Antonion, TX: The Psychological Corporation. Wechsler, D. (2003). Wechsler intelligence scale for children, 4th edition. Technical and interpretive manual. . San Antonio, TX: The Psychological Corporation.

86

White, S. W., Scahill, L., Klin, A., Koenig, K., & Volkmar, F. R. (2007). Educational placements and service use patterns of individuals with autism spectrum disorders. Journal of Autism and Developmental Disorders, 37(8), 1403-1412. Yeargin-Allsopp, M., Rice, C., Karapurkar, T., Doernberg, N., Boyle, C., & Murphy, C. (2003). Prevalence of autism in a US metropolitan area. Journal of the American Medical Association, 289(1), 49-55.

87