<<

Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2010 Informal Number-Related Skills: An Examination of the Structure of and Relations Between These Skills in Preschool David J. Purpura

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] THE FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

INFORMAL NUMBER-RELATED MATHEMATICS SKILLS: AN EXAMINATION OF THE

STRUCTURE OF AND RELATIONS BETWEEN THESE SKILLS IN PRESCHOOL

By

DAVID J. PURPURA

A Dissertation submitted to the Department of in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Degree Awarded: Summer Semester, 2010

Copyright © 2009 David J. Purpura All Rights Reserved

The members of the committee approve the dissertation of David J. Purpura defended on October 30, 2009.

______Christopher J. Lonigan Professor Directing Dissertation

______Barbara Foorman University Representative

______Janet Kistner Committee Member

______Christopher Schatschneider Committee Member

______Jesse Cougle Committee Member

The Graduate School has verified and approved the above-named committee members.

ii

I dedicate this dissertation to my amazing and beautiful wife Ashley.

Without your love and support, none of this would have been possible.

This degree is as much yours as it is mine.

iii

ACKNOWLEDGEMENTS I would like to thank Dr. Christopher Lonigan for his guidance and mentorship throughout my graduate career. You have pushed me to be a better student, writer, and researcher. For your continued efforts to challenge me, I am truly grateful. I would also like to thank the members of my committee, Drs. Foorman, Kistner, Schatschneider, and Cougle, for their valuable input and support throughout my work on this project. My thanks are also extended to the local preschool teachers and directors without whom this project would not have been possible. To the undergraduate testers who worked with me, your hard work has enabled me to complete this project successfully. My fellow graduate students, your friendship, and support have made graduate school an amazing experience. I could not imagine having gone through these years without any of you. To my parents who have pushed me to excel and not allowed me to settle for less than I could accomplish, I cannot thank you enough. The rest of my family, I thank you for the love and support you have shown to me over the years. Most importantly, I thank my beautiful wife Ashley. I could not have done this without you. The endless hours you have spent over the last few years reading drafts of papers, encouraging me when I was buried in work, and helping me to stop and smell the roses, has been a blessing to me.

This work was supported by a Predoctoral Interdisciplinary Research Training Grant (R305B04074) from the Institute of Education Sciences.

iv

TABLE OF CONTENTS

List of Tables ...... vii List of Figures ...... viii Abstract ...... x

1. INTRODUCTION ...... 1

Formal and Informal Mathematics Skills ...... 2 Validity Concerns ...... 12 Overview of Current Study ...... 19

2. PHASE 1: INITIAL MEASURE CONSTRUCTION ...... 21

Method ...... 21 Results ...... 29

3. PHASE 2: FURTHER MEASURE REFINEMENT AND CONSTRUCT COMPARISON ...... 40

Method ...... 40 Results ...... 43

4. DISCUSSION ...... 77

Measure Development...... 77 Model Comparison ...... 79 Future Directions ...... 80 Early Numeracy and Third Variables ...... 82 Conclusion ...... 83

APPENDICES ...... 84

A Informed Consent Form ...... 84 B Facility Consent Form ...... 88 C Child Assent Script...... 91 D Phase 1 Counting Subtest ...... 93 E Phase 1 Numerical Relations Subtest ...... 97 F Phase 1 Reasoning Subtest...... 101

v

G Phase 1 Item Psychometric Information ...... 104 H Phase 2 Counting Subtest ...... 129 I Phase 2 Numerical Relations Subtest ...... 133 J Phase 2 Arithmetic Reasoning Subtest...... 137 K Phase 2 Item Psychometric Information ...... 140

REFERENCES ...... 166

BIOGRAPHICAL SKETCH ...... 172

vi

LIST OF TABLES

Table 1: Names, descriptions, and means of assessing counting skills...... 5

Table 2: Names, descriptions, and means of assessing Numerical Relations skills ...... 9

Table 3: Names, descriptions, and means of assessing Arithmetic Reasoning skills...... 13

Table 4: Listing of important Counting, Numerical Relations, and Arithmetic Reasoning skills on five measures of informal mathematics skills and the National Council of Teachers of Mathematics standards ...... 16

Table 5: Demographic information of children participating in Phase 1 by group...... 29

Table 6: Means, standard deviations, range, skewness, and kurtosis of the sum scores of the final early numeracy tasks in preschool children ...... 65

Table 7: Correlations between the sum scores of the final early numeracy skill tasks in preschool children ...... 67

Table 8: Means, standard deviations, and F-values of the early numeracy skills for preschool children by sex ...... 69

Table 9: Model fit for the Counting, Numerical Relations, and Arithmetic Reasoning factors ...... 70

Table 10: Fit indices for the models of the structure of early numeracy skills in preschool children ...... 74

vii

LIST OF FIGURES

Figure 1: Test Information Functions for Counting tasks. High levels of information indicate high reliability of the task at a given point along the latent trait...... 45

Figure 2: Standard Error Functions for Counting tasks. Standard error values below .316 indicate high reliability of the task at a given point along the latent trait...... 46

Figure 3: Test Information Functions for Counting subtest (top) and Standard error (bottom) of the subtest. High levels of information and low standard error values indicate high reliability of the task at a given point along the latent trait. .. 51

Figure 4: Test Information Functions for Numerical Relations tasks. High levels of information indicate high reliability of the task at a given point along the latent trait...... 52

Figure 5: Standard Error Functions for the Numerical Relations tasks. Standard error values below .316 indicate high reliability of the task at a given point along the latent trait...... 53

Figure 6: Test Information Functions for Numerical Relations subtest (top) and Standard error (bottom) of the subtest. High levels of information and low standard error values indicate high reliability of the task at a given point along the latent trait...... 58

Figure 7: Test Information Functions for Arithmetic Reasoning tasks. High levels of information indicate high reliability of the task at a given point along the latent trait...... 60

Figure 8: Standard Error Functions for the Arithmetic Reasoning tasks. Standard error values below .316 indicate high reliability of the task at a given point along the latent trait...... 61

Figure 9: Test Information Functions for Arithmetic Reasoning subtest (top) and Standard error (bottom) of the subtest. High levels of information and low standard error values indicate high reliability of the task at a given point along the latent trait...... 64

Figure 10: The Confirmatory Factor Analysis model of Counting skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001...... 71

viii

Figure 11: The Confirmatory Factor Analysis model of Numerical Relations skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001...... 72

Figure 12: The Confirmatory Factor Analysis model of Arithmetic Reasoning skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001...... 73

Figure 13: The three-factor model of preschool early numeracy skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001...... 75

ix

ABSTRACT Validating the structure of early numeracy skills is critical to understanding the developmental trajectories of mathematics skills at early ages. However, little research has been devoted to construct evaluation of the Counting, Numerical Relations, and Arithmetic Reasoning domains. This study was designed to address this gap in the literature by (a) developing a broad and deep measure of Counting skills, Numerical Relations skills, and Arithmetic Reasoning skills, (b) examining the structure of these three skill domains, and (c) examining the relations between these three domains. In Phase 1 of this study, measure development was conducted using a sample of 306 children (50.5% girls; 46.9% White, 41.0% African American, and 12.1% other). In Phase 2, further measure development was conducted as well as evaluation of the structure of and relations between the three domains using Confirmatory Factor Analysis. Three hundred ninety-three children from the Tallahassee and Panama City areas in Florida participated in Phase 2 (51.7% girls, 55.7% White, 33.8% African American, and 10.5% other). Results indicated that the relations between the early numeracy skills were best explained by a three- factor model of Counting, Numerical Relations, and Arithmetic Reasoning. The findings from this study provide a strong foundation upon which to build future research in early numeracy skills.

x

INTRODUCTION Mathematics skills are foundational academic abilities upon which other academic skills, such as science and engineering are built. Individuals who are successful in mathematics are likely to experience later success in these and other areas (Duncan et al. 2007; Geary, 1994). Unfortunately, achievement in mathematics by individuals in the United States is outpaced by achievement in mathematics skills by individuals in other countries. Results from several large international studies on mathematics have indicated that children in the United States, throughout their schooling, lag behind their international peers in mathematics skills (Stevenson, Lee, Chen, Lummis, et al., 1990; Gonzales, Calsyn, et al., 2000; Mullis, Martin, Gonzalez, & Chrostowski, 2004). These performance disparities are evident as early as the beginning of preschool (Stevenson, et al., 1990). Even within the United States, significant disparities amongst groups of children in their mathematics skills emerge prior to school entry. These disparities are delineated mainly along socio-economic status (SES) lines (Starkey, Klein, & Wakeley, 2004). Children from low-SES families enter preschool with significantly lower mathematics skills than their middle- and high-SES peers (Starkey et al. 2004), and, without intervention, they develop their mathematics skills at a slower rate (Aunola, Leskinen, Lerkkanen, and Nurmi, 2004). Recognition of these achievement disparities, and the desire of educators, politicians, and researchers to reduce and eliminate this achievement gap between children from low-SES families and children from middle- to high-SES families, has spurred a significant increase in mathematics research over the last 15 years, especially research regarding mathematics skills at the preschool level (Gersten, Clarke, & Mazzocco, 2007). A driving force in the increase of quality early mathematics research is the National Council of Teachers of Mathematics (NCTM) Principles and Standards for School Mathematics (PSSM; NCTM, 2000). The PSSM were developed by a committee of researchers and educators, under the direction of NCTM, by analyzing curriculum frameworks from the majority of states in this country, the curriculum frameworks from several other countries including Singapore, Japan, China, and Korea, and research findings. The committee developed the principles and standards to provide a foundation with which to guide classroom instructional development and practices. NCTM included preschool mathematics principles and standards in the PSSM for the first time in the version released in the year 2000 (NCTM, 2000). Well-delineated standards of developmentally appropriate early mathematics skills, especially at the preschool level, are a

1

necessary and useful tool for mathematics educators to guide instructional development during these formative years. Equally important to the role the PSSM can play in instructional development is the impact they have on building a solid research foundation. Clearly defined and established standards enhance communication among researchers and enable them to develop a platform of knowledge that can be used to synthesize research findings in the field. The NCTM standards are intended to be that platform. However, these standards are not sufficient in their current state to be used as a complete foundational guide for researchers. Although the standards are empirically based, they are not empirically supported. The subtle distinction between these terms indicates that although research findings were used to develop these standards, the overall standards model has not been empirically validated. One of the reasons for the lack of validation is the non- specific nature of the standards. General standards such as “Develop a sense of whole numbers and represent them in flexible ways, including relating, composing, and decomposing numbers” are generally broad and do not necessarily elucidate a set of specific tasks that can be used to assess children’s mathematical abilities. Early mathematics standards must be developed from research and be representative of age-appropriate skills (Neuman & Roskos, 2005). A clear, effective, and consistent means of assessing children’s attainment of these standards must also be developed and empirically validated. Unfortunately, the NCTM standards, as well as the preschool mathematics standards developed by individual states, vary widely in their structure and content. As a result of the inconsistent content and structure of preschool mathematics standards, no clear definition of early mathematics skills or sufficiently broad measure to assess these skills exists. Formal and Informal Mathematics Skills To attain the breadth of knowledge necessary to understand and define the nature of early mathematics skills, it is necessary to examine the foundational evidence in this field. The current knowledge base of early mathematics skills is rooted in the concepts of formal and informal mathematics skills (Baroody, Gannon, Berent, & Ginsburg, 1984; Baroody & Ginsburg, 1982; Ginsburg, 1975; Greenes, Ginsburg, & Balfanz, 2004, Starkey, Klein, & Wakeley, 2004). Formal mathematics skills are those skills taught in school that require the use of abstract numerical notation such as writing numerals, place-value tasks, knowledge of the base-ten mathematics system, and decimal knowledge (Baroody, Gannon, Berent, and Ginsburg, 1984). Informal

2

mathematics skills are the developmental precursors to formal mathematics skills and do not require specific instruction in abstract mathematical notation. In fact, some evidence suggests that children are born with a degree of informal mathematics skills, specifically the ability to recognize changes in magnitude (Starkey & Cooper, 1980; Wood & Spelke, 2005), and these skills develop as children explore their natural environment (Ginsburg, 1975). Beyond the ability to determine magnitude changes, several other domains of skill are included under the umbrella of informal mathematics. These six primary domains include number knowledge, arithmetic operations, measurement, shapes, patterns/logical inferences, and spatial concepts (Greenes, et al., 2004). Number knowledge and arithmetic operations are the two most studied aspects of informal mathematics skills (Aubrey & Godfrey, 2003; Baroody, 2004; Baroody, et al., 1984; Ginsburg, Klein, & Starkey, 1998; Griffin, 2004; Jordan, Kaplan, Olah, & Locuniak, 2006; van de Rijt, van Luit, & Pennings, 1999). Within the domain of number knowledge, two general types of skills typically are discussed: counting skills and numerical relations. These skills are those most necessary for the development of basic formal skills such as addition and subtraction (Jordan, Kaplan, Ramineni, & Locuniak, 2009). These two areas, along with arithmetic operations, are categorized best as informal number-related skills. Although the other skills (measurement, shapes, patterns/logical inferences, and spatial concepts) often identified as informal geometric skills are believed to be important for the development of formal skills, they appear to be wholly different from those skills related to number development and thus are outside the scope of this discussion. The examination of informal number-related skills has a history dating back over the last half-century (Dantzig, 1954; Piaget, 1952). The informal number-related skill most frequently recognized and discussed is counting. Often, young children’s first major mathematical accomplishment is counting to 10. However, Piaget (1952) argued that without knowledge of the meaning behind counting, the ability to count is inherently meaningless. Others, such as Gelman & Gallistel (1978), proposed that advanced counting skills are the foundation of mathematical competence. They suggested that a child who can count a set of objects and recognize that the last number counted indicates total quantity (cardinality) and the order the objects are counted does not matter (order irrelevance), has well-developed early mathematical skills. Recent work indicates several other aspects of counting are important to the development of mathematics

3

skills. These skills include counting objects (Clements & Sarama, 2007; Griffin, 2004; Jordan, Kaplan, Locuniak, & Ramineni, 2007; NCTM, 2000; Sophian, 2004; Starkey et al. 2004; van de Rijt, Godfrey, Aubrey, van Luit, GhesquiEre, Torbeyns, et al., 2003), subitizing (Clements & Sarama, 2007; NCTM, 2000), estimation (Sophian, 2004), counting forward from a number other than one, counting backward (Clements & Sarama, 2007; van de Rijt, van Luit, & Pennings, 1999), and recognizing counting errors (Clements & Sarama, 2007). In Table 1, the counting-based skills that have been suggested are listed, along with brief descriptions of each skill and examples of how each skill typically is assessed. High levels of performance in these skills would indicate that an individual has a well-developed competence in counting. The second area of number related skills is the understanding of numerical relations or “number sense.” Number sense has been described as a skill that is “easy to recognize [in children] but difficult to define” (Griffin, 2004). Many attempts have been made to define this ability, but specific skills associated with number sense have varied widely throughout the years. Berch (2005) identified over 30 different aspects or skills of early mathematics that have been associated with number sense. Broadly, knowledge of these skills would indicate that a child with a developed understanding of number sense is able to utilize the concept of a number line to develop basic mathematical understanding and thus develop later skills such as addition and subtraction. For example, recognition that the number ‘four’ is greater than the number ‘two’ is dependent on the knowledge that numbers always follow a specific sequence and the knowledge that numbers that follow other numbers are always larger. Specific skills such as magnitude discrimination (Berch, 2005; Jordan, et al., 2007), relative size of numbers and quantities (Clements & Sarama, 2007; Jordan et al., 2007), set and number comparison (Clements & Sarama, 2007; Jordan et al., 2007; NCTM, 2000; Griffin, 2004; van de Rijt, et al., 1999), knowledge of number order (Clements & Sarama, 2007; Griffin, 2004; Jordan, et al., 2007; Starkey, et al. 2004), number sequencing (Clements & Sarama, 2007; NCTM, 2000; Sophian, 2004; van de Rijt, et al., 1999), number reproduction (Starkey, et al. 2004), recognizing numerals (Jordan et al., 2007), and the recognition that numerals equate to quantities (Clements & Sarama, 2007, NCTM, 2000) are all skills often associated with this domain. In Table 2, the skills within this domain are listed with brief descriptions of associated tasks and examples of how each skill typically is assessed.

4

Table 1

Names, descriptions, and means of assessing counting skills

Skill Description Assessment Examples Rote counting Knowledge of the counting sequence. REMA: “How high can you count? Start at 1 and tell me.” Children earned 1 point for every specified number counted to (e.g. 1 point was given if they correctly counted to five). CMA: Child is instructed to start counting at one and count as high as he/she can. Examiner is allowed to prompt child once. After the second error or extended pause, the examiner discontinues the task and the last correctly counted number is recorded. The task is also discontinued if the child counts beyond 100. NSC: Children were asked to count to ten. They were allowed to restart counting only once and they could self-correct any numbers. ENT: Children are asked to count to twenty. NKT: Child is asked to count from 1 to 10. TEMA-3: Child is asked to count to a specified number (20, 40, etc.). Counting is stopped when the child reaches another specified number.

Counting up from a Higher order counting ability that REMA: Children are asked to count to a number (e.g.) 10 and start number other than indicates a child understands that the counting at another number (e.g. 4). No additional aid is provided. one counting sequence does not have to start ENT: Children are asked to count on from a number (e.g. Count on from at one. nine). The examiner starts counting three numbers before the specified number then instructs the child to continue counting. TEMA-3: Examiner asks the child to count together. The child then counts up to a specified number and the examiner then asks, “and then comes?”

5

Table 1 - continued Skill Description Assessment Examples Counting backward Higher order counting ability that REMA: Children are asked to start counting backward from a specified indicates a child’s knowledge that the number (e.g. 10). counting sequence can be used for ENT: Children are presented with a specific number of blocks, told how different purposes than simply counting many blocks there are, and instructed to point out the blocks and count a set. backward. TEMA-3: Children are asked to count backward from 10.

Counting error Knowledge of the principles of counting CMA: Examiner counts a series of objects either correctly, incorrectly identification (e.g. a number is counted only once, (repeats a number, skips an object, double-counts an object, etc), or numbers cannot be skipped, each object pseudo-incorrectly (counts every other object, then goes back through and counted is assigned only one number) counts the missing dots). Then the examiner asks the child to indicate and ability to recognize when counting whether the objects were counted correctly. principles have been violated. NSC: Children were asked to identify whether the examiner counted “OK” or “not OK.” The examiner then counted the set of objects either correctly, pseudo-incorrectly (e.g. counting every other dot then going back through and counting the missed dots), or incorrectly (e.g. skipping a picture, double-counting a picture).

Structured Counting The ability to utilize the counting REMA: Children are shown a set of pictures and asked to count them and sequence to enumerate a quantity. tell the examiner how many there are. CMA: Children were asked to count a set of objects. NSC: Children were asked to count a set of pictures (e.g. count a set of 5). ENT: A number of blocks are placed in front of the child. They are instructed to count the blocks. Pointing, touching, and moving the blocks is allowed. NKT: Children are asked to count a specific number of objects presented to them. TEMA-3: Children are asked to count a specific number of pictures.

6

Table 1 - continued Skill Description Assessment Examples Cardinality Recognition that the last number REMA: Children were asked to count a set of objects and then to specify counted means “how many.” how many there are. NSC: Children were asked to count a set of pictures out loud. The pictures were then hidden from view and children were asked how many of the picture was on the paper. TEMA-3: Child is presented with a number of pictures. The pictures are then hidden and the child is asked, “How many [of the picture] did you count?”

Resultative Counting Higher order counting skill that ENT: Children are presented with a specific number of blocks and asked to indicates a child’s ability to count identify how many total objects there are. Pointing, touching, and moving without physical manipulation or the blocks is NOT allowed. touching of the set to be enumerated. Count a subset of The understanding that quantities can be REMA: Children are instructed to produce a set of a specific number out of objects separated from a larger quantity. a larger set of objects. CMA: Child is asked to count all the chips of one color out of a set with two different colored objects. ENT: Child is presented with a set of blocks (e.g. 15). They are then asked to produce a smaller set of the blocks (e.g. 11). NKT: Children are presented with a line of objects of two different colors. Then the child is asked to count all of one color object. TEMA-3: Children are given a set of blocks. They are then instructed to give the examiner a subset of the objects.

Subitizing The ability to rapidly enumerate small REMA: Children are shown a card with a specific number of the same quantities without counting the set item object on it for exactly 2 seconds. The card is then hidden and the child is by item. asked, “How many?”

7

Table 1 - continued Skill Description Assessment Examples Estimation Ability to quickly understand the NSC: Children were presented with cards with a specific number of dots. approximate quantity in a moderate to Children were asked, “About how many dots do you see?” Responses were large set without structured counting. scored correct if the child provided a response within 25% of the answer. Note. REMA = Research-based Early Math Assessment (Clements, Sarama, & Liu, 2008), CMA = Child Math Assessment, NSC = Number Sense Core battery (Jordan et al., 2007), ENT = Early Numeracy Test (van de Rijt, van Luit, & Pennings, 2003), NKT = Number Knowledge Test (Griffin & Case, 1997), TEMA-3 = Test of Early Mathematics Ability – Third Edition (Ginsburg & Baroody, 2003).

8

Table 2

Names, descriptions, and means of assessing Numerical Relations skills

Skill Description Assessment Examples Ordinal Numbers Identification of ordinal positions such REMA: “Three animals are in a line, which one is first in line?” as 1st, 2nd, 3rd, and last. CMA: Child is asked to identify the object in a specific ordinal position (1st, 2nd, etc). ENT: “Point out the Nth object.”

Relative Size Knowledge of the proximity of one REMA: Children are asked which of two numbers is closer to another number to another number along the number (e.g. which of 500 and 520 is closer to 250). number line. NSC: Children were shown three numbers (e.g. 4, 7, 8) and were asked which number (e.g. 7 or 8) is closer to the other number (e.g. 4). NKT: Children are asked which of two numbers is closer to another number. The visual array of numbers is then shown to the child. TEMA-3: Children are presented an array of three numbers and asked which of two of the numbers is closer to the other number.

Number Comparison The ability to differentiate the REMA: Children were verbally asked which of two or three numbers is numerical quantity of two numbers. bigg[er/est] or small[er/est]. CMA: Children are asked to identify the larger or smaller of two numbers. NSC: Children were verbally given two numbers (e.g. 4 and 5) and asked which number is bigger or which number is smaller. NKT: Children are verbally asked which of two numbers is bigger or smaller. TEMA-3: Children are verbally asked questions such as, “Which is more, 3 or 2?”

9

Table 2 - continued Skill Description Assessment Examples Set Comparison Ability to differentiate magnitude of REMA: Child is shown two cards with different quantities of dots and is sets of varying sizes. asked to identify which card has more dots. Children are also shown cards with dots of different sizes (e.g. 9 big dots on one card and 11 small dots on another card) and asked which is more. ENT: Which set of four sets of objects has the fewest? NKT: Children are presented with two sets of objects and asked to identify the pile with either more or less objects. TEMA-3: Child is asked to identify which of two sets of dots has more dots.

Number Order Knowledge of the ordering of numbers REMA: Child is verbally asked to identify which number comes one or on the number line by identifying the two before or after a specified number. numbers before and after a given CMA: Child is verbally asked to identify the number that comes before or number. after another number (e.g. What number comes after 4?) NSC: Children were verbally given a number (e.g. 4) and asked what number comes after that number and what number comes two after that number. NKT: Child is asked, “What number comes right after 7?” and “What number comes 2 after 7?”

Sequencing Knowledge of the order of numbers and REMA: Children are asked to correctly order a set of six cards by the quantities along the number line. number of images on a card. Children are also asked to order a set of 1-5 numerals. CMA: Children are asked to order a set of five objects in a series by size and insert a sixth object. This task does not use numbers or quantities, but rather uses physical size. ENT: Children are presented with several of the same object in varying quantity. They are also presented with another example of the same object and asked where it goes in the order based on the quantity.

10

Table 2 - continued Skill Description Assessment Examples Set reproduction Ability to produce or recognize REMA: Children are shown a picture of a specific number of objects. They equivalent sets by using knowledge of are then given objects and asked, “Make yours look just like mine.” one-to-one correspondence. CMA: Children are presented with a number of objects of one color and asked to produce the same number of that object with another color. ENT: Child is visually presented with a number of objects. They are then given a larger number of blocks and asked, “Can you lay down the same amount of blocks?” Children are also asked to match equal sets of different types of objects (e.g. draw lines from the pictures that have the same number of objects). TEMA-3: Children are presented with a number of objects. The number of objects is then hidden and the child is asked to reproduce the same number of blocks.

Number ID Knowledge of the names of written NSC: Children were shown a series of cards, each with a given number numerals. (e.g. 3) and asked to verbally name the number. TEMA-3: Child is visually presented with a numeral and asked, “What number is this?”

Numerals Ability to connect quantity to written REMA: Children were asked to match a numeral to a set of objects of the numbers. same number. TEMA-3: Children are shown pictures of animals and asked to write down the number of animals on the page. Note. REMA = Research-based Early Math Assessment (Clements, Sarama, & Liu, 2008), CMA = Child Math Assessment, NSC = Number Sense Core battery (Jordan et al., 2007), ENT = Early Numeracy Test (van de Rijt, van Luit, & Pennings, 2003), NKT = Number Knowledge Test (Griffin & Case, 1997), TEMA-3 = Test of Early Mathematics Ability – Third Edition (Ginsburg & Baroody, 2003).

11

The third domain, arithmetic operations, is also known as basic arithmetic reasoning. This domain is an understanding of the ways in which groups are composed and decomposed by differentiating sets and subsets (Greenes et al., 2004). Although most children in preschool are not expected to solve formal arithmetic problems, such as 2 + 2 = 4, many have already begun to develop an understanding of the rules and operations of addition and subtraction (Starkey, et al. 2004). Children can recognize that when two groups of objects are initially equal and one more object is added to one of the groups, the group that had an object added now has more total objects. Many preschool children are even able to solve simple word problems (Clements & Sarama, 2007; Starkey, et al. 2004). This domain can be measured in a number of ways, including addition and subtraction with or without objects through story problems (Clements & Sarama, 2007; Starkey, Klein, & Ramirez, 2002; Jordan, et al., 2007; van de Rijt, et al., 1999), two-set addition (Klein, Starkey, & Ramirez, 2002), and composition/decomposition of sets (Clements & Sarama, 2007). It is important to assess number combinations, early-level formal arithmetic problems, to determine if a child with advanced informal skills has begun the transition to understanding formal mathematics skills. In Table 3, the arithmetic reasoning skills that have been suggested are presented along with brief descriptions of each skill and examples of how each skill typically is assessed. Validity Concerns An overarching problem within these three domains is the lack of consistency with which specific skills are identified and assessed across studies and measures. As a result of these inconsistencies, no two assessment measures identify the same core set of skills and, in some cases, when comparable skills are identified, they are labeled or assessed differently. In Table 41, a chart of skills assessed within several major informal mathematics assessment measures is presented. The constructs of counting, numerical relations, and arithmetic reasoning are not well established.

1 This table only indicates skills that are assessed by at least one item, not actual subtests. This table does not indicate that these measures assess the full range of ability within each skill. No empirical evaluations have been released publicly by the authors of these tests regarding the validity of these tasks as separate units of measurement.

12

Table 3

Names, descriptions, and means of assessing Arithmetic Reasoning skills

Skill Description Assessment Examples Addition/Subtraction Ability to identify specific changes in REMA: Children were given a specific number of objects (e.g. 3) and told with objects magnitude of a set of objects. how many objects they were given. They were then given another quantity of the same object (e.g. 2) and told how many more of that object they received. They were then asked how many they were given all together. CMA: A number of objects are placed in front of the child. The child is then asked to determine the total number of objects on the table. The objects are put in a box and told that another number of objects are being added to the box. The child is asked to determine the total number of objects in the box. Similar procedures are done for subtraction. NSC: The experimenter placed X objects on the table and told the child how many objects were there were. The objects were then covered and objects were either added or subtracted one at a time. The child was told how many objects were being added or subtracted. The objects remained covered and the children were asked how many total objects were now on the table. ENT: Children are shown a picture of a specific number of objects (e.g. 9 marbles). They are then told that another number of objects (e.g. 3) is lost. They are then asked to identify the picture at the bottom of the page with the total number of objects that are left. TEMA-3: Examiner places block(s) on the table, then covers the block(s) on the table. The examiner then places more blocks on the table and proceeds to hide them with the other blocks. The child is then instructed to “Make [theirs] the same as [the examiners].” Children are also presented with story problems and instructed that they can use their fingers or blocks to help solve the problems.

13

Table 3 - continued Skill Description Assessment Examples Addition/Subtraction Ability to solve simple addition and CMA: Story problems were verbally presented to children such. without objects subtraction story problems apart from NSC: Story problems were verbally presented to children such as “Jill has the formal written structure. two pennies. Jim gives her one more penny. How many pennies does Jill have now?” NKT: Story problem, If you had 4 chocolates and I gave you 3 more chocolates, how many would you have? TEMA-3: Children are presented verbally with story problems.

Equivalent Sets Ability to equally divide a given CMA: Children are asked to divide a number of objects amongst a number of objects between a different different number of individuals (e.g. equally divide these 6 objects among number of groups and if necessary these 3 people). recognize that the set cannot be divided TEMA-3: Children are given a number of blocks and asked to share the equally. blocks fairly between two people.

Two-set addition Ability to recognize that changes in CMA: Objects are added to two separate containers, one at a time (e.g. one quantity to one set of objects results in a in container one, one in container two, another in container 1, etc). The change to that set’s quantitative relation child is then asked to determine if the containers have the same number of to another set. objects or if one has more objects (child cannot see inside the containers). One or two more objects are then either added to or subtracted from one of the containers and the child are again asked to determine if the containers have the same number of objects or which has more.

Number Ability to determine changes in initial REMA: Examiner places a specific number of objects (e.g. 4) in front of composition/decompo quantities by knowing initial and final the child and counts them out. Examiner then covers the object and either sition quantities. adds or subtracts a specific number of objects. Child is then shown the total number of objects and asked to identify how many objects were either added or subtracted.

14

Table 3 - continued Skill Description Assessment Examples Number Ability to solve basic addition and REMA: Children were verbally asked questions such as “How much is Combinations subtraction problems such as 1+1=2, 2+7?” presented verbally and/or visually. NSC: Children were verbally asked addition and subtraction problems such as, “How much is 2 plus 1?” or, “How much is 2 minus 1?” The same problems were asked in this task as were asked in the verbal and non- verbal problem solving groups. NKT: Children are verbally asked questions such as, “How much is 2 + 4?” or, “How much is 8 take away 6?” TEMA-3: Children are visually shown addition and subtraction problems. The problems are then read to the child and the child is asked to solve the problem. Note. REMA = Research-based Early Math Assessment (Clements, Sarama, & Liu, 2008), CMA = Child Math Assessment, NSC = Number Sense Core battery (Jordan et al., 2007), ENT = Early Numeracy Test (van de Rijt, van Luit, & Pennings, 2003), NKT = Number Knowledge Test (Griffin & Case, 1997), TEMA-3 = Test of Early Mathematics Ability – Third Edition (Ginsburg & Baroody, 2003)

15

Table 4 Listing of the important Counting, Numerical Relation, and Arithmetic Reasoning skills on five measures of informal mathematics skills and the National Council of Teachers of Mathematics standards

Measures Tasks NCTM REMA CMA NSC ENT NKT TEMA-3 Counting Rote Counting X X X X X X Counting forward X X X Counting backward X X X Counting error identification X X X Structured Counting X X X X X X X Cardinality X X X Resultative Counting X Counting Subsets X X X X X Subitizing X X Estimation X Numerical Relations Ordinal Numbers X X X Relative Size X X X X Number Comparison X X X X X X Set Comparison X X X X X Number Order X X X X Sequencing X X X X Set Reproduction X X X X Number Identification X X Numerals X X X Arithmetic Reasoning Addition/Subtraction with objects X X X X X Addition/Subtraction without objects X X X X Two-set addition X Number Composition X Equivalent Sets X X Number Combinations X X X X NCTM = National Council of Teachers of Mathematics Standards, REMA = Research-based Early Math Assessment (Clements, Sarama, & Liu, 2008), CMA = Child Math Assessment, NSC = Number Sense Core battery (Jordan et al., 2007), ENT = Early Numeracy Test (van de Rijt, van Luit, & Pennings, 2003), NKT = Number Knowledge Test (Griffin & Case, 1997), TEMA-3 = Test of Early Mathematics Ability – Third Edition (Ginsburg & Baroody, 2003).

16

The inconsistent nature of the definition of informal number-related mathematics skills stems from two main problems: (a) a poorly synthesized research foundation and (b) reliance on methods of construct development that are inherently flawed. Several of the common informal number-related mathematics skills measures were developed using the NCTM standards as a framework, which, as previously mentioned, were developed by expert consensus. This type of methodology, especially when used for early mathematics construct development, often results in a wide variety of expert opinions. Howell and Kemp (2005), in an attempt to synthesize expert knowledge on the definition of number sense, found a general lack of consensus amongst experts in their study, highlighting the limitations of committee consensus. There is a clear need to synthesize the informal number-related mathematics knowledge base and clearly identify and support these constructs through empirical means. Well-defined constructs of informal number- related mathematics skills would result in a strong research foundation that would enhance and clarify communication within the field of early mathematics research. The field of early mathematics research is not only lacking foundational research that identifies the key skills within each mathematics domain, but there is a dearth of evidence reliably examining the relation between each of these domains. It is not clear whether these domains are, in fact, separate aspects of informal mathematics skills or different developmental stages of a general informal mathematics skill construct. Several research groups have presented different models of the relations between these domains, but there is little evidence that supports any of these models. For example, in the Building Blocks project assessment measure, the Research-based Early Math Assessment (REMA; Clements, Sarama, & Liu, 2008), it was found that a one-factor model of early mathematics skills was acceptable. However, no other potential models were examined, and their model also included a variety of early geometry tasks. The Test of Early Mathematics Ability--Third Edition (TEMA-3; Ginsburg & Baroody, 2003) is another measure that utilizes a general one-factor model of early mathematics skills. In this measure, it is possible to calculate individual scores on the numerous skill sets, but the measure is designed primarily as a general assessment of early number skills, and the scores on individual skill sets have not been evaluated as empirically valid. Another measure, the Child Math Assessment (CMA; Starkey, Klein, & Ramirez, 2002), utilizes a two-factor model where counting and quantity relation skills are one factor and arithmetic reasoning is another factor. A third measure, the Early Numeracy Test (ENT; van de Rijt et al., 1999), is one of the most widely used

17

measures of early mathematics skills outside of the United States. Through empirical validation, the ENT was found to have a one-dimensional structure that measures two related constructs: quantity relation skills and counting skills (Aunio, et al., 2006). However, only a limited assessment of arithmetic reasoning skills was utilized in this measure, and these arithmetic reasoning skills were included in the domain of counting skills. Only one other published study has examined empirically the structural relation between any of these factors, and that study was conducted with kindergarten children and not preschool children (Jordan, Kaplan, Olah, & Locuniak, 2006). Jordan and colleagues, utilizing an exploratory factor analysis, found that a 2- factor model of number (including both counting and relational skills) and arithmetic reasoning best explained the relation between the two domains. Unpublished research by Okimato (2000, cited in Kalchman et al., 2001, p. 3) suggested that counting and numerical relations are separate factors, but it was not evident that any measures of arithmetic reasoning were included in the analysis. Overall, the current understanding of the skills that form each informal number-related domain and the relation between those domains is underdeveloped, in some cases contradictory, and must be formally evaluated utilizing a diverse sample of preschool children. The current issues in developing a solid research foundation for early number-related skills have profoundly limited the ability of researchers in this field to connect empirically informal mathematics skills to formal mathematics skills. As of yet, no study has examined the developmental trajectories of specific informal number-related skills in preschool to later mathematics skills. There is, however, some evidence linking specific early mathematics skills in kindergarten to specific mathematics skills in later grades (Bryant, et al., 2006; Chard et al., 2005; Gersten, Jordan, & Flojo, 2005; Griffin & Case, 1997; Jordan, et al., 2009). Measures of quantity discrimination, identification of missing numbers, and knowledge of magnitude comparison are all reasonable correlates of later mathematics skills (Bryant et al., 2006; Chard et al., 2005; Gersten et al., 2005; Griffin & Case, 1997). Although it is evident that counting-based skills and numerical-relations skills are strong predictors of later mathematics skills, earlier assessment of these skills, as well as other early number-related skills, would be necessary to gauge the full impact of specific skills on the developmental trajectories of later mathematics skills. Longitudinal work from the preschool level cannot be conducted effectively until a broad and deep understanding of preschool number-related skills has been established. Once the foundational level of skills has been assessed and the domains of early mathematics skills have

18

been examined adequately, the refinement of the broad base of early mathematics skills to the core early mathematics skills can be conducted. Overview of Current Study This study was intended to test and refine measures of the constructs of Counting, Numerical Relations, and Arithmetic Reasoning as well as to identify the model that best represents the relation between these domains in a sample of preschool children. Empirical validation of the construct of informal number-related skills is necessary to provide a solid foundation for the field of early mathematics research. Understanding the structure of, and relations between, mathematics skills at school entry will enable researchers to examine accurately the developmental sequencing of mathematics skills. Accurate identification of the developmental process will enable researchers and educators to identify children who are likely to struggle later on in mathematics and will enable educators and researchers to target effectively the areas of deficit. Without a comprehensive understanding of these early constructs, any attempts to build upon current knowledge of mathematics development may not be grounded firmly in an accurate understanding of early mathematics. Without defining and examining these foundational issues, intervention and curriculum research may be limited in their effectiveness and application because researchers may be working from different knowledge bases. Effective communication and synthesized research efforts may also be impaired. This study took place in two phases. First, because no existing measure captured both the depth and the breadth of informal number-related skills, it was necessary to develop a measure that assessed these skills accurately in preschool children. Initial item evaluation and reduction took place in the first phase. Further measure refinement took place during the second phase of the study. Because the specific constructs of Counting, Numerical Relations, and Arithmetic reasoning are not well defined in the available body of research, the primary goal of the second phase of the study was to define the skills that contributed to each construct and examine the relation between these three constructs. In the latter phase, each construct was refined in the context of an Item Response Theory (IRT) analysis and was validated utilizing Confirmatory Factor Analysis (CFA). The evaluation of the relations between constructs was completed by comparing five models of informal number-related skills through a series of CFAs. The models that were compared were (a) a three-factor model of Counting, Numerical Relations, and Arithmetic Reasoning, (b) the two-factor model indicated by Starkey, Klein, and Ramirez (2002)

19

with Counting and Numerical Relations skills as one factor and Arithmetic Reasoning as another, (c) the two-factor model suggested by van de Rijt et al., (1999) with Counting and Arithmetic Reasoning as one factor and Numerical Relations as a separate factor, (d) another two-factor model where Counting was a separate factor from Numerical Relations skills and Arithmetic Reasoning, and (e) the 1-factor model indicated by Clements and Sarama (2007).

20

PHASE 1: INITIAL MEASURE CONSTRUCTION Method Participants In Phase 1 of this study, 323 children were recruited from 21 preschools in the Tallahassee area. Four of the preschools were Head Start centers (84 children), which primarily serve children from families at or below the Federal Poverty Guidelines. The remaining centers were private, fee-for-services preschools. Of the children recruited to participate in the study, 306 completed the assessment. Seventeen children did not complete the assessment for a variety of reasons including that the child’s family moved out of town, the child was sick during the testing period at his or her school, or the child refused to participate. The children who completed the assessment were evenly split by sex (50.5% female) and were 46.9% White, 41.0% African-American, and 12.1% other race/ethnicity. Children ranged in age from 35.03 months to 66.84 months (M = 51.11 months, SD = 7.63), were primarily English-speaking, and had no known developmental disorders. Parental consent was obtained for each participating child (Appendix A). Consent was also obtained from each preschool from which children were assessed (Appendix B). Child assent also was obtained prior to assessing each child (Appendix C). Materials Although there are several measures available that assess informal mathematics skills, none of these measures assessed the full breadth and depth of informal number-related skills. Thus, it was necessary to construct a measure that assessed all of the skills related to Counting, Numerical Relations, and Arithmetic Reasoning, as well as a broad range of ability within each skill. Three separate subtests, one for each of the three separate domains of informal number- related skills, were constructed (see Appendices D, E, and F). The skills assessed by each subtest were a compilation of skills presented in the available body of literature and commonly used informal number-related skills assessment measures (see Tables 1, 2, and 3). When possible, individual items for each skill set were created using questions from other tests as a framework. When comparable questions for a specific skill were used across other assessment measures, similar items were created for this measure. When diverse means of assessing specific skills were used across other assessment measures, items reflecting that diversity were created for this measure. Further, multiple methods of representing quantities (e.g., objects, dots, or pictures)

21

were used to determine if differences in performance resulted based on the type of quantity representation. All items were worded in such a way that they conveyed the desired meaning and would be understandable to a preschool child. Additionally, items that use pictures or stories (e.g. story problems in the Arithmetic Reasoning tasks) were reviewed carefully to ensure that pictures and stories were appealing to both boys and girls. Counting Skills Subtest The Counting skills subtest (see Appendix D) was comprised of 10 tasks: rote counting, counting up from a number other than one, counting backward, identifying counting errors, structured counting, cardinality, resultative counting, counting a subset, subitizing, and estimation. All the tasks were designed to assess varying aspects of counting ability. Rote counting. The rote counting task used in the Child Math Assessment (CMA) could be used to calculate rote-counting scores from all other assessment measures. In this task, children were asked to count as high as they could (Item 1). If a child paused for an extended period of time during the counting or made a mistake, the child was prompted with the next number by the examiner. Only one prompt was allowed. After a second mistake or extended pause, the task was stopped. The highest numbers counted to prior to the first and second mistakes/pauses were recorded. Counting forward. In the first part of this task, children were asked to count to a specified number, starting at a number other than one (Items 2-5). No initial assistance was provided to the child, as per the methodology utilized in the REMA. In the second part of this task, the examiner started the count sequence and, at a specified number, instructed the child to continue counting up until told to stop, as per the methodology used in the ENT (Items 6-9). Counting backward. In the first part of this task, comparable to the counting backward task on the ENT, children were asked to count a set of dots backward to one from a specified number (Items 10-13). No initial counting assistance was provided for this task. In the second part of the task, similar to the REMA and TEMA-3, children were given a specific number and asked to count backward to one (Items 14-17). No assistance was provided by the examiner. Identifying counting errors. This task required children to identify correct or incorrect sequences of counting. In this task, similar to the CMA and the Number Knowledge Test (NKT), the examiner counted a set of either five or 10 dots correctly or incorrectly (Items 18-33). Some of the count sequences were incorrect due to repetition of numbers, reversal of numbers in the

22

counting sequence, skipping numbers, or skipping dots. Children were asked if the counting sequence was correct. If the child stated that the sequence was incorrect, the examiner asked the child what was wrong with the counting sequence. Structured counting. This task consisted of three parts; counting dots (Items 34-38), counting pictures (Items 39-43), and counting objects (Items 44-48). As was similar to the structured counting tasks on all the major assessment measures, in this task, children were presented with a set of dots, pictures, or objects and asked to count the presented items. The three different types of counting tasks (dots, pictures, and objects) all had the same quantities to be counted. Cardinality. This task was comparable to the cardinality tasks on the REMA, NSC, and TEMA-3 and was assessed in the context of the structured counting task (Items 34-48). After the child had completed each structured counting item, he or she was asked to indicate how many objects/pictures there were in all. A correct response indicated that the child likely understood the last number counted means “how many.” In the cardinality tasks on the NSC and TEMA-3, the examiner hides the counted objects/pictures prior to asking “how many?” However, removing the counted objects/pictures from view may force children to rely on potentially underdeveloped memory skills rather than their mathematics skills. Therefore, in the current task, the counted items were not hidden after the initial counting phase. Examiners noted whether or not the child recounted the objects. Resultative counting. This task was similar to the resultative counting task in the ENT. The child was instructed that it was time for another counting game, but for this task, the child had to place his or her hands underneath his or her knees and not touch the items while counting (Items 49-53). The child was asked to count the presented items out loud. Counting a subset. In the first part of this task, similar to the counting a subset task on the REMA, ENT, and TEMA-3, children were presented with a specific quantity of objects (e.g. 15) and asked to count out a smaller set of objects (e.g. 5) from the larger set (Items 54-58). The second task, although similar to the task in the CMA and NKT, was modified to better fit the construct. In the CMA and NKT counting a subset tasks, children are shown a line of dots of two colors in a pattern (e.g. B R B R B B) and asked to count all of one color. Because it is possible that children may not be able to distinguish between colors or may not know color names, and the physical differences between the presented subgroups were irrelevant to the task, the colored

23

dots were replaced with two different pictures (dogs and cars) in this task. These two types of pictures are easily distinguishable from one another for a preschool child. The child was instructed to count all of one object (Items 59-63). Subitizing. In this task, children were briefly presented (2 seconds) with either a set of dots (Items 64-70) or a set of pictures (Items 71-77) and instructed to say how many dots or pictures were presented. This task is directly comparable to the subitizing task on the REMA, with the addition of subitizing pictures. Estimation. Comparable to the estimation task in the NSC, in this task, children were shown a set of dots or pictures (between 10 and 100) and asked to estimate the number of dots (Items 78-81) or pictures (Items 82-85) on the page. A response was considered correct if the child provided a response within 25 percent of the exact answer. Numerical Relations Subtest The Numerical Relations subtest (see Appendix E) was intended to assess a child’s knowledge of numbers and quantity and the association between the numbers on the number line. Nine skill areas were used to assess this domain: ordinality, relative size, number comparison, set comparison, number order, sequencing, set reproduction, number identification, and numerals. Ordinality. In this task, the child was presented with a line of pictures. The child was then asked to identify the nth picture in the line (Items 1-8). This task is comparable to the ordinality task on the REMA, CMA, and ENT. Relative Size. This skill was assessed in two ways. First, similar to the NSC and TEMA- 3, children were shown an array of 3, 4, or 5 numbers (one number at the top of the page and two, three, or four at the bottom of the page). Children were then asked to identify which of the numbers at the bottom of the page was numerically closest to the number at the top of the page (Items 9-12). The second method, modeled after the relative size tasks used in the REMA and NKT, was similar to the first task except that questions were presented verbally and without the printed arrays (Items 13-16). Number comparison. The number comparison tasks on the REMA, CMA, NSC, NKT, and TEMA-3 all are presented verbally. These tasks require children to identify which of two numbers is bigger or smaller. However, only having two response options, as was true for number comparison tasks on other measures, results in a large occurrence of guessing. The task

24

in the current measure utilized similar problems (Items 25-32), but also included questions presented visually and/or with more response options (i.e. rather than asking which of two numbers is bigger, the question was which of three or four numbers is the biggest; Items 17-24). Set comparison. In the set comparison tasks of the REMA, ENT, NKT, and TEMA, children are presented with sets of either dots (Items 33-40) or pictures (Items 41-48). Response options for this task on other measures typically are comprised of either two or four sets of images. In the current measure, children were presented with two to four sets of dots or pictures, and each set presented a different quantity. Children were asked to identify which set had the most or fewest dots or pictures. Number order. In this task, as in the REMA, CMA, NSC, and NKT, children were asked verbally what number comes before or after another number (Items 49-56). In some cases, children also were asked what number comes two before or after another number. However, the latter task appeared to tap arithmetic skills rather than knowledge of number order; thus, this assessment was limited to asking what numbers come immediately before and immediately after the given number. Sequencing. The REMA and ENT both used similar sequencing tasks. In the current measure, children were presented with three to five cards each with a set of dots of varying quantity. Children were then instructed to put the sets of dots in order from fewest to most (Items 57-60). The sequencing task on the CMA is different than the sequencing tasks on the REMA and the ENT. On the CMA task, children are asked to order sets based on magnitude (i.e. physical size) rather than quantity. However, the second aspect of the CMA sequencing task, inserting another set into the order, was adapted for the current measure’s sequencing task. After each sequencing item, children were given an additional card with a different set of dots and instructed to insert it into the appropriate place in order. Set reproduction. In Part A of this task, similar to the set reproduction tasks on the REMA, CMA, ENT, and TEMA-3, children were presented with a set of dots of a specified number. They were then instructed to take blocks from a pile and make them look just like the set of dots (Items 61-64). In Part B, similar to the second set reproduction task in the ENT, children were shown a primary set of pictures and four sets of other pictures, each with a different quantity. One of the second set of pictures had the same quantity of images as the

25

primary set of pictures. Children were directed to identify which sets had the same number of pictures (Items 65-68). Number identification. In this task, as in the number identification tasks on the NSC and TEMA-3, children were presented with flashcards, in a random order, each with a numeral 1-15 (Items 69-83). They were shown the flashcards one at a time and asked, “What number is this?” If a child stated the name of a letter for specific numbers (e.g. L instead of 1) or named the numbers individually for two digit numbers, the child was asked, “What is another name for this number?” Numerals. In the first part of this task, which was comparable to the numerals task on the REMA, children were presented with a numeral at the top of the page, and four sets of dots below. Children were instructed to identify which of the sets meant the same thing as the number at the top of the page (Items 64-87). The numerals task on the TEMA-3 was designed for older children (write the numeral of the quantity shown) and was not appropriate for preschool children. Thus, a modified version of this task was designed for the current assessment. In the second part of the task, children were presented with a set of dots at the top of the page and four numerals at the bottom. They were instructed to identify which of the numerals meant the same thing as the set of dots at the top of the page (Items 88-91). Arithmetic Reasoning Subtest The Arithmetic Reasoning Subtest (see Appendix F) was intended to assess children’s ability to understand changes in quantity and other basic arithmetic abilities. Six skill domains were used to assess this ability: Addition and subtraction with objects, addition and subtraction without objects, two-set addition and subtraction, equivalent sets, number composition/decomposition, and number combinations. Addition and subtraction with objects. This task was similar to the addition and subtraction with objects tasks used in the REMA, CMA, NSC, ENT, and TEMA-3. The examiner placed an empty box on the table in front of the child and stated how many discs he or she was placing in the box (e.g. 2). The examiner then placed the quantity of discs into the box, one at a time. The examiner then informed the child that another quantity (e.g. 1) of discs was being added to the box. The examiner then placed the new quantity in the box. The child was asked to identify how many objects were now in the box (Items 1-4). Similar procedures were

26

used for subtraction tasks, but rather than a second set of objects being added to the box, a set of objects was removed, one object at a time (Items 5-8). Addition and subtraction without objects. In this task, similar to the CMA, NSC, NKT, and TEMA-3, children were presented verbally with story problems that did not contain distracters (e.g. irrelevant information). These story problems were simple addition (Items 9-12) or subtraction (Items 13-16) problems that were appealing to children. Two-set addition and subtraction. In this task, which was comparable to the two-set addition/subtraction task on the CMA, two empty boxes were placed on the table. The examiner instructed the child to pay attention as discs were added to the boxes. The examiner placed a series of discs into the boxes one at a time (alternating the box the discs were placed in). When the examiner had finished placing discs in each box, children were asked if the boxes contained the same or a different number of discs (Items 17-24). After the child responded correctly or was informed of the correct response, the examiner added objects to or subtracted objects from one of the boxes. The child was then asked if the boxes had the same or a different number of objects. Equivalent sets. In this task, which was similar to the CMA and TEMA-3, children were presented with a set of objects (e.g. six blocks) and a picture of a different quantity of objects (e.g. a picture of three tables). Children were instructed to divide the objects equally amongst the images so that all sets were equal (e.g. two discs were placed on each picture of a table; Items 25-30). Number composition/decomposition. This task was modeled after the number composition/decomposition task in the REMA. Children were presented with a set of objects on the table and informed verbally of the quantity. The objects were hidden and more objects were either added or subtracted from the initial set. The new set was then presented to the child and the child was asked to identify how many objects were either added or subtracted from the initial set (Items 31-38). Number combinations. In the REMA, NSC, and NKT, children are verbally presented with number combinations. However, that task was likely too difficult for many preschool children to accomplish. For the current measure, the items were presented in three different ways. Children were presented with four basic addition tasks verbally (Items 43-46), visually (Items 47-50), or both visually and verbally (Items 39-42). For example, a child was presented

27

with the problem “1 + 1 =” and asked, “How much is 1 + 1?”, or the child was asked that question without the visual aid, or the child was presented with the problem without verbal cues. Procedure Assessment Procedure After consent forms were signed by parents and collected by the researcher, children were assigned randomly to one of three groups (103 Counting, 103 Numerical Relations, 100 Arithmetic Reasoning). Children in each group were then assessed on one of the three domain- specific subtests. Assessments were conducted by individuals who had either completed or were working towards completion of a bachelor’s degree. Assessments took place in the local preschools during non-instructional time in a quiet room designated by the individual preschool directors. Each domain-specific subtest took approximately 30-45 minutes to complete. Assessments typically were conducted in one or two testing sessions. However, it was necessary, at times, to divide the testing session into two or three sessions depending on the individual needs of some children. Analytic Procedure Data entry. Data were entered twice into SPSS version 16.0 and the two files were compared. Any responses that did not match between the two files were reexamined and corrected. Test item reduction for this phase took place in a four-step procedure. Step 1. Internal consistency (Cronbach’s Alpha) of all tasks was calculated both before and after item reduction. On tasks found to have poor internal reliability (α < .70), individual items were examined to determine if the poor reliability was due to specific items or to the overall set of items. If a task was shown to have poor reliability at all stages of analysis, it either was removed from further examination or revised. Step 2. Items that were poor measures of their respective constructs, too easy for children, or too advanced for children in this age group were removed. To accomplish this step in item reduction, items that were answered correctly at very low rates (less than 10%) or at very high rates (more than 90%) and had poor item fit were removed from the measure. The bottom limit of 10% was increased to the chance percentage for multiple-choice tasks (e.g. 25% for a four option multiple-choice item). To test for item fit, a confirmatory factor analysis was conducted on each task. Items with non-significant factor loadings were considered to have inadequate fit.

28

Additionally, any remaining items with low item-total correlations (< .20) within each task and low factor loadings were removed. Step 3. Items that appeared to function similarly within subtests were removed. Inter-item correlations for items on each task were calculated, and any item that correlated with another item at r > .90 and had a comparable mean was removed. Step 4. Tasks that assessed the same skill using different methods or presentations were compared, and the tasks were reduced in length by removing redundant items. The tasks presented in different ways (e.g. dots, pictures, objects; free response, multiple-choice) were compared based on response patterns and response rates. Responses were examined to determine whether the tasks assessed skills at comparable levels or if one type of task was more difficult to accomplish. Results Preliminary Analyses Demographic information for the three groups is presented in Table 5. Preliminary analyses indicated that the three groups of children did not differ by sex χ2 (2, N = 306) = 1.68, p = .432, ethnicity χ2 (4, N = 306) = .46, p = .977, or age F(2, 305) = .96, p = .383.

Table 5 Demographic information of children participating in Phase 1 by group Sex Race/Ethnicity Age African Male Female Caucasian Other Months American Counting 47 56 48 41 14 51.97 Numerical Relations 56 47 49 43 11 50.54 Arithmetic Reasoning 48 52 56 42 12 50.97

29

Primary Analyses The four-step analytical process is discussed within the context of each task separately. Factor loadings were calculated for each task in Mplus version 5.1 (Muthén & Muthén, 2008a) using a Weighted Least Squares (WLS) estimator in a confirmatory factor analysis framework. Several of the tasks were analyzed using a Weighted Least Squares Mean and Variance Adjusted (WLSMV) estimator due to a non-positive definite full-weight matrix, which prevented the WLS estimator from running properly. A non-positive definite matrix usually occurs when two or more items in a model are highly correlated, the sample size is small relative to the skewness of the included items, or both. Model parameters for the WLS estimator cannot be calculated if a non-positive definite matrix occurs because the full weight matrix cannot be inverted. The WLSMV estimator does not require an inversion of the matrix and thus can be used to compute model parameters in the presence of a non-positive definite full weight matrix. However, chi- squared fit indices cannot be computed with the WLSMV estimator. Muthén and Muthén (2008b) recommend three possible solutions to this problem: (a) remove one (or more) of the highly correlated items, (b) increase sample size, or (c) use the WLSMV as long as the chi- square fit indices are not desired. The WLSMV estimator was used rather than removing highly correlated items because chi-squared fit indices were not needed in the current analyses and retention of all items in this phase of the analysis was desired. Revisions to each task are discussed in the following section, and psychometric information for the items are presented in Appendix G. Additionally, no items within a given task correlated at r > .90. Therefore, no items were removed as part of Step 3 and this step will not be discussed in the following section. Counting Rote counting. The correlation between the highest number counted without an error (M = 18.38, SD = 16.53) and the highest number counted with only one error (M = 21.84, SD = 18.16) was calculated in SPSS version 16.0. Results indicated that the highest number counted without an error and the highest number counted with only one error correlated at r = .95, p < .001. This task was revised to stop after the child made the first error or counted to 100 without an error. Counting forward. The counting forward without assistance task and the counting forward with assistance task both exhibited good internal consistency (α = .83). The percentage

30

of children who received full credit for items, partial credit2, item-total correlations, and factor loadings are presented in Table G-1. All items had high item-total correlations (r = .59 to .70) and significant factor loadings (β = .82 to .96). These two tasks were highly correlated (r = .73, p < .001) and the scoring patterns between both tasks were identical (e.g., counting up from lower numbers was easiest). The counting forward with assistance task appeared to span a broader range of abilities than the counting forward without assistance task (e.g., only 54% of children answered the first counting forward without assistance question correctly versus 71% who answered the first counting forward with assistance question correctly). Because of these factors, the counting forward with assistance task was retained for the revised version of the measure, and the items from the counting forward without assistance task were removed. Counting backward. The counting backward with dots and the counting backward without assistance tasks both tasks exhibited acceptable internal consistency (α = .75 for dots and .72 for counting down with assistance). The percentage of children who received full credit for items, partial credit, item-total correlations, and factor loadings are presented in Table G-2. All items on both tasks had acceptable item-total correlations (r = .28 to .75) and significant factor loadings (β = .73 to .99). The two counting backward tasks were significantly correlated (r = .64, p < .001). One concern with this task was the overall difficulty of the items. No item was correctly answered by more than 50% of participants. Thus, the task was modified for the revised version of the assessment to include assistance comparable to that used in the counting forward task, which was retained. The assistance on this task was restructured to frame the instructions as “counting backward like you would when a rocket ship blasts off.” Identifying counting errors. The internal consistency of the items using 5 dots was good (α = .84) as was the internal consistency of the items using 10 dots (α = .81). The total scores for these two tasks were very highly correlated (r = .85, p < .001) and response patterns were identical. When all items were combined, the overall task had a high internal consistency (α = .91). The percentage of children who received full credit for items, partial credit, item-total correlations, and factor loadings are presented in Table G-3. All counting error items had high item-total correlations (r = .47 to .82), and had acceptable factor loadings (β = .64 to .95). The

2 Participants could receive partial credit (1 point) for a response with one error in the counting forward and counting backward tasks. Also, in the identifying counting errors task, participants who identified that an error was made, but could not identify what the error was, received 1 point rather than the full 2 points. All remaining tasks were scored 0 or 1; thus partial credit was not awarded for any other task.

31

four “correctly counted” items had low item total correlations (r = .00 to .13) and extremely low factor loadings (β = -.13 to .19). Further, these items were correctly answered by over 90% of participants. Although these four items were poor measures of this construct, two of these items (one 5 dot and one 10 dot “correctly counted” item) were retained for the revised version of this measure for administration purposes. To reduce further the size of this task, the better item from each type of error was selected to be retained (i.e., of the two “counted a dot twice” items, the one with the higher item-total correlation was retained). Thus, three 5-dot items were removed from the revised version of the measure (Items 18, 23, and 25) and three 10-dot items were removed from the revised version of the measure (Items 26, 28, and 30). The revised task consisted of Items 20, 22, 24, 29, 31, and 32 and had an internal consistency of α = .89. Structured counting. There were three different types of counting tasks: counting dots, counting pictures, and counting blocks. Each type of task had acceptable internal consistency (α = .70 to .73). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-4. All items had acceptable item-total correlations (r = .39 to .55) and significant factor loadings (β = .66 to .92). Correlations between the sums of the three tasks were also very high (r = .75 to .78). Comparable items (e.g., counting 3 dots and counting 3 blocks) had similar percent-correct scores and factor loadings. Overall, it appeared that the three types of counting tasks functioned similarly and were highly related, and thus were redundant. The counting pictures and counting objects tasks were removed from the revised measure. Additionally, because the counting dots task had an internal consistency that was only slightly above the minimally acceptable level, three new items (counting 6, 11, and 14 dots) were added to the revised measure. Cardinality. There were three different types of Cardinality tasks: Cardinality with dots, Cardinality with pictures, and Cardinality with blocks. Each type of task had good internal consistency (α = .82 to .83). The percentage of children who answered each item correctly, item- total correlations, and factor loadings are presented in Table G-5. All items had acceptable item- total correlations (r = .56 to .71) and significant factor loadings (β = .84 to .97). Correlations between the sums of the three tasks were also very high (r = .82 to .88, p < .001). Comparable items (e.g., Cardinality with 3 dots and Cardinality with 3 blocks) had similar percent correct scores and factor loadings. Overall, it appeared that the three types of cardinality tasks functioned similarly and were highly related, and thus were redundant. The Cardinality with

32

pictures and Cardinality with objects tasks were removed from the revised measure. Additionally, three new items (Cardinality with 6, 11, and 14 dots) were added to the revised measure. Resultative counting. This task had a comparable internal consistency to the counting dots task (α = .72) as well as similar response patterns and similar mean scores for comparable items. The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-6. Item-total correlations were acceptable for all items (r = .45 to .57) and all factor loadings were significant (β = .72 to .89). Although this task was significantly correlated with the counting dots task (r = .54, p < .001), the correlation was not high enough to warrant removal of this task. Further, removal of any individual items resulted in internal consistency below α = .70. Thus, all items from this task were retained for the revised version of this measure. Counting a subset. Of the two counting a subset tasks, counting out a set of objects from a larger set had an acceptable internal consistency (α = .76) and counting a subset of pictures had below acceptable internal consistency (α = .64). The two tasks also were correlated highly (r = .69, p < .001). When items from both tasks were combined, the internal consistency increased to .83. For further analyses, these tasks were combined into one task. The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-7. All items had acceptable item-total correlations (r = .27 to .62) and significant factor loadings (β = .65 to .93). All items were retained for the revised version of this measure. Subitizing. The subitizing dots task had a low internal consistency (α = .57) and the subitizing-pictures task had a below acceptable level of internal consistency (α = .65). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-8. On the subitizing dots task, three items (Items 66, 67, and 70) had low item-total correlations (r = .18, .19, and .16, respectively) and factor loadings (β = .29, .32, and .34, respectively). Even with removal of these three items, the internal consistency did not reach acceptable levels (α = .52). Because of the low internal consistency, this task was removed from the revised version of the measure. Only one subitizing pictures item (Item 77; subitizing 10) had a low item-total correlation (r = .12) and a low factor loading (β = .20). When Item 77 was removed, the subitizing pictures task increased in internal consistency to α = .68.

33

This item was replaced with an item likely to function better (subitizing 7) for the revised measure. Estimation. The estimating pictures task had a low internal consistency (α = .24). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-9. Three of the items (Items 82, 84, and 85) had unacceptable item-total correlations (r = .14, -.14, and .16, respectively) and all four items had non-significant factor loadings. Due to the poor functioning of all these items, this task was removed from the revised version of the measure. The estimating dots task also had a low internal consistency (α = .38). One item (Item 80; estimate 50 dots) had both a low item-total correlation (r = .10) and factor loading (β = .32). When this item was removed, the internal consistency increased to α = .40. Rather than removing this task from the measure, four additional items were added. These four items were estimation multiple-choice items. On the multiple-choice items, children were presented with four sets of dots at the same time (10, 20, 50, and 100) and asked to identify which one was the stated number (e.g., “Which of these is 20”) without counting. After each question, the sets of dots were rearranged into a different order. The resulting Estimation task was 7 questions long (3 free response and 4 multiple-choice). Numerical Relations Ordinality. The ordinality task had an internal consistency below acceptable levels (α = .60). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-10. Item 1 had a low item-total correlation (r = .15) and low, but statistically significant, factor loading (β = .32). All other items had acceptable item- total correlations (r = .21 to .42) acceptable and significant factor loadings (β = .46 to .77). During the testing process, it was observed that children frequently pointed to the first picture as the response to the first question, the second picture as the response to the second question, and so forth. The organization of this task may have contributed to the low factor loading of this item and the low reliability of the overall task. Thus, for the revised measure, items were rearranged so that the ordinal position of an item did not correspond to the order of the question (e.g., question number 1 was not “Which is the first duck in line?”). Relative size. The Relative Size task had low internal consistency (α = .47). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-11. Although most of the items, with the exception of Item 11,

34

had acceptable item-total correlations (r = .21 to .31) and factor loadings (β = .27 to .89), the set of items performed poorly as a task. Two potential reasons for the low internal consistency of this task were identified. First, the items were 2- and 3-option multiple-choice. The high probability of correctly guessing the answer to the problems likely increased the error associated with each item. Rather than remove these items from the revised measure, the 2- and 3-option multiple-choice questions were changed to 4-option multiple-choice to reduce the likelihood that a child would correctly guess the answer. Second, for two of the verbal questions (Items 13 and 14), the correct answer was the last option presented to the child. It was observed that many children simply repeated the last option presented for all questions. For the revised measure, the response options were reordered so the last option presented was not correct for the verbal items. Number comparison. The Number Comparison task had a low internal consistency (α = .54). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-12. This task was subject to the same concerns as the Relative Size task. The same changes were made to this task--all items were changed to 4-option multiple-choice and verbal response options were rearranged so that the correct answer was not the last answer presented to the child. Set comparison. The set comparison task for dots and pictures both had low internal consistencies (α = .41 and .40 respectively). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-13. The high chance of guessing correctly (due to 2- and 3-option multiple-choice questions) likely contributed to the low internal consistency. Thus, as was done for the previous two tasks, all items were changed to 4-option multiple-choice. Two additional items were added (one “most” and one “fewer”) to increase the internal consistency of the task. It was observed that many children appeared not to understand the concept of “fewest.” Thus, for the revised measure, a prompt was added prior to the first “fewest” question indicating to children that “fewest means the least, or the lowest number of dots.” Further, the set comparison tasks with dogs and dots performed similarly in all aspects. Comparable items had nearly identical mean response rates and the tasks had almost identical response patterns. Therefore, the set comparison for pictures (dogs) section was removed from the revised measure. Number order. The Number Order task had a good internal consistency (α = .87). The percentage of children who answered each item correctly, item-total correlations, and factor

35

loadings are presented in Table G-14. All items had high item-total correlations (r = .50 to .76) and high factor loadings (β = .83 to .95). Although fewer than 50% of the children correctly answered the easiest of these items, no changes were made to the task because there were no easier items that could be added. Sequencing. The sequencing task had an internal consistency below acceptable levels (α = .62). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-15. The items on the sequencing task all had acceptable item-total correlations (r = .37 to .62) and factor loadings (β = .79 to 1.00). Items 59 and 60 were answered correctly by fewer than 10% of the children; however, both of these items were retained because of their high factor loadings. No items were answered correctly by more than 30% of children. Two easier items were added (sequencing 1, 5, 9 and sequencing 1, 3, 5). Insertion. The insertion task had a low internal consistency (α = .38). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-15. Two of the items (Items 57b and 59b) had low item-total correlations and Item 57b also had a non-significant factor loading (β = .24). Although two of the items were answered correctly by 50% of the children, this was likely because a “correct response” for these items was to insert the card either at the beginning or the end of the sequence. It was observed that most children who did not understand this task simply placed the card at the beginning or end of the sequence for all items. Thus, for these two items, children who were just guessing had a 50% change of answering correctly. The low internal consistency, combined with the difficulty of this task, warranted removal of this task from the revised version of the measure. Set reproduction. The equal set reproduction task with blocks had an acceptable internal consistency (α = .78). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-16. These items had high item-total correlations (r = .47 to .70) and high factor loadings (β = .85 to .94). No changes were made to this part of the task. The identifying equivalent sets (multiple choice) task had a low internal consistency (α = .48). The percentage of children who answered each item correctly, item-total correlations, and factor loadings also are presented in Table G-16. During test administration, it was observed that some children appeared confused by the layout of the items (the response options of cars were laid out differently than the equivalent set of planes). All items were

36

retained for the revised version of the measure, but the response options of cars were changed to visually match the equivalent set of planes. Number identification. The number identification task had a high internal consistency (α = .93). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-17. All items had high item-total correlations (r = .54 to .76) and factor loadings (β = .81 to 1.02). All items from this task were retained for the revised measure. Numerals. Neither the matching numbers with sets task nor the matching sets with numbers task alone had acceptable internal consistency (α = .53 and .62, respectively). Given that these tasks are very similar in nature, when combined, the internal consistency of the task increased to an acceptable level (α = .75). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-18. Only one item (Item 91) had a low item-total correlation (r = .08) and low factor loading (β = .33). This item was removed from the revised measure and replaced with an easier item. Two additional items of each type were added to the measure. Arithmetic Reasoning Addition/subtraction with objects. The addition and subtraction with objects task had good internal consistency (α = .81). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-19. All items had high item-total correlations (r = .48 to .59) and high factor loadings (β = .68 to .95). However, children performed quite well on this task as evidenced by the nearly 50% correct response rate to the most difficult item. To increase range of measurement for this task, four additional items (2 addition and 2 subtraction) were added. These items were intended to be more difficult than the original items. Addition/subtraction without objects. This task had an internal consistency below acceptable levels (α = .64). The percentage of children who answered each item correctly, item- total correlations, and factor loadings are presented in Table G-20. The range of skill that this task assessed was limited because the easiest item was answered correctly by 54% of children. All items had acceptable item-total correlations (r = .24 to .38) and factor loadings (β = .40 to .90). To increase the internal consistency and the range of measurement for this task, two items (1 addition and 1 subtraction) that were easier than the initial items were added.

37

Two-set addition/subtraction. The two-set addition/subtraction task had a low internal consistency (α = .49). The initial equivalence part of the task also had a low internal consistency (α = .51). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-21. Few items on either task were answered correctly at a rate greater than chance. Four items on the latter part of the task (Items 18b, 20b, 22b, 24b) had low item-total correlations (r = .08, .19, -.04, and .08, respectively) and non- significant factor loadings (β = .00, .28, -.27, and .14, respectively). When these items were removed the internal consistency of this part of the task increased to α = .68. Thus, the four poor- performing items were removed from the revised measure and two other items, similar to the four items that were retained, were added to increase the internal consistency of the task on the revised measure. Equivalent sets. This task had an acceptable internal consistency (α = .72). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-22. All items had acceptable item-total correlations (r = .24 to .54) and significant factor loadings (β = .51 to .91). The range of skills that the items from this task assessed was broad (21-88% correct). No changes were made to this task for the revised measure. Number composition/decomposition. The number composition task had a low internal consistency (α = .58). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-23. The correct response rate for these items was between 8-16%, which was extremely low. The low percent-correct for the items was likely because many of the children could not differentiate between “how many discs were added” and “how many discs are there total,” even after being prompted with the response, “That’s how many discs there are total. How discs were added? How many new discs are there?” It was determined that these items could be improved by the addition of a sample question. Thus, no items were removed and a sample item was added. The number decomposition task had an internal consistency below acceptable levels (α = .61). The percentage of children who answered each item correctly, item-total correlations, and factor loadings also are presented in Table G-23. Items ranged in correct response rates between 54-76%. All items had acceptable item-total correlations (r = .31 to .48) and acceptable factor loadings (β = .49 to .87). Questions that were more difficult than the original questions were

38

necessary to assess the full breadth of skills on this task. Thus, two new items and a sample item were added to the revised version of this measure. Number combinations. The three number combination tasks (verbal/visual, visual only, and verbal only) all had internal consistency values below acceptable levels (α = .68, .64, .65, respectively). The percentage of children who answered each item correctly, item-total correlations, and factor loadings are presented in Table G-24. Only one item (Item 46; the most difficult visual-only item) had a low item-total correlation (r = .17). All items had significant factor loadings (β = .53 to .90). Items presented verbally/visually had similar response rates and patterns as items presented only visually. Thus, the visual-only task was removed from the revised version of the measure. Items presented only verbally were restricted in the range of the percentage of children who answered the problem correctly (16 to 21%). Because the verbal/visual task appeared to assess a broader range of ability than the verbal-only task, the latter task was removed. All items from the verbal/visual task were retained and two easier items were added to this task (e.g., 1+0=1 and 0+2=2) to increase the range of ability that the task assessed and the internal consistency of the task. Summary Based on the analyses, it was possible to shorten all three tasks considerably. The counting task was reduced from 101 to 62 items (including both parts of two-part items) and administration time was reduced from 40-45 minutes to 20-25 minutes. The Numerical Relations task was reduced from 99 to 91 items and administration time was reduced from 30-35 minutes to 25-30 minutes. The Arithmetic Reasoning task was reduced from 58 to 56 items and administration time was reduced from 30-35 minutes to 25-30 minutes.

39

PHASE 2: FURTHER MEASURE REFINEMENT AND CONSTRUCT COMPARISON Method Participants In Phase 2 of this study, 393 children were assessed from 45 preschools in the Tallahassee and Panama City areas in Florida. Of these children, 283 were children who were assessed during Phase 1. The additional 110 children were recruited from additional schools in the Tallahassee and Panama City areas. The 23 children who were assessed in Phase 1 but not assessed in Phase 2 left the study primarily because their families moved out of town or they no longer attended the same preschool. The children who completed Phase 2 of the assessment were evenly split by sex (51.7% female), and reasonably representative of the demographics of the Tallahassee and Panama City areas (55.7% Caucasian, 33.8% African-American, and 10.5% other race/ethnicity). Children ranged in age from 37.58 months to 71.84 months (M = 56.96 months, SD = 9.02 months), were primarily English-speaking, and had no known developmental disorders. Measures Children were assessed using the revised assessment measure developed in Phase 1 (see Appendix H for the revised Counting measure, Appendix I for the revised Numerical Relations measure, and Appendix J for the revised Arithmetic Reasoning measure). Assessments were conducted by individuals who either had completed or were working towards completion of a bachelor’s degree. Assessments took place in the local preschools during non-instructional time in a quiet room designated by the individual preschool directors. Children were assessed on all three numeracy subtests; each subtest was administered in a separate session. At times, it was necessary to divide the testing session for each subtest into two or three sessions depending on the individual needs of some children. Procedure Step 1 - Differential Item Functioning The primary purpose of this phase was to refine and evaluate the three constructs of Counting, Numerical Relations, and Arithmetic Reasoning to examine the relations between these three constructs. Phase 2 was accomplished in four steps. In the first step of Phase 2, items were examined through a differential item functioning (DIF) test to determine if items functioned differently based on sex or race/ethnicity. To test for DIF based on sex or race, a CFA-with-

40

covariates model was conducted for each numeracy domain. The CFA-with-covariates model tested whether or not the direct relation between an item and the covariate was mediated by the factor. If the direct relation between the item and the covariate was significant then the item had DIF. Significant DIF was determined by examining modification indices. Modification indices are the improvement in χ2 model fit if a parameter is included in the model (e.g., including the relation of the covariate to the item and factor). Items with significant DIF were more likely to be answered correctly by an individual of one subgroup (e.g. boys) with a given latent ability than by an individual of the other subgroup (e.g. girls) with the same latent ability level. Therefore, as these items would not be reasonable to include on an assessment measure targeted for general populations, items found to have significant DIF based on these variables were removed from the measure. Step 2 - Item reduction The second step of Phase 2 was to reduce further the number of items that contributed to each skill while maintaining the discriminating ability of each skill over the ability continuum. A two-parameter logistic (2-PL) Item Response Theory (IRT) analysis was conducted on each task using Mplus version 5.1 (Muthén & Muthén, 2008a). IRT is a model-based method of latent trait measurement that relates the amount of an individual’s latent ability to the probability of correctly responding to an item (Hambleton, Swaminathan, & Rogers, 1981). IRT allows researchers to select items based on item-level characteristics. The item-level characteristics, or parameters, in a 2-PL model that describe item functioning are referred to as the difficulty parameter and the discrimination parameter. The difficulty parameter measures the point along the ability spectrum at which a specific response option would be endorsed 50% of the time for an individual with a given ability. Items with high difficulty parameters require a greater amount of latent ability to be answered correctly; hence, the item is more difficult to answer correctly than an item with a lower difficulty parameter. The discrimination parameter measures how well an item differentiates between individuals with latent abilities above and below the items difficulty parameter. A third parameter, the guessing parameter, is included only when it is feasible that a child with low ability can correctly guess the answer on items. For example, on a four-option multiple-choice question, a child who did not know the correct answer would have a 25% chance of guessing the answer correctly. As the probability of correctly guessing the answer increases, the amount of information provided by an item decreases. Although there are several

41

sets of items in this measure on which guessing could impact the parameter estimation, it was not feasible to include the guessing parameter in these analyses for two reasons. First, the number of participants necessary to conduct a 3-PL analysis on a measure of this size is very large. Second, the research on parameter estimation in 3-PL models in not sufficiently developed to warrant its use at this time. Psychometric information for items and tasks are presented in Appendix K. Within each task, difficulty parameters were compared to identify items that provided overlapping or comparable information. The use of multiple items that provided overlapping information is not desired because it would result in a “double-counting” of one level of item, which could inflate some children’s total scores. For example, if the test had many items with identically low difficulty parameters (i.e., easy items) then the test scores would be inflated artificially at the low end of the scale (e.g., the sum of correct answers to five easy items are not equivalent to the sum of answers on five difficult items). However, if the test were constructed with several items that spanned the range of mathematical ability, none of which overlapped in their information, the test would be a uniform measure of mathematics ability across the range of the latent trait. Identification and removal of similar items would result in a uniform measure of mathematics ability. To accomplish the item-refinement process, the item parameters from the IRT analysis were used to create Item Information Functions (IIF) and Test Information Functions (TIF). IIFs show the ability of an item to provide information across the ability spectrum for the targeted skill. The sum of the IIFs from a measure is the Test Information Function (TIF). The TIF shows the ability of a test to provide information across the ability spectrum for the targeted skill. Utilizing IIFs and TIFs, item selection was conducted based on a four-step procedure set forth by Lord (1977). These steps are, (a) selecting the shape of the desired TIF (e.g. categorization test vs. broad abilities), (b) select items that contribute unique information to the TIF, (c) calculate the TIF, and (d) compare the actual TIF to the target TIF, and if they are not comparable, repeat steps 2 and 3. In this study, the desired shape of the TIF was one that measured a broad range of abilities. The goal of this item-reduction phase was to reduce the number of total items for each skill while maintaining the ability to assess children over a broad range of latent abilities (i.e. retaining test-level standard error of less than .3163 over the broadest range possible of each

3 A standard error of .316 is equivalent to a classical test theory internal consistency of .90.

42

latent trait continuum). Items with overlapping difficulty parameters were reduced to only the item(s) that provide the greatest amount of information. Step 3 – Construct Evaluation The third step of Phase 2 was to examine the refined constructs of Counting, Numerical Relations, and Arithmetic Reasoning. The purpose of this step was to examine how well the purported skills within each domain actually represented the targeted construct. Three CFAs, one for each domain, were conducted. Tasks that did not adequately fit within their respective domain were removed from further analysis. Poorly fitting tasks were evaluated to determine whether or not they could be moved to one of the other two domain areas, but only if theoretical justification could be made for this change. Step 4 - Model Comparison When measure refinement was completed and the constructs were validated, the fourth step of Phase 2 was to compare the models explaining the relations between informal number- related skills. The five alternative models of children's informal numeracy skills were evaluated using CFA in Mplus version 5.1 (Muthén & Muthén, 2008a). One 3-factor model, three 2-factor models (Counting + Numerical Relations, Arithmetic Reasoning; Counting + Arithmetic Reasoning, Numerical Relations; Counting, Numerical Relations + Arithmetic Reasoning), and 2 one 1-factor model were compared. Maximum likelihood estimation was used and χ difference tests along with standard fit were compared to determine the best-fit model. Results Step 1 - Differential Item Functioning To test for differential item functioning, three separate CFA-with-covariate analyses, one for each numeracy domain, were conducted in Mplus version 5.1 (Muthén & Muthén, 2008a). Modification indices for the relation between the covariate and the items were examined to determine if model fit would be improved by inclusion of the covariate in the model. The CFA- with-covariates analyses for all three factors yielded no significant modification indices (a modification index of 3.84 is considered the minimum value to improve model fit; Muthén & Muthén, 2008b). Inclusion of the covariates in the model did not result in improved model fit for any of the analyses. Therefore, no DIF was detected for any items based on sex or ethnicity and no items were removed in this step of the analyses. Although no DIF was identified for either sex or race/ethnicity, other variables not measured in this study could result in DIF. One possibility

43

is family SES. DIF could not be calculated for family SES in this study, however, because family demographic information was not collected. In future studies, SES-related DIF should be examined. Step 2 - Item reduction Counting Rote counting. To score the rote counting task on the same metric as the other tasks, it was necessary to determine critical points at which children received credit for correctly counting to that number. To rescore this task, the task was divided into 20 separate items (e.g. correctly counted to 5, correctly counted to 10, etc.). These 20 items were then evaluated in a 1PL IRT model where the discrimination parameters were all held to be equal. The results of the 1PL IRT analysis are presented in Table K-1. Scoring cutoffs were selected so that no cutoffs had overlapping or similar difficulty parameters. The final rote counting task score was computed in the following manner; one point each for correctly counting to 5, 10, 15, 20, 25, 40, and 100. The Test Information Function (TIF) from this task is presented in Figure 1 and the Standard Error (SE) Function is presented in Figure 2. Overall, this task did not have a standard error of less than .316 at any theta value. Counting forward and counting backward. In this phase of the analysis, counting forward and counting backward were examined together because the two tasks measured a similar construct (i.e., the ability to manipulate the counting sequence) and, separately, only had four items. The percentage of children who received full credit for items, partial credit, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-2. These tasks were initially scored so that partial credit could be earned if the child made one mistake in his or her counting. However, initial IRT analyses suggested there was very little difference in the difficulty of receiving partial credit and full credit for each item (e.g., difficulty for partial credit for Item 2 was b = -.70 and the difficulty for full credit for that same item was b = -.59). The difficulties for partial credit and full credit were overlapping. Therefore, all partial credit scoring was removed.4 Item 2 and Item 3 had very similar difficulty parameters

4 It should be noted that when dichotomous data are obtained from polytomously scored data, the dichotomous difficulty parameters are equal to the difficulty parameter for the point at which the dichomization occurred. For example, when the counting forward/backward items were revised to dichotomous items, the new difficulty parameter was equivalent to the original Difficulty 2 parameter. Therefore, the original Difficulty 2 parameters were used in overlapping item comparison.

44

Rote Counting Counting Forward/Backward Counting Error Identification

40.0 40.0 40.0 30.0 30.0 30.0 20.0 20.0 20.0 10.0 10.0 Information Information

10.0 Information 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Structured Counting Cardinality Resultative Counting

40.0 40.0 40.0 30.0 30.0 30.0 20.0 20.0 20.0 Information 10.0 Information 10.0 Information 10.0 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Counting a Subset Subitizing Estimation

40.0 40.0 40.0 30.0 30.0 30.0 20.0 20.0 20.0 Information

10.0 Information 10.0 Information 10.0 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Figure 1. Test Information Functions for Counting tasks. High levels of information indicate high reliability of the task at a given point along the latent trait.

45

Rote Counting Counting Forward/Backward Counting Error Identification

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard 0.0 Error Standard 0.0 Error Standard 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Structured Counting Cardinality Resultative Counting

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard Standard Error Standard 0.0 Error Standard 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Counting a Subset Subitizing Estimation

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard Standard Error Standard 0.0 Error Standard 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Figure 2. Standard Error Functions for Counting tasks. Standard error values below .316 indicate high reliability of the task at a given point along the latent trait.

46

(b = -.62 and -.58) and both had very high discrimination parameters. Item 3 had a higher discrimination (a = 5.91), and thus, Item 2 was removed from the final measure. Similarly, Items 4, 5, and 7 all had comparable difficulties (b = .00, .10, and .21, respectively). All three items had high item discriminations (a = 2.24 to 2.49). To maintain a broad range of item difficulties, Item 5 was removed. The remaining Items 3, 4, 6, 7, 8, and 9 all had unique difficulty parameters and were used to construct the Count forward/backward scale. The final counting forward/backward scale had a good internal consistency (α = .82). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task had a standard error of less than .316 from the theta value of -.85 to .60. Counting error identification. The percentage of children who received full credit for items, partial credit, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-3. The analyses indicated that, as was similar to Phase 1 analyses, none of the “correct” counting items had high factor loadings (β = .11 and .33). Although Item 11 significantly loaded on the factor, the difficulty of this item was b = -4.41, an extremely easy item. These items did not contribute significantly to the overall factor and thus were removed from the final analysis. Item 11 was retained on the final measure for administration purposes, but no points will be awarded for correctly answering the question. Of the remaining six items, Items 12 and 13 had nearly identical item difficulties (b = .23 and .18), but Item 13 had a higher discrimination parameter (a = 2.38). Item 12 was removed from the final measure. Item 16 had a full credit difficulty parameter very similar to that of Item 10 (b = .29 and .36) and a partial credit difficulty parameter similar to Items 12 and 13 (b = -.14, -.14, and -.16, respectively). Items 12 and 16 were removed from the final measure. All other items had unique difficulty parameters and thus were retained for the final measure. The items that were retained for this task were Items 10, 13, 14, and 15. The final counting error identification task had a good internal consistency (α = .84). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task had a standard error of less than .316 from the theta value of -.50 to .30. Structured counting. The percentage of children who answered each item correctly, item- total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-4. Item 18 and Item 19 had identical item difficulty parameters (b = -1.26). Item 18 was removed because Item 19 had a higher discrimination parameter (a = 2.41 vs. 1.91).

47

Items 20 and 23 had similar item difficulty parameters (b = .11 and .19). However, Item 20 had a much higher discrimination parameter than did Item 23 (a = 6.50 vs. 3.29). Thus, Item 23 was removed from the final measure. Items 21 and 22 also had comparable difficulty parameters (b = -.63 and -.74). Item 22 had a higher discrimination parameter than Item 21 (a = 1.75 vs. 1.70), and thus, Item 21 was removed from the final measure. All other items had unique difficulty parameters and thus were retained for the final measure. The items retained for this task were Items 19, 20, 22, 24, and 25. The final structured counting task had an acceptable internal consistency (α = .79). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task had a standard error of less than .316 from the theta value of -.20 to .30. Cardinality. This task was tied directly to the structured counting task. Thus, the ideal items for this task should not only have a broad range of difficulty parameters, but were included on the final structured counting task. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-5. Overall, this task was fairly easy (i.e., more than 50% of children answered the hardest question correctly) and the range of difficulty was very restricted (b = -.86 to -.22). Items 18b and 19b both had comparable difficulty parameters (b = -.77 and - .86), but Item 19b had a higher discrimination parameter (a = 3.97). Item 18b was removed from the final measure. Items 20b, 23b, and 25b had comparable difficulty parameters (b = -.30, -.22, and -.29, respectively); however, Item 20b had the highest discrimination parameter (a = 3.60). Items 23b and 25b were removed from the final measure. Lastly, items 21b, 22b, and 24b all had similar difficulty parameters (b = -.54, -.53, and -.48, respectively), but Item 22b had the highest discrimination parameter (a = 3.98). Items 21b and 24b were removed from the final measure. Only three items were retained, Items 19b, 20b, and 22b. All three of these items were also included on the structured counting task. The final cardinality task had an acceptable internal consistency (α = .75). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task had a standard error of less than .316 from the theta value of - 1.00 to -.10. Resultative counting. The percentage of children who answered each item correctly, item- total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-6. In this task, only two items had comparable difficulty parameters, Items

48

26 and 27 (b = -.74 and -.73). Item 27 had a higher discrimination parameter (a = 1.90), and therefore, Item 26 was removed from the final task. Items 27, 28, 29, and 30 were retained. The final resultative counting task had a below acceptable level of internal consistency (α = .68). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task did not have a standard error of less than .316 at any theta value. Counting a subset. The percentage of children who answered each item correctly, item- total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-7. Items 33 and 35 had nearly identical difficulty parameters (b = .70 and .72), but Item 33 had a higher discrimination parameter (a = 1.82). Item 35 was removed from the final measure. Items 36 and 37 also had nearly identical difficulty parameters (b = -1.51 and - 1.53), but Item 37 had a higher discrimination parameter (a = 1.40). Item 36 was removed from the final measure. All other items had unique difficulty parameters. Thus, Items 31, 32, 33, 34, 37, 38, 39, and 40 were retained for the final measure. The final counting a subset task had good internal consistency (α = .82). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task had a standard error of less than .316 from the theta value of -.90 to -.20. Subitizing. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-8. No items had overlapping difficulty scores. All items were retained for the final measure. This task had a below acceptable level of internal consistency (α = .69). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task did not have a standard error of less than .316 at any theta value. Estimation. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-9. All estimation items had significant factor loadings. However, the factor loadings for Items 51, 52, and 53 were low (β = .45, .32, and .37) compared to the remaining items (β = .63 to .79). Item 49 had an identical difficulty parameter to Item 51 (b = .72) and a similar difficulty parameter to Item 50 (b = .81); however, Item 49 had a much higher discrimination parameter (a = 1.27) than either of the other two items. Items 50 and 51 were removed from the final measure. Because no other items had similar difficulty parameters to Items 52 (b = 2.71) and 53 (b = 1.69) and their factor loadings were significant, albeit low (β = .32 and .37), these

49

items were retained. All other items had unique difficulty parameters and were retained for the final version of the measure. The final estimation task consisted of Items 48, 49, 52, 53, and 54 and had a low internal consistency (α = .49). The TIF from this task is presented in Figure 1 and the SE Function is presented in Figure 2. This task did not have a standard error of less than .316 at any theta value. Summary. The final Counting subtest was reduced from 62 items to 43 items. All final tasks except resultative counting, subitizing, and estimation had internal consistency scores above α = .70. (resultative counting and subitizing just narrowly missed this standard with an internal consistency score of α = .68 and .69, respectively). When all items from the Counting tasks were combined, the counting subtest had an internal consistency of α = .94. The TIF for the entire Counting subtest and the SE function are presented in Figure 3. Although the individual tasks had limited ranges of theta at which the SE function was less than .316, the subtest as a whole had a standard error of less than .316 from a theta value of less than -2.00 through 1.75, which indicates that this test was a strong measure across the latent trait spectrum. Numerical Relations Ordinality. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-10. Items 1, 5, and 7 all had comparable difficulty parameters (b = .35, .40, and .39); however, Item 5 had the highest difficulty parameter (a = 1.95). Items 1 and 7 were removed from the final measure. Items 3 and 4 also had similar difficulty parameters (b = -.20 and -.28), but Item 3 had a slightly higher discrimination parameter (a = 1.74). Item 4 was removed from the final measure. Items 6 and 8 also had comparable difficulty parameters (b = .93 and .89), but Item 6 had the higher discrimination parameter (a = 1.51). Item 8 was removed from the final measure. All other items provided unique difficulty parameters and were retained. The final ordinality task consisted of Items 2, 3, 5, and 6 and had an acceptable internal consistency (α = .73). The TIF from this task is presented in Figure 4 and the SE Function is presented in Figure 5. This task did not have a standard error of less than .316 at any theta value. Relative size. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-11. Items 9, 10, and 11 all had similar difficulty parameters (b = -.19, -.13, and -.22), but Item 10 had the highest discrimination parameter (a = 1.06). Items 9 and 11 were removed

50

Counting Subtest

150.0

100.0

50.0 Information 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Counting Subtest

1.0 0.8 0.6 0.4 0.2 Standard Error Standard 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Figure 3. Test Information Functions for Counting subtest (top) and Standard error (bottom) of the subtest. High levels of information and low standard error values indicate high reliability of the task at a given point along the latent trait.

51

. Ordinality Relative Size Number Comparison

50.0 50.0 50.0 40.0 40.0 40.0 30.0 30.0 30.0 20.0 20.0 20.0

Information 10.0 Information 10.0 Information 10.0 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Set Comparison Number Order Sequencing

50.0 50.0 50.0 40.0 40.0 40.0 30.0 30.0 30.0 20.0 20.0 20.0 Information Information 10.0 Information 10.0 10.0 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Set Reproduction Number Identification Numerals

50.0 50.0 50.0 40.0 40.0 40.0 30.0 30.0 30.0 20.0 20.0 20.0 Information Information 10.0 Information 10.0 10.0 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Figure 4. Test Information Functions for Numerical Relations tasks. High levels of information indicate high reliability of the task at a given point along the latent trait.

52

Ordinality Relative Size Number Comparison

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard 0.0 Error Standard 0.0 Error Standard 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Set Comparison Number Order Sequencing

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard Standard Error Standard 0.0 Error Standard 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Set Reproduction Number Identification Numerals

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard Standard Error Standard 0.0 Error Standard 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Figure 5. Standard Error Functions for the Numerical Relations tasks. Standard error values below .316 indicate high reliability of the task at a given point along the latent trait.

53

from the final measure. Items 12 and 15 had nearly identical difficulty parameters (b = .38 and .39), but Item 12 had a much higher discrimination parameter (a = 2.05). Item 15 was removed from the final measure. Items 13 and 14 had very similar difficulty parameters (b = .65 and .60), but Item 14 had a higher discrimination parameter (a = 1.19). Item 13 was removed from the final measure. All other items provided unique difficulty parameters and were retained. The final relative size task consisted of Items 10, 12, 14, and 16 and had a below acceptable level of internal consistency (α = .67). The TIF from this task is presented in Figure 4 and the SE Function is presented in Figure 5. This task did not have a standard error of less than .316 at any theta value. Number comparison. The percentage of children who answered each item correctly, item- total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-12. Items 17, 18, 20, 21, and 22 all had comparable difficulty parameters (b = -.06, -.05, -.04, and -.01, respectively). All of these items, except Item 22, were number comparison items in which the child was asked, “Which number means the most?” Item 22 was an item that asked, “Which number means the least?” Although Item 22 (a = 1.83) had a slightly higher discrimination parameter than Item 17 (a = 1.72), Item 17 was retained so that there would be a balanced number of “most” and “least” questions. Items 18, 20, 21, and 22 were removed from the final measure. Items 25, 26, 27, and 29 all had comparable difficulty parameters (b = .20, .25, .20, and .24, respectively), but Item 27 had the highest discrimination parameter (a = 2.40). Items 25, 26, and 29 were removed from the final measure. Items 28, 30, 31, and 32 all had comparable difficulty parameters (b = .45, .48, .46, and .45, respectively), but Item 30 had the highest discrimination parameter (a = 2.82). Items 28, 31, and 32 were removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final number comparison task consisted of Items 17, 19, 23, 24, 27, and 30 and had an acceptable internal consistency (α = .74). The TIF from this task is presented in Figure 4 and the SE Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of -.10 to .70. Set comparison. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-13. The set comparison task was a relatively easy task because most items were answered correctly by more than 50% of the children. Items 33 and 37 had comparable difficulty

54

parameters (b = -.54 and -.53), but Item 33 had a higher discrimination parameter (a = 2.01). Item 37 was removed from the final measure. Items 36 and 39 had comparable difficulty parameters (b = -.40 and -.39), but Item 39 had a higher discrimination parameter (a = 2.23). Item 36 was removed from the final measure. Items 38, 40, and 41 had similar difficulty parameters (b = -.31, -.35, and -.27, respectively), but Item 38 had the highest discrimination parameter (a = 1.82). Items 40 and 41 were removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final set comparison task consisted of Items 33, 34, 35, 38, 39, and 42 and had an acceptable internal consistency (α = .77). The TIF from this task is presented in Figure 4 and the SE Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of -.90 to -.25. Number order. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-14. Items 45 and 46 had comparable difficulty parameters (b = -.15 and -.08), but Item 45 had a higher discrimination parameter (a = 4.08). Item 46 was removed from the final measure. Items 47 and 50 also had comparable difficulty parameters (b = .24 and .33); however, Item 50 had a higher discrimination parameter (a = 2.25). Item 47 was removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final version of the number order task consisted of Items 43, 44, 45, 48, 49, and 50 and had a good internal consistency (α = .87). The TIF from this task is presented in Figure 4, and the SE Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of -.55 to .70. Sequencing. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-15. Items 52 and 53 had very similar difficulty parameters (b = .27 and .30), but Item 53 had a higher discrimination parameter (a = 1.76). Item 52 was removed from the final measure. Items 55 and 56 also had comparable difficulty parameters (b = .79 and .90), but Item 56 had a higher discrimination parameter (a = 3.06). Item 55 was removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final sequencing task consisted of Items 51, 53, 54, and 56 and had an acceptable internal consistency (α = .79). The TIF from this task is presented in Figure 4, and the SE

55

Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of .50 to .95. Set reproduction (blocks and pictures). The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-16. Items 57 and 58 had similar difficulty parameters (b = - 1.15 and -1.22), but Item 58 had a higher discrimination parameter (a = 1.99). Item 57 was removed from the final measure. Items 61 and 63 had similar difficulty parameters (b = -.48 and -.58), but Item 63 had a higher discrimination parameter (a = 1.44). Item 61 was removed from the final measure. Although Item 64 had a low factor loading and discrimination parameter, the factor loading was significant and no other item had a comparable difficulty parameter. All other items had unique difficulty parameters and were retained for the final measure. Thus, the final task consisted of Items 58, 59, 60, 62, 63, and 64 and had a below acceptable level of internal consistency (α = .63). The TIF from this task is presented in Figure 4 and the SE Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of -.85 to -.60. Number identification. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-17. Items 67, 68, and 69 had identical difficulty parameters (b = -.92, -.92, and -.93, respectively), but Item 67 had the highest discrimination parameter (a = 2.27). Items 68 and 69 were removed from the final measure. Items 70 and 71 had very similar difficulty parameters (b = -.34 and -.39), but Item 71 had a higher discrimination parameter (a = 2.21). Item 70 was removed from the final measure. Items 73 and 74 had comparable difficulty parameters (b = -.08 and -.19), but Item 74 had the higher discrimination parameter (a = 2.96). Item 73 was removed from the final measure. Items 75 and 78 had comparable difficulty parameters (b = .01 and .13). Although Item 75 had the higher discrimination parameter (a = 2.84), Item 79 was retained because it fit better in the range of difficulty parameters. Items 76 and 77 had comparable difficulty parameters (b = .48 and .45). Even though Item 77 had a higher discrimination parameter (a = 2.19), Item 76 (a = 1.95) was retained to maximize the range of measurement because it had a slightly higher difficulty parameter. All other items had unique difficulty parameters and were retained for the final measure. The final number identification task consisted of Items 65, 66, 67, 71, 72, 74, 76, 78, and 79 and had a high internal consistency

56

(α = .90). The TIF from this task is presented in Figure 4, and the SE Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of -1.20 to .60. Numerals. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-18. Items 80, 81, 83, 87, and 89 had comparable difficulty parameters (b = -.50, -.56, -.58, -.52, and -.55, respectively). Even though Item 89 had the highest discrimination parameter (a = 2.66), Item 81 was selected because it also had a high discrimination parameter and balanced out the types of items in the final task (Item 81 was one of the first types of items – match number to one of the sets). Items 80, 83, 87, and 89 were removed from the final measure. Items 84, 85, and 91 had comparable difficulty parameters (b = .23, .20, and .21, respectively), but Item 85 had the highest discrimination parameter (a = 1.57). Items 84 and 91 were removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final numerals task consisted of Items 81, 82, 85, 86, 88, and 90 and had a good internal consistency (α = .80). The TIF from this task is presented in Figure 4, and the SE Function is presented in Figure 5. This task had a standard error of less than .316 from the theta value of -.80 to .00. Summary. The final Numerical Relations subtest was reduced from 91 items to 51 items. All final tasks, except relative size and set reproduction, had internal consistency scores above α = .70. (both tasks narrowly missed this standard with internal consistency scores of α = .67 and .63, respectively). When all items from the Numerical Relations tasks were combined, the Numerical Relations subtest had an internal consistency of α = .96. The TIF for the entire Numerical Relations subtest and the SE Function are presented in Figure 6. Although many of the individual tasks had limited ranges of theta at which the SE function was less than .316, the subtest as a whole had a standard error of less than .316 from a theta value of less than -1.60 through 1.50, which indicates that this test was a strong measure across the latent trait spectrum. Arithmetic Reasoning Addition/subtraction with objects. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-19. Items 2 and 7 had comparable difficulty parameters (b = -1.21 and -1.10). Item 2 had a much higher discrimination parameter (a = 1.08). Item 7 was removed from the final measure. Items 8 and 9 had comparable difficulty parameters (b = -.61

57

Numerical Relations Subtest

80.0 60.0 40.0 20.0 Information 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Numerical Relations Subtest

1.0 0.8 0.6 0.4 0.2 Standard Error Standard 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Figure 6. Test Information Functions for Numerical Relations subtest (top) and Standard error (bottom) of the subtest. High levels of information and low standard error values indicate high reliability of the task at a given point along the latent trait.

58

and -.52), but Item 9 had a higher discrimination parameter (a = 1.18). Item 8 was removed from the final measure. Items 4 and 10 had identical difficulty parameters (b = -.28) and nearly identical discrimination parameters (a = 1.70 and 1.68). Although Item 4 has a slightly higher discrimination parameter, Item 10 was retained to maintain a balance of addition and subtraction items. Items 6 and 11 had comparable difficulty parameters (b = .42 and .50). Item 6 had a much higher discrimination parameter (a = 1.33). Item 11 was removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final task consisted of Items 1, 2, 3, 5, 6, 9, 10, and 12 and had an acceptable internal consistency (α = .72). The TIF from this task is presented in Figure 7 and the SE Function is presented in Figure 8. This task did not have a standard error of less than .316 at any theta value. Addition/subtraction without objects. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-20. Items 13 and 20 had comparable difficulty parameters (b = -.33 and -.28), but Item 20 had a much higher discrimination parameter (a = .91). Item 13 was removed from the final measure. Items 15 and 21 had comparable difficulty parameters (b = .00 and .06), but Item 15 had a higher discrimination parameter (a = 1.00). Item 21 was removed from the final measure. Items 16 and 18 had nearly identical difficulty parameters (b = .28 and .27), but Item 18 had a much higher discrimination parameter (a = 1.41). Item 16 was removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final task consisted of Items 14, 15, 17, 18, 19, 20, and 22 and had an acceptable internal consistency (α = .71). The TIF from this task is presented in Figure 7, and the SE Function is presented in Figure 8. This task did not have a standard error of less than .316 at any theta value. Initial equivalence. The percentage of children who answered each item correctly, item- total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-21. No items had overlapping or similar difficulty parameters, thus all were retained. The internal consistency for this task was low (α = .47). The TIF from this task is presented in Figure 7, and the SE Function is presented in Figure 8. This task did not have a standard error of less than .316 at any theta value.

59

Addition/Subtraction with Objects Addition/Subtraction without Objects Initial Equivalence

15.0 15.0 15.0

10.0 10.0 10.0 5.0 5.0 5.0 Information Information Information 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Two-set Addition/Subtraction Equivalent Sets Number Composition/Decomposition

15.0 15.0 15.0

10.0 10.0 10.0

5.0 5.0 5.0 Information Information Information 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Number Combinations

20.0 15.0 10.0 5.0 Information 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Figure 7. Test Information Functions for Arithmetic Reasoning tasks. High levels of information indicate high reliability of the task at a given point along the latent trait.

60

Addition/Subtraction with Objects Addition/Subtraction without Objects Initial Equivalence

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard Error Standard Standard Error Standard 0.0 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Two-set Addition/Subtraction Equivalent Sets Number Composition/Decomposition

1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 Standard Error Standard Standard Error Standard 0.0 Error Standard 0.0 0.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 -2.0 -1.0 0.0 1.0 2.0 Theta Theta Theta

Number Combinations

1.0 0.8 0.6 0.4 0.2 Standard Error Standard 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Figure 8. Standard Error Functions for the Arithmetic Reasoning tasks. Standard error values below .316 indicate high reliability of the task at a given point along the latent trait.

61

Two-set addition and subtraction. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-22. Items 23b and 24b had comparable difficulty parameters (b = .31 and .35). However, Item 23b had a much higher discrimination parameter (a = 1.09). Item 24b was removed from the final measure. Items 25b and 28b had comparable difficulty parameters (b = -.75 and -.72), but Item 25b had a much higher discrimination parameter (a = .96). Item 28b was removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. Overall, Items 23b, 25b, 26b, and 27b were retained and had a low internal consistency (α = .41). The TIF from this task is presented in Figure 7, and the SE Function is presented in Figure 8. This task did not have a standard error of less than .316 at any theta value. Equivalent sets. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-23. Items 31 and 32 had identical difficulty parameters (b = .13), but Item 32 had a higher discrimination parameter (a = 1.85). Item 31 was removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final task consisted of Items 29, 30, 32, 33, and 34 and had an acceptable internal consistency (α = .74). The TIF from this task is presented in Figure 7, and the SE Function is presented in Figure 8. This task did not have a standard error of less than .316 at any theta value. Number composition/decomposition. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-24. Items 35 and 39 had comparable difficulty parameters (b = .18 and .15), but Item 39 had a higher discrimination parameter (a = 1.61). Item 35 was removed from the final measure. Items 41 and 42 had comparable difficulty parameters (b = -.14 and -.16), but Item 41 had a much higher discrimination parameter (a = 1.49). Item 42 was removed from the final measure. Items 38, 43 and 44 had comparable difficulty parameters (b = .73, .84, and .67, respectively), but Item 38 had the highest discrimination parameter (a = 1.32). Items 43 and 44 were removed from the final measure. All other items had unique difficulty parameters and were retained for the final measure. The final task consisted of Items 36, 37, 38, 39, 40, and 41 and had an acceptable internal consistency (α =.74). The TIF from this task is

62

presented in Figure 7 and the SE Function is presented in Figure 8. This task had a standard error of less than .316 from the theta value of -.05 to .60. Number combinations. The percentage of children who answered each item correctly, item-total correlations, factor loadings, discrimination parameters, and difficulty parameters are presented in Table K-25. Items 45 and 46 had identical difficulty parameters (b = .56), but Item 46 had a higher discrimination parameter (a = 1.26). Item 45 was removed from the final version of this task. All other items had unique difficulty parameters and were retained for the final measure. The final task consisted of Items 46, 47, 48, 49, and 50 and had an acceptable internal consistency (α = .77). The TIF from this task is presented in Figure 7, and the SE Function is presented in Figure 8. This task had a standard error of less than .316 from the theta value of .35 to 1.15. Summary. The final Arithmetic Reasoning subtest was reduced from 56 items to 41 items. All final tasks except initial equivalence and two-set addition/subtraction had internal consistency scores above α = .70. When all items from the Arithmetic Reasoning tasks were combined, the Arithmetic Reasoning subtest had an internal consistency of α = .90. The TIF for the entire Arithmetic Reasoning subtest and the SE Function are presented in Figure 9. Although many of the individual tasks had limited ranges of theta at which the SE function was less than .316, the subtest as a whole had a standard error of less than .316 from a theta value of less than - 1.10 through 1.50, which indicates that this test was a strong measure across the majority of the latent trait spectrum. Step 3 - Construct Evaluation. Total scores for each task were computed by summing the final items from each task. Means, standard deviations, ranges, skew, and kurtosis for each task are presented in Table 6. No tasks exhibited significant skew or kurtosis. Correlations between the tasks are presented in Table 7. All tasks were correlated significantly at p < .001. Means, standard deviations, and F- contrasts by sex are presented in Table 8. On one task, two-set addition/subtraction, boys performed significantly better than girls F(1, 392) = 6.26, p .013, d = .25 (a small effect size according to Cohen, 1988). There were no other statistically significant differences by sex.

63

Arithmetic Reasoning Subtest

40.0 30.0 20.0 10.0 Information 0.0 -2.0 -1.0 0.0 1.0 2.0 Theta

Arithmetic Reasoning Subtest

1.0 0.8 0.6 0.4 0.2 Standard Error Standard 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta

Figure 9. Test Information Functions for Arithmetic Reasoning subtest (top) and Standard error (bottom) of the subtest. High levels of information and low standard error values indicate high reliability of the task at a given point along the latent trait.

64

Table 6

Means, Standard Deviations, Range, Skewness, and Kurtosis of the sum scores of the final early numeracy tasks in preschool children

Task Mean SD Range Skew Kurtosis Counting Rote Counting 3.03 1.54 0 – 6 .07 -.65 Counting Forward and 2.56 1.92 0 – 6 .06 -1.29 Backward Error Identification 3.75 2.96 0 – 8 .03 -1.44 Structured Counting 3.25 1.65 0 – 5 -.47 -1.05 Cardinality 2.16 1.08 0 – 3 -.93 -.57 Resultative Counting 1.58 1.25 0 – 4 .50 -.73 Counting a Subset 4.31 2.40 0 – 8 -.18 -.11 Subitizing 3.85 1.71 0 – 7 -.22 -.18 Estimation 1.78 1.32 0 – 5 .50 -.39

Numerical Relations Ordinality 1.55 1.39 0 – 4 .42 -1.13 Relative Size 1.49 1.35 0 – 4 .47 -1.03 Number Comparison 2.19 1.88 0 – 6 .53 -.86 Set Comparison 3.89 1.91 0 – 6 -.41 -1.08 Number Order 2.87 2.26 0 – 6 .03 -1.48 Sequencing 1.39 1.45 0 – 4 .63 -.99 Set Reproduction 3.61 1.56 0 – 6 -.49 -.38 Number Identification 5.45 3.08 0 – 9 -.41 -1.16 Numerals 3.45 2.00 0 – 6 -.26 -1.22

Arithmetic Reasoning Addition and Subtraction 4.96 2.11 0 – 8 -.38 -.69 with Objects Addition and Subtraction 3.28 2.04 0 – 7 .15 -1.01 without Objects Initial Equivalence 3.32 1.53 0 – 6 -.01 -.61 Two-set Addition 1.75 1.13 0 – 4 .26 -.71 Equivalent Sets 2.82 1.56 0 – 5 -.10 -1.01 Number Composition and 2.56 1.89 0 – 6 .39 -.98 Decomposition Number Combinations 1.29 1.57 0 – 5 1.12 .10 Note. N = 393.

65

CFAs for each numeracy domain were conducted separately. Raw total scores were used in the analyses. All models were identified. Correlated residuals were included between specific skills that were measured within the same task to allow for common measurement variance due to the nature of the task. These model parameters included correlations between the residuals for the Structured Counting and Cardinality tasks as well as between the residuals for the Initial Equivalence and Two-Set Addition/Subtraction tasks. Inclusion of these parameters improved the model fit but did not change the structure of the model. Fit indices for the three domains are presented in Table 9. χ2/df ratios less than 2.0 are indicators of good fit and ratios greater than 2.0 but less than 3.0 are indicators of modest fit. Standardized Root Mean Square Residual (SRMR) values of less than .10 are indicators of a good fitting model. Comparative Fit Index (CFI) and Tucker Lewis Index (TLI) values of greater than .90 are indicators of a good fitting model. A Root Mean Square Error of Approximation (RMSEA) value of less than or equal to .05 is an indicator of a good fitting model. An RMSEA value between .05 and .08 is an indicator of a moderate fitting model. All three models were a good fit to the data. All tasks significantly loaded on their respective factor, and thus, none were removed from further analyses. The Counting model with standardized regression weights is presented in Figure 10. The Numerical Relations model with standardized regression weights is presented in Figure 11. The Arithmetic Reasoning model with standardized regression weights is presented in Figure 12. Step 4 - Model Comparison The fit of the five possible one-, two-, and three-factor alternative models of Counting, Numerical Relations, and Arithmetic Reasoning were evaluated using CFA in Mplus version 5.1 (Muthén & Muthén, 2008a). As was done in Part 3, all models contained correlated errors between the structured counting and cardinality tasks as well as between the initial equivalence and two-set addition/subtraction tasks. Inclusion of these parameters improved the fit of the models but did not alter the structure of these models. All analyses were conducted with the raw total scores of each task.5 Fit indices for the five models are shown in Table 10. Although all models fit the data well, the three-factor model fit the data the best. Further, the χ2 difference tests revealed that the one- and two-factor models yielded a significantly worse fit to the data than the three-factor

5 Modification indices for the inclusion of other correlated errors were examined. The inclusion of other correlated errors did not change the model fit and comparison results and thus, are not reported.

66

Table 7

Correlations between the sum scores of all the final early numeracy skill tasks in preschool children

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1. Rote Counting - 2. Count Forward/Backward .68 - 3. Error Identification .53 .61 - 4. Structured Counting .69 .62 .52 - 5. Cardinality .62 .57 .52 .76 - 6. Resultative Counting .63 .62 .57 .64 .59 - 7. Counting a Subset .73 .70 .64 .76 .68 .72 - 8. Subitizing .52 .54 .44 .57 .49 .56 .63 - 9. Estimation .48 .48 .46 .41 .36 .43 .51 .41 -

10. Ordinality .54 .58 .63 .53 .49 .55 .65 .53 .49 - 11. Relative Size .55 .58 .58 .49 .44 .49 .60 .45 .41 .62 - 12. Number Comparison .50 .54 .55 .47 .41 .48 .58 .45 .37 .62 .56 - 13. Set Comparison .52 .57 .61 .56 .48 .51 .64 .46 .45 .61 .58 .63 - 14. Number Order .64 .66 .70 .64 .58 .63 .72 .53 .48 .66 .66 .58 .64 - 15. Sequencing .46 .48 .58 .44 .42 .45 .57 .42 .42 .59 .50 .52 .51 .56 - 16. Set Reproduction .51 .56 .49 .57 .49 .51 .64 .52 .41 .49 .49 .47 .56 .54 .48 - 17. Number Identification .63 .64 .61 .66 .60 .56 .70 .50 .39 .57 .60 .51 .58 .70 .42 .55 - 18. Numerals .58 .61 .61 .63 .57 .61 .67 .54 .40 .58 .60 .56 .62 .72 .49 .52 .69 -

19. Add/Sub with Objects .54 .59 .51 .54 .51 .54 .61 .49 .42 .46 .47 .45 .54 .55 .47 .50 .55 .56 20.Add/Sub without Objects .53 .56 .59 .51 .48 .53 .65 .53 .44 .61 .59 .59 .63 .65 .59 .52 .54 .55 21. Initial Equivalence .38 .37 .36 .38 .31 .41 .47 .29 .26 .38 .40 .41 .47 .44 .44 .39 .33 .37 22. Two-Set Add/Sub .33 .36 .44 .31 .26 .36 .42 .29 .26 .39 .38 .37 .40 .39 .41 .32 .35 .39 23. Equivalent Sets .45 .46 .44 .47 .40 .43 .51 .39 .33 .42 .40 .40 .46 .41 .44 .48 .39 .44 24. Composition/Decomposition .44 .52 .49 .43 .34 .44 .51 .42 .42 .53 .46 .51 .49 .48 .43 .38 .47 .45 25. Number Combinations .36 .40 .39 .31 .30 .37 .39 .32 .36 .44 .39 .44 .38 .44 .37 .36 .38 .36 Note. N = 393. All correlations were significant at p < .01.

67

Table 7 - continued 20 21 22 23 24 25 1. Rote Counting 2. Count Forward/Backward 3. Error Identification 4. Structured Counting 5. Cardinality 6. Resultative Counting 7. Counting a Subset 8. Subitizing 9. Estimation

10. Ordinality 11. Relative Size 12. Number Comparison 13. Set Comparison 14. Number Order 15. Sequencing 16. Set Reproduction 17. Number Identification 18. Numerals

19. Add/Sub with Objects - 20.Add/Sub without Objects .53 - 21. Initial Equivalence .41 .45 - 22. Two-Set Add/Sub .40 .42 .53 - 23. Equivalent Sets .38 .48 .40 .37 - 24. Composition/Decomposition .44 .55 .32 .33 .44 - 25. Number Combinations .34 .44 .28 .29 .33 .39 Note. N = 393. All correlations were significant at p < .01.

68

Table 8.

Means, standard deviations, and F-values of the Early Numeracy Skills for preschool children by sex

Task Male a Female b F Mean SD Mean SD Counting Rote Counting 2.98 1.59 3.06 1.51 .26 Counting Forward/Backward 2.55 1.99 2.57 1.85 .01 Error Identification 3.90 3.05 3.62 2.88 .91 Structured Counting 3.21 1.72 3.29 1.59 .23 Cardinality 2.11 1.12 2.20 1.04 .79 Resultative Counting 1.60 1.24 1.56 1.26 .09 Count a Subset 4.26 2.42 4.47 2.38 .72 Subitizing 3.84 1.67 3.86 1.74 .01 Estimation 1.82 1.33 1.75 1.32 .29 Numerical Relations Ordinality 1.58 1.42 1.52 1.37 .16 Relative Size 1.52 1.37 1.46 1.33 .18 Number Comparison 2.20 1.91 2.17 1.85 .02 Set Comparison 3.82 1.92 3.97 1.91 .60 Number Order 2.99 2.28 2.76 2.24 .98 Sequencing 1.45 1.50 1.33 1.40 .64 Set Reproduction 3.58 1.54 3.63 1.59 .09 Number Identification 5.52 3.19 5.38 2.97 .19 Numerals 3.35 2.06 3.55 1.94 1.02 Arithmetic Reasoning Addition and Subtraction with Objects 5.01 2.19 4.92 2.03 .16 Addition and Subtraction without Objects 3.28 2.11 3.27 1.97 .00 Initial Equivalence 3.43 1.53 3.21 1.54 1.93 Two-Set Addition/Subtraction 1.90 1.23 1.62 1.02 6.26* Equivalent Sets 2.73 1.62 2.90 1.51 1.24 Number Composition and Decomposition 2.53 1.90 2.58 1.88 .07 Number Combinations 1.24 1.61 1.34 1.53 .46 Note. a n = 203. b n = 190.

* indicates the F-contrast significant at p < .05.

69

Table 9.

Model fit for the Counting, Numerical Relations, and Arithmetic Reasoning factors.

Model χ2 df CFI TLI AIC BIC RMSEA SRMR Counting 53.70 26 .99 .98 11473.79 11585.06 .05 .02 Numerical Relations 100.43 27 .97 .95 12223.99 12331.28 .08 .03 Arithmetic Reasoning 14.02 13 1.00 1.00 9811.04 9898.47 .01 .02

Note. N = 393. CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; AIC = Akaike Information Criterion; BIC = Bayesian

Information Criterion; RMSEA = Root Mean Squared Error of Approximation; SRMR = Standardized Root Mean Square Residual.

70

Figure 10. The Confirmatory Factor Analysis model of Counting skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001.

71

Figure 11. The Confirmatory Factor Analysis model of Numerical Relations skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001.

72

Figure 12. The Confirmatory Factor Analysis model of Arithmetic Reasoning skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001.

73

Table 10

Fit indices for the models of the Structure of Early Numeracy Skills in Preschool Children

Model χ2 df CFI TLI AIC BIC RMSEA SRMR χ2 difa 3-Factor 576.21 270 .95 .95 32464.70 32782.60 .05 .04 -- 2-Factor NR+AR, CT 600.13 272 .95 .94 32484.61 32794.57 .06 .04 23.92* 2-Factor CT+NR, AR 643.08 272 .94 .94 32527.56 32837.52 .06 .04 66.87* 2-Factor CT+AR, NR 636.20 272 .94 .94 32520.68 32830.64 .06 .04 59.99* 1-Factor 673.71 273 .94 .93 32556.20 32862.18 .06 .04 97.50*

Note. N = 393. CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; AIC = Akaike Information Criterion; BIC = Bayesian

Information Criterion; RMSEA = Root Mean Squared Error of Approximation; SRMR = Standardized Root Mean Square Residual. a Chi-Square difference tests involved comparison to the three factor model.

*p < .01.

74

Figure 13. The three-factor model of preschool early numeracy skills (N = 393). Ovals represent latent variables and rectangles represent observed variables. All values represent standardized coefficients. All factor loadings are significant at p < .001.

75

model. Parameter values for the three-factor model are shown in Figure 13. All tasks loaded significantly on their respective factor.

76

DISCUSSION In this study, an early numeracy measure that assessed directly the skills of Counting, Numerical Relations, and Arithmetic Reasoning skills was developed because no existing measure captured both the depth and the breadth of these domains. Three separate subtests--one for each of the early numeracy domains--were constructed utilizing a compilation of modified tasks that were commonly used to assess their respective domains. The Counting subtest included tasks intended to measure a child’s knowledge of the counting sequence and ability to obtain quantity. The Numerical Relations subtest included tasks intended to measure a child’s knowledge of numbers and quantity and the association between the numbers on the number line. The Arithmetic Reasoning subtest included tasks intended to measure children’s ability to understand changes in quantity and other basic arithmetic abilities. Through the use of Item Response Theory analyses, overlapping and uninformative items were removed from the measures, resulting in tasks that assessed uniformly the range of their target skill. All the tasks significantly loaded on their respective factors and provided valuable information for the assessment of early numeracy skills. Utilizing this measure, five possible models of the relations between the three domains were compared. The results of this study indicate that early numeracy skills in preschool children are represented best by the three highly related but distinct factors of Counting, Numerical Relations, and Arithmetic Reasoning. Measure Development Issues in task functioning Although the majority of the tasks developed for the measure had psychometric properties that met or exceeded standards, several tasks had psychometric properties that fell below these standards. In subsequent revisions of this measure, modifications to, or removal of, poorly functioning tasks may be necessary to improve the overall measurement of the three early numeracy domains. The tasks that were just below acceptable reliability standards (e.g., resultative counting, relative size, and set reproduction) would likely meet the standards if items with difficulty parameters at the high or low end of the latent trait continuum were added. The subitizing task, which also was just below acceptable standards, could be improved through the addition of items that present the pictures in a different manner (e.g., dispersed across the page rather than in a straight line). It is not possible to add additional set sizes because subitizing is the

77

rapid quantification of small sets without counting and the task already includes items that assess sets of one to seven. The three tasks with low levels of internal consistency (e.g., estimation, initial equivalence, and two-set addition/subtraction) may need to be removed because they may be inappropriate for children of this age. The estimation task had not been used previously with preschool children (Jordan et al., 2007) and many of the children had difficulty grasping this concept, even after the examiner provided a thorough explanation. The other two tasks (e.g., initial equivalence and two-set addition/subtraction) functioned poorly most likely because children could correctly guess the answer 50% of the time on some items. As was observed on many of the Numerical Relations tasks in Phase 1, multiple-choice items, which had only two or three options were subject to significant amounts of error from guessing. The same concern likely affected the initial equivalence and two-set addition/subtraction tasks. Issues in the assessment of informal mathematics in young children Through the development of this measure, valuable insights regarding practical issues in the assessment of early numeracy skills were obtained. It not only was important to construct tasks that assessed the range of skills and the depth of ability but also to ensure the tasks were easy to administer and were not time-consuming. One specific way these issues were addressed was by examining whether or not the use of manipulatives, such as blocks, was necessary to effective assessment of counting skills. Although recent evidence found that manipulatives and hands-on activities are important tools in teaching children abstract mathematics concepts (Nishida, 2009), no systematic evaluation of the importance of manipulatives in assessing children’s skills previously was conducted. In this study, the tasks that assessed counting skills using blocks, dots, or pictures all performed comparably. Therefore, it is evident that utilizing manipulatives is unnecessary for the assessment of these constructs. Similar findings were also observed on tasks that used dots or pictures. The tasks with dots performed comparably well to the tasks with pictures, indicating that the manner in which the items were presented did not affect the outcomes. These findings, coupled with the results of the item-reduction process, resulted in measures of Counting, Numerical Relations, and Arithmetic Reasoning that were age- appropriate, easy to administer, and psychometrically sound. Further, the development of these measures has provided a foundation upon which to assess the relations between these skills.

78

Model Comparison Measure use Although no previous studies have examined directly the three-factor model, these results support findings by Jordan et al. (2006) who found that Arithmetic Reasoning was a distinct factor from a combined measure of Counting and Numerical Relation skills, and Okimato (2000, cited in Kalchman et al., 2001, p. 3) who found that Counting and Numerical Relations skills were distinct factors. This study’s results expand upon the past findings in three central ways. First, this is the only study to examine the structure of early numeracy skills using a broad and deep measure of these three constructs. Past studies have used assessment measures that either were not constructed to assess all three factors or automatically grouped two of the factors within one construct without testing for unidimensionality of that construct. For example, Jordan and colleagues (2006) utilized a measure that combined counting skills and numerical relations skills; Clements et al. (2008) only tested the goodness of fit for the one-factor model (they also included informal geometry skills in the analysis); and Okimato (2000) did not include a distinct measure of Arithmetic Reasoning in their analyses. Methods and statistics The second way this study expanded upon past findings was by utilizing statistical and methodological techniques that allowed a 3-factor model to be examined. Both Jordan et al. (2006) and Okimato (2000) utilized exploratory factor analysis. Although Jordan et al. identified possible 2- and 3-factor models, the 3-factor model was rejected because it was subject to a Heywood case (i.e., negative estimated residual variance). One common cause of Heywood cases is under-identification of a model (i.e., only 1 or 2 tasks load on a latent factor). In their study, there were only eight possible numeracy tasks that could load onto the 3 factors, which is a likely cause of the Heywood case. As for Clements et al. (2008), their measure did not include direct measures of the different types of numeracy tasks (e.g., relative size, counting error identification, etc.), but rather included individual items from a broad range of tasks. Thus, it was not possible to compare these models in that study. Model identification The third, but primary, way this study expanded upon past results is that this study identified clearly a model of three distinct factors of early numeracy skills utilizing a large and diverse sample of 3- to 5-year-old preschool children and directly compared the three-factor

79

model to other potential models. The support of the three-factor model fills an important gap in the current understanding of early numeracy skills because construct evaluation of early numeracy skills was lacking in the current body of literature. The central focus of the majority of early numeracy research had been on curriculum development. Although there are several effective early mathematics curricula that have been developed (Clements & Sarama, 2007; Greenes, Ginsburg, & Balfanz, 2004; Starkey, Klein, & Wakeley, 2004), these curricula were built upon the constructs set forth by the NCTM PSSM (2008) that were not empirically validated. These findings stand in contrast to the constructs presented in the NCTM PSSM and indicate that the commonly used standards of early numeracy skills in preschool may not adequately represent the relation between the factors of Counting, Numerical Relations, and Arithmetic Reasoning. The information garnered from this study can be used to delineate a revised framework for early numeracy skills and can be used to operationally define the skills and tasks that contribute to each factor. Before significant modifications to the standards are made, further evaluation and support for the relations between early numeracy skills is needed. Although the results of this study identified the three factors as separate and distinct, they were also very highly correlated (r > .90). Factors with such high correlations functionally may be one factor (i.e., the factors may develop at similar rates and times). It is unclear if the high correlation is a true representation of the relation between these factors or if it is artificially inflated because of common underlying aspects such as cognitive, behavioral or language skills. Further, as mathematics skills are cumulative, the high relation could be a result of causal relations between the three factors. Unfortunately, the direct causal impacts of individual skills on other skills has not yet be examined. Future Directions The three factors of early numeracy skills that have been identified represent measureable constructs that may enable researchers and educators to identify children at risk for later mathematics difficulties and may be potential areas for intervention. A large body of research explaining the developmental underpinnings of reading achievement has identified a core phonological deficit as a cause of reading disabilities (Liberman & Shankweiler, 1985; Share, 1995; Stanovich, 1988). As of now, a core deficit in mathematics disabilities has not been identified (Chiappe, 2005). Although early numeracy skills have been implicated as important

80

contributors to formal mathematics development (Geary, 1994; Ginsburg, 1989), the extent to which any one of these skills, compared to other early numeracy skills, is central to the development of later mathematics skills is unclear. Utilizing the three-factor framework, it will be possible to examine the longitudinal interactions of these early skills to garner a greater understanding of their relation to the development of later formal mathematics skills. For example, it is necessary to understand how Counting and Numerical Relations skills contribute to the development of Arithmetic Reasoning. Currently it is unknown whether or not both factors have a direct and significant relation to Arithmetic Reasoning or if one factor mediates the relation between the other factor and Arithmetic Reasoning. By understanding the structure of preschool mathematics skills, and how these individual skills are related to the development of formal arithmetic processes, researchers and educators will be able to understand more fully the developmental trajectories of early numeracy skills, and it may be possible to identify better children who are at risk early on for later mathematics learning disabilities. Once the developmental trajectories of early numeracy skills are better understood, it will be important to identify which of these skills are the most malleable with instruction and are central to the development of a broad range of mathematics skills. One means of identifying the core area or areas for intervention is through the evaluation of targeted aspects of the existing preschool mathematics curricula and utilizing the three-factor model as a framework. Specifically, completing component analyses (i.e., targeted interventions on one skill domain) of existing curricula may help to determine which of these domains result in the greatest growth of children’s numeracy skills. By examining the effects of certain components of a curriculum, such as the Counting instruction, Numerical Relations instruction, or Arithmetic Reasoning instruction, causal attributions can be made regarding the effects one skill has on the development of other mathematics skills. These causal attributions are important to the overall understanding of early numeracy skills because they enable researchers to define clearly the developmental progression of early numeracy skills and how best to improve children’s overall mathematics performance. Through this knowledge, refined and targeted curricula and interventions can be developed and utilized to narrow or even close the international and socio- economic-status achievement gaps in early mathematics.

81

Early Numeracy and Third Variables The high correlation between these three factors is likely due to underlying cognitive, behavioral, and language skills. Understanding the link between these domains and early numeracy skills may provide valuable information regarding the development of early numeracy skills and potential barriers to successful acquisition of these skills. It has been reported that the development of mathematics skills has many cognitive requirements. Specific examples of research by Swanson and colleagues (Swanson, 2004; Swanson & Beebe-Frankenberger, 2004; Swanson & Kim, 2007) as well as Fuchs and colleagues (Fuchs et al., 2005; 2006) indicate working memory and inattention, respectively, are two of the most important factors in the development of mathematics skills. Future research should examine the impact these skills have on the development of mathematics skills and on the effectiveness of mathematics interventions, particularly with younger children who are just starting their formal education. If these cognitive and behavioral factors are found to account for significant variance in the development of the three early numeracy factors, it may be necessary to include them in an overall model of mathematics development and mathematics disability identification. Children’s language skills are another factor that likely contribute to the overall development of early mathematics skills and account for some of the overlap between the three numeracy domains. Fuchs and colleagues (Fuchs et al., 2008) found that it was necessary to include language skills in the prediction of 3rd grade children’s ability to learn mathematics tasks. Surprisingly, the relation between language skill and mathematics is under-studied in typical samples of preschool children. Often, when these skills are examined together they are examined either only in older children (Fuchs, Compton, Fuchs, Hollenbeck, Craddock, et al, 2008; Leong & Jerred, 2001) or in children with limited English language ability (Kerr, 2008; Kim, 2008; Naude, Pretorius, & Vandeyar, 2003; Wu, West, & Hughes, 2008). Examination of the relation between language and early numeracy skills may identify language skills as an important underlying factor in preschool children’s successful acquisition of early numeracy skills. One reason for the potential link between language and early mathematics skills in preschool children is the inherent necessity of understanding multiple language terms to accomplish basic mathematical tasks. For example, “less,” “fewer,” “least,” “lowest,” and “smallest” all are words that describe the lesser of quantities. However, a child with limited language abilities may not understand that these words have the same general

82

meaning. A child’s ability to understand a concept (e.g., which set has the fewest dots) may develop at a different time than his or her ability to comprehend the applicable language skills necessary to understand the question (e.g., know the meaning of fewer; Lansdell, 1999). Further, certain mathematics tasks, such as basic arithmetic reasoning, often are couched in story problems, necessitating an ability to read or have adequate language development to accomplish the mathematics problems. The potential overlap between language and the three early mathematics domains must be parsed out to gain a clearer picture of the relations between the three domains. Conclusion It is evident that a significant amount of work still is needed to understand fully the constructs of early numeracy skills, how these skills relate to one another, and the developmental trajectories of these skills. However, the identification of the 3-factor structure of Counting, Numerical Relations, and Arithmetic Reasoning provides a platform of knowledge upon which future research on the development of early mathematics skills, including longitudinal and intervention studies, can be built. Ideally, utilizing this structure as a framework for future research will enable researchers to identify the central factors in the development of later mathematics skills and eventually narrow or eliminate the international and socio-economic- status achievement gaps.

83

APPENDIX A

INFORMED CONSENT

84

Office of the Vice President For Research Human Subjects Committee Tallahassee, Florida 32306-2742 (850) 644-8673 · FAX (850) 644-4392

APPROVAL MEMORANDUM

Date: 9/12/2008

To: David Purpura

Address: 227 N. Bronough St. Suite 3111 Tallahassee, FL Dept.: PSYCHOLOGY DEPARTMENT

From: Thomas L. Jacobson, Chair

Re: Use of Human Subjects in Research DEVELOPMENT OF EARLY MATHEMATICS SKILLS, COGNITIVE SKILLS, AND EXECUTIVE FUNCTIONING

The application that you submitted to this office in regard to the use of human subjects in the research proposal referenced above has been reviewed by the Human Subjects Committee at its meeting on 09/10/2008. Your project was approved by the Committee.

The Human Subjects Committee has not evaluated your proposal for scientific merit, except to weigh the risk to the human participants and the aspects of the proposal related to potential risk and benefit. This approval does not replace any departmental or other approvals, which may be required.

If you submitted a proposed consent form with your application, the approved stamped consent form is attached to this approval notice. Only the stamped version of the consent form may be used in recruiting research subjects.

If the project has not been completed by 9/9/2009 you must request a renewal of approval for continuation of the project. As a courtesy, a renewal notice will be sent to you prior to your expiration date; however, it is your responsibility as the Principal Investigator to timely request renewal of your approval from the Committee.

You are advised that any change in protocol for this project must be reviewed and approved by the Committee prior to implementation of the proposed change in the protocol. A protocol change/amendment form is required to be submitted for approval by the Committee. In addition, federal regulations require that the Principal Investigator promptly report, in writing any unanticipated problems or adverse events involving risks to research subjects or others.

By copy of this memorandum, the Chair of your department and/or your major professor is reminded that he/she is responsible for being informed concerning research projects involving human subjects in the department, and should review protocols as often as needed to insure that the project is being conducted in compliance with our institution and with DHHS regulations.

This institution has an Assurance on file with the Office for Human Research Protection. The Assurance Number is IRB00000446.

Cc: Christopher Lonigan, Advisor HSC No. 2008.1643

85

David J. Purpura

The Florida State University Phone: 850-645-6401 Tallahassee, Florida 32306-1270 [email protected] Department of Psychology 850/644-2040 850/644-7739 (fax) Faculty Advisor: Christopher J. Lonigan, Ph.D. Phone: 850-644-7241 [email protected] Informed Consent Form Title: DEVELOPMENT OF EARLY MATHEMATICS SKILLS, COGNITIVE SKILLS, AND EXECUTIVE FUNCTIONING Investigator: DAVID J. PURPURA, M.S. Telephone: (850) 645-6401

This is a project that is being conducted at your child’s preschool. The purpose of this research project is to examine the structure of children’s pre-mathematics skills, the development of these skills to later mathematics skills, the relation between these skills, cognitive abilities, and children’s behavior, as well as to examine young children’s executive functioning skills.

If you decide to participate in this phase of the project, your part in the project may include:

1. Completing questionnaires that ask for information about your child's developmental, academic, behavioral, and language history, characteristics of your family, such as number of children, habits, and amount of education completed, and general characteristics about your child, like his or her activity level, temperament, and ability to interact with others.

Should you decide to participate, your child's part in the project may include:

1. Participating in periodic assessments of pre-mathematics skills, mathematics skills, language skills, cognitive skills, and behavioral functioning. These assessments will be conducted by trained project personnel at your child's preschool or school, and they involve commonly used tests or developmental tests similar to other tests that involve your child being asked to count, compare groups, examine the relations sets, add and subtract groups, circle similar sets of pictures, or respond to simple computer games (i.e., finding an object presented on the screen). Children typically enjoy these tests. We will not continue a test if your child indicates that he or she does not want to participate. We anticipate that each periodic assessment will take approximately 1-1/2 hours total time (spread over at least 2 shorter assessment periods). For this project, assessments will be conducted at the beginning and/or end of the preschool year, and at the end of the school year in kindergarten and first grade.

2. We will ask your child’s teachers to complete rating forms on behavior in the preschool or school settings, and we will have these same rating forms completed by members of our research team. For our research staff to be able to accurately fill out some of these forms, they may spend time observing your child in his or her classroom setting.

Your participation may result in the following benefits:

1. You will be informed of your child's assessment results if you request such a summary. These results will allow a broad determination of how your child's pre-mathematics skills are developing relative to age expectations. Children and parents in this project will participate in future yearly assessments of children’s academic skills, cognitive skills, and behavior. These assessments will allow you to monitor the development of your child’s academic/school-related skills.

86

2. The information that we gather from this project will be used to better understand the development of pre- mathematics skills and how cognitive and behavioral skills impact the development of these academic skills. This information may be used to help children who are lagging behind in the development of academic skills. Portions of the data from this project will be used to help parents and preschools cooperate to enhance children's development.

We know of no risks associated with your or your child's participation in this project.

Your and your child’s participation is completely voluntary. You do not have to participate if you do not want to. Your (or your child’s) decision whether to participate or not will have no effects on any other treatment or services to which you are entitled from Florida State University or any institution associated with the project. You may change your mind and withdraw from this project at any time without penalty. There are no risks associated with withdrawal from this project.

All information obtained as a result of this project will be kept confidential, to the extent allowed by law. Confidentiality will be ensured in the following ways: In public reports of the results of this project, we will only report results that have been averaged over large numbers of children. No individual child or family will ever be identified publicly. Assessments of your child's skills are solely for research purposes. These assessments and other information gathered on your family and child will be kept in a locked file storage area in research offices at the Department of Psychology at Florida State University (Lonigan Research Lab), identified only by a code, and will not be available to your child's school or to any other person or institution unless you ask us in writing to do so. All materials will be retained for a period of up to five years following completion of the project (up to September 12, 2015).

If at any time you have questions about this project, please contact David J. Purpura at the Department of Psychology, Florida State University, (850) 645-6401. A description of the group results of this project will be sent to you upon request. If you have questions about your rights as a participant in this project, or if you feel you have been placed at risk, you can contact the Chair of the Human Subjects Committee, Institutional Review Board, through the Office of the Vice President for Research, at (850) 644-8633.

If you agree to participate in this research project, please sign and print your name and the name of your child below. Your signature indicates that you have read the information provided above, or have had it read to you, and that you have decided to participate.

A copy of this consent form has been offered to you.

______Signature and printed name of parent or legal guardian Today’s Date

______Printed name of child Child's Date of Birth Please include the following information so that we can contact you regarding project results.

______Street Address City State Zip

Home Phone: ______Work Phone: ______

87

APPENDIX B

FACILITY CONSENT FORM

88

David J. Purpura

The Florida State University Phone: 850-645-6401 Tallahassee, Florida 32306-1270 [email protected] Department of Psychology 850/644-2040 Faculty Advisor: 850/644-7739 (fax) Christopher J. Lonigan, Ph.D. Phone: 850-644-7241 [email protected]

Facility Consent Form Title: DEVELOPMENT OF EARLY MATHEMATICS SKILLS, COGNITIVE SKILLS, AND EXECUTIVE FUNCTIONING Investigator: David J. Purpura Telephone: (850) 645-6401

The purpose of this research project is to determine what kinds of screening measures are best suited to help determine if children are prepared for academic instruction when they reach kindergarten and first grade. The project will help preschool teachers identify areas where children are struggling and provide more appropriate and effective pre-academic activities in their preschool classrooms.

The project will include:

1. Children will participate in assessments of pre-mathematics skills, cognitive skills, and executive functioning. These assessments will be conducted by trained project personnel, and they involve commonly used tests or developmental tests similar to other tests that involve your child being asked to count, compare groups, examine the relations sets, add and subtract groups, circle similar sets of pictures, or respond to simple computer games (i.e., finding an object presented on the screen). Children typically enjoy these tests. We will not continue a test if a child indicates that he or she does not want to participate. We anticipate that each assessment will take approximately 90 minutes total time. For this project, assessments will be conducted at your school once, either in the fall or the spring.

2. Teachers may be asked to complete rating forms on children’s behavior in the preschool settings; we may have these same rating forms completed by members of our research team. For our research staff to be able to accurately fill out some of these forms, they may spend time observing children in the classroom setting.

3. Parents may be asked to complete questionnaires that ask for information about their child's developmental, academic, behavioral, and language history, characteristics of their family, such as number of children, habits, and amount of education completed, and general characteristics about their child, like his or her activity level, temperament, and ability to interact with others.

We know of no risks associated with children's participation in this project. All information obtained as a result of this project will be kept confidential, to the extent allowed by law. Confidentiality will be ensured in the following ways: In public reports of the results of this project, we will only report results that have been averaged over large numbers of children. No individual child or family will ever be identified publicly. Assessments of children's skills are solely for research purposes. These assessments and other information gathered in your facility will be kept in a locked file storage area in research offices at the Department of Psychology at Florida State University

89

(Lonigan Research Lab), identified only by a code. All materials will be retained for a period of up to five years following completion of the project (up to September 12, 2015).

If you agree to participate in this research project, please sign and print your name and the name of your facility below. Your signature indicates that you have read the information provided above, or have had it read to you, and that you have agreed to participate.

A copy of this consent form has been offered to you.

______Printed name of facility director or designated point of contact

______Signature of facility director or designated point of contact Today’s Date

90

APPENDIX C

CHILD ASSENT FORM

91

David J. Purpura Phone: 850-645-6401 The Florida State University [email protected] Tallahassee, Florida 32306-1270 Department of Psychology Faculty Advisor: 850/644-2040 Christopher J. Lonigan, Ph.D. 850/644-7739 (fax) Phone: 850-644-7241 [email protected]

Child Assent Script

Hello ______(child’s name), my name is ______(examiner’s name). How are you doing today? I would like your help in a project I am doing. I would like to play some games with you. Sometimes I’ll ask you some questions about pictures or numbers, and sometimes we will play other games.

I am going to ask some of the other children in your class to help me answer some of the same questions and play the same games. Some of these games are for older kids too and you may not know all the answers, but that is okay because I just want you to try your best. And if you don’t want to play any more, you just let me know and we will stop. You won’t get in trouble.

Okay? So will you play some of these games with me?

If child says yes, add:

I am going to write down some of what you say so that I can remember it. Are you ready to play now?

If child says no, the child will be thanked and excused.

92

APPENDIX D

PHASE 1 COUNTING SUBTEST

93

Phase 1 Counting Subtest Fall 2008

Item number Description of item Rote counting Item 1a Highest number counted to without an error Item 1b Highest number counted to with only 1 error

Counting forward without assistance Item 2 Count forward from 4 Item 3 Count forward from 6 Item 4 Count forward from 11 Item 5 Count forward from 16 Counting forward with assistance Item 6 Count forward from 4 Item 7 Count forward from 6 Item 8 Count forward from 11 Item 9 Count forward from 16

Counting backward with dots Item 10 Count backward from 3 Item 11 Count backward from 6 Item 12 Count backward from 9 Item 13 Count backward from 14 Counting backward with assistance Item 14 Count backward from 3 Item 15 Count backward from 6 Item 16 Count backward from 9 Item 17 Count backward from 14

Counting error identification Item 18 Skipped counting a number Item 19 Correct counting Item 20 Reversed two numbers Item 21 Correct counting Item 22 Skipped counting a dot Item 23 Incorrect cardinality Item 24 Double counting of a dot Item 25 Repeated a number Item 26 Reversed two numbers Item 27 Correct counting Item 28 Double counting a dot Item 29 Incorrect cardinality Item 30 Skipped counting a dot

94

Item number Description of item Item 31 Repeated a number Item 32 Skipped counting a number Item 33 Correct counting

Counting dots Item 34 Count 4 dots Item 35 Count 3 dots Item 36 Count 16 dots Item 37 Count 8 dots Item 38 Count 20 dots Item 39 Count 4 pictures Item 40 Count 3 pictures Item 41 Count 16 pictures Item 42 Count 8 pictures Item 43 Count 20 pictures Item 44 Count 4 objects Item 45 Count 3 objects Item 46 Count 16 objects Item 47 Count 8 objects Item 48 Count 20 objects

Cardinality dots Item 34b 4 dots Item 35b 3 dots Item 36b 16 dots Item 37b 8 dots Item 38b 20 dots Item 39b 4 pictures Item 40b 3 pictures Item 41b 16 pictures Item 42b 8 pictures Item 43b 20 pictures Item 44b 4 objects Item 45b 3 objects Item 46b 16 objects Item 47b 8 objects Item 48b 20 objects

Resultative Counting Item 49 Count 4 Item 50 Count 3 Item 51 Count 16 Item 52 Count 8 Item 53 Count 20

95

Item number Description of item Counting a subset Item 54 Count out 4 blocks from a set of 10 Item 55 Count out 3 blocks from a set of 10 Item 56 Count out 16 blocks from a set of 20 Item 57 Count out 8 blocks from a set of 20 Item 58 Count out 20 blocks from a set of 25

Count by category Item 59 Count 4 cars Item 60 Count 3 dogs Item 61 Count 16 cars Item 62 Count 8 dogs Item 63 Count 20 cars

Subitizing Item 64 Subitize 3 dots Item 65 Subitize 2 dots Item 66 Subitize 5 dots Item 67 Subitize 1 dots Item 68 Subitize 4 dots Item 69 Subitize 7 dots Item 70 Subitize 10 dots Item 71 Subitize 3 pictures Item 72 Subitize 2 pictures Item 73 Subitize 5 pictures Item 74 Subitize 1 pictures Item 75 Subitize 4 pictures Item 76 Subitize 7 pictures Item 77 Subitize 10 pictures

Estimation Item 78 Estimate 10 dots Item 79 Estimate 20 dots Item 80 Estimate 50 dots Item 81 Estimate 100 dots Item 82 Estimate 10 pictures Item 83 Estimate 20 pictures Item 84 Estimate 50 pictures Item 85 Estimate 100 pictures

96

APPENDIX E

PHASE 1 NUMERICAL RELATIONS SUBTEST

97

Phase 1 Numerical Relations Subtest Fall 2008

Item number Description of item Ordinality Item 1 1st of 5 Item 2 3rd of 5 Item 3 Last of 5 Item 4 2nd of 5 Item 5 2nd of 10 Item 6 8th of 10 Item 7 7th of 10 Item 8 10th of 10

Relative size (visual) Item 9 Closest to 2 Item 10 Closest to 5 Item 11 Closest to 7 Item 12 Closest to 10 Relative size (verbal) Item 13 Closest to 3 Item 14 Closest to 6 Item 15 Closest to 8 Item 16 Closest to 11

Number comparison (visual) Item 17 Most – 4, 1 Item 18 Most – 3, 2 Item 19 Most – 5, 3, 8 Item 20 Most – 12, 7, 14, 8 Item 21 Least – 3, 6 Item 22 Least – 1, 3 Item 23 Least – 3, 6, 2 Item 24 Least – 9, 7, 6, 12 Number comparison (verbal) Item 25 Most – 3, 5 Item 26 Most – 6, 3 Item 27 Most – 5, 2, 7 Item 28 Most – 11, 6, 13, 7 Item 29 Least – 1, 4 Item 30 Least – 4, 3 Item 31 Least – 5, 7, 4 Item32 Least – 8, 6, 5, 11

98

Item number Description of item Set comparison (dots) Item 33 Most – 4, 1 Item 34 Most – 3, 2 Item 35 Most – 5, 3, 8 Item 36 Most – 12, 7, 14, 8 Item 37 Least – 3, 6 Item 38 Least – 1, 3 Item 39 Least – 3, 6, 2 Item 40 Least 9, 7, 6, 12 Set comparison (pictures) Item 41 Most – 4, 1 Item 42 Most – 3, 2 Item 43 Most – 5, 3, 8 Item 44 Most – 12, 7, 14, 8 Item 45 Least – 3, 6 Item 46 Least – 1, 3 Item 47 Least – 3, 6, 2 Item 48 Least 9, 7, 6, 12

Number order Item 49 Number after 2 Item 50 Number before 2 Item 51 Number after 5 Item 52 Number before 5 Item 53 Number after 9 Item 54 Number before 9 Item 55 Number after 15 Item 56 Number before 15

Sequencing Item 57 Sequence sets of 2, 3, 4 Item 58 Sequence sets of 1, 2, 3, 4 Item 59 Sequence sets of 2, 4, 5, 6 Item 60 Sequence sets of 2, 3, 4, 5, 7 Insertion Task Item 57b Insert a set of 1 into a sequence of 2, 3, 4 Item 58b Insert a set of 5 into a sequence of 1, 2, 3, 4 Item 59b Insert a set of 3 into a sequence of 2, 4, 5, 6 Item 60b Insert a set of 6 into a sequence of 2, 3, 4, 5, 7

99

Item number Description of item Set reproduction Item 65 Match 5 blocks to 5 dots Item 66 Match 3 blocks to 3 dots Item 67 Match 8 blocks to 8 dots Item 68 Match 18 blocks to 18 dots Item 69 Match a set of 5 cars to a set of 5 planes Item 70 Match a set of 3 cars to a set of 3 planes Item 71 Match a set of 8 cars to a set of 8 planes Item 72 Match a set of 18 cars to a set of 18 planes

Number Identification Item 73 Identify the number 1 Item 74 Identify the number 2 Item 75 Identify the number 3 Item 76 Identify the number 4 Item 77 Identify the number 5 Item 78 Identify the number 6 Item 79 Identify the number 7 Item 80 Identify the number 8 Item 81 Identify the number 9 Item 82 Identify the number 10 Item 83 Identify the number 11 Item 84 Identify the number 12 Item 85 Identify the number 13 Item 86 Identify the number 14 Item 87 Identify the number 15

Numerals Item 88 Match sets of dots to the number 5 Item 89 Match sets of dots to the number 3 Item 90 Match sets of dots to the number 8 Item 91 Match sets of dots to the number 18 Item 92 Match numbers to a set of 5 dots Item 93 Match numbers to a set of 3 dots Item 94 Match numbers to a set of 8 dots Item 95 Match numbers to a set of 18 dots

100

APPENDIX F

PHASE 1 ARITHMETIC REASONING SUBTEST

101

Phase 1 Arithmetic Reasoning Subtest Fall 2008

Item number Description of item Addition/Subtraction with objects Item 1 0+2=2 Item 2 1+1=2 Item 3 1+2=3 Item 4 2+2=4 Item 5 2-1=1 Item 6 3-2=1 Item 7 3-1=2 Item 8 4-1=3

Addition/Subtraction without objects Item 9 0+2=2 Item 10 1+1=2 Item 11 1+2=3 Item 12 2+2=4 Item 13 2-1=1 Item 14 3-2=1 Item 15 3-1=2 Item 16 4-1=3

Initial equivalence Item 17 4 discs in each box Item 18 5 discs in one box, 4 in the other box Item 19 4 discs in one box, 5 in the other box Item 20 6 discs in one box, 4 in the other box Item 21 4 discs in each box Item 22 5 discs in one box, 4 in the other box Item 23 4 discs in one box, 5 in the other box Item 24 6 discs in one box, 4 in the other box

Two-set addition/subtraction Item 17b Add 1 to a box Item 18b Add 1 to the side that has 1 fewer Item 19b Add 1 to the side that has 1 more Item 20b Add 1 to the side that has 1 fewer Item 21b Subtract 1 from a box Item 22b Subtract 1 from the side that has 1 more Item 23b Subtract 1 from the box that has 1 fewer Item 24b Subtract 1 from the box that has 2 more

102

Item number Description of item Equivalent sets Item 25 Divide 4 blocks into 2 sets Item 26 Divide 6 blocks into 3 sets Item 27 Divide 12 blocks into 4 sets Item 28 Divide 10 blocks into 2 sets Item 29 Divide 9 blocks into 2 sets with 1 remainder Item 30 Divide 13 blocks into 4 sets with 1 remainder

Number composition/decomposition Item 31 Add 2 discs Item 32 Add 1 disc Item 33 Add 1 disc Item 34 Add 3 discs Item 35 Take away 2 discs Item 36 Take away 1 disc Item 37 Take away 1 disc Item 38 Take away 2 discs

Number Combinations Item 39 1+1=2 presented visually and verbally Item 40 1+2=3 presented visually and verbally Item 41 2+2=4 presented visually and verbally Item 42 1+3=4 presented visually and verbally Item 43 1+1=2 presented verbally Item 44 1+2=3 presented verbally Item 45 2+2=4 presented verbally Item 46 1+3=4 presented verbally Item 47 1+1=2 presented visually Item 48 1+2=3 presented visually Item 49 2+2=4 presented visually Item 50 1+3=4 presented visually

103

APPENDIX G

PHASE 1 ITEM PSYCHOMETRIC INFORMATION

104

Table G-1

Percent correct, percent partial correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Counting Up tasks in Phase 1

Percent Partial Item Total Task Percent Correct Factor Loading Retained Correct Correlation Counting Forward – Without Help Item 2 54 7 .59 .82 No Item 3 48 10 .63 .82 No Item 4 29 14 .70 .92 No Item 5 26 13 .70 .95 No

Counting Forward – With Help Item 6 71 9 .62 .94 Yes Item 7 65 6 .66 .93 Yes Item 8 33 18 .69 .90 Yes Item 9 26 17 .66 .96 Yes

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

All factor loadings were significant at p < .001.

105

Table G-2

Percent correct, percent partial correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Counting Backward tasks in Phase 1

Percent Partial Item Total Task Percent Correct Factor Loading Retained Correct Correlation Counting Backward – with pictures Item 10 37 7 .55 .91 No Item 11 17 5 .69 .94 No Item 12 12 5 .63 .83 No Item 13 4 1 .45 .88 No

Counting Backward – with assistance Item 14 47 5 .56 .77 Revised Item 15 24 8 .75 .99 Revised Item 16 10 3 .57 .94 Revised Item 17 2 1 .28 .73 Revised

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

All factor loadings were significant at p < .001.

106

Table G-3

Percent correct, percent partial correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Error Identification task in Phase 1

Percent Partial Item Total Task Percent Correct Factor Loading Retained Correct Correlation Item 18 32 28 .68 .87 No Item 19 91 -- .11 .11* Yes Item 20 30 28 .81 .94 Yes Item 21 91 -- .13 .19* No Item 22 29 15 .75 .89 Yes Item 23 18 11 .47 .64 No Item 24 26 12 .74 .88 Yes Item 25 40 14 .81 .95 No Item 26 27 15 .76 .91 No Item 27 91 -- .05 -.13* Yes Item 28 22 12 .66 .80 No Item 29 18 8 .52 .71 Yes Item 30 25 9 .59 .77 No Item 31 39 8 .82 .96 Yes Item 32 27 14 .80 .93 Yes Item 33 83 -- .00 -.07* No

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

* Indicates non-significant factor loading.

107

Table G-4

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Structured Counting task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Counting Dots Item 34 79 .48 .83 Yes Item 35 85 .44 .76 Yes Item 36 26 .40 .73 Yes Item 37 59 .53 .85 Yes Item 38 21 .50 .79 Yes Counting Pictures Item 39 86 .40 .66 No Item 40 85 .46 .70 No Item 41 23 .55 .92 No Item 42 62 .39 .66 No Item 43 23 .50 .89 No Counting Objects Item 44 86 .50 .83 No Item 45 80 .55 .90 No Item 46 26 .48 .80 No Item 47 53 .48 .91 No Item 48 21 .44 .73 No

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

All factor loadings were significant at p < .001.

108

Table G-5

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Cardinality task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Dots Item 34b 67 .61 .96 Yes Item 35b 81 .56 .97 Yes Item 36b 41 .62 .86 Yes Item 37b 62 .71 .95 Yes Item 38b 39 .57 .84 Yes Pictures Item 39b 74 .68 .97 No Item 40b 81 .52 .90 No Item 41b 40 .64 .93 No Item 42b 61 .64 .89 No Item 43b 37 .61 .85 No Objects Item 44b 73 .61 .89 No Item 45b 77 .59 .92 No Item 46b 44 .62 .85 No Item 47b 55 .67 .88 No Item 48b 42 .61 .91 No

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

All factor loadings were significant at p < .001.

109

Table G-6

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Resultative Counting task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item 49 57 .54 .85 Yes Item 50 67 .53 .86 Yes Item 51 12 .45 .87 Yes Item 52 33 .57 .85 Yes Item 53 11 .37 .72 Yes

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

All factor loadings were significant at p < .001.

110

Table G-7

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Counting Subsets task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Counting out a Subset Item 54 57 .61 .86 Yes Item 55 67 .58 .86 Yes Item 56 12 .51 .89 Yes Item 57 33 .62 .93 Yes Item 58 11 .41 .81 Yes Counting by Category Item 59 77 .58 .81 Yes Item 60 83 .38 .71 Yes Item 61 17 .32 .75 Yes Item 62 52 .44 .69 Yes Item 63 14 .27 .65 Yes

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

All factor loadings were significant at p < .001.

111

Table G-8

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Subitizing tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Dots Item 64 76 .37 .61 No Item 65 76 .48 .93 No Item 66 34 .18 .29* No Item 67 87 .37 .85 No Item 68 38 .35 .65 No Item 69 11 .19 .32* No Item 70 14 .16 .34* No Pictures Item 71 74 .40 .67 Yes Item 72 80 .47 .98 Yes Item 73 40 .50 .78 Yes Item 74 89 .39 .85 Yes Item 75 47 .35 .53 Yes Item 76 12 .33 .50 Yes Item 77 17 .12 .20* No

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

112

Table G-9

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Estimation tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Dots Item 78 41 .20 .45* Yes Item 79 15 .24 .53 Yes Item 80 7 .10 .32* No Item 81 13 .32 .91 Yes Pictures Item 82 34 .14 .38* No Item 83 16 .26 .97* No Item 84 3 -.14 -.04* No Item 85 14 .16 .43* No

Note. N = 103. The item numbers in this table refer to the Counting item numbers from Appendix D.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

113

Table G-10

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Ordinality task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item 1 81 .15 .32 Yes Item 2 16 .37 .69 Yes Item 3 48 .38 .66 Yes Item 4 17 .21 46 Yes Item 5 18 .42 .73 Yes Item 6 8 .38 .77 Yes Item 7 15 .23 .52 Yes Item 8 19 .32 .62 Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

All factor loadings were significant at p < .001.

114

Table G-11

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Relative Size task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item9 49 .23 .36 Revised Item10 50 .30 .74 Revised Item11 63 -.03 -.01* Revised Item12 38 .22 .85 Revised Item13 74 .21 .27 Revised Item14 68 .21 .89 Revised Item15 53 .31 .73 Revised Item16 16 .25 .77 Revised

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

115

Table G-12

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Number Comparison task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Visual Item17 61 .29 .37 Revised Item18 53 .16 .35 Revised Item19 51 .19 .45 Revised Item20 33 .33 .53 Yes Item21 48 .23 .01* Revised Item22 54 .16 .02* Revised Item23 45 .22 .58 Revised Item24 24 .20 .33 Yes Verbal Item25 65 .21 -.07* Revised Item26 51 .11 .63 Revised Item27 71 .17 -.03* Revised Item28 25 .07 .03* Yes Item29 31 .31 .60 Revised Item30 82 .07 -.37 Revised Item31 65 .12 -.24* Revised Item32 14 .23 .72 Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

116

Table G-13

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Set Comparison task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Dots Item33 76 .10 .87 Revised Item34 72 .25 .51 Revised Item35 72 .30 .65 Revised Item36 42 .17 .51 Yes Item37 69 .11 -.17* Revised Item38 40 .12 -.24* Revised Item39 36 .08 .04* Revised Item40 36 .29 .20* Yes Dogs Item41 76 .12 .70 Revised Item42 68 .36 .53 Revised Item43 72 .22 .73 Revised Item44 45 .25 .43 Yes Item45 63 .17 -.11* Revised Item46 52 .05 -.32 Revised Item47 31 .01 .23* Revised Item48 35 .22 .05* Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

117

Table G-14

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Number Order task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item49 47 .62 .83 Yes Item50 45 .67 .88 Yes Item51 35 .73 .91 Yes Item52 33 .76 .95 Yes Item53 24 .56 .81 Yes Item54 26 .73 .95 Yes Item55 7 .50 .88 Yes Item56 15 .51 .83 Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

All factor loadings were significant at p < .001.

118

Table G-15

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Sequencing tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Initial Sequencing Item57 28 .37 .80 Yes Item58 15 .41 .79 Yes Item59 8 .62 1.00 Yes Item60 3 .38 .88 Yes Insertion Task Item57b 50 .14 .24* No Item58b 50 .25 .43 No Item59b 20 .18 .56 No Item60b 9 .32 1.09 No

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

119

Table G-16

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for Set Reproduction tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Blocks Item65 74 .69 .93 Yes Item66 89 .47 .85 Yes Item67 77 .70 .94 Yes Item68 50 .53 .86 Yes Pictures Item69 39 .24 .48 Yes Item70 44 .25 .47 Yes Item71 50 .46 .88 Yes Item72 22 .17 .50 Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

All factor loadings were significant at p < .001.

120

Table G-17

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Number Identification task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item73 66 .58 .86 Yes Item74 60 .64 .87 Yes Item75 64 .60 .82 Yes Item76 60 .70 .93 Yes Item77 60 .72 .93 Yes Item78 44 .71 .87 Yes Item79 48 .65 .81 Yes Item80 49 .76 .94 Yes Item81 33 .70 .88 Yes Item82 31 .66 .85 Yes Item83 27 .71 .93 Yes Item84 19 .54 .85 Yes Item85 18 .58 .90 Yes Item86 26 .64 .86 Yes Item87 16 .64 1.02 Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

All factor loadings were significant at p < .001.

121

Table G-18

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Numerals task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Set Item88 46 .42 .774 Yes Item89 44 .40 .841 Yes Item90 39 .40 .833 Yes Item91 17 .08 .328 No Number Item92 29 .49 .795 Yes Item93 48 .50 .901 Yes Item94 33 .31 .717 Yes Item95 15 .31 .741 Yes

Note. N = 103. The item numbers in this table refer to the Numerical Relations item numbers from Appendix E.

All factor loadings were significant at p < .001.

122

Table G-19

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Addition and Subtraction with objects task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item 1 86 .56 .95 Yes Item 2 77 .56 .83 Yes Item 3 70 .45 .68 Yes Item 4 54 .58 .81 Yes Item 5 73 .55 .79 Yes Item 6 66 .49 .72 Yes Item 7 65 .59 .82 Yes Item 8 48 .48 .68 Yes

Note. N = 100. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix F.

All factor loadings were significant at p < .001.

123

Table G-20

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Addition and Subtraction without objects task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item 9 56 .35 .69 Yes Item 10 51 .28 .40 Yes Item 11 23 .33 .90 Yes Item 12 21 .24 .73 Yes Item 13 51 .38 .72 Yes Item 14 55 .38 .74 Yes Item 15 43 .35 .66 Yes Item 16 24 .36 .89 Yes

Note. N = 100. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix F.

All factor loadings were significant at p < .001.

124

Table G-21

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Initial Equivalence and Two-set Addition/Subtraction tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Initial Equivalence Item 17 34 .07 .73 Yes Item 18 51 .27 .52 No Item 19 52 .35 .69 Yes Item 20 64 .37 .75 No Item 21 34 .03 .56 Yes Item 22 45 .14 .08* No Item 23 55 .39 .69 Yes Item 24 54 .29 .44 No Two-set

Addition/Subtraction Item 17b 65 .38 .71 Yes Item 18b 32 .08 .00* No Item 19b 67 .41 .81 Yes Item 20b 37 .19 .28* No Item 21b 56 .32 .64 Yes Item 22b 18 -.04 -.27* No Item 23b 51 .33 .75 Yes Item 24b 28 .08 .14* No

Note. N = 100. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix F.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

125

Table G-22

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Equivalent Sets task in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Item25 88 .24 .51 Yes Item26 74 .39 .66 Yes Item27 52 .54 .76 Yes Item28 54 .48 .75 Yes Item29 27 .54 .91 Yes Item30 21 .49 .88 Yes

Note. N = 100. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix F.

All factor loadings were significant at p < .001.

126

Table G-23

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Number Composition and Decomposition tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Composition Item31 12 .37 .59 Yes Item32 14 .43 .72 Yes Item33 16 .61 1.20 Yes Item34 8 .08 .16* Yes Decomposition Item35 54 .31 .49 Yes Item36 76 .44 .80 Yes Item37 70 .48 .89 Yes Item38 68 .36 .56 Yes

Note. N = 100. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix F.

* Indicates non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

127

Table G-24

Percent correct, item-total correlations, factor loadings and whether or not the item was retained on the revised measure for the Number Combinations tasks in Phase 1

Task Percent Correct Item Total Correlation Factor Loading Retained Visual and Verbal Item39 37 .52 .73 Yes Item40 29 .60 .82 Yes Item41 16 .37 .75 Yes Item42 17 .37 .83 Yes Visual Item43 42 .59 .90 No Item44 37 .54 .81 No Item45 24 .45 .78 No Item46 20 .17 .53 No Verbal Item47 19 .52 .83 No Item48 21 .49 .75 No Item49 20 .39 .69 No Item50 16 .30 .73 No

Note. N = 100. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix F.

All factor loadings were significant at p < .001.

128

APPENDIX H

PHASE 2 COUNTING SUBTEST

129

Phase 2 Counting Subtest Spring 2009

Item Number Description Rote Counting Item 1 Highest number counted to without an error

Counting forward Item 2 Count forward from 4 with assistance Item 3 Count forward from 6 with assistance Item 4 Count forward from 11 with assistance Item 5 Count forward from 16 with assistance

Counting Backward Item 6 Count backward from 3 with assistance Item 7 Count backward from 6 with assistance Item 8 Count backward from 9 with assistance Item 9 Count backward from 14 with assistance

Counting error identification Item 10 Reversed two numbers Item 11 Correct counting Item 12 Skipped counting a dot Item 13 Double counted a dot Item 14 Incorrect cardinality Item 15 Double counted a number Item 16 Skipping counting a number Item 17 Correct counting

Structured counting Item 18 Count 4 dots Item 19 Count 3 dots Item 20 Count 16 dots Item 21 Count 8 dots Item 22 Count 6 dots Item 23 Count 20 dots Item 24 Count 11 dots Item 25 Count 14 dots

130

Item Number Description Cardinality Item 18b 4 dots Item 19b 3 dots Item 20b 16 dots Item 21b 8 dots Item 22b 6 dots Item 23b 20 dots Item 24b 11 dots Item 25b 14 dots

Resultative Counting Item 26 Count 4 Item 27 Count 3 Item 28 Count 16 Item 29 Count 8 Item 30 Count 20

Counting a subset Item 31 Count out 4 blocks from a set of 10 Item 32 Count out 3 blocks from a set of 10 Item 33 Count out 16 blocks from a set of 20 Item 34 Count out 8 blocks from a set of 20 Item 35 Count out 20 blocks from a set of 25

Count by category Item 36 Count 4 cars Item 37 Count 3 dogs Item 38 Count 16 cars Item 39 Count 8 dogs Item 40 Count 20 cars

Subitizing Item 41 Subitize 3 pictures Item 42 Subitize 2 pictures Item 43 Subitize 5 pictures Item 44 Subitize 1 pictures Item 45 Subitize 4 pictures Item 46 Subitize 7 pictures Item 47 Subitize 6 pictures

131

Item Number Description Estimation Item 48 Estimate 10 dots Item 49 Estimate 20 dots Item 50 Estimate 100 dots Item 51 Identify which set is 10 Item 52 Identify which set is 20 Item 53 Identify which set is 50 Item 54 Identify which set is 100

132

APPENDIX I

PHASE 2 NUMERICAL RELATIONS SUBTEST

133

Phase 2 Numerical Relations Subtest Spring 2009

Item Number Description Ordinality Item 1 3rd of 5 Item 2 4th of 5 Item 3 Last of 5 Item 4 1st of 5 Item 5 2nd of 10 Item 6 8th of 10 Item 7 10th of 10 Item 8 7th of 10

Relative size (visual) Item 9 Closest to 2 Item 10 Closest to 5 Item 11 Closest to 7 Item 12 Closest to 10 Relative size (verbal) Item 13 Closest to 3 Item 14 Closest to 6 Item 15 Closest to 8 Item 16 Closest to 11

Number comparison (visual) Item 17 Most – 1, 4, 3, 2 Item 18 Most – 3, 1, 2, 0 Item 19 Most – 5, 3, 8, 1 Item 20 Most – 12, 7, 14, 8 Item 21 Least – 3, 6, 7, 5 Item 22 Least – 1, 3, 5, 9 Item 23 Least – 3, 6, 2, 8 Item 24 Least – 9, 7, 6, 12 I Number comparison (verbal) Item 25 Most – 3, 5, 1, 2 Item 26 Most – 6, 3, 2, 4 Item 27 Most – 5, 2, 7, 1 Item 28 Most – 11, 6, 13, 7 Item 29 Least – 1, 4, 9, 6 Item 30 Least – 4, 3, 10, 13 Item 31 Least – 5, 7, 4, 12 Item 32 Least – 8, 6, 5, 11

134

Item Number Description Set comparison Item 33 Most – 1, 4, 3, 2 Item 34 Most – 3, 1, 2, 0 Item 35 Most – 5, 3, 8, 1 Item 36 Most – 12, 7, 14, 8 Item 37 Most – 3, 7 , 10, 1 Item 38 Least – 3, 6, 7, 5 Item 39 Least – 9, 2, 11, 7 Item 40 Least – 1, 3, 5, 9 Item 41 Least – 3, 6, 2, 8 Item 42 Least – 9, 7, 6, 12

Number order Item 43 Number after 2 Item 44 Number before 2 Item 45 Number after 5 Item 46 Number before 5 Item 47 Number after 9 Item 48 Number before 9 Item 49 Number after 15 Item 50 Number before 15

Sequencing Item 51 Sequence sets of 1, 5, 9 Item 52 Sequence sets of 1, 3, 5 Item 53 Sequence sets of 2, 3, 4 Item 54 Sequence sets of 1, 2, 3, 4 Item 55 Sequence sets of 2, 4, 5, 6 Item 56 Sequence sets of 2, 3, 4, 5, 7

Set Reproduction Item 57 Match 5 blocks to 5 dots Item 58 Match 3 blocks to 3 dots Item 59 Match 8 blocks to 8 dots Item 60 Match 18 blocks to 18 dots Item 61 Match a set of 5 cars to a set of 5 planes Item 62 Match a set of 3 cars to a set of 3 planes Item 63 Match a set of 8 cars to a set of 8 planes Item 64 Match a set of 18 cars to a set of 18 planes

135

Item Number Description Number identification Item 65 Identify the number 1 Item 66 Identify the number 2 Item 67 Identify the number 3 Item 68 Identify the number 4 Item 69 Identify the number 5 Item 70 Identify the number 6 Item 71 Identify the number 7 Item 72 Identify the number 8 Item 73 Identify the number 9 Item 74 Identify the number 10 Item 75 Identify the number 11 Item 76 Identify the number 12 Item 77 Identify the number 13 Item 78 Identify the number 14 Item 79 Identify the number 15

Numerals Item 80 Match sets of dots to the number 5 Item 81 Match sets of dots to the number 3 Item 82 Match sets of dots to the number 1 Item 83 Match sets of dots to the number 12 Item 84 Match sets of dots to the number 8 Item 85 Match sets of dots to the number 7 Item 86 Match numbers to a set of 5 dots Item 87 Match numbers to a set of 3 dots Item 88 Match numbers to a set of 8 dots Item 89 Match numbers to a set of 14 dots Item 90 Match numbers to a set of 8 dots Item 91 Match numbers to a set of 7 dots

136

APPENDIX J

PHASE 2 ARITHMETIC REASONING SUBTEST

137

Phase 2 Arithmetic Reasoning Subtest Spring 2009

Item Number Description Addition/subtraction with objects Item 1 0+2=2 Item 2 1+1=2 Item 3 1+2=3 Item 4 2+2=4 Item 5 2+3=5 Item 6 3+3=6 Item 7 2-1=1 Item 8 3-2=1 Item 9 3-1=2 Item 10 4-1=3 Item 11 5-2=3 Item 12 5-4=1

Addition/subtraction without objects Item 13 1+0=1 Item 14 0+2=2 Item 15 1+1=2 Item 16 1+2=3 Item 17 2+2=4 Item 18 1-1=0 Item 19 2-1=1 Item 20 3-2=1 Item 21 3-1=2 Item 22 4-1=3

Initial Equivalence Item 23 5 discs in one box, 4 in the other box Item 24 6 discs in one box, 4 in the other box Item 25 4 discs in each box Item 26 5 discs in one box, 4 in the other box Item 27 4 discs in one box, 6 in the other box Item 28 4 discs in each box

Two-set addition/subtraction Item 23b Add 1 disc to the box with 1 fewer disc Item 24b Add 1 disc to the box with 2 fewer discs Item 25b Add 2 discs to one of the boxes Item 26b Subtract 1 disc from the box with 1 more disc Item 27b Subtract 1 disc from the box with 1 more disc Item 28b Subtract 2 discs from one of the boxes

138

Item Number Description Equivalent sets Item 29 Divide 4 blocks into 2 sets Item 30 Divide 6 blocks into 3 sets Item 31 Divide 12 blocks into 4 sets Item 32 Divide 10 blocks into 2 sets Item 33 Divide 9 blocks into 2 sets with 1 remainder Item 34 Divide 13 blocks into 4 sets with 1 remainder

Number composition/decomposition Item 35 Add 2 discs Item 36 Add 1 disc Item 37 Add 1 disc Item 38 Add 3 discs Item 39 Take away 2 discs Item 40 Take away 1 disc Item 41 Take away 1 disc Item 42 Take away 2 discs Item 43 Take away 4 discs Item 44 Take away 2 discs

Number combinations Item 45 1+0=1 presented visually and verbally Item 46 0+2=2 presented visually and verbally Item 47 1+1=2 presented visually and verbally Item 48 1+2=3 presented visually and verbally Item 49 2+2=4 presented visually and verbally Item 50 1+3=4 presented visually and verbally

139

APPENDIX K

PHASE 2 ITEM PSYCHOMETRIC INFORMATION

140

Table K-1

Percent correct, difficulty parameters, and whether or not the item was retained on the final measure for the Rote Counting task in Phase 2

Percent Correct Difficulty Retained Count to 5 95 -1.64 Yes Count to 10 85 -1.04 Yes Count to 15 58 -.21 Yes Count to 20 46 .09 Yes Count to 25 41 .24 Yes Count to 30 30 .51 No Count to 35 27 .63 No Count to 40 17 .94 Yes Count to 45 15 1.03 No Count to 50 12 1.18 No Count to 55 12 1.19 No Count to 60 11 1.24 No Count to 65 10 1.26 No Count to 70 10 1.27 No Count to 75 10 1.27 No Count to 80 9 1.36 No Count to 85 9 1.36 No Count to 90 8 1.41 No Count to 95 8 1.41 No Count to 100 7 1.51 Yes

Note. N = 393.

All factor loadings were significant at p < .001.

141

Table K-2

Percent correct, percent partial correct, item-total correlations, factor loadings discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Counting Forward and Backward tasks in Phase 2

Percent Percent Item-Total Factor Partial Discrimination Difficulty 1 Difficulty 2 Retained Correct Correlation Loading Correct Count Forward Item 2 73 3 .76 .97 3.68 -.71 -.62 No Item 3 72 3 .78 .99 5.91 -.70 -.59 Yes Item 4 50 14 .75 .91 2.24 -.38 .00 Yes Item 5 46 14 .74 .93 2.45 -.28 .10 No Count Backward Item 6 61 3 .60 .93 2.49 -.37 -.29 Yes Item 7 42 5 .74 .96 3.34 .07 .21 Yes Item 8 26 4 .66 .94 2.71 .57 .69 Yes Item 9 5 2 .34 .84 1.54 1.77 1.92 Yes

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

142

Table K-3

Percent correct, percent partial correct, item-total correlations, factor loadings discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Counting Error Identification task in Phase 2

Percent Percent Item-Total Factor Partial Discrimination Difficulty 1 Difficulty 2 Retained Correct Correlation Loading Correct Item 10 37 28 .76 .88 1.86 -.41 .36 Yes Item 11 92 -- .15 .33 .34 -4.41 -- Yes Item 12 42 14 .72 .90 2.10 -.14 .21 No Item 13 43 13 .74 .92 2.38 -.16 .18 Yes Item 14 27 10 .50 .67 .89 .45 .86 Yes Item 15 50 11 .84 .99 6.40 -.27 -.01 Yes Item 16 39 16 .79 .93 2.59 -.14 .29 No Item 17 91 -- .05 .11* .11 -12.81 -- No

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

* Indicates a non-significant factor loading at p > .05. All other factor loadings were significant at p < .001.

143

Table K-4

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Structured Counting task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 18 89 .47 .89 1.91 -1.26 No Item 19 90 .49 .92 2.41 -1.26 Yes Item 20 44 .69 .99 6.50 .11 Yes Item 21 71 .61 .86 1.70 -.63 No Item 22 75 .59 .87 1.75 -.74 Yes Item 23 41 .63 .96 3.29 .19 No Item 24 62 .63 .85 1.64 -.36 Yes Item 25 54 .68 .92 2.27 -.12 Yes

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

144

Table K-5

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Cardinality task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 18b 80 .66 .97 3.72 -.77 No Item 19b 83 .64 .97 3.97 -.86 Yes Item 20b 60 .72 .96 3.60 -.30 Yes Item 21b 72 .73 .97 3.66 -.54 No Item 22b 72 .76 .97 3.98 -.53 Yes Item 23b 56 .68 .94 2.64 -.22 No Item 24b 69 .75 .96 3.34 -.48 No Item 25b 60 .69 .93 2.51 -.29 No

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

145

Table K-6

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Resultative Counting task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 26 72 .47 .80 1.34 -.74 No Item 27 74 .49 .88 1.90 -.73 Yes Item 28 24 .49 .85 1.62 .81 Yes Item 29 38 .47 .70 .98 .40 Yes Item 30 21 .43 .74 1.09 1.08 Yes

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

146

Table K-7

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Counting a Subset tasks in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Counting out a Subset Item 31 66 .67 .95 3.11 -.42 Yes Item 32 74 .62 .95 2.95 -.69 Yes Item 33 27 .58 .88 1.82 .70 Yes Item 34 55 .63 .82 1.44 -.14 Yes Item 35 27 .57 .86 1.65 .72 No Counting by Category Item 36 88 .45 .80 1.35 -1.51 No Item 37 89 .43 .82 1.40 -1.53 Yes Item 38 36 .47 .75 1.12 .49 Yes Item 39 65 .49 .68 .93 -.59 Yes Item 40 25 .41 .72 1.04 .94 Yes

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

147

Table K-8

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Subitizing task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 41 73 .47 .82 1.43 -.78 Yes Item 42 86 .44 .86 1.65 -1.22 Yes Item 43 40 .36 .56 .67 .48 Yes Item 44 92 .44 .88 1.81 -1.59 Yes Item 45 59 .47 .72 1.04 -.30 Yes Item 46 16 .31 .60 .75 1.69 Yes Item 47 21 .39 .67 .91 1.20 Yes

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

148

Table K-9

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Estimation task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 48 46 .34 .63 .81 .23 Yes Item 49 27 .38 .79 1.27 .72 Yes Item 50 27 .28 .76 1.15 .81 No Item 51 38 .29 .45 .51 .72 No Item 52 21 .19 .32 .34 2.71 Yes Item 53 28 .25 .37 .40 1.69 Yes Item 54 57 .36 .69 .96 -.24 Yes

Note. N = 393. The item numbers in this table refer to the Counting item numbers from Appendix H.

All factor loadings were significant at p < .001.

149

Table K-10

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Ordinality task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 1 37 .61 .85 1.59 .35 No Item 2 40 .63 .90 2.09 .25 Yes Item 3 58 .60 .87 1.74 -.20 Yes Item 4 62 .58 .86 1.72 -.28 No Item 5 36 .61 .89 1.95 .40 Yes Item 6 19 .51 .83 1.51 .93 Yes Item 7 38 .50 .76 1.15 .39 No Item 8 24 .47 .73 1.06 .89 No

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

150

Table K-11

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Relative Size task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 9 .55 .43 .70 .98 -.19 No Item 10 .53 .42 .73 1.06 -.13 Yes Item 11 .56 .34 .59 .74 -.22 No Item 12 .35 .59 .90 2.05 .38 Yes Item 13 .32 .38 .68 .93 .65 No Item 14 .31 .41 .77 1.19 .60 Yes Item 15 .37 .49 .82 1.44 .39 No Item 16 .29 .40 .72 1.02 .78 Yes

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

151

Table K-12

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Number Comparison tasks in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 17 .47 .59 .86 1.72 -.06 Yes Item 18 .48 .48 .77 1.19 -.05 No Item 19 .43 .56 .90 2.06 .10 Yes Item 20 .47 .51 .81 1.37 -.04 No Item 21 .47 .54 .86 1.66 -.01 No Item 22 .49 .50 .89 1.83 -.08 No Item 23 .37 .37 .75 1.15 .32 Yes Item 24 .30 .40 .69 .96 .60 Yes Item 25 .35 .60 .91 2.15 .20 No Item 26 .38 .34 .88 1.85 .25 No Item 27 .37 .60 .92 2.40 .20 Yes Item 28 .32 .50 .76 1.17 .45 No Item 29 .35 .56 .93 2.45 .24 No Item 30 .24 .55 .94 2.82 .48 Yes Item 31 .26 .49 .83 1.46 .46 No Item 32 .28 .55 .93 2.45 .45 No

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

152

Table K-13

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Set Comparison tasks in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 33 .67 .57 .90 2.01 -.54 Yes Item 34 .71 .57 .89 1.90 -.69 Yes Item 35 .81 .51 .88 1.88 -.95 Yes Item 36 .62 .43 .70 .99 -.40 No Item 37 .66 .47 .78 1.24 -.53 No Item 38 .56 .58 .88 1.82 -.31 Yes Item 39 .61 .58 .91 2.23 -.39 Yes Item 40 .59 .54 .86 1.66 -.35 No Item 41 .59 .50 .81 1.39 -.27 No Item 42 .53 .44 .66 .88 -.19 Yes

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

153

Table K-14

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Number Order task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 43 .63 .64 .88 1.90 -.32 Yes Item 44 .63 .77 .99 6.36 -.24 Yes Item 45 .57 .75 .97 4.08 -.15 Yes Item 46 .55 .77 .96 3.59 -.08 No Item 47 .40 .68 .91 2.15 .24 No Item 48 .45 .77 .97 3.69 .14 Yes Item 49 .26 .60 .95 2.99 .63 Yes Item 50 .36 .64 .91 2.25 .33 Yes

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

154

Table K-15

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Sequencing task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 51 48 .54 .82 1.43 .08 Yes Item 52 42 .44 .80 1.34 .27 No Item 53 40 .61 .87 1.76 .30 Yes Item 54 33 .64 .93 2.45 .47 Yes Item 55 23 .62 .92 2.37 .79 No Item 56 19 .43 .95 3.06 .90 Yes

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

155

Table K-16

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Set Reproduction tasks in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Blocks Item 57 83 .56 .84 1.57 -1.15 No Item 58 88 .50 .89 1.99 -1.22 Yes Item 59 75 .64 .94 2.78 -.73 Yes Item 60 49 .39 .67 .91 .02 Yes Pictures Item 61 63 .41 .75 1.12 -.48 No Item 62 62 .46 .84 1.54 -.37 Yes Item 63 67 .40 .82 1.44 -.58 Yes Item 64 19 .14 .42 .46 2.02 Yes

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

156

Table K-17

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Number Identification task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 65 84 .57 .89 1.95 -1.09 Yes Item 66 77 .70 .95 3.15 -.78 Yes Item 67 80 .64 .92 2.27 -.92 Yes Item 68 79 .63 .88 1.87 -.92 No Item 69 80 .64 .91 2.21 -.93 No Item 70 61 .51 .85 1.59 -.34 No Item 71 64 .74 .91 2.21 -.39 Yes Item 72 67 .68 .87 1.78 -.53 Yes Item 73 53 .55 .91 2.24 -.08 No Item 74 57 .78 .95 2.96 -.19 Yes Item 75 50 .77 .94 2.84 .01 No Item 76 33 .63 .89 1.95 .48 Yes Item 77 34 .66 .91 2.19 .45 No Item 78 45 .72 .93 2.46 .13 Yes Item 79 37 .49 .93 2.47 .37 Yes

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

157

Table K-18

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Numerals tasks in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 80 68 .63 .91 2.18 -.50 No Item 81 69 .44 .92 2.35 -.56 Yes Item 82 81 .55 .90 2.09 -.89 Yes Item 83 68 .48 .80 1.35 -.58 No Item 84 44 .44 .75 1.12 .23 No Item 85 39 .57 .84 1.57 .20 Yes Item 86 59 .38 .88 1.82 -.37 Yes Item 87 70 .71 1.00 9.98 -.52 No Item 88 51 .54 .90 2.08 -.14 Yes Item 89 69 .64 .94 2.66 -.55 No Item 90 45 .50 .86 1.69 .04 Yes Item 91 37 .52 .83 1.47 .21 No

Note. N = 393. The item numbers in this table refer to the Numerical Relation item numbers from Appendix I.

All factor loadings were significant at p < .001.

158

Table K-19

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Addition and Subtraction with Objects task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 1 87 .30 .54 .64 -2.03 Yes Item 2 82 .41 .73 1.08 -1.21 Yes Item 3 78 .53 .89 1.97 -.83 Yes Item 4 57 .61 .86 1.70 -.28 No Item 5 52 .52 .80 1.34 -.17 Yes Item 6 33 .48 .80 1.33 .42 Yes Item 7 77 .37 .67 .91 -1.10 No Item 8 69 .49 .74 1.11 -.61 No Item 9 64 .50 .76 1.18 -.52 Yes Item 10 56 .51 .86 1.68 -.28 Yes Item 11 35 .34 .60 .75 .50 No Item 12 45 .27 .52 .61 .19 Yes

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

159

Table K-20

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Addition and Subtraction without Objects task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 13 55 .22 .41 .44 -.33 No Item 14 66 .39 .71 1.00 -.59 Yes Item 15 49 .49 .71 1.00 0.00 Yes Item 16 39 .45 .80 1.32 .28 No Item 17 31 .47 .75 1.14 .54 Yes Item 18 38 .52 .82 1.41 .27 Yes Item 19 54 .43 .74 1.09 -.13 Yes Item 20 56 .41 .67 .91 -.28 Yes Item 21 45 .40 .67 .91 .06 No Item 22 32 .36 .70 .99 .63 Yes

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

160

Table K-21.

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Initial Equivalence task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 23 56 .16 .39 .42 -.64 Yes Item 24 70 .26 .63 .80 -.93 Yes Item 25 43 .23 .72 1.03 .18 Yes Item 26 50 .19 .47 .53 -.12 Yes Item 27 64 .29 .72 1.05 -.39 Yes Item 28 48 .27 .72 1.03 .02 Yes

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

161

Table K-22

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Two-Set Addition task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 23b 39 .33 .74 1.09 .31 Yes Item 24b 38 .17 .30 .32 .35 No Item 25b 70 .25 .69 .96 -.75 Yes Item 26b 34 .24 .67 .90 .45 Yes Item 27b 32 .18 .48 .55 .73 Yes Item 28b 65 .26 .56 .67 -.72 No

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

162

Table K-23

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Equivalent Sets task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 29 86 .39 .77 1.21 -1.34 Yes Item 30 76 .50 .87 1.76 -.73 Yes Item 31 55 .58 .82 1.45 -.13 No Item 32 54 .62 .88 1.85 -.13 Yes Item 33 35 .59 .93 2.46 .34 Yes Item 34 30 .51 .85 1.59 .57 Yes

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

163

Table K-24

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Number Composition and Decomposition tasks in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Composition Item 35 40 .59 .84 1.56 .18 No Item 36 30 .65 .92 2.37 .44 Yes Item 37 40 .66 .95 2.92 .26 Yes Item 38 24 .48 .98 1.32 .73 Yes Decomposition Item 39 46 .51 .85 1.61 .15 Yes Item 40 62 .27 .50 .57 -.54 Yes Item 41 55 .57 .83 1.49 -.14 Yes Item 42 58 .43 .70 .99 -.16 No Item 43 23 .39 .76 1.18 .84 No Item 44 33 .41 .67 .90 .67 No

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

164

Table K-25

Percent correct, item-total correlations, factor loadings, discrimination parameters, difficulty parameters, and whether or not the item was retained on the final measure for the Number Combinations task in Phase 2

Percent Correct Item-Total Correlation Factor Loading Discrimination Difficulty Retained Item 45 33 .31 .69 .94 .56 No Item 46 32 .38 .78 1.26 .56 Yes Item 47 30 .58 .88 1.86 .60 Yes Item 48 28 .57 .91 2.15 .66 Yes Item 49 22 .59 .95 2.99 .77 Yes Item 50 18 .57 .91 2.16 .94 Yes

Note. N = 393. The item numbers in this table refer to the Arithmetic Reasoning item numbers from Appendix J.

All factor loadings were significant at p < .001.

165

REFERENCES

Aubrey, C., & Godfrey, R. (2003). The development of children’s early numeracy through key stage 1. British Education Research Journal, 29, 821-840.

Aunio, P., Niemivirta, M., Hautomaki, J., Van Luit, J. E. H., Shi, J.,et al. (2006). Young children’s number sense in China and Finland. Scandinavian Journal of Educational Research, 50, 483-502.

Aunola, K., Leskinen, E., Lerkkanen, M., & Nurmi, J. (2004). Developmental dynamics of math performances from preschool to grade 2. Journal of Educational Psychology, 96, 699- 713.

Baroody, A. J. (2004). The developmental bases for early childhood number and operations. In D. Clements, J. Sarama, & A. M. DiBiase (Eds.), Engaging young children in mathematics: Standards for early childhood (pp. 173-219). Mahwah, NJ: Erlbaum.

Baroody, A. J., Gannon, K. E., Berent, R., & Ginsburg, H. P. (1984). The development of basic formal mathematics abilities. Acta Paedologica, 1, 133-150.

Baroody, A. J., & Ginsburg, H. P. (1982). Preschoolers’ informal mathematics skills: research and diagnosis. American Journal of Diseases of Children, 136, 195-197.

Berch, D. (2005). Making sense of number sense: Implications for children with mathematical disabilities. Journal of Learning Disabilities, 38, 333-339.

Bryant, D. P., Bryant B. R., Kim, S. A., & Gersten, R. (2006, February). Three-tier mathematics intervention: Emerging model and preliminary findings. Poster presented at the 14th annual meeting of the Pacific Coast Research Conference, San Diego, CA.

Chard, D. J., Clarke, B., Baker, S., Otterstedt, J., Braun, D., & Katz, R. (2005). Using measures of number sense to screen for difficulties in mathematics: Preliminary findings. Assessment for Effective Intervention, 30, 3-14.

Chiappe, P. (2005). How reading research can inform mathematics difficulties: The search for the core deficit. Journal of Learning Disabilities, 38, 313-317.

Clements, D. H., & Sarama, J. (2007). Effects of a preschool mathematics curriculum: Summative research on the Building Blocks project. Journal for Research in Mathematics Education, 38 (2), 136–163.

Clements, D. H., Sarama, J., & Liu, Xiufeng (2008). Development of a measure of early mathematics achievement using the Rasch model: The Research-based Early Maths Assessment. Educational Psychology, 28(4), 457-482.

166

Dantzig, T. (1954). Number: The language of science. New York: MacMillan.

Developing Skills Checklist. (1990). CBT/McGraw-Hill.

Duncan, G., J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., et al. (2007). School readiness and later achievement. Developmental Psychology, 43, 1428-1446.

Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel- Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29, 278-295.

Fuchs, L. S., Compton, D. L., Fuchs, D., Hollenbeck, K. N., Craddock, C. F., & Hamlett, C. L. (2008). Dynamic assessment of algebraic learning in predicting third graders’ development of mathematical problem solving. Journal of Educational Psychology, 100, 829-850.

Fuchs, L.S., Compton, D. L., Fuchs, D., Paulsen, K., Bryant, J.D., & Hamlett, C. L. (2005). The prevention, Identification, and cognitive determinants of math difficulty. Journal of Educational Psychology, 97, 493-513.

Fuchs, L. S., Fuchs, D., Compton, D. L., Powell, S. R., Seethaler, P. M., Capizzi, A. M., et al. (2006). The cognitive correlates of third-grade skill in arithmetic, algorithmic computation, and arithmetic word problems. Journal of Educational Psychology, 98, 29- 43.

Geary, D. C. (1994). Children’s mathematical development: Research and practical applications. Washington, DC, US: American Psychological Association, xv, 327 pp.

Gelman, R., & Gallistel, C. R. (1978). The child’s understanding of number. Oxford, England: Harvard University Press, xiii.

Gersten, R., Clarke, B., & Mazzocco, M. M. M. (2007). Historical and contemporary perspectives on mathematical learning disabilities. In Berch, D. and Mazzocco, M. M. M. (Eds), Why is math so hard for some children? The nature and origins of mathematical learning difficulties and disabilities. (pp. 7-27). Baltimore, MD: Paul H Brookes Publishing.

Gersten, R., Jordan, N. C., & Flojo, J. R. (2005). Early identification and interventions for students with mathematics difficulties. Journal of Learning Disabilities, 38, 293-304.

Ginsburg, H. P. (1975). Young children’s informal knowledge of mathematics. Journal of Children’s Mathematical Behavior, 1, 63-156.

Ginsburg, H. P. & Baroody, A. J. (2003). Test of Early Mathematics Ability (3rd ed.), Pro-ed, Austin, TX.

167

Ginsburg, H. P., Klein, A., & Starkey, P. (1998). The development of children’s mathematical thinking: Connecting research with practice. Williams, D., Sigel, I. E., & Renninger, K. (Eds), Child psychology in practice. (pp. 401-476). Hoboken, NJ: John Wiley & Sons Inc.

Gonzales, P., Calsyn, C., Jocelyn, L., Mak, K., Kastberg, D., Arafeh, S., Williams, T., and Tsen, W. (2000). Pursuing excellence: Comparisons of international eighth-grade mathematics and science achievement from a U.S. perspective, 1995 and 1999 (NCES 2001–028). U.S. Department of Education, National Center for Education Statistics. Washington, DC: U.S. Government Printing Office.

Greenes, C., Ginsburg, H. P., & Balfanz, R. (2004). Big math for little kids. Early Childhood Research Quarterly, 19, 159 – 166.

Griffin, S. (2004). Building number sense with Number Worlds: A mathematics program for young children. Early Childhood Research Quarterly, 19, 173 – 180.

Griffin, S., & Case, R. (1997). Re-thinking the primary school math curriculum: An approach based on cognitive science. Issues in Education, 2, 1-49.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1981). Fundamentals of Item Response Theory. Newbury Park: Sage Publications.

Howell, S. & Kemp, C. (2005). Defining early number sense: A participatory Australian study. Educational Psychology, 25, 555 – 571.

Jordan, N. C., Kaplan, D., Locuniak, M. N., & Ramineni, C. (2007). Predicting first-grade math achievement from developmental number sense trajectories. Learning Disabilities Research & Practice, 22, 36-46.

Jordan, N. C., Kaplan, D. Olah, L. N., & Locuniak, M. N. (2006). Number sense growth in kingerdarten: A longitudinal investigation of children at risk for mathematics difficulties. Child Development, 77, 153 – 175.

Jordan, N. C., Kaplan, D., Ramineni, C., & Locuniak, M. N. (2009). Early math matters: Kindergarten number competence and later mathematics outcomes. Developmental Psychology, 45, 850-867.

Kalchman, M., Moss, J., & Case, R. (2001). Psychological models for the development of mathematical understanding: Rational numbers and functions. In S. Carver & D. Klahr (Eds.), Cognition and instruction (pp. 1–38). Mahwah, NJ: Erlbaum.

Kerr, Z. (2008). Parental influences on mathematics achievement of children of immigrant backgrounds. Dissertation abstracts international Section A: Humanities and Social Sciences. 68 (10-A), 4206.

168

Kim, S. H. (2008). A comparison study of the academic effects of providing gifted services versus not providing gifted services to gifted Latino ELL students in mathematics. Dissertation abstracts international Section A: Humanities and Social Sciences. 68 (11- A), 4593.

Klein, A., Starkey, P., & Ramirez, A. (2002). Pre-K mathematics curriculum. Glendale, IL: Scott Foresman.

Lansdell, J. M. (1999). Introducing young children to mathematical concepts: Problems with “new” terminology. Educational Studies, 25, 327-333.

Leong, C. K. & Jerred, W. D. (2001). Effects of consistency and adequacy of language information on understanding elementary mathematics word problems. Annals of Dyslexia, 51, 277-298.

Liberman, I. Y. & Shankweiler, D. (1985). Phonology and the problems of learning to read and write. Remedial and Special Education, 6, 8-17.

Lord, F. M. (1977). Practical applications of item characteristic curve theory. Journal of Educational Measurement, 14, 117-138.

Mullis, I.V.S., Martin, M.O., Gonzalez, E.J., & Chrostowski, S.J. (2004). TIMSS 2003 international mathematics report: Findings from IEA’s Trends in International Mathematics and Science Study at the eighth and fourth grades. Chestnut Hill, MA: Boston College.

Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo simulation to decide on sample size and determine power. Structural Equation Modeling, 9, 599-620.

Muthén, L. K., & Muthén, B. O. (2008a). Mplus 5.1 [Computer program]. Los Angeles, CA: Muthén & Muthén.

Muthén, L. K. & Muthén, B. O. (2008b, March). Mplus Short Courses Topic 1 – Exploratory Factor Analysis, Confirmatory Factor Analysis, and Structural Equation Modeling with Continuous Indicators. Presented during Mplus Short Course Day 1 at Johns Hopkins University, Baltimore, MD.

National Counsel of Teachers of Mathematics. . Principles and Standards for School Mathematics. Reston, Va.: NCTM, 2000.

Naude, H., Pretorius, E., & Vandeyar, S. (2003). Teacher professionalism—An innovative programme for teaching mathematics to foundation level learners with limited language proficiency. Early Child Development and Care, 173, 293-315.

Neuman, S. B., & Roskos, K. (2005). The state of pre-kindergarten standards. Early Childhood Research Quarterly, 20, 125-145.

169

Nishida, T. K. (2009). The use of manipulatives to support children’s acquisition of abstract math concepts. Dissertation abstracts international Section A: Humanities and Social Sciences. 69 (1-B), 718.

Piaget, J. (1952). The child’s conception of number. London: Routledge & Kegan Paul.

Share, D. L. (1995). Phonological recoding and self-teaching: Sine qua non of reading acquisition. Cognition, 55, 151-218.

Sophian, C. (2004). Mathematics for the future: Developing a Head Start curriculum to support mathematics learning. Early Childhood Research Quarterly, 19, 59 – 81.

Stanovich, K. E. (1988). Explaining the differences between the dyslexic and the garden-variety poor reader: The phonological-core variable-difference model. Journal of Learning Disabilities, 21, 590-612.

Starkey, P. & Cooper, R. G. (1980). Perception of numbers by human infants. Science, 210, 1033 – 1035.

Starkey, P., Klein, A., & Wakeley, A. (2004). Enhancing young children’s mathematical knowledge through a pre-kindergarten mathematics intervention. Early Childhood Research Quarterly, 19, 99 – 120.

Stevenson, H. W., Lee, S., Chen, C., Lummis, M., et al. (1990). Mathematics achievement of children in China and the United States. Child Development. 61, 1053-1066.

Swanson, H. L. (2004). Working memory and phonological processing as predictors of children’s mathematical problem solving at different ages. Memory & Cognition, 32, 648-661.

Swanson, H. L., & Beebe-Frankenberger, M. (2004). The relationship between working memory and mathematical problem solving in children at risk and not at risk for serious math difficulties. Journal of Educational Psychology, 96, 471-491.

Swanson, H. L., & Kim, K. (2007). Working memory, short-term memory, and naming speed as predictors of children’s mathematical performance. Intelligence, 35, 151-168.

Tzouriadou, M., Barbas, G., & Bonti, E. (2002). Socio-cultural environment as a factor of differentiation in mathematical reasoning. Psychology: The Journal of the Hellenic Psychological Society, 9, 281 0 294. van de Rijt, B. A. M., Van Luit, J. E. H., & Pennings, A. H. (1999). The construction of the Utrecht Early Mathematical Competence Scales. Educational and Psychological Measurement, 59, 289 – 309.

170

van de Rijt, Godfrey, Aubrey, van Luit, GhesquiEre, Torbeyns, et al. (2003). The development of early numeracy in europe. Journal of Early Childhood Research, 1, 155-180.

Wood, J. N. & Spelke, E. S. (2005). Infants’ enumeration of actions: Numerical discrimination and its signature limits. Developmental Science, 8, 173 – 181.

Wu, W., West, S. G., & Hughes, J. N. (2008). Effect of retention in first grade on children’s achievement trajectories over 4 years: A piecewise growth analysis using propensity score matching. Journal of Educational Psychology, 100, 727-740.

171

BIOGRAPHICAL SKETCH

David J. Purpura was born on April 15, 1982, in Bridgeport, Connecticut. He grew up in Westwood, Massachusetts and is a die-hard Boston sports fan. He graduated in 2005 from Purdue University with a Bachelor’s degree in Psychology. He is a Predoctoral Interdisciplinary Research Training Program fellow through the Florida Center for Reading Research and The Florida State University and is completing his Ph.D. in Clinical Psychology. His research interest is in the interrelation of children’s early academic skills and dysregulated executive functioning. David is particularly interested in disentangling the overlap between early mathematics achievement and attention problems by further evaluating the constructs of these two domains. His is also interested in the application of advance research methodology and statistics such as Item Response Theory, Structural Equation Modeling, and Hierarchical Linear Modeling to the broader field of psychopathology and education.

172