This may be the author’s version of a work that was submitted/accepted for publication in the following source:

English, Lyn & Watson, Jane M. (2018) Modelling with authentic data in sixth grade. ZDM - International Journal on Mathematics Education, 50(1-2), pp. 103- 115.

This file was downloaded from: https://eprints.qut.edu.au/113835/

c Consult author(s) regarding copyright matters

This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the docu- ment is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recog- nise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to [email protected]

Notice: Please note that this document may not be the Version of Record (i.e. published version) of the work. Author manuscript versions (as Sub- mitted for peer review or as Accepted for publication after peer review) can be identified by an absence of publisher branding and/or typeset appear- ance. If there is any doubt, please refer to the published source. https://doi.org/10.1007/s11858-017-0896-y 1

Title: Modelling with Authentic Data in Grade 6 First author: Lyn D. English Faculty of Education Queensland University of Technology Victoria Park Road Kelvin Grove Brisbane Queensland , 4059 Email: [email protected] Phone: 617 31383329 Second author: Jane Watson University of Tasmania Faculty of Education Email: [email protected]

Key words: modelling with data; data literacy; primary school; variation; uncertainty; informal inference; mathematisation; transnumeration

2

Modelling with Authentic Data in Sixth Grade Abstract

This article explores 6th- grade students’ modelling with data in generating models for selecting an Australian team for the (then) forthcoming 2016 Olympics, using data on swimmers’ times at various previous events. We propose a modelling framework comprising four components: working in shared problem spaces between mathematics and statistics; interpreting and reinterpreting problem contexts and questions; interpreting, organising and operating on data in model construction; and drawing informal inferences. In studying students’ model generation, consideration is given to how they interpreted, organised, and operated on the problem data in constructing and documenting their models, and how they engaged in informal inferential reasoning. Students’ responses included applying mathematical and statistical operations and reasoning to selected variables, identifying how variation and trends in swimmers’ performances inform model construction, recognising limitations in using only one performance variable, and acknowledging inform model construction, recognising limitations in using only one performance variable, and acknowledging uncertainty in model creation and model application due to chance variation.

1. Introduction and Background

Statistical literacy is increasingly important in today’s society where data inform nearly all aspects of our lives. An ability to deal intelligently with such data is essential for a fulfilling and productive life in the community, school, workplace, and family (Engel, 2017; Gal,

2005). The notion of statistical literacy has been controversial for some time, with various definitions proposed, incorporating different aspects of informal and formal knowledge bases, dispositions, and attitudes (e.g., Gal, 2004; Gould, 2017; Watson, 2006). There has, however, been limited attention paid to primary school students’ statistical literacy, especially with 3 respect to how it might be developed in modelling with data. In this study, we consider statistical literacy to involve the ability to construct, interpret, reason, and communicate with data and data representations, make statistically sound decisions, and critically evaluate claims made in various contexts including the media.

By undertaking their own investigations and generating different conclusions, primary school students can learn to make critical decisions with data, where variation and uncertainty are ever present. Yet these school students often do not receive the appropriate or adequate experiences that set them on the road to this statistical literacy. Developing statistical literacy takes a long time and must begin in the earliest years of schooling (English, 2014; Lehrer &

Schauble, 2002; Watson, 2006).

One approach to fostering this literacy with primary school students is through modelling with data. Although the term, modelling, has been used variously in the literature (e.g., Blum

& Leiss, 2007; English, Arleback, & Mousoiludes, 2016; Gravemeijer, 1999; Kaiser, 2007;

Lesh & Doerr, 2003), the linking of modelling with the development of statistical literacy has been limited especially in the primary school. As applied to the present study, modelling with data is a process of inquiry involving comprehensive statistical reasoning that draws upon mathematical and statistical concepts, contexts, and questions. The models produced should be supported by evidence and open to informal inferential thinking, with the latter including acknowledgement of uncertainty in model creations and applications arising from variation due to chance (Watson & English, 2015; Makar & Rubin, 2009).

The aim of this article is to examine 6th- grade students’ generation of models for selecting

Australian swimming teams for the (then forthcoming) 2016 Olympic Games, using data on swimmers’ competing times at various previous events. In studying students’ model generation, consideration is given to how they interpreted, organised, and operated on the 4 problem data in constructing their models, how they documented their models, and how they engaged in informal inferential reasoning, as previously defined.

The problem was the final component of an activity implemented at the end of a three-year longitudinal study designed to develop 4th – 6th grade students’ statistical literacy, with a focus on informal inferential reasoning (Makar & Rubin, 2009). In reporting on students’ responses to the problem, we consider the following research questions:

1. How did students interpret, organise, and operate on the problem data in

constructing their models?

2. What was the nature of the informal inferences students drew from their models?

2. Modelling with Data

As used in this study, modelling with data differs from several other perspectives on modelling in that the focus is on developing both mathematical and statistical learning. As such, this modelling is an especially rich vehicle for developing foundational understandings that often are not addressed until the secondary school. An early introduction to modelling involving statistical learning is especially important in today’s society where studies of professional practice repeatedly highlight the complexity of the statistical world (e.g., Gould,

2017). Young learners need to be introduced to the rudiments of this complexity in ways that are pedagogically fruitful and tractable (Lehrer & English, in press; Lehrer & Romberg,

1996; Lehrer & Schauble, 2000). Modelling with data facilitates a balancing of this complexity and tractability but has been under-researched in the primary school years. The framework we address has four core components, namely, working in shared problem spaces

(boundary interactions) between mathematics and statistics; interpreting and reinterpreting problem contexts and questions; interpreting, organising and operating on data in model construction; and drawing informal inferences. 5

2.1 Boundary Interactions in Modelling with Data

As noted, our approach to modelling with data links children’s mathematical and statistical learning. Specifically, features of the statistical problem-solving framework of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report (Franklin et al., 2007) are combined with aspects of research on the models and modelling approach (Lesh &

Zawojewski, 2007; Doerr & English, 2003). A key feature of this combined perspective is the focus on both structure and variability (http://illuminations.nctm.org/Lesson.aspx?id=1189).

As highlighted on the National Council of Teachers of Mathematics (NCTM) Illuminations website, “Statistical models extend mathematical models by describing variability around the structure” (p.11). As such, our framework supports the importance of “productive boundary interactions” or shared problem spaces between the two communities of practice in mathematics and statistics (Groth, 2015, p. 5). Two neglected problem spaces are variability and context, both of which can have different meanings in mathematics and statistics.

Modelling with data provides one avenue for incorporating aspects of both meanings, in particular, by drawing on horizontal and vertical mathematisation (Freudenthal, 1991) and transnumeration (Wild & Pfannuch, 1999), as highlighted by Groth.

Although (horizontal) mathematisation has frequently been identified as a key process in working modelling problems (Niss, 2010), “vertical mathematisation” has received less attention. Vertical mathematisation is applied when students look beyond the specific problem context to consider the general mathematical structure/s generated, while transnumeration refers to the specific role that context plays throughout a statistical inquiry, where different approaches are sought to find and convey meaning in data. Modelling with data thus extends young students’ mathematical and statistical learning because it encourages them to search for meaning in the context housing the data, apply mathematical and statistical processes in problem solution, and to recognise “the general structures” of the mathematics 6 and statistics learned (Groth, 2015, p. 9). Such structures can be applied to the solution of new, related problems.

2.2 Interpreting and Reinterpreting Problematic Contexts and Questions

Problems involving modelling with data are inherently complex and are often ill-structured, with constraints that might not always be clearly defined. In other words, such problems within real-world contexts do not come “neatly packaged;” that is, they are neither organised nor “labeled for analysis” (Langman, Zawojewski, & Whitney, 2016, p.71), in contrast to the usual tasks students meet in the primary years. Modelling with data thus presents challenges to young children that standard curricula often remove for them.

The importance of context cannot be underestimated, but is often overlooked or given a secondary role (Pfannkuch, 2011). Yet, data are “numbers within a context” (Franklin et al.,

2007), with context furnishing the meaning of the data (Langrall, Nisbet, Mooney, & Jansem,

2011; Moore, 1990). Children need to appreciate this role of context while at the same time be able to abstract the data from the context (Konold & Higgins, 2003). As Moore (1990) emphasised, a problem involving data should engage students’ knowledge of context so that they can understand and interpret the data rather than just perform arithmetical procedures to solve the problem, which so often happens in the later grades when these early experiences are missing (Watson, 2006). The important role of transnumeration, as discussed previously, is underscored in interpreting and understanding data within a given context.

Interpreting the question to be investigated within a given context can also be problematic, especially when initial questions are often informal and broad and can take substantial interpretation and refining in both statistical inquiries and modelling (English, 2014; English,

Watson, & Fitzallen, 2017; Whitin & Whitin, 2011). Given that data are generated within a 7 context of inquiry (Moore, 1990), posing and refining questions need to be elevated to a higher status in primary curricula (Lavigne & Lajoie, 2007).

2.3 Interpreting, Organising and Operating on Data in Model Construction

As indicated previously, experiences in modelling with data present new opportunities and challenges for younger grades to link their learning in mathematics and statistics. These experiences require noticing, attending to, and operating on critical aspects of the modelling situation (Langman et al., 2016). Identifying and prioritising variables, considering “trade- offs” (where some variables might be discarded or assigned a lower status), recognising statistical features (e.g., distributions and variation), and applying both mathematical operations (e.g., calculating differences in times) and statistical operations (e.g., determining and plotting means; identifying trends) are not typically experienced in the primary school yet are core foundations for future learning. Sometimes, the variables being addressed are ill- defined or open to multiple interpretations, necessitating a negotiation of meanings and agreement on interpretations for the context at hand. Because modelling with data is designed for small group work, it enables negotiation and argumentation processes among group members that facilitate concept development and documentation of learning, as well as new perspectives on alternative approaches to model construction (English & Lesh, 2003;

Zawojewski, 2010; Zawojewski, Lesh, & English, 2003). Such peer-mediated learning afforded by these modelling experiences needs emphasis in the younger grades.

As children organise and operate on the variables selected, they need to appreciate the ever- present variation in data, the underlying concept for all investigations involving data.

Recognising inherent variability entails not simply perceiving the data in terms of single attributes (e.g., just noticing different colours). Rather, the different values of an attribute need to be considered, which Konold, Finzer, and Kreetong (2015) term a “case value” perspective. Moving beyond case values to an aggregate perspective subsequently enables 8 students to view data in terms of a distribution of values as an object by attending to characteristics such as the shape and general trends in differences among case values (Ben-

Zvi & Arcavi, 2001). Traditional instruction often treats measures of distribution as forms of computation, but research has shown how children can progress towards recognising statistics as ways of measuring characteristics of distribution (Lehrer & English, in press). Given that interpreting data through the lens of distribution (Bakker & Gravemeijer, 2004) is a cornerstone of modelling with data, more experiences are needed in the primary curriculum where students look for and identify data distributions.

2.4 Drawing Informal Inferences

Problems involving modelling with data do not end with model creation. Rather, drawing informal inferences (Makar & Rubin, 2009) from a generated model is foundational to these experiences. As highlighted in the NCTM Illuminations website, statisticians use models of data “to make inferences, including predictions, about the population under study.” (p. 11; http://illuminations.nctm.org/Lesson.aspx?id=1189). Inferences drawn about the question in context need to be made in light of the uncertainty that arises from chance variation (Lehrer,

Kim, & Jones, 2011). Informal inferences are precursors to “formal” inferences, where a theoretical approach is usually the initial introduction older students receive. It is imperative that younger students engage in informal inference as a foundational component of statistical literacy, yet this understanding has not received the required attention (Makar, 2016).

Contemplating answers to questions that look beyond the given data and acknowledging the uncertainty in any generalisations formed, reflects the vertical mathematisation discussed previously. Importantly, drawing such generalisations needs to take into consideration variation in the data—as noted, variation is the key to accepting a conclusion with some degree of certainty. 9

In sum, we argue that the proposed modelling with data approach provides rich opportunities not only for establishing underrepresented foundational statistical understandings in the primary grades, but also for linking the important transumeration and mathematisation processes emphasised by Groth (2015).

3. Methodology

3.1 Participants

Eighty-nine Year 6 students (mean age 11 years, 10 months) in four classes at a government- run school in a middle-socioeconomic, Australian capital suburb participated. Most of the students had been involved in the study for the previous 2 ½ years. Only data from students whose parents had given written permission to participate in the study are reported. All students, however, took part in the activity as a regular component of their curriculum.

Gender was evenly split with 45 girls and 44 boys. Forty-three percent of the students (38) were classified as having English as a second language (ESL). Pseudonyms are used throughout when referring to students.

3.2 Design

Incorporating both qualitative and quantitative aspects, the study adopted one form of design- based research (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003). This approach was considered the most appropriate for achieving our overall aims as it enables (a) intervention where new learning situations are created and refined through iterative cycles of design and analysis, together with a focus on developing students’ learning across time; (b) caters for complex classroom situations that contain many variables and real-world constraints; and (c) focuses on theory development and refinement that supports learning and informs future learning experiences. 10

The present investigation was the final of seven comprehensive multi-session activities implemented in a three-year longitudinal study across grades 4 to 6. The students’ previous experiences in the study, commencing in the 4th-grade, included comprehensive activities involving (a) problem posing through survey construction, (b) measurement involving different forms of variation, (c) probability experiments involving simulation, (d) investigations centred on the environment; (v) and investigations measuring reaction times.

Each activity was introduced to the teachers in a professional learning session including a detailed lesson plan printed in parallel with student workbooks. Teachers implemented the activities in their individual classrooms.

3.3 Activity Structure and Implementation

The present modelling problem (“Let the Selections Begin!”) was the third session in a set of activities investigating the performances of athletes, which were implemented across a whole school day. The first two sessions were concerned with students refining a general question about Olympic Games athletes, namely, “Are Athletes Getting Better over Time?” (Watson

& English, in press). The question was raised by the teachers in introducing the activity.

Students were to refine and subsequently investigate their question by exploring a range of data sets for particular Olympic events (e.g., , running, long and high jump, and sprint). Having reached a conclusion on their question based on evidence from their data analysis, students were given additional information on technological developments that could contribute to the improvement of athletes’ performances. In light of this new information, students were asked to reconsider their conclusions and the degree of certainty associated with them.

Students were each provided with a workbook where they recorded their responses during each of the three sessions. Students’ learning from the first two sessions provided important groundwork for the present modelling problem, which was of approximately 1hr 45 mins 11 duration and comprised whole class and small group work. Student groups (2-3 per class, hereafter referred to as “focus groups”) were selected in consultation with the teachers and were of mixed achievement levels. All focus groups and whole-class discussions were video- and audio-taped for subsequent analysis.

Following the previous sessions, in which students found that athletes are in general improving over time, students were asked, “What does this mean for Australia? Are our athletes improving over time as well?” Students quickly realised that the question was vague and needed refining. It was explained to the students that, to facilitate data management, their investigations would be restricted to Australia’s swimming teams’ chances of winning gold at the Rio 2016 Olympics swimming events. The broad question was then posed: Are Australian swimming teams improving over time? If so, is Australia likely to win gold in the pool at the

2016 Olympics?

Students quickly identified further questions that they needed to ask (e.g., “How many teams are we talking about? Are we referring to men's or women's teams? How can we answer,

‘likely to win?’”). The students were to then create and refine their own questions to explore data on women's or men's 100m freestyle events during the 2012-2014 period. Figure 1 presents an example of one of the data sets provided. Due to time restraints imposed by the timetable, students were presented with the data rather than sourcing the information themselves even though the latter approach is more desirable. Data were presented to the students in both table and graphical formats, the latter using the software program,

TinkerPlots (Konold & Miller, 2011) with which the students were familiar from their previous activities.

12

Alexander Cameron James Kenneth Matt Matthew Ned Samuel Tommaso Vincent Event Competition Graham McEvoy Magnussen To Targett Abood McKendry Young D’Orsogna Dai No. (Current age 19) (Current age 20) (Current age 23) (Current age 22) (Current age 28) (Current age 28) (Current age 22) (Current age 16) (Current age 23) (Current age 16) 1 July 2012 48.94 47.22 51.60 Olympic Games April 2013 2 Australia Swimming 49.15 48.07 47.53 48.58 48.58 49.00 49.78 48.86 Championships June 2013 3 50.57 48.84 48.11 49.25 50.51 50.91 49.70 Time Trials July 2013 4 FINA World 47.88 47.71 Championships December 2013 5 McDonald’s Qld 50.27 49.33 51.82 51.58 49.07 Championships Jan/Feb 2014 6 NSW/Vic Open 48.28 47.73 49.29 Championships April 2014 7 Australian Swimming 49.21 47.65 47.83 49.00 49.88 52.58 48.72 Championships June 2014 8 Swimming Australia 48.45 50.56 51.76 Grand Prix 3 July 2014 9 Commonwealth 48.34 48.11 49.04 Games 10 August 2014 47.60 47.68 48.23 49.13 49.18 Pan Pacs

Personal Best Times 49.11 47.60 47.10 48.58 48.58 48.23 49.13 51.55 48.72 50.94 www.swimming.org.au Blank: Did not compete *Best time across heat, semi final & final 13

Figure 1. Men’s 100m Freestyle Results (inc. relay events) recorded by Australian Competitors (in seconds)* 14

The students were then presented with the problem task: You are to select Australia’s best 6 swimmers to compete in either the women’s or men’s 100m freestyle event at the Rio

Olympics. Your selection should ensure Australia has the best chance of winning gold.

Students were to document all their working including their model creations in their workbooks. On creating their models, all student groups (including focus groups) were to complete a report in their workbooks to present before the class. The report was to comprise the question the group addressed, the data set chosen, the specific data that were analysed and documented, how the swimming team members were selected and why, and their certainty in their model selecting the best team for the 2016 Olympics as well as for selecting teams for other sports events. The students’ workbook responses were analysed and are reported in the

Results section.

3.4 Data Analysis

Data sources from 46 student groups included: (a) group models and reports as documented in the students’ workbooks, (b) students’ annotations on the table of data provided (e.g., Fig.

1), (c) transcripts of all focus group discussions as students completed the activity, (d) transcripts of all whole class discussions, and (e) transcripts of the class presentations of all group models. Iterative refinement cycles for videotape analyses of conceptual change in students’ learning were undertaken by the first author in analysing the transcripts (Lesh &

Lehrer, 2000). These analyses were undertaken in conjunction with the data from the students’ workbook responses and from their annotated tables of data (e.g., Fig. 1).

Content analysis (Patton, 2002) was used by both the first author and an experienced research assistant in identifying, coding, and categorising all the data recorded in the students’ workbooks. Inductive analysis was applied in initially identifying patterns and categories of responses. Several cycles of analyses followed, in which response categories were refined 15 taking into consideration components of the theoretical framework. Specifically, the following components were analysed: how each of the student groups interpreted, organised, and operated on the data (including mathematical and statistical operations and reasoning), how they documented their models including TinkerPlots representations and written reports/explanations, and the informal inferences drawn (with respect to their certainty in team selections and their confidence in applying their model to other sports events). There was 90% coder agreement across the foregoing components analysed, with some lack of clarity when members within two of the groups offered conflicting statements on their confidence in team selection. Mutual agreement was reached through reanalysis.

Results

In reporting the results of the analyses of 46 student groups across the classes, we consider the research questions in turn.

4 .1 Research Question 1: How did students interpret, organise, and operate on the problem data in constructing their models?

As seen in the data of Figure 1, variation occurs in the swimmers’ times including their personal bests (PBs), in the number and level of competitions they entered, in the differences between their PBs and their other recorded times, in their ages, and in the time lapses between a swimmer’s last competition and the 2016 Olympic Games. Students thus had to interpret these data in light of this variation. This necessitated keeping in mind the problem context such as taking into consideration the relative ranking of competitions, possible factors that could impact on a swimmer’s performance over time, and additional demands of a future

Olympics. The consideration of both variability in data and problem context illustrates the

“productive boundary interactions” between statistics and mathematics addressed in our 16 framework. This interaction required students to interpret a complex set of data in organising and operating on selected variables in model construction.

In selecting and organising variables, students used data directly from the table and/or operated on data to create new variables. The former included a swimmer’s PB, his/her age, his/her fastest time across the competitions entered, and “recency”, that is, how recently a swimmer had competed. The latter approach, involving the generation of several new variables, was evident in students’ application of mathematical and/or statistical operations and reasoning. Specifically, the following operations were applied, illustrating the role of transnumeration where different approaches are sought to generate meaning in the data within the Olympic context:

Mathematical operations: These included calculating the difference between a swimmers’ PB and his/her fastest time recorded in the data table; determining the number of times a swimmer had swum in a particular event (e.g., Swimming Australia Time Trials); and identifying the total number of events in which a swimmer had competed.

Statistical operations/reasoning: Included here is calculating a swimmer’s mean time across the events swum, and determining “consistency” in performance. The term, consistency, was used by several groups in reference to the extent of variation in a swimmer’s times across events, again highlighting the boundary interactions in modelling with data.

A student in one class group report defined consistency as “Keeping the same speed all the time.” In another class, a student explained in reporting to the class that “…the main factors that affected our decision were the swimmers’ personal best and their consistency.” On being asked to explain her understanding of consistency, the student replied: “… like as in so they are not sometimes very fast and sometimes really slow, they are always at a good speed.

They’re almost always swimming well.” The model comprising the variables of personal best 17 time and consistency is displayed in Figure 2. Students’ reference to consistency reflected their awareness of inherent variability in the data, that is, they considered the swimmers’ times in terms of “case values” (Konold et al., 2015) rather than individual attributes.

Figure 2. Model displaying PBs and Consistency

Mathematical and statistical operations/reasoning: These applications can be seen in students’ calculation of time lapses before the onset of the 2016 Olympics Games and how this might affect a swimmer’s future times; and in ranking the “levels of competition” (such as an Olympics event being more competitive than a Commonwealth Games event).

Table 1 displays the frequencies with which variables were chosen directly from the table of swimmers’ times (no operation) and those created through operating on the data. Many groups incorporated more than one variable in their model construction. The variables were identified through the analysis of the student workbooks and from any annotated copies of

Figure 1. 18

Table 1. Frequencies of variables (no operation and operations applied) across all class groups (N=46 groups) Variables Percentage (%) of groups No operation Swimmer’s personal best time (PB) 87 Age of swimmer 28 Each swimmer’s fastest time 13 Recency 13

Operations applied Calculating mean 37 Calculating difference between a swimmer’s PB and their fastest time 4 Time lapses before onset of 2016 Olympics 13 How many times a swimmer has competed in a particular event 9 No. of events swimmer has competed in 28 Levels of competition 2 Consistency in performance 28

N (student groups) = 46 As displayed in Table 1, a swimmer’s personal best time was most frequently incorporated in model construction, either solely (by a few student groups) or in combination with other variables. Ranking swimmers’ personal best times on the table of data supplied (e.g., Fig.1) was common, with 41% of the groups doing so. A focus solely on PBs, however, suggests these students were not taking into consideration the overall context of the activity, where factors other than personal best times are considered important in the Australian selections of swimming teams (e.g., performances at previous national and international swimming events, and at swimming trials for a particular event where times should equal or better the qualifying times).

Several groups were also cognizant of the variation in the swimmers’ ages (ranging from 16 to 28 years) and how age can impact on a swimmer’s times, especially given the time lapse to the 2016 Olympics, again reflecting the process of transnumeration. A swimmer’s age was thus considered a variable for inclusion in their models. Likewise, students’ appreciation of the variation in a swimmer’s performance across events was apparent in the number of 19 groups who considered “consistency” in performance an important variable for inclusion, as discussed previously.

Calculating a swimmer’s mean time as a variable for team selection was frequent, again indicating an awareness of variation in the data and reflecting the mathematical and statistical links in the modelling problem. At the same time, however, some mathematical uncertainty appeared regarding comparing means. One group member explained to his class in their report: “I’ve just got another thing that I didn’t really agree on, um use of the average cause you know how you have to add up each person’s times and then divide it by how many races they’ve done, um not every single racer has done the same amount of races so I just thought that would be a little bit inaccurate.” A fellow class peer commented, “With that, it wouldn’t be inaccurate because if you were dividing it by the number of races they’d done then it would still be the same, for example, if someone did eight races you add them all together and divide it, them by eight; it would still be the same, just as accurate as if somebody had done three races and you divided it by three.” Such valuable student interactions illustrate the important opportunities for negotiation and argumentation afforded by modelling problems of the present type.

The students rarely took into account, however, the variation in the competitive levels of the swimming events, and any impact this might have. It is surprising that this variable received limited attention, given the previous examples of students’ awareness of the problem context as they considered variables for inclusion in their models. Several groups did, however, take into account the number of events in which a swimmer had competed and how this variable could affect their team selections.

Overall, 43% of the groups included two variables in their model construction, while 11% focused on only one variable. Models comprising three or more variables were less frequent, with 30% of groups including three, 4% four variables, and only 11% incorporating five or 20 more. Figure 3 illustrates the Workbook documentation of one model in which four variables were considered namely, swimmers’ ages, the numbers of races undertaken, average times, and personal best times.

Figure 3. A model comprising lists of four variables

Student groups frequently documented their models in more than one way, such as creating a

TinkerPlots representation (Fig. 4) and recording directly on the supplied table of data (e.g.,

Fig. 1), as noted previously. Displaying their model with Tinkerplots was featured in 26% of the 21

Groups’ responses. Figure 4 illustrates a TinkerPlots representation accompanied by written documentation.

 Our task was to identify the best 6 male swimer (swimmers) to compete in the men’s 100m freestyle event at Rio de Januro (Rio de Janeiro) Olympics. We were to choose swimmers with the best chance of receiving a gold medal.  The data we used was the Men’s 100m Freestyle Results (inc relay events) recorded by Aus (Australian) competitors [in seconds]. More specifically, their times in competition. We used graphs and averages to calculate the fastest swimmers.  Our swimmers were Magnussen, McEvoy, Abood, To, Graham and Targett. We chose them because they had average times of 47.73, 48.21, 49.24, 49.33, 49.79 and 50.04 respetledy (respectively), which were better than the other average times of 50 seconds plus.

Figure 4. An example of a TinkerPlots representation and written documentation

Examples of Group Model Construction

To illustrate a few ways in which student groups interpreted, organised, and operated on the data in constructing their models, we consider transcripts from three groups who specifically took into account multiple variables. An indication of transnumeration and mathematisation processes can be seen as the students grapple with aspects of the problem context when anticipating the impact of age and other factors on a swimmer’s future performance. 22

Difficulties in just relying on a swimmer’s PB in the context of predicting future performance was apparent in the group discussions, necessitating the need to consider other variables.

Nick and Milo

This group focused on the swimmer’s PBs, their ages, the recency of the events, and trends in swimmers’ times over the given period. Nick began by considering the swimmers’ PBs and plotting these on a TinkerPlots file “so we can see how they’re ranked on their PB.” On identifying several swimmers based on their PBs, Milo alerted Nick to the need to “consider their current age.” “Cause they get older and generally the older you get the weaker your um

...” Milo further explained that they could consider the July

“cause that’s like a really recent large event.…” and that “You have to look at the recent 3

[events] at least, the Personal Best (PB) isn’t generally ... but the thing is you don’t look at

PB time, cause that doesn’t mean … he could have got that years ago. You want to look at the three recent ones that he’s done.”

At this point, Milo revisited the problem goal, but was reminded by Nick that they needed to refine their initial question, “Who are the best six Australian swimmers?”, which subsequently became, “Who are the six best swimmers in the men’s 100m freestyle, the

Australian Men’s 100m Freestyle?” Moving to the age variable, Milo commented, “I reckon the prime age is about 20, just the prime age.”

Again returning to their question, the students decided to list six swimmers according to their

PBs on their data table, only to discover that two swimmers had the same PBs. On comparing the rate of change in these swimmers’ times, Milo suggested they discard the slower swimmer. Nick preferred to keep both swimmers but Milo alerted him to another variable to consider, namely, that one of the swimmers “will be 24 by the time (in 2016) and he, look his time is getting slower.” While deciding to keep one of the swimmers as a reserve, Milo 23 noticed how recently the swimmers had competed and took this into account along with trends in the swimmers’ times, observing that one of the swimmers was actually becoming slower with “every new race he does.” On the other hand, selecting the alternative swimmer was problematic: “…it was a little hard to say because he hasn’t done any recent races but we can see from 2012 to 2013 that he’s actually improving.” This group’s model construction is quite sophisticated in that they were considering variables simultaneously, “trading off” what they deemed less important, and were taking into consideration trends in data across races.

The age variable was also considered important but its potential impact on a swimmer’s future performance was open to speculation.

Robyn and Pierre

Robyn and Pierre not only considered the swimmers’ PBs but also their mean times, their age, and the number of events in which they had competed: As Robyn reported to her class peers: “Our Australian 100m Freestyle choices were who swam a PB of

47.10 seconds and has an average time of 47.74 sec. We also chose Alexander Graham, who swam a PB of 49.11 seconds and has an average of 49.80 seconds, and Cameron McEvoy, who swam a PB of 47.60 and has an average of 48.25.” On documenting their final model of selected swimmers, the group explained their decision to the class with reference to the unreliability of personal best times: “We came to our decision by firstly looking at the PBs but had a disagreement that that person won’t get their PB again, which resulted in us finding out the average.”

Edward and Josh

These students not only considered the swimmers’ means and their PBs, but also calculated the mean of a swimmer’s PB and his overall mean. They further took into account the number of races a swimmer had swum. As the group explained in presenting their class reports: “… our three main criteria are competition average, and it’s basically the average 24 speed in the which they raced in their competitions ... and the second one [criteria] was PB and how fast they can swim; so by that I mean once we found out their competition average we averaged that with the PB to find out their thing …” On asking for further explanation by the first author, Edward explained that “We just averaged it, yeah so we add them.” Jack further explained: “We found out the average … we got the average, we got the PB, and we added them together and then we divided both of them by 2.” Jack considered their approach to model construction “… very good to use in this kind of situation …; cause you can’t just rely on a PB, you can’t; a swimmer can’t perform their PB consistently every time so we’ve thought that … we will average the competition and their PB…”

This group displayed an interesting instance of the application of both mathematical and statistical operations and reasoning. The students were aware of the inconsistency in the swimmers’ personal best times and the variation in their race times, and thus chose to calculate the average of the two variables. It would thus seem that the students had at least an implicit understanding of how the mean can reveal characteristics of data distribution.

4.2 Research question 2: What was the nature of the informal inferences students drew from their models?

In addressing this second research question, we refer to our notion of informal inferential reasoning with a focus in this section on students’ certainty in their team selections and their confidence in applying their model to other sports events. In drawing on students’ workbook responses and their reports presented to their class peers, we identified four categories of response to certainty in team selections: (1) No response; (2) Yes, “completely” certain,

“fairly” certain, or “comfortable” with the model, but no justification given or the justification was limited (e.g., “Why wouldn’t you pick the fastest swimmers to swim on your team?”); (3) As for category 2 but with an acceptable justification (e.g., “I am very certain about the team because they consistently show they’re some of the fastest women out 25 there.”); and (4) No or not very certain, with a reasonable justification (e.g., “I am not extremely certain my method would be good for other races but it could work depending on their experience and previous racing times”).

The majority of student groups (78%, N=46) offered a category 3 type response with respect to the certainty in their team selection winning the 2016 Olympics. Reference to trends in the data was made by several groups, with explanations such as:

We have chosen the fastest swimmers to increase the chance of winning. Most of these

swimmers have a decreasing trend in their recorded time (PBs). This means they are

improving their recorded times as the time goes by. According to the swimmers’ PBs’

decreasing trend we predicted that by 2016 that the swimmers will beat their current PB

record and will have a better chance of winning Gold in the Rio Olympics 2016. We are

certain about our conclusion because the trend line shows a decreasing trend in the

swimmers’ records and the fastest swimmers will have a better chance because we

already are faster and our trends, and are trending better than other swimmers.

Many groups noted some reservation in their conclusions, however, citing an element of chance or limitations in their data, such as the small sample size and the lack of other race data. Students’ reference to chance suggested they were aware of inherent variability in sports times and a swimmer’s performance, for example, “… the other people, maybe they’re slower than you or maybe they’re faster than you,” and “We believe that our selections is

[sic] best, based on the data provided but there is still an element of chance. Their chance of winning will rely on how well they perform on the day.” Awareness of data limitations was evident in explanations of the form, “I am very certain but I am not completely certain as this is only 11 out of thousands of swimmers so you know, it’s only a sample so it isn’t really big,” and “…we didn’t have enough information because, like, we should have had more 26 times, like more times to be able to make the teams maybe better because some of the swimmers could have been faster than they are but maybe they just had some bad days.”

Thirteen percent of the groups expressed certainty in the swimmers they selected, but could not give a justification or at least an appropriate one (category 2). Only 6% of groups were not certain (with justification; category 4). Of those who expressed uncertainty, reference to a lack of contextual knowledge and statistical information was made, again illustrating the important role of transumeration. For example, one student explained, “I am not certain because, I’m not certain of how they’ll succeed because I don’t know how many times they have succeeded in an event of swimming and to become first in a competition. I’m also not certain because I don’t have much knowledge about swimming and what speed they have to compete at in order to become successful.”

Students’ consideration of whether their models could be applied to related sports events involved vertical mathematisation, where the general structure of their models was the focus.

The student groups were generally confident that they could extend their models to other sports events. Sixty-five percent of the student groups expressed this certainty and could justify their conclusions, with some noting that they might have to make an adjustment to one or more of their variables because of the different requirements of a new problem context.

For example, “You could kind of do the same thing but say, instead of speed, you could say, if it was like long jump or something, you could do their distance, PB for distance and their consistency of their distance, and something like that, with all the different types, with a lot of different types of sports.”

Other groups (20%) were not certain or not completely certain of their model’s applicability to other sports teams, with justifications that referred to differences in variables across sports, such as: 27

Well I, I’m not so much [certain] because it really depends on what the sport is

because, say it’s something like netball, they’re going to look at your passing or

something as well, they’re not just looking at your speed. So, just looking, I think if it’s

just purely like running or swimming it does work but if it’s a sport for something like

netball, and you are looking at this, this could be like a plan B or something but you

can’t really rely on it when looking at other sports.

The remaining groups either did not respond to this generalisability aspect (4%) or expressed certainty but couldn’t justify their decision (12%).

It is also interesting to note how students critically questioned some of the conclusions their peer groups had drawn, regarding limitations in the data and variables used. For example, several groups reported to the class their degree of certainty as a percentage, which elicited a peer question such as, “When you said you’re 75% sure that they’re going to win the race, how do you know that, cause you don’t really have the teams and their times [the group in question focused only on PBs]?” Although the group explained, “Yeah, that’s basically why we said we’re only 75% sure because we don’t really know how good the other teams are”, this did not satisfy their peer: “How can you be any percent sure? If you don’t even know the other teams?”

Critical peer questioning also arose when groups limited the variables considered in their model creation, especially when the PBs were the primary or only variable. Peer comments following group reports to the class included, “Do you think your team selection would be more accurate if you take more things into account than just the PB?” and “So do you think you could have made a better choice if you … considered the competitions and the age group?”

5. Discussion 28

In this article, we have examined one approach to developing primary school students’ statistical literacy, namely, through modelling with data. We proposed a framework comprising four components: working in shared problem spaces between mathematics and statistics; interpreting and reinterpreting problem contexts and questions; interpreting, organising, and operating on data in model construction; and drawing informal inferences.

Modelling with data provides an ideal vehicle for developing shared problem spaces, specifically, between mathematics and statistics. The multiple, real-world contexts that can be targeted enhance the contributions of this modelling, where various data must be interpreted, organised, and operationalised through applying mathematical and/or statistical procedures and reasoning, as was the case in the present problem.

The problem presented to the 6th- grade students required the production of models for selecting Australian swimming teams for the (then forthcoming) 2016 Olympic Games, using data on swimmers’ competing times at various previous events. The problem comprised data that displayed inherent variation, which necessitated a consideration of variables that would be most appropriate for inclusion in model generation. Students’ selection of variables involved using data directly from the given data table (e.g., Fig. 1) such as ages and personal best times, and applying mathematical and statistical operations in creating new variables.

Students’ responses indicated an appreciation of foundational statistical ideas and processes including calculating means to accommodate data variation, identifying how trends in swimmers’ performances inform model construction, recognising limitations in using only one performance variable (especially personal best times), acknowledging uncertainty in team selections, and recognising required structural adjustments (vertical mathematisation) in applying their models to other sporting events.

Students’ recognition of data variation and distribution suggests they were aware that a focus only on single attributes or their values (e.g., PBs or an athlete’s time in a given race) was 29 inadequate in model creation for this problem. Instead, adoption of an aggregate perspective

(Ben-Zvi & Arcavi, 2001; Konold et al., 2015), that is, viewing data in terms of a distribution of values, was frequently present. Assisting younger students in progressing from a single case perspective through to a consideration of multiple case values (Konold et al., 2015), and ultimately to an aggregate viewpoint needs greater attention in primary curricula. On the other hand, it is worth noting Konold et al.’s (2015) point that, “…the perspectives one takes on data should serve one’s questions rather than the other way around” (p. 323). Hence, case- value perspectives might not always be basic starting points to more sophisticated forms of analysing and representing data. Indeed, children often view case values as effective ways of representing variability, as Lehrer and Schauble (2004) indicated in their reference to

“plateaus” in case-value plots indicating low variability in certain regions of a data display.

A consideration of context in connection with selecting and operating on variables was also apparent in students’ responses, reflecting interactions between transnumeration and mathematisation processes. Contextual issues were present throughout as students considered different factors that could impact on a swimmer’s future performances. Many such factors were produced through operating on given variables while keeping in mind the problem context and goal of selecting a swimming team with the best chances of winning at the 2016

Olympics. One important contextual factor, however, that was rarely taken into consideration was the competitive level of a swimming event. This was an unexpected omission given that transnumeration was apparent in several instances during model construction.

A recognition of uncertainty arising from chance variation (Lehrer et al., 2011) was evident in students’ informal inferences with respect to their confidence in their team selections and in applying their models to other sports events. Even when students were certain or reasonably confident that their model was an effective one for team selection, they nevertheless raised aspects that could impact on decisions made. Recognition that they could 30 apply their models to other sports contexts but that some adaptations would be required (e.g., in some ball sports) suggests an awareness of data as “numbers within a context” (Franklin et al., 2007; Langrall et al., 2011; Moore, 1990), an essential understanding in students’ development of statistical literacy. The development of a model that could be applied and re- applied to other contexts is a core feature of modelling with data, promoting students’ use of vertical mathematisation including an appreciation of a model’s generalisable structure

(Doerr et al., in press; Lehrer & Schauble, 2010). Ideally, our students would have been presented with a sequence of related problems housed within other contexts, not just sports scenarios, where they could have assessed the effectiveness of their models. The learning affordances of model development sequences, where students can apply and adapt the structures they have created, have been frequently documented (e.g., Doerr et al. in press;

Doerr & English, 2003).

Of interest were some of the critical questions peers asked other groups on their model reporting. Such questions pertained to the degree of certainty with which some groups justified their models, as well as the limitations imposed by a narrow selection of variables in model construction. These foundational inferential reasoning processes, however, currently receive limited attention in the primary grades (Makar, 2016).

6. Limitations

Limitations of this study need noting. First, the study was confined to four classes in the one school situated in a middle-class suburb. More classes from schools in other demographic regions would have enhanced the findings. Second, due to time limitations, students were presented with data rather than given the opportunity to source these themselves. Ideally, students would have gathered their own data, possibly leading to further variable selection and model creation and representation. Third, as noted, students would have benefited from testing their predictions on related problems in different contexts. This would have likely 31 strengthened their appreciation of the mathematical and statistical structures (Groth, 2015) comprising their models.

7. Concluding Points

Experiences in the primary grades do not usually afford children opportunities to link their mathematical and statistical learning in dealing with problems involving complex data, where the exact nature of a desired end-product is not known in advance, and where different approaches to solution and multiple solutions are possible (cf. English, 2013; Lesh &

Zawojewski, 2007; Langman et al., 2016). In contrast to traditionally held perspectives, primary school students can deal with complex modelling problems in which they not only apply and enhance their understanding of mathematics and statistics within appealing contexts (English, 2014; Leavy & Hourigan, forthcoming), but also generate new learning beyond the curriculum.

Acknowledgements

This study was supported by research funding from the Australian Research Council (ARC)

Discovery Grant DP20100158. Any opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect the views of the ARC. We wish to acknowledge the enthusiastic participation of the students and teachers, together with the excellent support of our senior research assistant, Jo Macri, and research assistant Joanna

Smeed.

References

Bakker, A., & Gravemeijer, K. P. E. (2004). Learning to reason about distribution. In The

challenge of developing statistical literacy. In D. Ben-Zvi & Garfield, J. (Eds.). The

challenge of developing statistical literacy (pp. 147-168). Dordrecht, The Netherlands:

Springer. 32

Ben-Zvi, D., & Arcavi, A. (2001). Junior high school students’ construction of global views

of data and data representations. Educational Studies in Mathematics, 45, 35-65.

Blum, W., & Leiss, D. (2007). How do students and teachers deal with mathematical

modelling problems? The example Sugarloaf and the DISUM project. In C. Haines, P.

L. Galbraith, W. Blum, & S. Khan (Eds.), Mathematical modelling (ICTMA12):

Education, engineering and economics (pp. 222-231). Chichester: Horwood.

Cobb, P., Confrey, J., diSessa, A., Lehrer, R., & Schauble, L. (2003). Design experiments in

educational research. Educational Researcher, 32(1), 9-13.

Doerr, H. M., delMas, R., & Makar, K. (in press). A modeling approach to the development

of students’ informal inferential reasoning. Statistics Education Research Journal.

Doerr, H. M., & English, L. D. (2003). A modelling perspective on students’ mathematical

reasoning about data. Journal for Research in Mathematics Education, 34(2), 110-136.

Engel, J. (2017). Statistical literacy for active citizenship: A call for data science education.

Statistics Education Research Journal, 16(1), 44-49. Retrieved from https://iase-

web.org/documents/SERJ/SERJ16(1)_Engel.pdf

English, L. D. (2013). Reconceptualising statistical learning in the early years. In L. D.

English & J. Mulligan (Eds.), Reconceptualising early mathematics learning (pp.67-

82). Dordrecht, The Netherlands: Springer.

English, L. D. (2014). Promoting statistical literacy through data modelling in the early

school years. In E. Chernoff & B. Sriraman (Eds.), Probabilistic thinking: Presenting

plural perspectives (pp. 441-458). Dordrecht, The Netherlands: Springer.

English, L. D., & Lesh. R. A. (2003). Ends-in-view problems. In R. A. Lesh & H. Doerr

(Eds.), Beyond constructivism: A models and modelling perspective on mathematics

problem solving, learning, and teaching (pp.297-316). Mahwah, NJ: Lawrence

Erlbaum. 33

English, L. D., Arleback, J. B., & Mousoulides, N. (2016). Reflections on progress in

mathematical modelling research. In A. Gutierrez, G. Leder, & P. Boero (Eds.), The

Second Handbook of Research on the Psychology of Mathematics Education (pp. 383-

413). Rotterdam: Sense Publishers.

English, L., Watson, J., & Fitzallen, N. (2017). Fourth-graders’ meta-questioning in statistical

investigations. In A. Downton, S. Livy, & J. Hall (Eds.), 40 years on: We are still

learning! (Proceedings of the 40th annual conference of the Mathematics Education

Research Group of Australasia, pp. 229-236). Melbourne: MERGA.

Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., & Scheaffer, R.

(2007, August). Guidelines for assessment and instruction in statistics education [GAISE)

report. Alexandria, VA: American Statistical Association. Retrieved from

http://www.amstat.org/education/gaise/GAISEPreK-12_Full.pdf.

Freudenthal, H. (1991). Revisiting mathematics education: China lectures. Dordrecht, the

Netherlands: Kluwer.

Gal, I. (2004). Adults’ statistical literacy: Meanings, components, responsibilities. In D. Ben-

Zvi & J. Garfield (Eds.), The challenge of developing statistical literacy, reasoning and

thinking (pp. 47–78). Dordrecht, The Netherlands: Kluwer.

Gould, R. (2017). Data literacy is statistical literacy. Statistics Education Research Journal,

16(1), 22-25. Retrieved from https://iase-

web.org/documents/SERJ/SERJ16(1)_Gould.pdf.

Gravemeijer, K. (1999). How emergent models may foster the constitution of formal

mathematics. Mathematical Thinking and Learning, 1, 155-177.

Groth, R. (2015). Research commentary: Working at the boundaries of mathematics

education and statistics education communities of practice. Journal for Research in

Mathematics Education, 46(1), 4-16. 34

Kaiser, G. (2007). Modelling and modelling competencies in school. In C. Haines, P. L.

Galbraith, W. Blum, & S. Khan (Eds.), Mathematical modelling (ICTMA 12):

Education, engineering and economics (pp. 110-119). Chichester: Horwood.

Konold, C., Finzer, W., & Kreetong, K. (2015, July). Modeling as a core component of

structuring data. Paper presented at the 9th International Research Forum on Statistical

Reasoning, Thinking, and Literacy (SRTL9), Paderborn, Germany.

Konold, C., & Higgins, T. L. (2003). Reasoning about data. In J. Kilpatrick, W. G. Martin, &

D. Schifter (Eds.), A research companion to Principles and Standards for School

Mathematics (pp. 193-215). Reston, VA: National Council of Teachers of Mathematics.

Konold, C., & Miller, C.D. (2011). TinkerPlots: Dynamic data exploration [Computer

software, Version 2.2]. Emeryville, CA: Key Curriculum Press.

Langman, C. N., Zawojewski, J. S., & Whitney, S. R. (2016). Five principles for supporting

design activity. In L. A. Annetta & J. Minogue (Eds.), Connecting science and

engineering education practices in meaningful ways: Building bridges (pp. 59-106).

Dordrecht: Springer.

Langrall, C., Nisbet, S., Mooney, E., & Jansem, S. (2011). The role of context in developing

reasoning about informal statistical inference. Mathematical Thinking and Learning, 13(1-

2), 47-67.

Lavigne, N. C., & Lajoie, S. P. (2007). Statistical reasoning of middle school children

engaged in survey inquiry. Contemporary Educational Psychology, 32, 630-666.

Leavy, A., & Hourigan, M. (forthcoming). Inscriptional capacities of young children engaged

in statistical investigations. In A. Leavy, M. Meletiou-Mavrotheris, & E. Paparistodemou

(Eds.), Statistics in early childhood and primary education: Supporting early statistical

and probabilistic thinking. Dordrecht, The Netherlands: Springer. 35

Lehrer, R. &, English, L. D. (In press). Introducing children to modeling variability. In Ben-

Zvi, D., Garfield, J., & Makar, K. (Eds.). International handbook of research in

statistics education. Dordrecht, The Netherlands: Springer.

Lehrer, R., Kim, M. J., & Jones, R. S. (2011). Developing conceptions of statistics by

designing measures of distribution. ZDM—The International Journal on Mathematics

Education, 43 (5), 723-736.

Lehrer, R., & Romberg, T. (1996). Exploring children’s data modeling. Cognition and

Instruction, 14(1), 69–108.

Lehrer, R. & Schauble, L. (2000). Inventing data structures for representational purposes:

Elementary grade students' classification models. Mathematical Thinking and Learning, 2,

49-72.

Lehrer, R., & Schauble, L. (2002). (Eds.). Investigating real data in the classroom:

Expanding children's understanding of math and science. New York: Teachers College

Press. (Spanish Translation, Publicaciones M.C.E.P., Sevilla, Spain).

Lehrer, R., & Schauble, L. (2010). What kind of explanation is a model? In M.K. Stein & L.

Kucan (Eds.), Instructional explanation in the disciplines (pp. 9-22). New York:

Springer.

Lesh, R. A., & Doerr, H. (2003). (Eds.). Beyond constructivism: A models and modelling

perspective on mathematics problem solving, learning, and teaching. Mahwah, NJ:

Lawrence Erlbaum.

Lesh, R., & Lehrer, R. (2000). Iterative refinement cycles for videotape analyses of

conceptual change. In R. Lesh & A. Kelly (Eds.), Research design in mathematics and

science education (pp. 665-708). Hillsdale, NJ: Erlbaum. 36

Lesh, R. A., & Zawojewski, J. (2007). Problem solving and modelling. In F. K. Lester Jr.

(Ed.), Second handbook of research on mathematics teaching and learning (pp. 763–804).

Charlotte, NC: Information Age.

Makar, K. (2016). Developing young children's emergent inferential practices in statistics.

Mathematical Thinking and Learning, 16(1), 1-24.

Makar, K., Bakker, A., & Ben-Zvi, D. (2011). The role of context and evidence in informal

inferential reasoning. Mathematical Thinking and Learning, 13(1-2), 1-4.

Makar, K., & Rubin, A. (2009). A framework for thinking about informal statistical

inference. Statistics Education Research Journal, 8(1), 82-105. Retrieved from

http://iase-web.org/documents/SERJ/SERJ8(1)_Makar_Rubin.pdf

Moore, D. S. (1990). Uncertainty. In L. Steen (Ed.). On the shoulders of giants: New

approaches to numeracy (pp. 95-137). Washington, DC: National Academy Press.

Niss. M. (2010). Modeling a crucial aspect of students’ mathematical modeling. In R. Lesh,

P. L. Galbraith, C. R. Haines, & A. Hurford (Eds.), Modeling students’ mathematical

modeling competencies (pp. 43-60). New York: Springer.

Patton, M. (2002). Qualitative research & evaluation methods (3rd edn.). Thousand Oaks,

CA: Sage.

Pfannkuch, M. (2011). The role of context in developing informal statistical inferential

reasoning: A classroom study. Mathematical Thinking and Learning, 13(1-2), 27-46.

Watson, J. M. (2006). Statistical literacy at school: Growth and goals. Mahwah, NJ:

Lawrence Erlbaum.

Watson, J. M., & English, L. D. (2015). Introducing the practice of statistics: Are we

environmentally friendly? Mathematics Education Research Journal, 27(4), 585-613. 37

Watson, J. M., & English, L. D. (in press). Statistical problem posing, problem refining, and

further reflection in Grade 6. Canadian Journal of Science, Mathematics, and

Technology Education.

Whitin, D. J., & Whitin, P. E. (2011). Learning to read numbers: Integrating critical literacy

and critical numeracy in K-8 classrooms. New York, NY: Routledge.

Wild, C. J., & Pfannkuch, M. (1999). Statistical thinking in empirical enquiry. International

Statistical Review, 67(3), 223–248. doi:10.1111/j.1751-5823.1999.tb00442.x

Zawojewski, J. S. (2010). Problem solving versus modeling. In R. Lesh, P. L. Galbraith, C.

R. Haines, & A. Hurford (Eds.), Modeling students’ mathematical modeling competencies

(pp. 237-243). New York: Springer.

Zawojewski, J. S, Lesh, R., & English, L. D. (2003). A models and modelling perspective on

the role of small group learning. In R. A. Lesh & H. Doerr (Eds.), Beyond

constructivism: A models and modelling perspective on mathematics problem solving,

learning, and teaching (pp. 337-358). Mahwah, NJ: Lawrence Erlbaum.