Making a Grouped-Data Frequency Table: Development And

Home , Frequency (statistics), Frequency distribution, Grouped data

MAKING A GROUPED-DATA FREQUENCY TABLE: DEVELOPMENT AND

EXAMINATION OF THE ITERATION ALGORITHM

A dissertation presented to

the faculty of

the College of Education of Ohio University

In partial fulfillment

of the requirements for the degree

Doctor of Philosophy

Hippolyte O. Lohaka

November 2007

This dissertation titled

Making a Grouped-data Frequency Table: Development and Examination of the Iteration

Algorithm

HIPPOLYTE O. LOHAKA

has been approved for

the Department of Educational Studies

and the College of Education by

Gordon P. Brooks

Associate Professor of Educational Studies

Renée A. Middleton

Dean, College of Education

ABSTRACT

LOHAKA, HIPPOLYTE O., Ph.D., November 2007, Educational Research and

Evaluation

Making a Grouped-data Frequency Table: Development and Examination of the Iteration

Algorithm (283 pp.)

Director of Dissertation: Gordon P. Brooks

This study focuses on the development and examination of a new method to construct frequency tables for grouped data. This method is called the iteration algorithm in that it proceeds by successive iterations to determine the four key elements that are essential in building a grouped-data frequency distribution. The algorithm also uses five formulas and

stops running as soon as the first solution is attained (for teaching purposes only). Two

major interests emerged. The first interest was to evaluate how accurate the iteration

algorithm is as a process. The second and main focus of this study was to assess the

effectiveness of the iteration algorithm as an instructional method. The findings of the

Monte Carlo simulations to address the first main interest showed that the results yielded

by the iteration algorithm are comparable to those produced by a well-known statistical

package. To tackle the second foremost aspect of this study, the multivariate analysis of

covariance (MANCOVA) results indicated that the students expressed, on average, more

positive attitudes towards the iteration algorithm than towards a traditional method in

learning how to construct their own grouped-data frequency tables.

Approved:

Gordon P. Brooks

Associate Professor of Educational Studies

This dissertation is dedicated to the memories of my late father César Lohaka and my late daughter Sibyl Walo Lohaka.

ACKNOWLEDGMENTS

Throughout the entire process of my doctoral program, I have relied on the support of so many tremendous people. First of all, and most significantly, I would like to offer my heartfelt gratitude to the Almighty God who created me and gave me all the intellectual and physical abilities, with which I have been able to achieve whatever life and educational goals I set for myself.

Second, I wish to express my utmost gratitude and appreciation to my academic advisor and dissertation committee chair Dr. Gordon Brooks for his unremitting backing, feedback, insight, and guidance through my unending years of uncertainty and ambiguity.

I could not have completed this without his encouragement, challenge, and mentoring. I am also grateful to my other dissertation committee members: Dr. Teresa Franklin, Dr.

George Johanson, and Dr. Robert Klein for their helps particularly with sampling plan and data collection procedures. The contributions they made to this research study are incommensurable.

In addition, my many thanks go to the Mathematics and Psychology department professors who kindly accepted to administer my questionnaires to their elementary statistics students. Without their cooperation, it would not have been possible to complete, as they are, the last two chapters of this project.

A special thank you goes to Dr. Timothy Kayworth, my current Department Chair, for providing continual support and reducing my teaching load during my first year.

Last but not least, special thanks are due to Helene Obonga Lohaka, my wife and life partner. All her emotional supports, encouragements, and caring love were constant sources of strength and inspiration during the difficult times of this study.

TABLE OF CONTENTS

Abstract…………………………………………………………………………………....3

Dedication………………………………………………………………………………....4

Acknowledgements………………………………………………………………………..5

List of Tables.……………………………………………………………………………13

List of Figures……………………………………………………………………………16

Chapter 1: Introduction…………….....….……………..………………………………..17

Forms of Tabulation Arrangement…………...…..………….…………..……….18

Presenting Data in an Ordered Array.….….………....…..……………….20

Example of an Ordered Array……………..…..………………….22

Presenting Ordered Array Data in a Stem-and-Leaf Plot…………………24

Example of a Stem-and-Leaf Plot..…..……...…………………....25

Presenting Data in a Simple Frequency Table..…………..………………26

Example of a Simple Frequency Table..……..…..………………27

Presenting Data in a Grouped-data Frequency Table.…………..……..…32

Example of a Grouped-data Frequency Table……..……………...33

Presenting Data in a Histogram…………………………..………………37

Example of a Histogram…………………………..…………..….39

Rationale for the Study…………………………..………..………………………41

Statement of the Problem…..……………………………………..……………….44

Research Questions.…….………….……….…………………....…….………….47

Monte Carlo Simulations…………………….……….………..…………48

Survey of Attitudes towards the Method Used……….…………..………49

Research Hypotheses………………….……………………………….…………49

Monte Carlo Simulations……………………….……..….………………49

Survey of Attitudes towards the Method Used………..….………………50

Significance of the Study………………..….……………………….……………51

Difference between the Iteration and Traditional Methods…..….….……51

Benefits expected to be gained over the Traditional Methods.……..……52

Summary of Major Differences and Benefits……………………….……56

Limitations and Delimitations of the Study…………………………………...….58

Definitions of Terms…………………………………………………………..….60

Attitude……………………....…………………………………….……..60

Class Boundary……………....…………………………………….……..60

Class Frequency...…………….…………………………………….…….60

Class Interval…...………………………………………….……………..61

Class Limits....………………..…………………………………………..61

Class Midpoint…………….….…………………………….…………….61

Class Width…………………………………………………...…………..62

Coefficient of Variation .……..…………………………………………..62

Frequency Distribution………..…………...……………………………..63

Grouped Data………………….....………...……………………………..63

Grouped-Data Frequency Distribution..…...……………………………..63

Iteration Algorithm….…………..………………………………………..64

Range…….……………………..…………………………….…………..64

Raw Data..……..……………….……………….………………………..64

Simple Frequency Table…..…….………………………………………..65

Traditional Method…………..…..………………………………………..65

Organization of the Study……………………….………………………………..65

Chapter 2: Review of the Literature………………….…………………………..……....68

Introduction……………….…………….………………………….…………….68

Review of the Relevant Literature….…….…..…………………….…………….69

Vague Guidelines……………………….………………………………..70

Presenting Data in a Frequency Table…..……………………………..…71

Presenting Data in a Simple Frequency Table..………………….72

Presenting Data in a Frequency Table for Grouped Data..….……73

Advantages…………………………….…..………….….73

Disadvantages………………………………….…………75

Summary of Advantages and Disadvantages…………..…76

Presenting Data in a Graph…………..………………………………..….77

Histogram…………………………………………………………78

Traditional Methods………………………………………………………80

Inclusive Range…….…………………………………………….80

Number of Class Intervals………………………………………..81

Class Interval Size………………………………………………..93

Starting Point……………………………………………………..96

Tallies…………………………………………………………….98

Summary Statistics ..………………..…………………………..………..98

From the Teaching and Learning Statistics’ Perspectives…………....…..99

Definitions of Teaching, Learning, and Education………………99

Various Teaching Methods Documented…………….………....100

Teaching and Learning Statistics……………………………….103

Factors Affecting Attitudes towards Statistics.…………………103

Unique Solution Feature………….…………………..…………104

Structural and Organizational Issues…..………………………..105

Reasoning about Graphical Representation of Data….…………106

Attitudes towards Statistics………..……………..………….…..107

Summary and Conclusion…………..……………..………….…108

Chapter 3: Iteration Algorithm Development and Examination….………...……….…111

Development of a Traditional Method………..………..…………………….…111

Determining the Range…..…………………….…….………….111

Choosing a Number of Class Intervals………..……..………….111

Computing a Class Width…….……………….…….………….112

Picking up a Starting Point……….………………….………….113

Table Preparation and Construction……………………….……114

Development of the Iteration Algorithm….………..……..……………………119

Determination of the Inclusive Range of Grouped Data….…….119

Selecting a Number of Class Intervals………..…….……..…….120

Fixing on the Class Interval Size....………….…….……………122

Setting up the Starting Point………………………..……………124

Table Preparation and Construction…………………………..…124

Iteration Algorithm versus Traditional Method…….……………………..…….127

Monte Carlo Simulations…………………………………………..…………….128

Random Generation of Normal Data…………………………..……..…129

Main Outcomes………………………………………………..……..…130

Conclusion…………….……………..………………………………….135

Chapter 4: Methodology..………….….….…..………………..……………………….136

Research Design……………………….………………………………………..136

Operational Definitions of the Variables……………….…………..…………...137

Survey of Attitudes of the Method Used…………….………………….137

Measures to Represent the Student Perceived Ability in Statistics…..…138

Identification of Population……….….…………………….…………………...138

Sampling Plan……………………..…………………………………..………...138

Instrumentation……………………………………………………………..……139

Selection/Development of Instrument………………………….……….139

Pilot Studies…………….…………………………………………….…………144

Reliability Issues……………..…………………………………………144

Cronbach’s Alpha as an Index of Reliability…..…………………144

Pilot Studies Results………………………………………………145

Data Collection Procedures……………………………………………..147

Data Analysis Procedures……………………….…………….………..……….148

Statistical Methods…………………………………….………………..148

Power and Effect Size…………………………………………………..150

Chapter 5: Data Analysis…………………………..…………..……………………….154

Results………………………………….………………………………………..154

Data Analysis……….………………….……….…………..…………...155

Survey of Attitudes towards Method Used (SAMU)...……….…155

MANCOVA Results………………………….………………….162

Assumptions when Using MANCOVA.………………………....172

Analyzing and Interpreting MANCOVA Results……………..…178

Assumptions when Using ANCOVA.………………………...... 182

Univariate ANCOVA Follow-up Analyses……………………....185

Conclusion………….………………………..……………………………..……190

Chapter 6: Discussion, Conclusions, Recommendations…………………..…………...191

Summary of the Results and Findings…………………………………………..191

Discussion of the Results and Findings ….…………….…………..…………...194

Explanations of the Results………………..…..…………….……….…200

Implications of the Study…..……………………………..….………………….201

Implications for Students and Teachers..………………….…………....201

Implications for Researchers and Practitioners…………………...…….202

Summary of the Implications……………………………….…………...203

Limitations………………………………………………………………………204

Conclusion………….……….…….….…………………….…………………...205

Recommendations for Future Research…………………….…………………...209

References…....……………...………………………………………………………….211

Appendix A: Informed Consent form (Protocol #05E080)...... 241

Appendix B: Ohio University Consent Form Template………….…………….………242

Appendix C: Exercise I (with Integers)..……………………………………………….243

Appendix D: Construction of the Grouped-data Frequency Table (Traditional)………244

Appendix E: Construction of the Grouped-data Frequency Table (Iteration)….………250

Appendix F: Survey of Attitudes towards the Method Used (SAMU)…………………255

Appendix G: Exercise II (with Decimals)....…………………………………………...260

Appendix H: Construction of the Grouped-data Frequency Table (Iteration)….………261

Appendix I: Construction of the Grouped-data Frequency Table (SPSS)….….……….267

Appendix J: A Copy of the E-Mail Sent to the SPSS Corporate Headquarters………...271

Appendix K: List of Abbreviations………………………………………….………….273

Appendix L: Sample Set of Scatter Plots for Pairs of Dependent Variables..………….274

Appendix M: Additional Output of the Exploratory Factor Analysis…...... ……….282

LIST OF TABLES

Table 1. Grade of 80 Students in the Final Statistics Exam...…………………....………21

Table 2. Ordered Grades of 80 Students in the Final Statistics Exam..……………….…23

Table 3. Simple Frequency Distribution of the Grades of 80 Students..…………..……29

Table 4. Grouped-data Frequency of the Grades of 80 Students….………………….….35

Table 5. Differences and Benefits of Iteration Algorithm over Traditional Methods..….57

Table 6. Appropriate Number of Classes in Frequency Distributions as per Keller.....…86

Table 7. Value of k for Different n Numbers according to Each Formula………………90

Table 8. Grouped-data Frequency Table for the Final Statistics Grades of 80 Students

(Traditional Method with First Option)….…………………………..….116

Table 9. Grouped-data Frequency Table for the Final Statistics Grades of 80 Students

(Traditional Method with Third Option)……….…………………….….118

Table 10. General Rule for Determining the Number of Classes in a Histogram by

Sincich et al…….………………………………………………………..122

Table 11. Grouped-data Frequency Table for the Final Statistics Grades of 80 Students

Iteration Algorithm)……………………………………………………..126

Table 12. Overall Summary Table of Square Roots of Deviations..……………...... 131

Table 13. Summary Table by Sample Size of Square Roots of Deviations…….…...…133

Table 14. Reliability Statistics between Iteration and Traditional Algorithms………...145

Table 15. Inter-item Correlation Matrix for the Iteration Algorithm.…………………..146

Table 16. Inter-item Correlation Matrix for the Traditional Algorithm.…………...... 147

Table 17. Reliability Statistics for Iteration Algorithm and Traditional Method………157

Table 18. Partial outputs of the Initial Exploratory Factor Analysis Summary

(Principal Component Analysis)………………………………………..158

Table 19. Means and Standard Deviations of the SAMU……….….…..………...…….164

Table 20. Inter-correlations of the SAMU (Iteration – Mathematics)…...……...……...166

Table 21. Inter-correlations of the SAMU (Iteration – Psychology)……...... ………...168

Table 22. Inter-correlations of the SAMU (Tradition – Mathematics)….…..……….....169

Table 23. Inter-correlations of the SAMU (Traditional – Psychology)…....…..…….....171

Table 24. Selected Output for Multivariate Analysis of Covariance (PMA)...... ……...175

Table 25. Correlations between Dependent Variables and Covariate……………...... 177

Table 26. Adjusted Means of the SAMU for Method…………………..…………...... 180

Table 27. Adjusted Means of the SAMU for Department……………..…...... 181

Table 28. Adjusted Means of the SAMU for Method by Department…………….....…182

Table 29. Levene’s Test of Equality of Error Variances……..………………..……….183

Table 30. Selected Output for Univariate F-Tests……………………………….…..…187

Table 31.Elementary Statistics Final Exam Grades of 65 students…………...... ……243

Table 32. Grouped-data frequency Table for the Grades (Traditional)………………...248

Table 33. Grouped-data Frequency Table for the Weights (Iteration)…..…….…..…...254

Table 34. Thickness (in mm) of 100 Mass-Produced Metal Blocks…...………………260

Table 35. Grouped-Data Frequency and Cumulative Frequency Table for the Thickness

of Metal Blocks Random Sample SAMPLEQ3 (Iteration Algorithm)…265

Table 36. Grouped-Data Frequency and Cumulative Frequency Table for the Thickness

of Metal Blocks Random Sample SAMPLEQ3 (SPSS Algorithm)…….268

Table 37. Comparison of the Basic Statistics from the Raw Data……………………...270

Table 38. Varimax Rotated Factors. Principal Component Extraction

(Iteration Algorithm)……………………………………………………282

Table 39. Varimax Rotated Factors. Principal Component Extraction

(Traditional Method)……………………………………………………283

LIST OF FIGURES

Figure 1. Grades stem-and-leaf plot……………………….….………...………………..26

Figure 2. Histogram produced by the SPSS default format.....………...... ……………....40

Figure 3. Histogram modified with a traditional method’s choices…...... 41

Figure 4. Histogram with the iteration algorithm’s choices..…………..………………127

Figure 5. SampleQ3…………….……………………………….…..…………….……267

Figure 6. Scatter Plot for Value versus Difficulty (Iteration – Mathematics)………….274

Figure 7. Scatter Plot for Cognitive versus Difficulty (Iteration – Mathematics)...... 275

Figure 8. Scatter Plot for Affect versus Difficulty (Iteration – Psychology)..………….276

Figure 9. Scatter Plot for Cognitive versus Difficulty (Iteration – Psychology)..…...…277

Figure 10. Scatter Plot for Affect versus Cognitive (Traditional – Mathematics)...……278

Figure 11. Scatter Plot for Value versus Cognitive (Traditional – Mathematics)…...... 279

Figure 12. Scatter Plot for Value versus Difficulty (Traditional – Psychology)……….280

Figure 13. Scatter Plot for Value versus Affect (Traditional – Psychology)..………….281

CHAPTER 1: INTRODUCTION

Once the data have been gathered from a population of interest, it is often difficult to get a sense of what the data indicate when they are presented in an unorganized fashion. Assembling the raw data into a meaningful form, such as a frequency distribution, makes the data easier to understand and interpret by a broad audience in a relatively short period of time. It is in the context of frequency distributions that the importance of conveying in a succinct way numerical information contained in the data to a general public is encountered.

Communication is at the heart of statistics and it implies finding out significant patterns in a given dataset (Rangecroft, 2002). This view is shared by Chatfield (1995) who recognized communication as an important component of investigation that should be incorporated into statistics education. Brase and Brase (1999) also concurred by stressing “no matter what type of data is available, sample or population, it is important to be able to organize and present the data to other people” (p. 23). Brase and Brase went on to infer “for this purpose, tables and graphs are indispensable” (p. 23). As Hulsizer

(2004) appositely expressed it, tables and graphs allow a researcher to “make sense of his/her own data, communicate his or her results to others who are interested in them, and actually see relationships in the data” (p. 12).

Unfortunately, the nature of large data sets is difficult to either communicate to any intended audience or make a decision without considering some logical procedure for organizing and summarizing data for a meaningful representation. This can be done either in a tabulated arrangement, graphically, or statistically (Aron & Aron, 2003; Aryee,

2003; Roscoe, 1975).

Tabulation arrangement goes about transforming this mass of raw data from unmanageable details into an easy-to-understand summarized form that is more clearly presentable or interpretable (Levin & Fox, 2004). As part of the mission of descriptive statistics, ways must be found to succinctly organize and neatly present the raw data in order to facilitate examination, analysis, and comprehension (Blommers & Forsyth,

1977).

This study focuses on how the development and examination of a new method to construct frequency tables for grouped data. A grouped-data frequency distribution is a table listing scores grouped into non-overlapping classes of equal width, listed in one column, along with the number of cases falling in each class, recorded in the adjacent column. A grouped-data frequency distribution is constructed for 3 main reasons: (1) to facilitate the examination, analysis, and comprehension of data, (2) to estimate frequencies of the unknown population distribution from the distribution of sample data, and (3) to facilitate the computation of various statistical measures. Grouped-data frequency distributions can be structured either as a table or a graph; both show basic features that are of interest to a statistician.

In the following sections, 3 major forms of tabulation arrangements together with two appropriate graphical displays are gradually introduced using the same relatively large dataset. The goal is to show how ultimately the grouped-data frequency table brings very often more clarity and better understanding in summarizing a big set of raw data.

Forms of Tabulation Arrangement

As stated above, data in raw form are usually not easy to use for decision making.

Some type of organization is needed. Organizing and presenting tabulated data can be

19 done in 3 different forms, according primarily to the size of the data collected (Aryee,

2003; Hopkins & Glass, 1978).

First, if the size of the data collected ranges from small to medium, the data collected may be arranged in an ordered array. It is an arrangement of data items in either an ascending (from lowest to highest value) or descending (from highest to lowest value) order of magnitude (refer, for instance, to Isaac & Michael, 1997, p. 166).

Second, if the size ranges from medium to large, the data can be presented into a frequency distribution with an interval size (or interval width) of one. Frequency distributions with an interval width of one are referred to as simple frequency distributions. Frequency distributions with an interval width of one are used to report exactly each and every of the values observed of the variable that has been measured, traditionally, starting with the smallest value at the beginning and ending with the highest value.

Third, if the sample size is large or very large, the “data are often summarized into a frequency distribution with an interval width greater than one, which reports the number of observations in a set of data within a given range of values of the variable that has been measured” (Raymondo, 1999, p. 48). Frequency distributions with an interval size bigger than one will be referred to as grouped-data frequency distributions.

For comparison purposes, the same large dataset is carried throughout the first 3 chapters to exemplify the 3 different forms of tabulation arrangement (that is, ordered array, simple frequency table, and grouped-data frequency table). A frequency distribution is only a major method of data reduction. Another major method of data reduction is to present the data in some type of graphic display. Since the data used for

20 illustration purposes in this study are quantitative data, two types of graphic display retained are the stem-and-leaf plot for the ordered array data and the histogram for the grouped data.

Presenting Data in an Ordered Array

A simplest way for organizing raw data in a systematic order is usually the ordered array. The ordered array is perhaps more useful when only a few measures are involved (usually if there are 20 or fewer observations). The ordered array helps to see at once the maximum and minimum values. It also gives a rough idea of the distribution of the items over the range.

In contrast, when the number of data items involved is significantly large, the ordered array appears a rather awkward data organization tool. “Listing the values individually for bigger datasets consumes too much space and makes it difficult to visualize the important patterns in the data” (Sanders, Murphy, & Eng, 1976, p. 45).

Table 1 lists grades of 80 students obtained in the final exam of an elementary statistics class. The grades are scored out of 100 points and rounded to the nearest whole number.

Table 1

Grades of 80 Students Obtained in the Final Exam of an Elementary Statistics Class 58 55 31 61 77 66 76 58 68 38

44 49 54 41 29 81 90 45 70 40

61 51 48 67 88 67 36 69 71 57

52 58 51 63 92 57 83 86 71 47

64 43 31 78 57 61 58 72 72 41

51 66 62 93 62 84 76 28 58 46

50 46 32 65 85 75 47 57 42 68

37 51 58 63 56 53 79 74 80 73

The scores recorded in Table 1 are called raw data in that they appear in the way they were originally collected (according, here, to the official class roster). Therefore, these numerical values are not organized in any systematic order or processed

(manipulated or treated) in any logical sequence beyond their original form.

Though possible, it takes some time and effort to get a clear idea of these eighty scores by simply looking at the numbers individually in the foregoing form. More precisely, questions such as the following are difficult to answer in a short time.

1. What is the shape of the distribution? Do the observed scores tend to fall into

some recognizable pattern?

2. What is the location, or central tendency, of the random variable being observed?

3. How much variation is involved? Are the values widely dispersed or are they all

fairly close in value? (See Milton, McTeer, & Corbet, 1997, pp. 13-14).

4. Where do most of the scores tend to concentrate?

To hold so many scores in mind at once is obviously hard. To derive any generalized concepts of group performance from a brief inspection of these scores is also tricky. Certain characteristics of the group can, of course, be noted without lengthy scrutiny. For instance, a quick examination of these values shows that all of the data are two-digit numbers, and thus none of them is less than 10 or equals 100 (the possible maximum score a student can obtain).

It is difficult for even a trained eye to quickly determine much else from these data. Even identifying the lowest and the highest scores would take a bit of time. “It is only a hazy impression that can be formed, which may not always be correct”

(Vaidyanathan & Vaidyanathan, 1987, p. 8). Auspiciously, with some re-ordering of the eighty scores, these and other characteristics may be noted more quickly and precisely.

Example of an Ordered Array

As a first step, the scores can be rearranged in an ordered array, from lowest to highest or from highest to lowest, as a way of organizing them in some sort of logical order. But this does not reduce the bulk of the data. An ascending-order array is shown in

Table 2 below.

Table 2

Ordered Array for Grades of 80 Students in the Final Exam of a Statistics Class 28 29 31 31 32 36 37 38 40 41

41 42 43 44 45 46 46 47 47 48

49 50 51 51 51 51 52 53 54 55

56 57 57 57 57 58 58 58 58 58

58 61 61 61 62 62 63 63 64 65

66 66 67 67 68 68 69 70 71 71

72 72 73 74 75 76 76 77 78 79

80 81 83 84 85 86 88 90 92 93

Table 2, as presented above, is called an ordered array since the raw data have

been only rearranged from lowest to highest. Now, it can be noticed at a glance that the

smallest score is 28 and the largest score is 93. This information is hidden in Table 1.

From the same information the (non-inclusive) range of the scores can be computed as

65, the difference between 93 and 28. In the context of a statistics class’ scores, a range

of 65 indicates that the scores obtained by these 80 students are widely dispersed. To

know where the data are centered, the mean is often computed. Yet, if the distribution of

the data is heavily skewed, then the median would be a better descriptive measure of the

center than the mean.

Calculating an average score for all the 80 values still consumes as much time as

before, but now the median and mode can be obtained a bit more rapidly. The ‘middle’ or

median score is 58 (the value that splits the ordered array into two equal halves). As to

24 the score that occurs more frequently, or modal score, it is again 58 for this particular value appears six times in the dataset. The average score (or mean) is obtained by adding together all the scores observed and dividing their total by 80, which yields a value of 60.

Presenting Ordered Array Data in a Stem-and-Leaf Plot

A first approach to graphically presenting quantitative data is to form a stem-and- leaf plot. A stem-and-leaf diagram, or stem plot, is a data plot that uses the first part of the actual digits that make up the data values as the stem and the second part of the actual digits that make up the data value as the leaf to organize and graphically represent the quantitative data. A stem-and-leaf plot is similar to a histogram but is usually a more informative display for relatively small data sets (less than 100 data points). It provides a table as well as a picture of the data and from it the data can readily be written in order of magnitude, which is useful for many statistical procedures. Like with the ordered array, the only information lost is the order in which the data were originally collected.

To construct a stem plot, each quantitative measurement is broken into two parts:

The leading digit(s) become(s) the stem, and the trailing digit(s) become(s) the leaf. The stems are located along the main axis, and a leaf for each data value is located so as to display the entire distribution of the data. The stems represent the classes in a graph; the leaves reflect the number of measurements in each class.

Advantages of a Stem-and-Leaf Plot

First, a stem-and-leaf diagram (or stem plot) is similar to a frequency histogram

but has the advantage that the graph still contains the original data values. Second, a

stem-and-leaf plot provides a simple way to sort data. Third, the stem plot might not

25 always be easier to construct, but it shows more detail than the comparable frequency histogram.

Disadvantage of a Stem-and-Leaf Plot

There is one major disadvantage. A stem-and-leaf plot is not as useful as the comparable frequency histogram if there are a large number of data values or if the data values have a large number of digits.

Example of a Stem-and-Leaf Plot

Figure 1 displays a stem-and-leaf plot for the final statistics exam grades of 80 students. Since the grades recorded are two-digit numbers, the leading digit will be the stem and the trailing digit the leaf. Here, one has to use the 10’s digit for the stem unit:

29, for instance, is shown as “2 . 9”. In other terms, with the number 29, “2” represents the stem, while “9” serves as the leaf.

Frequency Stem & Leaf

2.00 2 . 89

6.00 3 . 112678

13.00 4 . 0112345667789

20.00 5 . 01111234567777888888

16.00 6 . 1112233456677889

13.00 7 . 0112234566789

7.00 8 . 0134568

3.00 9 . 023

Stem width: 10

Each leaf: 1 case(s)

Figure 1. Grade stem-and-leaf plot.

By observing the above stem-and-leaf plot, it can be seen that the mode is 58 and the distribution of the grades is roughly symmetrical around this modal value. There are no outliers and no gaps spotted, but it can be argued that the ordered array may make these characteristics appear even better.

Presenting Data in a Simple Frequency Table

A small to moderate dataset can be more usefully represented in a simple (or ungrouped-data) frequency table. A simple frequency table is a listing (generally arranged in order of magnitude) of each score achieved of a variable, listed in one column, together with the number of times that score occurred, recorded in an adjacent column (Aryee, 2003; Sprinthall, 2000). The recorded number of times each score occurred is referred to as frequency; hence the term “simple frequency table” or,

27 equivalently, “simple frequency distribution” (Aryee, 2003; Sprinthall, 2000; Raymondo,

1999). This type of tabulation arrangement is usually used when there are less than 100 observations and/or when the range of unit score categories is moderate. Berenson and

Levine (1998), especially, recommended “a maximum of about 20 distinct values” (p.

28).

Some authors see the simple frequency distribution as an enhancement over the ordered array in terms of further simplifying inspection of the data and making more sense out of them (Elifson, Runyon, & Haber, 1990, p. 55; Glasnapp and Poggio, 1985, p.

58; Hamburg, 1989, p. 9; Hinkle, Wiersma, & Jurs, 2003, p. 22; Hole, 2000, p. 2; Spence,

Wiersma, & Jurs, 1990). In particular, Glasnapp and Poggio alleged that the simple frequency table “is very efficient and communicates maximum information” (pp. 58-59).

Additionally, with a simple frequency table, it is possible to start seeing certain patterns and regularities which might not otherwise be revealed if the data were not organized

(Aryee, 2003).

Example of a Simple Frequency Table

It can easily be verified that Table 3 contains more than the maximum 20 distinct values as recommended by Berenson and Levine (1998). The data in Table 1 are evidently too large to be conveniently portrayed by a simple frequency table. However, the same data set was kept to make comparisons easier.

An enhanced picture of the raw data displayed in Table 1 emerges by stacking together the identical scores. The obtained scores are listed in one column called ‘Score’

(from the lowest score observed to the highest score observed) and each of their

28 respective frequencies (number of occurrences) is indicated in the adjacent column labeled ‘Frequency’, as shown in Table 3 below.

Table 3 is called a simple frequency distribution or simple frequency table. It helps to gain a better sense of the nature of the same grades shown in Table 1 than does the ordered array presented in Table 2. In particular, an entire picture of how the scores are distributed (or scattered) now emerges. Scores in Table 3 occupy only 55 rows, down from 80 (if Table 2 had been presented in a single column and not in the way it was done here due to space economy).

Table 3

Simple Frequency Table of the Grades of 80 Students in a Statistics Final Exam Score, xi Frequency, ni Relative frequency, fi Cumulative relative

frequency in percent, Fi

28 1 0.0125 1.25

29 1 0.0125 2.50

31 2 0.0250 5.00

32 1 0.0125 6.25

36 1 0.0125 7.50

37 1 0.0125 8.75

38 1 0.0125 10.00

40 1 0.0125 11.25

41 2 0.0250 13.75

42 1 0.0125 15.00

43 1 0.0125 16.25

44 1 0.0125 17.50

45 1 0.0125 18.75

46 2 0.0250 21.25

47 2 0.0250 23.75

48 1 0.0125 25.00

49 1 0.0125 26.25

50 1 0.0125 27.50

(table continues)

Table 3 (continued)

Simple Frequency Table of the Grades of 80 Students in a Statistics Final Exam Score, xi Frequency, ni Relative frequency, fi Cumulative relative

frequency in percent, Fi

51 4 0.0500 32.50

52 1 0.0125 33.75

53 1 0.0125 35.00

54 1 0.0125 36.25

55 1 0.0125 37.50

56 1 0.0125 38.75

57 4 0.0500 43.75

58 6 0.0750 51.25

61 3 0.0375 55.00

62 2 0.0250 57.50

63 2 0.0250 60.00

64 1 0.0125 61.25

65 1 0.0125 62.50

66 2 0.0250 65.00

69 1 0.0125 71.25

70 1 0.0125 72.50

71 2 0.0250 75.00

72 2 0.0250 77.50

(table continues)

Table 3 (continued and concluded)

Simple Frequency Table of the Grades of 80 Students in a Statistics Final Exam Score, xi Frequency, ni Relative frequency, fi Cumulative relative

frequency in percent, Fi

73 1 0.0125 78.75

74 1 0.0125 80.00

75 1 0.0125 81.25

76 2 0.0250 83.75

77 1 0.0125 85.00

78 1 0.0125 86.25

79 1 0.0125 87.50

80 1 0.0125 88.75

81 1 0.0125 90.00

83 1 0.0125 91.25

84 1 0.0125 92.50

85 1 0.0125 93.75

86 1 0.0125 95.00

88 1 0.0125 96.25

90 1 0.0125 97.50

92 1 0.0125 98.75

93 1 0.0125 100.00

Total n = 80 1.0000

Although the data seem to be uniformly distributed, they tend to be concentrated

(or tend to cluster) around a central value. For example, it is readily apparent that there are more scores from mid fifties to mid sixties (around the median value) than in any other group; only 5 scores are less than 35, and only 5 other scores are above 85. This tendency was not as apparent from the unorganized (raw) data contained in Table 1.

th Note that in Table 3, the xi at the top of the first column stands for the i raw

score (in this case, grade), ni over the second column represents the frequency or number

of occurrence for that score, and n in the last row designates the total number of subjects or scores observed. The symbols fi and Fi correspond to the relative frequency and

cumulative relative frequency, respectively.

While the data presentation has now become much clearer, there are still too

many details contained in Table 3. Even with 55 rows and 2 columns (instead of 80 rows

and 1 column), it still does not adequately simplify and condense the presentation of the

data.

The grouped-data frequency table presented in the next section uses again the

same dataset as above for comparison purposes. A brief presentation of the approach

precedes the illustration.

Presenting Data in a Grouped-Data Frequency Table

Interval-level scores are sometimes spread over a wide range, making the

resultant simple frequency distribution long and difficult to read. When such instances

occur, few cases may fall at each score value, and the group pattern becomes blurred

(Levin & Fox, 2004, pp. 26-27). When there are at least 50 observations (or at least 20

distinct values according to Schloss & Smith, 1999, p. 147), a simple frequency

33 distribution is long and bulky (Khazanie, 1986, p. 11; Spence et al., 1990; Shavelson,

Ruiz-Primo & Mitchell, 1996, p. 39). It is therefore not an efficient way to organize and display a vast dataset (Glasnapp & Poggio, 1985, p. 59; Raymondo, 1999, p. 50).

If the variable studied is a quantitative continuous variable, a more compact and more accurate form of data organization may be developed without obscuring the essential information contained in the values, and it is called a grouped-data frequency table (Gravetter & Wallnau, 1985, p. 37; Sanders et al., 1976, p. 47). Frequency distributions for grouped data are employed to reveal or emphasize a group pattern

(Levin & Fox, 2004, pp. 26-27) with greater convenience and clarity. This ancillary form of tabulation arrangement consists of combining adjacent score values into groups or classes and summing up their corresponding frequencies.

Grouped-data frequency distributions can be structured either as tables or graphs; both show basic features that are of interest to a statistician. These features are the overall shape of the data, symmetry, the location and the spread, existence of outliers and evidence of clusters or gaps (see, for instance, Arsham, 2001; Mays, 2005).

Example of a Grouped-Data Frequency Table

The same dataset as presented in Table 1 is used again here. This example aims at illustrating a few advantages a grouped-data frequency table might have over a simple data frequency table for the same set of raw data.

Table 4 displays the same raw data contained in Table 1, but the data are now grouped into classes in what is called a grouped-data frequency distribution or a frequency table for grouped data. Here the grades are presented within broader categories along with the number or frequency of students whose grades fell into these categories. A

34 traditional method (Vaidyanathan & Vaidyanathan, 1987, pp. 9-11) served in determining the 4 key elements essential for building a frequency distribution of the grouped data as depicted by Table 4.

Specific details related to how Table 4 was constructed using this traditional procedure (Vaidyanathan & Vaidyanathan, 1987) are described in chapter 3 under the method development section for the traditional method. In the meantime, the computed 4 key elements are as follows: The inclusive range for ‘grouped data’ is equal to 72 (that is, the difference between 96 and 25 plus 1; or 96.5 minus 24.5), the number of classes is 8, the class width is 9, and the starting point is 25. Both the smallest observed score (i.e.,

28) and the largest observed score (i.e., 93) are included in the retained frequency distribution of grouped data.

Table 4

Grouped-data Frequency Table for Grades of 80 Students in the Final Statistics Exam

(Traditional Method)

Class Class Frequency Class Cumulative Relative cumulative

limit boundary midpoint frequency frequency in

ni xi percent

25 to 33 24.5 to 33.5 5 29 5 6.25+

34 to 42 33.5 to 42.5 7 38 12 15.00

43 to 51 42.5 to 51.5 14 47 26 32.50

52 to 60 51.5 to 60.5 15 56 41 51.25

61 to 69 60.5 to 69.5 16 65 57 71.25

70 to 78 69.5 to 78.5 12 74 69 86.25

79 to 87 78.5 to 87.5 7 83 76 95.00

88 to 96 87.5 to 96.5 4 92 80 100.00

Total n = 80

Note: +6.25 = (5 * 100)/80.

A frequency table for grouped data, like the one of Table 4, appears to be a more

concise and efficient method for ordering the observations collected. In particular, the

reclassification makes the data more manageable and more comprehensible at a glance.

Besides, salient features or patterns in the data are revealed and become easier to detect

and interpret. For example, it is readily apparent that only 5 scores are less than 34, and

only 4 scores are above 87; the modal interval is ranging from 61 to 69 since it has the

36 highest frequency, which is 16; the median class is ranging from 52 to 60 because the

50% cumulative relative frequency in percent value falls in that class; the grades distribution is slightly skewed to the left.

A big inconvenience of using grouped data rather than the raw data is the loss of precision. For instance, the grouped data no longer have the same range as the range of the original data set (raw data). With the raw data the range was found to be 65, now with the grouped data it is 71 (that is, 96 - 25). Also, getting the mean or even the median is still not any easier.

The issue of grouping data into frequency tables is perhaps not as important as it once was. Dubois (1982), for example, pinpointed that for “further statistical manipulations and where calculators and computers are available, the grouped-data frequency distribution has only tangential relevance, and the behavioral scientist is best advised to use the raw data” (p. 23).

Sokal and Rohlf (1995), on their parts, noted that “Grouping data into frequency distributions was necessary when computations were done by pencil and paper or with mechanical calculations” (p. 28). They went on to add: “Nowadays, even thousands of variates can be processed efficiently by computer without prior grouping” (p. 28). As far back as 1964, Adkins acknowledged that “with the advent of computers, even large-scale analysis is performed with ungrouped data” (p. 38).

Even if comments made by Adkins (1964), Dubois (1982), and Sokal and Rohlf

(1995) are legitimate, still “occasions for grouping data arise sufficiently often that the topic should not be ignored” (Adkins, 1964, p. 38). Also, Frankfort-Nachmias and Leon-

Guerrero (2000) pinpointed that “grouped frequency distributions are very useful when the raw data are not available” (p. 47).

Roscoe (1975) affirmed that the grouping of the data into tables may be a useful technique in descriptive statistics, where the grouped-data frequency distribution is an

“end product” rather than an intermediate step in a series of calculations of descriptive statistical measures. Healey (2002) contended “the construction of frequency distributions is almost always the first step in any statistical analysis” (p. 28). He further added: “Clearly the data need to be organized in a format that allows the researcher (and his or her audience) to discern easily any patterns or trends of the distribution of the variable under study” (p. 28). More recently, Weiss (2004) noted: “Grouping data is one of the most common methods of organizing raw data” (p. 44).

To get rid of the cumbersome problem of dealing with too many raw scores, it is necessary to sacrifice some precision (May, Masson, & Hunter, 1990). Further, Dubois

(1979) emphasized that “this loss of precision is more than offset by the gains of increased compactness of presentation and new knowledge of the pattern of how the items distribute themselves” (p. 27).

Presenting Data in a Histogram

The most common graph of the distribution of one quantitative variable and also the most popular method of summarizing numerical information is the histogram (see, for example, Moore, 2000, p. 8; Triola, 2001, p. 12; Wand, 1997, p. 59). A histogram is a graphical technique that is applied to a set of interval data that describes how the data are distributed. The histogram clearly “portrays information on location, spread, and shape”

(Keller, 2001, p. 22). The histogram is only appropriate for variables whose values are

38 numerical and measured on an interval scale. It is generally used when dealing with large data sets (greater than 100 observations), when stem and leaf plots become tedious to construct. A histogram can also help to detect any unusual observations (outliers), or any gaps in the data set.

Histograms group data like stem-and-leaf plots, but the categories are not constrained by the decimal structure of the numbers. Each category should be of the same size, and the categories should be contiguous. Every data point must be in exactly one category or class. The actual values of the original data are not retained, but it is known how many data are in each class. For the sake of learning by doing, a histogram can be constructed manually by drawing a bar above each class interval (or, alternatively, class midpoint) so that the width of the bar extends to the real limits of the class interval and the height of the bar corresponds to the frequency (denoted by ni in Table 4). The bars are usually depicted vertically as “columns”, which are shown on the ordinate. The classes are represented horizontally along the baseline axis or the abscissa (Gravetter & Wallnau,

2000; Hopkins & Glass, 1978).

Although the popular current trend is to resort to statistical packages in order to mechanically generate histograms, there are still many instances where computers are inaccessible or the histograms produced by their default set up are simply unsatisfactory.

Also, as put by Moore and Notz (2006), “statistics software will choose classes of a histogram for you, and may use slightly different rules from one computer software to another. The computer’s choice is usually a good one, but you can change it if you want”

(p. 207).

On one hand, in a situation where the computers are unavailable, a grouped-data frequency table can be built by hand before being rendered graphically in the form of a histogram. On the other hand, a default histogram generated by any computer software might be too skewed or have too many empty classes. In such circumstances, an appropriate algorithm might be used in deciding about the number of class intervals the histogram should have, its class width and the starting point. Using the same dataset, a histogram was generated initially by the default set up of SPSS. This histogram was later changed on the basis of choices obtained by a traditional method. How these choices were made will be fully explained in chapter 3.

Example of a Histogram

Figure 2 portrays a histogram as initially generated by SPSS. This is the default format. Replacing all the default grade values by the class boundaries as yielded by a traditional method, SPSS produced Figure 3. Figure 3 is therefore a modified histogram of Figure 2. A quick look at the two graphs reveals that Figure 2 contains 17 different classes or adjoining bars, whereas the transformed histogram in Figure 3 has only 8 classes.

Additionally, although both histograms are symmetrical around the center (grade

60), the histogram displayed in Figure 2 has 3 peaks, and, the transformed histogram in

Figure 3, in contrast, has only one peak. Since the raw data as presented in Table 1 have one mode, the transformed histogram may be a relatively more satisfactory graphical representation of the Table 1 data than the original histogram.

Histogram of the students' grades

6 11 Frequency

4 7 7 6 6 6 5 5 5 2 4 4 3 3 3 2 2 Mean =60.00 1 Std. Dev. =15.96039 0 N =80 20.00 40.00 60.00 80.00 100.00 Grades

Figure 2. Histogram of the students’ grades (default SPSS format).

Histogram of the students' grades

16 Frequency 15 14 12 5

7 7 5 4 Mean =60.00 0 Std. Dev. =15.96039 N =80 27.0 36.0 45.0 54.0 63.0 72.0 81.0 90.0 99.0 Grades

Figure 3. Histogram of the students’ grades (Traditional method’s choices).

Rationale for the Study

In our contemporary society, there is a constant need for all people to understand, interpret and analyze numerical information displayed in tabular or graphical forms.

Sullivan (1993) emphasized this point by asserting that a data “presentation, which includes charts, graphs, and summary statistics, is often the preferred mode in business publications, research reports, news weeklies, and daily newspapers” (p. 1). Statistics users in general and students in particular need to recognize how information may be exhibited in a misleading manner, which often leads to false conclusions. Sullivan further maintained that knowing how to construct a grouped-data frequency table or histogram might help “students in particular to detect some of the misleading tabular or graphical

42 representation of data, or at least be able to read the professional literature and interpret charts and summaries contained in the texts” (p. 13).

As documented in chapter two, the related literature review has identified 4 elements as essential for making frequency tables of grouped data. These 4 key elements are as follows: the inclusive range, the number of classes, the class length or class width, and the starting point. At the moment, while various procedures exist for use in creating grouped-data frequency tables; none of them is known to determine all the above 4 key elements.

In addition, rules and guidelines or principles accompanying existing methods are highly subjective and thus arbitrary (Doane, 1976). It is so because they are open to personal interpretations or choices, for instance, and hence leave too much room for guessing. As a result, each statistics user would not necessarily come up with the same grouped-data frequency table if provided with the same set of raw data (Davies, 1929;

Davies & Goldsmith, 1980; Devore & Peck, 2005; Groebner & Shannon, 1981; Hoaglin,

Mosteller, & Tukey, 1983; Jaeger, 1990; Scott, 1979; Shaw, Huffman, & Haviland,

1987).

According to Aron and Aron (1997, p. 7) and Wessel and Willett (1960, p. 80), the big questions in designing a grouped-data frequency table are, first, determining the number of intervals and, second, the interval size or interval width. However, with the contemporary practices, the “number of classes is just arbitrary, and so also is the point at which to begin the lowest class and the point at which to end the last class” (Kerr, Hall, &

Kozub, 2002, p. 11).

It is in trying to address the above concerns that the iteration algorithm was conceived as a structured fashion for summarizing and displaying the data collected into a frequency distribution of grouped data. It was developed in response to the lack of a clear agreement about which existing formula works best and the lack of clarity in the instructions provided by the existing methods for creating grouped-data frequency tables.

Most existing methods leave too much room for subjectivity and arbitrariness, which in turn are sources of bias. In contrast, the iteration algorithm utilizes five formulas and works in a systematic step-by-step fashion that leaves no room for guessing by providing clear, unambiguous directions that yield unique grouped-data frequency tables; that is, all statistics users are expected to produce the same table for the same set of raw data. It also saves time by relieving the pain of the trial-and-error approaches associated with most of the traditional methods.

The iteration algorithm is original in a number of ways. First, the name stems from the fact that this protocol operates by successive iterations, though at times, a single iteration may suffice. Second, the iteration algorithm works each time with a unique set of five formulas. These formulas serve in determining, en masse, the 4 aforementioned key elements. Third, all of the five formulas used are either completely new or adapted to take care of the iterations. Fourth, in the absence of computers (chiefly in developing countries), producing a unique fitting table by many statistics students working on the same dataset, for instance, would make the whole marking exercise less tedious by the instructors teaching large statistics classes. It is so because statistics instructors would be left with just a single grouped-data frequency table to look at and grade.

Statement of the Problem

The general purpose of this study was twofold. First, it strove to establish evidence for the efficacy of the iteration algorithm as a process for statistics users when constructing their own frequency table of grouped data. Second, it presents the effectiveness of the iteration algorithm as an instructional method in contrast to a commonly used method.

To start with, Monte Carlo simulations were used to demonstrate that outcomes produced by the iteration algorithm are, based on the results of 9 statistical measures obtained directly from the raw data, overall comparable to the corresponding outputs yielded by a leading statistical package known as SPSS. More specifically, the absolute deviations between the raw data and the grouped data yielded by the iteration algorithm were compared to the differences between the raw data and the grouped data produced by the SPSS algorithm, for each of the 9 statistical measures considered. The main outcomes of the Monte Carlo simulations are reported in chapter 3.

The iteration algorithm was compared only to the one used by SPSS because both algorithms provide unique solutions (the first plausible solution obtained by the iteration algorithm against the default generated by SPSS). Besides, the statistical software SPSS was chosen because on the word of Willis and Seung (2006) this package “is the most commonly used statistical software package in the social sciences” (p. 12). Acock (2005) reported that “SPSS software package has emerged as one of the three dominant tools for data management and standard statistical analysis” (p. 1093).

Then, after the students who voluntarily participated in the survey had been exposed to the iteration algorithm and a commonly used approach found in many

45 statistics textbooks and related literature, an instrument was administered to each group of participants to evaluate not only their attitudes towards either instructional method, but also their ability to use either method.

Since the SPSS algorithm is unknown to most statistics users, the iteration algorithm is consequently compared only to a textbook technique to assess its worth as an instructional device. Many electronic mail messages sent to the SPSS Corporate

Headquarters and requesting this type of information remained unanswered and the site www.spss.com does not contain that information either. A letter requesting the SPSS algorithm is provided in the Appendix J.

The second aspect of this study intended to assess the effectiveness of the iteration algorithm as a teaching and learning method. Students’ attitudes towards the iteration algorithm were compared to those towards a traditional method. While the

Monte Carlo simulations are necessary in showing the iteration algorithm’s relative accuracy, it is the second aspect that was the focus of the present work. Towards this end, an instrument was developed to explore the effects of the instructional method used

(either the iteration or traditional method) on students attitudes and perceptions concerning the creation of grouped-data frequency tables.

To address these two major issues, 4 major research questions were posed. The first research question aimed at assessing the worth of the iteration algorithm as a process. The other 3 research questions were to evaluate, among other things, the students’ attitudes towards the instructional methods used, after factoring out differences in the students’ perceived ability in statistics.

Given the fact that the traditional method chosen leads to so many plausible solutions, it was not possible to compare the two instructional methods based on students’ self- efficacy. Consequently, there were no specific research questions on self-efficacy.

While there is a great deal of literature dealing with the comparison of two instructional practices in teaching statistics in general (Bradstreet, 1996; Garfield, 1995;

Hogg, 1991; Moore, 1997a; Moore, 1997b; Moore, 2001), the expected course grade and the perceived mathematics ability have been cited by many authors (Bandura, 1986;

Galagedera, Woodward, & Degamboda, 2000; Onwuegbuzie, 2003) among factors contributing either directly or indirectly to the student perceived ability in statistics.

For instance, in his path analysis model of statistical self-efficacy, Onwuegbuzie

(2003) retained students’ expectations of their performance in statistics examinations as one of the two cognitive variables (the other variable was study habits). The self-efficacy theory and the rationale for initially choosing the expected course grade are briefly discussed below.

The hypothesis that the expected statistics course grade is related to statistics self- efficacy stems from Bandura’s (1977, 1986) theory of social cognition. Bandura defined self-efficacy, which is a subset of cognition theory, as one’s perceived judgment of performance ability in certain situations. He went on to state that it is central to human behavior. Because of the cognitive capacity for self-referent thought, Bandura thought that individuals are able to evaluate their skills and to report their confidence levels for performance in given tasks. Apparently, individuals’ beliefs about their ability to perform at certain level (that is, self-efficacy) greatly affect their actions (Bandura, 1977).

Simply put, self-efficacy theory predicts that an individual’s beliefs system influences behavior choice, effort invested, persistence, and task success. According to this conceptualization, people tend to engage in activities that they believe they can undertake, control their efforts, persevere until this level of performance is accomplished, and then evaluate their performance according to previous expectations.

Consequently, as noted by Onwuegbuzie (2002), students’ expectations of their performance are an important manifestation of their levels of self-efficacy. Thus, it was hypothesized that students’ expectations of success in a statistics examination would significantly predict their eventual performance levels.

Galagedera et al. (2000) further observed a significant positive association between perceived mathematics ability and the expected grade in an elementary statistics course. As a result, students who believe that they have a low level of mathematical ability are likely to put themselves down as poorly performing candidates in elementary statistics courses.

It was therefore the goal of this study to see whether the students’ perceived ability in statistics had any influence on their attitudes towards the instructional method taught when making a frequency table of grouped data. In addition, the study was interested in the relationship between the instructional method selected and the students’ attitudes and background (Geske, Mickelson, Jonson, & Smith, 2000) after removing the effects of their perceived ability in statistics.

Research Questions

For the Monte Carlo simulations aspect, the independent variable entertained is the algorithm used (Iteration versus SPSS) with 9 levels each (mean, median, standard

48 deviation, inclusive range, first and third quartiles, semi-interquartile range, coefficient of relative skewness, and coefficient of variation). Dependent variables are composed of deviations. A deviation is defined here as a difference in absolute values either between each of the raw data statistical measures and each of the corresponding iteration algorithm statistical measures, or between each of the raw data statistical measures and the corresponding SPSS algorithm statistical measures.

To measure the students’ attitudes towards the instructional method taught, an instrument called Survey of Attitudes towards the Method Used (SAMU) was developed.

As found in the original Survey of Attitudes towards Statistics (SATS) instrument invented by Schau, Stevens, Dauphine, and Del Vecchio (1995), the SAMU instrument has 4 components (affect, cognitive competence, value, and difficulty). These 4 components are the 4 dependent variables (Dauphine, Schau, &, Stevens, 1997; Mills,

2004). The two independent variables considered are the instructional method used

(iteration algorithm or traditional method) and the academic department (psychology or mathematics), while the student perceived ability in statistics serves as the covariate.

Monte Carlo Simulations

Research question 1: On average, are the square roots of deviations between the 9 statistical measures computed directly from the raw data and the corresponding 9 statistical measures obtained from the grouped data produced by the iteration algorithm less than those obtained from the grouped data yielded by the SPSS algorithm, for varying sample sizes?

This research question had to be answered before the investigations of the others could begin. An answer to this research question may be found in chapter 3.

Survey of Attitudes towards the Method Used

Research question 2: Is there an overall interaction between the academic department and the instructional method used, after adjusting for differences in the students’ perceived ability in statistics among groups?

Research question 3: Are there differences in the mean attitudes scored by students as a function of the academic department, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics?

Research question 4: Are there differences in the mean attitudes scored by

students as a function of the instructional method used, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’

perceived ability in statistics?

Research Hypotheses

This work was meant to examine the effects of the algorithm used and its

interaction with students’ attitudes and their respective academic department in actually

constructing a frequency table for grouped data. Four hypotheses were formulated.

Monte Carlo Simulations

Initial screening of the results suggested that, for each of the 9 statistical

measures, the grouped data produced by using the iteration algorithm were much closer to the corresponding values computed directly from the raw data than were the values yielded by the SPSS algorithm. Based on these findings, the following prediction was made.

Research hypothesis 1: For each of the 9 statistical measures, the square roots of deviations between the raw data and the grouped data yielded by the iteration algorithm are hypothesized to be, on average, less than those between the raw data and the grouped data produced by the SPSS algorithm, for varying sample sizes.

Survey of Attitudes towards the Method Used

Since the SAMU instrument is similar to the original instrument (SATS), it is hoped that the results from the two instruments will be comparable. Based on the results found by Schau, Stevens, Dauphine, Del Vecchio (1995), Dauphine, Schau, and Stevens

(1997), Schultz, Drogosz, White, Distefano (1998), and Mills (2004), among others, the following predictions were made.

Research hypothesis 2: It is assumed to be no overall interaction between the academic department and the instructional method used, after adjusting for differences in the students’ perceived ability in statistics among groups.

Research hypothesis 3: It is conjectured that there are differences in the mean attitudes scored by students as a function of the academic department, as measured by the

4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics.

Research hypothesis 4: It is theorized that there are differences in the mean

attitudes scored by students as a function of the instructional method used, as measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics.

All hypotheses will be tested at the .01 significance level (that is α = .01).

Although it is conventional to conclude that the null hypothesis is false if the probability

51 value is less than .05, more conservative researchers conclude the null hypothesis is false only if the probability value is less than .01. A very small α-level (say .01) is less likely to be more extreme than the critical value and so is more significant that high α-level values (say .05). Smaller α-levels run greater risks of failing to reject a false null hypothesis (a Type II error), and also so have less statistical power. The selection of an

α-level inevitably involves a compromise between significance and power, and consequently between the Type I error and the Type II error.

The .01-level was chosen for this study because of one major reason. It was the same significance level used in the article that contains the original SATS instrument and in many other articles using this SATS instrument. For comparison purposes, the .01 level of significance was retained in the present study.

Significance of the Study

Differences between the Iteration Algorithm and the Traditional Method

As mentioned before, the iteration algorithm is different from existing methods because not only it works by iterations, but also it suffices by itself. It has five straightforward formulas that are applied to determine the 4 key elements necessary for constructing a frequency table of grouped data. In practical terms, it iteratively utilizes the five formulas to perform the following:

1. Calculate the inclusive range of grouped data. The formula of inclusive range for

integer data, which is the difference between the largest and smallest observation

and plus one unit of measurement, is modified here to take care of both the rank

of the iteration and the raw data precision (both whole numbers and decimals).

2. Decide on the proper number of classes using two formulas. It requires first

setting up the minimum number of class intervals and then finding the maximum

number of class intervals. The suitable number of classes retained must be

comprised between these two extreme values.

3. Work out the class width. This is done by reaching the first compatible trio

composed of the inclusive range for grouped data, the number of class intervals to

be retained in the distribution, and the corresponding class width. The class width

must be an odd number (having the same precision as the collected data) and be

equal for all class intervals in the distribution.

4. Compute the pertinent starting point (for instance, the lowest class boundary).

Two new formulas are applied according to whether the rank of the final iteration

is an even or odd number.

5. Once the number of classes, each class interval width, and the starting point of a

frequency distribution for grouped data have been determined, they may be used

to create a histogram of the raw data (or to modify a histogram generated by

default by any statistical package). The frequency distribution for grouped data is

formed by listing all the different class boundaries in one column and writing

down their corresponding frequencies in an adjacent column of a frequency table

for grouped data. Other columns may be added as well, such as one for class

midpoints.

Benefits Expected to be Gained over the Traditional Methods

The iteration algorithm may be beneficial to statistics users, over the existing methods, in that it helps eliminate the frustration that comes along with guesswork. It is

53 then intended to relieve the statistics users from the pain of trying so many unsuccessful attempts before getting a satisfactory table for grouped data. This can be achieved in a number of ways. The most important are listed below.

First, the iteration algorithm provides clear rules and relatively easy-to-apply formulas. The rules and formulas to be applied are used to compute, in one or more iterations, the 4 key elements necessary to build a frequency table of grouped data. The 4 key elements are the inclusive range for grouped data, the retained number of classes, the class width, and the starting point. Only five formulas need to be used. As a result, the iteration algorithm leaves no room for arbitrariness or subjectivity.

Second, by providing clear, unambiguous, yet simple directions, the iteration algorithm is posited to yield a first grouped-data frequency table that is hoped to be the same if an identical set of raw data is provided to statistics users. Putting it another way, statistics users are expected to produce the same frequency table for the same set of raw data by retaining only the first solution. Building a unique frequency table facilitates comparisons to other distributions, based on the relative frequencies expressed in percents.

More importantly, the first solution from the iteration algorithm has a potential

“time saving” feature not found in most traditional methods. In addition, if performed by hand, the first-and-unique solution approach is much better than the multiple-solution approach in terms of limiting the number of unavoidable computation errors.

Third, in the absence of computers, an extreme way of summarizing data by just a single value may be achieved by calculating (at least with hand calculators) statistical measures of central tendency, dispersion, and shape. Grouped-data frequency tables

54 obtained from the iteration algorithm can reduce the computations involved for getting those descriptive statistics, such as mean, median, mode, range, standard deviation, quartiles, semi inter-quartile range, coefficient of variation, skewness and kurtosis coefficients for numbers condensed in classes, whenever computers are not available.

Fourth, the iteration algorithm is complete since it suffices by itself to determine the 4 key elements necessary for constructing a table for grouped data. No known traditional method that is self-sufficient was found.

Fifth, by introducing a special tally in the grouped-data frequency table, the iteration algorithm retains all of the information contained in the raw data set.

Finally, statistics users participating in the construction process may get a sense of re-inventing the technique. It is here imagined that the statistics users understand better what is going on because they are expected to actively participate in the step-by-step process of building their own grouped-data frequency table. Furthermore, from personal experience, when students learn by reinventing the technique, they get a deeper understanding of the problem at hand, which, in turn, helps them to recall more easily the materials learned. Unfortunately, statistics users have no clue or idea about how the different algorithms utilized by any statistical package work.

Given the current invasion of graphics in the media and printed publications, students would make more sense out of such pictures as histograms if they know how to construct them themselves or at least know how to modify those generated by the default settings of statistical software packages. Bakker (2004) put it “students need, of course, not know exactly which operations the software does to make a histogram, but they should be able to understand that the data are categorized into certain intervals and that

55 the bars’ areas are relative to the number of values in those classes” (p. 12). As implied above, Bakker (1970) went on to observe: “In most statistical software packages such operations are hidden, which suggests that we need special software for education that minimizes this black-box character” (p. 12).

In particular, Shaughnessy (1995) stressed “the importance of helping students learn to transform information from graphs, and to create their own graphs to display raw data, was one of the top priorities among the many issues raised with respect to graphicacy” (p. 217). In either case, the iteration algorithm may serve to develop that special software for education that minimizes the aforementioned black-box character as a minimum in the process of helping students create their own histograms.

Even if it turns out that the iteration algorithm is not any better for statistics users than the traditional method, the method will still potentially have advantages for those who are intolerant for ambiguity. According to Dugas, Gosselin, and Ladouceur (2001), cited by DeRoma, Marin, and Kessler (2003), “intolerance for ambiguity has been associated with a number of anxiety-related problems, including worry, obsessions / compulsions, and panic sensations in response to uncertainties introduced in the classroom” (p. 104).

The expectation is that this systematic approach would help statistics users to learn the iteration algorithm better than the most commonly used ones by reducing the frustration level. It is also anticipated to improve their attitudes towards the iteration algorithm. It finally can have some sound learning and teaching benefits as well.

The iteration algorithm presents some disadvantages. For instance, the iteration algorithm does not always yield closer statistical measures to those yielded directly from

56 raw data than does the traditional method. Furthermore, the iteration algorithm may require at times numerous iterations, whereas the traditional method may reach a solution at the very first trial.

Summary of Major Differences and Benefits

Table 5 summarizes the most momentous differences and benefits of the iteration algorithm and the traditional method.

Table 5

Major Differences and Benefits of the Iteration Algorithm over the Traditional Methods

Traditional Methods Iteration Algorithm

The proposed guidelines or rules are vague or Clear step-by step instructions are

ambiguous. As a result, they leave a lot of provided. As such, they are less

room for subjectivity, arbitrariness, or confusing because no personal

intuition. judgment is involved. Only formulas

need strictly to be applied.

In particular most of the key elements are All of the key elements, in particular

determined in a trial-and-error manner. This the chosen number of classes, are

turns out to be an open door to multiple determined in a systematic fashion.

(plausible) solutions since it favors guessing. Therefore, no guesswork is involved

since only the first plausible solution is

retained as a unique (plausible)

solution (for teaching purposes).

Potentially frustrating and time consuming Potentially less frustrating and less

because of possible multiple unsuccessful time consuming due to the use of a

trials. limited number of successive

iterations.

(table continues)

Table 5 (continued)

Major Differences and Benefits of the Iteration Algorithm over the Traditional Methods

Traditional Methods Iteration Algorithm

Various formulas are available, but Only one formula may appropriately be

unfortunately there is no accepted consensus used at a time for each of the 4 key

about which existing formula works best for elements for creating frequency

any particular key element. distribution of grouped data.

Rules proposed by most authors are not The iteration algorithm suffices by

complete. These rules often need to be itself to determine the 4 key elements

combined with other available ones to get the necessary for constructing a table for

4 key elements: Inclusive range, number of grouped data.

classes, class width, and starting point.

Individual information is lost. Individual information is preserved

thanks to the special class tally.

Black-box character. No black-box character.

Limitations and Delimitations of the Study

Due to a number of constraints such as money, time, availability of respondents,

and resources, there are six major limitations. These six major limitations follow.

First, all data were obtained via self-report. Students’ attitudes towards the instructional (traditional or iteration) method taught were collected thru an instrument, which is a slightly modified version of SATS instrument.

Second, a limited range of subjects participated in this project. Only students enrolled in two elementary statistics courses at Ohio University, Athens campus, were surveyed. The surveyed students were drawn from the accessible academic departments only and might not have been homogeneous in terms of their background in mathematics, for instance. In particular, the pilot study dealt with graduate students from the College of

Education, whereas the actual study used undergraduates from the mathematics and psychology departments.

Third, the instrument was administered once. Students’ attitudes were measured only at the beginning of a one-trimester course. The non-response rate was 15% and no follow-up assessments were entertained at any other time.

Fourth, very few variables were incorporated in this study. To reduce the level of sophistication involved, only 4 dependent variables, two independent variables, and one covariate are included in the research questions. Other aspects of interest are discussed, but not in detail.

Fifth, there was no possibility of random assignment to instructional methods received. All departmental class sections were considered as intact groups. As such, in each department, some class sections were given a traditional instructional method, while others were taught the iteration algorithm.

Sixth, from the review of the related literature, there is no precise way of generalizing from the convenience sample to any type of population (Wiersma, 2002).

This means that the generalizability of the findings would be limited to the characteristics of the subjects. This does not mean that the findings are not useful; it simply means that caution is needed in generalizing the results (McMillan & Schumacher, 2001).

Definitions of Terms

Attitude

An attitude is a hypothetical construct that represents an individual’s like or dislike for an item. Attitudes are intensive feelings, relatively stable, which are consequence of positive, negative or neutral experiences over time in learning a topic (in the present case the instructional method taught).

Class Boundary

A class boundary (or real class limit or exact limit of the class interval, Hinkle et al., 2003, pp. 24-25) is a number that does not occur in the sample data, but is rather a value that falls midway between the upper limit of one class and the lower limit of the next-higher class. Bluman (2004) gives a basic rule of thumb to find class boundaries when given the class limits. “The class limits should have the same decimal place value as the data, but the class boundaries have an additional place value and end in a 5”

(Bluman, 2004, p. 34). For example, in Table 4, the first lower class boundary is 24.5 (25

- 0.5), the first upper class boundary is 33.5 (33 + 0.5), the second lower class boundary is 33.5, the second class upper boundary is 42.5, and the last upper class boundary is

96.5.

Class Frequency

The class frequency or simply frequency for a given class is “equal to the number of observations in the data set that fall within the class boundaries” (Madsen &

Moeschberger, 1983, p. 231). In Table 4, for instance, frequencies are represented under the ni column.

Class Interval

A class interval (or simply class) is a block of score values used to group scores

together (Shavelson et al., 1996, p. 39). For example, in Table 4, the first class interval is

(25 – 33) and the second one is (34 – 42). Because the class interval refers to actual data

collected, the interval begins and ends with the class limit rather than the class

boundaries.

Class Limits

The class limits for a given class are “the smallest and largest values that might

occur in the data and that will be put into that class” (Madsen & Moeschberger, 1983, p.

230). The class limits have the same measurement precision as the raw data collected.

Class Midpoint

A class midpoint (or class mark) is a numerical value that is exactly in the middle

of each class. It is half way between the limits of the interval (Thorndike & Dinnel, 2001,

p. 32). It is found by adding the upper and lower class limits and dividing this total by

two. It can also be found by adding the upper and lower boundaries and dividing this total

by two.

For purposes of further mathematical analysis, all observations belonging to a

given class interval are assumed to coincide with the class mark. That is, “the class mark

is the score given to all values that fall within the interval” (Spiegel & Stephens, 1998, p.

37). In Table 4, for instance, the first class mark is 29, the second is 38, and the last one is

92.

Class Width

A class width, or class length or class interval size, for a given class is the difference between its upper class boundary and its lower boundary; in most frequency distributions it will be constant across the classes. In such a case, the class length is equal to the difference between two successive lower class limits or two successive upper class limits. For instance, the class limits (25 – 33) become (24.5 – 33.5), that is, 25 - 0.5 =

24.5 and 33 + 0.5 = 35.5, respectively (0.5 being the one-half of one unit of measurement).

Therefore, the class width is equal to 33.5 - 24.5 or 9 and not 33 - 25, which is equal to 8. In other words, a class interval size is not calculated as the difference between the upper and lower class limit of the same class.

Coefficient of Variation

On the word of Weiers (2008) “in probability theory and statistics, expressing the standard deviation as a percentage of the mean, the coefficient of variation indicates the relative amount of dispersion in the dataset” (p. 82). If s is the sample standard deviation

of a set of data xi (i = 1, 2,…, n) and x is its sample mean, then the sample coefficient of

variation (cv) is expressed as the ratio of the sample standard deviation (s) to the sample

s mean ( x ): cv = x

There is an equivalent definition for the coefficient of variation of a population,

which is based on the expected value and the standard deviation of a random variable.

The population coefficient of variation is denoted by CV. It is defined as the ratio of the

σ population standard deviation σ to the population mean µ: CV = μ

It is often reported as on a scale of 0 to 100% by multiplying the above calculation by 100%. The coefficient of variation is a dimensionless number that allows comparison of the variation of measurements from different populations. In other words, as expressed by Sullivan (2007), “it allows for the comparison of two or more variables measured on a different scale” (p. 155).

Frequency Distribution

A frequency distribution shows, according to Huck (2004), “how many people (or animals or objects) were similar in the sense that, measured on the dependent variable, they ended up in the same category or had the same score” (p. 18). Two kinds of frequency distributions are discussed in this study: simple (or ungrouped-data) frequency distribution and grouped-data frequency distribution.

Grouped Data

When a few adjacent score values are combined together into classes they are called grouped data. Data like those presented in the first column of Table 4 are called grouped data.

Grouped-data Frequency Distribution

A grouped-data frequency distribution (like the one of Table 4) is, as defined by

Shavelosn et al. (2000), “a table listing scores grouped into non-overlapping class intervals of equal size or equal width (rather than individual scores) along with the frequency of scores falling into each class interval” (p. 39). A grouped-data frequency distribution is “almost exclusively used when the dependent measure is an interval-level or a ratio-level variable” (Harris, 1998, p. 74).

Iteration Algorithm

The iteration algorithm is a structured tactic with clear rules and five formulas applied in a limited number of successive iterations to create a frequency table for grouped data. The grouped-data frequency distribution is obtained without guessing, subjectivity or arbitrariness.

Range

Perhaps the simplest, and most obvious, measure of the dispersion in a set of numerical observations is the difference between the largest and the smallest values. This is known as the non-inclusive range, exclusive range, or, simply, range. From Table 2, it can be seen that the highest grade is 93, and the lowest is 28. The subtraction of 28 from

93 yields 65. The (non-inclusive) range therefore is 65 (i.e., 93 - 25).

For whole numbers, the inclusive range is the non-inclusive range plus 1 (one unit of the measurements observed). For the data in Table 2, the inclusive range is computed as: Inclusive range = (Non-Inclusive) Range + 1 = 65 + 1 or 66.

For decimals, a formula for the inclusive range is provided by the iteration algorithm. This formula is reported in chapter 3, under the method development section of the iteration algorithm. An example dealing with decimals be may found in Appendix

Raw Data

Raw data or ungrouped data or raw scores are the observed or collected values in their original form. That is, data that have not been manipulated, summarized, organized, or treated numerically (Spiegel & Stephens, 1998) in any way beyond their original collection form. For instance, Table 1 lists raw data.

Simple Frequency Distribution

A simple (or ungrouped-data) frequency distribution simply lists, in ascending order (from the lowest to the highest), each possible or actual score, along with the number of times that score appeared or occurred (e.g., Table 3). The raw data are not grouped into classes or intervals. A simple frequency distribution is “frequently the most useful way to present data when the dependent measure is a nominal-level variable”

(Harris, 1998, p. 74).

Traditional Method

A traditional method or generate-and-test algorithm is any existing technique with vague suggestions or guidelines and an unspecified number of formulas applied on-the- spur-of-the-moment to build a frequency distribution of grouped data. It is an unstructured tactic in that the grouped-data frequency table is constructed by trial-and- error approaches based on guesses. The traditional method illustrated in this study is the algorithm proposed by Vaidyanathan and Vaidyanathan (1987, pp. 9-11) and Mason and

Lind (1996, pp. 8-12). This method is presented in detail in chapter 3.

Organization of the Study

This study is organized in the following manner:

Chapter 1 introduces the background and objectives of the study. Under this chapter, the rationale for developing the iteration algorithm, the statement of the problem, research questions, significance of the study, limitations and delimitations of the study, definition of terms, and organization of the study are covered.

Chapter 2 presents a review of related literature relevant to the 4 key elements necessary to build a grouped-data frequency table. The teaching and learning statistics’ perspectives are also discussed.

Chapter 3 includes the developments of the two instructional methods. Also included, are the main results yielded by the Monte Carlo simulations.

Chapter 4 describes the methodology used for the study, which comprises the sampling plan, the instrument, the pilot study results, data collection and data analysis procedures, and an explanation of the analysis of the data.

Chapter 5 contains the data analysis and presentation of the findings of the study.

The results of the study are arranged into 3 sections.

In chapter 6 a summary and interpretations of the findings of the study are given.

Also, recommendations and suggestions for further research and practice are included.

Finally, 13 appendices are attached at the end. Appendix A contains with an informed consent form. Appendix B includes the Ohio University consent form template.

Appendix C is the first exercise with integers. Appendix D and Appendix E show how the traditional method and the iteration algorithm, respectively, work in practice with these integers. Appendix F shows a questionnaire of the Survey of Attitudes towards the

Method Used (SAMU) instrument. An exercise dealing with decimals appears in

Appendix G. Appendix H and Appendix I illustrate how to construct a grouped-data frequency table using the iteration algorithm and the SPSS algorithm, respectively, for non-integers. A copy of the e-mail message sent to the SPSS Corporate Headquarters representatives is attached as Appendix J. A list of abbreviations used in the text is incorporated in Appendix K. In Appendix L, there is a sample set of 8 scatter diagrams of

67 data for pairs of dependent variables. Finally, additional output of the exploratory factor analysis information from the final study data is reported in Appendix M.

CHAPTER 2: REVIEW OF THE LITERATURE

Introduction

This study initially focused on evidence for the efficacy of the iteration algorithm as a process. A review of the relevant literature in the area of data reduction into frequency tables will be presented, evolving rapidly from the simple or ungrouped-data frequency distributions to frequency distributions for grouped data. Cases where frequency tables for grouped data show an edge over simple frequency tables are analyzed. At the grouped-data frequency distribution level, what is faulty with the existing algorithms was examined and then it was further explained how different the iteration algorithm is from the existing ones.

A second and main area of concentration was assessing the effectiveness of the iteration algorithm as a teaching method, chiefly based on its two key features. Its algorithm provides a unique solution and it is well structured, which leaves no room for subjectivity or arbitrariness. Ultimately, the preoccupation that needs an answer is: Will some students (or, in general, some statistics users) show more positive attitudes towards how to group quantitative continuous data from the iteration algorithm than from a traditional approach (found in many basic statistics textbooks)?

Towards this end, how the learning as well as instruction literature talks about these educational issues was looked at. More specifically, literature pertinent to the unique solution versus multiple solutions algorithms; and structured versus spur-of-the- moment methods was investigated under the teaching and learning perspectives.

As stated in chapter one, 4 elements, which are the inclusive range, the number of classes, the class interval width, and the starting point, are essential for creating a

69 frequency table for grouped data. As such, the literature review contained a specific review of these 4 elements’ literature. A special section reviewed references that support the teaching and learning statistics merits pertaining to the iteration algorithm. The literature review ends with a short summary of the most relevant aspects of the topics presented. Following this review of the related literature, both the development of the iteration algorithm and the development of the traditional were traced.

Review of the Relevant Literature

The literature about how to create a grouped-data frequency table is abundant. As a matter of fact, virtually all basics statistics textbooks cited in this dissertation contain at least a section discussing the construction of frequency distributions for grouped data.

Unfortunately, these textbooks give only suggestions or guidelines, which are vague because they leave an open door to guessing, subjectivity or arbitrariness. The reference list found in the appendices enumerates literally more than 150 of such books containing ambiguous principles.

In contrast, the available literature concerning how to construct a grouped-data frequency table from start to end is scant. This is probably due to the advent of powerful statistical packages, which can perform in no time major statistical analyses even with extremely large data sets.

This section starts by reviewing the aforementioned vague guidelines and states what is confusing with the existing algorithms. Then, it moves on to discussing how some statistics users might learn better to group continuous data from the iteration algorithm than from a traditional approach (found in many basic statistics textbooks). How the

70 learning as well as instruction literature (including both theoretical and empirical) talks about these issues were explored. Finally, the findings were summarized.

Vague Guidelines

There is still a dearth of studies in the literature that addresses the issue of grouping data into frequency tables. The literature reviewed revealed that there is no clear agreement about which existing formula works best. The instructions provided by the existing methods for creating grouped-data frequency tables are ambiguous in that there is no exclusive way to interpret the key words found in them. The advice proffered by statistics textbooks is vague because of liberal use of the key terms such as: “personal judgment” (Moore, 1997a, p. 16, Moore, 2000, p. 10), “common sense”, “good intuition”,

“past experience”(Doane, 1976, p. 181), “educated guesses” for selecting a desired and/or appropriate number of classes (e.g., Fox et al., 1993, p. 37; Gravetter & Walnau 1992, p.

31), or “nice, convenient or simple number” when deciding about the length or width for each class interval” (e.g., Gravetter & Walnau, 1992, p. 31; Levin & Fox, 2004, pp. 26-

27).

Fox et al. (1993, p. 37) and Levin and Fox (2004, pp. 26-27), among others, deplored that the nonexistence of a standard set of objective and well structured guidelines poses a serious problem in particular to beginning statistics students, who generally find it challenging to construct class intervals on their own. They contended that this is a skill that develops only with practice. The lack of a clear-cut procedure leaves room for arbitrariness and subjectivity even among skilled behavioral researchers.

In fact, there is considerable flexibility of choice from the available options offered and

71 the decision taken heavily draws on each investigator’s judgment, intuition, or experience

(Doane, 1976).

The lack of a standard set of objective and well structured guidelines turns out to be a source of bias. Because even when the same guidelines are followed, they might be interpreted diversely according to personal preferences or research needs.

Presenting Data in a Frequency Table

A number of other statistics authors recognize the importance of grouping data into frequency tables. Freund (1998), who is one of them, said that “the most common method of summarizing data is to present them in condensed form in tables or charts, and at one time this took up the better part of an elementary course in statistics” (p. 15). He went on to articulate “nowadays there is so much else to learn in statistics that very little time is devoted to this kind of work. In a way this is unfortunate, because one does not have to look far in newspapers, magazines, and even professional journals to find unintentionally or intentionally misleading statistical charts” (Freund, 1998, p. 15).

Graham (1987) encapsulated 4 fundamental purposes for data analysis in a statistical investigation: (1) describing data, (2) summarizing data, (c) comparing and contrasting two or more data sets, or (d) generalizing about a population or predicting the next case. He recommended that grouped-data frequency tables as well as histograms be used as a representational tool for the first, third, and fourth purposes. He suggested that summary statistics related to center, spread and shape are most helpful for the second purpose.

Presenting Data in a Simple Frequency Table

When examining a very large set of raw data, it is almost impossible to get a clear picture of the data without organizing and summarizing them in some sort of logical and orderly fashion. “Even identifying the smallest number and the largest number may require spending a few minutes, let alone learning about how the numbers are distributed between these two extremes” (Keller, 2001, p. 22).

According to Hinkle et al. (2003) and MacDonald (1982), one of the traditional and simplest procedures for organizing and summarizing the data for a meaningful representation is in a simple or ungrouped frequency table. Weiss (2004) put that by

“suitably organizing data, we can often make a large and complicated batch of data more compact, easier to work with and understand” (p. 44). Data grouped in a simple frequency table allow the investigator to “determine very quickly where most of the items are concentrated” (L’Esperance, 1971, p. 24) and as said by Hamburg (1989) “if a frequency distribution of the individual figures is formed, many features of ungrouped data become readily discernible” (p. 9).

Spiegel (1961) evocated that an “important advantage of a frequency distribution is gained in the clear ‘overall’ picture which is obtained and in vital relationships which are thereby made evident” (p. 27). Montcalm and Royse (2002) registered frequency tables as one of at least 4 ways in which data can be summarized. The other 3 being: narration, graphs, and cross-tabulation tables.

On their parts, Hopkins, Glass, and Hopkins (1987) and Minium (1978) argued that statistical information could be more easily understood, comprehended and interpreted more accurately if it is organized into tables and displayed in graphs. As

73 explained in the previous chapter, as long as there are 20 or fewer scores, ordering all scores and/or listing each individual score and indicating its frequency may be sufficient.

One final advantage of a simple frequency distribution is that “we are able to achieve all of this convenience without any loss of precision” (Raymondo, 1999, p. 50).

Presenting Data in a Frequency Table for Grouped Data

With a big data set, a simple frequency table may be too long and impractical. A grouped-data frequency distribution can make information even more directly understandable than a simple frequency distribution can. It is so because values are contained in a frequency table that are more readily apprised and comprehended (see, for instance, in Aron & Aron, 2003, p. 9; Mansfield, 1980, p. 18), yet without obscuring the essential information contained in each individual value (Gravetter & Wallnau, 1985, p.

37; Sanders et al., 1976, p. 47).

Advantages of Presenting Data into a Grouped-data Frequency Table. Carlisle

(1985) and Hinkle et al. (2003) advised on considering the hefty dataset in the form of a grouped-data frequency table for ease of use: reading, presenting, interpreting and/or handling. Ravid (1994), among others, contended that frequency tables for grouped data would be useful for compressing a substantial amount of data in a relatively small space by combining a few adjacent values together into class intervals to avoid having a very long list of scores. Berenson and Levine (1998, pp. 22-23) and Vouch (1982, p. 28) further stressed that whenever a set of collected data contains about 20 or more observations, the best way to examine such mass data should be to present it in summary form by constructing appropriate tables and charts. The important features of the data can

74 be extracted from these tables and charts, and then be neatly presented to a general audience.

Elifson, Runyon, and Haber (1990, p. 56) made clear that the reasons for grouping data into a frequency table are 3fold. First, unless computers are available, it is uneconomical and unwieldy to deal with a large number of cases spread out over many scores. Second, some of the scores have such low frequency counts associated with them that it is not justified in maintaining these scores as separate and distinct entities. Third, categories or classes provide a concise and meaningful summary of the data.

On his part, MacDonald (1982, p. 1) retained 3 main advantages of presenting data into a grouped-data frequency table. There are as follows.

First, a grouped-data frequency table is a necessary preliminary to a graphical representation of the quantitative data. Second, the approximation of mean and standard deviation can be calculated by frequency weighting the class intervals. Third, the approximate median, quartiles, deciles, and percentile ranks can be calculated by linear interpolation of the class intervals. The exact determination of these latter statistics requires the ordering of all data points.

Bluman (2004) exposed 3 other advantages. Grouped-data frequency tables are constructed: (a) to organize the data in a meaningful display and an intelligent way, (b) to enable the reader to determine the nature or shape of the distribution, and (c) to enable the researcher to draw charts and graphs for the presentation of data. There is also a surprising fact, which turns out to be contrary to the general feeling. As per Châtillon,

Gélinas, Martin, and Laurencelle (1987), when it comes to estimating percentiles of a distribution from a sample, “the observed percentiles provide in many cases better

75 estimates when they are based on grouped data than when based on the raw data” (p.

395). They also established that “the estimations are even more accurate when the data set is relatively small and the population distribution is not too skewed” (p. 395).

Disadvantages of Presenting Data into a Grouped-data Frequency Table. There are also a number of rather obvious disadvantages to the use of the grouped-data frequency table. A major disadvantage is the loss of precision. By grouping scores into class intervals, the individual data values are no longer identified (Fraenkel & Wallen,

2003) or, equivalently as Shavelson et al. (1996) expressed it “the link between the identity of each subject and his data is lost” (p. 38). In other terms, after grouping raw data into class intervals, it is no longer immediately evident, for instance, how many times a particular value occurs, or whether it occurs at all. In particular, there is no way to tell “what was the exact score that was obtained by each subject unless we go back to the original or raw data” (Ravid, 1994, p. 33). This trade-off between usefulness (more compactness) and precision (more detail), as supported by Healey (2002, p. 32), is common in data analysis. More compactness is achieved by a smaller number of wide categories, while more detail is obtained by a greater number of narrow categories.

Bluman (2004) furnished 3 main disadvantages. According to him, grouped-data frequency distributions can: “(a) Reveal little about the actual distribution, skew, and kurtosis of data, (b) Be easily manipulated to yield misleading results, and (c) De- emphasize ranges and extreme values, particularly when open classes are used” (p. 33).

MacDonald (1982) conceded that: “Since the original data are lost in the grouping process, exact calculations of the mean and standard deviation are impossible” (p. 1). He went on to add: “Precise determination of the median, mode, quartiles, deciles, and

76 percentile ranks are likewise impossible, as these statistical measures require the original data in ordered form” (p.1). Runyon and Haber (1984) listed an extra disadvantage:

“Some small errors in statistics based upon grouped scores are unavoidable” (p. 50).

Glasnapp and Poggio (1985) called this loss of information grouping error, which

“results from the presentation of information that is one step removed from the exact raw score values” (p. 63). The “information regarding counts of individual scores is therefore lost” (Shavelson, Ruiz-Primo, & Mitchell, 2000, p. 39). That is, the original observations cannot be reproduced exactly from the frequency distribution of grouped data (Ferguson,

1989, p. 21). This was already long recognized by Davies (1929) who stated that “the grouping process may give rise to a considerable degree of distortion of the original data”

(p. 349).

Summary of Advantages and Disadvantages. In summarizing, when scores are grouped by class intervals, a trade-off is made in frequency distributions for grouped data between more detail (a greater number of narrow categories) and more compactness (a smaller number of wide categories) (Healey, 2002, p. 32). More detail goes at the expense of convenience (summary) and more compactness is resulted in the loss of information (grouping error) (see, among other books, in Healey, 2002; Kolstoe, 1969;

May, Masson, & Hunter, 1990).

Shavelson et al. (1996) emphasized this point by stating that although the grouping process generally destroys much of the original detail of the data; an important advantage is gained in “economy and clarity in the representation of the data” (p. 39).

When so much information is presented, Stockburger (2001) reasoned that “it is difficult to discern what the data are really like, or to ‘cognitively digest’ the data. That is why we

77 need to group them into classes. But by doing so we may lose information regarding counts of individual scores, while we gain understanding about distributions” (p. 2).

Spiegel and Stephens (1998) made a case as follows: “There is a gain in economy since the table is smaller than the original raw data distribution and it is easier to see how the scores are distributed” (p. 37). They went on to add that “a clear overall picture is obtained and the vital relationships are thereby made evident” (p. 37).

Additionally, Mosenthal and Kirsch’s works (1990a, 1990b) highlighted the links between well-structured tables and graphical representations. Friel, Curcio, and Bright

(2001) insisted that, “apparently attention to the use of grouped-data frequency tables as transition tools for organizing information to be represented graphically is needed” (p.

126) to promote understanding. Because one can move back and forth between frequency tables and graphs, having of both types of information has inherent advantages. In conclusion, as backed up by Frankfort-Guerrero (2000), “while the construction of a frequency distribution can be done by a computer, it is important to go through the process to understand how frequency distributions are actually put together” (p. 38).

Presenting Data in a Graph

Minium (1978) mentioned that “when a graph is to be made from the distribution, grouping the scores helps present a more nearly regular appearance” (p. 30). In the same token, Hopkins, Glass, and Hopkins (1987) averred that the interpretation of even massive amounts of data is facilitated by tables and graphs. Tables and graphs are statistical synopses of information.

A picture is worth a thousand words. When it comes to numbers, a not-yet-well- known saying is “a graph is better than a thousand numbers” (Spatz, 2005, p. 30). Keller

78 and Warrack (2003) and Triola (2001) explained that although the grouped-data frequency distribution provides information about how the numbers are distributed, the information is more easily understood and imparted by drawing a picture or graph.

Histogram

Many times, a graphical display says much about the distribution of data. A frequency distribution for grouped data can readily be displayed by a graph called a histogram (Rhodes, 2000, p. 11). A histogram is the most common graph of the distribution of one quantitative variable and also the most popular method of obtaining information (see, for instance, in Moore, 2000, p. 8; Triola, 2001, p. 12; Wand, 1997, p.

59). A histogram is created by drawing rectangles whose bars are the class intervals (or class midpoints), which are shown on the horizontal axis (the abscissa) and whose heights are the frequencies in those individual classes. The width of each rectangle should be the same and the rectangles should touch each other. The frequencies of each class interval

(tallies) are shown on the vertical axis (the ordinate).

Milton, McTeer, and Corbet (1997) brought up that “the histogram is especially useful for large data sets” (p. 22). Follettie (1980) and Friel, Curcio, and Bright (2001) alleged that a histogram generally would portray insights into the overall shape of a distribution if one chooses appropriate intervals for scaling. For some distributions, though, the features of a dataset (i.e., information on its location, spread, and shape) can be disguised or distorted due to a particular choice of class width (Gravetter & Wallnau,

1985, p. 37; Keller, 2001, p. 22). Isaac and Michael (1997) implied that a too narrow class width of a histogram would “compress the data, concealing meaningful changes in

79 its shape” (p. 166). In contrast, they conceded that a too broad class width would “stretch out the data creating unnecessary gaps” (p. 166).

Arsham (2001), for example, summarized below general information to be extracted from data through histograms. The six types of information to be extracted follow.

1. Central location of the data: A value representing the middle, or the typical, or the

average value of a group.

2. Shape: The extent to which its shape is skewed. That is, whether it is symmetrical,

approximately symmetrical, left-skewed, or right-skewed around a central

location (mean, median, or mode).

3. Dispersion: The degree of spread out from central location.

4. The degree of peakedness (kurtosis): How steeply it rises and falls.

5. Gaps: Non-homogeneous groups.

6. Outliers: Observation values that fall apart from the rest.

Being able to “see” what the quantitative data look like can be a real advantage of a histogram. However, histograms have their limitations just as grouped-data frequency distributions do. In fact, histograms may even be more limited if they are used alone and as substitutes for grouped-data frequency distributions rather than being used as accompaniments (Raymondo, 1999, p. 67). SPSS, for instance, does not provide grouped- data frequency tables. Therefore, a combination of histograms and grouped-data frequency distributions is ideal to provide a much clearer picture of the data.

Traditional Methods

To construct a frequency table for grouped data, most introductory statistics books agree on determining the 4 crucial elements. These 4 elements again are: The range, the number of class intervals, the class interval width, and the starting point. In the following how existing (traditional) methods do these things and why there are faulty were examined.

Inclusive Range

There is an accepted consensus that the range constitutes the beginning point for constructing a frequency distribution for grouped data; for instance in Anderson,

Sweeney, and Williams (2004), Aron and Aron (2003), Bartz (1979 and 1999), Fox,

Levin, and Harkins (1993), Glasnapp and Poggio (1985), Hoaglin (1983), Ravid (1994),

Shavelson, Ruiz-Primo, and Mitchell (2000), Spence, Cotton, Underwood, and Duncan

(1990), Summers and Peters (1973). Unfortunately, two slightly different formulas are proposed to compute it. On one hand, the range of batch of scores is considered as the difference between the largest and the smallest scores. Some authors call this ‘simple’ or

‘non-inclusive’ range.

On the other hand, the majority of elementary statistics textbooks recommend determining the range as the difference between the largest and the smallest scores plus one (for whole numbers). Some authors call the latter the ‘inclusive’ range. Why is the 1 added in to obtain the inclusive range? Schmidt (1975) explained that the 1 should be added to determine both the distance covered by the maximum and minimum scores and the distance between them. In that way, the two extreme values are included in the range.

No specific formulas are provided to calculate the inclusive range for decimals.

Number of Class Intervals

Vague Guidelines. After determining the (inclusive or non-inclusive) range of scores and distribution, the big question before a set of data can be converted into a grouped-data frequency table is determining the number of class intervals (Aron & Aron,

1997, p. 7; Spence, Cotton, Underwood, & Duncan, 1990; Wessel & Willett, 1960, p.

80). The number of class intervals to be used is primarily dependent on the number of observations in the data set, but also on the range of observed scores (Berenson &

Levine, 1998).

Anderson (1989), Aron and Aron (2003), Freund (1998), Jarrett and Kraft (1989),

Johnson and Bhattacharyya (1985), Khazanie (1986), Kooker and Robb (1982),

Mansfield (1986), Rees, (1989), Sanders (1990), Witte and Witte (1997), among other authors, cautioned that when setting up a grouped-data frequency table, it makes a big difference how many intervals are used. To mention just a few, Bartz (1999, p. 24) and

Kerr et al. (2002, p.11), for example, specified that the choice of the number of intervals and of the size of the interval is quite arbitrary. It is even highly subjective (Spatz, 2005, p. 30) as it is a matter of personal judgment (Lapin, 1987, p. 16; Hamilton, 1990, p. 24).

There exist guidelines to help the researchers with this, but they remain vague in that they are subject to personal interpretations, tastes, and preferences.

So many authors recommend using judgment and common sense, which are two ambiguous terms and lead, in turn, to an incredible number of plausible solutions. Among these authors are Anderson, Sweeney, and Williams (2004), Aron and Aron (2003), Fox,

Levin, and Harkins (1993), Glasnapp and Poggio (1985), Hoaglin (1983), and Jaeger

(1990). A few other authors are: Moore (2000), Ravid (1994), Spence, Cotton,

Underwood, and Duncan (1990), Shavelson, Ruiz-Primo, and Mitchell (2000), Summers and Peters (1973).

Additionally, Hertzberg (1991) underlined that it would be hard to make firm rules about the choice of intervals for a grouped table, but some general principles should be followed, if applicable. Nevertheless, the general principles involve vague words such as “nice, useful, fairly, easy, round, appropriate, and suitable” (Hertzberg, 1991, p. 11).

As Jaeger (1990) put it “selecting the appropriate number of classes remains a subjective process and varies from one statistics book to another” (p. 9). In contrast,

Lapin (1987), for example, stressed that “since the interaction between analyst and data display is a personal one, we face considerations of taste and aesthetics, particularly in choosing the number of intervals or the interval width” (p. 16). The majority of proposed techniques go by choosing “the desired number of classes” (Hoaglin, 1983, p. 8). There is regrettably no obvious concurrence as to “the way to compute the desired number of class intervals” (Ravid, 1994, p. 33).

It is important to remember that grouped frequency distributions are employed to reveal or emphasize a group pattern. Either too many or too few class intervals may blur that pattern and thereby work against the researcher who seeks to add clarity to the analysis. In sum, then, the researcher generally makes a decision as to the number of intervals based on the set of data and personal objectives, factors that may vary considerably from one research situation to another (Levin & Fox, 2004, pp. 26-27).

Lapin (1980), for instance, said that “one way to decide how many class intervals to use would be to try several –plotting a histogram for each and selecting the one that provides the most logical explanation of the underlying population pattern” (p. 21). He

83 further specified “the histogram chosen for permanent use should be the one that achieves the best balance between summarizing the data and preserving detail” (Lapin, 1987, p.

18). Unfortunately, this approach is lengthy and subject to too much personal judgment as well as taste and aesthetic in spite of partial help from a computer. Even with full computer assistance, the whole procedure remains quite time consuming as creating

‘several’ histograms cannot be done promptly.

Summing up, Sullivan (2005) reminded that the goal in constructing a frequency distribution is to reveal interesting features of the data. As such “too few classes would cause a ‘bunching’ effect. Too many classes will spread the data out, thereby not revealing any pattern” (p. 48).

Recommended Numbers of Class Intervals. There are too many numbers being recommended. To start with, a number of textbooks recommend as a general rule that 10 class intervals be taken as optimal and 30 as the maximum (as seen in Gravetter &

Wanau, 2000; Vaidyanathan & Vaidyanathan, 1987, p. 9). A few others urged between

10 and 20 intervals; for example Anderson, Sweeney, and Williams (2004), Aron and

Aron (2003), Glasnapp and Poggio (1985, p. 62), Hays (1967 & 1981), Kiess (1996),

Larson and Farber (2003), Levin and Rubin (1983), Thorndike (2001, p. 33), Schloss and

Smith (1999, p. 147), Zeller and Carmines (1978, p. 28). For some other authors, the number of class intervals can be situated either between 10 and 14, or 5 and 15. Among these researchers, there are notably Wyatt and Bridges (1967), for the first option. For the second option, are listed Aron and Aron (2003, p. 9), Berenson, Levine, and Krehbiel

(2002), Brase and Brase (2001, p. 36), Craft (1990); Freund (1998), Korin (1975), Lapin

(1987), Levine (2001, p. 58), Mason, Lind, and Marchal (1994, p. 27), Sanders, Murph, and Eng (1976), Sandy (1990).

On their parts, Bennett, Briggs, and Triola (2003), Devore and Peck (2005),

Hildebrand (1986), Groebner and Shannon (1981), Isaac and Michael (1997), Larson and

Farber (2006), for instance, stated that between 5 and 20 classes would be satisfactory for most data sets. Similarly, De Veaux, Velleman, and Bock (2006), Madsen and

Moeschberger (1983 and 1986), Mason et al. (1994), May et al. (1990), Mendenhall,

Reinmuth, Beaver, and Duhan (1986), Mendenhall and Sincich (1996), and Triola (2001 and 2002) retained 5 as the lowest number of class intervals and 20 as the largest number.

Many other authors concurred to the same two values or any numbers in between. More specifically, Hinkle et al. (2003) wrote that “for small data sets (less than 100 observations), 6 to 12 works well, while for larger data set with a wide range of scores,

10 to 20 intervals should be okay” (p. 24). Mendenhall (1997) further specified that “a small number of classes (in the range choice of 5 to 20 classes) is to be used for a small data set (say, 10 to 15 measurements) and a larger number of classes should be the preference for a larger data set” (p. 42).

A few other plausible numbers are proposed by Harnett (1982), for example, who phrased that “the number of classes used should probably be fewer than 20 (for ease of handling and to ensure sufficient compacting of the information) and at least 6 (to avoid loss of information due to grouping together widely diverse data)” (p. 10). Another author, Roscoe (1975), recommended that “the number of intervals be between 10 and

25, with 15 to 20 being optimum” (p. 29), whereas Hopkins, Glass, and Hopkins (1987, p. 10) and Wonnacott and Wonnacott (1984, p. 24) and Zelditch (1959, p. 19) put

85 forward that between 10 and 15 class intervals would be appropriate, rarely as many as

20. Jaeger (1997) suggested that about “12 score classes will often do the trick” (Jaeger,

1997, p. 9), while Sokal and Rohlf (1995, p. 25) advocated for a “range of 12 to 20 classes” or a range of “10 to 20 classes” as indicated by Spatz (2005, p. 376).

Concluding, Brase and Brase (1999, p. 47) and Mason and Lind (1996, p. 31), in particular, cautioned on the risk of either losing too much information by choosing fewer than 5 class intervals or sacrificing the clarity of the table for detail if more than 15 class intervals are used.

All these optional numbers are attempts to compromise between the desire to summarize data and the aspiration to retain as much detailed information as possible. The greater the number of score classes, the more detailed information about the original data is retained. The smaller is the number of score classes, the more the data have been simplified. Generally, the larger number of observations in a data set, the more classes should be used. As per Sanders and Smidt (2000), the key is to use the number of classes that would “give good view of the data pattern and provide insights into the information that is there” (p. 52).

Existing Formulas to Compute the Number of Class Intervals. Besides the above guidelines or suggestions, some formulas have been developed in choosing an appropriate number of class intervals, which is denoted by the letter k. One of the oldest formulas was designed by Sturges (1926) and is expressed as k = 1 + log2 n, where k is

the number of classes to choose, log2 represents the logarithm base 2, and n is the number

of observed data.

According to Hyndman (1995), most statistical computer packages “still use

Sturges’ rule (or an extension of it) for selecting the number of classes when constructing a histogram. Sturges’ rule is also widely recommended by introductory statistics textbooks” (pp. 1-2). In addition, as maintained by Arsham (2001), Sturges’ rule still holds and produces the same results as the approximate number of classes in frequency distributions proposed by Keller (2001, p. 24) and Keller and Warrack (2003, p. 35) and displayed in Table 6 below.

Table 6

Appropriate Number of Classes in Frequency Distributions as per Keller

Number of observations Number of classes

Less than 50 5 – 7 50 – 200 7 – 9 200 – 500 9 – 10 500 – 1,000 10 – 11 1,000 – 5,000 11 – 13 5,000 – 50,000 13 – 17 More than 50,000 17 – 20

On the word of Hyndman (1995, p. 2) and of Shvyrkov and Stout (1980, p. 471),

the Sturges’ rule leads to over smoothed histograms and no longer deserves a place in

statistics textbooks or as a default in statistical computer packages (mainly for sample

sizes that far exceed 200); since it assumes that frequency distributions follow only

distributions of binomial coefficients. Although the logarithm base 2 is not easy to apply and is available only on some advanced scientific hand calculators, the Sturges’ formula

87 remains apparently popular and is still widely used (see in Arsham, 2001; Hyndman,

1995, for instance).

The second formula examined is the square root of the total number of observations, denoted by n , as recommended by Davies and Goldsmith (1980, p. 11),

Devore and Peck (2005, p. 19), Howell (1997, p. 20), Pelosi and Sandifer (2000, p. 66).

As stated by Johnson and Kuby (2007), “the square root of n formula is a reasonable

guideline for the number of classes with samples of fewer than 125 data values” (p. 57).

In contrast, for a very large data set (like one including 50,000 scores or more), using the

square root formula leads to a prohibitive number of classes. Compared to other proposed

formulas, the square root rule produces higher number of classes than those obtained by

other rules listed in this study.

Ishikawa (1986, p. 8) presented a small table that can be roughly fitted with the

following formula: k = 6 + n/50 (third formula). Regrettably, the second and third

formulas increase rapidly as the number of observations n goes up, particularly when n is

in thousands. This applies also to the following 3 rules (fourth thru sixth formulas).

The fourth formula investigated is the Rice’s rule, which sets the number of

intervals to twice the cube root of the number of observations k = 2 ∗ 3 n (Retrieved

March 29, 2004 from http://cnx.rice.edu/content/m10160/latest/ and written by David M.

Lane).

The fifth formula explored is the fourth root of n rule times 2.5. It sets the number

of classes to twice and a half the fourth root of the number of scores collected, k = 2.5 ∗

4 n , and proposed by an unknown author (anonymous) and found in most statistics

88 textbooks used in the Francophone Africa schools, such as those in Burundi, Rwanda, and Zaire.

The sixth formula reviewed is the Terrell and Scott’s rule. It sets the approximate number of classes to the cube root of twice the number of observations, k = 3 2n (Terrell

& Scott, 1985, pp. 209-214).

Sadly, all the above six formulas lead to data over grouping when the number of

observations is relatively small to moderate, and too many classes for large batches of

data (Hyndman, 1995). An additional formula was entertained.

The seventh formula uses the smallest integer such that 2k ≥ n, where k is the

minimum number of classes and n being the number of observations. In other words, the

number of classes k is the power to which 2 is raised so that the result is equal to or

greater than the number of observations (Iman and Conover, 1982, p. 49; Mason and

Lind, 1996, p. 31; Webster, 1992, pp. 19 - 20). A big advantage with this called rule is

that it is the least affected by the sample size, among all the seven investigated so far.

This seventh formula, 2k ≥ n (Iman & Conover, 1982, p. 49, Mason & Lind, 1996,

p. 31, Webster, 1992, pp. 19-20), referred to as Exponential in Table 7 (continued) shown

on page 67, provides satisfactory results. Inopportunely, like the first formula, the seventh

formula is not easy to manipulate and might be confusing to introductory statistics

students.

There are two additional formulas that help to find the optimal numbers for the

class interval. The eighth and ninth formulae were conceived by Velleman (1976). Based

on large scale Monte Carlo simulations, Hoaglin (1983, pp. 7-32) corroborated

Velleman’s findings. The two new formulas give the maximum number of classes and follow:

(1) k = 2 * n for n ≤ 100, and (2) k = 10 ∗ log10 (n ) for n ≥ 100. At n = 100, k is

the same for both formulas. The number k is always rounded up to the next whole

number.

The following Table 7 lists a few numbers of observed values, n, for each of the 9

formulas proposed to compute an appropriate or suitable number of classes, k. It can be observed that there is a great diversity of k values obtained.

Table 7 (Part I)

Value of k for Selected n Numbers (Sturges, Square Root, Ishikawa, Rice, and Terrell’s) Number Sturges Square Root Ishikawa Rice Terrell’s

3 4 n k =1+ log 2 n k = n k = 6 + n / 50 k = 2 * n k = 2.5 * n

10 4 3 6 4 3

20 5 4 6 5 3

30 6 5 7 6 4

40 6 6 7 7 4

50 7 7 7 7 5

60 7 8 7 8 5

70 7 8 7 8 5

80 7 9 8 9 5

90 7 9 8 9 6

100 8 10 8 9 6

120 8 11 8 10 6

150 8 12 9 11 7

200 9 14 10 12 7

250 9 16 11 13 8

300 9 17 12 13 8

400 10 20 14 15 9

500 10 22 16 16 10

600 10 24 18 17 11

(table continues)

Table 7 (Part I continued)

Value of k for Selected n Numbers (Sturges, Square Root, Ishikawa, Rice, and Terrell’s) 700 10 26 20 18 11

800 11 28 22 19 12

900 11 30 24 19 12

1,000 11 32 26 20 13

5,000 13 71 106 34 22

9,000 14 95 186 42 26

10,000 14 100 206 43 27

Table 7 (Part II)

Value of k for Selected n Numbers (Anonymous, Exponential, and Velleman Maximum’s) Number Anonymous Exponential Velleman Maximum Velleman

(≤100) Maximum (≥100) k n k = 3 2* n 2 ≥ n

k = 2 * n k =10 * log10 n

10 4 3 7

20 5 4 9

30 6 4 11

40 6 5 13

50 7 5 15

60 7 5 16

70 7 6 17

(table continues)

Table 7 (Part II continued)

Value of k for Selected n Numbers (Anonymous, Exponential, and Velleman Maximum’s) 80 7 6 18

90 8 6 19

100 8 6 20 20

120 8 6 21

150 9 7 22

200 9 7 24

250 10 7 24

300 10 8 25

400 12 8 27

500 12 8 27

600 12 9 28

700 13 9 29

800 13 9 30

900 14 9 30

1,000 14 9 30

5,000 21 12 37

9,000 24 13 40

10,000 25 13 40

Class Interval Size

Once an appropriate number of classes has been calculated, the width of the class interval necessary to produce the computed number of intervals must be determined

(Spence et al., 1990). “The most difficult part (if there is one) in constructing a grouped- data frequency distribution is determining the class interval size” (Aron & Aron, 1997, p.

7; see also Glasnapp & Poggio, 1985, p. 62). Disappointingly, there is again no clear compromise as to a generally accepted formula for choosing the proper length of a class interval (Summers & Peters, 1973, p. 30).

Vague Guidelines. As said by Groebner and Shannon (1981) “determining class size and also class limits is essentially arbitrary, and two individuals will often come up with different ways of arranging data into classes” (p. 36). On the word of Stockburger

(2001), selecting the interval size is more art than science. For integer data, if the ratio range over “the desired number of number” is not a whole number (as it is in most cases), this number (class width) is rounded up to the next integer value (Brase & Brase, 2001, p.

36). Some authors, like Anderson (2004), Aron and Aron (2003), Stockburger (2001), suggested rounding it up or down to the nearest odd integer, while some others like Frank and Althoen (1994, p. 9) recommended rounding it only up to the nearest odd integer. No specific guidelines for decimals were located.

Generally speaking, if the class interval is large in relation to the total range of the set of observations, this loss of information may be appreciable. The larger a class interval is, the more the information loss. “If the class interval is small, the classification of data in the form of a frequency distribution may lead to very little gain in convenience over the utilization of the original observations” (Ferguson, 1989, pp. 27-28). In

94 summary, as a rule of thumb, Knoke and Bohrnstedt (1991) pushed for “an interval width being large enough to avoid too many categories, but not so large that significant differences between observations are concealed” (p. 41).

Recommended Class Interval Size. There is, again, a long list of class interval sizes being proposed. Unfortunately, almost all of them deal only with integer data.

Anderson et al. (2004), Aron and Aron (2003, p. 9), Hawkins and Weber al. (1980, p.

11), Mason et al. (1994, p. 30), McCall (2001, p. 36), Spencer et al. (1990), among other statistics authors, urged that class interval sizes should preferably be multiples of 5 or 10 to facilitate reading the distribution table. Witte and Witte (1997) cautioned: “The recommended length of interval number of classes, multiples of 5 or 10, should be thought as a rough rule of thumb to be applied with discretion rather than with slavish devotion” (p. 18).

When the width of the interval is a multiple of 5, the values chosen for either the lower limits or the upper limits of the classes are usually a multiple of 5 as well. For example, with class length of 10, class intervals of 60-69, 70-79, .., or 61-70, 71-80, …, are more convenient to set up and interpret than, for instance, 62-71, 72-81, and so forth.

The problem with this popular approach is the likely high number of possible tables to be produced before reaching a most appropriate one.

Anderson et al. (2004), Aron and Aron (2003, p. 9), De Veaux, Velleman, and

Bock (2006), Hawkins and Weber al. (1980, p. 11), Mason et al. (1994, p. 30), McCall

(2001, p. 36), Spencer et al. (1990) for instance, alerted that, whenever possible, interval class sizes should be of equal length. The statistical package Microsoft Excel, for instance, “cannot draw column charts with columns of unequal width” (Hunt, 2003, p.

43). The class intervals should be equally wide because, as defended by Keller (2001), “it makes the task of reading and interpreting the graph easier and it works well when the distribution of the data is symmetric” (p. 23).

According to Johnson and Bhattacharya (2001), McElroy (1979), and Sanders and

Smidt (2000) the rule requiring equal class intervals is inconvenient where large gaps exist in the data, that is, when the data are spread over a wide range, but are highly concentrated in a small part of the range with relatively few numbers elsewhere. Johnson and Bhattacharya (2001) suggested using “smaller intervals where the data are highly concentrated and larger intervals where the data are sparse in order to help reduce the loss of information due to grouping” (p. 34).

In general, using unequal class intervals may cause difficulties. For example, “if frequencies in a distribution with unequal intervals are compared, the observed variations may merely be related to interval sizes rather than to some underlying pattern” (Sanders

& Smidt, 2000, p. 53). Sanders and Smidt (2000) went on to stipulate: “Other difficulties of using unequal intervals can arise during the preparation of graphs” (p. 53). As a result, for simplicity purposes and practicability, making a frequency table with equal class size should be preferred wherever possible (Hunt, 2003; McElroy, 1979; Sanders & Smidt,

2000; Vaidyanathan & Vaidyanathan, 1987).

Odd whole interval lengths are recommended, because the problem with having even integers would yield class midpoints that are not stated to the same level of precision as the original observations (see, for example, in Anderson, 2004; Aron &

Aron, 2003; Frank & Althoen, 1994; Kolstoe, 1969; Summers, Peters, & Armstrong,

1977; Thorndike & Dinnel, 2001). The same line of reasoning is given by Spiegel and

Stephens (1998): “If the class intervals contain an odd number of score levels, the class midpoints would coincide with the actually observed data” (p. 38). They added: “This tends to lessen the so-called grouping error in further mathematical analysis” (Spiegel &

Stephens, 1998, p. 38). For whole numbers, when the class width is an odd number, the middle score is an integer and may be conveniently used as an index value representative of all measures in the interval (Roscoe, 1975).

Existing Formulas to Compute Class Interval Size. At least two formulas are proposed for the optimum class width. The first one was developed by Scott (1979) and extended by Wand (1997). Scott’s expression is written as w = 3.49∗ s∗ −3 n , where w is

the class width or class length, s is the sample standard deviation, and n is the sample

size.

The second formula was designed by Freedman and Diaconis (1981), cited by

Hyndman (1995, p. 2), and it is expressed as w = 2 ∗ (IQ) ∗ n-1/3, where w is the class

width and IQ is the sample interquartile range. These two formulas are heavy to apply

with a hand calculator because they require computing the sample standard deviation, s,

and the sample interquartile range, IQ, respectively.

For both formulas, the rounding exercise could be done up or down to the nearest

integer. This rounded number could probably be either an odd whole number or a

multiple of 5, which is another source of arbitrariness.

Starting Point

Once the decision on interval width has been made, the specific class interval

limits can be determined. This is done by finding the starting point, that is, by

determining the lower real limit of the lowest class interval, or, equivalently, the upper

97 real limit of the highest class interval. Levin and Fox (2004, pp. 29-30) acknowledged that determining the actual limits of the class intervals is the most difficult part of all.

They went on to affirm that “satisfactory results come with a great deal of trial and error as well as practice” pp. 29-30. Spatz (2005, p. 376), on his part, recommended that “each class interval begins with a multiple of the class interval size”. For example, with the raw data of Table 4, since the lowest measurement, Xlowest, is Xlowest = 28 and the class

interval width, w, is w = 9, Spatz (2005) urged that the first class interval be 27 – 35

because 27 is the highest multiple of 9 that is less than 28. He went on to specify that “an exception to this convention occurs when w = 5” (p. 376). He specified: “When the interval size is 5, it is usually better to use a multiple of 5 as the midpoint because multiples of 5 are easier to read on graphs” (Spatz, 2005, p. 376).

The choice of the starting points may affect the shape of a histogram in major and surprising ways. Farnsworth (2000) showed that “histograms may have the same number of class intervals (or even the same class widths), but give very different pictures of the distribution if their start points are different” (pp. 81-82).

Vague Guidelines. As Kerr et al. (2002) put it “the point at which to begin the lowest class boundary is just arbitrary and so also is the point at which to end the class boundary” (p. 11). Spatz and many other statistics authors like Stockburger (2001),

Glasnapp and Poggio (1985), Hildebrand (1986), Hopkins et al. (1987), May et al. (1990) proposed that the first class interval begins with a multiple of the interval size.

The same idea is shared by Champion (1970) who explained: “The reason to begin an interval with a multiple of the interval size is that this procedure can act as a check on the work done. If a mistake is made in the construction of the frequency

98 distribution for grouped data, the error can be detected more quickly” (p. 21) and corrected. A problem with this method is again it may open doors to so many personal interpretations. In particular, Farnsworth (2000) warned: “Any statistical procedure in which the analyst can exercise a choice like those for histograms presents for misleading results and erroneous decisions” (pp. 84-85).

Existing Formulas to Compute the Starting Point. Only the aforementioned guidelines and recommendations mentioned above are available. No specific formulas were found to determine the starting point.

Tallies

An extra step, though not integrant part, in constructing a grouped frequency table is to count the number of observations falling into each class interval; that is, find the class frequencies (tallies). This is best done by using a tally or score sheet (Spiegel &

Stephens, 1998, p. 38). A final major step could consist of computing the class midpoints or class marks.

Summary Statistics

While tables and graphs are extremely useful in depicting a set of observations concisely, data may be described in even more condensed forms. A single number, or only a few numbers based on the observations, may be used to convey useful information concerning certain attributes possessed by the observed variables (Korin, 1975, p. 31;

Elzey 1967).

As Chou (1963) put it “after data have been collected and tabulated, the next step in statistical analysis is to describe their properties” (p. 110). Chou (1963) went on to articulate that “to describe data involves computing a proportion, a mean, a standard

99 deviation, or some other summary measure of the observations comprising a population or a sample” (p. 110).

From the Teaching and Learning Statistics’ Perspectives

There is a relative paucity of research dedicated to explaining this phenomenon.

The following is a summary of the limited related literature found.

Definitions of Teaching, Learning, and Education

Leamnson (1999), as cited by Miller (2000), defined teaching as “any activity that has the conscious intention of, and potential for, facilitating learning in another” (p. 3).

Later on, Leamnson (1999) warned that, as defined above, teaching did “not imply necessarily that any learning is going on” (p. 51). He went on to comment that, generally speaking, teaching “is something done by someone … not to someone” (p. 52), and concluded that good teaching, then, becomes “a matter of doing the right things under appropriate circumstances” (p. 52). He became more explicit about it by stating “telling students what they need to know is one thing; doing something that will inspire them to become motivated and actually learn is quite another” (p. 55). Leamnson (1999) finally reported he was “of the opinion that most good teaching does have elements of performance” (p. 69).

Coiro (2005) defined learning as “a process of real-life practice experiences that generates engagements and an understanding of how to think deeply, work collaboratively, and communicate effectively” (p. 3). Giorgis and Johnson (2001) complemented that definition by stating “true learning often happens when (and because) mistakes are made” (p. 87) and, even better, when they are fixed. To enhance this

“learning by mistake” definition, Taplin (2003) boasted about the fact that “if students

100 learn best from making their own mistakes, then a dominant role of educators is to place students in positions where they make these mistakes safely” (p. 146).

Combining the teaching and learning concepts, Leamnson (1999) claimed that

“learning is self-initiated and not externally caused” (p. 53) and, as so defined, “learning does not require teaching… But without teaching of any sort, learning is limited entirely to discovery through personal experience” (p. 53). On her part, Miller (2000) enriched that “teaching and learning are necessarily connected concepts, each defining the other”

(p. 21) because teaching and learning co-exist when teacher and student are partners in the learning process (Freire, 1970/1993). Bransford, Brown, and Cocking (2000) believed the primary goal of teaching is to foster “active learners who seek to understand complex subject matter and are better prepared to transfer what they have learned to new problems and settings” (p. 13).

Education, on the other hand, was delineated by Leamnson (1999) as “learning that has been facilitated by teaching” (p. 54). Compliant with the way teaching, learning, and education are defined here, Leamnson (1999) deduced that “teaching can be done with or without learning. Learning can be done with or without teaching. Education requires both teaching and learning” (p. 54).

Various Teaching Methods Documented

As per Conners, Mccown and Roskos-Ewoldsen (1998), the statistics instructor faces 4 major challenges: “To motivate students to value a topic they think is irrelevant to their life, to deal with the anxiety that is often associated with learning statistics, to effectively teach both high and low achievers, and to make the learning process memorable” (p. 40). Trying to meet these challenges, researchers have experimented with

101 a variety of learning strategies (Beins, 1985; Conners, Mccown, Roskos-Ewoldosn, 1998;

Derry, Levin, & Schauble, 1995; Dillbeck, 1983; Thompson, 1994). It has been documented and established by many researchers in education and psychology that students learn statistics in multiple ways.

One common way is learning by doing (Schank, 2005; Gelman, 2005). In particular, Conners et al. (1998) and Garfield (1995) proclaimed that learning by doing lends itself to active rather than passive learning. Zhu and Simon (1987) commented that learning by doing examples and making a mistake (Schank, 2005; Gelman, 2005) has been shown to be effective and in some cases more so than traditional instruction.

Another fashion is learning by personal involvement (Gelman, 2005). By getting students involved in their learning of statistics, statistics instructors promote student ownership of their learning (Miller, 2000). This view is supported by Johnson and John

(2003) who vindicated that “students learn statistical principles more effectively by active involvement in the learning process, by finding out for themselves as participation helps promote understanding and retention” (p. 94).

An additional line of attack for learning statistics is by participating in the statistics classroom activities in pairs or cooperative learning and be enthusiastic about these activities (Borresen, 1990; DeBord & Muhlig, 2004; Gelman, 2005; Gelman &

Nolan, 2002; Hsu, 2003; McLaren, 2004; Mills, 2003; Morris, Kellaway, & Smith, 1978;

Morris, Joiner, & Scanlon, 2002; Porter, Griffiths, & Hedberg, 2003; Rajecki, 2002;

Smith, 1998). As far as learning to actually use statistics is concerned, two heads are definitely better than one (Dunn, 2001) because collaborative (preferably in small groups) efforts provide a practical division of labor and necessarily promote recursive learning. In

102 fact, collaborators must explain, justify, and defend their choices to one another.

Teaching to others has been recognized as one of the best ways to learn (Webb, 1992).

Students learn statistics also by actively building their own new knowledge and understanding on what they already know and believe (Lave, 1988). That is, students formulate new knowledge by modifying and refining their current concepts and by adding new concepts to their previous knowledge (Cobb, 1994; Greeno, Collins, &

Resnick, 1996; Lave, 1988; Mills, 2003). Besides, McKeachie (2002) concurred that learning occurs when students see a relationship between what they are hearing from the instructor and what they already know. This point inspired Mills (2003) who boosted that

“regardless of how clearly a statistics teacher in elementary and secondary schools explains a statistical concept, students will understand the material only after they have constructed their own meaning for the new concepts, which may require restructuring and reorganizing this new information and assimilating it to prior or previous knowledge” (p.

56). Eggen and Kauchak (2001) imply that constructivism also suggests that learning should be facilitated by teachers and that interaction and discussion are critical components during the learning process.

McKeachie (2003) cautioned that “teaching that works for some students may not work for others” (p. 41). A review of the literature reveals that there is no clear consensus among the authors about which teaching and learning approach should be adopted by everybody. Some authors like Dunn (2001), Giraud (1977), Keeler and Steinhorst (1995),

Potthast (1999), for instance, have advised students against learning statistics in isolation, whereas others such as Spence (2001) and Benedict and Anderton (2004) have had a diametrically opposite stance. The latter have noticed that people best learn individually

103 or in one-on-one relationships in which teacher and student are in constant communication with each other.

Teaching and Learning Statistics

One of the major difficulties students have in learning statistics is that “the course deals with ‘obscure stuff’ with which they have no prior experience and which they are unable to relate to the real world or to everyday experience” (Oathout, 1995, p. 29). Speer

(1979) asserted that when given the opportunity some students elected to work alone rather than with peers or in large group setting. The teaching and learning of statistics is remarkably complex because it involves not only new and difficult concepts, but also belief systems resistant to change (Metz, 1997).

Zeiner (1991) conceded that lack of previous exposure to statistics and limited mathematics experience are known to be associated with negative attitudes, which, in turn, inhibit learning. Mills (2004) emphasized that “student attitudes about statistics are important because they may influence the learning process” (p. 350). Negative student attitudes toward statistics may create a major obstacle to effective learning (Cashin &

Elmore, 1997; Fullerton & Umphrey, 2001; Schultz, Drogsz, White, & Distefano, 1998;

Walters, Mateli, Zakrajsek, & Popovich, 1988).

Factors affecting attitudes towards statistics

Variables which continue to influence students’ attitudes, and ultimately statistics self-efficacy, have been investigated. Results have shown that attitudes are related to, among other variables, mathematics proficiency (Brown & Brown, 1995; Kottke, 2000), level of statistics course (Waters, Martelli, Zakrajsek, & Popovich, 1989), computer experience (Rhoads & Hubele, 2000; Gratz, Volpe, & Kind, 1993), prior exposure to

104 statistics (Kottle, 2000; Mills, 2004; Sutarso, 1992), and student confidence (Mills,

2004). In addition, research regarding anxiety has indicated a negative relationship; that is, the higher a student’s anxiety level, the lower the student’s performance (Ware &

Chastain, 1989).

As to computer use in classroom activities, Meletiou (2000) mentioned that the

“agreement is unanimous on the potential benefits of technology on student learning of statistics” (p. 50). However, researchers such as Behrens (1997) warned that “coupling the student with technology alone is generally insufficient to reach the desired effect” (p.

120) and Meletiou (2000) specified that “technological interventions might not work quite as well as one would like to think” (p. 50). Meletiou (2000) persisted in adding that

“despite the wide use of technology in many statistics classrooms, relatively little published research exists describing its actual impact on student learning, and curricula are often developed and implemented without the benefit of research on their effects in terms of students learning” (p. 50).

Unique Solution Feature

Learning styles (here one-solution versus multiple solution learning styles) have been implicated as representing antecedents of statistics anxiety (Onwuegbuzie, 1998;

Onwuegbuzie & Wilson, 2003). Since students’ beliefs towards statistics have generally not yet been explored, and given that many students identify statistics with mathematics, it is informative to first examine related research on beliefs about mathematics and mathematics problem solving (Mills, 2004). Schoenfeld (1992, p. 359) lists some of the typical student beliefs about the nature of mathematics and mathematical activity:

105

1. Mathematics problems have one and only one right answer.

2. There is only one correct way to solve any mathematics problem –usually the

rule the teacher has most recently demonstrated to the class.

3. Ordinary students cannot expect to understand mathematics; they expect simply

to memorize it and apply what they have learned mechanically and without

understanding.

4. Mathematics is a solitary activity, done by individuals in isolation.

These beliefs could be similar to students’ beliefs about statistics, but there may also be beliefs unique to statistics and statistics education that have not been recognized so far (Simon & Bruce, 1991). In her cited work, Oathout (1995) recalled a case of an

MBA student with engineering degree who was deceived with his experience with statistics. Before taking a statistics course he had thought that it was a math course. He maintained that engineering is a hard science and is always precise. One plus one is always two. Statistics is a soft science where one plus one is someplace between 1.8 and

2.2. He deplored the fact that statistics always deals with error and tries to eliminate error, but engineering is always precise. He found the imprecision of statistics troublesome at first because he tried to apply the engineering approach and was disturbed by the conflict

(Oathout, 1995).

Structural and Organizational Issues

Although developmental psychologists and some other scholars claimed that human beings are alike in more ways than they are different, Fennema and Behr (1980) believed that “humans differ in relevant and significant ways on many variables that are of vital importance to the teaching-learning process” (p. 325). Allwood (1990) specified

106 that there are many roads to learning and students, like any other human being, learn in different ways too and vary in their abilities to perform certain tasks. So, as Harris and

Schau (1999) put it, “there is no guaranteed teaching method to improve statistics learning, and it is likely that different methods are maximally effective with different students” (p. 208). Some students do their best in statistics courses only if the teaching method is well structured, that is, coherent and clearly organized (Moore, 1997a).

Conners et al. (1998) proclaimed: “A distinct organizational scheme of course material may make learning last, particularly when students are aware of the organization” (p. 41). Harris and Schau (1999) contended that having a clear structure seems to reduce, among other things, students’ anxiety for at least some of them. Because

Bandalos, Finney, and Geske (2003) discovered that disorganization was a positive predictor of test anxiety. Disorganization was defined as “difficulty in establishing or maintaining a structured, organized approach to studying” (p. 607). Moreover, for Harris and Schau (1999), “step-by-step examples with every intermediate step described are useful for most students” (p. 203), and chiefly for those lacking the appropriate quantitative knowledge and mental preparation.

In contrast, there are some other students who would prefer a teaching procedure with little structure. This may provide a kind of spontaneity and freedom to pursue independent ideas that can be exciting and motivating for them (Harris & Schau, 1999).

Reasoning about Graphical Representations of Data

Garfield and Chance (2000, p. 115) urged students to develop the following 4 strategic reasoning skills. First, students should understand the way in which a plot is meant to represent a data set. Second, they should understand how to read and interpret a

107 graph. Third, they should know how to modify a graph to better represent a data set.

Fourth, they should be able to identify the overall pattern, center, and spread in a distribution. In the statistical investigation process, Heaton and Mickelson (2002) encouraged students to figure things out for themselves by developing investigation questions, constructing data tables, and graphing data.

Attitudes towards Statistics

Beliefs and attitudes related to math may play a powerful role in affective responses to statistics since students often expect that the study of statistics will include a heavy dose of mathematics, including complex algebra and formulae (Simon and Bruce

1991). The frequent appearance of statistics courses within mathematics departments (or as part of a high school math class) reinforces this perception.

Since practically all students have studied some high school level mathematics before starting a formal statistics class, their affective reactions to those math-learning experiences may affect how they relate to statistics learning. Students’ predispositions, beliefs, and expectations may interact with aspects of the learning environment created by the teacher in ways that would work against what the teacher is attempting to accomplish (Gal & Ginsburg, 1994).

An instrument that is frequently used to measure attitudes towards statistics is the

Survey of Attitudes towards Statistics (SATS), which developed by Schau, Dauphine, and Stevens (1995). This instrument served as a model to the new instrument used in this study and called Survey of Attitudes towards the Method Used (SAMU) as previously defined. Like the original SATS instrument, the SAMU instrument is broken into 4 components: Affect, cognitive competence, value and difficulty. The affect component

108 assesses positive and negative feelings about statistics, and thus the instructional method used. The cognitive competence component assesses self-beliefs about ability in statistics. The value component assesses beliefs about the usefulness and importance of the instructional method. The difficulty component assesses beliefs about the difficulty of the instructional method used based on beliefs about the nature of statistics. These 4 scales were also selected for analyses as dependent variables because they represent the students’ attitudes towards statistics that some research has shown to be important for statistics achievement (Elmore et al., 1993; Green, 1994; Roberts & Saxe, 1982; Schau ,

Stevens, Dauphinee, & Del Vecchio, 1993; Wise, 1995).

Summary and Conclusion

Statistics is a course that is nowadays not only gaining popularity all over the world, but also becoming part of diverse curricula. It is even required in virtually all academic fields where data collection is involved (Batanero, Godino, Green, Holmes, &

Vallecillos, 1994). As Batanero et al. (1994) expressed it “the teaching of statistics is currently increasing substantially in many countries, due to its widely recognized place in the general education of citizens” (p. 527). It is unfortunate, as mentioned mainly by

Freund (1998, p. 15), that statistics instructors no longer devote enough time (if any) to the construction of frequency distributions and graphs due most likely to the widespread availability of personal computers. Also, the vagueness of the existing techniques as well as the nonexistence of an accepted consensus about which formula works best, force various statistics users to do it automatically by their computers (Aron & Aron, 2003, p.

9).

109

As recommended by Batanero et al. (1994) and Curzio (1987), the ability to critically read data (chiefly those in which interpretation is not needed, only facts explicitly expressed in the graph or frequency table are required) is a component of numerically literacy and a necessity in our technological society. This leaves room for a systematic approach like the one found in the iteration algorithm.

Based on these deficiencies detected in the commonly known and available algorithms, this study was designed to compensate to the lack of standard procedure for constructing frequency tables of grouped data. Strong points of the iteration algorithm, such as having unique and guess-free solution and being well structured, have been backed up by learning and teaching literature.

The 5 formulas found in the iteration algorithm are known and their accompanying step-by-step instructions are unambiguous and relatively easy to follow and apply in a guess-free fashion. Choices yielded by the iteration algorithm can be used to modify the set-up of a default histogram generated by SPSS or any other statistical software. In chapter 3, it will be demonstrated that the majority of grouped data produced by the iteration algorithm are much closer to the statistical measure data computed directly from the raw data than the corresponding grouped data yielded by the SPSS algorithm. More broadly, the following chapter will describe in detail the development of the traditional algorithm along with the development of the iteration algorithm. It is shown how each of the two instructional methods was designed and how each of them works, using for both cases the same previous set of raw data.

From a teaching and learning perspective, the iteration algorithm favors learning by doing in that the latter lends itself to active rather than passive learning (Conners et al.,

110

1998; Garfield, 1995; Gelman, 2005; Schank, 2005). Results to be presented in chapter 5 and to be discussed in chapter 6 will suggest that learning how to construct a unique and guess-free grouped-data frequency tables by using the iteration method could be in many cases more effective than using a try-and-error approach found in most traditional methods.

111

CHAPTER 3: ITERATION ALGORITHM DEVELOPMENT AND

EXAMINATION

This is a special chapter devoted to the development and examination of each of the two instructional methods used. The two methods will be compared based on how each of them computes the 4 key elements that are essential for building a grouped-data frequency table.

Development of a Traditional Method with Integers

Based on the 4 key elements crucial for creating a frequency table for grouped data, a traditional method (Vaidyanathan & Vaidyanathan, 1987; Mason & Lind, 1996) was chosen to illustrate how vague are the guidelines or rules provided. The main reason for selecting the retained scheme was widespread of the formulas used, which can be computed on most hand calculators (Howell, 1997).

Determining the Range

The formula to compute the (non-inclusive) range, R, of the observed scores is

R = Xhighest - Xlowest, where X is the observed score, Xhighest is the highest observed

value, and Xlowest is the lowest observed value.

Here, R = 93 - 28 = 65.

Choosing a Number of Class Intervals

Herbert Sturges (Sturges, 1926) considered an idealized frequency histogram with

k−1 th ⎛⎞ k bins (or class intervals) where the j bin count is the binomial coefficient⎜⎟, j = 0, 1, ⎝⎠j

…, k - 1. As k increases, this ideal frequency histogram approaches the shape of a normal

density probability function. The total sample size is

112

k −1 k −1 ⎛⎞ k−1 k−1 n ==+=()11 2 ∑⎜⎟j j=0 ⎝⎠

by the binomial expansion. Solving for k (the number of bins) in the above relation gives

k−1 log10 (2 )= log 10 (n )

⇒−∗(kn 1) log10 (2) = log 10 ( ) log (n ) ⇒−=k 1 10 log10 (2)

⇒=+kn1 log2 ( ).

This is Sturges’ rule. So the number of class intervals to choose when constructing a

histogram from normal data is k ≈ 1 + log2 (n ) = 1 + 3.322∗ log (n), where log (n) denotes the decimal logarithm of the sample size, n (Sturges, 1926, p. 65). Note that the

log10 (n) 1 value 3.322 comes from the relation: log 2 (n) = = ∗ log (n). That is, log10 (2) log 2

11 ==3.322. log10 2 log 2

Therefore, k ≈ 1+ 3.322 ∗ log (80) = 7.32. The answer could be 7 if it is decided

to round down (no precision regarding this matter is provided), or 8 (if rounded up) or

even 10 (the next higher multiple of 5, to facilitate the calculations).

Computing a Class Width

A way of setting up the class width, w, for each class interval is given below. By

applying the formula R/k = w, the following is obtained:

First possibility with k = 7: R/k = 65/7 = 9.29, which could be rounded to either 9

or 10.

Second possibility with k = 8: R/k = 65/8 = 8.13, that is, either 8 or 9.

113

Third possibility with k = 10: R/k = 65/10 = 6.5, which could be rounded to either

6 or 7.

Hence, w may take on any of the 5 distinct values listed above (6, 7, 8, 9, and 10) and a unique choice among them is needed. Which unique value should be selected?

Again no precision is given, so an odd number can be retained such that the level of accuracy of midpoints corresponds exactly to that of the data collected. There is still a problem here because there are 2 odd numbers: 7 and 9. Some statistics users might opt for 7, while some others will choose 9. Any of the two possibilities seems plausible, but what about if only one has to be retained? For the sake of illustration, however; both of the two odd integers (i.e., 7 and 9) are in turn examined.

First option with w = 7. Such selection would lead us to have a grouped-data frequency table with k = 10 class intervals each having an equal size of 7 (Third possibility).

Second option with w = 9. This choice would force the chosen grouped-data frequency table to have k = 7 class intervals each having an equal size of 9 (First possibility).

Third option with w = 9. If this is the choice, the corresponding grouped-data frequency table will consist of k = 8 class intervals each having an equal size of 9

(Second possibility).

Picking up a Starting Point

The authors do not suggest anything or recommend any particular formula. So 25

(the closest multiple of 5, which is also less than the minimum observed score, 28) can be retained, or the minimum observed score itself, 28, which is a multiple of 7, or 27 as a

114 multiple of 9. However, to keep the number of trials to a minimum, only one possible and admissible starting point, 25, was retained.

Table Preparation and Construction

Prepare a list of the limits of each class interval, beginning with the starting point.

In constructing the grouped frequency table, list the lower class limit of the first class interval, the lowest class limit obtained by applying Step 4. Add the class interval width

(w) to this first upper class boundary. Write down the upper class limit. Then, list all the class boundaries by adding the class width to the limits of the previous class interval.

After that, determine the class midpoints of each interval by averaging the class limits.

That is, adding the corresponding class lower limit to its class upper limit, and dividing the resulting sum by two to obtain a class midpoint. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.

Second Option: R = 65, k = 7, w = 9 and starting point is 25

Class 1: 25 to 33 [33 = 25 + (9 - 1) = 25 + 8];

Class 2: 34 to 42;

Class 3: 43 to 51;

Class 4: 52 to 60;

Class 5: 61 to 69;

Class 6: 70 to 78;

Class 7: 79 to 87.

115

Verdict: This is a wrong choice, since the last class interval, 79 to 87, fails to capture 93, the maximum score observed. Consequently, this selection as a doable option is rejected.

First Option: R = 65, k = 10, w = 7 and starting point is 25

Class 1: 25 to 31 [that is, 26 = 25 + (7 - 1) = 25 + 6];

Class 2: 32 to 38;

Class 3: 39 to 45;

Class 4: 46 to 52;

Class 5: 53 to 59;

Class 6: 60 to 66;

Class 7: 67 to 73;

Class 8: 74 to 80;

Class 9: 81 to 87;

Class 10: 88 to 94.

Verdict: This is a right choice, since the last class interval, that is, 88 to 94, captures the largest observed score, which happens to be 93. Hence, this selection is retained as a feasible option.

The resulting grouped-data frequency distribution is presented in Table 8 below.

Table 8 has two more class intervals than does Table 9.

116

Table 8

Grouped-data Frequency Table for the Final Statistics Grades of 80 Students

(Traditional Method) [First Option: R = 65, k = 10, w = 7 and Starting Point is 25]

Class ID Class limit Tally* Frequency Cumulative Class

Frequency midpoint

i ni xi

1 25 to 31 2 | 8 9 2 2 28

2 32 to 38 3 | 1 1 2 6 7 8 6 8 35

3 39 to 45 4 | 0 1 1 2 3 4 5 7 15 42

4 46 to 52 4 | 6 6 7 7 8 9 12 27 49

5 | 0 1 1 1 1 2

5 53 to 59 5 | 3 4 5 6 7 7 7 7 14 41 56

5 | 8 8 8 8 8 8

6 60 to 66 6 | 1 1 1 2 2 3 3 4 5 6 6 11 52 63

7 67 to 73 6 | 7 7 8 8 9 11 63 70

7 | 0 1 1 2 2 3

8 74 to 80 7 | 4 5 6 6 7 8 9 8 71 77

8 | 0

9 81 to 87 8 | 1 3 4 5 6 5 76 84

10 88 to 94 8 | 8 4 80 91

9 | 0 2 3

Total n = 80

117

Note: * The tally column is included here simply as an aid for determining the frequencies. It is not a necessary part of a grouped-data frequency table. The tally column here appears roughly as a stem-and-leaf plot.

Third Option: R = 65, k = 8, w = 9 and starting point is 25

Class 1: 25 to 33 [33 = 25 + (9 – 1) = 25 + 8];

Class 2: 34 to 42;

Class 3: 43 to 51;

Class 4: 52 to 60;

Class 5: 61 to 69;

Class 6: 70 to 78;

Class 7: 79 to 87;

Class 8: 88 to 96.

Verdict: This is a right choice, since the last class interval, that is, 88 to 96, captures the largest observed score, which happens to be 93. Hence, this selection is also retained as a viable option.

The resulting grouped-data frequency distribution is presented in Table 9 below.

Table 9 provides more useful information than does Table 4 (on page 35).

118

Table 9

Grouped-data Frequency Table for the Final Statistics Grades of 80 Students

(Traditional Method) [Third Option: R = 65, k = 8, w = 9 and Starting Point is 25]

Class ID Class limit Tally* Frequency Cumulative Class

Frequency midpoint

i ni xi

1 25 to 33 2 | 8 9 55 29

3 | 1 1 2

2 34 to 42 3 | 6 7 8 7 12 38

4 | 0 1 1 2

3 43 to 51 4 | 3 4 5 6 6 7 7 8 9 14 26 47

5 | 0 1 1 1 1

4 52 to 60 5 | 2 3 4 5 6 7 7 7 7 15 41 56

5 | 8 8 8 8 8 8

5 61 to 69 6 | 1 1 1 2 2 3 3 4 5 6 6 16 57 65

6 | 7 7 8 8 9

6 70 to 78 7 | 0 1 1 2 2 3 4 5 6 6 7 12 69 74

7 | 8

7 79 to 87 7 | 9 7 76 83

8 | 0 1 3 4 5 6

8 88 to 96 8 | 8 4 80 92

9 | 0 2 3

Total n = 80

119

Note: * The tally column is included here simply as an aid for determining the

frequencies. It is not a necessary part of a grouped-data frequency table. The tally

column here appears like a special stem-and-leaf plot.

In closing, there are two acceptable tables. The final choice lies on the user.

Development of the Iteration Algorithm

Based on the assumption that a hands-on approach would enable students to better grasp the usefulness of concepts presented (see, for instance, Fago, 1996), the iteration algorithm was conceived. Since statistics instructors know that it is hard for learning to occur when emotions and attitudes in the classroom are ranging from boredom for some students to frustration, anger, and fear for others (Schultz, Drogosz, White, & Distefano,

1998). The iteration algorithm was designed to alleviate the frustration, anger, and fear statistics users may have experienced with existing methods.

As stated earlier on, the iteration algorithm was developed in response to the nonexistence of an accepted consensus about hard-and-fast rules for making these decisions about the aforementioned 4 key elements: Range, number of classes, class width, and starting point. The formulas involved that govern the choice of groups in a frequency distribution with the iteration algorithm are succinctly depicted below. Altogether, there are six formulas, but only five of them need to be used for any given set of raw data.

Determination of the Inclusive Range of Grouped Data

The iteration algorithm introduces two new elements in the traditional formula:

The accuracy or precision of the collected data (either integers or decimals) and the number of iterations involved in the computations. The modified formula reads as

120

Rt = [Xhighest – Xlowest + u * t],

where Xhighest is the highest observed score, Xlowest the lowest observed score, u represents 1 unit of measurement: 10-1, 10-2, 10-3, etc., for 1, 2, 3, etc. digits after the decimal point, and t is the number of tries or iterations needed to reach the final range for grouped data. An advantage of the Rt formula over the traditional R = Xhighest - Xlowest + 1 is that Rt is applicable to both whole and decimal observed data.

The different figures used for computations came from the same data set given in the example found in chapter one.

Here, Rt = (Xhighest - Xlowest + u ∗ t)

= 93 - 28 + 10-0 ∗ t = 65 + 1 ∗ t = 65 + t.

Selecting a Number of Class Intervals

The iteration algorithm works with two bounds: a minimum and a maximum.

Fortunately, the literature search identified two formulas that help get the maximum number of class intervals (Hoaglin et al., 1983, Velleman, 1976). It remained to identify two other formulas to serve as possible minimum values.

1. Minimum number of classes, kmin: kmin = 2for* n n is less than 100 and kmin =

7× log10 n , when n is at least 100 (Two newly developed formulas).

2. Maximum number of classes, kmax: kmax = 2 * n if n is less than 100 and kmax

= 10∗log10 n , when n is at least 100 (Hoaglin et al., 1983, Velleman, 1976).

In the absence of a generally accepted formula about the number of groups to retain in a frequency distribution, seven formulas found in the literature were reviewed, and one was finally chosen because of its easiness of use (function available in almost all calculators), and yet seemed to minimize the grouping error for at least small values (less

121 than 100). For the number of observations not exceeding 100, the iteration algorithm initially opted for the n formula to get a minimum number of intervals as proposed by

Davies and Goldsmith (1980), Devore and Peck (2005), Howell (1997), Pelosi and

Sandifer (2000).

The formula n was selected because it also constitutes a useful starting point

(Nelson, 1988) and it is easy to apply even with a basic hand calculator. The goal was to get a formula or formulas that attempt to compromise between the desire to summarize data and the desire to retain as much detailed information as possible.

In practical terms, the iteration algorithm has to come up with two formulas that would help to get the computed statistical measures much closer to those calculated directly from the raw data. After so many trials and errors, the formula 2n for n ≤ 100 was developed along with its corresponding 7 ∗ log10 (n) for n ≥100, 7 as the square root of 50 . At n = 100, these two formulas are equivalent (except for rounding errors). The two formulas were obtained by dividing each of the Hoaglin’s formulas by 2 .

When applied to a first set of 150 random samples of standard normal data, the two new formulas created to compute a minimum number of classes help all grouped- data statistical measures, and in particular the grouped-data standard deviation, to be very close to the ones generated directly from the raw data than by applying the formula

n alone.

Furthermore, the two new formulas produce similar values as those found by

Sincich, Levine, and Stephan (1999) and, also, McClave, Benson, and Sincich (2005).

Table 10 below summarizes their findings.

122

Table 10

General Rule for Determining the Number of Classes in a Histogram by Sincich et al. Number of observations in a data set Number of classes

Less than 25 5 or 6

25 – 50 7 – 14

More than 50 15 – 20

Those were the major reasons that militated for their inclusion in the iteration algorithm. Their results compare well with those yielded by SPSS. Here the following are obtained:

Determination of the Minimum Value of k, the Possible Number of Classes

Applying the formula kmin = 2* n for n ≤ 100 produces

kmin = 2=*80 160 = 12.65 ≈ 12 (rounded down). This signifies that the minimum number of possible class intervals is 12.

Setting up the Maximum Value of k

Applying the formula kmax = 2* 80 for n ≤ 100 yields

kmax ≈ 2=* 80 2 ∗ 8.94 = 17.89 ≈ 18 (rounded up). This means that the maximum number of possible class intervals is 18.

(Note that the number of groups, k, is always comprised between these two bounds, that is, kmin = 12 ≤ k ≤ kmax = 18).

Fixing on the Class Interval Size

As stated by Glasnapp and Poggio (1985), “The most difficult part (if there is one) in constructing a grouped-data frequency distribution is determining the interval

123

width” (p. 62). The traditional formula R/k = w is modified to Rt/k = w, if w is an odd

integer, but operates by iteration until a compatible trio composed of Rt, k, and w is

found, where kmin ≤ k ≤ kmax. Here Rt, k, and w are determined at the same iteration step,

whereas with a traditional method the 3 variables are computed at 3 different times and

separately.

With the same data from Table 4, one gets Rt/k = w if and only if the ratio Rt over

k yields an odd whole number. One can proceed by successive iterations.

First iteration (t = 1): R1/k = [93 – (28) + 1 * 1]/k = (65 + 1)/k, or 66/k, where R1 is the initial inclusive range or the range at iteration 1 and k is any integer between 12 and 18

(inclusive). The inclusive range at iteration 1, R1, of 66 is divisible by 2, 3, 6, 11, 22, 33, and 66. Unfortunately, none of them belongs to the given range of values from 12 up to

18, limits included.

Second iteration (t = 2): R2/k = [93 – (28) + 1 * 2]/k = (65 + 2)/k, or 67/k. No chance as 67 is a prime number.

Third iteration (t = 3): R3/k = [93 – (28) + 1 * 3]/k = (65 + 3)/k, or 68/k. The number 68 is divisible only by 17. Unfortunately, the ratio 68 over 17 is 4, an even value.

A fourth iteration is needed.

Fourth iteration (t = 4): R4/k = [93 – (28) + 1 * 4]/k = (65 + 4)/k, or 69/k. Only 3 and 23 (values falling outside the considered interval 12-18) divide 69. An additional iteration is still needed.

Fifth iteration (t = 5): R5/k = [93 – (28) + 1 * 5]/k, or 70/k. The number 70 is a multiple of 14. Thus, w = R5/k = 70/14 = 5 (an odd value). In other words, one may have k

= 14 class intervals with an equal class interval width, w, of 5. The final inclusive range R

124

(R = R5) for our grouped-data frequency table will be 70, and the number of necessary tries or iterations, t, is 5 (an odd value).

Setting up the Starting Point

No known formulas are found with traditional methods. The iteration algorithm

applies these two new formulas: (Xlowest - t*u/2) if t is an odd number, and (Xlowest - t*u/2

+ u/2) if t is an even number, where t, is the final iteration used to find w, the class width.

These two new formulas help to determine the starting point without resorting to

many unsuccessful trials, which, in turn, save time and limit errors associated with the

latter. Another advantage is that the last two formulas give directly class boundaries

rather than class limits.

With whole numbers, the two formulas to be applied here follow:

(Xlowest - t * 0.5), if t is an odd number of tries or iterations; otherwise

(Xlowest - t * 0.5 + 0.5), when t is an even number of tries or iterations;

where Xlowest is the smallest score observed in the data set and t denotes the number of

tries or iterations necessary to reach the smallest odd whole ratio, w, which represents the

class interval length or class interval width.

Now plotting the lowest score and the value 5 (an odd integer), the number t of

iterations, into the first of the two formulas shown above yields:

Xlowest - t * 0.5 = 28 - 5 * 0.5, which reduces to 25.5. Therefore, the lowest class

boundary becomes 25.5.

Table Preparation and Construction

The table should begin point with the starting point. In constructing the grouped

frequency table, list the lower class boundary of the first class interval, the lowest class

125 boundary obtained by applying Step 4. Add the class interval width (w) to this first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the previous class interval. After that, determine the class marks of each interval by averaging the class boundaries. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.

First class interval: Lower class boundary = 25.5.

Upper class boundary = 25.5 + 1 ∗ w = 25.5 + 5 = 30.5.

Second class interval: Lower class boundary = 30.5.

Upper class boundary = 30.5 + w = 30.5 + 5 = 35.5.

Third class interval: Lower class boundary = 35.5.

Upper class boundary = 35.5 + w = 35.5 + 5 = 40.5.

And the last (4teenth) class interval: Lower class boundary =

25.5 + 13 ∗ w = 25.5 + 13 ∗ 5 = 25.5 + 65, or 90.5.

Upper class boundary = 25.5 + 14 ∗ w = 25.5 + 14 ∗ 5 = 25.5 + 70, or 95.5.

Check: Final Range, R = Last upper class boundary – First lower class boundary =

95.5 - 25.5 = 70 (same result as before). The grouped-data frequency distribution is presented in Table 11 below and the corresponding histogram is displayed by Figure 4. It looks nicer than the default histogram generated by SPSS (Figure 2).

126

Table 11

Grouped-data Frequency Table for the Statistics Final Exam Grades of 80 Students

(Iteration Algorithm)

Class boundary Frequency Cumulative Class Relative Frequency in

Frequency midpoint Percent

ni xi

25.5, 30.5 2 2 28 2.5*

30.5, 35.5 3 5 33 3.75*

37.5, 40.5 4 9 38 5

42.5, 45.5 6 15 43 7.5

47.5, 50.5 7 22 48 8.75

52.5, 55.5 8 30 53 10

55.5, 60.5 11 41 58 13.75

60.5, 65.5 9 50 63 11.25

65.5, 70.5 8 58 68 10

70.5, 75.5 7 65 73 8.75

75.5, 80.5 6 71 78 7.5

80.5, 85.5 4 75 83 5

85.5, 90.5 3 78 88 3.75

90.5, 95.5 2 80 93 2.5

Total n = 80 100

Note: * 2.5 = (2 ∗ 100)/80 and 3.75 = (3 ∗ 100)/80

127

Histogram of the student's grades By the Iterative method's choices 12

11 10

9 8 8 8 7 7 6 6 6

4 4 4 Std. Dev = 15.96 3 3 2 2 2 Mean = 60.0 0 N = 80.00 Frequency 28.0 38.0 48.0 58.0 68.0 78.0 88.0 33.0 43.0 53.0 63.0 73.0 83.0 93.0

Grades obtained in final statistics exam

Figure 4. Histogram by the iteration algorithm.

Iteration Algorithm versus Traditional Method

The iteration algorithm was labeled as a structured, near error free or near errorless tactic, while the traditional instructional method was defined as a generate-and- test learning process. It was posited that students would learn more using the iteration and the generate-and-test methods than using the traditional instructional method alone. At least 3 investigations, as reported below, may defend this postulation. First, Moore and

Goldiamond (1964) used a fading procedure to train a form discrimination in nursery school children and concluded that “the errorless procedure was more efficient than trial- and-error procedures in producing the most perfect learning with the least practices” (p.

269).

128

Second, Heckler, Fuqua, and Pennypacker (1975) investigated the relative efficacy of an errorless, fading procedure for teaching complex verbal statements to 23 undergraduate psychology students. They discovered that “learning under the fading procedure was significantly better than learning by trial-and-error controls or by an experimental group using a non-fading method” (p. 103).

More recently, William, Ludlow, and Walls (1978) used prompting and fading techniques to teach students tasks with few or no errors condition, on one hand, with trial-and-error learning, on the other hand. Their findings indicate that “percentage of errors was less for errorless fading than trial-and-error in initial learning, but did not differ during transfer or retention” (William, et. al, 1978, p. 553).

Monte Carlo Simulations

Monte Carlo simulations consist of comparing the outcomes from the iteration algorithm against those generated by SPSS. For data analysis purposes, the values on each of the 9 statistical measures serve as the dependent variable. The independent variable is the algorithm used (Iteration versus SPSS). The goal is to discover which algorithm used produced closer statistical measures to those calculated directly from the raw data for each of the 9 statistical measures for varying sample sizes. For each statistical measure, the procedure consists of computing the differences between the values calculated directly from the raw data and the values obtained from the grouped- data frequency tables yielded by SPSS, on one hand, and the values calculated directly from the raw data and the values obtained from the grouped-data frequency tables produced by the iteration algorithm, on the other hand. With this in mind, this preoccupation is investigated in 3 steps.

129

First, on average, are the computed values on each of the 9 statistical measures much closer between raw data and iteration algorithm than between raw data and SPSS?

Second, on average, do the computed values on each of the 9 statistical measures differ among the two computing methods used according to the sample size (small, medium, and large) or differ overall?

Third, does the SPSS algorithm or the iteration algorithm have one effect in one sample size, but a different effect in another sample size? For example: Does the iteration algorithm work better for large sample sizes than for any other sample sizes?

Random Number Generation of Normal Data

Before reporting the main outcomes of the Monte Carlo simulations, it is worthy to note that the normal data used in this study were randomly generated by running the

Random Number Generation function of the Microsoft Excel (2003 edition). The actual random number generation process was completed in early 2004 (between January and

February, 2004). The recently reported problem of getting numerous negative normal data values was not observed during that period of time.

For replication purposes, the following steps were followed. First, open an Excel spreadsheet. This is done by clicking on the Start button, then choosing Programs, then

Microsoft Office, and Microsoft Office Excel 2003. Second, from the Excel main menu, select Tools, then Data Analysis, and Random Number Generation. Click OK. Third, in the slot in front of Number of Variables, enter 1. Fourth, in the slot in front of Number of

Random Numbers, enter 50 (to get a small sample size) or 100 (for a medium sample size) or 250 (for a large sample size).

130

Fifth, choose the Normal distribution from the list of Distributions. Leave intact the default format. That is, leave unchanged Mean = 0 (zero) and Standard Deviation = 1

(one). Sixth, the Random Seed is chosen in the following way: In the slot appearing in front the Random Number, enter any number between 1 and 999, for instance. Keep track of the random seed entered each time you let Excel generate a set of random numbers. In this way, no random seed will be entered more than once for the entire process. Seventh, select Output range and enter A1:A50 in the corresponding dialog box. Press OK.

Finally, repeat the above process 150 times (50 times for each particular sample size).

Main Outcomes

For each of the 9 statistical measures for all 150 random samples, two sets are being computed. Set I = (Raw Data Statistical Measure - SPSS Statistical Measure) and

Set II = (Raw Data Statistical Measure - Iteration Statistical Measure). Then each difference set is squared. After that, the Square Roots of Differences (that is, (Set I) 2 − (Set II) 2 ), the square root of differences between the two sets for each individual measure are compared: (1) For all the 150 samples, (2) For each of the 3 50 samples in accordance with their respective size (Small size = 50 normal random numbers, Medium size = 100; and Large size = 250). The smaller the results obtained, the closer the results are to those produced directly from the raw data.

In Table 12 and Table 13, each third row (SPSS - Iteration) shows the difference of outcomes from the two algorithms. A positive difference means the result from the iteration algorithm is smaller. In other terms, a positive difference signifies that the iteration algorithm outcome is much closer to the corresponding value computed directly from the raw data than the comparable outcome produced by SPSS. Since there are more

131 positive differences than negative ones, it can be concluded that, overall, the algorithm used by SPSS yielded almost always much higher deviations (in absolute values) than did the iteration algorithm. This turns out to be in the advantage of the iteration algorithm.

In particular, the sample size plays an important role in the comparison between the SPSS and the iteration algorithm outputs, mainly for large sample sizes. In fact, SPSS produced here only one statistical measure of dispersion that is closer to the corresponding measure computed directly from the raw data than did the iteration algorithm; and this number of closer statistical measures goes up as the sample size decreases from 250 down to 50. Overall, the iteration algorithm produced 20 much closer results out of the 27 comparisons made. In other terms, the values from the iteration algorithm were smaller than those from the SPSS algorithm. Further details follow in the next two tables provided below.

Table 12

Overall Summary Table of Square Root of Deviations (SRD)

Method Used Mean Median STD Range First Quartile

SPSS 0.064995 0.099261 0.01392 0.365533 0.078972

Iteration 0.010200 0.042151 0.01436 0.061817 0.094111

SPSS – Iteration 0.054795 0.057110 -0.0004 0.303716 -0.015139

Which method yields

a smaller SRD? Iteration Iteration SPSS Iteration SPSS

(table continues)

132

Table 12 (continued)

Overall Summary Table of Square Root of Deviations (SRD)

Method Used Third Semi-IQR Skewness Coefficient of

Quartile Variation

SPSS 0.074329 0.05152 0.306839 171.4759

Iteration 0.046900 0.05285 0.212013 141.3795

SPSS – Iteration 0.027429 -0.0013 0.094826 30.09640

Which method yields a

smaller SRD? Iteration SPSS Iteration Iteration

For the 150 random samples combined: Out of 9 (9) statistical measures SPSS excels the iteration algorithm in just 3 (3) cases (all are statistical measures of dispersion.

The results of these 3 cases follow: STD (Standard Deviation) = -0.0004, First Quartile =

-0.0151, and Semi-IQR (Semi Interquartile Range) = -0.0013.

Although the number of class intervals produced by the SPSS and the iteration algorithms depended on both the sample size (the number of scores observed) and the range (the difference between the largest score and the smallest score), the two algorithms brought in quite varying class interval numbers. The SPSS algorithm produced numbers of class intervals extending from 9 to 29, while the iteration yielded numbers of class intervals ranging from 14 to 24. Overall, the SPSS algorithm tended to generate a larger number of class intervals than did the iteration algorithm. This could

133 explain, at least partially, why the SPSS generated much better STD, first quartile, and semi-IQR values than did the iteration algorithm.

Table 13

Summary Table by Sample Size of Square Roots of Deviations (SRD)

Mean Median STD Range First

Quartile

Large Size 1 0.109628 0.156118 0.00639 0.284089 0.114089

2 0.005077 0.027832 0.00629 0.063891 0.052469

3 0.104551 0.128286 0.0001 0.220198 0.06162

4 Iteration Iteration Iteration Iteration Iteration

Small Size 1 0.016068 0.063369 0.01806 0.403356 0.066686

2 0.01513 0.055426 0.02116 0.038497 0.150813

3 0.000938 0.007943 -0.0031 0.364859 -0.084127

4 Iteration Iteration SPSS Iteration SPSS

Medium Size 1 0.019935 0.035604 0.01464 0.39751 0.036088

2 0.008423 0.038518 0.01167 0.077447 0.033089

3 0.019935 -0.00291 0.00297 0.320063 0.002999

4 Iteration SPSS Iteration Iteration Iteration

(table continues)

Note: 1 = SPSS 2 = Iteration algorithm 3 = Difference between the SPSS and the iteration algorithms 4 = Which method has the smaller SRD?

134

Table 13 (continued)

Summary Table by Sample Size of Square Root of Deviations (SRD)

Third Quartile Semi Inter- Skewness Coefficient of

Quartile Variation

Large Size 1 0.114197 0.02062 0.393863 134.6167

2 0.028106 0.03252 0.175954 101.4435

3 0.086091 -0.0119 0.217909 33.1732

4 Iteration SPSS Iteration Iteration

Small Size 1 0.048643 0.08271 0.290599 221.0278

2 0.058928 0.08062 0.244528 206.8633

3 -0.010285 0.00209 0.046071 14.16450

4 SPSS Iteration Iteration Iteration

Medium Size 1 0.034164 0.02666 0.221641 145.7293

2 0.048466 0.02896 0.217254 82.95367

3 -0.014302 -0.0023 0.004387 62.77563

4 SPSS SPSS Iteration Iteration

Note: 1 = SPSS 2 = Iteration algorithm 3 = Difference between the SPSS and the iteration algorithm 4 = Which method has the smaller SRD?

Large Size

The SPSS algorithm did relatively better than did the iteration algorithm in one

(1) case (a measure of dispersion), that is, SIQ (Semi Interquartile Range) = -0.0119.

135

Small Size

The SPSS algorithm excels the iteration algorithm in 3 (3) cases (all are statistical measures of dispersion), with STD (Standard Deviation) = -0.0031, First Quartile = -

0.0841, and Third Quartile = -0.0103.

Medium Size

The SPSS algorithm betters the iteration algorithm in 3 (3) cases (the first one is a statistical measure of location and the following two ones are measures of dispersion).

Outcomes of the 3 cases are given below: Median = -0.0029, Third Quartile = -0.0143, and SIQ (Semi Interquartile Range) = -0.0023.

Conclusion

As hypothesized, for each of the 9 statistical measures, the square roots of deviations between the raw data and grouped data yielded by the iteration algorithm are, on average, are smaller than those between the raw data and the grouped data produced by the SPSS algorithm, for varying sample sizes. In light of the results obtained so far from the Monte Carlo simulations, the iteration algorithm holds promise for providing valid grouped-data frequency tables. Further development work is recommended mainly in the areas of teaching and learning it.

The next chapter presents in detail variables to be used in the survey and factors retained in the scale utilized. The instrument included in the survey helped collect the data that would adequately tackle the issues raised in chapter one. Chapter 4 also exhibits the research design consistent with the research questions along with statistical methods essential to answer them.

136

CHAPTER 4: METHODOLOGY

Introduction

The following sections introduce, in order, the goals, methods, and participants.

Next, brief scale outcomes from the surveys are presented. Finally, methods of data collection and data analysis are discussed.

Research Design

Intact class sections from two academic departments are used. Ohio University students were selected from 4 class sections of the mathematics department and also from

4 other class sections of the psychology department. All of these students were attending an elementary statistics class during the 2005 summer sessions and 2005 fall quarter at

Ohio University, Athens Campus.

In either department, students from two sections received the traditional method and students from the other two sections got the iteration algorithm. Students in each class section were taught by their regular instructor, either the iteration algorithm or the traditional method, but not both. The survey was administered only once to each departmental class section.

Advantages of the Intact Group Comparison Design

From the related literature, the intact or static group comparison design used here has a number of advantages. Some of the most important advantages are listed below.

First, the data collection process with this type of design is very feasible and relatively easier since it uses pre-existing samples (Wiersma, 2002). Second, it is economical.

Third, there is a rapid turn-around in data collection. Fourth, the participants are measured in their natural setting. Fifth, it can identify attributes of a population from a

137 small group of individuals, as presented in Fowler (1988). Sixth, there is a possibility of gathering data on many variables such as attitudes, perceptions, beliefs, and behaviors.

Seventh, it is good for exploratory research or pilot studies since it may provide a moderate basis for comparing treatments. An eighth advantage is that it generates hypotheses for future research and provides data that may be useful to many researchers.

Finally, the presence of a comparison group does control for history, because it is assumed that events occurring outside of the experimental setting will equally affect both groups (Gay & Airasian, 2000; Gravetter & Wallnau, 1985; Sanders et al., 1976).

Disadvantages of the Intact Group Comparison Design

There are two major problems with this research design. The first one is the lack of random assignment of subjects. The fact that there is no pretest data to provide for any basis of group equivalency is the second problem. The participants may have different skill levels prior to the survey and therefore would have performed independently of their respective academic department.

Operational Definitions of the Variables

Survey of Attitudes toward the Method Used

Six main variables are, overall, considered. On one hand, there are two independent variables: Departmental class section with two categories (MATH 250 and

PSY 120) and the instructional method taught having also two categories (the treatment or the iteration algorithm and the control or the traditional method). On the other hand, under attitude subscale according to the Survey of Attitudes towards the Method Used

(SAMU) scale, there are 4 quantitative continuous dependent variables: Total attitude scores obtained by students on each of the 4 subscales (affect, cognitive competency,

138 value, and difficulty) as found in the original Survey of Attitudes towards Statistics

(SATS) instrument invented by Schau, Stevens, Dauphinee, and Del Vecchio (1995).

Measures to Represent the Student Perceived Ability in Statistics

Two additional independent variables were thought to be measures to represent the student ability in statistics: expected course grade and perceived mathematics ability

(Perney & Ravid, 1991; Schultz, Drogosz, White, & Distefano, 1998; Onwuegbuzie, &

Wilson, 2003). The expected course grade served originally as a covariate. Ten levels

(from grade F to grade A) are assigned to the expected course grade variable. It turned not to be an appropriate covariate for the present data. The variable student perceived mathematics ability, measured by “How good at mathematics are you?” with seven levels

(from 1 = Poor up to 7 = Excellent), served as an alternate covariate.

Identification of Population

Ideally, our target population consists of all American students with some basic statistics pre-knowledge, or mathematics competence, or computer literacy, or a combination of all or some of these factors. Practically, only all Ohio University undergraduate students, who had taken at least a statistics course during the 2005 summer sessions or the 2005 fall quarter, were the accessible population.

Sampling Plan

A relatively large sample size was needed in order to increase the statistical power. Efforts were made to reach out Ohio University undergraduate students willing to participate in the experiment. Incentives used to attract more participants were a 2 extra points-bonus.

139

4 class sections were selected from each department: 4 sections of MATH 250 course from the mathematics department and 4 sections of PSY 120 class from the psychology department. In each department, the instructor retained was teaching two class sections per quarter.

Both the MATH 250 (Introduction to Probability and Statistics) and the PSY 120

(Elementary Statistics Reasoning) courses are offered during each fall quarter, spring quarter and summer sessions. It was the instructor who decided which section was going to be the treatment group (the group taught the iteration algorithm) and which other section served as the control group (the one taught the traditional method). All together,

200 questionnaires were distributed and only 170 students completed the surveys and turned in their questionnaires to their respective instructors. Hence, the response rate was

85%. Out of these 170 students, 84 were from the math department and 83 were taught the iteration algorithm.

Instrumentation

Selection/Development of Instrument

Many efforts have been made to investigate affective factors that affect a student’s statistical performance. Much of the research involves surveys designed to quantify students’ attitudes toward statistics. In efforts to quantify and understand these attitudes, at least seven Likert-type instruments claiming to measure attitudes towards statistics have been developed (Hilton et al., 2004; Mills, 2004). These instruments include the Multifactorial Scale of Attitudes Towards (Auzmendi, 1991), Bad Attitude

Toward Statistics (Berk & Nanda, 1998), Statistics Attitude Survey (Roberts &

Bilderback, 1980), the Survey of Attitudes Toward Statistics (SATS; Schau, Stevens,

140

Dauphinee, & Del Vecchio, 1995), Students’ Attitude Toward Statistics (Sutarso, 1992a and 1992b), and Attitudes Toward Statistics (Wise, 1985). Additionally, at least two

Likert-type instruments claiming to characterize how statistics anxiety influences statistical performance have been designed (Mills, 2004). They include Statistical

Anxiety Rating Scale (Cruise, Cash, & Bolton, 1985) and Statistics Anxiety Inventory

(Zeidner, 1991).

Given that the objective was to use a validated, accepted tool to solely assess student attitudes and achievement skills, the SATS-Schau was retained in this work.

More precisely, the SATS-Schau was selected because it had been used on at least 4 large scale investigations and two confirmatory factor analysis studies had already established its construct validity (Schau et al., 1995; Dauphine, et al., 1997; Mills, 2004). Also, the subscales on the survey represent important attitudes that are related to student achievement (Elmore, Lewis, & Bay, 1993; Schau et al., 1993) and more importantly, it is the closest instrument according to our research needs.

The SAMU instrument used here (shown in Appendix B) is an adapted and reduced version of the SATS (Schau et al., 1995) to better match our purposes.

Permission was sought several times, but unfortunately the principal author never responded to our repeated requests. This new instrument, SAMU, is a questionnaire consisting of 26 initial statements. Using a Likert-type scale, all students responded to each of the first 20 statements (attitude items) by circling a number from 1 to 7, where 1 indicates “Strongly Disagree”, 2 represents “Moderately Disagree”, 3 signifies “Slightly

Disagree”, 4 stands for “Neither Agree or Disagree”, 5 symbolizes “Slightly Agree”, 6 designates “Moderately Agree”, and 7 denotes “Strongly Agree”. These first 20 items are

141 designed to measure a student’s attitudes toward of each of the two instructional methods taught in terms of their respective applicability.

The scores of the next 5 statements (background characteristics items) were also recorded on a 7-point Likert-type scale. These five questions are related to background information (mathematics perceived competency, computer literacy, statistics pre- knowledge, length of high school mathematics exposure, and the cumulative number of college mathematics/statistics courses taken).

For the last question, the response scale consisted of a 10-point scale, with, 1 = F,

2 = D, 3 = C-, 4 = C, 5 = C+, 6 = B- , 7 = B, 8 = B+, 9 = A-, and 10 = A, respectively.

This item reflects the letter grade a student expected to earn in either MATH 250 or PSY

120.

An additional seven self-efficacy items are included in the SAMU instrument.

These are dichotomous variables and marked on a 2-point Likert-style format, anchored at both ends (with Pass = 1 and Fail = 0, respectively).

4 subscales are derived from the first 20 statements. The 4 subscales, or facets, are briefly described below. The main definitions are those provided by Schau et al. (1995) and Dauphine et al., 1997. Our interpretations are inspired by examples found chiefly in

Cruise et al. (1985), but also in Bell (1998a and 1998b), Cherian & Glencross (1997),

Fullerton & Umphrey (2001), Kennedy & McCallister (2001), Sutarso (1992a and

1992b), Zanakis & Valenzi (1997), and Zeidner (1991).

Facet 1: Affect. This factor deals with examinees’ positive and negative feelings concerning the instructional method used. It is posited (based on past experience and review of the related literature) that those people who score higher on this factor, are

142 most likely to be those who have more positive attitudes toward the instructional method presented to them. Frequency tables being part of techniques taught in statistics courses, it is logically deduced that higher scores on this facet could be interpreted as an indication of more positive attitudes toward both statistics and the instructional method used for constructing grouped-data frequency distributions. Those with lower scores on this facet may exhibit more negative attitudes toward both statistics and the instructional algorithm used. This facet has 5 items (Items 1, 2, 10, 11, and 15).

Facet 2: Cognitive Competence. This factor is concerned with attitudes about intellectual knowledge and skills when applied to statistics. Those skills may be acquired by a number of ways such as through heredity (e.g., parental influence: if both parents were strong in mathematics/statistics, so would most likely be some of their children) or through long, hard, and successful practice (e.g., computer literacy, mathematics background, statistics pre-knowledge). People with higher scores on this factor could be those who feel adequate when applying mathematical/statistical formulas. They feel also more at ease when dealing with large sets of data than would have participants with significantly lower scores. High scores (or low scores) would be interpreted as an indication of positive (or negative) feelings toward either or both instructional methods.

This component also comprises of 5 items (Items 3, 8, 12, 14, and 18).

Facet 3: Value. This factor deals with attitudes about the usefulness, relevance, and worth of statistics in personal and professional life. In particular, it is the interest that people show in the instructional method presented and its possible use in their professional future lives. This would arise, for example, from using the method in classroom activities or future professional lives, or even recommending it in conjunction

143 with technology for classroom teaching purposes. On one hand, at this computer age, some students may feel like computers should do everything and relieve human beings from all sorts of complicated mathematical computations usually done in the past by hand and/or with the help of hand calculators. These students, that are over-dependent on the computers for instance, frequently hate doing hand and/or calculator computations. On the other hand, there are some students that do not like using computers and do not like hand and/or calculator computations, either. As a result, a person scoring very low on this facet on the SAMU instrument sees little or no benefit or advantage in the instructional method used. An examinee scoring low on this factor also feels that the instructional algorithm used is almost worthless now and in the future, thus indicating a negative attitude toward it. Again, this facet has 5 items (Items 5, 7, 9, 17, and 19).

Facet 4: Difficulty. This factor reveals the examinees’ attitudes about the difficulty of statistics as a learning domain. On the SAMU instrument, a very high score on this factor displays a very high overall positive attitude towards the method for building frequency tables for grouped data. This in turn reveals that the student who underscores feels less confident using it than the majority of his/her classmates. In addition, low scores would most likely mean that scorers did have problems to understand the instructional method used because it was harder for them to learn and, probably, they did not like it at all. Consequently, those students that score low on this factor are expected to experience, overall, more negative attitudes toward the method for constructing grouped- data frequency distributions. Like other factors, this one includes 5 items (Items 4, 6, 13,

16, and 20).

144

Even though, strictly speaking, it is difficult to assess the quality of the data collected, it is easier to assess the accuracy of the experiment instrument utilized to collect those data. This assessment consists primarily of looking at the reliability and the validity characteristics of the survey or measurement instruments (Litwin, 1995, p. 3).

Only the first of these two important issues are addressed in the Pilot Studies.

Pilot Studies

To check for reliability and item functioning, the instrument was tested during the

2002 fall quarter and the 2003 fall quarter. Two convenience samples were used for these two pilot studies. A total of 45 graduate OU students were sampled, including 20 females and 25 males; among which there were 19 Master’s and 26 PhD students. On each administration of the test, the same scale was used twice: one time the data were based on the iteration algorithm, and the other time, the traditional algorithm served as data provider.

Reliability Issues

Cronbach’s Alpha as an Index of Reliability

Reliability is concerned with the consistency, stability, and dependability of the results (McMillan, 1997, p. 60). In particular, according to Litwin (1995), it is a

“statistical measure of how reproducible the survey instrument’s data are” (p. 6). For the

SAMU scale used, an Alpha reliability coefficient was computed and their results analyzed and compared. There are a wide variety of internal consistency measures that can be used, but Cronbach's Alpha was chosen because it is a popular method to measure reliability (Aron & Aron, 1999, p. 527). Cronbach’s alpha reliability coefficient was used

145 because also of the single administration of the scale. In addition to all that, it was the one

Schau et al. (1995) used in their SATS instrument, which serves here as our criterion.

Pilot Survey Results

Out of a total of 50 students, 45 of them returned their filled-out questionnaires.

This is a 90% response rate. Table 14 reports measures of internal consistency reliability among the dependent variables. There are overall 26 items, of which 20 items for the attitude subscale, 5 items for the background subscale, and 1 item serves as the covariate.

The 7 items for the self-efficacy subscale are not considered in the item analysis because they concern only the iteration algorithm.

Table 14

Reliability Statistics for the Iteration Algorithm and the Traditional Method

Dependent Variable Iteration Algorithm Traditional Method

Cronbach’s Number of Cronbach’s Number of

Alpha Items Alpha Items

Attitude Subscale

Affect .857 5 .820 5

Cognitive .762 5 .743 5

Value .805 5 .805 5

Difficulty .761 5 .723 5

Overall .949 20 .941 20

146

Based on the 20 overall items, the Cronbach’s Alpha coefficients were .95 and .94 for the iteration algorithm and the traditional method, respectively. As both coefficients are in excess of .90, one can be satisfied that the Cronbach’s Alpha coefficients are sufficiently high to suggest scale reliability for respondents like those in the study.

Further, these results are in agreement with those found by Hilton et al. (2004) and

Finney and Schraw (2004) who reported separately an overall Cronbach’s coefficient, α, for the post-test version of .90 and .91, respectively. Hence, the scale seems to be reasonably reliable and was not markedly revised.

Table 14 gives values for the attitude subscales: SAMU Affect (.82 and .86),

SAMU Cognitive Competence (.74 and .76), SAMU Value (.81 and .81), and SAMU

Difficulty (.72 and .76). Although these values might be overestimates of the population coefficient alphas because the same sample was used to conduct the item analyses and to compute the reliability estimates, they remain consistent with what was reported by Mills

(2004): SATS Affect (.81 and .85), SATS Cognitive Competence (.77 and .83), SATS

Value (.80 and .85), and SATS Difficulty (.64 and .77).

Table 15

Subscale Correlation Matrix for the Iteration Algorithm

Affect Cognitive Value Difficulty

Affect 1.000 0.794 0.711 0.750

Cognitive 0.794 1.000 0.896 0.858

Value 0.711 0.896 1.000 0.816

Difficulty 0.750 0.858 0.816 1.000

147

Table 16

Inter-item Correlation Matrix for the Traditional Method

Affect Cognitive Value Difficulty

Affect 1.000 0.719 0.678 0.688

Cognitive 0.719 1.000 0.868 0.798

Value 0.678 0.868 1.000 0.774

Difficulty 0.688 0.798 0.774 1.000

In closing, an analysis of the inter-item correlation matrix indicated that the intercorrelations between components or dimensions were quite high; since all the correlations were roughly at least .70 for both teaching methods (refer to Table 15 and

Table 16 above). More importantly, all the bivariate correlation coefficients were significant at the 0.01 significance level (two-tailed test). The analysis also inferred that the dependent variables are related dimensions or components to this 20-item scale. The dimensions did appear to be assessing attitudes toward the instructional method used.

Data Collection Procedures

Once the instructor had decided about which instructional method each class section was going to be taught, he or she would illustrate the chosen method by a well worked out example with the necessary explanations. The worked out example was provided to each instructor by the researcher. The students were requested to fill out one scale. Per Ohio University’s requirements, students asked to participate in this survey did so willingly and with due informed consent. Those who did not want to participate still had the option to receive the scale and then either decline filling it out and immediately

148 leave the spot or stop their participation at any other time they felt like doing so.

Although the students were not required to write down their names, each questionnaire included a statement assuring confidentiality of the respondents’ responses.

Data were collected by the scale on 26 items. The first 20 items were related to the students’ attitudes towards the method taught to them. The 5 other items concerned their background information. Due to its unique solution feature, students were tested only on the iteration algorithm to evaluate their mastery level of the materials taught with the 7 extra items. These seven last items were fill-in questions and required students to carry out actual calculations using the formulas provided. The self-administered questionnaire took roughly fifty minutes to complete for each of the two instructional methods used. Students turned in the filled out questionnaires to their instructor as soon as they were done.

Based on our two pilot studies, it was then anticipated an overall response rate of about 87%, which led to distributing around 200 copies (50 copies to each of the 4 class sections). Exactly 170 (85% of a total of 200) students turned in their filled-out questionnaires to their respective instructors. The scale was coded and used once; then the data were entered based on these codes into SPSS as they were received.

Data Analysis Procedures

Statistical Methods

For the attitudes survey, a 2 x 2 between-subjects factorial multivariate analysis of covariance (Two-Way MANCOVA) was conducted using the 4 SATS-Schau subscales

(affect, value, cognitive competence, difficulty) as dependent variables. The independent variables are the instructional (iteration or traditional) method and the departmental class

149 section (mathematics or psychology). The covariate is a variable that measures the student perceived ability in statistics.

As stated earlier on, one of the hypotheses to test is that the student attitudes’ scores will vary according to the instructional method used and/or the departmental class section. To reduce the random error introduced by differences in student background experience (Tabachnick & Fidell, 2001, p. 275), the student perceived ability in statistics was chosen as a covariate in the analyses. Two variables were thought to be measuring the student perceived ability: expected course grade and perceived mathematics ability.

The expected course grade was initially retained as a possible covariate here because of its known association with the students’ attitudes towards statistics (Perney &

Ravid, 1991; Onwuegbuzie, 2000). The expected grade in the course has ten categories ranging from grade F to grade A. Based on the review of the related literature, results have shown that attitudes are related to other variables, among which perceived mathematics ability comes often (Brown & Brown, 1995; Kottke, 2000). Galagedera et al. (2000) further observed a significant positive association between perceived mathematics ability and the expected grade in an elementary statistics course. Zeiner

(1991) on his part conceded that lack of previous exposure to statistics and limited mathematics proficiency are known to be associated with negative attitudes, which, in turn, inhibit learning.

Therefore, a second variable that was considered as a reliable alternate covariate was the “student perceived mathematics ability”. This information was gathered by “How good at mathematics are you?” item.

150

The multiple analysis of covariance (MANCOVA) was chosen because of its ability to handle multiple dependent variables and covariate(s), and rigorously test for differences while controlling for Type 1 error. Additionally, testing several dependent variables simultaneously may reveal complex interactions missed by the univariate tests

(Scheiner, 2001; Stevens, 2002).

MANCOVA is similar to MANOVA, but in this case one interval independent was added as covariate. This covariate serves as a control variable for the independent factors, acting to reduce the error term in the model. Like other control procedures,

MANCOVA can be seen as a form of “what if” analysis, asking what would happen if all cases scored equally on the covariate, so that the effect of the factors over and beyond the covariate can be isolated (Green et al., 2003; Scheiner, 2001; Stevens, 2002; Tabachnick

& Fidell, 2001).

Power and Effect Size

The power of a test is a measuring tool for assessing the accuracy of a test or in comparing two competing test procedures. It is the probability of rejecting a null hypothesis when it is, in fact, false. The alpha level, the size of the treatment effect, and especially, the sample size affect the power of a statistical test. The power of a test is directly proportional to sample size and effect size, and inversely related to p-value

(Sharma, 1996, p. 350). Larger sample sizes are likely to result in high power because the closer the sample composition is to the actual population, the lower the unexplained or error variance. If the power of an experiment is low, then there is a good chance that the experiment is inconclusive. That is why it is so important to consider priori power in the design of experiments before collecting the data. However, “power alone is not enough in

151 determining an effect of a certain size” (Capraro, 2002, p. 12). That is why it is necessary also to determine an a priori effect size.

The effect size is defined in many ways by different authors. Cohen, for instance, defined the effect size as the amount of variability in the dependent variable that can be accounted for or attributed to the independent variable (Cohen, 1988). Hypothesis testing gives us an indication that some type of reliable effect occurred in our study, while effect size tells us the magnitude of that effect or the strength of the relationship among variables. That is, hypothesis testing answers a “yes-no” question, while an effect size calculation answers a “how much?” question (Goodwin, 2002).

Even if “statistical” significance is found (the null hypothesis was rejected), the implications of the results do not necessarily warrant “practical” significance, which can often be revealed by the effect size, or “clinical” significance (Thompson, 2002). It can be noted, however, that Nickerson (2000) also warns that a “large effect is not a guarantee of importance any more than a small p-value” (p. 257). In other words, a small p-value or a large effect size does not necessarily indicate that the results are important to

“real-world” application (Nickerson, 2000).

Furthermore, to avoid any confusion in this interpretation of the results, the phrase

“statistically significant” should be employed instead of simply “significant” (Carver,

1978; Nickerson, 2000; Thompson, 1994; Thompson, 1996). “Significant” implies

“important” and, again, the statistically significant results may not be necessarily important in reality (Kennedy, 2002, p. 6).

Estimates of a Priori Power and a Priori Effect Size from Sample-Power

The following is a text report from the Sample-Power software. The sample sizes

152 computed are those that are required to obtain a priori power of at least .80, when the significance level, α, is set at .05 and are a priori effect sizes is “medium” (that is, 0.25).

This power analysis is for a one-way fixed effects analysis of covariance with 2 levels. The study will include 64 cases per cell for a total of 128 cases. The study will include a set of 1 covariate which account for .0% of the variance in the dependent variable.

The criterion for significance (alpha) has been set at 0.05. The analysis of variance is non-directional (i.e., two-tailed) which means that an effect in either direction will be interpreted.

As to the main effects, the factor “department” includes only 2 levels. With a projected .80 response rate, it was anticipated surveying at least 64 students from the mathematics department and at least 64 students from the psychology department. With the adjusted medium effect size (f) of .25, a sample size of 128 students in total was needed to yield a priori power of at least .80.

Estimates of a priori power and a priori effect size from Stevens (2002) Table

Power of the test can be obtained from power tables using effect size, p-value, and the sample size (Sharma, 1996, p. 350). Table 4.7 from Stevens (2002, p. 200) provides power values for two-group MANOVA for 2 through 7 variables, with group size varying from small (15) to large (100), and with effect size varying from small (D2 = .25) to very large (D2 = 2.25).

153

Based on this table, with 4 dependent variables and setting the overall effect, D2, at the moderate level, that is D2 = .64, and the significance level α at .05, to reach a priori power of at least .80, at least 50 per group (equal group sizes are assumed) are needed.

Our final sample size of 170 students (or roughly 85 students per department) would be therefore more than adequate to yield power values well above .80.

CHAPTER 5: DATA ANALYSIS

Results

The primary interests of this study were twofold: (a) how relatively accurate is the iteration algorithm as a process for statistics users if these users want to construct their own frequency table of grouped data, (b) how effective is the iteration algorithm as a

154 teaching and learning method if compared to a traditional algorithm when constructing a grouped-data frequency distribution.

Since the outcomes of the first primary interest were already presented in chapter

3, this chapter offers only the results of the second primary interest, which was the focus of this study. These findings are presented below.

To assess the effectiveness of the iteration algorithm as an instructional device, a seven-point Likert-type instrument was developed. This instrument, called the Survey of

Attitudes towards the Method Used (SAMU), is a slightly revised version of the well- known Survey of Attitudes toward Statistics (SATS) instrument (Schau, Stevens,

Dauphine, & Del Vecchio, 1995). Like the original SATS instrument, the SAMU is a self-report instrument that has 20 items that specifically measure 4 facets of attitudes towards not statistics as is the case with the SATS measure, but towards the instructional method taught. As explained by the authors of the original SATS instrument, a panel of instructors and introductory statistics students identified by consensus 4 facets of attitudes towards statistics. The 4 facets involved are: Affect, Cognitive Competence,

Value, and Difficulty. Each of these facets was already defined in chapter 4 under the instrument development section.

For comparison purposes, the 4 facets were retained as dependent variables in the current data analysis. In addition, there was only one version of the SAMU questionnaire that was administered to both groups of students who were taught either the iteration algorithm or the traditional method.

Data Analysis

155

The analyses of the SAMU instrument, which the remaining 3 research questions address, were produced by running MANCOVA using the SPSS MANOVA program.

Multivariate and univariate analyses were performed only after checking for possible violations of assumptions. Before that, the reliability analyses rooted in Cronbach’s coefficient α results for the 4 facets for both instructional methods were considered.

Because the Cronbach’s coefficient α results obtained in the pilot studies were quite higher than those reported in the current chapter, an exploratory factor analyses were performed to analyze interrelationships among variables and to explain these variables in terms of their common underlying dimensions (factors). Specifically, a principal components analysis was used to assess dimensionality.

Survey of Attitudes towards the Method Used

Negatively worded items. The SAMU instrument measures the effects of the instructional method taught (either the iteration or traditional method) on students’ attitudes and perceptions concerning the construction of grouped-data frequency tables.

For all the 20 items, in particular, a seven-point Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree) was used. Negatively worded items were reverse coded on each facet such that a high score always represented a more positive attitude. As can be seen in the Appendix F, all negatively worded items are followed by the upper case N in parentheses.

Reliability Analyses. As can be noticed in Table 17, the Cronbach’s Alpha coefficients based on the 20 overall items were .76 and .57 for the iteration algorithm and the traditional method, respectively (N = 170). Both coefficients are much lower than those found by the pilot studies. The pilot studies used graduate students in the College of

156

Education, whereas the final study used undergraduate students in Mathematics and

Psychology departments.

There were 4 domains of interest in this particular investigation. Attitudes towards the method taught were assessed by examining the dimensions of Affect, Cognitive

Competence, Value, and Difficulty because the same components were used in the original SATS instrument. Cronbach’s Alpha coefficient was computed for each subscale and can be sighted in Table 17.

Table 17

Reliability Statistics for the Iteration Algorithm and the Traditional Method

Dependent Variable Iteration Algorithm Traditional Method

Cronbach’s Number of Cronbach’s Number of

Alpha Items Alpha Items

157

Attitude Subscale

Affect .730 5 .450 5

Cognitive .209 5 .297 5

Value .285 5 .120 5

Difficulty .723 5 .666 5

Overall .759 20 .567 20

Factor analyses. An exploratory factor analysis was carried out. The method of initial factor extraction was principal components analysis. A principal components analysis is a method that provides a unique solution, so that the original data can be reconstructed from the results. It looks at the total variance among the variables, so the solution generated will include as many factors as there are variables, although it is unlikely that they will all meet the criteria for retention. One good rule of thumb for determining the number of factors to be extracted is the “eigenvalue greater than 1” criterion (Kaiser’s rule). In essence this is like saying that, unless a factor extracts at least as much as the equivalent of one original variable, it should be dropped. Although this criterion is probably the one most widely used, it is not the best way to extract the number of factors to be retained in an exploratory factor analysis.

In this study, the principal component factor analysis was run by imposing 4 factors to see whether the items load correctly on the 4 factors as identified by the original scale authors. The initial eigenvalues were obtained for only these 4 factors

158 retained. The total variance and percent of variance explained by each factor are displayed in Table 18. The greater weight of the first factor and the similar relevance of the remaining factors are visible.

Table 18

Partial Outputs of the Initial Factor Analysis Summary (Principal Component Analysis)

Iteration Algorithm Traditional Method

Initial Eigenvalues Initial Eigenvalues

Factor Total % of Cumulative Total % of Cumulative

Variance % Variance %

1 4.378 21.891 21.891 3.291 16.455 16.455

2 2.733 13.663 35.554 3.042 15.208 31.663

3 2.142 10.709 46.263 2.611 13.053 44.716

4 1.612 8.059 54.322 1.612 8.061 52.777

5 1.531 7.657 61.979 1.501 7.503 60.280

To facilitate the interpretation of the retained factors, the axes were rotated using the Varimax method. Varimax rotation is an orthogonal rotation of the factor axes to maximize the variance of the squared loadings of a factor (column) on all the variables

(rows) in a factor matrix, which has the effect of differentiating the original variables by extracted factor. Each factor will tend to have either large or small loadings of any particular variable. A Varimax solution yields results which make it as easy as possible to identify each variable with a single factor. This is the most common rotation option.

159

The rotated factorial scores are included in Table 38 and Table 39 shown in

Appendix M. The factors are interpreted below with a name to describe the component to which the item belongs.

First factor: Affective and Difficulty components. For the iteration algorithm, this factor explains 21.89% of the total variance and includes items in the affective and difficulty component only. The two Affect items that contributed the most to the first factor (r = .98) are: I like constructing grouped-data frequency tables by the method presented and I am scared by the algorithm brought to light in class. The only Difficulty item that contributed the most to the first factor (r = .98) are: the algorithm of the instructional method taught is complicated.

Affective and Cognitive components. For the traditional method, this factor explains 16.46% of the total variance and includes items in the affective and cognitive competence components. The two Affective items that contributed the most to the first factor (r = .98) are: I feel insecure when creating grouped-data frequency tables by the method presented, and I am under stress using the method presented in class. The only

Cognitive item that contributed the most to this first factor (r = .98) is: I make a lot of computational errors when applying the algorithm taught in class. In the study sample data these two components (Affect and Cognitive) were positively and significantly related, but their correlation was moderate.

The very strong weight found in the Difficulty item suggests that the students considered the iteration algorithm to be relatively difficult regardless of their perceived ability in statistics and their knowledge in constructing grouped-data frequency tables.

These findings suggest also the extent to which the students’ affection towards the

160 instructional method taught might be conditioned by their understanding of the topic.

Positive small correlations in the other components (or factors) imply that attitudes in general tend to improve a little with increased knowledge.

Second factor: Difficulty component. For the iteration algorithm, this factor explains 13.66% of the total variance and includes items in the Difficulty component only. The first 3 Difficulty items that contributed more than any other items to the second factor (r = .98) are: the algorithm’s formula are easy to use; learning the algorithm as shown requires a great deal of discipline; and the algorithm exposed can be quickly learned by most people.

Difficulty component. For the traditional method, this factor explains 15.21% of the total variance and includes again items in the Difficulty component only. Difficulty items that contributed the most to this second factor (r = .98) are the same as those listed above under the iteration algorithm section.

The difficult level observed might came from the fact that many students associate the construction of grouped-data frequency tables to the mathematical technical features of the formulas contained in the algorithms. Positive attitudes of the majority of them could have helped to reduce this difficulty level.

Third factor: Affective and Cognitive components. For the iteration algorithm, this factor explains 10.71% of the total variance and includes items in the affective and cognitive competence components. The only Affect item that contributed the most to this third factor (r = .89) is: I am under stress using the method presented in class. The only

Cognitive item that contributed the second most to this third factor (r = .88) is: I make a lot of computational errors when applying the algorithm taught in class.

161

Affective and Difficulty components. For the traditional method, this factor explains 13.05% of the total variance and includes items in the Affect and Difficulty components. The only Affect item that contributed the most to this third factor (r = .91) is: I like constructing grouped-data frequency tables by the method presented. The only

Difficulty item that contributed the second most to this third factor (r = .89) is: the algorithm of the method present is complicated.

In the study sample data Affect and Cognitive components are positively and significantly correlated to each other, so are Affect and Difficulty components. The other two facets, Cognitive and Difficulty, are not significantly correlated to each other. The interpretation of these results are like those made under the first factor. The fourth factor explains less than 10% of the total variance observed. In light of these results, the SAMU items seem to load correctly only on 3 of the 4 factors identified by the original SATS scale authors.

MANCOVA results

A 2 x 2 multivariate analysis of covariance test was conducted to determine how effective is the iteration algorithm as a teaching and learning method if compared to a traditional algorithm when constructing a grouped-data frequency distribution. In the following sections, each part of the analyses will refer to a specific table.

Research question 2: Is there an overall interaction between the academic department and the instructional method used, adjusting for differences in the students’ perceived ability in statistics among groups?

162

Research question 4: Are there differences in the mean attitudes scored by students as a function of the instructional method used, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics?

Means and Standard Deviations. Means and standard deviations of the Survey

Attitudes towards the Method Used (SAMU) scale scores on both instructional methods taught to mathematics and psychology students are displayed in Table 19. The Expected

Course Grade (ECG) scores, for students who were taught the iteration algorithm, were on average higher for mathematics students (M = 9.77, SD = .72) than for psychology students (M = 8.91, SD = 1.54). Likewise, the Perceived Mathematics Ability (PMA) scores, for students who were taught the iteration algorithm, were on average higher for mathematics department students (M = 5.84, SD = 1.11) than for psychology department students (M = 5.18, SD = 1.06).

For those who were instructed with the traditional method, their ECG scores were greater as well for mathematics students (M = 8.20, SD = 1.72) than for psychology students (M = 7.69, SD = 1.81). Their PMA scores were again greater for mathematics students (M = 4.98, SD = .96) than for psychology students (M = 4.88, SD = 1.04). On every single dependent variable, mathematics students achieved superior average scores on the iteration algorithm than did their psychology counterparts. Except for the Value

163 facet (M = 27.46, for mathematics, versus M = 27.67, for psychology), mathematics students scored, on average, higher on the traditional method than did psychology students.

Table 19

Means and Standard Deviations* of the Survey of Attitudes toward the Method Used Instructional Method Taught

Dependent Variables Iteration Algorithm Traditional Method and Covariate Mathematics Psychology Mathematics Psychology

(n = 43) (n = 44) (n = 41) (n = 42)

Affect 29.81 (2.99) 25.80 (4.06) 26.76 (1.93) 25.52 (2.78)

Cognitive 29.37 (2.00) 28.86 (2.04) 29.24 (1.83) 28.88 (1.98)

164

Value 28.51 (2.44) 28.05 (2.30) 27.46 (2.30) 27.67 (2.18)

Difficulty 28.74 (3.09) 26.05 (4.08) 24.51 (3.09) 23.95 (3.28)

ECG 9.77 (.72) 8.91 (1.54) 8.20 (1.72) 7.69 (1.81)

PMA 5.84 (1.11) 5.18 (1.06) 4.98 (.96) 4.88 (1.04)

Note: * Standard deviations are reported within parentheses.

Intercorrelations. Pearson product-moment coefficients of the SAMU total and subscale scores are presented in Table 20 thru Table 23. As suggested by Henson (2000) and Thompson and Vacha-Haase (2000), internal consistency reliability coefficients for the total (that is, the total score obtained on the 20 items) and 4 subscales of the SAMU were reported for the current sample. The choice of these subscales instead of just one has already been explained at the beginning of this chapter.

Iteration Algorithm and Mathematics Department. For the iteration algorithm and mathematics department, inter-correlations among the 4 subscale scores and standard deviations of the SAMU scores displayed on the principal diagonal are displayed in Table

20. The most salient inter-correlation observed was between Total and Difficulty (r = .81, p < .01, n = 43) and the lowest inter-correlation spotted was between Value and

Cognitive (r = -.03, p > .01, n = 43), that were negatively and not significantly related.

Total is least related to Cognitive (r = .49, p < .01, n = 43), but their correlation coefficient is still statistically significant.

ECG scores for mathematics students who were taught the iteration algorithm ranged from 4 to 10 with a mean of 9.77 (SD = .72). The ECG variable is most related to

165

Difficulty (r = .61, p < .01, n = 43) and had the smallest and insignificant correlation coefficient with Value (r = .25, p > .01, n = 43).

Perceived Mathematics Ability (measured by the item ‘How good at mathematics are you?’) scores for mathematics students who were taught the iteration algorithm ranged from 1 to 7 with a mean of 5.84 (SD = 1.11). The PMA variable is most and significantly associated with Total (r = .57, p < .01, n = 43), then followed successively by Value (r = .46, p < .01, n = 43), Affect (r = .44, p < .01, n = 43), and ECG (r = .43, p < .01, n = 43). The PMA variable had the smallest and insignificant correlation coefficient with the Cognitive facet (r = .03, p > .01, n = 43).

Table 20

Inter-correlations and Standard Deviations* of the Survey of Attitudes toward the

Method Used Scores (Iteration algorithm - Mathematics Department)

Variables Affect Cognitive Value Difficulty ECG PMA Total

Affect 2.99

Cognitive .29+ 2.00

Value .30+ -.03 2.44

166

Difficulty .40+ .24 .22 3.09

ECG .47+ .29+ .25 .61+ .72

PMA .44+ .03 .46+ .34+ .43+ 1.11

Total .747+ .491+ .494+ .809+ .696+ .565+ 9.834

Notes: * Standard deviations appear on the principal diagonal.

+ p < .01 for two-tailed test.

Iteration Algorithm and Psychology Department. For the iteration algorithm and psychology department, inter-correlations among the 4 subscale scores and standard deviations of the SAMU scores shown on the principal diagonal are presented in Table

21. The highest and also statistically significant inter-correlation observed was between

Total and ECG (r = 0.87, p < .01, n = 43) and the lowest inter-correlation observed was between Difficulty and Value (r = 0.00, p < .01, n = 44). Total is again least related to

Cognitive (r = 0.43, p < .01, n = 44), but their correlation coefficient is still statistically significant.

ECG scores for psychology students who were taught the iteration algorithm ranged from 4 to 10 with a mean of 8.91 (SD = 1.54). The ECG variable is highly and most significantly related to Affect (r = 0.67, p < .01, n = 44) and had the least and still statistically significant correlation coefficient with Cognitive (r = 0.32, p < .05, n = 44).

PMA scores for psychology students who were taught the iteration algorithm ranged from 1 to 7 with a mean of 5.18 (SD = 1.01). The PMA variable is highly and most significantly associated with Value (r = .61, p < .01, n = 44), then followed in

167 decreasing order by Affect (r = .44, p < .01, n = 44), Total (r = .43, p < .01, n = 44), and

ECG (r = .4, p < .01, n = 44). The PMA variable had the smallest and insignificant correlation coefficient with Difficulty (r = .03, p > .01, n = 44).

Table 21

Inter-correlations and Standard Deviations of the Survey of Attitudes toward the Method

Used Scores (Iteration Algorithm - Psychology Department)

Variables Affect Cognitive Value Difficulty ECG PMA Total

Affect 4.06

Cognitive .24* 2.04

Value .27* .20 2.30

168

Difficulty .53* .03 .00 4.08

ECG .67* .32* .35* .64* 1.54

PMA .44* .10 .61* .03 .40* 1.06

Total .840* .428* .442* .741* .868* .425* 12.018

Notes: * p < .01 for two-tailed test.

Traditional Method and Mathematics Department. For the traditional method and mathematics department, inter-correlations among the 4 subscale scores and standard deviations of the SAMU scores appearing on the principal diagonal are presented in

Table 22. The highest and most statistically significant inter-correlation was between

Total and ECG (r = .85, p < .01, n = 41) and the lowest and also insignificant inter- correlation observed was between PMA and Cognitive (r = .02, p > .01, n = 41). Total had the lowest and also insignificant correlation coefficient with PMA (r = .17, p > .01, n = 41). ECG scores for mathematics students who were taught the traditional method ranged from 4 to 10 with a mean of 4.98 (SD = .96). The ECG variable is highly and most significantly related to Difficulty (r = .54, p < .01, n = 41) and is least and not significantly associated with the Value facet (r = .10, p > .01, n = 41).

PMA scores for mathematics students who were taught the traditional method ranged from 1 to 7 with a mean of 4.98 (SD = .96). The PMA variable is most, but not significantly associated with ECG (r = .281, p > .01, n = 41), then followed in decreasing order by Value (r = .19, p > .01, n = 41) and Total (r = .17, p > .01, n = 41). The PMA variable has a negative and also insignificant correlation first with Affect (r = -.12,

169 p > .01, n = 41) and second with Difficulty (r =- .14, p > .01, n = 41).

Table 22

Inter-correlations and Standard Deviations of the Survey of Attitudes toward the Method used scores (Traditional Method - Mathematics Department)

Variables Affect Cognitive Value Difficulty ECG PMA Total

Affect 1.93

Cognitive .41+ 1.83

Value .08 .11 2.30

Difficulty .31+ .26 .12 3.09

ECG .49+ .43+ .10 .54+ 1.72

PMA -.12 .02 .19 -.14 .28+ .96

Total .594+ .598+ .387+ .741+ .848+ .165 8.652

Note: * Standard deviations appear on the principal diagonal.

+p < .01 for two-tailed test.

Traditional Method and Psychology Department. As to the traditional and psychology department, inter-correlations between subscale scores and standard deviations of the SAMU on the principal diagonal are presented in Table 23. The highest and also most statistically significant inter-correlation was observed between Total and

ECG (r = .91, p < .01, n = 42) and the lowest and also insignificant inter-correlation spotted (in absolute values) was between Difficulty and Cognitive (r = .02, p > .01, n = 42). Total is least related to Value (r = .38, p < .05, n = 42).

170

ECG scores for psychology students who were taught the traditional method ranged from 4 to 10 with a mean of 7.69 (SD = 1.81). The ECG variable had the highest and also most statistically significant correlation coefficient with Affect (r = .54, p < .01, n = 42) and had the lowest and also non significant coefficient of correlation with Value

(r = .22, p > .01, n = 42).

PMA scores for psychology department students who were taught the traditional method ranged from 1 to 7 with a mean of 4.88 (SD = 1.04). The PMA variable is highly and most significantly associated with Value (r = .62, p < .01, n = 42), then followed in decreasing order by ECG (r = .50, p < .01, n = 42) and Total (r = .42, p < .01, n = 42).

The PMA variable is negatively and not significantly related to Cognitive (r = -.10, p > .01, n = 42). The PMA variable had the smallest and also insignificant coefficient of correlation with Difficulty (r = .08, p > .01, n = 42).

Table 23

Inter-correlations and Standard Deviations* of the Survey of Attitudes toward the

Method Used Scores (Traditional Method - Psychology Department)

Variables Affect Cognitive Value Difficulty ECG PMA Total

Affect 2.78

Cognitive .50+ 1.98

Value -.12 .13 2.18

171

Difficulty -.03 .02 .07 3.28

ECG .54+ .41+ .22 .40+ 1.81

PMA -.09 -.10 .62+ .08 .50+ 1.04

Total .572+ .556+ .378+ .545+ .912+ .419+ 8.966

Note: * Standard deviations appear on the principal diagonal.

+p < .01 for two-tailed test.

Running MANCOVA analyses. As a way to assess the effectiveness of the iteration algorithm as a teaching and learning method, the current study investigated differences between attitudes scores based on students’ academic department and the instructional method taught. Using the attitudes scores of 170 Ohio University undergraduate students, a 2 × 2 between-subjects factorial multivariate analysis of covariance (MANCOVA) was conducted on the 4 dependent variables (Affect,

Cognitive, Value, and Difficulty) that were hypothesized to be 4 dimensions of attitudes towards the instructional method taught. The two independent variables were the instructional method taught (iteration algorithm or traditional method) and the departmental section (mathematics department or psychology department). In addition, the ECG variable was initially entered as the covariate to control for important prior differences between groups. Since Schau et al. (1995) used it in their article containing the original SATS instrument, this study kept the same significance level of .01.

SPSS syntax for the Multivariate analysis of Covariance using the MANOVA program was run. Type III Sum of Squares was used. Stevens (1996, p. 332) affirmed

172 that “the covariate is not appropriate for the data collected if there is a covariate by group interaction”. This assumption, among many others, was examined.

ECG Covariate by Group Interaction Effect. The multivariate test shows a statistically significant F test value. There is an ECG covariate by Method by Department interaction effect (F (12, 420.97) = .82, p < .01). As a result, the ECG variable might not be a satisfactory covariate for the data collected. Another variable measuring the students’ perceived ability in statistics, PMA, was retained to serve as the covariate. The following assumptions were tested for this new (PMA) covariate in analysis.

Assumptions when Using MANCOVA

Before using the multivariate analyses, data were screened for the main assumptions of parametric statistics (i.e., independence, normality, homogeneity of variance-covariance matrices, homogeneity of regression slopes, independent groups, and linear relationships between the dependent variables and the new PMA covariate).

Independence of the Observations Assumption. This assumption is met in that it is assumed that each student who participated in the SAMU survey filled out his or her questionnaire alone and independently of any other student.

Assumption of Multivariate Normality of the Dependent Variables within Each

Group. This assumption was determined by first a visual representation of a scatter plot of data for pairs of dependent variables within each of the two departmental class sections. Altogether 24 scatter diagrams of data for pairs of dependent variables were plotted. A sample set of 8 out of the 24 scatter plots are shown in the Appendix L. The graphs were judged to be normal because each scatter diagram for every pair of variables was roughly oval-shaped. Second, for each dependent variable the Kolmogorov-Smirnov

173 and Shapiro-Wilks tests were run simultaneously. The results show that the distributions of both instructional methods’ scores for both departments as well as the distribution of scores for each of the 4 dependent variables were all normal (p > .01). Tabachnick and

Fidell (2001) asserted that “a sample size of at least 20 in each cell should ensure

‘robustness’ of this normality assumption” (p. 329). In the present study, each cell had at least 41 scores.

Tabachnick and Fidell also asserted that “homoscedasticity is related to the assumption of normality because when the assumption of multivariate normality is met, the relationships between variables are homoscedastic. The bivariate scatter plots between two variables are of roughly the same width all over with some bulging toward the middle” (p. 79). Since the assumption of multivariate normality is tenable, the homoscedasticity assumption is checked next to see whether this latter assumption is also satisfied.

Homogeneity of variance-covariance matrices assumption with the PMA covariate. The multivariate test for homogeneity of dispersion matrices, Box M test, evaluates whether the variances and covariances among the dependent variables are the same for all levels of a factor. If the probability value associated with the Box’s M test is less than .01, then the variance-covariance matrices are said to be significantly different.

Based on the multivariate test for homogeneity of dispersion matrices’ results, it can be established that the assumption of homogeneity of variance-covariance matrices is tenable at the .01 significance level in that the observed probability value associated with the F test was not statistically significant (Box’s M = 50.11, F (30, 75374) = 1.60,

174

p > .01). A closer investigation of the standard deviations for each instructional method taught (Table 19) revealed that cells with larger samples sizes, except for Value, produced larger standard deviations between the two academic departments. Stevens

(2002) asserted that this assumption is robust if group sizes are equal or approximately equal and largest STD over smallest STD < 1.5. Within each instructional method taught, the ratio of the higher standard deviation over the smaller standard deviation between the two departments is, except for ECG (iteration algorithm), less than 1.5.

Homogeneity of Regression Slopes Assumption. The multivariate test, at the bottom of Table 24, shows that there is not a PMA covariate by group interaction effect

(F (12, 420.97) = 2.01, p > .01). Because of this insignificant F value, the hypothesis of homogeneity of regression slopes at the .01 significance level could not be rejected. The results, therefore, indicated that the assumption of parallelism of the regression planes was tenable for this new covariate (that is, the PMA covariate).

Independent Groups. The group of students who were taught the iteration algorithm and the group of other students who received the traditional method were independently enrolled in each of their respective departmental class sections. This assumption is met in that all groups were made up of unique individuals (i.e., no student belonged to more than one departmental class section).

Table 24

Selected Output for Multivariate Analysis of Covariance Test* Run with PMA

Effect Value F Hypothesized Error D.F. Sig. of F Eta

D.F. Squared

175

Within +

Residual .74 14.23 4.00 162.00 .000 .26

Method x

Department .95 2.08 4.00 162.00 .086 .05

Department .85 7.12 4.00 162.00 .000 .15

Method .84 7.84 4.00 162.00 .000 .16

PMA .54 33.65 4.00 159.00 .000 .46

PMA x

Method x

Department .86 2.01 12.00 420.97 .022 .14

Note: * The Test Name is Wilks’ Lambda.

Linear Relationships between the Dependent Variables and the Covariate

Assumption. Covariate is used when there may be possible impacts of this variable on the dependent variables. Within each group, this assumption was inspected by plotting a scatter diagram between each of the 4 dependent variables and the covariate. Since all scatter diagrams were roughly oval-shaped, it was concluded that both variables (each dependent variable and the covariate) were normally distributed and linearly related

(Tabachnick & Fidell, 2001).

A correlation analysis was also performed. As anticipated, statistically significant correlation results were found, on one hand, between each dependent variable and the covariate and, on the other hand, between each two dependent variables.

176

Pearson Product-moment Correlation Coefficients. Correlation analysis does not test linearity, it assumes linearity. Based on the results of the correlation analysis displayed in Table 25, 17 out of the 21 correlation coefficients listed were statistically significant at the .01 level. Hence, it was assumed that overall the PMA variable and the dependent variables were linearly and positively associated.

Table 25

Correlations between the dependent variables and PMA+

Variables Affect Cognitive Value Difficulty ECG PMA Total

Affect 1

Cognitive .336* 1

(.000)

Value .197* .105 1

177

(.010) (.172)

Difficulty .466* .140 .162 1

(.000) (.069) (.035)

ECG .586* .344* .258* .619* 1

(.000) (.000) (.001) (.000)

PMA .344* .037 .496* .232* .465* 1

(.000) (.629) (.000) (.000) (.000)

Total .773* .451* .436* .788* .843* .509* 1

(.000) (.000) (.000) (.000) (.000) (.000)

Notes: * p < .01 for two-tailed test; + probability values appear within parentheses.

All major multivariate assumptions being met, the results of assumption tests permit use of the multivariate analysis of covariance for this study. The PMA variable was consequently retained as the new covariate.

Analyzing and Interpreting the MANCOVA Results

The multivariate null hypothesis tested in covariance is that the adjusted population-mean vectors are equal. Due to an overall significant F value, the multivariate test at the top of Table 24 illustrates that the multivariate null hypothesis had to be rejected at the .01 significance level, F (4, 162) = 14.23, p < .01. Hence one can conclude that the groups differed significantly on the set of two adjusted means.

178

Because of this statistically significant multivariate analysis of covariance result, but with a modest association between the combined 4 dependent variables and the covariate (multivariate eta squared η2 = .26), the next step was to examine each of the two main effects to determine where the significance arises. After adjusting for differences in the PMA, there was a significant main effect for department, F (4, 162) = 7.12, p < .01.

Association between the combined 4 dependent variables and the main effect of department was modest (multivariate eta squared η2 = .15).

Similarly, after adjusting for differences in the PMA, a significant main effect for method was found, F (4, 162) = 7.84, p < .001. The association between the main effect of method and the combined dependent variables was modest as well (multivariate eta squared η2 = .16).

These results show that both independent variables are contributing to the overall multivariate significance. Analyses did not detect a significant method by department interaction effect, F (4, 162) = 2.08, p > .01, which yielded a substantially small effect size (multivariate eta squared η2 = .05). Note that the values F and Λ (lambda) are taken from Table 24 and the multivariate eta squared η2 is calculated according to the formula:

η2 = 1 - Wilks’ Λ.

In contrast, the PMA covariate yielded a statistically significant F value, F (4,

159) = 33.65, p < .01. This PMA covariate produced a moderate effect size (multivariate eta squared η2 = .46).

The PMA covariate scores, for students who were taught the iteration algorithm, were on average greater for mathematics department students (M = 5.837, SD = 1.11) than for psychology department students (M = 5.18, SD = 1.06). For those who were

179 instructed with the traditional method, their PMA scores were higher as well for mathematics students (M = 4.98, SD = 0.96) than for their counterpart psychology students (M = 4.88, SD = 1.04).

As to the adjusted means for the two independent variables, they are displayed in

Table 26 thru Table 28. For the independent variable Method (Table 26), the students’ scores were higher on all dependent variables for the iteration algorithm than for the traditional method (M(Adj) = 27.61 versus M(Adj) = 26.34, for instance, for Affect).

Table 26

Adjusted Means of the Survey of Attitudes toward the Method Used Instrument for

Method

Instructional Method Received

Dependent Variable Iteration Algorithm Traditional Method

(n = 87) (n = 83)

180

Affect 27.61 26.34

Cognitive 29.11 29.07

Value 27.97 27.87

Difficulty 27.31 24.31

Table 27 presents the adjusted means of the Survey of Attitudes toward the

Method Used Scale for Department. Again, the students’ scores were higher on all dependent variables for the iteration algorithm than for the traditional method except for

Value (M(Adj) = 27.79 versus M(Adj) = 28.05).

Table 27

Adjusted Means of the Survey of Attitudes toward the Method Used Instrument for

Department

Academic Department

Dependent Variable Mathematics Psychology

(n = 84) (n = 86)

181

Affect 28.16 25.79

Cognitive 29.30 28.88

Value 27.79 28.05

Difficulty 26.58 25.05

Table 28 presents the adjusted means of the SAMU instrument for Method by

Department. Both mathematics and psychology department students exhibited higher attitude score towards the iteration algorithm than towards the traditional method, except for Cognitive in psychology (M(Adj) = 28.86 versus M(Adj) = 28.89).

Table 28

Adjusted Means of the Survey of Attitudes toward the Method Used Instrument for

Method by Department

Instructional Method Received

Dependent Variable Iteration algorithm Traditional Method

Mathematics Psychology Mathematics Psychology

182

(n = 43) (n = 44) (n = 41) (n = 42)

Affect 29.40 25.82 26.92 25.75

Cognitive 29.36 28.86 29.25 28.89

Value 27.86 28.08 27.72 28.02

Difficulty 28.57 26.06 24.58 24.05

Assumptions when Using ANCOVA

As recommended by Tabachnick, and Fidell (2001) and Stevens (2002), the following 9 main assumptions were checked for conformity.

The Observations are Independent. This assumption is met in that it is assumed that each student who participated in the survey filled out his or her questionnaire alone and independently of any other student.

The Observations are Normally Distributed on the Dependent Variable in Each

Group. Within each group, the Kolmogorov-Smirnov and Shapiro-Wilks tests were simultaneously run for each dependent variable. The 16 probability values associated with the Shapiro-Wilk test of normality were all not significant (p > .01). Only the

Shapiro-Wilk test of normality results are reported here because, based on an extensive

Monte Carlo study done in 1968 by Wilk, Shapiro, and Chen, Stevens (2002) ascertained that the Shapiro-Wilk test is more powerful than the Kolmogorov-Smirnov in detecting departures from normality.

183

These results show that the distributions of both instructional methods’ scores for both departments as well as the distribution of each of the 4 dependent variables were all normal. This assumption is satisfied.

The Population Variances for the Groups are Equal (Homogeneity of Variance

Assumption). As shown in Table 29, the probability value associated with the F Levene’s test of the equality of error variances was significant for the Affect facet (F (3, 166) =

5.77, p < .01). In contrast, the probability values associated with the F Levene’s test of the equality of error variances were not significant (p > .01) for each of the other 3 facets

(Cognitive, Value and Difficulty).

Table 29

Levene’s Test of Equality of Error Variances

Dependent variable F df1 df2 Sig. p

Affect 5.766 3 166 .001

Cognitive .666 3 166 .574

Value 1.105 3 166 .349

Difficulty 2.639 3 166 .051

Stevens (2002) asserted that this assumption is robust if group sizes are equal or approximately equal –largest/smallest < 1.5. In the present case, the ratio largest group size over smallest group size is 87/83 or 1.05, which is less than 1.5. In view of the above results, it can be implied that the population from which the samples were drawn has equal group variances.

184

The slope of the Regression Line is the Same in Each Group (Homogeneity of the Regression Slopes Assumption). The univariate test, at the bottom of Table 30, indicates that there is a PMA covariate by Method by Department interaction effect for only the Affect facet (F (1, 165) = 5.22, p < .01). Since there were no other significant interaction effects between the covariate and each independent variable (for the 3 remaining facets), the results suggest that the homogeneity of the regression slopes assumption is satisfied.

Outliers. The distribution of each dependent variable was examined for potential outliers by identifying scores that were greater than or equal to |3| standard deviations from their mean. This evaluation of the assumption of absence of outliers is satisfactory as no outliers were detected.

Independent Groups. The group of students who were taught the iteration algorithm and the group of other students who received the traditional method were independently selected. Like for the multivariate case, this assumption is tenable because all groups were made up of unique individuals (i.e., no student belonged to more than one departmental class section).

A linear Relationship Exists between the Dependent Variables and the

Covariate. This assumption was inspected by plotting a scatter diagram between each of the 4 dependent variables and the covariate. Since each of the 4 scatter diagrams was roughly oval-shaped, it can be concluded that each dependent variable and the covariate are normally distributed and linearly related (Tabachnick & Fidell, 2001).

The Covariate is Reliable. Even tough there is no way to ascertain that the covariate is perfectly reliable, it may be assumed in ANCOVA that the covariate is

185 measured without error. According to Tabachnick and Fidell (2001), “with self-report of variables such as attitudes such an assumption is not easily made. The covariate may be reliable at the point of measurement, but fluctuate over short periods” (p. 283).

All main univariate assumptions being somehow met, the results of assumption tests permit use of the analysis of covariance for this study. The PMA variable serves as the covariate.

Univariate ANCOVA Follow-up Analyses

The univariate ANCOVA follow-up is in Table 30. The overall test results revealed that, after adjusting for differences in the PMA, the attitudes mean scores differed significantly on two of the 4 dependent variables: Affect, F (1, 165) = 9.32, p < .01, eta squared η2 = .28; and Value, F (1, 165) = 48.31, p < .01, eta squared η2 = .25.

Their adjusted mean scores did not differ significantly on Cognitive, F (1, 165) = .03, p >

.01, eta squared η2 = .01; and on Difficulty, F (1, 165) = 1.18, p > .01, eta squared η2 =

.24.

The method by department interaction effect results suggest that the adjusted mean scores were not significantly different on all dependent variables, except possibly for Affect, F (1, 165) = 6.80, p = .01, eta squared η2 = .04. Significant adjusted mean score differences were detected for Affect and Difficulty only for the independent variable Department, F (1, 165) = 26.03, p < .01, eta squared η2 = .14; and F (1, 165) =

8.19, p < .01, eta squared η2 = .05, respectively. Similarly, statistically significant adjusted mean score differences were detected again for Affect and Difficulty only for the independent variable Method, F (1, 165) = 7.17, p < .01, eta squared η2 = .04; and F

(1, 165) = 30.36, p < .01, eta squared η2 = .16, respectively. Considering the PMA

186 covariate alone, statistically significant adjusted mean score differences were detected both for Affect, F (1, 165) = 7.59, p < .01, eta squared η2 = .05; and for Value, F (1, 165)

= 45.35, p < .01, eta squared η2 = .23, respectively. The covariate by method by department interaction effect results indicate that the adjusted mean scores differed significantly only for Affect, F (1, 165) = 5.22, p < .01.

Table 30

Selected Output for Univariate F-tests Run with (1, 165) Degrees of Freedom

Effect Subscale Error Hypothesized Error F Sig. of Partial

SS MS MS F Eta

Squared

187

Within + Affect 1466.88 82.83 8.89 9.32 .003 .28

Residual Cognitive 641.09 .10 3.89 .03 .087 .01

Regression Value 685.47 200.71 4.15 48.31 .000 .25

Difficulty 1928.43 13.82 11.69 1.18 .278 .24

Method x Affect 1466.88 60.44 8.89 6.80 .010 .04

Department Cognitive 641.09 .18 3.89 .05 .829 .00

Value 685.47 .07 4.15 .02 .899 .00

Difficulty 1928.43 41.05 11.69 3.51 .061 .02

Department Affect 1466.88 231.40 8.89 26.03 .000 .14

Cognitive 641.09 7.49 3.89 1.93 .167 .01

Value 685.47 2.84 4.15 .68 .410 .00

Difficulty 1928.43 95.74 11.69 8.19 .005 .05

(table continues)

Table 30 (continued)

Selected Output for Univariate F-tests Run with (1, 165) Degrees of Freedom

Effect Subscale Error Hypothesized Error F Sig. of Partial

SS MS MS F Eta

Squared

188

Method Affect 1466.88 63.73 8.89 7.17 .008 .04

Cognitive 641.09 .07 3.89 .017 .895 .00

Value 685.47 .42 4.15 .100 .752 .00

Difficulty 1928.43 354.81 11.69 30.36 .000 .16

Covariate Affect 1337.63 62.63 8.26 7.59 .007 .05 w/ (1,162) Cognitive 637.66 .07 3.94 .02 .896 .00 degrees of Value 665.45 186.27 4.11 45.35 .000 .23 freedom Difficulty 1884.43 8.74 11.63 .75 .387 .01

Covariate Affect 1337.63 43.08 8.26 5.22 .002 x Method x Cognitive 637.66 1.15 3.94 .29 .832

Department Value 665.45 6.67 4.11 1.62 .186

Difficulty 1884.43 14.67 11.63 1.26 .290

In summary the study has established that:

1. Overall, the iteration algorithm produced grouped-data values were much closer to the corresponding statistical measure values computed directly from the raw data than the grouped-data values yielded by the SPSS algorithm, for each of the 9 statistical measures considered and according to varying sample sizes (small, medium, and large). Since no actual statistical test was applied, the observed differences in absolute mean deviations could not be declared to be statistically significant.

189

Follow-up univariate tests found that two dependent variables contributed more than other dependent variables to the statistically significant overall difference. These two dependent variables are Affect and Value.

2. Significant adjusted mean score differences were found between attitudes scores reported on the dependent variables and the PMA covariate by students who were taught the iteration algorithm and attitude scores reported by students who were taught the traditional method. Analyses did not detect a significant method by department interaction effect neither at the multivariate level nor at the univariate level.

3. Significant adjusted mean score differences were discovered between attitude scores reported on the dependent variables and the covariate by both mathematics and psychology department students. The majority of the students expressed very positive attitudes towards either instructional method (iteration or traditional), after controlling for the PMA, irrespective of the departmental section they came from. In other terms, adjusted mean scores of attitudes towards the instructional method taught reported by mathematics department students were, on average, higher than those reported by their counterpart psychology department students.

4. The method variable yielded statistically significant results. That is, the majority of the students exhibited very positive attitudes no matter which instructional method was introduced to them, after controlling for the PMA. Putting it differently, adjusted mean attitude scores reported by students who were taught the iteration algorithm were higher than those reported by students who were exposed to the traditional method.

Conclusion

190

The data analysis presented in chapter 5 provided a basis for developing the conclusions and recommendations that follow in chapter 6. All usable survey data were analyzed and the respective information was discussed in narrative form and presented in tables where appropriate. The two main sections of this chapter provided an in-depth look of the Monte Carlo simulations results, the MANCOVA and ANCOVA outcomes. The chapter presented specific findings for 3 of the 4 study’s 4 research questions, which will be summarized and discussed in chapter 6.

CHAPTER 6: DISCUSSION, SUMMARY, CONCLUSION, AND

RECOMMENDATIONS

This chapter begins with a brief summary of the study. A review of the research questions as described in preceding chapters is discussed in reference to current and

191 previous research findings. Next the implications of the study for students, teachers, and researchers are outlined. Then, limitations of the study are identified with their possible impact upon the results. This chapter will end with a conclusion and recommendations for further research that emerged on the basis of the current study’s findings.

Summary of the Results and Findings

The purpose of a grouped-data frequency table is to summarize and organize a large set of data. Presenting data in a grouped-data frequency distribution makes inspection of the data set much more manageable than presenting the entire set of raw data. A grouped-data frequency distribution can be considered as a type of descriptive statistic. Grouped-data frequency distributions can be structured either as a frequency table or a graph; both show basic features that are of interest to a statistician.

As Friel, Curcio, and Bright (2001) insisted “apparently attention to the use of grouped-data frequency tables as transition tools for organizing information to be represented graphically is needed” (p. 126) to promote understanding. Because one can move back and forth between frequency tables and graphs, having both types of information has inherent advantages. Furthermore, “histograms may even be more limited if they are used alone and as substitutes for grouped-data frequency distributions rather than being used as accompaniments” (Raymondo, 1999, p. 67). A combination of histograms and grouped-data frequency distributions is ideal to provide a much clearer picture of the data.

Four elements emerged from the review of the related literature as essential for making frequency tables of grouped data. These 4 key elements are as follows: the inclusive range, the number of classes, the class length or class width, and the starting

192 point. At the moment, although various procedures exist for use in creating grouped-data frequency tables, none of them is known to able to determine all the above 4 key elements in a guess-free fashion.

Besides the computation of the inclusive range, there is no clearly identifiable compromise about how to determine the other 3 key elements. The way in which statistics users decide about these key elements is very often a matter of artistic taste.

After determining the inclusive range, the second step in designing a grouped-data frequency table is commonly fixing on the number of class intervals. Then the third step is figuring out the interval size or interval width (Aron & Aron, 1997; Wessel & Willett,

1960). Finally, the fourth step is determining the starting point (Groebner & Shannon,

1981; Scott, 1979; Shaw, Huffman, & Haviland, 1987). With contemporary practices, the

“number of class intervals is just arbitrary, and so also is the point at which to begin the lowest class and the point at which to end the last class” (Kerr, Hall, & Kozub, 2002, p.

11).

In view of the above concerns and insinuations, the iteration algorithm was invented to offer a first step to address this gap on knowledge by presenting a structured method for summarizing and displaying large data collected into a frequency distribution of grouped data. It was conceived in response chiefly to the lack of a clear consensus about which existing formula works best and the lack of clarity in the instructions provided by the contemporary practices for creating grouped-data frequency tables.

Most existing methods leave too much room for subjectivity and arbitrariness in that they allow guessing. In contrast, the iteration algorithm utilizes five formulas and works in a systematic step-by-step fashion that leaves no room for guessing by providing

193 comprehensive, clear directions that are less prone to errors when constructing grouped- data frequency tables.

Once the iteration algorithm was developed, two preoccupations transpired. First, as a process, how relatively accurate is the iteration algorithm? Second, as an instructional method, how relatively efficient is the iteration algorithm if compared to a tradition method when constructing a grouped-data frequency table? To address the first crucial issue, a Monte Carlo study was designed. The results of the Monte Carlo simulations helped to check whether the iteration algorithm yielded, on average, smaller deviations than those produced by the SPSS algorithm, based on 9 statistical measures.

To tackle the second and main focus of this study, the Survey of Attitudes towards Statistics (SATS) devised by Schau, Dauphine, and Stevens (1995) was adopted after being slightly revised and a MANCOVA test was thereafter performed using the students’ Survey of Attitudes towards the Method Used (SAMU) scores on 4 dimensions.

Self-reported attitude scores were used in this study because they have been recognized to have a potential role in influencing the learning process (Harris, 1999; Wooten, 1998).

The results and findings are briefly summarized below.

First, as a process, it was found that the iteration algorithm is relatively accurate.

For each of the 9 statistical measures retained, the grouped data yielded by the iteration algorithm were much closer to the corresponding statistical measure value computed directly from the raw data than were the grouped data produced by the SPSS algorithm.

As noted in chapter 4, more positive differences were observed by subtracting the iteration algorithm results from the corresponding SPSS outcomes. The varying sample sizes (small, medium, large, and overall) did not alter much the above findings. Yet, the

194 large sample size seemed to yield slightly better results than the small and medium sample sizes.

Second, as an instructional method, after factoring out differences in the students’

Perceived Mathematics Ability (PMA), it was found that students showed more positive attitudes towards the iteration algorithm than towards the traditional method. The majority of psychology department students expressed higher attitudes towards the instructional method taught and acknowledged more positive perceived ability in statistics as measured by the PMA variable. Out of the 4 facets studied (Affect,

Cognitive, Value, and Difficulty), Affect and Value seemed to contribute more to the overall significant results observed than did the other two dependent variables. Cognitive contributed, if any, the least.

Discussion of the Results and Findings

This section will provide a quick overview of the results as they answer the research questions set forth in chapter 1. References to current and previous research findings will be made as fit.

Research question 1: On average, are the mean of square roots of deviations between the 9 statistical measures computed directly from the raw data and the corresponding 9 statistical measures obtained from the grouped data produced by the iteration algorithm less than those obtained from the grouped data yielded by the SPSS algorithm, for varying sample sizes?

In order to establish, as a process, how relatively accurate the iteration algorithm is, the results of the Monte Carlo study were examined by running and comparing the square roots of deviations between the 9 statistical measures computed directly from the

195 raw data and the corresponding 9 statistical measures obtained from the grouped data by using the iteration algorithm to the square roots of deviations between the 9 statistical measures computed directly from the raw data and the corresponding 9 statistical measures obtained from the grouped data by applying the SPSS algorithm. Satisfactory differences were detected by simple comparisons in favor of the iteration algorithm. Four varying sample sizes were considered (small, medium, large, and overall) and 9 statistical measures were used, and the results at each sample size level and for each measure were again generally satisfactory in support of the relative accuracy of the iteration algorithm.

Since no actual statistical test was performed, it might not be possible to admit that the differences observed between the two algorithms were statistically significant.

To determine, as an instructional method, how relatively efficient the iteration algorithm is if compared to a tradition method when constructing a grouped-data frequency table, the SAMU instrument was developed. The SATS invented by Schau,

Dauphine, and Stevens (1995) served as the model. Like the original SATS instrument, the SAMU instrument is broken into 4 components: Affect, cognitive competence, value and difficulty.

The Affect component assesses positive and negative feelings about statistics, and thus about the instructional method used. The Cognitive Competence component assesses self-beliefs about ability in statistics, and thereby in the instructional method used. The

Value component assesses beliefs about the usefulness and importance of the instructional method taught. The Difficulty component assesses beliefs about the difficulty of the instructional method used based on beliefs about the nature of statistics.

196

These 4 subscales were also selected for analyses because they represent the students’ attitudes towards statistics that some research has shown to be important for statistics achievement (Elmore et al., 1993; Green, 1994; Roberts & Saxe, 1982; Schau et al., 1993; Wise, 1995). The survey uses a seven-point Likert scale that ranges from 1

(strongly disagree) to 7 (strongly agree). The instrument used in this study, SAMU, attempted to explore the effects of the instructional method used (either the iteration or traditional method) on students’ attitudes and perceptions concerning the creation of grouped-data frequency tables. Negatively worded items were reverse coded so that higher attitudes scores are always associated with higher positive attitudes towards the instructional method used.

The students’ attitudes scores on the SAMU instrument were subjected to a multivariate analysis of covariance, with PMA scores entered as a covariate. This PMA covariate replaced the initial Expected Course Grade (ECG) covariate. If included in the final data analyses as covariate, the ECG variable might have influenced the MANCOVA test in producing inaccurate test results.

Prior to using the multivariate analysis of covariance, screening procedures for conformity to the multivariate analyses of covariance assumptions were carried out.

Results of evaluation of assumptions of independence of observations, normality, linearity, homogeneity of variance-covariance matrices, among others, were satisfactory.

In particular, the non violation of the homogeneity-of-slopes assumption indicated that the relationship between the covariate and the dependent variables differed significantly as a function of the instructional method used. Also, the PMA covariate was then judged to be adequately reliable for covariance analysis.

197

Research question 2: Is there an overall interaction between the academic department and the instructional method used, adjusting for differences in the student perceived mathematics ability?

Because the small reported probability value on the MANCOVA output associated with the F statistics test for Wilks’ Lambda, the null hypothesis that the population means on the 4 dependent variables, adjusted for differences in the PMA covariate, are equal across levels of the two given factors had to be rejected at the .01 significance level. Significant differences among the adjusted mean scores for the two groups of students were found. In other terms, there were multivariate statistically significant differences in mean scores, as measured by the dependent variables and the

PMA covariate, between the students’ attitudes scores reported on the SAMU instrument by students who were taught the iteration algorithm and those reported by students who were taught the traditional method, after factoring out the effects of the PMA. A further look at the results revealed that adjusted mean scores yielded by the iteration algorithm were, on average, greater than those produced by the traditional method.

No statistically significant interaction effect between the two independent variables was found. This means that the academic department in which a student was enrolled did not have a real impact on the student’s attitudes towards either instructional method taught. Also, a non-significant covariate by group interaction effect suggested that the differences on the dependent variables between the two groups did not vary significantly as a function of the PMA covariate.

198

Following multivariate significant results, an ANCOVA test was performed to see where the differences arise. Before using the ANCOVA, assumptions were checked for conformity. The evaluation of the main required assumptions was satisfactory.

The ANCOVA results revealed that two facets, Value and Affect, contributed more than other dependent variables to the overall significance, F (1, 165) = 48.31, p <

.001, and F (1, 165) = 9.32, p < .01, respectively, after factoring out the effects of the

PMA. Their respective effect sizes varied from small to moderate.

Mathematics Ability?

Statistically significant department main effects were found, after adjustment for differences among the groups with respect to the PMA. This means that there were substantially significant differences in adjusted mean scores between attitude scores reported on the dependent variables and the covariate by both mathematics and psychology department students. Irrespective of the instructional method taught, mathematics department students achieved higher scores on the SAMU instrument than did psychology students, after controlling for the PMA.

The ANCOVA follow-up results showed that two facets that contributed more than other dependent variables to the department main effects. The two facets are Affect and Difficulty, F (1, 165) = 26.03, p < .01 and F (1, 165) = 8.19, p < .01, respectively.

These results are similar to those found by Ford, Liwag-McLamb, and Foley (1998).

199

Perceived Mathematics Ability?

Statistically significant method main effects were found, after adjustment for differences among the groups with respect to the PMA. This means that there were significant differences in adjusted mean scores between attitude scores reported on the dependent variables and the covariate by students who were taught the iteration algorithm and attitude scores reported by students who were exposed to the traditional method.

The variable method moderated the relationship between the academic department and the total attitudes points scored on the dependent variables and the covariate such that the pattern of department scores’ differences varied significantly between the iteration algorithm and the traditional methods. Both mathematics and psychology department students scored higher on the iteration algorithm than they did on the traditional method. In other words, probably due their background, mathematics department students showed more positive attitudes toward both instructional methods taught than did psychology students. For both groups the majority of adjusted mean scores on the iteration algorithm exceeded those on the traditional method.

The above findings imply that the facet Affect is the one that contributed the most to the overall significance and to the main effect significance. The two other facets that follow are Value and Difficulty. All these MANOVA and ANCOVA results are similar to those found by Dauphine, Schau, and Stevens (1997), Ford, Liwag-McLamb,

& Foley (1998), Mills (2004), and Schau, Dauphine, and Stevens (1995).

200

Explanations of the Results

A first explanation for these results may be that attitude scores reported by the students were independent of the academic department (mathematics or psychology).

Students enrolled in a MATH 250 class or in a PSY 120 class could have come from any academic department. They were not necessarily mathematics or psychology major students.

A second explanation may be that students enrolled in a MATH 250 course had higher PMA scores or a stronger mathematics background (more mathematics courses taken, for instance) than their counterparts taking a PSY 120 course. This explanation is based on the fact that a MATH 250 course is generally offered to sophomores, whereas a

PSY 120 is taught essentially to freshmen.

A third explanation could be that the instructors involved with the survey administration were completely new to the iteration algorithm. They had to master it first before teaching it to their respective students. It is obviously a difficult task for an instructor to learn a new algorithm in just a few hours before explaining it to the students.

This difficulty could have resulted in less effective presentation.

Additionally, it should be acknowledged that the instructor characteristics are likely to have some impact on the students’ attitudes towards the instructional method taught. Comments made by some students revealed that at least one instructor did not feel confident and seemed unprepared when he was teaching the iteration algorithm. He kept changing (at least 3 times) his class interval choices.

Another reason could be that the mathematics department instructors were PhD students, whereas in the psychology department it was an Assistant Professor who

201 conducted the presentations. It is possible that the students enrolled in MATH 250 classes were more at ease with their instructors, Asian PhD students, than those registered in

PSY 120 classes taught by a White American Assistant Professor.

Finally, psychology students may have developed more negative feelings towards statistics than did mathematics students. Negative student attitudes towards statistics may have created a major obstacle for effectively learning the iteration algorithm (Cashin &

Elmore, 1997; Fullerton & Umphrey, 2001; Perney & Ravid, 1991; Schultz, Drogsz,

White, & Distefano, 1988; Waters, Matelli, Zakrajsek, & Popovich, 1988).

As indicated in Table 24, all of the 4 facets of the SAMU instrument are positively associated with each other and with the covariate. More specifically, high levels of Affect are positively and significantly related to high levels of Difficulty. A more favorable attitude towards PMA is positively correlated to Value and Affect.

Cognitive is neither significantly associated with any of the 3 other facets nor with PMA.

Implications of the Study

This section will offer recommendations to practitioners on how they can best use the results and findings found from this study. The results have implications for students, teachers, researchers, and practitioners. These implications are summarized below.

Implications for Students and Teachers

A unique solution approach. For learning and teaching purposes, the iteration algorithm stops when a first solution is attained. Although, based on the review of the related literature review, it was established that the uniqueness solution feature of the iteration algorithm was proved to work for some people only, it is the algorithm that is the most recommended for constructing grouped-data frequency tables in light of results

202 and findings of this study. It was further demonstrated that the unique solution approach works exceptionally well for children with autism facing serious cognitive difficulties dealing with multiple-solutions problems (Bernard-Opitz, & Nakhoda-Sapuan, 2001).

The unique solution approach may help to reduce boredom for some students, frustration, anger, and fear for others (Schultz, Drogosz, White, & Distefano, 1998) as well as vagueness. As such, it is proposed that the iteration algorithm (unique-solution approach) be taught predominantly to secondary school and undergraduate students.

Because of diverse learning styles (Allwood, 1990; Harris & Schau, 1999;

McKeachie, 2003) and the fact that significant differences could not be established anchored in the Monte Carlo simulations’ findings, the results of this study suggest that the iteration algorithm be used in combination with any traditional method whenever applicable to accommodate for all kinds of learning styles. Onwuegbuzi (1998) acknowledged that learning styles have been implicated as representing antecedent of statistics anxiety. Saxe (1990, p. 122) added that “research on inquiry demonstrates that learners come to change their ideas when they discover alternatives that seem plausible and appear to be more useful”.

Implications for Researchers and Practitioners

A multiple solutions approach. As stated in chapter 2, there are some other statistics users in general and students in particular who are minimally bored by vagueness and who would prefer a teaching procedure with little structure. This may provide a kind of spontaneity and freedom to pursue independent ideas that can be exciting and motivating for them (Harris & Schau, 1999).

203

As put by Moore and Notz (2006), “statistics software will choose classes of a histogram for you, and may use slightly different rules from one computer software to another. The computer’s choice is usually a good one, but you can change it if you want”

(p. 207). Since a default histogram generated by any computer software might be too skewed or have too many empty classes, the iteration algorithm can be used in deciding about the number of class intervals the histogram should have, its class width and the starting point.

For practitioners and researchers doing their own work, the iteration algorithm can be extended beyond the first solution achieved, if they wish, and run until it provides a first grouped-data frequency table with either an empty first class interval or an empty last class interval. Multiple acceptable solutions would be produced and only the most appropriate solution according to the statistical study needs and the researcher personal tastes would be retained. Based on the review of the related literature and personal experience, it is urged that the iteration algorithm (multiple-solution approach) be taught predominantly to graduate students and researchers doing their own work.

Summary of the Implications

If it is true that human beings learn by doing (Schank, 2005) and learn by personal involvement (Gelman, 2005), then being able to modify a default set-up histogram generated by SPSS, for instance, to better represent a dataset (Garfield & Chance, 2000) might be more beneficial to statistics users than just producing the default set-up histograms. The choices derived from applying the iteration algorithm (unique-solution approach) can be used to modify the default set-up histogram. Being a well structured instructional method and having a unique solution obtained in a guess-free fashion, the

204 iteration algorithm is what this study favors chiefly for statistics instructors and graduates students taking an elementary statistics course.

Limitations

The limitations listed in chapter 1 were written before the data were collected.

This section reviews the limitations of the study after the data were collected and analyzed.

A first limitation has to do with the scope of the Monte Carlo study. The Monte

Carlo simulations were limited to only 150 cases because it was not possible to obtain the algorithm used by SPSS software to generate histograms. The grouped-data statistical measure differences between the two algorithms were carried out by Microsoft Excel, but the data had to be entered manually into an Excel spreadsheet. Had the SPSS algorithm been available, it could have helped to compare the grouped data yielded by the iteration algorithm to those produced by the SPSS default set up on a very large scale comprising thousands of cases.

A second limitation is that the subscale scores have relatively poor reliability, especially for Value and Cognitive. In particular, Value was identified as one of the two facets that contributed the most to the statistically significant overall difference observed between the two instructional methods used. As a result, one would have expected it to produce a strong reliability coefficient. Also, Cognitive had a low reliability coefficient due probably to the difficulty that some students might have experienced in applying correctly the 5 formulas contained in the iteration algorithm while constructing their own grouped-data frequency tables.

205

Third, exploratory factor analysis results did not confirm the factor structure underlying the item responses as reported by the authors of the original SATS instrument.

The SAMU items loaded more correctly on 3 factors rather than on 4 factors as identified by the original scale inventors. In addition, weak correlations (albeit mostly statistically significant due to a relatively large sample size) between subscale factors reflect lack of equivalence in measured constructs.

Conclusion

The conclusions of this study were based on the data collected from the surveys and are assumed to be characteristic of the students of the two Ohio University departments selected. The anticipated results of this study were that those students who were taught the iteration algorithm would have higher and more positive attitudes toward the instructional method used compared to those who received the traditional method when constructing their own grouped-data frequency tables. This assumption was found to be true. Mathematics and psychology students differ significantly on the set of two adjusted means.

The ECG variable was considered to be the covariate in the original plan for this study. After the original statistical analyses were conducted, it was found that the ECG variable was found to be statistically insignificant. An explanation of why ECG was a bad covariate due to an interaction effect could be that the ECG variable had a 10-point scale (scores were ranging from 1 to 10), whereas the other items had a range of scores varying from 1 to 7. Another explanation could be that the two courses (MATH 250 and

PSY 120) are perhaps different in contents. As such, mathematics and psychology students might have different difficulty levels, so they would expect different grades.

206

In the final data analyses the initial ECG covariate was replaced by PMA as the variable that adequately measures the students’ perceived ability in statistics. After running the analyses with the latter covariate, it was found that the PMA covariate was significantly related to the dependent variables (other than Cognitive). This means that the group means on the dependent variables were significantly adjusted due to differences on the PMA covariate.

From the review of the related literature, it was revealed that people learn in different ways. As suggested by Allwood (1990), there are many roads to learning and students, like any other human being, learn in different ways too and vary in their abilities to perform certain tasks. So there is no instructional method that would therefore work for everybody. In the same way that many authors suggested using histograms in combination to grouped-data frequency tables, the usage of the iteration algorithm is recommended in addition to any traditional method to accommodate for both those who do not stand for ambiguity and might like a unique-solution approach and those who prefer a multiple-solution approach.

The iteration algorithm is particularly promising in that it offers a unique solution approach primarily for teaching purposes (the only aspect of the iteration algorithm discussed in this study). It can be extended to serve as a multiple-solutions tactic for researchers. This approach is supported by Lapin (1980), for instance, who stated that

“one way to decide how many class intervals to use would be to try several –plotting a histogram for each and selecting the one that provides the most logical explanation of the underlying population pattern” (p. 21). He further specified “the histogram chosen for permanent use should be the one that achieves the best balance between summarizing the

207 data and preserving detail” (Lapin, 1987, p. 18). Although the idea of plotting several histograms and choosing the best that describes the data at hand is shared by Garfield and

Chance (2000) and Heaton and Mickelson (2002), this study prefers the unique solution approach.

To summarize the answer to the first research question, the absolute mean deviations from the raw data yielded by the SPSS algorithm were indeed greater than those yielded by the iteration algorithm. The differences observed between the statistical measure values produced by the two algorithms could not be declared significant as no actual statistical test was performed. Only simple comparisons were done.

The conclusion for the second research question showed that there were statistically significant overall differences in adjusted mean attitude scores of students who were taught the iteration algorithm against those that received the traditional method. But analyses did not detect a significant method by department interaction effect neither at the multivariate level nor at the univariate level.

The conclusion for the third research question showed that there were statistically significant differences in adjusted mean attitude scores between the two department sections. The majority of mathematics students reported higher and more positive attitude scores than their counterpart in the psychology department. The conclusion for the fourth research question showed that there were statistically significant differences in adjusted mean attitude scores between the two instructional methods. Students that were taught the iteration algorithm had more positive attitudes in all 4 domains than those who were exposed to the traditional method. Holding positive attitudes towards the instructional

208 method taught was recognized by the related literature review to facilitate its effective learning.

The conclusion of this study is that the iteration algorithm did have a positive and higher impact on the students’ attitudes towards the instructional method used than did the traditional method. Based on the ANCOVA post hoc test results, it was noted that most students found the iteration algorithm to be useful in their future professional and personal lives. Yet, as undergraduates, they might have experienced some difficulties in successfully applying the iteration algorithm to actually build their own grouped-data frequency tables. The iteration algorithm appears to be more appealing to graduate students than to undergraduate students. The results of the two pilot studies showed that graduate students reported more positive attitude scores towards the instructional method taught than did the undergraduate students.

Despite the limitations encountered, the findings of this study are promising and encouraging. Nevertheless, generalizing them to the target population may be limited only to some findings.

Sprenkle and Moon (1996) suggested that a response rate between 60% and 80% might be needed for generalizing at least some findings to the general population. As the data were collected were self reported, the researcher makes the assumption that participants’ responses accurately reflected their feelings (Heppner, Kivlighan, &

Wampold, 1992).

Recommendations for Future Research

209

The following recommendations for future research are based on the review of the literature, the analysis of the data, the results, conclusions, and discussion. First of all, the main focus of this study is an ex post facto analysis done without the benefit of a control group and with no random assignment of participants. As such, it does not espouse a cause-and-effect relationship between the exposure to the iteration algorithm and higher attitudes toward its applicability. To establish this causality, future research needs to include a control group in their study and assign at random subjects to the two instructional methods. The internal validity here is to show that differences in the dependent variable were caused by the independent variables and not by something else

(Stevens, 2002). It is also suggested to administer first the SATS instrument before administering the SAMU questionnaire.

Second, the current study was completed using a cross-sectional one shot survey design. A similar study should be planned within the same school that uses a longitudinal design to determine if changes over time become perceptible.

Third, gender may be included a possible covariate in addition to a variable measuring the students’ perceived ability in mathematics.

Fourth, white students were over-represented which could skew the results. A more diversified accessible population in terms of demographic characteristics and academic background along with several more variables added might provide further insight into the study’s findings.

Additionally, in this study an exploratory factor analysis was conducted. Future studies should utilize confirmatory factor analysis to replicate the factor structure findings in a very large sample of participants.

210

Although the results of this study may still be generalized to the Ohio University taking an elementary statistics course, generalization of the results to all American

College students is questionable and should only be done with caution. Also,

Wisenbaker, Scott, and Nasser (2000) reported that attitude components measured at the end of the course predicted final course achievement; those measured at the beginning of the course did not. Since the surveys in this study were conducted at the beginning of the course, replicating it is recommended, preferably at both the beginning and at the end of the course, accommodating the shortcomings addressed here.

REFERENCES

Acock, A. C. (2005, November). SAS, Stata, SPSS: A comparison. Journal of Marriage

& Family, 67(4), 193-1095.

211

Adkins, D. C. (1964). Statistics: An introduction for students in the behavioral sciences.

Columbus, OH: Charles E. Merrill Books.

Allwood, C. M. (1990, June). On the relation between justification of solution method

and correctness of solution in statistical problem solving. Scandinavian Journal

of Psychology, 31(3), 181-190.

Anderson, A. J. B. (1989). Interpreting data: A first course in statistics. New York:

Chapman and Hall.

Anderson, D., Sweeney, D., &, Williams, T. (2004). Essentials of modern business

statistics with Microsoft Excel (2nd ed.). Mason, OH: South-Western.

Arsham, Hossein. (2001). Statistical data analysis: Prove it with statistics. Retrieved

October 10, 2004, from: http://ubmail.ubalt.edu/~harsham/stat-data/opre330.htm

Bakker, A. (2004). Design research in statistics education on symbolizing and computer

tools. (Doctoral dissertation, Freudenthal Instituut, Universiteit Utrecht, the

Netherlands, 1970). Available online at:

http://www.stat.auckland.ac.nz/~iase/publications/dissertations/dissertations.php

Bandalos, D. L., Finney, S. J., &, Geske, J. A. (2003, September). A model of statistics

performance based on achievement goal theory. Journal of Educational

Psychology, 95(3), 604-616.

Bandura, A. (1977, March). Self-efficacy: Toward a unifying theory of behavioral

change. Psychological Review, 84, 191-215.

Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory.

Englewood Cliffs, NJ: Prentice-Hall.

212

Bartz, A. E. (1979). Descriptive statistics for education and the behavioral sciences

(5th ed.). New York: Macmillan.

Bartz, A. E. (1999). Basic statistical concepts (4th ed.). Upper Saddle River, NJ: Merrill.

Batanero, C., Godino, J. D., Green, D. R., Holmes, P, &, Vallecillos, A. (1994). Errors

and difficulties in understanding elementary statistical concepts. International

Journal of Mathematics Education, Science and Technology, 25(4), 527-547.

Behrens, J. T. (1997). Toward a theory and practice of using interactive graphics in

statistical education. In J. B. Garfield and G. Burrill (Eds.), Research on the role

of technology in teaching and learning statistics, 111-121. Voorburg, The

Netherlands: International Statistical Institute.

Beins, B. (1985, October). Teaching the relevance of statistics through consumer-oriented

research. Teaching of Psychology, 12(3), 168-169.

Bell, J. A. (1998, October). Success in statistics: gender, learning style, nationality, or

student classification. Proceedings of the Fifth Annual Southwestern Business

Administration Teaching Conference, 4, 27-29.

Benedict, J. O., & Anderton, J. B. (2004, Summer). Applying the just-in-time teaching

approach to teaching statistics. Teaching of Psychology, 31(3), 197-199.

Bluman, A. G. (2004). Elementary statistics: A step-by-step approach (6th ed.). Boston,

MA: McGraw Hill.

Borresen, C. R. (1990, Winter). Success in introductory statistics with small groups.

College Teaching, 38(1), 26-28.

213

Bradstreet, T. E. (1996, February). Teaching introductory statistics courses so that non-

statisticians experience statistical reasoning. The American Statistician, 50(1), 69-

78.

Brase, C. H., & Brase, C. P. (1999). Understanding basic statistics: Concepts and

methods. Boston, CA: Houghton Mifflin.

Brase, C. H., & Brase, C. P. (2001). Understanding basic statistics (2nd ed.).

Boston, CA: Houghton Mifflin.

Brown, J. D. (2000, Autumn). Statistics corner, questions and answers about the

Language-testing statistics: What is construct validity? Shiken: JALT Testing &

Evaluation SIG Newsletter, 4 (2), 7 – 10. Retrieved October 12, 2002 from

http://www.jalt.org/test/bro_8.htm

Brown, T. S., & Brown, J. T. (1995, December). Prerequisite course grades and attitudes

toward statistics. College Student Journal, 29(4), 502-507.

Campbell, D. T., Stanley, J. C. (1966). Experimental and quasi-experimental designs

for research. Boston, MA: Houghton Mifflin.

Carlisle, Y. (1985). Descriptive statistics. Supervision: Technical aspects of supervision.

The choice series # 43. A self-learning opportunity. Washington, England:

Northern Regional Management Centre (ERIC Document Reproduction Service

No. ED268248).

Carver, R. P. (1978, August). The case against statistical significance testing. Harvard

Educational Review, 48(3), 378-399.

Cashin, S. E., & Elmore, P. B. (1997). Instruments used to assess attitudes toward

214

statistics: A psychometric evaluation. Paper presented at the annual meeting of the

American Educational Research Association, Chicago, IL.

Champion, D. J. (1970). Basic statistics for social research. Scranton, PA: Chandler.

Chatfield, C. (1995). Problem solving: A statistician's guide (2nd ed.). New York:

Chapman & Hall.

Cherian, V. I., & Glencross, M. J. (1997, June). Sex socioeconomic status, and

attitude toward applied statistics among postgraduate education students.

Psychological Reports, 80(3), 1385-1386.

Chou, Y. L. (1963). Applied business and economics statistics. New York: Holt, Rinehart

and Winston.

Cobb, P. (1994, October). Where is the mind? Constructivist and socio-cultural

perspectives on mathematical development. Educational Researcher, 23(7), 13-

200.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).

Hillsdale, NJ: Lawrence Erlbaum Associates.

Coiro, J. (2005, February). Philosophy of teaching and learning statement.

Comprehensive Examination for Cognition and Instruction. Storrs, CT:

University of Connecticut.

Conners, F. A., Mccown, S. M., & Roskos-Ewoldsen, B. (1998, January). Unique challenges in teaching undergraduate statistics. Teaching of Psychology, 25(1), 40-42.

Craft, J. L. (1990). Statistics and data analysis for social workers. Itasca, IL: Peacok.

215

Cruise, R. J., Cash, R. W., & Bolton, D. L. (1985, October). Development and validation

of an instrument to measure statistical anxiety. American Statistical Association

Proceedings of the Section on Statistics Education, 4, 92-97.

Curzio, F. R. (1987, November). Comprehension of mathematical relationships expressed

in graphs. Journal of Research in Mathematics Education, 18(5), 382-393.

Dauphine, T. L., Schau, C., & Stevens, J. J. (1997, January). Survey of attitudes toward

statistics: Factor structure and factorial invariance for women and men. Structural

Equation Modeling, 4(2), 129-141.

Davies, G. R. (1929, December). The analysis of frequency distributions. Journal of the

American Statistical Association, 24(168), 349-359.

Davies, O. L., & Goldsmith, P. L. (1980). Statistical methods in research and production.

New York: Longman.

De Veaux, R. D., Velleman, P. F., & Bock, D. E. (2006). Intro Stats (2nd ed.). Boston,

MA: Pearson Addison Wesley.

DeBord, K., Aruguete, M., & Muhlig, J. (2004, February). Are computer assisted

teaching methods effective? Teaching of Psychology, 31(1), 65-69.

DeRoma, V. M., Martin, K. M., & Kessler, M. L. (2003, June). The relationship

between tolerance for ambiguity and need for course structure. Journal of

Instructional Psychology, 30(2), 104-109.

Devore, J. L., & Peck, R. (2005). The exploration and analysis of data (5th

ed.). Belmont, CA: Brooks/Cole.

Dillbeck, M. C. (1983, February). Teaching statistics in terms of the knower. Teaching of

216

Psychology, 10(1), 18-20.

Doane, D. P. (1976, November). Aesthetic frequency classification. American

Statistician, 30(4), 181-183.

Dubois, E. N. (1979). Essential statistical methods for business. New York: McGraw-

Hill.

Dugas, M. J., Gosselin, P., & Ladouceur, R. (2001, October). Intolerance of uncertainty

and worry: investigating specificity in a non clinical sample. Cognitive Therapy

and Research, 25(5), 551-558.

Dunn, D. S. (2001). Two heads are better than one: Learning statistics in common. Paper

presented at the Annual Meeting of the American Psychological Association

(109th, San Francisco, CA, August 24-28, 2001).

Eggen, P., & Kauchak, D. (2001). Educational Psychology: Windows on Classrooms (5th

ed.). Upper Saddle River, NJ: Prentice Hall.

Elifson, K. W., Runyon, R. P., & Haber, A. (1990). Fundamentals of social statistics

(2nd ed.). New York: McGraw-Hill.

Elmore, P. B., Lewis, E. L., & Bay, M. L. (1993). Statistics achievement: A function of

attitudes and related experience. Paper presented at the annual meeting of the

American Educational Research Association, Atlanta, GA.

Elzey, F. F. (1967). A first reader in statistics. Belmont, CA: Wadsworth.

Fago, G. C. (1996). Psychometric scales: Learning by doing. Proceeding of the

Annual Conference on Undergraduate Teaching of Psychology (10th, Ellenville,

NY, March 20-22, 1996).

Farnsworth, D. L. (2000, Autumn). The case against histograms. Teaching Statistics,

217

22(3), 81-85.

Fennema, E., & Behr, M. J. (1980). Individual differences and learning of mathematics

In: R. J. Shumway (ed.), Research in Mathematics Education, 356-387. Reston,

VA: National Council of Teachers of Mathematics.

Ferguson, G. A. (1981). Statistical analysis in psychology and education (5th ed.). New

York: McGraw-Hill Book.

Ferguson, G. A. (1989). Statistical analysis in psychology and education (6th ed.). New

York: McGraw-Hill Book.

Finney, S. J., & Schraw (2003, April). Self-efficacy beliefs in college statistics courses.

Contemporary Education Psychology, 28(2), 161-186.

Fisch, R. (1971). Course evaluation, test anxiety, and final test results in a statistics

Course. Zeitschrift für Entwicklung psychologie und padagogische Psychologie,

30, 361-366.

Follettie, J. F. (1980). Bar graph-using operations and response time (Technical Report).

Los Alamitos, CA: Southwest Regional Laboratory for Educational Research and

Development. (ERIC Document Reproduction Service No. ED250 381).

Ford, T. M., Liwag-McLamb, M. G., & Foley, L. A. (1998, June). Perceptions of rape

based on sex and sexual orientation of victim. Journal of Social Behavior &

Personality, 13(2), 253-263.

Fowler, W. A. (1988). Survey research methods. Newbury Park, CA: Sage.

Fox, J. A., Levin, J., &, Harkins, S. (1993). Elementary statistics in behavioral research.

New York: Harper Collins.

Fraenkel, J. R., & Wallen, N. E. (2003). How to design and evaluate research in

218

education (5th ed.). Boston, MA: McGraw-Hill.

Frank, H., & Althoen, S. C. (1994). Statistics: Concepts and applications. Cambridge,

England: Cambridge University.

Frankfort-Nachmias, C., & Leon-Guerrero, A. (2000). Social statistics for a diverse

society. Thousands Oak, CA: Pine Forge.

Freedman, D., & Diaconis, P. (1981). On the histogram as a density estimator: L2 theory.

Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57, 453-476.

Freire, P. (1993). Pedagogy of the oppressed (Myra Bergman Ramos, Trans.). New York:

Continuum. (Original work published 1970).

Freund, J. E. (1998). Modern elementary statistics (12th ed.). Englewood Cliffs, NJ:

Prentice-Hall.

Friel, S. N., Curcio, F. R., & Bright, G. W. (2001, March). Making sense of graphs:

Critical factors influencing comprehension and instructional implications. Journal

for Research in Mathematics Education, 32(2), 124-158.

Fullerton, J. A., & Umphrey, D. (2001). An analysis of attitudes toward statistics:

Gender differences among advertising majors. Paper presented at the annual

meeting of the Association for Education in Journalism and Mass Communication

(Washington, DC, August 5-8, 2001). (ERIC Document Reproduction Service

No. ED456479).

219

Galagedera, D., Woodward, G., & Degamboda, S. (2000, September). An investigation

of how perceptions of mathematics ability can affect elementary statistics

performance. International Journal of Mathematics Education, Science,

Technology, 31(5), 679-689.

Garfield, J. (1995, April). How students learn statistics. International Statistical Review,

63(1), 25-34.

Garfield, J., & Chance, B. (2000). Assessment in statistics education: Issues and

challenges. Mathematics Thinking and Learning, 2(1&2), 99-125.

Gay, L. R., & Airasian, P. (2000). Educational research (6th ed.). Upper Saddle River,

NJ: Prentice-Hall.

Gelman, A. (2005, February). A course on teaching statistics at the university level.

Special section: Preparing graduate students to teach statistics. The American

Statistician, 59(1), 1-4.

Gelman, A. & Nolan, D. (2002). Teaching statistics: A bag of tricks. London, UK:

Oxford University.

Giorgis, C., & Johnson, N. J. (2001, Summer). The learning process. The Reading

Teacher, 55(1), 86-94.

Giraud, G. (1977, January). Cooperative learning and statistics instruction. Journal of

Statistics Education, 5(1), 1-14.

Glasnapp, D. R., & Poggio, J. P. (1985). Essentials of statistical analysis for the

behavioral sciences. Columbus, OH: Charles E. Merrill.

Goodwin, C. J. (2002). Research in psychology (3rd ed.). NY: John Wiley & Sons.

220

Graham, A. (1987). Statistical investigations in the secondary school. New York:

Cambridge University.

Gratz, Z. S., Volpe, G. D., & Kind, B. M. (1993). Attitudes and achievement in

introductory psychological statistics classes: Traditional versus computer-

supported instruction. Paper presented at the annual Conference of Teachers of

Psychology, Ellenville, New York, March 25, 1993, 17 p.

Gravetter, F. J., & Wallnau, L. B. (1985). Statistical for the behavioral sciences. St. Paul,

MN: West.

Gravetter, F. J., & Wallnau, L. B. (1992). Statistics for the behavioral sciences: A

first course for students of psychology and education (3rd ed.). St. Paul, MN:

West.

Gravetter, F. J., & Wallnau, L. B. (2000). Statistical for the behavioral sciences

(5th ed.). Belmont, CA: Wadsworth/Thomson Learning.

Green, S. B., & Salkind, N. J. (2003). Using SPSS for windows and

Macintosh: Analyzing and understanding data (3rd ed.). Upper Saddle River, NJ:

Prentice-Hall.

Greeno, J., Collins, A., & Resnick, L. (1996). Cognition and learning. In: D. Berliner and

R. Calfee (eds.), Handbook of Educational Psychology, pp. 15-46. New York:

Macmillan.

Groebner, D. F., & Shannon, P. W. (1981). Business statistics: A decision-making

approach. Columbus, OH: Charles E. Merrill.

Hamburg, M. (1989). Statistical analysis for decision making (5th ed.). San Diego,

CA: Harcourt Brace Jovanovich.

221

Hamilton, L. C. (1990). Modern data analysis: A first course in applied statistics (2nd

ed.). Belmont, CA: Wadsworth.

Harnett, D. L. (1982). Statistical methods (2nd ed.). Reading, MA: Addison-Wesley.

Harris, M. B. (1998). Basic statistics for behavioral science research (2nd ed.).

Boston, MA: Allyn and Bacon.

Harris, M. B., & Schau, C. (1999). Successful strategies for teaching statistics. In S. N.

Davis, M. Crawford, & J. Sebrechts (Eds.). Coming into Her Own: Educational

Success in Girls and Women (pp. 193-210). San Francisco, CA: Jossey-Bass.

Hawkins, C. A., & Weber, J. E. (1980). Statistical analysis: Applications to business and

economics. New York: Harper & Row.

Hays, W. L. (1967). Basics statistics: Basic concepts in psychology series. Belmont, CA:

Brooks.

Hays, W. L. (1981). Statistics (3rd ed.). New York: Holt, Rinehart and Winston.

Heaton, R. M., & Mickelson, W. T. (2002, March). The learning and teaching of

statistical investigation in teaching and teacher education. Journal of

Mathematical Teacher Education, 5(1), 35-59.

Heckler, J. B., Fuqua, R. W., & Pennypacker, H. S. (1975, October). Errorless

differentiation of academic reponses by college students. Teaching of Psychology,

2(3), 103-107.

Henson, R. K. (2000, November). A primer on coefficient alpha. Paper presented at the

annual meeting of the American Educational Research Association, San

Francisco, CA (ERIC Document Reproduction Service No. ED31448).

222

Heppner, P. P., Kivlighan, D. M., & Wampold, B. E. (1992). Major research designs.

In C. Verduin (Ed.), Research design in counseling (pp. 115-165). Pacific Grove,

CA: Brooks-Cole.

Hertzberg, P. A. (1991). Principles of statistics. New York: John Wiley & Sons.

Hildebrand, D. K. (1986). Statistical thinking for behavioral scientists. Boston, MA:

Duxbury.

Hilton, S. C., Schau, C., &, Olsen, J. A. (2004, January). Survey of attitudes towards

statistics: Factor structure invariance by gender and by adminsitration time.

Structural equation modeling, 11(1), 92-109.

Hinkle, D. E., Wiersma, W., & Jurs, S. G. (2003). Applied statistics for the behavioral

sciences (5th ed.). Boston, MA: Houghton Mifflin.

Hoaglin, D. C., Mosteller, F., & Tukey, J. W. (Eds.) (1983). Understanding robust and

exploratory data analysis. New York: Wiley.

Hogg, R. V. (1991, November). Statistical education: Improvements are badly needed.

The American Statistician, 45(4), 342-343.

Hopkins, K. D., Glass, G. V., & Hopkins, B. R. (1987). Basic statistics for the behavioral

sciences (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall.

Hovermill, J. A. (2003). Technology supported inquiry learning in mathematics and

statistics with Fathom: A professional development project. (Doctoral

dissertation, The University of Colorado, January, 2004). Dissertation Abstracts

International, 64, (7-A), 2416-2666.

Howell, D. C. (1997). Statistical methods for psychology. (4th ed.). Belmont, CA:

Duxbury.

223

Huck, S. W. (2004). Reading statistics and research (4th ed.).Boston, MA: Pearson

Education.

Hulsizer, M. R. (2004). Frequency distributions and graphs. Course notes: Introduction to

Measurement and statistics (PSYC 2750 02, chapter 2). Retrieved December 27,

2004 from http://www.wester.edu/~hulsizer/class/statsu01/freqdist.html

Hunt, N. (2003, Summer). Handling continuous data in Excel. Teaching Statistics, 25(2),

42-45.

Hyndman, R. J. (1995). The problem with Sturges’ rule for constructing histograms. A

short note (2 pages). Unpublished article.

Iman, R. L., & Conover, W. J. (1982). Modern business statistics. New York: John Wiley

& Sons.

Isaac, S., & Michael, W. B. (1997). Handbook in research and evaluation (4th ed.).

San Diego, CA: Educational and Industrial Testing Services.

Ishikawa, Kaoru (1986). Guide to quality control. White Plains, NY: Unipub, Kraus

International.

Jaeger, Richard M. (1990). Statistics: A spectator sport (2nd ed.). New Park,

CA: Sage.

Jarrett, J., & Kraft, A. (1989). Statistical analysis for decision making. Boston, MA:

Allyn and Bacon.

Johnson, R. A., & Bhattacharya, G. K. (1985). Statistics: Principles and methods (2nd

ed.). New York: John Wiley & Sons.

Johnson, R. A., & Bhattacharya, G. K. (2001). Statistics: Principles and methods (4th

ed.). New York: John Wiley & Sons.

224

Johnson, D. G., & John, J. A. (2003, February). Use of demonstrations and experiments

in teaching business statistics. Journal of Applied Mathematics and Decision

Sciences, 7(2), 93-103.

Johnson, R., & Kuby, P. (2007). Elementary statistics (10th ed.). Belmont, CA:

Thompson Brooks/Cole.

Keeler, C. M. & Steinhorst, R. K. (1995, March). Using small groups to promote active

learning in the introductory statistics course: A report from the field. Journal of

Statistics Education [Online], 3(2).

(http://www.amstat.org/publications/jse/v3n2/keeler.html).

Keller, G. (2001). Applied statistics with Microsoft Excel. Pacific Grove, CA: Duxbury.

Keller, G., & Warrack, B. (2003). Statistics for management and economics (6th ed.).

Pacific Grove, CA: Brooks/Cole.

Kennedy, R. L., & McCallister, C. J. (2001, November). Attitudes toward advanced and

multivariate statistics when using computers. Paper presented at the annual

meeting of the Mid-South Educational Research Association, Little Rock, AR.

Kerr, A. W., Hall, H. K., & Kozub, S. A. (2002). Doing statistics with SPSS.

Thousand Oaks, CA: Sage.

Khazanie, R. (1986). Elementary statistics: In a world of applications (2nd ed.).

Glenview, IL: Scott, Foresman and Company.

Kiess, H. O. (1996). Statistical concepts for the behavioral sciences (2nd ed.).

Needham Heights, MA: Allyn & Bacon.

Knoke, D., & Bohrnstedt, G. W. (1991). Basic social statistics. New York: Peacock.

225

Kolstoe, Ralph H. (1969). Introduction to statistics for the behavioral sciences.

Homewood, IL: Dorsey.

Kooker, E. W., & Robb, G. P. (1982). Introduction to descriptive statistics. Springfield,

IL: Charles C. Thomas.

Korin, B. P. (1975). Statistical concepts for the social sciences. Cambridge, MA:

Winthrop.

Kottle, J. L. (2000, Summer). Mathematics proficiency, statistics knowledge, attitudes

toward statistics, and measurement course performance. College Student Journal,

34(3), 334-347.

L’Esperance, W. L. (1971). Modern statistics for business and economics. New York:

Macmillan.

Lapin, L. L. (1980). Statistics: Meaning and method (2nd ed.). New York: Harcourt

Brace Jovanovich.

Lapin, L. L. (1987). Statistics for modern business decisions (4th ed.). San Diego, CA:

Harcourt Brace Jovanovich.

Larson, R., & Farber, B. (2003). Elementary statistics: Picturing the world (2nd ed.).

Upper Saddle River, NJ: Prentice Hall.

Larson, R., & Farber, B. (2006). Elementary statistics: Picturing the world (3rd ed.).

Upper Saddle River, NJ: Prentice Hall.

Lave, J. (1988). Cognition in practice: Mind, mathematics, and culture in everyday life.

Cambridge, MA: Cambridge University.

226

Leamnson, R. (1999). Thinking about teaching and learning: Developing habits of

learning with first year college and university students. Sterling, VA: Stylus.

Levin, J., & Fox, J. A. (2004). The essentials: Elementary statistics in social research.

New York: Pearson Education.

Levin, Richard I., & Rubin, David S. (1983). Short course in business statistics.

Englewood Cliffs, N.J.: Prentice-Hall.

Levine, D. M. (2001). Applied statistics for engineers and scientists: Using Microsoft

Excel and MINITAB. Upper Saddle River, NJ: Prentice-Hall.

Litwin, M. S. (1995). How to measure survey reliability and validity. Thousands Oaks,

CA: Sage.

MacDonald, A. D. (1982, January). A stem-and-leaf plot: An approach to statistics. The

Mathematics Teacher 75(1), 25-28.

MacDonald-Ross, M. (1977, Summer). How numbers are shown: A review of research

on the presentation of quantitative data in texts. AV Communication Review,

25(3), 359-409.

Madsen, R. W., & Moeschberger, M. L. (1983). Introductory statistics for business and

economics. Englewood Cliffs, N.J.: Prentice-Hall.

Madsen, R. W., & Moeschberger, M. L. (1986). Statistical concepts with applications to

business and economics (2nd ed.). Englewood Cliffs, N.J.: Prentice-Hall.

Mansfield, E. (1980). Statistics for business and economics: Methods and

applications. New York: W. W. Norton & Company.

Mansfield, E. (1986). Basic statistics with applications. New York: W. W. Norton &

Company.

227

Mason, R. D., Lind, D. A. (1996). Statistical techniques in business

and economics (9th ed.). Boston, MA: Irwin-McGraw-Hill.

Mason, R. D., Lind, D. A., & Marchal, W. G. (1994). Statistics: An introduction (4th

ed.). Forth Worth, TX: Harcourt Brace.

Mason, R. D., & Lind, D. A., & Marchal, W. G. (1996). Statistical techniques in business

and economics (10th ed.). Boston, MA: Irwin-McGraw-Hill.

May, R. B., Masson, M. E. J., & Hunter, M. A. (1990). Application of statistics in

behavioral research. New York: Harper & Row.

Mays, D. A. (2005). Supplement to basic practice of statistics (9th ed.). Boston, MA:

McGraw Hill Custom.

McCall, R. (2001). Fundamental statistics for behavioral sciences (8th ed.).

Belmont, CA: Wadsworth/Thomson Learning.

McClave, J. T., Benson, P. G., & Sincich, T. (2005). Statistics for business and

economics (9th ed.). Upper Saddle River, N.J: Prentice-Hall.

McElroy, Elam E. (1979). Applied business statistics. (2nd ed.). San Francisco, CA:

Holden-Day.

McKeachie, W. J. (2002). Teaching tips: Strategies, research, and theory for college and

university teachers (11th ed.) Boston, MA: Houghton Mifflin.

McLaren, C. H. (2004, January). A comparison of student persistence and performance in

online and classroom business statistics experiences. Decision Sciences Journal of

Innovative Education, 2(1), 1-10.

McMillan, J. H. (1997). Classroom assessment: Principles and practice for effective

instruction. Needham Heights, MA: Allyn & Bacon.

228

McMillan, J. H., & Schumacher, S. (2001). Research in education (5th ed.). New

York: Addison Wesley Longman.

Meletiou, M. M. (2000). Developing students’ conceptions of variation: An untapped

well in statistical reasoning. (Doctoral dissertation, The University of Texas at

Austin, February, 2001). Dissertation Abstracts International, 61, (8-A), 3096-

3422.

Mendenhall, W. (1997). Beginning statistics: A to Z. (2nd ed.) Belmont, CA: Duxbury.

Mendenhall, W., Reinmuth, J. E., Beaver, R., & Duhan D. (1986). Statistics for

management and economics (5th ed.). Boston, CA: Duxbury.

Mendenhall, W., &, Sincich, T. (1996). A second course in statistics:

Regression analysis. Upper Saddle River, NJ: Prentice-Hall.

Metz, K. E, (1997). Dimensions in the assessment of students’ understanding and

application of chance. In I. Gal and J. B. Garfield (Eds.), The Assessment

Challenge in Statistics Education. Burke, VA: IOS.

Miller, J. B. (2000). The quest for the constructivist statistics classroom: Viewing

practice through constructivist theory. (Doctoral dissertation, The Ohio State

University, November, 2000). Dissertation Abstracts International, 61, (5-A),

1774-2081.

Mills, J. D. (2003, Summer). A theoretical framework for teaching statistics. Teaching

Statistics, 25(2), 56-59.

Mills, J. D. (2004, Summer). Students’ attitudes toward statistics: Implications for the

future. College Student Journal, 38(3), 349-361.

229

Milton, J. S., McTeer, P. M., & Corbet, J. J. (1997). Introduction to statistics. New York:

McGraw-Hill.

Minium, E. W. (1978). Statistical reasoning in psychology and education (2nd ed.). New

York: John Wiley & Sons.

Montcalm, D., & Royse, D. (2002). Data analysis for social workers. Boston, MA: Allyn

and Bacon.

Moore, D. S. (1997a, August). New pedagogy and new content: The case of statistics.

International Statistical Review, 65(2), 123-137.

Moore, D. S. (1997b). The active practice of statistics: A text for multimedia learning.

New York: W. H. Freeman and Company.

Moore, D. S. (2000). The basic practice of statistics (2nd ed.). New York: W. H.

Freeman and Company.

Moore, D. S. (2001). Statistics: Concepts and controversies (5th ed.). New York: W. H.

Freeman and Company.

Moore, D. S. & Notz, W. I. (2006). Statistics: Concepts and controversies (6th ed.). New

York: W. H. Freeman and Company.

Moore, R., & Goldiamond, I. (1964, Summer). Errorless establishment of visual

discriminations using fading procedures. Journal of the Experimental Analysis of

Behavior, 7(3), 269-272.

Morris, E. J., Joiner, R., & Scanlon, E. (2002, June). The contribution of computer-based

activities to understanding statistics. Journal of Computer Assisted Learning,

18(2), 114-124.

230

Morris, L. W., Kellaway, D. S., & Smith, D. H. (1978, August). Mathematics Anxiety

Rating Scale: Predicting anxiety experiences and academic performance in two

groups of students. Journal of Educational Psychology, 70 (4), 589-594.

Mosenthal, P. B., & Kirsch, I. S.. (1990a, Summer). Understanding graphs and charts,

Part I. Journal of Reading, 33(3), 371-373.

Mosenthal, P. B., & Kirsch, I. S.. (1990b, Fall). Understanding graphs and charts, Part II.

Journal of Reading, 33(4), 454-457.

Nelson, Lloyd S. (1988, Summer). Technical aid. Notes on the histogram: I. Equal class

intervals. Journal of Quality Technology, 20(3), 211-213.

Oathout, M. J. (1995, April). College students’ theory of learning introductory statistics:

Phase One. Paper presented at the annual meeting of the American Educational

Research Association, San Francisco, CA.

Onwuegbuzie, A. J. (1997, Spring). Writing a research proposal: The role of library

anxiety, statistics anxiety, and composition anxiety. Library and Information

Science, 19(1), 5-33.

Onwuegbuzie, A. J. (1998, Spring). Statistics anxiety: A function of learning style?

Research in the Schools, 5(1), 43-52.

Onwuegbuzie, A. J. (2000, Summer). Attitudes towards statistics assessments.

Assessment and Evaluation in Higher Education, 25(3), 321-339.

Onwuegbuzie, A. J. (2003, December). Modeling statistics achievement among graduate

students. Educational & Psychological Measurement, 63(6), 1020-1038. Sage.

231

Onwuegbuzie, A. J., Slate, J., Paterson, F., Watson, M., & Schwartz, R. (2000, Spring).

Factors associated with underachievement in educational research courses.

Research in the Schools, 7(1), 53-65.

Onwuegbuzie, A. J., Bailey, P., & Daley, C. E. (2002, Spring). The role of foreign

language anxiety and students’ expectations in foreign language learning.

Research in the Schools, 9(1), 33-50.

Onwuegbuzie, A. J., & Wilson, V. A. (2003, April). Statistics anxiety: nature,

antecedents, effects, and treatments – a comprehensive review of the literature.

Teaching in Higher Education, 8(2), 195-209.

Pelosi, M. K., & Sandifer, T. M. (2000). Doing statistics for business with Excel: Data,

inference and decision making. New York: John Wiley & Sons.

Perney, J., & Ravid, R. (1991). The relationship between attitudes towards statistics,

math self-concept, test anxiety and graduate students’ achievement in an

introductory statistics course. Paper presented at the annual meeting of the

American Educational Research Association, Boston, MA, April (ERIC

Document Reproduction Service No. ED318607).

Porter, A., Griffiths, D., &, Hedberg, J. (2003, Spring). From classroom to online

teaching: Experiences in improving statistics education. Journal of Applied

Mathematics and Decision Sciences, 7(2), 65-73.

Potthast, M. J. (1999, Fall). Outcomes of using small-group cooperative learning

experiences in introductory statistics courses. College Student Journal, 33(1), 34-

42.

232

Rajecki, D. W. (2002). Personal ad content analysis teaches statistical applications.

Teaching of Psychology, 29(2), 119-122.

Rangecroft, M. (2002, Summer). The language of statistics. Teaching Statistics, 24(2),

34-37.

Ravid, R. (1994). Practical statistics for educators. Lanham, MD: University Press of

America.

Raymondo, J. C. (1999). Statistical analysis in the behavioral sciences. Boston, MA:

McGraw-Hill College.

Rees, D. G. (1989). Essential statistics (2nd ed.). New York: Chapman and Hall.

Rhoads, T. R., Hubele, N. F. (2000, June). Student attitudes toward statistics before and

after a computer-integrated introductory statistics course. IEEE Transactions on

Education, 43(2), 182-187.

Rhodes, C. (2000). Using statistics in the social sciences: A computer integrated

approach. New York: General Hall.

Roberts, D. M., & Bilderback, E. W. (1980, Spring). Reliability and validity of a statistics

survey. Educational and Psychological Measurement, 40(1), 25-238.

Roberts, D. M., & Saxe, J. E. (1982, Fall). Validity of a statistics attitude survey: a

follow-up study. Educational and Psychological Measurement, 42(3), 907-912.

Roscoe, John T. (1975). Fundamental research statistics for the behavioral sciences (2nd

ed.). New York: Holt, Rinehart and Winston.

Runyon, R. P., & Haber, A. (1984). Fundamentals of behavioral statistics. Reading, MA:

Addison-Wesley.

233

Runyon, R. P., & Haber, A. (1990). Fundamentals of behavioral statistics (3rd ed.).

Reading, MA: Addison-Wesley.

Sanders, D. H., Murphy, A. F., & Eng, R. J. (1976). Statistics: A fresh approach. New

York, NY: McGraw-Hill Book.

Sanders, D. H. (1990). Statistics: A fresh approach (4th ed.). New York, NY: McGraw-

Hill.

Sanders, D. H., & Smidt, R. K. (2000). Statistics: A first course (6th ed.). Boston,

MA: McGraw-Hill.

Sandy, R. (1990). Statistics for business and economics. New York: McGraw-Hill.

Saxe, G. (1990). Culture and cognitive development: Studies in mathematical

understanding. Hillsdale, NJ: Erlbaum.

Schank, R. C. (2005). Lessons in learning, e-learning, and training: Perspectives and

guidance for the enlightened trainee. San Francisco, CA: Pfeiffer.

Schau, C., Stevens, J., Dauphinee, T. L., &, Del Vecchio, A. (1993). Evaluation of two

surveys measuring students’ attitudes toward statistics. Paper presented at the

annual meeting of the American Educational Association, Atlanta, GA.

Schau, C., Stevens, J., Dauphinee, T. L., &, Del Vecchio, A. (1995, October). The

development and validation of the survey of attitudes towards statistics.

Educational and Psychological Measurement, 55(5), 868-875.

Schloss, P. J., & Smith, M. A. (1999). Conducting research. Upper Saddle River:

Prentice-Hall.

234

Schoenfeld, A. H. (1992). Learning to think mathematically: Problem solving,

meta-cognition and sense making in mathematics. In D. A. Grouws (Ed.),

Handbook of Research on Mathematics Teaching and Learning (pp. 334-370).

NY: Macmillan.

Schmidt, M. J. (1975). Understanding and using statistics basic concepts. Lexington,

MA: D.C. Heath & Company.

Schultz, P. A., Drogosz, L. M., White, V. E., & Distefano, C. (1998, Fall). Prior

knowledge, attitude and strategy use in an introduction to statistics course.

Learning and Individual Differences, 10(4), 291-308.

Scott, D. W. (1979, September). On optimal data based histograms. Biometrika, 66(3),

605-610.

Shaughnessy, J. M. (1995). Discussion: Empirical research on technology and teaching

statistics. In J. Garfield and J. Truran (Eds.). Research Papers on Stochastic

Education (pp. 217-219).

Shavelson, R. J., Ruiz-Primo, M. A., &, Mitchell, M. (1996). Student guide for:

Statistical reasoning for the behavioral sciences. Needham Heights, MA: Allyn &

Bacon.

Shavelson, R. J., Ruiz-Primo, M. A., &, Mitchell, M. (2000). Student guide for:

Statistical reasoning for the behavioral sciences (2nd ed.). Needham Heights,

MA: Allyn & Bacon.

Shaw, D. G., Huffman, M. D., & Haviland, M. G. (1987, Summer). Grouping continuous

data in discrete intervals: Information loss and recovery. Journal of Educational

Measurement 24(2), 167-173.

235

Shvyrkov, Vladislav, & Stout, David E. (1980, December). On Sturges’ optimal class

interval. The American Statistical Association: Proceedings of the Business &

Economic Section, 5(5), 471-472.

Simon, J. L., & Bruce, P. (1991, Spring). Resampling: A tool for everyday statistical work, Chance, 4(1), 22-32.

Sincich, T., Levine, D. M., & Stephan, D. (1999). Practical statistics by example using

Microsoft Excel. Upper Saddle River, NJ: Prentice Hall.

Smith, G. (1998, June). Learning statistics by doing statistics. Journal of Statistics

Education [Online], 6(3).

(http://www.amstat.org/publications/jse/v6n3/smith.html).

Sokal, R. R., & Rohlf, F. J. (1995). Biometry (3rd ed.). New York: W. H. Freeman and

Company.

Spatz, C. (2005). Basic statistics: Tales of distributions (8th ed.). Belmont, CA:

Wadsworth/Thomson Learning.

Speer, W. R. (1979, November). “Do you see what I hear?” A look at individual learning

styles. Arithmetic Teacher, 27(3), 22-26.

Spence, J. T., Cotton, J. W., Underwood, B., J., & Duncan, C., P. (1990). Elementary

statistics (5th ed.). Englewood Cliffs, NJ: Prentice-Hall.

Spiegel, M. R. (1961). Theory and problems of statistics. Schaum’s Outline Series. New

York: Schaum.

Spiegel, M. R., & Stephens, L. J. (1998). Theory and problems of statistics (3rd ed.).

Schaum’s Outline Series. New York: McGraw-Hill.

Sprenkle, D. H., & Moon, S. M. (Eds.) (1996). Research Methods in Family Therapy.

New York : Guilford.

236

Sprinthall, R. C. (2000). Basic statistical analysis (6th ed.). Needham Heights, MA:

Allyn and Bacon.

Stevens, J. (1996). Applied multivariate statistics for the social science (3rd ed.).

Mahwah, NJ: Lawrence Erlbaum Associates.

Stevens, J. (2002). Applied multivariate statistics for the social science (4th ed.).

Mahwah, NJ: Lawrence Erlbaum Associates.

Stockburger, D. W. (2001). Introductory statistics: Concepts, models, and applications.

Available online at: http://www.psychstat.smsu.edu/introbooks/sbk09.htm.

Sturges, H. (1926, March). The choice of a class interval. Journal of the American

Statistical Association 21(153), 65-66.

Sullivan, M. M. (1993, November). Students learn statistics when they assume a

statistician’s role. Paper presented at the annual conference of the American

Mathematical Association of Two-Year Colleges, Boston, MA.

Sullivan, M. M. (2005). Fundamentals of statistics. Upper Saddle River, NJ: Pearson

Prentice Hall.

Sullivan, M. M. (2007). Statistics: Informed decisions using data. (2nd ed.) Upper Saddle

River, NJ: Pearson Prentice Hall.

Summers, G. W., & Peters, W. S. (1973). Basic statistics in business and economics.

Belmont, CA: Wadsworth.

Summers, G. W., Peters, W. S., & Armstrong, C. P. (1977). Basic statistics in business

and economics (2nd ed.). Belmont, CA: Wadsworth.

237

Sutarso, T. (1992a). Students’ attitudes toward statistics. Paper presented at the

Annual Meeting of the Mid-South Educational Research Association, Knoxville,

TN (ERIC Document Reproduction Service No. ED 353316).

Sutarso, T. (1992b). Some variables in relation to students’ anxiety in learning

statistics. Paper presented at the Annual Meeting of the Mid-South Educational

Research Association, Knoxville, TN. (ERIC Document Reproduction Service

No. ED353334).

Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston,

MA: Allyn and Bacon.

Taplin, R. H. (2003, June). Teaching statistical consulting before methodology.

Australian New Zealand Journal of Statistics, 45(2), 141-152.

Terrell, G. R., & Scott, D. W. (1985, March). Over-smoothed nonparametric density

estimates. Journal of the American Statistical Association, 80(389), 209-214.

Thompson, B. (2002, Winter). “Statistical”, “practical”, and “clinical”: How many kinds

of significance do counselors need to consider? Journal of Counseling and

Development, 80(1), 64-71.

Thompson, W. B. (1994, February). Making data analysis realistic: Incorporating

research into statistics courses. Teaching of Psychology, 21(1), 41-43.

Thompson, B., & Vacha-Haase, T. (2000, April). Psychometrics is datametrics: the test is

not reliable. Educational and Psychological Measurement, 60(2), 174-195.

Thorndike, R. M., & Dinnel, D. L. (2001). Basic statistics for the behavioral sciences.

Upper Saddle River, NJ: Prentice-Hall.

238

Triola, M. F. (2001). Essential of statistics. Boston, MA: Addison Wesley

Longman.

Triola, M. F. (2002). Elementary statistics. (8th ed.). Boston, MA: Addison

Wesley Longman.

Vaidyanathan, R., & Vaidyanathan, G. (1987). College business statistics with Canadian

applications. Toronto, Ontario, Canada: Allyn and Bacon.

Velleman, P. F. (1976, Summer). Interactive computing for exploratory data analysis I:

Display algorithms. 1975 Proceedings of the Statistical Computing Section, 142-

147. Washington, DC: American Statistical Association.

Wand, M. P. (1997, March). Data-based choice of histogram bin-width. The American

Statistician, 51(1), 59-64.

Ware, M. E., & Chastain, J. D. (1989). Person variables contributing to success in

introductory statistics. (ERIC Document Reproduction Service No. ED309 927).

Waters, L. K., Martelli, T. A., Zakrajsek, T., & Popovich, P. M. (1988, Summer).

Attitudes toward statistics: An evaluation of multiple measures. Educational and

Psychological Measurement, 48(2), 513-516.

Webb, N. M. (1992). Testing a theoretical model of student interaction and learning in

small groups. In R. Hertz-Lazarowitz & N. Miller (Eds.), Interaction in

cooperative groups: The theoretical anatomy of group learning (pp. 102-119).

New York: Cambridge University.

Webster, A. (1992). Applied statistics for business and economics. Homewood, IL:

Richard D. Irwin.

239

Weiers, R. M. (2008). Introduction to business statistics (6th ed.). Mason, OH: Thomson

South-Western.

Weiss, Neil A. (2004). Elementary statistics (7th ed.). Boston, MA: Addison Wesley.

Wessel, R. H., & Willett, E. R. (1960). Statistics as applied to economics and business.

New York: Henry Holt and Company.

Wiersma, W. (2002). Research methods in education: An introduction (7th ed.).

Needham Heights, MA: Allyn & Bacon.

William, E., Ludlow, B., Walls, R. T. (1978, October). Learning, transfer, and retention

of errorless fading versus trial-and-error teaching. Psychological Reports,

43(2), 553-554.

Willis, T., & Seung, K. H. (2006). Using Computers in Educational and Psychological

Research: Using Information Technologies to Support the Research Process.

Springfield, IL: Charles C. Thomas.

Wise, S. L. (1985, Summer). The development and validation of a scale measuring

attitudes towards statistics. Educational and Psychological Measurement, 45(2),

401- 405.

Wisenbaker, J. M., Scott, J. S., and Nasser, F. (2000). Structural equation models relating

attitudes about and achievement in introductory statistics courses: A comparison

of results from the U.S. and Israel. Ninth International Congress on Mathematics

Education, Tokyo, Japan.

Witte, R. S., & Witte, J. S. (1997). Statistics (5th ed.). Forth Worth, TX: Harcourt Brace.

Wonnacott, T. H., & Wonnacott, R. J. (1984). Introductory statistics for business and

economics (3rd ed.). New York: John Wiley & Sons.

240

Wooten, C. T. (1998, May). Factors influencing student learning in introductory

accounting classes: A comparison of traditional and non-traditional students.

Issues Accounting Education, 13(2), 357-373.

Wyatt, W. W., & Bridges, C. M. (1967). Statistics for the behavioral sciences.

Englewood, IN: D.C. Heath and Company.

Zanakis, S. H., & Valenzi, E. R. (1997, September/October). Student anxiety and

attitudes in business statistics. Journal of Education for Business 73, 10-16.

Zeidner, M. (1991, November). Statistics and mathematics anxiety in social science

students: Some interesting parallels. British Journal of Educational Psychology,

61(3), 319-328.

Zelditch, M. (1959). A basic course in sociological statistics. New York: Henry Holt

and Company.

Zeller, R. A., & Carmines, E. G. (1978). Statistical analysis of social data. Chicago, IL:

Rand McNally.

Zhu, X., & Simon, H. A. (1987, Summer). Learning mathematics from examples and by

doing. Cognition and Instruction, 4(3), 137-166.

241

Appendix A

INFORMED CONSENT (Protocol # 05E080)

Dear Ohio University Student,

Purpose of the research

Greetings! Thank you for agreeing to participate in my research on the construction of grouped-data frequency tables using a new algorithm. As a component of my doctoral dissertation at Ohio University under the supervision of Dr. Gordon P. Brooks, I have designed a technique called the “iteration algorithm” because it proceeds by successive iterations and it utilizes seven formulas as well as six rules. This structured approach is intended to reduce the biases observed with the existing methods, and should, if well applied, produce just one grouped-data frequency table for the same set of raw data. It is so because the iteration algorithm utilizes five formulas and six rules and works in a systematic step-by-step way, which leaves no room for guessing.

Procedures to be followed

Based on a set of raw data (called EXERCISE I in Appendix C), it will be shown to you how to construct on your own a grouped-data frequency table using a traditional way (taught in textbooks). At the end of the solution for the EXERCISE I, a brief survey (Appendix F) follows to get your impressions about the method presented to you. After that, a second set of raw data (called EXERCISE II in Appendix G) is submitted to you. You are cordially invited to construct a grouped-data frequency table for this second data set by following the steps presented in the two methods. After completing the questionnaire (Appendix F), please return them directly to your Instructor.

Duration of subject’s participation

The whole process will take you less than half an hour.

Confidentiality

Strict confidentiality will be used at all times in the study. Identification numbers will be used to protect the identity of the subjects.

Compensation

The free executive pens are small tokens of my appreciation for your taking your valuable time to complete this survey.

In advance, thank you very much for your participation in my research and best luck for the remainder of your academic year. Hippolyte O. LOHAKA

242

Appendix B

OHIO UNIVERSITY CONSENT FORM TEMPLATE

Title of Research: Making on your own grouped-data frequency tables with continuous data.

Principal Investigator: Hippolyte O. LOHAKA

Department: Educational Studies

Federal and university regulations require us to obtain signed consent for participation in research involving human subjects. After reading the statements below, please indicate your consent by signing this form.

Explanation of Study: Please refer to the above Informed Consent Note.

Risks and Discomforts: There are no known risks or discomforts for participating in this survey.

Benefits: The method proposed yields, overall, better results than those generated by SPSS. Also, it could very well be used for teaching purposes.

Contact Person: Hippolyte O. LOHAKA, 740 – 593 – 4526, [email protected]

If you have any questions regarding your rights as a research participant, please contact Jo Ellen Sherow, Director of Research Compliance, Ohio University, 740 – 593 – 0664.

I certify that I have read and understand this consent form and agree to participate as a subject in the research described. I agree that known risks to me have been explained to my satisfaction and I understand that no compensation is available from Ohio University and its employees for any injury resulting from my participation in this research. I certify that I am 18 years of age or older. My participation in this research is given voluntarily. I understand that I may discontinue participation at any time without penalty or loss of any benefits to which I may otherwise be entitled. I certify that I have been given a copy of this consent form to take with me.

Signature: Date:

243

Appendix C

EXERCISE I (with Integers)

The data shown in Table 31 represent the final grades (in percent) obtained by 65 students in a section of elementary statistics. The grades are recorded to the nearest whole number.

Table 31

Elementary statistics final grades of 65 students during the 2002 spring

44 46 49 49 50 51 54 54 55 55 56 57 59 59 60 60 61 62 62 64 64 65 65 66 66 67 68 69 69 70 70 71 71 71 72 72 73 73 74 74 75 75 76 76 77 77 79 79 79 80 80 81 82 84 84 85 85 86 87 89 90 90 91 94 97

Task: Construct a grouped-data frequency table.

SOLUTION

244

Appendix D

Construction of the Grouped-data Frequency Table by a Traditional Method

Rule 1: Computation of the inclusive range, R, of the scores.

R = (Xhighest – Xlowest), where X is the observed score, Xhighest is the maximum observed score, and Xlowest is the minimum observed score.

Here, R = 97 - 44 = 53.

Rule 2: Determination of a possible value for k, the number of class intervals. k ≈ 1 + 3.322* log (n), where log (n) denotes the decimal logarithm of the sample size, n.

⇒ k ≈ 1 + 3.322 * log (65) = 7.02. The answer could be 7 if it is decided to round down or 8 (if rounded up), or even 10 (the next higher multiple of 5).

Rule 3: Determination of the class interval width, w, for each class interval.

The formula: R/k = w is applied. Then, there are successively 3 possibilities.

First option with k = 7: R/k = 53/7 = 7.57, which could be rounded to either w = 7 or

w = 8.

Second option with k = 8: R/k = 53/8 = 6.63, which reduces to either w = 6 or w = 7.

Third option with k = 10: R/k = 53/10 = 5.3, which could be rounded to either w = 5

or w = 6.

Hence, the class width w may have any of the 4 distinct values listed above (w =

5, w = 6, w = 7, and w = 8) and only one of them has to be chosen. Which one must be selected? By choosing an odd number for the class interval width is recommended, so that the level of accuracy of midpoints corresponds exactly to that of observed data.

There is still a problem because of 2 odd numbers: w = 5 and w = 7. Some students might

245 go for w = 5, while some others will choose w = 7. Any of the two possibilities seems plausible. Only one between the two needs to be retained. But, for the sake of illustration, both odd integers (i.e., 5 and 7) are going to be examined.

On one hand, the third option with w = 5 and k = 10, the second option with w = 7 and k = 8, and the first option has no w = 5 value. On the other hand, only the first option has a w = 7 value, with w = 7 and k = 7.

Rule 4: Determination of the lower real class limit (or lower class boundary) of the highest class interval, which is the starting point. No specific formulas are provided for doing this. As a result, a guess of an appropriate value would be to select, for instance, the minimum score observed. In the present case, it is 44.

Rule 5: Prepare a list of the limits of each class interval, beginning with the starting point.

In constructing the grouped frequency table, list the lower class boundary of the first class interval, which is the starting point obtained by applying Rule 4. Add the class interval width (w) to the starting point to obtain the first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the immediately preceding class interval.

After that, determine the class midpoints of each interval by averaging the class boundaries. That is, adding the corresponding class upper boundary to its class lower boundary, and dividing the resulting sum by two to obtain a class midpoint. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.

246

First Option: R = 53, k = 7, w = 7 and starting point is 44

First class: Lower class limit: 44 and

Upper class limit: 44 + w - 1 = 44 + 6 = 50

Second class: Lower class limit: 51 and

Upper class limit: 51 + w - 1 = 51 + 6 = 57

Third class: Lower class limit: 58 and

Upper class limit: 58 + w - 1 = 58 + 6 = 64

Fourth class: Lower limit: 65 and

Upper limit: 65 + w - 1 = 65 + 6 = 71

Fifth class: Lower limit: 72 and

Upper limit: 72 + w - 1 = 72 + 6 = 78

Sixth class: Lower limit: 79 and

Upper limit: 79 + w – 1 = 79 + 6 = 85

Seventh class: Lower limit: 86 and

Upper limit: 86 + w - 1= 86 + 6 = 92.

Verdict: This is a BAD selection, as the last (here the seventh) class interval of (86 to 92) fails to capture the maximum observed score, which is 97. Therefore, this first choice of class intervals is discarded.

247

Second Choice: R = 53, k = 8, w = 7 and starting point is 44

First class: Lower class limit: 44 and

Upper class limit: 44 + w - 1 = 44 + 6 = 50

Second class: Lower class limit: 51 and

Upper class limit: 51 + w - 1 = 51 + 6 = 57

Third class: Lower class limit: 58 and

Upper class limit: 58 + w - 1 = 58 + 6 = 64

Fourth class: Lower limit: 65 and

Upper limit: 65 + w - 1 = 65 + 6 = 71

Fifth class: Lower limit: 72 and

Upper limit: 72 + w - 1 = 72 + 6 = 78

Sixth class: Lower limit: 79 and

Upper limit: 79 + w – 1 = 79 + 6 = 85

Seventh class: Lower limit: 86 and

Upper limit: 86 + w - 1= 86 + 6 = 92

Eighth class: Lower limit: 93 and

Upper limit: 93 + w -1 = 93 + 6 = 99.

Verdict: This is a GOOD choice, since the last (here the eighth) class interval of (93 to

99) contains the maximum observed score, which is 97.

The grouped-data frequency distribution obtained is presented in Table 32 below.

248

Table 32

Grouped-data frequency table for the final statistics grades of 65 students (Traditional

Method)

Class ID Class limit Tally Frequency Class

i ni midpoint

1 44, 50 4 | 4 6 9 9 5 47

5 | 0

2 51, 57 5 | 1 4 4 5 5 6 7 7 54

3 58, 64 5 | 9 9 9 61

6 | 0 0 1 2 2 4 4

4 65, 71 6 | 5 5 6 6 7 8 9 9 13 68

7 | 0 0 1 1 1

5 72, 78 7 | 2 2 3 3 4 4 5 5 6 6 7 7 8 13 75

6 79, 85 7 | 9 9 10 82

8 | 0 0 1 2 4 4 5 5

7 86, 92 8 | 6 7 9 6 89

9 | 0 0 1

8 93, 99 9 | 4 7 2 96

Total n = 65

249

Third Choice: R = 53, k = 10, w = 5 and starting point is 44

First class: Lower limit: 44 and

Upper limit: 44 + w - 1 = 44 + 4 = 48

Second class: Lower limit: 49 and

Upper limit: 49 + w -1 = 49 + 4 = 53

Third class: Lower limit: 54 and

Upper limit: 54 + w – 1 = 54 + 4 = 58

Fourth class: Lower limit: 59 and

Upper limit: 59 + w - 1 = 59 + 4 = 63

Fifth class: Lower limit: 64 and

Upper limit: 64 + w - 1 = 64 + 4 = 68

Sixth class: Lower limit: 69 and

Upper limit: 69 + w - 1 = 69 + 4 = 73

Seventh class: Lower limit: 74 and

Upper limit: 74 + w - 1 = 74 + 4 = 78

Eighth class: Lower limit: 79 and

Upper limit: 79 + w - 1 = 79 + 4 = 83

Ninth class: Lower limit: 84 and

Upper limit: 84 + w - 1 = 84 + 4 = 88

Tenth class: Lower limit: 89 and Upper limit: 89 + w - 1 = 89 + 4 = 93

Verdict: This is a BAD choice, since the last (here the tenth) class interval of (89 to 93) does NOT contain the maximum observed score, which is 97). Consequently, this choice of class intervals is rejected.

250

Appendix E

Construction of the Grouped-data Frequency Table by the Iteration Algorithm

The rules (step-by-step instructions) for constructing of the grouped-data frequency table by the iteration algorithm are shown below.

Rule 1: Computation of the inclusive range, Rt, of the grouped scores.

Rt = (Xhighest – Xlowest + dp ∗ t), where X is the observed score, dp represents the data precision (or the level of accuracy): 10-0, 10-1, 10-2, 10-3, etc., for 0, 1, 2, 3, … digits after the decimal point, and t is the number of necessary iterations or tries to reach the final inclusive range of the grouped scores.

Here, Rt = (Xhighest − Xlowest + dp ∗ t)

= 97 − 44 + 10-0 ∗ t = 53 + 1 ∗ t

= 53 + t.

Rule 2: Setting the maximum value of k, the potential number of classes. kmax = 2*√(n) for n ≤ 100.

⇒ kmax ≈ 2 ∗ √(65) = 2 ∗ 8.0623 = 16.125 ≈ 17 (rounded up). The frequency distribution must have up to 17 classes of grouped scores.

Rule 3: Determination of the minimum value of k, the number of possible class intervals. kmin = √(2 * n) = √(2 * 65) = √(130) = 11.402 ≈ 11 (rounded down).

⇒ kmin ≈ 11. The minimum number of possible class intervals is then 11.

(Note that the number of groups, k, is an integer that must be included between these two values, that is, kmin = 11 ≤ k ≤ kmax = 17).

Rule 4: Determination of the interval width that is equal for all classes.

251

Rt/k = w if and only if the ratio Rt over k yields an odd whole number. The problem will be solved in successive iterations.

First iteration (t = 1): R1/k = [97 – (44) + 1 * 1]/k = (53 + 1)/k, or 54/k, where R1 is the

initial inclusive range of the grouped scores or the range at iteration 1 and k is any

integer comprised between 11 and 17 (inclusively). The inclusive range at iteration 1,

R1, of 54 is divisible by 3, 6, 9, 18, 27, and 54. Unfortunately, none of them belongs

to the given range of values from 11 up to 17, limits included.

Second iteration (t = 2): R2/k = [97 – (44) + 1 * 2]/k = (53 + 2)/k, or 55/k. The number

55 is divisible by 5 and 11, but only 11 falls in the interval of values considered.

Thus, w = R2/k = 55/11 = 5. In other words, there will be k = 11 class intervals with an

equal class interval width, c, of 5, the final inclusive range R (R = R2) for the grouped-

data frequency table will be 55, and the number of necessary tries or iterations, t, is 2

(an even value).

Rule 5: Determination of the lower real class limit (or class boundary) of the lowest class interval boundary, which is the starting point. It is obtained by applying ONE of the two following formulas:

(Xlowest – t*0.5), if t is an odd number of tries or iterations; otherwise

(Xlowest – t*0.5 + 0.5), when t is an even number of tries or iterations; where Xlowest is the smallest score observed in the data set and t denotes the number of tries or iterations necessary to reach the smallest odd whole ratio, w, which represents the class interval length or class interval width.

Now plotting the lowest score and the value 2 (an even integer), the number t of iterations, into the first of the two formulas shown above yields:

252

Xlowest - t*0.5 + 0.5 = 44 - 2*0.5 + 0.5, or 43.5. Therefore, the lowest class boundary becomes 43.5.

Rule 6: Prepare a list of the limits of each class interval, beginning with the starting point.

In constructing the grouped frequency table, list the lower class boundary of the first class interval, which is the starting point obtained by applying Rule 5. Add the class interval width (w) to the starting point to obtain the first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the immediately preceding class interval.

After that, determine the class marks of each interval by averaging the class boundaries. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.

First class interval: Lower class boundary = 43.5.

Upper class boundary = 43.5 + 1 ∗ w = 43.5 + 5 = 48.5.

Second class interval: Lower class boundary = 48.5.

Upper class boundary = 48.5 + w = 48.5 + 5 = 53.5.

Third class interval: Lower class boundary = 53.5.

Upper class boundary = 53.5 + w = 53.5 + 5 = 58.5.

And last (eleventh) class interval: Lower class boundary =

43.5 + 10 ∗ w = 43.5 + 10 ∗ 5 = 43.5 + 50, or 93.5.

Upper class boundary =

43.5 + 11 ∗ w = 43.5 + 11 ∗ 5 = 43.5 + 55, or 98.5.

Check: Final Range, R = Last upper class boundary - First lower class boundary

= 98.5 – 43.5 = 55 (same result as before).

253

The Iteration algorithm grouped-data frequency distribution is presented in Table 33 below.

254

Table 33

Grouped-data Frequency Table for the Weight of 40 Male Students (The Iteration

Algorithm)

Class ID Class Tally Frequency Cumulative Class

boundary frequency midpoint

i ni xi

1 43.5, 48.5 4 | 4 6 2 2 46

2 48.5, 53.5 4 | 9 9 4 6 51

5 | 0 1

3 53.5, 58.5 5 | 4 4 5 5 6 7 6 12 56

4 58.5, 63.5 5 | 9 9 7 19 61

6 | 0 0 1 2 2

5 63.5, 68.5 6 | 4 4 5 5 6 6 7 8 8 27 66

6 68.5, 73.5 6 | 9 9 11 38 71

7 | 0 0 1 1 1 2 2 3 3

7 73.5, 78.5 7 | 4 4 5 5 6 6 7 7 8 46 76

8 78.5, 83.5 7 | 9 9 9 7 53 81

8 | 0 0 1 2

9 83.5, 88.5 8 | 4 4 5 5 6 7 6 59 86

10 88.5, 93.5 8 | 9 4 63 91

9 | 0 0 1

11 93.5, 98.5 9 | 4 7 2 65 96

Total n = 65

255

Appendix F

SURVEY OF ATTITUDES TOWARD THE METHOD USED (SAMU)

Dear Ohio University Students,

Thank you again for completing these statements about your attitude toward the traditional method. Your responses will be kept confidential. The feedback you provide will help us improve our new instrument.

SAMU QUESTIONNAIRE

DIRECTION: The questions below are designed to identify your attitudes about the method presented for creating frequency tables for grouped data. The items scale has 7 possible responses; the responses range from 1 (STRONGLY DISAGREE) through 4 (NEITHER DISAGREE NOR AGREE) to 7 (STRONGLY AGREE). Please read each question. From the 7-point scale, carefully circle appearing below the one response that most clearly represents your agreement with that statement. Use the entire 7-point scale to indicate your degree of agreement or disagreement with our items. Try not to think too deeply about each response and move quickly to the next item.

1. I like constructing grouped-data frequency tables by the method presented. (P)