MAKING A GROUPED-DATA FREQUENCY TABLE: DEVELOPMENT AND
EXAMINATION OF THE ITERATION ALGORITHM
A dissertation presented to
the faculty of
the College of Education of Ohio University
In partial fulfillment
of the requirements for the degree
Doctor of Philosophy
Hippolyte O. Lohaka
November 2007
2
This dissertation titled
Making a Grouped-data Frequency Table: Development and Examination of the Iteration
Algorithm
by
HIPPOLYTE O. LOHAKA
has been approved for
the Department of Educational Studies
and the College of Education by
Gordon P. Brooks
Associate Professor of Educational Studies
Renée A. Middleton
Dean, College of Education
3
ABSTRACT
LOHAKA, HIPPOLYTE O., Ph.D., November 2007, Educational Research and
Evaluation
Making a Grouped-data Frequency Table: Development and Examination of the Iteration
Algorithm (283 pp.)
Director of Dissertation: Gordon P. Brooks
This study focuses on the development and examination of a new method to construct frequency tables for grouped data. This method is called the iteration algorithm in that it proceeds by successive iterations to determine the four key elements that are essential in building a grouped-data frequency distribution. The algorithm also uses five formulas and
stops running as soon as the first solution is attained (for teaching purposes only). Two
major interests emerged. The first interest was to evaluate how accurate the iteration
algorithm is as a process. The second and main focus of this study was to assess the
effectiveness of the iteration algorithm as an instructional method. The findings of the
Monte Carlo simulations to address the first main interest showed that the results yielded
by the iteration algorithm are comparable to those produced by a well-known statistical
package. To tackle the second foremost aspect of this study, the multivariate analysis of
covariance (MANCOVA) results indicated that the students expressed, on average, more
positive attitudes towards the iteration algorithm than towards a traditional method in
learning how to construct their own grouped-data frequency tables.
Approved:
Gordon P. Brooks
Associate Professor of Educational Studies
4
This dissertation is dedicated to the memories of my late father César Lohaka and my late daughter Sibyl Walo Lohaka.
5
ACKNOWLEDGMENTS
Throughout the entire process of my doctoral program, I have relied on the support of so many tremendous people. First of all, and most significantly, I would like to offer my heartfelt gratitude to the Almighty God who created me and gave me all the intellectual and physical abilities, with which I have been able to achieve whatever life and educational goals I set for myself.
Second, I wish to express my utmost gratitude and appreciation to my academic advisor and dissertation committee chair Dr. Gordon Brooks for his unremitting backing, feedback, insight, and guidance through my unending years of uncertainty and ambiguity.
I could not have completed this without his encouragement, challenge, and mentoring. I am also grateful to my other dissertation committee members: Dr. Teresa Franklin, Dr.
George Johanson, and Dr. Robert Klein for their helps particularly with sampling plan and data collection procedures. The contributions they made to this research study are incommensurable.
In addition, my many thanks go to the Mathematics and Psychology department professors who kindly accepted to administer my questionnaires to their elementary statistics students. Without their cooperation, it would not have been possible to complete, as they are, the last two chapters of this project.
A special thank you goes to Dr. Timothy Kayworth, my current Department Chair, for providing continual support and reducing my teaching load during my first year.
Last but not least, special thanks are due to Helene Obonga Lohaka, my wife and life partner. All her emotional supports, encouragements, and caring love were constant sources of strength and inspiration during the difficult times of this study.
6
TABLE OF CONTENTS
Abstract…………………………………………………………………………………....3
Dedication………………………………………………………………………………....4
Acknowledgements………………………………………………………………………..5
List of Tables.……………………………………………………………………………13
List of Figures……………………………………………………………………………16
Chapter 1: Introduction…………….....….……………..………………………………..17
Forms of Tabulation Arrangement…………...…..………….…………..……….18
Presenting Data in an Ordered Array.….….………....…..……………….20
Example of an Ordered Array……………..…..………………….22
Presenting Ordered Array Data in a Stem-and-Leaf Plot…………………24
Example of a Stem-and-Leaf Plot..…..……...…………………....25
Presenting Data in a Simple Frequency Table..…………..………………26
Example of a Simple Frequency Table..……..…..………………27
Presenting Data in a Grouped-data Frequency Table.…………..……..…32
Example of a Grouped-data Frequency Table……..……………...33
Presenting Data in a Histogram…………………………..………………37
Example of a Histogram…………………………..…………..….39
Rationale for the Study…………………………..………..………………………41
Statement of the Problem…..……………………………………..……………….44
Research Questions.…….………….……….…………………....…….………….47
Monte Carlo Simulations…………………….……….………..…………48
7
Survey of Attitudes towards the Method Used……….…………..………49
Research Hypotheses………………….……………………………….…………49
Monte Carlo Simulations……………………….……..….………………49
Survey of Attitudes towards the Method Used………..….………………50
Significance of the Study………………..….……………………….……………51
Difference between the Iteration and Traditional Methods…..….….……51
Benefits expected to be gained over the Traditional Methods.……..……52
Summary of Major Differences and Benefits……………………….……56
Limitations and Delimitations of the Study…………………………………...….58
Definitions of Terms…………………………………………………………..….60
Attitude……………………....…………………………………….……..60
Class Boundary……………....…………………………………….……..60
Class Frequency...…………….…………………………………….…….60
Class Interval…...………………………………………….……………..61
Class Limits....………………..…………………………………………..61
Class Midpoint…………….….…………………………….…………….61
Class Width…………………………………………………...…………..62
Coefficient of Variation .……..…………………………………………..62
Frequency Distribution………..…………...……………………………..63
Grouped Data………………….....………...……………………………..63
Grouped-Data Frequency Distribution..…...……………………………..63
Iteration Algorithm….…………..………………………………………..64
Range…….……………………..…………………………….…………..64
8
Raw Data..……..……………….……………….………………………..64
Simple Frequency Table…..…….………………………………………..65
Traditional Method…………..…..………………………………………..65
Organization of the Study……………………….………………………………..65
Chapter 2: Review of the Literature………………….…………………………..……....68
Introduction……………….…………….………………………….…………….68
Review of the Relevant Literature….…….…..…………………….…………….69
Vague Guidelines……………………….………………………………..70
Presenting Data in a Frequency Table…..……………………………..…71
Presenting Data in a Simple Frequency Table..………………….72
Presenting Data in a Frequency Table for Grouped Data..….……73
Advantages…………………………….…..………….….73
Disadvantages………………………………….…………75
Summary of Advantages and Disadvantages…………..…76
Presenting Data in a Graph…………..………………………………..….77
Histogram…………………………………………………………78
Traditional Methods………………………………………………………80
Inclusive Range…….…………………………………………….80
Number of Class Intervals………………………………………..81
Class Interval Size………………………………………………..93
Starting Point……………………………………………………..96
Tallies…………………………………………………………….98
Summary Statistics ..………………..…………………………..………..98
9
From the Teaching and Learning Statistics’ Perspectives…………....…..99
Definitions of Teaching, Learning, and Education………………99
Various Teaching Methods Documented…………….………....100
Teaching and Learning Statistics……………………………….103
Factors Affecting Attitudes towards Statistics.…………………103
Unique Solution Feature………….…………………..…………104
Structural and Organizational Issues…..………………………..105
Reasoning about Graphical Representation of Data….…………106
Attitudes towards Statistics………..……………..………….…..107
Summary and Conclusion…………..……………..………….…108
Chapter 3: Iteration Algorithm Development and Examination….………...……….…111
Development of a Traditional Method………..………..…………………….…111
Determining the Range…..…………………….…….………….111
Choosing a Number of Class Intervals………..……..………….111
Computing a Class Width…….……………….…….………….112
Picking up a Starting Point……….………………….………….113
Table Preparation and Construction……………………….……114
Development of the Iteration Algorithm….………..……..……………………119
Determination of the Inclusive Range of Grouped Data….…….119
Selecting a Number of Class Intervals………..…….……..…….120
Fixing on the Class Interval Size....………….…….……………122
Setting up the Starting Point………………………..……………124
Table Preparation and Construction…………………………..…124
10
Iteration Algorithm versus Traditional Method…….……………………..…….127
Monte Carlo Simulations…………………………………………..…………….128
Random Generation of Normal Data…………………………..……..…129
Main Outcomes………………………………………………..……..…130
Conclusion…………….……………..………………………………….135
Chapter 4: Methodology..………….….….…..………………..……………………….136
Research Design……………………….………………………………………..136
Operational Definitions of the Variables……………….…………..…………...137
Survey of Attitudes of the Method Used…………….………………….137
Measures to Represent the Student Perceived Ability in Statistics…..…138
Identification of Population……….….…………………….…………………...138
Sampling Plan……………………..…………………………………..………...138
Instrumentation……………………………………………………………..……139
Selection/Development of Instrument………………………….……….139
Pilot Studies…………….…………………………………………….…………144
Reliability Issues……………..…………………………………………144
Cronbach’s Alpha as an Index of Reliability…..…………………144
Pilot Studies Results………………………………………………145
Data Collection Procedures……………………………………………..147
Data Analysis Procedures……………………….…………….………..……….148
Statistical Methods…………………………………….………………..148
Power and Effect Size…………………………………………………..150
Chapter 5: Data Analysis…………………………..…………..……………………….154
11
Results………………………………….………………………………………..154
Data Analysis……….………………….……….…………..…………...155
Survey of Attitudes towards Method Used (SAMU)...……….…155
MANCOVA Results………………………….………………….162
Assumptions when Using MANCOVA.………………………....172
Analyzing and Interpreting MANCOVA Results……………..…178
Assumptions when Using ANCOVA.………………………...... 182
Univariate ANCOVA Follow-up Analyses……………………....185
Conclusion………….………………………..……………………………..……190
Chapter 6: Discussion, Conclusions, Recommendations…………………..…………...191
Summary of the Results and Findings…………………………………………..191
Discussion of the Results and Findings ….…………….…………..…………...194
Explanations of the Results………………..…..…………….……….…200
Implications of the Study…..……………………………..….………………….201
Implications for Students and Teachers..………………….…………....201
Implications for Researchers and Practitioners…………………...…….202
Summary of the Implications……………………………….…………...203
Limitations………………………………………………………………………204
Conclusion………….……….…….….…………………….…………………...205
Recommendations for Future Research…………………….…………………...209
References…....……………...………………………………………………………….211
Appendix A: Informed Consent form (Protocol #05E080)...... 241
Appendix B: Ohio University Consent Form Template………….…………….………242
12
Appendix C: Exercise I (with Integers)..……………………………………………….243
Appendix D: Construction of the Grouped-data Frequency Table (Traditional)………244
Appendix E: Construction of the Grouped-data Frequency Table (Iteration)….………250
Appendix F: Survey of Attitudes towards the Method Used (SAMU)…………………255
Appendix G: Exercise II (with Decimals)....…………………………………………...260
Appendix H: Construction of the Grouped-data Frequency Table (Iteration)….………261
Appendix I: Construction of the Grouped-data Frequency Table (SPSS)….….……….267
Appendix J: A Copy of the E-Mail Sent to the SPSS Corporate Headquarters………...271
Appendix K: List of Abbreviations………………………………………….………….273
Appendix L: Sample Set of Scatter Plots for Pairs of Dependent Variables..………….274
Appendix M: Additional Output of the Exploratory Factor Analysis…...... ……….282
13
LIST OF TABLES
Table 1. Grade of 80 Students in the Final Statistics Exam...…………………....………21
Table 2. Ordered Grades of 80 Students in the Final Statistics Exam..……………….…23
Table 3. Simple Frequency Distribution of the Grades of 80 Students..…………..……29
Table 4. Grouped-data Frequency of the Grades of 80 Students….………………….….35
Table 5. Differences and Benefits of Iteration Algorithm over Traditional Methods..….57
Table 6. Appropriate Number of Classes in Frequency Distributions as per Keller.....…86
Table 7. Value of k for Different n Numbers according to Each Formula………………90
Table 8. Grouped-data Frequency Table for the Final Statistics Grades of 80 Students
(Traditional Method with First Option)….…………………………..….116
Table 9. Grouped-data Frequency Table for the Final Statistics Grades of 80 Students
(Traditional Method with Third Option)……….…………………….….118
Table 10. General Rule for Determining the Number of Classes in a Histogram by
Sincich et al…….………………………………………………………..122
Table 11. Grouped-data Frequency Table for the Final Statistics Grades of 80 Students
Iteration Algorithm)……………………………………………………..126
Table 12. Overall Summary Table of Square Roots of Deviations..……………...... 131
Table 13. Summary Table by Sample Size of Square Roots of Deviations…….…...…133
Table 14. Reliability Statistics between Iteration and Traditional Algorithms………...145
Table 15. Inter-item Correlation Matrix for the Iteration Algorithm.…………………..146
Table 16. Inter-item Correlation Matrix for the Traditional Algorithm.…………...... 147
Table 17. Reliability Statistics for Iteration Algorithm and Traditional Method………157
14
Table 18. Partial outputs of the Initial Exploratory Factor Analysis Summary
(Principal Component Analysis)………………………………………..158
Table 19. Means and Standard Deviations of the SAMU……….….…..………...…….164
Table 20. Inter-correlations of the SAMU (Iteration – Mathematics)…...……...……...166
Table 21. Inter-correlations of the SAMU (Iteration – Psychology)……...... ………...168
Table 22. Inter-correlations of the SAMU (Tradition – Mathematics)….…..……….....169
Table 23. Inter-correlations of the SAMU (Traditional – Psychology)…....…..…….....171
Table 24. Selected Output for Multivariate Analysis of Covariance (PMA)...... ……...175
Table 25. Correlations between Dependent Variables and Covariate……………...... 177
Table 26. Adjusted Means of the SAMU for Method…………………..…………...... 180
Table 27. Adjusted Means of the SAMU for Department……………..…...... 181
Table 28. Adjusted Means of the SAMU for Method by Department…………….....…182
Table 29. Levene’s Test of Equality of Error Variances……..………………..……….183
Table 30. Selected Output for Univariate F-Tests……………………………….…..…187
Table 31.Elementary Statistics Final Exam Grades of 65 students…………...... ……243
Table 32. Grouped-data frequency Table for the Grades (Traditional)………………...248
Table 33. Grouped-data Frequency Table for the Weights (Iteration)…..…….…..…...254
Table 34. Thickness (in mm) of 100 Mass-Produced Metal Blocks…...………………260
Table 35. Grouped-Data Frequency and Cumulative Frequency Table for the Thickness
of Metal Blocks Random Sample SAMPLEQ3 (Iteration Algorithm)…265
Table 36. Grouped-Data Frequency and Cumulative Frequency Table for the Thickness
of Metal Blocks Random Sample SAMPLEQ3 (SPSS Algorithm)…….268
Table 37. Comparison of the Basic Statistics from the Raw Data……………………...270
15
Table 38. Varimax Rotated Factors. Principal Component Extraction
(Iteration Algorithm)……………………………………………………282
Table 39. Varimax Rotated Factors. Principal Component Extraction
(Traditional Method)……………………………………………………283
16
LIST OF FIGURES
Figure 1. Grades stem-and-leaf plot……………………….….………...………………..26
Figure 2. Histogram produced by the SPSS default format.....………...... ……………....40
Figure 3. Histogram modified with a traditional method’s choices…...... 41
Figure 4. Histogram with the iteration algorithm’s choices..…………..………………127
Figure 5. SampleQ3…………….……………………………….…..…………….……267
Figure 6. Scatter Plot for Value versus Difficulty (Iteration – Mathematics)………….274
Figure 7. Scatter Plot for Cognitive versus Difficulty (Iteration – Mathematics)...... 275
Figure 8. Scatter Plot for Affect versus Difficulty (Iteration – Psychology)..………….276
Figure 9. Scatter Plot for Cognitive versus Difficulty (Iteration – Psychology)..…...…277
Figure 10. Scatter Plot for Affect versus Cognitive (Traditional – Mathematics)...……278
Figure 11. Scatter Plot for Value versus Cognitive (Traditional – Mathematics)…...... 279
Figure 12. Scatter Plot for Value versus Difficulty (Traditional – Psychology)……….280
Figure 13. Scatter Plot for Value versus Affect (Traditional – Psychology)..………….281
17
CHAPTER 1: INTRODUCTION
Once the data have been gathered from a population of interest, it is often difficult to get a sense of what the data indicate when they are presented in an unorganized fashion. Assembling the raw data into a meaningful form, such as a frequency distribution, makes the data easier to understand and interpret by a broad audience in a relatively short period of time. It is in the context of frequency distributions that the importance of conveying in a succinct way numerical information contained in the data to a general public is encountered.
Communication is at the heart of statistics and it implies finding out significant patterns in a given dataset (Rangecroft, 2002). This view is shared by Chatfield (1995) who recognized communication as an important component of investigation that should be incorporated into statistics education. Brase and Brase (1999) also concurred by stressing “no matter what type of data is available, sample or population, it is important to be able to organize and present the data to other people” (p. 23). Brase and Brase went on to infer “for this purpose, tables and graphs are indispensable” (p. 23). As Hulsizer
(2004) appositely expressed it, tables and graphs allow a researcher to “make sense of his/her own data, communicate his or her results to others who are interested in them, and actually see relationships in the data” (p. 12).
Unfortunately, the nature of large data sets is difficult to either communicate to any intended audience or make a decision without considering some logical procedure for organizing and summarizing data for a meaningful representation. This can be done either in a tabulated arrangement, graphically, or statistically (Aron & Aron, 2003; Aryee,
2003; Roscoe, 1975).
18
Tabulation arrangement goes about transforming this mass of raw data from unmanageable details into an easy-to-understand summarized form that is more clearly presentable or interpretable (Levin & Fox, 2004). As part of the mission of descriptive statistics, ways must be found to succinctly organize and neatly present the raw data in order to facilitate examination, analysis, and comprehension (Blommers & Forsyth,
1977).
This study focuses on how the development and examination of a new method to construct frequency tables for grouped data. A grouped-data frequency distribution is a table listing scores grouped into non-overlapping classes of equal width, listed in one column, along with the number of cases falling in each class, recorded in the adjacent column. A grouped-data frequency distribution is constructed for 3 main reasons: (1) to facilitate the examination, analysis, and comprehension of data, (2) to estimate frequencies of the unknown population distribution from the distribution of sample data, and (3) to facilitate the computation of various statistical measures. Grouped-data frequency distributions can be structured either as a table or a graph; both show basic features that are of interest to a statistician.
In the following sections, 3 major forms of tabulation arrangements together with two appropriate graphical displays are gradually introduced using the same relatively large dataset. The goal is to show how ultimately the grouped-data frequency table brings very often more clarity and better understanding in summarizing a big set of raw data.
Forms of Tabulation Arrangement
As stated above, data in raw form are usually not easy to use for decision making.
Some type of organization is needed. Organizing and presenting tabulated data can be
19 done in 3 different forms, according primarily to the size of the data collected (Aryee,
2003; Hopkins & Glass, 1978).
First, if the size of the data collected ranges from small to medium, the data collected may be arranged in an ordered array. It is an arrangement of data items in either an ascending (from lowest to highest value) or descending (from highest to lowest value) order of magnitude (refer, for instance, to Isaac & Michael, 1997, p. 166).
Second, if the size ranges from medium to large, the data can be presented into a frequency distribution with an interval size (or interval width) of one. Frequency distributions with an interval width of one are referred to as simple frequency distributions. Frequency distributions with an interval width of one are used to report exactly each and every of the values observed of the variable that has been measured, traditionally, starting with the smallest value at the beginning and ending with the highest value.
Third, if the sample size is large or very large, the “data are often summarized into a frequency distribution with an interval width greater than one, which reports the number of observations in a set of data within a given range of values of the variable that has been measured” (Raymondo, 1999, p. 48). Frequency distributions with an interval size bigger than one will be referred to as grouped-data frequency distributions.
For comparison purposes, the same large dataset is carried throughout the first 3 chapters to exemplify the 3 different forms of tabulation arrangement (that is, ordered array, simple frequency table, and grouped-data frequency table). A frequency distribution is only a major method of data reduction. Another major method of data reduction is to present the data in some type of graphic display. Since the data used for
20 illustration purposes in this study are quantitative data, two types of graphic display retained are the stem-and-leaf plot for the ordered array data and the histogram for the grouped data.
Presenting Data in an Ordered Array
A simplest way for organizing raw data in a systematic order is usually the ordered array. The ordered array is perhaps more useful when only a few measures are involved (usually if there are 20 or fewer observations). The ordered array helps to see at once the maximum and minimum values. It also gives a rough idea of the distribution of the items over the range.
In contrast, when the number of data items involved is significantly large, the ordered array appears a rather awkward data organization tool. “Listing the values individually for bigger datasets consumes too much space and makes it difficult to visualize the important patterns in the data” (Sanders, Murphy, & Eng, 1976, p. 45).
Table 1 lists grades of 80 students obtained in the final exam of an elementary statistics class. The grades are scored out of 100 points and rounded to the nearest whole number.
21
Table 1
Grades of 80 Students Obtained in the Final Exam of an Elementary Statistics Class 58 55 31 61 77 66 76 58 68 38
44 49 54 41 29 81 90 45 70 40
61 51 48 67 88 67 36 69 71 57
52 58 51 63 92 57 83 86 71 47
64 43 31 78 57 61 58 72 72 41
51 66 62 93 62 84 76 28 58 46
50 46 32 65 85 75 47 57 42 68
37 51 58 63 56 53 79 74 80 73
The scores recorded in Table 1 are called raw data in that they appear in the way they were originally collected (according, here, to the official class roster). Therefore, these numerical values are not organized in any systematic order or processed
(manipulated or treated) in any logical sequence beyond their original form.
Though possible, it takes some time and effort to get a clear idea of these eighty scores by simply looking at the numbers individually in the foregoing form. More precisely, questions such as the following are difficult to answer in a short time.
1. What is the shape of the distribution? Do the observed scores tend to fall into
some recognizable pattern?
2. What is the location, or central tendency, of the random variable being observed?
3. How much variation is involved? Are the values widely dispersed or are they all
fairly close in value? (See Milton, McTeer, & Corbet, 1997, pp. 13-14).
22
4. Where do most of the scores tend to concentrate?
To hold so many scores in mind at once is obviously hard. To derive any generalized concepts of group performance from a brief inspection of these scores is also tricky. Certain characteristics of the group can, of course, be noted without lengthy scrutiny. For instance, a quick examination of these values shows that all of the data are two-digit numbers, and thus none of them is less than 10 or equals 100 (the possible maximum score a student can obtain).
It is difficult for even a trained eye to quickly determine much else from these data. Even identifying the lowest and the highest scores would take a bit of time. “It is only a hazy impression that can be formed, which may not always be correct”
(Vaidyanathan & Vaidyanathan, 1987, p. 8). Auspiciously, with some re-ordering of the eighty scores, these and other characteristics may be noted more quickly and precisely.
Example of an Ordered Array
As a first step, the scores can be rearranged in an ordered array, from lowest to highest or from highest to lowest, as a way of organizing them in some sort of logical order. But this does not reduce the bulk of the data. An ascending-order array is shown in
Table 2 below.
23
Table 2
Ordered Array for Grades of 80 Students in the Final Exam of a Statistics Class 28 29 31 31 32 36 37 38 40 41
41 42 43 44 45 46 46 47 47 48
49 50 51 51 51 51 52 53 54 55
56 57 57 57 57 58 58 58 58 58
58 61 61 61 62 62 63 63 64 65
66 66 67 67 68 68 69 70 71 71
72 72 73 74 75 76 76 77 78 79
80 81 83 84 85 86 88 90 92 93
Table 2, as presented above, is called an ordered array since the raw data have
been only rearranged from lowest to highest. Now, it can be noticed at a glance that the
smallest score is 28 and the largest score is 93. This information is hidden in Table 1.
From the same information the (non-inclusive) range of the scores can be computed as
65, the difference between 93 and 28. In the context of a statistics class’ scores, a range
of 65 indicates that the scores obtained by these 80 students are widely dispersed. To
know where the data are centered, the mean is often computed. Yet, if the distribution of
the data is heavily skewed, then the median would be a better descriptive measure of the
center than the mean.
Calculating an average score for all the 80 values still consumes as much time as
before, but now the median and mode can be obtained a bit more rapidly. The ‘middle’ or
median score is 58 (the value that splits the ordered array into two equal halves). As to
24 the score that occurs more frequently, or modal score, it is again 58 for this particular value appears six times in the dataset. The average score (or mean) is obtained by adding together all the scores observed and dividing their total by 80, which yields a value of 60.
Presenting Ordered Array Data in a Stem-and-Leaf Plot
A first approach to graphically presenting quantitative data is to form a stem-and- leaf plot. A stem-and-leaf diagram, or stem plot, is a data plot that uses the first part of the actual digits that make up the data values as the stem and the second part of the actual digits that make up the data value as the leaf to organize and graphically represent the quantitative data. A stem-and-leaf plot is similar to a histogram but is usually a more informative display for relatively small data sets (less than 100 data points). It provides a table as well as a picture of the data and from it the data can readily be written in order of magnitude, which is useful for many statistical procedures. Like with the ordered array, the only information lost is the order in which the data were originally collected.
To construct a stem plot, each quantitative measurement is broken into two parts:
The leading digit(s) become(s) the stem, and the trailing digit(s) become(s) the leaf. The stems are located along the main axis, and a leaf for each data value is located so as to display the entire distribution of the data. The stems represent the classes in a graph; the leaves reflect the number of measurements in each class.
Advantages of a Stem-and-Leaf Plot
First, a stem-and-leaf diagram (or stem plot) is similar to a frequency histogram
but has the advantage that the graph still contains the original data values. Second, a
stem-and-leaf plot provides a simple way to sort data. Third, the stem plot might not
25 always be easier to construct, but it shows more detail than the comparable frequency histogram.
Disadvantage of a Stem-and-Leaf Plot
There is one major disadvantage. A stem-and-leaf plot is not as useful as the comparable frequency histogram if there are a large number of data values or if the data values have a large number of digits.
Example of a Stem-and-Leaf Plot
Figure 1 displays a stem-and-leaf plot for the final statistics exam grades of 80 students. Since the grades recorded are two-digit numbers, the leading digit will be the stem and the trailing digit the leaf. Here, one has to use the 10’s digit for the stem unit:
29, for instance, is shown as “2 . 9”. In other terms, with the number 29, “2” represents the stem, while “9” serves as the leaf.
26
Frequency Stem & Leaf
2.00 2 . 89
6.00 3 . 112678
13.00 4 . 0112345667789
20.00 5 . 01111234567777888888
16.00 6 . 1112233456677889
13.00 7 . 0112234566789
7.00 8 . 0134568
3.00 9 . 023
Stem width: 10
Each leaf: 1 case(s)
Figure 1. Grade stem-and-leaf plot.
By observing the above stem-and-leaf plot, it can be seen that the mode is 58 and the distribution of the grades is roughly symmetrical around this modal value. There are no outliers and no gaps spotted, but it can be argued that the ordered array may make these characteristics appear even better.
Presenting Data in a Simple Frequency Table
A small to moderate dataset can be more usefully represented in a simple (or ungrouped-data) frequency table. A simple frequency table is a listing (generally arranged in order of magnitude) of each score achieved of a variable, listed in one column, together with the number of times that score occurred, recorded in an adjacent column (Aryee, 2003; Sprinthall, 2000). The recorded number of times each score occurred is referred to as frequency; hence the term “simple frequency table” or,
27 equivalently, “simple frequency distribution” (Aryee, 2003; Sprinthall, 2000; Raymondo,
1999). This type of tabulation arrangement is usually used when there are less than 100 observations and/or when the range of unit score categories is moderate. Berenson and
Levine (1998), especially, recommended “a maximum of about 20 distinct values” (p.
28).
Some authors see the simple frequency distribution as an enhancement over the ordered array in terms of further simplifying inspection of the data and making more sense out of them (Elifson, Runyon, & Haber, 1990, p. 55; Glasnapp and Poggio, 1985, p.
58; Hamburg, 1989, p. 9; Hinkle, Wiersma, & Jurs, 2003, p. 22; Hole, 2000, p. 2; Spence,
Wiersma, & Jurs, 1990). In particular, Glasnapp and Poggio alleged that the simple frequency table “is very efficient and communicates maximum information” (pp. 58-59).
Additionally, with a simple frequency table, it is possible to start seeing certain patterns and regularities which might not otherwise be revealed if the data were not organized
(Aryee, 2003).
Example of a Simple Frequency Table
It can easily be verified that Table 3 contains more than the maximum 20 distinct values as recommended by Berenson and Levine (1998). The data in Table 1 are evidently too large to be conveniently portrayed by a simple frequency table. However, the same data set was kept to make comparisons easier.
An enhanced picture of the raw data displayed in Table 1 emerges by stacking together the identical scores. The obtained scores are listed in one column called ‘Score’
(from the lowest score observed to the highest score observed) and each of their
28 respective frequencies (number of occurrences) is indicated in the adjacent column labeled ‘Frequency’, as shown in Table 3 below.
Table 3 is called a simple frequency distribution or simple frequency table. It helps to gain a better sense of the nature of the same grades shown in Table 1 than does the ordered array presented in Table 2. In particular, an entire picture of how the scores are distributed (or scattered) now emerges. Scores in Table 3 occupy only 55 rows, down from 80 (if Table 2 had been presented in a single column and not in the way it was done here due to space economy).
29
Table 3
Simple Frequency Table of the Grades of 80 Students in a Statistics Final Exam Score, xi Frequency, ni Relative frequency, fi Cumulative relative
frequency in percent, Fi
28 1 0.0125 1.25
29 1 0.0125 2.50
31 2 0.0250 5.00
32 1 0.0125 6.25
36 1 0.0125 7.50
37 1 0.0125 8.75
38 1 0.0125 10.00
40 1 0.0125 11.25
41 2 0.0250 13.75
42 1 0.0125 15.00
43 1 0.0125 16.25
44 1 0.0125 17.50
45 1 0.0125 18.75
46 2 0.0250 21.25
47 2 0.0250 23.75
48 1 0.0125 25.00
49 1 0.0125 26.25
50 1 0.0125 27.50
(table continues)
30
Table 3 (continued)
Simple Frequency Table of the Grades of 80 Students in a Statistics Final Exam Score, xi Frequency, ni Relative frequency, fi Cumulative relative
frequency in percent, Fi
51 4 0.0500 32.50
52 1 0.0125 33.75
53 1 0.0125 35.00
54 1 0.0125 36.25
55 1 0.0125 37.50
56 1 0.0125 38.75
57 4 0.0500 43.75
58 6 0.0750 51.25
61 3 0.0375 55.00
62 2 0.0250 57.50
63 2 0.0250 60.00
64 1 0.0125 61.25
65 1 0.0125 62.50
66 2 0.0250 65.00
69 1 0.0125 71.25
70 1 0.0125 72.50
71 2 0.0250 75.00
72 2 0.0250 77.50
(table continues)
31
Table 3 (continued and concluded)
Simple Frequency Table of the Grades of 80 Students in a Statistics Final Exam Score, xi Frequency, ni Relative frequency, fi Cumulative relative
frequency in percent, Fi
73 1 0.0125 78.75
74 1 0.0125 80.00
75 1 0.0125 81.25
76 2 0.0250 83.75
77 1 0.0125 85.00
78 1 0.0125 86.25
79 1 0.0125 87.50
80 1 0.0125 88.75
81 1 0.0125 90.00
83 1 0.0125 91.25
84 1 0.0125 92.50
85 1 0.0125 93.75
86 1 0.0125 95.00
88 1 0.0125 96.25
90 1 0.0125 97.50
92 1 0.0125 98.75
93 1 0.0125 100.00
Total n = 80 1.0000
32
Although the data seem to be uniformly distributed, they tend to be concentrated
(or tend to cluster) around a central value. For example, it is readily apparent that there are more scores from mid fifties to mid sixties (around the median value) than in any other group; only 5 scores are less than 35, and only 5 other scores are above 85. This tendency was not as apparent from the unorganized (raw) data contained in Table 1.
th Note that in Table 3, the xi at the top of the first column stands for the i raw
score (in this case, grade), ni over the second column represents the frequency or number
of occurrence for that score, and n in the last row designates the total number of subjects or scores observed. The symbols fi and Fi correspond to the relative frequency and
cumulative relative frequency, respectively.
While the data presentation has now become much clearer, there are still too
many details contained in Table 3. Even with 55 rows and 2 columns (instead of 80 rows
and 1 column), it still does not adequately simplify and condense the presentation of the
data.
The grouped-data frequency table presented in the next section uses again the
same dataset as above for comparison purposes. A brief presentation of the approach
precedes the illustration.
Presenting Data in a Grouped-Data Frequency Table
Interval-level scores are sometimes spread over a wide range, making the
resultant simple frequency distribution long and difficult to read. When such instances
occur, few cases may fall at each score value, and the group pattern becomes blurred
(Levin & Fox, 2004, pp. 26-27). When there are at least 50 observations (or at least 20
distinct values according to Schloss & Smith, 1999, p. 147), a simple frequency
33 distribution is long and bulky (Khazanie, 1986, p. 11; Spence et al., 1990; Shavelson,
Ruiz-Primo & Mitchell, 1996, p. 39). It is therefore not an efficient way to organize and display a vast dataset (Glasnapp & Poggio, 1985, p. 59; Raymondo, 1999, p. 50).
If the variable studied is a quantitative continuous variable, a more compact and more accurate form of data organization may be developed without obscuring the essential information contained in the values, and it is called a grouped-data frequency table (Gravetter & Wallnau, 1985, p. 37; Sanders et al., 1976, p. 47). Frequency distributions for grouped data are employed to reveal or emphasize a group pattern
(Levin & Fox, 2004, pp. 26-27) with greater convenience and clarity. This ancillary form of tabulation arrangement consists of combining adjacent score values into groups or classes and summing up their corresponding frequencies.
Grouped-data frequency distributions can be structured either as tables or graphs; both show basic features that are of interest to a statistician. These features are the overall shape of the data, symmetry, the location and the spread, existence of outliers and evidence of clusters or gaps (see, for instance, Arsham, 2001; Mays, 2005).
Example of a Grouped-Data Frequency Table
The same dataset as presented in Table 1 is used again here. This example aims at illustrating a few advantages a grouped-data frequency table might have over a simple data frequency table for the same set of raw data.
Table 4 displays the same raw data contained in Table 1, but the data are now grouped into classes in what is called a grouped-data frequency distribution or a frequency table for grouped data. Here the grades are presented within broader categories along with the number or frequency of students whose grades fell into these categories. A
34 traditional method (Vaidyanathan & Vaidyanathan, 1987, pp. 9-11) served in determining the 4 key elements essential for building a frequency distribution of the grouped data as depicted by Table 4.
Specific details related to how Table 4 was constructed using this traditional procedure (Vaidyanathan & Vaidyanathan, 1987) are described in chapter 3 under the method development section for the traditional method. In the meantime, the computed 4 key elements are as follows: The inclusive range for ‘grouped data’ is equal to 72 (that is, the difference between 96 and 25 plus 1; or 96.5 minus 24.5), the number of classes is 8, the class width is 9, and the starting point is 25. Both the smallest observed score (i.e.,
28) and the largest observed score (i.e., 93) are included in the retained frequency distribution of grouped data.
35
Table 4
Grouped-data Frequency Table for Grades of 80 Students in the Final Statistics Exam
(Traditional Method)
Class Class Frequency Class Cumulative Relative cumulative
limit boundary midpoint frequency frequency in
ni xi percent
25 to 33 24.5 to 33.5 5 29 5 6.25+
34 to 42 33.5 to 42.5 7 38 12 15.00
43 to 51 42.5 to 51.5 14 47 26 32.50
52 to 60 51.5 to 60.5 15 56 41 51.25
61 to 69 60.5 to 69.5 16 65 57 71.25
70 to 78 69.5 to 78.5 12 74 69 86.25
79 to 87 78.5 to 87.5 7 83 76 95.00
88 to 96 87.5 to 96.5 4 92 80 100.00
Total n = 80
Note: +6.25 = (5 * 100)/80.
A frequency table for grouped data, like the one of Table 4, appears to be a more
concise and efficient method for ordering the observations collected. In particular, the
reclassification makes the data more manageable and more comprehensible at a glance.
Besides, salient features or patterns in the data are revealed and become easier to detect
and interpret. For example, it is readily apparent that only 5 scores are less than 34, and
only 4 scores are above 87; the modal interval is ranging from 61 to 69 since it has the
36 highest frequency, which is 16; the median class is ranging from 52 to 60 because the
50% cumulative relative frequency in percent value falls in that class; the grades distribution is slightly skewed to the left.
A big inconvenience of using grouped data rather than the raw data is the loss of precision. For instance, the grouped data no longer have the same range as the range of the original data set (raw data). With the raw data the range was found to be 65, now with the grouped data it is 71 (that is, 96 - 25). Also, getting the mean or even the median is still not any easier.
The issue of grouping data into frequency tables is perhaps not as important as it once was. Dubois (1982), for example, pinpointed that for “further statistical manipulations and where calculators and computers are available, the grouped-data frequency distribution has only tangential relevance, and the behavioral scientist is best advised to use the raw data” (p. 23).
Sokal and Rohlf (1995), on their parts, noted that “Grouping data into frequency distributions was necessary when computations were done by pencil and paper or with mechanical calculations” (p. 28). They went on to add: “Nowadays, even thousands of variates can be processed efficiently by computer without prior grouping” (p. 28). As far back as 1964, Adkins acknowledged that “with the advent of computers, even large-scale analysis is performed with ungrouped data” (p. 38).
Even if comments made by Adkins (1964), Dubois (1982), and Sokal and Rohlf
(1995) are legitimate, still “occasions for grouping data arise sufficiently often that the topic should not be ignored” (Adkins, 1964, p. 38). Also, Frankfort-Nachmias and Leon-
37
Guerrero (2000) pinpointed that “grouped frequency distributions are very useful when the raw data are not available” (p. 47).
Roscoe (1975) affirmed that the grouping of the data into tables may be a useful technique in descriptive statistics, where the grouped-data frequency distribution is an
“end product” rather than an intermediate step in a series of calculations of descriptive statistical measures. Healey (2002) contended “the construction of frequency distributions is almost always the first step in any statistical analysis” (p. 28). He further added: “Clearly the data need to be organized in a format that allows the researcher (and his or her audience) to discern easily any patterns or trends of the distribution of the variable under study” (p. 28). More recently, Weiss (2004) noted: “Grouping data is one of the most common methods of organizing raw data” (p. 44).
To get rid of the cumbersome problem of dealing with too many raw scores, it is necessary to sacrifice some precision (May, Masson, & Hunter, 1990). Further, Dubois
(1979) emphasized that “this loss of precision is more than offset by the gains of increased compactness of presentation and new knowledge of the pattern of how the items distribute themselves” (p. 27).
Presenting Data in a Histogram
The most common graph of the distribution of one quantitative variable and also the most popular method of summarizing numerical information is the histogram (see, for example, Moore, 2000, p. 8; Triola, 2001, p. 12; Wand, 1997, p. 59). A histogram is a graphical technique that is applied to a set of interval data that describes how the data are distributed. The histogram clearly “portrays information on location, spread, and shape”
(Keller, 2001, p. 22). The histogram is only appropriate for variables whose values are
38 numerical and measured on an interval scale. It is generally used when dealing with large data sets (greater than 100 observations), when stem and leaf plots become tedious to construct. A histogram can also help to detect any unusual observations (outliers), or any gaps in the data set.
Histograms group data like stem-and-leaf plots, but the categories are not constrained by the decimal structure of the numbers. Each category should be of the same size, and the categories should be contiguous. Every data point must be in exactly one category or class. The actual values of the original data are not retained, but it is known how many data are in each class. For the sake of learning by doing, a histogram can be constructed manually by drawing a bar above each class interval (or, alternatively, class midpoint) so that the width of the bar extends to the real limits of the class interval and the height of the bar corresponds to the frequency (denoted by ni in Table 4). The bars are usually depicted vertically as “columns”, which are shown on the ordinate. The classes are represented horizontally along the baseline axis or the abscissa (Gravetter & Wallnau,
2000; Hopkins & Glass, 1978).
Although the popular current trend is to resort to statistical packages in order to mechanically generate histograms, there are still many instances where computers are inaccessible or the histograms produced by their default set up are simply unsatisfactory.
Also, as put by Moore and Notz (2006), “statistics software will choose classes of a histogram for you, and may use slightly different rules from one computer software to another. The computer’s choice is usually a good one, but you can change it if you want”
(p. 207).
39
On one hand, in a situation where the computers are unavailable, a grouped-data frequency table can be built by hand before being rendered graphically in the form of a histogram. On the other hand, a default histogram generated by any computer software might be too skewed or have too many empty classes. In such circumstances, an appropriate algorithm might be used in deciding about the number of class intervals the histogram should have, its class width and the starting point. Using the same dataset, a histogram was generated initially by the default set up of SPSS. This histogram was later changed on the basis of choices obtained by a traditional method. How these choices were made will be fully explained in chapter 3.
Example of a Histogram
Figure 2 portrays a histogram as initially generated by SPSS. This is the default format. Replacing all the default grade values by the class boundaries as yielded by a traditional method, SPSS produced Figure 3. Figure 3 is therefore a modified histogram of Figure 2. A quick look at the two graphs reveals that Figure 2 contains 17 different classes or adjoining bars, whereas the transformed histogram in Figure 3 has only 8 classes.
Additionally, although both histograms are symmetrical around the center (grade
60), the histogram displayed in Figure 2 has 3 peaks, and, the transformed histogram in
Figure 3, in contrast, has only one peak. Since the raw data as presented in Table 1 have one mode, the transformed histogram may be a relatively more satisfactory graphical representation of the Table 1 data than the original histogram.
40
Histogram of the students' grades
12
10
8
6 11 Frequency
4 7 7 6 6 6 5 5 5 2 4 4 3 3 3 2 2 Mean =60.00 1 Std. Dev. =15.96039 0 N =80 20.00 40.00 60.00 80.00 100.00 Grades
Figure 2. Histogram of the students’ grades (default SPSS format).
41
Histogram of the students' grades
20
15
10
16 Frequency 15 14 12 5
7 7 5 4 Mean =60.00 0 Std. Dev. =15.96039 N =80 27.0 36.0 45.0 54.0 63.0 72.0 81.0 90.0 99.0 Grades
Figure 3. Histogram of the students’ grades (Traditional method’s choices).
Rationale for the Study
In our contemporary society, there is a constant need for all people to understand, interpret and analyze numerical information displayed in tabular or graphical forms.
Sullivan (1993) emphasized this point by asserting that a data “presentation, which includes charts, graphs, and summary statistics, is often the preferred mode in business publications, research reports, news weeklies, and daily newspapers” (p. 1). Statistics users in general and students in particular need to recognize how information may be exhibited in a misleading manner, which often leads to false conclusions. Sullivan further maintained that knowing how to construct a grouped-data frequency table or histogram might help “students in particular to detect some of the misleading tabular or graphical
42 representation of data, or at least be able to read the professional literature and interpret charts and summaries contained in the texts” (p. 13).
As documented in chapter two, the related literature review has identified 4 elements as essential for making frequency tables of grouped data. These 4 key elements are as follows: the inclusive range, the number of classes, the class length or class width, and the starting point. At the moment, while various procedures exist for use in creating grouped-data frequency tables; none of them is known to determine all the above 4 key elements.
In addition, rules and guidelines or principles accompanying existing methods are highly subjective and thus arbitrary (Doane, 1976). It is so because they are open to personal interpretations or choices, for instance, and hence leave too much room for guessing. As a result, each statistics user would not necessarily come up with the same grouped-data frequency table if provided with the same set of raw data (Davies, 1929;
Davies & Goldsmith, 1980; Devore & Peck, 2005; Groebner & Shannon, 1981; Hoaglin,
Mosteller, & Tukey, 1983; Jaeger, 1990; Scott, 1979; Shaw, Huffman, & Haviland,
1987).
According to Aron and Aron (1997, p. 7) and Wessel and Willett (1960, p. 80), the big questions in designing a grouped-data frequency table are, first, determining the number of intervals and, second, the interval size or interval width. However, with the contemporary practices, the “number of classes is just arbitrary, and so also is the point at which to begin the lowest class and the point at which to end the last class” (Kerr, Hall, &
Kozub, 2002, p. 11).
43
It is in trying to address the above concerns that the iteration algorithm was conceived as a structured fashion for summarizing and displaying the data collected into a frequency distribution of grouped data. It was developed in response to the lack of a clear agreement about which existing formula works best and the lack of clarity in the instructions provided by the existing methods for creating grouped-data frequency tables.
Most existing methods leave too much room for subjectivity and arbitrariness, which in turn are sources of bias. In contrast, the iteration algorithm utilizes five formulas and works in a systematic step-by-step fashion that leaves no room for guessing by providing clear, unambiguous directions that yield unique grouped-data frequency tables; that is, all statistics users are expected to produce the same table for the same set of raw data. It also saves time by relieving the pain of the trial-and-error approaches associated with most of the traditional methods.
The iteration algorithm is original in a number of ways. First, the name stems from the fact that this protocol operates by successive iterations, though at times, a single iteration may suffice. Second, the iteration algorithm works each time with a unique set of five formulas. These formulas serve in determining, en masse, the 4 aforementioned key elements. Third, all of the five formulas used are either completely new or adapted to take care of the iterations. Fourth, in the absence of computers (chiefly in developing countries), producing a unique fitting table by many statistics students working on the same dataset, for instance, would make the whole marking exercise less tedious by the instructors teaching large statistics classes. It is so because statistics instructors would be left with just a single grouped-data frequency table to look at and grade.
44
Statement of the Problem
The general purpose of this study was twofold. First, it strove to establish evidence for the efficacy of the iteration algorithm as a process for statistics users when constructing their own frequency table of grouped data. Second, it presents the effectiveness of the iteration algorithm as an instructional method in contrast to a commonly used method.
To start with, Monte Carlo simulations were used to demonstrate that outcomes produced by the iteration algorithm are, based on the results of 9 statistical measures obtained directly from the raw data, overall comparable to the corresponding outputs yielded by a leading statistical package known as SPSS. More specifically, the absolute deviations between the raw data and the grouped data yielded by the iteration algorithm were compared to the differences between the raw data and the grouped data produced by the SPSS algorithm, for each of the 9 statistical measures considered. The main outcomes of the Monte Carlo simulations are reported in chapter 3.
The iteration algorithm was compared only to the one used by SPSS because both algorithms provide unique solutions (the first plausible solution obtained by the iteration algorithm against the default generated by SPSS). Besides, the statistical software SPSS was chosen because on the word of Willis and Seung (2006) this package “is the most commonly used statistical software package in the social sciences” (p. 12). Acock (2005) reported that “SPSS software package has emerged as one of the three dominant tools for data management and standard statistical analysis” (p. 1093).
Then, after the students who voluntarily participated in the survey had been exposed to the iteration algorithm and a commonly used approach found in many
45 statistics textbooks and related literature, an instrument was administered to each group of participants to evaluate not only their attitudes towards either instructional method, but also their ability to use either method.
Since the SPSS algorithm is unknown to most statistics users, the iteration algorithm is consequently compared only to a textbook technique to assess its worth as an instructional device. Many electronic mail messages sent to the SPSS Corporate
Headquarters and requesting this type of information remained unanswered and the site www.spss.com does not contain that information either. A letter requesting the SPSS algorithm is provided in the Appendix J.
The second aspect of this study intended to assess the effectiveness of the iteration algorithm as a teaching and learning method. Students’ attitudes towards the iteration algorithm were compared to those towards a traditional method. While the
Monte Carlo simulations are necessary in showing the iteration algorithm’s relative accuracy, it is the second aspect that was the focus of the present work. Towards this end, an instrument was developed to explore the effects of the instructional method used
(either the iteration or traditional method) on students attitudes and perceptions concerning the creation of grouped-data frequency tables.
To address these two major issues, 4 major research questions were posed. The first research question aimed at assessing the worth of the iteration algorithm as a process. The other 3 research questions were to evaluate, among other things, the students’ attitudes towards the instructional methods used, after factoring out differences in the students’ perceived ability in statistics.
46
Given the fact that the traditional method chosen leads to so many plausible solutions, it was not possible to compare the two instructional methods based on students’ self- efficacy. Consequently, there were no specific research questions on self-efficacy.
While there is a great deal of literature dealing with the comparison of two instructional practices in teaching statistics in general (Bradstreet, 1996; Garfield, 1995;
Hogg, 1991; Moore, 1997a; Moore, 1997b; Moore, 2001), the expected course grade and the perceived mathematics ability have been cited by many authors (Bandura, 1986;
Galagedera, Woodward, & Degamboda, 2000; Onwuegbuzie, 2003) among factors contributing either directly or indirectly to the student perceived ability in statistics.
For instance, in his path analysis model of statistical self-efficacy, Onwuegbuzie
(2003) retained students’ expectations of their performance in statistics examinations as one of the two cognitive variables (the other variable was study habits). The self-efficacy theory and the rationale for initially choosing the expected course grade are briefly discussed below.
The hypothesis that the expected statistics course grade is related to statistics self- efficacy stems from Bandura’s (1977, 1986) theory of social cognition. Bandura defined self-efficacy, which is a subset of cognition theory, as one’s perceived judgment of performance ability in certain situations. He went on to state that it is central to human behavior. Because of the cognitive capacity for self-referent thought, Bandura thought that individuals are able to evaluate their skills and to report their confidence levels for performance in given tasks. Apparently, individuals’ beliefs about their ability to perform at certain level (that is, self-efficacy) greatly affect their actions (Bandura, 1977).
47
Simply put, self-efficacy theory predicts that an individual’s beliefs system influences behavior choice, effort invested, persistence, and task success. According to this conceptualization, people tend to engage in activities that they believe they can undertake, control their efforts, persevere until this level of performance is accomplished, and then evaluate their performance according to previous expectations.
Consequently, as noted by Onwuegbuzie (2002), students’ expectations of their performance are an important manifestation of their levels of self-efficacy. Thus, it was hypothesized that students’ expectations of success in a statistics examination would significantly predict their eventual performance levels.
Galagedera et al. (2000) further observed a significant positive association between perceived mathematics ability and the expected grade in an elementary statistics course. As a result, students who believe that they have a low level of mathematical ability are likely to put themselves down as poorly performing candidates in elementary statistics courses.
It was therefore the goal of this study to see whether the students’ perceived ability in statistics had any influence on their attitudes towards the instructional method taught when making a frequency table of grouped data. In addition, the study was interested in the relationship between the instructional method selected and the students’ attitudes and background (Geske, Mickelson, Jonson, & Smith, 2000) after removing the effects of their perceived ability in statistics.
Research Questions
For the Monte Carlo simulations aspect, the independent variable entertained is the algorithm used (Iteration versus SPSS) with 9 levels each (mean, median, standard
48 deviation, inclusive range, first and third quartiles, semi-interquartile range, coefficient of relative skewness, and coefficient of variation). Dependent variables are composed of deviations. A deviation is defined here as a difference in absolute values either between each of the raw data statistical measures and each of the corresponding iteration algorithm statistical measures, or between each of the raw data statistical measures and the corresponding SPSS algorithm statistical measures.
To measure the students’ attitudes towards the instructional method taught, an instrument called Survey of Attitudes towards the Method Used (SAMU) was developed.
As found in the original Survey of Attitudes towards Statistics (SATS) instrument invented by Schau, Stevens, Dauphine, and Del Vecchio (1995), the SAMU instrument has 4 components (affect, cognitive competence, value, and difficulty). These 4 components are the 4 dependent variables (Dauphine, Schau, &, Stevens, 1997; Mills,
2004). The two independent variables considered are the instructional method used
(iteration algorithm or traditional method) and the academic department (psychology or mathematics), while the student perceived ability in statistics serves as the covariate.
Monte Carlo Simulations
Research question 1: On average, are the square roots of deviations between the 9 statistical measures computed directly from the raw data and the corresponding 9 statistical measures obtained from the grouped data produced by the iteration algorithm less than those obtained from the grouped data yielded by the SPSS algorithm, for varying sample sizes?
This research question had to be answered before the investigations of the others could begin. An answer to this research question may be found in chapter 3.
49
Survey of Attitudes towards the Method Used
Research question 2: Is there an overall interaction between the academic department and the instructional method used, after adjusting for differences in the students’ perceived ability in statistics among groups?
Research question 3: Are there differences in the mean attitudes scored by students as a function of the academic department, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics?
Research question 4: Are there differences in the mean attitudes scored by
students as a function of the instructional method used, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’
perceived ability in statistics?
Research Hypotheses
This work was meant to examine the effects of the algorithm used and its
interaction with students’ attitudes and their respective academic department in actually
constructing a frequency table for grouped data. Four hypotheses were formulated.
Monte Carlo Simulations
Initial screening of the results suggested that, for each of the 9 statistical
measures, the grouped data produced by using the iteration algorithm were much closer to the corresponding values computed directly from the raw data than were the values yielded by the SPSS algorithm. Based on these findings, the following prediction was made.
50
Research hypothesis 1: For each of the 9 statistical measures, the square roots of deviations between the raw data and the grouped data yielded by the iteration algorithm are hypothesized to be, on average, less than those between the raw data and the grouped data produced by the SPSS algorithm, for varying sample sizes.
Survey of Attitudes towards the Method Used
Since the SAMU instrument is similar to the original instrument (SATS), it is hoped that the results from the two instruments will be comparable. Based on the results found by Schau, Stevens, Dauphine, Del Vecchio (1995), Dauphine, Schau, and Stevens
(1997), Schultz, Drogosz, White, Distefano (1998), and Mills (2004), among others, the following predictions were made.
Research hypothesis 2: It is assumed to be no overall interaction between the academic department and the instructional method used, after adjusting for differences in the students’ perceived ability in statistics among groups.
Research hypothesis 3: It is conjectured that there are differences in the mean attitudes scored by students as a function of the academic department, as measured by the
4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics.
Research hypothesis 4: It is theorized that there are differences in the mean
attitudes scored by students as a function of the instructional method used, as measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics.
All hypotheses will be tested at the .01 significance level (that is α = .01).
Although it is conventional to conclude that the null hypothesis is false if the probability
51 value is less than .05, more conservative researchers conclude the null hypothesis is false only if the probability value is less than .01. A very small α-level (say .01) is less likely to be more extreme than the critical value and so is more significant that high α-level values (say .05). Smaller α-levels run greater risks of failing to reject a false null hypothesis (a Type II error), and also so have less statistical power. The selection of an
α-level inevitably involves a compromise between significance and power, and consequently between the Type I error and the Type II error.
The .01-level was chosen for this study because of one major reason. It was the same significance level used in the article that contains the original SATS instrument and in many other articles using this SATS instrument. For comparison purposes, the .01 level of significance was retained in the present study.
Significance of the Study
Differences between the Iteration Algorithm and the Traditional Method
As mentioned before, the iteration algorithm is different from existing methods because not only it works by iterations, but also it suffices by itself. It has five straightforward formulas that are applied to determine the 4 key elements necessary for constructing a frequency table of grouped data. In practical terms, it iteratively utilizes the five formulas to perform the following:
1. Calculate the inclusive range of grouped data. The formula of inclusive range for
integer data, which is the difference between the largest and smallest observation
and plus one unit of measurement, is modified here to take care of both the rank
of the iteration and the raw data precision (both whole numbers and decimals).
52
2. Decide on the proper number of classes using two formulas. It requires first
setting up the minimum number of class intervals and then finding the maximum
number of class intervals. The suitable number of classes retained must be
comprised between these two extreme values.
3. Work out the class width. This is done by reaching the first compatible trio
composed of the inclusive range for grouped data, the number of class intervals to
be retained in the distribution, and the corresponding class width. The class width
must be an odd number (having the same precision as the collected data) and be
equal for all class intervals in the distribution.
4. Compute the pertinent starting point (for instance, the lowest class boundary).
Two new formulas are applied according to whether the rank of the final iteration
is an even or odd number.
5. Once the number of classes, each class interval width, and the starting point of a
frequency distribution for grouped data have been determined, they may be used
to create a histogram of the raw data (or to modify a histogram generated by
default by any statistical package). The frequency distribution for grouped data is
formed by listing all the different class boundaries in one column and writing
down their corresponding frequencies in an adjacent column of a frequency table
for grouped data. Other columns may be added as well, such as one for class
midpoints.
Benefits Expected to be Gained over the Traditional Methods
The iteration algorithm may be beneficial to statistics users, over the existing methods, in that it helps eliminate the frustration that comes along with guesswork. It is
53 then intended to relieve the statistics users from the pain of trying so many unsuccessful attempts before getting a satisfactory table for grouped data. This can be achieved in a number of ways. The most important are listed below.
First, the iteration algorithm provides clear rules and relatively easy-to-apply formulas. The rules and formulas to be applied are used to compute, in one or more iterations, the 4 key elements necessary to build a frequency table of grouped data. The 4 key elements are the inclusive range for grouped data, the retained number of classes, the class width, and the starting point. Only five formulas need to be used. As a result, the iteration algorithm leaves no room for arbitrariness or subjectivity.
Second, by providing clear, unambiguous, yet simple directions, the iteration algorithm is posited to yield a first grouped-data frequency table that is hoped to be the same if an identical set of raw data is provided to statistics users. Putting it another way, statistics users are expected to produce the same frequency table for the same set of raw data by retaining only the first solution. Building a unique frequency table facilitates comparisons to other distributions, based on the relative frequencies expressed in percents.
More importantly, the first solution from the iteration algorithm has a potential
“time saving” feature not found in most traditional methods. In addition, if performed by hand, the first-and-unique solution approach is much better than the multiple-solution approach in terms of limiting the number of unavoidable computation errors.
Third, in the absence of computers, an extreme way of summarizing data by just a single value may be achieved by calculating (at least with hand calculators) statistical measures of central tendency, dispersion, and shape. Grouped-data frequency tables
54 obtained from the iteration algorithm can reduce the computations involved for getting those descriptive statistics, such as mean, median, mode, range, standard deviation, quartiles, semi inter-quartile range, coefficient of variation, skewness and kurtosis coefficients for numbers condensed in classes, whenever computers are not available.
Fourth, the iteration algorithm is complete since it suffices by itself to determine the 4 key elements necessary for constructing a table for grouped data. No known traditional method that is self-sufficient was found.
Fifth, by introducing a special tally in the grouped-data frequency table, the iteration algorithm retains all of the information contained in the raw data set.
Finally, statistics users participating in the construction process may get a sense of re-inventing the technique. It is here imagined that the statistics users understand better what is going on because they are expected to actively participate in the step-by-step process of building their own grouped-data frequency table. Furthermore, from personal experience, when students learn by reinventing the technique, they get a deeper understanding of the problem at hand, which, in turn, helps them to recall more easily the materials learned. Unfortunately, statistics users have no clue or idea about how the different algorithms utilized by any statistical package work.
Given the current invasion of graphics in the media and printed publications, students would make more sense out of such pictures as histograms if they know how to construct them themselves or at least know how to modify those generated by the default settings of statistical software packages. Bakker (2004) put it “students need, of course, not know exactly which operations the software does to make a histogram, but they should be able to understand that the data are categorized into certain intervals and that
55 the bars’ areas are relative to the number of values in those classes” (p. 12). As implied above, Bakker (1970) went on to observe: “In most statistical software packages such operations are hidden, which suggests that we need special software for education that minimizes this black-box character” (p. 12).
In particular, Shaughnessy (1995) stressed “the importance of helping students learn to transform information from graphs, and to create their own graphs to display raw data, was one of the top priorities among the many issues raised with respect to graphicacy” (p. 217). In either case, the iteration algorithm may serve to develop that special software for education that minimizes the aforementioned black-box character as a minimum in the process of helping students create their own histograms.
Even if it turns out that the iteration algorithm is not any better for statistics users than the traditional method, the method will still potentially have advantages for those who are intolerant for ambiguity. According to Dugas, Gosselin, and Ladouceur (2001), cited by DeRoma, Marin, and Kessler (2003), “intolerance for ambiguity has been associated with a number of anxiety-related problems, including worry, obsessions / compulsions, and panic sensations in response to uncertainties introduced in the classroom” (p. 104).
The expectation is that this systematic approach would help statistics users to learn the iteration algorithm better than the most commonly used ones by reducing the frustration level. It is also anticipated to improve their attitudes towards the iteration algorithm. It finally can have some sound learning and teaching benefits as well.
The iteration algorithm presents some disadvantages. For instance, the iteration algorithm does not always yield closer statistical measures to those yielded directly from
56 raw data than does the traditional method. Furthermore, the iteration algorithm may require at times numerous iterations, whereas the traditional method may reach a solution at the very first trial.
Summary of Major Differences and Benefits
Table 5 summarizes the most momentous differences and benefits of the iteration algorithm and the traditional method.
57
Table 5
Major Differences and Benefits of the Iteration Algorithm over the Traditional Methods
Traditional Methods Iteration Algorithm
The proposed guidelines or rules are vague or Clear step-by step instructions are
ambiguous. As a result, they leave a lot of provided. As such, they are less
room for subjectivity, arbitrariness, or confusing because no personal
intuition. judgment is involved. Only formulas
need strictly to be applied.
In particular most of the key elements are All of the key elements, in particular
determined in a trial-and-error manner. This the chosen number of classes, are
turns out to be an open door to multiple determined in a systematic fashion.
(plausible) solutions since it favors guessing. Therefore, no guesswork is involved
since only the first plausible solution is
retained as a unique (plausible)
solution (for teaching purposes).
Potentially frustrating and time consuming Potentially less frustrating and less
because of possible multiple unsuccessful time consuming due to the use of a
trials. limited number of successive
iterations.
(table continues)
58
Table 5 (continued)
Major Differences and Benefits of the Iteration Algorithm over the Traditional Methods
Traditional Methods Iteration Algorithm
Various formulas are available, but Only one formula may appropriately be
unfortunately there is no accepted consensus used at a time for each of the 4 key
about which existing formula works best for elements for creating frequency
any particular key element. distribution of grouped data.
Rules proposed by most authors are not The iteration algorithm suffices by
complete. These rules often need to be itself to determine the 4 key elements
combined with other available ones to get the necessary for constructing a table for
4 key elements: Inclusive range, number of grouped data.
classes, class width, and starting point.
Individual information is lost. Individual information is preserved
thanks to the special class tally.
Black-box character. No black-box character.
Limitations and Delimitations of the Study
Due to a number of constraints such as money, time, availability of respondents,
and resources, there are six major limitations. These six major limitations follow.
59
First, all data were obtained via self-report. Students’ attitudes towards the instructional (traditional or iteration) method taught were collected thru an instrument, which is a slightly modified version of SATS instrument.
Second, a limited range of subjects participated in this project. Only students enrolled in two elementary statistics courses at Ohio University, Athens campus, were surveyed. The surveyed students were drawn from the accessible academic departments only and might not have been homogeneous in terms of their background in mathematics, for instance. In particular, the pilot study dealt with graduate students from the College of
Education, whereas the actual study used undergraduates from the mathematics and psychology departments.
Third, the instrument was administered once. Students’ attitudes were measured only at the beginning of a one-trimester course. The non-response rate was 15% and no follow-up assessments were entertained at any other time.
Fourth, very few variables were incorporated in this study. To reduce the level of sophistication involved, only 4 dependent variables, two independent variables, and one covariate are included in the research questions. Other aspects of interest are discussed, but not in detail.
Fifth, there was no possibility of random assignment to instructional methods received. All departmental class sections were considered as intact groups. As such, in each department, some class sections were given a traditional instructional method, while others were taught the iteration algorithm.
Sixth, from the review of the related literature, there is no precise way of generalizing from the convenience sample to any type of population (Wiersma, 2002).
60
This means that the generalizability of the findings would be limited to the characteristics of the subjects. This does not mean that the findings are not useful; it simply means that caution is needed in generalizing the results (McMillan & Schumacher, 2001).
Definitions of Terms
Attitude
An attitude is a hypothetical construct that represents an individual’s like or dislike for an item. Attitudes are intensive feelings, relatively stable, which are consequence of positive, negative or neutral experiences over time in learning a topic (in the present case the instructional method taught).
Class Boundary
A class boundary (or real class limit or exact limit of the class interval, Hinkle et al., 2003, pp. 24-25) is a number that does not occur in the sample data, but is rather a value that falls midway between the upper limit of one class and the lower limit of the next-higher class. Bluman (2004) gives a basic rule of thumb to find class boundaries when given the class limits. “The class limits should have the same decimal place value as the data, but the class boundaries have an additional place value and end in a 5”
(Bluman, 2004, p. 34). For example, in Table 4, the first lower class boundary is 24.5 (25
- 0.5), the first upper class boundary is 33.5 (33 + 0.5), the second lower class boundary is 33.5, the second class upper boundary is 42.5, and the last upper class boundary is
96.5.
Class Frequency
The class frequency or simply frequency for a given class is “equal to the number of observations in the data set that fall within the class boundaries” (Madsen &
61
Moeschberger, 1983, p. 231). In Table 4, for instance, frequencies are represented under the ni column.
Class Interval
A class interval (or simply class) is a block of score values used to group scores
together (Shavelson et al., 1996, p. 39). For example, in Table 4, the first class interval is
(25 – 33) and the second one is (34 – 42). Because the class interval refers to actual data
collected, the interval begins and ends with the class limit rather than the class
boundaries.
Class Limits
The class limits for a given class are “the smallest and largest values that might
occur in the data and that will be put into that class” (Madsen & Moeschberger, 1983, p.
230). The class limits have the same measurement precision as the raw data collected.
Class Midpoint
A class midpoint (or class mark) is a numerical value that is exactly in the middle
of each class. It is half way between the limits of the interval (Thorndike & Dinnel, 2001,
p. 32). It is found by adding the upper and lower class limits and dividing this total by
two. It can also be found by adding the upper and lower boundaries and dividing this total
by two.
For purposes of further mathematical analysis, all observations belonging to a
given class interval are assumed to coincide with the class mark. That is, “the class mark
is the score given to all values that fall within the interval” (Spiegel & Stephens, 1998, p.
37). In Table 4, for instance, the first class mark is 29, the second is 38, and the last one is
92.
62
Class Width
A class width, or class length or class interval size, for a given class is the difference between its upper class boundary and its lower boundary; in most frequency distributions it will be constant across the classes. In such a case, the class length is equal to the difference between two successive lower class limits or two successive upper class limits. For instance, the class limits (25 – 33) become (24.5 – 33.5), that is, 25 - 0.5 =
24.5 and 33 + 0.5 = 35.5, respectively (0.5 being the one-half of one unit of measurement).
Therefore, the class width is equal to 33.5 - 24.5 or 9 and not 33 - 25, which is equal to 8. In other words, a class interval size is not calculated as the difference between the upper and lower class limit of the same class.
Coefficient of Variation
On the word of Weiers (2008) “in probability theory and statistics, expressing the standard deviation as a percentage of the mean, the coefficient of variation indicates the relative amount of dispersion in the dataset” (p. 82). If s is the sample standard deviation
of a set of data xi (i = 1, 2,…, n) and x is its sample mean, then the sample coefficient of
variation (cv) is expressed as the ratio of the sample standard deviation (s) to the sample
s mean ( x ): cv = x
There is an equivalent definition for the coefficient of variation of a population,
which is based on the expected value and the standard deviation of a random variable.
The population coefficient of variation is denoted by CV. It is defined as the ratio of the
σ population standard deviation σ to the population mean µ: CV = μ
63
It is often reported as on a scale of 0 to 100% by multiplying the above calculation by 100%. The coefficient of variation is a dimensionless number that allows comparison of the variation of measurements from different populations. In other words, as expressed by Sullivan (2007), “it allows for the comparison of two or more variables measured on a different scale” (p. 155).
Frequency Distribution
A frequency distribution shows, according to Huck (2004), “how many people (or animals or objects) were similar in the sense that, measured on the dependent variable, they ended up in the same category or had the same score” (p. 18). Two kinds of frequency distributions are discussed in this study: simple (or ungrouped-data) frequency distribution and grouped-data frequency distribution.
Grouped Data
When a few adjacent score values are combined together into classes they are called grouped data. Data like those presented in the first column of Table 4 are called grouped data.
Grouped-data Frequency Distribution
A grouped-data frequency distribution (like the one of Table 4) is, as defined by
Shavelosn et al. (2000), “a table listing scores grouped into non-overlapping class intervals of equal size or equal width (rather than individual scores) along with the frequency of scores falling into each class interval” (p. 39). A grouped-data frequency distribution is “almost exclusively used when the dependent measure is an interval-level or a ratio-level variable” (Harris, 1998, p. 74).
64
Iteration Algorithm
The iteration algorithm is a structured tactic with clear rules and five formulas applied in a limited number of successive iterations to create a frequency table for grouped data. The grouped-data frequency distribution is obtained without guessing, subjectivity or arbitrariness.
Range
Perhaps the simplest, and most obvious, measure of the dispersion in a set of numerical observations is the difference between the largest and the smallest values. This is known as the non-inclusive range, exclusive range, or, simply, range. From Table 2, it can be seen that the highest grade is 93, and the lowest is 28. The subtraction of 28 from
93 yields 65. The (non-inclusive) range therefore is 65 (i.e., 93 - 25).
For whole numbers, the inclusive range is the non-inclusive range plus 1 (one unit of the measurements observed). For the data in Table 2, the inclusive range is computed as: Inclusive range = (Non-Inclusive) Range + 1 = 65 + 1 or 66.
For decimals, a formula for the inclusive range is provided by the iteration algorithm. This formula is reported in chapter 3, under the method development section of the iteration algorithm. An example dealing with decimals be may found in Appendix
G.
Raw Data
Raw data or ungrouped data or raw scores are the observed or collected values in their original form. That is, data that have not been manipulated, summarized, organized, or treated numerically (Spiegel & Stephens, 1998) in any way beyond their original collection form. For instance, Table 1 lists raw data.
65
Simple Frequency Distribution
A simple (or ungrouped-data) frequency distribution simply lists, in ascending order (from the lowest to the highest), each possible or actual score, along with the number of times that score appeared or occurred (e.g., Table 3). The raw data are not grouped into classes or intervals. A simple frequency distribution is “frequently the most useful way to present data when the dependent measure is a nominal-level variable”
(Harris, 1998, p. 74).
Traditional Method
A traditional method or generate-and-test algorithm is any existing technique with vague suggestions or guidelines and an unspecified number of formulas applied on-the- spur-of-the-moment to build a frequency distribution of grouped data. It is an unstructured tactic in that the grouped-data frequency table is constructed by trial-and- error approaches based on guesses. The traditional method illustrated in this study is the algorithm proposed by Vaidyanathan and Vaidyanathan (1987, pp. 9-11) and Mason and
Lind (1996, pp. 8-12). This method is presented in detail in chapter 3.
Organization of the Study
This study is organized in the following manner:
Chapter 1 introduces the background and objectives of the study. Under this chapter, the rationale for developing the iteration algorithm, the statement of the problem, research questions, significance of the study, limitations and delimitations of the study, definition of terms, and organization of the study are covered.
66
Chapter 2 presents a review of related literature relevant to the 4 key elements necessary to build a grouped-data frequency table. The teaching and learning statistics’ perspectives are also discussed.
Chapter 3 includes the developments of the two instructional methods. Also included, are the main results yielded by the Monte Carlo simulations.
Chapter 4 describes the methodology used for the study, which comprises the sampling plan, the instrument, the pilot study results, data collection and data analysis procedures, and an explanation of the analysis of the data.
Chapter 5 contains the data analysis and presentation of the findings of the study.
The results of the study are arranged into 3 sections.
In chapter 6 a summary and interpretations of the findings of the study are given.
Also, recommendations and suggestions for further research and practice are included.
Finally, 13 appendices are attached at the end. Appendix A contains with an informed consent form. Appendix B includes the Ohio University consent form template.
Appendix C is the first exercise with integers. Appendix D and Appendix E show how the traditional method and the iteration algorithm, respectively, work in practice with these integers. Appendix F shows a questionnaire of the Survey of Attitudes towards the
Method Used (SAMU) instrument. An exercise dealing with decimals appears in
Appendix G. Appendix H and Appendix I illustrate how to construct a grouped-data frequency table using the iteration algorithm and the SPSS algorithm, respectively, for non-integers. A copy of the e-mail message sent to the SPSS Corporate Headquarters representatives is attached as Appendix J. A list of abbreviations used in the text is incorporated in Appendix K. In Appendix L, there is a sample set of 8 scatter diagrams of
67 data for pairs of dependent variables. Finally, additional output of the exploratory factor analysis information from the final study data is reported in Appendix M.
68
CHAPTER 2: REVIEW OF THE LITERATURE
Introduction
This study initially focused on evidence for the efficacy of the iteration algorithm as a process. A review of the relevant literature in the area of data reduction into frequency tables will be presented, evolving rapidly from the simple or ungrouped-data frequency distributions to frequency distributions for grouped data. Cases where frequency tables for grouped data show an edge over simple frequency tables are analyzed. At the grouped-data frequency distribution level, what is faulty with the existing algorithms was examined and then it was further explained how different the iteration algorithm is from the existing ones.
A second and main area of concentration was assessing the effectiveness of the iteration algorithm as a teaching method, chiefly based on its two key features. Its algorithm provides a unique solution and it is well structured, which leaves no room for subjectivity or arbitrariness. Ultimately, the preoccupation that needs an answer is: Will some students (or, in general, some statistics users) show more positive attitudes towards how to group quantitative continuous data from the iteration algorithm than from a traditional approach (found in many basic statistics textbooks)?
Towards this end, how the learning as well as instruction literature talks about these educational issues was looked at. More specifically, literature pertinent to the unique solution versus multiple solutions algorithms; and structured versus spur-of-the- moment methods was investigated under the teaching and learning perspectives.
As stated in chapter one, 4 elements, which are the inclusive range, the number of classes, the class interval width, and the starting point, are essential for creating a
69 frequency table for grouped data. As such, the literature review contained a specific review of these 4 elements’ literature. A special section reviewed references that support the teaching and learning statistics merits pertaining to the iteration algorithm. The literature review ends with a short summary of the most relevant aspects of the topics presented. Following this review of the related literature, both the development of the iteration algorithm and the development of the traditional were traced.
Review of the Relevant Literature
The literature about how to create a grouped-data frequency table is abundant. As a matter of fact, virtually all basics statistics textbooks cited in this dissertation contain at least a section discussing the construction of frequency distributions for grouped data.
Unfortunately, these textbooks give only suggestions or guidelines, which are vague because they leave an open door to guessing, subjectivity or arbitrariness. The reference list found in the appendices enumerates literally more than 150 of such books containing ambiguous principles.
In contrast, the available literature concerning how to construct a grouped-data frequency table from start to end is scant. This is probably due to the advent of powerful statistical packages, which can perform in no time major statistical analyses even with extremely large data sets.
This section starts by reviewing the aforementioned vague guidelines and states what is confusing with the existing algorithms. Then, it moves on to discussing how some statistics users might learn better to group continuous data from the iteration algorithm than from a traditional approach (found in many basic statistics textbooks). How the
70 learning as well as instruction literature (including both theoretical and empirical) talks about these issues were explored. Finally, the findings were summarized.
Vague Guidelines
There is still a dearth of studies in the literature that addresses the issue of grouping data into frequency tables. The literature reviewed revealed that there is no clear agreement about which existing formula works best. The instructions provided by the existing methods for creating grouped-data frequency tables are ambiguous in that there is no exclusive way to interpret the key words found in them. The advice proffered by statistics textbooks is vague because of liberal use of the key terms such as: “personal judgment” (Moore, 1997a, p. 16, Moore, 2000, p. 10), “common sense”, “good intuition”,
“past experience”(Doane, 1976, p. 181), “educated guesses” for selecting a desired and/or appropriate number of classes (e.g., Fox et al., 1993, p. 37; Gravetter & Walnau 1992, p.
31), or “nice, convenient or simple number” when deciding about the length or width for each class interval” (e.g., Gravetter & Walnau, 1992, p. 31; Levin & Fox, 2004, pp. 26-
27).
Fox et al. (1993, p. 37) and Levin and Fox (2004, pp. 26-27), among others, deplored that the nonexistence of a standard set of objective and well structured guidelines poses a serious problem in particular to beginning statistics students, who generally find it challenging to construct class intervals on their own. They contended that this is a skill that develops only with practice. The lack of a clear-cut procedure leaves room for arbitrariness and subjectivity even among skilled behavioral researchers.
In fact, there is considerable flexibility of choice from the available options offered and
71 the decision taken heavily draws on each investigator’s judgment, intuition, or experience
(Doane, 1976).
The lack of a standard set of objective and well structured guidelines turns out to be a source of bias. Because even when the same guidelines are followed, they might be interpreted diversely according to personal preferences or research needs.
Presenting Data in a Frequency Table
A number of other statistics authors recognize the importance of grouping data into frequency tables. Freund (1998), who is one of them, said that “the most common method of summarizing data is to present them in condensed form in tables or charts, and at one time this took up the better part of an elementary course in statistics” (p. 15). He went on to articulate “nowadays there is so much else to learn in statistics that very little time is devoted to this kind of work. In a way this is unfortunate, because one does not have to look far in newspapers, magazines, and even professional journals to find unintentionally or intentionally misleading statistical charts” (Freund, 1998, p. 15).
Graham (1987) encapsulated 4 fundamental purposes for data analysis in a statistical investigation: (1) describing data, (2) summarizing data, (c) comparing and contrasting two or more data sets, or (d) generalizing about a population or predicting the next case. He recommended that grouped-data frequency tables as well as histograms be used as a representational tool for the first, third, and fourth purposes. He suggested that summary statistics related to center, spread and shape are most helpful for the second purpose.
72
Presenting Data in a Simple Frequency Table
When examining a very large set of raw data, it is almost impossible to get a clear picture of the data without organizing and summarizing them in some sort of logical and orderly fashion. “Even identifying the smallest number and the largest number may require spending a few minutes, let alone learning about how the numbers are distributed between these two extremes” (Keller, 2001, p. 22).
According to Hinkle et al. (2003) and MacDonald (1982), one of the traditional and simplest procedures for organizing and summarizing the data for a meaningful representation is in a simple or ungrouped frequency table. Weiss (2004) put that by
“suitably organizing data, we can often make a large and complicated batch of data more compact, easier to work with and understand” (p. 44). Data grouped in a simple frequency table allow the investigator to “determine very quickly where most of the items are concentrated” (L’Esperance, 1971, p. 24) and as said by Hamburg (1989) “if a frequency distribution of the individual figures is formed, many features of ungrouped data become readily discernible” (p. 9).
Spiegel (1961) evocated that an “important advantage of a frequency distribution is gained in the clear ‘overall’ picture which is obtained and in vital relationships which are thereby made evident” (p. 27). Montcalm and Royse (2002) registered frequency tables as one of at least 4 ways in which data can be summarized. The other 3 being: narration, graphs, and cross-tabulation tables.
On their parts, Hopkins, Glass, and Hopkins (1987) and Minium (1978) argued that statistical information could be more easily understood, comprehended and interpreted more accurately if it is organized into tables and displayed in graphs. As
73 explained in the previous chapter, as long as there are 20 or fewer scores, ordering all scores and/or listing each individual score and indicating its frequency may be sufficient.
One final advantage of a simple frequency distribution is that “we are able to achieve all of this convenience without any loss of precision” (Raymondo, 1999, p. 50).
Presenting Data in a Frequency Table for Grouped Data
With a big data set, a simple frequency table may be too long and impractical. A grouped-data frequency distribution can make information even more directly understandable than a simple frequency distribution can. It is so because values are contained in a frequency table that are more readily apprised and comprehended (see, for instance, in Aron & Aron, 2003, p. 9; Mansfield, 1980, p. 18), yet without obscuring the essential information contained in each individual value (Gravetter & Wallnau, 1985, p.
37; Sanders et al., 1976, p. 47).
Advantages of Presenting Data into a Grouped-data Frequency Table. Carlisle
(1985) and Hinkle et al. (2003) advised on considering the hefty dataset in the form of a grouped-data frequency table for ease of use: reading, presenting, interpreting and/or handling. Ravid (1994), among others, contended that frequency tables for grouped data would be useful for compressing a substantial amount of data in a relatively small space by combining a few adjacent values together into class intervals to avoid having a very long list of scores. Berenson and Levine (1998, pp. 22-23) and Vouch (1982, p. 28) further stressed that whenever a set of collected data contains about 20 or more observations, the best way to examine such mass data should be to present it in summary form by constructing appropriate tables and charts. The important features of the data can
74 be extracted from these tables and charts, and then be neatly presented to a general audience.
Elifson, Runyon, and Haber (1990, p. 56) made clear that the reasons for grouping data into a frequency table are 3fold. First, unless computers are available, it is uneconomical and unwieldy to deal with a large number of cases spread out over many scores. Second, some of the scores have such low frequency counts associated with them that it is not justified in maintaining these scores as separate and distinct entities. Third, categories or classes provide a concise and meaningful summary of the data.
On his part, MacDonald (1982, p. 1) retained 3 main advantages of presenting data into a grouped-data frequency table. There are as follows.
First, a grouped-data frequency table is a necessary preliminary to a graphical representation of the quantitative data. Second, the approximation of mean and standard deviation can be calculated by frequency weighting the class intervals. Third, the approximate median, quartiles, deciles, and percentile ranks can be calculated by linear interpolation of the class intervals. The exact determination of these latter statistics requires the ordering of all data points.
Bluman (2004) exposed 3 other advantages. Grouped-data frequency tables are constructed: (a) to organize the data in a meaningful display and an intelligent way, (b) to enable the reader to determine the nature or shape of the distribution, and (c) to enable the researcher to draw charts and graphs for the presentation of data. There is also a surprising fact, which turns out to be contrary to the general feeling. As per Châtillon,
Gélinas, Martin, and Laurencelle (1987), when it comes to estimating percentiles of a distribution from a sample, “the observed percentiles provide in many cases better
75 estimates when they are based on grouped data than when based on the raw data” (p.
395). They also established that “the estimations are even more accurate when the data set is relatively small and the population distribution is not too skewed” (p. 395).
Disadvantages of Presenting Data into a Grouped-data Frequency Table. There are also a number of rather obvious disadvantages to the use of the grouped-data frequency table. A major disadvantage is the loss of precision. By grouping scores into class intervals, the individual data values are no longer identified (Fraenkel & Wallen,
2003) or, equivalently as Shavelson et al. (1996) expressed it “the link between the identity of each subject and his data is lost” (p. 38). In other terms, after grouping raw data into class intervals, it is no longer immediately evident, for instance, how many times a particular value occurs, or whether it occurs at all. In particular, there is no way to tell “what was the exact score that was obtained by each subject unless we go back to the original or raw data” (Ravid, 1994, p. 33). This trade-off between usefulness (more compactness) and precision (more detail), as supported by Healey (2002, p. 32), is common in data analysis. More compactness is achieved by a smaller number of wide categories, while more detail is obtained by a greater number of narrow categories.
Bluman (2004) furnished 3 main disadvantages. According to him, grouped-data frequency distributions can: “(a) Reveal little about the actual distribution, skew, and kurtosis of data, (b) Be easily manipulated to yield misleading results, and (c) De- emphasize ranges and extreme values, particularly when open classes are used” (p. 33).
MacDonald (1982) conceded that: “Since the original data are lost in the grouping process, exact calculations of the mean and standard deviation are impossible” (p. 1). He went on to add: “Precise determination of the median, mode, quartiles, deciles, and
76 percentile ranks are likewise impossible, as these statistical measures require the original data in ordered form” (p.1). Runyon and Haber (1984) listed an extra disadvantage:
“Some small errors in statistics based upon grouped scores are unavoidable” (p. 50).
Glasnapp and Poggio (1985) called this loss of information grouping error, which
“results from the presentation of information that is one step removed from the exact raw score values” (p. 63). The “information regarding counts of individual scores is therefore lost” (Shavelson, Ruiz-Primo, & Mitchell, 2000, p. 39). That is, the original observations cannot be reproduced exactly from the frequency distribution of grouped data (Ferguson,
1989, p. 21). This was already long recognized by Davies (1929) who stated that “the grouping process may give rise to a considerable degree of distortion of the original data”
(p. 349).
Summary of Advantages and Disadvantages. In summarizing, when scores are grouped by class intervals, a trade-off is made in frequency distributions for grouped data between more detail (a greater number of narrow categories) and more compactness (a smaller number of wide categories) (Healey, 2002, p. 32). More detail goes at the expense of convenience (summary) and more compactness is resulted in the loss of information (grouping error) (see, among other books, in Healey, 2002; Kolstoe, 1969;
May, Masson, & Hunter, 1990).
Shavelson et al. (1996) emphasized this point by stating that although the grouping process generally destroys much of the original detail of the data; an important advantage is gained in “economy and clarity in the representation of the data” (p. 39).
When so much information is presented, Stockburger (2001) reasoned that “it is difficult to discern what the data are really like, or to ‘cognitively digest’ the data. That is why we
77 need to group them into classes. But by doing so we may lose information regarding counts of individual scores, while we gain understanding about distributions” (p. 2).
Spiegel and Stephens (1998) made a case as follows: “There is a gain in economy since the table is smaller than the original raw data distribution and it is easier to see how the scores are distributed” (p. 37). They went on to add that “a clear overall picture is obtained and the vital relationships are thereby made evident” (p. 37).
Additionally, Mosenthal and Kirsch’s works (1990a, 1990b) highlighted the links between well-structured tables and graphical representations. Friel, Curcio, and Bright
(2001) insisted that, “apparently attention to the use of grouped-data frequency tables as transition tools for organizing information to be represented graphically is needed” (p.
126) to promote understanding. Because one can move back and forth between frequency tables and graphs, having of both types of information has inherent advantages. In conclusion, as backed up by Frankfort-Guerrero (2000), “while the construction of a frequency distribution can be done by a computer, it is important to go through the process to understand how frequency distributions are actually put together” (p. 38).
Presenting Data in a Graph
Minium (1978) mentioned that “when a graph is to be made from the distribution, grouping the scores helps present a more nearly regular appearance” (p. 30). In the same token, Hopkins, Glass, and Hopkins (1987) averred that the interpretation of even massive amounts of data is facilitated by tables and graphs. Tables and graphs are statistical synopses of information.
A picture is worth a thousand words. When it comes to numbers, a not-yet-well- known saying is “a graph is better than a thousand numbers” (Spatz, 2005, p. 30). Keller
78 and Warrack (2003) and Triola (2001) explained that although the grouped-data frequency distribution provides information about how the numbers are distributed, the information is more easily understood and imparted by drawing a picture or graph.
Histogram
Many times, a graphical display says much about the distribution of data. A frequency distribution for grouped data can readily be displayed by a graph called a histogram (Rhodes, 2000, p. 11). A histogram is the most common graph of the distribution of one quantitative variable and also the most popular method of obtaining information (see, for instance, in Moore, 2000, p. 8; Triola, 2001, p. 12; Wand, 1997, p.
59). A histogram is created by drawing rectangles whose bars are the class intervals (or class midpoints), which are shown on the horizontal axis (the abscissa) and whose heights are the frequencies in those individual classes. The width of each rectangle should be the same and the rectangles should touch each other. The frequencies of each class interval
(tallies) are shown on the vertical axis (the ordinate).
Milton, McTeer, and Corbet (1997) brought up that “the histogram is especially useful for large data sets” (p. 22). Follettie (1980) and Friel, Curcio, and Bright (2001) alleged that a histogram generally would portray insights into the overall shape of a distribution if one chooses appropriate intervals for scaling. For some distributions, though, the features of a dataset (i.e., information on its location, spread, and shape) can be disguised or distorted due to a particular choice of class width (Gravetter & Wallnau,
1985, p. 37; Keller, 2001, p. 22). Isaac and Michael (1997) implied that a too narrow class width of a histogram would “compress the data, concealing meaningful changes in
79 its shape” (p. 166). In contrast, they conceded that a too broad class width would “stretch out the data creating unnecessary gaps” (p. 166).
Arsham (2001), for example, summarized below general information to be extracted from data through histograms. The six types of information to be extracted follow.
1. Central location of the data: A value representing the middle, or the typical, or the
average value of a group.
2. Shape: The extent to which its shape is skewed. That is, whether it is symmetrical,
approximately symmetrical, left-skewed, or right-skewed around a central
location (mean, median, or mode).
3. Dispersion: The degree of spread out from central location.
4. The degree of peakedness (kurtosis): How steeply it rises and falls.
5. Gaps: Non-homogeneous groups.
6. Outliers: Observation values that fall apart from the rest.
Being able to “see” what the quantitative data look like can be a real advantage of a histogram. However, histograms have their limitations just as grouped-data frequency distributions do. In fact, histograms may even be more limited if they are used alone and as substitutes for grouped-data frequency distributions rather than being used as accompaniments (Raymondo, 1999, p. 67). SPSS, for instance, does not provide grouped- data frequency tables. Therefore, a combination of histograms and grouped-data frequency distributions is ideal to provide a much clearer picture of the data.
80
Traditional Methods
To construct a frequency table for grouped data, most introductory statistics books agree on determining the 4 crucial elements. These 4 elements again are: The range, the number of class intervals, the class interval width, and the starting point. In the following how existing (traditional) methods do these things and why there are faulty were examined.
Inclusive Range
There is an accepted consensus that the range constitutes the beginning point for constructing a frequency distribution for grouped data; for instance in Anderson,
Sweeney, and Williams (2004), Aron and Aron (2003), Bartz (1979 and 1999), Fox,
Levin, and Harkins (1993), Glasnapp and Poggio (1985), Hoaglin (1983), Ravid (1994),
Shavelson, Ruiz-Primo, and Mitchell (2000), Spence, Cotton, Underwood, and Duncan
(1990), Summers and Peters (1973). Unfortunately, two slightly different formulas are proposed to compute it. On one hand, the range of batch of scores is considered as the difference between the largest and the smallest scores. Some authors call this ‘simple’ or
‘non-inclusive’ range.
On the other hand, the majority of elementary statistics textbooks recommend determining the range as the difference between the largest and the smallest scores plus one (for whole numbers). Some authors call the latter the ‘inclusive’ range. Why is the 1 added in to obtain the inclusive range? Schmidt (1975) explained that the 1 should be added to determine both the distance covered by the maximum and minimum scores and the distance between them. In that way, the two extreme values are included in the range.
No specific formulas are provided to calculate the inclusive range for decimals.
81
Number of Class Intervals
Vague Guidelines. After determining the (inclusive or non-inclusive) range of scores and distribution, the big question before a set of data can be converted into a grouped-data frequency table is determining the number of class intervals (Aron & Aron,
1997, p. 7; Spence, Cotton, Underwood, & Duncan, 1990; Wessel & Willett, 1960, p.
80). The number of class intervals to be used is primarily dependent on the number of observations in the data set, but also on the range of observed scores (Berenson &
Levine, 1998).
Anderson (1989), Aron and Aron (2003), Freund (1998), Jarrett and Kraft (1989),
Johnson and Bhattacharyya (1985), Khazanie (1986), Kooker and Robb (1982),
Mansfield (1986), Rees, (1989), Sanders (1990), Witte and Witte (1997), among other authors, cautioned that when setting up a grouped-data frequency table, it makes a big difference how many intervals are used. To mention just a few, Bartz (1999, p. 24) and
Kerr et al. (2002, p.11), for example, specified that the choice of the number of intervals and of the size of the interval is quite arbitrary. It is even highly subjective (Spatz, 2005, p. 30) as it is a matter of personal judgment (Lapin, 1987, p. 16; Hamilton, 1990, p. 24).
There exist guidelines to help the researchers with this, but they remain vague in that they are subject to personal interpretations, tastes, and preferences.
So many authors recommend using judgment and common sense, which are two ambiguous terms and lead, in turn, to an incredible number of plausible solutions. Among these authors are Anderson, Sweeney, and Williams (2004), Aron and Aron (2003), Fox,
Levin, and Harkins (1993), Glasnapp and Poggio (1985), Hoaglin (1983), and Jaeger
(1990). A few other authors are: Moore (2000), Ravid (1994), Spence, Cotton,
82
Underwood, and Duncan (1990), Shavelson, Ruiz-Primo, and Mitchell (2000), Summers and Peters (1973).
Additionally, Hertzberg (1991) underlined that it would be hard to make firm rules about the choice of intervals for a grouped table, but some general principles should be followed, if applicable. Nevertheless, the general principles involve vague words such as “nice, useful, fairly, easy, round, appropriate, and suitable” (Hertzberg, 1991, p. 11).
As Jaeger (1990) put it “selecting the appropriate number of classes remains a subjective process and varies from one statistics book to another” (p. 9). In contrast,
Lapin (1987), for example, stressed that “since the interaction between analyst and data display is a personal one, we face considerations of taste and aesthetics, particularly in choosing the number of intervals or the interval width” (p. 16). The majority of proposed techniques go by choosing “the desired number of classes” (Hoaglin, 1983, p. 8). There is regrettably no obvious concurrence as to “the way to compute the desired number of class intervals” (Ravid, 1994, p. 33).
It is important to remember that grouped frequency distributions are employed to reveal or emphasize a group pattern. Either too many or too few class intervals may blur that pattern and thereby work against the researcher who seeks to add clarity to the analysis. In sum, then, the researcher generally makes a decision as to the number of intervals based on the set of data and personal objectives, factors that may vary considerably from one research situation to another (Levin & Fox, 2004, pp. 26-27).
Lapin (1980), for instance, said that “one way to decide how many class intervals to use would be to try several –plotting a histogram for each and selecting the one that provides the most logical explanation of the underlying population pattern” (p. 21). He
83 further specified “the histogram chosen for permanent use should be the one that achieves the best balance between summarizing the data and preserving detail” (Lapin, 1987, p.
18). Unfortunately, this approach is lengthy and subject to too much personal judgment as well as taste and aesthetic in spite of partial help from a computer. Even with full computer assistance, the whole procedure remains quite time consuming as creating
‘several’ histograms cannot be done promptly.
Summing up, Sullivan (2005) reminded that the goal in constructing a frequency distribution is to reveal interesting features of the data. As such “too few classes would cause a ‘bunching’ effect. Too many classes will spread the data out, thereby not revealing any pattern” (p. 48).
Recommended Numbers of Class Intervals. There are too many numbers being recommended. To start with, a number of textbooks recommend as a general rule that 10 class intervals be taken as optimal and 30 as the maximum (as seen in Gravetter &
Wanau, 2000; Vaidyanathan & Vaidyanathan, 1987, p. 9). A few others urged between
10 and 20 intervals; for example Anderson, Sweeney, and Williams (2004), Aron and
Aron (2003), Glasnapp and Poggio (1985, p. 62), Hays (1967 & 1981), Kiess (1996),
Larson and Farber (2003), Levin and Rubin (1983), Thorndike (2001, p. 33), Schloss and
Smith (1999, p. 147), Zeller and Carmines (1978, p. 28). For some other authors, the number of class intervals can be situated either between 10 and 14, or 5 and 15. Among these researchers, there are notably Wyatt and Bridges (1967), for the first option. For the second option, are listed Aron and Aron (2003, p. 9), Berenson, Levine, and Krehbiel
(2002), Brase and Brase (2001, p. 36), Craft (1990); Freund (1998), Korin (1975), Lapin
84
(1987), Levine (2001, p. 58), Mason, Lind, and Marchal (1994, p. 27), Sanders, Murph, and Eng (1976), Sandy (1990).
On their parts, Bennett, Briggs, and Triola (2003), Devore and Peck (2005),
Hildebrand (1986), Groebner and Shannon (1981), Isaac and Michael (1997), Larson and
Farber (2006), for instance, stated that between 5 and 20 classes would be satisfactory for most data sets. Similarly, De Veaux, Velleman, and Bock (2006), Madsen and
Moeschberger (1983 and 1986), Mason et al. (1994), May et al. (1990), Mendenhall,
Reinmuth, Beaver, and Duhan (1986), Mendenhall and Sincich (1996), and Triola (2001 and 2002) retained 5 as the lowest number of class intervals and 20 as the largest number.
Many other authors concurred to the same two values or any numbers in between. More specifically, Hinkle et al. (2003) wrote that “for small data sets (less than 100 observations), 6 to 12 works well, while for larger data set with a wide range of scores,
10 to 20 intervals should be okay” (p. 24). Mendenhall (1997) further specified that “a small number of classes (in the range choice of 5 to 20 classes) is to be used for a small data set (say, 10 to 15 measurements) and a larger number of classes should be the preference for a larger data set” (p. 42).
A few other plausible numbers are proposed by Harnett (1982), for example, who phrased that “the number of classes used should probably be fewer than 20 (for ease of handling and to ensure sufficient compacting of the information) and at least 6 (to avoid loss of information due to grouping together widely diverse data)” (p. 10). Another author, Roscoe (1975), recommended that “the number of intervals be between 10 and
25, with 15 to 20 being optimum” (p. 29), whereas Hopkins, Glass, and Hopkins (1987, p. 10) and Wonnacott and Wonnacott (1984, p. 24) and Zelditch (1959, p. 19) put
85 forward that between 10 and 15 class intervals would be appropriate, rarely as many as
20. Jaeger (1997) suggested that about “12 score classes will often do the trick” (Jaeger,
1997, p. 9), while Sokal and Rohlf (1995, p. 25) advocated for a “range of 12 to 20 classes” or a range of “10 to 20 classes” as indicated by Spatz (2005, p. 376).
Concluding, Brase and Brase (1999, p. 47) and Mason and Lind (1996, p. 31), in particular, cautioned on the risk of either losing too much information by choosing fewer than 5 class intervals or sacrificing the clarity of the table for detail if more than 15 class intervals are used.
All these optional numbers are attempts to compromise between the desire to summarize data and the aspiration to retain as much detailed information as possible. The greater the number of score classes, the more detailed information about the original data is retained. The smaller is the number of score classes, the more the data have been simplified. Generally, the larger number of observations in a data set, the more classes should be used. As per Sanders and Smidt (2000), the key is to use the number of classes that would “give good view of the data pattern and provide insights into the information that is there” (p. 52).
Existing Formulas to Compute the Number of Class Intervals. Besides the above guidelines or suggestions, some formulas have been developed in choosing an appropriate number of class intervals, which is denoted by the letter k. One of the oldest formulas was designed by Sturges (1926) and is expressed as k = 1 + log2 n, where k is
the number of classes to choose, log2 represents the logarithm base 2, and n is the number
of observed data.
86
According to Hyndman (1995), most statistical computer packages “still use
Sturges’ rule (or an extension of it) for selecting the number of classes when constructing a histogram. Sturges’ rule is also widely recommended by introductory statistics textbooks” (pp. 1-2). In addition, as maintained by Arsham (2001), Sturges’ rule still holds and produces the same results as the approximate number of classes in frequency distributions proposed by Keller (2001, p. 24) and Keller and Warrack (2003, p. 35) and displayed in Table 6 below.
Table 6
Appropriate Number of Classes in Frequency Distributions as per Keller
Number of observations Number of classes
Less than 50 5 – 7 50 – 200 7 – 9 200 – 500 9 – 10 500 – 1,000 10 – 11 1,000 – 5,000 11 – 13 5,000 – 50,000 13 – 17 More than 50,000 17 – 20
On the word of Hyndman (1995, p. 2) and of Shvyrkov and Stout (1980, p. 471),
the Sturges’ rule leads to over smoothed histograms and no longer deserves a place in
statistics textbooks or as a default in statistical computer packages (mainly for sample
sizes that far exceed 200); since it assumes that frequency distributions follow only
distributions of binomial coefficients. Although the logarithm base 2 is not easy to apply and is available only on some advanced scientific hand calculators, the Sturges’ formula
87 remains apparently popular and is still widely used (see in Arsham, 2001; Hyndman,
1995, for instance).
The second formula examined is the square root of the total number of observations, denoted by n , as recommended by Davies and Goldsmith (1980, p. 11),
Devore and Peck (2005, p. 19), Howell (1997, p. 20), Pelosi and Sandifer (2000, p. 66).
As stated by Johnson and Kuby (2007), “the square root of n formula is a reasonable
guideline for the number of classes with samples of fewer than 125 data values” (p. 57).
In contrast, for a very large data set (like one including 50,000 scores or more), using the
square root formula leads to a prohibitive number of classes. Compared to other proposed
formulas, the square root rule produces higher number of classes than those obtained by
other rules listed in this study.
Ishikawa (1986, p. 8) presented a small table that can be roughly fitted with the
following formula: k = 6 + n/50 (third formula). Regrettably, the second and third
formulas increase rapidly as the number of observations n goes up, particularly when n is
in thousands. This applies also to the following 3 rules (fourth thru sixth formulas).
The fourth formula investigated is the Rice’s rule, which sets the number of
intervals to twice the cube root of the number of observations k = 2 ∗ 3 n (Retrieved
March 29, 2004 from http://cnx.rice.edu/content/m10160/latest/ and written by David M.
Lane).
The fifth formula explored is the fourth root of n rule times 2.5. It sets the number
of classes to twice and a half the fourth root of the number of scores collected, k = 2.5 ∗
4 n , and proposed by an unknown author (anonymous) and found in most statistics
88 textbooks used in the Francophone Africa schools, such as those in Burundi, Rwanda, and Zaire.
The sixth formula reviewed is the Terrell and Scott’s rule. It sets the approximate number of classes to the cube root of twice the number of observations, k = 3 2n (Terrell
& Scott, 1985, pp. 209-214).
Sadly, all the above six formulas lead to data over grouping when the number of
observations is relatively small to moderate, and too many classes for large batches of
data (Hyndman, 1995). An additional formula was entertained.
The seventh formula uses the smallest integer such that 2k ≥ n, where k is the
minimum number of classes and n being the number of observations. In other words, the
number of classes k is the power to which 2 is raised so that the result is equal to or
greater than the number of observations (Iman and Conover, 1982, p. 49; Mason and
Lind, 1996, p. 31; Webster, 1992, pp. 19 - 20). A big advantage with this called rule is
that it is the least affected by the sample size, among all the seven investigated so far.
This seventh formula, 2k ≥ n (Iman & Conover, 1982, p. 49, Mason & Lind, 1996,
p. 31, Webster, 1992, pp. 19-20), referred to as Exponential in Table 7 (continued) shown
on page 67, provides satisfactory results. Inopportunely, like the first formula, the seventh
formula is not easy to manipulate and might be confusing to introductory statistics
students.
There are two additional formulas that help to find the optimal numbers for the
class interval. The eighth and ninth formulae were conceived by Velleman (1976). Based
on large scale Monte Carlo simulations, Hoaglin (1983, pp. 7-32) corroborated
89
Velleman’s findings. The two new formulas give the maximum number of classes and follow:
(1) k = 2 * n for n ≤ 100, and (2) k = 10 ∗ log10 (n ) for n ≥ 100. At n = 100, k is
the same for both formulas. The number k is always rounded up to the next whole
number.
The following Table 7 lists a few numbers of observed values, n, for each of the 9
formulas proposed to compute an appropriate or suitable number of classes, k. It can be observed that there is a great diversity of k values obtained.
90
Table 7 (Part I)
Value of k for Selected n Numbers (Sturges, Square Root, Ishikawa, Rice, and Terrell’s) Number Sturges Square Root Ishikawa Rice Terrell’s
3 4 n k =1+ log 2 n k = n k = 6 + n / 50 k = 2 * n k = 2.5 * n
10 4 3 6 4 3
20 5 4 6 5 3
30 6 5 7 6 4
40 6 6 7 7 4
50 7 7 7 7 5
60 7 8 7 8 5
70 7 8 7 8 5
80 7 9 8 9 5
90 7 9 8 9 6
100 8 10 8 9 6
120 8 11 8 10 6
150 8 12 9 11 7
200 9 14 10 12 7
250 9 16 11 13 8
300 9 17 12 13 8
400 10 20 14 15 9
500 10 22 16 16 10
600 10 24 18 17 11
(table continues)
91
Table 7 (Part I continued)
Value of k for Selected n Numbers (Sturges, Square Root, Ishikawa, Rice, and Terrell’s) 700 10 26 20 18 11
800 11 28 22 19 12
900 11 30 24 19 12
1,000 11 32 26 20 13
5,000 13 71 106 34 22
9,000 14 95 186 42 26
10,000 14 100 206 43 27
Table 7 (Part II)
Value of k for Selected n Numbers (Anonymous, Exponential, and Velleman Maximum’s) Number Anonymous Exponential Velleman Maximum Velleman
(≤100) Maximum (≥100) k n k = 3 2* n 2 ≥ n
k = 2 * n k =10 * log10 n
10 4 3 7
20 5 4 9
30 6 4 11
40 6 5 13
50 7 5 15
60 7 5 16
70 7 6 17
(table continues)
92
Table 7 (Part II continued)
Value of k for Selected n Numbers (Anonymous, Exponential, and Velleman Maximum’s) 80 7 6 18
90 8 6 19
100 8 6 20 20
120 8 6 21
150 9 7 22
200 9 7 24
250 10 7 24
300 10 8 25
400 12 8 27
500 12 8 27
600 12 9 28
700 13 9 29
800 13 9 30
900 14 9 30
1,000 14 9 30
5,000 21 12 37
9,000 24 13 40
10,000 25 13 40
93
Class Interval Size
Once an appropriate number of classes has been calculated, the width of the class interval necessary to produce the computed number of intervals must be determined
(Spence et al., 1990). “The most difficult part (if there is one) in constructing a grouped- data frequency distribution is determining the class interval size” (Aron & Aron, 1997, p.
7; see also Glasnapp & Poggio, 1985, p. 62). Disappointingly, there is again no clear compromise as to a generally accepted formula for choosing the proper length of a class interval (Summers & Peters, 1973, p. 30).
Vague Guidelines. As said by Groebner and Shannon (1981) “determining class size and also class limits is essentially arbitrary, and two individuals will often come up with different ways of arranging data into classes” (p. 36). On the word of Stockburger
(2001), selecting the interval size is more art than science. For integer data, if the ratio range over “the desired number of number” is not a whole number (as it is in most cases), this number (class width) is rounded up to the next integer value (Brase & Brase, 2001, p.
36). Some authors, like Anderson (2004), Aron and Aron (2003), Stockburger (2001), suggested rounding it up or down to the nearest odd integer, while some others like Frank and Althoen (1994, p. 9) recommended rounding it only up to the nearest odd integer. No specific guidelines for decimals were located.
Generally speaking, if the class interval is large in relation to the total range of the set of observations, this loss of information may be appreciable. The larger a class interval is, the more the information loss. “If the class interval is small, the classification of data in the form of a frequency distribution may lead to very little gain in convenience over the utilization of the original observations” (Ferguson, 1989, pp. 27-28). In
94 summary, as a rule of thumb, Knoke and Bohrnstedt (1991) pushed for “an interval width being large enough to avoid too many categories, but not so large that significant differences between observations are concealed” (p. 41).
Recommended Class Interval Size. There is, again, a long list of class interval sizes being proposed. Unfortunately, almost all of them deal only with integer data.
Anderson et al. (2004), Aron and Aron (2003, p. 9), Hawkins and Weber al. (1980, p.
11), Mason et al. (1994, p. 30), McCall (2001, p. 36), Spencer et al. (1990), among other statistics authors, urged that class interval sizes should preferably be multiples of 5 or 10 to facilitate reading the distribution table. Witte and Witte (1997) cautioned: “The recommended length of interval number of classes, multiples of 5 or 10, should be thought as a rough rule of thumb to be applied with discretion rather than with slavish devotion” (p. 18).
When the width of the interval is a multiple of 5, the values chosen for either the lower limits or the upper limits of the classes are usually a multiple of 5 as well. For example, with class length of 10, class intervals of 60-69, 70-79, .., or 61-70, 71-80, …, are more convenient to set up and interpret than, for instance, 62-71, 72-81, and so forth.
The problem with this popular approach is the likely high number of possible tables to be produced before reaching a most appropriate one.
Anderson et al. (2004), Aron and Aron (2003, p. 9), De Veaux, Velleman, and
Bock (2006), Hawkins and Weber al. (1980, p. 11), Mason et al. (1994, p. 30), McCall
(2001, p. 36), Spencer et al. (1990) for instance, alerted that, whenever possible, interval class sizes should be of equal length. The statistical package Microsoft Excel, for instance, “cannot draw column charts with columns of unequal width” (Hunt, 2003, p.
95
43). The class intervals should be equally wide because, as defended by Keller (2001), “it makes the task of reading and interpreting the graph easier and it works well when the distribution of the data is symmetric” (p. 23).
According to Johnson and Bhattacharya (2001), McElroy (1979), and Sanders and
Smidt (2000) the rule requiring equal class intervals is inconvenient where large gaps exist in the data, that is, when the data are spread over a wide range, but are highly concentrated in a small part of the range with relatively few numbers elsewhere. Johnson and Bhattacharya (2001) suggested using “smaller intervals where the data are highly concentrated and larger intervals where the data are sparse in order to help reduce the loss of information due to grouping” (p. 34).
In general, using unequal class intervals may cause difficulties. For example, “if frequencies in a distribution with unequal intervals are compared, the observed variations may merely be related to interval sizes rather than to some underlying pattern” (Sanders
& Smidt, 2000, p. 53). Sanders and Smidt (2000) went on to stipulate: “Other difficulties of using unequal intervals can arise during the preparation of graphs” (p. 53). As a result, for simplicity purposes and practicability, making a frequency table with equal class size should be preferred wherever possible (Hunt, 2003; McElroy, 1979; Sanders & Smidt,
2000; Vaidyanathan & Vaidyanathan, 1987).
Odd whole interval lengths are recommended, because the problem with having even integers would yield class midpoints that are not stated to the same level of precision as the original observations (see, for example, in Anderson, 2004; Aron &
Aron, 2003; Frank & Althoen, 1994; Kolstoe, 1969; Summers, Peters, & Armstrong,
1977; Thorndike & Dinnel, 2001). The same line of reasoning is given by Spiegel and
96
Stephens (1998): “If the class intervals contain an odd number of score levels, the class midpoints would coincide with the actually observed data” (p. 38). They added: “This tends to lessen the so-called grouping error in further mathematical analysis” (Spiegel &
Stephens, 1998, p. 38). For whole numbers, when the class width is an odd number, the middle score is an integer and may be conveniently used as an index value representative of all measures in the interval (Roscoe, 1975).
Existing Formulas to Compute Class Interval Size. At least two formulas are proposed for the optimum class width. The first one was developed by Scott (1979) and extended by Wand (1997). Scott’s expression is written as w = 3.49∗ s∗ −3 n , where w is
the class width or class length, s is the sample standard deviation, and n is the sample
size.
The second formula was designed by Freedman and Diaconis (1981), cited by
Hyndman (1995, p. 2), and it is expressed as w = 2 ∗ (IQ) ∗ n-1/3, where w is the class
width and IQ is the sample interquartile range. These two formulas are heavy to apply
with a hand calculator because they require computing the sample standard deviation, s,
and the sample interquartile range, IQ, respectively.
For both formulas, the rounding exercise could be done up or down to the nearest
integer. This rounded number could probably be either an odd whole number or a
multiple of 5, which is another source of arbitrariness.
Starting Point
Once the decision on interval width has been made, the specific class interval
limits can be determined. This is done by finding the starting point, that is, by
determining the lower real limit of the lowest class interval, or, equivalently, the upper
97 real limit of the highest class interval. Levin and Fox (2004, pp. 29-30) acknowledged that determining the actual limits of the class intervals is the most difficult part of all.
They went on to affirm that “satisfactory results come with a great deal of trial and error as well as practice” pp. 29-30. Spatz (2005, p. 376), on his part, recommended that “each class interval begins with a multiple of the class interval size”. For example, with the raw data of Table 4, since the lowest measurement, Xlowest, is Xlowest = 28 and the class
interval width, w, is w = 9, Spatz (2005) urged that the first class interval be 27 – 35
because 27 is the highest multiple of 9 that is less than 28. He went on to specify that “an exception to this convention occurs when w = 5” (p. 376). He specified: “When the interval size is 5, it is usually better to use a multiple of 5 as the midpoint because multiples of 5 are easier to read on graphs” (Spatz, 2005, p. 376).
The choice of the starting points may affect the shape of a histogram in major and surprising ways. Farnsworth (2000) showed that “histograms may have the same number of class intervals (or even the same class widths), but give very different pictures of the distribution if their start points are different” (pp. 81-82).
Vague Guidelines. As Kerr et al. (2002) put it “the point at which to begin the lowest class boundary is just arbitrary and so also is the point at which to end the class boundary” (p. 11). Spatz and many other statistics authors like Stockburger (2001),
Glasnapp and Poggio (1985), Hildebrand (1986), Hopkins et al. (1987), May et al. (1990) proposed that the first class interval begins with a multiple of the interval size.
The same idea is shared by Champion (1970) who explained: “The reason to begin an interval with a multiple of the interval size is that this procedure can act as a check on the work done. If a mistake is made in the construction of the frequency
98 distribution for grouped data, the error can be detected more quickly” (p. 21) and corrected. A problem with this method is again it may open doors to so many personal interpretations. In particular, Farnsworth (2000) warned: “Any statistical procedure in which the analyst can exercise a choice like those for histograms presents for misleading results and erroneous decisions” (pp. 84-85).
Existing Formulas to Compute the Starting Point. Only the aforementioned guidelines and recommendations mentioned above are available. No specific formulas were found to determine the starting point.
Tallies
An extra step, though not integrant part, in constructing a grouped frequency table is to count the number of observations falling into each class interval; that is, find the class frequencies (tallies). This is best done by using a tally or score sheet (Spiegel &
Stephens, 1998, p. 38). A final major step could consist of computing the class midpoints or class marks.
Summary Statistics
While tables and graphs are extremely useful in depicting a set of observations concisely, data may be described in even more condensed forms. A single number, or only a few numbers based on the observations, may be used to convey useful information concerning certain attributes possessed by the observed variables (Korin, 1975, p. 31;
Elzey 1967).
As Chou (1963) put it “after data have been collected and tabulated, the next step in statistical analysis is to describe their properties” (p. 110). Chou (1963) went on to articulate that “to describe data involves computing a proportion, a mean, a standard
99 deviation, or some other summary measure of the observations comprising a population or a sample” (p. 110).
From the Teaching and Learning Statistics’ Perspectives
There is a relative paucity of research dedicated to explaining this phenomenon.
The following is a summary of the limited related literature found.
Definitions of Teaching, Learning, and Education
Leamnson (1999), as cited by Miller (2000), defined teaching as “any activity that has the conscious intention of, and potential for, facilitating learning in another” (p. 3).
Later on, Leamnson (1999) warned that, as defined above, teaching did “not imply necessarily that any learning is going on” (p. 51). He went on to comment that, generally speaking, teaching “is something done by someone … not to someone” (p. 52), and concluded that good teaching, then, becomes “a matter of doing the right things under appropriate circumstances” (p. 52). He became more explicit about it by stating “telling students what they need to know is one thing; doing something that will inspire them to become motivated and actually learn is quite another” (p. 55). Leamnson (1999) finally reported he was “of the opinion that most good teaching does have elements of performance” (p. 69).
Coiro (2005) defined learning as “a process of real-life practice experiences that generates engagements and an understanding of how to think deeply, work collaboratively, and communicate effectively” (p. 3). Giorgis and Johnson (2001) complemented that definition by stating “true learning often happens when (and because) mistakes are made” (p. 87) and, even better, when they are fixed. To enhance this
“learning by mistake” definition, Taplin (2003) boasted about the fact that “if students
100 learn best from making their own mistakes, then a dominant role of educators is to place students in positions where they make these mistakes safely” (p. 146).
Combining the teaching and learning concepts, Leamnson (1999) claimed that
“learning is self-initiated and not externally caused” (p. 53) and, as so defined, “learning does not require teaching… But without teaching of any sort, learning is limited entirely to discovery through personal experience” (p. 53). On her part, Miller (2000) enriched that “teaching and learning are necessarily connected concepts, each defining the other”
(p. 21) because teaching and learning co-exist when teacher and student are partners in the learning process (Freire, 1970/1993). Bransford, Brown, and Cocking (2000) believed the primary goal of teaching is to foster “active learners who seek to understand complex subject matter and are better prepared to transfer what they have learned to new problems and settings” (p. 13).
Education, on the other hand, was delineated by Leamnson (1999) as “learning that has been facilitated by teaching” (p. 54). Compliant with the way teaching, learning, and education are defined here, Leamnson (1999) deduced that “teaching can be done with or without learning. Learning can be done with or without teaching. Education requires both teaching and learning” (p. 54).
Various Teaching Methods Documented
As per Conners, Mccown and Roskos-Ewoldsen (1998), the statistics instructor faces 4 major challenges: “To motivate students to value a topic they think is irrelevant to their life, to deal with the anxiety that is often associated with learning statistics, to effectively teach both high and low achievers, and to make the learning process memorable” (p. 40). Trying to meet these challenges, researchers have experimented with
101 a variety of learning strategies (Beins, 1985; Conners, Mccown, Roskos-Ewoldosn, 1998;
Derry, Levin, & Schauble, 1995; Dillbeck, 1983; Thompson, 1994). It has been documented and established by many researchers in education and psychology that students learn statistics in multiple ways.
One common way is learning by doing (Schank, 2005; Gelman, 2005). In particular, Conners et al. (1998) and Garfield (1995) proclaimed that learning by doing lends itself to active rather than passive learning. Zhu and Simon (1987) commented that learning by doing examples and making a mistake (Schank, 2005; Gelman, 2005) has been shown to be effective and in some cases more so than traditional instruction.
Another fashion is learning by personal involvement (Gelman, 2005). By getting students involved in their learning of statistics, statistics instructors promote student ownership of their learning (Miller, 2000). This view is supported by Johnson and John
(2003) who vindicated that “students learn statistical principles more effectively by active involvement in the learning process, by finding out for themselves as participation helps promote understanding and retention” (p. 94).
An additional line of attack for learning statistics is by participating in the statistics classroom activities in pairs or cooperative learning and be enthusiastic about these activities (Borresen, 1990; DeBord & Muhlig, 2004; Gelman, 2005; Gelman &
Nolan, 2002; Hsu, 2003; McLaren, 2004; Mills, 2003; Morris, Kellaway, & Smith, 1978;
Morris, Joiner, & Scanlon, 2002; Porter, Griffiths, & Hedberg, 2003; Rajecki, 2002;
Smith, 1998). As far as learning to actually use statistics is concerned, two heads are definitely better than one (Dunn, 2001) because collaborative (preferably in small groups) efforts provide a practical division of labor and necessarily promote recursive learning. In
102 fact, collaborators must explain, justify, and defend their choices to one another.
Teaching to others has been recognized as one of the best ways to learn (Webb, 1992).
Students learn statistics also by actively building their own new knowledge and understanding on what they already know and believe (Lave, 1988). That is, students formulate new knowledge by modifying and refining their current concepts and by adding new concepts to their previous knowledge (Cobb, 1994; Greeno, Collins, &
Resnick, 1996; Lave, 1988; Mills, 2003). Besides, McKeachie (2002) concurred that learning occurs when students see a relationship between what they are hearing from the instructor and what they already know. This point inspired Mills (2003) who boosted that
“regardless of how clearly a statistics teacher in elementary and secondary schools explains a statistical concept, students will understand the material only after they have constructed their own meaning for the new concepts, which may require restructuring and reorganizing this new information and assimilating it to prior or previous knowledge” (p.
56). Eggen and Kauchak (2001) imply that constructivism also suggests that learning should be facilitated by teachers and that interaction and discussion are critical components during the learning process.
McKeachie (2003) cautioned that “teaching that works for some students may not work for others” (p. 41). A review of the literature reveals that there is no clear consensus among the authors about which teaching and learning approach should be adopted by everybody. Some authors like Dunn (2001), Giraud (1977), Keeler and Steinhorst (1995),
Potthast (1999), for instance, have advised students against learning statistics in isolation, whereas others such as Spence (2001) and Benedict and Anderton (2004) have had a diametrically opposite stance. The latter have noticed that people best learn individually
103 or in one-on-one relationships in which teacher and student are in constant communication with each other.
Teaching and Learning Statistics
One of the major difficulties students have in learning statistics is that “the course deals with ‘obscure stuff’ with which they have no prior experience and which they are unable to relate to the real world or to everyday experience” (Oathout, 1995, p. 29). Speer
(1979) asserted that when given the opportunity some students elected to work alone rather than with peers or in large group setting. The teaching and learning of statistics is remarkably complex because it involves not only new and difficult concepts, but also belief systems resistant to change (Metz, 1997).
Zeiner (1991) conceded that lack of previous exposure to statistics and limited mathematics experience are known to be associated with negative attitudes, which, in turn, inhibit learning. Mills (2004) emphasized that “student attitudes about statistics are important because they may influence the learning process” (p. 350). Negative student attitudes toward statistics may create a major obstacle to effective learning (Cashin &
Elmore, 1997; Fullerton & Umphrey, 2001; Schultz, Drogsz, White, & Distefano, 1998;
Walters, Mateli, Zakrajsek, & Popovich, 1988).
Factors affecting attitudes towards statistics
Variables which continue to influence students’ attitudes, and ultimately statistics self-efficacy, have been investigated. Results have shown that attitudes are related to, among other variables, mathematics proficiency (Brown & Brown, 1995; Kottke, 2000), level of statistics course (Waters, Martelli, Zakrajsek, & Popovich, 1989), computer experience (Rhoads & Hubele, 2000; Gratz, Volpe, & Kind, 1993), prior exposure to
104 statistics (Kottle, 2000; Mills, 2004; Sutarso, 1992), and student confidence (Mills,
2004). In addition, research regarding anxiety has indicated a negative relationship; that is, the higher a student’s anxiety level, the lower the student’s performance (Ware &
Chastain, 1989).
As to computer use in classroom activities, Meletiou (2000) mentioned that the
“agreement is unanimous on the potential benefits of technology on student learning of statistics” (p. 50). However, researchers such as Behrens (1997) warned that “coupling the student with technology alone is generally insufficient to reach the desired effect” (p.
120) and Meletiou (2000) specified that “technological interventions might not work quite as well as one would like to think” (p. 50). Meletiou (2000) persisted in adding that
“despite the wide use of technology in many statistics classrooms, relatively little published research exists describing its actual impact on student learning, and curricula are often developed and implemented without the benefit of research on their effects in terms of students learning” (p. 50).
Unique Solution Feature
Learning styles (here one-solution versus multiple solution learning styles) have been implicated as representing antecedents of statistics anxiety (Onwuegbuzie, 1998;
Onwuegbuzie & Wilson, 2003). Since students’ beliefs towards statistics have generally not yet been explored, and given that many students identify statistics with mathematics, it is informative to first examine related research on beliefs about mathematics and mathematics problem solving (Mills, 2004). Schoenfeld (1992, p. 359) lists some of the typical student beliefs about the nature of mathematics and mathematical activity:
105
1. Mathematics problems have one and only one right answer.
2. There is only one correct way to solve any mathematics problem –usually the
rule the teacher has most recently demonstrated to the class.
3. Ordinary students cannot expect to understand mathematics; they expect simply
to memorize it and apply what they have learned mechanically and without
understanding.
4. Mathematics is a solitary activity, done by individuals in isolation.
These beliefs could be similar to students’ beliefs about statistics, but there may also be beliefs unique to statistics and statistics education that have not been recognized so far (Simon & Bruce, 1991). In her cited work, Oathout (1995) recalled a case of an
MBA student with engineering degree who was deceived with his experience with statistics. Before taking a statistics course he had thought that it was a math course. He maintained that engineering is a hard science and is always precise. One plus one is always two. Statistics is a soft science where one plus one is someplace between 1.8 and
2.2. He deplored the fact that statistics always deals with error and tries to eliminate error, but engineering is always precise. He found the imprecision of statistics troublesome at first because he tried to apply the engineering approach and was disturbed by the conflict
(Oathout, 1995).
Structural and Organizational Issues
Although developmental psychologists and some other scholars claimed that human beings are alike in more ways than they are different, Fennema and Behr (1980) believed that “humans differ in relevant and significant ways on many variables that are of vital importance to the teaching-learning process” (p. 325). Allwood (1990) specified
106 that there are many roads to learning and students, like any other human being, learn in different ways too and vary in their abilities to perform certain tasks. So, as Harris and
Schau (1999) put it, “there is no guaranteed teaching method to improve statistics learning, and it is likely that different methods are maximally effective with different students” (p. 208). Some students do their best in statistics courses only if the teaching method is well structured, that is, coherent and clearly organized (Moore, 1997a).
Conners et al. (1998) proclaimed: “A distinct organizational scheme of course material may make learning last, particularly when students are aware of the organization” (p. 41). Harris and Schau (1999) contended that having a clear structure seems to reduce, among other things, students’ anxiety for at least some of them. Because
Bandalos, Finney, and Geske (2003) discovered that disorganization was a positive predictor of test anxiety. Disorganization was defined as “difficulty in establishing or maintaining a structured, organized approach to studying” (p. 607). Moreover, for Harris and Schau (1999), “step-by-step examples with every intermediate step described are useful for most students” (p. 203), and chiefly for those lacking the appropriate quantitative knowledge and mental preparation.
In contrast, there are some other students who would prefer a teaching procedure with little structure. This may provide a kind of spontaneity and freedom to pursue independent ideas that can be exciting and motivating for them (Harris & Schau, 1999).
Reasoning about Graphical Representations of Data
Garfield and Chance (2000, p. 115) urged students to develop the following 4 strategic reasoning skills. First, students should understand the way in which a plot is meant to represent a data set. Second, they should understand how to read and interpret a
107 graph. Third, they should know how to modify a graph to better represent a data set.
Fourth, they should be able to identify the overall pattern, center, and spread in a distribution. In the statistical investigation process, Heaton and Mickelson (2002) encouraged students to figure things out for themselves by developing investigation questions, constructing data tables, and graphing data.
Attitudes towards Statistics
Beliefs and attitudes related to math may play a powerful role in affective responses to statistics since students often expect that the study of statistics will include a heavy dose of mathematics, including complex algebra and formulae (Simon and Bruce
1991). The frequent appearance of statistics courses within mathematics departments (or as part of a high school math class) reinforces this perception.
Since practically all students have studied some high school level mathematics before starting a formal statistics class, their affective reactions to those math-learning experiences may affect how they relate to statistics learning. Students’ predispositions, beliefs, and expectations may interact with aspects of the learning environment created by the teacher in ways that would work against what the teacher is attempting to accomplish (Gal & Ginsburg, 1994).
An instrument that is frequently used to measure attitudes towards statistics is the
Survey of Attitudes towards Statistics (SATS), which developed by Schau, Dauphine, and Stevens (1995). This instrument served as a model to the new instrument used in this study and called Survey of Attitudes towards the Method Used (SAMU) as previously defined. Like the original SATS instrument, the SAMU instrument is broken into 4 components: Affect, cognitive competence, value and difficulty. The affect component
108 assesses positive and negative feelings about statistics, and thus the instructional method used. The cognitive competence component assesses self-beliefs about ability in statistics. The value component assesses beliefs about the usefulness and importance of the instructional method. The difficulty component assesses beliefs about the difficulty of the instructional method used based on beliefs about the nature of statistics. These 4 scales were also selected for analyses as dependent variables because they represent the students’ attitudes towards statistics that some research has shown to be important for statistics achievement (Elmore et al., 1993; Green, 1994; Roberts & Saxe, 1982; Schau ,
Stevens, Dauphinee, & Del Vecchio, 1993; Wise, 1995).
Summary and Conclusion
Statistics is a course that is nowadays not only gaining popularity all over the world, but also becoming part of diverse curricula. It is even required in virtually all academic fields where data collection is involved (Batanero, Godino, Green, Holmes, &
Vallecillos, 1994). As Batanero et al. (1994) expressed it “the teaching of statistics is currently increasing substantially in many countries, due to its widely recognized place in the general education of citizens” (p. 527). It is unfortunate, as mentioned mainly by
Freund (1998, p. 15), that statistics instructors no longer devote enough time (if any) to the construction of frequency distributions and graphs due most likely to the widespread availability of personal computers. Also, the vagueness of the existing techniques as well as the nonexistence of an accepted consensus about which formula works best, force various statistics users to do it automatically by their computers (Aron & Aron, 2003, p.
9).
109
As recommended by Batanero et al. (1994) and Curzio (1987), the ability to critically read data (chiefly those in which interpretation is not needed, only facts explicitly expressed in the graph or frequency table are required) is a component of numerically literacy and a necessity in our technological society. This leaves room for a systematic approach like the one found in the iteration algorithm.
Based on these deficiencies detected in the commonly known and available algorithms, this study was designed to compensate to the lack of standard procedure for constructing frequency tables of grouped data. Strong points of the iteration algorithm, such as having unique and guess-free solution and being well structured, have been backed up by learning and teaching literature.
The 5 formulas found in the iteration algorithm are known and their accompanying step-by-step instructions are unambiguous and relatively easy to follow and apply in a guess-free fashion. Choices yielded by the iteration algorithm can be used to modify the set-up of a default histogram generated by SPSS or any other statistical software. In chapter 3, it will be demonstrated that the majority of grouped data produced by the iteration algorithm are much closer to the statistical measure data computed directly from the raw data than the corresponding grouped data yielded by the SPSS algorithm. More broadly, the following chapter will describe in detail the development of the traditional algorithm along with the development of the iteration algorithm. It is shown how each of the two instructional methods was designed and how each of them works, using for both cases the same previous set of raw data.
From a teaching and learning perspective, the iteration algorithm favors learning by doing in that the latter lends itself to active rather than passive learning (Conners et al.,
110
1998; Garfield, 1995; Gelman, 2005; Schank, 2005). Results to be presented in chapter 5 and to be discussed in chapter 6 will suggest that learning how to construct a unique and guess-free grouped-data frequency tables by using the iteration method could be in many cases more effective than using a try-and-error approach found in most traditional methods.
111
CHAPTER 3: ITERATION ALGORITHM DEVELOPMENT AND
EXAMINATION
This is a special chapter devoted to the development and examination of each of the two instructional methods used. The two methods will be compared based on how each of them computes the 4 key elements that are essential for building a grouped-data frequency table.
Development of a Traditional Method with Integers
Based on the 4 key elements crucial for creating a frequency table for grouped data, a traditional method (Vaidyanathan & Vaidyanathan, 1987; Mason & Lind, 1996) was chosen to illustrate how vague are the guidelines or rules provided. The main reason for selecting the retained scheme was widespread of the formulas used, which can be computed on most hand calculators (Howell, 1997).
Determining the Range
The formula to compute the (non-inclusive) range, R, of the observed scores is
R = Xhighest - Xlowest, where X is the observed score, Xhighest is the highest observed
value, and Xlowest is the lowest observed value.
Here, R = 93 - 28 = 65.
Choosing a Number of Class Intervals
Herbert Sturges (Sturges, 1926) considered an idealized frequency histogram with
k−1 th ⎛⎞ k bins (or class intervals) where the j bin count is the binomial coefficient⎜⎟, j = 0, 1, ⎝⎠j
…, k - 1. As k increases, this ideal frequency histogram approaches the shape of a normal
density probability function. The total sample size is
112
k −1 k −1 ⎛⎞ k−1 k−1 n ==+=()11 2 ∑⎜⎟j j=0 ⎝⎠
by the binomial expansion. Solving for k (the number of bins) in the above relation gives
k−1 log10 (2 )= log 10 (n )
⇒−∗(kn 1) log10 (2) = log 10 ( ) log (n ) ⇒−=k 1 10 log10 (2)
⇒=+kn1 log2 ( ).
This is Sturges’ rule. So the number of class intervals to choose when constructing a
histogram from normal data is k ≈ 1 + log2 (n ) = 1 + 3.322∗ log (n), where log (n) denotes the decimal logarithm of the sample size, n (Sturges, 1926, p. 65). Note that the
log10 (n) 1 value 3.322 comes from the relation: log 2 (n) = = ∗ log (n). That is, log10 (2) log 2
11 ==3.322. log10 2 log 2
Therefore, k ≈ 1+ 3.322 ∗ log (80) = 7.32. The answer could be 7 if it is decided
to round down (no precision regarding this matter is provided), or 8 (if rounded up) or
even 10 (the next higher multiple of 5, to facilitate the calculations).
Computing a Class Width
A way of setting up the class width, w, for each class interval is given below. By
applying the formula R/k = w, the following is obtained:
First possibility with k = 7: R/k = 65/7 = 9.29, which could be rounded to either 9
or 10.
Second possibility with k = 8: R/k = 65/8 = 8.13, that is, either 8 or 9.
113
Third possibility with k = 10: R/k = 65/10 = 6.5, which could be rounded to either
6 or 7.
Hence, w may take on any of the 5 distinct values listed above (6, 7, 8, 9, and 10) and a unique choice among them is needed. Which unique value should be selected?
Again no precision is given, so an odd number can be retained such that the level of accuracy of midpoints corresponds exactly to that of the data collected. There is still a problem here because there are 2 odd numbers: 7 and 9. Some statistics users might opt for 7, while some others will choose 9. Any of the two possibilities seems plausible, but what about if only one has to be retained? For the sake of illustration, however; both of the two odd integers (i.e., 7 and 9) are in turn examined.
First option with w = 7. Such selection would lead us to have a grouped-data frequency table with k = 10 class intervals each having an equal size of 7 (Third possibility).
Second option with w = 9. This choice would force the chosen grouped-data frequency table to have k = 7 class intervals each having an equal size of 9 (First possibility).
Third option with w = 9. If this is the choice, the corresponding grouped-data frequency table will consist of k = 8 class intervals each having an equal size of 9
(Second possibility).
Picking up a Starting Point
The authors do not suggest anything or recommend any particular formula. So 25
(the closest multiple of 5, which is also less than the minimum observed score, 28) can be retained, or the minimum observed score itself, 28, which is a multiple of 7, or 27 as a
114 multiple of 9. However, to keep the number of trials to a minimum, only one possible and admissible starting point, 25, was retained.
Table Preparation and Construction
Prepare a list of the limits of each class interval, beginning with the starting point.
In constructing the grouped frequency table, list the lower class limit of the first class interval, the lowest class limit obtained by applying Step 4. Add the class interval width
(w) to this first upper class boundary. Write down the upper class limit. Then, list all the class boundaries by adding the class width to the limits of the previous class interval.
After that, determine the class midpoints of each interval by averaging the class limits.
That is, adding the corresponding class lower limit to its class upper limit, and dividing the resulting sum by two to obtain a class midpoint. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.
Second Option: R = 65, k = 7, w = 9 and starting point is 25
Class 1: 25 to 33 [33 = 25 + (9 - 1) = 25 + 8];
Class 2: 34 to 42;
Class 3: 43 to 51;
Class 4: 52 to 60;
Class 5: 61 to 69;
Class 6: 70 to 78;
Class 7: 79 to 87.
115
Verdict: This is a wrong choice, since the last class interval, 79 to 87, fails to capture 93, the maximum score observed. Consequently, this selection as a doable option is rejected.
First Option: R = 65, k = 10, w = 7 and starting point is 25
Class 1: 25 to 31 [that is, 26 = 25 + (7 - 1) = 25 + 6];
Class 2: 32 to 38;
Class 3: 39 to 45;
Class 4: 46 to 52;
Class 5: 53 to 59;
Class 6: 60 to 66;
Class 7: 67 to 73;
Class 8: 74 to 80;
Class 9: 81 to 87;
Class 10: 88 to 94.
Verdict: This is a right choice, since the last class interval, that is, 88 to 94, captures the largest observed score, which happens to be 93. Hence, this selection is retained as a feasible option.
The resulting grouped-data frequency distribution is presented in Table 8 below.
Table 8 has two more class intervals than does Table 9.
116
Table 8
Grouped-data Frequency Table for the Final Statistics Grades of 80 Students
(Traditional Method) [First Option: R = 65, k = 10, w = 7 and Starting Point is 25]
Class ID Class limit Tally* Frequency Cumulative Class
Frequency midpoint
i ni xi
1 25 to 31 2 | 8 9 2 2 28
2 32 to 38 3 | 1 1 2 6 7 8 6 8 35
3 39 to 45 4 | 0 1 1 2 3 4 5 7 15 42
4 46 to 52 4 | 6 6 7 7 8 9 12 27 49
5 | 0 1 1 1 1 2
5 53 to 59 5 | 3 4 5 6 7 7 7 7 14 41 56
5 | 8 8 8 8 8 8
6 60 to 66 6 | 1 1 1 2 2 3 3 4 5 6 6 11 52 63
7 67 to 73 6 | 7 7 8 8 9 11 63 70
7 | 0 1 1 2 2 3
8 74 to 80 7 | 4 5 6 6 7 8 9 8 71 77
8 | 0
9 81 to 87 8 | 1 3 4 5 6 5 76 84
10 88 to 94 8 | 8 4 80 91
9 | 0 2 3
Total n = 80
117
Note: * The tally column is included here simply as an aid for determining the frequencies. It is not a necessary part of a grouped-data frequency table. The tally column here appears roughly as a stem-and-leaf plot.
Third Option: R = 65, k = 8, w = 9 and starting point is 25
Class 1: 25 to 33 [33 = 25 + (9 – 1) = 25 + 8];
Class 2: 34 to 42;
Class 3: 43 to 51;
Class 4: 52 to 60;
Class 5: 61 to 69;
Class 6: 70 to 78;
Class 7: 79 to 87;
Class 8: 88 to 96.
Verdict: This is a right choice, since the last class interval, that is, 88 to 96, captures the largest observed score, which happens to be 93. Hence, this selection is also retained as a viable option.
The resulting grouped-data frequency distribution is presented in Table 9 below.
Table 9 provides more useful information than does Table 4 (on page 35).
118
Table 9
Grouped-data Frequency Table for the Final Statistics Grades of 80 Students
(Traditional Method) [Third Option: R = 65, k = 8, w = 9 and Starting Point is 25]
Class ID Class limit Tally* Frequency Cumulative Class
Frequency midpoint
i ni xi
1 25 to 33 2 | 8 9 55 29
3 | 1 1 2
2 34 to 42 3 | 6 7 8 7 12 38
4 | 0 1 1 2
3 43 to 51 4 | 3 4 5 6 6 7 7 8 9 14 26 47
5 | 0 1 1 1 1
4 52 to 60 5 | 2 3 4 5 6 7 7 7 7 15 41 56
5 | 8 8 8 8 8 8
5 61 to 69 6 | 1 1 1 2 2 3 3 4 5 6 6 16 57 65
6 | 7 7 8 8 9
6 70 to 78 7 | 0 1 1 2 2 3 4 5 6 6 7 12 69 74
7 | 8
7 79 to 87 7 | 9 7 76 83
8 | 0 1 3 4 5 6
8 88 to 96 8 | 8 4 80 92
9 | 0 2 3
Total n = 80
119
Note: * The tally column is included here simply as an aid for determining the
frequencies. It is not a necessary part of a grouped-data frequency table. The tally
column here appears like a special stem-and-leaf plot.
In closing, there are two acceptable tables. The final choice lies on the user.
Development of the Iteration Algorithm
Based on the assumption that a hands-on approach would enable students to better grasp the usefulness of concepts presented (see, for instance, Fago, 1996), the iteration algorithm was conceived. Since statistics instructors know that it is hard for learning to occur when emotions and attitudes in the classroom are ranging from boredom for some students to frustration, anger, and fear for others (Schultz, Drogosz, White, & Distefano,
1998). The iteration algorithm was designed to alleviate the frustration, anger, and fear statistics users may have experienced with existing methods.
As stated earlier on, the iteration algorithm was developed in response to the non- existence of an accepted consensus about hard-and-fast rules for making these decisions about the aforementioned 4 key elements: Range, number of classes, class width, and starting point. The formulas involved that govern the choice of groups in a frequency distribution with the iteration algorithm are succinctly depicted below. Altogether, there are six formulas, but only five of them need to be used for any given set of raw data.
Determination of the Inclusive Range of Grouped Data
The iteration algorithm introduces two new elements in the traditional formula:
The accuracy or precision of the collected data (either integers or decimals) and the number of iterations involved in the computations. The modified formula reads as
120
Rt = [Xhighest – Xlowest + u * t],
where Xhighest is the highest observed score, Xlowest the lowest observed score, u represents 1 unit of measurement: 10-1, 10-2, 10-3, etc., for 1, 2, 3, etc. digits after the decimal point, and t is the number of tries or iterations needed to reach the final range for grouped data. An advantage of the Rt formula over the traditional R = Xhighest - Xlowest + 1 is that Rt is applicable to both whole and decimal observed data.
The different figures used for computations came from the same data set given in the example found in chapter one.
Here, Rt = (Xhighest - Xlowest + u ∗ t)
= 93 - 28 + 10-0 ∗ t = 65 + 1 ∗ t = 65 + t.
Selecting a Number of Class Intervals
The iteration algorithm works with two bounds: a minimum and a maximum.
Fortunately, the literature search identified two formulas that help get the maximum number of class intervals (Hoaglin et al., 1983, Velleman, 1976). It remained to identify two other formulas to serve as possible minimum values.
1. Minimum number of classes, kmin: kmin = 2for* n n is less than 100 and kmin =
7× log10 n , when n is at least 100 (Two newly developed formulas).
2. Maximum number of classes, kmax: kmax = 2 * n if n is less than 100 and kmax
= 10∗log10 n , when n is at least 100 (Hoaglin et al., 1983, Velleman, 1976).
In the absence of a generally accepted formula about the number of groups to retain in a frequency distribution, seven formulas found in the literature were reviewed, and one was finally chosen because of its easiness of use (function available in almost all calculators), and yet seemed to minimize the grouping error for at least small values (less
121 than 100). For the number of observations not exceeding 100, the iteration algorithm initially opted for the n formula to get a minimum number of intervals as proposed by
Davies and Goldsmith (1980), Devore and Peck (2005), Howell (1997), Pelosi and
Sandifer (2000).
The formula n was selected because it also constitutes a useful starting point
(Nelson, 1988) and it is easy to apply even with a basic hand calculator. The goal was to get a formula or formulas that attempt to compromise between the desire to summarize data and the desire to retain as much detailed information as possible.
In practical terms, the iteration algorithm has to come up with two formulas that would help to get the computed statistical measures much closer to those calculated directly from the raw data. After so many trials and errors, the formula 2n for n ≤ 100 was developed along with its corresponding 7 ∗ log10 (n) for n ≥100, 7 as the square root of 50 . At n = 100, these two formulas are equivalent (except for rounding errors). The two formulas were obtained by dividing each of the Hoaglin’s formulas by 2 .
When applied to a first set of 150 random samples of standard normal data, the two new formulas created to compute a minimum number of classes help all grouped- data statistical measures, and in particular the grouped-data standard deviation, to be very close to the ones generated directly from the raw data than by applying the formula
n alone.
Furthermore, the two new formulas produce similar values as those found by
Sincich, Levine, and Stephan (1999) and, also, McClave, Benson, and Sincich (2005).
Table 10 below summarizes their findings.
122
Table 10
General Rule for Determining the Number of Classes in a Histogram by Sincich et al. Number of observations in a data set Number of classes
Less than 25 5 or 6
25 – 50 7 – 14
More than 50 15 – 20
Those were the major reasons that militated for their inclusion in the iteration algorithm. Their results compare well with those yielded by SPSS. Here the following are obtained:
Determination of the Minimum Value of k, the Possible Number of Classes
Applying the formula kmin = 2* n for n ≤ 100 produces
kmin = 2=*80 160 = 12.65 ≈ 12 (rounded down). This signifies that the minimum number of possible class intervals is 12.
Setting up the Maximum Value of k
Applying the formula kmax = 2* 80 for n ≤ 100 yields
kmax ≈ 2=* 80 2 ∗ 8.94 = 17.89 ≈ 18 (rounded up). This means that the maximum number of possible class intervals is 18.
(Note that the number of groups, k, is always comprised between these two bounds, that is, kmin = 12 ≤ k ≤ kmax = 18).
Fixing on the Class Interval Size
As stated by Glasnapp and Poggio (1985), “The most difficult part (if there is one) in constructing a grouped-data frequency distribution is determining the interval
123
width” (p. 62). The traditional formula R/k = w is modified to Rt/k = w, if w is an odd
integer, but operates by iteration until a compatible trio composed of Rt, k, and w is
found, where kmin ≤ k ≤ kmax. Here Rt, k, and w are determined at the same iteration step,
whereas with a traditional method the 3 variables are computed at 3 different times and
separately.
With the same data from Table 4, one gets Rt/k = w if and only if the ratio Rt over
k yields an odd whole number. One can proceed by successive iterations.
First iteration (t = 1): R1/k = [93 – (28) + 1 * 1]/k = (65 + 1)/k, or 66/k, where R1 is the initial inclusive range or the range at iteration 1 and k is any integer between 12 and 18
(inclusive). The inclusive range at iteration 1, R1, of 66 is divisible by 2, 3, 6, 11, 22, 33, and 66. Unfortunately, none of them belongs to the given range of values from 12 up to
18, limits included.
Second iteration (t = 2): R2/k = [93 – (28) + 1 * 2]/k = (65 + 2)/k, or 67/k. No chance as 67 is a prime number.
Third iteration (t = 3): R3/k = [93 – (28) + 1 * 3]/k = (65 + 3)/k, or 68/k. The number 68 is divisible only by 17. Unfortunately, the ratio 68 over 17 is 4, an even value.
A fourth iteration is needed.
Fourth iteration (t = 4): R4/k = [93 – (28) + 1 * 4]/k = (65 + 4)/k, or 69/k. Only 3 and 23 (values falling outside the considered interval 12-18) divide 69. An additional iteration is still needed.
Fifth iteration (t = 5): R5/k = [93 – (28) + 1 * 5]/k, or 70/k. The number 70 is a multiple of 14. Thus, w = R5/k = 70/14 = 5 (an odd value). In other words, one may have k
= 14 class intervals with an equal class interval width, w, of 5. The final inclusive range R
124
(R = R5) for our grouped-data frequency table will be 70, and the number of necessary tries or iterations, t, is 5 (an odd value).
Setting up the Starting Point
No known formulas are found with traditional methods. The iteration algorithm
applies these two new formulas: (Xlowest - t*u/2) if t is an odd number, and (Xlowest - t*u/2
+ u/2) if t is an even number, where t, is the final iteration used to find w, the class width.
These two new formulas help to determine the starting point without resorting to
many unsuccessful trials, which, in turn, save time and limit errors associated with the
latter. Another advantage is that the last two formulas give directly class boundaries
rather than class limits.
With whole numbers, the two formulas to be applied here follow:
(Xlowest - t * 0.5), if t is an odd number of tries or iterations; otherwise
(Xlowest - t * 0.5 + 0.5), when t is an even number of tries or iterations;
where Xlowest is the smallest score observed in the data set and t denotes the number of
tries or iterations necessary to reach the smallest odd whole ratio, w, which represents the
class interval length or class interval width.
Now plotting the lowest score and the value 5 (an odd integer), the number t of
iterations, into the first of the two formulas shown above yields:
Xlowest - t * 0.5 = 28 - 5 * 0.5, which reduces to 25.5. Therefore, the lowest class
boundary becomes 25.5.
Table Preparation and Construction
The table should begin point with the starting point. In constructing the grouped
frequency table, list the lower class boundary of the first class interval, the lowest class
125 boundary obtained by applying Step 4. Add the class interval width (w) to this first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the previous class interval. After that, determine the class marks of each interval by averaging the class boundaries. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.
First class interval: Lower class boundary = 25.5.
Upper class boundary = 25.5 + 1 ∗ w = 25.5 + 5 = 30.5.
Second class interval: Lower class boundary = 30.5.
Upper class boundary = 30.5 + w = 30.5 + 5 = 35.5.
Third class interval: Lower class boundary = 35.5.
Upper class boundary = 35.5 + w = 35.5 + 5 = 40.5.
And the last (4teenth) class interval: Lower class boundary =
25.5 + 13 ∗ w = 25.5 + 13 ∗ 5 = 25.5 + 65, or 90.5.
Upper class boundary = 25.5 + 14 ∗ w = 25.5 + 14 ∗ 5 = 25.5 + 70, or 95.5.
Check: Final Range, R = Last upper class boundary – First lower class boundary =
95.5 - 25.5 = 70 (same result as before). The grouped-data frequency distribution is presented in Table 11 below and the corresponding histogram is displayed by Figure 4. It looks nicer than the default histogram generated by SPSS (Figure 2).
126
Table 11
Grouped-data Frequency Table for the Statistics Final Exam Grades of 80 Students
(Iteration Algorithm)
Class boundary Frequency Cumulative Class Relative Frequency in
Frequency midpoint Percent
ni xi
25.5, 30.5 2 2 28 2.5*
30.5, 35.5 3 5 33 3.75*
37.5, 40.5 4 9 38 5
42.5, 45.5 6 15 43 7.5
47.5, 50.5 7 22 48 8.75
52.5, 55.5 8 30 53 10
55.5, 60.5 11 41 58 13.75
60.5, 65.5 9 50 63 11.25
65.5, 70.5 8 58 68 10
70.5, 75.5 7 65 73 8.75
75.5, 80.5 6 71 78 7.5
80.5, 85.5 4 75 83 5
85.5, 90.5 3 78 88 3.75
90.5, 95.5 2 80 93 2.5
Total n = 80 100
Note: * 2.5 = (2 ∗ 100)/80 and 3.75 = (3 ∗ 100)/80
127
Histogram of the student's grades By the Iterative method's choices 12
11 10
9 8 8 8 7 7 6 6 6
4 4 4 Std. Dev = 15.96 3 3 2 2 2 Mean = 60.0 0 N = 80.00 Frequency 28.0 38.0 48.0 58.0 68.0 78.0 88.0 33.0 43.0 53.0 63.0 73.0 83.0 93.0
Grades obtained in final statistics exam
Figure 4. Histogram by the iteration algorithm.
Iteration Algorithm versus Traditional Method
The iteration algorithm was labeled as a structured, near error free or near errorless tactic, while the traditional instructional method was defined as a generate-and- test learning process. It was posited that students would learn more using the iteration and the generate-and-test methods than using the traditional instructional method alone. At least 3 investigations, as reported below, may defend this postulation. First, Moore and
Goldiamond (1964) used a fading procedure to train a form discrimination in nursery school children and concluded that “the errorless procedure was more efficient than trial- and-error procedures in producing the most perfect learning with the least practices” (p.
269).
128
Second, Heckler, Fuqua, and Pennypacker (1975) investigated the relative efficacy of an errorless, fading procedure for teaching complex verbal statements to 23 undergraduate psychology students. They discovered that “learning under the fading procedure was significantly better than learning by trial-and-error controls or by an experimental group using a non-fading method” (p. 103).
More recently, William, Ludlow, and Walls (1978) used prompting and fading techniques to teach students tasks with few or no errors condition, on one hand, with trial-and-error learning, on the other hand. Their findings indicate that “percentage of errors was less for errorless fading than trial-and-error in initial learning, but did not differ during transfer or retention” (William, et. al, 1978, p. 553).
Monte Carlo Simulations
Monte Carlo simulations consist of comparing the outcomes from the iteration algorithm against those generated by SPSS. For data analysis purposes, the values on each of the 9 statistical measures serve as the dependent variable. The independent variable is the algorithm used (Iteration versus SPSS). The goal is to discover which algorithm used produced closer statistical measures to those calculated directly from the raw data for each of the 9 statistical measures for varying sample sizes. For each statistical measure, the procedure consists of computing the differences between the values calculated directly from the raw data and the values obtained from the grouped- data frequency tables yielded by SPSS, on one hand, and the values calculated directly from the raw data and the values obtained from the grouped-data frequency tables produced by the iteration algorithm, on the other hand. With this in mind, this preoccupation is investigated in 3 steps.
129
First, on average, are the computed values on each of the 9 statistical measures much closer between raw data and iteration algorithm than between raw data and SPSS?
Second, on average, do the computed values on each of the 9 statistical measures differ among the two computing methods used according to the sample size (small, medium, and large) or differ overall?
Third, does the SPSS algorithm or the iteration algorithm have one effect in one sample size, but a different effect in another sample size? For example: Does the iteration algorithm work better for large sample sizes than for any other sample sizes?
Random Number Generation of Normal Data
Before reporting the main outcomes of the Monte Carlo simulations, it is worthy to note that the normal data used in this study were randomly generated by running the
Random Number Generation function of the Microsoft Excel (2003 edition). The actual random number generation process was completed in early 2004 (between January and
February, 2004). The recently reported problem of getting numerous negative normal data values was not observed during that period of time.
For replication purposes, the following steps were followed. First, open an Excel spreadsheet. This is done by clicking on the Start button, then choosing Programs, then
Microsoft Office, and Microsoft Office Excel 2003. Second, from the Excel main menu, select Tools, then Data Analysis, and Random Number Generation. Click OK. Third, in the slot in front of Number of Variables, enter 1. Fourth, in the slot in front of Number of
Random Numbers, enter 50 (to get a small sample size) or 100 (for a medium sample size) or 250 (for a large sample size).
130
Fifth, choose the Normal distribution from the list of Distributions. Leave intact the default format. That is, leave unchanged Mean = 0 (zero) and Standard Deviation = 1
(one). Sixth, the Random Seed is chosen in the following way: In the slot appearing in front the Random Number, enter any number between 1 and 999, for instance. Keep track of the random seed entered each time you let Excel generate a set of random numbers. In this way, no random seed will be entered more than once for the entire process. Seventh, select Output range and enter A1:A50 in the corresponding dialog box. Press OK.
Finally, repeat the above process 150 times (50 times for each particular sample size).
Main Outcomes
For each of the 9 statistical measures for all 150 random samples, two sets are being computed. Set I = (Raw Data Statistical Measure - SPSS Statistical Measure) and
Set II = (Raw Data Statistical Measure - Iteration Statistical Measure). Then each difference set is squared. After that, the Square Roots of Differences (that is, (Set I) 2 − (Set II) 2 ), the square root of differences between the two sets for each individual measure are compared: (1) For all the 150 samples, (2) For each of the 3 50 samples in accordance with their respective size (Small size = 50 normal random numbers, Medium size = 100; and Large size = 250). The smaller the results obtained, the closer the results are to those produced directly from the raw data.
In Table 12 and Table 13, each third row (SPSS - Iteration) shows the difference of outcomes from the two algorithms. A positive difference means the result from the iteration algorithm is smaller. In other terms, a positive difference signifies that the iteration algorithm outcome is much closer to the corresponding value computed directly from the raw data than the comparable outcome produced by SPSS. Since there are more
131 positive differences than negative ones, it can be concluded that, overall, the algorithm used by SPSS yielded almost always much higher deviations (in absolute values) than did the iteration algorithm. This turns out to be in the advantage of the iteration algorithm.
In particular, the sample size plays an important role in the comparison between the SPSS and the iteration algorithm outputs, mainly for large sample sizes. In fact, SPSS produced here only one statistical measure of dispersion that is closer to the corresponding measure computed directly from the raw data than did the iteration algorithm; and this number of closer statistical measures goes up as the sample size decreases from 250 down to 50. Overall, the iteration algorithm produced 20 much closer results out of the 27 comparisons made. In other terms, the values from the iteration algorithm were smaller than those from the SPSS algorithm. Further details follow in the next two tables provided below.
Table 12
Overall Summary Table of Square Root of Deviations (SRD)
Method Used Mean Median STD Range First Quartile
SPSS 0.064995 0.099261 0.01392 0.365533 0.078972
Iteration 0.010200 0.042151 0.01436 0.061817 0.094111
SPSS – Iteration 0.054795 0.057110 -0.0004 0.303716 -0.015139
Which method yields
a smaller SRD? Iteration Iteration SPSS Iteration SPSS
(table continues)
132
Table 12 (continued)
Overall Summary Table of Square Root of Deviations (SRD)
Method Used Third Semi-IQR Skewness Coefficient of
Quartile Variation
SPSS 0.074329 0.05152 0.306839 171.4759
Iteration 0.046900 0.05285 0.212013 141.3795
SPSS – Iteration 0.027429 -0.0013 0.094826 30.09640
Which method yields a
smaller SRD? Iteration SPSS Iteration Iteration
For the 150 random samples combined: Out of 9 (9) statistical measures SPSS excels the iteration algorithm in just 3 (3) cases (all are statistical measures of dispersion.
The results of these 3 cases follow: STD (Standard Deviation) = -0.0004, First Quartile =
-0.0151, and Semi-IQR (Semi Interquartile Range) = -0.0013.
Although the number of class intervals produced by the SPSS and the iteration algorithms depended on both the sample size (the number of scores observed) and the range (the difference between the largest score and the smallest score), the two algorithms brought in quite varying class interval numbers. The SPSS algorithm produced numbers of class intervals extending from 9 to 29, while the iteration yielded numbers of class intervals ranging from 14 to 24. Overall, the SPSS algorithm tended to generate a larger number of class intervals than did the iteration algorithm. This could
133 explain, at least partially, why the SPSS generated much better STD, first quartile, and semi-IQR values than did the iteration algorithm.
Table 13
Summary Table by Sample Size of Square Roots of Deviations (SRD)
Mean Median STD Range First
Quartile
Large Size 1 0.109628 0.156118 0.00639 0.284089 0.114089
2 0.005077 0.027832 0.00629 0.063891 0.052469
3 0.104551 0.128286 0.0001 0.220198 0.06162
4 Iteration Iteration Iteration Iteration Iteration
Small Size 1 0.016068 0.063369 0.01806 0.403356 0.066686
2 0.01513 0.055426 0.02116 0.038497 0.150813
3 0.000938 0.007943 -0.0031 0.364859 -0.084127
4 Iteration Iteration SPSS Iteration SPSS
Medium Size 1 0.019935 0.035604 0.01464 0.39751 0.036088
2 0.008423 0.038518 0.01167 0.077447 0.033089
3 0.019935 -0.00291 0.00297 0.320063 0.002999
4 Iteration SPSS Iteration Iteration Iteration
(table continues)
Note: 1 = SPSS 2 = Iteration algorithm 3 = Difference between the SPSS and the iteration algorithms 4 = Which method has the smaller SRD?
134
Table 13 (continued)
Summary Table by Sample Size of Square Root of Deviations (SRD)
Third Quartile Semi Inter- Skewness Coefficient of
Quartile Variation
Large Size 1 0.114197 0.02062 0.393863 134.6167
2 0.028106 0.03252 0.175954 101.4435
3 0.086091 -0.0119 0.217909 33.1732
4 Iteration SPSS Iteration Iteration
Small Size 1 0.048643 0.08271 0.290599 221.0278
2 0.058928 0.08062 0.244528 206.8633
3 -0.010285 0.00209 0.046071 14.16450
4 SPSS Iteration Iteration Iteration
Medium Size 1 0.034164 0.02666 0.221641 145.7293
2 0.048466 0.02896 0.217254 82.95367
3 -0.014302 -0.0023 0.004387 62.77563
4 SPSS SPSS Iteration Iteration
Note: 1 = SPSS 2 = Iteration algorithm 3 = Difference between the SPSS and the iteration algorithm 4 = Which method has the smaller SRD?
Large Size
The SPSS algorithm did relatively better than did the iteration algorithm in one
(1) case (a measure of dispersion), that is, SIQ (Semi Interquartile Range) = -0.0119.
135
Small Size
The SPSS algorithm excels the iteration algorithm in 3 (3) cases (all are statistical measures of dispersion), with STD (Standard Deviation) = -0.0031, First Quartile = -
0.0841, and Third Quartile = -0.0103.
Medium Size
The SPSS algorithm betters the iteration algorithm in 3 (3) cases (the first one is a statistical measure of location and the following two ones are measures of dispersion).
Outcomes of the 3 cases are given below: Median = -0.0029, Third Quartile = -0.0143, and SIQ (Semi Interquartile Range) = -0.0023.
Conclusion
As hypothesized, for each of the 9 statistical measures, the square roots of deviations between the raw data and grouped data yielded by the iteration algorithm are, on average, are smaller than those between the raw data and the grouped data produced by the SPSS algorithm, for varying sample sizes. In light of the results obtained so far from the Monte Carlo simulations, the iteration algorithm holds promise for providing valid grouped-data frequency tables. Further development work is recommended mainly in the areas of teaching and learning it.
The next chapter presents in detail variables to be used in the survey and factors retained in the scale utilized. The instrument included in the survey helped collect the data that would adequately tackle the issues raised in chapter one. Chapter 4 also exhibits the research design consistent with the research questions along with statistical methods essential to answer them.
136
CHAPTER 4: METHODOLOGY
Introduction
The following sections introduce, in order, the goals, methods, and participants.
Next, brief scale outcomes from the surveys are presented. Finally, methods of data collection and data analysis are discussed.
Research Design
Intact class sections from two academic departments are used. Ohio University students were selected from 4 class sections of the mathematics department and also from
4 other class sections of the psychology department. All of these students were attending an elementary statistics class during the 2005 summer sessions and 2005 fall quarter at
Ohio University, Athens Campus.
In either department, students from two sections received the traditional method and students from the other two sections got the iteration algorithm. Students in each class section were taught by their regular instructor, either the iteration algorithm or the traditional method, but not both. The survey was administered only once to each departmental class section.
Advantages of the Intact Group Comparison Design
From the related literature, the intact or static group comparison design used here has a number of advantages. Some of the most important advantages are listed below.
First, the data collection process with this type of design is very feasible and relatively easier since it uses pre-existing samples (Wiersma, 2002). Second, it is economical.
Third, there is a rapid turn-around in data collection. Fourth, the participants are measured in their natural setting. Fifth, it can identify attributes of a population from a
137 small group of individuals, as presented in Fowler (1988). Sixth, there is a possibility of gathering data on many variables such as attitudes, perceptions, beliefs, and behaviors.
Seventh, it is good for exploratory research or pilot studies since it may provide a moderate basis for comparing treatments. An eighth advantage is that it generates hypotheses for future research and provides data that may be useful to many researchers.
Finally, the presence of a comparison group does control for history, because it is assumed that events occurring outside of the experimental setting will equally affect both groups (Gay & Airasian, 2000; Gravetter & Wallnau, 1985; Sanders et al., 1976).
Disadvantages of the Intact Group Comparison Design
There are two major problems with this research design. The first one is the lack of random assignment of subjects. The fact that there is no pretest data to provide for any basis of group equivalency is the second problem. The participants may have different skill levels prior to the survey and therefore would have performed independently of their respective academic department.
Operational Definitions of the Variables
Survey of Attitudes toward the Method Used
Six main variables are, overall, considered. On one hand, there are two independent variables: Departmental class section with two categories (MATH 250 and
PSY 120) and the instructional method taught having also two categories (the treatment or the iteration algorithm and the control or the traditional method). On the other hand, under attitude subscale according to the Survey of Attitudes towards the Method Used
(SAMU) scale, there are 4 quantitative continuous dependent variables: Total attitude scores obtained by students on each of the 4 subscales (affect, cognitive competency,
138 value, and difficulty) as found in the original Survey of Attitudes towards Statistics
(SATS) instrument invented by Schau, Stevens, Dauphinee, and Del Vecchio (1995).
Measures to Represent the Student Perceived Ability in Statistics
Two additional independent variables were thought to be measures to represent the student ability in statistics: expected course grade and perceived mathematics ability
(Perney & Ravid, 1991; Schultz, Drogosz, White, & Distefano, 1998; Onwuegbuzie, &
Wilson, 2003). The expected course grade served originally as a covariate. Ten levels
(from grade F to grade A) are assigned to the expected course grade variable. It turned not to be an appropriate covariate for the present data. The variable student perceived mathematics ability, measured by “How good at mathematics are you?” with seven levels
(from 1 = Poor up to 7 = Excellent), served as an alternate covariate.
Identification of Population
Ideally, our target population consists of all American students with some basic statistics pre-knowledge, or mathematics competence, or computer literacy, or a combination of all or some of these factors. Practically, only all Ohio University undergraduate students, who had taken at least a statistics course during the 2005 summer sessions or the 2005 fall quarter, were the accessible population.
Sampling Plan
A relatively large sample size was needed in order to increase the statistical power. Efforts were made to reach out Ohio University undergraduate students willing to participate in the experiment. Incentives used to attract more participants were a 2 extra points-bonus.
139
4 class sections were selected from each department: 4 sections of MATH 250 course from the mathematics department and 4 sections of PSY 120 class from the psychology department. In each department, the instructor retained was teaching two class sections per quarter.
Both the MATH 250 (Introduction to Probability and Statistics) and the PSY 120
(Elementary Statistics Reasoning) courses are offered during each fall quarter, spring quarter and summer sessions. It was the instructor who decided which section was going to be the treatment group (the group taught the iteration algorithm) and which other section served as the control group (the one taught the traditional method). All together,
200 questionnaires were distributed and only 170 students completed the surveys and turned in their questionnaires to their respective instructors. Hence, the response rate was
85%. Out of these 170 students, 84 were from the math department and 83 were taught the iteration algorithm.
Instrumentation
Selection/Development of Instrument
Many efforts have been made to investigate affective factors that affect a student’s statistical performance. Much of the research involves surveys designed to quantify students’ attitudes toward statistics. In efforts to quantify and understand these attitudes, at least seven Likert-type instruments claiming to measure attitudes towards statistics have been developed (Hilton et al., 2004; Mills, 2004). These instruments include the Multifactorial Scale of Attitudes Towards (Auzmendi, 1991), Bad Attitude
Toward Statistics (Berk & Nanda, 1998), Statistics Attitude Survey (Roberts &
Bilderback, 1980), the Survey of Attitudes Toward Statistics (SATS; Schau, Stevens,
140
Dauphinee, & Del Vecchio, 1995), Students’ Attitude Toward Statistics (Sutarso, 1992a and 1992b), and Attitudes Toward Statistics (Wise, 1985). Additionally, at least two
Likert-type instruments claiming to characterize how statistics anxiety influences statistical performance have been designed (Mills, 2004). They include Statistical
Anxiety Rating Scale (Cruise, Cash, & Bolton, 1985) and Statistics Anxiety Inventory
(Zeidner, 1991).
Given that the objective was to use a validated, accepted tool to solely assess student attitudes and achievement skills, the SATS-Schau was retained in this work.
More precisely, the SATS-Schau was selected because it had been used on at least 4 large scale investigations and two confirmatory factor analysis studies had already established its construct validity (Schau et al., 1995; Dauphine, et al., 1997; Mills, 2004). Also, the subscales on the survey represent important attitudes that are related to student achievement (Elmore, Lewis, & Bay, 1993; Schau et al., 1993) and more importantly, it is the closest instrument according to our research needs.
The SAMU instrument used here (shown in Appendix B) is an adapted and reduced version of the SATS (Schau et al., 1995) to better match our purposes.
Permission was sought several times, but unfortunately the principal author never responded to our repeated requests. This new instrument, SAMU, is a questionnaire consisting of 26 initial statements. Using a Likert-type scale, all students responded to each of the first 20 statements (attitude items) by circling a number from 1 to 7, where 1 indicates “Strongly Disagree”, 2 represents “Moderately Disagree”, 3 signifies “Slightly
Disagree”, 4 stands for “Neither Agree or Disagree”, 5 symbolizes “Slightly Agree”, 6 designates “Moderately Agree”, and 7 denotes “Strongly Agree”. These first 20 items are
141 designed to measure a student’s attitudes toward of each of the two instructional methods taught in terms of their respective applicability.
The scores of the next 5 statements (background characteristics items) were also recorded on a 7-point Likert-type scale. These five questions are related to background information (mathematics perceived competency, computer literacy, statistics pre- knowledge, length of high school mathematics exposure, and the cumulative number of college mathematics/statistics courses taken).
For the last question, the response scale consisted of a 10-point scale, with, 1 = F,
2 = D, 3 = C-, 4 = C, 5 = C+, 6 = B- , 7 = B, 8 = B+, 9 = A-, and 10 = A, respectively.
This item reflects the letter grade a student expected to earn in either MATH 250 or PSY
120.
An additional seven self-efficacy items are included in the SAMU instrument.
These are dichotomous variables and marked on a 2-point Likert-style format, anchored at both ends (with Pass = 1 and Fail = 0, respectively).
4 subscales are derived from the first 20 statements. The 4 subscales, or facets, are briefly described below. The main definitions are those provided by Schau et al. (1995) and Dauphine et al., 1997. Our interpretations are inspired by examples found chiefly in
Cruise et al. (1985), but also in Bell (1998a and 1998b), Cherian & Glencross (1997),
Fullerton & Umphrey (2001), Kennedy & McCallister (2001), Sutarso (1992a and
1992b), Zanakis & Valenzi (1997), and Zeidner (1991).
Facet 1: Affect. This factor deals with examinees’ positive and negative feelings concerning the instructional method used. It is posited (based on past experience and review of the related literature) that those people who score higher on this factor, are
142 most likely to be those who have more positive attitudes toward the instructional method presented to them. Frequency tables being part of techniques taught in statistics courses, it is logically deduced that higher scores on this facet could be interpreted as an indication of more positive attitudes toward both statistics and the instructional method used for constructing grouped-data frequency distributions. Those with lower scores on this facet may exhibit more negative attitudes toward both statistics and the instructional algorithm used. This facet has 5 items (Items 1, 2, 10, 11, and 15).
Facet 2: Cognitive Competence. This factor is concerned with attitudes about intellectual knowledge and skills when applied to statistics. Those skills may be acquired by a number of ways such as through heredity (e.g., parental influence: if both parents were strong in mathematics/statistics, so would most likely be some of their children) or through long, hard, and successful practice (e.g., computer literacy, mathematics background, statistics pre-knowledge). People with higher scores on this factor could be those who feel adequate when applying mathematical/statistical formulas. They feel also more at ease when dealing with large sets of data than would have participants with significantly lower scores. High scores (or low scores) would be interpreted as an indication of positive (or negative) feelings toward either or both instructional methods.
This component also comprises of 5 items (Items 3, 8, 12, 14, and 18).
Facet 3: Value. This factor deals with attitudes about the usefulness, relevance, and worth of statistics in personal and professional life. In particular, it is the interest that people show in the instructional method presented and its possible use in their professional future lives. This would arise, for example, from using the method in classroom activities or future professional lives, or even recommending it in conjunction
143 with technology for classroom teaching purposes. On one hand, at this computer age, some students may feel like computers should do everything and relieve human beings from all sorts of complicated mathematical computations usually done in the past by hand and/or with the help of hand calculators. These students, that are over-dependent on the computers for instance, frequently hate doing hand and/or calculator computations. On the other hand, there are some students that do not like using computers and do not like hand and/or calculator computations, either. As a result, a person scoring very low on this facet on the SAMU instrument sees little or no benefit or advantage in the instructional method used. An examinee scoring low on this factor also feels that the instructional algorithm used is almost worthless now and in the future, thus indicating a negative attitude toward it. Again, this facet has 5 items (Items 5, 7, 9, 17, and 19).
Facet 4: Difficulty. This factor reveals the examinees’ attitudes about the difficulty of statistics as a learning domain. On the SAMU instrument, a very high score on this factor displays a very high overall positive attitude towards the method for building frequency tables for grouped data. This in turn reveals that the student who underscores feels less confident using it than the majority of his/her classmates. In addition, low scores would most likely mean that scorers did have problems to understand the instructional method used because it was harder for them to learn and, probably, they did not like it at all. Consequently, those students that score low on this factor are expected to experience, overall, more negative attitudes toward the method for constructing grouped- data frequency distributions. Like other factors, this one includes 5 items (Items 4, 6, 13,
16, and 20).
144
Even though, strictly speaking, it is difficult to assess the quality of the data collected, it is easier to assess the accuracy of the experiment instrument utilized to collect those data. This assessment consists primarily of looking at the reliability and the validity characteristics of the survey or measurement instruments (Litwin, 1995, p. 3).
Only the first of these two important issues are addressed in the Pilot Studies.
Pilot Studies
To check for reliability and item functioning, the instrument was tested during the
2002 fall quarter and the 2003 fall quarter. Two convenience samples were used for these two pilot studies. A total of 45 graduate OU students were sampled, including 20 females and 25 males; among which there were 19 Master’s and 26 PhD students. On each administration of the test, the same scale was used twice: one time the data were based on the iteration algorithm, and the other time, the traditional algorithm served as data provider.
Reliability Issues
Cronbach’s Alpha as an Index of Reliability
Reliability is concerned with the consistency, stability, and dependability of the results (McMillan, 1997, p. 60). In particular, according to Litwin (1995), it is a
“statistical measure of how reproducible the survey instrument’s data are” (p. 6). For the
SAMU scale used, an Alpha reliability coefficient was computed and their results analyzed and compared. There are a wide variety of internal consistency measures that can be used, but Cronbach's Alpha was chosen because it is a popular method to measure reliability (Aron & Aron, 1999, p. 527). Cronbach’s alpha reliability coefficient was used
145 because also of the single administration of the scale. In addition to all that, it was the one
Schau et al. (1995) used in their SATS instrument, which serves here as our criterion.
Pilot Survey Results
Out of a total of 50 students, 45 of them returned their filled-out questionnaires.
This is a 90% response rate. Table 14 reports measures of internal consistency reliability among the dependent variables. There are overall 26 items, of which 20 items for the attitude subscale, 5 items for the background subscale, and 1 item serves as the covariate.
The 7 items for the self-efficacy subscale are not considered in the item analysis because they concern only the iteration algorithm.
Table 14
Reliability Statistics for the Iteration Algorithm and the Traditional Method
Dependent Variable Iteration Algorithm Traditional Method
Cronbach’s Number of Cronbach’s Number of
Alpha Items Alpha Items
Attitude Subscale
Affect .857 5 .820 5
Cognitive .762 5 .743 5
Value .805 5 .805 5
Difficulty .761 5 .723 5
Overall .949 20 .941 20
146
Based on the 20 overall items, the Cronbach’s Alpha coefficients were .95 and .94 for the iteration algorithm and the traditional method, respectively. As both coefficients are in excess of .90, one can be satisfied that the Cronbach’s Alpha coefficients are sufficiently high to suggest scale reliability for respondents like those in the study.
Further, these results are in agreement with those found by Hilton et al. (2004) and
Finney and Schraw (2004) who reported separately an overall Cronbach’s coefficient, α, for the post-test version of .90 and .91, respectively. Hence, the scale seems to be reasonably reliable and was not markedly revised.
Table 14 gives values for the attitude subscales: SAMU Affect (.82 and .86),
SAMU Cognitive Competence (.74 and .76), SAMU Value (.81 and .81), and SAMU
Difficulty (.72 and .76). Although these values might be overestimates of the population coefficient alphas because the same sample was used to conduct the item analyses and to compute the reliability estimates, they remain consistent with what was reported by Mills
(2004): SATS Affect (.81 and .85), SATS Cognitive Competence (.77 and .83), SATS
Value (.80 and .85), and SATS Difficulty (.64 and .77).
Table 15
Subscale Correlation Matrix for the Iteration Algorithm
Affect Cognitive Value Difficulty
Affect 1.000 0.794 0.711 0.750
Cognitive 0.794 1.000 0.896 0.858
Value 0.711 0.896 1.000 0.816
Difficulty 0.750 0.858 0.816 1.000
147
Table 16
Inter-item Correlation Matrix for the Traditional Method
Affect Cognitive Value Difficulty
Affect 1.000 0.719 0.678 0.688
Cognitive 0.719 1.000 0.868 0.798
Value 0.678 0.868 1.000 0.774
Difficulty 0.688 0.798 0.774 1.000
In closing, an analysis of the inter-item correlation matrix indicated that the intercorrelations between components or dimensions were quite high; since all the correlations were roughly at least .70 for both teaching methods (refer to Table 15 and
Table 16 above). More importantly, all the bivariate correlation coefficients were significant at the 0.01 significance level (two-tailed test). The analysis also inferred that the dependent variables are related dimensions or components to this 20-item scale. The dimensions did appear to be assessing attitudes toward the instructional method used.
Data Collection Procedures
Once the instructor had decided about which instructional method each class section was going to be taught, he or she would illustrate the chosen method by a well worked out example with the necessary explanations. The worked out example was provided to each instructor by the researcher. The students were requested to fill out one scale. Per Ohio University’s requirements, students asked to participate in this survey did so willingly and with due informed consent. Those who did not want to participate still had the option to receive the scale and then either decline filling it out and immediately
148 leave the spot or stop their participation at any other time they felt like doing so.
Although the students were not required to write down their names, each questionnaire included a statement assuring confidentiality of the respondents’ responses.
Data were collected by the scale on 26 items. The first 20 items were related to the students’ attitudes towards the method taught to them. The 5 other items concerned their background information. Due to its unique solution feature, students were tested only on the iteration algorithm to evaluate their mastery level of the materials taught with the 7 extra items. These seven last items were fill-in questions and required students to carry out actual calculations using the formulas provided. The self-administered questionnaire took roughly fifty minutes to complete for each of the two instructional methods used. Students turned in the filled out questionnaires to their instructor as soon as they were done.
Based on our two pilot studies, it was then anticipated an overall response rate of about 87%, which led to distributing around 200 copies (50 copies to each of the 4 class sections). Exactly 170 (85% of a total of 200) students turned in their filled-out questionnaires to their respective instructors. The scale was coded and used once; then the data were entered based on these codes into SPSS as they were received.
Data Analysis Procedures
Statistical Methods
For the attitudes survey, a 2 x 2 between-subjects factorial multivariate analysis of covariance (Two-Way MANCOVA) was conducted using the 4 SATS-Schau subscales
(affect, value, cognitive competence, difficulty) as dependent variables. The independent variables are the instructional (iteration or traditional) method and the departmental class
149 section (mathematics or psychology). The covariate is a variable that measures the student perceived ability in statistics.
As stated earlier on, one of the hypotheses to test is that the student attitudes’ scores will vary according to the instructional method used and/or the departmental class section. To reduce the random error introduced by differences in student background experience (Tabachnick & Fidell, 2001, p. 275), the student perceived ability in statistics was chosen as a covariate in the analyses. Two variables were thought to be measuring the student perceived ability: expected course grade and perceived mathematics ability.
The expected course grade was initially retained as a possible covariate here because of its known association with the students’ attitudes towards statistics (Perney &
Ravid, 1991; Onwuegbuzie, 2000). The expected grade in the course has ten categories ranging from grade F to grade A. Based on the review of the related literature, results have shown that attitudes are related to other variables, among which perceived mathematics ability comes often (Brown & Brown, 1995; Kottke, 2000). Galagedera et al. (2000) further observed a significant positive association between perceived mathematics ability and the expected grade in an elementary statistics course. Zeiner
(1991) on his part conceded that lack of previous exposure to statistics and limited mathematics proficiency are known to be associated with negative attitudes, which, in turn, inhibit learning.
Therefore, a second variable that was considered as a reliable alternate covariate was the “student perceived mathematics ability”. This information was gathered by “How good at mathematics are you?” item.
150
The multiple analysis of covariance (MANCOVA) was chosen because of its ability to handle multiple dependent variables and covariate(s), and rigorously test for differences while controlling for Type 1 error. Additionally, testing several dependent variables simultaneously may reveal complex interactions missed by the univariate tests
(Scheiner, 2001; Stevens, 2002).
MANCOVA is similar to MANOVA, but in this case one interval independent was added as covariate. This covariate serves as a control variable for the independent factors, acting to reduce the error term in the model. Like other control procedures,
MANCOVA can be seen as a form of “what if” analysis, asking what would happen if all cases scored equally on the covariate, so that the effect of the factors over and beyond the covariate can be isolated (Green et al., 2003; Scheiner, 2001; Stevens, 2002; Tabachnick
& Fidell, 2001).
Power and Effect Size
The power of a test is a measuring tool for assessing the accuracy of a test or in comparing two competing test procedures. It is the probability of rejecting a null hypothesis when it is, in fact, false. The alpha level, the size of the treatment effect, and especially, the sample size affect the power of a statistical test. The power of a test is directly proportional to sample size and effect size, and inversely related to p-value
(Sharma, 1996, p. 350). Larger sample sizes are likely to result in high power because the closer the sample composition is to the actual population, the lower the unexplained or error variance. If the power of an experiment is low, then there is a good chance that the experiment is inconclusive. That is why it is so important to consider priori power in the design of experiments before collecting the data. However, “power alone is not enough in
151 determining an effect of a certain size” (Capraro, 2002, p. 12). That is why it is necessary also to determine an a priori effect size.
The effect size is defined in many ways by different authors. Cohen, for instance, defined the effect size as the amount of variability in the dependent variable that can be accounted for or attributed to the independent variable (Cohen, 1988). Hypothesis testing gives us an indication that some type of reliable effect occurred in our study, while effect size tells us the magnitude of that effect or the strength of the relationship among variables. That is, hypothesis testing answers a “yes-no” question, while an effect size calculation answers a “how much?” question (Goodwin, 2002).
Even if “statistical” significance is found (the null hypothesis was rejected), the implications of the results do not necessarily warrant “practical” significance, which can often be revealed by the effect size, or “clinical” significance (Thompson, 2002). It can be noted, however, that Nickerson (2000) also warns that a “large effect is not a guarantee of importance any more than a small p-value” (p. 257). In other words, a small p-value or a large effect size does not necessarily indicate that the results are important to
“real-world” application (Nickerson, 2000).
Furthermore, to avoid any confusion in this interpretation of the results, the phrase
“statistically significant” should be employed instead of simply “significant” (Carver,
1978; Nickerson, 2000; Thompson, 1994; Thompson, 1996). “Significant” implies
“important” and, again, the statistically significant results may not be necessarily important in reality (Kennedy, 2002, p. 6).
Estimates of a Priori Power and a Priori Effect Size from Sample-Power
The following is a text report from the Sample-Power software. The sample sizes
152 computed are those that are required to obtain a priori power of at least .80, when the significance level, α, is set at .05 and are a priori effect sizes is “medium” (that is, 0.25).
This power analysis is for a one-way fixed effects analysis of covariance with 2 levels. The study will include 64 cases per cell for a total of 128 cases. The study will include a set of 1 covariate which account for .0% of the variance in the dependent variable.
The criterion for significance (alpha) has been set at 0.05. The analysis of variance is non-directional (i.e., two-tailed) which means that an effect in either direction will be interpreted.
As to the main effects, the factor “department” includes only 2 levels. With a projected .80 response rate, it was anticipated surveying at least 64 students from the mathematics department and at least 64 students from the psychology department. With the adjusted medium effect size (f) of .25, a sample size of 128 students in total was needed to yield a priori power of at least .80.
Estimates of a priori power and a priori effect size from Stevens (2002) Table
Power of the test can be obtained from power tables using effect size, p-value, and the sample size (Sharma, 1996, p. 350). Table 4.7 from Stevens (2002, p. 200) provides power values for two-group MANOVA for 2 through 7 variables, with group size varying from small (15) to large (100), and with effect size varying from small (D2 = .25) to very large (D2 = 2.25).
153
Based on this table, with 4 dependent variables and setting the overall effect, D2, at the moderate level, that is D2 = .64, and the significance level α at .05, to reach a priori power of at least .80, at least 50 per group (equal group sizes are assumed) are needed.
Our final sample size of 170 students (or roughly 85 students per department) would be therefore more than adequate to yield power values well above .80.
CHAPTER 5: DATA ANALYSIS
Results
The primary interests of this study were twofold: (a) how relatively accurate is the iteration algorithm as a process for statistics users if these users want to construct their own frequency table of grouped data, (b) how effective is the iteration algorithm as a
154 teaching and learning method if compared to a traditional algorithm when constructing a grouped-data frequency distribution.
Since the outcomes of the first primary interest were already presented in chapter
3, this chapter offers only the results of the second primary interest, which was the focus of this study. These findings are presented below.
To assess the effectiveness of the iteration algorithm as an instructional device, a seven-point Likert-type instrument was developed. This instrument, called the Survey of
Attitudes towards the Method Used (SAMU), is a slightly revised version of the well- known Survey of Attitudes toward Statistics (SATS) instrument (Schau, Stevens,
Dauphine, & Del Vecchio, 1995). Like the original SATS instrument, the SAMU is a self-report instrument that has 20 items that specifically measure 4 facets of attitudes towards not statistics as is the case with the SATS measure, but towards the instructional method taught. As explained by the authors of the original SATS instrument, a panel of instructors and introductory statistics students identified by consensus 4 facets of attitudes towards statistics. The 4 facets involved are: Affect, Cognitive Competence,
Value, and Difficulty. Each of these facets was already defined in chapter 4 under the instrument development section.
For comparison purposes, the 4 facets were retained as dependent variables in the current data analysis. In addition, there was only one version of the SAMU questionnaire that was administered to both groups of students who were taught either the iteration algorithm or the traditional method.
Data Analysis
155
The analyses of the SAMU instrument, which the remaining 3 research questions address, were produced by running MANCOVA using the SPSS MANOVA program.
Multivariate and univariate analyses were performed only after checking for possible violations of assumptions. Before that, the reliability analyses rooted in Cronbach’s coefficient α results for the 4 facets for both instructional methods were considered.
Because the Cronbach’s coefficient α results obtained in the pilot studies were quite higher than those reported in the current chapter, an exploratory factor analyses were performed to analyze interrelationships among variables and to explain these variables in terms of their common underlying dimensions (factors). Specifically, a principal components analysis was used to assess dimensionality.
Survey of Attitudes towards the Method Used
Negatively worded items. The SAMU instrument measures the effects of the instructional method taught (either the iteration or traditional method) on students’ attitudes and perceptions concerning the construction of grouped-data frequency tables.
For all the 20 items, in particular, a seven-point Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree) was used. Negatively worded items were reverse coded on each facet such that a high score always represented a more positive attitude. As can be seen in the Appendix F, all negatively worded items are followed by the upper case N in parentheses.
Reliability Analyses. As can be noticed in Table 17, the Cronbach’s Alpha coefficients based on the 20 overall items were .76 and .57 for the iteration algorithm and the traditional method, respectively (N = 170). Both coefficients are much lower than those found by the pilot studies. The pilot studies used graduate students in the College of
156
Education, whereas the final study used undergraduate students in Mathematics and
Psychology departments.
There were 4 domains of interest in this particular investigation. Attitudes towards the method taught were assessed by examining the dimensions of Affect, Cognitive
Competence, Value, and Difficulty because the same components were used in the original SATS instrument. Cronbach’s Alpha coefficient was computed for each subscale and can be sighted in Table 17.
Table 17
Reliability Statistics for the Iteration Algorithm and the Traditional Method
Dependent Variable Iteration Algorithm Traditional Method
Cronbach’s Number of Cronbach’s Number of
Alpha Items Alpha Items
157
Attitude Subscale
Affect .730 5 .450 5
Cognitive .209 5 .297 5
Value .285 5 .120 5
Difficulty .723 5 .666 5
Overall .759 20 .567 20
Factor analyses. An exploratory factor analysis was carried out. The method of initial factor extraction was principal components analysis. A principal components analysis is a method that provides a unique solution, so that the original data can be reconstructed from the results. It looks at the total variance among the variables, so the solution generated will include as many factors as there are variables, although it is unlikely that they will all meet the criteria for retention. One good rule of thumb for determining the number of factors to be extracted is the “eigenvalue greater than 1” criterion (Kaiser’s rule). In essence this is like saying that, unless a factor extracts at least as much as the equivalent of one original variable, it should be dropped. Although this criterion is probably the one most widely used, it is not the best way to extract the number of factors to be retained in an exploratory factor analysis.
In this study, the principal component factor analysis was run by imposing 4 factors to see whether the items load correctly on the 4 factors as identified by the original scale authors. The initial eigenvalues were obtained for only these 4 factors
158 retained. The total variance and percent of variance explained by each factor are displayed in Table 18. The greater weight of the first factor and the similar relevance of the remaining factors are visible.
Table 18
Partial Outputs of the Initial Factor Analysis Summary (Principal Component Analysis)
Iteration Algorithm Traditional Method
Initial Eigenvalues Initial Eigenvalues
Factor Total % of Cumulative Total % of Cumulative
Variance % Variance %
1 4.378 21.891 21.891 3.291 16.455 16.455
2 2.733 13.663 35.554 3.042 15.208 31.663
3 2.142 10.709 46.263 2.611 13.053 44.716
4 1.612 8.059 54.322 1.612 8.061 52.777
5 1.531 7.657 61.979 1.501 7.503 60.280
To facilitate the interpretation of the retained factors, the axes were rotated using the Varimax method. Varimax rotation is an orthogonal rotation of the factor axes to maximize the variance of the squared loadings of a factor (column) on all the variables
(rows) in a factor matrix, which has the effect of differentiating the original variables by extracted factor. Each factor will tend to have either large or small loadings of any particular variable. A Varimax solution yields results which make it as easy as possible to identify each variable with a single factor. This is the most common rotation option.
159
The rotated factorial scores are included in Table 38 and Table 39 shown in
Appendix M. The factors are interpreted below with a name to describe the component to which the item belongs.
First factor: Affective and Difficulty components. For the iteration algorithm, this factor explains 21.89% of the total variance and includes items in the affective and difficulty component only. The two Affect items that contributed the most to the first factor (r = .98) are: I like constructing grouped-data frequency tables by the method presented and I am scared by the algorithm brought to light in class. The only Difficulty item that contributed the most to the first factor (r = .98) are: the algorithm of the instructional method taught is complicated.
Affective and Cognitive components. For the traditional method, this factor explains 16.46% of the total variance and includes items in the affective and cognitive competence components. The two Affective items that contributed the most to the first factor (r = .98) are: I feel insecure when creating grouped-data frequency tables by the method presented, and I am under stress using the method presented in class. The only
Cognitive item that contributed the most to this first factor (r = .98) is: I make a lot of computational errors when applying the algorithm taught in class. In the study sample data these two components (Affect and Cognitive) were positively and significantly related, but their correlation was moderate.
The very strong weight found in the Difficulty item suggests that the students considered the iteration algorithm to be relatively difficult regardless of their perceived ability in statistics and their knowledge in constructing grouped-data frequency tables.
These findings suggest also the extent to which the students’ affection towards the
160 instructional method taught might be conditioned by their understanding of the topic.
Positive small correlations in the other components (or factors) imply that attitudes in general tend to improve a little with increased knowledge.
Second factor: Difficulty component. For the iteration algorithm, this factor explains 13.66% of the total variance and includes items in the Difficulty component only. The first 3 Difficulty items that contributed more than any other items to the second factor (r = .98) are: the algorithm’s formula are easy to use; learning the algorithm as shown requires a great deal of discipline; and the algorithm exposed can be quickly learned by most people.
Difficulty component. For the traditional method, this factor explains 15.21% of the total variance and includes again items in the Difficulty component only. Difficulty items that contributed the most to this second factor (r = .98) are the same as those listed above under the iteration algorithm section.
The difficult level observed might came from the fact that many students associate the construction of grouped-data frequency tables to the mathematical technical features of the formulas contained in the algorithms. Positive attitudes of the majority of them could have helped to reduce this difficulty level.
Third factor: Affective and Cognitive components. For the iteration algorithm, this factor explains 10.71% of the total variance and includes items in the affective and cognitive competence components. The only Affect item that contributed the most to this third factor (r = .89) is: I am under stress using the method presented in class. The only
Cognitive item that contributed the second most to this third factor (r = .88) is: I make a lot of computational errors when applying the algorithm taught in class.
161
Affective and Difficulty components. For the traditional method, this factor explains 13.05% of the total variance and includes items in the Affect and Difficulty components. The only Affect item that contributed the most to this third factor (r = .91) is: I like constructing grouped-data frequency tables by the method presented. The only
Difficulty item that contributed the second most to this third factor (r = .89) is: the algorithm of the method present is complicated.
In the study sample data Affect and Cognitive components are positively and significantly correlated to each other, so are Affect and Difficulty components. The other two facets, Cognitive and Difficulty, are not significantly correlated to each other. The interpretation of these results are like those made under the first factor. The fourth factor explains less than 10% of the total variance observed. In light of these results, the SAMU items seem to load correctly only on 3 of the 4 factors identified by the original SATS scale authors.
MANCOVA results
A 2 x 2 multivariate analysis of covariance test was conducted to determine how effective is the iteration algorithm as a teaching and learning method if compared to a traditional algorithm when constructing a grouped-data frequency distribution. In the following sections, each part of the analyses will refer to a specific table.
Research question 2: Is there an overall interaction between the academic department and the instructional method used, adjusting for differences in the students’ perceived ability in statistics among groups?
162
Research question 3: Are there differences in the mean attitudes scored by students as a function of the academic department, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics?
Research question 4: Are there differences in the mean attitudes scored by students as a function of the instructional method used, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ perceived ability in statistics?
Means and Standard Deviations. Means and standard deviations of the Survey
Attitudes towards the Method Used (SAMU) scale scores on both instructional methods taught to mathematics and psychology students are displayed in Table 19. The Expected
Course Grade (ECG) scores, for students who were taught the iteration algorithm, were on average higher for mathematics students (M = 9.77, SD = .72) than for psychology students (M = 8.91, SD = 1.54). Likewise, the Perceived Mathematics Ability (PMA) scores, for students who were taught the iteration algorithm, were on average higher for mathematics department students (M = 5.84, SD = 1.11) than for psychology department students (M = 5.18, SD = 1.06).
For those who were instructed with the traditional method, their ECG scores were greater as well for mathematics students (M = 8.20, SD = 1.72) than for psychology students (M = 7.69, SD = 1.81). Their PMA scores were again greater for mathematics students (M = 4.98, SD = .96) than for psychology students (M = 4.88, SD = 1.04). On every single dependent variable, mathematics students achieved superior average scores on the iteration algorithm than did their psychology counterparts. Except for the Value
163 facet (M = 27.46, for mathematics, versus M = 27.67, for psychology), mathematics students scored, on average, higher on the traditional method than did psychology students.
Table 19
Means and Standard Deviations* of the Survey of Attitudes toward the Method Used Instructional Method Taught
Dependent Variables Iteration Algorithm Traditional Method and Covariate Mathematics Psychology Mathematics Psychology
(n = 43) (n = 44) (n = 41) (n = 42)
Affect 29.81 (2.99) 25.80 (4.06) 26.76 (1.93) 25.52 (2.78)
Cognitive 29.37 (2.00) 28.86 (2.04) 29.24 (1.83) 28.88 (1.98)
164
Value 28.51 (2.44) 28.05 (2.30) 27.46 (2.30) 27.67 (2.18)
Difficulty 28.74 (3.09) 26.05 (4.08) 24.51 (3.09) 23.95 (3.28)
ECG 9.77 (.72) 8.91 (1.54) 8.20 (1.72) 7.69 (1.81)
PMA 5.84 (1.11) 5.18 (1.06) 4.98 (.96) 4.88 (1.04)
Note: * Standard deviations are reported within parentheses.
Intercorrelations. Pearson product-moment coefficients of the SAMU total and subscale scores are presented in Table 20 thru Table 23. As suggested by Henson (2000) and Thompson and Vacha-Haase (2000), internal consistency reliability coefficients for the total (that is, the total score obtained on the 20 items) and 4 subscales of the SAMU were reported for the current sample. The choice of these subscales instead of just one has already been explained at the beginning of this chapter.
Iteration Algorithm and Mathematics Department. For the iteration algorithm and mathematics department, inter-correlations among the 4 subscale scores and standard deviations of the SAMU scores displayed on the principal diagonal are displayed in Table
20. The most salient inter-correlation observed was between Total and Difficulty (r = .81, p < .01, n = 43) and the lowest inter-correlation spotted was between Value and
Cognitive (r = -.03, p > .01, n = 43), that were negatively and not significantly related.
Total is least related to Cognitive (r = .49, p < .01, n = 43), but their correlation coefficient is still statistically significant.
ECG scores for mathematics students who were taught the iteration algorithm ranged from 4 to 10 with a mean of 9.77 (SD = .72). The ECG variable is most related to
165
Difficulty (r = .61, p < .01, n = 43) and had the smallest and insignificant correlation coefficient with Value (r = .25, p > .01, n = 43).
Perceived Mathematics Ability (measured by the item ‘How good at mathematics are you?’) scores for mathematics students who were taught the iteration algorithm ranged from 1 to 7 with a mean of 5.84 (SD = 1.11). The PMA variable is most and significantly associated with Total (r = .57, p < .01, n = 43), then followed successively by Value (r = .46, p < .01, n = 43), Affect (r = .44, p < .01, n = 43), and ECG (r = .43, p < .01, n = 43). The PMA variable had the smallest and insignificant correlation coefficient with the Cognitive facet (r = .03, p > .01, n = 43).
Table 20
Inter-correlations and Standard Deviations* of the Survey of Attitudes toward the
Method Used Scores (Iteration algorithm - Mathematics Department)
Variables Affect Cognitive Value Difficulty ECG PMA Total
Affect 2.99
Cognitive .29+ 2.00
Value .30+ -.03 2.44
166
Difficulty .40+ .24 .22 3.09
ECG .47+ .29+ .25 .61+ .72
PMA .44+ .03 .46+ .34+ .43+ 1.11
Total .747+ .491+ .494+ .809+ .696+ .565+ 9.834
Notes: * Standard deviations appear on the principal diagonal.
+ p < .01 for two-tailed test.
Iteration Algorithm and Psychology Department. For the iteration algorithm and psychology department, inter-correlations among the 4 subscale scores and standard deviations of the SAMU scores shown on the principal diagonal are presented in Table
21. The highest and also statistically significant inter-correlation observed was between
Total and ECG (r = 0.87, p < .01, n = 43) and the lowest inter-correlation observed was between Difficulty and Value (r = 0.00, p < .01, n = 44). Total is again least related to
Cognitive (r = 0.43, p < .01, n = 44), but their correlation coefficient is still statistically significant.
ECG scores for psychology students who were taught the iteration algorithm ranged from 4 to 10 with a mean of 8.91 (SD = 1.54). The ECG variable is highly and most significantly related to Affect (r = 0.67, p < .01, n = 44) and had the least and still statistically significant correlation coefficient with Cognitive (r = 0.32, p < .05, n = 44).
PMA scores for psychology students who were taught the iteration algorithm ranged from 1 to 7 with a mean of 5.18 (SD = 1.01). The PMA variable is highly and most significantly associated with Value (r = .61, p < .01, n = 44), then followed in
167 decreasing order by Affect (r = .44, p < .01, n = 44), Total (r = .43, p < .01, n = 44), and
ECG (r = .4, p < .01, n = 44). The PMA variable had the smallest and insignificant correlation coefficient with Difficulty (r = .03, p > .01, n = 44).
Table 21
Inter-correlations and Standard Deviations of the Survey of Attitudes toward the Method
Used Scores (Iteration Algorithm - Psychology Department)
Variables Affect Cognitive Value Difficulty ECG PMA Total
Affect 4.06
Cognitive .24* 2.04
Value .27* .20 2.30
168
Difficulty .53* .03 .00 4.08
ECG .67* .32* .35* .64* 1.54
PMA .44* .10 .61* .03 .40* 1.06
Total .840* .428* .442* .741* .868* .425* 12.018
Notes: * p < .01 for two-tailed test.
Traditional Method and Mathematics Department. For the traditional method and mathematics department, inter-correlations among the 4 subscale scores and standard deviations of the SAMU scores appearing on the principal diagonal are presented in
Table 22. The highest and most statistically significant inter-correlation was between
Total and ECG (r = .85, p < .01, n = 41) and the lowest and also insignificant inter- correlation observed was between PMA and Cognitive (r = .02, p > .01, n = 41). Total had the lowest and also insignificant correlation coefficient with PMA (r = .17, p > .01, n = 41). ECG scores for mathematics students who were taught the traditional method ranged from 4 to 10 with a mean of 4.98 (SD = .96). The ECG variable is highly and most significantly related to Difficulty (r = .54, p < .01, n = 41) and is least and not significantly associated with the Value facet (r = .10, p > .01, n = 41).
PMA scores for mathematics students who were taught the traditional method ranged from 1 to 7 with a mean of 4.98 (SD = .96). The PMA variable is most, but not significantly associated with ECG (r = .281, p > .01, n = 41), then followed in decreasing order by Value (r = .19, p > .01, n = 41) and Total (r = .17, p > .01, n = 41). The PMA variable has a negative and also insignificant correlation first with Affect (r = -.12,
169 p > .01, n = 41) and second with Difficulty (r =- .14, p > .01, n = 41).
Table 22
Inter-correlations and Standard Deviations of the Survey of Attitudes toward the Method used scores (Traditional Method - Mathematics Department)
Variables Affect Cognitive Value Difficulty ECG PMA Total
Affect 1.93
Cognitive .41+ 1.83
Value .08 .11 2.30
Difficulty .31+ .26 .12 3.09
ECG .49+ .43+ .10 .54+ 1.72
PMA -.12 .02 .19 -.14 .28+ .96
Total .594+ .598+ .387+ .741+ .848+ .165 8.652
Note: * Standard deviations appear on the principal diagonal.
+p < .01 for two-tailed test.
Traditional Method and Psychology Department. As to the traditional and psychology department, inter-correlations between subscale scores and standard deviations of the SAMU on the principal diagonal are presented in Table 23. The highest and also most statistically significant inter-correlation was observed between Total and
ECG (r = .91, p < .01, n = 42) and the lowest and also insignificant inter-correlation spotted (in absolute values) was between Difficulty and Cognitive (r = .02, p > .01, n = 42). Total is least related to Value (r = .38, p < .05, n = 42).
170
ECG scores for psychology students who were taught the traditional method ranged from 4 to 10 with a mean of 7.69 (SD = 1.81). The ECG variable had the highest and also most statistically significant correlation coefficient with Affect (r = .54, p < .01, n = 42) and had the lowest and also non significant coefficient of correlation with Value
(r = .22, p > .01, n = 42).
PMA scores for psychology department students who were taught the traditional method ranged from 1 to 7 with a mean of 4.88 (SD = 1.04). The PMA variable is highly and most significantly associated with Value (r = .62, p < .01, n = 42), then followed in decreasing order by ECG (r = .50, p < .01, n = 42) and Total (r = .42, p < .01, n = 42).
The PMA variable is negatively and not significantly related to Cognitive (r = -.10, p > .01, n = 42). The PMA variable had the smallest and also insignificant coefficient of correlation with Difficulty (r = .08, p > .01, n = 42).
Table 23
Inter-correlations and Standard Deviations* of the Survey of Attitudes toward the
Method Used Scores (Traditional Method - Psychology Department)
Variables Affect Cognitive Value Difficulty ECG PMA Total
Affect 2.78
Cognitive .50+ 1.98
Value -.12 .13 2.18
171
Difficulty -.03 .02 .07 3.28
ECG .54+ .41+ .22 .40+ 1.81
PMA -.09 -.10 .62+ .08 .50+ 1.04
Total .572+ .556+ .378+ .545+ .912+ .419+ 8.966
Note: * Standard deviations appear on the principal diagonal.
+p < .01 for two-tailed test.
Running MANCOVA analyses. As a way to assess the effectiveness of the iteration algorithm as a teaching and learning method, the current study investigated differences between attitudes scores based on students’ academic department and the instructional method taught. Using the attitudes scores of 170 Ohio University undergraduate students, a 2 × 2 between-subjects factorial multivariate analysis of covariance (MANCOVA) was conducted on the 4 dependent variables (Affect,
Cognitive, Value, and Difficulty) that were hypothesized to be 4 dimensions of attitudes towards the instructional method taught. The two independent variables were the instructional method taught (iteration algorithm or traditional method) and the departmental section (mathematics department or psychology department). In addition, the ECG variable was initially entered as the covariate to control for important prior differences between groups. Since Schau et al. (1995) used it in their article containing the original SATS instrument, this study kept the same significance level of .01.
SPSS syntax for the Multivariate analysis of Covariance using the MANOVA program was run. Type III Sum of Squares was used. Stevens (1996, p. 332) affirmed
172 that “the covariate is not appropriate for the data collected if there is a covariate by group interaction”. This assumption, among many others, was examined.
ECG Covariate by Group Interaction Effect. The multivariate test shows a statistically significant F test value. There is an ECG covariate by Method by Department interaction effect (F (12, 420.97) = .82, p < .01). As a result, the ECG variable might not be a satisfactory covariate for the data collected. Another variable measuring the students’ perceived ability in statistics, PMA, was retained to serve as the covariate. The following assumptions were tested for this new (PMA) covariate in analysis.
Assumptions when Using MANCOVA
Before using the multivariate analyses, data were screened for the main assumptions of parametric statistics (i.e., independence, normality, homogeneity of variance-covariance matrices, homogeneity of regression slopes, independent groups, and linear relationships between the dependent variables and the new PMA covariate).
Independence of the Observations Assumption. This assumption is met in that it is assumed that each student who participated in the SAMU survey filled out his or her questionnaire alone and independently of any other student.
Assumption of Multivariate Normality of the Dependent Variables within Each
Group. This assumption was determined by first a visual representation of a scatter plot of data for pairs of dependent variables within each of the two departmental class sections. Altogether 24 scatter diagrams of data for pairs of dependent variables were plotted. A sample set of 8 out of the 24 scatter plots are shown in the Appendix L. The graphs were judged to be normal because each scatter diagram for every pair of variables was roughly oval-shaped. Second, for each dependent variable the Kolmogorov-Smirnov
173 and Shapiro-Wilks tests were run simultaneously. The results show that the distributions of both instructional methods’ scores for both departments as well as the distribution of scores for each of the 4 dependent variables were all normal (p > .01). Tabachnick and
Fidell (2001) asserted that “a sample size of at least 20 in each cell should ensure
‘robustness’ of this normality assumption” (p. 329). In the present study, each cell had at least 41 scores.
Tabachnick and Fidell also asserted that “homoscedasticity is related to the assumption of normality because when the assumption of multivariate normality is met, the relationships between variables are homoscedastic. The bivariate scatter plots between two variables are of roughly the same width all over with some bulging toward the middle” (p. 79). Since the assumption of multivariate normality is tenable, the homoscedasticity assumption is checked next to see whether this latter assumption is also satisfied.
Homogeneity of variance-covariance matrices assumption with the PMA covariate. The multivariate test for homogeneity of dispersion matrices, Box M test, evaluates whether the variances and covariances among the dependent variables are the same for all levels of a factor. If the probability value associated with the Box’s M test is less than .01, then the variance-covariance matrices are said to be significantly different.
Based on the multivariate test for homogeneity of dispersion matrices’ results, it can be established that the assumption of homogeneity of variance-covariance matrices is tenable at the .01 significance level in that the observed probability value associated with the F test was not statistically significant (Box’s M = 50.11, F (30, 75374) = 1.60,
174
p > .01). A closer investigation of the standard deviations for each instructional method taught (Table 19) revealed that cells with larger samples sizes, except for Value, produced larger standard deviations between the two academic departments. Stevens
(2002) asserted that this assumption is robust if group sizes are equal or approximately equal and largest STD over smallest STD < 1.5. Within each instructional method taught, the ratio of the higher standard deviation over the smaller standard deviation between the two departments is, except for ECG (iteration algorithm), less than 1.5.
Homogeneity of Regression Slopes Assumption. The multivariate test, at the bottom of Table 24, shows that there is not a PMA covariate by group interaction effect
(F (12, 420.97) = 2.01, p > .01). Because of this insignificant F value, the hypothesis of homogeneity of regression slopes at the .01 significance level could not be rejected. The results, therefore, indicated that the assumption of parallelism of the regression planes was tenable for this new covariate (that is, the PMA covariate).
Independent Groups. The group of students who were taught the iteration algorithm and the group of other students who received the traditional method were independently enrolled in each of their respective departmental class sections. This assumption is met in that all groups were made up of unique individuals (i.e., no student belonged to more than one departmental class section).
Table 24
Selected Output for Multivariate Analysis of Covariance Test* Run with PMA
Effect Value F Hypothesized Error D.F. Sig. of F Eta
D.F. Squared
175
Within +
Residual .74 14.23 4.00 162.00 .000 .26
Method x
Department .95 2.08 4.00 162.00 .086 .05
Department .85 7.12 4.00 162.00 .000 .15
Method .84 7.84 4.00 162.00 .000 .16
PMA .54 33.65 4.00 159.00 .000 .46
PMA x
Method x
Department .86 2.01 12.00 420.97 .022 .14
Note: * The Test Name is Wilks’ Lambda.
Linear Relationships between the Dependent Variables and the Covariate
Assumption. Covariate is used when there may be possible impacts of this variable on the dependent variables. Within each group, this assumption was inspected by plotting a scatter diagram between each of the 4 dependent variables and the covariate. Since all scatter diagrams were roughly oval-shaped, it was concluded that both variables (each dependent variable and the covariate) were normally distributed and linearly related
(Tabachnick & Fidell, 2001).
A correlation analysis was also performed. As anticipated, statistically significant correlation results were found, on one hand, between each dependent variable and the covariate and, on the other hand, between each two dependent variables.
176
Pearson Product-moment Correlation Coefficients. Correlation analysis does not test linearity, it assumes linearity. Based on the results of the correlation analysis displayed in Table 25, 17 out of the 21 correlation coefficients listed were statistically significant at the .01 level. Hence, it was assumed that overall the PMA variable and the dependent variables were linearly and positively associated.
Table 25
Correlations between the dependent variables and PMA+
Variables Affect Cognitive Value Difficulty ECG PMA Total
Affect 1
Cognitive .336* 1
(.000)
Value .197* .105 1
177
(.010) (.172)
Difficulty .466* .140 .162 1
(.000) (.069) (.035)
ECG .586* .344* .258* .619* 1
(.000) (.000) (.001) (.000)
PMA .344* .037 .496* .232* .465* 1
(.000) (.629) (.000) (.000) (.000)
Total .773* .451* .436* .788* .843* .509* 1
(.000) (.000) (.000) (.000) (.000) (.000)
Notes: * p < .01 for two-tailed test; + probability values appear within parentheses.
All major multivariate assumptions being met, the results of assumption tests permit use of the multivariate analysis of covariance for this study. The PMA variable was consequently retained as the new covariate.
Analyzing and Interpreting the MANCOVA Results
The multivariate null hypothesis tested in covariance is that the adjusted population-mean vectors are equal. Due to an overall significant F value, the multivariate test at the top of Table 24 illustrates that the multivariate null hypothesis had to be rejected at the .01 significance level, F (4, 162) = 14.23, p < .01. Hence one can conclude that the groups differed significantly on the set of two adjusted means.
178
Because of this statistically significant multivariate analysis of covariance result, but with a modest association between the combined 4 dependent variables and the covariate (multivariate eta squared η2 = .26), the next step was to examine each of the two main effects to determine where the significance arises. After adjusting for differences in the PMA, there was a significant main effect for department, F (4, 162) = 7.12, p < .01.
Association between the combined 4 dependent variables and the main effect of department was modest (multivariate eta squared η2 = .15).
Similarly, after adjusting for differences in the PMA, a significant main effect for method was found, F (4, 162) = 7.84, p < .001. The association between the main effect of method and the combined dependent variables was modest as well (multivariate eta squared η2 = .16).
These results show that both independent variables are contributing to the overall multivariate significance. Analyses did not detect a significant method by department interaction effect, F (4, 162) = 2.08, p > .01, which yielded a substantially small effect size (multivariate eta squared η2 = .05). Note that the values F and Λ (lambda) are taken from Table 24 and the multivariate eta squared η2 is calculated according to the formula:
η2 = 1 - Wilks’ Λ.
In contrast, the PMA covariate yielded a statistically significant F value, F (4,
159) = 33.65, p < .01. This PMA covariate produced a moderate effect size (multivariate eta squared η2 = .46).
The PMA covariate scores, for students who were taught the iteration algorithm, were on average greater for mathematics department students (M = 5.837, SD = 1.11) than for psychology department students (M = 5.18, SD = 1.06). For those who were
179 instructed with the traditional method, their PMA scores were higher as well for mathematics students (M = 4.98, SD = 0.96) than for their counterpart psychology students (M = 4.88, SD = 1.04).
As to the adjusted means for the two independent variables, they are displayed in
Table 26 thru Table 28. For the independent variable Method (Table 26), the students’ scores were higher on all dependent variables for the iteration algorithm than for the traditional method (M(Adj) = 27.61 versus M(Adj) = 26.34, for instance, for Affect).
Table 26
Adjusted Means of the Survey of Attitudes toward the Method Used Instrument for
Method
Instructional Method Received
Dependent Variable Iteration Algorithm Traditional Method
(n = 87) (n = 83)
180
Affect 27.61 26.34
Cognitive 29.11 29.07
Value 27.97 27.87
Difficulty 27.31 24.31
Table 27 presents the adjusted means of the Survey of Attitudes toward the
Method Used Scale for Department. Again, the students’ scores were higher on all dependent variables for the iteration algorithm than for the traditional method except for
Value (M(Adj) = 27.79 versus M(Adj) = 28.05).
Table 27
Adjusted Means of the Survey of Attitudes toward the Method Used Instrument for
Department
Academic Department
Dependent Variable Mathematics Psychology
(n = 84) (n = 86)
181
Affect 28.16 25.79
Cognitive 29.30 28.88
Value 27.79 28.05
Difficulty 26.58 25.05
Table 28 presents the adjusted means of the SAMU instrument for Method by
Department. Both mathematics and psychology department students exhibited higher attitude score towards the iteration algorithm than towards the traditional method, except for Cognitive in psychology (M(Adj) = 28.86 versus M(Adj) = 28.89).
Table 28
Adjusted Means of the Survey of Attitudes toward the Method Used Instrument for
Method by Department
Instructional Method Received
Dependent Variable Iteration algorithm Traditional Method
Mathematics Psychology Mathematics Psychology
182
(n = 43) (n = 44) (n = 41) (n = 42)
Affect 29.40 25.82 26.92 25.75
Cognitive 29.36 28.86 29.25 28.89
Value 27.86 28.08 27.72 28.02
Difficulty 28.57 26.06 24.58 24.05
Assumptions when Using ANCOVA
As recommended by Tabachnick, and Fidell (2001) and Stevens (2002), the following 9 main assumptions were checked for conformity.
The Observations are Independent. This assumption is met in that it is assumed that each student who participated in the survey filled out his or her questionnaire alone and independently of any other student.
The Observations are Normally Distributed on the Dependent Variable in Each
Group. Within each group, the Kolmogorov-Smirnov and Shapiro-Wilks tests were simultaneously run for each dependent variable. The 16 probability values associated with the Shapiro-Wilk test of normality were all not significant (p > .01). Only the
Shapiro-Wilk test of normality results are reported here because, based on an extensive
Monte Carlo study done in 1968 by Wilk, Shapiro, and Chen, Stevens (2002) ascertained that the Shapiro-Wilk test is more powerful than the Kolmogorov-Smirnov in detecting departures from normality.
183
These results show that the distributions of both instructional methods’ scores for both departments as well as the distribution of each of the 4 dependent variables were all normal. This assumption is satisfied.
The Population Variances for the Groups are Equal (Homogeneity of Variance
Assumption). As shown in Table 29, the probability value associated with the F Levene’s test of the equality of error variances was significant for the Affect facet (F (3, 166) =
5.77, p < .01). In contrast, the probability values associated with the F Levene’s test of the equality of error variances were not significant (p > .01) for each of the other 3 facets
(Cognitive, Value and Difficulty).
Table 29
Levene’s Test of Equality of Error Variances
Dependent variable F df1 df2 Sig. p
Affect 5.766 3 166 .001
Cognitive .666 3 166 .574
Value 1.105 3 166 .349
Difficulty 2.639 3 166 .051
Stevens (2002) asserted that this assumption is robust if group sizes are equal or approximately equal –largest/smallest < 1.5. In the present case, the ratio largest group size over smallest group size is 87/83 or 1.05, which is less than 1.5. In view of the above results, it can be implied that the population from which the samples were drawn has equal group variances.
184
The slope of the Regression Line is the Same in Each Group (Homogeneity of the Regression Slopes Assumption). The univariate test, at the bottom of Table 30, indicates that there is a PMA covariate by Method by Department interaction effect for only the Affect facet (F (1, 165) = 5.22, p < .01). Since there were no other significant interaction effects between the covariate and each independent variable (for the 3 remaining facets), the results suggest that the homogeneity of the regression slopes assumption is satisfied.
Outliers. The distribution of each dependent variable was examined for potential outliers by identifying scores that were greater than or equal to |3| standard deviations from their mean. This evaluation of the assumption of absence of outliers is satisfactory as no outliers were detected.
Independent Groups. The group of students who were taught the iteration algorithm and the group of other students who received the traditional method were independently selected. Like for the multivariate case, this assumption is tenable because all groups were made up of unique individuals (i.e., no student belonged to more than one departmental class section).
A linear Relationship Exists between the Dependent Variables and the
Covariate. This assumption was inspected by plotting a scatter diagram between each of the 4 dependent variables and the covariate. Since each of the 4 scatter diagrams was roughly oval-shaped, it can be concluded that each dependent variable and the covariate are normally distributed and linearly related (Tabachnick & Fidell, 2001).
The Covariate is Reliable. Even tough there is no way to ascertain that the covariate is perfectly reliable, it may be assumed in ANCOVA that the covariate is
185 measured without error. According to Tabachnick and Fidell (2001), “with self-report of variables such as attitudes such an assumption is not easily made. The covariate may be reliable at the point of measurement, but fluctuate over short periods” (p. 283).
All main univariate assumptions being somehow met, the results of assumption tests permit use of the analysis of covariance for this study. The PMA variable serves as the covariate.
Univariate ANCOVA Follow-up Analyses
The univariate ANCOVA follow-up is in Table 30. The overall test results revealed that, after adjusting for differences in the PMA, the attitudes mean scores differed significantly on two of the 4 dependent variables: Affect, F (1, 165) = 9.32, p < .01, eta squared η2 = .28; and Value, F (1, 165) = 48.31, p < .01, eta squared η2 = .25.
Their adjusted mean scores did not differ significantly on Cognitive, F (1, 165) = .03, p >
.01, eta squared η2 = .01; and on Difficulty, F (1, 165) = 1.18, p > .01, eta squared η2 =
.24.
The method by department interaction effect results suggest that the adjusted mean scores were not significantly different on all dependent variables, except possibly for Affect, F (1, 165) = 6.80, p = .01, eta squared η2 = .04. Significant adjusted mean score differences were detected for Affect and Difficulty only for the independent variable Department, F (1, 165) = 26.03, p < .01, eta squared η2 = .14; and F (1, 165) =
8.19, p < .01, eta squared η2 = .05, respectively. Similarly, statistically significant adjusted mean score differences were detected again for Affect and Difficulty only for the independent variable Method, F (1, 165) = 7.17, p < .01, eta squared η2 = .04; and F
(1, 165) = 30.36, p < .01, eta squared η2 = .16, respectively. Considering the PMA
186 covariate alone, statistically significant adjusted mean score differences were detected both for Affect, F (1, 165) = 7.59, p < .01, eta squared η2 = .05; and for Value, F (1, 165)
= 45.35, p < .01, eta squared η2 = .23, respectively. The covariate by method by department interaction effect results indicate that the adjusted mean scores differed significantly only for Affect, F (1, 165) = 5.22, p < .01.
Table 30
Selected Output for Univariate F-tests Run with (1, 165) Degrees of Freedom
Effect Subscale Error Hypothesized Error F Sig. of Partial
SS MS MS F Eta
Squared
187
Within + Affect 1466.88 82.83 8.89 9.32 .003 .28
Residual Cognitive 641.09 .10 3.89 .03 .087 .01
Regression Value 685.47 200.71 4.15 48.31 .000 .25
Difficulty 1928.43 13.82 11.69 1.18 .278 .24
Method x Affect 1466.88 60.44 8.89 6.80 .010 .04
Department Cognitive 641.09 .18 3.89 .05 .829 .00
Value 685.47 .07 4.15 .02 .899 .00
Difficulty 1928.43 41.05 11.69 3.51 .061 .02
Department Affect 1466.88 231.40 8.89 26.03 .000 .14
Cognitive 641.09 7.49 3.89 1.93 .167 .01
Value 685.47 2.84 4.15 .68 .410 .00
Difficulty 1928.43 95.74 11.69 8.19 .005 .05
(table continues)
Table 30 (continued)
Selected Output for Univariate F-tests Run with (1, 165) Degrees of Freedom
Effect Subscale Error Hypothesized Error F Sig. of Partial
SS MS MS F Eta
Squared
188
Method Affect 1466.88 63.73 8.89 7.17 .008 .04
Cognitive 641.09 .07 3.89 .017 .895 .00
Value 685.47 .42 4.15 .100 .752 .00
Difficulty 1928.43 354.81 11.69 30.36 .000 .16
Covariate Affect 1337.63 62.63 8.26 7.59 .007 .05 w/ (1,162) Cognitive 637.66 .07 3.94 .02 .896 .00 degrees of Value 665.45 186.27 4.11 45.35 .000 .23 freedom Difficulty 1884.43 8.74 11.63 .75 .387 .01
Covariate Affect 1337.63 43.08 8.26 5.22 .002 x Method x Cognitive 637.66 1.15 3.94 .29 .832
Department Value 665.45 6.67 4.11 1.62 .186
Difficulty 1884.43 14.67 11.63 1.26 .290
In summary the study has established that:
1. Overall, the iteration algorithm produced grouped-data values were much closer to the corresponding statistical measure values computed directly from the raw data than the grouped-data values yielded by the SPSS algorithm, for each of the 9 statistical measures considered and according to varying sample sizes (small, medium, and large). Since no actual statistical test was applied, the observed differences in absolute mean deviations could not be declared to be statistically significant.
189
Follow-up univariate tests found that two dependent variables contributed more than other dependent variables to the statistically significant overall difference. These two dependent variables are Affect and Value.
2. Significant adjusted mean score differences were found between attitudes scores reported on the dependent variables and the PMA covariate by students who were taught the iteration algorithm and attitude scores reported by students who were taught the traditional method. Analyses did not detect a significant method by department interaction effect neither at the multivariate level nor at the univariate level.
3. Significant adjusted mean score differences were discovered between attitude scores reported on the dependent variables and the covariate by both mathematics and psychology department students. The majority of the students expressed very positive attitudes towards either instructional method (iteration or traditional), after controlling for the PMA, irrespective of the departmental section they came from. In other terms, adjusted mean scores of attitudes towards the instructional method taught reported by mathematics department students were, on average, higher than those reported by their counterpart psychology department students.
4. The method variable yielded statistically significant results. That is, the majority of the students exhibited very positive attitudes no matter which instructional method was introduced to them, after controlling for the PMA. Putting it differently, adjusted mean attitude scores reported by students who were taught the iteration algorithm were higher than those reported by students who were exposed to the traditional method.
Conclusion
190
The data analysis presented in chapter 5 provided a basis for developing the conclusions and recommendations that follow in chapter 6. All usable survey data were analyzed and the respective information was discussed in narrative form and presented in tables where appropriate. The two main sections of this chapter provided an in-depth look of the Monte Carlo simulations results, the MANCOVA and ANCOVA outcomes. The chapter presented specific findings for 3 of the 4 study’s 4 research questions, which will be summarized and discussed in chapter 6.
CHAPTER 6: DISCUSSION, SUMMARY, CONCLUSION, AND
RECOMMENDATIONS
This chapter begins with a brief summary of the study. A review of the research questions as described in preceding chapters is discussed in reference to current and
191 previous research findings. Next the implications of the study for students, teachers, and researchers are outlined. Then, limitations of the study are identified with their possible impact upon the results. This chapter will end with a conclusion and recommendations for further research that emerged on the basis of the current study’s findings.
Summary of the Results and Findings
The purpose of a grouped-data frequency table is to summarize and organize a large set of data. Presenting data in a grouped-data frequency distribution makes inspection of the data set much more manageable than presenting the entire set of raw data. A grouped-data frequency distribution can be considered as a type of descriptive statistic. Grouped-data frequency distributions can be structured either as a frequency table or a graph; both show basic features that are of interest to a statistician.
As Friel, Curcio, and Bright (2001) insisted “apparently attention to the use of grouped-data frequency tables as transition tools for organizing information to be represented graphically is needed” (p. 126) to promote understanding. Because one can move back and forth between frequency tables and graphs, having both types of information has inherent advantages. Furthermore, “histograms may even be more limited if they are used alone and as substitutes for grouped-data frequency distributions rather than being used as accompaniments” (Raymondo, 1999, p. 67). A combination of histograms and grouped-data frequency distributions is ideal to provide a much clearer picture of the data.
Four elements emerged from the review of the related literature as essential for making frequency tables of grouped data. These 4 key elements are as follows: the inclusive range, the number of classes, the class length or class width, and the starting
192 point. At the moment, although various procedures exist for use in creating grouped-data frequency tables, none of them is known to able to determine all the above 4 key elements in a guess-free fashion.
Besides the computation of the inclusive range, there is no clearly identifiable compromise about how to determine the other 3 key elements. The way in which statistics users decide about these key elements is very often a matter of artistic taste.
After determining the inclusive range, the second step in designing a grouped-data frequency table is commonly fixing on the number of class intervals. Then the third step is figuring out the interval size or interval width (Aron & Aron, 1997; Wessel & Willett,
1960). Finally, the fourth step is determining the starting point (Groebner & Shannon,
1981; Scott, 1979; Shaw, Huffman, & Haviland, 1987). With contemporary practices, the
“number of class intervals is just arbitrary, and so also is the point at which to begin the lowest class and the point at which to end the last class” (Kerr, Hall, & Kozub, 2002, p.
11).
In view of the above concerns and insinuations, the iteration algorithm was invented to offer a first step to address this gap on knowledge by presenting a structured method for summarizing and displaying large data collected into a frequency distribution of grouped data. It was conceived in response chiefly to the lack of a clear consensus about which existing formula works best and the lack of clarity in the instructions provided by the contemporary practices for creating grouped-data frequency tables.
Most existing methods leave too much room for subjectivity and arbitrariness in that they allow guessing. In contrast, the iteration algorithm utilizes five formulas and works in a systematic step-by-step fashion that leaves no room for guessing by providing
193 comprehensive, clear directions that are less prone to errors when constructing grouped- data frequency tables.
Once the iteration algorithm was developed, two preoccupations transpired. First, as a process, how relatively accurate is the iteration algorithm? Second, as an instructional method, how relatively efficient is the iteration algorithm if compared to a tradition method when constructing a grouped-data frequency table? To address the first crucial issue, a Monte Carlo study was designed. The results of the Monte Carlo simulations helped to check whether the iteration algorithm yielded, on average, smaller deviations than those produced by the SPSS algorithm, based on 9 statistical measures.
To tackle the second and main focus of this study, the Survey of Attitudes towards Statistics (SATS) devised by Schau, Dauphine, and Stevens (1995) was adopted after being slightly revised and a MANCOVA test was thereafter performed using the students’ Survey of Attitudes towards the Method Used (SAMU) scores on 4 dimensions.
Self-reported attitude scores were used in this study because they have been recognized to have a potential role in influencing the learning process (Harris, 1999; Wooten, 1998).
The results and findings are briefly summarized below.
First, as a process, it was found that the iteration algorithm is relatively accurate.
For each of the 9 statistical measures retained, the grouped data yielded by the iteration algorithm were much closer to the corresponding statistical measure value computed directly from the raw data than were the grouped data produced by the SPSS algorithm.
As noted in chapter 4, more positive differences were observed by subtracting the iteration algorithm results from the corresponding SPSS outcomes. The varying sample sizes (small, medium, large, and overall) did not alter much the above findings. Yet, the
194 large sample size seemed to yield slightly better results than the small and medium sample sizes.
Second, as an instructional method, after factoring out differences in the students’
Perceived Mathematics Ability (PMA), it was found that students showed more positive attitudes towards the iteration algorithm than towards the traditional method. The majority of psychology department students expressed higher attitudes towards the instructional method taught and acknowledged more positive perceived ability in statistics as measured by the PMA variable. Out of the 4 facets studied (Affect,
Cognitive, Value, and Difficulty), Affect and Value seemed to contribute more to the overall significant results observed than did the other two dependent variables. Cognitive contributed, if any, the least.
Discussion of the Results and Findings
This section will provide a quick overview of the results as they answer the research questions set forth in chapter 1. References to current and previous research findings will be made as fit.
Research question 1: On average, are the mean of square roots of deviations between the 9 statistical measures computed directly from the raw data and the corresponding 9 statistical measures obtained from the grouped data produced by the iteration algorithm less than those obtained from the grouped data yielded by the SPSS algorithm, for varying sample sizes?
In order to establish, as a process, how relatively accurate the iteration algorithm is, the results of the Monte Carlo study were examined by running and comparing the square roots of deviations between the 9 statistical measures computed directly from the
195 raw data and the corresponding 9 statistical measures obtained from the grouped data by using the iteration algorithm to the square roots of deviations between the 9 statistical measures computed directly from the raw data and the corresponding 9 statistical measures obtained from the grouped data by applying the SPSS algorithm. Satisfactory differences were detected by simple comparisons in favor of the iteration algorithm. Four varying sample sizes were considered (small, medium, large, and overall) and 9 statistical measures were used, and the results at each sample size level and for each measure were again generally satisfactory in support of the relative accuracy of the iteration algorithm.
Since no actual statistical test was performed, it might not be possible to admit that the differences observed between the two algorithms were statistically significant.
To determine, as an instructional method, how relatively efficient the iteration algorithm is if compared to a tradition method when constructing a grouped-data frequency table, the SAMU instrument was developed. The SATS invented by Schau,
Dauphine, and Stevens (1995) served as the model. Like the original SATS instrument, the SAMU instrument is broken into 4 components: Affect, cognitive competence, value and difficulty.
The Affect component assesses positive and negative feelings about statistics, and thus about the instructional method used. The Cognitive Competence component assesses self-beliefs about ability in statistics, and thereby in the instructional method used. The
Value component assesses beliefs about the usefulness and importance of the instructional method taught. The Difficulty component assesses beliefs about the difficulty of the instructional method used based on beliefs about the nature of statistics.
196
These 4 subscales were also selected for analyses because they represent the students’ attitudes towards statistics that some research has shown to be important for statistics achievement (Elmore et al., 1993; Green, 1994; Roberts & Saxe, 1982; Schau et al., 1993; Wise, 1995). The survey uses a seven-point Likert scale that ranges from 1
(strongly disagree) to 7 (strongly agree). The instrument used in this study, SAMU, attempted to explore the effects of the instructional method used (either the iteration or traditional method) on students’ attitudes and perceptions concerning the creation of grouped-data frequency tables. Negatively worded items were reverse coded so that higher attitudes scores are always associated with higher positive attitudes towards the instructional method used.
The students’ attitudes scores on the SAMU instrument were subjected to a multivariate analysis of covariance, with PMA scores entered as a covariate. This PMA covariate replaced the initial Expected Course Grade (ECG) covariate. If included in the final data analyses as covariate, the ECG variable might have influenced the MANCOVA test in producing inaccurate test results.
Prior to using the multivariate analysis of covariance, screening procedures for conformity to the multivariate analyses of covariance assumptions were carried out.
Results of evaluation of assumptions of independence of observations, normality, linearity, homogeneity of variance-covariance matrices, among others, were satisfactory.
In particular, the non violation of the homogeneity-of-slopes assumption indicated that the relationship between the covariate and the dependent variables differed significantly as a function of the instructional method used. Also, the PMA covariate was then judged to be adequately reliable for covariance analysis.
197
Research question 2: Is there an overall interaction between the academic department and the instructional method used, adjusting for differences in the student perceived mathematics ability?
Because the small reported probability value on the MANCOVA output associated with the F statistics test for Wilks’ Lambda, the null hypothesis that the population means on the 4 dependent variables, adjusted for differences in the PMA covariate, are equal across levels of the two given factors had to be rejected at the .01 significance level. Significant differences among the adjusted mean scores for the two groups of students were found. In other terms, there were multivariate statistically significant differences in mean scores, as measured by the dependent variables and the
PMA covariate, between the students’ attitudes scores reported on the SAMU instrument by students who were taught the iteration algorithm and those reported by students who were taught the traditional method, after factoring out the effects of the PMA. A further look at the results revealed that adjusted mean scores yielded by the iteration algorithm were, on average, greater than those produced by the traditional method.
No statistically significant interaction effect between the two independent variables was found. This means that the academic department in which a student was enrolled did not have a real impact on the student’s attitudes towards either instructional method taught. Also, a non-significant covariate by group interaction effect suggested that the differences on the dependent variables between the two groups did not vary significantly as a function of the PMA covariate.
198
Following multivariate significant results, an ANCOVA test was performed to see where the differences arise. Before using the ANCOVA, assumptions were checked for conformity. The evaluation of the main required assumptions was satisfactory.
The ANCOVA results revealed that two facets, Value and Affect, contributed more than other dependent variables to the overall significance, F (1, 165) = 48.31, p <
.001, and F (1, 165) = 9.32, p < .01, respectively, after factoring out the effects of the
PMA. Their respective effect sizes varied from small to moderate.
Research question 3: Are there differences in the mean attitudes scored by students as a function of the academic department, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’ Perceived
Mathematics Ability?
Statistically significant department main effects were found, after adjustment for differences among the groups with respect to the PMA. This means that there were substantially significant differences in adjusted mean scores between attitude scores reported on the dependent variables and the covariate by both mathematics and psychology department students. Irrespective of the instructional method taught, mathematics department students achieved higher scores on the SAMU instrument than did psychology students, after controlling for the PMA.
The ANCOVA follow-up results showed that two facets that contributed more than other dependent variables to the department main effects. The two facets are Affect and Difficulty, F (1, 165) = 26.03, p < .01 and F (1, 165) = 8.19, p < .01, respectively.
These results are similar to those found by Ford, Liwag-McLamb, and Foley (1998).
199
Research question 4: Are there differences in the mean attitudes scored by students as a function of the instructional method used, measured by the 4 SAMU subscales, after adjustment for differences among the groups with respect to the students’
Perceived Mathematics Ability?
Statistically significant method main effects were found, after adjustment for differences among the groups with respect to the PMA. This means that there were significant differences in adjusted mean scores between attitude scores reported on the dependent variables and the covariate by students who were taught the iteration algorithm and attitude scores reported by students who were exposed to the traditional method.
The variable method moderated the relationship between the academic department and the total attitudes points scored on the dependent variables and the covariate such that the pattern of department scores’ differences varied significantly between the iteration algorithm and the traditional methods. Both mathematics and psychology department students scored higher on the iteration algorithm than they did on the traditional method. In other words, probably due their background, mathematics department students showed more positive attitudes toward both instructional methods taught than did psychology students. For both groups the majority of adjusted mean scores on the iteration algorithm exceeded those on the traditional method.
The above findings imply that the facet Affect is the one that contributed the most to the overall significance and to the main effect significance. The two other facets that follow are Value and Difficulty. All these MANOVA and ANCOVA results are similar to those found by Dauphine, Schau, and Stevens (1997), Ford, Liwag-McLamb,
& Foley (1998), Mills (2004), and Schau, Dauphine, and Stevens (1995).
200
Explanations of the Results
A first explanation for these results may be that attitude scores reported by the students were independent of the academic department (mathematics or psychology).
Students enrolled in a MATH 250 class or in a PSY 120 class could have come from any academic department. They were not necessarily mathematics or psychology major students.
A second explanation may be that students enrolled in a MATH 250 course had higher PMA scores or a stronger mathematics background (more mathematics courses taken, for instance) than their counterparts taking a PSY 120 course. This explanation is based on the fact that a MATH 250 course is generally offered to sophomores, whereas a
PSY 120 is taught essentially to freshmen.
A third explanation could be that the instructors involved with the survey administration were completely new to the iteration algorithm. They had to master it first before teaching it to their respective students. It is obviously a difficult task for an instructor to learn a new algorithm in just a few hours before explaining it to the students.
This difficulty could have resulted in less effective presentation.
Additionally, it should be acknowledged that the instructor characteristics are likely to have some impact on the students’ attitudes towards the instructional method taught. Comments made by some students revealed that at least one instructor did not feel confident and seemed unprepared when he was teaching the iteration algorithm. He kept changing (at least 3 times) his class interval choices.
Another reason could be that the mathematics department instructors were PhD students, whereas in the psychology department it was an Assistant Professor who
201 conducted the presentations. It is possible that the students enrolled in MATH 250 classes were more at ease with their instructors, Asian PhD students, than those registered in
PSY 120 classes taught by a White American Assistant Professor.
Finally, psychology students may have developed more negative feelings towards statistics than did mathematics students. Negative student attitudes towards statistics may have created a major obstacle for effectively learning the iteration algorithm (Cashin &
Elmore, 1997; Fullerton & Umphrey, 2001; Perney & Ravid, 1991; Schultz, Drogsz,
White, & Distefano, 1988; Waters, Matelli, Zakrajsek, & Popovich, 1988).
As indicated in Table 24, all of the 4 facets of the SAMU instrument are positively associated with each other and with the covariate. More specifically, high levels of Affect are positively and significantly related to high levels of Difficulty. A more favorable attitude towards PMA is positively correlated to Value and Affect.
Cognitive is neither significantly associated with any of the 3 other facets nor with PMA.
Implications of the Study
This section will offer recommendations to practitioners on how they can best use the results and findings found from this study. The results have implications for students, teachers, researchers, and practitioners. These implications are summarized below.
Implications for Students and Teachers
A unique solution approach. For learning and teaching purposes, the iteration algorithm stops when a first solution is attained. Although, based on the review of the related literature review, it was established that the uniqueness solution feature of the iteration algorithm was proved to work for some people only, it is the algorithm that is the most recommended for constructing grouped-data frequency tables in light of results
202 and findings of this study. It was further demonstrated that the unique solution approach works exceptionally well for children with autism facing serious cognitive difficulties dealing with multiple-solutions problems (Bernard-Opitz, & Nakhoda-Sapuan, 2001).
The unique solution approach may help to reduce boredom for some students, frustration, anger, and fear for others (Schultz, Drogosz, White, & Distefano, 1998) as well as vagueness. As such, it is proposed that the iteration algorithm (unique-solution approach) be taught predominantly to secondary school and undergraduate students.
Because of diverse learning styles (Allwood, 1990; Harris & Schau, 1999;
McKeachie, 2003) and the fact that significant differences could not be established anchored in the Monte Carlo simulations’ findings, the results of this study suggest that the iteration algorithm be used in combination with any traditional method whenever applicable to accommodate for all kinds of learning styles. Onwuegbuzi (1998) acknowledged that learning styles have been implicated as representing antecedent of statistics anxiety. Saxe (1990, p. 122) added that “research on inquiry demonstrates that learners come to change their ideas when they discover alternatives that seem plausible and appear to be more useful”.
Implications for Researchers and Practitioners
A multiple solutions approach. As stated in chapter 2, there are some other statistics users in general and students in particular who are minimally bored by vagueness and who would prefer a teaching procedure with little structure. This may provide a kind of spontaneity and freedom to pursue independent ideas that can be exciting and motivating for them (Harris & Schau, 1999).
203
As put by Moore and Notz (2006), “statistics software will choose classes of a histogram for you, and may use slightly different rules from one computer software to another. The computer’s choice is usually a good one, but you can change it if you want”
(p. 207). Since a default histogram generated by any computer software might be too skewed or have too many empty classes, the iteration algorithm can be used in deciding about the number of class intervals the histogram should have, its class width and the starting point.
For practitioners and researchers doing their own work, the iteration algorithm can be extended beyond the first solution achieved, if they wish, and run until it provides a first grouped-data frequency table with either an empty first class interval or an empty last class interval. Multiple acceptable solutions would be produced and only the most appropriate solution according to the statistical study needs and the researcher personal tastes would be retained. Based on the review of the related literature and personal experience, it is urged that the iteration algorithm (multiple-solution approach) be taught predominantly to graduate students and researchers doing their own work.
Summary of the Implications
If it is true that human beings learn by doing (Schank, 2005) and learn by personal involvement (Gelman, 2005), then being able to modify a default set-up histogram generated by SPSS, for instance, to better represent a dataset (Garfield & Chance, 2000) might be more beneficial to statistics users than just producing the default set-up histograms. The choices derived from applying the iteration algorithm (unique-solution approach) can be used to modify the default set-up histogram. Being a well structured instructional method and having a unique solution obtained in a guess-free fashion, the
204 iteration algorithm is what this study favors chiefly for statistics instructors and graduates students taking an elementary statistics course.
Limitations
The limitations listed in chapter 1 were written before the data were collected.
This section reviews the limitations of the study after the data were collected and analyzed.
A first limitation has to do with the scope of the Monte Carlo study. The Monte
Carlo simulations were limited to only 150 cases because it was not possible to obtain the algorithm used by SPSS software to generate histograms. The grouped-data statistical measure differences between the two algorithms were carried out by Microsoft Excel, but the data had to be entered manually into an Excel spreadsheet. Had the SPSS algorithm been available, it could have helped to compare the grouped data yielded by the iteration algorithm to those produced by the SPSS default set up on a very large scale comprising thousands of cases.
A second limitation is that the subscale scores have relatively poor reliability, especially for Value and Cognitive. In particular, Value was identified as one of the two facets that contributed the most to the statistically significant overall difference observed between the two instructional methods used. As a result, one would have expected it to produce a strong reliability coefficient. Also, Cognitive had a low reliability coefficient due probably to the difficulty that some students might have experienced in applying correctly the 5 formulas contained in the iteration algorithm while constructing their own grouped-data frequency tables.
205
Third, exploratory factor analysis results did not confirm the factor structure underlying the item responses as reported by the authors of the original SATS instrument.
The SAMU items loaded more correctly on 3 factors rather than on 4 factors as identified by the original scale inventors. In addition, weak correlations (albeit mostly statistically significant due to a relatively large sample size) between subscale factors reflect lack of equivalence in measured constructs.
Conclusion
The conclusions of this study were based on the data collected from the surveys and are assumed to be characteristic of the students of the two Ohio University departments selected. The anticipated results of this study were that those students who were taught the iteration algorithm would have higher and more positive attitudes toward the instructional method used compared to those who received the traditional method when constructing their own grouped-data frequency tables. This assumption was found to be true. Mathematics and psychology students differ significantly on the set of two adjusted means.
The ECG variable was considered to be the covariate in the original plan for this study. After the original statistical analyses were conducted, it was found that the ECG variable was found to be statistically insignificant. An explanation of why ECG was a bad covariate due to an interaction effect could be that the ECG variable had a 10-point scale (scores were ranging from 1 to 10), whereas the other items had a range of scores varying from 1 to 7. Another explanation could be that the two courses (MATH 250 and
PSY 120) are perhaps different in contents. As such, mathematics and psychology students might have different difficulty levels, so they would expect different grades.
206
In the final data analyses the initial ECG covariate was replaced by PMA as the variable that adequately measures the students’ perceived ability in statistics. After running the analyses with the latter covariate, it was found that the PMA covariate was significantly related to the dependent variables (other than Cognitive). This means that the group means on the dependent variables were significantly adjusted due to differences on the PMA covariate.
From the review of the related literature, it was revealed that people learn in different ways. As suggested by Allwood (1990), there are many roads to learning and students, like any other human being, learn in different ways too and vary in their abilities to perform certain tasks. So there is no instructional method that would therefore work for everybody. In the same way that many authors suggested using histograms in combination to grouped-data frequency tables, the usage of the iteration algorithm is recommended in addition to any traditional method to accommodate for both those who do not stand for ambiguity and might like a unique-solution approach and those who prefer a multiple-solution approach.
The iteration algorithm is particularly promising in that it offers a unique solution approach primarily for teaching purposes (the only aspect of the iteration algorithm discussed in this study). It can be extended to serve as a multiple-solutions tactic for researchers. This approach is supported by Lapin (1980), for instance, who stated that
“one way to decide how many class intervals to use would be to try several –plotting a histogram for each and selecting the one that provides the most logical explanation of the underlying population pattern” (p. 21). He further specified “the histogram chosen for permanent use should be the one that achieves the best balance between summarizing the
207 data and preserving detail” (Lapin, 1987, p. 18). Although the idea of plotting several histograms and choosing the best that describes the data at hand is shared by Garfield and
Chance (2000) and Heaton and Mickelson (2002), this study prefers the unique solution approach.
To summarize the answer to the first research question, the absolute mean deviations from the raw data yielded by the SPSS algorithm were indeed greater than those yielded by the iteration algorithm. The differences observed between the statistical measure values produced by the two algorithms could not be declared significant as no actual statistical test was performed. Only simple comparisons were done.
The conclusion for the second research question showed that there were statistically significant overall differences in adjusted mean attitude scores of students who were taught the iteration algorithm against those that received the traditional method. But analyses did not detect a significant method by department interaction effect neither at the multivariate level nor at the univariate level.
The conclusion for the third research question showed that there were statistically significant differences in adjusted mean attitude scores between the two department sections. The majority of mathematics students reported higher and more positive attitude scores than their counterpart in the psychology department. The conclusion for the fourth research question showed that there were statistically significant differences in adjusted mean attitude scores between the two instructional methods. Students that were taught the iteration algorithm had more positive attitudes in all 4 domains than those who were exposed to the traditional method. Holding positive attitudes towards the instructional
208 method taught was recognized by the related literature review to facilitate its effective learning.
The conclusion of this study is that the iteration algorithm did have a positive and higher impact on the students’ attitudes towards the instructional method used than did the traditional method. Based on the ANCOVA post hoc test results, it was noted that most students found the iteration algorithm to be useful in their future professional and personal lives. Yet, as undergraduates, they might have experienced some difficulties in successfully applying the iteration algorithm to actually build their own grouped-data frequency tables. The iteration algorithm appears to be more appealing to graduate students than to undergraduate students. The results of the two pilot studies showed that graduate students reported more positive attitude scores towards the instructional method taught than did the undergraduate students.
Despite the limitations encountered, the findings of this study are promising and encouraging. Nevertheless, generalizing them to the target population may be limited only to some findings.
Sprenkle and Moon (1996) suggested that a response rate between 60% and 80% might be needed for generalizing at least some findings to the general population. As the data were collected were self reported, the researcher makes the assumption that participants’ responses accurately reflected their feelings (Heppner, Kivlighan, &
Wampold, 1992).
Recommendations for Future Research
209
The following recommendations for future research are based on the review of the literature, the analysis of the data, the results, conclusions, and discussion. First of all, the main focus of this study is an ex post facto analysis done without the benefit of a control group and with no random assignment of participants. As such, it does not espouse a cause-and-effect relationship between the exposure to the iteration algorithm and higher attitudes toward its applicability. To establish this causality, future research needs to include a control group in their study and assign at random subjects to the two instructional methods. The internal validity here is to show that differences in the dependent variable were caused by the independent variables and not by something else
(Stevens, 2002). It is also suggested to administer first the SATS instrument before administering the SAMU questionnaire.
Second, the current study was completed using a cross-sectional one shot survey design. A similar study should be planned within the same school that uses a longitudinal design to determine if changes over time become perceptible.
Third, gender may be included a possible covariate in addition to a variable measuring the students’ perceived ability in mathematics.
Fourth, white students were over-represented which could skew the results. A more diversified accessible population in terms of demographic characteristics and academic background along with several more variables added might provide further insight into the study’s findings.
Additionally, in this study an exploratory factor analysis was conducted. Future studies should utilize confirmatory factor analysis to replicate the factor structure findings in a very large sample of participants.
210
Although the results of this study may still be generalized to the Ohio University taking an elementary statistics course, generalization of the results to all American
College students is questionable and should only be done with caution. Also,
Wisenbaker, Scott, and Nasser (2000) reported that attitude components measured at the end of the course predicted final course achievement; those measured at the beginning of the course did not. Since the surveys in this study were conducted at the beginning of the course, replicating it is recommended, preferably at both the beginning and at the end of the course, accommodating the shortcomings addressed here.
REFERENCES
Acock, A. C. (2005, November). SAS, Stata, SPSS: A comparison. Journal of Marriage
& Family, 67(4), 193-1095.
211
Adkins, D. C. (1964). Statistics: An introduction for students in the behavioral sciences.
Columbus, OH: Charles E. Merrill Books.
Allwood, C. M. (1990, June). On the relation between justification of solution method
and correctness of solution in statistical problem solving. Scandinavian Journal
of Psychology, 31(3), 181-190.
Anderson, A. J. B. (1989). Interpreting data: A first course in statistics. New York:
Chapman and Hall.
Anderson, D., Sweeney, D., &, Williams, T. (2004). Essentials of modern business
statistics with Microsoft Excel (2nd ed.). Mason, OH: South-Western.
Arsham, Hossein. (2001). Statistical data analysis: Prove it with statistics. Retrieved
October 10, 2004, from: http://ubmail.ubalt.edu/~harsham/stat-data/opre330.htm
Bakker, A. (2004). Design research in statistics education on symbolizing and computer
tools. (Doctoral dissertation, Freudenthal Instituut, Universiteit Utrecht, the
Netherlands, 1970). Available online at:
http://www.stat.auckland.ac.nz/~iase/publications/dissertations/dissertations.php
Bandalos, D. L., Finney, S. J., &, Geske, J. A. (2003, September). A model of statistics
performance based on achievement goal theory. Journal of Educational
Psychology, 95(3), 604-616.
Bandura, A. (1977, March). Self-efficacy: Toward a unifying theory of behavioral
change. Psychological Review, 84, 191-215.
Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory.
Englewood Cliffs, NJ: Prentice-Hall.
212
Bartz, A. E. (1979). Descriptive statistics for education and the behavioral sciences
(5th ed.). New York: Macmillan.
Bartz, A. E. (1999). Basic statistical concepts (4th ed.). Upper Saddle River, NJ: Merrill.
Batanero, C., Godino, J. D., Green, D. R., Holmes, P, &, Vallecillos, A. (1994). Errors
and difficulties in understanding elementary statistical concepts. International
Journal of Mathematics Education, Science and Technology, 25(4), 527-547.
Behrens, J. T. (1997). Toward a theory and practice of using interactive graphics in
statistical education. In J. B. Garfield and G. Burrill (Eds.), Research on the role
of technology in teaching and learning statistics, 111-121. Voorburg, The
Netherlands: International Statistical Institute.
Beins, B. (1985, October). Teaching the relevance of statistics through consumer-oriented
research. Teaching of Psychology, 12(3), 168-169.
Bell, J. A. (1998, October). Success in statistics: gender, learning style, nationality, or
student classification. Proceedings of the Fifth Annual Southwestern Business
Administration Teaching Conference, 4, 27-29.
Benedict, J. O., & Anderton, J. B. (2004, Summer). Applying the just-in-time teaching
approach to teaching statistics. Teaching of Psychology, 31(3), 197-199.
Bluman, A. G. (2004). Elementary statistics: A step-by-step approach (6th ed.). Boston,
MA: McGraw Hill.
Borresen, C. R. (1990, Winter). Success in introductory statistics with small groups.
College Teaching, 38(1), 26-28.
213
Bradstreet, T. E. (1996, February). Teaching introductory statistics courses so that non-
statisticians experience statistical reasoning. The American Statistician, 50(1), 69-
78.
Brase, C. H., & Brase, C. P. (1999). Understanding basic statistics: Concepts and
methods. Boston, CA: Houghton Mifflin.
Brase, C. H., & Brase, C. P. (2001). Understanding basic statistics (2nd ed.).
Boston, CA: Houghton Mifflin.
Brown, J. D. (2000, Autumn). Statistics corner, questions and answers about the
Language-testing statistics: What is construct validity? Shiken: JALT Testing &
Evaluation SIG Newsletter, 4 (2), 7 – 10. Retrieved October 12, 2002 from
http://www.jalt.org/test/bro_8.htm
Brown, T. S., & Brown, J. T. (1995, December). Prerequisite course grades and attitudes
toward statistics. College Student Journal, 29(4), 502-507.
Campbell, D. T., Stanley, J. C. (1966). Experimental and quasi-experimental designs
for research. Boston, MA: Houghton Mifflin.
Carlisle, Y. (1985). Descriptive statistics. Supervision: Technical aspects of supervision.
The choice series # 43. A self-learning opportunity. Washington, England:
Northern Regional Management Centre (ERIC Document Reproduction Service
No. ED268248).
Carver, R. P. (1978, August). The case against statistical significance testing. Harvard
Educational Review, 48(3), 378-399.
Cashin, S. E., & Elmore, P. B. (1997). Instruments used to assess attitudes toward
214
statistics: A psychometric evaluation. Paper presented at the annual meeting of the
American Educational Research Association, Chicago, IL.
Champion, D. J. (1970). Basic statistics for social research. Scranton, PA: Chandler.
Chatfield, C. (1995). Problem solving: A statistician's guide (2nd ed.). New York:
Chapman & Hall.
Cherian, V. I., & Glencross, M. J. (1997, June). Sex socioeconomic status, and
attitude toward applied statistics among postgraduate education students.
Psychological Reports, 80(3), 1385-1386.
Chou, Y. L. (1963). Applied business and economics statistics. New York: Holt, Rinehart
and Winston.
Cobb, P. (1994, October). Where is the mind? Constructivist and socio-cultural
perspectives on mathematical development. Educational Researcher, 23(7), 13-
200.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).
Hillsdale, NJ: Lawrence Erlbaum Associates.
Coiro, J. (2005, February). Philosophy of teaching and learning statement.
Comprehensive Examination for Cognition and Instruction. Storrs, CT:
University of Connecticut.
Conners, F. A., Mccown, S. M., & Roskos-Ewoldsen, B. (1998, January). Unique challenges in teaching undergraduate statistics. Teaching of Psychology, 25(1), 40-42.
Craft, J. L. (1990). Statistics and data analysis for social workers. Itasca, IL: Peacok.
215
Cruise, R. J., Cash, R. W., & Bolton, D. L. (1985, October). Development and validation
of an instrument to measure statistical anxiety. American Statistical Association
Proceedings of the Section on Statistics Education, 4, 92-97.
Curzio, F. R. (1987, November). Comprehension of mathematical relationships expressed
in graphs. Journal of Research in Mathematics Education, 18(5), 382-393.
Dauphine, T. L., Schau, C., & Stevens, J. J. (1997, January). Survey of attitudes toward
statistics: Factor structure and factorial invariance for women and men. Structural
Equation Modeling, 4(2), 129-141.
Davies, G. R. (1929, December). The analysis of frequency distributions. Journal of the
American Statistical Association, 24(168), 349-359.
Davies, O. L., & Goldsmith, P. L. (1980). Statistical methods in research and production.
New York: Longman.
De Veaux, R. D., Velleman, P. F., & Bock, D. E. (2006). Intro Stats (2nd ed.). Boston,
MA: Pearson Addison Wesley.
DeBord, K., Aruguete, M., & Muhlig, J. (2004, February). Are computer assisted
teaching methods effective? Teaching of Psychology, 31(1), 65-69.
DeRoma, V. M., Martin, K. M., & Kessler, M. L. (2003, June). The relationship
between tolerance for ambiguity and need for course structure. Journal of
Instructional Psychology, 30(2), 104-109.
Devore, J. L., & Peck, R. (2005). The exploration and analysis of data (5th
ed.). Belmont, CA: Brooks/Cole.
Dillbeck, M. C. (1983, February). Teaching statistics in terms of the knower. Teaching of
216
Psychology, 10(1), 18-20.
Doane, D. P. (1976, November). Aesthetic frequency classification. American
Statistician, 30(4), 181-183.
Dubois, E. N. (1979). Essential statistical methods for business. New York: McGraw-
Hill.
Dugas, M. J., Gosselin, P., & Ladouceur, R. (2001, October). Intolerance of uncertainty
and worry: investigating specificity in a non clinical sample. Cognitive Therapy
and Research, 25(5), 551-558.
Dunn, D. S. (2001). Two heads are better than one: Learning statistics in common. Paper
presented at the Annual Meeting of the American Psychological Association
(109th, San Francisco, CA, August 24-28, 2001).
Eggen, P., & Kauchak, D. (2001). Educational Psychology: Windows on Classrooms (5th
ed.). Upper Saddle River, NJ: Prentice Hall.
Elifson, K. W., Runyon, R. P., & Haber, A. (1990). Fundamentals of social statistics
(2nd ed.). New York: McGraw-Hill.
Elmore, P. B., Lewis, E. L., & Bay, M. L. (1993). Statistics achievement: A function of
attitudes and related experience. Paper presented at the annual meeting of the
American Educational Research Association, Atlanta, GA.
Elzey, F. F. (1967). A first reader in statistics. Belmont, CA: Wadsworth.
Fago, G. C. (1996). Psychometric scales: Learning by doing. Proceeding of the
Annual Conference on Undergraduate Teaching of Psychology (10th, Ellenville,
NY, March 20-22, 1996).
Farnsworth, D. L. (2000, Autumn). The case against histograms. Teaching Statistics,
217
22(3), 81-85.
Fennema, E., & Behr, M. J. (1980). Individual differences and learning of mathematics
In: R. J. Shumway (ed.), Research in Mathematics Education, 356-387. Reston,
VA: National Council of Teachers of Mathematics.
Ferguson, G. A. (1981). Statistical analysis in psychology and education (5th ed.). New
York: McGraw-Hill Book.
Ferguson, G. A. (1989). Statistical analysis in psychology and education (6th ed.). New
York: McGraw-Hill Book.
Finney, S. J., & Schraw (2003, April). Self-efficacy beliefs in college statistics courses.
Contemporary Education Psychology, 28(2), 161-186.
Fisch, R. (1971). Course evaluation, test anxiety, and final test results in a statistics
Course. Zeitschrift für Entwicklung psychologie und padagogische Psychologie,
30, 361-366.
Follettie, J. F. (1980). Bar graph-using operations and response time (Technical Report).
Los Alamitos, CA: Southwest Regional Laboratory for Educational Research and
Development. (ERIC Document Reproduction Service No. ED250 381).
Ford, T. M., Liwag-McLamb, M. G., & Foley, L. A. (1998, June). Perceptions of rape
based on sex and sexual orientation of victim. Journal of Social Behavior &
Personality, 13(2), 253-263.
Fowler, W. A. (1988). Survey research methods. Newbury Park, CA: Sage.
Fox, J. A., Levin, J., &, Harkins, S. (1993). Elementary statistics in behavioral research.
New York: Harper Collins.
Fraenkel, J. R., & Wallen, N. E. (2003). How to design and evaluate research in
218
education (5th ed.). Boston, MA: McGraw-Hill.
Frank, H., & Althoen, S. C. (1994). Statistics: Concepts and applications. Cambridge,
England: Cambridge University.
Frankfort-Nachmias, C., & Leon-Guerrero, A. (2000). Social statistics for a diverse
society. Thousands Oak, CA: Pine Forge.
Freedman, D., & Diaconis, P. (1981). On the histogram as a density estimator: L2 theory.
Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57, 453-476.
Freire, P. (1993). Pedagogy of the oppressed (Myra Bergman Ramos, Trans.). New York:
Continuum. (Original work published 1970).
Freund, J. E. (1998). Modern elementary statistics (12th ed.). Englewood Cliffs, NJ:
Prentice-Hall.
Friel, S. N., Curcio, F. R., & Bright, G. W. (2001, March). Making sense of graphs:
Critical factors influencing comprehension and instructional implications. Journal
for Research in Mathematics Education, 32(2), 124-158.
Fullerton, J. A., & Umphrey, D. (2001). An analysis of attitudes toward statistics:
Gender differences among advertising majors. Paper presented at the annual
meeting of the Association for Education in Journalism and Mass Communication
(Washington, DC, August 5-8, 2001). (ERIC Document Reproduction Service
No. ED456479).
219
Galagedera, D., Woodward, G., & Degamboda, S. (2000, September). An investigation
of how perceptions of mathematics ability can affect elementary statistics
performance. International Journal of Mathematics Education, Science,
Technology, 31(5), 679-689.
Garfield, J. (1995, April). How students learn statistics. International Statistical Review,
63(1), 25-34.
Garfield, J., & Chance, B. (2000). Assessment in statistics education: Issues and
challenges. Mathematics Thinking and Learning, 2(1&2), 99-125.
Gay, L. R., & Airasian, P. (2000). Educational research (6th ed.). Upper Saddle River,
NJ: Prentice-Hall.
Gelman, A. (2005, February). A course on teaching statistics at the university level.
Special section: Preparing graduate students to teach statistics. The American
Statistician, 59(1), 1-4.
Gelman, A. & Nolan, D. (2002). Teaching statistics: A bag of tricks. London, UK:
Oxford University.
Giorgis, C., & Johnson, N. J. (2001, Summer). The learning process. The Reading
Teacher, 55(1), 86-94.
Giraud, G. (1977, January). Cooperative learning and statistics instruction. Journal of
Statistics Education, 5(1), 1-14.
Glasnapp, D. R., & Poggio, J. P. (1985). Essentials of statistical analysis for the
behavioral sciences. Columbus, OH: Charles E. Merrill.
Goodwin, C. J. (2002). Research in psychology (3rd ed.). NY: John Wiley & Sons.
220
Graham, A. (1987). Statistical investigations in the secondary school. New York:
Cambridge University.
Gratz, Z. S., Volpe, G. D., & Kind, B. M. (1993). Attitudes and achievement in
introductory psychological statistics classes: Traditional versus computer-
supported instruction. Paper presented at the annual Conference of Teachers of
Psychology, Ellenville, New York, March 25, 1993, 17 p.
Gravetter, F. J., & Wallnau, L. B. (1985). Statistical for the behavioral sciences. St. Paul,
MN: West.
Gravetter, F. J., & Wallnau, L. B. (1992). Statistics for the behavioral sciences: A
first course for students of psychology and education (3rd ed.). St. Paul, MN:
West.
Gravetter, F. J., & Wallnau, L. B. (2000). Statistical for the behavioral sciences
(5th ed.). Belmont, CA: Wadsworth/Thomson Learning.
Green, S. B., & Salkind, N. J. (2003). Using SPSS for windows and
Macintosh: Analyzing and understanding data (3rd ed.). Upper Saddle River, NJ:
Prentice-Hall.
Greeno, J., Collins, A., & Resnick, L. (1996). Cognition and learning. In: D. Berliner and
R. Calfee (eds.), Handbook of Educational Psychology, pp. 15-46. New York:
Macmillan.
Groebner, D. F., & Shannon, P. W. (1981). Business statistics: A decision-making
approach. Columbus, OH: Charles E. Merrill.
Hamburg, M. (1989). Statistical analysis for decision making (5th ed.). San Diego,
CA: Harcourt Brace Jovanovich.
221
Hamilton, L. C. (1990). Modern data analysis: A first course in applied statistics (2nd
ed.). Belmont, CA: Wadsworth.
Harnett, D. L. (1982). Statistical methods (2nd ed.). Reading, MA: Addison-Wesley.
Harris, M. B. (1998). Basic statistics for behavioral science research (2nd ed.).
Boston, MA: Allyn and Bacon.
Harris, M. B., & Schau, C. (1999). Successful strategies for teaching statistics. In S. N.
Davis, M. Crawford, & J. Sebrechts (Eds.). Coming into Her Own: Educational
Success in Girls and Women (pp. 193-210). San Francisco, CA: Jossey-Bass.
Hawkins, C. A., & Weber, J. E. (1980). Statistical analysis: Applications to business and
economics. New York: Harper & Row.
Hays, W. L. (1967). Basics statistics: Basic concepts in psychology series. Belmont, CA:
Brooks.
Hays, W. L. (1981). Statistics (3rd ed.). New York: Holt, Rinehart and Winston.
Heaton, R. M., & Mickelson, W. T. (2002, March). The learning and teaching of
statistical investigation in teaching and teacher education. Journal of
Mathematical Teacher Education, 5(1), 35-59.
Heckler, J. B., Fuqua, R. W., & Pennypacker, H. S. (1975, October). Errorless
differentiation of academic reponses by college students. Teaching of Psychology,
2(3), 103-107.
Henson, R. K. (2000, November). A primer on coefficient alpha. Paper presented at the
annual meeting of the American Educational Research Association, San
Francisco, CA (ERIC Document Reproduction Service No. ED31448).
222
Heppner, P. P., Kivlighan, D. M., & Wampold, B. E. (1992). Major research designs.
In C. Verduin (Ed.), Research design in counseling (pp. 115-165). Pacific Grove,
CA: Brooks-Cole.
Hertzberg, P. A. (1991). Principles of statistics. New York: John Wiley & Sons.
Hildebrand, D. K. (1986). Statistical thinking for behavioral scientists. Boston, MA:
Duxbury.
Hilton, S. C., Schau, C., &, Olsen, J. A. (2004, January). Survey of attitudes towards
statistics: Factor structure invariance by gender and by adminsitration time.
Structural equation modeling, 11(1), 92-109.
Hinkle, D. E., Wiersma, W., & Jurs, S. G. (2003). Applied statistics for the behavioral
sciences (5th ed.). Boston, MA: Houghton Mifflin.
Hoaglin, D. C., Mosteller, F., & Tukey, J. W. (Eds.) (1983). Understanding robust and
exploratory data analysis. New York: Wiley.
Hogg, R. V. (1991, November). Statistical education: Improvements are badly needed.
The American Statistician, 45(4), 342-343.
Hopkins, K. D., Glass, G. V., & Hopkins, B. R. (1987). Basic statistics for the behavioral
sciences (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall.
Hovermill, J. A. (2003). Technology supported inquiry learning in mathematics and
statistics with Fathom: A professional development project. (Doctoral
dissertation, The University of Colorado, January, 2004). Dissertation Abstracts
International, 64, (7-A), 2416-2666.
Howell, D. C. (1997). Statistical methods for psychology. (4th ed.). Belmont, CA:
Duxbury.
223
Huck, S. W. (2004). Reading statistics and research (4th ed.).Boston, MA: Pearson
Education.
Hulsizer, M. R. (2004). Frequency distributions and graphs. Course notes: Introduction to
Measurement and statistics (PSYC 2750 02, chapter 2). Retrieved December 27,
2004 from http://www.wester.edu/~hulsizer/class/statsu01/freqdist.html
Hunt, N. (2003, Summer). Handling continuous data in Excel. Teaching Statistics, 25(2),
42-45.
Hyndman, R. J. (1995). The problem with Sturges’ rule for constructing histograms. A
short note (2 pages). Unpublished article.
Iman, R. L., & Conover, W. J. (1982). Modern business statistics. New York: John Wiley
& Sons.
Isaac, S., & Michael, W. B. (1997). Handbook in research and evaluation (4th ed.).
San Diego, CA: Educational and Industrial Testing Services.
Ishikawa, Kaoru (1986). Guide to quality control. White Plains, NY: Unipub, Kraus
International.
Jaeger, Richard M. (1990). Statistics: A spectator sport (2nd ed.). New Park,
CA: Sage.
Jarrett, J., & Kraft, A. (1989). Statistical analysis for decision making. Boston, MA:
Allyn and Bacon.
Johnson, R. A., & Bhattacharya, G. K. (1985). Statistics: Principles and methods (2nd
ed.). New York: John Wiley & Sons.
Johnson, R. A., & Bhattacharya, G. K. (2001). Statistics: Principles and methods (4th
ed.). New York: John Wiley & Sons.
224
Johnson, D. G., & John, J. A. (2003, February). Use of demonstrations and experiments
in teaching business statistics. Journal of Applied Mathematics and Decision
Sciences, 7(2), 93-103.
Johnson, R., & Kuby, P. (2007). Elementary statistics (10th ed.). Belmont, CA:
Thompson Brooks/Cole.
Keeler, C. M. & Steinhorst, R. K. (1995, March). Using small groups to promote active
learning in the introductory statistics course: A report from the field. Journal of
Statistics Education [Online], 3(2).
(http://www.amstat.org/publications/jse/v3n2/keeler.html).
Keller, G. (2001). Applied statistics with Microsoft Excel. Pacific Grove, CA: Duxbury.
Keller, G., & Warrack, B. (2003). Statistics for management and economics (6th ed.).
Pacific Grove, CA: Brooks/Cole.
Kennedy, R. L., & McCallister, C. J. (2001, November). Attitudes toward advanced and
multivariate statistics when using computers. Paper presented at the annual
meeting of the Mid-South Educational Research Association, Little Rock, AR.
Kerr, A. W., Hall, H. K., & Kozub, S. A. (2002). Doing statistics with SPSS.
Thousand Oaks, CA: Sage.
Khazanie, R. (1986). Elementary statistics: In a world of applications (2nd ed.).
Glenview, IL: Scott, Foresman and Company.
Kiess, H. O. (1996). Statistical concepts for the behavioral sciences (2nd ed.).
Needham Heights, MA: Allyn & Bacon.
Knoke, D., & Bohrnstedt, G. W. (1991). Basic social statistics. New York: Peacock.
225
Kolstoe, Ralph H. (1969). Introduction to statistics for the behavioral sciences.
Homewood, IL: Dorsey.
Kooker, E. W., & Robb, G. P. (1982). Introduction to descriptive statistics. Springfield,
IL: Charles C. Thomas.
Korin, B. P. (1975). Statistical concepts for the social sciences. Cambridge, MA:
Winthrop.
Kottle, J. L. (2000, Summer). Mathematics proficiency, statistics knowledge, attitudes
toward statistics, and measurement course performance. College Student Journal,
34(3), 334-347.
L’Esperance, W. L. (1971). Modern statistics for business and economics. New York:
Macmillan.
Lapin, L. L. (1980). Statistics: Meaning and method (2nd ed.). New York: Harcourt
Brace Jovanovich.
Lapin, L. L. (1987). Statistics for modern business decisions (4th ed.). San Diego, CA:
Harcourt Brace Jovanovich.
Larson, R., & Farber, B. (2003). Elementary statistics: Picturing the world (2nd ed.).
Upper Saddle River, NJ: Prentice Hall.
Larson, R., & Farber, B. (2006). Elementary statistics: Picturing the world (3rd ed.).
Upper Saddle River, NJ: Prentice Hall.
Lave, J. (1988). Cognition in practice: Mind, mathematics, and culture in everyday life.
Cambridge, MA: Cambridge University.
226
Leamnson, R. (1999). Thinking about teaching and learning: Developing habits of
learning with first year college and university students. Sterling, VA: Stylus.
Levin, J., & Fox, J. A. (2004). The essentials: Elementary statistics in social research.
New York: Pearson Education.
Levin, Richard I., & Rubin, David S. (1983). Short course in business statistics.
Englewood Cliffs, N.J.: Prentice-Hall.
Levine, D. M. (2001). Applied statistics for engineers and scientists: Using Microsoft
Excel and MINITAB. Upper Saddle River, NJ: Prentice-Hall.
Litwin, M. S. (1995). How to measure survey reliability and validity. Thousands Oaks,
CA: Sage.
MacDonald, A. D. (1982, January). A stem-and-leaf plot: An approach to statistics. The
Mathematics Teacher 75(1), 25-28.
MacDonald-Ross, M. (1977, Summer). How numbers are shown: A review of research
on the presentation of quantitative data in texts. AV Communication Review,
25(3), 359-409.
Madsen, R. W., & Moeschberger, M. L. (1983). Introductory statistics for business and
economics. Englewood Cliffs, N.J.: Prentice-Hall.
Madsen, R. W., & Moeschberger, M. L. (1986). Statistical concepts with applications to
business and economics (2nd ed.). Englewood Cliffs, N.J.: Prentice-Hall.
Mansfield, E. (1980). Statistics for business and economics: Methods and
applications. New York: W. W. Norton & Company.
Mansfield, E. (1986). Basic statistics with applications. New York: W. W. Norton &
Company.
227
Mason, R. D., Lind, D. A. (1996). Statistical techniques in business
and economics (9th ed.). Boston, MA: Irwin-McGraw-Hill.
Mason, R. D., Lind, D. A., & Marchal, W. G. (1994). Statistics: An introduction (4th
ed.). Forth Worth, TX: Harcourt Brace.
Mason, R. D., & Lind, D. A., & Marchal, W. G. (1996). Statistical techniques in business
and economics (10th ed.). Boston, MA: Irwin-McGraw-Hill.
May, R. B., Masson, M. E. J., & Hunter, M. A. (1990). Application of statistics in
behavioral research. New York: Harper & Row.
Mays, D. A. (2005). Supplement to basic practice of statistics (9th ed.). Boston, MA:
McGraw Hill Custom.
McCall, R. (2001). Fundamental statistics for behavioral sciences (8th ed.).
Belmont, CA: Wadsworth/Thomson Learning.
McClave, J. T., Benson, P. G., & Sincich, T. (2005). Statistics for business and
economics (9th ed.). Upper Saddle River, N.J: Prentice-Hall.
McElroy, Elam E. (1979). Applied business statistics. (2nd ed.). San Francisco, CA:
Holden-Day.
McKeachie, W. J. (2002). Teaching tips: Strategies, research, and theory for college and
university teachers (11th ed.) Boston, MA: Houghton Mifflin.
McLaren, C. H. (2004, January). A comparison of student persistence and performance in
online and classroom business statistics experiences. Decision Sciences Journal of
Innovative Education, 2(1), 1-10.
McMillan, J. H. (1997). Classroom assessment: Principles and practice for effective
instruction. Needham Heights, MA: Allyn & Bacon.
228
McMillan, J. H., & Schumacher, S. (2001). Research in education (5th ed.). New
York: Addison Wesley Longman.
Meletiou, M. M. (2000). Developing students’ conceptions of variation: An untapped
well in statistical reasoning. (Doctoral dissertation, The University of Texas at
Austin, February, 2001). Dissertation Abstracts International, 61, (8-A), 3096-
3422.
Mendenhall, W. (1997). Beginning statistics: A to Z. (2nd ed.) Belmont, CA: Duxbury.
Mendenhall, W., Reinmuth, J. E., Beaver, R., & Duhan D. (1986). Statistics for
management and economics (5th ed.). Boston, CA: Duxbury.
Mendenhall, W., &, Sincich, T. (1996). A second course in statistics:
Regression analysis. Upper Saddle River, NJ: Prentice-Hall.
Metz, K. E, (1997). Dimensions in the assessment of students’ understanding and
application of chance. In I. Gal and J. B. Garfield (Eds.), The Assessment
Challenge in Statistics Education. Burke, VA: IOS.
Miller, J. B. (2000). The quest for the constructivist statistics classroom: Viewing
practice through constructivist theory. (Doctoral dissertation, The Ohio State
University, November, 2000). Dissertation Abstracts International, 61, (5-A),
1774-2081.
Mills, J. D. (2003, Summer). A theoretical framework for teaching statistics. Teaching
Statistics, 25(2), 56-59.
Mills, J. D. (2004, Summer). Students’ attitudes toward statistics: Implications for the
future. College Student Journal, 38(3), 349-361.
229
Milton, J. S., McTeer, P. M., & Corbet, J. J. (1997). Introduction to statistics. New York:
McGraw-Hill.
Minium, E. W. (1978). Statistical reasoning in psychology and education (2nd ed.). New
York: John Wiley & Sons.
Montcalm, D., & Royse, D. (2002). Data analysis for social workers. Boston, MA: Allyn
and Bacon.
Moore, D. S. (1997a, August). New pedagogy and new content: The case of statistics.
International Statistical Review, 65(2), 123-137.
Moore, D. S. (1997b). The active practice of statistics: A text for multimedia learning.
New York: W. H. Freeman and Company.
Moore, D. S. (2000). The basic practice of statistics (2nd ed.). New York: W. H.
Freeman and Company.
Moore, D. S. (2001). Statistics: Concepts and controversies (5th ed.). New York: W. H.
Freeman and Company.
Moore, D. S. & Notz, W. I. (2006). Statistics: Concepts and controversies (6th ed.). New
York: W. H. Freeman and Company.
Moore, R., & Goldiamond, I. (1964, Summer). Errorless establishment of visual
discriminations using fading procedures. Journal of the Experimental Analysis of
Behavior, 7(3), 269-272.
Morris, E. J., Joiner, R., & Scanlon, E. (2002, June). The contribution of computer-based
activities to understanding statistics. Journal of Computer Assisted Learning,
18(2), 114-124.
230
Morris, L. W., Kellaway, D. S., & Smith, D. H. (1978, August). Mathematics Anxiety
Rating Scale: Predicting anxiety experiences and academic performance in two
groups of students. Journal of Educational Psychology, 70 (4), 589-594.
Mosenthal, P. B., & Kirsch, I. S.. (1990a, Summer). Understanding graphs and charts,
Part I. Journal of Reading, 33(3), 371-373.
Mosenthal, P. B., & Kirsch, I. S.. (1990b, Fall). Understanding graphs and charts, Part II.
Journal of Reading, 33(4), 454-457.
Nelson, Lloyd S. (1988, Summer). Technical aid. Notes on the histogram: I. Equal class
intervals. Journal of Quality Technology, 20(3), 211-213.
Oathout, M. J. (1995, April). College students’ theory of learning introductory statistics:
Phase One. Paper presented at the annual meeting of the American Educational
Research Association, San Francisco, CA.
Onwuegbuzie, A. J. (1997, Spring). Writing a research proposal: The role of library
anxiety, statistics anxiety, and composition anxiety. Library and Information
Science, 19(1), 5-33.
Onwuegbuzie, A. J. (1998, Spring). Statistics anxiety: A function of learning style?
Research in the Schools, 5(1), 43-52.
Onwuegbuzie, A. J. (2000, Summer). Attitudes towards statistics assessments.
Assessment and Evaluation in Higher Education, 25(3), 321-339.
Onwuegbuzie, A. J. (2003, December). Modeling statistics achievement among graduate
students. Educational & Psychological Measurement, 63(6), 1020-1038. Sage.
231
Onwuegbuzie, A. J., Slate, J., Paterson, F., Watson, M., & Schwartz, R. (2000, Spring).
Factors associated with underachievement in educational research courses.
Research in the Schools, 7(1), 53-65.
Onwuegbuzie, A. J., Bailey, P., & Daley, C. E. (2002, Spring). The role of foreign
language anxiety and students’ expectations in foreign language learning.
Research in the Schools, 9(1), 33-50.
Onwuegbuzie, A. J., & Wilson, V. A. (2003, April). Statistics anxiety: nature,
antecedents, effects, and treatments – a comprehensive review of the literature.
Teaching in Higher Education, 8(2), 195-209.
Pelosi, M. K., & Sandifer, T. M. (2000). Doing statistics for business with Excel: Data,
inference and decision making. New York: John Wiley & Sons.
Perney, J., & Ravid, R. (1991). The relationship between attitudes towards statistics,
math self-concept, test anxiety and graduate students’ achievement in an
introductory statistics course. Paper presented at the annual meeting of the
American Educational Research Association, Boston, MA, April (ERIC
Document Reproduction Service No. ED318607).
Porter, A., Griffiths, D., &, Hedberg, J. (2003, Spring). From classroom to online
teaching: Experiences in improving statistics education. Journal of Applied
Mathematics and Decision Sciences, 7(2), 65-73.
Potthast, M. J. (1999, Fall). Outcomes of using small-group cooperative learning
experiences in introductory statistics courses. College Student Journal, 33(1), 34-
42.
232
Rajecki, D. W. (2002). Personal ad content analysis teaches statistical applications.
Teaching of Psychology, 29(2), 119-122.
Rangecroft, M. (2002, Summer). The language of statistics. Teaching Statistics, 24(2),
34-37.
Ravid, R. (1994). Practical statistics for educators. Lanham, MD: University Press of
America.
Raymondo, J. C. (1999). Statistical analysis in the behavioral sciences. Boston, MA:
McGraw-Hill College.
Rees, D. G. (1989). Essential statistics (2nd ed.). New York: Chapman and Hall.
Rhoads, T. R., Hubele, N. F. (2000, June). Student attitudes toward statistics before and
after a computer-integrated introductory statistics course. IEEE Transactions on
Education, 43(2), 182-187.
Rhodes, C. (2000). Using statistics in the social sciences: A computer integrated
approach. New York: General Hall.
Roberts, D. M., & Bilderback, E. W. (1980, Spring). Reliability and validity of a statistics
survey. Educational and Psychological Measurement, 40(1), 25-238.
Roberts, D. M., & Saxe, J. E. (1982, Fall). Validity of a statistics attitude survey: a
follow-up study. Educational and Psychological Measurement, 42(3), 907-912.
Roscoe, John T. (1975). Fundamental research statistics for the behavioral sciences (2nd
ed.). New York: Holt, Rinehart and Winston.
Runyon, R. P., & Haber, A. (1984). Fundamentals of behavioral statistics. Reading, MA:
Addison-Wesley.
233
Runyon, R. P., & Haber, A. (1990). Fundamentals of behavioral statistics (3rd ed.).
Reading, MA: Addison-Wesley.
Sanders, D. H., Murphy, A. F., & Eng, R. J. (1976). Statistics: A fresh approach. New
York, NY: McGraw-Hill Book.
Sanders, D. H. (1990). Statistics: A fresh approach (4th ed.). New York, NY: McGraw-
Hill.
Sanders, D. H., & Smidt, R. K. (2000). Statistics: A first course (6th ed.). Boston,
MA: McGraw-Hill.
Sandy, R. (1990). Statistics for business and economics. New York: McGraw-Hill.
Saxe, G. (1990). Culture and cognitive development: Studies in mathematical
understanding. Hillsdale, NJ: Erlbaum.
Schank, R. C. (2005). Lessons in learning, e-learning, and training: Perspectives and
guidance for the enlightened trainee. San Francisco, CA: Pfeiffer.
Schau, C., Stevens, J., Dauphinee, T. L., &, Del Vecchio, A. (1993). Evaluation of two
surveys measuring students’ attitudes toward statistics. Paper presented at the
annual meeting of the American Educational Association, Atlanta, GA.
Schau, C., Stevens, J., Dauphinee, T. L., &, Del Vecchio, A. (1995, October). The
development and validation of the survey of attitudes towards statistics.
Educational and Psychological Measurement, 55(5), 868-875.
Schloss, P. J., & Smith, M. A. (1999). Conducting research. Upper Saddle River:
Prentice-Hall.
234
Schoenfeld, A. H. (1992). Learning to think mathematically: Problem solving,
meta-cognition and sense making in mathematics. In D. A. Grouws (Ed.),
Handbook of Research on Mathematics Teaching and Learning (pp. 334-370).
NY: Macmillan.
Schmidt, M. J. (1975). Understanding and using statistics basic concepts. Lexington,
MA: D.C. Heath & Company.
Schultz, P. A., Drogosz, L. M., White, V. E., & Distefano, C. (1998, Fall). Prior
knowledge, attitude and strategy use in an introduction to statistics course.
Learning and Individual Differences, 10(4), 291-308.
Scott, D. W. (1979, September). On optimal data based histograms. Biometrika, 66(3),
605-610.
Shaughnessy, J. M. (1995). Discussion: Empirical research on technology and teaching
statistics. In J. Garfield and J. Truran (Eds.). Research Papers on Stochastic
Education (pp. 217-219).
Shavelson, R. J., Ruiz-Primo, M. A., &, Mitchell, M. (1996). Student guide for:
Statistical reasoning for the behavioral sciences. Needham Heights, MA: Allyn &
Bacon.
Shavelson, R. J., Ruiz-Primo, M. A., &, Mitchell, M. (2000). Student guide for:
Statistical reasoning for the behavioral sciences (2nd ed.). Needham Heights,
MA: Allyn & Bacon.
Shaw, D. G., Huffman, M. D., & Haviland, M. G. (1987, Summer). Grouping continuous
data in discrete intervals: Information loss and recovery. Journal of Educational
Measurement 24(2), 167-173.
235
Shvyrkov, Vladislav, & Stout, David E. (1980, December). On Sturges’ optimal class
interval. The American Statistical Association: Proceedings of the Business &
Economic Section, 5(5), 471-472.
Simon, J. L., & Bruce, P. (1991, Spring). Resampling: A tool for everyday statistical work, Chance, 4(1), 22-32.
Sincich, T., Levine, D. M., & Stephan, D. (1999). Practical statistics by example using
Microsoft Excel. Upper Saddle River, NJ: Prentice Hall.
Smith, G. (1998, June). Learning statistics by doing statistics. Journal of Statistics
Education [Online], 6(3).
(http://www.amstat.org/publications/jse/v6n3/smith.html).
Sokal, R. R., & Rohlf, F. J. (1995). Biometry (3rd ed.). New York: W. H. Freeman and
Company.
Spatz, C. (2005). Basic statistics: Tales of distributions (8th ed.). Belmont, CA:
Wadsworth/Thomson Learning.
Speer, W. R. (1979, November). “Do you see what I hear?” A look at individual learning
styles. Arithmetic Teacher, 27(3), 22-26.
Spence, J. T., Cotton, J. W., Underwood, B., J., & Duncan, C., P. (1990). Elementary
statistics (5th ed.). Englewood Cliffs, NJ: Prentice-Hall.
Spiegel, M. R. (1961). Theory and problems of statistics. Schaum’s Outline Series. New
York: Schaum.
Spiegel, M. R., & Stephens, L. J. (1998). Theory and problems of statistics (3rd ed.).
Schaum’s Outline Series. New York: McGraw-Hill.
Sprenkle, D. H., & Moon, S. M. (Eds.) (1996). Research Methods in Family Therapy.
New York : Guilford.
236
Sprinthall, R. C. (2000). Basic statistical analysis (6th ed.). Needham Heights, MA:
Allyn and Bacon.
Stevens, J. (1996). Applied multivariate statistics for the social science (3rd ed.).
Mahwah, NJ: Lawrence Erlbaum Associates.
Stevens, J. (2002). Applied multivariate statistics for the social science (4th ed.).
Mahwah, NJ: Lawrence Erlbaum Associates.
Stockburger, D. W. (2001). Introductory statistics: Concepts, models, and applications.
Available online at: http://www.psychstat.smsu.edu/introbooks/sbk09.htm.
Sturges, H. (1926, March). The choice of a class interval. Journal of the American
Statistical Association 21(153), 65-66.
Sullivan, M. M. (1993, November). Students learn statistics when they assume a
statistician’s role. Paper presented at the annual conference of the American
Mathematical Association of Two-Year Colleges, Boston, MA.
Sullivan, M. M. (2005). Fundamentals of statistics. Upper Saddle River, NJ: Pearson
Prentice Hall.
Sullivan, M. M. (2007). Statistics: Informed decisions using data. (2nd ed.) Upper Saddle
River, NJ: Pearson Prentice Hall.
Summers, G. W., & Peters, W. S. (1973). Basic statistics in business and economics.
Belmont, CA: Wadsworth.
Summers, G. W., Peters, W. S., & Armstrong, C. P. (1977). Basic statistics in business
and economics (2nd ed.). Belmont, CA: Wadsworth.
237
Sutarso, T. (1992a). Students’ attitudes toward statistics. Paper presented at the
Annual Meeting of the Mid-South Educational Research Association, Knoxville,
TN (ERIC Document Reproduction Service No. ED 353316).
Sutarso, T. (1992b). Some variables in relation to students’ anxiety in learning
statistics. Paper presented at the Annual Meeting of the Mid-South Educational
Research Association, Knoxville, TN. (ERIC Document Reproduction Service
No. ED353334).
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston,
MA: Allyn and Bacon.
Taplin, R. H. (2003, June). Teaching statistical consulting before methodology.
Australian New Zealand Journal of Statistics, 45(2), 141-152.
Terrell, G. R., & Scott, D. W. (1985, March). Over-smoothed nonparametric density
estimates. Journal of the American Statistical Association, 80(389), 209-214.
Thompson, B. (2002, Winter). “Statistical”, “practical”, and “clinical”: How many kinds
of significance do counselors need to consider? Journal of Counseling and
Development, 80(1), 64-71.
Thompson, W. B. (1994, February). Making data analysis realistic: Incorporating
research into statistics courses. Teaching of Psychology, 21(1), 41-43.
Thompson, B., & Vacha-Haase, T. (2000, April). Psychometrics is datametrics: the test is
not reliable. Educational and Psychological Measurement, 60(2), 174-195.
Thorndike, R. M., & Dinnel, D. L. (2001). Basic statistics for the behavioral sciences.
Upper Saddle River, NJ: Prentice-Hall.
238
Triola, M. F. (2001). Essential of statistics. Boston, MA: Addison Wesley
Longman.
Triola, M. F. (2002). Elementary statistics. (8th ed.). Boston, MA: Addison
Wesley Longman.
Vaidyanathan, R., & Vaidyanathan, G. (1987). College business statistics with Canadian
applications. Toronto, Ontario, Canada: Allyn and Bacon.
Velleman, P. F. (1976, Summer). Interactive computing for exploratory data analysis I:
Display algorithms. 1975 Proceedings of the Statistical Computing Section, 142-
147. Washington, DC: American Statistical Association.
Wand, M. P. (1997, March). Data-based choice of histogram bin-width. The American
Statistician, 51(1), 59-64.
Ware, M. E., & Chastain, J. D. (1989). Person variables contributing to success in
introductory statistics. (ERIC Document Reproduction Service No. ED309 927).
Waters, L. K., Martelli, T. A., Zakrajsek, T., & Popovich, P. M. (1988, Summer).
Attitudes toward statistics: An evaluation of multiple measures. Educational and
Psychological Measurement, 48(2), 513-516.
Webb, N. M. (1992). Testing a theoretical model of student interaction and learning in
small groups. In R. Hertz-Lazarowitz & N. Miller (Eds.), Interaction in
cooperative groups: The theoretical anatomy of group learning (pp. 102-119).
New York: Cambridge University.
Webster, A. (1992). Applied statistics for business and economics. Homewood, IL:
Richard D. Irwin.
239
Weiers, R. M. (2008). Introduction to business statistics (6th ed.). Mason, OH: Thomson
South-Western.
Weiss, Neil A. (2004). Elementary statistics (7th ed.). Boston, MA: Addison Wesley.
Wessel, R. H., & Willett, E. R. (1960). Statistics as applied to economics and business.
New York: Henry Holt and Company.
Wiersma, W. (2002). Research methods in education: An introduction (7th ed.).
Needham Heights, MA: Allyn & Bacon.
William, E., Ludlow, B., Walls, R. T. (1978, October). Learning, transfer, and retention
of errorless fading versus trial-and-error teaching. Psychological Reports,
43(2), 553-554.
Willis, T., & Seung, K. H. (2006). Using Computers in Educational and Psychological
Research: Using Information Technologies to Support the Research Process.
Springfield, IL: Charles C. Thomas.
Wise, S. L. (1985, Summer). The development and validation of a scale measuring
attitudes towards statistics. Educational and Psychological Measurement, 45(2),
401- 405.
Wisenbaker, J. M., Scott, J. S., and Nasser, F. (2000). Structural equation models relating
attitudes about and achievement in introductory statistics courses: A comparison
of results from the U.S. and Israel. Ninth International Congress on Mathematics
Education, Tokyo, Japan.
Witte, R. S., & Witte, J. S. (1997). Statistics (5th ed.). Forth Worth, TX: Harcourt Brace.
Wonnacott, T. H., & Wonnacott, R. J. (1984). Introductory statistics for business and
economics (3rd ed.). New York: John Wiley & Sons.
240
Wooten, C. T. (1998, May). Factors influencing student learning in introductory
accounting classes: A comparison of traditional and non-traditional students.
Issues Accounting Education, 13(2), 357-373.
Wyatt, W. W., & Bridges, C. M. (1967). Statistics for the behavioral sciences.
Englewood, IN: D.C. Heath and Company.
Zanakis, S. H., & Valenzi, E. R. (1997, September/October). Student anxiety and
attitudes in business statistics. Journal of Education for Business 73, 10-16.
Zeidner, M. (1991, November). Statistics and mathematics anxiety in social science
students: Some interesting parallels. British Journal of Educational Psychology,
61(3), 319-328.
Zelditch, M. (1959). A basic course in sociological statistics. New York: Henry Holt
and Company.
Zeller, R. A., & Carmines, E. G. (1978). Statistical analysis of social data. Chicago, IL:
Rand McNally.
Zhu, X., & Simon, H. A. (1987, Summer). Learning mathematics from examples and by
doing. Cognition and Instruction, 4(3), 137-166.
241
Appendix A
INFORMED CONSENT (Protocol # 05E080)
Dear Ohio University Student,
Purpose of the research
Greetings! Thank you for agreeing to participate in my research on the construction of grouped-data frequency tables using a new algorithm. As a component of my doctoral dissertation at Ohio University under the supervision of Dr. Gordon P. Brooks, I have designed a technique called the “iteration algorithm” because it proceeds by successive iterations and it utilizes seven formulas as well as six rules. This structured approach is intended to reduce the biases observed with the existing methods, and should, if well applied, produce just one grouped-data frequency table for the same set of raw data. It is so because the iteration algorithm utilizes five formulas and six rules and works in a systematic step-by-step way, which leaves no room for guessing.
Procedures to be followed
Based on a set of raw data (called EXERCISE I in Appendix C), it will be shown to you how to construct on your own a grouped-data frequency table using a traditional way (taught in textbooks). At the end of the solution for the EXERCISE I, a brief survey (Appendix F) follows to get your impressions about the method presented to you. After that, a second set of raw data (called EXERCISE II in Appendix G) is submitted to you. You are cordially invited to construct a grouped-data frequency table for this second data set by following the steps presented in the two methods. After completing the questionnaire (Appendix F), please return them directly to your Instructor.
Duration of subject’s participation
The whole process will take you less than half an hour.
Confidentiality
Strict confidentiality will be used at all times in the study. Identification numbers will be used to protect the identity of the subjects.
Compensation
The free executive pens are small tokens of my appreciation for your taking your valuable time to complete this survey.
In advance, thank you very much for your participation in my research and best luck for the remainder of your academic year. Hippolyte O. LOHAKA
242
Appendix B
OHIO UNIVERSITY CONSENT FORM TEMPLATE
Title of Research: Making on your own grouped-data frequency tables with continuous data.
Principal Investigator: Hippolyte O. LOHAKA
Department: Educational Studies
Federal and university regulations require us to obtain signed consent for participation in research involving human subjects. After reading the statements below, please indicate your consent by signing this form.
Explanation of Study: Please refer to the above Informed Consent Note.
Risks and Discomforts: There are no known risks or discomforts for participating in this survey.
Benefits: The method proposed yields, overall, better results than those generated by SPSS. Also, it could very well be used for teaching purposes.
Contact Person: Hippolyte O. LOHAKA, 740 – 593 – 4526, [email protected]
If you have any questions regarding your rights as a research participant, please contact Jo Ellen Sherow, Director of Research Compliance, Ohio University, 740 – 593 – 0664.
I certify that I have read and understand this consent form and agree to participate as a subject in the research described. I agree that known risks to me have been explained to my satisfaction and I understand that no compensation is available from Ohio University and its employees for any injury resulting from my participation in this research. I certify that I am 18 years of age or older. My participation in this research is given voluntarily. I understand that I may discontinue participation at any time without penalty or loss of any benefits to which I may otherwise be entitled. I certify that I have been given a copy of this consent form to take with me.
Signature: Date:
243
Appendix C
EXERCISE I (with Integers)
The data shown in Table 31 represent the final grades (in percent) obtained by 65 students in a section of elementary statistics. The grades are recorded to the nearest whole number.
Table 31
Elementary statistics final grades of 65 students during the 2002 spring
44 46 49 49 50 51 54 54 55 55 56 57 59 59 60 60 61 62 62 64 64 65 65 66 66 67 68 69 69 70 70 71 71 71 72 72 73 73 74 74 75 75 76 76 77 77 79 79 79 80 80 81 82 84 84 85 85 86 87 89 90 90 91 94 97
Task: Construct a grouped-data frequency table.
SOLUTION
244
Appendix D
Construction of the Grouped-data Frequency Table by a Traditional Method
Rule 1: Computation of the inclusive range, R, of the scores.
R = (Xhighest – Xlowest), where X is the observed score, Xhighest is the maximum observed score, and Xlowest is the minimum observed score.
Here, R = 97 - 44 = 53.
Rule 2: Determination of a possible value for k, the number of class intervals. k ≈ 1 + 3.322* log (n), where log (n) denotes the decimal logarithm of the sample size, n.
⇒ k ≈ 1 + 3.322 * log (65) = 7.02. The answer could be 7 if it is decided to round down or 8 (if rounded up), or even 10 (the next higher multiple of 5).
Rule 3: Determination of the class interval width, w, for each class interval.
The formula: R/k = w is applied. Then, there are successively 3 possibilities.
First option with k = 7: R/k = 53/7 = 7.57, which could be rounded to either w = 7 or
w = 8.
Second option with k = 8: R/k = 53/8 = 6.63, which reduces to either w = 6 or w = 7.
Third option with k = 10: R/k = 53/10 = 5.3, which could be rounded to either w = 5
or w = 6.
Hence, the class width w may have any of the 4 distinct values listed above (w =
5, w = 6, w = 7, and w = 8) and only one of them has to be chosen. Which one must be selected? By choosing an odd number for the class interval width is recommended, so that the level of accuracy of midpoints corresponds exactly to that of observed data.
There is still a problem because of 2 odd numbers: w = 5 and w = 7. Some students might
245 go for w = 5, while some others will choose w = 7. Any of the two possibilities seems plausible. Only one between the two needs to be retained. But, for the sake of illustration, both odd integers (i.e., 5 and 7) are going to be examined.
On one hand, the third option with w = 5 and k = 10, the second option with w = 7 and k = 8, and the first option has no w = 5 value. On the other hand, only the first option has a w = 7 value, with w = 7 and k = 7.
Rule 4: Determination of the lower real class limit (or lower class boundary) of the highest class interval, which is the starting point. No specific formulas are provided for doing this. As a result, a guess of an appropriate value would be to select, for instance, the minimum score observed. In the present case, it is 44.
Rule 5: Prepare a list of the limits of each class interval, beginning with the starting point.
In constructing the grouped frequency table, list the lower class boundary of the first class interval, which is the starting point obtained by applying Rule 4. Add the class interval width (w) to the starting point to obtain the first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the immediately preceding class interval.
After that, determine the class midpoints of each interval by averaging the class boundaries. That is, adding the corresponding class upper boundary to its class lower boundary, and dividing the resulting sum by two to obtain a class midpoint. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.
246
First Option: R = 53, k = 7, w = 7 and starting point is 44
First class: Lower class limit: 44 and
Upper class limit: 44 + w - 1 = 44 + 6 = 50
Second class: Lower class limit: 51 and
Upper class limit: 51 + w - 1 = 51 + 6 = 57
Third class: Lower class limit: 58 and
Upper class limit: 58 + w - 1 = 58 + 6 = 64
Fourth class: Lower limit: 65 and
Upper limit: 65 + w - 1 = 65 + 6 = 71
Fifth class: Lower limit: 72 and
Upper limit: 72 + w - 1 = 72 + 6 = 78
Sixth class: Lower limit: 79 and
Upper limit: 79 + w – 1 = 79 + 6 = 85
Seventh class: Lower limit: 86 and
Upper limit: 86 + w - 1= 86 + 6 = 92.
Verdict: This is a BAD selection, as the last (here the seventh) class interval of (86 to 92) fails to capture the maximum observed score, which is 97. Therefore, this first choice of class intervals is discarded.
247
Second Choice: R = 53, k = 8, w = 7 and starting point is 44
First class: Lower class limit: 44 and
Upper class limit: 44 + w - 1 = 44 + 6 = 50
Second class: Lower class limit: 51 and
Upper class limit: 51 + w - 1 = 51 + 6 = 57
Third class: Lower class limit: 58 and
Upper class limit: 58 + w - 1 = 58 + 6 = 64
Fourth class: Lower limit: 65 and
Upper limit: 65 + w - 1 = 65 + 6 = 71
Fifth class: Lower limit: 72 and
Upper limit: 72 + w - 1 = 72 + 6 = 78
Sixth class: Lower limit: 79 and
Upper limit: 79 + w – 1 = 79 + 6 = 85
Seventh class: Lower limit: 86 and
Upper limit: 86 + w - 1= 86 + 6 = 92
Eighth class: Lower limit: 93 and
Upper limit: 93 + w -1 = 93 + 6 = 99.
Verdict: This is a GOOD choice, since the last (here the eighth) class interval of (93 to
99) contains the maximum observed score, which is 97.
The grouped-data frequency distribution obtained is presented in Table 32 below.
248
Table 32
Grouped-data frequency table for the final statistics grades of 65 students (Traditional
Method)
Class ID Class limit Tally Frequency Class
i ni midpoint
xi
1 44, 50 4 | 4 6 9 9 5 47
5 | 0
2 51, 57 5 | 1 4 4 5 5 6 7 7 54
3 58, 64 5 | 9 9 9 61
6 | 0 0 1 2 2 4 4
4 65, 71 6 | 5 5 6 6 7 8 9 9 13 68
7 | 0 0 1 1 1
5 72, 78 7 | 2 2 3 3 4 4 5 5 6 6 7 7 8 13 75
6 79, 85 7 | 9 9 10 82
8 | 0 0 1 2 4 4 5 5
7 86, 92 8 | 6 7 9 6 89
9 | 0 0 1
8 93, 99 9 | 4 7 2 96
Total n = 65
249
Third Choice: R = 53, k = 10, w = 5 and starting point is 44
First class: Lower limit: 44 and
Upper limit: 44 + w - 1 = 44 + 4 = 48
Second class: Lower limit: 49 and
Upper limit: 49 + w -1 = 49 + 4 = 53
Third class: Lower limit: 54 and
Upper limit: 54 + w – 1 = 54 + 4 = 58
Fourth class: Lower limit: 59 and
Upper limit: 59 + w - 1 = 59 + 4 = 63
Fifth class: Lower limit: 64 and
Upper limit: 64 + w - 1 = 64 + 4 = 68
Sixth class: Lower limit: 69 and
Upper limit: 69 + w - 1 = 69 + 4 = 73
Seventh class: Lower limit: 74 and
Upper limit: 74 + w - 1 = 74 + 4 = 78
Eighth class: Lower limit: 79 and
Upper limit: 79 + w - 1 = 79 + 4 = 83
Ninth class: Lower limit: 84 and
Upper limit: 84 + w - 1 = 84 + 4 = 88
Tenth class: Lower limit: 89 and Upper limit: 89 + w - 1 = 89 + 4 = 93
Verdict: This is a BAD choice, since the last (here the tenth) class interval of (89 to 93) does NOT contain the maximum observed score, which is 97). Consequently, this choice of class intervals is rejected.
250
Appendix E
Construction of the Grouped-data Frequency Table by the Iteration Algorithm
The rules (step-by-step instructions) for constructing of the grouped-data frequency table by the iteration algorithm are shown below.
Rule 1: Computation of the inclusive range, Rt, of the grouped scores.
Rt = (Xhighest – Xlowest + dp ∗ t), where X is the observed score, dp represents the data precision (or the level of accuracy): 10-0, 10-1, 10-2, 10-3, etc., for 0, 1, 2, 3, … digits after the decimal point, and t is the number of necessary iterations or tries to reach the final inclusive range of the grouped scores.
Here, Rt = (Xhighest − Xlowest + dp ∗ t)
= 97 − 44 + 10-0 ∗ t = 53 + 1 ∗ t
= 53 + t.
Rule 2: Setting the maximum value of k, the potential number of classes. kmax = 2*√(n) for n ≤ 100.
⇒ kmax ≈ 2 ∗ √(65) = 2 ∗ 8.0623 = 16.125 ≈ 17 (rounded up). The frequency distribution must have up to 17 classes of grouped scores.
Rule 3: Determination of the minimum value of k, the number of possible class intervals. kmin = √(2 * n) = √(2 * 65) = √(130) = 11.402 ≈ 11 (rounded down).
⇒ kmin ≈ 11. The minimum number of possible class intervals is then 11.
(Note that the number of groups, k, is an integer that must be included between these two values, that is, kmin = 11 ≤ k ≤ kmax = 17).
Rule 4: Determination of the interval width that is equal for all classes.
251
Rt/k = w if and only if the ratio Rt over k yields an odd whole number. The problem will be solved in successive iterations.
First iteration (t = 1): R1/k = [97 – (44) + 1 * 1]/k = (53 + 1)/k, or 54/k, where R1 is the
initial inclusive range of the grouped scores or the range at iteration 1 and k is any
integer comprised between 11 and 17 (inclusively). The inclusive range at iteration 1,
R1, of 54 is divisible by 3, 6, 9, 18, 27, and 54. Unfortunately, none of them belongs
to the given range of values from 11 up to 17, limits included.
Second iteration (t = 2): R2/k = [97 – (44) + 1 * 2]/k = (53 + 2)/k, or 55/k. The number
55 is divisible by 5 and 11, but only 11 falls in the interval of values considered.
Thus, w = R2/k = 55/11 = 5. In other words, there will be k = 11 class intervals with an
equal class interval width, c, of 5, the final inclusive range R (R = R2) for the grouped-
data frequency table will be 55, and the number of necessary tries or iterations, t, is 2
(an even value).
Rule 5: Determination of the lower real class limit (or class boundary) of the lowest class interval boundary, which is the starting point. It is obtained by applying ONE of the two following formulas:
(Xlowest – t*0.5), if t is an odd number of tries or iterations; otherwise
(Xlowest – t*0.5 + 0.5), when t is an even number of tries or iterations; where Xlowest is the smallest score observed in the data set and t denotes the number of tries or iterations necessary to reach the smallest odd whole ratio, w, which represents the class interval length or class interval width.
Now plotting the lowest score and the value 2 (an even integer), the number t of iterations, into the first of the two formulas shown above yields:
252
Xlowest - t*0.5 + 0.5 = 44 - 2*0.5 + 0.5, or 43.5. Therefore, the lowest class boundary becomes 43.5.
Rule 6: Prepare a list of the limits of each class interval, beginning with the starting point.
In constructing the grouped frequency table, list the lower class boundary of the first class interval, which is the starting point obtained by applying Rule 5. Add the class interval width (w) to the starting point to obtain the first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the immediately preceding class interval.
After that, determine the class marks of each interval by averaging the class boundaries. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.
First class interval: Lower class boundary = 43.5.
Upper class boundary = 43.5 + 1 ∗ w = 43.5 + 5 = 48.5.
Second class interval: Lower class boundary = 48.5.
Upper class boundary = 48.5 + w = 48.5 + 5 = 53.5.
Third class interval: Lower class boundary = 53.5.
Upper class boundary = 53.5 + w = 53.5 + 5 = 58.5.
And last (eleventh) class interval: Lower class boundary =
43.5 + 10 ∗ w = 43.5 + 10 ∗ 5 = 43.5 + 50, or 93.5.
Upper class boundary =
43.5 + 11 ∗ w = 43.5 + 11 ∗ 5 = 43.5 + 55, or 98.5.
Check: Final Range, R = Last upper class boundary - First lower class boundary
= 98.5 – 43.5 = 55 (same result as before).
253
The Iteration algorithm grouped-data frequency distribution is presented in Table 33 below.
254
Table 33
Grouped-data Frequency Table for the Weight of 40 Male Students (The Iteration
Algorithm)
Class ID Class Tally Frequency Cumulative Class
boundary frequency midpoint
i ni xi
1 43.5, 48.5 4 | 4 6 2 2 46
2 48.5, 53.5 4 | 9 9 4 6 51
5 | 0 1
3 53.5, 58.5 5 | 4 4 5 5 6 7 6 12 56
4 58.5, 63.5 5 | 9 9 7 19 61
6 | 0 0 1 2 2
5 63.5, 68.5 6 | 4 4 5 5 6 6 7 8 8 27 66
6 68.5, 73.5 6 | 9 9 11 38 71
7 | 0 0 1 1 1 2 2 3 3
7 73.5, 78.5 7 | 4 4 5 5 6 6 7 7 8 46 76
8 78.5, 83.5 7 | 9 9 9 7 53 81
8 | 0 0 1 2
9 83.5, 88.5 8 | 4 4 5 5 6 7 6 59 86
10 88.5, 93.5 8 | 9 4 63 91
9 | 0 0 1
11 93.5, 98.5 9 | 4 7 2 65 96
Total n = 65
255
Appendix F
SURVEY OF ATTITUDES TOWARD THE METHOD USED (SAMU)
Dear Ohio University Students,
Thank you again for completing these statements about your attitude toward the traditional method. Your responses will be kept confidential. The feedback you provide will help us improve our new instrument.
SAMU QUESTIONNAIRE
DIRECTION: The questions below are designed to identify your attitudes about the method presented for creating frequency tables for grouped data. The items scale has 7 possible responses; the responses range from 1 (STRONGLY DISAGREE) through 4 (NEITHER DISAGREE NOR AGREE) to 7 (STRONGLY AGREE). Please read each question. From the 7-point scale, carefully circle appearing below the one response that most clearly represents your agreement with that statement. Use the entire 7-point scale to indicate your degree of agreement or disagreement with our items. Try not to think too deeply about each response and move quickly to the next item.
1. I like constructing grouped-data frequency tables by the method presented. (P)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
2. I feel insecure when creating grouped-data frequency tables by the method presented. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
3. I have trouble understanding the method presented because of how I think. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
256
4. The algorithm’s formulas are easy to use. (P)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
5. The procedure as shown is worthless. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
6. The algorithm of the instructional method taught is complicated. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
7. The method exhibited should be a required part of my statistics course. (P)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
8. I have no idea of what’s going on in the algorithm of the method presented. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
9. Even with the advent of computers, learning the algorithm as shown is still useful. (P)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
257
10. I am under stress using the method presented in class. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
11. I enjoy creating frequency tables for grouped data by the algorithm shown. (P)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
12. I can learn the algorithm presented. (P)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
13. Learning the algorithm as shown requires a great deal of discipline. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
14. I make a lot of computational errors when applying the algorithm taught in class. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
15. I am scared by the algorithm brought to light in class. (N)
Strongly Moderately Slightly Neither Slightly Moderately Strongly disagree disagree disagree disagree agree agree agree nor agree 1 2 3 4 5 6 7
258
16. The algorithm exposed can be quickly learned by most people. (P)
Strongly Moderately Slightly Neither disagree Slightly Moderately Strongly disagree disagree disagree nor agree agree agree agree
1 2 3 4 5 6 7
17. The algorithm on hand will have no application in my job. (N)
Strongly Moderately Slightly Neither disagree Slightly Moderately Strongly disagree disagree disagree nor agree agree agree agree
1 2 3 4 5 6 7
18. I understand how to apply the formulas contained in the algorithm proposed. (P)
Strongly Moderately Slightly Neither disagree Slightly Moderately Strongly disagree disagree disagree nor agree agree agree agree
1 2 3 4 5 6 7
19. The algorithm offered will be relevant in my life. (P)
Strongly Moderately Slightly Neither disagree Slightly Moderately Strongly disagree disagree disagree nor agree agree agree agree
1 2 3 4 5 6 7
20. The algorithm of the method exhibited is highly technical. (N)
Strongly Moderately Slightly Neither disagree Slightly Moderately Strongly disagree disagree disagree nor agree agree agree agree
1 2 3 4 5 6 7
B. BACKGROUND INFORMATION
NOTICE that the labels for the scale on each of the following items differ from those used above.
21. How good at mathematics are you?
Poor Well Worse Average Better Well Excellent below than than above average average average average 1 2 3 4 5 6 7
259
22. What is your level of computer literacy?
Not at all Inadequately Intolerably Neutral Tolerably Adequately Completely literate literate literate literate literate literate 1 2 3 4 5 6 7
23. How much experience with statistics have you had (e.g., courses, research studies)?
None Very Negligible Fair deal Good deal Very good Great deal negligible deal deal deal 1 2 3 4 5 6 7
24. How many years of high school mathematics did you take?
One year Two years 3 years 4 years Five years Six years Seven years 1 2 3 4 5 6 7
25. How many college mathematics and/or statistics courses have you taken so far?
One Two 3 courses 4 courses Five Six Seven course courses courses courses course 1 2 3 4 5 6 7
26. What grade do you expect to receive in this course?
F D C– C C+ B– B B+ A– A 1 2 3 4 5 6 7 8 9 10
260
Appendix G
EXERCISE II (with Decimals)
Data in Table 34 are measurements of the thickness in millimeters of mass-produced metal blocks (Ishikawa 1986, p. 7). To save space, only 26 observations exist in the sample of 100. These observations have been ordered with frequencies of occurrences shown in parentheses.
Table 34
Thickness in Millimeters of 100 Mass-produced Metal Blocks with Frequencies of
Occurrences in Parentheses
3.30 (1) 3.31 (1) 3.32 (1) 3.34 (2) 3.37 (1)
3.38 (2) 3.40 (2) 3.41 (4) 3.42 (1) 3.43 (2)
3.44 (5) 3.45 (5) 3.46 (15) 3.47 (6) 3.48 (13)
3.49 (4) 3.50 (9) 3.51 (1) 3.52 (10) 3.54 (4)
3.55 (1) 3.56 (5) 3.59 (2) 3.60 (1) 3.63 (1)
3.68 (1)
Construct a grouped-data frequency table on your spare time. Again, thank you very much.
261
Appendix H
Construction of the Grouped-data Frequency Table by the Iteration Algorithm
Rule 1: Computation of an inclusive range, Rt, of the scores.
Rt = [(Xhighest – Xlowest) + u ∗ t], where X is the observed score and t is the number of iterations or tries necessary to obtain the final inclusive range, R.
-2 Here, Rt = 3.68 − 3.30 + 10 ∗ t = 0.38 + 0.0t.
Rule 2: Determination of the minimum value of k, the number of possible classes. kmin = √(2 ∗ 100) = √200 = 14.142 ≈ 14 (rounded down).
⇒ kmin ≈ 14. The minimum number of possible class intervals is then 14.
Rule 3: Setting the maximum value of k.
kmax = 2*√ (100) for n ≤ 100.
⇒ kmax ≈ 2 ∗ √(100) = 2 ∗ 10 = 20. Hence, the maximum number of possible class intervals is 20.
(Note that kmin = 14 ≤ k ≤ kmax = 20).
Rule 4: Determination of the interval width for each class.
Rt/k = w if and only if the ratio Rt over k yields an odd whole number. The problem will be solved in successive iterations.
First iteration (t = 1): R1/k = [(3.68 – 3.30) + 0.01]/k = (0.38 + 0.01)/k, or 0.39/k, where R1 is the initial inclusive range or the inclusive range at iteration 1 and k is any integer comprised between 14 and 20 inclusively. Since there are no odd numbers in the range of 14 to 20 that divide 0.39, a second iteration is needed by increasing the non-
262 inclusive range (that is, largest score minus smallest score) by 0.01 and then the arithmetic is redone.
Second iteration (t = 1): R1/k = [(3.68 – 3.30) + 0.02]/k = (0.38 + 0.02)/k, or
0.40/k, where R2 is the second inclusive range or the inclusive range at iteration 2 and k is any integer comprised between 14 and 20 inclusively. Since there are no odd numbers in the range of 14 to 20 that divide 0.39 (in fact 0.40 is divisible by 16, but the ratio of 0.025 has two 3 decimals instead of just two), a the third iteration is required by increasing the non-inclusive range (that is, Maximum score minus Minimum score) by 0.01 again and then the arithmetic is redone.
Third iteration (t = 3): R3/k = [(3.68 – 3.30) + 0.03]/k = (0.38 + 0.03)/k, or 0.41/k, where R3 is the third inclusive range or the inclusive range at iteration 3 and k is any integer comprised between 14 and 20 inclusively. Since the number 0.41 is a prime number, a fourth iteration is necessitated by increasing the non-inclusive range (that is, largest score minus smallest score) by 0.01 again and the arithmetic is done one more time.
Fourth iteration (t = 4): R4/k = (0.38 + 0.04)/k = 0.42/k. The new inclusive range
(at iteration 4), R4, of 0.42 is divisible by 14, which is comprised between 14 and 20 inclusively. There will be k = 14 class intervals because w = R4/k = 0.42/14 = 0.03. In other words, there will be k = 14 class intervals with an equal class interval width, w, of
0.03 (an odd number), the final inclusive range R (R = R4) for the grouped-data frequency table will be 0.42, and the number of necessary tries or iterations, t, is 4 (an even value).
Rule 5: Determination of the lower real class limit (or class boundary) of the lowest class interval, which is the starting point. It is obtained by applying the following formula:
263
(Xlowest - t * 0.005 - 0.005), where t denotes an even number of tries or iterations necessary to reach the smallest odd ratio, w, which represents the class interval length.
Plotting the lowest score and the value 4, the number t of iterations, into the above formula yields:
Xlowest - t * 0.005 - 0.005 = 3.30 - 4 * 0.005 - 0.005, or 3.275. Therefore, the lowest class boundary becomes 3.275.
Rule 6: Prepare a list of the limits of each class interval, beginning with the starting point.
In constructing the grouped frequency table, list the lower class boundary of the first class interval, the lowest class boundary obtained by applying Rule 5. Add the class width (w) to this first upper class boundary. Write down the upper class boundary. Then, list all the class boundaries by adding the class width to the boundaries of the previous class interval.
After that, determine the class marks of each interval by averaging the class boundaries. Further task is to tally the frequencies for each class. Finally, sum the frequency column and check against the total number of observations.
First class interval: Lower class boundary = 3.275.
Upper class boundary = 3.275 + 1 ∗ w = 35.5 + 0.03 = 3.305.
Second class interval: Lower class boundary = 3.305.
Upper class boundary = 3.305 + w = 3.305 + 0.03 = 3.335.
Third class interval: Lower class boundary = 3.335.
Upper class boundary = 3.335 + w = 3.335 + 0.03 = 3.365.
And
Last (fourteenth) class interval: Lower class boundary =
264
3.275 + 12 ∗ w = 3.275 + 13 ∗ 0.03 = 3.275 + 0.39, or 3.665.
Upper class boundary =
3.275 + 13 ∗ w = 3.275 + 14 ∗ 0.03 = 3.275 + 0.42, or 3.695.
Table 35 presents the grouped-data frequency distribution obtained.
265
Table 35
Grouped-data frequency and cumulative frequency table for the thickness in millimeters of 100 mass-produced metal blocks (using the Iteration algorithm)
Class Class True Frequency Cumulative Class
ID Limits Frequency Midpoints
2 i ni ↑ni xi nixi nix i
1 3.275 – 3.305 1 1 3.29 3.29 10.8241
2 3.305 – 3.335 2 3 3.32 6.64 22.0448
3 3.335 – 3.365 2 5 3.35 6.7 22.445
4 3.365 – 3.395 3 8 3.38 10.14 34.2732
5 3.395 – 3.425 7 15 3.41 23.87 81.3967
6 3.425 – 3.455 12 27 3.44 41.28 142.0032
7 3.455 – 3.485 34 61 3.47 117.98 409.3906
8 3.485 – 3.515 14 75 3.50 49 171.5
9 3.515 – 3.545 14 89 3.53 49.42 174.4526
10 3.545 – 3.575 6 95 3.56 21.36 76.0416
11 3.575 – 3.605 3 98 3.59 10.77 38.6643
12 3.605 – 3.635 1 99 3.62 3.62 13.1044
13 3.635 – 3.665 0 99 3.65 0 0
14 3.665 – 3.695 1 100 3.68 3.68 13.5424
Total n = 100 347.75 1209.683
266 a) Computation of the sample mean:
Mean (X) = (1/n)*∑nixi = (1/n) = (1/100)*(347.75) = 3.4775.
Hence, the sample mean is, Mean (X) is 3.478. b) Computation of the sample variance:
2 2 s X = {∑nix i – n*[Mean (X) * Mean (X)]}/ (n – 1)
= [1209.683 – 100 * 3.4775 * 3.4775]/99 = 0.004.
2 Therefore, s X = 0.004. c) Computation of the sample standard deviation:
2 s X = √0.004. Hence, the sample standard deviation, sX, is 0.062. d) Computation of the median
Me = LMedian + w*[(n/2) – cumfreqMedian]/freqMedian.
= 3.455 + 0.03 * (50 – 27)/34 = 3.475.
Hence, the median, Me, is 3.475.
(New symbols used: LMedian is the lower boundary corresponding to the median
class, n is the total number of scores or subjects, cumfreqMedian denotes the
cumulative frequency corresponding to the lower limit, and freqMedian represents the
number of cases in the class interval containing the median).
th [Similarly, the 25 percentile, or Q1, is obtained as 3.425 + 0.03 * (25 - 15)/12, or
th 3.450. The 75 percentile, or Q3, is calculated as 3.485 + 0.03 * (75 - 61)/14, or
3.515]. e) Computation of the mode
Mo = LMode + w*[Δ1/ (Δ1 – Δ2)] = 3.455 + 0.03 * [(34 - 12)]/ [(34 - 12) + (34 - 14)] =
3.470. Hence, the mode, Mo, is 3.47.
267
Appendix I
Construction of the Grouped-data Frequency Table by the SPSS Procedure
SAMPLEQ3 40
37
33 30
20
10 10 9 Std. Dev = .06 Mean = 3.48 0 3 3 3 N = 100.00 Frequency 3.30 3.35 3.40 3.45 3.50 3.55 3.60 3.65 3.70
SAMPLEQ3
Figure 5. SAMPLEQ3.
268
Table 36
Grouped-data Frequency and Cumulative Frequency Table for the Thickness of Metal
Blocks Random Sample SAMPLEQ3 (Using SPSS Algorithm)
Class Class True Frequency Cumulative Class
ID Limits Frequency Midpoints
2 i ni ↑ni xi nixi nix i
1 3.275, 3.325 3 3 3.30 9.9 32.67
2 3.325, 3.375 3 6 3.35 10.05 33.6675
3 3.375, 3.425 9 15 3.40 30.6 104.04
4 3.425, 3.475 33 48 3.45 113.85 392.7825
5 3.475, 3.525 37 85 3.50 129.5 453.25
6 3.525, 3.575 10 95 3.55 35.5 126.025
7 3.575, 3.625 3 98 3.60 10.8 38.88
8 3.625, 3.675 1 99 3.65 3.65 13.3225
9 3.675, 3.725 1 100 3.70 3.7 13.69
Total n = 100 347.55 1208.328
a) Computation of the sample mean:
Mean (X) = (1/n)*∑nixi = (1/n) = (1/100)*(347.55) = 3.4755.
Hence, the sample mean is, Mean (X) is 3.476.
b) Computation of the sample variance:
2 2 s X = {∑nix i – n*[Mean (X) * Mean (X)]}/ (n – 1)
= [1208.328 – 100*3.4755*3.4755]/99 = 0.00422.
269
2 Therefore, s X = 0.00422.
c) Computation of the sample standard deviation:
2 s X = √0.00422. Hence, the sample standard deviation, sX, is 0.065.
d) Computation of the median
Me = LMedian + w*[(n/2) – cumfreqMedian]/freqMedian.
= 3.475 + 0.05 * (50 – 48)/37 = 3.4777.
Hence, the median, Me, is 3.478.
(New symbols used: LMedian is the lower boundary corresponding to the median class, n is the total number of scores or subjects, cumfreqMedian denotes the cumulative frequency corresponding to the lower limit, and freqMedian represents the number of cases in the class interval containing the median).
th [Similarly, the 25 percentile, or Q1, is obtained as 3.425 + 0.05 * (25 – 15)/33, or
th 3.440. The 75 percentile, or Q3, is calculated as 3.475 + 0.05 * (75 – 48)/37, or 3.511].
e) Computation of the mode
Mo = LMode + w*[Δ1/ (Δ1 – Δ2)].
= 3.475 + 0.05 * [(37 – 33)]/ [(37 – 33) + (37 – 27)] = 3.48145.
Hence, the mode, Mo, is 3.481.
(New symbols used: LMode is the lower boundary of the class interval containing the highest frequency or the modal class, Δ1 is the difference between the frequency of the modal class and the frequency of the previous class, whereas Δ2 is the difference between the frequency of the modal class and the frequency of the following class).
270
Table 37
Comparison of the Statistical Measures from the Raw Data to the Grouped-data
Measures (Iteration Algorithm) and the Grouped-data Measures (SPSS Algorithm)
(SAMPLEQ3)
Statistical Measures Raw Data Grouped-data measures Grouped-data measures by the Iteration by the SPSS Algorithm Algorithm Mean 3.476 3.478 3.476* Median 3.480 3.475 3.478* Mode 3.460 3.470 3.481 Standard deviation 0.063 0.062 0.065 Range 0.38 0.42 0.45
Percentile 25 (Q1) 3.450 3.450 3.440
Percentile 75 (Q3) 3.518 3.515 3.511 Semi Inter Quartile 0.034 0.033 0.0355 Range Skewness -0.109 0.145 -0.092* Coefficient of 0.0181 0.0178 0.0187 Variation (C.V.)
Note: Except for the sample mean, median, and the sample skewness statistical measures, shown each with an asterisk (∗) in Table 37, all of the Iteration algorithm results are much closer to the raw data’s results than those obtained with the SPSS procedure.
271
Appendix J
A Copy of the E-Mail Message sent to the SPSS Corporate Headquarters
From: Hippolyte Lohaka [mailto:[email protected]] Sent: Sunday, September 29, 2002 8:57 PM To: SPSS Corporate Headquarters Subject: HUMBLE REQUEST FOR ASSISTANCE
Corporate headquarters
SPSS Inc. 233 S. Wacker Drive, 11th floor Chicago, IL 60606-6307
Tel: +1.312.651.3000 Fax: +1.312.651.3668
Dear Sirs,
Permit me to humbly introduce to you. My name is Hippolyte OSANGO LOHAKA and I am currently a fourth-year Ph.D. Student in Educational Statistics at Ohio University in Athens.
Please be informed that as part of my doctoral dissertation work, I have designed a method of constructing grouped-data frequency tables by hand, which leaves NO room for guessing. Simple pairwise comparisons (using t paired-samples test) have shown so far that the results yielded by my method are much closer to those obtained directly from raw data than those generated by the MINITAB software, for instance. As you perfectly know, by grouping data there is always some kind of information loss associated with it.
What I wish to establish now is that my method leads to a MINIMUM loss of information, that is, my method "RETAINS" MORE information, overall, than most leading statistical packages such as SPSS, SAS, and MINITAB.
272
The purpose of this electronic message is therefore to most humbly request you to kindly provide me with your LATEST "ALGORITHMS" for constructing grouped-data frequency distributions or, alternatively, HISTOGRAMS so that I can include them in my dissertation and better defend my points.
I greatly appreciate how busy you must be at this particular time of the year, but I pray that you do not let me down. You are certainly the best people to give me a helpful hand with this crucial problem.
Thank you so much in anticipation of your inestimable assistance in these two crucial matters.
Very respectfully yours,
HIPPOLYTE O. LOHAKA Ohio University Ph.D. Student Educational Statistics Area Program
P.S. To give you a better idea of my work, a copy of the survey I administered to Ohio University students may be sent to you upon request. Thank you so much.
H. O. L.
273
Appendix K
List of Abbreviations
ANCOVA: Analysis of Covariance
ECG: Expected Course Grade
IQR: Interquartile Range
MANCOVA: Multivariate Analysis of Covariance
MATH: Mathematics
PMA: Perceived Mathematics Ability
PSY: Psychology
SAMU: Survey of Attitudes towards the Method Used
SATS: Survey of Attitudes towards Statistics
SPSS: Statistical Package for Social Sciences
SRD: Square Root of Deviations
STD: Sample Standard Deviation
274
Appendix L
Sample Set of Scatter Diagrams of Data for Pairs of Dependent Variables
Scatter Plot Difficulty-Value
Method: 1.00, Department: 1.00
34.00
32.00
30.00
28.00
Difficulty 26.00
24.00
22.00
20.00
Figure 6. Scatter diagram for Value versus Difficulty Value (Iteration – Mathematics).
275
Scatter Plot Cognitive-Difficulty
Method: 1.00, Department: 1.00
32.50
30.00 Cognitive
27.50
Figure 7. Scatter diagram for Difficulty versus Cognitive (Iteration – Mathematics).
276
Scatter Plot Difficulty-Affect
Method: 1.00, Department: 2.00
35.00
30.00
25.00 Difficulty
20.00
15.00
Figure 8. Scatter diagram for Affect versus Difficulty (Iteration – Psychology).
277
Scatter Plot Cognitive-Difficulty
Method: 1.00, Department: 2.00
32.50
30.00 Cognitive 27.50
25.00
Figure 9. Scatter diagram for Difficulty versus Cognitive (Iteration – Psychology).
278
Scatter Plot Affect-Cognitive
Method: 2.00, Department: 1.00
30.00
27.00 Affect
24.00
Figure 10. Scatter diagram for Cognitive versus Affect (Traditional – Mathematics).
279
Scatter Plot Cognitive-Value
Method: 2.00, Department: 1.00
32.50
30.00 Cognitive 27.50
25.00
Figure 11. Scatter diagram for Value versus Cognitive (Traditional – Mathematics).
280
Scatter Plot Difficulty-Value
Method: 2.00, Department: 2.00
30.00
27.00
24.00 Difficulty
21.00
18.00
Figure 12. Scatter diagram for Difficulty versus Value (Traditional – Psychology).
281
Scatter Plot Affect-Value
Method: 2.00, Department: 2.00
32.00
30.00
28.00
26.00 Affect 24.00
22.00
20.00
18.00
Figure 13. Scatter diagram for Value versus Affect (Traditional – Psychology).
282
Appendix M
Additional Outputs of the Exploratory Factor Analysis for the SAMU Data
Table 38
Varimax Rotated Factors. Principal Component Extraction (Iteration Algorithm)
Rotated Component Matrixa,b
Component 1 2 3 4 Item01 .979 .062 .033 .121 Item02 .418 .104 -.124 .771 Item03 -.029 .120 .306 .538 Item04 .073 .979 .076 .116 Item05 .263 .013 -.205 .790 Item06 .979 .062 .033 .121 Item07 .089 .065 -.055 .129 Item08 .083 -.041 -.453 .015 Item09 .017 .086 .296 -.060 Item10 .230 .003 .889 .208 Item11 -.017 -.150 .072 .445 Item12 -.055 -.254 .362 -.026 Item13 .073 .979 .076 .116 Item14 .234 .044 .879 .245 item15 .979 .062 .033 .121 Item16 .073 .979 .076 .116 Item17 -.001 -.230 .100 .133 Item18 .076 .064 -.196 -.002 Item19 .134 .010 .189 -.149 Item20 -.012 .074 .064 .538 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 5 iterations. b. Method = 1.00
283
Table 39
Varimax Rotated Factors. Principal Component Extraction (Traditional Method)
Rotated Component Matrixa,b
Component 1 2 3 4 Item01 -.064 -.040 .912 -.094 Item02 .980 -.034 .068 -.053 Item03 .265 -.029 .509 .240 Item04 .023 .984 .010 .009 Item05 -.128 -.021 .126 .351 Item06 -.133 .024 .891 .026 Item07 -.027 -.123 -.052 .339 Item08 .278 .042 -.035 -.006 Item09 .118 .065 -.083 .398 Item10 .980 -.034 .068 -.053 Item11 -.044 -.145 -.108 .381 Item12 .195 .243 .330 -.061 Item13 -.042 .962 -.023 -.022 Item14 .980 -.034 .068 -.053 item15 -.006 -.043 .748 -.287 Item16 .023 .984 .010 .009 Item17 .161 -.001 -.048 .632 Item18 .007 .180 .014 .536 Item19 .123 .087 -.047 -.199 Item20 -.181 .183 -.052 .515 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 5 iterations. b. Method = 2.00