<<

The development of modern Author(s): Dale E. Varberg Source: The Teacher, Vol. 56, No. 4 (APRIL 1963), pp. 252-257 Published by: National Council of Teachers of Mathematics Stable URL: http://www.jstor.org/stable/27956805 . Accessed: 22/10/2014 15:44

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

National Council of Teachers of Mathematics is collaborating with JSTOR to digitize, preserve and extend access to The Mathematics Teacher.

http://www.jstor.org

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions HISTORICALLY SPEAKING,?

Edited byHoward Eves, University ofMaine, Orono, Maine

The development of modern statistics*

by Dale E. Varberg, H aniline University, St. Paul, Minnesota

That area of study which we now call evoke some notion of the procedures which statistics has only recently come of age. are used to condense and interpret a col While its origins may be traced back to lection of , such as the computing of the eighteenth century, or perhaps earlier, and standard deviations. But to the first really significant developments in the practitioner of the craft, statistics is the theory of statistics did not occur until the art ofmaking inferences from a body of the late nineteenth and early twentieth data, or, more generally, the science of centuries, and it is only during the last making decisions in the face of uncer thirty years or so that it has reached a tainty. fullmeasure of respectability. It was ante concern themselves with dated by the theory of and has answering such questions as: Is this par its roots embedded in this subject. In fact, ticular lot of manufactured items defec any serious study of statistics must of tive? Is there a connection between smok necessity be preceded by a study of prob ing and cancer? Will Kennedy win the ability theory, for it is in the latter sub next election? In answering these ques ject that the theory of statistics finds its tions, it is necessary to reason from the foundation and fountainhead. specific to the general, from the sample to The word Statistik was first used by the . Therefore, any conclu Gottfried Achenwall (1719-72), a lecturer sions reached by the are not to at the University of G?ttingen [l].** He be accepted as absolute certainties. It is, is sometimes referred to as the "Father of in fact, one of the jobs of the statistician Statistics"?perhaps mistakenly, since he to give some measure of the certainty of was mainly concerned with the description the conclusions he has drawn. of interesting facts about his country. It should not be inferred from this lack Our English word "" means dif of certainty that the mathematics of sta ferent things to different people. To the tistics is nonrigorous. The mathematics man on the street, statistics is the mass of that forms the basis of statistics stems figures that the expert on any subject uses from and has a firm to support his contentions?it's "what axiomatic foundation and rigorously you use to prove anything by." To the proved theorems. more sophisticated person, the word may If we conceive of statistics as the science * of drawing inferences and deci This is the first of two lectures on the of making statistics given by Professor Varberg at a National sions, it is appropriate to date its begin Science Foundation Summer Institute for High with the work of Sir School Mathematics Teachers held at Bowdoin Col nings lege during the summer of 1962. Notes were taken by (1822-1911) and (1857 Alvin K. Funderburg. in the late nineteenth ** Numerals in brackets refer to the notes at the 1936) century. end of this article. Starting here, modern

252 The Mathematics Teacher | April, 1963

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions has developed in four great waves of ideas, smaller than the other. I wonder if she " in four periods, each of which was intro could be lying to me.' duced by a pioneering work of a great We have here in this simple story, as statistician [2], Helen Walker points out, "rejection of The first period was inaugurated by the constituted authority, appeal to empiri publication of Galton's Natural Inheri cal evidence, faith in his own interpreta tance in 1889. If for no other reason, this tion of the meaning of observed data, and book is justly famous because it sparked finally imputation of moral obliquity to a the interest of Karl Pearson in statistics. person whose judgment differed from his Until this time, Pearson had been an ob own." These were to be prominent charac scure mathematician teaching at Uni teristics throughout Pearson's whole life. versity College in . Now the idea This first period, then, was marked by a that all knowledge is based on statistical change in attitude toward statistics, a foundations captivated his mind. Moving recognition of its importance by the scien to Gresham College in 1890 with the tificworld. But, in addition to this, many chance to lecture on any subject that he advances were made in statistical tech wished, Pearson chose the topic: "the nique. Among the technical tools invented scope and concepts of modern science." and studied by Galton, Pearson, and In his lectures he placed increasingly their followers were the standard devia stronger emphasis on the statistical foun tion, correlation coefficient, and the chi dation of scientific laws and soon was de square test. voting most of his energy to promoting the About 1915, a new name appeared on study of statistical theory. Before long, the statistical horizon, R. A. Fisher his laboratory became a center in which (1890-). His paper of that year on the men from all over the world studied and exact distribution of the sample correla went back home to light statistical fires. tion coefficients ushered in the second pe Largely through his enthusiasm, the sci riod of statistical history and was followed entific world was moved from a state of by a whole series of papers and books disinterest in statistical studies to a situa which gave a new impetus to statistical in tion where large numbers of people were quiry. One author has gone so far as to eagerly at work developing new theory and credit Fisher with half of the statistical gathering and studying data from all theory that we use today. Among the sig fields of knowledge. The conviction grew nificant contributions of Fisher and his that the analysis of statistical data could associates were the development of meth provide answers to a host of important ods appropriate for small samples, the dis questions. covery of the exact distributions of many An anecdote, related by Helen Walker sample statistics, the formulation of logi [3], of Pearson's childhood illustrates in cal principles for testing hypotheses, the a vivid way the characteristics which invention of the technique known as an marked his adult career. Pearson was alysis of , and the introduction of asked what was the first thing he could re criteria for choice among various possible member. "Well," he said, "I do not know estimators for a population parameter. how old I was, but I was sitting in a high The thirdperiod began about 1928with chair and I was sucking my thumb. Some the publication of certain joint papers by one told me to stop sucking it and said and , the lat that unless I do so the thumb would ter a son of Karl Pearson. These papers wither away. I put my two thumbs to introduced and emphasized such concepts gether and looked at them a long time. as "Type II" error, , and 'They look alike to me/ I said to myself, confidence intervals. It was during this can't see that the thumb I suck is any period that industry began to make wide

Historically speaking,? 253

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions spread application of statistical tech Frequency diagram niques, especially in connection with . There was increasing in terest in taking of surveys with consequent attention to the theory and technique of taking samples. We date the beginning of the fourth pe riod with the first paper of mm i?it (1902-50) on the now often used statisti ?4 cal procedure-sequential . This paper of 1939 initiated a deluge of papers by Wald, ended only by his untimely 2 death in an airplane crash when at the Figure height of his powers. Perhaps Wald's most contribution was his introduc significant the data pictorially. tion of a new of at statistical way looking (1759-1823) of England is usually given what is known as statistical de problems, credit for introducing the idea of graphi cision From this of theory. point view, cal representation into statistics. His writ statistics is as the art of regarded playing ings,mostly on , were illustrated a with nature as the This game, opponent. with extremely good graphs, , is a while it does very general theory, and, bar diagrams, etc. In our problem, the lead to formidable mathematical compli data ismost simply represented by means it is fair to that a share cations, say large of what is called a frequency diagram. of research statisticians have present-day We have shown such a diagram for the found it to this new advantageous adopt height X (Fig. 2). A similar diagram for Y approach. would be easy to construct. While such this brief view Having given bird's-eye pictures do help our intuition, we need of statistical history, we move te a discus more than this ifwe are to treat the data sion of some of the most basic of statisti mathematically. We need mathematical cal For this it will concepts. purpose be measures which describe the data pre convenient to refer to a table showing the cisely. and of twelve heights weights people (Fig. Among the most important of such 1). The height X is shown in inches; the measures are the measures of central tend Y is shown in weight pounds. ency. The earliest of these, actually dat To some for such a collection get feeling ing back to the Greeks, is the arithmetic of it is desirable to data, clearly display ,which for a discrete variable X, Table of heights and weights such as we have in our example, is defined by Individual Y

1 60 110 =(1/ ) Xi. 2 60 135 *=1 3 60 120 4 62 120 Here Xi denotes a value of the variable X, 5 62 140 and is the size of the population. In our 6 62 130 the mean of the 7 62 135 example, heights is 8 64 150 63.83 ; the mean of the weights is 141.67. 9 64 145 To understand the significance of this 10 70 170 we rewrite the definition in 11 70 185 concept, the 12 70 160 form 1 Figure Mx=(l/n)X*i/j

254 The Mathematics Teacher | April, 1963

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions Here /, stands for the frequency of occur though admittedly itmay take much cal rence of the value x? and the summation culation to get it. The does, how extends over the distinct values of the ever, have a property which is sometimes variable X. Consider now a weightless rod advantageous. It is not as subject to dis on which there is a scale running through tortion due to a few extreme values. For the range of the variable X, and suppose example, if in the table of heights of twelve that at Xj is attached a mass of size /y/n. persons, one of the 70-inch persons were a mean This gives a system of total mass 1, which exchanged for 90-inch person, the will have as its center of mass, that is, would be changed considerably while the the system will balance on a fulcrum median would be unaffected. . measures placed at In the case of the heights, the We next consider of disper system would look as in Figure 3. This sion, i.e., measures of how the data spreads mean. interpretation of the mean will be helpful out about the Perhaps the first such later when we consider the notion of a con measure was the probable error introduced tinuously distributed variable. by Bessel in 1815 in connection with prob While the concept is probably quite old, lems in astronomy. Most commonly used itwas not until 1883 that the median was today is the , this introduced into statistics by Galton as a terminology due to Karl Pearson (1894). a X second measure of [4]. It is defined for discrete variable by The median is simply the middle value of 1/2 the distribution in the case of an odd num = ) (^- ber of values and is the average of the two [(1/ )2] middle values otherwise. The median Inspection of this formula reveals that in our is 62. height example tends to be large when the data is widely Another measure of central is tendency dispersed, small when the data clusters the , introduced by Karl Pearson about the mean. around 1894. The mode is the most fre To introduce the next notion, which is if there is one. quently occurring value, correlation, we refer back to the table of In the case where two or more values oc heights and weights (Fig. 1). Inspection cur with equal frequency, the mode is not of the data reveals that these two varia well defined. In the example, the mode of bles are somehow related. Even common is 62. heights again sense tells us that tall people should gen If the distribution of a variable X is ex erally weigh more than short people. if its dia actly symmetrical, i.e., frequency Graphically, this relationship can be por is about a gram exactly symmetrical verti trayed by means of what is called a scat cal then and line, the mean, median, mode ter diagram, this being merely a plot of there is a will The reader (if mode) agree. the data in the Cartesian plane (see Fig. should be able to himself convince that the 4). The relationship, if linear, will be indi converse a non is false by constructing cated by a tendency of the points to simu symmetrical distribution for which the late a straight line. mean, median, and mode agree. In the late nineteenth century, Sir For most purposes, certainly for the Francis Galton asked whether such a rela mean oretical purposes, the is the most tionship between two sets of data could measure useful of central tendency, al be measured, and he introduced the notion of correlation. It was Karl Pearson, how ?t/12 ever, who gave us our present coefficient of correlation defined by 60 m 64 70 ? 3 =(1/ ) ( )( %- ). Figure t-l

Historically speaking,? 255

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions diagram = Scatter f(x) (1/V2t ) exp [- (x- )2/2 2].

Here and are parameters which turn out to be the mean and standard devia tion. The normal curve is often of p^mm^f -^A4 ^ , thought roughly as any "bell shaped" curve. How ever, this is inaccurate, for other functions, such as g(x)=[ir(l+x2)]-1, also have graphs which are bell shaped and yet lack completely the qualities which make the normal curve so useful. While the defini tion of the normal curve given above may appear complicated, from the point of view of the mathematician it is one of the sim sii 3?5 plest and best behaved of all curves. Fig ure 5 pictures a special normal curve. If the area under the normal curve from ? ?> to + oo were to be calculated in Figure 4 by tegration, it would be found to be 1. two-thirds of this area lies It is a matter of simple algebra to show Approximately between one standard deviation to that ranges between ?1 and +1. A points the one value of zero indicates no linear relation left and standard deviation to the of the mean. The that a ship; + 1 indicates that the data lies on a right probability normal variable assumes values on any straight line of positive slope; ?1 means interval a

256 The Mathematics Teacher | April, 1963

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions idealized continuous rod ofmass 1 running calculate the mean and standard deviation indefinitely far in both directions with for the normal distribution, they turn out density varying according to the function to be the two parameters and respec / which determines the normal curve. tively. From calculus, the center of mass of Some important uses of the normal such a rod would be given by curve in connection with statistical prob lems will be described in the second of xf(x)dx. these which is to be in ?00 lectures, reprinted the next issue of The Mathematics we use This is, in fact, the formula which Teacher. to define the mean of a continuous distri bution. Perhaps surprisingly, not every Notes continuous distribution has a mean, for 1. Walker, Helen M., Studies in the His Statistical Method The the above integral may fail to converge. tory of (Baltimore: William and Wilkins Company, 1929). This This is the for for the case, example, book has been of great value in the preparation Cauchy distribution, determined by the of this lecture, especially in connection with = statistical before 1900. equation g(x) [ (1+ 2) as the reader history 2. In dividing the history of statistics into may verify. four periods, we are following Helen M. Walker, Recalling the formula in the discrete "The Contributions of Karl Pearson," Journal the American Statistical LI 11 case, it is natural to define the standard of Association, (1958), 11-22. deviation for a continuous distribution by 3. Ibid., p. 13. /2 4. Actually, Gustav Fechner had employed . this measure under the name der Ceniralwerth (x-?)J(x)dxj in 1874 and had given a description of its prop erties. Galton's use of the appears to It is a matter of a concept fairly simple integration go back as early as 1869, but the name median to check that if these formulas are used to is firstused by him in 1883.

The origin ofUHopital s rule

by D. J. Struik, Massachusetts Institute of Technology, Cambridge, Massachusetts

The so-called rule of L'H?pital, which l'H?pital was an amateur mathematician states that who had become deeply interested in the new calculus to the learned lim/(*) presented _f\a) world by Leibniz in two short papers, one of 1684 and the other of 1686. Not quite = convinced that he could master the new when f(a) =g(a) 0, gf(a)^0, was pub and branch ofmathematics all lished for the first time by the French exciting by some mathematician G. F. A. de l'H?pital (or himself, L'H?pital engaged, during months of the services of the De Lhospital) in his Analyse des infiniment 1691-92, * Swiss and math petits (Paris, 1696) [l] The Marquis de brilliant young physician Johann first at his * Numerals in brackets refer to the notes at the ematician, Bernoulli, end of this article. Paris home and later at his ch?teau in the

Historically speaking,? 257

This content downloaded from 146.186.124.59 on Wed, 22 Oct 2014 15:44:00 PM All use subject to JSTOR Terms and Conditions