<<

Graphical Methods in Author(s): Stephen E. Fienberg Source: The American , Vol. 33, No. 4 (Nov., 1979), pp. 165-178 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2683729 Accessed: 16/09/2010 12:29

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=astata.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to The American Statistician.

http://www.jstor.org Graphical Methodsin Statistics

STEPHEN E. FIENBERG* do we have a systematicbody of experimentalresults to use as a guide. We have seen considerable in- Graphical methodshave played a centralrole in the developmentof novation in during the past 20 years, but statisticaltheory and practice.This presentationbriefly reviews some the advances in statistical methodologyhave made of the highlightsin the historical developmentof statisticalgraphics room for even greaterinnovation in the future.These and gives a simple taxonomythat can be used to characterize the currentuse of graphicalmethods. This taxonomyis used to describe are the themes of this article. the evolution of the use of graphics in some major statisticaland The qualities and values of and graphs as related scientificjournals. compared withtextual and tabular formsof presenta- Some recent advances in the use of graphicalmethods for statis- tion have been summarizedby Calvin Schmid (1954) tical analysis are reviewed, and several graphical methods for the in his Handbook of Graphic Prese4t-ation: statisticalpresentation of are illustrated,including the use of multicolormaps. 1. In comparison with other types of presentation, KEY WORDS: Diagnostic plots; Graphical methods; History of well-designed charts are more effectivein creating statistics; , statistical; Standards for graphics; Statistical interestand in appealingto the attentionof the reader. graphics. 2. Visual relationships,as portrayedby charts and graphs, are more clearly grasped and more easily 1. INTRODUCTION remembered. 3. The use of charts and graphs saves time, since For no study is less alluring or more dry and tedious than the essential meaning of large masses of statistical statistics,unless the mind and imaginationare set to work or data can be visualized at a glance. thatthe person studyingis particularlyinterested in the subject; 4. Charts and graphs can provide a comprehensive which is seldom the case with young men in any rank in life. pictureof a problemthat makes possible a more com- These words were written178 years ago by William plete and better-balancedunderstandihg than could be Playfair,one of the fathersof statisticalgraphics, in derived fromtabular or textualforms of presentation. his The Statistical Breviary(1801). Playfair'spurpose 5. Charts and graphscan bringout hiddenfacts and in developinghis graphicalrepresentations of statistical relationshipsand can stimulate,as well as aid, analyti- data was to make the statisticsa littlemore palatable. cal thinkingand investigation. We have come a long way since Charts and 1801. This is, of course, what Playfair'swork was all about. graphsnow play an importantrole in data presentation. He said, 'I have succeeded in proposing and putting They are used in our textbooks and classrooms; they in practice a new and useful of stating ac- summarize data in our technical journals; they are counts, . . . as much informationmay be obtained in playingan increasingrole in governmentreports; they five minutesas would require whole days to imprint appear daily in our newspapers and popular maga- on the memory,in a lasting manner, by a of zines. In the fieldof statistics,graphs and charts are figures." used not only to summarizedata but also as diagnostic aids in analysis, to organize Monte Carlo results,and, of course, to display theoreticalrelations. We have come far since the timeof Playfair,but we 2. LANDMARKS IN THE HISTORY OF have farto go. Actual practice in statisticalgraphics is highly varied, good graphics being overwhelmed by distorteddata presentation,cumbersome charts, and A paper on graphic methods in statisticswould be perplexingpictures. Although advice on how and when incomplete without some attentionto historical de- to draw graphs is available, we have no theory of velopment.This briefaccotint owes much to the work statisticalgraphics, nor, as Kruskal (1977) has noted, of Beniger and Robyn (1978) and the social graphics project at the Bureau of Social Science Research * Stephen E. Fienbergis Professor,Department of Applied Statis- (also see Feinberg and Franklin 1975) led by Albert tics, School of Statistics, Universityof Minnesota, St. Paul, MN Biderman. 55108. This article is based on a portionof a General Methodology AlthoughBeniger and Robyn (1978) trace attempts Lecture deliveredat the 137thannual meetingof the AmericanStatis- at graphic depictionof empiricaldata back to at least tical Association, Chicago, Illinois,August 18, 1977,and its prepara- the 10th or 11th centuryA.D;, it was only afterthe tion was supported in part by grant NIE-G-76-0096 from the National Instituteof Education, U.S. Departmentof Health, Educa- work of Crome and Playfair,in the late 18thand early tion and Welfare. The opinions do not necessarily reflect the 19th centuries,that the use of graphs and charts for positions or policies of the National Instituteof Education. The data displaybecame accepted practice.Playfair gave us U.S. Bureau of the has provided funds for the reproduc- the bar chartin his Commercial and Political Atlas tion of the color maps in Section 7. The authorwishes to acknowl- (1786) and the pie and circle graph in his The edge the assistance of Phillip Chapman and Stanley Wasserman in the preparationof this articleand the helpfulcomments and sugges- StatisticalBreviary (1801). His workprovides excellent tions of Albert Biderman, David Hoaglin, Kinley Larntz, William examples of good graphics;it conveys informationand Kruskal, , the editor,and two anonymousreferees. is pleasing to the eye. (Also see the discussion of

\? The American Statistician,November 1979, Vol. 33, No. 4 165 Playfair's work in Funkhouser 1937 and Funkhouser curve, published in JASA (Lorenz 1905), which com- and Walker 1935.) pares percentilesof cumulativedistributions. Such a An examination of Playfair's An Inquiry Into the comparisonof two cumulativedistributions is the first Permanent Causes of the Decline and Fall of Power- example of what Wilk and Gnanadesikan (1968) have ful and Wealthy Nations (1805) illustrateshow far labeled as P-P plots. ahead of his time Playfair really was. Particularly noteworthyin that volume is Playfair's Figure 4, which is a circle chart givingthe extent, population, 3. PUBLISHED STANDARDS FOR GRAPHICS and revenue of the principal nations in Europe in 1804. (This figure was not in a form suitable for Withthe rapid growthof graphicpresentation came reproductionhere.) The circles are proportionalto a professionalconcern forthe need of standards.This the areas of the countries or territories(the figures concern was reflected in the proceedings of the are on the chart as well). The red line on the leftis InternationalStatistical Congresses, held in Europe the number of inhabitants; the yellow line on the from 1853 to 1876, and in abortive attemptsto de- rightis the revenue in pounds. The scales for these velop rules and standardsfor graphics at the sessions lines are the same. of the InternationalStatistical Institute at the beginning The dotted lines, to connect the extremitiesof the lines of of the 20th century. populationand revenue, serve by theirdescent fromright to Then, in 1914, as a result of invitationsextended left,or fromleft to right,to show how revenue and popu- by the American Society of Mechanical Engineers, a lation are proportionalto each other. The impressionmade by this chart is such that it is impossible not to see by what numberof nationalassociations formeda joint commit- Sweden and Denmark are of littleimportance, as to tee on standards for graphic presentation.The com- wealth or power; for,though population and territoryare the mittee'spreliminary report, published inJASA in 1915, original foundation of power, finances are the means of consisted of 17 basic rules of elementarygraphic pre- exertingit. (Playfair 1805, p. 190) sentation,each illustratedby one or more figures.The This figureand the related one in Playfair's The rules are simple and direct, and several of them are Statistical Breviary representone of the earliest at- just as relevanttoday as in 1915. tempts at the graphical depiction of multivariate Ten of these rules pertain to the portrayalof time data, a topic discussed in greaterdetail in Section 6. series data (with time on the horizontalaxis), using Playfairrecognized that, although charts save time, arithmeticscales. We can thus see the kinds of charts the idea of Schmid, that they can allow large masses and figuresthat dominated publications during the early of data to be visualized at a glance, needs some part of this century. qualification.Playfair notes: "Opposite to each Chart The AmericanSociety of Mechanical Engineershas are descriptions and explanations. The reader will continued this effortat standards and has published find,five minutesattention to the principleon which various updates over the years. The emphasis, how- they are constructed,a saving of much labour and ever, has remained on charts, and there time;but, withoutthat trifling attention, he may as well have been few published lists of standards for other look at a blank sheet of paper as at one of the types of charts and graphs. It is of interestto note Charts" (1805, p. xvi). thatthe ASA has just revivedits participationin these Subsequent developments involved such famous activitiesafter many years of benignneglect. names as Bessel (graphic table), Fourier (cumulative A rationalset of graphicstandards should be based curve), and Quetelet (empirical mortality on a theoryfor graphic presentation. Alas, we have no curves, graphs of frequency curves, plots of histo- such theory,and the currentprospects for its develop- grams withlimiting normal curves). ment remain dim. Yet it is easy to come up with a In 1849, Fletcher published the first statistical simpleset of suggestionsthat would improvethe clarity (with tone wash) in a statisticaljournal, al- of most graphs. For example, Cox (1978, p. 6) pro- though such maps had appeared elsewhere as early vides the followinglist of six items: as 1819. Then, in 1857, (1857, 1. The axes should be clearly labeled with the names of the 1858) introducedher Coxcomb chart to describe, by variables and the unitsof measurement. month, the causes of mortalityin the British Army 2. Scale breaks should be used forfalse origins. duringthe Crimean War. The Coxcomb is the fore- 3. Comparison of relateddiagrams should be made easy, for runnerof the modern-dayRose Chartand othergraphs example, by using identical scales of measurementand placing diagramsside by side. used to show cyclic phenomena. 4. Scales should be arrangedso thatsystematic and approxi- The Statistical Atlas of the Based on mately linear relations are plotted at roughly45? to the the Results of the Ninth Census (Walker 1874) con- x axis. tained the firstexamples of population pyramidsand 5. Legends should make diagramsas nearlyself-explanatory, bilateralfrequency polygons. The descendentsof these that is, independentof the text, as is feasible. 6. Interpretationshould not be prejudiced by the technique graphicalelders are among the most effectiveforms of of presentation,for example, by superimposing thick graphicaldisplay. smooth curves on scatter of points faintly Moving into the 20th century,we findthe Lorenz reproduced.

166 ? The American Statistician,November 1979, Vol. 33, No. 4 Even these are not hard and fast rules; for example, for tables, for example, Fox charts, nomograms, and Tukey and otherslike to reorganizeplots so thatrefer- especially charts with small, detailed grid lines. 3. Organizationalgraphs and charts,for ence lines (as just mentioned in item (4)) run example, maps, cer- tain skull diagrams, Venn diagrams, flow charts. These horizontally. graphs do not contain numericalinformation per se, but are usually used to enhance our understandingof a prob- lem or to organize informationin a tidyway.

4. CLASSIFICATION OF GRAPHICS For graphs involvingdata, we focused on the dis- tinction between communication or summary, and To illustratehow the use of graphics has changed analysis. For our purposes, a graphdisplaying data or in our professionaljournals duringthe past 50 years, I summarizinganalyses was intended for communica- required a means of dividinggraphs and charts into tion, even when, in addition to data summaries, it groups, reflectingvarious purposes the graphs serve. contained a fittedtheoretical curve. We interpreted Differentauthors have proposed differentclassification analytical graphs to be ones actually involved in the schemesover the years (e.g., see MaGill 1930),but most analysis, and we required these to include, at a of these schemes were of littleuse for my needs. minimum,something beyond a straightforwardex- Schmid (1954) suggested that there are basically aminationof the traditional modes ofdata presentation. three purposes for charts and graphs: (a) , The purpose of a graphinvolving data is not always (b) analysis,and (c) computation.These categoriesare apparent from the graph itself, and we spent many similarto those suggested by Tukey (1972), although hours reading the accompanyingtext. Even then we his descriptionsare more elaborate and his labels a had a large numberof instances in which the purpose littlemore colorfulthan Schmid's. Tukey's categories was either both communicationand analysis or dif- are as follows: ficult to determine precisely. Thus, we finallyde- 1. Graphs intendedto show what has already been learned cided to breakthe communications-analysis continuum by some othertechnique (propaganda graphs), into three categories (the finalthree of our six cate- 2. Graphs to let us see what may be happening over and gories of purpose): above what has already been described (analytical graphs), and 4. Graphs intendedto display data and results of analysis, 3. Graphs from which numbers are to be read off (sub- for example, time series charts, ,results of stitutesfor tables). Monte Carlo studies, and scatterplots(even those withan accompanyingregression line). Tufte (1976) added anotherpurpose to this list: 5. Plots and graphs with elements of both data display and 4. Graphs fordecoration (graphs are pretty). analysis, forexample, charts fromolder papers involving primitive forms of analysis, and graphs of posterior I have alreadydiscussed theartistic beauty of Playfair's distributions. charts, and to describe them as decoration would be 6. Analytical graphs, for example, residual plots; half- almost demeaning. Thus I chose not to include this normaland otherprobability plots, where conclusions are drawndirectly from graph; graphic methods of performing purpose in my study. calculations, and spectrumestimates from time series.

We tried to make our classificationof graphs con- 5. USE OF GRAPHICS IN JASA AND BIOMETRIKA, sistentover time but as the descriptionof category5 1920-75 demonstrates,this was difficult.It was clear to us that the placement of graphs into categories was a func- Phillip Chapman, a graduate studentat Minnesota, tionof our currentperspective and ourpersonal biases. and I examined the evolution of the use of graphics Our preliminarystudy involved examining all graphs in statisticaljournals subsequent to the 1915 standards published in JASA and Biometrikaduring six 5-year report.Our purpose was not to assess the adherence spans, beginningwith 1921-25 and movingin 10-year to standards,but ratherto determinewhether the rela- increments up through 1971-75. We chose these tive volume of statisticalgraphics used has changed particularjournals because they represent the two over time and whetherthere has been a shiftin the major English-speaking countries with substantial purposes to whichthe publishedgraphs are being put, statisticalprofessional groups and because they have in particulara shiftfrom illustration (and communica- been published throughoutthe 20th century.We had tion) to analysis (and exploration). neither the time nor the resources to consider the Aftersome initialexplorations, we increased the list entire50 years of thejournals. Thus, we chose 5-year of three purposes fromthe preceding section to six. spans because of the distortionsthat possibly could Three of these purposes were relevantfor graphs not resultfrom idiosyncratic volumes or issues ofjournals. involvingdata: For example, one of the early issues of Biometrika thatwe examined containedprimarily articles on skull 1. Graphs depictingtheoretical relationships, such as prob- measurementsand only very specialized graphs. ability density functions,contours of multivariateden- sities, values of risk functionsfor differentestimators, Tables 1 and 2 contain simple summaries of the and theoreticaldescriptions of graphicalmethods. relative volume and distributionof graphs and charts 2. Computational graphs and charts, used as substitutes for both journals. These tables give two related

?? The American Statistician,November 1979, Vol. 33, No. 4 167 1. Numberof Chartsper 100 pages in JASAand BIOMETRIKA

Purpose of Graph

(a) JASA Nondata Data Years 1a 2 3 Subtotals 4 5 6 Subtotals Totals

1921-25 .89 .10 .40 1.39 7.77 2.62 .99 11.38 12.77 1931-35 1.19 .15 .58 1.92 6.74 1.08 .96 8.78 10.70 1941-45 .43 .14 .33 .90 6.29 .99 .47 7.75 8.65 1951 -55 2.46 1.42 .41 4.29 2.62 .96 .19 3.77 8.06 1961-65 b 2.84 .34 .32 3.50 1.88 .64 1.03 3.55 7.05 1971-75c 5.37 .05 .91 6.33 4.70 .76 1.92 7.38 13.71

(b) Biometrika( Nondata Data Years 1 2 3 Subtotals 4 5 6 Subtotals Totals

1921-25 1.75 0 .71 2.46 8.25 .66 .09 9.00 11.46 1931-35 4.37 .08 1.13 5.58 9.84 .11 .30 10.25 15.83 1941 _45e 2.94 .33 .33 3.60 2.45 .33 1.14 3.93 7.52 1951 -55 2.62 .44 .53 3.59 1.41 .78 .53 2.22 5.81 1961 -65 3.88 .65 .18 4.71 .80 1.16 .36 2.32 7.03 1971 -75 3.55 .06 .35 3.96 1.32 .41 .53 2.26 6.22

a 1 = Theoretical curves; 2 graphs for computation:3 = nonnumericalcharts and diagrams; 4 = data display and summary:5 = graphs with mixtureof display and analysis, 6 = analytical graphs b Slightchange in page size; no adjustmentsmade I Change page size and format;no adjustmentsmade d Biometrikahas a differentpage size and formatfrom those used by JASA e Slight increase in amountof textper page measures of relative volume: the number of graphs graphsplay an importantrole, mainlyin the 1950's and per 100 pages and the actual space taken up by the 1960's, and the changes in the volume of nonnumeri- graphicsas a percentageof total space. Special care is cal graphs over time are not especially interesting. needed when interpretingthese data because of A reader of an earlier version of this article noted changes in journal page size and format.The major thatchanging technology may have created a tendency change to be wary of is theJASA shiftfrom 6-by-9-in. towardsmaller graphs over time.We did notfind strong single-columnpages to 8?/2-by-11-in.double-column evidence to supportthis suggestion, except whenJASA pages in 1971. The second measure seems to handle shiftedto the double-columnformat in 1971. this shiftin a reasonable way. Figures A and B give a graphicaldisplay of values In Biometrika there has been roughly a constant fromTables 1 and 2, contrastingthe two journals in volume of theoreticalgraphs over time, whereas in terms of graphs of type 4, 5, and 6 (communication, JASA there is a noticeable increase from 1941-45 mixed; and analysis), all of whichare graphsinvolving to 1951-55. In both journals, type 2 (computational) data. These figuresclearly show the decline in the use

2. Percentageof Space Devoted to Chartsand Graphsin JASAand Biometrika

(a) JASA Nondata Data Years 1 2 3 Subtotals 4 5 6 Subtotals Totals

1921 -25 .38 .05 .25 .68 4.00 1.06 .62 5.68 6.36 1931 -35 .48 .14 .23 .85 3.48 .61 .37 4.46 4.69 1941 -45 .20 .19 .28 .67 3.33 .47 .22 4.02 4.69 1951 -55 1.39 1.22 .23 2.84 1.71 .70 .11 2.52 5.36 1961 -65 1.63 .26 .19 2.08 1.00 .50 .55 2.05 4.13 1971-75 1.03 .02 .20 1.25 1.02 .18 .37 1.57 2.82

(b) Biometrika Nondata Data Years 1 2 3 Subtotals 4 5 6 Subtotals Totals

1921-25 63 0 .44 1.07 5.62 .48 .04 6.14 7.21 1931-35 2.26 .05 .79 3.10 5.44 .06 .22 5.72 8.82 1941-45 1.37 .23 .08 1.68 2.45 .20 .52 3.17 4.85 1951 -55 1.12 .41 .23 1.76 .48 .32 .26 1.06 2.82 1961 -65 1.74 .43 .04 2.21 .32 .49 .16 .97 3.18 1971 -75 1.67 .05 .15 1.87 .65 .19 .19 1.03 2.90

168 ?) The American Statistician,November 1979, Vol. 33, No. 4 of statisticalgraphs during this century, at least within JASA B I OMETRI KA two of our major statisticaljournals. What is the explanationfor this decline? Has there really been a shift away from graphical display? 21-25 The answers to these questions may really be that, ratherthan indicatinga decline in the use of graphs, these data only reflectthe relative increase in statis- 31-35 tical theoryand nongraphicalmethodology. After all, R. A. Fisher's pathbreakingwork on statisticaltheory was published in the 1920's and 1930's, as were the 41-45 contributionsof Hotelling, Neyman, Egon Pearson, and others. The rise of mathematicalstatistics and the focus on it in the statisticalliterature coincide with the relativedecline in the use of graphics. 51-55 LI IV

I 6. RECENT INNOVATIONS IN 61-65 V STATISTICAL GRAPHICS Despite what may appear to be a prolongeddecline 71-75 in the relative use of graphics in statisticaljournals, the past 20 years has seen an almost astonishing 4 3 2 1 00 1 2 3 4 5 6 increase in innovativegraphical ideas fordata display and analysis. The statistical groups at Princeton B. Percentage of Pages Devoted to Charts and Universityand at Bell Telephone Laboratories have Graphs in JASA and Biometrika, 1921 -75 provided much of the leadership for the develop- ment of what might be called the "new statistical to various proposals forrepresenting multidimensional graphics." I would like to review quickly some of data in only two dimensions. these innovations. Anderson (1957) developed his method of using glyphs and metroglyphs,which are circles of fixed 6.1 Graphs forDisplaying Multidimensional Data radius with rays of various lengthsrepresenting the values of differentvariables. When the glyphs are The rapid spread of the use of computersfor statis- plottedas points in a two-dimensionalscatterplot, we tical analysis in the 1960's early led to an upsurge in get a representationof (K + 2)-dimensional data, work involvingmultivariate in analysis. This, turn,led where K is the number of rays. There are many variantsof the glyphtechnique, involving the plotting JASA BIOMETRIKA of triangles(Pickett and White 1966),k-sided polygons (Siegel, Goldwyn, and Friedman 1971), and weather- vanes (Cleveland and Kleiner 1974), as well as much 21-25 more elaborate devises such as constellations(Waki- motoand Taguri 1978). FigureC shows a set of STARS (a version of the k-sided polygons)produced by using 31-35 TROLL, a computersystem developed by theNational Bureau of Economic Research (NBER) ComputerRe- search Center for Economics and Management Sci- 41-45 ence. Welsch (1976) describes the standard TROLL graphic capabilities, as well as a series of experi- mentalgraphic devices, includingSTARS. 51-55 The data in Table 3 are taken fromAshton, Healy, and Lipton (1957), who used graphicaltechniques to compare measurementson the teeth of fossils and 61-65 a, V different"races" of humans and apes. Andrews - l _~~~~~~VI (1972) also used an excerpt of these data to produce a that is considered later in this article. Here, we 71-75 use the same data set as Andrews, involving eight measurementson the permanentfirst lower premolar. I l l l I l I l lI l l l l I The values displayed are not the original measure-

A. Graphsand Chartsper 100 Pages in JASAand NOTE: FiguresI, J,and K, whichappear on thefollowing two pages, Biometrika,1921 -75 are discussed in Section 7 of this article.

? The American Statistician,November 1979, Vol. 33, No. 4 169 CARDIOVASCULARDISEASE MORTALITY U.S. COUNTIES1968-1971 MALESAGED 35-74

DEATH RATE (PER 100 000 PERSONS)

938.52-2309.96 'Z7ts, Ai82779 93851

COUNTQ A A1O AILABLE

P.pqwd 4. 1 ~ ~ ~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~Dpoeto Ho44h Edumon. and Weffw* ?h~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~NwaA aertIL\ng ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P.44.y4.4hhanisod S.,... InIM

by L __ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~weof R*@ o~Coww7/76 I. Death Rate From CardiovascularDisease Among Males Aged 35-74, 1968-71 (U.S. Bureau of the Census 1976)

PERCENTWITH 1.01 OR MOREPERSONS PER ROOM.BY COUNTY,1970

PERCENT

e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c) d * l J. Percentage of Housing Units With 1.01 or More Persons per PBoom, 1970 (U.S. Bureau of the | Census~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~4,C ^ i"" dAIIAOI Al A0w F 1976)~ __- ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~~~~ S

170 @) The American Statistician, November 1979, Vol. 33, No. c; -B

0~~~~~~~~ M ~~~~~~~~~~~~~~~~~~tO<8D

> _!r I~~~~~~~~~~~~~~~~~~~~~~SNOSId3d OOOOl &id)

0O~~~~~~~~~~~~~~~~~~~~~~~~7 _O SCOD E 0

00 0~~~~~~~~~~~~~~~~~~~~~O0C

L mAc wO 0W~~~~~~~~~~~~~~~?~HaLV 0S

000

Z~o C)cohe American Statistician, November 1979, Vol. 33, No. 4 171

u 0~~~~~~~~~~~~~~~~~~~~~~~~~~~CQ

00.~~~~~~~~~~~~~~~~~~~~~~ CO)

Ow 1oW3000000t11 t IUE

I-IC~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~C z . Co

U.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.,

a. cts~~~~~~~~~~~~~~~~~~~acu rc ~ ~ ~ ~ ~ ~~~?TeAercnSaitca,Nvmer17,Vl 3 o 7 and G (the gorillas and orangutans)form a second; and H and I (the chimpanzees)form a third.Note how most of the separationinto groups is based on the val- ues of the firsttwo canonical variates (these are the ones withthe largesteigenvalues). Andrews (1972) suggested representinga k tuple, x = (x1, x2, . . . Xk), by the finiteFourier series

fx(t) = x1l/V2+ x2 sint + X3 COS t + x4 sin 2t + x5 cos 2t + ....

He then plottedfx(t)over the range -iT ? t ? vT for each point x, for the nine points in Table 3, and pro- duced the graph in Figure D. The graph distinguishes

G H I differentvalues for humans (A,B,C), the gorillas and orangutans (D,E,F,G), and the chimpanzees (H,I). C. STARS for Measurements on Permanent First These groupsare the same as those we arrivedat using Lower Premolar of Various Groups of Humans and the STARS. Note thatthe humans have been separated Apes from the apes and that at t2 and t4 the humans have a commonvalue, whereas the apes convergeinto ments, but rather are the eight canonical variables their two groups at t1. At t3 the group members produced fromthe data on the humans and apes in have theirwidest separation. In his article, Andrews order to maximize the between sum of squares rela- goes on to develop significancetests and confidence tive to the within.Table 3 contains the group means intervalsto make comparisonson the plots. of the values of the canonical variables for the Notingthat people grow up studyingand reactingto humans and the apes. faces, Chernoff(1973) proposed representinga pointin The STARS correspondingto the nine "observa- 18-dimensionalspace by drawing a face whose 18 tions" in Table 3 are given in Figure C. Each characteristics(such as lengthof nose, shape of face, canonical variable is located along one of the eight curvature of mouth, size of eyes, etc.) are deter- rays, beginning with variable one located at three minedby the coordinatesor positionof the point. o'clock and runningcounterclockwise. Thus in Figure Both Chernoff'sfaces and Andrews's Fourier plots C(A), the ray at three o'clock corresponds to the are affected by interchangingcoordinates. Thus a value in row A, column 1 of Table 2, and the other varietyof displays may need to be tried before one seven rays runningcounterclockwise correspond to can arriveat the best one fora givendata set. Chernoff columns 2 through8, respectively.The lengthof any and Rizvi (1975) reportedon an experimentinvolving given ray correspondsto the value of the correspond- randompermutations in the assignmentof coordinates ing variable. This makes STARS especially useful for to the 18 facial features, in a problem involving36 nonnegativevariates. When negative values are pos- observations from two multivariatenormal popula- sible, as in this example, TROLL scales the cor- tions, with approximately18 observations fromeach respondingrays to have their minimumvalue at the population(the actual numbersvaried between 16 and origin. Thus the displays in Figure C suppress all 20). They concluded thatrandom permutations tend to informationreflected in the sign of the observations. affectthe errorrate in a classificationtask by a factor The polygon links the actual values of the coor- of about 25 percent. Their study did not, however, dinates for the observation,the circle is included for evaluate the efficacyof specificfeatures, for example, referencepurposes, and the barely visible tick marks the eyes or the mouth. indicate the means for the nine observations. The To check on the implicationsof the Chernoff-Rizvi rays,circle, and ticksare thusthe same in each STAR. study and to see how Chernoff's faces work in Examinationof the firstnine STARS suggests that practice, I used the FACES programin TROLL with A, B, and C (the humans) formone group; D, E, F, the same data as for the STARS and for Andrews's

3. Permanent First Lower Premolar Coefficients of Canonical Variates for Means of Eight Groups (Andrews 1972)

A. West African -8.09 +.49 +.18 +.75 -.06 -.04 +.04 +.03 B. British -9.37 -.68 -.44 -.37 +.37 +.02 -.01 +.05 C. Australianaboriginal -8.87 +11.44 +.36 -.34 -.29 -.02 -.01 -.05 D. Gorilla: male +6.28 +2.89 +.43 -.03 +.10 -.14 +.07 +.08 E. female +4.82 +1.52 +.71 -.06 +.25 +.15 -.07 -.10 F. Orangutan:male +5.11 +1.61 -.72 +.04 -.17 +.13 +.03 +.05 G. female +3.60 +.28 -1.05 +.01 -.03 -.1 1 -.1 1 -.08 H. Chimpanzee: male +3.46 -3.37 +?33 -.32 -.19 -.04 +.09 +.09 I. female +3.05 -4.21 +.17 +.28 +.04 +.02 -.06 -.06

172 (? The American Statistician,November 1979, Vol. 33, No. 4 i 0~~~~~~~~~~~

4~~~~~~~~~~~~~~~

D.AdessFuirPo-2 eaueet nPraetFrtLwe rmlro aiu ruso HuasadAe-4des17)(erne it emsio fteBoerc oit.

Fourierplot-the nine observationson eightvariables fromTable 3. The use of the FACES programin TROLL turns out to be somewhatcomplicated when thereare fewer A B C than 18 variables. Unless one explicitlyinstructs the programto the contrary,it assigns multiplefeatures to each variable even thoughthe programmerassigns I only one. In two abortive attemptsto draw FACES I~~~4 for the threehuman and six ape groups, the program D E F internallyassigned 13 characteristicsto the eightvari- ates, even though I specifiedthe assignmentof only eightcharacteristics. Several additional attemptsat the constructionof faces with only eight characteristicsled to groupings G H I of faces quite differentfrom those I expected. A par- E. Chernoff'sFACES forMeasurements on Per- ticular one was strongly influenced by the eighth manentFirst Lower Premolar of VariousGroups of canonical variate and led to two groups of apes, the Humans and Apes

(? The American Statistician,November 1979, Vol. 33, No. 4 173 100I I I I I ics has provided the impetusfor innovativedevelop- ments is diagnostic plots. These plots typically in- O d Odf some formof data transformationand rescaling 90 ocdf volve Ocd so that comparisons and deviations can be measured Ode Ocde * cdef Odef froma straightline. Some examples ofthese diagnostic 80 - plots are 1. Cp plots for choosing subset regressions, as 70 _ O bd O bdf suggestedby Colin Mallows. For the standardnormal multipleregression model withk possible independent of size 60 bcd O bdef variables, let P be a subset of the regressors Obdo O bcdf p-i whose residual sum of squares is denoted by 0 bcd *bcdef RSSp. The quantityCp is definedas (50 Cp = RSSp/&2 - n + 2p,

6&2 40 when n is the number of observations and is an estimate of the constant error oU2, usually based on using all p-i predictors.A Cp plot is con- 30 structedby plottingthe values of Cp versusp forall 2k ce possible regressionequations. ae go" _- ~~~~~O eeocd,aoe odefSodeocde 20 0 a oa act adf, acdf In Figure F, I reproducea Cp plot of a six-variable 8 abc * abce example given by Gorman and Toman (1966) and a AND b ob O abe *abcd *abcde (1971). The data involve observa- IN O abd *abde abcdef Daniel and Wood 10 Sobcef 2 Oabcf,abef 4 tions fromlaboratory performance tests on 31 asphalt

Obf 5adf aSobdef pavements(i.e., n = 31). The straightline corresponds I - 0 I 1 to Cp = p, and points near the line correspond to 0 1 2 3 4 5 6 7 p reasonable regressionequations. The two most par- simoniousequations singledout are based on (X,,Xb, F. Cp Plot for Six-Variable Multiple Regression Xf) and (Xal,Xb,Xd,X4,. Example (Gorman and Toman 1966) 2. Residual plots of various sorts, such as half- normal plots (Daniel 1959) and other plots, following females and the males! The finalset of FACES I pro- the suggestionsof Anscombe and Tukey (1963). For duced is included here as Figure E. The eightcanoni- example, in multipleregression analysis, cal variables are representedby the followingfacial oftenplot residuals (or residuals divided by estimates characteristics:(a) face shape, (b) jaw shape, (c) eye of theirstandard errors) against (a) time, (b) omitted size, (d) eye position, (e) pupil position, (f) forehead variables,(c) normal-orderstatistics or ,(d) pre- shape, (g) eyebrow slant,and (h) mouthshape. Figure dicted values, and so on. E does a moderatelygood job of producingthe same In FigureG, I give a residualplot foran intermediate three groups I identifiedwith the othergraphic meth- regressionequation fitto the asphalt-pavementdata ods. WhetherI would have stumbledacross thisgroup- describedin 1. In this plot, the standardizedresiduals ing had I not been explicitlylooking for it is another matter.This one experiencewith FACES suggeststhat 1.2 its use requires considerable skill and experience. Which of these formsof multivariatedata display is and cannot be deter- the best? The answer is unclear 1.0 mined by displayingdata fromone or two examples. 0 The data in Table 3 involve eight canonical variables 0 0 0.8~~~~~~~~ and thus suggestthe appropriatenessof the Andrews X~~~~~~~~ ? cc plot for this example, since the Fourier functionhas *D 0.-0 - an implicitorder of importancefor the variables not 41, 0 ~ ~~~0 se )s associated withSTARS and FACES. Other examples , 0.6- - I have seen suggestthe superiorityof the latter. 0-4~~~~~~ 6.2 Graphical Aids to Analysis-DiagnosticPlots 0.4-I,, .I , ,l ' l' l I 3.5 4.0 4.5 5.0 5.5 6.0 The multidimensionaldata plots in Section 6.1 are Xf examples of computer-generatedgraphics that would have been either impracticalor totallyimpossible to G. ResidualPlot forRegression Based ofl Xa, Xb, draw withoutthe aid of the computer. Another area and Xc Using Gorman-TomanExample: Standard- in whichthe availabilityof computer-generatedgraph- ized Residuals vs. Xf

174 C) The American Statistician,November 1979, Vol. 33, No. 4 4 4

2 2 l ST -K 0 ~ ~ ~ ~ - P ~~~~~~~~~~~~~~~~~~~~~~~~~~~H

0 ED S

2-2 2 - Iu=

-4- l _ r ll T l r l I I I I 04 0.6 0.8 1.0 1.2 1.4 16 18 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 h h

H(a,b). ResidualPlot for on Xay b, X4, and XfUsing Gorman-Toman Example: Residuals vs. TrimmingParameter (h)

(i.e., residualsdivided by estimatedstandard error) for 4. There is yet anotherdiagnostic display for ridge theregression based on (Xa,Xb,Xc) have been plotted regressionmodels knownas the ridge trace (see Mar- against the values of Xf. Note the linear trend,which quardt and Snee 1975 for details). The method in- arguesfor the inclusion of Xf in the equation. A similar volves plottingthe residual sum of squares and the plot of the standardizedresiduals for the regression regressioncoefficients, based on (Xa,Xb,Xf) versusthe values of X41or Xc showed littleevidence of linear trends. These three f3= (X'X + aI) -X'Y, with information plots along other supportthe choice as a functionof 0 c a c 1. The limitof a = 0 cor- of the regressionbased on(Xa,Xb,Xf), suggestedin 1. responds to the estimator 3. Diagnosticdisplays forrobust regression (Denby f3= (X'X)-1X'Y. and Mallows 1977), showingthe effectsof varyingthe trimmingparameter (in a model suggestedby Huber) The ridge trace for the asphalt-pavementdata is not on the adjustmentof residuals and on the values of especially helpful,and thus we have not included it regressioncoefficients. Specifically, I plotthe residuals here. p 5. Q-Q (Quantile-Quantile) and P-P yi- I xj31(h) (Percent-Per- i=0 cent) plots for comparing two distributionfunctions Wilk versus values of h, the Huber trimmingparameter, (see and Gnanadesikan 1968 and Gnanadesikan where fo(h)is chosen to minimize 1977) as well as hybridplots, forexample, Q-P plots. n p 6. Three-dimensionalisometric plots of changing

i=1 j=0 periodograms(Blackstone and Bingham 1974). wheres is a scale estimatefor the residuals, and Diagnostic plots such as these are direct aids in ontheraljueffet onfh residuals ofdusng the Haubero . In the course of analysis of a given Ph[t] = ?2t2 It h whenrssonetrmseffcits. opcficly thostte residualsruhyls data set one typicallyneeds to look at several differ- 2 = - ?t12 t t > h. ent diagnostic plots. The computer makes such examinationa reasonable task. thanuIn Figure1.5lties H, oh,theiI giveestmaed a pair of plotsstandardg of theearrors.thus 31 residuals (theythere haveseem been s thosbento split intoextremie two groupsosrain for ease ofhs dis- play) for the regression based on (Xa,Xb,XI,Xf) fit 6.3 SemigraphicDisplays to the asphalt-pavementdata, as functionsof h. At weffetsnee tobcae testmptereinr the restimuationofnd each value of h, those residuals outside the two 450 Not all recent innovations in graphics require the referencelines have been trimmed,while those inside availabilityof sophisticatedcomputers. Indeed, Tukey are beinggiven full weight. In thisexample, we see that (1972, 1977)proposes several semigraphicdisplays that attemptto blurthe distinctionbetween table and graph and thatare easily preparedby hand at home or on the commutertrain. The best knownof these are the stem- and-leaf display, which is an alternativeto tallying values into frequency distributions,and box-and- whiskerplots.

? The American Statistician,November 1979, Vol. 33, No. 4 175 7. STATISTICAL MAPS variate map, over and above that which they can extract fromthe two univariatemaps put side by side? Color statisticalmaps have been in widespread use The real problemwith these bivariatecolor maps is since the mid-l9thcentury, but therehave been some thatnot enougheffort has gone intomaking them show recentadvances and innovations.I willdescribe one of informationabout bivariate relationships. Wainer these, pioneeredby the U.S. Bureau of theCensus (see (1978) suggests that the way to begin correctingthe Meyer, Broome, and Schweitzer 1975), the two- maps is to have each of the univariate maps use variable color "cross" map. This type of map is in- only one color, using white instead of yellow as the tended to convey the spatial distributionof two vari- low end in both. This clearly has some advantages, ables and the geographicconcentration of their rela- but still ignores a key issue. The color scheme of the tionship.Only an example does the methodjustice (or bivariate grid in the corner of Figure J focuses at- injustice, depending on your point of view). Figures tentionon the four corners, whereas a color scheme I, J, and K are fromthe August 1976 issue of STATUS designedto measurethe relationshipbetween two vari- (a now-defunctmonthly chartbook of social and eco- ables would focus attentionon or near the diagonal, nomic trends produced by the U.S. Bureau of the running from lower right to upper left. Wainer's Census). (Figures I, J, and K appear on pp. 170-171 variantis an improvement,but does not go farenough of this article.) in the appropriate direction. It is unclear whether We begin by examiningthe two variables of study one can simultaneouslyachieve the aim of highlight- separately. First, in Figure I, we have the death rate ing bivariaterelationships and the preservationof uni- fromcardiovascular disease among males age 35 to 74 variate information. (1968-71)- darkblue is high,yellow low. The data are Wainer and Biderman (1977) provide some other displayed by county. The low death rates are concen- suggestions on empiricallyevaluating the efficacyof trated in the western half of the country. Next, in map displays. Tukey (1979) suggests: "So called Figure J, we have a measure of overcrowded hous- 'Statistical maps' do not deserve so honored a name. ing, the percentageof unitswith 1.01 or more persons 'Patch maps' is more appropriate.We can, and must, per room (1970)-dark red is high,and, again, yellow do better by assigningvalues to centers ratherthan is low. The bivariate map is created by an overlay areas, by learning to adjust for area compositions, process, and thereare 16 resultingcolors representing by bringingin spatial smoothing." Tukey goes on to the combinationsof the variables,as in Figure K. How give detailed suggestionsfor improvements. The idea does one interpretthis map? The instructionsin of spatial smoothingpatch maps has drawn attention STATUS note: in the literatureof statistical as well (e.g., see Tobler 1975). Ifthe geographic relationships were random,the resulting map would show no particulartendency toward an areal concen- trationof similarcolors, but instead would exhibit a patch- workof small contrasting color blocks throughoutthe country. 8. STATISTICAL WITH Examinationof the map shows thatthere is, indeed, a geo- STATISTICAL GRAPHICS graphic variation in the distributionof male cardiovascular mortalityand overcrowdedhousing. The 16 individualcolors In the late 1920's and early 1930's, F. E. Croxton which make up the map appear to be concentratedin sizable and others published a series of papers in JASA re- groups of contiguouscounties. (p. 42) portingon studies comparing the relative merits of circles, bars, squares, and cubes for certain types of This statementis, of course, a half-truth.Just as in- displays (e.g., see Eells 1926, Croxton and Stryker dependence in an R x C contingencytable does not 1927, Huhn 1927, and Croxton and Stein 1932). The lead to expected cell values of the same size, because conclusions fromthese early attemptsat experimen- of marginalstructure, so too here the marginaluni- tationwere inconclusiveand contradictory.I have been variate structureleads to nonrandompatterns. to locate any furtherwork on experimentation Do not feel dismayedif you are havingtrouble figur- unable on untilrecent years. ing out what is going on in Figure K. It takes con- (except maps) The recent literature on experimentation with siderable practice to learn to discriminate among is quite fascinating.Earlier I mentionedthe the 16 colors and to organize the spatial bivariate graphics Chernoff-Rizvi(1975) experimentwith faces. William relationship,even for those of us not afflictedwith Kruskal has drawn my attentionto a carefullydone color blindness. There are a variety of issues and study of the use of dot area symbols in cartography questions associated with the use of such maps that Castner and Robinson (1969). They thoroughlyde- need to be resolved: by scribe the characteristicsof dot patternsand theirper- 1. Choice of class intervals. ception, focusingon featuressuch as form,size spac- 2. Choice of colors-note that the colors do have to be ing, arrangement,orientation, and reflectancedensity. matched. From their study of these characteristics,Castner 3. The numberof classes to be used. 4. Is the two-colorsystem superior to a single-colorsystem and Robinson devise a series of tests to evaluate the and geometricpatterning? effectsof varyingsome of these characteristics.The 5. Can individualsextract additional information from the bi- actual experimentsare not fancy in an experimental

176 ?DThe American Statistician,November 1979, Vol. 33, No. 4 design sense, but the careful and almost systematic siderable experimentation.For a professionthat gave approach to the problem is worthyof study by any- rise to the designand analysis of experiments,we have one contemplatingan experimentwith graphical forms. done surprisinglylittle to foster careful, controlled A morerecent study of empirical experiments by Craw- experimentationwith graphical forms to aid us in arriv- ford(1976) underscoresthe fact that the cartography ing at an informedjudgment on what constitutesgood literaturehas better examples of experiments with graphicalpresentation. graphicalforms than does the statisticalliterature. Finally,I note a series of papers by Howard Wainer [Received December 1977. Revised May 1979.] and various coauthors reportingon experimentswith statisticalgraphics. For example, Wainer and Reiser (1976) and Lono and Wainer (1978) have studied the REFERENCES response timeof subjects to questions about different Anderson,Edgar (1957), "A SemigraphicalMethod forthe Analysis graphicaland tabulardisplays of the same set of data. of Complex Problems," Proceedings of the National Academy Wainer and Biderman (1977) have followed up on of Science, 13, 923-927 (reprinted,with an appended note, in Crawford'swork on maps, and Wainer(1978) has been Technometrics(1960), 2, 387-391. carryingout experimentswith two-variable maps. Andrews, D.F. (1972), "Plots of High Dimensional Data," Bio- What is clear to me is that the design of good metrics,28, 125-136. Anscombe, F.J., and Tukey, John W. (1963), "The Examination experimentsin this area will tax the mindsof the best and Analysis of Residuals," Technometrics,5, 141-160. statisticians,as well as those well versed in the psy- Ashton, E.H., Healy, M.J.R., and Lipton, S. (1957), "The De- chology of perception. scriptive Use of Discriminant Functions in Physical Anthro- pology," Proceedings of the Royal Society, B, 146, 552-572. Beniger, James R., and Robyn, Dorothy L. (1978), "Quantitative Graphics in Statistics: A Brief History," The American Statis- 9. WHERE DO WE GO FROM HERE? tician, 32, 1-11. Blackstone, Eugene H., and Bingham,C. (1974), "Cleaning Time We have come far since the timeof Playfair,but we Series by Regressionon Noise," Eason' 74, 586-591. still have far to go. We know how to prepare some Bradu, Dan, and Gabriel, K. Ruben (1978), "The As a formsof statisticalgraphics well; yet in otherareas we DiagnosticTool forModels of Two-Way Tables," Technometrics, have much to learn. Where do we go fromhere? 20, 47-68. Clearly, one of the thingswe need in the area of Castner, Henry W., and Robinson, ArthurH. (1969), Dot Area statisticalgraphics is more. We need to educate our Symbols in Cartography: The Influence of Pattern on Their Perception, American Congress on Surveying and Mapping, studentsand ourselves to make more and betteruse of CartographyDivision, Technical MonographNo. CA-4. known graphicdevices. We also need more attempts Chernoff,Herman (1973), "Using Faces to RepresentPoints in K- at innovation;the examples I have shown do not suf- DimensionalSpace Graphically,"Journal of the AmericanStatis- fice. Finally, we need more attempts at synthesis. tical Association, 68, 361-368. Let me elaborate. , and Rizvi, M. Haseeb (1975), "Effect on Classification Error or Random Permutationsof Features in Representing I have suggested that despite the recent flurryof MultivariateData by Faces," Journal of the American Statis- graphic innovation, many of our statisticaljournals tical Association, 70, 548-554. publishfewer graphs and chartsthan ever before.This Cleveland, William S., and Kleiner, Beat (1974), "The Analysis of must change. First, we must teach statisticiansand Air Pollution Data From New Jersey and ," paper others how and when to draw good graphic displays presented at the annual meeting of the American Statistical Association, St. Louis, Missouri, August 1974. of data. Second, we must encourage them to use Cox, D.R. (1978), "Some Remarks on the Role in Statistics of graphicalmethods in theirwork and inthe materialthey Graphical Methods," Applied Statistics, 27, 4-9. prepare for publication. Third, we must change the Crawford, P.V. (1976), "Optimal Spatial Design for Thematic policies in our professionaljournals so that graphics Maps," CartographicJournal, 13, 145-155. are encouraged, not discouraged. Croxton, Frederick E., and Stein, Harold (1932), "Graphic Com- parisons by Bars, Squares, Circles and Cubes," Journal of the Many areas of statisticalmethodology and analysis American Statistical Association, 27, 54-60. could benefitfrom graphic innovations: , and Stryker, R.E. (1927), "Bar Charts Versus Circle Diagrams," Journalof the American Statistical Association, 22, 1. We furtherwork on data. need displayingmultidimensional 473-482. Some fascinatingsuggestions are foundin Tukey and Tukey (1977). Daniel, Cuthbert(1959), "The Use of Half-NormalPlots in Inter- 2. We have few effectivedisplay devices for aiding in the fitting pretingFactorial Two Level Experiments," Technometrics,1, of ANOVA models to measurement and models to data, loglinear 311-341. categoricaldata, except forthose dealing withtwo-way arrays (e.g., , and Wood, Fred S. (1971), FittingEquations to Data, New see Tukey 1977, and Bradu and Gabriel 1978). Special attention York: JohnWiley & Sons. must be paid to devices that utilize the hierarchicalstructure of Denby, L., and Mallows, C.L. (1977), "Two Diagnostic Displays the parametersin these models. for Robust ," Technometrics,19, 1-13. 3. As Tukey and othershave indicated,much more can be done Eells, Walter Crosby (1926), "The Relative Merits of Circles and with statisticalmaps to make themworthy of the name. Bars forRepresenting Component Parts," Journalof theAmerican Before we can arrive at a theory for statistical Statistical Association, 21, 119- 132. Feinberg,Barry M., and Franklin,Carolyn A. (1975), Social Graphics graphs, we need more attemptsat synthesis;but be- Bibliography, Washington, D.C.: Bureau of Social Science fore we can expect effectivesynthesis, we need con- Research.

?C)The American Statistician,November 1979, Vol. 33, No. 4 177 Fletcher, Joseph (1849), "Moral and Educational Statistics of (1805), An Inquiryin the PermanentCauses of the Decline England and Wales," Journalof theStatistical Society of London, and Fall of Powerfuland WealthyNations, London. 12, 151-176; 189-335. Schmid, Calvin F. (1954), Handbook of Graphic Presentation, Funkhouser, H. Gray (1937), "Historical Development of the New York: Ronald Press. GraphicalRepresentation of StatisticalData," Osiris, 3, 269-405. Siegel, J.H., Goldwyn,R.M., and Friedman,H.P. (1971), "Pattern , and Walker, Helen M. (1935), "Playfair and His Charts," and Process of the Evolution of Human Septic Shock," Surgery, Economic History,3, 103-109. 70, 232-245. Gnanadesikan, R. (1977), Methodsfor Statistical Data Analysis of Tobler, W.R. (1975), "Linear Operators Applied to Areal Data," MultivariateObservations, New York: JohnWiley & Sons. in Display and Analysis of Spatial Data, ed. J.C. Davis and Gorman, J.W., and Toman, R.J. (1966), "Selection of Variables M.J. McCullagh, New York: JohnWiley & Sons, 14-37. for FittingEquations to Data," Technometrics,8, 27-51. Tukey, JohnW. (1972), "Some Graphicand SemigraphicDisplays," Huhn, R. von (1927), "Further Studies in the Graphic Use of in Statistical Papers in Honor of George W. Snedecor, ed. Circles and Bars I: A Discussion of the Eell's ," T.A. Bancroft,Ames: Iowa State UniversityPress, 293-316. Journalof the American StatisticalAssociation, 22, 31-36. (1977), ExploratoryData Analysis,Reading, Mass.: Addison- Joint Committee on Standards for Graphic Presentation (1915), Wesley. "Preliminary Report," Journal of the American Statistical (1979), "Methodology, and the Statistician's Response for Association, 14, 790-797. BOTH Accuracy AND Relevance," Journal of the American Kruskal, William (1977), "Visions of Maps and Graphs," Pro- Statistical Association, 74 (in press). ceedings of the International Symposium on Computer- Tukey, Paul A., and Tukey, John W. (1977), "Methods for Assisted Cartography,1975, 27-36. Direct and IndirectGraphic Display for Data Sets in 3 and More Lono, Marianne, and Wainer, Howard (1978), "Experiments in Dimensions," unpublishedmanuscript, Bell Laboratories. Graphical Comprehension,"First General Conferenceon Social Tufte, Edward R. (1976), unpublishednotes. PrincetonUniversity. Graphics, Leesburg, Virginia. U.S. Bureau of the Census (1976), STATUS, A Monthly Chart- Lorenz, M.O. (1905), "Methods of Measuringthe Concentrationof book of Social and Economic Trends (August issue). Wealth," Journal of the American Statistical Association, Wainer, Howard (1978), "An Empirical InquiryInto Two-Variable 9, 209-219. Color Maps," First General Conference on Social Graphics, MaGill, Walter H. (1930), "The Determinationof the Graphical Leesburg, Virginia. Forms and the Frequencies of the Forms Employed in the , and Biderman, Albert D. (1977), "Some Methodological CurrentReading Matterof the Non-Specialist," unpublishedPhD Commentson EvaluatingMaps," CartographicJournal, 14, 109- thesis, Universityof Pennsylvania. 114. Marquardt, Donald W., and Snee, Ronald D. (1975), "Ridge Re- ,and Reiser, Mark (1976), "Assessing the Efficacyof Visual gressionin Practice," The American Statistician,29, 3-20. Displays," Proceedings of the American Statistical Association, Meyer, MortonA., Broome, FrederickR., and Schweitzer,Richard Social StatisticsSection, (Vol. I), 89-92. H., Jr.(1975), "Color StatisticalMapping by the U.S. Bureau of Wakimoto,Kazumasa, and Taguri, Masaaki (1978), "Constellation the Census," American Cartographer,2, 100-117. Graphical Method for Representing MultidimensionalData," Nightingale,Florence (1857), Mortalityofthe British Army, London. Annals of the Instituteof Statistical Mathematics, 30, 97- 104. (1858), Notes on Matters Affectingthe Health, Walker, Francis A. (1874), Statistical Atlas of the United States and Hospital Administrationof the BritishArmy, London. Based on the Results of the Ninth Census, Washington,D.C.: Pickett,R., and White,B.W. (1966), "ConstructingData Pictures," U.S. Bureau of the Census. Proceedings of the 7th National Symposium of Information Welsch, Roy E. (1976), "Graphics for Data Analysis," Computers Display, 75-81. and Graphics, 2, 31-37. Playfair, William (1786), The Commercial and Political Atlas, Wilk, Martin B., and Gnanadesikan, Ram (1968), "Probability London. Plotting Methods for the Analysis of Data," Biometrika, 55, (1801), The Statistical Breviary,London. 1-17.

178 ?CThe AmericanStatistician, November 1979, Vol. 33, No. 4