The Past, Present and Future of vpart1

The Past, Present and Future of Statistical Graphics Part 1: Visions from the history of data (An Ideo-Graphic and Idiosyncratic View) The only new thing in the world is the history you don’t know. Harry S. Truman The Milestones Project The Golden Age of Statistical Graphics Sex: Male 1198 1493 Questions of Statistical Historiography Admit?: No Admit?: Yes

557 1278

Sex: Female

Michael Friendly http://www.math.yorku.ca/SCS/friendly.html

VIEWS, London, Nov, 2004 Color version of these slides: http://www.math.yorku.ca/SCS/Papers/views/

VIEWS, London, 2004 2 c Michael Friendly

The Past, Present and Future of Statistical Graphics The Past, Present and Future of Statistical Graphics milestone1

Outline and Plan for Today

Part 1: Visions from the history of Milestones Project: Roots of Data Visualization The Milestones Project The Golden Age of Statistical Graphics Problems of Statistical Historiography early -making → geo-measurement → thematic cartography Part 2: Tables & graphs: Some principles of graphical displays GIS, geo-visualization Graphical failures and successes Statistics, statistical thinking Graphical comparisons probability theory → distributions → estimation Corrgrams: rendering and variable order statistical models → diagnostic plots → interactive graphics Effect ordering for data display Data collection Part 3: Graphical methods for categorical data early recording devices Overview: Categorical Data and Graphics “statistics” (numbers of the state): population, mortality → census, surveys Methods for two-way frequency tables economic, social, moral, medical, ...statistics Mosaic displays and loglinear models for n-way tables Visual thinking Part 4: Wither thou goest? Visions of the future geometry, functions, mechanical , EDA SAS graphics: The power to grow? Technology Statistical graphics: Models for growth? paper, printing, lithography, computing, displays, ... Wider visions Conclusions

VIEWS, London, 2004 1 c Michael Friendly VIEWS, London, 2004 3 c Michael Friendly The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1

Milestones Project: Goals

Comprehensive catalog of historical developments in all fields related to data visualization

→ collect detailed bibliography, images, cross-references, web links, etc. 220 milestone items (6200 BC – present) 240 images, portraits 140 web links (biographies, commentary) 250 references

→ enable researchers to study themes, antecedants, influences, trends, etc.

Web version: http://www.math.yorku.ca/SCS/Gallery/milestone/ Present form: hyperlinked, chronological listing (HTML, PDF) Searchable by subject, content, author, country, etc. (LATEX→ XML) GFKL Paper: Friendly (2004)

VIEWS, London, 2004 4 c Michael Friendly VIEWS, London, 2004 6 c Michael Friendly

The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1

VIEWS, London, 2004 5 c Michael Friendly VIEWS, London, 2004 7 c Michael Friendly The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1

VIEWS, London, 2004 8 c Michael Friendly VIEWS, London, 2004 10 c Michael Friendly

The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1

VIEWS, London, 2004 9 c Michael Friendly VIEWS, London, 2004 11 c Michael Friendly The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1

VIEWS, London, 2004 12 c Michael Friendly VIEWS, London, 2004 14 c Michael Friendly

The Past, Present and Future of Statistical Graphics dupin The Past, Present and Future of Statistical Graphics golden

Beginning of Modern Data Graphics: 1800–1849 The Golden Age of Statistical Graphics

Playfair’s linear arithmetic (1780–1800): line , pie , etc. Snow: map of cholera cases (Aug 31–Sep 8, 1854) → Broad Street pump. (1835) ,“average man” as central tendency in a normal curve. Moral, social and medical statistics collected systematically (1820–) Dupin: distributions of years of schooling; prostitutes in Paris.

Broad Street

Pump

VIEWS, London, 2004 13 c Michael Friendly VIEWS, London, 2004 15 c Michael Friendly The Past, Present and Future of Statistical Graphics golden The Past, Present and Future of Statistical Graphics golden

cf. Water in Walkerton: Outbreak of E. coli contamination (May 16–22, 2000) → 6 died, > 2000 ill.

Source: undetermined until Jan. 2001 No one thought to make a map!

VIEWS, London, 2004 16 c Michael Friendly VIEWS, London, 2004 18 c Michael Friendly

The Past, Present and Future of Statistical Graphics golden The Past, Present and Future of Statistical Graphics album

“The Best Statistical Graphic Ever Produced” Flow as visual tools

Movement of people and goods was a consistent theme of most of Minard’s work Data represented both visually and numerically Extensive legends, describing how the information should be understood and interpreted Visual engineer for France: the dawn of globalization, emergence of the modern French state.

E-J Marey (1878): “defies the pen of the historian by its brutal eloquence”. Funkhouser (1937): Minard, the Playfair of France. Tufte (1983): “multivariate complexity integrated so gently that viewers are hardly aware that they are looking into a world of six dimensions ... the best statistical graphic ever produced.”

VIEWS, London, 2004 17 c Michael Friendly VIEWS, London, 2004 19 c Michael Friendly The Past, Present and Future of Statistical Graphics album The Past, Present and Future of Statistical Graphics march

The March Re-visited

March on Moscow was part of a pair, along with Hannibal’s campaign

Carte figurative et approximative du mouvement des voyageurs sur les principal Aug. 1869: Prussian army invades, Minard flees to Bordeau chemin de fer de l’Europe en 1862 (1865) [ENPC: 5862/C351] Personal meaning: horrors of war, the human cost of thirst for military glory.

VIEWS, London, 2004 20 c Michael Friendly VIEWS, London, 2004 22 c Michael Friendly

The Past, Present and Future of Statistical Graphics album The Past, Present and Future of Statistical Graphics album

Minard’s graphic inventions

Population represented by squares, area ∼ population Why the Golden Age? ⇒ Visual center of gravity used to choose location for new post office Statistics as a discipline:

1st International Statistics Congress (1853) [Quetelet] 3rd ISC: Expo. & standardization of graphical methods (Vienna, 1857) la Societ´ e´ de statistique de Paris (1860) Royal Statistical Society (1860)

Expansion of industrialization, trade, transport → government initiatives in data collection and analysis.

Statistics: Numbers of the State

Ministry of Public Works (France): Statistical Bureau (Emile´ Chasson) Similar efforts in Germany, Switzerland, etc. U.S. Census Bureau (Edward Walker)— first US census (1860)

VIEWS, London, 2004 21 c Michael Friendly VIEWS, London, 2004 23 c Michael Friendly The Past, Present and Future of Statistical Graphics album The Past, Present and Future of Statistical Graphics stigler

L’Album de Statistique Graphique

The pinnacle of the Golden Age of Graphics Stigler’s Law of Eponomy 18 volumes published 1879–1899

Les Chevaliers des Album All milestone items are attributed to one or more individuals, but it is hard to 1889: Gross receipts in theaters in Paris, 1848-1889 maintain a claim for a “first.”

“No scientific discovery is named after its original discoverer” (after Merton, 1973)

Laplace first published the Fourier transform Poisson firts discovered the Cauchy distribution de Moivre and Laplace have priority for the Gaussian distribution

Epononyms are conveyed by the community of scholars, not historians

⇒ All milestone items are attributed to one or more individuals, but it is hard to maintain a claim for a “first.”

VIEWS, London, 2004 24 c Michael Friendly VIEWS, London, 2004 26 c Michael Friendly

The Past, Present and Future of Statistical Graphics historiog The Past, Present and Future of Statistical Graphics historiog

What counts as a “milestone?”

Any history, particularly one of “milestones,” must address the question of inclusion. Problems of Statistical Historiography We include innovations and developments in: Graphic forms Who gets credit? Stigler’s Law of Eponomy Statistical graphics: bar chart, line plot, scatterplot, boxplot, mosaic plot Cartography: isoline, choropleth What counts as a milestone? Graphic content: data collection, recording What is milestone “data?” Bills of Mortality, vital statistics, census Understanding through reproduction Measurement, recording devices How to display, visualize, search? Technology and enablement Reproduction: printing press, lithography : photography, motion picture Rendering: computing, video display

VIEWS, London, 2004 25 c Michael Friendly VIEWS, London, 2004 27 c Michael Friendly The Past, Present and Future of Statistical Graphics historiog The Past, Present and Future of Statistical Graphics miledata

What counts as a “milestone?”

Theory and practice Probability theory Summarization: estimation and modeling Exposure: EDA Awareness and use Theory and data on visual display Principles of graphics (Bertin, Tufte, Wilkinson, ...) Empirical studies— what works? Implementation and disemination Techniques available and accessible Printing, ppublication, web Software

VIEWS, London, 2004 28 c Michael Friendly VIEWS, London, 2004 30 c Michael Friendly

The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics miledata

What is milestones “data?”

Some meta questions

How to export advances in data visualization to an historical realm? How might a grahically-minded statistician look at history? EDA → EBA? (Exploratory bibliographic analysis) What kinds of tools are needed?

VIEWS, London, 2004 29 c Michael Friendly VIEWS, London, 2004 31 c Michael Friendly The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics miledata

VIEWS, London, 2004 32 c Michael Friendly VIEWS, London, 2004 34 c Michael Friendly

The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics miledata

Analyzing milestones data?

Milestones: Time course of developments Density 0.006

Early maps New graphic Golden age High-D Vis forms Measurement Begin modern Modern dark & Theory period ages

0.004

0.002

0.000 1500 1600 1700 1800 1900 2000 Year

VIEWS, London, 2004 33 c Michael Friendly VIEWS, London, 2004 35 c Michael Friendly The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics repro

Analyzing milestones data?

Milestones: Places of development 0.02

Early maps New graphic Golden age High-D Vis forms Measurement Begin modern Modern dark & Theory period ages

0.01 Relative density

N. America

Europe n=131 n= 76 0.00

1500 1600 1700 1800 1900 2000 Year

VIEWS, London, 2004 36 c Michael Friendly VIEWS, London, 2004 38 c Michael Friendly

The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro

Playfair re-visited

Plotting the ratio of prices to wages conveys Playfair’s intention Understanding through reproduction 10

Historical graphs were created using available data, methods and technology 9 8 We can often come to better understandings of significant questions,

intellectual 7 scientific 6

graphical 5 by re-analysis from a modern . “What were they thinking?” 4

Examples (Friendly and Denis, 2004): 3 Playfair’s graph Labour cost of wheat (Weeks/Quarter) 2 Galton’s correlation 1 1560 1580 1600 1620 1640 1660 1680 1700 1720 1740 1760 1780 1800 1820 Year

VIEWS, London, 2004 37 c Michael Friendly VIEWS, London, 2004 39 c Michael Friendly The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro

Galton’s correlation diagram Galton’s correlation diagram

A semi-graphical display (retains data values) Averaged frequencies in 4 adjacent cells Isolines, connecting equal average values → formed concentric ellipses whose conjugate diameters had interpretations as lines of regression of y on x and x on y. Pearson (1901): “that Galton should have evolved all this ... is one of the most noteworthy discoveries arising from pure observation.”

VIEWS, London, 2004 40 c Michael Friendly VIEWS, London, 2004 42 c Michael Friendly

The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro

Galton’s correlation diagram Retracing Galton’s steps: Initial interpolation

75

1 3 73 1 3 3 3 10 13 9 1 2 1 2 7 2 4 1 4 7 8 11 18 17 22 20 10 1 3 4 3 5 10 4 9 2 2 71 1 1 1 2 6 11 22 38 47 35 24 18 10 1 17 1 1 3 12 18 14 7 4 3 3 1 2 18 21 23 482 62 83 90 66 423 22 15 5 0 1 16 4 1774 27 20 33 25 20 11 4 5 69 1 8 35 47 62 100 112 135 127 84 53 22 12 1 7 11 16 25 318 34 48 21 18 4 3 11 4 15 37 56 92 130 131 148 126 69 37 11 3 3 53 14 15 36 38 28 38 19 11 4 0 6 67 6 14 27 36 70 108 97 939 74 34 15 4 3 3 5 2 17 17 14 13 4 Mid-parent height 4 15 22 19 37 56 49 41 29 11 3 1 1 9 5 7 11 11 7 7 5 72 1 65 3 14 22 17 24 32 23 16 14 7 3 1 1 1 4 4 1 5 5 2 3 7 14 10 9 14 8 4 3 1 2 4 1 2 2 1 1 63

61 61 63 65 67 69 71 73 75 Child height

VIEWS, London, 2004 41 c Michael Friendly VIEWS, London, 2004 43 c Michael Friendly The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro

Retracing Galton’s steps: Initial smoothing

75

73 Galton’s correlation diagram: conclusions 71 7 2 3 5 0 Galton did not slavishly interpolate iso-frequency curves as one might do with 74 69 8 modern software. 11

3 0 Rather, he drew contours to the smoothed data by eye and brain (as he had done 67 96 Mid-parent height with earlier maps of weather patterns) 7 65 Reasonable to suppose he had some notions that the contours should be elliptical.

63

61 61 63 65 67 69 71 73 75 Child height

VIEWS, London, 2004 44 c Michael Friendly VIEWS, London, 2004 46 c Michael Friendly

The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics vishist

PROC KDE to the rescue

Galton’s data: Kernel density estimate How to visualize a history? 75

73 Timeline is obvious, but:

013 0.0 8000+ years, but most milestones in last 300–400

1 problems of display, resolution, access 71 0.009 0.0130 0 linear representation, little room for content 69 . 0.01 0 0

5 0.0208 2

69 47 0.02 Lessons from the past?

0 Dubourg’s Scroll of History

. 0 0 5 67 2 Priestley’s Chart of Biography Mid Parent height Marey’s life spans of British monarchs

13 65 00 0. 0 Lessons from graphic artists .0 0 13

63 Lessons from the web Lessons from data visualization 61 61 63 65 67 69 71 73 75 Child height

VIEWS, London, 2004 45 c Michael Friendly VIEWS, London, 2004 47 c Michael Friendly The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist

Priestley’s Chart of Biography

Lifespans of famous people, 1200 BC – 1750 Divided into two groups: 30 “men of learning” and 29 “statesmen” Linear time scale Hammond’s Graphic History of Mankind (1933)

1 sheet, vertical format 15in x 90in Varying-resolution time scale Separatre timelines for nations/ethnic groups Rise and fall of empires Ermergence of new cultures Influence → width of lines Shading/stripes ← conquest, outside influence

VIEWS, London, 2004 48 c Michael Friendly VIEWS, London, 2004 50 c Michael Friendly

The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist

Lessons from graphic artists

Hammond’s Graphic History of Mankind (1933)

Geschictesbaum Europa (2002)

VIEWS, London, 2004 49 c Michael Friendly VIEWS, London, 2004 51 c Michael Friendly The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist

David Rumsey map collection Geschictesbaum Europa

Online collection of over 7000 historical maps (http://www.davidrumsey.com) Extensively indexed Search by: author, country, keywords, data fields, etc. Provides it’s own Java client

Tree representation → resolu- tion increases with time Branches for countries, topics (religeon, arts, science) Detailed descriptions of events

VIEWS, London, 2004 52 c Michael Friendly VIEWS, London, 2004 54 c Michael Friendly

The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist

Lessons from data visualization

Zoom, focus and resolution Lessons from the web Non-linear scales for space and time lens: context, sorting, focus Digital image libraries Dynamic, interactive graphics AP Photo Archive http://archivepix.ap.org Network and tree representations David Rumsey map collection http://www.davidrumsey.com Library of Congress, American Memory http://memory.loc.gov

Provide:

Image metadata Search Zoom (varying image resolution, e.g., Mr Sid)

VIEWS, London, 2004 53 c Michael Friendly VIEWS, London, 2004 55 c Michael Friendly The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist

Interactive fisheye view: browsing a web site Zoom, focus and resolution

Like an interactive map viewer, a time-based viewer can reveal more or less detail

VIEWS, London, 2004 56 c Michael Friendly VIEWS, London, 2004 58 c Michael Friendly

The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics conclusions1

Summary and conclusions

The only new thing ...is the history you don’t know. Harry S. Truman Modern data visualization has deep roots Cartography Statistics Data collection Visual thinking Technology Images from the past have both beauty and truth They still have lessons from which we can learn Milestones Project attempts to documument them all, comprehensively and for future study

VIEWS, London, 2004 57 c Michael Friendly VIEWS, London, 2004 59 c Michael Friendly The Past, Present and Future of Statistical Graphics conclusions1

Summary and conclusions

This leads to interesting problems in statistical historiography What counts as a “milestone?” How to organize, represent, and analyze historical data? What tools are needed? What were they thinking? — Understanding through reproduction How to visualize a history?

VIEWS, London, 2004 60 c Michael Friendly