The Past, Present and Future of Statistical Graphics vpart1
The Past, Present and Future of Statistical Graphics Part 1: Visions from the history of data visualization (An Ideo-Graphic and Idiosyncratic View) The only new thing in the world is the history you don’t know. Harry S. Truman The Milestones Project The Golden Age of Statistical Graphics Sex: Male 1198 1493 Questions of Statistical Historiography Admit?: No Admit?: Yes
557 1278
Sex: Female
Michael Friendly York University http://www.math.yorku.ca/SCS/friendly.html
VIEWS, London, Nov, 2004 Color version of these slides: http://www.math.yorku.ca/SCS/Papers/views/
VIEWS, London, 2004 2 c Michael Friendly
The Past, Present and Future of Statistical Graphics plan The Past, Present and Future of Statistical Graphics milestone1
Outline and Plan for Today
Part 1: Visions from the history of data visualization Milestones Project: Roots of Data Visualization The Milestones Project The Golden Age of Statistical Graphics Cartography Problems of Statistical Historiography early map-making → geo-measurement → thematic cartography Part 2: Tables & graphs: Some principles of graphical displays GIS, geo-visualization Graphical failures and successes Statistics, statistical thinking Graphical comparisons probability theory → distributions → estimation Corrgrams: rendering and variable order statistical models → diagnostic plots → interactive graphics Effect ordering for data display Data collection Part 3: Graphical methods for categorical data early recording devices Overview: Categorical Data and Graphics “statistics” (numbers of the state): population, mortality → census, surveys Methods for two-way frequency tables economic, social, moral, medical, ...statistics Mosaic displays and loglinear models for n-way tables Visual thinking Part 4: Wither thou goest? Visions of the future geometry, functions, mechanical diagrams, EDA SAS graphics: The power to grow? Technology Statistical graphics: Models for growth? paper, printing, lithography, computing, displays, ... Wider visions Conclusions
VIEWS, London, 2004 1 c Michael Friendly VIEWS, London, 2004 3 c Michael Friendly The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1
Milestones Project: Goals
Comprehensive catalog of historical developments in all fields related to data visualization
→ collect detailed bibliography, images, cross-references, web links, etc. 220 milestone items (6200 BC – present) 240 images, portraits 140 web links (biographies, commentary) 250 references
→ enable researchers to study themes, antecedants, influences, trends, etc.
Web version: http://www.math.yorku.ca/SCS/Gallery/milestone/ Present form: hyperlinked, chronological listing (HTML, PDF) Searchable by subject, content, author, country, etc. (LATEX→ XML) GFKL Paper: Friendly (2004)
VIEWS, London, 2004 4 c Michael Friendly VIEWS, London, 2004 6 c Michael Friendly
The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1
VIEWS, London, 2004 5 c Michael Friendly VIEWS, London, 2004 7 c Michael Friendly The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1
VIEWS, London, 2004 8 c Michael Friendly VIEWS, London, 2004 10 c Michael Friendly
The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1
VIEWS, London, 2004 9 c Michael Friendly VIEWS, London, 2004 11 c Michael Friendly The Past, Present and Future of Statistical Graphics milestone1 The Past, Present and Future of Statistical Graphics milestone1
VIEWS, London, 2004 12 c Michael Friendly VIEWS, London, 2004 14 c Michael Friendly
The Past, Present and Future of Statistical Graphics dupin The Past, Present and Future of Statistical Graphics golden
Beginning of Modern Data Graphics: 1800–1849 The Golden Age of Statistical Graphics
Playfair’s linear arithmetic (1780–1800): line plot, pie chart, etc. Snow: map of cholera cases (Aug 31–Sep 8, 1854) → Broad Street pump. Adolphe Quetelet (1835) ,“average man” as central tendency in a normal curve. Moral, social and medical statistics collected systematically (1820–) Dupin: distributions of years of schooling; prostitutes in Paris.
Broad Street
Pump
VIEWS, London, 2004 13 c Michael Friendly VIEWS, London, 2004 15 c Michael Friendly The Past, Present and Future of Statistical Graphics golden The Past, Present and Future of Statistical Graphics golden
cf. Water in Walkerton: Outbreak of E. coli contamination (May 16–22, 2000) → 6 died, > 2000 ill.
Source: undetermined until Jan. 2001 No one thought to make a map!
VIEWS, London, 2004 16 c Michael Friendly VIEWS, London, 2004 18 c Michael Friendly
The Past, Present and Future of Statistical Graphics golden The Past, Present and Future of Statistical Graphics album
“The Best Statistical Graphic Ever Produced” Flow maps as visual tools
Movement of people and goods was a consistent theme of most of Minard’s work Data represented both visually and numerically Extensive legends, describing how the information should be understood and interpreted Visual engineer for France: the dawn of globalization, emergence of the modern French state.
E-J Marey (1878): “defies the pen of the historian by its brutal eloquence”. Funkhouser (1937): Minard, the Playfair of France. Tufte (1983): “multivariate complexity integrated so gently that viewers are hardly aware that they are looking into a world of six dimensions ... the best statistical graphic ever produced.”
VIEWS, London, 2004 17 c Michael Friendly VIEWS, London, 2004 19 c Michael Friendly The Past, Present and Future of Statistical Graphics album The Past, Present and Future of Statistical Graphics march
The March Re-visited
March on Moscow was part of a pair, along with Hannibal’s campaign
Carte figurative et approximative du mouvement des voyageurs sur les principal Aug. 1869: Prussian army invades, Minard flees to Bordeau chemin de fer de l’Europe en 1862 (1865) [ENPC: 5862/C351] Personal meaning: horrors of war, the human cost of thirst for military glory.
VIEWS, London, 2004 20 c Michael Friendly VIEWS, London, 2004 22 c Michael Friendly
The Past, Present and Future of Statistical Graphics album The Past, Present and Future of Statistical Graphics album
Minard’s graphic inventions
Population represented by squares, area ∼ population Why the Golden Age? ⇒ Visual center of gravity used to choose location for new post office Statistics as a discipline:
1st International Statistics Congress (1853) [Quetelet] 3rd ISC: Expo. & standardization of graphical methods (Vienna, 1857) la Societ´ e´ de statistique de Paris (1860) Royal Statistical Society (1860)
Expansion of industrialization, trade, transport → government initiatives in data collection and analysis.
Statistics: Numbers of the State
Ministry of Public Works (France): Statistical Bureau (Emile´ Chasson) Similar efforts in Germany, Switzerland, etc. U.S. Census Bureau (Edward Walker)— first US census (1860)
VIEWS, London, 2004 21 c Michael Friendly VIEWS, London, 2004 23 c Michael Friendly The Past, Present and Future of Statistical Graphics album The Past, Present and Future of Statistical Graphics stigler
L’Album de Statistique Graphique
The pinnacle of the Golden Age of Graphics Stigler’s Law of Eponomy 18 volumes published 1879–1899
Les Chevaliers des Album All milestone items are attributed to one or more individuals, but it is hard to 1889: Gross receipts in theaters in Paris, 1848-1889 maintain a claim for a “first.”
“No scientific discovery is named after its original discoverer” (after Merton, 1973)
Laplace first published the Fourier transform Poisson firts discovered the Cauchy distribution de Moivre and Laplace have priority for the Gaussian distribution
Epononyms are conveyed by the community of scholars, not historians
⇒ All milestone items are attributed to one or more individuals, but it is hard to maintain a claim for a “first.”
VIEWS, London, 2004 24 c Michael Friendly VIEWS, London, 2004 26 c Michael Friendly
The Past, Present and Future of Statistical Graphics historiog The Past, Present and Future of Statistical Graphics historiog
What counts as a “milestone?”
Any history, particularly one of “milestones,” must address the question of inclusion. Problems of Statistical Historiography We include innovations and developments in: Graphic forms Who gets credit? Stigler’s Law of Eponomy Statistical graphics: bar chart, line plot, scatterplot, boxplot, mosaic plot Cartography: isoline, choropleth What counts as a milestone? Graphic content: data collection, recording What is milestone “data?” Bills of Mortality, vital statistics, census Understanding through reproduction Measurement, recording devices How to display, visualize, search? Technology and enablement Reproduction: printing press, lithography Imaging: photography, motion picture Rendering: computing, video display
VIEWS, London, 2004 25 c Michael Friendly VIEWS, London, 2004 27 c Michael Friendly The Past, Present and Future of Statistical Graphics historiog The Past, Present and Future of Statistical Graphics miledata
What counts as a “milestone?”
Theory and practice Probability theory Summarization: estimation and modeling Exposure: EDA Awareness and use Theory and data on visual display Principles of graphics (Bertin, Tufte, Wilkinson, ...) Empirical studies— what works? Implementation and disemination Techniques available and accessible Printing, ppublication, web Software
VIEWS, London, 2004 28 c Michael Friendly VIEWS, London, 2004 30 c Michael Friendly
The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics miledata
What is milestones “data?”
Some meta questions
How to export advances in data visualization to an historical realm? How might a grahically-minded statistician look at history? EDA → EBA? (Exploratory bibliographic analysis) What kinds of tools are needed?
VIEWS, London, 2004 29 c Michael Friendly VIEWS, London, 2004 31 c Michael Friendly The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics miledata
VIEWS, London, 2004 32 c Michael Friendly VIEWS, London, 2004 34 c Michael Friendly
The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics miledata
Analyzing milestones data?
Milestones: Time course of developments Density 0.006
Early maps New graphic Golden age High-D Vis forms Measurement Begin modern Modern dark & Theory period ages
0.004
0.002
0.000 1500 1600 1700 1800 1900 2000 Year
VIEWS, London, 2004 33 c Michael Friendly VIEWS, London, 2004 35 c Michael Friendly The Past, Present and Future of Statistical Graphics miledata The Past, Present and Future of Statistical Graphics repro
Analyzing milestones data?
Milestones: Places of development 0.02
Early maps New graphic Golden age High-D Vis forms Measurement Begin modern Modern dark & Theory period ages
0.01 Relative density
N. America
Europe n=131 n= 76 0.00
1500 1600 1700 1800 1900 2000 Year
VIEWS, London, 2004 36 c Michael Friendly VIEWS, London, 2004 38 c Michael Friendly
The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro
Playfair re-visited
Plotting the ratio of prices to wages conveys Playfair’s intention Understanding through reproduction 10
Historical graphs were created using available data, methods and technology 9 8 We can often come to better understandings of significant questions,
intellectual 7 scientific 6
graphical 5 by re-analysis from a modern perspective. “What were they thinking?” 4
Examples (Friendly and Denis, 2004): 3 Playfair’s graph Labour cost of wheat (Weeks/Quarter) 2 Galton’s correlation diagram 1 1560 1580 1600 1620 1640 1660 1680 1700 1720 1740 1760 1780 1800 1820 Year
VIEWS, London, 2004 37 c Michael Friendly VIEWS, London, 2004 39 c Michael Friendly The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro
Galton’s correlation diagram Galton’s correlation diagram
A semi-graphical display (retains data values) Averaged frequencies in 4 adjacent cells Isolines, connecting equal average values → formed concentric ellipses whose conjugate diameters had interpretations as lines of regression of y on x and x on y. Pearson (1901): “that Galton should have evolved all this ... is one of the most noteworthy discoveries arising from pure observation.”
VIEWS, London, 2004 40 c Michael Friendly VIEWS, London, 2004 42 c Michael Friendly
The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro
Galton’s correlation diagram Retracing Galton’s steps: Initial interpolation
75
1 3 73 1 3 3 3 10 13 9 1 2 1 2 7 2 4 1 4 7 8 11 18 17 22 20 10 1 3 4 3 5 10 4 9 2 2 71 1 1 1 2 6 11 22 38 47 35 24 18 10 1 17 1 1 3 12 18 14 7 4 3 3 1 2 18 21 23 482 62 83 90 66 423 22 15 5 0 1 16 4 1774 27 20 33 25 20 11 4 5 69 1 8 35 47 62 100 112 135 127 84 53 22 12 1 7 11 16 25 318 34 48 21 18 4 3 11 4 15 37 56 92 130 131 148 126 69 37 11 3 3 53 14 15 36 38 28 38 19 11 4 0 6 67 6 14 27 36 70 108 97 939 74 34 15 4 3 3 5 2 17 17 14 13 4 Mid-parent height 4 15 22 19 37 56 49 41 29 11 3 1 1 9 5 7 11 11 7 7 5 72 1 65 3 14 22 17 24 32 23 16 14 7 3 1 1 1 4 4 1 5 5 2 3 7 14 10 9 14 8 4 3 1 2 4 1 2 2 1 1 63
61 61 63 65 67 69 71 73 75 Child height
VIEWS, London, 2004 41 c Michael Friendly VIEWS, London, 2004 43 c Michael Friendly The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics repro
Retracing Galton’s steps: Initial smoothing
75
73 Galton’s correlation diagram: conclusions 71 7 2 3 5 0 Galton did not slavishly interpolate iso-frequency curves as one might do with 74 69 8 modern software. 11
3 0 Rather, he drew contours to the smoothed data by eye and brain (as he had done 67 96 Mid-parent height with earlier maps of weather patterns) 7 65 Reasonable to suppose he had some notions that the contours should be elliptical.
63
61 61 63 65 67 69 71 73 75 Child height
VIEWS, London, 2004 44 c Michael Friendly VIEWS, London, 2004 46 c Michael Friendly
The Past, Present and Future of Statistical Graphics repro The Past, Present and Future of Statistical Graphics vishist
PROC KDE to the rescue
Galton’s data: Kernel density estimate How to visualize a history? 75
73 Timeline is obvious, but:
013 0.0 8000+ years, but most milestones in last 300–400
1 problems of display, resolution, access 71 0.009 0.0130 0 linear representation, little room for content 69 . 0.01 0 0
5 0.0208 2
69 47 0.02 Lessons from the past?
0 Dubourg’s Scroll of History
. 0 0 5 67 2 Priestley’s Chart of Biography Mid Parent height Marey’s life spans of British monarchs
13 65 00 0. 0 Lessons from graphic artists .0 0 13
63 Lessons from the web Lessons from data visualization 61 61 63 65 67 69 71 73 75 Child height
VIEWS, London, 2004 45 c Michael Friendly VIEWS, London, 2004 47 c Michael Friendly The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist
Priestley’s Chart of Biography
Lifespans of famous people, 1200 BC – 1750 Divided into two groups: 30 “men of learning” and 29 “statesmen” Linear time scale Hammond’s Graphic History of Mankind (1933)
1 sheet, vertical format 15in x 90in Varying-resolution time scale Separatre timelines for nations/ethnic groups Rise and fall of empires Ermergence of new cultures Influence → width of lines Shading/stripes ← conquest, outside influence
VIEWS, London, 2004 48 c Michael Friendly VIEWS, London, 2004 50 c Michael Friendly
The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist
Lessons from graphic artists
Hammond’s Graphic History of Mankind (1933)
Geschictesbaum Europa (2002)
VIEWS, London, 2004 49 c Michael Friendly VIEWS, London, 2004 51 c Michael Friendly The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist
David Rumsey map collection Geschictesbaum Europa
Online collection of over 7000 historical maps (http://www.davidrumsey.com) Extensively indexed Search by: author, country, keywords, data fields, etc. Provides it’s own Java client
Tree representation → resolu- tion increases with time Branches for countries, topics (religeon, arts, science) Detailed descriptions of events
VIEWS, London, 2004 52 c Michael Friendly VIEWS, London, 2004 54 c Michael Friendly
The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist
Lessons from data visualization
Zoom, focus and resolution Lessons from the web Non-linear scales for space and time Table lens: context, sorting, focus Digital image libraries Dynamic, interactive graphics AP Photo Archive http://archivepix.ap.org Network and tree representations David Rumsey map collection http://www.davidrumsey.com Library of Congress, American Memory http://memory.loc.gov
Provide:
Image metadata Search Zoom (varying image resolution, e.g., Mr Sid)
VIEWS, London, 2004 53 c Michael Friendly VIEWS, London, 2004 55 c Michael Friendly The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics vishist
Interactive fisheye view: browsing a web site Zoom, focus and resolution
Like an interactive map viewer, a time-based viewer can reveal more or less detail
VIEWS, London, 2004 56 c Michael Friendly VIEWS, London, 2004 58 c Michael Friendly
The Past, Present and Future of Statistical Graphics vishist The Past, Present and Future of Statistical Graphics conclusions1
Summary and conclusions
The only new thing ...is the history you don’t know. Harry S. Truman Modern data visualization has deep roots Cartography Statistics Data collection Visual thinking Technology Images from the past have both beauty and truth They still have lessons from which we can learn Milestones Project attempts to documument them all, comprehensively and for future study
VIEWS, London, 2004 57 c Michael Friendly VIEWS, London, 2004 59 c Michael Friendly The Past, Present and Future of Statistical Graphics conclusions1
Summary and conclusions
This leads to interesting problems in statistical historiography What counts as a “milestone?” How to organize, represent, and analyze historical data? What tools are needed? What were they thinking? — Understanding through reproduction How to visualize a history?
VIEWS, London, 2004 60 c Michael Friendly