Milestones in the History of Data Outline • Introduction A case study in statistical historiography – Milestones Project: overview {flea bites man, bites flea, bites man}-wise – Background , York University – Data and Stories CARME 2003 • Milestones tour • Problems of statistical historiography – What counts as a milestone? – What is “data” – How to visualize?

Milestones: Project Goals Milestones: Conceptual Overview

• Comprehensive catalog of historical • Roots of developments in all fields related to data – : -making, geo-measurement visualization. thematic cartography, GIS, geo-visualization – Statistics: probability theory, distributions, • o Collect representative bibliography, estimation, models, stat-graphics, stat-vis images, cross-references, web links, etc. – Data: population, economic, social, moral, • o Enable researchers to find/study medical, … themes, antecedents, influences, patterns, – Visual thinking: geometry, functions, mechanical , EDA, … trends, etc. – Technology: printing, lithography, • Web: http://www.math.yorku.ca/SCS/Gallery/milestone/ computing… Milestones: Content Overview Background: Les Albums Every picture has a story – Rod Stewart

c. 550 BC: The first world map? (Anaximander of Miletus) • Album de

1669: First graph of a continuous distribution function Statistique (Gaunt's life )– Christiaan Huygens. Graphique, 1879-99

1801: Pie , circle graph - • Les Chevaliers des William Playfair 1782: First topographical map- Albums M. du Carla-Boniface 1991-1996: • Milestones session, Interactive data th 1896: Area rectangles to display two visualization 5 Intl. Conf. Social variables and their product- Jacques Bertillon systems (Xgobi, ViSta) Science

1924: Museum of Social Statistical Methodology Graphics- (Cologne)

Background: C. J. Minard Why Minard? • “The best statistical graphic ever • Study breadth and depth of his work – How related to work in his time? produced… defies the pen of the historian” – How related to modern ? – How related to his personal history?

Civil Engineer for ENPC (1810-1842) Visual Engineer for France (1843-1869) Background: C. J. Minard Meta Questions

• Bibliographic problem: Catalog & categorize his • How to export advances in data graphic work to make sense of patterns, trends, visualization to an historical realm? indications ™How might a graphically-minded statistician – The Graphic Works of look at history of data visualization? (http://www.math.yorku.ca/SCS/Gallery/minbib.html) – All known graphic works, catalog entries (ENPC & ™EDA o EBA (Exploratory bibliographic BNF), keywords, cross-references analysis)? – Online, searchable ™What kinds of tools are needed?

The Ebb and Flow of Minard’s Minard’s themes: Goods vs. Other Graphic Output

Graphical insight: Graphical insight: discrete data are smoothing helps the hard to show mind’s eye effectively th

moo s ic tr

me a Life skills insight: r Statistical insight: a p - retirement may not n models are often crude o N Li be a bad thing. near l ogistic model approximations Where to build a new post office?(1867) The March Re-Visited (1869) Hannibal’s retreat

Center of gravity of Napoleon’s pop. 1812 density campaign

th Milestones Tour Pre 17 C.: Early & diagrams c. 550 BC: The first world map? (Anaximander of 1350: Bar graph of New graphic Miletus) theoretical function forms Beginning of N. Oresme, France modern graphics c. 1280: Diagrams of paired comparisons for Golden Age of electoral systems- Ramon Llull, Spain data graphics

Modern Dark

Measurement Ages 1375: Catalan Atlas, an & Theory exquisitely beautiful visual cosmography, perpetual Re-birth calendar, and thematic Early maps representation of the known High-D world- Abraham Cresques, & diagrams Spain data vis c. 1280: Diagrams of paired comparisons for electoral 1305: Mechanical of knowledge- Ramon Llull, Spain systems- Ramon Llull, Spain red = winner Tree of porphyry: Aristotle’s categories bccd deeffgghhi of knowledge (center) bdcedf egfhgi • Left: questions becfdgehfi • Right: rotating disks o answers bfcgdh ei bg chdi bhci bi

candidate f c h i d b e (Main hall, Univ. of Barcelona) wins 7 6 5 4 3 2 0

1375: Catalan Atlas, an exquisitely beautiful visual cosmography, 1600-1699: Measurement and Theory perpetual calendar, and thematic representation of the known world- Abraham Cresques, Majorca, Spain [BNF: ESP 30] 1626: Visual representations used to chart the changes in sunspots over time- Christopher Scheiner

1644: First visual representation of statistical data- M.F. van Langren, Spain

1669: First graph of a continuous distribution function (Gaunt's life table)– Christiaan Huygens.

1693: First use of areas of rectangles to display probabilities of independent binary events- Edmund Halley, England 1644: First visual representation of statistical data: 1700-1799: New graphic forms determination of longitude between Toledo and Rome- M. F. van Langren, Spain 1701: Isobar map, 1765: Historical lines of equal time line (life magnetic declination spans of famous – Edmund Halley people)

1786: , line graphs of economic data- William Playfair

1782: First topographical map- Marcellin du Actual distance=16o30’ Carla-Boniface

Estimated distance

1786: Bar chart and line graph showing three time series: 1800-1849: Beginning of modern data graphics Price of wheat, weekly wages and reigning monarch over a 250+ year span- William Playfair 1801: , 1819: First modern Elizabeth George IV circle graph statistical map invented- William (illiteracy in France)- Playfair Charles Dupin monarch

1843: Wind-rose (polar coordinates)- L. Lalanne

price of 1844: variable- wheat width, divided bars, area ~ cost of transport- C. J. Minard wages

1565 1820 1844: Tableau-graphique: variable-width, divided bars, 1801: Pie chart, circle graph invented- William Playfair area ~ cost of transport- Charles Joseph Minard

pop

taxes

1850-1899: Golden Age 1896: Area rectangles on a map to display two variables and their product- Jacques Bertillon

1855: Dot map of disease data (cholera)- John Snow

1879: Stereogram (3D Broad St. pump population pyramid)- Luigi Perozzo

% foreigners

1884: Recursive 1896: Area multi-mosaic on a rectangles on a map map- Emile to display two

Cheysson variables and their population product- Jacques Bertillon Album de Statistique Graphique, 1879-1899 Under the Cover

1884: Changes in French • One set of sources to produce various population by department, 1801-1881 hyper-linked versions

loss gain

• Items: 216 •Text links: 120 • Images: 222 • References: 243

Problems of statistical What counts as a “milestone”? historiography Innovations and/or developments in: • What counts as a milestone? • Graphic forms: • Who gets credit? – Stigler’s Law of Eponomy – Statistical graphics: bar chart, line , • What is “data” scatterplot, boxplot, mosaic plot • How to display, visualize, search? – Cartography: isoline, choropleth • Graphic content: data collection,recording – Bills of Mortality, vital statistics, census – Measurement, recording devices What counts as a “milestone”? What counts as a “milestone”? • Technology and enablement • Theory of, and data on visual display – Reproduction: printing press, lithography – Principles of graphics (Bertin, Tufte, …) – : photography, motion picture – Empirical studies- what works? – Rendering: computing, video display • Implementation/dissemination • Theory and practice – Techniques available & accessible – Probability theory – Printing, publication, web – Summarization: estimation & modeling – Exposure: EDA – Software – Awareness & use

What is milestone “data”? What is milestone “data”?

• History-item data base Milestones data – When, who, what, where, notes, … as a relational • Bibliographic data base data base – Book, article, in-collection, map, misc, … • Multi-media database – Image, portrait, web link, audio, movie, … Analyzing milestones data? Analyzing milestones data?

How to visualize a history? Jacques Barbeu-Dubourg’s Scroll of History

• Timeline: obvious, but: •54’ scroll, spanning 6,480 years (Creationo1753) – 8000+ years, but most in last 300-400 •Grouped vertically by – Problems of display, resolution, access theme or country – Linear: no representation of content •Symbols for character & • Lessons from the past? profession – Dubourg’s Scroll of History •History=Geography + Chronology – Priestly’s Chart of Biography – Marey’s life spans of British monarchs Priestly’s Chart of Biography Marey’s life spans of British monarchs (1878) • Life spans of famous people, 1200 BC to 1750

reign

life span

Periods of peace and war

Lessons from the present

• Hammond’s Graphic History of Mankind – Varying-resolution time scale – Separate time lines for nations/ethnic groups • Rise and fall of empires • Emergence of new cultures – Influence indicated by width of lines – Shading/stripes show conquest or outside influence Lessons from the web

• Digital image libraries – AP Photo Archive: http://archivepix.ap.org/ – David Rumsey Map Collection: http://www.davidrumsey.com/ – American Memory: http://memory.loc.gov • Provide: – Search – Zoom – Image data

Searchable image catalogs

AP Photo Archive: http://archivepix.ap.org/

Search by: what, when, where, … Lessons from data visualization Zoom, Focus & Resolution • Zoom, focus & resolution – Non-linear scales for space & time – Table lens • Network representations Begin modern graphics Golden Age • Tree representations x xx x xx xxx x xx x x xx xx xxx Zoom Unzoom 18001825 18501875 1900 movie st 1 star plot stereogram nomogram x x x x x x xx xx xx x x xxx xxx 18751880 1885 1890 1895 2000 Albums

Non-linear Scales for Space Non-linear Scales for Space

Fisheye view of central Washington D.C.

Dynamic: move the cursor to change the focal point Hand with Sphere, M. E. Esher Non-linear Scales for Time Table Lens • Hyperbolic viewer: increased resolution at focus A hyperbolic viewer varies • http://www.tablelens.com resolution smoothly, trading off against span of the view:

Resolution * Span = constant Context

Focus

Sorting

Semantic Network Star Tree Visual Thesaurus - http://www.visualthesaurus.com/ http://www.inxight.com/map/

Part of Zoom, speech animate controls controls Dali TimeScape Conclusions

The only new thing… is the history you don’t know – Harry Truman

• Modern data visualization has deep roots: – Cartography – Statistics – Data collection – Visual thinking – Technology • Milestones Project attempts to document them all comprehensively.

Conclusions

• This leads to interesting problems in (statistical) historiography: – What counts as a Milestone? – How to organize and represent historical data? – What tools are needed? – How to visualize a history?