<<

Visualisation and its importance in the analysis of Big Data Stefano De Francisci

THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION

Eurostat Outline

Introduction Visual cognition process Graphic processing Big Data Storytelling

Eurostat Outline

Introduction Visual cognition process Graphic information processing Big Storytelling

Eurostat To be learned #1: Data-ink ratio

“Above all else show the data” (Edward Tufte)

http://www.infovis-wiki.net/index.php/Data-Ink_Ratio

Eurostat To be learned #2: Visual information-seeking mantra

“Overview first, zoom and filter, then details-on-demand”

() 1. Overview 2. Zoom 3. Filter 4. Details-on-demand 5. Relate 6. History 7. Extracts

Eurostat To be learned #3: Storytelling, or the weaving of a narrative

“Don't just get to the point – start with it” (David Marder)

Gapminder «We all have important stories to (Hans Rosling) tell. Some stories involve numbers»

«Data are not intrinsically boring. Neither are intrinsically interesting.» (S. Few) NoiItalia (Istat)

«Behind the scene stands the Young people and storyteller, but behind the storyteller the jobs crisis in stands a community of memory» numbers (H. Arendt) (Oecd)

Eurostat Why we need to visualize information?

«By visualizing information, we turn it Not to get lost in into a landscape that you can explore with the data your eyes, a sort of information . And when you’re lost in information, an information map is kind of useful.» (D. McCandless)

http://www.brainyquote.com/quotes/quotes/d/davidmccan630550.html

Visualization of [Big] Data gives you a way So… we can quote «to find things that you had no theory Baudelaire: about and no statistical models to identify, «Le beau est toujours but with visualization it jumps right out at bizarre» you and says, ‘This is bizarre’ » (B. Stensrud)

http://bollier.org/sites/default/files/aspen_reports/InfoTech09.pdf

Eurostat Reality and representation In ’s allegory of …in the same the cave the shadows way, the data is of real objects is all all the analyst the spectator sees. sees, and the world Real visualization of that data is all the viewer sees.

But shadows

are not Data reality

But which kind of visualization could better represent

the data? Visual representation

Eurostat http://www.worldmapper.org/animations/wm01to02.html

Eurostat Importance of visualization: Anscombe's quartet

Eurostat http://en.wikipedia.org/wiki/Anscombe's_quartet Eurostat http://en.wikipedia.org/wiki/Anscombe's_quartet Eurostat Some definitions

Visualization Information visualization The act or process of interpreting in The use of computer-supported, visual terms or of putting into visible interactive, visual representations of form Merriam-Webster Dictionary abstract data to amplify cognition

Visualization is both a process of S.K. Card, J.D. Mackinlay,B. Shneiderman presentation and discovery Readings in Information Visualization: Using Vision to Think, Morgan Kaufmann, 1999 J. C. Roberts, Display Models - ways to classify visual representations, International Journal of Computer Integrated Design and Construction, 2000

Visual analytics Systems The science of analytical reasoning Systems for visual analysis of large, facilitated by interactive visual complex data integrate computational interfaces knowledge discovery with interactive J.J. Thomas, K.A. Cook, eds. Illuminating the path: The visualization research and development agenda for visual analytics. T. Schreck, D. Keim, Visual Analysis of Social IEEE Computer Society Press, 2005 Media Data,Computer, vol. 46, no. 5, 2013

Eurostat Outline

Introduction Visual cognition process Graphic information processing Big Data visualization Storytelling

Eurostat Visual cognition process. A general overview Visual cognition process The visual cognition process can help people to understand the sense of data, solve problems and make decisions

Phases Aims

• Representation: a mapping • Communicate and explain from raw data to a visible (known ideas) representation • Analyse and discover • Interaction: changing what is (unknown reality) immediately viewable • Evaluate and decide (on • Presentation: organizing this the basis of graphical visible representation into the evidences) space available

Eurostat Visual cognition building blocks

Raw Data Visual Patterns

Visual representation Visual interaction

Visual Visual Visual Visual synthesis presentation exploration analysis

Infographic GAV Storytelling Dashboard

Sense-making Understanding Problem-solving Making decisions

Knowledge cristallization Knowledge discovery

Eurostat Visual cognition pipeline Visual interaction Exploring systems Problem- Raw solving Data

Visual interaction Dashboard Sense-

Visual Making

analysis Visual exploration Visual Visual Under- Visual presentation Visual synthesis representation standing Patterns Visual interaction Making decisions

Visualization practitioner User Analyst Storytelling

Eurostat Visual cognition process. Conceptual models

KNOWLEDGE

IMAGE etal.

ABSTRACTION Mi. Mi. Chen

M.S.T. Carpendale M.S.T. DATA DATA

IMAGE

etal. DATA

KNOWLEDGE

Keim

C.Ware D. D. ABSTRACTION

Eurostat Outline

Introduction Visual cognition process Graphic information processing Big Data visualization Storytelling

Eurostat The essence of visual representation

A few words are enough to explain the act of visualization

“Visualization is about seeing the unseen and gaining an understanding of the underlying

information” (J.C.Roberts)

https://kar.kent.ac.uk/21931/1/display_models-ways_roberts.pdf

Eurostat The generic dataflow model of visual representation

Raw data are preprocessed…

…filtered… to select the required information …mapped… into an Abstract Visualization Object …rendered to generate the image Eurostat The essence of visual analytics The (visual) analytics process can be seen as the combination of two human activities that enable to discovery knowledge into the world complex information space: to see and to think

“Discovery consists of seeing what everybody has seen and thinking what nobody has thought.” (A. von Szent-Gyorgyi )

Static, Dynamic, rational, instinctive, passive interactive

Eurostat Goals underlying visual analytics The goal of visual analytics is the creation of tools and techniques to enable people to: • Synthesize information and derive insight from massive, dynamic, ambiguous and often conflicting data. • Detect the expected and discover the unexpected. • Provide assessments that can be timely, defensible, and understandable. • Communicate assessment effectively for action

J.J. Thomas, K.A. Cook, eds. Illuminating the path: The research and development agenda for visual analytics. IEEE Computer Society Press, 2005

Eurostat The essence of visual

Our sense of pattern is an extremely important part of all intellectual activity, and the externalization and representation of pattern in is... at the core of many advances. (D. Sless )

“Graphic communication involves transcribing and telling others what you have discovered. Its aim: rapid perception and, potentially, memorization of the overall information. Its imperative: simplicity” (J.Bertin)

Eurostat Information visualization uses

Scenarios Core Roots Maxim

Communication Presentation Organizing as well as possible Computer “Above all else show the visible representation into + the space available Information the data” E. Tufte Design ( )

Analysis Interaction “Overview first, zoom Dialog between the user and and filter, then details- the system as the user HCI+KDD explores the data set to on-demand” B. Shneiderman uncover insights ( ) Decision Support Synthesis/Evaluation Dashboard + “Eloquence through KPI + Measuring and monitoring the Balance simplicity” S. Few achievement of fixed objectives Scorecard ( )

Eurostat Information visualization solutions Infographic, , The art and science of preparing and presenting the information so that they can be used by humans in an efficient and effective way

Geo-analytics visualization systems Graphic techniques to analyze and make sense of the data

Dashboard Graphic techniques to measure and monitor relevant data of an organization, in order to achieve their fixed objectives

Eurostat Infographic

Information graphics or are graphic visual representations of information, data or knowledge intended to present complex information quickly and clearly. They can improve cognition by utilizing graphics to enhance the human visual system’s ability to see patterns and trends. http://en.wikipedia.org/wiki/Infographic

Infographics examples • Tables • Flow • Conceptual • Hystograms • Graphs • Maps • Topographic maps • Shemas • Signage Systems

Eurostat Infographic. Two styles

All in one Scrolling screenshot down

Infographics as Infographics as snapshot storytelling http://www.coolinfographics.com/blog/ http://www.dailyinfographic.com/100- 2015/4/17/your-life-in-weeks.html years-of-change

Eurostat

Visual analytics

Human

Visual analytics integrates scientific

disciplines to

improve the Machine

division of

labor between

human and

machine.

Eurostat Visual analytics Techniques that take advantage Techniques that enable of the human eye’s broad users to obtain deep bandwidth pathway insights that into the mind to directly support allow users to see, assessment, explore, and planning, understand large and decision amounts of making Visual information at once Analytics Techniques to present the results of an analysis That convert to communicate all types of information in the conflicting and appropriate context dynamic data in ways to a variety of audiences. that support visualization

and analysis http://vis.pnnl.gov/pdf/RD_Agenda_Visu alAnalytics.pdf Eurostat Decision support visualization: dashboard

A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.

S. Few, “Dashboard Confusion”, Intelligent Enterprise magazine, 2004

Eurostat The secret art of information visualization

In order to increase the human cognitive resources visualization enables to: • extend the working memory • enhance the recognition of patterns • utilize the powerful image processing capabilities of the human brain • reduce the search of information

http://www2.vtt.fi/inf/pdf/workingpapers/2009/W117.pdf

Eurostat Working memory extension

These two elements occupy the same memory space in the human mind

Stephen Few - Now You See It: Simple Visualization Techniques for Quantitative Analysis - Analytics Press, 2009

Eurostat Visual patterns useful for data understanding

High – Low - In between Up – Down - Plateau Steep - Gradual

Steady - fluctuating Random - repeating Leading - lagging

Stephen Few, Now You See It: Simple Visualization Techniques for Quantitative Analysis, Analytics Press, 2009

Eurostat Visual patterns useful for data understanding

Non-intersecting Symmetrical -skewed Normal - abnormal - intersecting

Tightly - loosely Wide - narrow Clusters - gaps distributed

Stephen Few, Now You See It: Simple Visualization Techniques for Quantitative Analysis, Analytics Press, 2009

Eurostat About

Stephen Few, Now You See It: Simple Visualization Techniques for Quantitative Analysis, Analytics Press, 2009

Eurostat About visual perception

=> Same data can Up provide different => visual Fluctuating messages =>

Random - repeating

Stephen Few, Now You See It: Simple Visualization Techniques for Quantitative Analysis, Analytics Press, 2009

Eurostat Optical illusions

Red: convesse Blue: concave

Stephen Few, Now You See It: Simple Visualization Techniques for Quantitative Analysis, Analytics Press, 2009

Eurostat Pre-attentive patterns Semiotic

Eurostat

Semiotic elements in visualization

elements

Graphical

Graphical properties

Perception accuracy

Eurostat Complexity in simplicity

“If the are boring, then you've got the wrong numbers.” (Edward Tufte)

1. Graphical excellence is the well-designed presentation of interesting data – a matter of Principles of substance, of statistics, and of design. graphical 2. Graphical excellence consists of complex ideas excellence communicated with clarity, precision, and according to efficiency. Edward Tufte 3. Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink the smallest place. 4. Graphical excellence is nearly always multivariate 5. And graphical excellence requires telling the truth about the data.

Eurostat Same data, two graphs

80.000 60.000 40.000 Revenue/expenditure by months (in €) 20.000 Expenditure 0

Month Expenditure Revenue Revenue

May

April

July June

January 55.000 50.000 March

January

February August

February 58.000 53.000 October

September November March 63.000 60.000 December April 56.000 55.000 May 61.000 60.000

June 65.000 66.000 Balance July 58.000 60.000 August 62.000 65.000 September 69.000 70.000 October 62.000 65.000 November 66.000 72.000 December 73.000 78.000

Eurostat , some examples

Dark blue slice 20% 10% 20% Red slice 20% 20% 1 2 Green slice 10% 20% 3 4 Purple slice 20% 20% Light Blue slice 20% 10% 5 6 Orange slice 10%

1 2 1 2

3 4 3 4

5 6 5 6

Is the purple slice greater Is the green slice greater than the dark blue one? than the yellow one? Eurostat Chartjunk, some examples Country Bronze Silver Gold III Olympic Games Medal 35 Galatia 1 3 4 Gold Silver Bronze Moesia 10 3 1 30

Phrygia 5 9 0 25 Bithinya 6 5 4 20 Pamphilya 10 2 10 Achaia 10 10 6 15

Thrace 20 0 9 10 Rhodes 8 7 8 5 Illyria 3 10 8 Crete 2 3 10 0 Macedonia 3 10 1 Lycia 3 1 5

Eurostat Chartjunk, some examples III Olympic Games Medal Table Country Bronze Silver Gold Galatia 1 3 4 35 Gold Silver Bronze Moesia 10 3 1 30 Phrygia 5 9 0 25 Bithinya 6 5 4 20 Pamphilya 10 2 10 15 Achaia 10 10 6 10 Thrace 20 0 9 Rhodes 8 7 8 5 Illyria 3 10 8 0 Crete 2 3 10 Macedonia 3 10 1 Lycia 3 1 5

Country Total Gold Silver Bronze Total Gold Silver Bronze 0 10 20 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Pamphilya 22 10 2 10 Crete 15 10 3 2 Thrace 29 9 0 20 Rhodes 23 8 7 8 Illyria 21 8 10 3 Achaia 26 6 10 10 Lycia 9 5 1 3 Bithinya 15 4 5 6 Galatia 8 4 3 1 Moesia 14 1 3 10 Macedonia 14 1 10 3 Phrygia 14 0 9 5 Eurostat Make-Up and graphic surgeries

Eurostat

Make-Up and graphic surgeries

1240 1230 Product Units sold

months 1220

PC 1230 1210

per

Tablet 1200 Excel 1200

default units 1190 TV 1210

graph 1180 Sold PC Tablet TV

1300 1200 1100 1000 900 1240 800 700 1230 PC 600 500 1220 400 300 TV 200 1210 100 1200 0 Tablet PC Tablet TV (A) (B) Eurostat How to make an effective Sales by geographical area and quarter Sales by Quarter (# of units sold) Groups of regions Product Q1 Q2 Q3 Q4 Smartphone 100 200 300 400 Tablet 400 300 300 500 Northwest Italy Notebook 900 600 700 1000 Mobile 300 600 700 400 Smartphone 200 600 200 500 Tablet 500 400 400 800 Northeast Italy Notebook 900 700 600 800 Mobile 400 500 700 500 Smartphone 300 600 700 800 Tablet 800 500 200 900 Central Italy Notebook 500 300 400 600 Mobile 200 800 1000 600 Smartphone 700 400 700 600 Tablet 1000 400 300 1000 South Italy Notebook 500 200 100 400 Mobile 200 900 1000 400 Smartphone 600 200 400 900 Tablet 900 300 200 900 Insular Italy Notebook 700 100 200 600 Mobile 300 600 900 300 Eurostat Bad and good solutions

(A) (B) (C)

(D) (E) (F)

Eurostat

Bad and good solutions

Tablet

Mobile

Tablet

Notebook

Mobile

Notebook

Smartphone Smartphone

4

3

2

1 100 200 300 400 400 300 300 500 900 600 700 1000 300 600 700 400 200 600 200 500 500 400 400 800 900 700 600 800 400 500 700 500 300 600 700 800 800 500 200 900 500 300 400 600 200 800 1000 600 700 400 700 600 1000 400 300 1000 500 200 100 400 200 900 1000 400

600 200 400 900

900 300 200 900 700 100 200 600

300 600 900 300

Tablet

Tablet

Mobile

Mobile

Notebook

Notebook

Smartphone Smartphone Eurostat Eurostat Getting the Picture Right

What does «right» mean?

1. Conceptual?

2. Perceptual? Point of view 3. Graphic?

4. Politically?

Eurostat Conceptual POV Correlation Does Not Mean Causation Prices and weights of foreign cars sold in Austria in 1956 Car Weight Price Car Weight Price model (Kmg) ($) model (Kmg) ($) 3500 1 675 1227 16 730 1408 3000 2 495 1085 17 1130 1827 r= 0,926 3 585 1096 18 1070 1885 2500 4 490 958 19 865 2019 5 760 1338 20 1050 2000 2000 6 585 1096 21 895 1758 7 670 1327 22 1120 2269 1500 8 1020 2115 23 1070 2154 Price 9 825 1485 24 1210 2250 1000 10 811 1377 25 1270 2269 500 11 825 1769 26 1325 2885 12 930 1804 27 1155 2058 0 13 950 1758 28 1210 2750 0 200 400 600 800 1000 1200 1400 14 890 1646 29 1220 2527 15 950 1381 30 1140 2132 Weight

Weight  Price ?

Eurostat Perceptual POV

Quarter Quarter Quarter

Quarter

Tablet

Tablet

Tablet

Tablet

Tablet

Mobile

Mobile

Mobile

Mobile

Mobile

Notebook

Notebook

Notebook

Notebook

Notebook

Smartphone

Smartphone

Smartphone

Smartphone Smartphone

Northwest Northeast Central South Insular

Eurostat Graphical POV

Size of effect shown in graphic = 1 : Truth Lie factor = = Size of effect in data ≠ 1 : Lie

where

From: http://www.infovis-wiki.net/index.php?title=Lie_Factor

Eurostat Political POV

Eurostat Outline

Introduction Visual cognition process Graphic information processing Big Data visualization Storytelling

Eurostat Data visualization and Big Data

Implementing effective data visualization solutions for Big Data has to take into account - apart the volume of the data - other intrinsic constraints generated by the typical characteristics of Big Data:

• real-time changes • extreme variety of the sources • different levels of data structuring

Moreover, it is advisable the simultaneous usage of several visualization techniques to better illustrate relationships among a large amount of data.

Eurostat When Data become Big?

Data in motion Data at scale Data in many forms Analysis of streaming data Petabyte (1015) to Structured, unstructured, to enable decisions within Exabyte (1018) text, multimedia fractions of a second

Complex Information Spaces Extreme-scale (a) the data items being difficult to compare based on raw data, (b) data compound of several base data types

Three critical elements in Size applying visual analytics to Inclusion of visual and analytical extreme-scale data and Active involvement of a human complex Information Spaces

Eurostat Complexity and flatness

“The world is complex, dynamic, multidimensional; the paper is static, flat.

How are we to represent the rich visual world of experience and measurement on mere flatland?” E. Tufte

Eurostat Big Data building blocks

Generic process model, Collection Big data analytics processes based on Cleaning

building blocks [Chau] Integration

Visualization

Analysis Some building blocks can be

skipped, depending on the Presentation operating contexts and to go back (two-way street) is Dissemination admitted

Eurostat Role of data visualization in Big Data Life Cycle

• Data visualization can play a specific role in several phases of the Big Data Life Cycle • Data types can affect visualization design • Visualization methods can informs data cleaning and the choice of analysis algorithms

Along the Big Data life cycle, visualization methods can be properly incorporated in three phases: • Pre-processing, staging, handling • Exploratory • Presentation of analytical results

Eurostat Three Styles of Big Data Visualization

Emphasis on… Methodology Author

Big Data  Medium Data  Small Data+ R Data Wickham reduction Filtering Filtering

New representation pattern + User Interaction Visual StarGlyphs+Parallel coordinates Carpendale interaction Interaction

Divide and conquer + Parallel Computation

HCP Bowei Xi

Remco Chang – Fields Institute 15

Eurostat Visualizing Big Data in Official Statistics

Although there are already many experiences and success stories in applying data visualization technologies on Big Data, the most interesting proposals are aimed at future challenges. The main issues to deal with are focused on the combination of some basic opportunities like:

Automated New advanced data Analytic analysis tools visualization platforms technologies

Traditional Interactive visual analytics Presentation visual methods approaches tools

Eurostat Automated analysis and interactive visual methods

In order to support the entire life cycle of Big Data, a good visual analytics system has to combine the advantages of the automatic analysis with interactive techniques to explore data. Behind this desired technical feature there is the deeper aim to integrate the analytic capability of a computer with the abilities of the human analysis.

volume, velocity, variety Appropriately definition in phase of design and implementation of mapping complex data specific weight and right into more simple visual balancing of the two forms of knowledge components

Eurostat Automated analysis

Reorganization of the structure of the visual analytics functionalities Macro phase Data Processes Selection & Data loading Data Integration management Export Data Pre-processing, cleaning & transformation handling Calculations & querying Statistics functions (univariate, bivariate and multivariate analysis) Clustering, classification, network modelling, predictive analysis Data Data projection (Principal Components, Multidimensional scaling, modelling Self organizing map, Bayesian Network) Pattern recognition & Visual query analysis (both automated and interactive) Data Visualization Visual Interpretation, evaluation, representation

Eurostat Automated analysis

Automated analysis of Big Data concerns with the “development of methods and techniques for making sense of data” [Fayyad]

Simple reports

Descriptive More abstract approximation or model of the process that Extreme Huge Synthetic generated the data characteristics Predictive model for of Big Data At low-level Clear estimating the value of future cases Useful Specific data-mining methods for pattern discovery and extraction

Eurostat Interactive Visual Analytics techniques with Big Data

Bring out meaningful: Data • Data mining • patterns preprocessing • Machine learning • outliers through visual • Statistical • clusters approaches methods • gaps

• Discover the most interesting relationships among data Interactive • Browse visualization • search • Investigate what-if scenarios • monitor • Verify the presence of biases • Simulate changes impact

Dissemination • Show the data • Enlighten the sense of data tools • Tell stories about them

Eurostat Interactive visualization

In the context of Big Data some categories as basis of reasoning can be adopted [Yi-etal-2007]: • Select (mark something as interesting) • Explore (show me something else) • Reconfigure (show me a different arrangement) • Encode (show me a different representation) • Abstract/elaborate (show me more or less detail) • Filter (show me something conditionally) • Connect (show me related items)

http://www.cs.tufts.edu/comp/250VA/papers/yi2007toward.pdf

Eurostat Select Abstract/ (mark elaborate something as (show me interesting) more or less detail)

Explore Filter (show (show me me something something conditionally) else)

Eurostat Reconfigure (show me a different arrangement)

Eurostat Connect (show me related items)

Eurostat Interactive visualization

Ability to mark data items of Select Outlier values interest to highlight them Enabling users to examine the Explore different subsets in which the Panning across the data data can be divided Provide users with different • Revelation of hidden patterns Reconfigure data perspectives • visual rearrangements of a series Capability of a visualization system to handle and Pre-attentive processing, colours, Encode transform the basic elements of shapes, dimensions human vision Abstract/ Capability of reduce or increase the details of the visualization elaborate Highlight some visual elements that are compliant with specific Filter conditions defined by users Enables users to better emphasize relationships and associations Connect already known or discover the hidden patterns of the data

Eurostat Traditional vs. New techniques

Traditional Visual Analytics tools and techniques don’t properly fit big data.

Computational problems for VA with Big Data

When the number of visualized

Human objects becomes large, humans

perception often have difficulty extracting meaningful information

Limited screen Risk of significant visual clutter Effects

space when a visualization displays too Main causes Main many data

Eurostat Traditional vs. New techniques

Working with new data sources brings about a number of analytical challenges

(1) getting the picture right, i.e. summarising the data (2) interpreting, or making sense of the data through inferences (3) defining and detecting anomalies.

Eurostat Visual scalability

Dimension reduction

Provide compact, Clustering

meaningful information Methods to exploit

about the methods machine learning

raw data Computational Computational Methods to exploit data mining

Eurostat 1. Social Networks (human-sourced 3. Internet of Things (machine- information) generated data) 1100. Social Networks 31. Data from sensors

1200. Blogs and comments 311. Fixed sensors 1300. Personal documents 3111. Home automation 1400. Pictures: Instagram, Flickr, Picasa 1500. Videos: Youtube etc. 3112. Weather/pollution sensors 1600. Internet searches 3113. Traffic sensors/webcam 1700. Mobile data content: text messages 3114. Scientific sensors 1800. User-generated maps 1900. E-Mail 3115. Security videos/ 2. Traditional Business systems (process- 312. Mobile sensors (tracking) mediated data) 3121. Mobile phone location 21. Data produced by Public Agencies 3122. Cars 2110. Medical records 3123. Satellite images 22. Data produced by businesses 32. Data from computer systems 2210. Commercial transactions 2220. Banking/stock records 3210. Logs 2230. E-commerce 3220. Web logs 2240. Credit cards

Eurostat 1. Social Networks (human-sourced 3. Internet of Things (machine- information) generated data) 1100. Social Networks 31. Data from sensors

1200. Blogs and comments 311. Fixed sensors 1300. Personal documents 3111. Home automation 1400. Pictures: Instagram, Flickr, Picasa 1500. Videos: Youtube etc. 3112. Weather/pollution sensors 1600. Internet searches 3113. Traffic sensors/webcam 1700. Mobile data content: text messages 3114. Scientific sensors 1800. User-generated maps 1900. E-Mail 3115. Security videos/images 2. Traditional Business systems (process- 312. Mobile sensors (tracking) mediated data) 3121. Mobile phone location 21. Data produced by Public Agencies 3122. Cars 2110. Medical records 3123. Satellite images 22. Data produced by businesses 32. Data from computer systems 2210. Commercial transactions 2220. Banking/stock records 3210. Logs 2230. E-commerce 3220. Web logs 2240. Credit cards

Eurostat 1200. Blogs and comments Blogopole

«La Blogopole (contraction de blogosphère politique) c'est l'ensemble des sites et blogs de citoyens qui alimentent le débat politique en France c'est à dire tant les hommes politiques, les militants, les sympathisants que les commentateurs et analystes»

http://blogopole.observatoire-presidentielle.fr/

Eurostat 1400. Pictures TagGalaxy

Eurostat 1300. Personal documents The Bible

«The bar graph that runs along the bottom represents all of the chapters in the Bible. Books alternate in between white and light gray. The length of each bar denotes the number of verses in the chapter. Each of the 63,779 cross references found in the Bible is depicted by a single arc - the color corresponds to the distance between the two chapters, creating a rainbow-like http://www.chrisharrison.net/index.php/Visualizations/BibleViz effect»

Eurostat 1100. Social Networks Human emotion

«This video shows the mood in the U.S., as inferred using over 300 million tweets, over the course of the day. The maps are represented using density-preserving cartograms»

https://www.youtube.com/watch?v=ujcrJZRSGkg

Eurostat 1100. Social Networks Tweetcatcha

«TweetCatcha seeks to uncover the organic nature of news as it travels through Twitter over time, by examining the movement of NY Times articles through Twitter»

Eurostat 1. Human-sourced information WikiMindMap

Eurostat 1. Human-sourced information 100 seconds of History

For a sort of evolution of the world at glance, all geotagged Wikipedia articles

have been scraped, with time attached to them, providing a total of 14,238 events.

http://flowingdata.com/2011/03/21/history-of-the-world-in-100-seconds-according-to-wikipedia/

Eurostat 2110. Medical records Human disease network

«The diseasome website is a disease/disorder relationships explorer and a sample of an innovative map-oriented scientific work. Built by a team of researchers and engineers, it uses the Human Disease Network dataset and allows intuitive knowledge discovery by mapping its complexity»

Eurostat 1700. Mobile data content: text messages Digital City Portraits (launch of 4G by EE) «…Digital portrait for «In the London each city, formed from image you can millions of bits of data clearly see when as people talked and Hurricane Sandy interacted about the hit in New york, biggest events of the and even when day.» Obama visited the city to inspect the «…time explodes damage.» outwards from the «It's also evident centre with each point that only a day representing one later hardly minute giving a anybody was possible 4320 points talking about the – the number of hurricane, minutes in three days showing the – to cover the day transient nature before, during and of social media, after the launch of http://brendandawes.com/projects/ee even for large global events.» 4G.» Eurostat 3121. Mobile phone location Urban Mobs «Cette visualisation représente la quantité de SMS envoyés le soir de la fête de la musique (21 juin 2008). On peut découvrir à partir de 17h une forte activité aux alentours du Parc des Princes que nous pouvons mettre en parallèle avec le concert de Tokio Hotel ce soir là. On remarque ensuite un autre foyer d'activité à l'hippodrome d'Auteuil correspondant au concert organisé par France 2» http://www.urbanmobs.fr/fr/france/

Eurostat 31. Data from sensors LIVE Singapore! «Making decisions in sync with the environment LIVE Singapore! provides people with access to a range of useful real-time information about their city by developing an open platform for the collection, elaboration and distribution of real-time data that reflect urban activity. Giving people visual and tangible access to real-time information about their city enables them to take their decisions more in sync with their environment, with what is actually happening around https://www.youtube.com/watch?feature=player_embedded&v=2aEPkyOBtRo them.»

Eurostat 312. Mobile sensors (tracking) San Francisco Transportation

«…data from the Muni (San Francisco Municipal Transportation Agency) showing the geographic coordinates of their vehicles to create this map showing average transit speeds over a 24-hour period. […]

Black lines represent very slow movement under 7 mph. Red are less than 19 mph. Blue are less than 43 mph. Green lines depict faster speeds above 43 mph.» https://www.flickr.com/photos/walkingsf/4521616274/in/photostream/

Eurostat Examples

http://blog.profitbricks.com/39- http://www.dailyinfographic.com/ http://www.visualisingdata.com/ data-visualization-tools-for-big- data/

http://www.visualcomplexity.com/vc/ http://exploringdata.github.io/

Eurostat Outline

Introduction Visual cognition process Graphic information processing Big Data visualization Storytelling

Eurostat Hints about Storytelling

“Narrative or recital of an event, or a series of events whether real or fictitious” New International Webster’s Comprehensive Dictionary (2013 edition)

“Programme to make the results of official statistics accessible and understandable to people and – in fulfilment of an information mandate – to make "evidence based decision making" possible.” Armin Grossenbacher, Federal Statistical Office, Storytelling revisited, 2010

Eurostat Storytelling principles

1) Gricean Maxims (P. Grice)

2) Pyramid principle (B. Minto)

3) Seven steps to storytelling (J. Lambert)

4) Scenario for combining data, model and stories (J. Koomey)

5) Five golden rules for statistics storytellers (D. Marder)

Eurostat Gricean Maxims

1. Make your contribution to the 1. Do not say what you “Make your conversation as informative believe to be false. conversational as necessary. contribution 2. Do not say that for what is 2. Do not make your which you lack required, at contribution to the adequate the stage at conversation more evidence. which it informative than occurs, by the necessary. Grice’s accepted conversational maxims purpose or 1.Avoid obscurity direction of the Be relevant talk exchange of expression. 2. Avoid ambiguity. (that is, say things in which you related to the current are engaged.” 3. Be brief (avoid unnecessary wordiness). topic of conversation). (P. Grice) 4. Be orderly.

Eurostat Barbara Minto’s pyramid principle

The Situation is The Complication is what The Question states The Answer is your simply the state of is changing in your field to what the situation and particularly inspired affairs in your make things more complication are asking. way of solving the particular area. For challenging—it’s the For instance how do I problem you are example, your proverbial thorn in your achieve double-digit presenting. current growth rate side that you have to growth with increased or your product remove in order to make competition? Or another offering. things run smoothly. This question—how do I might be your new reach out to the competition, or a lack of particular audience that fresh prospects. I’ve targeted and get http://blog.kurtosys.com/storytell them to buy my product? ing-pyramid-principle/

Eurostat Seven steps to storytelling

Decisive Step 1: Owning Your Insights Insights Emotions Moments Step 2: Owning Your Emotions Step 3: Finding The Moment Vision Step 4: Seeing Your Story Narrativ Step 5: Hearing Your Story e Step 6: Assembling Your Story Editing

Step 7: Sharing Your Story Sharing

Joe Lambert, DIGITAL STORYTELLING COOKBOOK – 2010, Digital Diner Press

Eurostat Scenario for combining data, model and stories

Turning Numbers Into Knowledge: Mastering the Art of - Jon Koomey

Eurostat Five golden rules for statistics storytellers

… five golden rules that statistical story writers often lose sight of: •Write as people speak; •Don’t just get to the point – start with it; •Make every sentence relevant to the audience – what’s in it for them; •Stay simple, but don’t patronise; •Use only one idea per sentence.

David Marder, Office for National Statistics. The Holistic Approach to Statistical Story-Telling, 16 UNECE Work Session on Dissemination of Statistical Commentary (Geneva, 4-5 Dec. 2003).

Eurostat

killer-examples

(i.e.: 8 ways to / build an effective

storytelling with infographics

infographic) - best

- the

newspaper -

of -

flowchart examples

timeline - bait comparison numbers photos

vision http://www.howtostory.be/killer Eurostat