<<

Complexity, Big Data Science, and Happiness

Complexity, Big Data Science, and Complexity Introduction Emergence Happiness Universality Symmetry Breaking Discrete Days, St. Michael’s College, 2011 The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Peter Dodds Happiness Tweetage Mechanical Turk Department of Mathematics & Statistics References Center for Complex Systems Vermont Advanced Computing Center University of Vermont

1 of 83 Complexity, Big Outline Data Science, and Happiness

Complexity Complexity Introduction Emergence Introduction Universality Symmetry Breaking Emergence The Big Theory Revolution: Big Data & Universality Complex Networks Symmetry Breaking Nutshell Measuring The Big Theory Happiness Tweetage Revolution: Big Data & Complex Networks Mechanical Turk Nutshell References

Measuring Happiness Tweetage Mechanical Turk

References

2 of 83 Complexity, Big Definitions Data Science, and Happiness

A meaningful definition of a Complex System: Complexity Introduction Emergence I Distributed possibly networked system of many Universality Symmetry Breaking interrelated parts with no centralized control The Big Theory Revolution: Big Data & [2] Complex Networks exhibiting emergent behavior—‘More is Different’ Nutshell

Measuring Happiness A few optional features: Tweetage Mechanical Turk

References I Nonlinear relationships

I Presence of feedback loops

I Being open or driven

I Presence of memory

I Modular (nested)/multiscale structure

I Opaque boundaries

4 of 83 Complexity, Big Data Science, and Happiness

Complexity Examples of Complex Systems: Introduction Emergence Universality Symmetry Breaking The Big Theory I human societies I animal societies Revolution: Big Data & Complex Networks Nutshell I cells I disease ecologies Measuring I organisms I brains Happiness Tweetage Mechanical Turk I power systems I social insects References I weather systems I geophysical systems

I ecosystems I the

I i.e., everything that’s interesting...

5 of 83 Complexity, Big Data Science, and Happiness

Complexity Introduction Relevant fields: Emergence Universality I Medical Symmetry Breaking I Physics I Cognitive The Big Theory Sciences Revolution: Big Data & Sciences Complex Networks I Economics Nutshell I Systems I Biology Measuring I Sociology Engineering Happiness I Ecology Tweetage I Psychology Mechanical Turk I Computer I Geociences References I Information Science Sciences I Geography I ...

I i.e., everything that’s interesting...

6 of 83 Complexity, Big Complexity Manifesto: Data Science, and Happiness 1. Systems are ubiquitous and systems matter. 2. Consequently, much of science is about Complexity Introduction understanding how pieces dynamically fit together. Emergence Universality 3. 1700 to 2000 = Golden Age of Reductionism. Symmetry Breaking The Big Theory I Atoms!, sub-atomic particles, DNA, genes, people, ... Revolution: Big Data & Complex Networks 4. Understanding and creating systems (including new Nutshell Measuring ‘atoms’) is the greater part of science and Happiness Tweetage engineering. Mechanical Turk

5. Universality: systems with quantitatively different References micro details exhibit qualitatively similar macro behavior. 6. Computing advances make the Science of Complexity possible: 6.1 We can measure and record enormous amounts of data, research areas continue to transition from data scarce to data rich. 6.2 We can simulate, model, and create complex

systems in extraordinary detail. 7 of 83 Complexity, Big Data Science, and Data, Data, Everywhere—the Economist, Feb 25, 2010 () Happiness

Big Data Science: Complexity Introduction Emergence I 2013: year traffic on Universality Internet estimate to reach Symmetry Breaking The Big Theory Revolution: Big Data & 2/3 Zettabytes Complex Networks (1ZB = 103EB = 106PB = Nutshell 9 Measuring 10 TB) Happiness Tweetage I Large Hadron Collider: 40 Mechanical Turk TB/second. References

I 2016—Large Synoptic Survey Telescope: 140 TB every 5 days. Exponential growth: I Facebook: ∼ 100 billion ∼ 60% per year. I photos

I Twitter: ∼ 5 billion tweets

8 of 83 No really, that’s a lot of data RESEARCH ARTICLE

enter a regime marked by slower forgetting: contrast, “1973” declined to half its peak by they were first invented (1800–1840, 1840–1880, Collective memory has both a short-term and a 1983, a lag of only 10 years. We are forgetting and 1880–1920) (7). We tracked the frequency long-term component. our past faster with each passing year (Fig. 3A). of each invention in the nth year after it was But there have been changes. The amplitude We were curious whether our increasing invented as compared to its maximum value and of the plots is rising every year: Precise dates are tendency to forget the old was accompanied by plotted the median of these rescaled trajectories increasingly common. There is also a greater fo- more rapid assimilation of the new (21). We di- for each cohort. cus on the present. For instance, “1880” declined vided a list of 147 inventions into time-resolved The inventions from the earliest cohort RESEARCH ARTICLE to half its peak value in 1912, a lag of 32 years. In cohorts based on the 40-year interval in which (1800–1840) took over 66 years from invention to widespread impact (frequency >25% of peak). famous people born in that year. For example, the Fame comes sooner and rises faster. Between the Since then, the cultural adoption of technology has 1882 cohort includes “Virginia Woolf” and “ABFelix early 19th century and the mid-20th century, the become more rapid. The 1840–1880 invention Frankfurter”; the 1946 cohort includes “Bill age of initial celebrity declined from 43 to 29 Year of invention cohort was widely adopted within 50 years; the Clinton” and “Steven Spielberg”.Weplottedthe years, and the doubling time fell from 8.1 to 3.3 1880–1920 cohort within 27 (Fig. 3B and fig. S7). median frequency for the names in each cohort years. As a result, the most famous people alive “In the future, everyone will be famous for over time (Fig. 3, DRESEARCH and E). The resulting ARTICLE trajectories today are more famous—in books—than their 7.5 minutes”–Whatshisname. People, too, rise to were all similar. Each cohort had a pre-celebrity predecessors. Yet this fame is increasingly short- prominence, only to be forgotten (22). Fame can be period (median frequency <10−9), followed by a lived: The post-peak half-life dropped from 120 enter a regime marked by slower forgetting: contrast, “1973” declined to half its peak by they were first invented (1800–1840, 1840–1880, x10-5

Collective memory has both a short-term and a 1983, atracked lag of only by measuring 10 years. the We frequency are forgetting of a personand’s 1880rapid–1920) rise to (7). prominence, We tracked a the peak, frequency and a slow de- to 71 years during the 19th century.

y c Frequency

long-term component. our pastname faster (Fig. with 3C). each We passing compared year (Fig.the rise 3A). to fameof of eachcline. invention We therefore in the characterizednth year after each it cohort was using We repeated this analysis with all 42,358 n 5 euq But there have been changes. The amplitude Wethe were most curious famous whether people of our different increasing eras. We tookinventedfour as compared parameters: to (i) its the maximum age of initial value celebrity, and (ii) people in the databases of the Encyclopaedia all 740,000 people with entries in Wikipedia, the doubling time of the initial rise, (iii) the age of Britannica (24), which reflect a process of expert erF of the plots is rising every year: Precise dates are tendency to forget the old was accompanied by plotted the median of these rescaled trajectories Median frequency (% of peak value) 0 increasingly common. There is also a greater fo- more rapidremoved assimilation cases where of the several new (21 famous). We individuals di- for eachpeak cohort. celebrity, and (iv) the half-life of the decline curation that began in 1768. The results were cus on the present. For instance, “1880” declined vided ashare list of a 147 name, inventions and sorted into the time-resolved rest by birth date andThe(Fig. inventions 3E). The fromage of thepeak earliest celebritycohort has been con- similar (7)(fig.S9).Thus,peoplearegettingmore to half its peak value in 1912, a lag of 32 years. In cohortsfrequency based on (the23). 40-year For every interval year from in which 1800 to 1950,(1800–1840)sistent took over over time: 66 about years 75 from years invention after birth. But famous than ever before but are being forgotten we constructed a cohort consisting of the 50 most the other parameters have been changing (fig. S8). more rapidly than ever. on January 14, 2011 ABFig. 4. Culturomics can be used to ABCD detect censorship. (A)Usagefrequen-Year of invention cy of “Marc Chagall” in German (red) as compared to English (blue). (B) Suppression of Leon Trotsky (blue), RESEARCH ARTICLE Grigory Zinoviev (green), and Lev Complexity, Big enter a regime marked by slower forgetting: contrast, “1973” declined to half its peak by they were first invented (1800–1840, 1840–1880, Kamenev (red) in Russian texts, Big Data—Culturomics:x10-5 Data Science, and

with noteworthy eventsCollective indicated: memory has both a short-term and a 1983, a lag of only 10 years. We are forgetting and 1880–1920) (7). We tracked the frequency y

long-term component. c our past faster with each passing year (Fig. 3A). of each invention in the nth year after it was Happiness Frequency Trotsky’sassassination(bluearrow), n 5

Frequency euq But there have been changes. The amplitude We were curious whether our increasing invented as compared to its maximum value on January 14, 2011 and www.sciencemag.org Zinoviev and Kamenev executed Frequency Frequency

of the plots is rising every year: Precise dates are erF tendency to forget the old was accompanied by plotted the median of these rescaled trajectories (red arrow), the Great Purge (red (log) Frequency Median frequency (% of peak value) highlight), and perestroikaincreasingly (gray common. ar- There is also a greater fo- 0more rapid assimilation of the new (21). We di- for each cohort. row). (C)The1976and1989Tianan-“Quantitativecus on the present. For instance, “1880 analysis” declined vided aof list of culture 147 inventions into using time-resolved millionsThe inventions of from digitized the earliest cohort men Square incidentsto half both its peak led to value in 1912, a lag of 32 years. In cohorts based on the 40-year interval in which [11](1800–1840) took over 66 years from invention Complexity elevated discussionbooks” in English texts by Michel et al., Science, 2011 Introduction (scale shown on the right). Response to the 1989 incident is largely ab- Emergence

ABEF on January 14, 2011 sent in Chinese texts (blue, scale shown CD

CD Year of invention Universality Downloaded from 天安門 on the left), suggesting government www.sciencemag.org censorship. (D)WhiletheHolly- Symmetry Breaking Half life: 73 yrs wood Ten were blacklisted (red The Big Theory highlight) from U.S. movie studios, Revolution: Big Data & their fame declined (median: thick Complex Networks gray line). None of them were cred- x10-5

ited in a film until 1960’s(aptly y Nutshell c Frequency n 5 Frequency Frequency named) Exodus.(E)Artistsandwrit- euq Frequency

ers in various disciplines were sup- www.sciencemag.org erF Downloaded from Measuring Median frequency Median frequency (% of peak value) pressed by the Nazi (log) Frequency regime (red 0 Doubling time: 4 yrs Happiness highlight). In contrast, the Nazis them- Median frequency (log) selves (thick red line) exhibited a Tweetage strong fame peak during the war years. (F)Distributionofsuppres- Mechanical Turk sion indices for both English (blue)

EFand German (red) for the period from EFFig. 3. Cultural turnover is accelerating. (A)Weforget:frequencyof“1883” 1871 on January 14, 2011 (gray lines; median,References thick dark gray line). Five examples are highlighted. 1933–1945. ThreeCD victims of Nazi (blue), “1910” (green), and “1950” (red). Inset: We forget faster. The half-life (E)Themediantrajectoryofthe1865cohortischaracterizedbyfour suppression are highlighted at left of the curvesDownloaded from (gray dots) is getting shorter (gray line: moving average). (B)Cultural parameters: (i) initial age of celebrity (34 years old, tick mark); (ii) doubling (red arrows). Inset: Calculation of adoption is quicker. Median trajectory for three cohorts of inventions from three time of the subsequent rise to fame (4 years, blue line); (iii) age of peak celebrity Half life: 73 yrs the suppression index for “Henri different time periods (1800–1840, blue; 1840–1880, green; 1880–1920, (70 years after birth, tick mark), and (iv) half-life of the post-peak forgetting Matisse”. red). Inset: The telephone (green; date of invention, green arrow) and radio phase (73 years, red line). Inset: The doubling time and half-life over time. (blue; date of invention, blue arrow). (C)Fameofvariouspersonalitiesborn (F)Themediantrajectoryofthe25mostfamous personalities born between between 1920 and 1930. (D)Frequencyofthe50mostfamouspeoplebornin 1800 and 1920 in various careers. Frequency www.sciencemag.org SCIENCE VOLwww.sciencemag.org 331 14 JANUARY 2011 179 Frequency (log) Frequency Median frequency

Doubling time: 4 yrs Median frequency (log)

Fig. 3. Cultural turnover is accelerating. (A)Weforget:frequencyof“1883” 1871EF (gray lines; median, thick dark gray line). Five examples are highlighted.

(blue), “1910” (green), and “1950” (red). Inset: We forget faster. The half-life (E)Themediantrajectoryofthe1865cohortischaracterizedbyfour Downloaded from 180 SCIENCE of the curves (gray dots) is getting shorter (gray line: moving average). (B)Cultural http://www.culturomics.org/parameters: (i) initial14 age JANUARY of celebrity 2011 (34 years VOL old, 331 tick mark); (ii) doublingwww.sciencemag.org() adoption is quicker. Median trajectory for three cohorts of inventions from three time of the subsequent rise to fame (4 years, blue line);Half (iii) life: age73 yrs of peak celebrity different time periods (1800–1840, blue; 1840–1880, green; 1880–1920, Google(70 years after birth, Books tick mark), and (iv) ngram half-life of the post-peak viewer forgetting( ) red). Inset: The telephone (green; date of invention, green arrow) and radio phase (73 years, red line). Inset: The doubling time and half-life over time.  (blue; date of invention, blue arrow). (C)Fameofvariouspersonalitiesborn (F)Themediantrajectoryofthe25mostfamous personalities born between between 1920 and 1930. (D)Frequencyofthe50mostfamouspeoplebornin 1800 and 1920 in various careers. 10 of 83

www.sciencemag.org SCIENCE VOL 331 14 JANUARY 2011 179Median frequency

Doubling time: 4 yrs Median frequency (log)

Fig. 3. Cultural turnover is accelerating. (A)Weforget:frequencyof“1883” 1871 (gray lines; median, thick dark gray line). Five examples are highlighted. (blue), “1910” (green), and “1950” (red). Inset: We forget faster. The half-life (E)Themediantrajectoryofthe1865cohortischaracterizedbyfour of the curves (gray dots) is getting shorter (gray line: moving average). (B)Cultural parameters: (i) initial age of celebrity (34 years old, tick mark); (ii) doubling adoption is quicker. Median trajectory for three cohorts of inventions from three time of the subsequent rise to fame (4 years, blue line); (iii) age of peak celebrity different time periods (1800–1840, blue; 1840–1880, green; 1880–1920, (70 years after birth, tick mark), and (iv) half-life of the post-peak forgetting red). Inset: The telephone (green; date of invention, green arrow) and radio phase (73 years, red line). Inset: The doubling time and half-life over time. (blue; date of invention, blue arrow). (C)Fameofvariouspersonalitiesborn (F)Themediantrajectoryofthe25mostfamous personalities born between between 1920 and 1930. (D)Frequencyofthe50mostfamouspeoplebornin 1800 and 1920 in various careers.

www.sciencemag.org SCIENCE VOL 331 14 JANUARY 2011 179 Complexity, Big Homo narrativus: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Mechanisms = Nutshell I Measuring Evolution equations, Happiness Tweetage algorithms, stories, ... Mechanical Turk References I Rollover zing: “Also, all financial analysis. And, more directly, D&D.”

http://xkcd.com/904/ ()

11 of 83 Complexity, Big Basic Science ' Describe + Explain: Data Science, and Happiness

Complexity Lord Kelvin (possibly): Introduction Emergence Universality “To measure is to know.” Symmetry Breaking I The Big Theory Revolution: Big Data & I “If you cannot measure it, you Complex Networks Nutshell cannot improve it.” Measuring Happiness Tweetage Bonus: Mechanical Turk References I “X-rays will prove to be a hoax.”

I “There is nothing new to be discovered in physics now, All that remains is more and more precise measurement.”

12 of 83 Complexity, Big Data Science, and Emergence: Happiness Tornadoes, financial collapses, human emotion aren’t Complexity found in water molecules, dollar bills, or carbon atoms. Introduction Emergence Universality Examples: Symmetry Breaking The Big Theory Revolution: Big Data & I Fundamental particles → Life, the Universe, and Complex Networks Nutshell

Everything Measuring Happiness I Genes → Organisms Tweetage Mechanical Turk

I Brains → Thoughts References I People → The Web

I People → Religion

I People → Language, and rules in language (e.g., -ed, -s).

I ? → time; ? → gravity; ? → reality.

“The whole is more than the sum of its parts” –Aristotle 14 of 83 Complexity, Big Data Science, and Toast + Capers + Almonds = Something Different: Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

15 of 83 Complexity, Big Emergence—Mechanism Data Science, and Happiness

Complexity Thomas Schelling () (Economist/Nobelist): Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage I “Micromotives and Mechanical Turk [14] Macrobehavior” References

I Segregation I Wearing hockey helmets I Seating choices

[youtube] ()

16 of 83 Complexity, Big Reductionism Data Science, and Happiness

Complexity I Complex Systems enthusiasts often decry Introduction Emergence reductionist approaches . . . Universality Symmetry Breaking I But reductionism seems to be misunderstood. The Big Theory Revolution: Big Data & Complex Networks I Reductionist techniques can explain weak Nutshell emergence (e.g., phase transitions). Measuring Happiness ‘A Miracle Occurs’ explains strong emergence. Tweetage I Mechanical Turk I But: maybe miracle should be interpreted as an References inscrutable yet real mechanism that cannot be simply described. Gulp.

I Listen to Steve Strogatz and Hod Lipson (Cornell) in the last piece on Radiolab’s show ‘Limits’ (51:40): http://blogs.wnyc.org/radiolab/2010/04/ 05/limits/

17 of 83 Complexity, Big The emergence of taste: Data Science, and Happiness

Complexity Introduction Emergence I Molecules → Ingredients → Taste/Nutrition/Health Universality Symmetry Breaking See Michael Pollan’s article on nutritionism ( ) in the The Big Theory I  Revolution: Big Data & Complex Networks New York Times, January 28, 2007. Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

nytimes.com

I See also: bumblebees.

18 of 83 Complexity, Big Limits to what is possible: Data Science, and Happiness

Universality (): Complexity Introduction Emergence I The property that the macroscopic aspects of a Universality Symmetry Breaking system do not depend sensitively on the system’s The Big Theory Revolution: Big Data & details. Complex Networks Nutshell I Key figure: Leo Kadanoff ( ). Measuring  Happiness Tweetage Mechanical Turk

Examples: References

I The Central Limit Theorem:

1 2 2 P(x; µ, σ)dx = √ e−(x−µ) /2σ dx . 2πσ

I Nature of phase transitions in statistical mechanics.

I Navier Stokes equation for fluids.

20 of 83 Complexity, Big Fluids mechanics Data Science, and Happiness

Complexity Introduction I Fluid mechanics = One of the great successes of Emergence Universality understanding complex systems. Symmetry Breaking The Big Theory Revolution: Big Data & I Navier-Stokes equations: micro-macro system Complex Networks evolution. Nutshell Measuring Happiness I The big three: Experiment + Theory + Simulations. Tweetage Mechanical Turk I Works for many very different ‘fluids’: References I the atmosphere, I oceans, I blood, I galaxies, I the earth’s mantle... I and ball bearings on lattices...?

21 of 83 Complexity, Big Lattice gas models Data Science, and Happiness Collision rules in 2-d on a hexagonal lattice: Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

I Lattice matters... Only hexagonal lattice works in 2-d. I No ‘good’ lattice in 3-d. I Upshot: play with ‘particles’ of a system to obtain new or specific macro behaviours. 22 of 83 Complexity, Big Hexagons—Honeycomb: ( ) Data Science, and  Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

I Orchestrated? Or an accident of bees working hard? I See “On Growth and Form” by D’Arcy Wentworth Thompson ( ). [16, 17]  23 of 83 Complexity, Big Hexagons—Giant’s Causeway: ( ) Data Science, and  Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

http://newdesktopwallpapers.info 24 of 83 Complexity, Big Hexagons—Giant’s Causeway: ( ) Data Science, and  Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

http://www.physics.utoronto.ca/ 25 of 83 Complexity, Big Hexagons run amok: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell Graphene ( ): single layer of Measuring I  Happiness carbon molecules in a perfect Tweetage Mechanical Turk

hexagonal lattice (super strong). References I Chicken wire () ...

26 of 83 Complexity, Big Whimsical but great example of real science: Data Science, and Happiness

“How Cats Lap: Water Uptake by Felis catus” () Complexity Reis et al., Science, 2010. Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

Amusing interview here ()

27 of 83 Complexity, Big Symmetry Breaking Data Science, and Happiness Philip Anderson ( )—“More is Different,” Science, 1972 [2]  Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Argues against idea that Nutshell I Measuring the only real scientists Happiness Tweetage are those working on Mechanical Turk the fundamental laws. References

I Symmetry breaking → different laws/rules at different scales...

(2006 study → “most creative physicist in the world” ()) 29 of 83 Complexity, Big Symmetry Breaking Data Science, and Happiness

“Elementary entities of science X obey the laws of Complexity Introduction science Y” Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & I X I Y Complex Networks solid state or elementary particle Nutshell I I Measuring many-body physics physics Happiness Tweetage Mechanical Turk I chemistry I solid state many-body physics References

I molecular biology I chemistry

I cell biology I molecular biology . vdots . I psychology I physiology I social sciences I psychology

30 of 83 Complexity, Big Symmetry Breaking Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking Anderson: The Big Theory Revolution: Big Data & [the more we know about] “fundamental laws, the less Complex Networks Nutshell

relevance they seem to have to the very real problems of Measuring Happiness the rest of science.” Tweetage Mechanical Turk Scale and complexity thwart the constructionist References hypothesis.

Accidents of history and path dependence () matter.

31 of 83 More is different:

http://xkcd.com/435/ () Complexity, Big A real science of complexity: Data Science, and Happiness

Complexity Introduction A real theory of everything anything: Emergence Universality Symmetry Breaking 1. Is not just about the ridiculously small stuff... The Big Theory Revolution: Big Data & Complex Networks 2. It’s about the increase of complexity Nutshell

Measuring Symmetry breaking/ Happiness vs. Universality Tweetage Accidents of history Mechanical Turk References

I Second law of thermodynamics: we’re toast in the long run.

I So how likely is the local complexification of structure we enjoy?

I How likely are the Big Transitions?

34 of 83 Complexity, Big Complexification—the Big Transitions: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory I Big Science. Revolution: Big Data & I Big Bang. I Big Word. Complex Networks I Big Data. Nutshell I Big Random- I Big Story. Measuring I Big Information. Happiness ness. Tweetage I Big I Big Algorithm. Mechanical Turk I Big Replicate. Number. References I Big Connection. I Big Life. I Big God. I Big Social. I Big Evolve. I Big Make. I Big Awareness.

35 of 83 Complexity, Big Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

36 of 83 Complexity, Big Ancestry: Data Science, and Happiness

Complexity Introduction Emergence From Keith Briggs’s excellent Universality Symmetry Breaking etymological investigation: () The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References I Opus reticulatum:

I A Latin origin?

[http://serialconsign.com/2007/11/we-put-net-network]

38 of 83 Complexity, Big Key Observation: Data Science, and Happiness

I Many complex systems Complexity can be viewed as complex networks Introduction Emergence of physical or abstract interactions. Universality Symmetry Breaking I Opens door to mathematical and numerical analysis. The Big Theory Revolution: Big Data & Complex Networks I Dominant approach of last decade of a Nutshell theoretical-physics/stat-mechish/combinatorics Measuring Happiness flavor. Tweetage Mechanical Turk

I Mindboggling amount of work published on complex References networks since 1998...

I ... largely due to your typical theoretical physicist:

I Piranha physicus

I Hunt in packs.

I Feast on new and interesting ideas (see chaos, cellular automata, ...) 39 of 83 Complexity, Big More observations Data Science, and Happiness

I But surely networks aren’t new... Complexity Graph theory is well established... Introduction I Emergence Universality I Study of social networks started in the 1930’s... Symmetry Breaking The Big Theory So why all this ‘new’ research on networks? Revolution: Big Data & I Complex Networks Nutshell I Answer (to repeat): Oodles of Easily Accessible Measuring Data. Happiness Tweetage I We can now inform (alas) our theories Mechanical Turk with a much more measurable reality.∗ References

I Crucial observation: Real networks occupy a tiny, low entropy part of all network space and require specific attention.

I A central goal: establish mechanistic explanations.

I What kinds of dynamics lead to these real networks? ∗If this is upsetting, maybe string theory is for you...

40 of 83 Complexity, Big Popularity (according to ISI) Data Science, and Happiness

Complexity [20] Introduction “Collective dynamics of ‘small-world’ networks” Emergence Universality Symmetry Breaking I Watts and Strogatz The Big Theory Revolution: Big Data & Nature, 1998 Complex Networks Nutshell

I ≈ 4677 citations (as of January 18, 2011) Measuring Happiness I Over 1100 citations in 2008 alone. Tweetage Mechanical Turk

References “Emergence of scaling in random networks” [3]

I Barabási and Albert Science, 1999

I ≈ 5270 citations (as of January 18, 2011)

I Over 1100 citations in 2008 alone.

41 of 83 Complexity, Big Models Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking 1. generalized random networks: The Big Theory Revolution: Big Data & Complex Networks Nutshell I Arbitrary degree distribution Pk . Measuring I Wire nodes together randomly. Happiness Tweetage Mechanical Turk I Create ensemble to test deviations from randomness. References

I Interesting, applicable, rich mathematically, very important.

42 of 83 Complexity, Big Models Data Science, and Happiness 2. ‘scale-free networks’: Complexity Introduced by Barabasi and Introduction I Emergence [3] Albert Universality Symmetry Breaking The Big Theory I Generative, mechanistic Revolution: Big Data & Complex Networks model Nutshell Measuring I Ancestory: Herbert Simon’s Happiness [15] Tweetage model for Zipf’s law Mechanical Turk

I Preferential attachment References model with growth: α I P[attachment to node i] ∝ ki . −γ I Produces Pk ∼ k when γ = 2.5 α = 1. hki = 1.8 Trickiness: other models N = 150 I generate skewed degree distributions. 43 of 83 Complexity, Big Models Data Science, and Happiness

Complexity 3. small-world networks Introduction Emergence [20] Universality I Introduced by Watts and Strogatz Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Two scales: Nutshell local regularity (an individual’s friends know each Measuring I Happiness other) Tweetage Mechanical Turk I global randomness (shortcuts). References

I Shortcuts allow disease to jump

I Number of infectives increases exponentially in time

I Facilitates synchronization

44 of 83 Complexity, Big Popularity according to books: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking Linked: How Everything Is Connected to The Big Theory Revolution: Big Data & Everything Else and What It Complex Networks Means—Albert-Laszlo Barabási Nutshell Measuring Happiness Tweetage Mechanical Turk

References

Six Degrees: The Science of a Connected Age—Duncan Watts [19]

45 of 83 Complexity, Big More observations Data Science, and Happiness

I Web-scale data sets can be overly exciting. Complexity Introduction Emergence Universality Witness: Symmetry Breaking The Big Theory Revolution: Big Data & I The End of Theory: The Data Deluge Makes the Complex Networks Scientific Theory Obsolete (Anderson, Wired) ( ) Nutshell  Measuring Happiness I “The Unreasonable Effectiveness of Data,” Tweetage Halevy et al. [9] Mechanical Turk References I c.f. Wigner’s “The Unreasonable Effectiveness of Mathematics in the Natural Sciences” [21]

But:

I For scientists, description is only part of the battle.

I We still need to understand.

46 of 83 Complexity, Big Examples Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking What passes for a complex network? The Big Theory Revolution: Big Data & Complex Networks Nutshell I Complex networks are large (in node number) Measuring Complex networks are sparse (low edge to node Happiness I Tweetage ratio) Mechanical Turk References I Complex networks are usually dynamic and evolving

I Complex networks can be social, economic, natural, informational, abstract, ...

47 of 83 Complexity, Big Examples Data Science, and Happiness

Physical networks Complexity Introduction Emergence I River networks Universality I The Internet Symmetry Breaking The Big Theory I Neural networks Revolution: Big Data & I Road networks Complex Networks I Trees and leaves Nutshell I Power grids Measuring I Blood networks Happiness Tweetage Mechanical Turk

References

I Distribution (branching) versus redistribution (cyclical)

48 of 83 Complexity, Big Examples Data Science, and Happiness

Interaction networks Complexity Introduction The Emergence I Universality Symmetry Breaking I Biochemical The Big Theory Revolution: Big Data & networks Complex Networks Nutshell

I Gene-protein Measuring Happiness networks Tweetage Food webs: who Mechanical Turk I References eats whom

I The World Wide Web (?)

I Airline networks

I Call networks (AT&T) datamining.typepad.com ()

I The Media 49 of 83 Complexity, Big Dynamic networks: Server security Data Science, and Happiness

Complexity Serving one html page with an image: Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

I Map of system calls made by a Linux server running Apache and Windows server running IIS. Which is which?

Taken from http://www.visualcomplexity.com ()

50 of 83 Complexity, Big Examples Data Science, and Happiness

Interaction networks: Complexity Introduction social networks Emergence Universality Symmetry Breaking I Snogging The Big Theory Revolution: Big Data & I Friendships Complex Networks Nutshell I Acquaintances Measuring Happiness I Boards and Tweetage Mechanical Turk

directors References I Organizations I twitter.com () facebook.com (), (Bearman et al., 2004)

I ‘Remotely sensed’ by: tweets (open), , Facebook posts, emails, phone logs (*cough*).

51 of 83 del.icio.us/url/830cfc21af4d02c392aa6ad04e93b93e 11/14/2006 09:04 PM Complexity, Big Examples Data Science, and Happiness Relational networks popular | recent del.icio.us / url Complexity I Consumer purchases ( ) login | register | help Introduction  Emergence 15 (Wal-Mart: ≈ 2.5 petabyte = 2.5 × 10 bytes) Universality Symmetry Breaking The Big Theory I Thesauri: Networks of words generated by meanings Revolution: Big Data & » del.icio.us history for http://en.wikipedia.org/wiki/Main_Page del.icio.us search Complex Networks Nutshell I Knowledge/Databases/Ideas check url Measuring I Metadata—Tagging: delicious ( ), flickr ( ) Happiness   Tweetage Mechanical Turk Main Page - Wikipedia, the free common tags cloud | list References encyclopedia http://en.wikipedia.org/wiki/Main_Page community daily dictionary education encyclopedia this url has been saved by 16283 people. english free imported info information internet knowledge save this to your bookmarks » learning news reference research resource resources search tools useful web web2.0 wiki user notes wikipedia Nov ‘06

52 of 83 all the knoweldge - Penfoldio related items - show ! facts - mmayes posting history I would not survive without Wikipedia, it knows » first posted by weblook to util everything, and I want to learn everything. - RobRemi Nov ‘06

The internet encyclopedia by faithfulsprig to reference - limannka by Penfoldio to wikipedia The mother of all wikis by vishinator to wikipedia - Username314 by mmayes to research Wikipedia - lancepickens by markcameron1 to system:unfiled by hirewad to system:unfiled ALWAYS useful! - AmritS by yeager.eng to firefox:toolbar Free Online Encyclopedia, User contributed articles by bandaids to reference - Maluvia by ingee to system:unfiled Welcome to Wikipedia, the free encyclopedia that by hdnev6 to system:unfiled anyone can edit. by hopabonk to encyclopedia - sanders_d by TianCaiWuDi to system:unfiled The Free Encyclopedia by k9entertainment to system:unfiled - puffseeker by llshoemaker to wikipedia Very good encycolopedia by tvn to imported kennis - slams by papaluukas to system:unfiled Knowledge by ureshi to system:unfiled - neilbriody by malph to reference encyclopedia Miscellaneous Info - sahing by salar15 to system:unfiled by RobRemi to wikipedia wiki knowledge encyclopedia free online encyclopedia http://del.icio.us/url/830cfc21af4d02c392aa6ad04e93b93e?settagview=cloud Page 1 of 65 Complexity, Big Clickworthy Science: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

[5] Bollen et al. ; a higher resolution figure is here () 53 of 83 Complexity, Big A notable feature of large-scale networks: Data Science, and Happiness

I Graphical renderings are often just a big mess. Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory ⇐ Typical hairball Revolution: Big Data & Complex Networks Nutshell I number of nodes N = 500 Measuring Happiness I number of edges m = 1000 Tweetage Mechanical Turk

I average degree hki = 4 References

I And even when renderings somehow look good: “That is a very graphic analogy which aids understanding wonderfully while being, strictly speaking, wrong in every possible way” said Ponder [Stibbons] —Making Money, T. Pratchett.

I We need to extract digestible, meaningful aspects. 54 of 83 Complexity, Big Properties Data Science, and Happiness

Complexity Some key aspects of real complex networks: Introduction Emergence Universality I degree I concurrency Symmetry Breaking distribution P ∗ The Big Theory k hierarchical Revolution: Big Data & I Complex Networks I assortativity scaling Nutshell Measuring I homophily I network distances Happiness Tweetage I clustering I centrality Mechanical Turk References I motifs I efficiency

I modularity I robustness

I Plus coevolution of network structure and processes on networks. ∗ Degree distribution is the elephant in the room that we are now all very aware of...

55 of 83 Complexity, Big Nutshell: Data Science, and Happiness

Complexity Overview Key Points: Introduction Emergence Universality The field of complex networks came into existence in Symmetry Breaking I The Big Theory Revolution: Big Data & the late 1990s. Complex Networks Explosion of papers and interest since 1998/99. Nutshell I Measuring Happiness I Hardened up much thinking about complex systems. Tweetage Specific focus on networks that are large-scale, Mechanical Turk I References sparse, natural or man-made, evolving and dynamic, and (crucially) measurable.

I Three main (blurred) categories: 1. Physical (e.g., river networks), 2. Interactional (e.g., social networks), 3. Abstract (e.g., thesauri).

57 of 83 Complexity, Big Nutshell: Data Science, and Happiness

Overview Key Points (cont.): Complexity Introduction Emergence Obvious connections with the vast extant field of Universality I Symmetry Breaking graph theory. The Big Theory Revolution: Big Data & Complex Networks I But focus on dynamics is more of a Nutshell physics/stat-mech/comp-sci flavor. Measuring Happiness Two main areas of focus: Tweetage I Mechanical Turk

1. Description: Characterizing very large networks References 2. Explanation: Micro story → Macro features

I Some essential structural aspects are understood: degree distribution, clustering, assortativity, group structure, overall structure,...

I Still much work to be done, especially with respect to dynamics... exciting!

58 of 83 Complexity, Big Bonus materials: Data Science, and Happiness

Complexity Introduction Graduate Course Websites: Emergence Universality Symmetry Breaking I Principles of Complex Systems (), University of Vermont The Big Theory Revolution: Big Data & Complex Networks I Complex Networks (), University of Vermont Nutshell Measuring Happiness Tweetage Textbooks: Mechanical Turk

I David Easley and Jon Kleinberg (Economics and References Computer Science, Cornell) “Networks, Crowds, and Markets: Reasoning About a Highly Connected World” () I Mark Newman (Physics, Michigan) “Networks: An Introduction” ()

59 of 83 Complexity, Big Bonus materials: Data Science, and Happiness

Complexity Introduction Emergence Review articles: Universality Symmetry Breaking The Big Theory I S. Boccaletti et al. Revolution: Big Data & [4] Complex Networks “Complex networks: structure and dynamics” Nutshell Times cited: 1,028 (as of June 7, 2010) Measuring Happiness Tweetage I M. Newman Mechanical Turk [12] “The structure and function of complex networks” References Times cited: 2,559 (as of June 7, 2010)

I R. Albert and A.-L. Barabási “Statistical mechanics of complex networks” [1] Times cited: 3,995 (as of June 7, 2010)

60 of 83 Complexity, Big Bonus materials: Data Science, and Happiness [18] I Complex Social Networks—F. Vega-Redondo Complexity I Fractal River Basins: Chance and Self-Organization—I. Introduction [13] Emergence Rodríguez-Iturbe and A. Rinaldo Universality Symmetry Breaking Random Graph Dynamics—R. Durette The Big Theory I Revolution: Big Data & Complex Networks I Scale-Free Networks—Guido Caldarelli Nutshell Measuring I Evolution and Structure of the Internet: A Statistical Happiness Tweetage Physics Approach—Romu Pastor-Satorras and Mechanical Turk Alessandro Vespignani References

I Complex Graphs and Networks—Fan Chung

I Social Network Analysis—Stanley Wasserman and Kathleen Faust

I Handbook of Graphs and Networks—Eds: Stefan Bornholdt and H. G. Schuster [6]

I Evolution of Networks—S. N. Dorogovtsev and J. F. F. Mendes [8] 61 of 83 Complexity, Big The Team: Data Science, and Happiness 1. People: Thanks to ... Complexity Introduction Kameron Harris Isabel Kloumann Catherine Bliss Emergence Chris Danforth Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk 2. Machines: References

I 3000 processors + storage at the Jonathan Harris & SepVermont Kamvar Advanced Computing wefeelfine.org Center

I 40 TB of storage in Danforth’s office. 3. Support:

NSF and NASA. 63 of 83 Complexity, Big Happiness: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

Bentham: Jefferson: Socrates et al.: hedonistic . . . the pursuit of eudaimonia [10] calculus happiness

64 of 83 Complexity, Big Early drafts: Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks Nutshell

Measuring Happiness Tweetage Mechanical Turk

References

65 of 83 Complexity, Big Twitter—living in the now: Data Science, and Happiness

Complexity 0.16 Introduction breakfast Emergence Universality 0.14 lunch Symmetry Breaking The Big Theory dinner Revolution: Big Data & 0.12 Complex Networks Nutshell 0.1 Measuring Happiness Tweetage 0.08 Mechanical Turk

References 0.06

count fraction 0.04

0.02 0 0 2 4 6 8 10 12 14 16 18 20 22 24 hour of day (local time)

66 of 83 Complexity, Big Twitter—living in the now: Data Science, and Happiness

Complexity 0.07 Introduction Emergence Universality 0.06 Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks 0.05 Nutshell

Measuring 0.04 Happiness Tweetage Mechanical Turk 0.03 hungry References starving count fraction 0.02 food 0.01 eat 0 0 2 4 6 8 10 12 14 16 18 20 22 24 hour of day (local time)

67 of 83 Complexity, Big Twitter—living in the now: Data Science, and Happiness

0.06 Complexity Introduction Emergence 0.05 Universality Symmetry Breaking The Big Theory Revolution: Big Data & 0.04 Complex Networks Nutshell

Measuring 0.03 Happiness Tweetage Mechanical Turk

count (%) 0.02 References

0.01

0 0 2 4 6 8 10 12 14 16 18 20 22 24 hour of day (local time) A few words you can’t say on television.

68 of 83 Twitter—overall time series:

6.4 Monday 2008— 2009—12/25 2010— 2011— Tuesday 12/25 12/25 Wednesday 6.3 Thursday avg 12/31 Friday h 12/24 12/24 12/31 12/24 A 01/01 04/12 11/26 Saturday 02/14 02/14 Sunday 6.2 06/21 07/04 05/09 07/04 11/27 01/01 02/14 12/31 04/04 10/31 06/20 01/01 11/25 05/08 10/31 04/29 6.1 10/31 04/24

6 06/25 09/14 average happiness 04/27 02/27 09/29 08/06 05/24 06/27 10/26 5.9 03/11 05/02

S Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May

N 700

600 B

500

400

300

Simpson lexical size Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May

) 3 7

10 C 2

1

word count0 (x Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May date Tref: 7 days before and after (havg=6.00) Tref: 7 days before and after (havg=5.98) Tcomp: Monday, 2008/09/29 (havg=5.95) Tcomp: Friday, 2011/04/29 (havg=6.04) Complexity, Big Data Science, and 1−↑bailout 1 wedding +↑ −↑bill dead −↓ Happiness −↑down dont −↓ −↑failed death −↓ 5 −↑not 5 beautiful +↑ last −↓ hate −↓ −↑no kiss +↑ −↑fail prince +↑ Complexity −↑fails +↓easter Introduction 10 −↑blame 10 +↓happy +↓love no −↓ Emergence −↑failure princess +↑ Universality −↑bad never −↓ −↑don’t shit −↓ Symmetry Breaking 15 −↑against 15 killed −↓ The Big Theory −↑die not −↓ −↑rejected dress +↑ Revolution: Big Data & −↑depression real +↑ Complex Networks −↑crisis +↓me Nutshell 20 money +↑ 20 +↓good +↓sunday +↓you +↓fun weekend +↑ Measuring r +↓party r friday +↑ Happiness +↓game party +↑ 25 weekend +↑ 25 +↓chocolate Tweetage +↓won +↓love old −↓ +↓win Mechanical Turk −↑worst married +↑ Word rank +↓great Word rank +↓game References 30 −↑panic 30 live +↑ +↓awesome bad −↓ −↑didn’t congrats +↑ +↓google amazing +↑ +↓saturday kill −↓ 35 billion +↑ 35 ill −↓ −↑hurt nigga −↓ +↓win Text size: wow +↑ Text size: 0 Tref Tcomp 0 Tref Tcomp 10 cancer −↓ 10 died −↓ −↑sick +↓life ↑ ↓ 40 1 − problem 40 1 ass − 10 −↑crash 10 hell −↓ ↓ ↑ + friday Balance: gorgeous + Balance: 2 2 10 −↑falling 10 congratulations +↑ house +↑ −165 : +65 +↓lol −68 : +168 ↓ +↓ +↑ ↑ +↓ +↑ 45 3 + home 45 3 couple + 10 −↑killed 10 +↓friends −↑fear kissed +↑ 4 ↓ 4 ↑ 10 miss − 10 she + −100 0 −↑gossip ↑ ↓ 0 100 killing −↓ ↑ ↓ r ↑ − − r ↓ − − 50 Pi=1 δhavg,i − poor 50 Pi=1 δhavg,i + haha

−20 −10 0 10 20 −30 −20 −10 0 10 20 30

Per word average happiness shift δhavg,r (%) Per word average happiness shift δhavg,r (%) 70 of 83 Tref: Tuesdays (havg=6.03) Tcomp: Saturdays (havg=6.06) Complexity, Big Data Science, and 1 love +↑ no −↓ Happiness haha +↑ party +↑ 5 fun +↑ saturday +↑ +↓new weekend +↑ Complexity not −↓ Introduction 2009−05−21 to 2010−12−31: 10 happy +↑ 6.08 dont −↓ Emergence −↑last Universality 6.07 hahaha +↑ 6.06 −↑bored Symmetry Breaking 15 −↑drunk The Big Theory 6.05 live +↑ avg die −↓ Revolution: Big Data & h 6.04 friends +↑ Complex Networks 6.03 game +↑ Nutshell 20 con −↓ 6.02 movie +↑ ↓ 6.01 cant − Measuring T W T F S S M T W T F S S M r −↑fight birthday +↑ Happiness day of week 25 +↓google Tweetage great +↑ 8 sunday +↑ Mechanical Turk family +↑ Word rank beautiful +↑ References 7 30 beach +↑ home +↑ +↓lunch 6 ↓ avg sick − h shopping +↑ 5 35 playing +↑ −↑don’t amazing +↑ Text size: 4 0 Tref Tcomp 10 bad −↓ awesome +↑ ↓ T W T F S S M T W T F S S M 40 1 homework − 10 wedding +↑ day of week ↑ − hangover Balance: 102 −↑miss +↓free −87 : +187 ↓ +↓ +↑ 45 3 shit − 10 court −↓ nice +↑ 4 ↑ 10 won + 0 100 +↓school r −↑ −↓ 50 Pi=1 δhavg,i movies +↑

−10 −5 0 5 10

Per word average happiness shift δhavg,r (%) 71 of 83 Complexity, Big valence word valence std dev twitter g-books nyt lyrics Data Science, and rank rank rank rank rank Happiness 1 laughter 8.50 0.93 3600 – – 1728 2 happiness 8.44 0.97 1853 2458 – 1230 3 love 8.42 1.11 25 317 328 23 Complexity 4 happy 8.30 0.99 65 1372 1313 375 Introduction 5 laughed 8.26 1.16 3334 3542 – 2332 Emergence 6 laugh 8.22 1.37 1002 3998 4488 647 Universality 7 laughing 8.20 1.11 1579 – – 1122 Symmetry Breaking 8 excellent 8.18 1.10 1496 1756 3155 – The Big Theory 9 laughs 8.18 1.16 3554 – – 2856 Revolution: Big Data & 10 joy 8.16 1.06 988 2336 2723 809 Complex Networks 11 successful 8.16 1.08 2176 1198 1565 – Nutshell 12 win 8.12 1.08 154 3031 776 694 Measuring 13 rainbow 8.10 0.99 2726 – – 1723 Happiness 14 smile 8.10 1.02 925 2666 2898 349 15 won 8.10 1.22 810 1167 439 1493 Tweetage 16 pleasure 8.08 0.97 1497 1526 4253 1398 Mechanical Turk 17 smiled 8.08 1.07 – 3537 – 2248 References 18 rainbows 8.06 1.36 – – – 4216 19 winning 8.04 1.05 1876 – 1426 3646 20 celebration 8.02 1.53 3306 – 2762 4070 21 enjoyed 8.02 1.53 1530 2908 3502 – 22 healthy 8.02 1.06 1393 3200 3292 4619 23 music 8.02 1.12 132 875 167 374 24 celebrating 8.00 1.14 2550 – – – 25 congratulations 8.00 1.63 2246 – – – 26 weekend 8.00 1.29 317 – 833 2256 27 celebrate 7.98 1.15 1606 – 3574 2108 28 comedy 7.98 1.15 1444 – 2566 – 29 jokes 7.98 0.98 2812 – – 3808 30 rich 7.98 1.32 1625 1221 1469 890 ...... 73 of 83 Complexity, Big valence word valence std dev twitter g-books nyt lyrics Data Science, and rank rank rank rank rank Happiness

...... 10193 violence 1.86 1.05 4299 1724 1238 2016 Complexity 10194 cruel 1.84 1.15 2963 – – 1447 Introduction 10195 cry 1.84 1.28 1028 3075 – 226 Emergence 10196 failed 1.84 1.00 2645 1618 1276 2920 Universality 10197 sickness 1.84 1.18 4735 – – 3782 Symmetry Breaking 10198 abused 1.83 1.31 – – – 4589 The Big Theory 10199 tortured 1.82 1.42 – – – 4693 Revolution: Big Data & Complex Networks 10200 fatal 1.80 1.53 – 4089 – 3724 Nutshell 10201 killings 1.80 1.54 – – 4914 – 10202 murdered 1.80 1.63 – – – 4796 Measuring 10203 war 1.80 1.41 468 175 291 462 Happiness 10204 kills 1.78 1.23 2459 – – 2857 Tweetage 10205 jail 1.76 1.02 1642 – 2573 1619 Mechanical Turk 10206 terror 1.76 1.00 4625 4117 4048 2370 10207 die 1.74 1.19 418 730 2605 143 References 10208 killing 1.70 1.36 1507 4428 1672 998 10209 arrested 1.64 1.01 2435 4474 1435 – 10210 deaths 1.64 1.14 – – 2974 – 10211 raped 1.64 1.43 – – – 4528 10212 torture 1.58 1.05 3175 – – 3126 10213 died 1.56 1.20 1223 866 208 826 10214 kill 1.56 1.05 798 2727 2572 430 10215 killed 1.56 1.23 1137 1603 814 1273 10216 cancer 1.54 1.07 946 1884 796 3802 10217 death 1.54 1.28 509 307 373 433 10218 murder 1.48 1.01 2762 3110 1541 1059 10219 terrorism 1.48 0.91 – – 3192 – 10220 rape 1.44 0.79 3133 – 4115 2977 10221 suicide 1.30 0.84 2124 4707 3319 2107 10222 terrorist 1.30 0.91 3576 – 3026 – 74 of 83 Complexity, Big std dev word valence std dev twitter g-books nyt lyrics Data Science, and rank rank rank rank rank Happiness

1 fE@king 4.64 2.93 448 – – 620 2 f&&kin 3.86 2.74 1077 – – 688 3 f&&ked 3.56 2.71 1840 – – 904 Complexity 4 pussy 4.80 2.66 2019 – – 949 Introduction 5 whiskey 5.72 2.64 – – – 2208 Emergence 6 slut 3.57 2.63 – – – 4071 Universality 7 cigarettes 3.31 2.60 – – – 3279 Symmetry Breaking 8 f&&k 4.14 2.58 322 – – 185 The Big Theory 9 mortality 4.38 2.55 – 3960 – – Revolution: Big Data & 10 cigarette 3.09 2.52 – – – 2678 Complex Networks 11 motherf&&kers 2.51 2.47 – – – 1466 Nutshell 12 churches 5.70 2.46 – 2281 – – Measuring 13 motherf&&king 2.64 2.46 – – – 2910 Happiness 14 capitalism 5.16 2.45 – 4648 – – 15 porn 4.18 2.43 1801 – – – Tweetage 16 summer 6.40 2.39 896 1226 721 590 Mechanical Turk 17 beer 5.92 2.39 839 4924 3960 1413 References 18 execution 3.10 2.39 – 2975 – – 19 wines 6.28 2.37 – – 3316 – 20 zombies 4.00 2.37 4708 – – – 21 aids 4.28 2.35 2983 3996 1197 – 22 capitalist 4.84 2.34 – 4694 – – 23 revenge 3.71 2.34 – – – 2766 24 mcdonalds 5.98 2.33 3831 – – – 25 beatles 6.44 2.33 3797 – – – 26 islam 4.68 2.33 – 4514 – – 27 pay 5.30 2.32 627 769 460 499 28 alcohol 5.20 2.32 2787 2617 3752 3600 29 muthaf&&kin 3.00 2.31 – – – 4107 30 christ 6.16 2.31 2509 909 4238 1526 ...... 75 of 83 Complexity, Big Positive bias in the English language: Data Science, and Happiness

0.15 Complexity Introduction Emergence Universality 0.125 Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks 0.1 Nutshell Measuring Happiness Tweetage

N 0.075 Mechanical Turk

References 0.05

0.025

0 1 2 3 4 5 6 7 8 9 h avg

76 of 83 Complexity, Big For more... Data Science, and Happiness

I PSD, KDH, IMK, CAB, and CMD “Temporal patterns of happiness and information in a Complexity Introduction global social network: Hedonometrics and Twitter.” Emergence Universality http://arxiv.org/abs/1101.5120 ( ) Symmetry Breaking  The Big Theory Revolution: Big Data & I P. S. Dodds and C. M. Danforth Complex Networks “Measuring the Happiness of Large-Scale Written Nutshell [7] Measuring Expression: Songs, , and Presidents.” Happiness Tweetage Journal of Happiness Studies, 2009. Mechanical Turk References I http://www.uvm.edu/∼pdodds/research/ () I http://www.onehappybird.com ()

I “Does a Nation’s Mood Lurk in Its Songs and Blogs?” by Benedict Carey New York Times, August 2009. () 77 of 83 Complexity, Big References I Data Science, and Happiness

[1] R. Albert and A.-L. Barabási. Complexity Introduction Statistical mechanics of complex networks. Emergence Universality Rev. Mod. Phys., 74:47–97, 2002. pdf () Symmetry Breaking The Big Theory Revolution: Big Data & Complex Networks [2] P. W. Anderson. Nutshell

More is different. Measuring Happiness Science, 177(4047):393–396, 1972. pdf () Tweetage Mechanical Turk [3] A.-L. Barabási and R. Albert. References Emergence of scaling in random networks. Science, 286:509–511, 1999. pdf () [4] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang. Complex networks: Structure and dynamics. Physics Reports, 424:175–308, 2006. pdf ()

78 of 83 Complexity, Big References II Data Science, and Happiness

[5] J. Bollen, H. Van de Sompel, A. Hagberg, Complexity Introduction L. Bettencourt, R. Chute, M. A. Rodriguez, and Emergence Universality B. Lyudmila. Symmetry Breaking The Big Theory Clickstream data yields high-resolution maps of Revolution: Big Data & Complex Networks science. Nutshell PLoS ONE, 4:e4803, 2009. pdf ( ) Measuring  Happiness Tweetage [6] S. Bornholdt and H. G. Schuster, editors. Mechanical Turk Handbook of Graphs and Networks. References Wiley-VCH, Berlin, 2003. [7] P. S. Dodds and C. M. Danforth. Measuring the happiness of large-scale written expression: Songs, blogs, and presidents. Journal of Happiness Studies, 2009. doi:10.1007/s10902-009-9150-9. pdf ()

79 of 83 Complexity, Big References III Data Science, and Happiness [8] S. N. Dorogovtsev and J. F. F. Mendes. Complexity Evolution of Networks. Introduction Emergence Oxford University Press, Oxford, UK, 2003. Universality Symmetry Breaking The Big Theory [9] A. Halevy, P. Norvig, and F. Pereira. Revolution: Big Data & Complex Networks The unreasonable effectiveness of data. Nutshell IEEE Intelligent Systems, 24:8–12, 2009. pdf ( ) Measuring  Happiness Tweetage [10] W. T. Jones. Mechanical Turk The Classical Mind. References Harcourt, Brace, Jovanovich, New York, 1970. [11] J.-B. Michel, Y. K. Shen, A. P. Aiden, A. Veres, M. K. Gray, The Google Books Team, J. P. Pickett, D. Hoiberg, D. Clancy, P. Norvig, J. Orwant, S. Pinker, M. A. Nowak, and E. A. Lieberman. Quantitative analysis of culture using millions of digitized books. 80 of 83 Complexity, Big References IV Data Science, and Happiness Science Magazine, 331:176–182, 2011. pdf () Complexity Introduction [12] M. E. J. Newman. Emergence Universality The structure and function of complex networks. Symmetry Breaking The Big Theory SIAM Review, 45(2):167–256, 2003. pdf () Revolution: Big Data & Complex Networks [13] I. Rodríguez-Iturbe and A. Rinaldo. Nutshell Measuring Fractal River Basins: Chance and Self-Organization. Happiness Tweetage Mechanical Turk Cambridge University Press, Cambrigde, UK, 1997. References [14] T. C. Schelling. Micromotives and Macrobehavior. Norton, New York, 1978. [15] H. A. Simon. On a class of skew distribution functions. Biometrika, 42:425–440, 1955. pdf ()

81 of 83 Complexity, Big References V Data Science, and Happiness

Complexity [16] D. W. Thompson. Introduction Emergence On Growth and From. Universality Symmetry Breaking Cambridge University Pres, Great Britain, 2nd The Big Theory Revolution: Big Data & edition, 1952. Complex Networks Nutshell

[17] D. W. Thompson. Measuring Happiness On Growth and Form — Abridged Edition. Tweetage Cambridge University Press, Great Britain, 1961. Mechanical Turk References [18] F. Vega-Redondo. Complex Social Networks. Cambridge University Press, 2007. [19] D. J. Watts. Six Degrees. Norton, New York, 2003.

82 of 83 Complexity, Big References VI Data Science, and Happiness

Complexity Introduction Emergence Universality Symmetry Breaking [20] D. J. Watts and S. J. Strogatz. The Big Theory Revolution: Big Data & Collective dynamics of ‘small-world’ networks. Complex Networks Nutshell

Nature, 393:440–442, 1998. pdf () Measuring Happiness Tweetage [21] E. Wigner. Mechanical Turk

The unreasonable effectivenss of mathematics in the References natural sciences. on Pure and Applied Mathematics, 13:1–14, 1960. pdf ()

83 of 83