Complexity, Big Data Science, and Happiness
Total Page:16
File Type:pdf, Size:1020Kb
Complexity, Big Data Science, and Happiness Complexity, Big Data Science, and Complexity Introduction Emergence Happiness Universality Symmetry Breaking Discrete Days, St. Michael’s College, 2011 The Big Theory Revolution: Big Data & Complex Networks Nutshell Measuring Peter Dodds Happiness Tweetage Mechanical Turk Department of Mathematics & Statistics References Center for Complex Systems Vermont Advanced Computing Center University of Vermont 1 of 83 Complexity, Big Outline Data Science, and Happiness Complexity Complexity Introduction Emergence Introduction Universality Symmetry Breaking Emergence The Big Theory Revolution: Big Data & Universality Complex Networks Symmetry Breaking Nutshell Measuring The Big Theory Happiness Tweetage Revolution: Big Data & Complex Networks Mechanical Turk Nutshell References Measuring Happiness Tweetage Mechanical Turk References 2 of 83 Complexity, Big Definitions Data Science, and Happiness A meaningful definition of a Complex System: Complexity Introduction Emergence I Distributed possibly networked system of many Universality Symmetry Breaking interrelated parts with no centralized control The Big Theory Revolution: Big Data & [2] Complex Networks exhibiting emergent behavior—‘More is Different’ Nutshell Measuring Happiness A few optional features: Tweetage Mechanical Turk References I Nonlinear relationships I Presence of feedback loops I Being open or driven I Presence of memory I Modular (nested)/multiscale structure I Opaque boundaries 4 of 83 Complexity, Big Data Science, and Happiness Complexity Examples of Complex Systems: Introduction Emergence Universality Symmetry Breaking The Big Theory I human societies I animal societies Revolution: Big Data & Complex Networks Nutshell I cells I disease ecologies Measuring I organisms I brains Happiness Tweetage Mechanical Turk I power systems I social insects References I weather systems I geophysical systems I ecosystems I the world wide web I i.e., everything that’s interesting... 5 of 83 Complexity, Big Data Science, and Happiness Complexity Introduction Relevant fields: Emergence Universality I Medical Symmetry Breaking I Physics I Cognitive The Big Theory Sciences Revolution: Big Data & Sciences Complex Networks I Economics Nutshell I Systems I Biology Measuring I Sociology Engineering Happiness I Ecology Tweetage I Psychology Mechanical Turk I Computer I Geociences References I Information Science Sciences I Geography I ... I i.e., everything that’s interesting... 6 of 83 Complexity, Big Complexity Manifesto: Data Science, and Happiness 1. Systems are ubiquitous and systems matter. 2. Consequently, much of science is about Complexity Introduction understanding how pieces dynamically fit together. Emergence Universality 3. 1700 to 2000 = Golden Age of Reductionism. Symmetry Breaking The Big Theory I Atoms!, sub-atomic particles, DNA, genes, people, ... Revolution: Big Data & Complex Networks 4. Understanding and creating systems (including new Nutshell Measuring ‘atoms’) is the greater part of science and Happiness Tweetage engineering. Mechanical Turk 5. Universality: systems with quantitatively different References micro details exhibit qualitatively similar macro behavior. 6. Computing advances make the Science of Complexity possible: 6.1 We can measure and record enormous amounts of data, research areas continue to transition from data scarce to data rich. 6.2 We can simulate, model, and create complex systems in extraordinary detail. 7 of 83 Complexity, Big Data Science, and Data, Data, Everywhere—the Economist, Feb 25, 2010 () Happiness Big Data Science: Complexity Introduction Emergence I 2013: year traffic on Universality Internet estimate to reach Symmetry Breaking The Big Theory Revolution: Big Data & 2/3 Zettabytes Complex Networks (1ZB = 103EB = 106PB = Nutshell 9 Measuring 10 TB) Happiness Tweetage I Large Hadron Collider: 40 Mechanical Turk TB/second. References I 2016—Large Synoptic Survey Telescope: 140 TB every 5 days. Exponential growth: I Facebook: ∼ 100 billion ∼ 60% per year. I photos I Twitter: ∼ 5 billion tweets 8 of 83 No really, that’s a lot of data RESEARCH ARTICLE enter a regime marked by slower forgetting: contrast, “1973” declined to half its peak by they were first invented (1800–1840, 1840–1880, Collective memory has both a short-term and a 1983, a lag of only 10 years. We are forgetting and 1880–1920) (7). We tracked the frequency long-term component. our past faster with each passing year (Fig. 3A). of each invention in the nth year after it was But there have been changes. The amplitude We were curious whether our increasing invented as compared to its maximum value and of the plots is rising every year: Precise dates are tendency to forget the old was accompanied by plotted the median of these rescaled trajectories increasingly common. There is also a greater fo- more rapid assimilation of the new (21). We di- for each cohort. cus on the present. For instance, “1880” declined vided a list of 147 inventions into time-resolved The inventions from the earliest cohort RESEARCH ARTICLE to half its peak value in 1912, a lag of 32 years. In cohorts based on the 40-year interval in which (1800–1840) took over 66 years from invention to widespread impact (frequency >25% of peak). famous people born in that year. For example, the Fame comes sooner and rises faster. Between the Since then, the cultural adoption of technology has 1882 cohort includes “Virginia Woolf” and “ABFelix early 19th century and the mid-20th century, the become more rapid. The 1840–1880 invention Frankfurter”; the 1946 cohort includes “Bill age of initial celebrity declined from 43 to 29 Year of invention cohort was widely adopted within 50 years; the Clinton” and “Steven Spielberg”.Weplottedthe years, and the doubling time fell from 8.1 to 3.3 1880–1920 cohort within 27 (Fig. 3B and fig. S7). median frequency for the names in each cohort years. As a result, the most famous people alive “In the future, everyone will be famous for over time (Fig. 3, DRESEARCH and E). The resulting ARTICLE trajectories today are more famous—in books—than their 7.5 minutes”–Whatshisname. People, too, rise to were all similar. Each cohort had a pre-celebrity predecessors. Yet this fame is increasingly short- prominence, only to be forgotten (22). Fame can be period (median frequency <10−9), followed by a lived: The post-peak half-life dropped from 120 enter a regime marked by slower forgetting: contrast, “1973” declined to half its peak by they were first invented (1800–1840, 1840–1880, x10-5 Collective memory has both a short-term and a 1983, atracked lag of only by measuring 10 years. the We frequency are forgetting of a personand’s 1880rapid–1920) rise to (7). prominence, We tracked a the peak, frequency and a slow de- to 71 years during the 19th century. y c Frequency long-term component. our pastname faster (Fig. with 3C). each We passing compared year (Fig.the rise 3A). to fameof of eachcline. invention We therefore in the characterizednth year after each it cohort was using We repeated this analysis with all 42,358 n 5 euq But there have been changes. The amplitude Wethe were most curious famous whether people of our different increasing eras. We tookinventedfour as compared parameters: to (i) its the maximum age of initial value celebrity, and (ii) people in the databases of the Encyclopaedia all 740,000 people with entries in Wikipedia, the doubling time of the initial rise, (iii) the age of Britannica (24), which reflect a process of expert erF of the plots is rising every year: Precise dates are tendency to forget the old was accompanied by plotted the median of these rescaled trajectories Median frequency (% of peak value) 0 increasingly common. There is also a greater fo- more rapidremoved assimilation cases where of the several new (21 famous). We individuals di- for eachpeak cohort. celebrity, and (iv) the half-life of the decline curation that began in 1768. The results were cus on the present. For instance, “1880” declined vided ashare list of a 147 name, inventions and sorted into the time-resolved rest by birth date andThe(Fig. inventions 3E). The fromage of thepeak earliest celebritycohort has been con- similar (7)(fig.S9).Thus,peoplearegettingmore to half its peak value in 1912, a lag of 32 years. In cohortsfrequency based on (the23). 40-year For every interval year from in which 1800 to 1950,(1800–1840)sistent took over over time: 66 about years 75 from years invention after birth. But famous than ever before but are being forgotten we constructed a cohort consisting of the 50 most the other parameters have been changing (fig. S8). more rapidly than ever. on January 14, 2011 ABFig. 4. Culturomics can be used to ABCD detect censorship. (A)Usagefrequen-Year of invention cy of “Marc Chagall” in German (red) as compared to English (blue). (B) Suppression of Leon Trotsky (blue), RESEARCH ARTICLE Grigory Zinoviev (green), and Lev Complexity, Big enter a regime marked by slower forgetting: contrast, “1973” declined to half its peak by they were first invented (1800–1840, 1840–1880, Kamenev (red) in Russian texts, Big Data—Culturomics:x10-5 Data Science, and with noteworthy eventsCollective indicated: memory has both a short-term and a 1983, a lag of only 10 years. We are forgetting and 1880–1920) (7). We tracked the frequency y long-term component. c our past faster with each passing year (Fig. 3A). of each invention in the nth year after it was Happiness Frequency Trotsky’sassassination(bluearrow), n 5 Frequency euq But there have been changes. The amplitude We were curious whether our increasing invented as compared to its maximum value on January 14, 2011 and www.sciencemag.org Zinoviev and Kamenev executed Frequency Frequency of the plots is rising every year: Precise dates are erF tendency to forget the old was accompanied by plotted the median of these rescaled trajectories (red arrow), the Great Purge (red (log) Frequency Median frequency (% of peak value) highlight), and perestroikaincreasingly (gray common. ar- There is also a greater fo- 0more rapid assimilation of the new (21).