Interpreting Power Laws
Total Page:16
File Type:pdf, Size:1020Kb
Lecture 10: Interpreting Power Laws: Deep Insight or Exaggerated Analogy? Power laws highlight the importance of the rare hard-to-predict events: the 2008-2009 financial collapse; the surprise Brexit vote in the UK; the Louisiana Flood of 2016 from '1,000-year' rain occurring in 2 days; etc. 1)Nassim Taleb's black swans give message about what can happen if you assume life proceeds in same manner as in the past. The turkey, which each day is fed and well taken care of by humans … until Thanksgiving Each day humans feed and take care of turkey. On Thanksgiving big surprise: 46 million turkeys are eaten; another 22 million are eaten at Christmas and 19 million at Easter. Not much turkey eaten in other times ---> power law. But in fact, turkeys are raised and killed and frozen in preparation for thanksgiving so farmers/firms actually adjust because they know the time when demand for turkeys is highest 2)IndyMac, seventh largest mortgage originator in the US until on July 11, 2008, it collapsed 3) Variation in Prices of Derivatives portfolio of UK interest rates 4)20th Century's largest disasters from EM-DAT International Disaster Database by the Center for Research on Epidemiology of Disasters that currently lists disasters from 1900 to 2015. NA 1917 Epidemic NA ALL 20,000,000 Soviet Union 1932 Famine Russia.Fed Europe 5,000,000 China, P Rep 1931 Flood E.Asia Asia 3,700,000 China, P Rep 1928 Drought E.Asia Asia 3,000,000 NA 1914 Epidemic Rest.Europ Europe 3,000,000 Soviet Union 1917 Epidemic Russia.Fed Europe 2,500,000 China, P Rep 1959 Flood E.Asia Asia 2,000,000 India 1920 Epidemic S.Asia Asia 2,000,000 Bangladesh 1943 Famine S.Asia Asia 1,900,000 China, P Rep 1909 Epidemic E.Asia Asia 1,500,0 Extreme Tail Events The heavy tail of power distributions shows up in the moments of the distribution. Assume Y (frequency) = S- a so ln Y = C -a ln S. Smaller a's mean greater weight at tail events. When a < 3 the 2nd moment does not exist – no variance. When a < 2 the first moment – the mean – also does not exist. You may calculate those statistics with finite data and will get a finite variance from ∑(X-Mean X)2 /N – but true variance is infinite. Why does a<2 not have a mean? (S-a) S = S1-a If a = 2 get harmonic series (∑1/S, which diverges. If a = 1 get ∑ 1, which is infinite. Since we never are at infinity, maybe not a problem. But infinite variance means high sensitivity of empirical variance to presence/absence of a small number of big events. Most studies find a to be between 2 and 3, but with large SD. Small diff in a has huge impact on the probability of an extreme outcome. Consider the Pareto distribution. If 0< k < 2 –> infinite variance and if 0 < k < 1, get infinite mean (You can do calculus or see http://en.wikipedia.org/wiki/Pareto_distribution . ) But in power law form a=k+1, where a is the power law coefficient, translates into if 1 < a < 3, second moment is infinite and if 1 < a < 2, first moment is infinite. What Generates Power law? 1)As outcome of statistical process via generalization of Central Limit theorem – if mean/variance and errors are iid, then as N → infinity distribution is NORMAL – to distributions that include infinite variance – stable distributiions 2)Through some model in which interrelation among parts creates thick tails. STRUCTURE that has some feedback loops that react to random shocks in ways to produce power law. Hope is to find simple and deep principles that underlie regularities and obviate the need for details to understand economics. 3)Through optimizing behavior that brings system to “brink” of large changes Two camps: people who seek/sometimes find broad robust detail-free laws and those who believe details matter. 1) STATISTICAL --properties of power law distributions as a stable distribution – linear combination of two independently drawn copies of the variable has the same distribution. Three closed form representatives of stable distribution: Normal, Cauchy distribution, and Levy distribution that is used in finance. Cauchy is symmetric with such a thick tail it has neither mean nor variance; Levy is non-negative variable – not symmetric. By being STABLE distributions, all three are “attractors” – if lots of random “stuff” happens end up with this distribution What it means to have a thick tail in a distribution: comparison of Cauchy and Levy with Normal A stable distribution has four parameters: – key is a; skewness parameter; scale parameter; location parameter. Stable distributions with infinite variance is likely to show jumps, which fits “many time series appear to exhibit "discontinuities( e.g., large jumps)” (Knight, 1997) Evidence + Generalized Central Limit Theorem justifies the use of stable models in finance & economics, where data poorly described by Gaussian model, but well described by a stable distribution – stock prices for instance ( Journal of Business & Economic Statistics, Vol. 8, No. 2 (Apr., 1990)). 2)Stochastic/proportionate growth plus some barrier/bound (% growth + bounds) generate power law (associated with Herb Simon who had a nasty debate with Mandelbrot over it) Without barriers/bounds stochastic growth gives log-normal: random ln/% growth –> log-normal with var σ2. If rate of growth is independent of initial size and the variance of growth is the same for all units – Gibrat's law in economics -- yields equation for growth of firm (http://docentes.fe.unl.pt/~jmata/gibrat.pdf): % change in SIZE = σ so that SIZE (t) = (1+ σ) SIZE (t-1)–> ln SIZE (t) = ln SIZE (t-1) + ln (1+σ) Need something to fatten tails in the distribution and go beyond log-normal. Some lower bound/friction. Gibrat + lower bound–> Zipf. This is the STEADY STATE distribution. The bound produces “Reflected Brownian motion” – originally shown by Champernowne for income distribution. Lower barrier creates “extra mass” that can add to distribution at tail. Gabaix model for cities: Cities of different sizes have same growth rate with a constant variance. The position of cities can change, but the distribution replicates itself. LA surpasses Chicago and the number 2 city is proportionate (½ of the largest city) in Zipf’s law with coefficient 1. You follow the average growth rate + random component unless you are very small. If you are very small you grow at 0 or at some positive value that depends on average growth and random shock. By moving density from the bottom, you push the distribution toward fatter tails. Do all but smallest have same growth rate with constant variance regardless of “policies”? To see mechanism, consider fixed total population that distributes itself among cities. With same % growth larger cities have greater absolute growth. Thus, must have more small cities to maintain the fixed population. Example: P = % of cities that double every period; (1-P) = % of cities that half at given amount. This does not allow for the variance in growth rates, but shows how the rule produces distribution. Scale the fixed population at 1. Then no NET growth: 2(p) + ½ (1-p) = 1 Solving we get p = 1/3 → cities with size 2 make up 1/3rd; cities with size ½ made up 2/3rd of population. There are twice as many small cities as large cities. What about next period, with the same process. Some cities get half the population and others double? City A of size 2 becomes 4 and city B of size 2 becomes 1, etc Rank of city by size size Frequency 1 4 1/9 2 1 2/9 3 1 2/9 4 1/4 4/9 And next period and so on A stable distribution by size classes after a long period of doubling/halving needs same absolute changes, which holds only if size classes have the same total population and fits Zipf with bins: SIZE # CITIES POPULATION IN CLASS small 40 1 40 larger 20 2 40 big 10 4 40 biggest 5 8 40 3)PREFERENTIAL ATTACHMENT MODEL (http://ccl.northwestern.edu/netlogo/models/PreferentialAttachment) The power law story for web pages is that growth rates differ across web sites with differential attachment, so that new pages more likely to attach to older/larger sites. Small # of older/larger sites will grow more rapidly than smaller/newer sites → power law. https://en.wikipedia.org/wiki/Preferential_attachment This can also explain why paper citations show power law. But also need to model rate of an entry and exit. A new site/paper enters and gathers some citations according to attachment while some sites die off. The power law is presumed to hold for internet that has mixture of older and younger firms. But the lifetime of sites at a moment in time is often exponential, so there are a few long-lived sites and many short-lives sites but the difference is less than in power law. The mixture of the exponential and log-normal gives power law. Other ways to get power law: as the inverse of a function that follows a power law; as combinations of exponential; random walk distribution of lifetime till run → lots of short lives, few older ones. The different variants of preferential attachment suggest different processes that would direct attention at different ways to affect power law distribution if, say, society viewed it as “too weighted” at tails for some reason. 4)LOCAL INTERACTIONS AND OPTIMIZATION --> SOC self organized criticality System has birth/death process that moves it to border area where it is subject to risk of major disruptions, producing power law.