So, You Think You Have a Power Law, Do You? Well Isn't That Special?
Total Page:16
File Type:pdf, Size:1020Kb
So, You Think You Have a Power Law, Do You? Well Isn't That Special? Cosma Shalizi Statistics Department, Carnegie Mellon University Santa Fe Institute 18 October 2010, NY Machine Learning Meetup 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman 2 Power laws, p(x) / x−α, are cool, but not that cool 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way 4 Reliable methods exist, and need only very straightforward mid-20th century statistics 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven You are now free to tune me out and turn on social media Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary Cosma Shalizi So, You Think You Have a Power Law? 2 Power laws, p(x) / x−α, are cool, but not that cool 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way 4 Reliable methods exist, and need only very straightforward mid-20th century statistics 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven You are now free to tune me out and turn on social media Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman Cosma Shalizi So, You Think You Have a Power Law? 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way 4 Reliable methods exist, and need only very straightforward mid-20th century statistics 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven You are now free to tune me out and turn on social media Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman 2 Power laws, p(x) / x−α, are cool, but not that cool Cosma Shalizi So, You Think You Have a Power Law? 4 Reliable methods exist, and need only very straightforward mid-20th century statistics 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven You are now free to tune me out and turn on social media Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman 2 Power laws, p(x) / x−α, are cool, but not that cool 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way Cosma Shalizi So, You Think You Have a Power Law? 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven You are now free to tune me out and turn on social media Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman 2 Power laws, p(x) / x−α, are cool, but not that cool 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way 4 Reliable methods exist, and need only very straightforward mid-20th century statistics Cosma Shalizi So, You Think You Have a Power Law? You are now free to tune me out and turn on social media Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman 2 Power laws, p(x) / x−α, are cool, but not that cool 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way 4 Reliable methods exist, and need only very straightforward mid-20th century statistics 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven Cosma Shalizi So, You Think You Have a Power Law? Power Laws: What? So What? Bad Practices Better Practices No Really, So What? References Summary 1 Everything good in the talk I owe to my co-authors, Aaron Clauset and Mark Newman 2 Power laws, p(x) / x−α, are cool, but not that cool 3 Most of the studies claiming to nd them use unreliable 19th century methods, and have no value as evidence either way 4 Reliable methods exist, and need only very straightforward mid-20th century statistics 5 Using reliable methods, lots of the claimed power laws disappear, or are at best not proven You are now free to tune me out and turn on social media Cosma Shalizi So, You Think You Have a Power Law? Pareto (continuous), Zipf or zeta (discrete) Explicitly: α − 1 x −α p(x) = xmin xmin (discrete version involves the Hurwitz zeta function) Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References What Are Power Law Distributions? Why Care? p(x) / x−α (continuous) P (X = x) / x−α (discrete) −(α−1) ) P (X ≥ x) / x and log p(x) = log C − α log x Cosma Shalizi So, You Think You Have a Power Law? Explicitly: α − 1 x −α p(x) = xmin xmin (discrete version involves the Hurwitz zeta function) Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References What Are Power Law Distributions? Why Care? p(x) / x−α (continuous) P (X = x) / x−α (discrete) −(α−1) ) P (X ≥ x) / x and log p(x) = log C − α log x Pareto (continuous), Zipf or zeta (discrete) Cosma Shalizi So, You Think You Have a Power Law? Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References What Are Power Law Distributions? Why Care? p(x) / x−α (continuous) P (X = x) / x−α (discrete) −(α−1) ) P (X ≥ x) / x and log p(x) = log C − α log x Pareto (continuous), Zipf or zeta (discrete) Explicitly: α − 1 x −α p(x) = xmin xmin (discrete version involves the Hurwitz zeta function) Cosma Shalizi So, You Think You Have a Power Law? Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Money, Words, Cities The three classic power law distributions Pareto's law: wealth (richest 400 in US, 2003) Wealth 1.000 0.200 0.050 Survival function 0.010 0.002 1e+09 5e+09 2e+10 5e+10 Net worth (US$) Cosma Shalizi So, You Think You Have a Power Law? Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Money, Words, Cities The three classic power law distributions Zipf's law: word frequencies (Moby Dick) Word Frequencies 1e+00 1e−01 1e−02 Survival function 1e−03 1e−04 1 10 100 1000 10000 Number of occurences Cosma Shalizi So, You Think You Have a Power Law? Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Money, Words, Cities The three classic power law distributions Zipf's law: city populations City Sizes 1e+00 1e−01 1e−02 Survival function 1e−03 1e−04 1e+00 1e+02 1e+04 1e+06 Population US Census (2000) Cosma Shalizi So, You Think You Have a Power Law? Heavy (fat, long, . ) tails: sub-exponential decay of p(x) Extreme inequality (80/20): high proportion of summed values comes from small fraction of samples/population Scale-free: 1x −α p x X s α − ( j ≥ ) = s s i.e., another power law, same α ) no typical scale though xmin is the typical value Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Properties Highly right skewed Cosma Shalizi So, You Think You Have a Power Law? Extreme inequality (80/20): high proportion of summed values comes from small fraction of samples/population Scale-free: 1x −α p x X s α − ( j ≥ ) = s s i.e., another power law, same α ) no typical scale though xmin is the typical value Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Properties Highly right skewed Heavy (fat, long, . ) tails: sub-exponential decay of p(x) Cosma Shalizi So, You Think You Have a Power Law? Scale-free: 1x −α p x X s α − ( j ≥ ) = s s i.e., another power law, same α ) no typical scale though xmin is the typical value Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Properties Highly right skewed Heavy (fat, long, . ) tails: sub-exponential decay of p(x) Extreme inequality (80/20): high proportion of summed values comes from small fraction of samples/population Cosma Shalizi So, You Think You Have a Power Law? ) no typical scale though xmin is the typical value Power Laws: What? So What? Bad Practices Better Practices Denitions and Examples No Really, So What? References Properties Highly right skewed Heavy (fat, long, .