More Mechanisms for Generating Power-Law Distributions
Total Page:16
File Type:pdf, Size:1020Kb
More Power-Law More Power-Law Mechanisms Outline Mechanisms More Mechanisms for Generating Optimization Optimization Power-Law Distributions Minimal Cost Optimization Minimal Cost Mandelbrot vs. Simon Mandelbrot vs. Simon Principles of Complex Systems Assumptions Minimal Cost Assumptions Model Model Course CSYS/MATH 300, Fall, 2009 Analysis Mandelbrot vs. Simon Analysis Extra Assumptions Extra Robustness Robustness HOT theory Model HOT theory Self-Organized Criticality Self-Organized Criticality Prof. Peter Dodds COLD theory Analysis COLD theory Network robustness Extra Network robustness Dept. of Mathematics & Statistics References References Center for Complex Systems :: Vermont Advanced Computing Center University of Vermont Robustness HOT theory Self-Organized Criticality COLD theory Network robustness References Frame 1/60 Frame 2/60 Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. More Power-Law More Power-Law Another approach Mechanisms Not everyone is happy... Mechanisms Optimization Optimization Minimal Cost Minimal Cost Benoit Mandelbrot Mandelbrot vs. Simon Mandelbrot vs. Simon Assumptions Assumptions Model Model I Mandelbrot = father of fractals Analysis Analysis Extra Extra I Mandelbrot = almond bread Robustness Robustness [11] HOT theory HOT theory I Derived Zipf’s law through optimization Self-Organized Criticality Self-Organized Criticality COLD theory COLD theory Network robustness Mandelbrot vs. Simon: Network robustness I Idea: Language is efficient References References I Mandelbrot (1953): “An Informational Theory of the I Communicate as much information as possible for as [11] little cost Statistical Structure of Languages” I Simon (1955): “On a class of skew distribution I Need measures of information (H) and cost (C)... functions” [14] I Minimize C/H by varying word frequency I Mandelbrot (1959): “A note on a class of skew I Recurring theme: what role does optimization play in distribution function: analysis and critique of a paper complex systems? by H.A. Simon” Frame 4/60 I Simon (1960): “Some further notes on a class of Frame 6/60 skew distribution functions” More Power-Law More Power-Law Not everyone is happy... (cont.) Mechanisms Not everyone is happy... (cont.) Mechanisms Optimization Optimization Minimal Cost Mandelbrot: Minimal Cost Mandelbrot vs. Simon Mandelbrot vs. Simon Assumptions “We shall restate in detail our 1959 objections to Simon’s Assumptions Model Model Mandelbrot vs. Simon: Analysis 1955 model for the Pareto-Yule-Zipf distribution. Our Analysis Extra objections are valid quite irrespectively of the sign of p-1, Extra I Mandelbrot (1961): “Final note on a class of skew Robustness Robustness HOT theory so that most of Simon’s (1960) reply was irrelevant.” HOT theory distribution functions: analysis and critique of a Self-Organized Criticality Self-Organized Criticality model due to H.A. Simon” COLD theory COLD theory Network robustness Simon: Network robustness I Simon (1961): “Reply to ‘final note’ by Benoit References “Dr. Mandelbrot has proposed a new set of objections to References Mandelbrot” my 1955 models of the Yule distribution. Like his earlier I Mandelbrot (1961): “Post scriptum to ‘final note”’ objections, these are invalid.” I Simon (1961): “Reply to Dr. Mandelbrot’s post Plankton: scriptum” “You can’t do this to me, I WENT TO COLLEGE!” “You weak minded fool!” “That’s it Mister! You just lost your brain privileges,” etc. Frame 7/60 Frame 8/60 More Power-Law More Power-Law Zipfarama via Optimization Mechanisms Zipfarama via Optimization Mechanisms Optimization Optimization Minimal Cost Minimal Cost Mandelbrot vs. Simon Word Cost Mandelbrot vs. Simon Assumptions Assumptions Model Model Mandelbrot’s Assumptions Analysis I Length of word (plus a space) Analysis Extra Extra I Word length was irrelevant for Simon’s method I Language contains n words: w , w ,..., wn. Robustness Robustness 1 2 HOT theory HOT theory Self-Organized Criticality Self-Organized Criticality I ith word appears with probability pi COLD theory COLD theory Network robustness Objection Network robustness Words appear randomly according to this distribution I References References (obviously not true...) I Real words don’t use all letter sequences I Words = composition of letters is important Objections to Objection I Alphabet contains m letters I Words are ordered by length (shortest first) I Maybe real words roughly follow this pattern (?) I Words can be encoded this way I Na na na-na naaaaa... Frame 10/60 Frame 11/60 More Power-Law More Power-Law Zipfarama via Optimization Mechanisms Zipfarama via Optimization Mechanisms Optimization Total Cost C Optimization Minimal Cost Minimal Cost Mandelbrot vs. Simon Mandelbrot vs. Simon Assumptions I Cost of the ith word: Ci ' 1 + logm i Assumptions Binary alphabet plus a space symbol Model Model Analysis Analysis Extra I Cost of the ith word plus space: Ci ' 1 + logm(i + 1) Extra i 1 2 3 4 5 6 7 8 0 Robustness I Subtract fixed cost: C = Ci − 1 ' log (i + 1) Robustness word 1 10 11 100 101 110 111 1000 HOT theory i m HOT theory Self-Organized Criticality Self-Organized Criticality length 1 2 2 3 3 3 3 4 COLD theory I Simplify base of logarithm: COLD theory Network robustness Network robustness 1 + ln2 i 1 2 2.58 3 3.32 3.58 3.81 4 References References 0 loge(i + 1) Ci ' logm(i + 1) = ∝ ln(i + 1) k k log m I Word length of 2 th word: = k + 1 = 1 + log2 2 e I Word length of ith word ' 1 + log2 i Total Cost: I For an alphabet with m letters, I word length of ith word ' 1 + logm i. n n X 0 X C ∼ pi Ci ∝ pi ln(i + 1) i=1 i=1 Frame 12/60 Frame 14/60 More Power-Law More Power-Law Zipfarama via Optimization Mechanisms Zipfarama via Optimization Mechanisms Optimization Optimization Information Measure Minimal Cost Minimal Cost Mandelbrot vs. Simon Mandelbrot vs. Simon I Use Shannon’s Entropy (or Uncertainty): Assumptions Assumptions Model Model Analysis Analysis n Extra Extra X Information Measure H = − p log p Robustness Robustness i 2 i HOT theory HOT theory i=1 Self-Organized Criticality I Use a slightly simpler form: Self-Organized Criticality COLD theory COLD theory Network robustness n n Network robustness I (allegedly) von Neumann suggested ‘entropy’... References X X References H = − pi log pi / log 2 = −g pi ln pi Proportional to average number of bits needed to e e I i=1 i=1 encode each ‘word’ based on frequency of occurrence where g = 1/ ln 2 I − log2 pi = log2 1/pi = minimum number of bits needed to distinguish event i from all others I If pi = 1/2, need only 1 bit (log21/pi = 1) I If pi = 1/64, need 6 bits (log21/pi = 6) Frame 15/60 Frame 16/60 More Power-Law More Power-Law Zipfarama via Optimization Mechanisms Zipfarama via Optimization Mechanisms Optimization Time for Lagrange Multipliers: Optimization Minimal Cost Minimal Cost Mandelbrot vs. Simon Mandelbrot vs. Simon I Minimize Assumptions I Minimize Assumptions Model Model F(p1, p2,..., pn) = C/H Analysis Ψ(p1, p2,..., pn) = Analysis Extra Extra subject to constraint Robustness F(p1, p2,..., pn) + λG(p1, p2,..., pn) Robustness HOT theory HOT theory Self-Organized Criticality Self-Organized Criticality n COLD theory where COLD theory X Network robustness Network robustness pi = 1 n References C P p ln(i + 1) References i=1 i=1 i F(p1, p2,..., pn) = = Pn H −g i=1 pi ln pi I Tension: (1) Shorter words are cheaper and the constraint function is (2) Longer words are more informative (rarer) n X I (Good) question: how much does choice of C/H as G(p1, p2,..., pn) = pi − 1 = 0 function to minimize affect things? i=1 Frame 17/60 Frame 19/60 Insert question 4, assignment 2 () More Power-Law More Power-Law Zipfarama via Optimization Mechanisms Zipfarama via Optimization Mechanisms Optimization Optimization Minimal Cost Minimal Cost Mandelbrot vs. Simon Finding the exponent Mandelbrot vs. Simon Assumptions Assumptions Some mild suffering leads to: Model Model Analysis I Now use the normalization constraint: Analysis Extra Extra I n n n −1−λH2/gC −H/gC −H/gC Robustness X X X Robustness pj = e (j + 1) ∝ (j + 1) HOT theory −H/gC −α HOT theory Self-Organized Criticality 1 = pj = (j + 1) = (j + 1) Self-Organized Criticality COLD theory = = = COLD theory Network robustness j 1 j 1 j 1 Network robustness References References I A power law appears [applause]: α = H/gC As n → ∞, we end up with ζ(H/gC) = 2 Next: sneakily deduce λ in terms of g, C, and H. I I where ζ is the Riemann Zeta Function I Find −H/gC I Gives α ' 1.73 (> 1, too high) pj = (j + 1) I If cost function changes (j + 1 → j + a) then exponent is tunable I Increase a, decrease α Frame 20/60 Frame 21/60 More Power-Law More Power-Law Zipfarama via Optimization Mechanisms More Mechanisms Optimization Optimization Minimal Cost Minimal Cost Mandelbrot vs. Simon Mandelbrot vs. Simon Assumptions Reconciling Mandelbrot and Simon Assumptions Model Model All told: Analysis Analysis Extra I Mixture of local optimization and randomness Extra Reasonable approach: Optimization is at work in Robustness Robustness I HOT theory I Numerous efforts... HOT theory Self-Organized Criticality Self-Organized Criticality evolutionary processes COLD theory COLD theory Network robustness Network robustness I But optimization can involve many incommensurate 1. Carlson and Doyle, 1999: References References elements: monetary cost, robustness, happiness,... Highly Optimized Tolerance (HOT)—Evolved/Engineered Robustness [5] I Mandelbrot’s argument is not super convincing 2. Ferrer i Cancho and Solé, 2002: I Exponent depends too much on a loose definition of [8] cost Zipf’s Principle of Least Effort 3. D’Souza et al., 2007: Scale-free networks [7] Frame 22/60 Frame 23/60 More Power-Law More Power-Law More Mechanisms Others are also not happy Mechanisms Optimization Optimization Minimal Cost Minimal Cost Mandelbrot vs.