Algorithms, Combinatorics, Information, and Beyond∗
Wojciech Szpankowski Purdue University W. Lafayette, IN 47907
August 2, 2011
NSF STC Center for Science of Information
Plenary ISIT, St. Petersburg, 2011 Dedicated to PHILIPPE FLAJOLET
∗Research supported by NSF Science & Technology Center, and Humboldt Foundation. Outline
1. Shannon Legacy
2. Analytic Combinatorics + IT = Analytic Information Theory
3. The Redundancy Rate Problem (a) Universal Memoryless Sources (b) Universal Renewal Sources (c) Universal Markov Sources 4. Post-Shannon Challenges: Beyond Shannon
5. Science of Information: NSF Science and Technology Center
Algorithms: are at the heart of virtually all computing technologies; Combinatorics: provides indispensable tools for finding patterns and structures; Information: permeates every corner of our lives and shapes our universe. Shannon Legacy: Three Theorems of Shannon
Theorem 1 & 3. [Shannon 1948; Lossless & Lossy Data Compression]
compression bit rate source entropy H(X) ≥ for distortion level D: lossy bit rate rate distortion function R(D) ≥
Theorem 2. [Shannon 1948; Channel Coding ] In Shannon’s words: It is possible to send information at the capacity through the channel with as small a frequency of errors as desired by proper (long) encoding. This statement is not true for any rate greater than the capacity. Post-Shannon Challenges
1. Back off from infinity (Ziv’97): Extend Shannon findings to finite size data structures (i.e., sequences, graphs), that is, develop information theory of various data structures beyond first-order asymptotics.
Claim: Many interesting information-theoretic phenomena appear in the second-order terms.
2. Science of Information: Information Theory needs to meet new challenges of current applications in
biology, communication, knowledge extraction, economics, ... to understand new aspects of information in:
structure, time, space, and semantics, and dynamic information, limited resources, complexity, representation-invariant information, and cooperation & dependency. Outline Update
1. Shannon Legacy 2. Analytic Information Theory 3. Source Coding: The Redundancy Rate Problem 4. Post-Shannon Information 5. NSF Science and Technology Center Analytic Combinatorics+Information Theory=Analytic Information Theory
In the 1997 Shannon Lecture Jacob Ziv presented compelling arguments • for “backing off” from first-order asymptotics in order to predict the behavior of real systems with finite length description.
To overcome these difficulties, one may replace first-order analyses by • non-asymptotic analysis, however, we propose to develop full asymptotic expansions and more precise analysis (e.g., large deviations, CLT).
1 Following Hadamard’s precept , we study information theory problems • using techniques of complex analysis such as generating functions, combinatorial calculus, Rice’s formula, Mellin transform, Fourier series, sequences distributed modulo 1, saddle point methods, analytic poissonization and depoissonization, and singularity analysis.
This program, which applies complex-analytic tools to information theory, • 2 constitutes analytic information theory.
1 The shortest path between two truths on the real line passes through the complex plane. 2 Andrew Odlyzko: “Analytic methods are extremely powerful and when they apply, they often yield estimates of unparalleled precision.” Some Successes of Analytic Information Theory
Wyner-Ziv Conjecture concerning the longest match in the WZ’89 • compression scheme (W.S., 1993). Ziv’s Conjecture on the distribution of the number of phrases in the LZ’78 • (Jacquet & W.S., 1995, 2011). Redundancy of the LZ’78 (Savari, 1997, Louchard & W.S., 1997). • Steinberg-Gutman Conjecture regarding lossy pattern matching compression • (Luczak & W.S., 1997; Kieffer & Yang, 1998; Kontoyiannis, 2003). Precise redundancy of Huffman’s Code (W.S., 2000) and redundancy of • a fixed-to-variable no prefix free code (W.S. & Verdu, 2010). Minimax Redundancy for memoryless sources (Xie &Barron, 1997; W.S., • 1998; W.S. & Weinberger, 2010), Markov sources (Risannen, 1998; Jacquet & W.S., 2004), and renewal sources (Flajolet & W.S., 2002; Drmota & W.S., 2004). Analysis of variable-to-fixed codes such as Tunstall and Khodak codes • (Drmota, Reznik, Savari, & W.S., 2006, 2008, 2010). Entropy of hidden Markov processes and the noisy constrained capacity • (Jacquet, Seroussi, & W.S., 2004, 2007, 2010; Han & Marcus, 2007). ... • Outline Update
1. Shannon Legacy 2. Analytic Information Theory 3. Source Coding: The Redundancy Rate Problem (a) Universal Memoryless Sources (b) Universal Markov Sources (c) Universal Renewal Sources Source Coding and Redundancy
Source coding aims at finding codes C : ∗ 0, 1 ∗ of the shortest length A → { } L(C, x), either on average or for individual sequences.
Known Source P : The pointwise and maximal redundancy are:
n n n Rn(Cn,P ; x1 ) = L(Cn, x1 ) + log P (x1 ) n n R∗(C ,P ) = max[L(C , x ) + log P (x )] n n n 1 1 x1 where P (xn) is the probability of xn = x x . 1 1 1 ··· n Unknown Source P : Following Davisson, the maximal minimax redundancy R∗ ( ) for a family of sources is: n S S n n R∗ ( ) = minsup max[L(C , x ) + log P (x )]. n n n 1 1 S Cn P x ∈S 1 Source Coding and Redundancy
Source coding aims at finding codes C : ∗ 0, 1 ∗ of the shortest length A → { } L(C, x), either on average or for individual sequences.
Known Source P : The pointwise and maximal redundancy are:
n n n Rn(Cn,P ; x1 ) = L(Cn, x1 ) + log P (x1 ) n n R∗(C ,P ) = max[L(C , x ) + log P (x )] n n n 1 1 x1 where P (xn) is the probability of xn = x x . 1 1 1 ··· n Unknown Source P : Following Davisson, the maximal minimax redundancy R∗ ( ) for a family of sources is: n S S n n R∗ ( ) = minsup max[L(C , x ) + log P (x )]. n n n 1 1 S Cn P x ∈S 1
Shtarkov’s Bound:
n n dn( ) := log supP (x1 ) Rn∗ ( ) log supP (x1 ) +1 S P ≤ S ≤ P xn n ∈S xn n ∈S 1X∈A 1X∈A Dn( ) S | {z } Learnable Information and Redundancy
k 1. := = Pθ : θ Θ set of k-dimensional parameterized S M {n ∈ } n distributions. Let θˆ(x ) = arg maxθ Θ log 1/Pθ(x ) be the ML estimator. ∈ Learnable Information and Redundancy
k 1. := = Pθ : θ Θ set of k-dimensional parameterized S M {n ∈ } n distributions. Let θˆ(x ) = arg maxθ Θ log 1/Pθ(x ) be the ML estimator. ∈
C ()Θ balls ; log C θ() useful bits n n n n 2. Two models, say Pθ(x ) and P (x ) are θ′ indistinguishable if the ML estimator θˆ 1 ⁄. n with high probability declares both models are the same. θˆ ˆ .θ θ.' 3. The number of distinguishable distributions (i.e, θ), Cn(Θ), summarizes then learnable information, indistinquishable I(Θ) = log2 Cn(Θ). models Learnable Information and Redundancy
k 1. := = Pθ : θ Θ set of k-dimensional parameterized S M {n ∈ } n distributions. Let θˆ(x ) = arg maxθ Θ log 1/Pθ(x ) be the ML estimator. ∈
C ()Θ balls ; log C θ() useful bits n n n n 2. Two models, say Pθ(x ) and P (x ) are θ′ indistinguishable if the ML estimator θˆ 1 ⁄. n with high probability declares both models are the same. θˆ ˆ .θ θ.' 3. The number of distinguishable distributions (i.e, θ), Cn(Θ), summarizes then learnable information, indistinquishable I(Θ) = log2 Cn(Θ). models
4. Consider the following expansion of the Kullback-Leibler (KL) divergence
n n 1 T 2 D(Pˆ Pθ) := E[log Pˆ(X )] E[log Pθ(X )] (θ θˆ) I(θˆ)(θ θˆ) d (θ, θˆ) θ|| θ − ∼ 2 − − ≍ I where I(θ) = I (θ) is the Fisher information matrix and d (θ, θˆ) is a { ij }ij I rescaled Euclidean distance known as Mahalanobis distance.
5. Balasubramanian proved the number of distinguishable balls Cn(Θ) of radius O(1/√n) is asymptotically equal to the minimax redundancy: P Learnable Information θˆ k = log Cn(Θ) = infθ Θmaxxn log = Rn∗ ( ) ∈ Pθ M Outline Update
1. Shannon Legacy
2. Analytic Information Theory
3. Source Coding: The Redundancy Rate Problem (a) Universal Memoryless Sources (b) Universal Renewal Sources (c) Universal Markov Sources Maximal Minimax for Memoryless Sources
For a memoryless source over the alphabet = 1, 2,..., m we have A { } P (xn) = p k1 p km, k + + k = n. 1 1 ··· m 1 ··· m Then
n Dn( 0) := sup P (x1 ) M n n P (x1 ) Xx1 k1 km = sup p1 pm n p1,...,pm ··· Xx1 n k1 km = sup p1 pm k1,...,km p ,...,pm ··· k + +km=n 1 1 ···X k1 km n k1 km = . k1,...,km n ··· n k + +km=n 1 ···X since the (unnormalized) likelihood distribution is
k1 km n k1 km k1 km sup P (x1 )= sup p1 pm = n p ,...,p ··· n ··· n P (x1 ) 1 m Generating Function for Dn(M0)
We write
k1 km k1 km n k1 km n! k k D ( ) = = 1 m n 0 n M k1,...,km n ··· n n k1! ··· km! k + +km=n k + +km=n 1 ···X 1 ···X Let us introduce a tree-generating function
k k 1 ∞ k 1 ∞ k − B(z) = zk = , T (z) = zk k! 1 T (z) k! Xk=0 − Xk=1 where T (z) = zeT (z) (= W ( z), Lambert’s W -function) that enumerates − − all rooted labeled trees. Let now
n ∞ n D (z) = zn D ( ). m n M0 n=0 n! X Then by the convolution formula
D (z) = [B(z)]m 1. m − Asymptotics for FINITE m
1 The function B(z) has an algebraic singularity at z = e− , and 1 1 β(z) = B(z/e) = + + O( (1 z). 2(1 z) 3 − − q By Cauchy’s coefficient formulap
m n! n m 1 β(z) Dn( 0) = [z ][B(z)] = √2πn(1 + O(1/n)) dz. M nn 2πi zn+1 I
For finite m, the singularity analysis of Flajolet and Odlyzko implies
n α nα 1 [z ](1 z)− − , α / 0, 1, 2,... − ∼ Γ(α) ∈ { − − }
that finally yields (cf. Clarke & Barron, 1990, W.S., 1998)
m √ m 1 n √π Γ( 2 )m 2 R∗ ( 0) = − log + log + n M 2 2 Γ(m) 3Γ(m 1) · √n 2 ! 2 − 2 3+ m(m 2)(2m + 1) Γ2(m)m2 1 + − 2 + 36 − 9Γ2(m 1) · n ··· 2 − 2 ! Redundancy for LARGE m
Now assume that m is unbounded and may vary with n. Then
m 1 β(z) 1 g(z) Dn,m( 0) = √2πn dz = √2πn e dz M 2πi zn+1 2πi I I where g(z) = m ln β(z) (n + 1)ln z. −
The saddle point z0 is a solution of g′(z0) = 0, that is,
1 2 3 g(z) = g(z0) + (z z0) g′′(z0) + O(g′′′(z0)(z z0) ). 2 − −
Under mild conditions satisfied by our g(z) (e.g., z0 is real and unique), the saddle point method leads to:
g(z0) e g′′′(z0) D ( ) = 1+ O , n,m 0 ρ M 2π g (z ) × (g′′(z0)) | ′′ 0 | p for some ρ < 3/2. m = o(n) m = n n = o(m) Theorem 1 (Orlitsky and Santhanam, 2004, and W.S. and Weinberger, 2010). (i) For m = o(n)
m 1 n m m log e m 1 m R∗ ( 0) = − log + log e + O n,m M 2 m 2 3 n 2 − n r r
(ii) For m = αn + ℓ(n), where α is a positive constant and ℓ(n) = o(n),
2 R∗ ( ) = n log B + ℓ(n) log C log A + O(ℓ(n) /n) n,m M0 α α − α p 1 α+2 −Cα where Cα := 0.5 + 0.5 1 + 4/α, Aα := Cα + 2/α, Bα = αCα e . (iii) For n = o(m) p m 3 n2 3 n 1 n3 Rn,m∗ ( 0) = n log + log e log e + O + . M n 2 m − 2 m √n m2 ! Outline Update
1. Shannon Legacy
2. Analytic Information Theory
3. Source Coding: The Redundancy Rate Problem (a) Universal Memoryless Sources (b) Universal Renewal Sources (c) Universal Markov Sources Renewal Sources (Virtual Large Alphabet)
The renewal process (introduced in 1996 by Csiszar´ and Shields) defined R0 as follows:
Let T , T ... be a sequence of i.i.d. positive-valued random variables • 1 2 with distribution Q(j) = Pr T = j . { i } In a binary renewal sequence the positions of the 1’s are at the renewal • epochs T , T + T ,... with runs of zeros of lengths T 1, T 1,.... 0 0 1 1 − 2 − Renewal Sources (Virtual Large Alphabet)
The renewal process (introduced in 1996 by Csiszar´ and Shields) defined R0 as follows:
Let T , T ... be a sequence of i.i.d. positive-valued random variables • 1 2 with distribution Q(j) = Pr T = j . { i } In a binary renewal sequence the positions of the 1’s are at the renewal • epochs T , T + T ,... with runs of zeros of lengths T 1, T 1,.... 0 0 1 1 − 2 −
For a sequence xn = 10α110α21 10αn1 0 0 0 ··· ··· k∗ define k as the number of i such that α = m. Then m i | {z } n k k k P (x )=[Q(0)] 0[Q(1)] 1 [Q(n 1)] n 1Pr T > k∗ . 1 ··· − − { 1 } Renewal Sources (Virtual Large Alphabet)
The renewal process (introduced in 1996 by Csiszar´ and Shields) defined R0 as follows:
Let T , T ... be a sequence of i.i.d. positive-valued random variables • 1 2 with distribution Q(j) = Pr T = j . { i } In a binary renewal sequence the positions of the 1’s are at the renewal • epochs T , T + T ,... with runs of zeros of lengths T 1, T 1,.... 0 0 1 1 − 2 −
For a sequence xn = 10α110α21 10αn1 0 0 0 ··· ··· k∗ define k as the number of i such that α = m. Then m i | {z } n k k k P (x )=[Q(0)] 0[Q(1)] 1 [Q(n 1)] n 1Pr T > k∗ . 1 ··· − − { 1 } Theorem 2 (Flajolet and W.S., 1998). Consider the class of renewal processes. Then 2 R∗ ( 0) = √cn + O(log n). n R log 2 2 where c = π 1 0.645. 6 − ≈ Maximal Minimax Redundancy
It can be proved that r 1 D ( ) n r n+1 − ≤ n R0 ≤ m=0 m
n Pk0 k1 kn 1 k k0 k1 kn 1 − rn = rn,k, rn,k = − k k k k ··· k k=0 (n,k) 0 n 1 X IX ··· − where (n,k) is is the integer partition of n into k terms, i.e., I
n = k0 + 2k1 + + nkn 1, k = k0 + + kn 1. ··· − ··· − n But we shall study sn = k=0 sn,k where
P k0 kn 1 k k k − sn,k = e− k ! ··· k ! (n,k) 0 n 1 IX −
k n i since S(z, u) = k,n sn,ku z = i∞=1 β(z u). P Q Maximal Minimax Redundancy
It can be proved that r 1 D ( ) n r n+1 − ≤ n R0 ≤ m=0 m n Pk0 k1 kn 1 k k0 k1 kn 1 − rn = rn,k, rn,k = − k k k k ··· k k=0 (n,k) 0 n 1 X IX ··· − where (n,k) is is the integer partition of n into k terms, i.e., I
n = k0 + 2k1 + + nkn 1, k = k0 + + kn 1. ··· − ··· − n But we shall study sn = k=0 sn,k where
P k0 kn 1 k k k − sn,k = e− k ! ··· k ! (n,k) 0 n 1 IX − k n i since S(z, u) = k,n sn,ku z = i∞=1 β(z u). P Q
n n c 1 sn = [z ]S(z, 1) = [z ] exp 1 z + a log 1 z − −
Theorem 3 (Flajolet and W.S., 1998). We have the following asymptotics 7 2 5 1 sn exp 2√cn log n + O(1) , log rn = √cn log n+ log log n+O(1). ∼ − 8 log 2 −8 2 Outline Update
1. Shannon Legacy
2. Analytic Information Theory
3. Source Coding: The Redundancy Rate Problem (a) Universal Memoryless Sources (b) Universal Renewal Sources (c) Universal Markov Sources Maximal Minimax for Markov Sources
Markov source over = m (FINITE) with transition matrix P = p m : |A| { ij}i,j=1 k k11 k kmm k11 kmm k 11 mm Dn( 1) = sup p11 pmm = n( ) ... , M P ··· |T | k1 km xn k n X1 X∈P where (m) denotes Markov types, while (k) := (xn) denotes the Pn Tn Tn 1 number of sequences of type k = [kij], with kij representing the numbers n of pairs (ij) in x1 . Maximal Minimax for Markov Sources
Markov source over = m (FINITE) with transition matrix P = p m : |A| { ij}i,j=1 k k11 k kmm k11 kmm k 11 mm Dn( 1) = sup p11 pmm = n( ) ... , M P ··· |T | k1 km xn k n X1 X∈P where (m) denotes Markov types, while (k) := (xn) denotes the Pn Tn Tn 1 number of sequences of type k = [kij], with kij representing the numbers n of pairs (ij) in x1 .
n For circular strings (i.e., after the nth symbol we re-visit the first symbol of x1 ), the matrix k = [k ] satisfies the following constraints denoted as (m): ij Fn
m m 1 i,j m kij = n, j=1 kij = j=1 kji ≤ ≤ P P P For example: m=3 k11 + k12 + k13 + k21 + k22 + k23 + k31 + k32 + k33 = n k12 + k13 = k21 + k31 k12 + k32 = k21 + k23 k13 + k23 = k31 + k32
Matrices satisfying (m) are called balanced matrices, and (m) Fn |Fn | denotes the number of balanced matrices over . A Markov Types and Eulerian Cycles
Example: Let = 0, 1 and A { }
1 2 k = 2 2 Markov Types and Eulerian Cycles
Example: Let = 0, 1 and A { }
1 2 k = 2 2
(m) – Markov types but also ... Pn a set of all connected Eulerian di-graphs G = (V (G)], E(G)) such that V (G) and E(G) = n. ⊆ A | | (m) – set of connected Eulerian digraphs on . En A (m) – balanced matrices but also ... Fn set of (not necessary connected) Eulerian digraphs on . A
m2 3m+3 Asymptotic equivalence: (m) = (m) + O(n − ) (m) . |Pn | |Fn | ∼ |En | Markov Types – Main Results
Theorem 4 (Knessl, Jacquet, and W.S., 2010). (i) For fixed m and n → ∞ m2 m n − m2 m 1 n(m) = d(m) + O(n − − ) |P | (m2 m)! − m 1 1 ∞ ∞ − 1 1 d(m) = dϕ1dϕ2 dϕm 1. (2π)m 1 ··· 1+ ϕ2 · 1 + (ϕ ϕ )2 ··· − − j=1 j k=ℓ k ℓ Z−∞ Z−∞ Y Y6 − (m 1) fold − − (ii) When m we find that provided that m4 = o(n) → ∞| {z }
3m/2 m2 √2m e m2 m n(m) n − |P | ∼ m2m22mπm/2 ·
6 12 20 Some numerics: (3) 1 n , (4) 1 n , (5) 37 n . |Pn | ∼ 12 6! |Pn | ∼ 96 12! |Pn | ∼ 34560 20! Markov Types – Main Results
Theorem 4 (Knessl, Jacquet, and W.S., 2010). (i) For fixed m and n → ∞ m2 m n − m2 m 1 n(m) = d(m) + O(n − − ) |P | (m2 m)! − m 1 1 ∞ ∞ − 1 1 d(m) = dϕ1dϕ2 dϕm 1. (2π)m 1 ··· 1+ ϕ2 · 1 + (ϕ ϕ )2 ··· − − j=1 j k=ℓ k ℓ Z−∞ Z−∞ 6 − (m 1) fold Y Y − − (ii) When m we find that provided that m4 = o(n) → ∞| {z } 3m/2 m2 √2m e m2 m n(m) n − |P | ∼ m2m22mπm/2 ·
6 12 20 Some numerics: (3) 1 n , (4) 1 n , (5) 37 n . |Pn | ∼ 12 6! |Pn | ∼ 96 12! |Pn | ∼ 34560 20!
Open Question:
Counting types and sequences of a given type 5 in a Markov field. (Answer: (2) 1 n ?) |Pn | ∼ 12 5! Markov Redundancy: Main Results
Theorem 5 (Rissanen, 1996, Jacquet & W.S., 2004). Let be a Markov M1 source over an m-ry alphabet. Then
m(m 1) n R∗ ( 1) = − log + log Am + log(1/n) n M 2 2π with
j yij A = mF (y ) d[y ] m m ij q ij (1) Pj √yij ZK Yi where (1) = y : y = 1 and F ( ) is a polynomial. K { ij ij ij } m ·Q P In particular, for m = 2 A2 = 16 Catalan where Catalan is Catalan’s ( 1)i × constant − 0.915965594. i (2i+1)2 ≈ P Theorem 6. Let be a Markov source of order r. Then Mr r m (m 1) n r R∗ ( r) = − log + log A + O(1/n) n M 2 2π m r where Am is a constant defined in a similar fashion as Am above. Outline Update
1. Shannon Information Theory 2. Source Coding 3. The Redundancy Rate Problem 4. Post-Shannon Information (Science of Information) 5. NSF Science and Technology Center Post-Shannon Challenges: Time & Space
We need to extend information theory to include structure, time & space, and ....
1. Coding Rate for Finite Blocklength: Polyanskiy, Poor & Verdu, 2010
1 V 1 log M ∗(n,,ε) C Q− (ε) n ≈ − r n where C is the capacity, V is the channel dispersion, n is the block length, ε error probability, and Q is the complementary Gaussian distribution. Post-Shannon Challenges: Time & Space
We need to extend information theory to include structure, time & space, and ....
1. Coding Rate for Finite Blocklength: Polyanskiy, Poor & Verdu, 2010
1 V 1 log M ∗(n,,ε) C Q− (ε) n ≈ − r n where C is the capacity, V is the channel dispersion, n is the block length, ε error probability, and Q is the complementary Gaussian distribution.
Time & Space: Classical Information Theory is at its weakest in dealing with problems of delay (e.g., information arriving late maybe useless or has less value).
2. Speed of Information in DTN: Jacquet et al., 2008.
3. Real-Time Communication over Unreliable Wireless Channels with Hard Delay Constraints: P.R. Kumar & Hou, 2011.
4. The Quest for Fundamental Limits in MANET, Goldsmith, et al., 2011. Structure
Structure: Measures are needed for quantifying information embodied in structures (e.g., material structures, nanostructures, biomolecules, gene regulatory networks protein interaction networks, social networks, financial transactions). Structure
Structure: Measures are needed for quantifying information embodied in structures (e.g., material structures, nanostructures, biomolecules, gene regulatory networks protein interaction networks, social networks, financial transactions).
Information Content of Unlabeled Graphs: A random structure model of a graph is defined for an unlabeled S G version. Some labeled graphs have the same structure.
G1 1G2 1G3 1G4 1 S1 S2
23 2 3 2 3 2 3
G5 1G6 1G7 1G8 1 S3 S4
23 2 3 2 3 2 3
H = E[ log P (G)] = P (G) log P (G), G − − G X∈G H = E[ log P (S)] = P (S) log P (S). S − − S X∈S Automorphism and Erdos-R¨ enyi´ Graph Model
a Graph Automorphism:
b c For a graph G its automorphism is adjacency preserving permutation of vertices of G. d e Automorphism and Erdos-R¨ enyi´ Graph Model
a Graph Automorphism:
b c For a graph G its automorphism is adjacency preserving permutation of vertices of G. d e
Erdos¨ and Renyi´ model: (n, p) generates graphs with n vertices, where G edges are chosen independently with probability p. If G has k edges, then
k (n) k P (G) = p (1 p) 2 − . −
Theorem 7 (Y. Choi and W.S., 2008). For large n and all p satisfying ln n p n ≪ and 1 p ln n (i.e., the graph is connected w.h.p.), − ≫ n n n 1 H = h(p) log n!+o(1) = h(p) n log n+n log e log n+O(1), S 2 − 2 − − 2 where h(p) = p log p (1 p) log (1 p) is the entropy rate. − − − − n (h(p)+ε)+log n! n (h(p) ε)+log n! AEP for structures: 2−(2) P (S) 2−(2) − . ≤ ≤
Sketch of Proof: 1. H = H log n!+ S P (S) log Aut(S) . S G − ∈S | | 2. S P (S) log Aut(S) = o(1) by asymmetry of (n,p). ∈S | | P G P Structural Zip (SZIP) Algorithm Asymptotic Optimality of SZIP
Theorem 8 (Choi, W.S., 2008). Let L(S) be the length of the code.
(i) For large n, n E[L(S)] h(p) n log n + n (c + Φ(log n))+ o(n), ≤ 2 − where h(p) = p log p (1 p) log (1 p), c is an explicitly computable − − − − constant, and Φ(x) is a fluctuating function with a small amplitude or zero.
(ii) Furthermore, for any ε > 0,
P (L(S) E[L(S)] εn log n) 1 o(1). − ≤ ≥ − (iii) The algorithm runs in O(n + e) on average, where e # edges. Asymptotic Optimality of SZIP
Theorem 8 (Choi, W.S., 2008). Let L(S) be the length of the code.
(i) For large n, n E[L(S)] h(p) n log n + n (c + Φ(log n))+ o(n), ≤ 2 − where h(p) = p log p (1 p) log (1 p), c is an explicitly computable − − − − constant, and Φ(x) is a fluctuating function with a small amplitude or zero.
(ii) Furthermore, for any ε > 0,
P (L(S) E[L(S)] εn log n) 1 o(1). − ≤ ≥ − (iii) The algorithm runs in O(n + e) on average, where e # edges.
Table 1: The length of encodings (in bits) Networks # of # of our adjacency adjacency arithmetic nodes edges algorithm matrix list coding US Airports 332 2,126 8,118 54,946 38,268 12,991 Protein interaction (Yeast) 2,361 6,646 46,912 2,785,980 1 59,504 67,488 Collaboration (Geometry) 6,167 21,535 115,365 19,012, 861 55 9,910 241,811 Collaboration (Erdos)¨ 6,935 11,857 62,617 24,043,645 308,2 82 147,377 Genetic interaction (Human) 8,605 26,066 221,199 37,0 18,710 729,848 310,569 Real-world Internet (AS level) 25,881 52,407 301,148 334,900,140 1,572, 210 396,060 Outline Update
1. Shannon Information Theory 2. Source Coding 3. The Redundancy Rate Problem 4. Post-Shannon Information 5. NSF Science and Technology Center on Science of Information NSF Center for Science of Information
In 2010 National Science Foundation established $25M
Science and Technology Center for Science of Information (http: soihub.org) to advance science and technology through a new quantitative understanding of the representation, communication and processing of information in biological, physical, social and engineering systems.
The center is located at Purdue University and partner istitutions include: Berkeley, MIT, Princeton, Stanford, UIUC, UCSD and Bryn Mawr & Howard U. NSF Center for Science of Information
In 2010 National Science Foundation established $25M
Science and Technology Center for Science of Information (http: soihub.org) to advance science and technology through a new quantitative understanding of the representation, communication and processing of information in biological, physical, social and engineering systems.
The center is located at Purdue University and partner istitutions include: Berkeley, MIT, Princeton, Stanford, UIUC, UCSD and Bryn Mawr & Howard U.
Specific Center’s Goals:
define core theoretical principles governing transfer of information. • develop meters and methods for information. • apply to problems in physical and social sciences, and engineering. • offer a venue for multi-disciplinary long-term collaborations. • transfer advances in research to education and industry. • Acknowledgments
My French Connection:
Philippe Flajolet (1948-2011) Analytic Combinatorics Acknowledgments
My French Connection:
Philippe Flajolet (1948-2011) Analytic Combinatorics
Philippe Jacquet My copain and accomplice for the last thirty something years in the analytic information theory adventure. Acknowledgments
My French Connection:
Philippe Flajolet (1948-2011) Analytic Combinatorics
Philippe Jacquet My copain and accomplice for the last thirty something years in the analytic information theory adventure.
My Palo Alto Connection: Acknowledgments
My French Connection:
Philippe Flajolet (1948-2011) Analytic Combinatorics
Philippe Jacquet My copain and accomplice for the last thirty something years in the analytic information theory adventure.
My Palo Alto Connection:
Finally: TheMaster! ....andALLmyco-authors. That’s IT