Combinatorial Reasoning in Information Theory
Total Page:16
File Type:pdf, Size:1020Kb
Combinatorial Reasoning in Information Theory Noga Alon, Tel Aviv U. ISIT 2009 1 Combinatorial Reasoning is crucial in Information Theory Google lists 245,000 sites with the words “Information Theory ” and “ Combinatorics ” 2 The Shannon Capacity of Graphs The (and) product G x H of two graphs G=(V,E) and H=(V’,E’) is the graph on V x V’, where (v,v’) = (u,u’) are adjacent iff (u=v or uv є E) and (u’=v’ or u’v’ є E’) The n-th power Gn of G is the product of n copies of G. 3 Shannon Capacity Let α ( G n ) denote the independence number of G n. The Shannon capacity of G is n 1/n n 1/n c(G) = lim [α(G )] ( = supn[α(G )] ) n →∞ 4 Motivation output input A channel has an input set X, an output set Y, and a fan-out set S Y for each x X. x ⊂ ∈ The graph of the channel is G=(X,E), where xx’ є E iff x,x’ can be confused , that is, iff S S = x ∩ x′ ∅ 5 α(G) = the maximum number of distinct messages the channel can communicate in a single use (with no errors) α(Gn)= the maximum number of distinct messages the channel can communicate in n uses. c(G) = the maximum number of messages per use the channel can communicate (with long messages) 6 There are several upper bounds for the Shannon Capacity: Combinatorial [ Shannon(56)] Geometric [ Lovász(79), Schrijver (80)] Algebraic [Haemers(79), A (98)] 7 Theorem (A-98): For every k there are graphs G and H so that c(G), c(H) ≤ k and yet c(G + H) kΩ(log k/ log log k) ≥ where G+H is the disjoint union of G and H. This answers a problem of Shannon. abcd… …a b א ב c אבגד… 8 Multiple channels and privileged users (A+Lubetzky 07 ): For any fixed t and any family of subsets of 1, 2, . , t , there are graphs G ,G ,...,GF so that { } 1 2 t c(i I Gi) is “large” if I contains a member of , and i∈s “small” otherwise. F For example, the capacity can be large iff I is of size at least k. 9 Not much is known on the Shannon capacity of graphs: What is the maximum possible capacity of a disjoint union of two graphs, each having capacity at most k ? Is the maximum possible capacity of a graph G with independence number 2 bounded by an absolute constant ? Is the problem of deciding if the Shannon capacity of a given input graph is at least a given real x decidable ? What is the expected value of c(G(n,½)) ? 10 Broadcasting with side information A sender holds a word x=x 1x2…x n of n blocks x i, each consisting of t bits, which he wants to broadcast to m receivers . Each receiver Ri is interested in x f(i) and has a prior side information consisting of some other blocks x j . βt=minimum number of bits that have to be transmitted to allow each R i to recover x f(i) β β β = lim t ( = inf t ). t →∞ t t 11 Motivation [Birk and Kol (98)] [Bar-Yossef,Birk,Jayram and Kol (06)]: Applications such as Video on Demand 12 The representing directed hypergraph: Vertices =[n]={1,2, …, n} (=indices of blocks) Edges: for each receiver R j there is a directed edge (f(j),N(j)), where N(j) is the set of indices of all blocks known to R i. Thus βt=βt(H), βs+t (H) ≤βs(H)+ βt(H), β(H)=inf βt(H)/t In words: β is the average asymptotic number of encoding bits needed per bit in each input block. 13 Example: H=([5],{ (1,2),(2,1),(3,{4,5}),(4,{3,5}),(5,{3,4}) } 5 1 2 4 3 βt(H) ≥ 2t as receivers R 1 , R 3 have to get x 1 , x 3 βt(H) ≤ 2t : it is enough to transmit x x , x x x . 1 ⊕ 2 3 ⊕ 4 ⊕ 5 14 * A related parameter: βt = β1(t H) , where t H is the disjoint union of t copies of H. * In words: βt (H) is the minimum number of bits required if the network topology is replicated t independent times. β∗(H) = limt βt∗(H)/t = inftβt∗(H)/t. →∞ Facts: α(H) β(H) β∗(H) β (H) ≤ ≤ ≤ 1 α(H) = max S : S V, v S e = (v, J) E, J S = . {| | ⊂ ∀ ∈ ∃ ∈ ∩ ∅} β (H) 1 ≤ min r : V = V V e = (v, J) E i, v V ,V v J { 1 ∪ · · · ∪ r ∀ ∈ ∃ ∈ i i − ⊂ } 15 Computing β*(H) Given H=([n],E) and t=1, two input strings x, y 0, 1 n ∈ { } are confusable if there exists a receiver (=directed edge) (i,J) in E such that x =y for all j in J, and yet x = y . j j i i Let λ (= λn) denote the maximum cardinality of a set of input strings that are pairwise non-confusable. Theorem 1 [A, Lubetzky, Stav(08)]: β∗(H) = limt β1(tH)/t = n log2 λ. →∞ − 16 Corollary [ALS]: For H = C = (Z ,E) where E = (i, i 1, i + 1 ): i Z , 5 5 { { − } ∈ 5} β (H) = 3, β∗(H) = 5 log 5 = 2.68.. 1 − 2 This implies that in Network Coding (introduced by Ahlswede,Cai, Li and Yeung ), nonlinear capacity may exceed linear capacity [Earlier examples given by Dougherty, Freiling and Zeger (05), and by Chan and Grant (07). ] 17 Theorem 2 [A, Weinstein (08)]: For every constant C there is an (explicit) hypergraph H for which * β (H)<3 and β1(H) > C. 1001..0 0001..1 0011..0 1100..0 (r) (r) (b) (b) (g) (g) (l) (l) R1 ,...,Rm R1 ,...,Rm R1 ,...,Rm R1 ,...,Rm 18 Theorem 3 [ A,Hassidim,Lubetzky,Stav,Weinsten (08)]: For every constant C there is an (explicit) hypergraph H for which β(H)=2 and yet β*(H)>C. 19 Codes for disjoint unions of hypergraphs Definition: The confusion graph C(H) of a directed hypergraph H=([n],E) describing a broadcast network is the undirected graph on {0,1}n, where x,y are adjacent iff for some e=(i,J) in E, x = y and yet x = y for all j J. i i j j ∈ Here each block is of length 1, the vertices are all possible input words, and two are adjacent iff they are confusable. Observation 1: The number of words in the optimal code for H is χ(C(H). That is, β (H) = log χ(C(H)) . 1 ⌈ 2 ⌉ 20 Definition: The OR graph product of G 1 and G 2 is the graph on V(G 1) x V(G 2), where (u,v) and (u’,v’) are adjacent if either uu E(G ) or vv E(G ) (or both). ′ ∈ 1 ′ ∈ 2 k LetG∨ denote the k-fold OR product of G. Observation 2: For any pair H 1 and H 2 of directed hypergraphs, the confusion graph of their disjoint union is the OR product of C(H 1) and C(H 2). Theorem [McEliece+Posner(71), Berge+Simonovits(74 )]: t 1/t For every graph G limt [χ(G∨ )] = χf (G) →∞ where χf (G) is the fractional chromatic number of G. 21 For a directed hypergraph H=([n],E), G=C(H) is a Cayley Graph and thus χf (G) = V (G) /α(G). n | | Hence χf (C(H)) = 2 /λ(H), where λ(H) is the maximum cardinality of a set of pairwise non-confusable words. Therefore n β∗(H) = log[2 /λ(H)] = n log λ(H) − proving Theorem 1. 22 Confusion graphs with χ much bigger than χf : An auxiliary graph: put n=2k, and let G be the Cayley Graph on (Z_2)k in which i,j are adjacent if the Hamming distance i j between i and j is at least k √k . | ⊕ | − 100 √k Fact 1:χ(G) > 100 and χf (G) < 2.05. Proof: The Kneser Graph k √k K(k, 2 200 ) is a subgraph of G, and hence by Lovász [78] the− chromatic number k √k √k is at least k 2( 2 200 ) + 2 > 100 . The fractional −chromatic− number is small as that’s a Cayley Graph and there is an independent set of size k 2k i<k/2 √k/200 i > 2.05 . 23 − Let H=([n],E) be the directed hypergraph on [n], where N=2k, and for each i,j satisfying √k i j k 100 (i,V-{i,j}),(j,V-{i,j}) are directed edges.| ⊕ | ≥ − Let C=C(H) be the corresponding confusion graph . This is the Cayley Graph on (Z_2)n whose generators are all vectors e and all e e with i j k √k . i i ⊕ j | ⊕ | ≥ − 100 24 √log n Fact 2: χ(C) χ(G) . ≥ ≥ 100 Proof: The induced subgraph of C on the vectors e i is G. Fact 3: χ (C) 2χ (G) < 4.1. f ≤ f Proof (sketch): Let I be a maximum independent set in k G. Then for all j Z2 and all ǫ 0, 1 , the set n ∈ ∈ { } Ij = u Z2 : u ǫ(mod 2), j i iui I is an independent{ ∈ set| of| ≡C, and its expected⊕ size∈ for} n random j, ǫ is 2 . 2χf (G) Thus, there is a directed hypergraph H on [n] so that √k β (H) log( ) = Ω(log log n), β∗(H) log 4.05 < 3.