This Week's Topics Shannon's Work. • Mathematical/Probabilistic Model of Communication. • Definitions of Information, En

This week's Topics Shannon's Framework (1948) Shannon’s Work. Three entities: Source, Channel, and Receiver. • Mathematical/Probabilistic Model of Communication. Source: Generates “message” - a sequence of bits/ symbols - according to some • Definitions of Information, Entropy, “stochastic” process S. Randomness. • Noiseless Channel & Coding Theorem. Communication Channel: Means of passing information from source to receiver. May • Noisy Channel & Coding Theorem. introduce errors, where the error sequence is another stochastic process E. • Converses. Receiver: Knows the processes S and E, but • Algorithmic challenges. would like to know what sequence was generated by the source. Detour from Error-correcting codes? �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 1 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 2 Goals/Options Noiseless case: Example • Noiseless case: Channel precious • Channel transmits bits: 0 ! 0; 1 ! 1. commodity. Would like to optimize usage. 1 bit per unit of time. • Noisy case: Would like to recover message • Source produces a sequence of independent despite errors. bits: 0 with probability 1 − p and 1 with probability p. • Source can “Encode” information. • Question: Expected time to transmit n • Receiver can “Decode” information. bits, generated by this source? Theories are very general: We will describe very specific cases only! �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 3 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 4 Noiseless Coding Theorem (for Example) Entropy of a source Let H2(p) = −(p log2 p+ (1 −p) log2(1 −p)). • Distribution D on finite set S is D : S ! [0; 1] with D(x) = 1. Noiseless Coding Theorem: Informally, Px2S expected time ! H(p) · n as n ! 1. • Entropy: H(D) = Px2S −D(x) log2 D(x). Formally, for every � > 0, there exists n0 s.t. • Entropy of p-biased bit H2(p). for every n ∃ n0, 9E : f0; 1gn ! f0; 1g� and D : f0; 1g� ! • Entropy quantifies randomness in a f0; 1gn s.t. distribution. • For all x 2 f0; 1gn;D(E(x)) = x. • Coding theorem: Suffices to specify entropy # of bits (amortized, in expectation) to • Ex[jE(x)j] � (H(p) + �)n. specify the point of the probability space. Proof: Exercise. • Fundamental notion in probability/information theory. �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 5 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 6 Binary Entropy Function H2(p) Noisy Case: Example • Plot H(p). • Source produces 0/1 w.p. 1/2. • Main significance? • Error channel: Binary Symmetric Channel with probability p (BSCp), transmits 1 bit − Let B (y; r) = fx 2 f0; 1gnj�(x; y) � 2 per unit of time faithfully with probability rg (n implied). 1 − p and flips it with probability p. − Let Vol2(r; n) = jB2(0; r)j. (H(p)+o(1))n − Then Vol2(pn; n) = 2 • Goal: How many source bits can be transmitted in n time units? − Can permit some error in recovery. − Error probability during recovery should be close to zero. • Prevailing belief: Can only transmit o(n) bits. �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 7 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 8 Noisy Coding Theorem (for Example) The Encoding and Decoding Functions Theorem: (Informally) Can transmit (1 − • E chosen at random from all functions k n H(p)) · n bits, with error probability going mapping f0; 1g ! f0; 1g . to zero exponentially fast. • D chosen to be the brute force algorithm (Formally) 8� > 0; 9� > 0 s.t. for all n: - for every y, D(y) is the vector x that minimizes �(E(x); y). Let k = (1 − H(p + �))n. Then 9E : f0; 1gk ! f0; 1gn and 9D : f0; 1gn ! • Far from constructive!!! f0; 1gk s.t. • But its a proof of concept! Pr [D(E(x) + �) 6= x] � exp(−�n); �;x • Main lemma: For E; D as above, the probability of decoding failure is where x is chosen according to the source and exponentially small, for any fixed message � independently according to BSCp. x. • Power of the probabilistic method! �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 9 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 10 Proof of Lemma • If � is not Bad, and no x0 6= x is Bad for x, then D(E(x) + �) = x. • Will fix x 2 f0; 1gk and E(x) first and pick • Conclude that decoding fails with error � next, and then the rest of E last! probability at most e−�(n), over random choice of E; � (for every x, and so also if x • � is Bad if it has weight more than (p+ �)n. is chosen at random). Pr[�Bad] � 2−�n • Conclude there exists E such that encoding � and decoding lead to exponentially small (Chernoff bounds). error probability, provided k+H(p)·n ∈ n. 0 0 • x Bad for x; � if E(x ) 2 B2(E(x)+�; (p+ �)n). Pr [x 0Bad for x; �] � 2H(p+�)n=2n E(x0) 0 k+H(p)·n−n • PrE[9x Bad for x; �] � 2 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 11 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 12 Converse to Coding Theorems Formal proof of the Converse • Shannon also showed his results to be tight. • � Easy if weight � (p−�)n. Pr�[� Easy ] � exp(−n). For any y of weight ∃ (p − �)n, • For noisy case, 1−H(p) is the best possible Pr[� = y] � 2−H(p−�)n. rate ... • For x 2 f0; 1gk let S ∀ f0; 1gn = x • ... no matter what E; D are! n fyjD(y) = xg. Have Px jSxj = 2 . • How to prove this? • Pr[ Decoding correctly] • Intuition: Say we transmit E(x). W.h.p. = 2−k Pr[� = y − E(x)] X X � # erroneous bits is ≥ pn. In such case, x2f0;1gk y2Sx symmetry implies no one received vector is likely w.p. more that n ≥ 2−H(p)n . To = Pr[� Easy]+2−k Pr[� = y−E(x)j�Hard ] �pn � X X � have error probability close to zero, at least x y2Sx 2H(p)n received vectors must decode to x. = exp(−n) + 2−k ··2−H(p)n 2n But then need 2k � 2n=2H(p)n . = exp(−n) �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 13 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 14 Importance of Shannon's Framework More general source • Examples considered so far are the baby • Allows for Markovian sources. examples! • Source described by a finite collection of • Theory is wide and general. states with a probability transition matrix. • But, essentially probabilistic + “information • Each state corresponds to a fixed symbol theoretic” not computational. of the output. • For example, give explicit E! Give efficient • Interesting example in the original paper: D! Shannon’s work does not. Markovian model of English. Computes the rate of English! �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 15 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 16 More general models of error General theorem • i.i.d. case generally is a transition matrix • Every source has a Rate (based on entropy from � to �.(�, � need not be finite! of the distribution it generates). (Additive White Gaussian Channel). Yet capacity might be finite.) • Every channel has a Capacity. • Also allows for Markovian error models. Theorem: If Rate < Capacity, information May be captured by a state diagram, with transmission is feasible with error decreasing each state having its own transition matrix exponentially with length of transmission. If from � to �. Rate > Capacity, information transmission is not feasible. �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 17 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 18 Contrast with Hamming adversarial error. Engineering need: Closer to Shannon theory. However Hamming theory provided solutions, since min. • Main goal of Shannon Theory: distance seemed easier to analyze than Perr. − Constructive (polytime/linear-time/etc.) E; D. − Maximize rate = k=n where E : f0; 1gk ! f0; 1gn . − While minimizing Perr = Prx,�[D(E(x)+ �) 6= x] • Hamming theory: − Explicit description of fE(x)gx. − No focus on E; D itself. − Maximize k=n and d=n, where d = minx1;x2f�(E(x1); E(x2))g. • Interpretations: Shannon theory deals with probabilistic error. Hamming with �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 19 �c Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 20.

This Week's Topics Shannon's Work. • Mathematical/Probabilistic Model of Communication. • Definitions of Information, En

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support