Achievability for Discrete Memoryless Systems IT Membership and Chapters Committee Abbas El Gamal
Total Page:16
File Type:pdf, Size:1020Kb
Acknowledgments Roberto Padovani for generous gift to the IT society Achievability for Discrete Memoryless Systems IT Membership and Chapters committee Abbas El Gamal Stanford University Summer School of IT general chairs Aylin Yener and Gerhard Kramer and the other organizers Padovani Lecture 2009 The lecture is based on Lecture Notes on Network Information Theory jointly developed with Young-Han Kim, UC San Diego A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 1 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 2 / 151 Network Information Flow Problem Consider a general network information processing system: Each node observes a subset of the sources and wishes to obtain descriptions of some or all the sources, or to compute a replacements function/make a decision based on these sources Source To achieve these goals, the nodes communicate over the network Information flow questions: ◮ Communication Network Node What are the necessary and sufficient conditions on information flow in the network under which the desired information processing goals can be achieved? ◮ What are the optimal coding/computing techniques/protocols needed? The difficulty of answering these questions depends on the information processing goals, the source and communication network models, and the computational and storage capabilities of the nodes ◮ Sources may be data, speech, music, images, video, sensor data, ... ◮ Nodes may be handsets, base stations, servers, sensor nodes, ... ◮ The communication network may be wired, wireless, or a hybrid A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 3 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 4 / 151 For example, if the sources are commodities with demands (rates in bits/sec), the nodes are connected by noiseless rate-constrained links, each intermediate node simply forwards the bits it receives, and the goal is to send the commodities to desired destination nodes, the This simple information processing system model, however, does not problem reduces to the well-studied multi-commodity flow with capture many important aspects of real-world systems: known necessary and sufficient conditions on optimal flow [1] ◮ Real world information sources have redundancies, time and space In the case of a single commodity, these necessary and sufficient correlations, and time variations ◮ Real world networks may suffer from noise, interference, node failures, conditions reduce to the celebrated max-flow min-cut theorem [2, 3] delays, and time variations M3 ◮ Real world networks may allow for broadcasting 2 j ◮ M2 Real world communication nodes may allow for more complex node operations than forwarding C12 ◮ The goal in many information processing systems is not to merely 1 N C14 communicate source information M1 M1 4 C 13 k M3 3 M2 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 5 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 6 / 151 Network Information Theory Network information theory attempts to answer the above information flow questions while capturing some of these aspects of real world Lots of research activities in the 70s and early 80s with many new networks in the framework of Shannon information theory results and techniques developed, but ◮ It involves several new challenges beyond point-to-point Many basic problems remained open ◮ Little interest from information theory community and absolutely no communication, including: multi-accessing; broadcasting; interest from communication practitioners interference; relaying; cooperation; competition; distributed source Wireless communication and the Internet (and advances in coding; use of side information; multiple descriptions; joint technology) have revived the interest in this area source–channel coding; coding for computing; . ◮ Lots of research going on since the mid 90s First paper on multiple user information theory: ◮ Some work on old open problems but some new problems as well Claude E. Shannon, “Two-way communication channels” Proc. 4th Berkeley Symp. Math. Stat. Prob., pp. 611–644, 1961 ◮ He didn’t find the optimal rates ◮ Problem remains open A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 7 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 8 / 151 State of the Theory Capacity/rate regions known only for few settings, including: ◮ Focus has been on compression and communication Point-to-point communication ◮ Multiple access channel Most work assumes discrete memoryless (DM) and Gaussian source ◮ Classes of broadcast, interference, and relay channels and channel models ◮ Classes of channels with state Most results are for separate source–channel settings ◮ Classes of multiple descriptions and distributed source coding ◮ Classes of channels with feedback For such a setting, the goal is to establish a coding theorem that ◮ Classes of deterministic networks determines the set of achievable rate tuples, i.e., ◮ Classes of wiretap channels and distributed key generation settings ◮ the capacity region when sending independent messages over a noisy channel, or Several of the coding techniques developed, such as superposition ◮ the rate/rate-distortion region when sending source descriptions over a coding, successive cancellation decoding, Slepian–Wolf, Wyner–Ziv, noiseless channel successive refinement, writing on dirty paper, network coding, Establishing a coding theorem involves proving: decode–forward, compress–forward, and interference alignment, are ◮ Achievability: There exists a sequence of codes that achieve any rate beginning to have an impact on real world networks tuple in the capacity/rate region But many basic problems remain open and a complete theory is yet to ◮ (Weak) converse: If a rate tuple is achievable, then it must be in the be developed capacity/rate region A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 9 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 10 / 151 The Lecture Outline Will present a simple (yet rigorous) and unified approach to 1 Typical Sequences achievability for DM systems through examples of basic channel and 2 DMC source models: 3 DM-MAC ◮ Strong typicality ◮ Small number of coding techniques and performance analysis tools 4 DM-BC with Degraded Message Sets The approach has its origin in Shannon’s original work, which was 5 DM-IC later made rigorous by Forney [7] and Cover [8] among others 6 DM-RC: Decode–Forward Why focus on achievability? 7 DMS ◮ Achievability is useful—it often leads to practical coding techniques 8 DMS with Side Information at Decoder ◮ We have recently developed a more unified approach based on a small set of techniques and tools 9 DM-RC: Compress–Forward Why discrete instead of Gaussian? 10 DMC with State Available at Encoder ◮ “End-to-end” model for many real world networks 11 DM-BC with Private Messages ◮ Many real world sources are discrete 12 ◮ Achievability for most Gaussian models follows from their discrete DM-WTC counterparts via scalar quantizations and standard limit theorems 13 References and Appendices A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 11 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 12 / 151 Typical Sequences Typical Sequences Typical Sequences Properties of Typical Sequences n n n n n Let x ∈X . Define the empirical pmf (or type) of x as Let X ∼ pX (xi) Qi=1 {i : xi = x} n π(x|xn) := for x ∈X X n (n) Tǫ (X) Let X1, X2,... be a sequence of i.i.d. random variables each drawn n (n) n . −nH(X) according to p(x). By the law of large numbers (LLN), for every P{X ∈ Tǫ }≥ 1 − ǫ p(x ) =2 x ∈X , π(x|Xn) → p(x) in probability (n) . nH(X) For X ∼ p(x), the set of ǫ-typical n-sequences [4] is defined as |Tǫ | =2 T (n)(X) := xn : π(x|xn) − p(x) ≤ ǫ · p(x) for all x ∈X ǫ Typical Average Lemma n (n) Let x ∈Tǫ (X) and g(x) be a nonnegative function, then n H(X) := − E(log p(X)) 1 n . −nH(X) −n(H(X)+δ(ǫ)) n −n(H(X)−δ(ǫ)) (1 − ǫ) E(g(X)) ≤ g(xi) ≤ (1 + ǫ) E(g(X)) p(x ) = 2 means 2 ≤ p(x ) ≤ 2 , n X i=1 where δ(ǫ) → 0 as ǫ → 0 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 13 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 14 / 151 Typical Sequences Typical Sequences Jointly Typical Sequences Properties of Jointly Typical Sequences n n n n (n) . nH(X) The joint type of (x ,y ) ∈X ×Y is defined as Tǫ (X) | · | =2 {i : (xi,yi) = (x,y)} n π(x,y|xn,yn) := for (x,y) ∈X×Y x n yn For (X,Y ) ∼ p(x,y), the set of jointly ǫ-typical n-sequences (xn,yn) is defined as before T (n)(X, Y ) ǫ . T (n)(Y ) | · | =2nH(X,Y ) T (n)(X,Y ) := {(xn,yn) : |π(x,y|xn,yn) − p(x,y)|≤ ǫ · p(x,y) . ǫ ǫ | · | =2nH(Y ) for all (x,y) ∈X ×Y} n n (n) n (n) n (n) If (x ,y ) ∈Tǫ (X,Y ), then x ∈Tǫ (X), y ∈Tǫ (Y ) n (n) n For x ∈Tǫ , the set of conditionally ǫ-typical y sequences is T (n)(Y |xn) T (n)(X|yn) ǫ . ǫ . (n) n n n n | · | =2nH(Y |X) | · | =2nH(X|Y ) Tǫ (Y |x ) := {y : |π(x,y|x ,y ) − p(x,y)|≤ ǫ · p(x,y) for all (x,y) ∈X ×Y} H(X|Y ) := − EX,Y (log p(X|Y )) A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 15 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 16 / 151 Typical Sequences Typical Sequences Another Useful Picture Other Properties n (n) n n n n Let x ∈Tǫ′ (X) and Y |{X = x }∼ i=1 pY |X (yi|xi). Then by the LLN, for every ǫ>ǫ′, Q n X Yn (n) (n) Tǫ (Y ) Tǫ (X) n n (n) P{(x ,Y ) ∈Tǫ (X,Y )}→ 1 as n →∞ Joint Typicality Lemma xn (n) n ˜ n n Given (X,Y ) ∼ p(x,y), let x ∈Tǫ (X) and Y ∼ i=1 pY (˜yi) (instead n Q (n) n of pY |X (˜yi|xi)).