Acknowledgments

Roberto Padovani for generous gift to the IT society Achievability for Discrete Memoryless Systems IT Membership and Chapters committee

Stanford University Summer School of IT general chairs Aylin Yener and Gerhard Kramer and the other organizers Padovani Lecture 2009

The lecture is based on Lecture Notes on Network jointly developed with Young-Han Kim, UC San Diego

A. El Gamal () Achievability Padovani Lecture 2009 1 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 2 / 151

Network Information Flow Problem

Consider a general network information processing system: Each node observes a subset of the sources and wishes to obtain descriptions of some or all the sources, or to compute a replacements function/make a decision based on these sources Source To achieve these goals, the nodes communicate over the network Information flow questions: ◮ Communication Network Node What are the necessary and sufficient conditions on information flow in the network under which the desired information processing goals can be achieved? ◮ What are the optimal coding/computing techniques/protocols needed? The difficulty of answering these questions depends on the information processing goals, the source and communication network models, and the computational and storage capabilities of the nodes ◮ Sources may be data, speech, music, images, video, sensor data, ... ◮ Nodes may be handsets, base stations, servers, sensor nodes, ... ◮ The communication network may be wired, wireless, or a hybrid

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 3 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 4 / 151 For example, if the sources are commodities with demands (rates in bits/sec), the nodes are connected by noiseless rate-constrained links, each intermediate node simply forwards the bits it receives, and the goal is to send the commodities to desired destination nodes, the This simple information processing system model, however, does not problem reduces to the well-studied multi-commodity flow with capture many important aspects of real-world systems: known necessary and sufficient conditions on optimal flow [1] ◮ Real world information sources have redundancies, time and space In the case of a single commodity, these necessary and sufficient correlations, and time variations ◮ Real world networks may suffer from noise, interference, node failures, conditions reduce to the celebrated max-flow min-cut theorem [2, 3] delays, and time variations M3 ◮ Real world networks may allow for broadcasting 2 j ◮ M2 Real world communication nodes may allow for more complex node operations than forwarding C12 ◮ The goal in many information processing systems is not to merely 1 N C14 communicate source information M1 M1 4

C 13 k M3 3 M2

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 5 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 6 / 151

Network Information Theory

Network information theory attempts to answer the above information flow questions while capturing some of these aspects of real world Lots of research activities in the 70s and early 80s with many new networks in the framework of Shannon information theory results and techniques developed, but ◮ It involves several new challenges beyond point-to-point Many basic problems remained open ◮ Little interest from information theory community and absolutely no communication, including: multi-accessing; broadcasting; interest from communication practitioners interference; relaying; cooperation; competition; distributed source Wireless communication and the Internet (and advances in coding; use of side information; multiple descriptions; joint technology) have revived the interest in this area source–channel coding; coding for computing; . . . ◮ Lots of research going on since the mid 90s First paper on multiple user information theory: ◮ Some work on old open problems but some new problems as well Claude E. Shannon, “Two-way communication channels” Proc. 4th Berkeley Symp. Math. Stat. Prob., pp. 611–644, 1961 ◮ He didn’t find the optimal rates ◮ Problem remains open

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 7 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 8 / 151 State of the Theory Capacity/rate regions known only for few settings, including: ◮ Focus has been on compression and communication Point-to-point communication ◮ Multiple access channel Most work assumes discrete memoryless (DM) and Gaussian source ◮ Classes of broadcast, interference, and relay channels and channel models ◮ Classes of channels with state Most results are for separate source–channel settings ◮ Classes of multiple descriptions and distributed source coding ◮ Classes of channels with feedback For such a setting, the goal is to establish a coding theorem that ◮ Classes of deterministic networks determines the set of achievable rate tuples, i.e., ◮ Classes of wiretap channels and distributed key generation settings ◮ the capacity region when sending independent messages over a noisy channel, or Several of the coding techniques developed, such as superposition ◮ the rate/rate-distortion region when sending source descriptions over a coding, successive cancellation decoding, Slepian–Wolf, Wyner–Ziv, noiseless channel successive refinement, writing on dirty paper, network coding, Establishing a coding theorem involves proving: decode–forward, compress–forward, and interference alignment, are ◮ Achievability: There exists a sequence of codes that achieve any rate beginning to have an impact on real world networks tuple in the capacity/rate region But many basic problems remain open and a complete theory is yet to ◮ (Weak) converse: If a rate tuple is achievable, then it must be in the be developed capacity/rate region

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 9 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 10 / 151

The Lecture Outline Will present a simple (yet rigorous) and unified approach to 1 Typical Sequences achievability for DM systems through examples of basic channel and 2 DMC source models: 3 DM-MAC ◮ Strong typicality ◮ Small number of coding techniques and performance analysis tools 4 DM-BC with Degraded Message Sets The approach has its origin in Shannon’s original work, which was 5 DM-IC later made rigorous by Forney [7] and Cover [8] among others 6 DM-RC: Decode–Forward Why focus on achievability? 7 DMS ◮ Achievability is useful—it often leads to practical coding techniques 8 DMS with Side Information at Decoder ◮ We have recently developed a more unified approach based on a small set of techniques and tools 9 DM-RC: Compress–Forward Why discrete instead of Gaussian? 10 DMC with State Available at Encoder ◮ “End-to-end” model for many real world networks 11 DM-BC with Private Messages ◮ Many real world sources are discrete 12 ◮ Achievability for most Gaussian models follows from their discrete DM-WTC counterparts via scalar quantizations and standard limit theorems 13 References and Appendices

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 11 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 12 / 151 Typical Sequences Typical Sequences Typical Sequences Properties of Typical Sequences n n n n n Let x ∈X . Define the empirical pmf (or type) of x as Let X ∼ pX (xi) Qi=1 {i : xi = x} n π(x|xn) := for x ∈X X n (n) Tǫ (X) Let X1, X2,... be a sequence of i.i.d. random variables each drawn n (n) n . −nH(X) according to p(x). By the law of large numbers (LLN), for every P{X ∈ Tǫ }≥ 1 − ǫ p(x ) =2 x ∈X , π(x|Xn) → p(x) in probability (n) . nH(X) For X ∼ p(x), the set of ǫ-typical n-sequences [4] is defined as |Tǫ | =2 T (n)(X) := xn : π(x|xn) − p(x) ≤ ǫ · p(x) for all x ∈X ǫ 

Typical Average Lemma n (n) Let x ∈Tǫ (X) and g(x) be a nonnegative function, then

n H(X) := − E(log p(X)) 1 n . −nH(X) −n(H(X)+δ(ǫ)) n −n(H(X)−δ(ǫ)) (1 − ǫ) E(g(X)) ≤ g(xi) ≤ (1 + ǫ) E(g(X)) p(x ) = 2 means 2 ≤ p(x ) ≤ 2 , n X i=1 where δ(ǫ) → 0 as ǫ → 0

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 13 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 14 / 151

Typical Sequences Typical Sequences Jointly Typical Sequences Properties of Jointly Typical Sequences n n n n (n) . nH(X) The joint type of (x ,y ) ∈X ×Y is defined as Tǫ (X) | · | =2 

{i : (xi,yi) = (x,y)} n π(x,y|xn,yn) := for (x,y) ∈X×Y x n yn For (X,Y ) ∼ p(x,y), the set of jointly ǫ-typical n-sequences (xn,yn) is defined as before T (n)(X, Y ) ǫ . T (n)(Y ) | · | =2nH(X,Y ) T (n)(X,Y ) := {(xn,yn) : |π(x,y|xn,yn) − p(x,y)|≤ ǫ · p(x,y) . ǫ  ǫ | · | =2nH(Y ) for all (x,y) ∈X ×Y} 

n n (n) n (n) n (n) If (x ,y ) ∈Tǫ (X,Y ), then x ∈Tǫ (X), y ∈Tǫ (Y ) n (n) n For x ∈Tǫ , the set of conditionally ǫ-typical y sequences is T (n)(Y |xn) T (n)(X|yn) ǫ . ǫ . (n) n n n n | · | =2nH(Y |X) | · | =2nH(X|Y ) Tǫ (Y |x ) := {y : |π(x,y|x ,y ) − p(x,y)|≤ ǫ · p(x,y)   for all (x,y) ∈X ×Y} H(X|Y ) := − EX,Y (log p(X|Y ))

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 15 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 16 / 151 Typical Sequences Typical Sequences Another Useful Picture Other Properties

n (n) n n n n Let x ∈Tǫ′ (X) and Y |{X = x }∼ i=1 pY |X (yi|xi). Then by the LLN, for every ǫ>ǫ′, Q n X Yn (n) (n) Tǫ (Y ) Tǫ (X) n n (n) P{(x ,Y ) ∈Tǫ (X,Y )}→ 1 as n →∞

 Joint Typicality Lemma xn   (n)  n ˜ n n Given (X,Y ) ∼ p(x,y), let x ∈Tǫ (X) and Y ∼ i=1 pY (˜yi) (instead n Q (n) n of pY |X (˜yi|xi)). Then, Tǫ (Y |x ) Qi=1 . | · | =2nH(Y |X) n ˜ n (n) . −nI(X;Y ) P{(x , Y ) ∈Tǫ (X,Y )} = 2

I(X; Y ) := H(X)+ H(Y ) − H(X,Y )

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 17 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 18 / 151

Typical Sequences Typical Sequences Multivariate Typical Sequences

n n n The set of ǫ-typical n-sequences (x1 ,x2 ,x3 ) is defined as

{(xn,xn,xn) : π(x ,x ,x |xn,xn,xn) − p(x ,x ,x ) ≤ ǫ · p(x ,x ,x ) Markov Lemma [5] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 n for all (x1,x2,x3) ∈X1 ×X2 ×X3} n n ( ) Suppose X → Y → Z form a . Let (x ,y ) ∈Tǫ′ (X,Y ). n n n n ′ If Z |{Y = y }∼ pZ Y (zi|yi), then for every ǫ>ǫ , Properties of jointly typical sets continue to hold by considering Qi=1 | subsets of the random variables as single random variables n n n (n) P{(x ,y , Z ) ∈Tǫ (X,Y,Z)}→ 1 as n →∞ Conditional Joint Typicality Lemma The proof follows by the LLN n n (n) ˜ n Let (X1, X2, X3) ∼ p(x1,x2,x3). Given (x1 ,x2 ) ∈Tǫ (X1, X2), let X3 n be drawn according to pX X (˜x3i|x1i) (instead of Qi=1 3| 1 pX3|X1,X2 (˜x3i|x1i,x2i)). Then,

n n ˜ n (n) . −nI(X2;X3|X1) P{(x1 ,x2 , X3 ) ∈Tǫ (X1, X2, X3)} = 2

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 19 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 20 / 151 DMC DMC

Discrete Memoryless Channel Assume that the message M ∼ Unif[1 : 2nR] The average probability of error for a (2nR,n) code is

(n) ˆ Consider a DMC (X ,p(y|x), Y) consisting of two finite sets X , Y, Pe = P{M 6= M} and a collection of conditional pmfs p(y|x) over Y A rate R is achievable if there exists a sequence of (2nR,n) codes The sender X wishes to send a message reliably to the receiver Y (n) with probability of error Pe → 0 as n →∞ Xn Y n M Encoder p(y|x) Decoder Mˆ The capacity of the DMC is the supremum of all achievable rates

A (2nR,n) code for the DMC consists of: Theorem (Shannon [6]) nR ⌈nR⌉ 1 A message set [1:2 ] := {1, 2,..., 2 } The capacity of the DMC (X ,p(y|x), Y) is 2 An encoder that assigns a codeword xn(m) to each message nR m ∈ [1:2 ] C = max I(X; Y ) 3 A decoder that assigns to each received sequence yn a message p(x) estimate mˆ (yn) ∈ [1:2nR] or an error e We prove achievability using random codebook generation, joint typicality decoding, and properties of jointly typical sequences

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 21 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 22 / 151

DMC DMC Proof of Achievability Analysis of the probability of error: We bound the probability of error P(E) averaged over codebooks and M. By symmetry of codebook generation, P(E)= P(E|M = 1). Thus, we assume wlog that M = 1 is sent Codebook generation: Fix p(x) that achieves C A decoding error E occurs iff Randomly and independently generate xn(m), m ∈ [1 : 2nR], n n n n n sequences each according to p(x )= pX (xi) E := {(X (1),Y ) ∈T/ ( )}, or Qi=1 1 ǫ The generated set of sequences constitutes the codebook C n n (n) nR E2 := {(X (m),Y ) ∈Tǫ for some m ∈ [2 : 2 ]} The codebook is revealed to both the sender and receiver before transmission commences Then, P(E)= P(E1 ∪E2) ≤ P(E1)+ P(E2) n nR n n n Encoding: To send a message m ∈ [1 : 2 ], transmit x (m) Since, (X (1),Y ) ∼ i=1 pX,Y (xi,yi), P(E1) → 0 as n →∞ by the LLN Q Decoding: We use joint-typicality decoding Upon receiving yn, the decoder declares that mˆ ∈ [1 : 2nR] is sent if Next, consider P(E2). Since for m 6= 1, n n n n (n) (X (m), X (1),Y ) ∼ p (x (m))p (x (1),y ), it is the unique message such that (xn(m ˆ ),yn) ∈T ; otherwise an i=1 X i X,Y i i ǫ (Xn(m),Y n) ∼ n p Q(x )p (y ). Thus, the joint typicality error e is declared i=1 X i Y i lemma, Q

n n (n) −n(I(X;Y )−δ(ǫ)) −n(C−δ(ǫ)) P{(X (m),Y ) ∈Tǫ }≤ 2 = 2 ,

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 23 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 24 / 151 DMC DMC

By the union of events bound Packing Lemma n n nR Let (U,X,Y ) ∼ p(u,x,y), ǫ> 0, and U ∼ i=1 pU (ui) 2 n nR Q n n (n) Let X (m),m ∈A, where |A| ≤ 2 , be random sequences each drawn P(E2) ≤ P{(X (m),Y ) ∈T } X ǫ according to n p (x |u ) m=2 i=1 X|U i i n Q 2nR Let Y be a random sequence drawn according to an arbitrary pmf n n n n ≤ 2−n(C−δ(ǫ)) ≤ 2−n(C−R−δ(ǫ)), p(y |u ), conditionally independent of each X (m),m ∈A, given U X m=2 There exists δ(ǫ) → 0 as ǫ → 0 such that if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 25 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 26 / 151

DMC DM-MAC

Packing Lemma DM Multiple Access Channel Let (U,X,Y ) ∼ p(u,x,y), ǫ> 0, and U n ∼ n p (u ) i=1 U i Consider a 2-sender DM-MAC (X1 ×X2,p(y|x1,x2), Y) Let Xn(m),m ∈A, where |A| ≤ 2nR, be randomQ sequences each drawn n Senders X1, X2 wish to send independent messages to the receiver Y according to i=1 pX|U (xi|ui) Let Y n be a randomQ sequence drawn according to an arbitrary pmf n p(yn|un), conditionally independent of each Xn(m),m ∈A, given U n X1 M1 Encoder 1 Y n There exists δ(ǫ) → 0 as ǫ → 0 such that if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 27 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 28 / 151 DM-MAC DM-MAC

nR1 nR2 Assume that (M1, M2) ∼ Unif([1 : 2 ] × [1 : 2 ]) Theorem (Ahlswede [9], Liao [10]) n n Thus, X1 (M1) and X2 (M2) are independent Let (X1, X2) ∼ p(x1)p(x2) and R(X1, X2) be the set of rate pairs The average probability of error is (R1, R2) such that

(n) ˆ ˆ Pe = P{(M1, M2) 6= (M1, M2)} R1 ≤ I(X1; Y |X2), R2 ≤ I(X2; Y |X1), A rate pair (R1, R2) is achievable for the DM-MAC if there exists a nR nR (n) R1 + R2 ≤ I(X1, X2; Y ) sequence of (2 1 , 2 2 ,n) codes with Pe → 0 as n →∞ The capacity region of the DM-MAC is the closure of the set of The capacity region for the DM-MAC is the convex closure of the union of achievable rate pairs (R1, R2) the regions R(X1, X2) over all p(x1)p(x2)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 29 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 30 / 151

DM-MAC DM-MAC Achievability ◦ R(X1, X2) is in general a pentagon with a 45 side (why?) We first fix (X1, X2) ∼ p(x1)p(x2) and show that every rate pair R 2 (R1, R2) in the interior of R(X1, X2) is achievable The rest of the capacity region is achieved using time sharing between I(X2; Y |X1) points in different R(X1, X2) regions: ′ ′ If rate pairs (R1, R2) and (R1, R2) are achievable, then using the codes of first pair α of the time and the codes of second pair the rest ′ ′ of the time, can achieve (αR1 + αR1, αR2 + αR2), α := (1 − α) I(X2; Y ) Note that time sharing is more general than time/frequency division, where we time-share between (R1, 0) and (0, R2)

I(X1; Y ) I(X1; Y |X2) R1 Codebook generation: Fix p(x1)p(x2). Randomly and independently n nR1 generate x1 (m1), m1 ∈ [1 : 2 ], sequences each according to The new achievability ideas are successive cancellation decoding, n n nR2 i=1 pX1 (x1i). Similarly generate x2 (m2), m2 ∈ [1 : 2 ], simultaneous decoding and time sharing, in addition to applying the Q n sequences each according to i=1 pX2 (x2i) packing lemma Q n Encoding: To send the message m1, encoder 1 transmits x1 (m1) n Similarly, to send m2, encoder 2 transmits x2 (m2)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 31 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 32 / 151 DM-MAC DM-MAC

Successive Cancellation Decoding n n n Since for m 6= 1, (X1 (m1),Y ) ∼ i=1 pX1 (x1i)pY (yi), by the nR1 Q This decoding rule aims to achieve the two corner points of the packing lemma (with |A| = 2 − 1, U = ∅), P(E12) → 0 as n →∞ if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 33 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 34 / 151

DM-MAC DM-MAC Simultaneous Joint Typicality Decoding Thus the total average probability of decoding error We can prove achievability of any rate pair in the interior of R without P(E) ≤ P(E1)+ P(E2) → 0 as n →∞ if R1

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 35 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 36 / 151 DM-MAC DM-MAC So, an error E occurs iff Alternative Characterization n n n (n) E1 := {(X1 (1), X2 (1),Y ) ∈T/ ǫ }, or The proof of the weak converse gives the alternative characterization n n n (n) E2 := {(X1 (m1), X2 (1),Y ) ∈Tǫ for some m1 6= 1}, or Theorem n n n (n) E3 := {(X1 (1), X2 (m2),Y ) ∈Tǫ for some m2 6= 1}, or The capacity region of the DM-MAC (X1 ×X2,p(y|x1,x2), Y) is the set n n n (n) of (R1, R2) pairs satisfying E4 := {(X1 (m1), X2 (m2),Y ) ∈Tǫ for some m1 6= 1,m2 6= 1}

4 R1 ≤ I(X1; Y |X2,Q), Thus, P(E) ≤ j=1 P(Ej) P R2 ≤ I(X2; Y |X1,Q), By the LLN, P(E1) → 0 as n →∞ R1 + R2 ≤ I(X1, X2; Y |Q) By the packing lemma, P(E2) → 0 as n →∞ if R1

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 37 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 38 / 151

DM-MAC DM-MAC Coded Time-Sharing Summary of Achievability Ideas

Fix p(q)p(x1|q)p(x2|q) Codebook generation: Randomly generate a time sharing sequence Coding techniques: (n) n n ◮ Random codebook generation: Makes analysis of Pe tractable; has q ∼ i=1 pQ(qi) Q n become practical with LDPC codes/Fountain codes Randomly and conditionally independently generate x (m1), 1 ◮ Joint typicality decoding (packing): Not optimal but easy to analyze m ∈ [1 : 2nR1 ], sequences each according to n p (x |q ) 1 i=1 X1|Q 1i i and achieves optimal rates; does not provide good error bounds n nR2 Q Similarly generate x2 (m2), m2 ∈ [1 : 2 ], sequences each according ◮ Successive cancellation decoding n to pX Q(x2i|qi) ◮ Simultaneous joint typicality decoding: More powerful than successive Qi=1 2| The chosen codebook, including qn, is revealed to the encoders and cancellation decoding ◮ decoder Time sharing and coded time sharing n n Performance analysis tools: Encoding: To send (m1,m2) transmit x (m1) and x (m2) 1 2 ◮ Division of error event Decoding: We find the unique (m ˆ 1, mˆ 2) such that ◮ LLN n n n n (n) ◮ (q ,x1 (m ˆ 1),x2 (m ˆ 2),y ) ∈Tǫ Properties of typicality ◮ The rest of the proof follows as in simultaneous decoding with U in Packing lemma the packing lemma replaced by Q

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 39 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 40 / 151 DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets DM Broadcast Channel with Degraded Message Sets

Consider a 2-receiver DM-BC (X ,p(y1,y2|x), Y1 ×Y2) Sender X wishes to send a common message to both receivers and a nR0 nR1 Assume that (M0, M1) ∼ Unif([1 : 2 ] × [1 : 2 ]) private message to receiver Y1 The average probability of error is Y n 1 Decoder 1 (Mˆ , Mˆ ) 01 1 (n) ˆ ˆ ˆ Pe = P{(M01, M1) 6= (M0, M1) or M02 6= M0} Xn (M0,M1) Encoder p(y1,y2|x) achievable Y n A rate pair (R0, R1) is for the DM-BC if there exists a 2 ˆ nR nR (n) Decoder 2 M02 sequence of (2 0 , 2 1 ,n) codes with Pe → 0 as n →∞ The capacity region of the DM-BC with degraded message sets is the nR nR A (2 0 , 2 1 ,n) code for the DM-BC consists of: closure of the set of achievable rate pairs 1 Two message sets [1:2nR0 ] and [1:2nR1 ] n 2 An encoder that assigns a codeword x (m0,m1) to each message pair nR0 nR1 (m0,m1) ∈ [1:2 ] × [1:2 ] 3 n n Two decoders: Decoder 1 assigns to each y1 ∈ Y1 an estimate n nR0 nR1 (ˆm01, mˆ 1)(y1 ) ∈ [1:2 ] × [1:2 ] or an error e. Decoder 2 n n n nR0 assigns to each y2 ∈ Y2 an estimate mˆ 02(y2 ) ∈ [1:2 ] or an error e

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 41 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 42 / 151

DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets Outline of Achievability

Theorem (K¨orner–Marton [11]) nR n Fix p(u)p(x|u). Generate 2 0 independent u (m0) sequences (cloud The capacity region of the DM-BC with degraded message sets is the set n nR1 centers). For each u (m0), generate 2 conditionally independent of rate pairs (R , R ) such that n 0 1 x (m0,m1) sequences (satellite codewords) n (n) X R ≤ I(U; Y ), Tǫ (U) n 0 2 U (n) Tǫ (X) R1 ≤ I(X; Y1|U),

R0 + R1 ≤ I(X; Y1)

for some p(u, x)

U is an auxiliary random variable

The new achievability idea is superposition coding n Receiver Y2 decodes the cloud center u (m0) n Receiver Y1 decodes the satellite codeword x (m0,m1)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 43 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 44 / 151 DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets Proof of Achievability Analysis of the Probability of Error We show that any rate pair that satisfies the conditions for any given p(u)p(x|u) is achievable Assume wlog that message pair (1, 1) is sent Codebook generation: Fix p(u)p(x|u). Randomly and independently First consider the average probability of error for decoder 2. Define n nR0 the events generate u (m0), m0 ∈ [1 : 2 ], sequences each according to n i=1 pU (ui) n n (n) Q n E21 := {(U (1),Y2 ) ∈T/ ǫ }, For each sequence u (m0), randomly and conditionally independently n n n n nR1 ( ) generate x (m0,m1), m1 ∈ [1 : 2 ], sequences each according to E22 := {(U (m0),Y2 ) ∈Tǫ for some m0 6= 1} n i=1 pX|U (xi|ui(m0)) Q n The probability of error for decoder 2 is then bounded by Encoding: To send the message pair (m0,m1), transmit x (m0,m1)

Decoding: Decoder 2 declares that a message mˆ 02 is sent if it is the P(E2) ≤ P(E21)+ P(E22) n n (n) unique message such that (u (m ˆ 02),y2 ) ∈Tǫ ; otherwise an error is declared By the LLN, P(E21) → 0 as n →∞ Decoder 1 uses simultaneous decoding. It declares that the message n n Since for m0 6= 1, U (m0) is independent of Y2 , by the packing pair (m ˆ 01, mˆ 1) is sent if it is the unique message pair such that lemma, P(E22) → 0 as n →∞ if R2

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 45 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 46 / 151

DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets

Next consider the average probability of error for decoder 1 n n n The probability of error for decoder 1 is then bounded by Let’s look at the pmfs for the triple (U (m0), X (m0,m1),Y1 )

P(E1) ≤ P(E11)+ P(E12)+ P(E13) m0 m1 Joint pmf n n n n 1 1 p(u ,x )p(y1 |x ) n n n n By the LLN, P(E11) → 0 as n →∞ 1 × p(u ,x )p(y1 |u ) n n n Since for m1 6= 1, × × p(u ,x )p(y1 ) n n n n (U (1), X (1,m1),Y1 ) ∼ i=1 pU (ui)pX|U (xi|ui)pY |U (yi|ui), by n n n Q × 1 p(u ,x )p(y1 ) the packing lemma, P(E12) → 0 as n →∞ if R1

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 47 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 48 / 151 DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets Alternative Characterization Following similar steps to the previous proof of achievability, we can show that (R0, R10, R11) is achievable if We can prove the converse for the region consisting of rate pairs R + R R1 − I(X; Y1|U), R + R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 49 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 50 / 151

DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets Multi-level DM-BC with Degraded Message Sets Now, we use the Fourier–Motzkin procedure (e.g., see [12], Appendix A 3-receiver multi-level DM-BC [13] II) to find the projection of the above (R0, R10, R1) region on the (X ,p(y1,y3|x)p(y2|y1), Y1, Y2, Y3) is a 3-receiver DM-BC where (R0, R1) plane by eliminating R10 as follows: receiver Y2 is a degraded version of receiver Y1 The first and second inequalities give Y1 p(y2|y1) Y2 R0

Y3 R0 + R1

This establishes the achievability of the outer bound Consider the two degraded message sets scenario: nR0 The above rate splitting idea turns out to be crucial for 3-receiver Common message M0 ∈ [1 : 2 ] is to be reliably sent to all DM-BC receivers, nR1 Private message M1 ∈ [1 : 2 ] is to be sent only to receiver Y1 What is the capacity region?

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 51 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 52 / 151 DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets A straightforward extension of K¨orner–Marton capacity region for 2 Proof of Achievability receivers gives the inner bound consisting of (R0, R1) such that The new achievability ideas are rate splitting and indirect decoding R0 < min{I(U; Y2),I(U; Y3)}, Rate splitting: Split the private message M1 into two independent R1

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 53 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 54 / 151

DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets

Decoding and Analysis of the Probability of Error If receiver Y3 decodes m0 directly by finding the unique mˆ 03 such n n (n) that (u (m ˆ 03),y3 ) ∈Tǫ , we obtain R0

R0 + R10 + R11

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 55 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 56 / 151 DM-BC with Degraded Message Sets DM-BC with Degraded Message Sets Indirect Decoding Interpretation Thus, we are left with only two error events

n n n (n) Suppose that R0 >I(U; Y3) E1 := {(U (1), V (1, 1),Y3 ) ∈T/ ǫ }, n Y3 cannot directly decode the cloud center u (1) n n n (n) E2 := {(U (m0), V (m0,m10),Y3 ) ∈Tǫ for some m0 6= 1, m10} U n Vn Then, the probability of error for decoder 3 averaged over codebooks P(E3) ≤ P(E31)+ P(E32) n u (1) n By the LLN, P(E31) → 0 as n →∞ y3 By the packing lemma (with |A| = 2n(R0+R10) − 1, X → (U, V ), U = ∅), P(E32) → 0 as n →∞ if R0 + R10

n Combining the bounds, substituting R10 + R11 = R1, and using the v (1,m10), nR10 Fourier–Motzkin procedure to eliminate R10 and R11 complete the m10 ∈ [1, 2 ] proof of achievability

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 57 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 58 / 151

DM-BC with Degraded Message Sets DM-IC Indirect Decoding Interpretation DM Interference Channel

Consider a 2-user pair DM-IC (X1 ×X2,p(y1,y2|x1,x2), Y1 ×Y2) Assume R0 >I(U; Y3) but R0 + R10

p(y1,y2|x1, x2) n u (1) n y3 n n X2 Y2 ˆ M2 Encoder 2 Decoder 2 M2

A (2nR1 , 2nR2 ,n) code for the interference channel consists of: n n (n) (v (m ,m ),y ) ∈ T nR nR 0 10 3 ǫ 1 Two message sets [1:2 1 ] and [1:2 2 ] 2 n Two encoders: Encoder j =1, 2 assigns a codeword xj (mj ) to each nRj The above condition suffices in general (even when R0

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 59 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 60 / 151 DM-IC DM-IC Han–Kobayashi Inner Bound This is the best known inner bound on the capacity region of the

nR1 nR2 2-user pair DM-IC Assume that (M1, M2) ∼ Unif([1 : 2 ] × [1 : 2 ]) The following is a recent characterization [15] of this inner bound The average probability of error is Theorem (Han–Kobayashi [16]) (n) ˆ ˆ Pe = P{M1 6= M1 or M2 6= M2} A rate pair (R1, R2) is achievable for a DM-IC if it satisfies the inequalities

A rate pair (R1, R2) is achievable for the DM-IC if there exists a R1

The capacity region of the DM-IC is the closure of the set of R1 + R2

2R1 + R2

R1 + 2R2

for some p(q)p(u1,x1|q)p(u2,x2|q)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 61 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 62 / 151

DM-IC DM-IC

Proof of Achievability Codebook generation: Fix p(q)p(u1,x1|q)p(u2,x2|q) n n Generate a sequence q ∼ pQ(qi) Qi=1 Split message Mj, j = 1, 2, into independent “public” part of rate For j = 1, 2, randomly and conditionally independently generate R and “private” part of rate R . Thus, R = R + R n nRj0 j0 jj j j0 jj uj (mj0), mj0 ∈ [1 : 2 ], sequences each according to n We first show that (R10, R20, R11, R22) is achievable if i=1 pUj |Q(uji|qi) Q n For each u (mj ), randomly and conditionally independently generate R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 63 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 64 / 151 DM-IC DM-IC Analysis of the Probability of Error We are left with only 5 error events: n n n n n (n) Assume message pair ((1, 1), (1, 1)) is sent. We bound the average E10 := {(Q , U1 (1), U2 (1), X1 (1, 1),Y1 ) ∈T/ ǫ }, probability of error for each decoder. First consider decoder 1 n n n n n (n) E11 := {(Q , U1 (1), U2 (1), X1 (1,m11),Y1 ) ∈Tǫ n We have 8 cases to consider (conditioning on q suppressed) for some m11 6= 1}, m10 m20 m11 Joint pmf n n n n n (n) E12 := {(Q , U (m10), U (1), X (m10,m11),Y ) ∈Tǫ 1 1 1 1 p(un,xn)p(un)p(yn|xn, un) 1 2 1 1 1 1 2 1 1 2 for some m 6= 1,m }, n n n n n n 10 11 2 1 1 × p(u1 ,x1 )p(u2 )p(y1 |u1 , u2 ) n n n n n (n) E13 := {(Q , U (1), U (m20), X (1,m11),Y ) ∈Tǫ 3 × 1 × p(un,xn)p(un)p(yn|un) 1 2 1 1 1 1 2 1 2 for some m 6= 1,m 6= 1}, n n n n n 20 11 4 × 1 1 p(u1 ,x1 )p(u2 )p(y1 |u2 ) n n n n n (n) E14 := {(Q , U (m10), U (m20), X (m10,m11),Y ) ∈Tǫ 5 1 × × p(un,xn)p(un)p(yn|un) 1 2 1 1 1 1 2 1 1 for some m 6= 1,m 6= 1,m } n n n n 10 20 11 6 × × 1 p(u1 ,x1 )p(u2 )p(y1 ) n n n n Then, the average probability of error for decoder 1 is 7 × × × p(u1 ,x1 )p(u2 )p(y1 ) 4 n n n n n 8 1 × 1 p(u1 ,x1 )p(u2 )p(y1 |x1 ) P(E ) ≤ P(E ) 1 X 1j Cases 3,4 and 6,7 share same pmf, and case 8 does not cause an error j=0

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 65 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 66 / 151

DM-IC DM-IC

Now, we bound each probability of error term Summary of Achievability Ideas

By the LLN, P(E10) → 0 as n →∞ Coding techniques: By the packing lemma, P(E11) → 0 as n →∞ if ◮ Random codebook generation R11

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 67 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 68 / 151 DM-RC: Decode–Forward DM-RC: Decode–Forward DM Relay Channel Assume that M ∼ Unif[1 : 2nR] Consider a DM-RC (X ×X ,p(y,y |x,x ), Y×Y ) n 1 1 1 1 The average probability of error is P ( ) = P{Mˆ 6= M} The sender X wishes to send a message to the receiver Y with the e A rate R is achievable for the DM-RC if there exists a sequence of help of the relay (X1,Y1) (2nR,n) codes with P (n) → 0 Relay e Encoder The capacity C of the DM-RC is the supremum of all achievable rates

n n The capacity of the DM-RC is not known in general Y1 X1 We know an upper bound (cutset bound) and several lower bounds Xn Y n M Encoderp(y,y1|x, x1) Decoder Mˆ on capacity that are tight only for some special cases decode–forward nR We present the coding scheme, which involves the A (2 ,n) code for a DM-RC consists of: new ideas of block Markov coding and backward decoding 1 A message set [1:2nR] 2 An encoder that assigns a codeword xn(m) to each message For reference nR m ∈ [1:2 ] Cutset Upper Bound [29] 3 i−1 A relay encoder that assigns a symbol x1i(y1 ) to every received i−1 sequence y1 for i ∈ [1 : n] C ≤ maxp(x1,x2) min{I(X, X1; Y ),I(X; Y,Y1|X1)} 4 A decoder that assigns to each received sequence yn an estimate mˆ (yn) ∈ [1:2nR] or an error e A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 69 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 70 / 151

DM-RC: Decode–Forward DM-RC: Decode–Forward Multi-hop Lower Bound Proof of Achievability

Codebook generation: Fix p(x)p(x1). Randomly and independently n nR n In this scheme, the relay decodes the message received from the generate x (m), m ∈ [1 : 2 ], each according to i=1 pX (xi), and n nR n Q sender in each block and retransmits it in the following block x1 (m), m ∈ [1 : 2 ], each according to i=1 pX1 (x1i) Q n Encoding: To send mj in block j, the encoder transmits x (mj) Proposition: Multi-hop Lower Bound n The relay has an estimate m˜ j−1 of mj−1. It transmits x1 (m ˜ j−1)

C ≥ maxp(x)p(x1) min{I(X1; Y ),I(X; Y1|X1)} Decoding and analysis of the probability of error: Assume that at the end of block j − 1 the relay knows (m1,m2,...,mj−1). The Achievability uses the following multi-block scheme: decoding procedures at the end of block j are as follows: n Consider b blocks, each consisting of n transmissions Upon receiving y1 (j), the relay declares that m˜ j is sent if it is the nR A sequence of b − 1 i.i.d. messages mj ∈ [1 : 2 ], j ∈ [1 : b − 1], is n n n (n) unique message such that (x (m ˜ j),x1 (mj−1),y1 (j)) ∈Tǫ to be sent over the channel in nb transmissions By the LLN and the packing lemma, the probability of error → 0 as M1 M2 M3 Mb−1 ∅ n →∞ if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 73 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 74 / 151

DM-RC: Decode–Forward DM-RC: Decode–Forward Decoding and Analysis of the Probability of Error

Encoding and decoding are described in the following table: Assume that at the end of block j − 1 the relay knows (m ,m ,...,m ). The decoding procedures at the end of block j Block 1 2 3 ... b − 1 b 1 2 j−1 are as follows: X xn(m |1) xn(m |m ) xn(m |m ) ... xn(m |m ) xn(1|m ) n 1 2 1 3 2 b−1 b−2 b−1 Upon receiving y1 (j), the relay receiver declares that m˜ j is sent if it is the unique message such that Y1 m˜ 1 m˜ 2 m˜ 3 . . . m˜ b−1 ∅ n n n (n) (x (m ˜ j|mj−1),x1 (mj−1),y1 (j)) ∈Tǫ ; otherwise an error is n n n n n X1 x1 (1) x1 (m ˜ 1) x1 (m ˜ 2) ... x1 (m ˜ b−2) x1 (m ˜ b−1) declared By the LLN and the packing lemma, the probability of error → 0 as Y ∅ mˆ 1 mˆ 2 . . . mˆ b−2 mˆ b−1 n →∞ if R

n The receiver declares that mˆ j−1 is sent if it is the unique message Encoding: To send mj in block j, the encoder sends x (mj|mj−1) n n (n) such that (x1 (m ˆ j−1),y (j)) ∈Tǫ ; otherwise an error is declared The relay has an estimate m˜ j−1 of the previous message mj−1 n By the LLN and the packing lemma, the probability of error → 0 as It sends x1 (m ˜ j−1) in block j n →∞ if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 75 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 76 / 151 DM-RC: Decode–Forward DM-RC: Decode–Forward Decode–Forward Lower Bound Proof of Achievability

Again we consider b transmission blocks, each consisting of n The cooperative multi-hop scheme can be improved by using both the transmissions and use a block Markov coding scheme nR information received through the direct path and from the relay. This A sequence of b − 1 i.i.d. messages, mj ∈ [1 : 2 ], j ∈ [1 : b − 1], is leads to the following decode–forward lower bound on the capacity of to be sent over the channel in the nb transmissions the DM-RC Codebook generation: We use the same codebook generation as in cooperative multi-hop. Fix p(x1)p(x|x1) that achieves lower bound Theorem (Cover–El Gamal [29]) n ′ ′ nR Randomly and independently generate x1 (m ), m ∈ [1, 2 ], n C ≥ maxp(x,x ) min {I(X, X1; Y ),I(X; Y1|X1)} n ′ 1 sequences each according to i=1 pX1 (x1i). For each x1 (m ), randomly and conditionally independentlyQ generate xn(m|m′), Remark: The decode–forward lower bound is tight when the DM-RC nR n ′ m ∈ [1, 2 ], sequences each according to pX|X (xi|x1i(m )) is physically degraded, i.e., Qi=1 1 Encoding: Encoding is again the same as in cooperative multi-hop p(y,y |x,x )= p(y |x,x )p(y|y ,x ) n 1 1 1 1 1 1 To send mj in block j, the encoder sends x (mj|mj−1) The relay has an estimate m˜ j−1 of the previous message mj−1 n It sends x1 (m ˜ j−1) in block j

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 77 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 78 / 151

DM-RC: Decode–Forward DM-RC: Decode–Forward Backward Decoding Decoding at the relay is done as in cooperative multi-hop Suppose the relay knows (m1,m2,...,mj−1) at the end of block j − 1 n Upon receiving y1 (j), the relay receiver declares m˜ j is sent if it is the n n n (n) Decoding at the receiver is done backwards after all bn transmissions unique message such that (x (m ˜ j|mj−1),x1 (mj−1),y1 (j)) ∈Tǫ ; are received [30] otherwise an error is declared Encoding and decoding are described in the following table: By the LLN and the packing lemma, the probability of error → 0 as n →∞ if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 79 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 80 / 151 DM-RC: Decode–Forward DMS Summary of Achievability Ideas Discrete Memoryless Source Coding techniques: Let X be a discrete memoryless source (DMS) (i.e., i.i.d. finite ◮ Random codebook generation alphabet source) (X ,p(x)) and d(x, xˆ), xˆ ∈ Xˆ, be a distortion ◮ Joint typicality decoding (packing) measureDecoder ◮ Successive cancellation decoding The decoder wishes to obtain a description of X with distortion D ◮ Simultaneous joint typicality decoding M ◮ Time sharing and coded time sharing Xn Encoder Decoder Xˆ n ◮ Superposition coding ◮ Rate splitting The average per-letter distortion between xn and xˆn is ◮ Indirect decoding ◮ Block Markov coding n n n 1 ◮ Backward decoding d(x , xˆ ) := d(xi, xˆi) n X Performance analysis tools: i=1 ◮ Division of error event A (2nR,n) lossy source code consists of: ◮ LLN 1 An encoder that assigns to each sequence xn ∈ X n an index ◮ Properties of typicality m(xn) ∈ [1:2nR) := {1, 2,..., 2⌊nR⌋}, and ◮ Packing lemma 2 A decoder that assigns to each index m ∈ [1:2nR) an estimate ◮ Fourier–Motzkin elimination xˆn(m) ∈ Xˆn

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 81 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 82 / 151

DMS DMS The distortion associated with the (2nR,n) code is defined as Proof of Achievability D = E d(Xn, Xˆ n) = p(xn)d(xn, xˆn(m(xn))) X  xn Codebook generation: Fix p(ˆx|x) that achieves R(D/(1 + ǫ)), and compute p(ˆx)= p(x)p(ˆx|x) A rate distortion pair (R, D) is achievable if there exists a sequence of x Randomly and independentlyP generate xˆn(m), m ∈ [1 : 2nR), (2nR,n) codes with lim sup E(d(Xn, Xˆ n)) ≤ D n→∞ sequences each according to n p (ˆx ) The rate distortion function R(D) is the infimum of rates R such i=1 Xˆ i The codebook is revealed toQ both the encoder and decoder that (R, D) is achievable joint typicality encoding Note that R(D) is nonincreasing and convex (thus continuous) Encoding: We use Given a sequence xn, choose an index m such that n n (n) Theorem (Shannon [17]) (x , xˆ (m)) ∈Tǫ The rate distortion function for a DMS (X ,p(x)) and distortion measure If there is more than one such m, choose the smallest index d(x, xˆ) for D ≥ Dmin := E (minxˆ d(X, xˆ)) is If there is no such m, choose m = 1 Decoding: Upon receiving the index m, the decoder chooses the R(D)= min I(X; Xˆ) n p(ˆx|x):E(d(x,xˆ))≤D reproduction sequence xˆ (m) The following lemma establishes the condition for successful joint The new achievability ideas are joint typicality encoding (covering) typicality encoding and the covering lemma

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 83 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 84 / 151 DMS DMS Covering Lemma Analysis of Average Distortion n n (n) n Let (U, X, Xˆ ) ∼ p(u, x, xˆ), ǫ> 0, and (U , X ) ∈Tǫ . Let Xˆ (m), m ∈A, |A| ≥ 2nR, be random sequences conditionally independent of We bound the distortion averaged over Xn and the codebook n n each other and X given U , each distributed according to Define the “error” event n i=1 pXˆ|U (ˆxi|ui). There exists δ(ǫ) → 0 as ǫ → 0 such that if Q n ˆ n (n) nR R>I(X; Xˆ |U)+ δ(ǫ), then E := {(X , X (m)) ∈T/ ǫ for all m ∈ [1 : 2 )}, n n ˆ n (n) P{(U , X , X (m)) ∈T/ ǫ for all m ∈A}→ 0 as n →∞ and partition it into the events

n ˆn n T ( )(X) n (n) X X ǫ E1 := {X ∈T/ ǫ }, n (n) n ˆ n (n) nR E2 := {X ∈Tǫ , (X , X (m)) ∈T/ ǫ for all m ∈ [1 : 2 )} Xˆ n(1)

Then, P(E)= P(E1)+ P(E2) Xn Xˆ n(m) By the LLN, P(E1) → 0 as n →∞

By the covering lemma (with U = ∅), P(E2) → 0 as n →∞ if R>I(X; Xˆ)+ δ(ǫ) The proof is in Appendix III A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 85 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 86 / 151

DMS DMS Now, denote the reproduction sequence for Xn by Xˆ n, then by the Lossless Source Coding law of total expectation and the typical average lemma In lossless source coding, we wish to find the minimum rate R∗ such E(d(Xn, Xˆ n)) = P(E) E(d(Xn, Xˆ n)|E)+ P(Ec) E(d(Xn, Xˆ n)|Ec) that if R > R∗, then there exists a sequence of (2nR,n) codes with c ≤ P(E) d + P(E )(1 + ǫ) E(d(X, Xˆ)), (n) n n max Pe = P{Xˆ 6= X }→ 0 as n →∞

where dmax := max(x,xˆ)∈X ×Xˆ d(x, xˆ) Theorem (Shannon [6]) Since by assumption E(d(X, Xˆ)) ≤ D/(1 + ǫ), The optimal lossless source coding rate for DMS X is R∗ = H(X) n ˆ n lim sup E d(X , X ) ≤ D, We show that this lossless source coding theorem is a special case of n→∞ the lossy source coding theorem if R>I(X; Xˆ)+ δ(ǫ)= R(D/(1 + ǫ)) + δ(ǫ) Consider the lossy source coding problem for a DMS X, reproduction ˆ Since the average per-letter distortion is asymptotically ≤ D, there alphabet X = X , and Hamming distortion measure (d(x, xˆ) = 0 if must exist a sequence of codes with asymptotic distortion ≤ D, x =x ˆ, and d(x, xˆ) = 1 otherwise) which proves the achievability of (R(D/(1 + ǫ)) + δ(ǫ), D) If we let D = 0, we can show that R(0) = H(X) (why?) Finally, since R(D) is continuous in D, R(D/(1 + ǫ)) + δ(ǫ) → R(D) We want to show that R∗ = R(0) as ǫ → 0

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 87 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 88 / 151 DMS DMS with Side Information at Decoder DMS with Side Information at Decoder The converse for the lossy source coding theorem under the above conditions implies that for any sequence of (2nR,n) codes if the Let (X,Y ) be a 2-DMS (X , Y,p(x,y)) and d(x, xˆ) be a distortion average per-symbol error probability (1/n) n P{Xˆ 6= X }→ 0, i=1 i i measure then R ≥ H(X) P Since the average per-symbol error probability is ≤ the block error The decoder observes the DMS Y (side information) and wishes to (n) ∗ obtain a description of the DMS X with distortion D probability Pe , then R ≥ H(X) n M As for the achievability, we can still use random coding and covering! X Encoder Decoder Xˆ n Fix a test channel p(ˆx|x) = 1 if x =x ˆ and 0, otherwise n n (n) n n Note that if (x , xˆ ) ∈Tǫ , then x =x ˆ n As before, randomly generate a codebook xˆn(m), m ∈ [1 : 2nR) Y nR By the covering lemma if R>I(X; Xˆ )+ δ(ǫ)= H(X)+ δ(ǫ), the A (2 ,n) rate distortion code with side information available at the decoder consists of: average probability of error → 0 as n →∞ 1 n n n nR Thus there exists a sequence of (2nR,n) lossless source codes with An encoder that assigns to each x ∈ X an index m(x ) ∈ [1:2 ) 2 n n (n) A decoder that assigns an estimate xˆ (m,y ) to each received index Pe → 0 as n →∞ m and side information sequence yn

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 89 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 90 / 151

DMS with Side Information at Decoder DMS with Side Information at Decoder Outline of Achievability The rate distortion function with side information available at the Fix p(u|x), xˆ(u, y). Generate 2nI(U;X) un(l) sequences and partition decoder, RSI−D(D), is the infimum of rates R such that there exists nR n ˆ n their indices into 2nR bins a sequence of (2 ,n) codes with lim supn→∞ E(d(X , X )) ≤ D Given xn find a jointly typical un sequence. Send the bin index of un Note that RSI−D(D) is nonincreasing and convex (thus continuous) yn yn Theorem (Wyner–Ziv [18]) xn un(1) Let (X,Y ) be a 2-DMS and d(x, xˆ) be a distortion measure. The rate B(1) n u (2) distortion function for X with side information Y available only at the decoder is B(2)

B(3) RSI−D(D) = min (I(X; U) − I(Y ; U)) = min I(X; U|Y ), where the minimization is over all functions xˆ : Y×U→ Xˆ and conditional pmfs such that ˆ B(2nR) p(u|x) EX,Y,U (d(X, X)) ≤ D ˜ un(2nR) n n (n) The new achievability ideas are binning and the Markov lemma The decoder finds unique l ∈B(m) such that (u (l),y ) ∈Tǫ The decoder then constructs the estimate (ˆx(u1,y1),..., xˆ(un,yn))

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 91 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 92 / 151 DMS with Side Information at Decoder DMS with Side Information at Decoder Proof of Achievability Analysis of Average Distortion n Codebook generation: Fix p(u|x), xˆ(u, y) that achieve the rate Let L be the index of the chosen U sequence and M be the bin index of L. Define the “error” events: distortion function RSI−D(D/(1 + ǫ)) n nR˜ Randomly and independently generate u (l), l ∈ [1 : 2 ), sequences n n (n) n E0 := {(X ,Y ) ∈T/ ǫ′ } each according to pU (ui) i=1 (n) ˜ Q ˜ n n nR Partition the set of indices l ∈ [1 : 2nR) into equal-size bins E1 := {(U (l), X ) ∈T/ ǫ′ for all l ∈ [1 : 2 )}, n(R˜−R) n(R˜−R) nR n n (n) B(m) := [(m − 1)2 + 1 : m2 ], m ∈ [1 : 2 ) E2 := {(U (L),Y ) ∈T/ ǫ }, n n ˜ n (n) ˜ ˜ Encoding: Given x , the encoder finds an l such that E3 := {(U (l),Y ) ∈Tǫ for some l ∈B(M), l 6= L} n n (n) (x , u (l)) ∈Tǫ′ . If there is more than one such index, it selects one uniformly at random. If there is no such index, it selects an arbitrary l The total probability of “error” is bounded by The encoder sends the index m such that l ∈B(m) c c P(E) ≤ P(E0)+ P(E0 ∩E1)+ P(E1 ∩E2)+ P(E3) Decoding: Let ǫ>ǫ′. The decoder finds the unique lˆ∈B(m) such that (un(ˆl),yn) ∈T (n) ǫ By the LLN, P(E0) → 0 as n →∞ If there is such an index ˆl, the reproduction xˆn is computed as By the covering lemma, P(Ec ∩E ) → 0 as n →∞ if xˆ =x ˆ(u (ˆl),y ), i ∈ [1 : n]; otherwise an arbitrary xˆn is chosen 0 1 i i i R>I˜ (X; U)+ δ(ǫ′)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 93 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 94 / 151

DMS with Side Information at Decoder DMS with Side Information at Decoder

c n n (n) Since E1 = {(U (L), X ) ∈Tǫ′ }, n n n n ′ n n n (n) Y |{X = x }∼ i=1 pY |X p(yi|xi), and ǫ>ǫ , by the Markov When there is no error, (U (L), X ,Y ) ∈Tǫ . Thus by the typical c Q lemma, P(E1 ∩E2) → 0 as n →∞ average lemma, the asymptotic distortion averaged over the random For P(E3), we can show via a sequence of careful conditioning, using codebook is bounded as the symmetry of code generation, and simple probability bounds (see n ˆ n ˆ c Appendix IV) that E(d(X , X )) ≤ dmax P(E)+(1+ ǫ) E(d(X, X)) P(E ) ≤ D ′ n ˜ n (n) ˜ if R>I(X; U) − I(Y ; U)+ δ(ǫ)+ δ(ǫ ) P(E3) ≤ P{(U (l),Y ) ∈Tǫ for some l ∈B(1)} ′ = RSI−D(D/(1 + ǫ)) + δ(ǫ)+ δ(ǫ )

n n n Finally, by the continuity of RSI−D(D) in D, taking ǫ → 0 shows that Since each U ∼ i=1 p(ui) and is independent of Y , by the Q any (R, D) pair with R > RSI−D(D) is achievable, which completes packing lemma, P(E3) → 0 as n →∞ if R˜ − RI(X; U) − I(Y ; U)+ δ(ǫ)+ δ(ǫ′)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 95 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 96 / 151 DMS with Side Information at Decoder DM-RC: Compress–Forward Lossless Source Coding with Side Information DM Relay Channel

In lossless source coding with side information at the decoder, we wish Again, consider a DM-RC (X ×X1,p(y,y1|x,x1), Y×Y1) ∗ (n) The sender X wishes to send a message to the receiver Y with the to find the minimum rate RSI−D needed to recover X with Pe → 0 help of the relay (X1,Y1) Theorem (Slepian–Wolf [19]) Relay ∗ Encoder RSI−D = H(X|Y ) Y n Xn We show that Slepian–Wolf is a special of Wyner–Ziv 1 1 Let d be the Hamming distortion measure and consider the case of Xn Y n M Encoderp(y,y1|x, x1) Decoder Mˆ D = 0. It is not difficult to show that in this case The code and achievability are defined as before R (0) = H(X|Y ), SI−D Here, we present the compress–forward coding scheme

∗ For reference Can show that RSI−D = RSI−D(0) = H(X|Y ) The converse follows by the converse for the Wyner–Ziv theorem Cutset Upper Bound [29]

Achievability follows by covering, binning, and packing with U = X C ≤ maxp(x1,x2) min{I(X, X1; Y ),I(X; Y,Y1|X1)}

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 97 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 98 / 151

DM-RC: Compress–Forward DM-RC: Compress–Forward Compress–Forward Lower Bound Outline of Achievability In the decode–forward scheme, the relay decodes the entire message. If the channel from the sender to the relay is worse than the direct channel to the receiver, this requirement can make the transmission A block Markov coding scheme is used to send b − 1 i.i.d. messages in rate lower than that when the relay is not used at all b blocks compress–forward n n In the scheme, the relay helps communication by In block j, a “compressed” version yˆ1 (j − 1) of y1 (j − 1) conditioned n sending a compressed version of its received signal to the receiver. on x1 (j − 1), which is known to both the relay and receiver, is Because this compressed version is correlated with the received constructed by the relay sequence, Wyner–Ziv coding is used to reduce its rate to the receiver. n This scheme achieves the following lower bound As in Wyner–Ziv coding, yˆ1 (j − 1) is encoded with side information n n y (j − 1) at the receiver and sent to the receiver via x1 (j) Theorem (Cover–El Gamal [29], El Gamal–Mohseni–Zahedi [32]) n n At the end of block j, the receiver decodes x1 (j) and finds yˆ1 (j − 1) ˆ C ≥ max min{I(X, X1; Y ) − I(Y1; Y1|X, X1,Y ), n n p(x)p(x1)p(ˆy1|y1,x1) The receiver then combines yˆ1 (j − 1) with y (j − 1) to decode mj−1 I(X; Y, Yˆ1|X1)}

This characterization is equivalent to that in Cover–El Gamal [29]

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 99 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 100 / 151 DM-RC: Compress–Forward DM-RC: Compress–Forward Proof of Achievability

Encoding and decoding are described in the following table: Codebook generation: Fix p(x)p(x1)p(ˆy1|y1,x1) that achieves the lower bound Block 1 2 3 ... b − 1 b n nR Randomly and independently generate x (m), m ∈ [1 : 2 ], n n n n n n X x (m1) x (m2) x (m3) ... x (mb−1) x (1) sequences each according to i=1 pX (xi) Q n nR1 n n n n Randomly and independently generate x1 (l), l ∈ [1 : 2 ], sequences Y1 yˆ1 (k1|1), yˆ1 (k2|l1), yˆ1 (k3|l2), ... yˆ1 (kb−1|lb−2), ∅ n l1 l2 l3 lb each according to pX (x1i) −1 Qi=1 1 n nR1 X xn(1) xn(l ) xn(l ) ... xn(l ) xn(l ) For each x1 (l), l ∈ [1 : 2 ], randomly and conditionally 1 1 1 1 1 2 1 b−2 1 b−1 nR˜1 n nR˜1 independently generate 2 sequences yˆ1 (k|l), k ∈ [1, 2 ], each ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ n Y ∅ l1, k1, l2, k2, ... lb−2, kb−2, lb−1, kb−1, according to p ˆ (ˆy1i|x1i(l)) Qi=1 Y1|X1 mˆ 1 mˆ 2 mˆ b−2 mˆ b−1 ˜ Partition the set [1 : 2nR1 ] into 2nR1 equal size bins B(l), l ∈ [1 : 2nR1 ]

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 101 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 102 / 151

DM-RC: Compress–Forward DM-RC: Compress–Forward

′ Let ǫ>ǫ . Assuming the receiver correctly decodes lj−1, it finds the Encoding: Let mj be the message to be sent in block j, and assume n n n (n) unique message mˆ j−1 such that that (ˆy1 (kj−1|lj−2),y1 (j − 1),x1 (lj−2)) ∈Tǫ′ and kj−1 ∈B(lj−1) n n n ˜ n (n) n n (x (m ˆ j−1),x1 (lj−2), yˆ1 (kj−1|lj−2),y (j − 1)) ∈Tǫ for some Then the codeword pair (x (mj),x1 (lj−1)) is transmitted in block j k˜j−1 ∈B(lj−1) Decoding and analysis of the probability of error: Assume that at the To bound the probability of the error, assume that end of block j − 1 the receiver knows (m ,...,m ) and 1 j−2 (M ,L ) = (1, 1) and let K be the index of the covering Yˆ n (l ,...,l ), the decoding procedure at the end of block j is as j−1 j−2 j−1 1 1 j−2 and L be the bin index of K . Define the events follows: j−1 j−1 The relay, upon receiving yn(j), finds an index k such that n n ˆ n n (n) 1 E1 := {(X (1), X1 (1), Y1 (Kj−1|1),Y (j − 1)) ∈T/ ǫ }, n n n (n) (ˆy (k|lj−1),y (j),x (lj−1)) ∈T ′ n n n n (n) 1 1 1 ǫ E2 := {(X (m), X (1), Yˆ (Kj−1|1),Y (j − 1)) ∈T By the LLN and the covering lemma, the probability of error → 0 as 1 1 ǫ ′ for some m 6= 1}, n →∞ if R˜1 >I(Yˆ1; Y1|X1)+ δ(ǫ ) n E := {(Xn(m), Xn(1), Yˆ n(k˜|1),Y n(j − 1)) ∈T (n) Upon receiving y (j), the receiver finds the unique lˆj−1 such that 3 1 1 ǫ n ˆ n (n) ˜ ˜ (x1 (lj−1),y (j)) ∈Tǫ for some k ∈B(Lj−1), k 6= Kj−1,m 6= 1} By the LLN and the packing lemma, the probability of error → 0 as Then the probability of error is bounded by n →∞ if R1

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 103 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 104 / 151 DM-RC: Compress–Forward DM-RC: Compress–Forward Summary of Achievability Ideas By the Markov lemma, P(E1) → 0 as n →∞

By the packing lemma, P(E2) → 0 as n →∞ if ˆ ˆ R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 105 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 106 / 151

DM-RC: Compress–Forward DMC with State Available at Encoder DMC with State Available at Encoder

Consider a DM channel with state (X × S,p(y|x,s), Y), where S is a DMS (S,p(s)) Performance analysis tools: The sender X who knows the state S noncausally wishes to send a ◮ Division of error event message to the receiver Y ◮ LLN n ◮ S Properties of typicality p(s) ◮ Packing lemma ◮ Fourier–Motzkin elimination ◮ Covering lemma Xn Y n M Encoderp(y|x, s) Decoder Mˆ

A (2nR,n) code for the channel with state noncausally known at the encoder is defined by an encoder xn(m,sn) and a decoder mˆ (yn)

We wish to find the channel capacity CSI−E for this setting

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 107 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 108 / 151 DMC with State Available at Encoder DMC with State Available at Encoder Outline of Achievability Fix p(u, x|s). For each message m ∈ [1 : 2nR], generate a subcode of 2n(R˜−R) un(l) sequences sn Theorem (Gelfand–Pinsker [20]) sn un The capacity of a DM channel with state available noncausally at the un(1) C(1) encoder is ˜ un(2n(R−R)) C(2) CSI−E = max (I(U; Y ) − I(U; S)) p(u,x|s) C(3) The new achievability idea is subcode generation, which as we will see is “dual” to binning nR C(2 ) ˜ un(2nR) To send m given sn, find un(l) ∈C(m) that is jointly typical with sn n n and transmit X ∼ i=1 pX|U,S(xi|ui,si) The receiver finds aQ jointly typical un(l) with yn and declares the subcode index of un(l) to be the message sent

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 109 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 110 / 151

DMC with State Available at Encoder DMC with State Available at Encoder Proof of Achievability [21] Analysis of the Probability of Error Codebook generation: Fix p(u|s)p(x|u, s) that achieves capacity For each message m ∈ [1 : 2nR] generate a subcode C(m) consisting Assume M = 1 and let L be the index of the chosen U n codeword for n of 2n(R˜−R) randomly and independently generated un(l), M = 1 and S l ∈ [(m − 1)2n(R˜−R) + 1 : m2n(R˜−R)], sequences each according to Define the events n i=1 pU (ui) Q E := {Sn ∈T/ (n)} Encoding: To send the message m ∈ [1 : 2nR] with the state sequence 0 ǫ′ n n n n (n) n s observed, the sender chooses a u (l) ∈C(m) such that E1 := {(U (l),S ) ∈T/ ǫ′ for all U (l) ∈C(1)}, (n) (n) (un(l),sn) ∈T . If sn ∈T/ or no such un(l) exists, then it picks n n (n) ǫ′ ǫ′ E2 := {(U (L),Y ) ∈T/ ǫ }, an arbitrary sequence un(l) ∈C(m) E := {(U n(˜l),Y n) ∈T (n) for some U n(˜l) ∈C/ (1)} The sender then generates a sequence xn according to 3 ǫ n pX|U,S(xi|ui(l),si) and transmits it Qi=1 The probability of error is bounded by Decoding: Let ǫ>ǫ′. Upon receiving yn, the decoder declares that a c c message mˆ is sent if it is the unique message such that P(E) ≤ P(E0)+ P(E0 ∩E1)+ P(E1 ∩E2)+ P(E3) n n (n) n (u (l),y ) ∈Tǫ for some u (l) ∈C(m ˆ ); otherwise an error is declared

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 111 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 112 / 151 DMC with State Available at Encoder DMC with State Available at Encoder Subcode Generation is “Dual” to Binning

By the LLN, P(E0) → 0 as n →∞ c By the covering lemma, P(E0 ∩E1) → 0 as n →∞ if Subcode generation is a channel coding technique—we have a set of R˜ − R>I(U; S)+ δ(ǫ′) messages and we generate many codewords for each message. To ′ c By the Markov lemma, since ǫ>ǫ , P(E1 ∩E2) → 0 as n →∞ send a message, we send one of the codewords in its subcode n ˜ n Since U (l) ∈C/ (1) is distributed according to i=1 p(ui) and is Binning is a source coding technique—-we have many n Q independent of Y , by the packing lemma, P(E3) → 0 as n →∞ if indices/sequences and we map them into a smaller number of bin ˜ R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 113 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 114 / 151

DMC with State Available at Encoder DM-BC with Private Messages Wyner–Ziv versus Gelfand–Pinsker DM-BC with Private Messages The Wyner–Ziv theorem establishes the rate distortion function for a Consider a 2-receiver DM-BC (X ,p(y1,y2|x), Y1 ×Y2) source X with side information Y noncausually known to the decoder: Sender X wishes to send a private message to each receiver RSI−D(D) = min(I(U; X) − I(U; Y )) n Y1 ˆ We proved achievability using binning, covering, packing Decoder 1 M1 The Gelfand–Pinsker theorem establishes the capacity of DMC with Xn (M1,M2) Encoder p(y1,y2|x) state noncausually known to the encoder: n Y2 Decoder 2 Mˆ 2 CSI−E = max(I(U; Y ) − I(U; S)) We proved achievability using subcode generation, covering, packing A (2nR1 , 2nR2 ,n) code for the DM-BC consists of: Note the following “dualities”: 1 Two message sets [1:2nR1 ] and [1:2nR2 ] n min is dual to max 2 An encoder that assigns a codeword x (m1,m2) to each message pair nR1 nR2 Binning is dual to subcode generation (m1,m2) ∈ [1:2 ] × [1:2 ] 3 n n RSI−D is the difference between the covering rate I(U; X) and the Two decoders: Decoder 1 assigns to each y1 ∈ Y1 an estimate mˆ (yn) ∈ [1:2nR1 ] or an error e. Decoder 2 assigns to each yn ∈ Yn packing rate I(U; Y ), while CSI−E is the difference between the 1 1 2 2 n nR2 packing rate I(U; Y ) and the covering rate I(U; S) an estimate mˆ 2(y2 ) ∈ [1:2 ] or an error e

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 115 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 116 / 151 DM-BC with Private Messages DM-BC with Private Messages Cover–van der Meulen Inner Bound

nR1 nR2 Assume that (M1, M2) ∼ Unif([1 : 2 ] × [1 : 2 ]) Theorem (Cover–van der Meulen [22, 23]) The average probability of error is A rate pair (R1, R2) is achievable for a DM-BC (X ,p(y1,y2|x), Y1 ×Y2) if (n) ˆ ˆ Pe = P{M1 6= M1 or M2 6= M2} R1

A rate pair (R1, R2) is achievable for the DM-BC if there exists a for some p(u1, u2,x)= p(u1)p(u2)p(x|u1, u2) nR nR (n) sequence of (2 1 , 2 2 ,n) codes with Pe → 0 as n →∞ The capacity region of the DM-BC with private messages is the Codebook generation: Fix p(u1)p(u2)p(x|u1, u2). Randomly and closure of the set of achievable rate pairs n nR1 independently generate u1 (m1), m1 ∈ [1 : 2 ], sequences each n n nR2 Capacity region not known in general according to i=1 pU1 (u1i), and u2 (m2), m2 ∈ [1 : 2 ], sequences Q n We consider inner bounds each according to i=1 pU2 (u2i) n Qn For each (u1 (m1), u2 (m2)) pair, generate the sequence n n x (m1,m2) ∼ pX U ,U (xi|u1i(m1), u2i(m2)) Qi=1 | 1 2

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 117 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 118 / 151

DM-BC with Private Messages DM-BC with Private Messages Marton Inner Bound

Marton’s inner bound allows U1, U2 in the Cover–van der Meulen bound to be dependent (even though the messages are independent). This comes with a penalty term in the sum rate Encoding: To send (m ,m ), transmit xn(m ,m ) 1 2 1 2 Theorem (Marton [24, 25]) Decoding and analysis of the probability of error: Decoder j = 1, 2 A rate pair (R1, R2) is achievable for a DM-BC (X ,p(y1,y2|x), Y1 ×Y2) if declares that message mˆ j is sent if it is the unique message such that n n (n) (Uj (m ˆ j),Yj ) ∈Tǫ R1

for some p(u1, u2,x)= p(u1, u2)p(x|u1, u2) Remark: This inner bound is tight for semi-deterministic BC and MIMO Gaussian BC The new achievability ideas are simultaneous joint typicality encoding and the mutual covering lemma A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 119 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 120 / 151 DM-BC with Private Messages DM-BC with Private Messages

nR1 Proof of Achievability C1(1) C1(2) C1(2 ) n u ˜ 1 n n un 2nR1 n u1 (1) u1 (2) 1 “ ” u2

˜ n Codebook generation: Fix p(u1, u2)p(x|u1, u2). Let Rj ≥ Rj, j = 1, 2 u2 (1) C2(1) nR1 un For each message m1 ∈ [1 : 2 ] generate a subcode C1(m1) 2 (2) n consisting of randomly and independently generated u1 (l1), C2(2) n(R˜1 −R1) n(R˜1−R1) l1 ∈ [(m1 − 1)2 + 1 : m12 ], sequences each n (un, un) according to i pU1 (u1i) 1 2 =1 (n) Q nR2 ∈ Tǫ Similarly, for each message m2 ∈ [1 : 2 ] generate a subcode n C2(m2) consisting of randomly and independently generated u2 (l2), n(R˜2 −R2) n(R˜2−R2) l2 ∈ [(m2 − 1)2 + 1 : m22 ], sequences each n nR2 according to p (u ) C2(2 ) ˜ i=1 U2 2i n nR2 Q u2 “2 ” nR1 nR2 For m1 ∈ [1 : 2 ] and m2 ∈ [1 : 2 ], define C(m1,m2) := nR1 nR2 n n n n (n) For each message pair (m1,m2) ∈ [1 : 2 ] × [1 : 2 ], pick one (u1 (l1), u2 (l2)) ∈C1(m1) ×C2(m2) : (u1 (l1), u2 (l2)) ∈Tǫ′ n n  index pair (l1, l2) such that (u1 (l1), u2 (l2)) ∈C(m1,m2). If no such pair exists, pick an arbitrary pair (l1, l2) such that n n (u1 (l1), u2 (l2)) ∈C1(m1) ×C2(m2)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 121 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 122 / 151

DM-BC with Private Messages DM-BC with Private Messages Mutual Covering Lemma [25]

n n nR1 Generate a sequence x (m1,m2) according to Let (U1, U2) ∼ p(u1, u2), ǫ> 0. Let U (m1),m1 ∈ [1 : 2 ], be pairwise n 1 i=1 pX|U1,U2 (xi|u1i(l1), u2i(l2)) independent random sequences, each distributed according to Q n n n nR2 Encoding: To send message pair (m ,m ), transmit x (m ,m ) pU (u1i). Similarly, let U (m2),m2 ∈ [1 : 2 ], be pairwise 1 2 1 2 Qi=1 1 2 Decoding: Let ǫ>ǫ′. Decoder 1 declares that mˆ is sent if it is the independent random sequences, each distributed according to 1 n n nR (n) 1 n n n i=1 pU2 (u2i). Assume {U1 (m1) : m1 ∈ [1 : 2 ]} and unique index such that (u1 (l1),y1 ) ∈Tǫ for some u1 (l1) ∈C1(m ˆ 1); Q {U n(m ) : m ∈ [1 : 2nR2 ]} are independent. There exists δ(ǫ) → 0 as otherwise an error is declared 2 2 2 ǫ → 0 such that if R1 + R2 >I(U1; U2)+ δ(ǫ), then Similarly, decoder 2 finds the unique index mˆ 2 such that (n) n n (n) nR1 nR2 n n n P (U (m1), U (m2)) ∈T/ for all m1 ∈ [1 : 2 ],m2 ∈ [1 : 2 ] → 0 (u2 (l2),y2 ) ∈Tǫ for some u2 (l2) ∈C2(m ˆ 2); otherwise an error is  1 2 ǫ declared A crucial requirement for this coding scheme to work is that we have The proof of the lemma uses Chebychev inequality (see Appendix V) n n at least one sequence pair (u1 (l1), u2 (l2)) ∈C(m1,m2) This lemma extends the covering lemma in two ways: n The constraint on the subcode sizes to guarantee this is provided by By considering a single U1 sequence (R1 = 0), we obtain the same the following lemma rate requirement R2 >I(U1; U2)+ δ(ǫ) as for the covering lemma n We assumed only pairwise independence among the U2 (m2) sequences, which strengthens the covering lemma

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 123 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 124 / 151 DM-BC with Private Messages DM-BC with Private Messages

Analysis of the Probability of Error n n (n) ′ Given that (U1 (L1), U2 (L2)) ∈Tǫ′ and ǫ>ǫ , Assume that the message pair (1, 1) is sent. Let (L ,L ) be the index n n n n (n) 1 2 P{U1 (L1), U2 (L2), X ,Y1 ) ∈T/ ǫ }→ 0 as n →∞ by the LLN. pair of the chosen sequence pair in C(1, 1) c Hence, P(E0 ∩E11) → 0 as n →∞ Consider the average probability of error for decoder 1. Define the n n Since Y1 is independent of every U1 (l1) ∈C/ (1), by the packing error events lemma P(E12) → 0 as n →∞ if R˜1 I(U1; U2)+ δ(ǫ) E12 := {(U1 (l1),Y1 ) ∈Tǫ for some l1 ∈/ [1 : 2 ]} Eliminating R˜1, R˜2 gives the inequalities in the theorem Then the probability of error for decoder 1 is bounded by Remark: Simultaneous encoding is “dual” to simultaneous decoding. P(E ) ≤ P(E )+ P(Ec ∩E )+ P(E ) 1 0 0 11 12 Alternatively, we can achieve each corner point of the region using n(R˜1−R1) To bound P(E0), note that subcode C1(1) corresponds to 2 covering and then use time sharing to achieve the rest of the region n n(R˜2−R2) U1 (l1) sequences and subcode C2(1) corresponds to 2 (as we did in successive cancellation decoding) n U2 (l2) sequences Simultaneous encoding, however, is more powerful because it can Hence, by the mutual covering lemma, P{|C(1, 1)| = 0}→ 0 as achieve every point in the region n →∞ if (R˜1 − R1) + (R˜2 − R2) >I(U1; U2)+ δ(ǫ)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 125 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 126 / 151

DM-BC with Private Messages DM-BC with Private Messages Relationship to Gelfand–Pinsker Multivariate Mutual Covering Lemma

Fix p(u1, u2)p(x|u1, u2) and consider the achievable rate pair Mutual covering lemma can be generalized to more than two variables (R1, R2) = (I(U1; Y1) − δ(ǫ),I(U2; Y2) − I(U1; U2) − δ(ǫ)) Let (U1, U2, U3) ∼ p(u1, u2, u3), ǫ> 0. For each j = 1, 2, 3, let n n nRj U1 Uj (mj),mj ∈ [1 : 2 ], be pairwise independent random sequences, p(u1) n each distributed according to i=1 pUj (uji). Assume that n nR1 Q n nR2 {U1 (m1) : m1 ∈ [1 : 2 ]}, {U2 (m2) : m2 ∈ [1 : 2 ]}, and n nR3 n n n {U3 (m3) : m3 ∈ [1 : 2 ]} are independent. There exists δ(ǫ) → 0 U2 X Y2 ˆ M2 Encoder p(x|u1,u2) p(y2|x) Decoder M2 as ǫ → 0 such that if R1 + R2 >I(U1; U2)+ δ(ǫ), So the coding scheme for sending M to Y is identical to that of 2 2 R1 + R3 >I(U1; U3)+ δ(ǫ), sending M2 over a channel R2 + R3 >I(U2; U3)+ δ(ǫ), p(y2|u2, u1)= x p(u1|u2)p(x|u1, u2)p(y2|x) with “state” U1 available noncausallyP at the encoder R1 + R2 + R3 >I(U1; U2)+ I(U1, U2; U3)+ δ(ǫ), then n n n (n) Since we do not know if the Marton inner bound is optimal in P{(U1 (m1), U2 (m2), U3 (m3)) ∈T/ ǫ for all (m1,m2,m3)}→ 0 as general, this relationship may or may not be fundamental n →∞ This relationship, however, turned out to be crucial in establishing the This extension is used to extend the Marton region to > 2 receivers capacity of MIMO Gaussian BC and in the proof of achievability for multiple description coding [26]

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 127 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 128 / 151 DM-WTC DM-WTC DM Wiretap Channel Assume that M ∼ Unif[1 : 2nR] Consider a DM-WTC, which is a 2-receiver DM-BC The average probability of error for the secrecy code is defined as (X ,p(y, z|x), Y×Z) We wish to send a message M to the legitimate receiver Y and keep (n) ˆ Pe = P{M 6= M} it secret from the eavesdropper Z The information leakage rate is defined as E(n) := (1/n)I(M; Zn) Y n Decoder Mˆ A rate R is achievable if there exists a sequence of (2nR,n) codes (n) (n) Xn with Pe → 0 and E → 0 as n →∞ M Encoder p(y,z|x) The secrecy capacity C of the DM-WTC is the supremum of all Zn S Eavesdropper achievable rates Theorem (Wyner [27],Csisz´ar–K¨orner [28]) A (2nR,n) secrecy code for the DM-WTC consists of: The secrecy capacity of the DM-WTC (X ,p(y, z|x), Y×Z) is 1 A message set [1:2nR] 2 An encoder that generates a codeword Xn(m), m ∈ [1:2nR], n CS = max (I(U; Y ) − I(U; Z)) according to p(x |m), i.e., a random assignment that depends on m p(u)p(x|u) 3 A decoder that assigns a message estimate mˆ (yn) to each received sequence yn ∈ Yn

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 129 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 130 / 151

DM-WTC DM-WTC Proof of Achievability

The wiretap channel was first introduced by Wyner [27], who assumed that Z is a degraded version of Y , i.e., Codebook generation: Assume C > 0 and fix p(u)p(x|u) that p(y, z|x)= p(y|x)p(z|y). Under this assumption, S achieves it. Thus I(U; Y ) − I(U; Z) > 0 nR I(U; Y ) − I(U; Z)= I(U; Y |Z) For each message m ∈ [1 : 2 ], generate a subcode C(m) consisting n(R˜−R) n ≤ I(X; Y |Z)= I(X; Y ) − I(X; Z) of 2 randomly and independently generated u (l), l ∈ [(m − 1)2n(R˜−R) + 1 : m2n(R˜−R)], sequences each according to n Thus, the secrecy capacity is i=1 p(ui) TheQ chosen set of subcodes are revealed to the encoder, decoder, and CS = max (I(X; Y ) − I(X; Z)) eavesdropper p(x) Encoding: To send message m ∈ [1 : 2nR], the encoder picks an index The above result also holds if Y is more capable than Z, i.e., l ∈ [(m − 1)2n(R˜−R) + 1 : m2n(R˜−R)] uniformly at random. It then n n I(X; Y ) ≥ I(X; Z) for all p(x) generates X ∼ pX U (xi|ui(l)) and transmits it Qi=1 |

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 131 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 132 / 151 DM-WTC DM-WTC Analysis of the Information Leakage Rate

Outline: For each subcode C(m), the eavesdropper (on average) has . ˜ = 2n(R−R−I(U;Z)) un(l) sequences that are jointly typical with zn Decoding and analysis of the probability of error: The decoder Thus if we take R˜ − R>I(U; Z), the eavesdropper would have declares that ˆl is sent if it is the unique index such that n n (n) roughly equal number of indices (in the exponent) in each subcode, (u (ˆl),y ) ∈Tǫ and declares that the message sent is the subcode n ˆ providing it with almost no information about the message sent index mˆ of u (l) ˜ ˜ l : 1 2n(R−R) 2nR By the LLN and the packing lemma, this decoding procedure succeeds with average probability of error P(E) → 0 as n →∞ if R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 133 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 134 / 151

DM-WTC DM-WTC Now consider the eavesdropper message rate averaged over the The equivocation term in the last step is bounded as follows randomly chosen codebook C Equivocation Bound Lemma I(M; Zn|C)= I(M,L; Zn|C) − I(L; Zn|M, C) If R 2n(I(X;Z|U) xn sequences each according to n p (x |u (l)). ˜ n i=1 X|U i i = nI(U; Z) − n(R − R)+ H(L|Z , M, C) The encoder randomly selects one of them forQ transmission

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 135 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 136 / 151 DM-WTC DM-WTC Summary of Achievability Ideas

Coding techniques: Performance analysis tools: ◮ Random codebook generation ◮ Division of error event ◮ Joint typicality decoding (packing) ◮ LLN ◮ Successive cancellation decoding ◮ Properties of typicality ◮ Simultaneous joint typicality decoding ◮ Packing lemma ◮ Time sharing and coded time sharing ◮ Fourier–Motzkin elimination ◮ Superposition coding ◮ Covering lemma ◮ Rate splitting ◮ Mutual covering lemma ◮ Indirect decoding ◮ Chebychev inequality ◮ Block Markov coding ◮ Entropy and mutual information bounds ◮ Backward decoding ◮ Joint typicality encoding (covering) There are other coding techniques that we didn’t discuss, e.g., linear ◮ Binning coding, network coding, iterative refinement for channels with ◮ Subcode generation: One-to-many coding; dual to binning feedback, and corresponding analysis tools ◮ Simultaneous joint typicality encoding (mutual covering): generalization of covering, dual to simultaneous decoding ◮ Random encoding (for secrecy)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 137 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 138 / 151

DM-WTC References and Appendices Conclusion [1] T. C. Hu. Multi-commodity network flows. Operations Research, 11(3):344–360, 1963. Presented a simple and unified approach to achievability for DM systems [2] L. R. Ford, Jr. and D. R. Fulkerson. ◮ Strong typicality Maximal flow through a network. ◮ Small number of coding techniques and performance analysis tools Canad. J. Math., 8:399–404, 1956. We observed some interesting ”dualities” [3] , Amiel Feinstein, and Claude E. Shannon. Covering ↔ Packing A note on the maximum flow through a network. Binning ↔ Subcode generation IRE Trans. Inf. Theory, 2(4):117–119, December 1956. Simultaneous covering ↔ Simultaneous packing [4] and James R. Roche. Proofs can be extended to Gaussian sources/channels via appropriate Coding for computing. scalar quantizations and standard limit theorems IEEE Trans. Inf. Theory, 47(3):903–917, 2001.

As we have seen, the same region can often be achieved using [5] . different coding schemes, and there can be several equivalent Multiterminal source coding. characterizations of the same region. It would be interesting to In Giuseppe Longo, editor, The Information Theory Approach to Communications. develop a deeper understanding of these intriguing observations Springer-Verlag, New York, 1978.

[6] Claude E. Shannon. A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 139 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 139 / 151 References and Appendices References and Appendices A mathematical theory of communication. [12] G¨unter M. Ziegler. Bell System Tech. J., 27:379–423, 623–656, 1948. Lectures on Polytopes. [7] G. David Forney, Jr. Springer-Verlag, New York, 1995. Information theory. [13] Shashi Borade, Lizhong Zheng, and Mitchell Trott. Unpublished course notes, Stanford University, 1972. Multilevel broadcast networks. [8] Thomas M. Cover and Joy A. Thomas. In Proc. IEEE International Symposium on Information Theory, pages 1151–1155, Nice, France, June 2007. Elements of Information Theory. Wiley, New York, second edition, 2006. [14] Chandra Nair and Abbas El Gamal. The capacity region of a class of 3-receiver broadcast channels with degraded [9] . message sets. Multiway communication channels. 2008. In Proc. 2nd Int. Symp. Inf. Theory, pages 23–52, Tsahkadsor, Armenian S.S.R., 1971. [15] Hon-Fah Chong, M. Motani, H. K. Garg, and H. El Gamal. On the Han–Kobayashi region for the interference channel. [10] Henry H. J. Liao. IEEE Trans. Inf. Theory, 54(7):3188–3195, July 2008. Multiple Access Channels. Ph.d. thesis, University of Hawaii, Honolulu, September 1972. [16] and Kingo Kobayashi. A new achievable rate region for the interference channel. [11] J´anos K¨orner and . IEEE Trans. Inf. Theory, 27(1):49–60, 1981. General broadcast channels with degraded message sets. IEEE Trans. Inf. Theory, 23(1):60–64, 1977. [17] Claude E. Shannon. A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 139 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 139 / 151

References and Appendices References and Appendices Coding theorems for a discrete source with a fidelity criterion. An achievable rate region for the broadcast channel. In IRE Int. Conv. Rec., part 4, volume 7, pages 142–163. 1959. IEEE Trans. Inf. Theory, 21:399–404, 1975. Reprinted with changes in Information and Decision Processes, R. E. Machol, Ed. [23] Edward C. van der Meulen. New York: McGraw-Hill, 1960, pp. 93-126. Random coding theorems for the general discrete memoryless broadcast channel. [18] Aaron D. Wyner and . IEEE Trans. Inf. Theory, 21:180–190, 1975. The rate-distortion function for source coding with side information at the decoder. [24] Katalin Marton. IEEE Trans. Inf. Theory, 22(1):1–10, 1976. A coding theorem for the discrete memoryless broadcast channel. [19] and Jack Keil Wolf. IEEE Trans. Inf. Theory, 25(3):306–311, 1979. A coding theorem for multiple access channels with correlated sources. [25] Abbas El Gamal and Edward C. van der Meulen. Bell System Tech. J., 52:1037–1076, 1973. A proof of Marton’s coding theorem for the discrete memoryless broadcast channel. [20] S. I. Gelfand and M. S. Pinsker. IEEE Trans. Inf. Theory, 27(1):120–122, January 1981. Coding for channel with random parameters. [26] Abbas El Gamal and Thomas M. Cover. Probl. Control Inf. Theory, 9(1):19–31, 1980. Achievable rates for multiple descriptions. [21] Chris Heegard and Abbas El Gamal. IEEE Trans. Inf. Theory, 28(6):851–857, 1982. On the capacity of computer memories with defects. [27] A. D. Wyner. IEEE Trans. Inf. Theory, 29(5):731–739, 1983. The wire-tap channel. [22] Thomas M. Cover. Bell System Tech. J., 54(8):1355–1387, 1975. A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 139 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 139 / 151 References and Appendices References and Appendices [28] Imre Csisz´ar and J´anos K¨orner. Appendix I: Proof of the Packing lemma Broadcast channels with confidential messages. n n n (n) Define the events Em := {(U , X (m),Y ) ∈ Tǫ }, m ∈ A IEEE Trans. Inf. Theory, 24(3):339–348, 1978. The probability of the event of interest can be bounded as [29] Thomas M. Cover and Abbas El Gamal. P Em ≤ P(Em) Capacity theorems for the relay channel. m∈A ! m∈A [ X IEEE Trans. Inf. Theory, 25(5):572–584, September 1979. Now, consider n n n (n) P(Em)= P{(U , X (m),Y ) ∈ T (U,X,Y )} [30] Frans M. J. Willems and Edward C. van der Meulen. ǫ n n n n n (n) n n The discrete memoryless multiple-access channel with cribbing encoders. = p(u ,y ) P{(u , X (m),y ) ∈ Tǫ (U,X,Y )|U = u } n n (n) IEEE Trans. Inf. Theory, 31(3):313–327, 1985. (u ,yX)∈Tǫ (a) ≤ p(un,yn)2−n(I(X;Y |U)−δ(ǫ)) ≤ 2−n(I(X;Y |U)−δ(ǫ)), [31] Liang-Liang Xie and P. R. Kumar. n n (n) (u ,yX)∈Tǫ An achievable rate for the multiple-level relay channel. n n (n) where (a) follows by the conditional joint typicality lemma, since (u ,y ) ∈ Tǫ , IEEE Trans. Inf. Theory, 51(4):1348–1358, 2005. n n X (m) ∼ i=1 pX|U (xi|ui), and δ(ǫ) → 0 as ǫ → 0. Hence −n(I(X;Y |U)−δ(ǫ)) −n(I(X;Y |U)−R−δ(ǫ)) [32] Abbas El Gamal, Mehdi Mohseni, and Sina Zahedi. Q P(Em) ≤ |A|2 ≤ 2 , m Bounds on capacity and minimum energy-per-bit for AWGN relay channels. X∈A IEEE Trans. Inf. Theory, 52(4):1545–1561, 2006. which tends to 0 as n →∞ if R

References and Appendices References and Appendices

Appendix II: Fourier–Motzkin for Han–Kobayashi Bound and two lower bounds on R10:

R10 ≥ R1 − I1, Substituting Rjj = Rj − Rj0 for j = 1, 2, we obtain the system of inequalities: R10 ≥ R1 + R20 − I3 R1 − R10 ≤ I1,

R1 ≤ I2, Comparing upper and lower bounds, and copying four inequalities in the original system that do not involve R10, we obtain a new system of inequalities in R1 − R10 + R20 ≤ I3, (R20, R1, R2) given by R1 + R20 ≤ I4,

R2 − R20 ≤ I5, R1 ≤ I2,

R2 ≤ I6, R1 + R20 ≤ I4,

R2 − R20 + R10 ≤ I7, R2 − R20 ≤ I5,

R2 + R10 ≤ I8 R2 ≤ I6,

R1 + R2 − R20 ≤ I1 + I7, Next we eliminate Rj0 for j = 1, 2 R1 + R2 ≤ I1 + I8, Elimination of R10: R1 + R2 ≤ I3 + I7, We have two upper bounds on R10: R1 + R2 + R20 ≤ I3 + I8

R10 ≤ I7 − R2 + R20, Noting that none of these 8 inequalities is redundant, we move on to eliminate R10 ≤ I8 − R2 R20 from them

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 141 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 142 / 151 References and Appendices References and Appendices

Elimination of R20: Appendix III: Proof of the Covering Lemma Comparing upper bounds on R20

R20 ≤ I4 − R1, R20 ≤ I3 + I8 − R1 − R2 From the joint typicality lemma,

with lower bounds n n n (n) −n(I(X;Xˆ |U)+δ(ǫ)) P{(u ,x , Xˆ (m)) ∈ Tǫ }≥ 2 R20 ≥ R2 − I5, R20 ≥ R1 + R2 − I1 − I7, for each m ∈ A, where δ(ǫ) → 0 as ǫ → 0 and copying inequalities that do not involve R , we obtain 20 Hence, the probability of the event of interest is bounded by R1 ≤ I2, n n n (n) n n n (n) P{(u ,x , Xˆ (m)) ∈/ Tǫ for all m} = P{(u ,x , Xˆ (m)) ∈/ Tǫ } R2 ≤ I6, m Y∈A R1 + R2 ≤ I1 + I8, ˆ |A| ≤ 1 − 2−n(I(X;X|U)+δ(ǫ)) R1 + R2 ≤ I3 + I7, h −n(I(X;Xˆ |U)+δ(ǫ))i R1 + R2 ≤ I4 + I5, ≤ e−“|A|·2 ”

2R1 + R2 ≤ I1 + I4 + I7, n(R−I(X;Xˆ |U)−δ(ǫ)) ≤ e−“2 ”, R1 + 2R2 ≤ I3 + I5 + I8,

2R1 + 2R2 ≤ I1 + I3 + I7 + I8 which goes to zero as n →∞, provided that R>I(X; Xˆ|U)+ δ(ǫ) n n Finally we note that the last inequality is redundant since it is implied by the third Taking expectation over (U , X ) completes the proof and fourth inequalities; thus we have 7 inequalities on (R1, R2)

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 143 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 144 / 151

References and Appendices References and Appendices

Appendix IV: Bounding P(E3) in Wyner–Ziv Achievability First we show that n n (n) n ˜ n (n) ˜ P{(U (˜l),Y ) ∈ Tǫ for some ˜l ∈ B(m), ˜l 6= L|M = m} = P{(U (l),Y ) ∈ Tǫ for some l ∈ B(1)|M = m} n ˜ n (n) ˜ ≤ P{(U (l),Y ) ∈ Tǫ for some l ∈ B(1)|M = m} where (a) and (c) follow since M is a function of L and (b) follows since by the n(R˜−R) n The claim holds trivially for m = 1 code generation, given L = l, any collection of 2 − 1 U (˜l) sequences with ˜ For m 6= 1, consider l 6= l has the same distribution n n (n) Hence P{(U (˜l),Y ) ∈ Tǫ for some ˜l ∈ B(m), ˜l 6= L|M = m} n n (n) n ˜ n (n) n ˜ ˜ = p(l|m) P{(U (˜l),Y ) ∈ Tǫ for some ˜l ∈ B(m), ˜l 6= l|L = l, M = m} P(E3)= p(m) P{(U (l),Y ) ∈ Tǫ for some U (l) ∈ B(m), l 6= L|M = m} m l∈B(m) X X n n (n) n (a) ≤ p(m) P{(U (˜l),Y ) ∈ T for some U (˜l) ∈ B(1)|M = m} = p(l|m) P{(U n(˜l),Y n) ∈ T (n) for some ˜l ∈ B(m), ˜l 6= l|L = l} ǫ ǫ m l∈B(m) X X n n (n) n = P{(U (˜l),Y ) ∈ Tǫ for some U (˜l) ∈ B(1)} (b) n n (n) n(R˜−R) = p(l|m) P{(U (˜l),Y ) ∈ Tǫ for some l˜∈ [1 : 2 − 1]|L = l} l∈BX(m) n n (n) ≤ p(l|m) P{(U (˜l),Y ) ∈ Tǫ for some ˜l ∈ B(1)|L = l} l∈BX(m) (c) n n (n) = p(l|m) P{(U (˜l),Y ) ∈ Tǫ for some ˜l ∈ B(1)|L = l, M = m} l∈B(m) A. El Gamal (StanfordX University) Achievability Padovani Lecture 2009 145 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 146 / 151 References and Appendices References and Appendices

Let Appendix V: Proof of Mutual Covering Lemma n n (n) p1 := P{(U1 (1), U2 (1)) ∈ Tǫ }, n n (n) n n (n) Let p2 := P{(U1 (1), U2 (1)) ∈ Tǫ , (U1 (1), U2 (2)) ∈ Tǫ }, nR1 nR2 n n (n) B = {(m1,m2) ∈ [1 : 2 ] × [1 : 2 ]:(U (m1), U (m2)) ∈ Tǫ (U1, U2)}. n n (n) n n (n) 1 2 p3 := P{(U1 (1), U2 (1)) ∈ Tǫ , (U1 (2), U2 (1)) ∈ Tǫ }, Then the probability of the event of interest can be bounded as n n (n) n n (n) 2 p4 := P{(U1 (1), U2 (1)) ∈ Tǫ , (U1 (2), U2 (2)) ∈ Tǫ } = p1 Var(|B|) 2 2 Then P{|B| = 0}≤ P (|B| − E |B|) ≥ (E |B|) ≤ 2 (E |B|) n n (n) n(R1+R2) E(|B|)= P{(U1 (m1), U2 (m2)) ∈ Tǫ } = 2 p1 ˘ ¯ m ,m by Chebychev’s inequality X1 2 and Using indicator random variables, we can express |B| as 2 n n (n) E(|B| )= P{(U1 (m1), U2 (m2)) ∈ Tǫ } nR nR 2 1 2 2 m ,m X1 2 |B| = E(m1,m2), n n (n) n n ′ (n) + P{(U1 (m1), U2 (m2)) ∈ Tǫ , (U1 (m1), U2 (m2)) ∈ Tǫ } m1=1 m2=1 X X m ,m ,m′ 6=m 1 X2 2 2 where n n (n) n ′ n (n) n n (n) + P{(U1 (m1), U2 (m2)) ∈ Tǫ , (U1 (m1), U2 (m2)) ∈ Tǫ } 1 if (U1 (m1), U2 (m2)) ∈ Tǫ , ′ E(m1,m2) := m1,m2,m 6=m1 0 otherwise X1 ( n n (n) n ′ n ′ (n) + P{(U1 (m1), U2 (m2)) ∈ Tǫ , (U1 (m1), U2 (m2)) ∈ Tǫ nR1 nR2 for each (m1,m2) ∈ [1 : 2 ] × [1 : 2 ] m ,m ,m′ 6=m ,m′ 6=m 1 2 1X 1 2 2 n(R1+R2) n(R1+2R2) n(2R1 +R2) 2n(R1+R2) ≤ 2 p1 + 2 p2 + 2 p3 + 2 p4 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 147 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 148 / 151

References and Appendices References and Appendices

Hence n(R1+R2) n(R1+2R2) n(2R1+R2) Appendix VI: Proof of the Equivocation Bound Lemma Var(|B|) ≤ 2 p1 + 2 p2 + 2 p3 We bound H(L|Zn, m, C)/n for every m Now by the joint typicality lemma, we have We first bound the average probability for the following events: −n(I(U1;U2)+δ(ǫ)) p1 ≥ 2 , n n (n) For every m and codebook C, let N(m, C) := |{l ∈ C(m): (U (l), Z ) ∈ Tǫ }|. −2n(I(U1;U2)−δ(ǫ)) p2 ≤ 2 , It is not difficult to show that . n(R˜−R−I(U;Z)) −2n(I(U1;U2)−δ(ǫ)) p3 ≤ 2 , E(N(m, C)) = 2 , n(R˜−R−I(U;Z)+δ(ǫ)) hence Var(N(m, C)) ≤ 2

2 4nδ(ǫ) 2 4nδ(ǫ) Define the random event E2(m, C) := {N(m, C) ≥ 2 E(N(m, C))}. Using the p2/p1 ≤ 2 , p3/p1 ≤ 2 Chebyshev inequality, the probability of E2 averaged over codebooks is Therefore, Var(N(m, C)) −n(R˜−R−I(U;Z)−3δ(ǫ)) P(E2(m, C)) ≤ 2 ≤ 2 Var(|B|) −n(R1+R2−I(U1;U2)−δ(ǫ)) −n(R1−4δ(ǫ)) −n(R2−4δ(ǫ)) (E(N(m, C))) 2 ≤ 2 + 2 + 2 , (E |B|) Thus if R˜ − R − I(U; Z) − 3δ(ǫ) > 0, i.e., R 4δ(ǫ), R2 > 4δ(ǫ), P(E2(m, C)) → 0 as n →∞ for every m R1 + R2 >I(U1; U2)+ δ(ǫ) For each code C and message m, define the indicator random variable I(m, C) := 0 n n (n) c Similarly, P{|B| = 0}→ 0 as n → 0 if R1 = 0 and R2 >I(U1; U2)+ δ(ǫ), or if if (U (L), Z ) ∈ Tǫ and the event E2 occurs, and I(m, C) := 1, otherwise n n (n) R1 >I(U1; U2)+ δ(ǫ) and R2 = 0 Clearly, E(I(m, C)) = P{I(m, C) = 1}≤ P{(U (L), Z ) ∈/ Tǫ } + P(E2(m, C)) Combining three sets of inequalities, we have shown that P{|B| = 0}→ 0 as The first term → 0 as n →∞ by the LLN and the second term → 0 as n →∞ if n →∞ if R1 + R2 >I(U1; U2) + 5δ(ǫ) R

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 149 / 151 A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 150 / 151 References and Appendices

We are now ready to bound the last eavesdropper message rate term. Consider

n n H(L|Z , m, C) ≤ 1+ p(C) P{I(m, C) = 1|C = C}H(L|Z ,m,I(m, C) = 1, C) XC + H(L|Zn,m,I(m, C) = 0, C) ≤ 1+ n(R˜ − R) P{I(m, C) = 1} + H(L|Zn,m,I(m, C) = 0, C) ≤ 1+ n(R˜ − R) P{I(m, C) = 1} + n(R˜ − R − I(U; Z)+ δ(ǫ))

Now since P{I(m, C) = 1}→ 0 as n →∞ if R

lim H(L|Zn, m, C)/n ≤ R˜ − R − I(U; Z)+ δ(ǫ) n→∞

This completes the proof of the lemma

A. El Gamal (Stanford University) Achievability Padovani Lecture 2009 151 / 151