6.441 Information Theory, Lecture 17

LECTURE 17 Last time: ² Differential entropy ² Entropy rate and Burg’s theorem ² AEP Lecture outline ² White Gaussian noise ² Bandlimited WGN ² Additive White Gaussian Noise (AWGN) channel ² Capacity of AWGN channel ² Application: DS-CDMA systems ² Spreading ² Coding theorem Reading: Sections 10.1-10.3. White Gaussian Noise (WGN) WGN is a good model for a variety of noise processes: - ambient noise - dark current - noise from amplifiers A sample at any time N(t) is Gaussian It is defined by its power spectrum: R(t) = N0±(t) Its power spectral density is N0 over the whole spectrum (flat PSD) What is its bandwidth? What is its energy? Do we ever actually have WGN? Bandlimited WGN WGN is always bandlimited to some band width W , thus the noise is passed through a bandpass filter The power density spectrum is now non zero only over an interval W of frequency The energy of the noise is then W N0 The Nyquist rate is then W and samples 1 are W apart in time Discrete-time samples of the bandlimited noise, which by abuse of notation we still denote N, are IID 2 Ni » N (0; σN ) AWGN Channel We operate in sampled time, hence already in a bandlimited world We ignore issues of impossibility of ban dlimiting and time-limiting simultaneously (one can obtain bounds for the limiting in time, and we are inherently considering very long times, as per the coding theorem) Yi = Xi + Ni Nyquist rate of the output is Nyquist rate of the input The input and the noise are independent The input has a constraint on its average energy, or average variance Capacity of AWGN Channel We have a memoryless channel Same arguments as for DMC imply the Xis should be IID, yielding IID Yis 2 2 The variance of Yi is σX + σN Hence we omit the i subscript in the fol lowing I(X; Y ) = h(Y ) ¡ h(Y jX) = h(Y ) ¡ EX[h(Y ¡ xjX = x)] = h(Y ) ¡ EX[h(NjX = x)] = h(Y ) ¡ h(N) 1 ³ ´ 1 ³ ´ · ln 2¼e(σ2 + σ2 ) ¡ ln 2¼eσ2 2 X N 2 N To achieve inequality with equality, select 2 the Xis to be IID » N (0; σX) (could have any mean other than 0, but would affect second moment without changing variance, so not useful in this case), hence the Xis are themselves samples of bandlimited W GN Capacity of AWGN Channel Another view of capacity I(X; Y ) = h(X) ¡ h(XjY ) = h(X) ¡ EY [h(X ¡ ®yjY = y)] In particular, we can select ® so that ®Y is the LLSE estimate of X from Y For X Gaussian, then this estimate is also the MMSE estimate of X from Y The error of the estimate is Xf = X ¡ ®y The error is independent of the value y (re call that for any jointly Gaussian random variables, the first rv can be expressed as the weighted sum of the second plus an independent Gaussian rv) The error is Gaussian Capacity of AWGN Channel Note that this maximizes EY [h(X ¡ ®yjY = y)] f The Xis are still IID EY [h(X ¡ ®yjY = y)] f = EY [h(X j Y = y)] · h(X f) 1 ³ ´ · ln 2¼eσ2 2 Xe We can clearly see that adding a constant to the input would not help Note that, if the channel is acting like a jammer to the signal, then the jammer can not control the h(X) term, only can control the h(XjY ) term All of the above inequalities are reached with equality if the channel is an AWGN channel Capacity of AWGN Channel Thus, the channel is acting as an optimal jammer to the signal Saddle point: - if the channel acts as an AWGN channel, then the sender should send bandlimited W GN under a specific variance constraint (energy constraint) - if the input is bandlimited WGN, then un der a variance constraint on the noise, the channel will act as an AWGN channel The capacity is thus � 2 ¶ � 2 ¶ 1 σY 1 σX C = 2 ln 2 = 2 ln 1 + 2 σN σN for small SNR, roughly proportional to SNR Application: DS-CDMA systems Consider a direct-sequence code division mul tiple access system Set U of users sharing the bandwidth Every user acts as a jammer to every other user j P k Xi = k2Unfjg Xi + Ni The capacity for user j is 0 1 σ2 1 B j C C = ln @1 + P X A j 2 σ2 +σ2 k2Unfjg Xk N i Spreading What happens when use more bandwidth (spreading)? Let us revisit the expression for capacity: � 2 ¶ 1 σX C = ln 1 + 2 2 σN Problem: when we change the bandwidth, we also change the number of samples Think in terms of number of degrees of freedom For time T and bandwidth W The total energy of the input stays the same, but the energy of the noise is pro portional to W Or, alternatively, per degree of freedom, we have the same energy per degree of free dom for the noise, but the energy for the 1 input decreases proportionally to W Spreading For a given T with energy constraint E over that time in the input, maximum mutual information in terms of nats per second is: ³ ´ 1 E W ln 1 + 2T W N0 E limit as W ! 1 is 2T N0 spreading is always beneficial, but its effect is bounded Geometric interpretation in terms of con cavity Coding theorem We no longer have a discrete input alpha bet, but a continuous input alphabet We now have transition pdfs fY jX(yjx) For any block length n, let n n Qn fY njXn(y jx ) = i=1 fY jX(yijxi) n Qn fXn(x ) = i=1 fX(xi) n Qn fY n(y ) = i=1 fY (yi) Let R be an arbitrary rate R < C For each n consider choosing a code of M = benRc codewords, where each codeword is chosen independently with pdf as n signment fXn(x ) Coding theorem C¡R n Let ² = 2 and let the set T² be the set of pairs (xn; yn) such that ³ ´ 1 n n jn i x ; y ¡ Cj · ² where i is the sample natural mutual infor mation ³ ´ � n n ¶ n n fY njXn(y jx ) i x ; y = ln n fY n(y ) For every n and each code in the ensemble, n the decoder, ³given y , ´selects the message n n n m for which x (m); y 2 T² . We assume an error if there are no such codewords or more than one codeword. Upper bound on probability Let ¸m be the event that, given message m enters the system, an error occurs. The mean probability of error over all en semble of codes is E[¸m] = P (¸m = 1) (indicator function) Error occurs when ³ ´ n n n x (m); y 62 T² or ³ ´ n 0 n n 0 x (m ); y 2 T² for m =6 m Upper bound on probability Hence, through the union bound ³³³ ´ ´ n n n E[¸m] = P x (m); y 62 T² 1 [ ³ ´ n 0 n n C [ x (m ); y 2 T² jmA m0=6 m ³³ ´ ´ n n n · P x (m); y 62 T² X ³³ ´ ´ n 0 n n + P x (m ); y 2 T² jm m0 6=m The probability of the pair not being typical approaches 0 as n ! 1 ¡ ¢ n n � ¶ i x (m);y 1 Pn fY jX (yijxi) = ln n n i=1 fY (yi) Through the WLLN, the above converges in probability to � ¶ R fY jX (yjx) C = fX(x)f (yjx) ln dxdy x2X ;y2Y Y jX fY (y) Hence, for any ² ³³ ´ ´ n n n limn!1 P x (m); y 2 T² jm ! 0 Upper bound to probability of several typical sequences Consider m0 =6 m Given how the codebook was chosen, the variables Xn(m0); Y n are independent con ditioned on m having been transmitted Hence ¡ n 0 n n ¢ P X (m ); Y 2 T² jm R¡ ¢ n 0 n n 0 n = n 0 n n fXn(x (m ))fY n(y )dx (m )dy x (m );y 2T² n Because of the definition of T² , for all pairs in the set ³ ´ n n n 0 ¡n(C¡²) fY n(y ) · fY njXn y jx (m ) e Upper bound to probability of several typical sequences ³ ´ n 0 n n P X (m ); Y 2 T² jm Z ³ ´ n n 0 ¡n(C¡²) · ¡ ¢ f n n y jx (m ) e n 0 n n Y jX x (m );y 2T² n 0 n 0 n fXn(x (m ))dx (m )dy ¡n(C¡²) · e E[¸m] is upper bounded by two terms that go to 0 as n ! 1, thus the average proba bility of error given that m was transmitted goes to 0 as n ! 1 This is the average probability of error aver aged over the ensemble of codes, therefore 8± > 0 and for any rate R < C there must exist a code length n with average proba bility or error less than ± Thus, we can create a sequence of codes with maximal probability of error converg ing to 0 as n ! 1 We can expurgate codebooks as before MIT OpenCourseWare http://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. .

Load more