Chapter 3 Source Codes, Line Codes & Error Control
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 3 Source Codes, Line Codes & Error Control 3.1 Primary Communication • Information theory deals with mathematical representation and anal- ysis of a communication system rather than physical sources and channels. • It is based on the application of probability theory i.e.,) calculation of probability of error. Discrete message The information source is said to be discrete if it emits only one symbol from finite number of symbols or message. • Let source generating `N' number of set of alphabets. X = fx1; x2; x3; ::::; xM g • The information source generates any one alphabet from the set. The probability of various symbols in X can be written as P (X = xk) = Pk k = 1; 2; ::::M M Pk = 1 (1) Xk=1 Discrete memoryless source (DMS) It is defined as when the current output symbol depends only on the cur- rent input symbol and not an any of the previous symbols. • Letter or symbol or character 3.2 Communication Enginnering – Any individual member of the alphabet set. • Message or word: – A finite sequence of letters of the alphabet set • Length of a word – Number of letters in a word • Encoding – Process of converting a word of finite set into another format of encoded word. • Decoding – Inverse process of converting a given encoded word to a origi- nal format. 3.2 Block Diagram of Digital Communication System Information Source Channel Base band Formatter source Encoder Encoder Modulator Channel Source Channel Base band Destination Deformatter Decoder Decoder Demodulator Fig .3.1 Typical communication system (or) Digital communication system Information source It may analog or digital. Example: voice, video Formetter It converts analog signal into a digital signal. Source Codes, Line Codes & Error Control 3.3 Source encoder • Is used efficient representation of data generated by source. • It represents digital signal into a few digits as possible depending on the information content of message. (i.e.,) minimizes the require- ments of digits. Channel encoder • Some redundancy is introduced in message to combine noise in chan- nel. Baseband modulator • Encoded signal is modulated here by precise modulating techniques. Channel • Transmitted signal gets corrupted by random noise, thermal noise, shot noise, atmospheric noise. Channel decoder • It removes the redundancy bits by channel decoding algorithm. Deformetter It converts digital data into a discrete form or analog form. • The above communication system is used to carry information bear- ing baseband signal from one place to another over a communication channel. • Performance of communication symbol measured by probability of error (Pe) • Condition to get error free communication is Entropy of the source < capacity of a channel • Capacity of a channel: The ability of a channel to convey information. • Entropy: Average information per symbol. 3.4 Communication Enginnering 3.3 Amount of Information • Amount of information defined interms of probability i.e., 1 I = f P i • Probability of occurrence of event is more, very less amount of infor- mation, otherwise probability of occurrence of an event is less then there will be more amount of information. • Example: If a dog bites a man the probability of occurrence is more so loess information. Otherwise if a man bites a dog the probability of occur- rence is less hence more information. 1 I (xj) = f (1) P (xj) Xj ! Event P (xj) ! Probability of an event I(xj) ! Amount of information Equation (1) can be rewrite as 1 1 I (xj) = log bits or Ik = log bits (2) P (xj) Pk Definition The amount of information Ixj, is related to the logarithm on the inverse of the probability of occurrence of an event P (xj). 3.4 Average Information or Entropy Definition The entropy of a source is defined as the source which produces average information per message or symbol in a particular interval. Let m1, m2, m3, .... mk, `k' different messages with p1, p2,, p3, .... pk, be corresponding probabilities of occurrences. Source Codes, Line Codes & Error Control 3.5 Example: Message generated by source is 0 0 ABCACBCABCAABC A; B; C ! m1; m2; m3:::: ) k = 3 Then the number of m1 message is m1 = P1L L ! Total no. of messages generated by the source. m1 ! A; L = 15 m1 = P115 Similarly for m2 ! B; L = 15 m2 = P215 The amount of information in messages `m1' is given as, 1 I = log 1 2 P 1 Total amount of information due to m1 message is 1 I = P L log t1 1 2 P 1 similarly total amount of information due to `m2' message is, 1 I = P L log t2 2 2 P 2 Thus the total amount of information due to `L' messages, is given as It = It1 + It2 + :::: + Itk 1 1 1 I = P log + P log + ::::P log t 1 2 P 2 2 P k 2 P 1 2 k Total information ) Average information = Number of messages I Average information = t L 3.6 Communication Enginnering Average information per message is nothing but entropy H(x) or H I H = t L 1 1 1 P1L log2 P1 + P2L log2 P2 + :::: + PkL log2 P ) H (s) = k L 1 1 1 L P1L log2 P1 + P2L log2 P2 + :::: + PkL log2 P = k h L i M 1 H = Pk log2 bits/symbol (3) Pk Xk=1 The entropy `H' of discrete memoryless source is bounded as 0 ≤ H ≤ log2 M 3.4.1 Properties of entropy Property 1: H = 0, if Pk = 0 or 1 Entropy is zero, when its probability of event is possible or not When Pk = 0 M 1 H = Pk log2 Pk Xk=1 1 = 0 log 2 0 H = 0 When Pk = 1 M 1 H = Pk log2 Pk Xk=1 1 = 1 log 2 1 H = 1 Property 2: All the symbols are equi-probable 1 H = Pk log2 Pk 1 1 1 H = P log + P log + :::: + P log 1 2 P 2 2 P M 2 P 1 2 M Source Codes, Line Codes & Error Control 3.7 For a minimum number of equally likely messages probability is 1 P = P = P ::::P = 1 2 3 M M 1 1 1 H = log (k) + log (M) + :::: + log (M) M 2 M 2 M 2 M H = log (M) M 2 H = log2 (M) Property 3: Upper bound on entropy 0 ≤ Hmax ≤ log2 k Consider any two probability distribution (P1; P2::::Pn) and (q1; q2::::qn) are the alphabet X = fx1; x2; :::xmg of a DMS source. Then M M qk Pk log2 qk Pk log10 x Pk log2 = * log2 x = Pk log102 log10 2 Xk=1 Xk=1 By a property of natural log, log x 6 x − 1; x > 0 M log qk M 2 Pk 1 qk Pk ≤ Pk − 1 log10 2 log10 2 Pk Xk=1 Xk=1 M 1 ≤ (qk − Pk) log10 2 Xk=1 M M ≤ log10 2 qk − pk ! Xk=1 Xk=1 W.K.T M M Pk = qk = 1 Xk=1 Xk=1 M qk Pk log2 6 0 Pk Xk=1 3.8 Communication Enginnering M M 1 Pk log2 qk + Pk log2 6 0 Pk Xk=1 Xk=1 M M 1 Pk log2 6 − Pk log2 qk Pk Xk=1 Xk=1 1 Sub q = k m M M 1 1 Pk log2 ≤ Pk log2 Pk qk Xk=1 Xk=1 M ≤ Pk log2 M Xk=1 M ≤ log2 M Pk Xk=1 M 1 Pk log2 ≤ log2 M Pk Xk=1 H ≤ log2 M The entropy H holds all the equiprobable symbols. 3.4.2 Entropy of a binary memoryless source (BMS) • Assume that the source is memoryless so that successive symbols emitted by the source are statistically independent. • Consider symbol `0' occurs with probability P0 and symbol `1' with probability P1 = 1 − P0. Entropy os BMS 2 1 H = Pk log2 Pk Xk=1 1 1 = P log + (1 − P ) log 0 2 P 0 2 1 − P 0 0 H = −P0 log2 P0 − (1 − P0) log (1 − P0) 1. When P0 = 0, H = 0 Source Codes, Line Codes & Error Control 3.9 2. When P0 = 1, H = 0 1 3. When P = P = , (i.e.,) symbol 0 and 1 are equally probable then 0 1 2 H = 1. 1 0.8 0.6 0.4 Entropy(H) 0.2 0.5 1 0 Symbol Probability Fig .3.2 Plot of Entropy Vs Probability 3.4.3 Extension of a discrete memoryless source • Consider a blocks with n-successive symbols rather than individual symbols. Each block is produced by an extended source alphabet (Xn) that has kn distinct blocks, where k = number of distinct sym- bols in the source alphabet (X) of original source. Extended entropy H (Xn) = nH (X) 3.4.4 Differential entropy Consider continuous random variable `X' having probability density func- tion of fX (x). 1 1 H = fX (x) log2 dx fX (x) −∞Z 3.4.5 Information rate (R) Rate of information (R) is defined as the average number of bits of infor- mation transmitted per second. R = rH bit/sec The channel types are classified as, 1. Discrete Memoryless Channel (DMC) 3.10 Communication Enginnering 2. Binary Communication Channel (BCC) 3. Binary Symmetric Channel (BSC) 4. Binary Erasable Channel (BEC) 5. Lossless Channel 6. Deterministic Channel 3.4.6 Discrete Memoryless Channel (BMC) Discrete: Channel is said to be discrete when alphabets X have finite. Memoryless: channel is said to be memoryless, when the current out- put symbols depends only on the current input symbol and not an any of the previous symbols. • It is a statistical model with an input X and output Y that is a noisy version of X.