10/2/2019
Introduction to Information Theory
Part 4
A General Communication System
CHANNEL
• Information Source • Transmitter • Channel • Receiver • Destination 10/2/2019 2
1 10/2/2019
Information Channel
Input Output Channel X Y
10/2/2019 3
Perfect Communication (Discrete Noiseless Channel)
0 0 X Y Transmitted Received Symbol Symbol 1 1
10/2/2019 4
2 10/2/2019
NOISE
10/2/2019 5
Motivating Noise…
0 0 X Y Transmitted Received Symbol Symbol 1 1
10/2/2019 6
3 10/2/2019
Motivating Noise…
f = 0.1, n = ~10,000
1 - f 0 0 f
f 1 1 1 - f
10/2/2019 7
Motivating Noise…
Message: $5213.75 Received: $5293.75
1. Detect that an error has occurred.
2. Correct the error.
3. Watch out for the overhead.
10/2/2019 8
4 10/2/2019
Error Detection by Repetition
In the presence of 20% noise…
Message : $ 5 2 1 3 . 7 5 Transmission 1: $ 5 2 9 3 . 7 5 Transmission 2: $ 5 2 1 3 . 7 5 Transmission 3: $ 5 2 1 3 . 1 1 Transmission 4: $ 5 4 4 3 . 7 5 Transmission 5: $ 7 2 1 8 . 7 5
There is no way of knowing where the errors are.
10/2/2019 9
Error Detection by Repetition
In the presence of 20% noise… Message : $ 5 2 1 3 . 7 5 Transmission 1: $ 5 2 9 3 . 7 5 Transmission 2: $ 5 2 1 3 . 7 5 Transmission 3: $ 5 2 1 3 . 1 1 Transmission 4: $ 5 4 4 3 . 7 5 Transmission 5: $ 7 2 1 8 . 7 5 Most common: $ 5 2 1 3 . 7 5 1. Guesswork is involved. 2. There is overhead.
10/2/2019 10
5 10/2/2019
Error Detection by Repetition
In the presence of 50% noise… Message : $ 5 2 1 3 . 7 5 … Repeat 1000 times!
1. Guesswork is involved. But it will almost never be wrong! 2. There is overhead. A LOT of it!
10/2/2019 11
Binary Symmetric Channel (BSC) (Discrete Memoryless Channel)
1 − 푝 0 0 푝 푋 푌 Transmitted Received Symbol 푝 Symbol 1 1 1 − 푝
10/2/2019 12
6 10/2/2019
Binary Symmetric Channel (BSC) (Discrete Memoryless Channel)
1 − 푝 0 0 푝 푋 푌 Transmitted Received Symbol 푝 Symbol 1 1 1 − 푝
Defined by a set of conditional probabilities (aka transitional probabilities)
푝 푦 푥 푓표푟 푎푙푙 푥 ∈ 푋 푎푛푑 푦 ∈ 푌 The probability of 푦 occurring at the output when 푥 is the input to the channel.
10/2/2019 13
A General Discrete Channel
p(푦 |푥 ) 푥1 1 1
p(푦2|푥1) 푦1 p(푦푟|푥1) p(푦3|푥1) 푦2 푥2 푦3
푥푠 푦푟 푠 × 푟 transition probabilities 푠 input symbols 푟 output symbols
10/2/2019 14
7 10/2/2019
Channel With Internal Structure
1 − 푝 1 − 푞 0 0 0 0 1 − 푝 1 − 푞 + 푝푞 0 푝 푞
1 − 푝 푞 + (1 − 푞)푝 푝 푞 1 1 1 1 1 1 − 푝 1 − 푞 + 푝푞
10/2/2019 15
Channel
• A channel can be modeled using a probabilistic model of source and what was received.
Input Output Channel X Y
10/2/2019 16
8 10/2/2019
Conditional & Joint Entropy
• 푋 ,푌 random variables with entropy 퐻(푋) and 퐻 푌
• Conditional Entropy: Average entropy in 푌, given knowledge of 푋.
ퟏ 푯 풀 푿 = 풑(풙풊, 풚풋) 풍풐품 풑(풚풋|풙풊) 풙풊∈푿 풚풋∈풀
where 푝 푥푖, 푦푗 = 푝 푦푗 푥푖 푝 푥푖
• Joint Entropy: 푯 푿, 풀 = 푯 풀 푿 + 푯(푿) Entropy of the pair (푋, 푌)
10/2/2019 17
Mutual Information
• The mutual information of a random variable 푋 given the random variable 푌 is
퐼 푋; 푌 = 퐻 푋 − 퐻 푋 푌 It is the information about 푋 transmitted by 푌.
10/2/2019 18
9 10/2/2019
Mutual Information: Properties
• 퐼(푋; 푌) = 퐻(푋, 푌) – 퐻(푋|푌) − 퐻(푌|푋) • 퐼(푋; 푌) = 퐻(푋) − 퐻(푋|푌) • 퐼(푋; 푌) = 퐻(푌) − 퐻(푌|푋) • 퐼(푋; 푌) is symmetric in 푋 and 푌 푝(푥,푦) • 퐼 푋; 푌 = σ σ 푝 푥, 푦 푙표푔 푥 푦 푝 푥 푝(푦) • 퐼(푋; 푌) >= 0 • 퐼(푋; 푋) = 퐻(푋)
10/2/2019 19
Entropy Concepts
• 퐼(푋; 푌) = 퐻(푋, 푌) – 퐻(푋|푌) − 퐻(푌|푋) • 퐼(푋; 푌) = 퐻(푋) − 퐻(푋|푌) • 퐼(푋; 푌) = 퐻(푌) − 퐻(푌|푋) • 퐼(푋; 푌) is symmetric in 푋 and 푌 푝(푥,푦) • 퐼 푋; 푌 = σ σ 푝 푥, 푦 푙표푔 푥 푦 푝 푥 푝(푦) • 퐼(푋; 푌) >= 0 • 퐼(푋; 푋) = 퐻(푋) 10/2/2019 20
10 10/2/2019
Channel Capacity
• What is the capacity of the channel?
• What is the reliability of communication across the channel?
10/2/2019 21
Channel Capacity
• What is the capacity of the channel? • What is the reliability of communication across the channel? • Defined Channel Capacity as the rate at which reliable communication is possible. • Capacity is defined in terms of Mutual Information I(X, Y) • Mutual Information is defined in terms of entropy of source H(X) and the joint entropy H(X, Y) • Joint entropy is defined in terms of source entropy H(X) and the conditional entropy H(Y|X)
10/2/2019 22
11 10/2/2019
Channel Capacity
• The capacity of a channel is the maximum possible mutual information that can be achieved between input and output by varying the probabilities of the input symbols.
If X is the input channel and Y is the output, the capacity C is
풄 = 퐦퐚퐱 푰(푿; 풀) 풊풏풑풖풕 풑풓풐풃풂풃풊풍풊풕풊풆풔
10/2/2019 23
Channel Capacity
풄 = 퐦퐚퐱 푰(푿; 풀) 풊풏풑풖풕 풑풓풐풃풂풃풊풍풊풕풊풆풔
Mutual information about X given Y is the information transmitted by the channel and depends on the probability structure
– Input probabilities – Transition probabilities – Output probabilities
10/2/2019 24
12 10/2/2019
Channel Capacity
풄 = 퐦퐚퐱 푰(푿; 풀) 풊풏풑풖풕 풑풓풐풃풂풃풊풍풊풕풊풆풔
Mutual information about X given Y is the information transmitted by the channel and depends on the probability structure
– Input probabilities – Transition probabilities: fixed by properties of channel – Output probabilities: determined by input and transition probabilities
10/2/2019 25
Channel Capacity
풄 = 퐦퐚퐱 푰(푿; 풀) 풊풏풑풖풕 풑풓풐풃풂풃풊풍풊풕풊풆풔
Mutual information about X given Y is the information transmitted by the channel and depends on the probability structure
– Input probabilities: can be adjusted by suitable coding – Transition probabilities: fixed by properties of channel – Output probabilities: determined by input and transition probabilities
10/2/2019 26
13 10/2/2019
Channel Capacity
풄 = 퐦퐚퐱 푰(푿; 풀) 풊풏풑풖풕 풑풓풐풃풂풃풊풍풊풕풊풆풔
Mutual information about X given Y is the information transmitted by the channel and depends on the probability structure
– Input probabilities: can be adjusted by suitable coding – Transition probabilities: fixed by properties of channel – Output probabilities: determined by input and transition probabilities
That is, input probabilities determine mutual information and can be varied by coding. The maximum mutual information with respect to these input probabilities is the channel capacity.
10/2/2019 27
Shannon’s Second Theorem
• Suppose a discrete channel has capacity C and the source has entropy H
If H < C there is a coding scheme such that the source can be transmitted over the channel with an arbitrarily small frequency of error.
If H > C, it is not possible to achieve arbitrarily small error frequency.
10/2/2019 28
14 10/2/2019
Detailed Communication Model
DISCRETE TRANSMITTER CHANNEL
data source channel information encipherment modulation source reduction coding coding
distortion (noise)
RECEIVER
data source channel destination decipherment demodulation reconstruction coding decoding
10/2/2019 29
Detailed Communication Model
DISCRETE TRANSMITTER CHANNEL
data source channel information encipherment modulation source reduction coding coding
Standard compression used to reduce inherent redundancy. distortion resulting bitstream is (noise) nearly random.
RECEIVER
data source channel destination decipherment demodulation reconstruction coding decoding
10/2/2019 30
15 10/2/2019
Detailed Communication Model
DISCRETE TRANSMITTER CHANNEL
data source channel information encipherment modulation source reduction coding coding
Next, resulting bitstream is transformed by an error correcting code- puts distortion redundancy back in but in (noise) a systematic way.
RECEIVER
data source channel destination decipherment demodulation reconstruction coding decoding
10/2/2019 31
Detailed Communication Model
DISCRETE TRANSMITTER CHANNEL
data source channel information encipherment modulation source reduction coding coding
distortion (noise)
RECEIVER
data source channel destination decipherment demodulation reconstruction coding decoding
10/2/2019 32
16 10/2/2019
Error Correcting Codes
• Hamming Codes (1950)
• Linear Codes
• Low Density Parity Codes (1960)
• Convolutional Codes
• Turbo Codes (1993)
10/2/2019 33
Applications of Error Correcting Codes
• Satellite communications • Spacecraft/space probe communications: Mariner 4 (1965), Voyager (1977-), Cassini (1990s), Mars Rovers (1997-)
10/2/2019 34
17 10/2/2019
Applications of Error Correcting Codes
• Satellite communications • Spacecraft/space probe communications: Mariner 4 (1965), Voyager (1977-), Cassini (1990s), Mars Rovers (1997-) • CD and DVD Players, etc. • Internet & Web communications (Ethernet, IP, IPv6, TCP, UDP, etc) • Data storage • ECC Memory (DRAM) • Etc. etc. etc.
10/2/2019 35
Error Correcting Codes: Checksum
• ISBN: 0-691-12418-3
• 1*0+2*6+3*9+4*1+5*1+6*2+7*4+8*1+9*8 = 168 mod 11 = 3
• This is a staircase checksum
10/2/2019 36
18 10/2/2019
ISBN Checksum Exercise
• For the 13-digit ISBN (e.g. 978-0-252-72546-3, for Shannon & Weaver’s The Mathematical Theory of Communication)
Each digit, from left to right, is alternately multiplied by 1 or 3, then those products are summed modulo 10 to give a value, R ranging from 0 to 9. Checksum digit is 10 – R.
Thus, 978-0-19-955137-? (Floridi’s Information: A Very Short Introduction)
10/2/2019 37
Information Channel is a General Model
Input Output Channel X Y Cholesterol Levels Condition of Arteries
10/2/2019 38
19 10/2/2019
Information Channel is a General Model
Input Output Channel X Y Symptoms Diagnosis or Test results
10/2/2019 39
Information Channel is a General Model
Input Output Channel X Y Geological Structure Presence of oil deposits
10/2/2019 40
20 10/2/2019
Information Channel is a General Model
Input Output Channel X Y Opinion Poll Next President
10/2/2019 41
MTC: Summary
• Information • Channel • Entropy • Conditional Entropy • Source Coding • Joint Entropy Theorem • Mutual Information • Redundancy • Channel Capacity • Compression • Shannon’s Second • Huffman Encoding Theorem • • Lempel-Ziv Coding Error Correction Codes
10/2/2019 42
21 10/2/2019
References
• Eugene Chiu, Jocelyn Lin, Brok Mcferron, Noshirwan Petigara, Satwiksai Seshasai: Mathematical Theory of Claude Shannon: A study of the style and context of his work up to the genesis of information theory. MIT 6.933J / STS.420J The Structure of Engineering Revolutions • Luciano Floridi, 2010: Information: A Very Short Introduction, Oxford University Press, 2011. • Luciano Floridi, 2011: The Philosophy of Information, Oxford University Press, 2011. • James Gleick, 2011: The Information: A History, A Theory, A Flood, Pantheon Books, 2011. • Zhandong Liu , Santosh S Venkatesh and Carlo C Maley, 2008: Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples, BMC Genomics 2008, 9:509 • David Luenberger, 2006: Information Science, Princeton University Press, 2006. • David J.C. MacKay, 2003: Information Theory, Inference, and Learning Algorithms, Cambridge University Press, 2003. • Claude Shannon & Warren Weaver, 1949: The Mathematical Theory of Communication, University of Illinois Press, 1949. • W. N. Francis and H. Kucera: Brown University Standard Corpus of Present-Day American English, Brown University, 1967. • Edward L. Glaeser: A Tale of Many Cities, New York Times, April 10, 2010. Available at: http://economix.blogs.nytimes.com/2010/04/20/a-tale-of-many-cities/ • Alan Rimm-Kaufman, The Long Tail of Search. Search Engine Land Website, September 18, 2007. Available at: http://searchengineland.com/the-long-tail-of-search-12198
43
22