15-853:Algorithms in the Real World General Model Applications Block

General Model message (m) Errors introduced by the 15-853:Algorithms in the Real World noisy channel: coder • changed fields in the codeword (e.g. a Error Correcting Codes I codeword (c) flipped bit) – Overview noisy • missing fields in the channel codeword (e.g. a lost – Hamming Codes byte). Called erasures codeword’ (c’) –Linear Codes How the decoder deals decoder with errors. • error detection vs. • error correction message or error 15-853 Page1 15-853 Page3 Applications Block Codes • Storage: CDs, DVDs, “hard drives”, message (m) Each message and codeword • Wireless: Cell phones, wireless links (WiMax) is of fixed size coder ∑ • Satellite and Space: TV, Mars rover, … = codeword alphabet k =|m| n = |c| q = |∑| • Digital Television: DVD, MPEG2 layover codeword (c) C ⊆Σn (codewords) • High Speed Networks: 10Gbase-T noisy channel Δ(x,y) = number of positions s.t. x ≠ y Reed-Solomon codes are the most used in practice, codeword’ (c’) i i Δ ∈ ≠ but various graph-based codes have recently d = min{ (x,y) : x,y C, x y} become as important. decoder s = max{Δ(c,c’)} that the code Algorithms for decoding are quite sophisticated. can correct message or error Code described as: (n,k,d)q 15-853 Page4 15-853 Page5 1 Binary Codes Hypercube Interpretation Consider codewords as vertices on a hypercube. Today we will mostly be considering ∑ = {0,1} and 110 111 will sometimes use (n,k,d) as shorthand for (n,k,d) codeword 2 010 011 In binary Δ(x,y) is often called the Hamming d = 2 = min distance 100 distance 101 n = 3 = dimensionality 2n = 8 = number of nodes 000 001 The distance between nodes on the hypercube is the Hamming distance Δ. The minimum distance is d. 001 is equidistance from 000, 011 and 101. For s-bit error detection d ≥ s + 1 For s-bit error correction d ≥ 2s + 1 15-853 Page7 15-853 Page8 Error Detection with Parity Bit Error Correcting One Bit Messages A (k+1,k,2)2 systematic code How many bits do we need to correct a one bit error Encoding: on a one bit message? 110 111 m1m2…mk ⇒ m1m2…mkpk+1 10 11 010 011 where pk+1 = m1 ⊕ m2 ⊕ … ⊕ mk 100 d = 2 since the parity is always even (it takes two bit 101 changes to go from one codeword to another). 00 01 000 001 Detects one-bit error since this gives odd parity 2 bits 3 bits Cannot be used to correct 1-bit error since any 0 -> 00, 1-> 11 0 -> 000, 1-> 111 odd-parity word is equal distance Δ to k+1 valid (n=2,k=1,d=2) (n=3,k=1,d=3) codewords. Need d ≥ 3 to correct one error. Why? 15-853 Page9 15-853 Page10 2 Example of (6,3,3)2 systematic code Error Correcting Multibit Messages Definition: A Systematic code We will first discuss Hamming Codes message codeword is one in which the message Detect and correct 1-bit errors. 000 000000 appears in the codeword 001 001011 Codes are of form: (2r-1, 2r-1 – r, 3) for any r > 1 010 010101 e.g. (3,1,3), (7,4,3), (15,11,3), (31, 26, 3), … 011 011110 which correspond to 2, 3, 4, 5, … “parity bits” (i.e. n-k) 100 100110 101 101101 The high-level idea is to “localize” the error. 110 110011 Any specific ideas? 111 111000 15-853 Page11 15-853 Page12 Hamming Codes: Encoding Hamming Codes: Decoding Localizing error to top or bottom half 1xxx or 0xxx m15m14m13m12m11m10 m9 p8 m7 m6 m5 p4 m3 p2 p1 p0 m15m14m13m12m11m10 m9 p8 m7 m6 m5 m3 p0 p8 = m15 ⊕ m14 ⊕ m13 ⊕ m12 ⊕ m11 ⊕ m10 ⊕ m9 We don’t need p0, so we have a (15,11,?) code. Localizing error to x1xx or x0xx After transmission, we generate ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ m15m14m13m12m11m10 m9 p8 m7 m6 m5 p4 m3 m2 p0 b8 = p8 m15 m14 m13 m12 m11 m10 m9 b4 = p4 ⊕ m15 ⊕ m14 ⊕ m13 ⊕ m12 ⊕ m7 ⊕ m6 ⊕ m5 p4 = m15 ⊕ m14 ⊕ m13 ⊕ m12 ⊕ m7 ⊕ m6 ⊕ m5 b2 = p2 ⊕ m15 ⊕ m14 ⊕ m11 ⊕ m10 ⊕ m7 ⊕ m6 ⊕ m3 Localizing error to xx1x or xx0x b1 = p1 ⊕ m15 ⊕ m13 ⊕ m11 ⊕ m9 ⊕ m7 ⊕ m5 ⊕ m3 m15m14m13m12m11m10 m9 p8 m7 m6 m5 p4 m3 p2 p0 With no errors, these will all be zero ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ p2 = m15 m14 m11 m10 m7 m6 m3 With one error b8b4b2b1 gives us the error location. Localizing error to xxx1 or xxx0 e.g. 0100 would tell us that p4 is wrong, and m15m14m13m12m11m10 m9 p8 m7 m6 m5 p4 m3 p2 p1 p0 1100 would tell us that m12 is wrong p1 = m15 ⊕ m13 ⊕ m11 ⊕ m9 ⊕ m7 ⊕ m5 ⊕ m3 15-853 Page13 15-853 Page14 3 Hamming Codes Lower bound on parity bits Can be generalized to any power of 2 How many nodes in hypercube do we need so that d = 3? r –n = 2– 1 (15 in the example) Each of the 2k codewords eliminates n neighbors plus – (n-k) = r (4 in the example) itself, i.e. n+1 – d = 3 (discuss later) n k – Can correct one error, but can’t tell difference between 2 ≥ (n +1)2 one and two! n ≥ k + log2 (n +1) –Gives (2r-1, 2r-1-r, 3) code n ≥ k + log (n +1) Extended Hamming code ⎡⎤2 – Add back the parity bit at the end In previous hamming code 15 ≥ 11 + ⎡ log2(15+1) ⎤ = 15 r r –Gives (2, 2 -1-r, 4) code Hamming Codes are called perfect codes since they – Can correct one error and detect 2 match the lower bound exactly – (not so obvious) 15-853 Page15 15-853 Page16 Lower bound on parity bits Lower Bounds: a side note What about fixing 2 errors (i.e. d=5)? The lower bounds assume random placement of bit errors. Each of the 2k codewords eliminates itself, its ⎛n⎞ ⎛n⎞ In practice errors are likely to be less than random, e.g. neighbors and its neighbors’ neighbors, giving: 1+ ⎜ ⎟ + ⎜ ⎟ ⎝1⎠ ⎝2⎠ evenly spaced or clustered: 2n ≥ (1+ n + n(n −1)/ 2)2k x x x x x x n ≥ k + log2 (1+ n + n(n −1)/ 2) x x x x x x ≥ k + 2log2 n −1 Can we do better if we assume regular errors? Generally to correct s errors: ⎛n⎞ ⎛n⎞ ⎛n⎞ We will come back to this later when we talk about ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ Reed-Solomon codes. In fact, this is the main n ≥ k + log2 (1+ ⎜ ⎟ + ⎜ ⎟ +L+ ⎜ ⎟) ⎝1⎠ ⎝2⎠ ⎝ s ⎠ reason why Reed-Solomon codes are used much more than Hamming-codes. 15-853 Page17 15-853 Page18 4 Linear Codes Linear Codes ∑ ∑n If is a field, then is a vector space Basis vectors for the (7,4,3)2 Hamming code: Definition: C is a linear code if it is a linear subspace m m m p m p p of ∑n of dimension k. 7 6 5 4 3 2 1 v1 = 1 0 0 1 0 1 1 v2 = 0 1 0 1 0 1 0 This means that there is a set of k basis vectors v = 0 0 1 1 0 0 1 n 3 vi ∈∑ (1 ≤ i ≤ k) that span the subspace. v4 = 0 0 0 0 1 1 1 i.e. every codeword can be written as: How can we see that d = 3? c = a1 v1 + … + ak vk ai ∈∑ The sum of two codewords is a codeword. 15-853 Page19 15-853 Page20 Generator and Parity Check Matrices Advantages of Linear Codes Generator Matrix: • Encoding is efficient (vector-matrix multiply) ∈∑k A k x n matrix G such that: C = {xG | x } • Error detection is efficient (vector-matrix multiply) Made from stacking the basis vectors • Syndrome (HyT) has error information : Parity Check Matrix •Gives qn-k sized table for decoding An (n – k) x n matrix H such that: C = {y ∈∑n | HyT = 0} Useful if n-k is small Codewords are the nullspace of H These always exist for linear codes 15-853 Page21 15-853 Page22 5 Example and “Standard Form” Relationship of G and H For the Hamming (7,4,3) code: If G is in standard form [Ik,A] T ⎡1 0 0 1 0 1 1⎤ then H = [A ,In-k] ⎢ ⎥ 0 1 0 1 0 1 0 G = ⎢ ⎥ ⎢0 0 1 1 0 0 1⎥ ⎢ ⎥ Example of (7,4,3) Hamming code: ⎣⎢0 0 0 0 1 1 1⎦⎥ By swapping columns 4 and 5 it is in the form Ik,A. A code with a matrix in this form is systematic, and G is in “standard form” transpose ⎡1 0 0 0 1 1 1⎤ ⎢ ⎥ ⎡1 0 0 0 1 1 1⎤ 0 1 0 0 1 1 0 ⎢ ⎥ ⎡1 1 1 0 1 0 0⎤ G = ⎢ ⎥ 0 1 0 0 1 1 0 ⎢ ⎥ ⎢0 0 1 0 1 0 1⎥ G = ⎢ ⎥ H = 1 1 0 1 0 1 0 ⎢ ⎥ ⎢0 0 1 0 1 0 1⎥ ⎢ ⎥ ⎣⎢0 0 0 1 0 1 1⎦⎥ ⎢ ⎥ ⎣⎢1 0 1 1 0 0 1⎦⎥ ⎣⎢0 0 0 1 0 1 1⎦⎥ 15-853 Page23 15-853 Page24 Proof that H is a Parity Check Matrix The d of linear codes Suppose that x is a message. Then Theorem: Linear codes have distance d if every set of (d-1) columns of H are linearly independent H(xG)T = H(GTxT) = (HGT)xT = (ATI +I AT)xT = (i.,e., sum to 0), but there is a set of d columns k n-k that are linearly dependent.

15-853:Algorithms in the Real World General Model Applications Block

A [49 16 6 ]-Linear Code As Product of the [7 4 2 ] Code Due to Aunu and the Hamming [7 4 3 ] Code

Coding Theory: Linear Error-Correcting Codes Coding Theory Basic Deﬁnitions Error Detection and Correction Finite Fields Anna Dovzhik

Mceliece Cryptosystem Based on Extended Golay Code

Coded Computing Via Binary Linear Codes: Designs and Performance Limits Mahdi Soleymani∗, Mohammad Vahid Jamali∗, and Hessam Mahdavifar, Member, IEEE

ERROR CORRECTING CODES Definition (Alphabet)

Improved List-Decodability of Random Linear Binary Codes

Coding Theory and Applications Solved Exercises and Problems Of

Reed-Solomon Encoding and Decoding

Decoding Algorithms of Reed-Solomon Code

The Binary Golay Code and the Leech Lattice

OPTIMIZING SIZES of CODES 1. Introduction to Coding Theory Error

Linear Codes