Block 15853:Algorithms in the Real World message (m) Each message and codeword is of fixed size Error Correcting Codes II coder ∑∑∑ = codeword alphabet k n q ∑ –ReedSolomon Codes codeword (c) =|m| = |c| = | | –Concatenated Codes C ⊆ Σn (codewords) noisy –Overview of some topics in coding channel (x,y) = number of positions s.t. x ≠ y –Low Density Parity Check Codes codeword’ (c’) i i (aka Expander Codes) d = min{ (x,y) : x,y ∈ C, x ≠ y} Network Coding decoder s = max{ (c,c’)} that the can correct Compressive Sensing Code described as: (n,k,d) –List Decoding message or error q

15-853 Page1 15-853 Page2

Linear Codes Generator and Parity Check Matrices Generator Matrix : If ∑ is a field, then ∑n is a G ∈ ∑k Definition : C is a if it is a linear subspace of A k x n matrix such that: C = { xG | x } ∑n of dimension k. Made from stacking the spanning vectors

This means that there is a set of k independent vectors Parity Check Matrix : n n T vi ∈ ∑ (1 ≤ i ≤ k) that span the subspace. An (n – k) x n matrix H such that: C = {y ∈ ∑ | Hy = 0} i.e. every codeword can be written as: (Codewords are the null space of H.)

c = a 1 v1 + a 2 v2 + … + ak vk where ai ∈ ∑ These always exist for linear codes The sum of two codewords is a codeword. Minimum distance = weight of leastweight codeword

15-853 Page3 15-853 Page4

1 k mesg Relationship of G and H

n For linear codes, if G is in standard form [I k A] n T mesg then H = [ A Ink] G = codeword Example of (7,4,3) :

nk transpose recv’d word nk 1 0 0 0 1 1 1   1 1 1 0 1 0 0 = 0 1 0 0 1 1 0   H syndrome G =   H = 1 1 0 1 0 1 0 0 0 1 0 1 0 1     1 0 1 1 0 0 1 0 0 0 1 0 1 1 if syndrome = 0, received word = codeword else have to use syndrome to get back codeword 15-853 Page5 15-853 Page6

Two Codes

Hamming codes are binary (2 r1–1, 2 r1r, 3) codes. Basically (n, n – log n, 3) ReedSolomon Codes Hadamard codes are binary (2 r1, r, 2 r1). Basically (n, log n, n/2)

The first has great rate, small distance. The second has poor rate, great distance. Can we get (n) rate, (n) distance? Yes. One way is to use a random linear code. Irving S. Reed and Gustave Solomon Let’s see some direct, intuitive ways. 15-853 Page7 15-853 Page8

2 ReedSolomon Codes in the Real World

2 (204,188,17) 256 : ITU J.83(A) (128,122,7) 256 : ITU J.83(B) PDF417 (255,223,33) 256 : Common in Practice QR code – Note that they are all byte based (i.e., symbols are from GF(2 8)). Decoding rate on 1.8GHz Pentium 4: – (255,251) = 89Mbps – (255,223) = 18Mbps Aztec code Dozens of companies sell hardware cores that DataMatrix code operate 10x faster (or more) All 2dimensional ReedSolomon bar codes – (204,188) = 320Mbps (Altera decoder)

15-853 Page9 15-853 Page10 images: wikipedia

Applications of ReedSolomon Codes Viewing Messages as Polynomials

• Storage : CDs, DVDs, “hard drives”, A (n, k, nk+1) code: • Wireless : Cell phones, wireless links Consider the polynomial of degree k1 k1 • Sateline and Space : TV, Mars rover, Voyager, p(x) = a k1 x + L + a 1 x + a 0 • Digital Television : DVD, MPEG2 layover Message : (a k1, …, a 1, a 0) Codeword : (p(1), p(2), …, p(n)) High Speed Modems • : ADSL, DSL, .. r To keep the p(i) fixed size, we use a i ∈ GF(q ) To make the i distinct, n ≤ q r Good at handling burst errors. Other codes are better for random errors. For simplicity, imagine that n = q r. So we have a – e.g., Gallager codes, Turbo codes (n, k, nk+1) log n code. Alphebet size increasing with the codeword length. A little awkward. (But can be fixed.)

15-853 Page11 15-853 Page12

3 Viewing Messages as Polynomials PolynomialBased Code

A (n, k, nk+1) code: A (n, k, 2s +1) code: Consider the polynomial of degree k1 k 2s k1 p(x) = a k1 x + L + a 1 x + a 0 Message : (a , …, a , a ) k1 1 0 n Codeword : (p(1), p(2), …, p(n)) r Can detect 2s errors To keep the p(i) fixed size, we use ai ∈ GF(q ) To make the i distinct, n ≤ qr Can correct s errors Generally can correct α erasures and β errors if Unisolvence Theorem : Any subset of size k of α + 2 β ≤ 2s (p(1), p(2), …, p(n)) is enough to (uniquely) reconstruct p(x) using polynomial interpolation, e.g., Lagrange interpolation formula.

15-853 Page13 15-853 Page14

Correcting Errors RS and “burst” errors

Correcting s errors : Let’s compare to Hamming Codes (which are “optimal”). 1. Find k + s symbols that agree on a polynomial p(x). code bits check bits These must exist since originally k + 2s symbols RS (255, 253, 3) 256 2040 16 agreed and only s are in error 11 11 Hamming (2 -1, 2 -11-1, 3) 2 2047 11 2. There are no k + s symbols that agree on the wrong polynomial p’(x) They can both correct 1 error, but not 2 random errors. Any subset of k symbols will define p’(x) – The Hamming code does this with fewer check bits Since at most s out of the k+s symbols are in However, RS can fix 8 contiguous bit errors in one byte error, p’(x) = p(x) – Much better than lower bound for 8 arbitrary errors  n n log1   + +   > 8log(n − )7 ≈ 88 check bits A bruteforce approach.  +   L   1 8 Better algos exist (maybe next lecture).   15-853 Page15 15-853 Page16

4 Concatenated Codes Concatenated Codes

Take a RS code (n,k,nk+1) q = n code. Take a RS code (n,k,nk+1) q = n code.

David Forney Can encode each alphabet symbol of k’ = log q = log n Can encode each alphabet symbol using another code. bits using another code.

E.g., use ((k’ + log k’), k’, 3) 2Hamming code. Now we can correct one error per alphabet symbol with little rate loss. (Good for sparse periodic errors.)

k’ k’1 Or (2 , k’, 2 )2 Hadamard code. (Say k = n/2.) Then 2 2 get (n , (n/2) log n, n /4) 2 code. Much better than plain Hadamard code in rate, distance worse only by factor of 2. 15-853 Page17 15-853 Page18 Wikipedia

Concatenated Codes Error Correcting Codes Outline

Introduction Take a RS code (n,k,nk+1) q = n code. Can encode each alphabet symbol of k’ = log q = log n Linear codes bits using another code. Reed Solomon Codes Expander Based Codes Or, since k’ is O(log n), could choose a code that – Expander Graphs requires exhaustive search but is good. – Low Density Parity Check (LDPC) codes δδδ δδδ Random linear codes give ((1+f( ))k’, k’, k’)) 2 codes. – Tornado Codes Composing with RS (with k = n/2), we get δδδ δδδ ( (1+f( ))n log n, (n/2) log n, (n/2)log n ) 2

Gives constant rate and constant distance ! And

polytime encoding and15-853 decoding. Page19 15-853 Page20

5 Why Expander Based Codes? (α, β ) Expander Graphs (nonbipartite)

These are linear codes like RS & random linear codes

RS/random linear codes give good rates but are slow: k ≤ αn ≥ βk

Code Encoding Decoding G Random Linear O(n 2) O(n 3) RS O(n log n) O(n 2) Properties LDPC O(n 2) or better O(n) – Expansion: every small subset ( k ≤αn) has many Tornado O(n log 1/ ε) O(n log 1/ ε) (≥ βk) neighbors – Low degree – not technically part of the Assuming an (n, (1p)n, (1ε)pn+1) 2 tornado code definition, but typically assumed

15-853 Page21 15-853 Page22

(α, β ) Expander Graphs (bipartite) Expander Graphs

Useful properties: k bits – Every set of vertices has many neighbors (k ≤ αn) at least βk bits

– Every balanced cut has many edges crossing it

– A random walk will quickly converge to the Properties stationary distribution (rapid mixing) – Expansion: every small subset ( k ≤αn) on left has many ( ≥βk) neighbors on right – Expansion is related to the eigenvalues of the – Low degree – not technically part of the adjacency matrix definition, but typically assumed

15-853 Page23 15-853 Page24

6 Expander Graphs: Applications dregular graphs

Pseudo-randomness : implement randomized An undirected graph is d-regular if every vertex has algorithms with few random bits d neighbors. Cryptography : strong oneway functions from weak ones. A bipartite graph is d-regular if every vertex on the Hashing: efficient nwise independent hash left has d neighbors on the right. functions Random walks: quickly spreading probability as you We consider only dregular constructions. walk through a graph Error Correcting Codes: several constructions Communication networks: fault tolerance, gossip based protocols, peertopeer networks

15-853 Page25 15-853 Page26

Expander Graphs: Constructions Expander Graphs: Constructions

Theorem: for every constant 0 < c < 1, can construct Important parameters: size (n) , degree (d) , expansion ( β) bipartite graphs with n nodes on left, Randomized constructions cn on right, – A random dregular graph is an expander with a high dregular, probability ααα ααα – Construct by choosing d random perfect matchings that are ( , 3d/4) expanders, for constants and – Time consuming and cannot be stored compactly d that are functions of c alone.

Explicit constructions “Any set containing at most alpha fraction of the – Cayley graphs, Ramanujan graphs etc left has (3d/4) times as many neighbors on the – Typical technique – start with a small expander, apply right” operations to increase its size

15-853 Page27 15-853 Page28

7 Error Correcting Codes Outline Low Density Parity Check (LDPC) Codes

Introduction n 1 0 0 0 1 0 0 0 1 Linear codes   parity 0 1 0 0 0 0 1 1 0 Read Solomon Codes code 0 1 1 0 1 0 0 0 0 check H =   nk Expander Based Codes 0 0 0 1 0 0 1 0 1 bits bits   1 0 1 0 0 1 0 0 0 – Expander Graphs   0 0 0 1 0 1 0 1 0 – Low Density Parity Check (LDPC) codes nk   H – Tornado Codes n Each row is a vertex on the right and each co lumn is a vertex on the left. A codeword on the left is valid if each right “parity check” vertex has parity 0. The graph has O(n) edges ( low density )

15-853 Page29 15-853 Page30

Applications in the real world History

10GbaseT (IEEE 802.3an, 2006) Invented by Gallager in 1963 (his PhD thesis) – Standard for 10 Gbits/sec over copper wire WiMax (IEEE 802.16e, 2006) Generalized by Tanner in 1981 (instead of using – Standard for mediumdistance wireless. parity and binary codes, use other codes for Approx 10Mbits/sec over 10 Kilometers. “check” nodes). NASA – Proposed for all their space data systems Mostly forgotten by community at large until the mid 90s when revisted by Spielman, MacKay and others.

15-853 Page31 15-853 Page32

8 Distance of LDPC codes Correcting Errors in LPDC codes

Consider a dregular LPDC with ( α, 3d/4) expansion. We say a vertex is unsatisfied if parity ≠ 0 Theorem : Distance of code is greater than αn. Proof . (by contradiction) Algorithm : Linear code; distance= min weight of non0 codeword. While there are unsatisfied check bits 1. Find a bit on the left for which more than ≤ α d = degree Assume a codeword with weight w n. d/2 neighbors are unsatisfied Let W be the set of 1 bits in codeword W 2. Flip that bit It has >3dw/4 neighbors on the right Average # of 1s per such neighbor Converges since every step reduces unsatisfied is < 4/3. nodes by at least 1. So at least one neighbor sees a single Runs in linear time. 1bit. Parity check would fail! neighbors Why must there be a node with more than d/2 15-853 Page33 unsatisfied neighbors? 15-853 Page34

Proof continued: Coverges to closest codeword ui = unsatisfied r = corrupt Theorem : Assume ( α,3d/4) expansion. If # of error i bits is less than αn/4 then simple decoding si = satisfied with corrupt neighbors u s 3 dr algorithm converges to closest codeword. i + i > i (by expansion) 4 Proof : let: 2si + ui ≤ dri (by counting edges)

ui = # of unsatisfied check bits 1 dr ≤ u (by substitution) on step i 2 i i u u u dr ri = # corrupt code bits on step i i < 0 (steps decrease u) 0 ≤ 0 (by counting edges) s = # satisfied check bits with i r r i.e. number of corrupt bits cannot corrupt neighbors on step i Therefore : i < 2 0 more than double We know that ui decrements on each If we start with at most an/4 corrupt bits we will never step, but what about ri? get an/2 corrupt bits but the distance is an 15-853 Page35 15-853 Page36

9 More on decoding LDPC Encoding LPDC

Simple algorithm is only guaranteed to fix half as Encoding can be done by generating G from H and many errors as could be fixed but in practice can using matrix multiply. do better. What is the problem with this? Various more efficient methods have been studied Fixing (d1)/2 errors is NP hard

Soft “decoding” as originally specified by Gallager is based on belief propagationdetermine probability of each code bit being 1 and 0 and propagate probs. back and forth to check bits.

15-853 Page37 15-853 Page38

Error Correcting Codes Outline The loss model

Introduction Random Erasure Model: Linear codes – Each bit is lost independently with some Read Solomon Codes probability Expander Based Codes – We know the positions of the lost bits For a rate of (1p) can correct (1ε)p fraction of the – Expander Graphs errors. – Low Density Parity Check (LDPC) codes Seems to imply a – Tornado Codes (n, (1p)n, (1ε)pn+1) 2 code, but not quite because of random errors assumption. We will assume p = .5. Error Correction can be done with some more effort

15-853 Page39 15-853 Page40

10 Tornado codes Message Check bits Will use dregular bipartite graphs with n nodes on bits the left and pn on the right (notes assume p = .5) Will need β > d/2 expansion. c6 = m3 ⊕ m7 m1

m2 c1

degree = d m3 degree = 2d

Similar to LDPC codes but check bits are not required to equal zero (i.e the graph does not c represent H). pk k = # of message bits

mk (notes use n)

15-853 Page41 15-853 Page42

Tornado codes: Encoding Tornado codes: Decoding

Why is it linear time? Assume that all the check bits are intact Find a check bit such that only one of its neighbors Computes the sum modulo is erased (an unshared neighbor) m1 2 of its neighbors Fix the erased code, and repeat.

m2 c1 m1

m3 m2 c1

m1+m 2+c 1 = m 3

cpk c mk pk

mk 15-853 Page43 15-853 Page44

11 Tornado codes: Decoding Tornado codes: Decoding

Need to ensure that we can always find such a check bit Can we always find unshared neighbors? “Unshared neighbors” property Consider the set of corrupted message bit and their neighbors. Suppose this set is small. Expander graphs give us this property if β > d/2 => at least one message bit has an unshared neighbor. (similar argument to one above)

Also, [Luby et al] show that if we construct the m1 graph from a specific kind of degree sequence, m2 c1 unshared then we can always find unshared neighbors. neighbor

cpk

mk

15-853 Page45 15-853 Page46

What if check bits are lost? Cascading

Cascading Encoding time – Use another bipartite graph to construct another level of – for the first k stages : |E| = d x |V| = O(k) check bits for the check bits – for the last stage: √k x √k = O(k) – Final level is encoded using RS or some other code k pk Decoding time 2 p k – start from the last stage and move left l p k ≤√n – again proportional to |E| – also proportional to d, which must be at least total bits n ≤ k(1 +p + p 2 + …) 1/ ε to make the decoding work = k/(1p) Can fix kp(1ε) random erasures rate = k/n = (1p)

15-853 Page47 15-853 Page48

12 Some extra slides Expander Graphs: Constructions

Start with a small expander, and apply operations to make it bigger while preserving expansion

Squaring –G2 contains edge (u,w) if G contains edges (u,v) and (v,w) for some node v – A’ = A 2 – 1/d I – λ’ = λ2 – 1/d – d’ ≤ d2 d Size ≡ Degree ↑ Expansion ↑

15-853 Page49 15-853 Page50

Expander Graphs: Constructions Expander Graphs: Constructions

Start with a small expander, and apply operations to make it Start with a small expander, and apply operations to make it bigger while preserving expansion bigger while preserving expansion

Tensor Product (Kronecker product) ZigZag product – G = AxB nodes are (a,b) ∀ a∈A and b ∈B – “Multiply” a big graph with a small graph – edge between (a,b) and (a’,b’) if A contains (a,a’) and B contains (b,b’)

– n’ = n 1n2 – λ’ = max ( λ , λ ) n = d 1 2 ↑ 2 1 Size d = √d – d’ = d 1d2 Degree ↑ 2 1 Expansion ↓

15-853 Page51 15-853 Page52

13 Expander Graphs: Constructions Combination: square and zigzag

Start with a small expander, and apply operations to make it For a graph with size n, degree d, and eigenvalue λ, bigger while preserving expansion define G = (n, d, λ). We would like to increase n while λ ZigZag product holding d and the same. – “Multiply” a big graph with a small graph Squaring and zigzag have the following effects: (n, d, λ)2 = (n, d 2, λ2) ≡ ↑↑ 2 2 (n 1, d 1, λ1) zz (d 1, d 2, λ2) = (n 1d1, d 2 , λ1+ λ2+ λ2 ) ↑↓↓ 4 4 2 Now given a graph H = (d , d, 1/5) and G 1 = (d , d , 2/5) ↑ Size –G = G 2 zz H (square, zigzag) Degree ↓ i i1 2 4i Expansion ↓ (slightly) Giving: G i = (n i, d , 2/5) where n i = d (as desired)

15-853 Page53 15-853 Page54

14