Error Detection/Correction

· Messages are a sequence of bits, 0’s and 1’s · Bit corruption – channel changes the value of some bits 01 10 · Corruption may be random for each bit (thermal noise for example ). · Corruption also may occur in bursts · Corruption burst: smallest subsequence containing all corrupted bits

msg sent: 01101110011011 msg recv: 01111110111001 corrupted bits X X X corruption burst: |------|

1 Definitions · error detection: detect if a message is corrupted (NOT detect which individual bits were corrupted). The ability to perform error detection is usually measured in two ways:

1. We say that a protocol performs x-bit detection, iff, for ANY message, and for ANY y number of corrupted bits in the message, where y  x, the protocol will detect the corruption.

2. We say that a protocol performs x-burst detection, iff, for ANY message, and for ANY corruption burst of size y in the message, where y  x, the protocol will detect the corruption.

· error correction: correct a corrupted message (restore its original value).

We say a protocol performs x-bit correction iff, in ANY message, if ANY y number of bits are corrupted, where y  x, the protocol can restore the original contents of the message.

Note that error bursts are not considered.

2 Accomplished by adding redundant check bits

3 Code Words

· code word: sequence M, |M| = m, data bits followed by a sequence R, |R|

= r, check bits (usually m >> r)

M R m data r check bits

bits

· the R check bits are a function of the M data bits

· I.e., there is a standard function f, known by both the sender and receiver.

· A code word is valid iff R = f(M). I often use “legal” and “valid”

interchangeably

4 Detecting Corruption

· Sender computes R = f(M), and sends M;R (; = concatenation)

· Channel transforms (corrupts) M;R into M';R', and receiver receives M';R'

· Receiver computes X = f(M')

· If X=R', the receiver accepts the msg., if X  R', the receiver rejects the

msg..

· Note, X  R'  corruption, but (X = R'  no corruption), since the channel

may turn a valid code word into another valid code word.

5 Abundance of Illegal Code Words

· code = a collection of legal code words · Assume for simplicity that you always send m data bits.

· m data bits 2m legal code words.

· i.e. 2m+r - 2m = number of illegal code words

· E.g., if m = 8, and r = 2, 256 valid code words, 1024 - 256 invalid code words.

· The hope is that if corruption occurs, the channel turns a valid code word into an invalid code word (not another valid code word), since there are more invalid code words than valid ones.

· The larger the number of illegal code words (i.e. larger r) the better.

6 Hamming distance

· Hamming distance of two equal-length code words: #different bits in two

code words.

· c1 = 1000100

c2 = 1011001

xor = 0011101

Hamm(c1, c2) = 4

· Hamming distance of a code = minimum hamming distance of any pair of

equal-length code words in the code.

7 Detecting Corruption · To perform x-bit detection, Hamm(code)  x+1

Why? Assume Hamm(code) ≥ x+1, then all valid code words are separated by at least x+1 bit changes ≥ x+1 ≥ x+1 |<------>|<------>| c2---|------c1------|----c3 e e’ If sender sends c1, and channel corrupts x bits or less (resulting in words e or e’), you end up with an invalid word, because other valid words (e.g. c2 and c3) are at least x+1 bits away from c1. · E.g. byte parity bit (“even”parity bit of a sequence of bits is the XOR of these bits) m = 8 data bits and r = 1 parity bit byte parity (even) code word 10111010 1 single bit error 11111010 1 = illegal code word double bit error 11011010 1 = another valid code word

Hamm dist of byte parity bit code = 2, and it performs 1-bit detection.

8 Error Correction To perform x-bit correction, you replace the received invalid code word by the closest legal codeword (in terms of Hamming distance).

E.g. valid code words 00000 00000 00000 11111 11111 00000 11111 11111 code Hamm dist = 5

Sender sends 00000 00000 Receiver receives 00000 00011 Receiver corrects to 00000 00000 (closest legal code word to received code word)

9 Hamming Distance for Correction To perform x-bit correction, Hamm dist of the code  2x + 1

Why? If x bits corrupted, there is only one valid code word within a distance of x bits of the invalid codeword.

Assume Hamm(code)  2x + 1, and sender sends code word c1.

|<------2x+1---->|<------2x+1------>| c2------c1------|------c3 ≤ x ≥ x+1 |<------>|<------>| e e = c1 corrupted by x bits, c2 and c3 are at least x+1 away from e c1 is only x or less away from e (thus correct to c1).

Assuming only x bits or less can get corrupted, the closest word to e is c1, and the receiver correctly corrects e to c1.

10 Example

In the earlier code example, we can always correct  2 bit errors, since Hamm(code) = 5.

E.g. sender sends 00000 11111 Receiver receives 00000 00111 Receiver corrects to 00000 11111

This is assuming  2 bits are corrupted.

If sender sends 00000 11111 And the receiver receives 00000 00011 (3 bit errors) The receiver incorrectly corrects the code word to 00000 00000

11 Parity Detection

· Assume each message is of size m*n (m is usually a constant) · Below, b = bit. · Visualize the message (which is just a bit string) as a 2-dimentional array. b[0,0], b[0,1], . . . b[0,m-1] b[1,0], b[1,1], . . . b[1,m-1] . . . b[n-1,0], b[n-1,1], . . . b[n-1,m-1] p[0], p[1] . . . p[m-1] p[i] = XOR of b[0,i], b[1, i], … b[n-1,i] i.e., p[i] = XOR of column i (this yields even parity)

· Receiver receives data bits and parity bits · checks if p[i] = XOR of b[0,i], b[1, i], … b[n-1,i] · if not, the message is thrown away.

12 Transmission Order

· How to send the bits ? ( in what order?) Row by row or column by column?

· We transmit row by row

b[0,0], … b[0,m-1], b[1,0], … b[1,m-1], … , b[n-1,0], … b[n-1,m-1], p[0], p[1], … p[m-1]

Why? Any burst size  m will be detected (the line in the figure represents a corruption burst)

burst size  m only 1 bit/column may be corrupted

burst size  m 2 bits/column may be corrupted, i.e., error may not be detected.

13 Row and column parity bits b[0,0], b[0,1], . . . b[0,m-1] q[0] b[1,0], b[1,1], . . . b[1,m-1] q[1] . . b[n-1,0], b[n-1,1], . . . b[n-1,m-1] q[n-1] p[0], p[1] . . . p[m-1]

· q[i] = XOR of b[i,0], … b[i,m-1] · p[i] as before · property: any two-bit error is detected (burst size  m+1 errors are detected). · What if we add a bit r on right bottom corner, where r = parity on q’s ? · property with r: any three-bit error is detected

14