Error-Correcting Linear Codes Motivation

Error-correcting linear codes Motivation • CD’s get scratched. • Unreliable or noisy communication channels (Wi-Fi, Ethernet…) • Redundancy measures the number of extra bits 푟 = 푛 − 푘. + [0] [1] [0] [0] [1] [1] [1] [0] Hamming (7,4) – A practical approach • Given a stream of bits, split it into 4 bit blocks and for each of those form a 7 bit block (codeword), e.g. 01010001011101101001… −> 0101 0101 0001 0111 0110 1001… • For each block 푑1푑2푑3푑4, suffix 3 redundant bits 푝1푝2푝3 given by: 푝1 = 푑1 + 푑2 + 푑4 푝2 = 푑1 + 푑3 + 푑4 푝3 = 푑2 + 푑3 + 푑4 • The receiver then tries to detect and correct errors with the help of the redundant bits. Hamming (7,4) – Examples 푑1푑2푑3푑4 푝1푝2푝3 • 0100 101 • 1000 010 1000 110 • 1000 011 1100 011 • p1 = 0 + 1 + 0 = 1 • p1 = 1 + 0 + 0 = 1 • p1 = 1 + 0 + 0 = 1 • p2 = 0 + 0 + 0 = 0 • p2 = 1 + 0 + 0 = 1 • p2 = 1 + 0 + 0 = 1 • p3 = 1 + 0 + 0 = 1 • p3 = 0 + 0 + 0 = 0 • p3 = 0 + 0 + 0 = 0 • 0010 100 0011 100 • 0110 001 0111 001 Wrong! • p1 = 0 + 0 + 0 = 0 • p1 = 0 + 1 + 0 = 1 • p2 = 0 + 1 + 0 = 1 • p2 = 0 + 1 + 0 = 1 • p3 = 0+ 1 + 0 = 1 • p3 = 1 + 1 + 0 = 0 Hamming (7,4) - Remarks • With 3 extra bits we are able to communicate 23 = 8 “states”. • In the previous exemples, we saw that we can’t correct 2 errors. • In fact, the Hamming(7,4) code is only capable of correcting 1 error per 7 bits. • Since we have 8 possible ways of communicating extra information, the Hamming(7,4) code is able to detect the no-errors state and the remaning 7 states where only 1 bit got flipped. • For each block of 4 bits, we send a total of 7 bits, so the efficiency of this code is 4/7. Challenges • How can we ensure that this procedure is correct without explictly check all possibilities? Does this set of equations make sense? • What if we wanted a Hamming(푥, 푦)? What are the consequences? • How can we bound the number of errors that a code can correct? • What about correcting more than one error? Error-correcting (linear) code • Definition: An error correcting code of length 푛 over a finite alphabet Σ is a subset of Σn. The elements of 풞 are called the valid codewords. If Σ = q we say that 풞 is a 푞-ary code. When q = 2, we say that 풞 is a binary code. • Definition: If Σ is a field and 풞 ⊂ Σ푛 is a subspace of Σ푛 then 풞 is said to be a linear code. More informally, a linear code is an error- correcting code for which any linear combination of codewords is also a codeword. 푛 • Definition: A linear [푛, 푘]-code 풞 is a 푘-dimensional subspace of 픽 2 푛 (풞 ⊂ 픽 2). Generator Matrix 푮 푛 • As a linear subspace of 픽 q , the entire code 풞 may be represented as the span of a set of k codewords (the order of 풞). • The generator matrix 퐺 is a k x n matrix whose rows form a basis for a linear code 풞, i. e. , the linear code is spanned by the rows of 퐺. 푘 푛 • We can think of 퐺 as a linear map 푇: 픽 q → 픽 q , (푛 > 푘). • When 퐺 has the form 퐺 = [퐼푘 | 푃] where 퐼푘 denotes the k x k identity matrix and P is some k x (n-k) matrix, then we say G is in standard form. • The generator matrix encodes our messages by projecting them into a higher dimensional space (intuitively, this corresponds to redundancy). Generator Matrix 푮 - Encoding 푘 • Let 푏 ∈ 픽 q be a block message and 퐺 = [퐼푘 | 푃] a generator matrix, where 푏 is assumed to be in row format. 푏. 퐺 = (푏 | 푟푒푑푢푛푑푎푛푡 푏푖푡푠) Hamming weight, Hamming distance • Definition: The Hamming weight 푤 푥 is the number of non-zero values in 푥. In the binary case, it is the number of 1s. 푛 푛 • Definition: For 푥 ∈ 픽 q and 푦 ∈ 픽 q the Hamming distance is given by 푑(푥, 푦) = 푖|푥 ≠ 푦 푖 푖 i.e., it is the number of coordinate positions in which they differ. 푤(0) = 0 Let 푥 = 01011 and 푦 = 10110, then 푑 푥, 푦 = 4 푛 푤 0010 = 1 Let 푦 ∈ 픽 2, then 푑 0, 푦 = 푤(푦) 푛 • Symmetry: 푑 푥, 푦 = 푑 푦, 푥 ∀푥, 푦 ∈ 픽 2 푛 • Positive definiteness: 푑 푥, 푦 ≥ 0, 푑 푥, 푦 = 0 iff 푥 = 푦 ∀푥, 푦 ∈ 픽 2 푛 • Triangle inequality: 푑 푥, 푦 ≤ 푑 푥, 푧 + 푑 푧, 푦 ∀푥, 푦, 푧 ∈ 픽 2 • For length 푛 = 1: • If 푥 = 푦, then 퐿퐻푆 = 0 and 퐿퐻푆 ≤ 푅퐻푆. If 푥 ≠ 푦, then 퐿퐻푆 is 1 and either 푥 ≠ 푧 or 푦 ≠ 푧, so 푅퐻푆 is at least 1 ֜ 퐿퐻푆 ≤ 푅퐻푆 • • For general n we use the fact that 푑 푥푖, 푦푖 ≤ 푑 푥푖, 푧푖 + 푑(푧푖, 푦푖) 푛 푛 푛 푛 • 푑 푥, 푦 = σ푖=1 푑 푥푖, 푦푖 ≤ σ푖=1 푑 푥푖, 푦푖 + 푑 푧푖, 푦푖 = σ푖=1 푑 푥푖, 푧푖 + σ푖=1 푑 푧푖, 푦푖 = = 푑 푥, 푧 + 푑(푧, 푦) Minimum distance 풅 • The minimum distance 푑 is the smallest Hamming distance between all distinct valid codewords: 푑 = m푖푛 푑 푥, 푦 푥,푦∈C 푥≠푦 • Computing the minimum distance of the code requires calculating |풞| |풞|2 ≈ 2 2 Hamming distances. • Remark: Sometimes codes are described as [푛, 푘, 푑] codes. Computing 풅 • Observation: the sum of any two messages is a message. • Let 푚1 and 푚2 be messages to be encoded by G and 푚3 the message 푚1 + 푚2 = 푚3. The following is true: 푚1 + 푚2 퐺 = 푚1퐺 + 푚2퐺 = 푚3퐺 • Important: We conclude that the sum of 2 valid codewords is a codeword Computing 풅 • 푑 = m푖푛 푑 푐1, 푐2 = m푖푛 푑 푐1 − 푐1, 푐2 − 푐1 푐1,푐2∈C 푐1,푐2∈C = m푖푛 푑 0, 푐2 − 푐1 = m푖푛 푑 0, 푐2 + 푐1 = m푖푛 푑 0, 푐 푐1,푐2∈C 푐1,푐2∈C 푐∈C = m푖푛 푤(푐) where 푐 ≠ 0. 푐∈C Parity-Check Matrix 푯 푛 푛_푘 • An (n-k) x n matrix 퐻 representing a linear function 휙: 픽 2 → 픽 2 whose kernel is 풞 is called a check matrix for 풞 (or a parity check matrix). 퐻푐푇 = 0 iff 푐 is a valid codeword • If 풞 is a code with generating matrix 퐺 in standard form, 퐺 = [퐼푘 | 푃], then 퐻 = −푃푇 퐼 ] is a check matrix for 풞. 푛 − 푘 • It gives us a method of detecting errors when they occur. • Remark: 퐻 has all the information about 풞, so as 퐺. 퐻푐푇 = 0 iff 푐 is a valid codeword. • Let 푐 be a valid codeword and 푒 an error vector, then we can write 퐻 푐 + 푒 푇 = 퐻푐푇 + 퐻푒푇 = 퐻푒푇 • Let 푞 be a valid codeword. If we introduce the same error vector 푒 퐻 푞 + 푒 푇 = 퐻푞푇 + 퐻푒푇 = 퐻푒푇 • The result 퐻푒푇 is called a syndrome. • Remark: syndromes depend only on the error and never on the message. Syndrome decoding 푛 푛 • Given any valid codeword 푐 ∈ 픽 2 and an error vector 푒 ∈ 픽 2 , we saw that 퐻푒푇 is called the syndrome vector. Let 푧 = 푐 + 푒, then 퐻푧푇 = 퐻푐푇 + 퐻푒푇 = 퐻푒푇. • Build a map associating all possible syndromes 퐻푒푇 to their corresponding 푛 errors 푒 ∈ 픽 2. • Under the assumption that no more than 푡 errors were made during 푛 transmission, the receiver can look up the value 퐻푒푇 in a table of size σ푡 . 푖=0 푖 • After finding 푒 in the table, it is easy to correct 푧, i.e., find 푐: 푐 = 푧 − 푒 Standard array • Standard arrays are used to decode linear codes. _ • A standard array for an [푛, 푘]-code is a 푞푛 푘 x 푞푘 array that lists all 푛 elements of a particular 픽 q vector space, where : 1. The first row lists all codewords (with the 0 codeword on the extreme left). 2. Each row is a coset with the coset leader (representative) in the first column. 3. The entry in the 푖-th row and 푗-th column is the sum of the 푖-th coset leader and the 푗-th codeword. • Recall Lagrange’s Theorem: If 퐺 is a finite group and 퐻 ⊂ 퐺 a subgroup, then 퐺 = 퐻 퐺/퐻 = 퐻 퐻\퐺 . Example 2 4 • Let 휙: 픽 2 → 픽 2 be the encoder map for a [4,2] code, given by the generator matrix 1 0 1 1 퐺 = 0 1 0 1 • Therefore, 풞 = {0000, 1011, 0101, 1110} • By Lagrange’s Theorem, the standard array will have dimesions 4x4 = since the order of 풞 is 2푘 4 and 푛 = 4, so we have 2푛/2푘 cosets. Example Valid 1011 0101 1110 codewords 0000 1000 0011 1101 0110 0100 1111 0001 1010 0010 1001 0111 1100 Example • Suppose that the word 0110 was received as a message. How do we 1000 + 0110 = 1110 ensure that it does not contain errors? 0000 1011 0101 1110 1000 0011 1101 0110 0100 1111 0001 1010 0010 1001 0111 1100 Standard array • The quotient group, with respect to the coset addition, satisfies: 푛 푛_푘 픽 q / 풞 ≅ 픽 q • The standard array is the correspondence between the syndromes 푛_푘 푛 푠 ∈ 픽 and the cosets of 풞 in 픽 q , induced by this isomorphism. • All row yield the same syndrome. _ • A [푛, 푘] linear code can correct 푞푛 푘 error patterns.

Error-Correcting Linear Codes Motivation

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support