Lecture 15: Recall defn. of row operations, row equivalence, and row space. Theorem 1: If B is row equivalent to A, then R(B) = R(A). Proof of Theorem 1: Check that, after each elementary row oper- ation, the row space does not change.

   0  R1 R1  R   0   2   R2  A =   ,B =    ...   ...  R 0 m Rm 0 0 1. Ri = Rj, Rj = Ri. – An exchange of rows does not change the set of rows. 0 −1 0 2. Ri := aRi; Ri := a Ri. – So, any of rows of B is a linear combination of rows of A, and vice versa. 0 0 0 0 3. Rj := aRi + Rj; Rj = Rj − aRi = Rj − aRi – So, any linear combination of rows of B is a linear combination of rows of A, and vice versa. Recall defn. of REF. Theorem 2: Every is row equivalent to an REF. Proof of Theorem 2: Step 1: Exchange rows so that a row with the left-most leading entry is at top. Then the leading entry in the top row is in position (1, j1)

1 Step 2: Subtract appropriate multiples of R1 from all other rows in order to “zero out” all entries of column j1 except for position (1, j1). Continue Steps 1 and 2 inductively until only zero rows remain:

– Exchange remaining rows so that (among all but row 1), R2 has the left-most leading entry . Then the leading entry in R2 is in position (2, j2), with j2 > j1.

– Subtract appropriate multiples of R2 from all other rows in order to “zero out” all entries of column j2 below (2, j2). – etc.

Example 1 over GF (3) = Z3: Let  0 2 2  A =  2 1 0  1 1 2

Construct an REF B for A: R1 ↔ R2:  2 1 0   0 2 2  1 1 2

2R1 + R3 → R3:  2 1 0   0 2 2  0 2 2

R2 + R3 → R3:  2 1 0   0 2 2  = B 0 0 0

2 Example 2 over GF (4) = {0, 1, a, b}, with special addition/multiplication tables (on p. 36). Find REF of:

 a b 1  A =  1 b 0  b 0 1 bR1 + R2 → R3:  a b 1   0 1 b  b 0 1 aR1 + R3 → R3:  a b 1   0 1 b  0 1 b

R2 + R3 → R3:  a b 1   0 1 b  0 0 0 Theorem 3: Let W = hSi where S is a subset of V (n, q). Let A be a matrix whose rows are the elements of S. Let B be a REF of A. Then the nonzero rows of B from a for W . Proof of Theorem 3: By Theorem 2, an REF B for A exists. We must show that i) the nonzero rows of B span W and ii) the nonzero rows are linearly independent. i: By Theorem 1, the row space of B is W ; thus, the nonzero rows of B span W .

3 ii: Write  0  R1 0  R  B =  2   ...   0  Rm 0 0 Let R1,..., Rk, k ≤ m, be the non-zero rows. Suppose k X 0 aiRi = 0. i=1 0 For i = 1, . . . , k, let ji be the leading entry of Ri. We show that each ai = 0, 1 ≤ i ≤ k, by induction on i.

For i = 1: Since all entries in column j1, except for (1, j1), are 0,

k X 0 a R0 = ( a R ) = 0. 1 1,j1 i i j1 i=1 Since R0 =6 0 and F is a field, a = 0. 1,j1 1

Inductive step: Assume ai = 0 for i = 1,...,I − 1 < k, for some I. Since R0 = 0 for all ` > j , `,jI I k X 0 a R0 = ( a R ) = 0. I I,jI i i jI i=1 Again, since R0 =6 0 and F is a field a = 0. I,jI I By Example 1 above (GF (3)), {210, 022} is a basis for h210, 112, 022i.

By Example 2 above (GF (4)), {ab1, 01b} is a basis for hab1, 1b0, b01i.

4 Corollary: Let S be a finite subset of V (n, q). Let A be a matrix whose rows are the elements of S. Let B be a REF of A. Then S is linearly independent iff B has no zero rows. Proof of Corollary: One can show row equivalence preserves linear independence (i.e., if A and B are row equivalent, then the rows of A are linearly independent iff the rows of B are linearly independent) – left a an exercise. Thus, S is linearly independent iff the the rows of the REF B are linearly independent. But the latter can happen iff B has no zero rows. Another proof: If B has no zero rows, then the rows of B form a basis for R(A). Thus, the dimension of R(A) is the number of rows of A. Since the rows of A span R(A) and by Theorem 4.2, some subset of a spanning set is a basis, and all bases for the same subspace have the same size, it follows that the rows of A form a basis for R(A) and thus are linearly independent. If S is linearly independent, then the rows of A form a basis for R(A). Since the nonzero rows of B also form a basis, and any two bases have the same size, it follows that B has no zero rows.  Defn: A matrix is in Reduced (RREF) if 1. The leading entry in each non-zero row is strictly to the right of the leading entry in the preceding row. 2. All the non-zero rows (if any) are at the bottom. 3. All leading entries are 1 4. A leading entry is the only nonzero entry of its column. Note that the difference between REF and RREF is conditions 3 and 4.

5 REF: L = a leading entry (the different appearances of L may be differ- ent nonzero elements)

  0 ... 0 L ∗ ... ∗ a ∗ ... ∗ b ∗ ... ∗  0 ... 0 0 0 ... 0 L ∗ ... ∗ c ∗ ... ∗       0 ... 0 0 0 ... 0 0 0 ... 0 L ∗ ... ∗     0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0     ......  0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0 RREF:

  0 ... 0 1 ∗ ... ∗ 0 ∗ ... ∗ 0 ∗ ... ∗  0 ... 0 0 0 ... 0 1 ∗ ... ∗ 0 ∗ ... ∗       0 ... 0 0 0 ... 0 0 0 ... 0 1 ∗ ... ∗     0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0     ......  0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0 Theorem: Every matrix is row equivalent to a unique RREF. Proof: In an REF, all entries below a leading entry are 0. From RRE, multiply each leading entry a by a−1 to make the leading entry = 1.

For each leading entry i, let ji denote its column; so ai,ji = 1.

For each nonzero entry a = ar,ji =6 0 for some r < i (above a leading entry), replace row Rr by Rr − aRi; this “zeros out” the (r, ji) entry. After making all leading entries of REF equal to 1:

6   0 ... 0 1 ∗ ... ∗ a ∗ ... ∗ b ∗ ... ∗  0 ... 0 0 0 ... 0 1 ∗ ... ∗ c ∗ ... ∗       0 ... 0 0 0 ... 0 0 0 ... 0 1 ∗ ... ∗     0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0     ......  0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0

Replace R1 by R1 − aR2:

  0 ... 0 1 ∗ ... ∗ 0 ∗ ... ∗ d ∗ ... ∗  0 ... 0 0 0 ... 0 1 ∗ ... ∗ c ∗ ... ∗       0 ... 0 0 0 ... 0 0 0 ... 0 1 ∗ ... ∗     0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0     ......  0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0

Replace R1 by R1 − dR3:

  0 ... 0 1 ∗ ... ∗ 0 ∗ ... ∗ 0 ∗ ... ∗  0 ... 0 0 0 ... 0 1 ∗ ... ∗ e ∗ ... ∗       0 ... 0 0 0 ... 0 0 0 ... 0 1 ∗ ... ∗     0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0     ......  0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0

Replace R2 by R2 − eR3:

7   0 ... 0 1 ∗ ... ∗ 0 ∗ ... ∗ 0 ∗ ... ∗  0 ... 0 0 0 ... 0 1 ∗ ... ∗ 0 ∗ ... ∗       0 ... 0 0 0 ... 0 0 0 ... 0 1 ∗ ... ∗     0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0     ......  0 ... 0 0 0 ... 0 0 0 ... 0 0 0 ... 0

8 Lecture 16: HW4 will be posted later today and due on Friday, March 11. Recall REF and RREF.

Example 1 over GF (3) = Z3. An REF for  0 2 2  A =  2 1 0  1 1 2 is:  2 1 0  B =  0 2 2  0 0 0

To find RREF: 2R1 → R1 : 2R2 → R2  1 2 0   0 1 1  0 0 0

R1 + R2 → R1:  1 0 1  RREF =  0 1 1  0 0 0 Example 2 over GF (4) = {0, 1, a, b}. An REF of:  a b 1  A =  1 b 0  b 0 1 is  a b 1  B =  0 1 b  0 0 0

9 Find RREF: bR1 → R1:  1 a b   0 1 b  0 0 0 aR2 + R1 → R1  1 0 a  RREF =  0 1 b  0 0 0 Proposition: Let H be a subset of V (n, 2). The following are equivalent 1. H is closed under vector addition 2. H is a subgroup of V (n, 2). 3. H is a subspace of V (n, 2). Proof: 1 ⇒ 2: It suffices to show that H is closed under vector addition and inverses. But in V (n, 2), −x = x. 2 ⇒ 3: It suffices to show that H is closed under vector addition and scalar multiplication. Since H is a subgroup, it is closed under vector addition. The only scalars are 0 and 1. For any u ∈ H, 0u = 0 ∈ H since H is a subgroup and 1u = u. 3 ⇒ 1: Obvious since any subspace is a vector space and is thus closed under vector addition.  Proposition is false for V (n, 4). Example: {00, 11} is closed under addition but is not a subspace because a(11) = aa is not in the subset. The marriage of algebra and coding theory:

10 Defn: A linear code C over GF (q) is a subspace of V (n, q). If C has dimension k, we say that C is an [n, k]-code over GF (q) or an [n, k]q-code. k Note: An [n, k]q-code is also a (n, q )-code. Do not confuse the notations: (·, ·) for general codes and [·, ·] for linear codes. Note: we always have k ≤ n since dim(V (n, q)) = n. General example: the span of any subset of V (n, q) is a linear code (and conversely). Examples: V (n, 2): — The 3-repetition code {000, 111} is closed under vector addition (check the addition table) and therefore a linear code. And {111} is a basis, and so it is a [3, 1] binary code.

— The code C3 = {00000, 01101, 10110, 11011} is closed under vector addition (check the addition table) and therefore a linear code. And {01101, 10110} is a basis and so it is a [5, 2] binary code because.

V (n, 3): — C = {000, 111, 222} is closed under vector addition and scalar multiplication and so is a linear code. And {111} is a basis, and so it is a [3, 1]3-code. — C = h022, 210, 112i is a linear code. By REF, we found that 2 {210, 022} is a basis and so it is a [3, 2]3-code and |C| = 3 = 9. V (n, 4): – C = hab1, 1b0, b01i is a linear code. By REF, we found that 2 {ab1, 01b} is a basis and so it is [3, 2]4-code and |C| = 4 = 16.

11 Advantage 1 of a linear code: we can define an [n, k]q code by specifying a basis of the code, i.e., k codewords instead of listing all the (qk) codewords of the code.

– Preceding example: The [3, 2]4-code is specified by 2 codewords of length 3, but there are 42 = 16 codewords.

– Reed-Solomon code: [255, 223]256-code is specified by 223 code- words instead of 256223 codewords.

Defn: The weight of a word x = x1 . . . xn, denoted wt(x), is the number of nonzero symbols in x.

Example: over GF (3) = Z3, wt(02120) = 3. Note: we defined this earlier only for binary words. Proposition: For x, y ∈ V (n, q), d(x, y) = wt(x − y) (using subtraction in V (n, q))

Proof: For each i,(x − y)i =6 0 iff xi =6 yi. Thus, the number of positions in which x and y differ equals the number of nonzero symbols in x − y. Recall that for binary words d(x, y) = wt(x+y). This is consistent since in V (n, 2), x = −x. Defn: for a linear code C, the weight of C is wt(C) = min wt(x) x∈C, x6=0 Proposition: For a linear code C, d(C) = wt(C). Proof: Let x, y ∈ C, x =6 y, that achieve d(C), i.e., d(C) = d(x, y). Then d(C) = d(x, y) = wt(x − y) ≥ wt(C) since x − y ∈ C and x − y =6 0.

12 Let x ∈ C s.t. wt(C) = wt(x). So, x =6 0. Then wt(C) = wt(x) = d(x, 0) ≥ d(C) since x, 0 are distinct elements of C.  Advantage 2 of linear code: easier to compute d(C): you need |C| only consider |C| words instead of 2 pairs of words. Later we will find an even better way of computing d(C) for linear codes.

Notation: An [n, k, d]q-code is an [n, k]q-code with minimum dis- tance = d. Defn: A generator matrix for a linear code C is a matrix whose rows form a basis for C. How to find a generator matrix for a linear code, defined as the span of a set of vectors: find an REF for a matrix whose rows are these vectors, and then delete the zero rows of the REF. Note: A generator matrix for a [k, n]-code is a k × n matrix. Since k ≤ n, G has the form:  ∗ ∗ ∗ ∗ ∗ ∗  G =  ∗ ∗ ∗ ∗ ∗ ∗  ∗ ∗ ∗ ∗ ∗ ∗ Use a generator matrix to encode arbitrary user messages to code- words:

Given an [n, k]q code C, with generator matrix G, define the en- coder E : V (k, q) → V (n, q),E(u) = uG Proposition: The range of the encoder is C and is a 1-1 function.

13 Proof: Write   R1    R2  G =    ···  Rk For all u ∈ V (k, q), X E(u) = uiRi i Thus the range of the encoder is the set of all linear combinations of rows of G, which is the span of the rows of G which is C since the rows of G form a basis for C. Suppose that E(u) = E(v). Then uG = vG and so (u−v)G = 0. So, k X (ui − vi)Ri = 0. i=1 Since the rows of G form a basis, they are linearly independent and so each ui − vi = 0, and so u = v. So, the encoder is 1-1.  Note: Since k ≤ n, we are expanding information messages in order to achieve error correction (protection). But as discussed earlier in the course, the error protection can be traded for increased user information rate. In the case q = 2, we encode arbitrary k-bit messages to n-bit codewords. This gives a rate k : n encoder. The higher the ratio k/n, the more efficient the encoding is. Example: 3-repetition code, a [3, 1] binary code. Gives a rate 1:3 encoder. G =  111 

14 E(0) = 0[111] = 000,E(1) = 1[111] = 111

Example C3, a [5, 2] binary code. Gives a rate 2:5 encoder.  10110  G = 01101

 10110  E(00) = (00) = 00000 01101  10110  E(01) = (01) = 01101 01101  10110  E(10) = (10) = 10110 01101  10110  E(11) = (11) = 11011 01101

Example (over GF (3) = Z3): C = h022, 210, 112i = h210, 022i.  210  G = 022 E : V (2, 3) → V (3, 3) E(21) = (21)G = 2(210) + (022) = (420) + (022) = 112 Example (over GF (4)): C = hab1, 1b0, b01i = hab1, 01bi is a basis.  ab1  G = 01b E : V (2, 4) → V (3, 4) E(ab) = (ab)G = a(ab1) + b(01b) = (a2, ab, a) + (0, b, b2) = (b, a, 0)

15 Defn: Generator matrix in standard form: G = [Ik | A], where Ik is the k × k and A is a k × (n − k) matrix. There are linear codes that do not have generator matrices in standard form. However, Prop: For every linear code C, there is a linear code C0 equivalent to C by a permutation of codeword positions s.t. C0 has a generator matrix in standard form. Proof: – Find RREF. – Delete zero rows.

– Permute the columns of the leading entry so that Ik appears in the left half of the matrix. The encoder corresponding to a generator matrix in standard form is a systematic encoder:

E(u) = u[IkA] = [u, uA]. So, the user message is transparent in the codeword. We are essen- tially encoding by appending “parities.” to the user message. Advantage 3 of linear code: easy to encode and to “uncode.”

16