ASTUDYOF SECRECY CODESANDTHEIR REAL-WORLD PERFORMANCE

BY

JAYADEV VASANTH NAIR

A THESISSUBMITTEDTOTHE GRADUATE FACULTY OF THE

UNIVERSITYOF COLORADO COLORADO SPRINGS

IN PARTIAL FULFILLMENT OF THE

REQUIREMENTSFORTHEDEGREEOF

MASTEROF SCIENCE

DEPARTMENT OF ELECTRICALAND COMPUTER ENGINEERING

2017 ii

THISTHESISFORTHE MASTEROF SCIENCEDEGREEBY

JAYADEV VASANTH NAIR

HASBEENAPPROVEDFORTHE

DEPARTMENT OF ELECTRICALAND COMPUTER ENGINEERING

BY

WILLIE K.HARRISON,CHAIR

MARK A.WICKERT

M.SCOTT TRIMBOLI

Date 05/04/2017 iii

Vasanth Nair, Jayadev (M.S. Electrical Engineering) A Study of Secrecy Codes and Their Real-World Performance Thesis directed by Assistant Professor Willie K. Harrison

ABSTRACT

This paper presents established works in the physical-layer security and information theory domain, discussing concepts and definitions relevant to the experimental setup. We discuss the wiretap channel model, its relevance, and the secrecy coding methods employed. These methods are then utilized with specific code constructions like LDPC, and Reed-Muller codes, and their performance over binary erasure channels (BEC) and binary symmetric chan- nels (BSC), using error rate curves, are studied. This is then expanded to a Gaussian channel with the help of radios and the transmission performance is studied. With this information, we seek to arrive at a ranking of codes, to determine if there is a better code suited to secrecy applications. iv

ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to my advisor Dr. Willie Har- rison, for his guidance, support, and enthusiasm. Dr. Harrison’s willingness to help out with every roadblock I’ve hit through this research, has been im- mense and it has gone a long way in helping me achieve my goals. I would also like to thank Sam Schmidt, for his endless and patient sup- port during my Master’s course, and for always being only a phone call away for help. Above all, I would like to thank my family: my better half Lekshmi Prathap, my parents Jayashree Narayanan and Vasanth Kumar Nair, my brother Jayanth Nair, for all that they have blessed me with. v TABLEOF CONTENTS

CHAPTER

IINTRODUCTION ...... 1

II PHYSICAL LAYER SECURITY ...... 4 Linear Block Codes ...... 4 Wiretap Channel ...... 8 Secrecy and Secrecy Coding Fundamentals ...... 11

III CODING STRATEGY ...... 16 Syndrome Coding...... 16 Linear Codes to be Used in Secrecy Designs ...... 20 Low-Density Parity Check Codes...... 20 Reed-Muller Codes ...... 23 Other Codes ...... 26 Bit Error Rate (BER) Curves ...... 26

IV EXPERIMENTSAND RESULTS ...... 28 System Description ...... 28 LDPC Construction...... 29 Reed-Muller Code Construction ...... 30 Random and Worst Code Constructions ...... 30 Simulation Setup ...... 31 Radio Transmission Setup ...... 33 Results ...... 35 Simulation Results ...... 35 Radio Transmission Results ...... 38

VCONCLUSIONAND FUTURE WORK ...... 40 Conclusion ...... 40 Future Work ...... 42

BIBLIOGRAPHY ...... 43 vi LISTOF FIGURES

FIGURE

2.1 A standard array table [21]...... 8 2.2 The wiretap channel model...... 9 2.3 The BSC wiretap channel model...... 10

3.1 The BEC channel model...... 17 3.2 Syndrome table [21] with M number of codewords per coset, and N cosets...... 18 3.3 Tanner graph for the parity check matrix in (3.7). The nodes on the right hand side are the check nodes, while the nodes on the left are the variable nodes...... 21 3.4 Reed-Muller code is found to have the highest equivocation rates amongst all (8, 4) codes...... 25

4.1 GNU Radio transmitter flowgraph...... 34 4.2 GNU Radio receiver flowgraph...... 34 4.3 Error performance of all (8, 4) codes over the binary erasure channel...... 35 4.4 Error performance of all (128, 64) codes over the binary erasure channel...... 36 4.5 Error performance of all (8, 4) codes over the ...... 37 4.6 Error performance of all (128, 64) codes over the binary sym- metric channel...... 37 1

CHAPTER I

INTRODUCTION

Cryptography, in general, has taken on several different methods of im- plementation given the varying requirements for privacy and confidentiality today. With the Internet continually growing in complexity, calls for mea- sures that guarantee data security are more prominent. One of the earliest forms of encryption was in using mono-alphabetic substitution ciphers, and it has since evolved to much larger, and more complex forms with the advent of military strategies, and technology in general. Regardless of complexity, when attempting to transmit data, the two criteria relevant to this discussion are ‘reliability’ and ‘security’; how reliably can data be sent across a channel, such that the intended receiver is able to correctly decipher and decode the data, and how well does that encryption scheme guarantee that the data is not compromised to a third-party? Unlike other implementations of cryptography, physical layer security is a technique that specifically targets the physical layer of a communication system; exploiting characteristics like thermal noise, interference, and the time-varying nature of fading channels [3]. This technique also envelopes the addition of coding strategies to the transmitted signal which further ex- ploit the channel properties to ensure that a third-party, also termed an eaves- dropper, is unable to detect or decode the message. Such a method allows for the physical location of the eavesdropper to be detrimental in its attempt to Chapter I. INTRODUCTION 2 decipher the message. The wiretap channel model introduced by Wyner in [24] is an example of a system that exploits channel conditions to ensure that the eavesdropper is not able to read any transmitted message. This model works on the assump- tion that the channel between the sender and the eavesdropper is ‘noisier’ than the channel between the sender and the legitimate receiver. For discrete memoryless channels, a channel property that is often related to its ‘noisi- ness’ is its crossover/erasure probability, i.e., the probability with which a transmitted bit is flipped or lost. Therefore, if the sender-eavesdropper chan- nel has a greater chance of a bit-flip or a bit-loss than the sender-receiver channel, then that channel is said to be ‘noisier’. As [23] describes, there are two major reasons why a wiretap channel set- ting is of relevance: (1) no assumptions are made regarding the eavesdrop- per’s computational ability, and (2) there is no key distribution between the sender and receiver. Reason (1) essentially implies information-theoretic secu- rity, where the eavesdropper, even with unlimited computational power, can never have enough information to retrieve the message. The eavesdropper could also potentially have access to the encoding/decoding scheme that is followed. Therefore, the intention behind using a wiretap model is solely to disarm the eavesdropper’s ability to decipher the correct message, by using channel noise to distort the message. Secrecy codes are generally designed for such a channel model, and this is where the two criteria that we introduced earlier, reliability and security, come into play. The aim of any secrecy code would be to not only meet the reliabil- ity criterion, but also to ensure that in doing so, security is not compromised. The channel capacity, as Shannon describes in [18], is the maximum rate at which the reliability criterion can be met. The maximum achievable rate that also meets the security criterion, in a wiretap channel model, was found to Chapter I. INTRODUCTION 3 be the difference between the receiver channel capacity and the eavesdrop- per (or wiretapper’s) channel capacity, under certain specific conditions [23]. This rate is labeled as the secrecy capacity. This capacity was shown to be zero, unless the eavesdropper’s channel is noisier than the main channel, in [8]. Wyner in [24], explained how secrecy capacity is achieved by means of an encoder that separates codewords into cosets, allowing one message to be mapped to one of several random codewords within a large coset. This allows for data to be reliably decoded by the legitimate receiver, while guar- anteeing secrecy through the randomness that each sub-code provides. This forms the basis for the coding scheme that is relevant to our discussion, coset coding. In this paper, we attempt to study different code constructions, imple- mented through coset coding, and their performance over certain channels, in an effort to discern if there is a better or best code under fixed block lengths, for secrecy applications. The paper discusses the expected perfor- mances, determined by simulated error curves, and attempts to corroborate these with results from a real-world environment test using Universal Soft- ware Radio Peripheral (USRP) boards. 4

CHAPTER II

PHYSICAL LAYER SECURITY

Physical layer security, as introduced earlier, focuses on the physical layer of a communication system, to achieve security. This technique exploits char- acteristics like noise and interference, such that the channel statistics of a receiver allow system designers to make information-theoretic security guar- antees, implying that the eavesdropper can never have enough information to successfully decode the message. This chapter will provide a background into how this can be achieved in a communication system, and how the ad- dition of certain coding methods can improve secrecy. Before we begin describing the systems and code constructions we em- ploy in this paper, it is perhaps vital to know and understand what comprises a code.

2.1 Linear Block Codes

In our experiments, and in in general, the transmitter aims to encode a certain message into a codeword, transmit over a channel, and subsequently have the receiver decode the message without error, even when the channel is noisy. Linear codes determine how a message is mapped to a codeword. Through this paper we aim to use only binary alphabets in {0, 1}, to represent our messages and codewords. Chapter II. PHYSICAL LAYER SECURITY 5

For k ≤ n , an (n,k) C, is a code for which a k-bit message is mapped to an n-bit codeword. So for a code that takes k-bit inputs, the total number of possible codewords is

N =2k, (2.1) each of which are n-bits long. This code is linear if and only if any linear combination of codewords results in another codeword. In other words, the (binary) addition of any two codewords in C, should give another codeword that is also in C. All linear block-codes are therefore closed under binary addition. Error-correction codes are codes that allow for error correction and detec- tion, by adding redundancy to the data being transmitted. Such codes often employ parity bits, which can be used in detecting and correcting errors by checking for even or odd parity. An example of a code that does not use parity bits, is the repetition code. In this scheme, every bit, or a block of bits, that is to be transmitted, is re- peated several times. For example, to transmit a message 101, this scheme produces a codeword by repeating each bit, say 3 times, such that the trans- mitted codeword is 111 000 111. If the received codeword is 110 001 101, the decoding procedure is to simply use majority-logic to determine the value of each information bit. In this case, the majorities in the repetition for each bit allow for correct decoding. Repetition codes, generally, show very poor er- ror correcting performance and are overlooked for other more effective error- correction codes. All codewords for a particular code C, can be obtained from a set of lin- early independent vectors, known as a basis. These vectors when represented as rows of a matrix, form a generator matrix G. An (n,k) code will therefore have a k ×n generator matrix, so that every codeword can be obtained by the Chapter II. PHYSICAL LAYER SECURITY 6 relation c = mG, (2.2) where m is the k-bit message, and c is the n-bit codeword. If the generator matrix is composed of rows {g0, g1,..., gk−1}, and the message m is repre- sented by bits {m0, m1,..., mk−1}, then the codeword is

c = mG = m0g0 + m1g1 + ... + mk−1gk−1. (2.3)

k The rate of the code R, is then obtained by the relation, R = n . The weight of a codeword is the number of non-zero elements in that codeword, while the of one codeword from another is the number of elements for which the two codewords differ. An important parameter, that we’ll touch on later in this paper, is minimum distance, which is defined as the minimum weight of a code’s non-zero codewords. For every generator matrix G, for a code C, there exists an (n − k) × n matrix H, known as the parity-check matrix, that forms the nullspace of G, which can be mathematically shown as

GHT =0, (2.4) where HT is the transpose of the matrix H. This implies that a length n word c is only a codeword if and only if

cHT =0. (2.5)

When a parity-check matrix H for a code C is used as the generator matrix for a different code, that code is known as the dual code of C, denoted as C⊥. Therefore, if G is the generator matrix for an (n,k) code C, then H is the generator matrix for its (n − k,n) dual code C⊥. Chapter II. PHYSICAL LAYER SECURITY 7

Every generator matrix G has a systematic representation, where the ma- trix is of the form

G = Ik P , (2.6)   where Ik is the identity matrix, and P is the k × (n − k) parity-check matrix. If G was not designed with this form, it can always be reduced to it by performing row reduction and column reordering until the identity matrix is obtained. Similarly, the parity-check matrix in its systematic representation, is of the form

T H = −P In−k , (2.7)   where −P T is the transpose of the k × (n − k) matrix P . Suppose we transmit a codeword c ∈ C. Let the received vector r be defined as, r = c + e, (2.8) where e is the error vector. We compute a vector s, such that,

s = rHT . (2.9)

This can be further simplified as

s =(c + e)HT = cHT + eHT . (2.10)

Using (2.5), (2.10) reduces to s = eHT . (2.11)

The vector s thus obtained is the syndrome for the received vector r. The syndrome allows us to identify the error pattern in the received vector and thereby correct the codeword. All codewords of C, have a syndrome equal to zero. Multiple vectors can have the same error pattern, and therefore the Chapter II. PHYSICAL LAYER SECURITY 8

FIGURE 2.1: A standard array table [21]. same syndrome. This can be visualized by constructing a standard array table, as shown in Figure 2.1, constructed for a (6, 3) code. This table is constructed by listing all the codewords in the first row of the table. The vectors in the first column, are the possible error-patterns, and they generally are chosen to be of minimum-weight. Each row forms a coset, and the vectors in the first column of each row are called the coset leaders. The code constructions that we will follow in this paper, seek to use such a standard array table, to achieve security; this concept will be clearer as we go through the next few sections.

2.2 Wiretap Channel

The wiretap channel model, as shown in Figure 2.2 was introduced by Wyner in 1975. This model employs the definitions of physical layer security to achieve secrecy, by exploiting channel characteristics that would impact a wiretapper’s ability to retrieve information. The wiretapper (eavesdropper) is assumed to have unlimited computational power, and full knowledge of the codebooks used by the legitimate transmitter and receiver. The sender and the intended receiver communicate over the channel that we label as the main channel, while the wiretapper attempts to tap the transmission through a separate channel that we call a wiretapper’s channel, or a wiretap channel. Chapter II. PHYSICAL LAYER SECURITY 9

FIGURE 2.2: The wiretap channel model.

Eliminating the use of keys and a key distribution, this model was introduced with the full intention of achieving secrecy purely by exploiting the channel properties of the eavesdropper. The sender attempts to encode a binary message M by means of an en- coding scheme, which we’ll discuss in detail in a later section, into a code- word Xn of length n bits. This codeword, subject to noise or any error in the main channel, is received by the intended receiver as Y n. The eavesdropper attempts to intercept this message over a separate noisier channel, and we label the eavesdropper’s observation as Zn. The goal, therefore, is to design a system where the intended receiver is able to decode the transmitted codeword with near-zero probability of error, and at the same time ensure that the eavesdropper can never recover it. The secrecy codes we define should ideally satisfy these two conditions, under certain channel conditions. The code must provide enough ambiguity so that the eavesdropper cannot recover the message, yet just enough that the legitimate receiver can decode it. The transmitter will, therefore, attempt to use code transmission rates that would maximize the eavesdropper’s uncertainty of the message. If we con- sider the main channel and the wiretapper’s channel to be binary symmetric channels (BSC), as we see in Figure 2.3, a ‘noisier’ channel for the eaves- droppper, implies that the crossover probability of the wiretapper’s BSC is greater than that of the main channel BSC. Realizing the model through Chapter II. PHYSICAL LAYER SECURITY 10

FIGURE 2.3: The BSC wiretap channel model. such BSCs can help us simplify the results of [8], from which we under- stand that the maximum achievable rate that meets reliability and security requirements over the wiretap channel, referred to as the secrecy capacity, is the difference in capacities of the two channels. This relation is true only if both channels that are weakly symmetric [2]. The channel capacity, as Shannon discovered in [18], for the main channel is

1 − H2(pR), (2.12)

where H2(pR) is the binary entropy function with probability pR, defined as

H2(pR)= −pRlog2(pR) − (1 − pR)log2(1 − pR). (2.13)

Similarly, for the wiretapper’s channel, the capacity is 1−H2(pA). The secrecy capacity Cs therefore, is the difference in capacities of these two channels, and is

Cs = Cmain − Cwiretap = H2(pA) − H2(pR). (2.14)

Code constructions that have since been defined, aim to achieve this ca- pacity by providing reliability and security at rates approaching Cs. In its simplest form, the main channel can be noiseless, so that the probability of Chapter II. PHYSICAL LAYER SECURITY 11 error is zero, whereas the wiretapper’s channel can be a BSC. Wyner ad- dresses this specific model in his paper and proves that for such a system, as the codeword length increases, the uncertainty of the message associated with the eavesdropper increases and approaches the source entropy, imply- ing that despite observing Zn, the uncertainty is too large for any information about the message to be leaked to the eavesdropper. This uncertainty, also referred to as the equivocation, is defined as

p(Zn) H(M|Zn)= p(M,Zn)log , (2.15) p(Zn,M) M,Zn X which essentially quantifies the information needed to describe M, given Zn. H(M|Zn)=0 implies that information about Zn can completely determine M, whereas, H(M|Zn) = H(M) implies that Y cannot be determined by the full knowledge of X. To understand how secrecy codes provide information-theoretic security, it is important to understand the established security constraints for achiev- ing information-theoretic security over the wiretap channel.

2.3 Secrecy and Secrecy Coding Fundamentals

Shannon in [19] defines perfect secrecy as the secrecy achieved when the eavesdropper attains no information about the message, even if the eaves- dropper knows the codeword precisely. If Xn is the transmitted codeword for a message M, then the condition for perfect secrecy is

H(M|Xn)= H(M). (2.16)

This implies that the mutual information between the codeword and the mes- sage, I(M; Xn)= H(M) − H(M|Xn), (2.17) Chapter II. PHYSICAL LAYER SECURITY 12 must be zero. In such a situation, the only way the attacker can decode the message is to guess it. It is when we attempt to describe what is required to attain and maintain perfect secrecy as described above, that we run into the reasons why such a system cannot be realized. To build this, there must exist at least the same number of keys as number of messages, and the equivocation associated with the key H(K) should be equal to or greater than the equivocation associated with the message [19], i.e., H(K) ≥ H(M). (2.18)

The one-time pad is an example of a scheme that can achieve perfect se- crecy, yet practically speaking, cannot be built. This scheme employs the use of XOR-operations to convert the message to its respective codeword, such that for a message M, and a key K,

X = M ⊕ K. (2.19)

For example, let the message to be transmitted be

M = (101100101), and the key uniquely generated for this message be

K = (011011001), then the transmitted codeword, as per (2.19), is

X = (110111100).

As discussed earlier, for this scheme to achieve perfect secrecy, K must be perfectly random, at least as long as M, and must satisfy (2.18). In other words, to attain perfect secrecy, the transmitter and the receiver must first share a key that is as long as the message, securely. Thus, the one-time pad Chapter II. PHYSICAL LAYER SECURITY 13 does not solve the fundamental problem of communicating a message with perfect secrecy. Weak secrecy, defined by Wyner in [24], requires for the rate of mutual in- formation between the message and the noisy codeword that the eavesdrop- per observes, to go to zero as the codeword length increases, i.e.,

1 lim I(M; Zn)=0. (2.20) n→∞ n

The notion of strong secrecy was then suggested, requiring for the total mu- tual information between the message and the eavesdropper’s observation to reduce to zero as the codeword length increases, denoted as

lim I(M; Zn)=0. (2.21) n→∞

It is also worth noting that, while these are different metrics, the max- imum rates for achieving both reliability and secrecy, are theoretically the same for both. This was studied and proven in [13]. To construct secrecy codes, it is necessary that any (or all) of the security constraints mentioned above are met. As discussed in [3], we understand that using standard er- ror correcting codes designed to achieve reliability is not sufficient to ensure information-theoretic security. Therefore to achieve secrecy, we adopt the technique Wyner introduced in [24], where we encode the message with one of several possible codewords. Wyner states and proves that when the en- coder output is a randomly chosen member of the coset related to the mes- sage, it is possible to reliably transmit information at rates approaching the secrecy capacity Cs. Such a system is characterized by an error probability

Pe, transmission rate R, and an equivocation ∆, so that for a small positive Chapter II. PHYSICAL LAYER SECURITY 14 number ǫ, and an equivocation rate d,

(kH(M)) ≥ R − ǫ, (rate specification) n ∆ ≥ d − ǫ, (security constraint) (2.22)

Pe ≤ ǫ (reliability constraint), where, k is the message length, n is the codeword length, and H(M) is the equivocation associated with the message. Cohen and Zemor further studied and discussed the method of coset cod- ing in [7] and [6]. These papers aimed to use linear codes to achieve the se- crecy capacity, and invoke syndrome coding to minimize information leak- age to the eavesdropper. In [6] the authors arrive at the conclusion that, even with possession of Zn and its syndrome, the eavesdropper has no advantage in discerning the message. We will discuss syndrome coding in detail, with examples, in the next chapter and thereby get a better understanding as to how the results of [6] hold. This approach has since been extensively used with the help of LDPC codes [14, 4] and polar codes. While there have been efforts to achieve a higher grade of secrecy with different mechanisms, as [23] states, these were found to be unsuitable or had unrealizable decoding. With the understanding that we can attain only a certain grade of secrecy, can we determine if there is a ‘best’ code? Is there a code that can perform better, in terms of secrecy, than any other? We will introduce and analyze a simple secrecy code construction in Section 3.1. In the next section we will also delve into a deeper analysis of the coset and syndrome coding techniques we discussed above, and set the premise for our experiments and simulations. We will attempt to answer these ques- tions by looking at error rates to determine if a certain code or a certain code construction is better suited for secrecy, in a given environment. We proceed Chapter II. PHYSICAL LAYER SECURITY 15 with the understanding that a good secrecy code, in a noisy channel, should distort the codeword such that the eavesdropper cannot retrieve the actual message. 16

CHAPTER III

CODING STRATEGY

In this chapter, we will extensively discuss the literature for the coding strategies introduced in the previous chapter, and how these strategies are applied to the wiretap setting. We will also introduce and cover the back- ground for some of the codes that are relevant to our experiments.

3.1 Syndrome Coding

Coset coding is an approach that was introduced by Wyner in [24], where the code design provides security against an eavesdropper in a wiretap set- ting. The fundamental idea behind coset coding, is that the secret message that is to be sent, can be transmitted as one of several codewords from a coset mapped to that message. This aims to raise the uncertainty of the eavesdrop- per as it receives a corrupted version of the transmitted codeword. One way to visualize this coding technique is to consider the wiretapper’s channel as a binary erasure channel (BEC). This channel, as shown in Figure 3.1, takes a binary input from {0, 1}, and produces an output in {0, 1, ?}. The ? symbol represents an erasure, and independently occurs with a probability ǫ. Any BEC with an erasure probability ǫ therefore, can be represented as BEC(ǫ). The eavesdropper would therefore observe a codeword with possi- ble erasures. On receiving an erasure, the eavesdropper recognizes that a bit Chapter III. CODING STRATEGY 17

FIGURE 3.1: The BEC channel model. was sent, but has no information regarding its value. The erasures are also assumed to occur independently across the channel. Consider the collection of all four-bit binary sequences C′. Let C be the linear block code that forms a subcode of C′. As explained in [11], for any c

′ ′ ∈ C , c+C forms a coset of C in C . Let coset C0 be defined as

C0 = C = {0000, 0101, 1010, 1111}

With linear block codes closed under bitwise X-OR addition, as discussed earlier, we will then form cosets C1, C2 and C3 by the X-OR addition of a codeword c ∈ C′ with C. For this construction it is important that c must not be in C, and that it has not been used in a previously formed coset. For example, C1 is formed by the X-OR addition of a codeword, 0001, that is not in C, to each codeword in C. Similarly, C2, and C3 can be formed by X-OR adding the codewords 1000, and 1001, respectively, to each codeword in C.

C1 = C +0001 = {0001, 0100, 1011, 1110}

C2 = C +1000 = {1000, 1101, 0010, 0111}

C3 = C +1001 = {1001, 1100, 0011, 0110}

The coset coding technique, for this particular code, uses these cosets to encode messages by randomly picking a codeword from the coset correspod- ing to a message. If we assign each coset to a message from {0, 1, 2, 3} such Chapter III. CODING STRATEGY 18

that, for m = i, we use the coset Ci, then any one of the four codewords can be randomly chosen. Consider the case when the eavesdropper views the message [?, 1, 1, ?] as the codeword passes through the BEC(ǫ). We notice that this particular com- bination of bits and erasures fit codewords in all 4 cosets, or in other words, the codeword could represent any of the 4 messages. Hence, no information is leaked to the eavesdropper. For this particular value of n and k, if we were to keep experimenting with different erasure patterns, we would notice that the most the observation Zn could leak, is one bit, as long as we have at least one erasure. This is the mechanism we follow in constructing secrecy codes. The technique described in the coset coding scheme, splits the code into subcodes or cosets. This is realized with syndromes, by creating a syndrome table for the code, as in Figure 3.2, just as we introduced in Section 2.1.

FIGURE 3.2: Syndrome table [21] with M number of codewords per coset, and N cosets.

Every row in the table, is a coset. As discussed earlier, the codeword corresponding to a message ei, could be a randomly chosen codeword from the ith row of the syndrome table. For an (n,n − k) binary linear block code, the number of codewords per coset, therefore, is M = 2n−k, while the number of cosets, N = 2k. As mentioned previously, the transmitted codeword Xn is obtained by using an (n − k) × n generator matrix G. In [14], the authors detail the pro- cedure for generating this matrix. As explained, out of the possible 2k cosets for a code C, for l ≤ k, 2l cosets are chosen . If G is the (n − k) × n generator Chapter III. CODING STRATEGY 19 matrix for C, then we choose a matrix G∗, an l × n matrix, with l linearly

n independent rows from {0, 1} \ C. Let the rows of G be g1,g2,g3,...gk, and

∗ the rows of G be h1,h2,h3,...hl. We also define a random k-bit vector, v. For an l-bit message, denoted as s, as [14] explains, the encoding operation is represented by the relation

x = s1h1 + s2h2 + ... + slhl + v1g1 + v2g2 + ... + vkgk. (3.1)

This can also be represented in matrix form as

G∗ x = s v   . (3.2)   G     The transmitted codeword therefore can be represented as

∗ x = sG + vG. (3.3)

The syndrome can then be retrieved using the k × n parity check matrix H, corresponding to G as T s = xH . (3.4)

Using (3.3), (3.4) can be simplified as

T ∗ T T ∗ T xH = sG∗ + vG HT = sG H + vGH = sG H +0= s. (3.5)   

The syndrome s can be directly obtained by the above relation if we construct G∗ such that ∗ T G H = I, (3.6) where I is the identity matrix. To summarize, the coset coding scheme has the transmitter and receiver agreeing on a parity check matrix H for a code C. To transmit a k-bit message, Chapter III. CODING STRATEGY 20 the transmitter uses a codeword x chosen randomly from the coset of the corresponding syndrome. The receiver is then able to decode the message from the received codeword, by obtaining the syndrome from the relation xHT .

3.2 Linear Codes to be Used in Secrecy Designs

In the following subsections, we will discuss the different linear codes available for our use, explain their relevance in our research, and discuss previous works that apply these codes to wiretap models, or broadly, secrecy applications. The descriptions and definitions for these codes were studied and referenced from [12].

3.2.1 Low-Density Parity Check Codes

Low-density parity check (LDPC) codes, discovered by Gallager in [9], are a family of linear error-correction codes characterized by a sparse par- ity check matrix. Sparse implies that the number of ones in the parity check matrix is considerably lower than the number of zeros. This work, from the 1960s, initially went unnoticed as the decoding process involved was consid- ered to be too complicated for that era. Tanner in 1981, revisited Gallager’s work, and reinterpreted it from a graphical perspective [22]. It was not until the late 1990s though that this code rose in prominence, when it was discovered that it achieved error per- formance close to the Shannon limit, closer than turbo codes, another capacity- achieving code. LDPC codes are generally specified in terms of their parity check matri- ces. As [12] defines, the parity check matrix H for an LDPC code has the following properties: (1) each row consists of ρ ones; (2) each column con- sists of γ ones; (3) the number of ones in common between any two rows, Chapter III. CODING STRATEGY 21 denoted by λ, is no greater than one; and (4) both ρ and γ are small com- pared to the length of the code and the number of rows in H. If the parity check matrix meets all the above properties, and has the same number of ones in its rows and columns, then the LDPC code it generates is said to be regular. If the number of ones in the rows and columns are not uniform, then the code is said to be irregular. In such cases these codes are represented by their degree-distribution polynomials as shown in (3.8) and (3.9).

1101100   H = 1011010 (3.7)     0111001     Consider the parity check matrix shown in (3.7). The Tanner graph, as seen in Figure 3.3, is a bipartite graph that uses the columns of this parity check matrix as variable nodes, and the rows as check nodes. Check node i and variable node j are only connected if Hij = 1, or in other words, if the corresponding row and column in the parity check matrix share a one. The connection between a variable and check node is called an edge.

FIGURE 3.3: Tanner graph for the parity check matrix in (3.7). The nodes on the right hand side are the check nodes, while the nodes on the left are the variable nodes. Chapter III. CODING STRATEGY 22

These LDPC codes are often characterized by a degree-distribution pair (λ, ρ), with degree distribution polynomials

dv d−1 λ(x)= λdx , (3.8) d=1 X and, dc d−1 ρ(x)= ρdx , (3.9) d=1 X where λd and ρd denote the fraction of edges connected to degree d variable nodes and check nodes respectively. The maximum variable and check node degrees are denoted by dv and dc, respectively. It is worth noting that differ- ent parity check matrices can have the same degree distribution pair. The construction of LDPC codes therefore largely depends on construct- ing parity check matrices that satisfy some of the properties mentioned ear- lier. Some of the known techniques in constructing codes target efficient en- coding and decoding, or near-capacity performance. A technique for con- structing large parity check matrices for block length ≃ 100, 000, is covered in [17]. Given their excellent bit error performance, LDPC codes have found wide ranging applications within communication systems, attributed to their reli- ability. Our goal is now to estimate how well they can achieve security. This was covered extensively in [14] and [4]. In [14], the authors utilize LDPC code constructions suggested in [17] for efficient encoding, to achieve secrecy in special cases where the wiretap channel is a binary erasure channel. LDPC codes were also shown to achieve both weak and strong secrecy on a binary erasure wiretap channel in [14] and [4], respectively. Chapter III. CODING STRATEGY 23

3.2.2 Reed-Muller Codes

As [12] describes, a Reed-Muller code is a prominent linear block code, characterized by multiple error correction capability, easy construction, and structural properties, that can be decoded with hard and soft decision algo- rithms. An r-th order Reed-Muller code, denoted by RM(r,m) is defined by:

• Code length n =2m,

m m m • Dimension k(r, m)=1+ 1 + 2 + .. + r ,    m−r • Minimum distance dmin =2 .

For example, for m = 4 and r = 2, the code length n = 16, k(2, 4) = 11, and dmin = 4. The encoder (generator matrix) is constructed using a vector vi of the form

vi = (0...0, 1...1, 0...0,..., 1...1), (3.10) 2i−1 2i−1 2i−1 2i−1 where there are 2m−i+1 alternating|{z} |{z} all-zero|{z} and all-one|{z} 2i−1-tuples. So for our example, we need the four 16-tuples

v4 = (0000000011111111),

v3 = (0000111100001111),

v2 = (0011001100110011),

v1 = (0101010101010101).

m Let v0 be the all-one 2 -tuple, v0 = (1, 1,..., 1). For this particular exam- ple, these vectors form five rows of the generator matrix. The rest of the rows of the matrix are formed by product vectors,

vi1..il = vi1vi2...vil, (3.11) Chapter III. CODING STRATEGY 24

for 1 ≤ i1 ≤ ... ≤ il−1 ≤ il. Therefore the r-th order RM code, RM(r, m) of length 2m is generated by the set of vectors

{v0, v1, v2,..., vm, v1v2, v1v3,..., vm−1vm, up to products of degree r}. (3.12) As mentioned earlier, the vectors in (3.12) when arranged as rows of a ma- trix, form the generator matrix for that RM code. For an RM code, RM(r, m),

m the generator matrix GRM (r, m) has l rows, for 0 ≤ l ≤ r. An interesting structural property that we come across  then is the inclusion chain

RM(0,m) ⊂ RM(1,m) ⊂ ... ⊂ RM(r, m). (3.13)

Another structural property that RM codes possess is that the code RM(m− r − 1,m) is the dual code of RM(r, m). We also realize that the zeroth-order RM code is a repetition code, and the (m − 1)th-order RM code is a single- parity-check code [12]. RM decoding is often implemented through a majority-logic algorithm, that uses check sums for specific bits/bit positions. Based on the minimum dis- tance of the code, majority logic can be reliably used for a certain number of errors in the codeword. Soft-decision decoding is preferred over this algo- rithm though, as it achieves good error performance with reduced decoding complexity. Decoding methods are explained at length in [12]. In terms of its error-performance, [16] discusses how Reed-Muller codes outperform polar codes under maximum-a-posteriori (MAP) decoding, a de- coding algorithm that determines the codeword by computing the a poste- riori probability associated with each bit. Reed-Muller codes have also been found to achieve capacity over the BEC [16]. Our interest in Reed-Muller codes for this study also stems from an inter- esting result based on the discussions and results in [5]. This paper discusses Chapter III. CODING STRATEGY 25 the need to maximize the uncertainty of the eavesdropper regarding the mes- sage based on the codeword it has received, and arrives at the equation,

n n H(M|Z = z )= k − µ + rank(Gµ), (3.14) for a BEC wiretap channel, where zn ∈ {0, 1, ?}n, µ is the number of unerased positions, and Gµ is the binary matrix of dimension (n − k) × µ, whose columns correspond to the columns of G corresponding to unerased bit lo- cations in Zn. This helps us calculate the equivocation for a code over the erasure wiretap channel. The paper goes on to state that using this calcula- tion for equivocation rates, it is possible to compare all linear block codes of block length n, and estimate which code is better suited for secrecy.

0.5 Reed Muller Code

0.4 ) n 0.3 Z | M ( 0.2 H 1 n

0.1

0 0 0.2 0.4 0.6 0.8 1 Probability of Erasure (ǫ)

FIGURE 3.4: Reed-Muller code is found to have the highest equivocation rates amongst all (8, 4) codes.

It was found that for all n = 8,k = 4 codes, the RM code was found to give the highest equivocation rates for any erasure probability over an era- sure wiretap channel. This was achieved by using the results of [5], as seen in Figure 3.4. This was obtained by running all (8, 4) linear block codes through Chapter III. CODING STRATEGY 26 the equivocation rate equation as defined by (3.14). A high equivocation rate implies that the eavesdropper has a greater uncertainty regarding the mes- sage, and this is favorable in achieving secrecy.

3.2.3 Other Codes

Our experiments also use random codes and a ‘worst’ code, where the lat- ter especially, serves as a reference as we try to estimate if a code is sufficient for secrecy. A random code is defined by an (n − k) × n generator matrix G, where each bit in {0, 1}, is chosen at a probability p, which in most cases is p =0.5. A ‘worst’ code is characterized by insufficient parity-checks in its genera- tor matrix. We will discuss the construction of these codes in detail in Section 4.1.3.

3.3 Bit Error Rate (BER) Curves

In our discussion, we will aim to use bit-error rates as the comparison metric to characterize codes and rank them in terms of their effectiveness for secrecy applications. BER is the ratio of number of bit errors to the to- tal number of transferred bits. These error rates when plotted over varying probabilities of erasures or noise, create the error rate curves. While, the equivocation rates (as seen in the previous subsections) pro- vide for a proven metric for information-theoretic security, it is hard to im- plement and calculate it over a Gaussian wiretap channel. BER performance comparisons on the other hand are easier to construct and analyze. As [1] states, while BER analysis can still be dubbed physical-layer security, it cannot be characterized as information-theoretic security.

R If Pe is the average BER associated with a legitimate receiver’s estimate

E of the transmitted message, and Pe is the average BER associated with the Chapter III. CODING STRATEGY 27

R eavesdropper’s estimate, then it is desired that Pe be as low as possible for

E reliability, and that Pe be as high as possible for security. If we estimate that

E Pe be close to 0.5, it could be high enough to prevent the eavesdropper from gleaning any information about the message. 28

CHAPTER IV

EXPERIMENTSAND RESULTS

In this chapter, we will describe and explain the setup for our experiments and how the background in the previous chapters is applied, and present our results. As introduced earlier, we aim to rank codes, according to their suit- ability for secrecy applications, with these experiments. This chapter covers results from the simulations run on MATLAB, as well as the results from experiments with transmissions using USRP radios.

4.1 System Description

In the experiments that will be described in the coming sections, we will seek to maintain and achieve the following:

• a rate, R = k/n =1/2, for all code constructions,

• a ‘short’ block length, n =8, and a ‘large’ block length, n = 128,

• simulations over binary erasure channels (BEC) and binary symmetric channels (BSC),

• transmit and receive messages, using USRP radios, and relate the per- formance to simulated performance

• obtain, study, and relate the error rate performances to the theory that has been discussed. Chapter IV. EXPERIMENTSAND RESULTS 29

For all codes, we start with a generator matrix G or a parity-check matrix H for an (n,n − k) binary linear block code C, based on the code construction followed. For example, for LDPC codes, we start with the parity check matrix H, that satisfies the criteria mentioned in Section 3.2.1. Once a G or H is generated, we generate the other by reducing the matrix to its systematic form. We obtain this from the relation seen in (2.6) and (2.7).

A secrecy generator Gs is then generated by adding k linearly indepen- dent rows ∈/ C, such that Gs is a full-rank n × n matrix as

G∗ Gs =   . (4.1) G     The encoding procedure for all codes would include a k-bit binary mes- sage s, padded with an (n − k)-bit random vector v, so that the n-bit encoded codeword x is obtained as

G∗ x = s v   . (4.2)   G    

4.1.1 LDPC Construction

For an (8, 4) LDPC code, we design a parity-check matrix H with the num- ber of ones in every row, ρ = 2, and the number of ones in every column, γ = 1. The number of ones in common between any two rows, λ = 0. This code, therefore, is a (1, 2)-regular LDPC code. We obtain the generator matrix G after reducing H to its systematic form. For a (128, 64) LDPC code, we consider its regular and irregular code con- struction. The regular LDPC code can be constructed as was done for the (8, 4) code. An irregular LDPC code is also tested with degree-distribution Chapter IV. EXPERIMENTSAND RESULTS 30 polynomials

ρ(x)=0.03125x3 +0.2344x4 +0.5x5 +0.15625x6 +0.078125x7, (4.3)

λ(x)= x2. (4.4)

4.1.2 Reed-Muller Code Construction

Reed-Muller codes are generated by setting m =3 (or 7), and order r =1 (or 3) to get a block length n = 8 (or 128) and dimension k = 4 (or 64). The construction procedure discussed in Section 3.2.2 gives the generator matrix G, for these parameters. We obtain the parity check matrix by reducing the generator matrix to its systematic form using (2.6) and (2.7).

4.1.3 Random and Worst Code Constructions

We construct random codes by generating (n − k) different n-bit long uni- formly random vectors, where the probability of selecting a 0 or a 1 is 0.5, and thereby form the rows of a generator matrix. For the ‘worst’ code, we aim to use a matrix that provides poor parity-check coverage. This normally results in poor error-correction capabilities, as even incorrect codewords can be interpreted as correct. Viewing the (8, 4) ‘worst’ code matrix that we use in our experiments will provide a better understanding of why such a code might not be suitable. The generator matrix is

10000000   01000000 G =   , (4.5)   00100000     00011111     Chapter IV. EXPERIMENTSAND RESULTS 31 while the parity-check matrix, as a result, will be

00011000   00010100 H =   . (4.6)   00010010     00010001     Clearly, the parity-check coverage the matrix provides is not sufficient to correct errors. The simulations, and the experiments, of this code provide an important benchmark with which to compare the less trivial constructions.

4.1.4 Simulation Setup

The simulations, run using MATLAB, aim to estimate the BER perfor- mance of all codes over binary erasure and binary symmetric channels. This is done by randomly selecting a message s and a random vector v, to be en- coded into an n-bit codeword x. We calculate the expected syndrome ˆs by simple matrix multiplication of the encoded message with the parity check matrix. The simulated binary symmetric channel, has the probability of a bit-flip p determined by the current Eb/N0 value at which the code is being tested, as we are using BPSK as the modulation technique. The error probability p is therefore determined using the Q-function by the relation

2E p = Q c . (4.7) N0 r !

The energy per coded symbol Ec can be defined as

Ec = REb, (4.8) Chapter IV. EXPERIMENTSAND RESULTS 32

1 where R is the rate of the code. Since we maintain a rate of 2 for all codes, then 1 E = × E . (4.9) c 2 b

Therefore (4.7) can be further simplified to

2RE E p = Q b = Q b , (4.10) N0 N0 r ! r ! where the function Q(x) is defined as the probability that a normal (Gaus- sian) random variable will obtain a value larger than x standard deviations above the mean [15], Ec is the energy per symbol, Eb is the energy per bit, and N0 is the noise density. For a BEC, we estimate the performance over the range of erasure proba- bility ǫ. These simulations subject every bit, passing through the channel, to the possibility of a bit-flip or erasure. In the event of an erasure, the erased bits are replaced randomly with bits to simulate an eavesdropper’s guess. As our code assigns multiple codewords to each coset, we expect for the codeword received with errors to either maintain the same coset, or prefer- ably, be distorted enough to result in a different coset. If the observed syn- drome is not the same as the expected syndrome, we consider that a single error. Error rate, or error performance, therefore is estimated by the ratio of the number of errors to the number of syndromes tested.

We compute this error rate for different Eb/N0 values, and thereby plot our error performance curves. This simulation pattern is followed for all four codes, and we group and plot performance curves for n = 8, and n = 128 block lengths. Chapter IV. EXPERIMENTSAND RESULTS 33

4.1.5 Radio Transmission Setup

Given how modern physical-layer security schemes are currently trying to bridge the gap from theory to practice, it is worth understanding and observing the performance of these codes when tested on a real-world test bench. We seek to achieve the same results as that of the simulation with the transmission of data using USRP radios over a real-world Gaussian channel. The USRP radios that will be used in this experiment, are the USRP B200 boards designed by Ettus Research. These boards are single-board software radios that allows for system prototyping using software like GNURadio Companion [10] and MATLAB. In our experiments, we seek to interface the radios with GNURadio soft- ware, allowing us to transmit and receive codewords over the channel. The received bits of data are then stored into a text file, thereby allowing us to revisit and reuse the transmitted data. This is then processed on MATLAB, to measure the error performance. The transmitter, as mentioned earlier, is developed on the GNU Radio software, as shown in Figure 4.1. We use a Vector Source block to constantly generate the 8- or 128- bit codeword. This block generates each value in the codeword as a byte, and the Unpacked to Packed block helps use these values as pure bits and therefore maintain the block length. The vector to be transmitted is then modulated using a DPSK Mod block, that employs Differential Binary Phase Shift Keying modulation scheme. We set the sampling rate at 250, 000 samples per second. A UHD:USRP Sink block helps us communicate with the USRP board. This block sets the center fre- quency for our signal and the antenna over which the signal can be transmit- ted. The receiver flowgraph is shown in Figure 4.2. A UHD:USRP Source block interfaces the software with the radio, and helps in receiving the transmitted Chapter IV. EXPERIMENTSAND RESULTS 34

FIGURE 4.1: GNU Radio transmitter flowgraph. vectors when tuned to the right center frequency. A DPSK Demod block de- modulates the DBPSK modulated signal, and the output in the form of bytes that are then converted to bits using an Unpacked to Packed block that retains the vector length.

FIGURE 4.2: GNU Radio receiver flowgraph.

A File Sink block is then used to store the received codeword, which can then be processed on MATLAB to determine the code’s error performance. The MATLAB code determines the syndrome from every 8 or 128 bits re- ceived, and measures the error rate. The GNURadio Companion flowgraphs, in unison with the MATLAB code, implements the system constructed in our simulations, and aims to achieve similar results. Chapter IV. EXPERIMENTSAND RESULTS 35

4.2 Results

In this section, we will present the results of all simulations and radio transmission tests.

4.2.1 Simulation Results

As discussed earlier, for all four codes that are being tested, we experi- ment with (8, 4) and (128, 64) codes. We can group these tests based on the channel they are tested on: BECs and BSCs. For BSCs, the simulations are run over multiple Eb/N0 values (in dB), with 10 iterations at each Eb/N0 value so as to average out the performance. The iterations are set with limiting conditions that set the maximum number of errors at 104, or the maximum number of syndromes to be compared at 107, thereby providing thorough simulations. For BECs, the simulations are run at erasure probabilities ǫ in the range of 0 to 1, with steps of 0.05. At each value of ǫ, we average the block error rate values over 10 iterations, with each iteration limited by either 10, 000 syndrome errors or 100, 000 syndrome tests.

0.9 0.8 0.7 0.6

0.5

0.4

0.3

0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FIGURE 4.3: Error performance of all (8, 4) codes over the bi- nary erasure channel. Chapter IV. EXPERIMENTSAND RESULTS 36

From Figure 4.3, we see that LDPC, Reed-Muller, and the random code exhibit almost similar behavior in terms of their error performance, and tends to be higher error rates than that of the ‘worst’ code. Reed-Muller codes exhibit marginally higher error rates than LDPC and random codes, but it does not appear to be significant, unlike with the ‘worst’ code.

1

0.98

0.96

0.94

0.92

0.9

0.88

0.86

0.84

0.82

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FIGURE 4.4: Error performance of all (128, 64) codes over the binary erasure channel.

Almost similar conclusions can obtained from Figure 4.4 for (128, 64) codes, where LDPC, RM, and random codes exhibit much higher error rates than the ‘worst code’, albeit for a small range of erasure probabilities. In contrast to the shorter block length graph, we notice that the error rates have shot up significantly, and the disparity between the ‘worst’ code and all other codes is negligible at ǫ> 0.2. Over a binary symmetric channel, (8, 4) codes replicate the behavior ob- served over a BEC, with LDPC, RM, and random codes once again exhibit- ing, almost similar but higher error rates over the ‘worst’ code. Random codes exhibit marginally lower error rates, but once again the difference is not significant relative to the ‘worst’ code and its performance. As was evident with the (128, 64) code performance over the BEC, the error performance over the BSC for LDPC, RM, and random codes are very Chapter IV. EXPERIMENTSAND RESULTS 37

100

10-1

10-2

10-3

10-4

10-5 0 1 2 3 4 5 6 7 8 9 10

FIGURE 4.5: Error performance of all (8, 4) codes over the bi- nary symmetric channel.

100

10-1

10-2

10-3

10-4

10-5 0 1 2 3 4 5 6 7 8 9 10

FIGURE 4.6: Error performance of all (128, 64) codes over the binary symmetric channel. similar, and is significantly higher than that of the ‘worst’ code. This behavior is only seen at higher values of Eb/N0, while in contrast, there is no obvious difference in performance at low Eb/N0. This result also corroborates the proposition that as block length increases, error rates rise. It is worth noting that the curves seen in both graphs for the ‘Uncoded BPSK’ serves only as a reference for these comparisons. Chapter IV. EXPERIMENTSAND RESULTS 38

4.2.2 Radio Transmission Results

In the simulations, regardless of block length, we noticed a trend of LDPC, RM, and random codes exhibiting higher error rates than the ‘worst’ code. We aim to at least replicate this performance by testing the codes over a transmitter-receiver system built using the USRP radios. The measurements provide a reading each, of the error-performance over the Gaussian (‘real-world’) channel, at a near-field distance and a far-field distance, for each code construction. These error rates are averaged from over 5 × 105 syndromes, thereby providing a thorough study of the codes performance.

TABLE IV.1: Error rate/performance of all (8, 4) codes.

(8, 4) Code Error Rates Code Construction Near-Field (≈ 30 cm) Far-Field (≈ 5 m) LDPC 0.001136 0.001390 Reed-Muller Code 0.001136 0.001390 Random Code 0.001136 0.001382 Worst Code 0.000753 0.000935

TABLE IV.2: Error rate/performance of all (128, 64) codes.

(128, 64) Code Error Rates Code Construction Near-Field (≈ 30 cm) Far-Field (≈ 5 m) Irregular LDPC 0.016156 0.018965 Regular LDPC 0.016156 0.018856 Reed-Muller Code 0.016156 0.018965 Random Code 0.016156 0.018965 Worst Code 0.008331 0.009620

We notice that the error rates obtained from these tests are consistent with the simulation results from the previous section. LDPC, Reed-Muller, and random codes exhibit very similar error performance, consistently higher Chapter IV. EXPERIMENTSAND RESULTS 39 than that of the ‘worst’ code. All (128, 64) codes, as seen in the simulations, have higher error rates compared to (8, 4) codes. In the next section, we will assess the simulation and radio transmis- sion results, and discuss the relevance of the error rate curves that we’ve observed. 40

CHAPTER V

CONCLUSIONAND FUTURE WORK

5.1 Conclusion

The results that we have obtained from the simulations and the radio transmission tests put forth several interesting assessments. As previously discussed in the coding strategy background, error rate curves provide an in- teresting alternate look at physical-layer security. We extend this discussion to include the coset coding scheme, and proceed with the idea that higher probability of error at the eavesdropper implies lesser amount of information leaked to the eavesdropper. This implies that a code with higher probability of error will be better suited for secrecy. This understanding is key to all of the assessments that will be introduced and discussed in this section. In the background section of this paper regarding the choice of Reed- Muller codes for our experiments, we discussed how (8, 4) RM codes consis- tently gave higher equivocation rates for all erasure probability. Therefore, it might be safe to assume that for that configuration, Reed-Muller codes are best suited for secrecy over other codes. Our results with error rate curves for (8, 4) codes, on the other hand, exhibit a pattern where LDPC, Reed-Muller, and random codes all give almost identical error rates. It is also seen that the ‘worst’ code consistently records lower error rates for the entire range of Eb/N0 values. The identical performances among the three better codes Chapter V. CONCLUSIONAND FUTURE WORK 41 raise the question as to whether we can actually pick a better code suited for secrecy. With the performance disparity between the ’worst’ code and the rest, a good assessment is that unless the transmitter deliberately chooses the worst code, any code construction could suitably fit his/her security require- ment for this configuration. This assessment, we then notice, further extends to (128, 64) codes. At higher ranges of SNR, there appears to be no obvious disparity between the performances of LDPC, RM, and random codes with respect to each other, but there is a wider disparity with the ‘worst’ code. The radio transmis- sion results corroborate this assessment, by providing rates over a real-world channel consistent with our simulations. Therefore our first conclusion, is that unless the system designer deliberately chooses the worst possible code, any randomly chosen codes, as evidence suggests, are essentially as good as structured ones for medium blocklengths. This could also likely be extended to larger blocklengths. On comparing the short block length results with that of the large block length, we notice that error rates rise as the block length increases. The er- ror rates for the large block length codes, appear to shift closer to unity for low SNRs. It therefore appears that with longer block lengths, better security (through higher error rates) is inherent. The disparity between the ‘worst’ code and the rest is almost negligible for low SNRs, which thereby imply that any code construction, even if it is the worst possible code, could meet the se- curity requirements for a system, when the eavesdropper’s channel (wiretap channel) suffers from high levels of noise. While it is worth mentioning that BER performance comparisons may not provide for a comprehensive mea- sure of secrecy, these results present an interesting insight into how these codes compare to each other. Another way of describing this assessment is that careful selection of codes, for security, might only be necessary for short block length codes, when the eavesdropper is in a very noisy channel. Chapter V. CONCLUSIONAND FUTURE WORK 42

To summarize, with the evidence we have obtained from simulations and the radio transmission tests, we realize that the codes do not exhibit large disparities in their error performance that allows for ranking its suitability for secrecy applications. They indicate that codes that are randomly chosen could work as well as any structured code. For short block length codes, the system designer can afford to choose any code construction that is not the worst possible code, for security. While it is apparent that the three bet- ter codes (LDPC, RM, and random codes) consistently produce higher error rates than the ‘worst’ code, at low SNR for large block lengths, the lower to negligible disparity allows the choice of any code construction to meet se- crecy.

5.2 Future Work

While the results we have obtained are consistent in all channels, and thereby provide the conclusions that we have arrived at for physical-layer security, information-theoretic security cannot be addressed with these error- performance curves. This calls for an in-depth look into equivocation rates for these codes, to determine if it is at all possible to consistently rank codes for secrecy. A larger collection of codes to compare and contrast, would pro- vide a much broader perspective for ranking codes, if at all possible. This can therefore be extended to include other coding strategies that will achieve a stronger grade of secrecy, like polar codes [3], lattice codes [20], and a few others. Error performance curves that consider the decoding process for these different code constructions, which has been largely ignored in this paper, might also present and provide viable results. Another area worth ex- ploring, that could potentially benefit the cause of ranking codes, is to have a real-time assessment of the code performance with transmissions on USRP radios. 43

BIBLIOGRAPHY

[1] D. Klinc, J. Ha, S. W. McLaughlin, J. Barros, and Byung-Jae Kwak. “LDPC Codes for the Gaussian Wiretap Channels”. In: IEEE Transactions on In- formation Forensics and Security 6.3 (Sept. 2011).

[2] M. Bloch, J. Barros. “Physical-Layer Security: From Information Theory to Security Engineering”. In: (2011).

[3] W. Harrison, J. Almeida, M. R. Bloch, S. W. McLaughlin, J. Barros. “Cod- ing for Secrecy: An overview of error-control coding techniques for physical-layer security”. In: IEEE Signal Processing Magazine 30.5 (Sept. 2013), pp. 41–50.

[4] A. Suresh, A. Subramanian, A. Thangaraj, M. Bloch, and S. McLaugh- lin. “Strong secrecy for erasure wiretap channels”. In: Proc. of the IEEE Information Theory Workshop (ITW 2010) (2010).

[5] J. Pfister, Marco A. C. Gomes, J. P. Vilela, M. R. Bloch, and W. K. Harri- son. “Quantifying Equivocation for Finite Blocklength Wiretap Codes”.

In: arXiv/1701.05484 (2017). URL: https://arxiv.org/pdf/1701. 05484.pdf.

[6] G. Cohen and G. Zémor. “Syndrome coding for the wire-tap channel revisited”. In: Proc. of the IEEE Information Theory Workshop (ITW ’06) (2006), pp. 33–36.

[7] G. Cohen and G. Zémor. “The wiretap chanel applied to biometrics”. In: Proc. of the International Symposium on Information Theory and Appli- cations (2004). BIBLIOGRAPHY 44

[8] I. Csiszár and J. Körner. “Broadcast Channels with confidential mes- sages”. In: IEEE Transactions on Information Theory 24.3 (May 1978), pp. 339– 348.

[9] R. G. Gallager. “Low-Density Parity-Check Codes”. In: (1963). URL: http://www.rle.mit.edu/rgallager/.

[10] “GNU Radio Wiki”. In: (2017). URL: https : / / wiki . gnuradio . org/index.php/Main_Page.

[11] W. Harrison. “Coset Codes in a Multi-hop Network”. In: Globecom 2013 Workshop - Trusted Communications with Physical Layer Security (2013).

[12] S. Lin, and D. Costello Jr. Error-Control Coding. 2nd ed. 2004.

[13] U. Maurer, and S. Wolf. “Information-Theoretic Key Agreement: From Weak to Strong Secrecy for Free”. In: IEEE Transactions on Information Theory 45.2 (Mar. 1999), pp. 499–514.

[14] A. Thangaraj, S. Dihidar, A. Calderbank, S. McLaughlin, and J. Merolla. “Applications of LDPC codes to the wiretap channel”. In: IEEE Trans- actions on Information Theory 53.8 (2007), pp. 2933–2945.

[15] “Q-Function Wiki”. In: (2017). URL: https://en.wikipedia.org/ wiki/Q-function.

[16] S. Kudekar, S. Kumar, M. Mondelli, H. D. Pfister, E. ¸Sa¸so˘glu,R. Ur- banke. “Reed-Muller Codes Achieve Capacity on Erasure Channels”.

In: arXiv/1505.05123 (2015). URL: https://arxiv.org/pdf/1505. 05123.pdf.

[17] T. J. Richardson, and R. L. Urbanke. “Efficient Encoding of Low-Density Parity-Check Codes”. In: IEEE Transactions on Information Theory 47.2 (Feb. 2001).

[18] C. E. Shannon. “A Mathematical Theory of Communication”. In: Bell System Technical Journal 27 (Oct. 1948), pp. 379–423, 623–656. BIBLIOGRAPHY 45

[19] C. E. Shannon. “Communication Theory of Secrecy Systems”. In: Bell System Technical Journal 28.4 (May 1949), pp. 656–715.

[20] F. Oggier, P. Solé, and Jean-Claude Belfiore. “Lattice Codes for the Wire- tap Gaussian Channel: Construction and Analysis”. In: IEEE Transac- tions on Information Theory 62.10 (Oct. 2015).

[21] “Syndrome Decoding”. In: Stanford EE Notes (Oct. 2015). URL: https: //web.stanford.edu/class/ee387/handouts/notes10.pdf.

[22] R.M. Tanner. “A recursive approach to low complexity codes”. In: IEEE Trans. on Information Theory (Sept. 1981).

[23] M. Bellare, S. Tessaro, and A. Vardy. “Semantic Security for the Wiretap Channel”. In: Advances in Cryptology- CRYPTO 2012 (2012), pp. 294– 311.

[24] A.D. Wyner. “The wire-tap channel”. In: Bell System Technical Journal 54.8 (Oct. 1975), pp. 1355–1387.