DISTINGUISHABILITY OF PUBLIC KEYS AND EXPERIMENTAL VALIDATION: THE MCELIECE PUBLIC- by Hai Pham

A Thesis Submitted to the Faculty of The Charles E. Schmidt College of Science in Partial Fulfillment of the Requirements for the Degree of Master of Science

Florida Atlantic University Boca Raton, FL December 2015 Copyright 2015 by Hai Pham

ii

ACKNOWLEDGEMENTS

I would like to express my deepest gratitude to my master thesis advisor, Dr. Steinwandt. I have learned many things since I became Dr. Steinwandt’s student. I admire his personality, his vast knowledge and his passion for . I am grateful for the time that he spent on guiding me, providing feedbacks, and sometimes giving tough love. I also would like to thank my small family (my dad, my mom, and my sister) for everything they have done for me. In addition, I owe many thanks to my big family (my grandparents, my aunts and uncles) for all their love.

iv ABSTRACT

Author: Hai Pham Title: Distinguishability of Public Keys and Experimental Validation: The McEliece Public-key Cryptosystem Institution: Florida Atlantic University Thesis Advisor: Dr. Rainer Steinwandt Degree: Master of Science Year: 2015

As quantum computers continue to develop, they pose a threat to cryptography since many popular will be rendered vulnerable. This is because the security of most currently used asymmetric systems requires the computational hard- ness of the problem, the or the elliptic curve discrete logarithm problem. However, there are still some cryptosystems that resist . We will look at code-based cryptography in general and the McEliece cryptosystem specifically. Our goal is to understand the structure behind the McEliece scheme, including the and decryption processes, and what some advantages and disadvantages are that the system has to offer. In addition, using the results from Courtois, Finiasz, and Sendrier’s paper in 2001 [12], we will discuss a scheme based on the McEliece cryptosystem. We analyze one classical algebraic attack against the security analysis of the system based on the distinguishing problem whether the public key of the McEliece scheme is generated from a generating matrix of a binary Goppa code or a random binary matrix. The idea of the attack involves solving an algebraic system of equations and we examine the dimension of the solution space of the linearized system of equations. With the v assistance from a paper in 2010 by Faug`ere,Gauthier-Uma˜na,Otmani, Perret, Til- lich [14], we will see the parameters needed for the intractability of the distinguishing problem.

vi To my paternal grandfather DISTINGUISHABILITY OF PUBLIC KEYS AND EXPERIMENTAL VALIDATION: THE MCELIECE PUBLIC-KEY CRYPTOSYSTEM

List of Tables ...... x

List of Figures ...... xi

1 Cryptography ...... 1 1.1 Introduction and Motivation...... 1 1.2 Cryptosystem...... 2 1.3 The McEliece Cryptosystem...... 4

2 Background in Coding theory ...... 5 2.1 Terminology...... 5 2.2 Generator and Parity-Check Matrices...... 7 2.3 Error Detection and Error Correction...... 8 2.4 Goppa Codes and Patterson’s algorithm...... 8

3 The McEliece Cryptosystem ...... 14 3.1 Encryption and Decryption in McEliece’s scheme...... 14 3.2 Advantages and Disadvantages of the McEliece Cryptosystem.... 17 3.3 McEliece-based Digital Signature Scheme...... 18

4 A Distinguisher for Public Keys in McEliece Cryptosystems . . . 20 4.1 Motivation...... 20 4.2 The Goppa Code Distinguishing Problem...... 21 4.3 Building A Distinguisher...... 21

viii 5 Conclusion ...... 32

Bibliography ...... 33

ix LIST OF TABLES

4.1 Maximal Degree for Distinguishability ...... 20 4.2 Experimental Validation ...... 29 4.3 Observations for the Second Experiment ...... 31

x LIST OF FIGURES

 4.1 The Matrix G in the Form of Ik | P ...... 23

xi CHAPTER 1 CRYPTOGRAPHY

1.1 INTRODUCTION AND MOTIVATION

Quantum computing explores systems that make use of quantum mechanical phenom- ena to perform operations on data. The development of actual quantum computers is still in an early stage, but we can already determine the effects that they will have on the current cryptographic systems. Large-scale quantum computers will be able to solve certain problems much more quickly than any classical computers that use even the best currently known algorithms. Private-key cryptography will be weakened, but it seems that with an increase in , one may restore the necessary security. For example, AES-128 uses a key of 128 bits, and so around 2127 operations are the expec- ted requirement to recover the key. However, quantum computers can run algorithms that require approximately 264 operations to recover an AES-128 key [1]. The solution in mind here is to switch to AES-256. The effect is more devastating with public-key cryptography because quantum computers can perform algorithms that break popu- lar public-key cryptosystems in a relatively short amount of time. It is because the current systems’ security requires the hardness of the integer factorization problem, the discrete logarithm problem or the elliptic curve discrete logarithm problem. For instance, Shor’s algorithm [3] can recover an RSA key in polynomial time. These reasons bring us to an important question: what can we do about this? Crypto- graphers have started to study post-. There are currently six prominent classes of cryptosystems that resist quantum computers. They are: lattice- based cryptography, multivariate cryptography, hash-based cryptography, code-based

1 cryptography, supersingular elliptic curve isogeny cryptography, and symmetric key based cryptography. In code-based cryptography, one of the classsical examples is the McEliece cryptosystem. Before explaining the McEliece cryptosystem in more detail, let us review some of the basic notions about cryptography. One definition of cryptography is the practice and study of techniques for secure communication in the presence of third parties (called adversaries) [2]. To briefly describe the idea behind encryption, the sender chooses a message that he/she would like to send, applies some encryption process, and sends this encrypted message over a network. Upon receiving the encrypted message, the receiver uses a known decryp- tion process to recover the original message. Although the adversary may intercept the encrypted message, he or she will be unable to recover any partial information (other than length) of the original message without knowledge of a piece of secret information, known as a private key.

1.2 CRYPTOSYSTEM

Accompanied with cryptography, a cryptosystem is an implementation of one or more cryptographic algorithms [10]. It is designed to fulfill particular specifications while providing certain security properties. Cryptosystems consist of two types: private or public-key (also known as symmetric or asymmetric key respectively). In private- key cryptography, the parties share a single piece of information, which is called the private key, and use this to perform the encryption and decryption processes. Here we assume that the private key is established in confidentiality between the parties. The adversary, in this case, would have no knowledge of this private key. One example of private key cryptography is a one-time pad. Suppose that Alice wants to send a message to Bob. Alice would generate a large sequence of uniformly random numbers chosen in {0,. . . ,25} for example (which will serve as the private key). After that, she has to communicate this sequence of random numbers (pad) to

2 Bob in confidentiality so that Eve, the adversary, would have no knowledge of it. The idea behind the encryption process is that each bit or character of the message that Alice wants to send to Bob is encrypted by combining it with the corresponding bit or character from the pad using modular addition. When Bob receives the encrypted message, he takes out his copy of the pad and correspondingly performs modular addition component-wise to recover the original message. If the pad is random, at least as long as the message, used only once, and kept secret, then this scheme can guarantee that an eavesdropper will not be able to recover any partial information (other than the length) of the original message [5]. One disadvantage of private key cryptography relates to the hardness of keeping the private key synchronized. As in the one-time padding scheme above, we see that it is essential to have the sequence of number truly random and only use it once. Since everyone has to agree on a secret key, in larger networks, it becomes difficult to establish the secret key in confidentiality. This issue does not show up in public key cryptography. In public-key cryptography, each individual manages a distinct private key and a distinct public key. All parties can have access to an individual’s public key, but each individual keeps his or her private key secret. An example of public key cryptography is RSA [7]. Basic RSA can be summarized as follows. Bob chooses two distinct prime numbers p and q and computes their product. He then picks an integer e such that e is coprime to ϕ(n) = (p−1)(q −1) and 0 < e < ϕ(n). Also, he computes d as d = e−1 mod ϕ(n). Bob publishes the pair (n, e) as his public key and keeps d secret. If Alice wants to send him a message, she first turns her message into an integer m modulo n. Using Bob’s public key (n, e), Alice encrypts her message m as me (mod n) and sends it to Bob. Bob then compute cd mod n to undo the exponentiation with e to recover the original message. Overall, we notice that one of the advantages of public-key cryptosystems is the

3 simpler key management. There is no need to meet to exchange the secret key, thus increasing the security and convenience. Public-key cryptosystems work well in a multi-user setting since new members can be added without impacting existing users. In addition, another outstanding feature is the ability to provide signatures for messages to ensure the authenticity of the sender.

1.3 THE MCELIECE CRYPTOSYSTEM

The main object that we would like to pay attention to is called the McEliece cryptosystem. It is an asymmetric cryptosystem that relies heavily on error-correcting codes as a tool for encryption. It was proposed by Robert McEliece in 1978 [8]. It of- fers randomization in the encryption process. One of the advantages of the McEliece cryptosystem is immunity to attacks using Shor’s algorithm. Unlike RSA, McEliece cryptosystems are considered secure in the presence of quantum computer [10]. Before we can discuss about the McEliece cryptosystem in more details, we must understand what an error-correcting code is. For this reason, we want to look at some basic ter- minology from coding theory.

4 CHAPTER 2 BACKGROUND IN CODING THEORY

2.1 TERMINOLOGY

Coding theory is the study of the properties of codes and their suitability for a specific application [9]. They are commonly used for data compression, cryptography, error- correction and more recently also for network coding. We will only go over the topics that would assist us in understanding the implementation of the McEliece cryptosystem. We start with a definition of a code then work our way down to the notion of , Hamming weight and distance, error-detection and error correction.

Definition 2.1.1. A fixed-length code C over a finite field A is a subset of An := A × A × ... × A (n copies). The length of C is defined as n. The field A is referred

to as the alphabet of C. (For our topic, we will only consider the case A = F2.)

Example 2.1.1. The set Γ = {000, 101, 011} is a code over the alphabet {0, 1} with length 3.

Definition 2.1.2. A word is an element of the set A × A × ... × A (n copies).

Definition 2.1.3. A codeword c in a code C is a word such that c ∈ C.

At first glance, one could think that both of these two concepts are the same. The subtle difference is that a codeword is an element of the code whereas a word may or may not be an element of the code. In other words, a word can be any element that has the right length and alphabet, but to be a codeword, it must belong to the code.

5 Example 2.1.2. Following from the previous example with the code Γ, we can say that neither 014 (wrong alphabet) nor 0001 (wrong length) are words in Γ. The string 010 is a word of the right length and alphabet but not a codeword in Γ. The string 011 is both a word and a codeword in Γ.

Next we will introduce what a linear code is and how to classify the codes.

Definition 2.1.4. If a code C is a vector subspace of An, C is called a linear code with alphabet A.

Definition 2.1.5. The dimension k of a linear code C is defined as the dimension of C as a vector space over A.

Example 2.1.3. The code Γ = {110, 011, 101, 000} is a linear code with dimension k = 2 because as a vector space, the code has basis {110, 011}.

n Definition 2.1.6. Let x = (x1, x2, ..., xn) and y = (y1, y2, ..., yn) be elements in A . The Hamming distance from x to y is defined as

dham(x, y) := |{i|xi 6= yi}|. (2.1)

n Definition 2.1.7. The Hamming weight of a word w is defined as dham(w, 0 ).

Definition 2.1.8. The minimum distance d of a linear code C is the minimum of the distances between codewords

d := min{dham(x, y)|x, y ∈ C}. (2.2)

Definition 2.1.9. An [n, k, d] code C is a linear code with length n, dimension k, and minimum distance d. (Notice: when appropriate, one may drop the minimum distance information).

6 2.2 GENERATOR AND PARITY-CHECK MATRICES

Recall that a linear code is a k-dimensional vector subspace of An, and the codewords in C are linear combinations of k basis vectors. Therefore, one way to define a linear code is to use what is called a generator matrix (usually denoted by G). In coding theory, a generator matrix is a matrix whose rows form a basis for a linear code. In our case, this is a k × n matrix. The codewords are linear combinations of the rows of this matrix. In other notation, a linear code C over the finite field Fq of q elements defined by a k × n matrix G whose entries belong to Fq is the vector space spanned by G’s rows:

 k C = uG | u ∈ Fq . (2.3)

The standard form for a generator matrix is

  G = Ik|P (2.4) where Ik is the k × k identity matrix, P is a k × l matrix and l = n − k. A generator matrix can be used to generate what is called a parity-check matrix. A parity-check matrix can be used to define a code in the following way:

 n T C = c ∈ Fq | Hc = 0 . (2.5)

We usually let H denote the parity-check matrix. The standard form of a parity- check matrix is

 T  H = − P |Il (2.6) where P T is the transpose of the matrix P .

Example 2.2.1. The matrix   1 1 0 0 1     G =  0 1 1 0 1      0 0 1 0 1 7 is a generator matrix for a [5, 3] linear code C with A = F2. The matrix   1 0 1 0 1 H =   0 0 0 1 0

is a parity-check matrix for this code.

2.3 ERROR DETECTION AND ERROR CORRECTION

Some interesting properties of a linear code include its capability to determine whether an error has occurred during a transmission and the ability to correct that error (if possible). Suppose we have an [n, k, d] linear code C, a codeword c ∈ C and we are transmitting c over a noisy channel such that 0 < t < d errors occur in the transmission (i.e., t of the n bits were flipped from 1 to 0 or 0 to 1). We know that the minimum distance between any two codewords is d, but the received word c0 is a distance 0 < t < d from c, so cannot be a codeword. Hence, a linear code can detect any t < d bit errors. Knowing that some errors may have occurred, the next thing is to find a way

 d−1  to correct the errors and identify the original codeword. Suppose that t ≤ 2  d−1  errors occurred. The basic idea is that we can define a ball of radius 2 around each codeword. These balls do not intersect with each other because the minimum

 d−1  distance between each pair of codewords is d > 2 2 . Therefore, each word can be assigned to a unique ball and a corresponding codeword. In this manner, we are not only able to detect the error but also we can correct the errors occurred during a transmission.

2.4 GOPPA CODES AND PATTERSON’S ALGORITHM

In his original proposal, McEliece suggested to use Goppa codes because they have resisted [10]. Also, these codes can be decoded efficiently using Patter-

8 son’s algorithm [10]. So let us look at how to define a Goppa code and explore the idea behind Patterson’s algorithm. The following are the steps to construct a Goppa code Γ:

2m−1 1. Select an integer m ≥ 3 and an integer t such that 2 ≤ t ≤ m . Note that t determines the number of errors which the Goppa code can correct.

2. Fix an integer n such that mt + 1 ≤ n ≤ 2m. The natural choice for n is 2m, but varying the value of n can yield a better a security/efficiency trade-off [10].

3. Select distinct elements a1, a2, ..., an in the finite field F2m .

4. Choose a degree-t polynomial g that is irreducible in F2m [x]. This g is known as a Goppa polynomial.

Q 5. Define h = (x − ai) ∈ F2m [x]. Then the set i

 X h  Γ = c ∈ n| c ≡ 0 mod g (2.7) F2 i x − a i i is an [n, ≥ n−mt] Goppa code. This code is capable of correcting up to t errors [10].

In practice, the following is the Magma implementation for creating a random Goppa code [6]. Note that in this case, we use m = 14 and r = 3 as an example. function RandomGoppaCode() q:= 2^14; K:= FiniteField(q); Kext:= ext; repeat g:= MinimalPolynomial (Random(Kext),K); until Degree(g) eq 3;

9 L:= [x : x in K]; return GoppaCode(L, g); end function;

Now, we need an efficient decoding algorithm, Patterson’s algorithm [11]. We will look at an overview of it first and then go into certain details of how the algorithm works.

Input: Syndrome s(x) ∈ F2m [x] (we will define it in the syndrome calculation below)

Output: Polynomial σ(x) ∈ F2m [x]. (Note: the roots of σ(x) correspond to the locations of error in a given word) Set T (x) = s−1(x) mod g(x) if T (x) = x then σ(x) = x else Set R(x) = pT (x) + x mod g(x) Apply lattice-basis reduction to the lattice generated by the vectors (R(x), 1) and (g(x), 0) to obtain a minimum vector (α, β) in the Eulidean metric space. Set σ(x) = α2(x) + xβ2(x) end if

Patterson’s algorithm involves an efficient way for calculating the inverse and square root of a polynomial in a quotient ring, and the process of lattice-basis re- duction. We will go through and view each stage. First, we need to discuss how the syndrome is defined and calculated. A syndrome s(x) ∈ F2m [x]/g(x), for a vector

n w ∈ F2 , is defined as n X wi (2.8) (x − a ) i=1 i

10 in the field F2m [x]/g(x). Upon receiving the transmitted word w (not necessarily a codeword), we know that w is of the form

 w = w1w2w3...wn−1wn wi ∈ {0, 1} (2.9)

wi Since we have access to all of the ai, we can form and compute their sum. x−ai The next step is to find the inverse of the syndrome mod g(x). The idea is to use the Euclidean algorithm. Since g(x) is irreducible over F2m [x], we can say that

gcd g(x), p(x) = 1 (2.10) as long as p(x) is not a multiple of g(x). So we follow the same procedure as how to find the gcd of two numbers. Consequently, there exist polynomials u(x), v(x) ∈ F2m [x] such that u(x)g(x) + v(x)p(x) ≡ v(x)p(x) ≡ 1 mod g(x) (2.11)

Therefore, v(x) is the inverse of p(x) mod g(x). In our case, we want to be able to determine the square root of a polynomial of the form p(x) = T (x) + x. We first look into how to compute the square root of x ∈ F2m [x]/g(x). To do this, we first split g(x) into even and odd parts (even and odd depending on the degree of each term) such that

2 2 g(x) = g0 (x) + xg1 (x) (2.12)

where g0(x) and g1(x) are the term-by-term square roots of the even and odd terms of g(x) respectively. This is possible because F2m [x] has characteristic 2. This can be shown by following these steps. Using the equation above, we have

2 2 g(x) − g0 (x) ≡ xg1 (x) mod g(x)

2 2 g0 (x) ≡ xg1 (x) mod g(x)

2 −2 g0 (x)g1 (x) ≡ x mod g(x)

−1 2 g0(x)g1 (x) ≡ x mod g(x)

11 −1 Therefore, the square root of x ∈ F2m [x]/g(x) is g0(x)g1 (x). With this knowledge, p we can compute T (x) + x ∈ F2m [x]. Similarly to the previous case, we split T (x)+x into even and odd parts, T0(x) and T1(x), such that

2 2 T (x) + x = T0 (x) + xT1(x) . (2.13)

−1 p The claim now is that if R(x) = T0(x) + g0(x)g1 (x)T1(x), then R(x) ≡ T (x) + x mod g(x). This is true because when we square it, we get

2 2 −1 2 2 2 2 R (x) = T0 (x) + g0(x)g1 (x) T1 (x) ≡ T0 (x) + xT1 (x) ≡ T (x) + x mod g(x). (2.14) The last step is the lattice-basis reduction. To stay within scope, we will just discuss what is necessary to implement Patterson’s decoding algorithm. Definition: A lattice L is a finitely generated Abelian group.

deg p(x) Definition: The norm of a polynomial p(x) ∈ F2m [x] is given by |p(x)| = 2

if p(x) 6= 0 and 0 if p(x) = 0. The length of a vector (α, β) ∈ F2m [x] is defined as |α2 + xβ2| where α, β are polynomials. An overview for lattice-basis reduction for Patterson’s algorithm [10] is as follows:

Input: Parameter t and two vectors (a, b) and (c, d) that form a basis for the lattice. Without loss of generality, suppose |(a, b)| ≥ |(c, d)|. Output: A vector (α, β), with |(α, β)| ≤ 2t.

 a  Set (α0, β0) = (α, β) − c (c, d) t if |(α0, β0)| > 2 then

 j c k  (α1, β1) = c mod α0, 1 − · β0 α0 else

Set (α, β) = (α0, β0) and terminate. end if Set i = 1

t while |(αi, βi)| > 2 do 12 Set αi+1 = αi−1 mod αi

j αi−1 k Set βi+1 = βi−1 − · βi αi i = i + 1 end while

Set (α, β) = (αi, βi)

The idea behind lattice-basis reduction is fairly similar to the Euclidean algorithm. We subtract integer multiples of the shorter vector from the longer one until the resulting vector is of minimal length.

13 CHAPTER 3 THE MCELIECE CRYPTOSYSTEM

3.1 ENCRYPTION AND DECRYPTION IN MCELIECE’S SCHEME

The McEliece cryptosystem consists of three algorithms: a key generation algorithm which produces a public key and a private key, an encryption algorithm and a de- cryption algorithm. First, we will look at all three algorithms in a general sense. To generate the keys, we first choose a linear code C (in this case, binary Goppa code) of length n and dimension k with the capability of correcting up to t errors. This code C should come with an efficient decoding algorithm and is given by a k × n generator matrix G. The interesting idea here is that we are going to disguise this generator matrix G by the following method. We randomly select two matrices S and P , where S is a k × k binary non-singular matrix and P is an n × n permutation matrix. We call S the scrambler matrix and P the permutation matrix. We compute Gˆ = SGP , a k × n matrix, and we publish this Gˆ together with the error-correcting capability t. The private key would be S, G, and P then. A Magma implementation of the key generation is included here.

C := RandomGoppaCode(); n := Length(C); k := Dimension(C); M := MatrixRing(GF(2), k); repeat S := Random(M); until IsUnit(S); 14 perm := Random(Sym(n)); G := S * GeneratorMatrix(C); I_n := MatrixRing(GF(2), n) !1; P := Parent(I_n)![I_n[j]^perm : j in [1..n]]; G := G * P; G := EchelonForm(G);

To encrypt a message, the sender looks up the receiver’s public key first. The sender converts the message m into a binary row vector of length k and computes mGˆ . After that, he/she computes mGˆ + e, where e is a {0,1} random row vector of weight t that is generated each time going through the encryption process. Hence, the sender would transmit the encrypted message mGˆ + e over the public channel. Upon receiving the encrypted message, the receiver can perform the following steps to recover the original message. By using the private key information, the receiver can multiply the encrypted message by P −1 to get

(mGˆ + e)P −1 = (mSGP + e)P −1 = mSG + eP −1 (3.1)

Then he/she uses the decoding algorithm for the code C to get rid of the error term eP −1 D(mSG + eP −1) = mS (3.2)

We solve for m by multiplying on the right of mS by S−1. Here are the three algorithms from a more mathematical view

I Key generation algorithm for McEliece

Input: m, t

ˆ Output Kpub = (G, t),Ksec = (S, G, P )

1. n ← 2m, k ← n − mt

2. C ← random binary (n, k)-linear code C capable of correcting t errors 15 and has an efficient decoding algorithm

3. G ← k × n generator matrix for the code C

4. S ← random k × k binary non-singular matrix

5. P ← random n × n permutation matrix

6. Gˆ ← k × n matrix S · G · P

As we briefly mentioned above, we use the scrambler and permutation matrices as a disguise for the structure of G. It is a necessary step because the generator matrix G alone can reveal crucial information to the attacker.

II Encryption algorithm for McEliece

Input: m, Kpub

Output: encrypted message c

1. Convert the message m into a binary string of length k

2. Compute mGˆ

3. Generate a random n-bit error vector e containing t ones

4. c = mGˆ + e

III Decryption algorithm for McEliece scheme

Input: c, Ksec

Output: original message m

1. Compute cP −1

2. Use the efficient decoding algorithm for the code C to decode cP −1 to

get mS

3. Compute (mS)S−1

16 Note: when we apply the decoding algorithm, it can be used to remove the error eP −1. This is because P is a permutation matrix and so eP −1 has weight t. The Goppa code we chose has the ability to correct up to t errors and mSG is at distance t from eP −1. We also can see the difficulty when an adversary intercepts the message. Without knowledge of the secret key, the adversary would have to find the nearest codeword from the encrypted message. It would seemingly involve calculating the syndrome of the encrypted message and comparing it to the syndromes of all the error vectors of weight t. With a good choice of n and t, it would be an infeasible computation.

3.2 ADVANTAGES AND DISADVANTAGES OF THE MCELIECE CRYPTO- SYSTEM

Let us look at some disadvantages first:

1. The size of the public key. It is a k × n matrix which can be expensive to store.

2. The encrypted message is much longer than the original message. For example, if we use the parameters suggested by McEliece, we see that the public key is a 524 × 1024 matrix over GF (2). Therefore, with an original message of 524 bits, the encrypted message is of 1024 bits.

However, the McEliece cryptosystem offers a few distinguishing advantages:

1. It includes the element of randomness into the encryption process to improve security.

2. Augmenting to indistinguishability under chosen-plaintext attack (IND-CPA) is easy. [4]

3. It is seemingly hard to break even with quantum computing.

17 4. Originally, we thought that McEliece does not allow practical digital signatures. However, it has been shown how to build a practical signature scheme with McEliece [12].

3.3 MCELIECE-BASED DIGITAL SIGNATURE SCHEME

Originally, it was believed that code-based cryptosystems like McEliece do not allow practical digital signatures. However, in [12], it shows how to construct a signature scheme based on the McEliece cryptosystem. The idea of an efficient digital signature includes an algorithm to generate a signature for any message and a fast, public verification algorithm. These two algorithms can be described as follows: Signing algorithm

1. For any message D, compute s = h(D), where h is a hash function returning a binary word of length n − k.

 2. Compute si = h [. . . s . . . |.i.] for i = 0, 1, 2,... where [. . . s . . . |.i.] denotes the concatenation of s and i

3. Find i0 the smallest value of i such that si is decodable

4. Decrypt this hash value by using the (the scrambler and

T permutation matrices) to compute z such that Hz = si0 where H is the parity- check matrix of the binary Goppa code.

5. Compute the index Iz of z in the space of words of weight 9 (which will be discussed in the upcoming paragraph)

i  i  i  I = 1 + 1 + 2 + ··· + 9 (3.3) z 1 2 9

6. Use [...Iz ... |.i0.] as a signature for message D.

Verification algorithm 18 1. Recover z from its index Iz.

T 2. Compute s1 = Hz with the public key H.

 3. Compute s2 = h [. . . h(D) ... |.i0.] with the public hash function.

4. Compare s1 and s2: if they are equal then the signature is valid.

The details on how to determine the proper parameters are discussed in Section 3 of the paper [12]. Basically, with the original McEliece parameters, the probability for successfully decoding each syndrome is far too small. This probability is the ratio between the number of decodable syndromes and the total number of syndromes,

1 which turns out to be t! . Hence, the number of decoding attempts to get one signature will be around t!. That means to get a reasonable signature scheme, t should not be more than 10. In [12], it is suggested that for the signature scheme, we want the code to be of length at least 215 (or 216) and being capable of correcting up to 10 (9) errors. Choosing the parameter of (216, 9) will help the signature scheme to run faster. For this reason, in step 5, we compute the index Iz of z in the space of words of weight 9. The signature scheme is not hard to follow, but there is a small note on the

first step of the verification algorithm, i.e., recover z from its index Iz. The idea is that i1 < ··· < i9 denote the positions of the non-zero bits of z. If we figure out these i1, i2, . . . , i9, then we will be able to recover z. After subtracting 1 from

Iz, we make use of the combinatorial number system [13]. We start with the 9- combination at position Iz − 1 to figure out i9. The remaining elements form the

i9 8-combination at position Iz − 1 − 9 and we repeat the process until we achieve z. The security of this signature scheme relies on the syndrome decoding problem and the indistinguishability of permuted binary Goppa codes from a random code [12]. We will look in detail at the distinguishability of Goppa codes in the next chapter.

19 CHAPTER 4 A DISTINGUISHER FOR PUBLIC KEYS IN MCELIECE CRYPTOSYSTEMS

4.1 MOTIVATION

Previously, we mentioned that the Goppa Code Distinguishing (GD) problem plays an important role in the security of the signature scheme. The GD problem consists of distinguishing the matrix of a binary Goppa code from a binary random matrix. A binary random matrix is a matrix whose entries are chosen uniformly random from the set {0, 1}. The main ingredients to tackle this problem include the consideration of the dimension of the solution space of a linear system of equations, obtained through linearization of a particular polynomial system. The observation is that the dimension depends on the type of code being used. By exploiting this fact, we can estimate what the degree of the Goppa polynomial has to be in order to make the McEliece public key indistinguishable. To illustrate and add to the motivation, we can look at the following table taken from [14]. For a binary Goppa code of length n = 2m, we can get a bound on the degree of the Goppa polynomial that can make the McEliece public key distinguishable

Table 4.1: Maximal Degree for Distinguishability

m 13 15 17 19 21 22 23

rmax 20 34 62 114 213 290 400

(according to [14])

20 For example, a binary Goppa code obtained with m = 13 and r = 19 correspond- ing to a 90-bit security [14] McEliece public key is distinguishable. If we want to increase the security, we want to choose a Goppa polynomial with degree larger than 20 to make our public key indistinguishable [14].

4.2 THE GOPPA CODE DISTINGUISHING PROBLEM

Recall that the security of the signature scheme relies partly on the distinguishability of Goppa codes from random codes. From [12], we have

Theorem 1. Under the random oracle assumption, a T -time algorithm that is able

1 to compute a valid pair message+signature for CFS with a probability ≥ 2 satisfies:

T ≥ min(TGoppa,TBD)

Let us look further into the Goppa Distinguishing problem. For integers n and k

with k ≤ n, we denote by Gn,k the set of k × n generator matrices of binary Goppa codes. Similarly, Rn,k is the set of binary random generator k × n matrices. Let D be an algorithm that takes as input a k × n matrix G and outputs a bit (0 or 1). It would solve the GD problem if it wins the following game with probabbility

1 nonneglibly away from 2 .

1. Let b be randomly chosen from 0,1.

2. If b is 1, the challenger selects G ∈ Gn,k. If b is 0, then selects G ∈ Rn,k. The challenger sends G to D

3. If D(G) = b then D wins the game. Otherwise, D loses.

4.3 BUILDING A DISTINGUISHER

Goppa codes belong to the class of alternant codes, and one well-known feature of an alternant code is the possibility of being decoded in polynomial time. In addition, 21 for any linear alternant code of length n ≤ qm, there exists a parity-check matrix of a special form

  y1 y2 ··· yn      y1x1 y2x2 ··· ynxn  V (x, y) =   [14]. r  . . . .   . . .. .      r−1 r−1 r−1 y1x1 y2x2 ··· ynxn Assume that the public key is a k × n generator matrix G where k = n − rm and such that it defines an alternant code of degree r. From what we discussed above, this alternant code can be described through a parity check matrix, which has the special form of Vr(x, y). Now knowing the matrix Vr(x, y) for some vectors x and y allows one to efficiently decode the public code defined by G [14]. Here, we describe a technique for partially solving the system of equations, and along the way, this will lead us to the building of our distinguisher. To start off, we can say that from the definition of G, we get that

T Vr(x, y)G = 0 (4.1)

From there, we get the following polynomial system:

r−1 n [ n X e o gi,jYjXj = 0 | 1 ≤ i ≤ k (4.2) e=0 j=1 The crucial note here is that we want to consider the dimension of the solution space of a linear system deduced from equation (4.2), which will essentially lead us to the bound on the degree of the Goppa polynomial that can make the public key distinguishable. Hence, to solve (partially solve) this system of equations, the strategy  is as follows. We assume, without loss of generality, that G = Ik | P . This form can be easily obtained by Gaussian elimination and by permutation of columns. Next, for any i ∈ {1, ..., k} and e ∈ {0, ..., r − 1}, we rewrite the system (4.2) as

22  Figure 4.1: The Matrix G in the Form of Ik | P [15]

n e P e YiXi = pi,jYjXj . This is possible because, once we have G in row Echelon form, j=k+1 T e performing Vr(x, y)G and solving for YiXi will get us the result we need. Then using

2 2 the identity YiYiXi = (YiXi) for all i in {1, ..., k}, we get

n n n 2 X X 2  X  pi.jYj pi,jYjXj = pi,jYjXj (4.3) j=k+1 j=k+1 j=k+1 After reordering things a little bit, we have

n X X  2 2 pi,jpi,j0 YjYj0 Xj0 + Yj0 YjXj = 0. (4.4) j=k+1 j0>j 2 2 Now we linearize the system of equations by letting Zjj0 = YjYj0 Xj0 + Yj0 YjXj . mr 0 We obtain k linear equations involving 2 variables Zjj . So the system becomes

n−1 n n X X o pi,jpi,j0 Zjj0 = 0 , i = 1, ..., k . (4.5) j=k+1 j0>j mr 0 To obtain a linear system of k equations involving 2 variables Zjj , like in (4.5), mr first we create a polynomial ring F2[z] with 2 variables. Since k = n − rm, we mr (n−k)(n−k−1) can simplify 2 into 2 . An easier start would be to implement (4.2),

23 dealing with two vectors of variables X and Y . After getting an idea on how this implementation works, we can move to the linearization step and introduce the single variable Z. If we look at (4.5), this system has the form

 n−1 n  P P  p1,jpi,j0 Zjj0 = 0 j=k+1 j0>j   . . .   n−1 n  P P  pk,jpi,j0 Zjj0 = 0 j=k+1 j0>j 0 Manually, we can keep track of the two subscripts j and j for Zjj0 . However, if we look carefully at this, we observe that as long as the coefficients are set up correctly, we do not care about the variables (they become dummy variables). For implementation purposes, instead of using variables with double subscripts, we can get away with a single subscript. We would like to represent these variables in an increasing order based on the subscripts.

Now we want to count the subscript vertically. For instance, Zk+1,k+2 corresponds to Z1; Zk+1,k+3 corresponds to Z2; ...; Zk+1,n corresponds to Zn; Zk+2,k+3 corres-

ponds to Zn+1; and so on. In order to do this, we introduce the algorithm function SubscriptConverter(j, jprime) that would convert the double subscripts into a single subscript in the way that we described above.

function SubscriptConverter(j, jprime) p := 0; found := false; a := k+1; while a le n-1 and not found do b := a+1; while b le n and not found do p := p+1;

24 if (a eq j) and (b eq jprime) then found := true; end if; b := b+1; end while; a := a+1; end while; return p; end function;

The subscript converter function requires two inputs, which are the double sub- script of Z (j and j’). Originally, we set the output to be zero and set the flag to be false (as in we have not found the matching single subscript yet). We also set a to be k + 1 because j starts at k + 1. Then the counting process begins as we enter the outside while loop. The conditions are that as long as a ≤ n−1 (this is the boundary for the outside summation) and we have not found a match yet. We then set b to be a + 1 and enter the second while loop. While b ≤ n (the boundary for the inside summation) and we have not found a match, we increase the count of p by 1 unit. After that, we increase b by 1 unit and end the inside while loop. Respectively, we increase a by 1 unit and end the outside while loop. An important step within the while loop is that as soon as a and b are equal to the inputs j and jprime respect- ively, we switch the flag to true to indicate that we have found a match. This forces the while loop to end as well. The final output is the current counting of p, which corresponds to the desired single subscript. The next objective involves the computation of the dimension of the solution space once we have set up the linear system of equations. We shall approach this by two different ways, denoted original way and optimized way. The original way resembles the process of partially solving that is described in [14]. We create a linear system 25 of equations using entries from the generator matrix G. After that, we can form a coefficient matrix by pulling the coefficients from each equation that we have in the above system and compute the rank of this matrix. Recall that the dimension of the solution space of a linear system equals the difference between the number of variables in the linear system and its rank. Hence, it is necessary to compute the rank of this matrix in order to find the dimension of the solution space. The following codes describe the process of creating the linear system of equations, extracting the coefficients from each equation, forming the coefficient matrix and computing the rank of that matrix.

R<[z]> := PolynomialRing(GF(2) , (n-k)*(n-k-1) div 2); X := []; for i in [1..k] do for g in [1..k] do F := &+ [ &+ [ G[g,j]*G[i,b]*(z[SubscriptConverter(j,b)]) : b in [j+1..n] ] : j in [k+1..n-1] ]; X := Include(X,F); end for; end for; Q := []; for j in [1..k] do for i in [1..((n-k)*(n-k-1) div 2)] do c := Coefficient(X[j],z[i],1); Q := Append(Q,c); end for; end for; CoeffMat := Matrix(k, ((n-k)*(n-k-1) div 2), Q); Rank(CoeffMat);

The disadvantage of the original way is the running time. For instance, an experi- 26 ment with n = 214 and a degree 3 extension took approximately 19 days to finish, and the part that consumed quite some time was the creation of the coefficient matrix. Hence, we would like to seek a faster way to compute the number of solutions, which leads us to the optimized way. In the optimized way, instead of going through the derivation of the system of equations and create the coefficient matrix, we start by forming a zero matrix with the appropriate dimensions. As we form the system of equations, we will update each entry of the zero matrix by the corresponding coeffi- cients from the equations.

ZM := ZeroMatrix(IntegerRing(),k,(n-k)*(n-k-1) div 2); A := []; for j in [k+1..n-1] do A[j] := []; for b in [j+1..n] do A[j][b] := SubscriptConverter(j,b); end for; end for;

for i in [1..k] do for j in [k+1..n-1] do if G[1,j] eq 1 then for b in [j+1..n] do ZM[g][A[j][b]] := G[i,b]; end for; end if; end for; end for;

27 Rank(ZM);

As we observed in the optimized way, we still create the system of equations but not to the full extent. We noticed that if the coefficient is zero from the equations, then the entry of the zero matrix would stay the same. Therefore, we are only interested if the coefficient is 1 in order to update the zero matrix. In the end, the updated zero matrix from the optimized way and the coefficient matrix from the original way are the same and therefore, computing the rank of both matrices yield the same result. For example, using n = 214 with a degree 3 extension again, we have a rank of 41. The advantage of the optimal way is the running time. If it took the original way about 19 days to finish running the experiment, the optimized way completed the job in around 4 seconds. Seeing that the optimal way will yield the same result as the original way, we can try to replicate a similar table that was shown in [14], in particular when q = 2 and m = 14. From the paper, one noticed that when r = 26, the dimension of the solution space becomes indistinguishable whether the generator matrix is of a binary Goppa code or a binary random matrix. Hence, the next task is to verify this result using the optimal way. We vary the degree of the Goppa polynomial r and run the experiment for r = 3,..., 31 with matrices that are of a binary Goppa code and binary random matrices simultaneously. However, the result we have did not match the expectation. The following table shows the data from the experiment with q = 2 and m = 14. The length of the Goppa code is n = 16384 for all cases. The first column represents the degree of the Goppa polynomial while the second column indicates the dimension k. The third and forth column are the rank when the matrix G is of a binary Goppa code and random binary matrix respectively. The next two columns represent the running time in seconds. Finally, the last column shows many time we run each case with different random generated numbers.

28 Table 4.2: Experimental Validation

Degree k Goppa Random Time(Goppa) Time(Random) Number of Trial

3 16342 39 40 3.700 4.140 100

4 16328 53 54 10.790 11.940 100

5 16314 68 68 14.610 23.360 100

6 16300 81 82 33.150 43.120 100

7 16286 95 96 43.680 51.180 100

8 16272 109 110 79.760 68.230 100

9 16258 124 124 111.220 119.340 100

10 16244 137 138 163.510 144.110 10

11 16230 151 152 195.710 221.600 10

12 16216 165 166 256.110 259.690 10

13 16202 180 181 420.880 389.940 1

14 16188 194 193 320.850 532.830 5

15 16174 207 208 506.820 652.530 5

16 16160 220 222 584.820 652.530 5

17 16146 236 235 775.580 908.250 5

18 16132 250 251 8032.250 1072.040 5

19 16118 264 263 7843.070 8645.280 5

20 16104 278 278 9606.520 10159.460 5

21 16090 291 293 12302.319 12264.750 5

22 16076 307 306 12624.180 13822.130 5

23 16062 319 318 15845.110 14047.950 5

24 16048 335 332 19625.260 19877.270 1

25 16034 347 349 24084.880 27360.620 1

26 16020 362 360 30849.530 34097.790 1

29 Degree k Goppa Random Time(Goppa) Time(Random) Number of Trial

27 16006 377 374 43087.740 38239.220 1

28 15992 390 391 50379.389 46684.830 1

29 15978 403 404 54046.030 50377.580 1

30 15964 418 418 57232.370 61207.270 1

31 15950 433 432 64905.430 62478.660 1

In this experiment, we change the degree of the Goppa polynomial and hope that when degree is larger than 26, the generator matrix will fail to be distinguishable. From the above table, the experiment fails to support this claim, but this does not indicate that the claim is false. From the table, we see that when the degree is 31, the ranks are still different, but it is a small difference. Hence, perhaps if we continue to increase the degree of the polynomial a little bit more, we will get that the ranks remain the same for both cases. It turns out that when we try with r = 40 and r = 41, we get the ranks are 557 and 572 respectively for both cases. Therefore, we can say that the matrix will become indistinguishable. Nonetheless, a question remains: given a matrix, how can one tell whether it is a binary random matrix or a binary Goppa matrix? One idea is to compute the rank of the given matrix and see whether it falls under the category of random matrix, Goppa matrix, or indistinguishable. To get started, we design another experiment with n = 214 and the extension degree is 3. We run the experiment 100 times, each with a different seed from 1, ..., 100. As for the generator matrix, we use both binary Goppa and random matrices respectively, and compute the rank. The rank values fluctuate between 35 and 41 for both cases. One interesting result is that there were a substantial amount of rank 41 when the generator matrix is a binary random one. The following table captures these occurrences: The rank of 41 captivates our attention here because it brings us to the next im-

30 Table 4.3: Observations for the Second Experiment

Rank value Binary Random

35 2 2

36 3 4

37 6 3

38 10 5

39 21 10

40 26 21

41 25 50

portant question. If we consider the occurrences as ordered pairs: (Binary, Random), what would the probability of receiving a 41 on Random and not Goppa provided that one of the coordinates of the ordered pairs is 41 be? It turns out that the percentage of this circumstance is 56.92%, which is a very interesting indication. In addition, the percentage of getting the rank of 41 on Goppa and not Random is 21.54%. The remaining 21.54% falls under the category of indistinguishable (both Goppa and Ran- dom have 41 as the rank). Furthermore, if we boost the number of the experiments a bit beyond 100, say 1000, we get 56.77% chance of receiving a 41 on Random and not Goppa provided one coordinate of the ordered pairs is 41. We have 22.92% chance of receiving a 41 on both. Hence, for this particular situation, we can conclude that for a given matrix, it would be a random matrix with a probability of more than half if the rank is 41. From this experiment, it is clear that there exists a distinguisher. Thus, one may repeat this process for different values of r.

31 CHAPTER 5 CONCLUSION

We have looked through the structures of the McEliece scheme to see how it works as well as the advantages and disadvantages of the scheme. In addition, we demonstrate a digital signature scheme against the belief that code-based cryptosystems do not allow practical digital signatures. The most important question resolves around the distinguishability of binary Goppa codes from a random code. More specifically, given a matrix, we would like to distinguish whether it is a binary random matrix or a binary Goppa matrix. In [14], it is suggested that it is worthwhile to look for the dimension of the solution space of a linear system of equations. It is because the dimensions

become the same after a certain degree (rmax) of the Goppa polynomial. Therefore, there is a distinguisher for those instances where the degree of the Goppa polynomial

has not passed rmax. After conducting an experiment with q = 2 and m = 14, our result does not match with the result in [14]. However, with the obtained result, one small question suggests itself for further investigation: when will the distinguisher fail in this case? Further interesting outcomes arise when we try to find out whether we have a good chance of telling whether a given matrix is binary random, binary Goppa, or indistinguishable. For the case of n = 214 with a degree 3 - extension, we reach a conclusion that if the rank of the matrix is 41, it is, with good probability, a binary random matrix. It strengthens our goal of finding a distinguisher. A future project may involve changing the extension degree and n.

32 BIBLIOGRAPHY

[1] Grover, L. “A Fast Quantum Mechanical Algorithm for Database Search.” Pro- ceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing- STOC ’96, 1995, 212–219. doi:10.1145/237814.237866

[2] Rivest, R. (1990). “Cryptology”. In J. Van Leeuwen. Handbook of Theoretical Computer Science. Elsevier.

[3] Shor, P. “Polynomial-Time Algorithms for Prime Factorization and Discreet Log- arithms on a Quantum Computer.” SIAM Rev. SIAM Review 41, no. 2 (1999): 303–332. doi:10.1137/S0036144598347011.

[4] Vaudenay, S. “Back to the Encryption Security Assumptions.” In A Classical Introduction to Cryptography Applications for Communications Security, 244– 245. New York: Springer, 2005.

[5] Shannon, C. “Communication Theory of Secrecy Systems.” Bell System Techin- ical Journal 28, no. 4 (1949): 656–715. doi:10.1002/j.1538-7305.1949.tb00928.x.

[6] Cayrel, Pierre-Louis. “McEliece in Magma.” Pierre-Louis Cayrel. November 25, 2007.

[7] Rivest, R., Shamir, A., Adleman, L. “A Method for Obtaining Digital Signatures and Public-key Cryptosystems.” Communications of the ACM 21, no. 2 (1983): 120–126. doi: 10.1145/359340.359342.

[8] Menezes, A., Oorschot, P., Vanstone, S. “Public-Key Encryption.” Handbook of Applied Cryptography. 298–300. CRC Press, 1997.

[9] Pless, Vera. Introduction to the Theory of Error-correcting Codes. New York: Wiley, 1982.

[10] Roering, C. (2013). “Coding-theory based cryptography: McEliece cryptosys- tems in Sage” (2013). Honors Theses. Paper 17.

[11] Patterson, N. “The Algebraic Decoding of Goppa Codes.” IEEE Transactions on Information Theory 21, no. 2 (1975): 203–207. doi:10.1109/TIT.1975.1055350.

[12] Courtois, N., Finiasz, M., Sendrier, N. “How to achieve a McEliece-based digital signature scheme.” Advances in Cryptology - ASIACRYPT 2001 Lecture Notes in Computer Science, 2001, 157–174. doi:10.1007/3-540-45682-1 10.

33 [13] Beckenbach, E., George L. Applied Combinatorial Mathematics. New York: J.WIley, 1964. 27–30.

[14] Faug`ere,J., Gauthier-Uma˜na,V., Otmani, A., Perret, L., Tillich, J. “A distin- guisher for high rate McEliece cryptosystems.” IEEE Transactions on Inform- ation Theory IEEE Trans Inform. Theory. 59, no. 10 (2013): 6830–6844. doi: 10.1109/TIT.2013.2272036.

[15] Faug`ere,J., Otmani, A., Perret, L., Tillich, J. “A distinguisher for high rate McEliece cryptosystems.” EPrint Report 2013/331 (2010).

34