CS 413, Computer and Data Security

1

Some Proofs and Fermat’s Little Theorem

This set of notes falls into four sections: Proofs or demonstrations of various preliminary things concerning modular fields; a proof of an intermediate result which is needed to prove Fermat’s theorem; a presentation of Fermat’s theorem; and a discussion of the extended Euclidean algorithm for finding inverses in modular fields.

I. Preliminary Things Concerning Modular Fields

The statement was made in the previous set of notes that if n is prime, then modular addition and multiplication form an algebraic field. As mentioned there, most of the needed properties are inherited from the integers. For example, commutativity for a specific set of values can be shown as follows:

(2 * 3) mod 5 = (3 * 2) mod 5

Because:

2 * 3 = 3 * 2

The most important property of field from the cryptographic point of view is the existence of multiplicative inverses for all elements of the field. This property does not obviously stem from the properties of integer arithmetic. It is somewhat more daunting to establish, and that topic will be pursued now.

1. If n is not Prime, not all Elements Are Invertible:

If you refer back to the multiplication table in the previous set of notes with n = 4, not prime, not every integer greater than 1 and less than 4 had an inverse. It can be shown that this is generally the case if n is not prime.

It is precisely the factors of the composite number n that do not have inverses. This can be proven by contradiction:

If n is composite, then there exist a and c less than n and not equal to 0 or 1 such that:

ac = n

Now assume that a has an inverse:

(aa-1) mod n = 1 2

This can be rewritten without modulus by applying the definition of that operation. There exists some k not equal to 0 such that:

aa-1 = kn + 1

Since ac = n, you can substitute ac for n in this expression:

aa-1 = kac + 1

Collect the terms involving a on one side:

aa-1 – kac = 1

Then factor:

a(a-1 – kc) = 1

This is no longer modular arithmetic. It is simply integer arithmetic. There is only one value for a, which when multiplied by any other integer, could give 1 as a result. That value of a would be 1 itself. This is a contradiction, because it was assumed that a was not 1. Therefore, a does not have an inverse in the modular field base n.

2. If n is Prime, all Elements Are Invertible

A proposition of greater interest is that for n prime, every a, 0 < a < n, has an inverse. The contents of the example multiplication table give a hint at how to show this. Except for the row for 0, every row of the table is a permutation of the values 0 through n – 1, the only possible values in the field. If you can show that for an arbitrary a, there can be no duplicates in a row, then one of the row elements has to be 1. Thus, a has an inverse.

Given:

n is prime and 0 < a < n

Assume that there are duplicate entries in the row of the multiplication table for value a. This can be expressed using b, c, and d such that 0 < b, c, d < n and b is not equal to c:

(ab) mod n = d and (ac) mod n = d

This means that there exist some p and q such that:

ab = pn + d

ac = qn + d 3

Without loss of generality, assume that p > q and subtract the second equation above from the first:

ab – ac = n(p – q)

a(b – c) = n(p – q)

There is a contradiction lurking in this statement. If (p – q) = 1, then this means that n is factorable—n is not prime, a contradiction. If (p – q) is greater than 1, then it must be the case that n is a prime factor of the expression on the left hand side. This would imply that n is less than either a or (b – c). However, both a and (b – c) are less than n, so this is also a contradiction.

Conclusion:

The contradiction implies that there can be no duplicate entries in the row for a in the modular multiplication table. Since the valid values in a row range from 0 to n – 1 and there are n entries in a row, this means that there is a 1 in each row. Thus, every element must have an inverse.

3. Is r! Invertible?

The next question is, for some r, 0 < r < n, n prime, does the expression r! have an inverse in the field? This serves both as a reminder of what the factorial function is and as a review of proof by induction. Proof by induction will come up again in the proof of Fermat’s theorem.

First we need a definition of factorial. The full definition starts with 0! = 1. Because this question involves finding multiplicative inverses and 0 doesn’t have one, the smallest factorial that needs to be defined is 1!. Here is a recursive definition starting with 1!:

1! = 1

r! = r(r – 1)!

A recursive definition has a base case, and the general expression for the function is defined in terms of the function applied to a value 1 less. Mathematical induction to prove that something holds true uses the same idea. You establish that the result holds for some base value, and then show that if it holds for some arbitrary value, it holds for the value one greater.

The task is to show that for some r, r < n, the expression r! has an inverse in the field. An inductive proof follows. 4

Base case:

1! = 1. 1 has an inverse in the field, namely itself. Therefore, 1! has an inverse.

Inductive step:

Assume that for r < (n – 1), r! has an inverse in the field and show that (r + 1)! has an inverse as a result. (r for this step is chosen less than n – 1 so that r in the induction is less than n.)

If r! has an inverse, let it be represented by p. Using equivalence notation you can write:

r!p ≡n 1

By definition:

(r + 1)! = (r + 1)r!

But (r + 1) < n. Since n is prime, according to the result of the previous section, (r + 1) has an inverse in the field. Let it be represented by q. Using equivalence notation you can write:

(r + 1)q ≡n 1

The product (r + 1)!pq is simplified below. With each step of the simplification a justification based on a property or result is shown. The final result, 1, shows that pq is the inverse of (r + 1)!.

(r + 1)!pq

≡n (r + 1)r!pq definition of factorial

≡n (r + 1)(r!p)q associativity

≡n (r + 1)1q r! and p are inverses

≡n (r + 1)q definition of multiplicative identity

≡n 1 (r + 1) and q are inverses

This shows that 1! has an inverse—pq is that inverse. This completes the induction. Given that 1! has an inverse and r! has an inverse, it was shown that (r + 1)! has an inverse. Therefore, r! has an inverse for all r < n. In general, this result tells you that the inverse of a factorial is the recursive product of the inverses of its factors. 5

II. Fermat’s Little Theorem

Cryptography makes use of a theorem by Fermat, known as Fermat’s little theorem. It has this name to distinguish it from another theorem of Fermat’s that is known as Fermat’s last theorem, or simply Fermat’s theorem. In these notes, after this preliminary section, a reference to Fermat’s theorem means the little theorem, not the last theorem.

Before pursuing Fermat’s little theorem, it might be worthwhile to give some information on his last theorem. Here is its statement:

An equation of the form xn + yn = zn does not have non-zero integer solutions for x, y and z when n > 2.

For n = 2, you can find integral solution sets to equations of this form. They are known as Pythagorean triples. Examples are {3, 4, 5}, {5, 12, 13}, and {9, 12, 15}. The theorem says that you can’t find such triples for any power higher than 2.

Pierre de Fermat died in 1665 and a marginal note he had written in one of his books stated that he had found a proof of this theorem. A mathematician named Andrew Wiles, born and educated in England, who now lives in the United States, published the first (correct) proof in 1995. It is said that he devoted 7 years of his professional life full time to solving the problem. Many other people collectively spent many years unsuccessfully trying to prove or disprove it in the 300+ years since it was originally stated.

Statement of Fermat’s Little Theorem

For n prime and a < n:

n a ≡n a

In words: a to the nth power is equivalent mod n to a. Stating this in another way, there exists some value p such that:

an = pn + a 6

Before moving on to trying to demonstrate this, it is worthwhile to see why this result might be of interest. It gives a way of finding a-1. Recall that a does have an inverse. Multiplying both sides of the equivalence stated in the theorem by a-1 two times in a row gives the following:

n -1 -1 a a ≡n aa

n-1 a ≡n 1

n-1 -1 -1 a a ≡n 1a

n-2 -1 a ≡n a

The practical use of this result can be shown with a concrete example. Referring to the mod 5 multiplication table, it was seen that 2 and 3 are inverses mod 5. “Looking in the multiplication table” is the equivalent of a brute force search of all possibilities. The inverses can be found instead by computing the result using Fermat’s theorem:

25-2 mod 5 = 23 mod 5 = 8 mod 5 = 3

35-2 mod 5 = 33 mod 5 = 27 mod 5 = 2

Binomial Coefficients

The binomial coefficients turn out to be useful in proving Fermat’s theorem. First of all, recall what they are. The notation looks like this:

n    r 

In English, this is read “n choose r”. This means, given a set of n elements, how many different ways are there to choose a subset of r elements from it. The order of the elements in a set is immaterial, so subsets containing the same elements in a different order are not considered different. The mathematical definition of the binomial coefficient looks like this:

n n!     r  r!(n  r)! 7

A concrete example looks like this:

5 5! 5! 1      3 3!(5  3)! (5  3)! 3!

You can interpret the first factor on the right as the number of different ways of choosing 3 elements out of 5 where the order of the chosen 3 does make a difference. The second factor divides by the number of different ways of ordering 3 elements. Thus, the result is the number of different ways of choosing 3 where the order doesn’t make a difference.

You may also be familiar with Pascal’s triangle, a nice mnemonic device for coming up with the binomial coefficients without calculations:

1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 …

The top of the pyramid, the 0th line in the pyramid, represents n = 0. There is only one coefficient in this case. The next line down, the 1st line in the pyramid, represents n = 1. There are 2 coefficients in this case:

1 1   ,   0 1

The general pattern of the coefficients, then, is:

n n n n  , , ,...,  0 1 2 n

And to refresh your memory, the reason they are called binomial coefficients is that the following formula describes the expansion of a binomial raised to an arbitrary integral power:

n n n nr r (a  b)    a b r0  r  8

In part of the following argument it will turn out that we’d like to deal with the cases where r = 0 and r = n separately. You can observe from Pascal’s triangle that they always give 1. This is demonstrated here for r = 0. The result comes to the same thing if r = n. Remember that in the full definition of factorial, 0! = 1:

n n! n!      1 0 0!(n  0)! 1n!

A Result Needed in Order to Prove Fermat’s Little Theorem

We are interested now in whether a binomial coefficient in general is evenly divisible by n if n is prime. In other words, for n prime, does the binomial coefficient equal 0 mod n. If this is true, a useful result follows from it. Write an expression where n is factored out of the expression for the binomial coefficient:

n n! (n 1)!     n  r  r!(n  r)! r!(n  r)!

In the cases where r = 0 and r = n, this would actually be a false step, assuming we want to work only with integers. Since the value of the binomial coefficient is 1, it would have to be the case that the rest of the expression has the value 1/n, a fraction. Therefore, we will consider the first and last coefficients separately.

What about the situation where 0 < r < n? Is it valid to factor n out of the formula for the coefficient and expect that the other factor, shown by itself below, to always be a whole number?

(n 1)! r!(n  r)!

This raises an interesting antecedent question. Is a binomial coefficient, in general, a whole number? In other words, is the following expression a whole number?

n n!     r  r!(n  r)!

It is not immediately clear how you might prove this just using the properties of numbers. However, by appealing to Pascal’s triangle it seems that it is. Likewise, using the constructive definition of the binomial coefficient, it seems clear that the sum of the integral coefficients of the like terms of a binomial expansion could only be a whole number. 9

Still, there is no guarantee in general that factoring out n gives another integral factor. In other words, the question is whether or not the second factor, shown as a fraction in the expression below, is an integer:

n! (n 1)!  n r!(n  r)! r!(n  r)!

The key to the argument is that we are only considering the case where n is prime. If the fraction on the left in fact reduces to a whole number, it can only be if the denominator goes evenly into the numerator. Since n is prime, no part of the denominator can be going into it in any case. Therefore, if n is factored out, the remaining expression must still reduce to a whole number.

This means that for 0 < r < n and n prime, the binomial coefficient is evenly divisible by n, or the binomial coefficient is equivalent to 0 mod n:

n   mod n  0  r 

Or:

n    n 0  r 

Now go back to the binomial expansion and see what this means. The expansion can be rewritten to isolate the terms where r = 0 and r = n:

n1 n n n n nr r (a  b)  a b    a b r1  r  10

The terms with coefficients of 1 are separated out, and every term of the summation consists of a product including a binomial coefficient where 0 < r < n. It was just shown that such binomial coefficients are equivalent to 0 mod n. The reducibility properties say that the mod of a sum is the sum of the mod, and so the whole summation is equivalent to 0 mod n. Likewise, the mod of the whole right hand side reduces simply to the mod of the first two terms, those with a coefficient of 1. Applying modulus to both sides of the equation given above and simplifying step by step leads to the desired result:

n1 n n n n nr r (a  b) mod n  (a b    a b ) mod n r1  r 

n1 n n n n nr r (a  b) mod n  (a b ) mod n    a b mod n r1  r 

(a  b)n mod n  (a n b n ) mod n  0

(a + b)n mod n = (an + bn) mod n

Or in the most concise notation:

n n n (a + b) ≡n a + b

This is a memorable result in its own right. Everyone knows that in the reals the power of a sum is not generally the sum of the powers of the terms. However, in a modular field with n prime, this holds for a power of n. This equivalence is used in the proof of Fermat’s Little Theorem.

III. The Proof of Fermat’s Little Theorem

For reference purposes, this is the statement of Fermat’s theorem again:

For n prime and a < n:

n n a ≡n a, or a mod n = a

The theorem can be proven inductively. You need a base step and an induction step. 11

Base step: 0 to any power is 0 and anything goes into 0 zero times with a remainder of 0. The base case is a = 0. This is summarized using the notation as follows:

n 0 ≡n 0

Then for a = 0:

n a ≡n a, or

an mod n = a

Induction step: Given an mod n = a for n prime, show that (a + 1)n mod n = a + 1. With each step of the demonstration a justification based on a property or result is shown.

(a + 1)n mod n

= (an + 1n) mod n by the result of the previous section

= (an + 1) mod n by definition of the identity and multiplication

= an mod n + 1 mod n by reducibility

= an mod n + 1 by simple arithmetic

= a + 1 by the inductive assumption

This completes the induction, giving:

(a + 1)n mod n = a + 1

Or:

n (a + 1) ≡n a + 1 12

Restating Fermat’s Little Theorem and Why It’s Important

The theorem says for n prime and a < n:

n a ≡n a

The reason it’s important is that it gives a computational formula for finding inverses in a modular field. If you multiply the left hand side and the right hand side of the equivalence by a-1 twice, you end up with a computable expression on the left equal to a-1 on the right.

n -1 -1 a a ≡n aa

n-1 a ≡n 1

n-1 -1 -1 a a ≡n 1a

n-2 -1 a ≡n a

Exponentiation in a modular field is just repeated multiplication as usual. Using Fermat’s theorem to find an inverse requires evaluating an-2. This involves n – 3 modular multiplications. For large values of a and n the power will tend to get very large. Reducibility helps with that problem, but each application of reduction requires a division by n. Because of the cost involved, Fermat’s theorem is not the most practical way of finding an inverse.

You can also find the inverse by searching. You can multiply a by every other value in the field until you get a result which is the identity. There are n – 1 candidate inverses. In this way, on average you will find the inverse after (n – 1) / 2 modular multiplications. Searching is also obviously not an ideal solution. The order of complexity of using Fermat’s theorem and searching are both the same. They are linear in n. 13

IV. The Extended Euclidean Algorithm

Because Fermat’s theorem isn’t very efficient computationally, other algorithms have been developed for finding the inverse in a modular field. One depends on the Euclidean algorithm for finding the gcd. This is the general statement of this algorithm as it was given previously:

Given: a, b, x = gcd(a, b), a = mb + r, gcd(a, b) = gcd(b, r) a and b are the knowns. x is the unknown. Given a and b, using integer division and modulus, it is easy to find b and r. Then the same result can be applied to b and r. The steps would go as follows:

a = mb + r0 gcd(a, b) = gcd(b, r0) b = m1r0 + r1 gcd(b, r0) = gcd(r0, r1) r0 = m2r1 + r2 gcd(r0, r1) = gcd(r1, r2) … The algorithm terminates when you reach rn = 0.

To be more specific, the terminating step looks like this:

rn-2 = mnrn-1 + 0

If a is prime, the gcd(a, b) = 1. That means that the next to last step will take this form:

rn-3 = mn-1rn-2 + 1

A simple rearrangement gives an expression for 1 in terms of ri and mj:

1 = rn-3 – mn-1rn-2

This is useful for the following reason: Each of these ri, mj are defined in the previous step. The components of that step are defined in the step before that one, all the back to the 0th step. That means that it is possible to substitute all the way back to the top until 1 is expressed in terms of a and b. Before explaining further why this is useful, back substitution will be illustrated with a very simple example:

Let a = 23 and b = 8. The steps of the Euclidean algorithm are:

23 = 2·8 + 7

8 = 1·7 + 1

7 = 7·1 + 0 14

Rewriting the next to last step, you have:

1 = 8 - 1·7

Now consider the first step. It can be solved for 7:

7 = 23 - 2·8

If you substitute this expression for 7 into the equation solved for 1, you get:

1 = 8 – 1· (23 – 2·8)

1 = 3·8 – 23

The critical point is that 1 has been expressed as a linear combination of a = 23 and b = 8. If you rewrite to isolate the term involving b on one side, you get:

1 + 23 = 3·8

If you take the modulus base a = 23 of both sides, you get:

(1 + 23) mod 23 = 3·8 mod 23

1 = 3·8 mod 23

This tells you that the inverse of b = 8 in the field with a = 23 is 3.

Before going on to more examples, the desirability of getting 1 as a linear combination of a and b can be explained in general terms. Consider this expression:

1 = ma + nb

Then you can isolate the term containing b:

1 – ma = nb

Taking the modulus base a of both sides gives:

(1 – ma) mod a = nb mod a

1 = nb mod a

So n is the inverse of b in the field. 15

Back substitution is not terribly hard, but the first example did not show how lengthy it can become even for a relatively small set of numbers. Let a = 31 and b = 13. The Euclidean algorithm gives:

31 = 2·13 + 5

13 = 2·5 + 3

5 = 1·3 + 2

3 = 1·2 + 1

2 = 2·1 + 0

Then you write:

1 = 3 – 1·2

Solving the previous step for 2 gives:

2 = 5 – 1·3

Back substituting gives:

1 = 3 – 1· (5 – 1·3)

1 = 2·3 – 1·5

3 = 13 – 2·5

1 = 2· (13 – 2·5) – 1·5

1 = 2·13 – 5·5

5 = 31 – 2·13

1 = 2·13 – 5· (31 – 2·13) 16

1 = 12·13 – 5·31

Again you can rearrange and take the modulus:

(1 + 5·31) mod 31 = (12·13) mod 31

1 = (12·13) mod 31

This means that 12 is the inverse of 13. This is easy to verify. 12·13 = 156. 31 goes into 156 5 times with a remainder of 1.

There is one other thing that can happen that can be illustrated with an example. Let a = 157 and b = 30. The Euclidean algorithm gives:

157 = 5·30 + 7

30 = 4·7 + 2

7 = 3·2 + 1

2 = 2·1 + 0

1 = 7 – 3·2

1 = 7 – 3(30 – 4·7)

1 = -3·30 + 13·7

1 = -3·30 + 13(157 – 5·30)

1 = -68·30 + 13·157

Rearranging gives:

1 – 13·157 = -68·30

You have something of this form:

1 = -nb + ma

1 – ma = -nb 17

It may not be clear how to think about negative values and modulus. Doing negative division and trying to find the remainder doesn’t work so well. It is helpful to think in terms of equivalence classes instead. The elements of equivalence classes are separated by multiples of a, ka. This means that even though you have –m, the modulus of the left hand side is still 1:

1 – ma mod a = 1

On the right hand side you are interested in what –n is mod a. Suppose –a < -n < 0. If you add a to –n, a – n is in the same equivalence class as –n, and 0 < a – n < a. They are in the same equivalence because their values are separated by a. Putting this result for the right hand side together with the result for the left hand side gives:

1 = 1 – ma mod n = -nb mod n = a – n

In other words, the back substitution gives –n as the raw result, and a – n is the inverse. For the concrete example, the outcome was:

1 = -68·30 + 13·157

This means that the inverse of 30 in the modular field base 157 is 157 – 68 = 89.

Back substituting works, but it is not ideal to run forwards through the Euclidean algorithm and then back again. It is possible to devise an algorithm where all needed quantities are computed on the forward pass. The claim was that this approach is better than Fermat’s theorem or linear search. The order of complexity depends on how many steps Euclid’s algorithm would take for a given set of values.