Revisiting Cryptography with Applications to Post-quantum SIDH Ciphers

by Wesam Nabil Eid

Master of Science Electrical and Computer Engineering University of New Haven 2013

A dissertation submitted to the Department of Computer Engineering and Sciences of Florida Institute of Technology in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Computer Engineering

Melbourne, Florida May, 2020 ⃝c Copyright 2020 Wesam Nabil Eid All Rights Reserved

The author grants permission to make single copies We the undersigned committee hereby recommend that the attached document be accepted as fulfilling in part the requirements for the degree of Ph.D. in Computer Engineering.

“Revisiting Elliptic Curve Cryptography with Applications to Post-quantum SIDH Ciphers” a dissertation by Wesam Nabil Eid

Marius C. Silaghi, Ph.D. Associate Professor, Department of Computer Engineering and Sciences Major Advisor

Carlos Otero, Ph.D. Associate Professor, Department of Computer Engineering and Sciences Committee Member

Susan Earles, Ph.D. Associate Professor, Department of Computer Engineering and Sciences Committee Member

Eugene Dshalalow, Dr.rer.nat. Professor, Department of Mathematical Sciences Committee Member

Philip Bernhard, Ph.D. Associate Professor and Department Head Department of Computer Engineering and Sciences ABSTRACT

Revisiting Elliptic Curve Cryptography with Applications to Post-quantum SIDH

Ciphers

by

Wesam Nabil Eid

Thesis Advisor: Marius C. Silaghi, Ph.D.

Elliptic Curve Cryptography (ECC) has positioned itself as one of the most promising candidates for various applications since its introduction by Miller and

Kolbitz in 1985 [53, 44]. The core operation for ECC is the scalar multiplication

[k]P where many efforts have addressed its computation speed. Here we introduce an efficient approach for calculating elliptic curve operations by a novel regrouping of terms and creating new projective representation operators and increasing paral- lelism. These operators and the corresponding projective coordinate representations are shown to lead to adjusted versions of scalar multiplication algorithms that are evaluated. These techniques enable more opportunities for optimizing computations, directing to an important speed-up for every application based on elliptic curves such as encryption, crypt-analysis, digital signatures, and pseudo-random generators. Also benefiting from our work is the post-quantum cryptosystem, Supersingular Isogeny

iii Diffie-Helman (SIDH). Its main weakness is elliptic curve computation complexity, that we improve, while its main quantum attacks complexity is maintained. For other elliptic curve schemes, the computation speed-up also favors attacks, which can how- ever be compensated by increasing the size of the key. In addition, we simulate the modeled design as a hardware arithmetic circuit, to further quantify the improvements that can be obtained.

iv Table of Contents

Abstract iii

List of Figures ix

List of Tables xi

List of Symbols xiii

Acknowlegments xiv

Dedication xviii

Chapter 1 Introduction 1

1.1 Motivation ...... 2

1.2 Summary of Ideas and Contributions ...... 3

1.3 Results ...... 4

1.4 Structure of Dissertation ...... 5

Chapter 2 Background 7

v Chapter 3 Fast 2nP 40

3.1 Fast nP + mQ ...... 40

3.1.1 Fast 22P ...... 40

3.1.2 Fast 23P ...... 52

3.1.3 Fast 24P ...... 61

3.1.4 Fast 3P (Point Tripling) ...... 71

3.1.5 Fast 2nQ + P ...... 80

3.1.6 Fast 2nP+2Q ...... 86

3.1.7 2nP+mQ...... 92

3.1.8 Generalizing 2nP + Q and 2nP + mQ Forms ...... 98

3.1.9 Another Implementation of 6Q ...... 100

3.1.10 Another Implementation of 10Q ...... 109

3.2 Results ...... 112

3.3 Further optimization ...... 113

Chapter 4 Direct Doubling 118

4.1 Direct Doubling with Labeling ...... 136

4.2 Comparing Fast 2nP algorithm vs Direct Doubling ...... 138

Chapter 5 Other Coordinate Systems 140

5.1 Projective Coordinates ...... 141

5.2 Jacobian Coordinates ...... 149

5.3 Montgomery Coordinates ...... 155

vi Chapter 6 EiSi Coordinates 158

Chapter 7 Sample Applications 174

7.1 Algorithms Overview ...... 175

7.2 Fast Multiplication with Mixed Base Multiplicands ...... 177

7.2.1 Right-to-left Extensions ...... 178

7.2.2 Double and Add Extensions ...... 181

7.2.3 NAF Extensions ...... 182

7.3 Fast Multiplication with Base 16 Multiplicands ...... 186

7.4 Fast Multiplication with Base 32 Multiplicands ...... 187

7.5 Fast Multiplication with Base 1024 Multiplicands ...... 188

7.6 Inverting Multiplications based on Curve Order ...... 189

Chapter 8 Supersingular Isogeny Diffie-Hellman SIDH 191

8.1 Supersingular Curve and Elliptic Curve ...... 192

8.2 Isogenies ...... 194

8.3 Isomorphisms ...... 195

8.4 J-Invariant ...... 196

8.5 Computing Isogenies over Finite ...... 197

8.5.1 Finite Fields and Frobenius Isogeny ...... 198

8.6 SIDH and Key Exchange ...... 200

8.7 SIDH and Post-Quantum Cryptosystem ...... 202

vii Chapter 9 Results and Experiments 205

9.1 Functions Description and Properties ...... 206

9.2 Double-and-Add vs NAF vs Right-to-left ...... 209

9.3 Base 16 vs 32 vs 1024 Multiplicands ...... 211

9.4 Our Work vs Original ...... 213

9.5 EiSi Coordinates vs Others ...... 215

9.6 Number of Multipliers Comparison ...... 220

Chapter 10 Conclusion 223

References 227

viii List of Figures

2.1 Elliptic curve arithmetic hierarchy [36]...... 13

2.2 Cracking the secret key. (a) Standard cells and regular routing us-

ing 15K measurementskeybyte found. (b) Standard cells and regular

rouing using 15K measurementskeybyte found. (c) WDDL and differ-

ential routing using 1.5 M measurementskeybyte found. (d WDDL and

differential routing using 1.5 M measurementskeybyte not found [38]. 38

2.3 Generation of Signed Payment [31]...... 39

2.4 Validation of Signed Purchase REQ and Signed Purchase Invoice [31]. 39

3.1 Cyclic of the elliptic curve E [63]...... 45

5.1 Montgomery Doubler Implementation Flowchart...... 157

7.1 Three point ladder (left-to-right) [40, 29]...... 176

7.2 Right-to-left algorithm [29]...... 177

7.3 Data-dependency graph for calculating a single double merged with

another one (Parallelization characteristic)...... 178

ix 7.4 First Proposed Algorithm...... 179

7.5 Left-to-Right Proposed Algorithm...... 183

7.6 Left-to-Right Double-Add-and-Subtract Algorithm...... 185

8.1 j-invariants in F4312 [21]...... 196

8.2 SIDH Key Exchange Protocol [26]...... 201

9.1 Computing the point 24P by using remi func and remi point functions. 208

9.2 Computing the point 29P by using remi func and remi point functions. 209

9.3 DA vs NAF vs RL in terms of Number of Multiplications...... 211

9.4 DA vs NAF vs RL in terms of Number of Maximum Levels...... 211

9.5 Base 16 vs 32 vs 1024 Multiplicands in terms of Number of Multipli-

cations...... 213

9.6 Base 16 vs 32 vs 1024 Multiplicands in terms of Number of Maximum

Levels...... 213

9.7 Base 32 vs Original algorithms in terms of Number of Multiplications. 215

9.8 Our Work vs Other Coordinates Algorithms in terms of Number of

Multiplications...... 217

9.9 Our Work vs Other Coordinates Algorithms in terms of Number of

Maximum Levels...... 218

9.10 Maximum levels for different number of multipliers...... 222

x List of Tables

3.1 Algorithms Preliminary Measurements...... 113

3.2 List of Labels for 4P, 8P and 16P...... 114

4.1 Number of Mult. in Fast 2n vs Direct Doubling...... 139

5.1 List of Labels for Projective Algorithms...... 144

5.2 List of Labels for Jacobian Algorithms...... 152

8.1 Instantiations of Diffie-Hellman...... 202

9.1 DA, NAF and RL Algorithms Measurements...... 210

9.2 Base 16, 32 and 1024 Algorithms Measurements...... 212

9.3 Base 32 vs Original Algorithms Measurements...... 214

9.4 EiSi vs Other Coordinates Measurements...... 216

9.5 List of Algorithms Linear Equations...... 218

9.6 Expected number of Mults and MaxLs with key of sizes 751 and 1013. 219

9.7 ∆ Values for Base 32 vs Jacobian and Base 1024 vs Montgomery. . . 219

xi 9.8 The number of multipliers appropriate to achieve the highest level of

parallelism...... 220

9.9 MaxL at Different Number of Multipliers...... 221

xii List of Symbols

ECC Elliptic Curve Cryptography

Nxn The abscissa in EiSi coordinate

Nyn The ordinate in EiSi coordinate

Un The scale in EiSi coordinate a mod n The remainder of dividing a by n a (mod n) The residue classes of a modulo n, cf. Gauss a ≡ b (mod n) The residue class of a and b modulo n are the same a−1 mod n The inverse of a modulo n, 0 ≤ a−1 < n s.t. a ∗ a−1 mod n = 1 a−1 (mod n) The residue class of integers a−1, s.t. a ∗ a−1 ≡ 1 (mod n) a −1 b (mod n) a ∗ b (mod n)

xiii Acknowledgements

The Prophet Muhammad, peace and blessings be upon him, said, He has not thanked Allah who has not thanked people..

Graduate studies at the Florida Institute of Technology have been both challeng- ing and fun. The classes have broadened my perspective as a computer engineering graduate and provided me with the knowledge and skills needed to become an ac- complished engineering instructor, and researcher. I also found the research weekly meetings with my advisor, Dr. Marius Silaghi, where we share updates, carried out new contributions, and learned from his experience, very beneficial. This interac- tion has yielded in many ways to the successful completion of this dissertation which presents the major milestones in my graduation journey. This work would not have been possible without the blessing of Allah, support of the family, patience and sup- port of my academic advisor and financial support that I received from the Saudi

Arabian Cultural Mission Scholarship (SACM). I also want to express my gratitude to Umm Al-Qura University, Makkah, KSA, which has shown the confidence in me to grant this opportunity to finish my graduate studies.

At the outset, I want to thank my Creator and my Lord, Allah, who blessed me and bestowed upon me myriad of blessings. Praise be to Allah, who gave me knowledge, strength, and patience until I passed this important stage of my life, which opened my

xiv eyes to be a part, even if it was very small in the system of science and development in our present world. When I was about to lose hope, he was with me, strengthening me and blessing me, glory to Allah the Great, All-knowing, Merciful and Sustainer.

I would like to thank my parents, who were credited with creating the man who

I am now. Their support and endless love are beyond description, and my words or even my actions cannot deliver what they deserve. In addition, I do not lose sight of my thanks to my brothers and sisters who were an important source of support at any time I needed them. I hope that the knowledge I received at the graduate level will inspire me to be a better person so that I can make my family happy and support them properly. Nobody has been more important to me in the pursuit of this work than the members of my family.

I would like to thank Dr. Marius Silaghi, my advisor, teacher, and most im- portantly biggest support, for his guidance, and patience during my journey. When obstacles came my way, he was always available with advice and solution to see me through. He helped and taught me more than I ever give him credit for here. He applied the ideal method that an academic advisor can use to guide his student. Dr.

Silaghi taught me many meanings, including dedication to work, sincerity, caring for students, humility and cooperation. The lessons I learned from him are not limited to words but will still be engraved in my memory and part of my personality in the near future as a teacher and researcher and above all as a human being. I was very lucky to work with him and as many of my friends think I am, I am fortunate to have

xv him as my advisor. Thanks to his generous family, whom I have all the respect and admiration for receiving me frequently and sharing part of my journey. I am grateful to all of those with whom I have had the pleasure to work during this and other re- lated projects. Each of the members of my Dissertation Committee has provided me extensive personal and professional guidance and taught me a great deal about both scientific research and life aspects in general. Dr. Dschalalow, who was anexampleof humility with his high standing in mathematics. His kind words and valuable advice were an inspiration to me and a major motivation for me in my journey to provide the best I can. Dr. Otero, who has always motivated me with words and deeds. I do not forget the words he addressed to me when I was in my early stages in graduate stud- ies that I should be the best in my field of specialization and to repeat my attempt until I reach the genius idea appropriate for the doctorate level. Thank you Doctor

Otero for all that you have given me and I will not forget your words. Dr. Earles, who was kind and willing to receive me and help me anytime. She taught me in the early stages before resuming the research phase, so she was a distinguished teacher who has always been giving me advice and showing her interest, which motivated me to always provide the best. I also acknowledge the help of Mr. Thamer Alrouqi in programming the implementation. His effort was really helpful for me as a friend and for my research generally.

I do not lose sight of my gratitude to my first inspiration and the main reason for which I am credited after Allah Almighty for having me involved in the field of

xvi cryptography. Without prior knowledge and without any interests accustomed to him, he did not hesitate to help me and share his extensive knowledge in this field.

All thanks and appreciation to Prof. Turki Al-Somani, who I will never forget the favor he provided to me. There are rare unique people like him and I was very lucky to have met him and had the honor to work with him.

xvii Dedication

”To Solomon We inspired the (right) understanding of the matter: to each (of them) We gave Judgment and Knowledge; it was Our power that made the hills and the birds celebrate Our praises, with David: it was We Who did (all these things).”

Al-Anbiya, Verse 79.

The Prophet Muhammad peace be upon him, said:” Allah loves that whenever any of you does something, he should excel in it.”

There is a vast difference between a man who teaches you how to dig and plant and then watches you until he makes sure that you are doing it properly and a man teaches you then continues to help you, dig with you to the end and shares you in throwing seeds and waiting until he sees the crop grow and flourish, there is a big difference. My academic parent is the second man.

I dedicate this work to my parents and my academic parent, Dr. Marius Silaghi.

xviii Chapter 1

Introduction

Secure Internet-based communications rely on public-key cryptography, which allows entities to communicate without the need for advanced sharing of confidential ma- terial. Elliptic Curve Cryptography (ECC) is still a predominant type of public-key cryptography [60]. ECC is traced to 1985 when Victor Miller and Neal Koblitz pro- posed it as an alternative to the factorization-based cryptosystem RSA. ECC has also been the predominant cryptographic protocol for encrypted emails, online bank- ing, secure ecommerce websites, digital signature and other data transfer platforms.

Breaching these would have significant effects for digital security and electronic pri- vacy. The adoption of ECC has also been accelerated by recommendations from an array of standardization entities, including, NIST, IETF and ANSI among others

(NIST, 2016). Compared to RSA, elliptic curve cryptography is one of the most effi- cient public key cryptosystems (PKC) for desirable security. While there are known quantum and classical attacks that breach cryptographic protocols grounded on su-

1 persingular isogeny graphs (SIGs), Supersingular isogeny Diffie-Hellman (SIDH) can handle quantum-based attacks but the subject is still understudied.

1.1 Motivation

Today, the uses of ECC have become multiple and varied, whether in the areas of communications, banking applications, digital signatures and even operating systems.

ECC has many applications because of smaller keys and increased theoretical robust- ness. Especially in light of the developments the field of quantum computing and the threats that this technology may cause to existing cryptosystems, it has become nec- essary to speed up the elliptic curve computations. Quantum computers effectively break elliptic curves, factoring and finite field adopted for public key cryptography.

Supersingular isogeny Diffie-Hellman (SIDH) key exchange is one of the post-quantum cryptographic algorithms adopted to create secure key exchange between communi- cating entities over insecure communication channels [5, 23]. The core operations for

SIDH is computing the isogeny and its kernel. Basically, Velus formula is used to compute the isogeny and the P + k[Q] formula is used to compute the kernel, where

P and Q are points on the curve and k is the secret key that is generated by both parties [40]. The complexity of SIDH relies on the difficulty of finding isogenies be- side computing the scalar multiplication in the kernel formula. Thus, speeding up EC computations will not only benefit the applications that rely on ECC, but also has an effective impact on the post quantum cryptosystem SIDH. Moreover, attacks are

2 not necessarily caused by independent calculations and might also be by analyzing the hardware behaviors such as Side Channel Attack (SCA). As attackers analyses the power behaviors, which differ between performing point addition or point dou- bling, revealing the secret key. Therefore, the development of the EC system includes aspects other than the speed. For instance, the Montgomery coordinates system is resistant to such attacks. The pitfall of the Montgomery coordinates system that it’s slower than other coordinates systems such as Projective and Jacobian. We provide effective algorithms to develop the EC system generally in several ways thatmake the process for attackers more complicated whether through software or hardware.

1.2 Summary of Ideas and Contributions

Unlike other coordinates systems, Weierstrass form require computing an inverse each time we perform point doubling or addition, i.e, every iteration. In general, finding inverses is much slower than multiplication. Thus, the main idea of our work is to eliminate inverses. Our first contribution is to compensate the previous x andy coordinates equations in the higher doubling orders equations and find a common factor between all slope’s denominators in order to result in a single inverse for each algorithm. In addition, we apply labeling and register renaming methods aside with common factor technique to minimize the number of multiplications. Skillfully, and by using these methods, we contribute an efficient way to find higher orders doubling in one algorithm with a single inverse instead of the original cumulative algorithm

3 that requires finding an inverse for each iteration. Moreover, we follow the same steps to find the intermediate algorithm up to 34P. Competitively with other coordinates systems, we contribute a new coordinates system using only x and y coordinates and performing a single inverse along the secret key size. Subsequently, we illustrate 6 different competing general algorithms and compare between them using multiple coordinates systems.

1.3 Results

Experiments and results have shown that our implementation proved that compared to our fastest algorithm in terms of number of multiplication, Base 32 Multiplicands, at the key size of 521 bits, the competing Montgomery, Projective and Jacobian coordinates systems are 100%, 88% and 68% slower respectively. Our algorithm is at least 61% faster than Jacobian, which is clearly the most efficient one among the other systems, at the smallest key size we used of 224 bits. Moreover, our results indicate that in terms of the maximum parallelization, when it’s implemented on hardware or

FPGA, our Base 1024 Multiplicands algorithm can be implemented with 55%, 59% and 68% less number of multiplier levels than Montgomery, Jacobian and Projective respectively, at the key size of 521 bits, and a very slight reduction in this ratios for the smaller keys. Generally, we notice that the improvement scales with the size of the input.

4 1.4 Structure of Dissertation

This document is divided into six parts. Part I recalls the affine coordinates equations of the elliptic curve and then describes our methodology for direct repeated point doubling with high order with the lowest coast when using only single inverse in two different ways. In Chapter 3, we illustrate ourn Fast2 algorithm of computing point doubling and multiple forms of point addition. Moreover, we show the results of the algorithms preliminary measurements compared to the original algorithm and then apply a Labeling technique to our work. In Chapter 4, a new methodology leads to even lower cost algorithms, Direct Doubling. Similarly, the Labeling was applied to the new algorithms and then a comparison at the end of Chapter 4 was provided between our two mathematical techniques.

Part II recalls other coordinates systems algorithms and presents new algorithms for direct repeated point doubling with high order up to 25P. In addition, it reviews the development of new equations after applying Labeling and Register renaming techniques. In Chapter 5, we represent Projective, Jacobian and Montgomery co- ordinates algorithms and applies some optimization techniques we follow from the previous chapters.

Part III introduces our new coordinates system, EiSi. In Chapter 6, we describe point doubling and addition for our new coordinates system. In addition, we illus- trated in detail how EiSi was derived from the original affine representation. At the

5 end of Chapter 6, we added some numerical examples that clarifies and validates the equations.

Part IV introduces some algorithms that we developed and some that we invented in details. In Chapter 7, we have presented a brief introduction to some of the ECC applications. Furthermore, we introduced 6 different general scalar multiplication algorithms that were implemented and compared later in Chapter 9.

Part V overviews the post quantum cryptosystem SIDH. In Chapter 8, we pre- sented an introduction to SIDH and some important topics and how our work will be beneficial to speed up the system. At the end of Chapter 8, we discussed the importance of this system in resisting quantum attacks.

Part VI shows some experiments, comparisons and the final results of our im- plementation. In Chapter 9, we list some of our important functions we used in our implementation. In addition, we have done some comparisons between our own algorithms, original and other coordinates systems algorithms.

6 Chapter 2

Background

The most popular forms of public-key cryptography for current applications have increasingly been based on Elliptic Curves (ECs) [53, 44]. With Elliptic Curve Cryp- tography (ECC), messages and secrets are mapped to points on an elliptic curve, and specific point doubling and point addition operations define transitions between points. Scalar point multiplication uses a sequence of point doubling and point addi- tions to efficiently evaluate point multiplications:

Q = [k]P = P + P + ... + P .    k

Cryptosystems based on ECs rely on the difficulty of solving the Elliptic Curve Dis- crete Log (ECDL) problem. Namely, given the points Q and P in the previous equa- tion, it is hard to determine the scalar multiple k for elliptic curves with points P of large order and large k numbers. However, with the expected emergence of quantum

7 computers [15] in the near future, cryptosystems that rely on the ECDL are no longer safe as the scalar multiple can be easily recovered using Shor’s algorithm [73]. Other quantum resilient schemes have been proposed. Furthermore, post-quantum cryp- tosystems such as Supersingular Isogeny Diffie-Hellman (SIDH) are slow techniques, and speeding up its elliptic curve computation is frequently mentioned as a significant goal.

The core operation for ECC is the scalar multiplication [k]P whose computation speed is seen as key to improving ciphers. For instant, in [28] Eisentrager et al pro- posed a method for computing the formula S = (2P + Q). Their improved procedure saves a field multiplication, when compared to the original algorithm. Later, Ciet et al [17] introduced a faster method for computing the same formula when a field inversion costs more than six field multiplications. Furthermore, they introduced an efficient method for computing point tripling. Mixed powers system of point doubling and tripling for computing the scalar multiplication was represented later by Dimitrov et al [27]. In [54] Mishra et al presented an efficient quintuple formula (5P) and intro- duced a mixed base algorithm with doubling and tripling. Further development was introduced by Longa and Miri [51] by computing an efficient method for tripling and quintupling mixed with differential addition. They proposed an efficient multibases non-adjacent representation (mbNAF) to reduce the cost. In [51] the same authors present further optimization in terms of cost for computing the form dP + Q. They have succeeded in implementing the previous forms of mixed double and add algo-

8 rithm by using a single inversion when applying a new precomputation scheme. More

recently, Purohit and Rawat [66] used a multibase representation to propose an effi-

cient scalar multiplication algorithm of doubling, tripling, and septupling, restricted

on a non super-singular elliptic curve defined over the field F2m . In addition, they have compared their work with other existing algorithms to achieve better represen- tation in terms of cost. Therefore, speeding up the scalar multiplication computation in parallel with reducing the cost is a critical task. We present a new methodology to compute elliptic curve operations with more general forms of the type mP + nQ, where m and n are small integers, aiming for faster implementation with the lowest cost among currently known algorithms using only one inversion.

Among all applications based on EC, the highest benefit from our work con- cerns the post-quantum cryptosystem, Supersingular Isogeny Diffie-Helman (SIDH).

Its main weakness is the slow elliptic curve computation speed. For other elliptic curve schemes, the computation speed-up also favors attacks, which can however be compensated by increasing the size of the key. Isogeny-based cryptography also uti- lizes points on an elliptic curve, but its security is instead based on the difficulty of computing isogenies between elliptic curves. An isogeny can be thought of as a unique algebraic mapping between two elliptic curves that satisfy the group law. An algorithm for computing isogenies on ordinary curves in sub-exponential time was presented by Childs et al [16], rendering the use of cryptosystems based on isogenies on ordinary curves unsafe in the presence of quantum computers. However, there is no

9 known algorithm for computing isogenies on supersingular curves in sub-exponential time.

In [40], Jao and De Feo proposed a key exchange based on isogenies of super- singular elliptic curves. The proposed scheme resembles the standard Elliptic Curve

Diffie-Hellman (ECDH), but goes a step further by computing isogenies overlarge degrees. In the scenario where Alice and Bob want to exchange a secret key over an

a b insecure channel, they pick a smooth isogeny prime p of the form lAlB . f ±1 where lA and lB are small primes, a and b are positive integers, and f is a small cofactor chosen to make the number prime. These define a supersingular elliptic curve, E0(Fq) where q = p2. Lastly, they choose four points on the curve that form the bases

a b {PA,QA} and {PB,QB}, which act as generators for E0(lA) and E0(lB), respectively.

In a graph of supersingular isogenies where the vertices represent isomorphic curves and the edges represent l-degree isogenies, the infeasibility to discover a path that connects two particular vertices provides the security for this protocol. This led to the

Supersingular Isogeny-based Diffie-Hellman key exchange protocol (SIDH) [40]. As of today, the best-known algorithms against the SIDH protocol have an exponential time complexity for both classical and quantum attackers.

Although the SIDH public key size for achieving a 128-bit security level in the quantum setting was already reported as small as 564 bytes in [23], this SIDH public key size was recently further reduced in [22] to just 330 bytes. However impressive, these key size credentials have to be contrasted against SIDH’s relatively slow runtime

10 performance. Indeed, the SIDH key exchange protocol has a latency in the order of milliseconds when implemented in high-end Intel processors. This timing is signifi- cantly higher than the one achieved by several other quantum-resistant cryptosystem proposals. Consequently, some recent works have focused on devising strategies to reduce the runtime cost of the SIDH protocol.

Reportedly Couveignes made the first suggestions towards the usage of isogenies for cryptographic purposes in a seminar held in 1997, later reported in [24]. The first published work of a concrete isogeny-based cryptographic primitive was presented by

Charles, Lauter and Goren in [12, 13], where the authors introduced the hardness of path-finding in supersingular isogeny graphs and its application to the design ofhash functions. It has since been used as an assumption for other cryptographic applica- tions such as key-exchange and digital signature protocols. Stolbunov studied in [77] the hardness of finding isogenies between two ordinary elliptic curves defined over a finite field Fq, with q a prime power. The author proposed to use this setting as the underlying hard problem for a Diffie-Hellman-like key exchange protocol. Nev- ertheless, Childs, Jao, and Soukharev discovered in [16] a subexponential complexity quantum attack against Stolbunovs scheme.

In 2011, Jao and De Feo proposed the problem of finding the isogeny map between two supersingular elliptic curves, a setting where the attack in [16] does not apply anymore. This proposal led to the Supersingular Isogeny-based Diffie-Hellman key exchange protocol (SIDH) [40]. As of today, the best-known algorithms against the

11 SIDH protocol have an exponential time complexity for both classical and quantum attackers.

On the other hand, some recent works have focused on devising strategies to reduce the runtime cost of the SIDH protocol. For example, Koziel et al. presented a parallel evaluation of isogenies implemented on an FPGA architecture [48, 49], reporting im- portant speedups for this protocol. These developments show the increasing research interest on developing techniques able to accelerate the SIDH protocol software and hardware implementations. More recently in [29], several algorithmic optimization targeting both elliptic-curve and field arithmetic operations have been presented in order to accelerate the runtime performance of SIDH by enhancing the calculation of the elliptic curve operation P + k[Q].

The core operations for SIDH is computing the isogeny and its kernel. Basically,

Velus formula is used to compute the isogeny and the P + k[Q] formula is used to compute the kernel, where P and Q are points on the curve and k is the secret key that is generated by both of them [40]. This operation must be performed in both phases of SIDH. First, this happens in the key generation phase, where the point is known in advance. In this case, one can construct a look up table that contains all doubles of point Q and reuse any of them when its needed. Second, in the key exchange phase, where the point Q is variable, we will apply our mixed-base representation (up to

32) in order to speed up the calculations, minding that all mixed-base formulas was implemented with a single inversion.

12 According to Gutub, there are various ways to apply elliptic curves in crypto

applications [36]. The author Gutub noted how the algorithm utilized for calculating

nP from P is based on the binary representation of n. This is because this is the

efficient and practical way to implement in hardware systems [36]. That is,the

binary algorithm scans the bits of n and doubles the point Q k-times [36]. Gutub

further explained that an extra operation of point addition (Q+P) is essential and

needed to perform in every case that a particular bit of n is found [36].

As depicted in Figure 2.1 below, every point addition or point doubling neces-

sitates the three operations of multiplication, inversion, and addition/subtraction:

Figure 2.1: Elliptic curve arithmetic hierarchy [36].

Weierstrass Elliptic Curve This section represents the equations of the original work that we compare our algorithm with. We consider elliptic curve over Zp, where

p > 3. Such a curve, in the short Weierstrass form in the affine plan, is the set ofall

13 pairs (x,y)∈ Zp which fulfill:

y2 ≡ x3 + a.x + b (mod p) (2.1)

For P = (xP , yP ) and Q = (xQ, yQ), one can compute P +Q by using the following

equations, where λ is represented into two different forms [63].

In case of addition where P ̸= Q:

( y − y ) λ = Q P mod p (2.2) xQ − xP

2 xR = λ − xP − xQ mod p

In case of computing 2 ∗ P (doubling of order one) where P has coordinates (x1, y1): (3x2 + a) λ = 1 mod p (2.3) 2y1

2 x2 = λ − 2x1 mod p (2.4)

y2 = λ(x1 − x2) − y1 mod p (2.5)

Where λ is the slope of the tangent through P, and x2 and y2, the affine coordinates

after doubling P one time. While a two dimensional projective space can also be used

14 for computations in Weierstrass form, here we focus on computations in the affine plan.

According to several scholars, the modern worlds complexities are rising, as the

Internet is available for use for everyone [83, 72]. As such, data and information security plays a vital role on the web [72]. According to Toradmalle, Muthukuru, and Sathyanarayana, confidentiality, integrity, availability, and non-repudiation are the principals pillars of security [80]. With these key security requirements, digital signatures play a pivotal role in authentication [80]. This is why the elliptic care signature application has become increasingly important in the past recent decade [85,

80].

Xu, Leonardi, Teh, Jao, Wang, Yu, and Azarderakhsh argued that with the mod- ern advancements in technology, there are increasingly more significant and rigorous security requirements in every industry such as health care [85, 6], As such, elliptic curve cryptography has been intensively suggested to design efficient RFID authen- tication protocols [85, 6].

Benssalah, Djeddou, and Drouiche noted that with elliptic curve signature appli- cation, a secure RFID authentication protocol can be developed and implemented in various industries such as health care [6]. Due to the effectiveness of the elliptic care signature application in providing mobility, privacy, and scalability to users, in 2017,

Galbraith-Petit-Silva proposed a digital signature scheme based on the problem of computing the endomorphism ring of a supersingular elliptic curve [85, 6]. Further-

15 more, in 2019, Xu et al. proposed the improvement of digital signatures based on elliptic curve endomorphism rings [85].

Users also need the security in low resource constrained environments. In such environments, security experts are tasked to design secured algorithms. These algo- rithms need to ensure that verification of information is achieved whenever datais exchanged [72, 80]. In such cases, Toradmalle et al. further underscored the effec- tiveness of elliptical curves, which are the strongest contenders in providing provably- secure digital signatures [80]. As such, the authors Toradmalle et al. developed and proposed a secured and improved ECDSA Elliptical Curve Digital Signature Algo- rithm [80].

Shah and Shah concurred and similarly proposed the use of algorithms based on elliptical curve, allowing the use and implementation of digital signatures [72].

In their study, Shah and Shah proposed the use of a digital signature algorithm by the utilization of single coordinate system of elliptical curve cryptography [72].

Like Toradmalle et al., Shah and Shah justified the use of novel algorithm based on elliptical curve for securing communication between internet of things over the public key cryptographic algorithms [72, 80]. Shah and Shah noted that with the use of elliptical curve, one can minimize and eradicate the use of longer keys, stressful calculations and consumption of battery resources for 16-bit processors, which are typically encountered in public key algorithms [72].

16 Projective Projective coordinates is another way of representing an elliptic curve.

The elliptic curve Γ can be described by another , in the projective space

P 2. That is, the polynomial defines a curve in the projective space P 2 which is also known as a Weierstrass equation [74]:

2 2 3 2 2 3 Γ: Y Z + a1XYZ + a3YZ = X + a2X Z + a4XZ + a6Z

According to Smart [74], the definition of a projective n-dimensional space P n

over a field F is defined as:

n+1 the set of (n + 1) − tuples (x0, ..., xn) ∈ F

where at least one xi is not equal 0, and;

where two elements are identified as a scalar multiple of each other.

The equivalence class of {λ(x0, ..., xn), λ ∈ F } is denoted by [x0, ..., xn], where

these x0, ..., xn are known as the homogeneous coordinates of that point [74].

Gutub (2007) noted further, projective coordinates are used in cases where there

is a need to eradicate performing lengthy inversion as in crypto processors [36]. It is

also plausible to define the Jacobi form of an elliptic curve as the intersection oftwo

quadrics. The author Gutub outlined that there are different forms of formula are

available for an elliptic curve in P3(K), which can be represented as the intersection

of two quadric surfaces [36]:

Q : {Q1(X0,X1,X2,X3) = 0}{Q2(X0,X1,X2,X3) = 0}

17 Higuchi and Takagi [37] and Okeya et al. [58] noted how randomized projective coordinates on a Montgomery-form elliptic curve are effective in securing systems against side channel attacks. For example, Okeya et al. [58] recommended a scalar multiplication method that does not incur a higher computational cost for randomized projective coordinates of the Montgomery form of elliptic curves. Explaining this concept further, the authors Okeya et al. [58] noted how a randomized projective coordinates method is an effective countermeasure against side channel attacks onan elliptic curve cryptosystem. Goubin [35] added to this, also indicating how projective coordinates are convenient and effective in reducing costly inversions. Goubin [35] described two types of projective coordinates: homogeneous and Jacobian projective coordinates.

According to Goubin [35], homogeneous projective coordinates are obtained by setting x = X/Z and y = Y/Z, so that the general Weierstrass equation equates to:

2 2 3 2 2 3 E : Y Z + a1XYZ + a3YZ = X + a2X Z + a4XZ + a6Z .

Additionally, according to Goubin [35], Jacobian projective coordinates are ob- tained by setting x = X/Z2 and y = Y/Z3, so that the general Weierstrass equation equates to:

2 3 3 2 2 4 6 E : Y + a1XYZ + a3YZ = X + a2X Z + a4XZ + a6Z .

18 With the use of a projective coordinates approach, the attacker is unable to predict the appearance of a specific value due to the fact that the projective coordinates are randomized [58, 37]. Okeya et al. [58] proved this in their study, as the authors tested the effectiveness of cryptographic usage of Montgomery-form elliptic curves in a counter measurement against side channel attacks. Okeya et al. [58] justified the effectiveness of cryptographic usage of Montgomery-form elliptic curves in constrained environments such as mobile devices and smart cards in their empirical study [58].

Similarly, Higuchi and Takagi [37] proposed the use of projective coordinates in securing systems against side channel attacks. The authors explained that with pro- jective coordinates, inversions from the elliptic addition are removed, presenting the use of projective coordinates are a cost-effective solution [37]. As such, Higuchi and

Takagi [37] proposed an algorithm that used the same projective coordinates as Lopez and Dahabs study in 1998 [52], which requires one multiplication less than [52]. Specif- ically, Higuchi and Takagi [37] proposed a fast addition algorithm on an elliptic curve over GF(2n) using projective coordinates:

x = X/Z , y = Y/Z2

According to Higuchi and Takagi [37], the above projective coordinates have less multiplications than the known fastest algorithm [52]. These findings [37, 52] are vital contributions for efficient implementation of elliptic curve cryptosystems.

19 Jacobian Jacobian is a form for elliptic curve, different from the Weierstrass form.

Most of the time, it is used in cryptography due to its abilities in providing defense systems against simple and differential power analysis style (SPA) attacks [82]. Com- pared to the Weierstrass form, Jacobian elliptic curve offers also faster arithmetic.

This form of elliptic curve is either of the following: Jacobi intersection or Jacobi quartic [9].

According to Wang, Wang, and Zhang, Jacobian elliptic curve is different from that of Weierstrass form curves, which is defined by the chord and tangent rule [82].

Wang et al. noted how a Jacobi quartic curve is defined. That is, a Jacobi quartic elliptic curve over a field K with char(K) not equal to 2 is defined by:

2 4 2 Ed,a : y = dx + 2ax + 1

where a, d ∈ K and discriminant ∆ = 256(a2 − d)2 is not equal to 0

Each elliptic curve over K with even number of K-rational points can be trans- formed to Jacobi quartic form [82].

Jacobian elliptic curves have been used in security applications such as in biomet- rics and information security. Several authors underscored the relevance of Jacobian elliptic curves in addressing identity theft, which is an increasing problem in todays interconnected world [82, 10]. For example, Brumnik, Kovtun, Kavun, and Podbregar proposed the use of Jacobean Genus 2 Hyperelliptic curves in biometic encryption and authentication systems [10].

20 Brumnik et al. noted how Jacobian curve applications enable systems to have multiple secure points that can be used within the optimization process [10]. The

Jacobian curve thus enables the development of a strong biometric authentication to provide enhanced security, through the integration of strong cryptographic algo- rithms [82, 10].

Montgomery Here is another form of elliptic curve, different from the short Weier- strass form, which is called Montgomery curve. In 1987, the Montgomery curve was introduced by Peter L. Montgomery [55]. Such a curve is a set of all pairs (x,y)∈ Zq which fulfill:

E/F q : By2 = x3 + Ax2 + x

2 such that A,B∈ Zq, and with A ̸=4 and B is a non zero value,

The point on an elliptic curve with affine coordinates can be represented in Mont- gomery form using projective coordinates P = (X : Z), where x = X/Z and y = Y/Z for Z̸=0.

For P = (X1 : Z1) and Q = (X2 : Z2), one can compute P + Q and point 2*P by using the following equations, where P ̸= Q:

2 X3 = (Z2 − Z1)((X2 − Z2)(X1 + Z1) + (X2 + Z2)(X1 − Z1))

21 2 Z3 = (X2 − X1)((X2 − Z2)(X1 + Z1) − (X2 + Z2)(X1 − Z1))

In case of doubling where P = Q:

2 2 4X1Z1 = (X1 + Z1) − (X1 − Z1)

2 2 X3 = (X1 + Z1) (X1 − Z1)

2 Z3 = (4X1Z1)((X1 − Z1) − ((A + 2)/4)(4X1Z1))

In 2000, Okeya, Kurumatani, and Sakurai proposed the use of elliptic curve cryp- tosystems based on the Montgomery-form EM : BY 2 = X3+AX2+X [57]. According to the authors, the use of elliptic curves with Montgomery-form allows one to defend themselves from timing-attacks [57]. Okeya et al. referenced how the Montgomery curve was introduced by Peter L. Montgomery [55]. Utilizing the Montgomery-form, the authors Okeya et al. proposed an algorithm for generating Montgomery-form el- liptic curve whose cofactor is exactly 4, following the order of any elliptic curve with the Montgomery-form that of which needs to be divisible by 4 [57]. Goubin [35] noted

Montgomery method also proves useful in avoiding side channel attacks, allowing the user a natural way of avoiding both timing and simple power analysis attacks. As

22 a countermeasure against attacks, Goubin [35] presented an algorithm, based on the original proposal of [55]:

Algorithm using Montgomerys method [55, 35]:

Require : d, P // d denotes to the scalar

Ensure : Q0 = d.P

Q0 := P

Q1 := 2P

for i = n − 2 down to 0 do

Q1 − di := Q0 + Q1

Qdi := 2.Qdi

end for

Return Q0

Further justifying the effective use of the Montgomery-form elliptic curve, Wang,

Wan, Guo, Cheung, and Yuen (2017) noted the effectiveness of Montgomery domains for privacy-preserving biometric identification [83]. The authors Wang et al. that this is vital information to take into account given that in biometric applications, several vulnerability studies have indicated security risks related to the use of adversarial ma- chine learning, which could compromise biometric recognition systems through the exploitation of biometric similarity information [83]. Wang et al. noted that in such cases, the Montgomery-form allows one to preserve privacy with regards to biometric information [83]. Wang et al. explained, the utilization of Montgomery multiplica-

23 tion for generating search indexes allows to withstand the exploitation of biometric similarity information and adversarial similarity analysis [83]. As such, this body of findings could further justify and explain the benefits of using the Montgomery-form, which could protect information leakage in various security systems [57, 6, 83].

Side channel attack resistance. In the context of computational performance and in establishing security and privacy protocols, embedded devices are vulnerably exposed to physical attacks such as side channel attacks [41]. According to several authors, side channel attack resistance has increasingly becoming popular due to its security advantages [4, 81, 41]. Goubin [36] noted, with the aim to eradicate vulnerabilities to side channel attacks, Elliptic Curve Cryptosystems are becoming more and more popular and are included in many standards and applications. Aranha,

Azarderakhsh, and Karabina further noted that resistance to side-channel attacks are receiving increasing attention in research and commercial applications in addition to efficiency requirements [4]. Aranha et al. proved this in their study by exploring software implementation of laddering algorithms over binary elliptic curves.

In 2017, Wang and Schaumont delved into a similar study, proposing an automated approach to comprehensive side-channel resistance [81]. The authors of the study proposed the utilization of formal verification and program synthesis in order to detect side-channel attacks, even attacks that are linked to leaks of software code running on portable devices [81]. Wang and Schaumont noted that with the use of power side- channel leaks in cryptographic software, leaks of software code running on portable

24 devices could be detected, which eliminates security threats and risks in cryptographic software [81]. These techniques are applicable to other types of side channels and software systems as well, not just cryptographic software [4, 81].

According to Genkin, Valenta, and Yarom, increasing security measures are vi- tal as countermeasures against side channel attacks [32]. In 2006, Nikova et al. [56] proposed an effective, secure countermeasure against first order side-channel attacks.

Poschmann et al. further proposed a countermeasure, using Nikova et al.’s initial recommendation [56, 65]. Poschmann et al. conducted their study regarding this topic wherein the authors utilized and decomposed the S-box and split it into three parts or sections, fulfilling the properties indicated by Nikova et al. [56]. Poschmann et al. found in their study that there are positive and significant experimental results regarding the effectiveness of their countermeasures, providing additional security features [65]. Poschmann et al. assessed their countermeasure system resistance to side-channel attacks by evaluating real power traces obtained from a field pro- grammable gate array (FPGA)-based side-channel standard platform [65]. These findings provides more empirical information regarding in-depth details on counter- measures against side channel attacks [32]. This body of information also underscores the various ways of implementing and assessing a countermeasure system resistance to side-channel attacks, such as by evaluation of real power traces [56, 65].

As a form of countermeasure, several studies have proposed the use of the Montgomery- form elliptic curve in fighting against side channel attacks [32, 8]. For example,

25 Libgcrypt’s implementation of ECDH encryption with Curve25519 was adopted as a security measure to prevent and eradicate side channel attacks. The Libgcrypt’s im- plementation of ECDH encryption with Curve25519 employs the Montgomery ladder scalar-by-point multiplication, with the utilization of the unified, branchless Mont- gomery double-and-add formula and implements a constant-time argument swap within the ladder [32]. According to Genkin et al., side channel resistance can be further improved with the use of unified addition formulas, which eradicate operand- dependent branches from point addition and point multiplication routines [32].

Side-channel attacks can be traced and measured, which is essential to its preven- tion. According to Goodwill, Jun, Jaffe, and Rohatgi [33], sidechannel attacks exploit the presence of information regarding sensitive algorithmic intermediates within the sidechannel traces collected from a device. As such, factors that influence the side- channel poses vulnerabilities to systems [33]. As such, several authors underscored the need to track and measure side-channel attacks in order to develop and implement prevention programs of side-channel attacks [71, 46, 33].

Seibert, Okhravi, and Sderstrm [71] noted, most occurrences of side channel at- tacks on cryptography implementations cannot be performed remotely as measure- ments can only be done if the attacker has physical access to the machine [71]. That is, the attackers need to have physical access to the machines power, electromagnetic

(EM) emanation, and acoustical analyses [71]. Furthermore, the attacker needs to be to physically measure the consumption of power, the produced EM emanation, or

26 the produced sound, respectively [71]. Furthermore, during the side channel attack, an attacker needs to be able to receive feedback from a victim [71].

In such cases, the attacker needs to be able to interact with the program via a networked or scripting setting [71]. In a networked setting, the attacker needs to be able to interact directly with the program, as well as receive information through the network [71]. In a scripting setting, the attacker needs to be able to send a script that of which initiates the side channel attacks [71, 76].

Kpf and Basin [46] proposed a model to assess and measure side-channel attacks, which integrates information-theoretic metrics to quantify the information revealed to an attacker. The authors Kpf and Basin [46] noted that with a model of adap- tive side-channel attacks, coupled with algorithms and approximation techniques for computing this measure, side-channel attack resistance can be evaluated and approxi- mated to an extent. Specifically, the model enables one to have a view of an attacker’s remaining uncertainty about a secret as a function of the number of side-channel mea- surements made [46]. Such algorithms and approximation techniques can be used to analyze the resistance of hardware implementations of cryptographic functions to both timing and power attacks [46].

In line with side-channel resistance, Goodwill et al. [33] aimed to a testing method- ology for side-channel resistance validation. The authors Goodwill et al. and Snow et al. underscored the vital role of having a sidechannel resistance validation program in place [76, 33]. Goodwill et al. proposed a methodology with respect to this goal,

27 evaluating whether a cryptographic module utilizing sidechannel analysis countermea- sures can provide resistance to side-channel attacks [33]. According to Goodwill et al., the following requirements need to be met in order to have an effective sidechannel resistance validation program:

Effectiveness of tests: results should be reproducible and be reasonable indicators of resistance achieved.

Ease and costeffectiveness of testing: Validating a moderate level of resistance

(e.g., FIPS 140 Level 2/3) should not require excessive amount of testing time per algorithm or exceptional test operator skills [33].

Hwang et al. defined MTD as the crossover point between the correlation coeffi- cient of the correct key and the maximum correlation coefficient of all the wrong key guesses [38]. Shown in the Correlation versus Number of Measurements graphs, MTD is the point by which the black line (correct key) crosses the gray envelope (wrong keys), as shown in Figure 2.2.

In general, according to Hwang et al., 2000 measurements are required to disclose a secret key byte for the insecure coprocessor [38]. As indicated in the graph, and as previously outlined, MTD is a metric that can be used when the correct key is known a prior to the attack [38]. Specifically, as shown in Figure 2(c) and Figure 2(d), the measurements indicate that out of sixteen key bytes, WDDL effectively protects five key bytes. That is, after 1.5 million measurements, five key bytes cannotbe broken [38].

28 Similar to Hwang et al., Kladko and Polulyakh [43] aimed to improve resistance of a computing device to side-channel attacks. Kladko and Polulyakh [43], however, proposed a different method or system for side-channel testing a computing device.

The authors Kladko and Polulyakh [43] noted in their operation, one or several phys- ical characteristics observed during execution of the operation are measured. That is, the results of the measurements are denoted as the signature of the operation [43].

Furthermore, in assessing the resistance of a computing device to side-channel attacks, Kladko and Polulyakh [43] noted the need to compare and contrast signatures for different operations and values of parameters. When comparing signatures, if the signatures are found to be identical or similar, then there is strong resistance capacity of the computing device to side-channel attacks [43]. In such cases, there is no significant correlation between the measured set of physical characteristics and the particular operation executed by the device or particular parameters processed by the device, according to Kladko and Polulyakh [43].

Conversely, a leakage or vulnerability to side channel attacks is indicated when signatures show significant dependence on the type of operation or parameters of the operation [43]. In these instances, leakages of information regarding the type of operation and parameters of the operation may occur through the side channel [43].

According to Kladko and Polulyakh [65], comparing and contrasting of signatures may be done visually by the tester. It may also be performed through definition of mathematical measurements, quantifying the degree of resistance of the device to

29 side-channel attacks through numerical terms [43].

Resistance against single power analysis (SPA) and side-channel attacks can be measured through power consumption. According to Coron [20], power consump- tion attacks are dependent on the fact that the power consumed at a given time during cryptographic process is associated to the instruction being executed and the manipulated data. The author Coron [20] explored the topic of resistance against dif- ferential power analysis for elliptic curve cryptosystems. In the study, it was revealed that power consumption analysis allows users to distinguish between instruction be- ing executed. As such, the instructions performed during a cryptographic algorithm does not depend on the processed data, making it resistant against SPA attacks [20].

Coron [20] explained how measurement and monitoring of power consumption enables to visually identify large features, distinguishing levels of resistance against SPA at- tacks. This body of literature underscores the various ways to measure resistance to side-channel attacks, such as assessment of levels of power consumption and through comparison of mathematical measurements [38, 20, ].

NIST standard for elliptic curve. The National Institute of Standards and Tech- nology (NIST) issues guidelines for the U.S. federal government. According to Han- kerson and Menezes, NIST elliptic curves are a set of curves from the FIPS 186-3 standard that are recommended for U.S. federal government use., as recommended by the NIST [1]. There are three types of recommended curves: random elliptic curves over a prime field, random elliptic curves over a binary (characteristic 2) field, and

30 Koblitz [45] elliptic curves over a binary field [1]. Further, NIST has recommended standards regarding elliptic curve cryptography for digital signature algorithms in

FIPS 186, as well as for key establishment schemes in SP 800-56A [3]. NIST further outline that in FIPS 186-4, there are fifteen recommended elliptic curves, which differ in terms of security levels. These standardized curves are intended for use in these elliptic curve cryptographic standards [1, 45, 3].

Applications of elliptic curve cryptography. There are numerous ways to apply the concept and knowledge of elliptic curve cryptography. Several authors underlined the effectiveness of elliptic curves in real-life applications [2, 70, 11]. One oftheways to apply elliptic curve cryptography is through voting systems [2, 70, 11]. Researchers

Ahmad, Hu, and Han delved into this topic and developed an efficient mobile voting system security scheme based on elliptic curve cryptography [2]. Ahmad et al. noted that with elliptic curve cryptography, voting systems are made more transparent and efficient than traditional voting systems [2]. For example, in the traditional process, votes are counted manually; therefore, there is an increased risk of counting errors [2, 70]. Furthermore, in traditional voting, voters could look for the ways to cast their votes more than once [70]. Sable and Bombale further added to this, noting that the use of the elliptic curve cryptography in voting systems allows a fully automated online computerized election process, which eliminates the increased risk of counting errors [70]. The use of an election system based on elliptic curve cryptography also allows the vote counting to be done in real time, which is vital given

31 that by the end of elections day, the automatic results are viewed by all stakeholders with transparency [70].

The mechanism of voting systems based on elliptic curve cryptography is mostly done through an algorithm. That is, Ahmad et al. proposed that users’ votes are more secured by using the elliptic curve cryptography algorithm [70]. This is more relevant given that the elliptic curve has smaller key size compared to other public key cryptographies [2, 34]. Furthermore, the elliptic curve cryptography has an essential homomorphic encryption property, allowing one to keep users’ anonymity [2, 34]. This characteristic is most vital in voting systems where transparency is key [2, 70, 11].

Ahmad et al. justified the use of elliptic curve cryptography in voting systems, as the authors evaluated the method of elliptic curve cryptography in comparison with conventional and/or manual methods [2]. The results of their study demonstrated that our proposed elliptic curve cryptography-based method outperformed the tra- ditional hybrid symmetric and asymmetric cryptographic method specifically in the voting context [2].

Olaniyi, Tayo, Olusayo, and Olusola added to the findings of Ahmad et al. [2], not- ing the use of cryptographic and stegano-cryptographic models for secure electronic voting systems [59]. It was noted in their study that having an electronic voting system in electronic decision making has higher success rates than traditional voting systems [59]. This success rate increases when the processes of security, authenticity and integrity of pre-electoral, electoral and post electoral phases of the electioneering

32 process are secured [59]. As such, the authors concluded that the utilization and implementation of elliptic curve cryptography in voting systems is significantly bene- ficial in providing and delivering a fair, transparent, better participatory and credible elections [59]. This body of literature highlights the effectiveness of elliptic curve cryptography in voting systems, which underscores the need to further examine the implementation process of elliptic curve cryptography-based voting systems. The use of elliptic curve cryptography in voting systems are also more cost-effective and secure for users. Budurushi, Neumann, and Volkamer noted this in their findings, underscor- ing how voting systems based on elliptic curve cryptography is more effective from a security, usability, and cost perspective [11]. This is also true in online or eVoting contexts [70, 11, 34]. Girase concurred, proposing the use of secure smartphone based voting system with modified Electronic Voting Machine (EVM) using elliptic curve cryptography [34]. The author noted that with the widespread use of smartphones, there is a significant opportunity to develop an eVoting system that could be utilized through secure smartphones [34].

A smartphone-based voting system, using elliptic curve cryptography, significantly reduces the risk and chance of invalid votes and its use results in reduction of polling time, increase in voting percentage, explained by Girase, which is similar to the find- ings of Ahmad et al. and Sable and Bombale [2, 70, 34]. Girase explained the concept further, outlining two ways of voting based on elliptic curve cryptography [34]. In the proposed system, voting can be done through the following: Booth Voting using mod-

33 ified EVM and e-Voting using smartphones (Smartphone Based Voting System) [34].

Both proposed methods through the elliptic curve cryptography system have been proven effective, safe, efficient, and secure [34]. Girase further explained thatbooth voting can be done through the utilization of modified EVMs while e-Voting through smartphones can be done through the utilization of an android application [34]. In these systems, there is a constant and stable connection to a common server that of which contains the secured list of voters and database [34].

The use of elliptic curve cryptography is also effective in electronic payment schemes. Several authors have examined this topic, noting how elliptic curve cryptog- raphy can be implemented in mobile payment environments [86, 39, 69]. Rui noted how the use of elliptic curve cryptography, on the basis of public key cryptosystem, authentication and digital signature technology, provides a safe and efficient payment method [69]. That is, through this payment scheme, the functions of the conventional check are duly performed and implemented, providing as well high levels of security and efficiency [69]. The analysis of Ruis findings showed that with the use ofelliptic curve cryptography, there is overall decreased risks of attacks, decreased computation time, increased transaction efficiency online [69]. Yang, Chang, and Chen addedto this and noted that through the use of elliptic curve cryptography, electronic payments can be made with an efficient authenticated encryption scheme [86]. The authors ex- plored this further in their study, noting that the recommended payment scheme does not require any digital signature [86]. Therefore, the computation costs are

34 further reduced [86]. The authors acknowledged that the electronic payment system based on elliptic curve cryptography provides the necessary security requirements of confidentiality, authenticity, integrity, privacy protection, and double-spending pre- vention [86]. Thus the use of elliptic curve cryptography provides a new way for security payment in electronic commerce contexts [86, 39, 69].

According to the Yang et al., the proposed authenticated encryption scheme is also applicable in other contexts such as electronic auction, online meeting, and elec- tronic voting [86]. The advantages of this scheme include the following: message confidentiality, authenticity, integrity, privacy protection, and double-spending pre- vention [86, 39]. Issac and Sherali added to this findings, noting that there should be more exploration regarding the use of elliptic curve cryptography given its advan- tages, especially in the current rising age of online or electronic commerce [39]. Issac and Sherali noted that with elliptic curve cryptography, more secure mobile payment systems could be developed enabling users to execute transactions while on the move with maximum security [39]. The authors further noted that this is essential, given that security remains a paramount concern in the implementation of mobile payment schemes [39].

Several authors explored the use of cryptography in online payment systems [31,

14]. For example, Chaudhry, Farash, Naqvi, and Sher proposed a secure and efficient authenticated encryption for electronic payment systems using elliptic curve cryptog- raphy [14]. The authors of the study noted that using elliptic curve cryptography

35 addresses issues in authentication, confidentiality, integrity and non-repudiation re- lated to online payment systems [14]. In the model proposed by Chaudhry et al., using elliptic curve cryptography as part of electronic payment processes eliminates the risk of attacks to both authenticated encryption scheme and e-payment system [14]. This was justified in their studys analysis, shown through both security and performance evidences [14].

Gao, Kulkarni, Ranavat, Chang, and Mei also explored this further in their study, examining the use of a 2D barcode-based mobile payment system, based on crypto- graphic models [31]. The authors proposed a mobile payment server scheme based on elliptic curve cryptography [31]. The proposed system of Gao et al. offered the following functions:

Certification generation, management and validation for mobile client and mer- chant server.

Mobile user registration for merchant users, end-users.

Use the barcode-based framework to process and generate 2D barcode-based messages between mobile clients and the payment server.

Mobile client authentication and e-wallet management.

Secure session creation, management, and validation.

Mobile payment processing based on secured message validation (using the ECC- based key pair) and data integrity checking with digital signatures [31].

Figure 2.3 explains the basic process to generate a signed payment request while

36 Figure 2.4 shows the basic steps in validating a signed purchase invoice, which are essentially in ensuring secure mobile payment systems [39, 31]:

37 Figure 2.2: Cracking the secret key. (a) Standard cells and regular routing using 15K measurementskeybyte found. (b) Standard cells and regular rouing using 15K measurementskeybyte found. (c) WDDL and differential routing using 1.5 M mea- surementskeybyte found. (d WDDL and differential routing using 1.5 M measure- mentskeybyte not found [38].

38 Figure 2.3: Generation of Signed Payment [31].

Figure 2.4: Validation of Signed Purchase REQ and Signed Purchase Invoice [31].

39 Chapter 3

Fast 2nP

3.1 Fast nP + mQ

In this section we introduce step by step the proposed affine computations of the form nP + mQ. We start with fast one-inversion formula for 2nP (referred to as doubling

of order n, or nth double), before addressing more complex equations.

3.1.1 Fast 22P

By replacing x1, y1 and x3, y3, and λ in Equations 2.3, 2.4 and 2.5, we find the new

slope and (x4, y4) coordinates for the second double.

2 3x3 + a λ4 = mod p 2y3

40 2 2 3(λ − 2x1) + a λ4 = 2 mod p 2(λ(x1 − (λ − 2x1)) − y1)

2 3x1+a 2 2 3(( ) − 2x1) + a 2y1 λ4 = 2 mod p 2(λ(x1 − (λ − 2x1)) − y1)

2 3x1+a 2 2 3(( ) − 2x1) + a 2y1 λ4 = 2 2 2 mod p (3.1) 3x1+a 3x1+a 3 3x1+a 2x1( ) − 2( ) + 4x1( ) − 2y1 2y1 2y1 2y1

4 (2y1) In order to get rid of all inverses in Equation 3.1 we multiply λ4 by 4 to (2y1) eliminate all the denominators of the slope λ, then we get

2 2 2 2 4 3((3x1 + a) − 2x1(2y1) ) + a(2y1) λ4 = 3 2 2 3 3 2 5 mod p (3.2) 2x1(2y1) (3x1 + a) − 4y1(3x1 + a) + 4x1(2y1) (3x1 + a) − (2y1)

The denominator is denoted U.

3 2 2 3 3 2 5 U = 2x1(2y1) (3x1 + a) − 4y1(3x1 + a) + 4x1(2y1) (3x1 + a) − (2y1)

Rewritten as:

2 2 2 3 2 2 4 U = 2y1(2x1(2y1) (3x1 + a) − 2(3x1 + a) + 4x1(2y1) (3x1 + a) − (2y1) )

41 Further rewritten as:

U = 2y1q (3.3) where,

2 2 2 3 2 2 4 q = 2x1(2y1) (3x1 + a) − 2(3x1 + a) + 4x1(2y1) (3x1 + a) − (2y1)

Eliminating inverses speeds up the calculation and increase the efficiency of the parallelization process between the slope and the coordinates equations.

For simplicity, let’s consider,

W λ = mod p (3.4) 4 U

Then we substitute Equation 3.4 in the x4 and y4 equations,

2 x4 = λ4 − 2x3 mod p

2 2 x4 = λ4 − 2(λ − 2x1) mod p

2 2 x4 = λ4 − 2λ + 4x1 mod p (3.5)

42 ( )2 ( 2 )2 W 3x1 + a x4 = − 2 + 4x1 mod p (3.6) U 2y1

Eliminating the inverses in Equation 3.6 by multiplying with the value of U 2 where we remind from Equation 3.3 that,

U = (2y1) q

Then we get,

2 2 2 2 2 2 U x4 = W − 2q (3x1 + a) + 4x1U mod p

W 2 − 2q2(3x2 + a)2 + 4x U 2 x = 1 1 mod p (3.7) 4 U 2

Same steps will be applied in order to find and simplify y4

y4 = λ4(x3 − x4) − y3 mod p

2 2 y4 = λ4((λ − 2x1) − x4) − (λ(x1 − (λ − 2x1)) − y1) mod p

2 3 y4 = λ4λ − 2x1λ4 − x4λ4 − x1λ + λ − 2x1λ + y1 mod p

43 Considering, N x = x mod p (3.8) 4 U 2

2 W 3x1 + a 2 W W y4 = ( ) − 2x1 − x4 U 2y1 U U 2 2 2 3x1 + a 3x1 + a 3 3x1 + a − x1( ) + ( ) − 2x1( ) + y1 2y1 2y1 2y1

3 Then we multiply y4 by U

3 2 2 2 2 2 2 3 2 3 U y4 = W q (3x1 + a) − 2x1WU − NxW − x1U q(3x1 + a) + q (3x1 + a) −

2 2 3 2x1U q(3x1 + a) + y1U

N y = y mod p (3.9) 4 U 3

As it can be noted in Equations 3.4, 3.8, and 3.9, the denominators of x4 and y4 are multiples of the λ4 denominator. Therefore, one can implement the second double with only one inverse, unlike the original equations that involve two inverses in order to compute the second double of a point on a curve. Furthermore, finding the λ4 value is not required anymore.

44 Numerical Examples In this section, we will use the cyclic group of points on the elliptic curve E in Figure 3.1, where the order of E is 19 [63]:

E : y2 ≡ x3 + 2.x + 2 mod 17 (3.10)

Figure 3.1: Cyclic group of the elliptic curve E [63].

As we see in Figure 3.1, it starts from the primitive element P = (5,1) to 19P that represents the identity element, then flips to P again as it is the characteristic ofany cyclic group. We will use this curve in all our numerical example sections at the end of each algorithm.

Note: Each section contains two numerical examples, the first ones being related to each other as the base point will be the same, and it applies to the second examples.

45 Example 1: Let P = (5,1) as we apply our fast 22P algorithm. We substitute with the values of x1 and y1 in the Equations 3.7 and 3.9 in order to find the new coordinates for the point 4P. First we compute W, q and U,

2 2 2 2 4 W = 3((3x1 + a) − 2x1(2y1) ) + a(2y1) mod p

W = 3(5929 − 40)2 + 32 mod 17

W = 104040995 mod 17

W = 9

As we notice the resulting large numbers that are supposed to be scaled in a range that does not exceed the value of p when the modulo function is applied after each mathematical operation as in our implementation. Thus, from now on we represent the resulting values in our numerical examples even in the intermediate steps after applying the modulo operator.

Now we complete computing q and U and then find the values of the new coordi-

46 nates,

2 2 2 3 2 2 4 q = 2x1(2y1) (3x1 + a) − 2(3x1 + a) + 4x1(2y1) (3x1 + a) − (2y1) mod p

q = 2(5)(2(1))2(3(5)2 + 2) − 2(3(5)2 + 2)3 + 4(5)(2(1))2(3(5)2 + 2) − (2(1))4 mod 17

q = 3 − 13 + 6 − 16 mod 17

q = 14

U = (2y1) q mod p

U = 2(14) mod 17

U = 11

Now we substitute the values of W, q, U, x1 and y1 in the new coordinates equa-

47 tions and we get,

(9)2 − 2(14)2(3(5)2 + 2)2 + 4(5)(11)2 x = mod 17 4 (11)2

13 − 13 + 6 x = mod 17 4 2

−1 x4 = 6(2 ) mod 17

x4 = 3

Where the inverse of 2 mod 17 is equal to 9 and Nx = 6.

3 2 2 2 2 2 2 3 2 3 U y4 = W q (3x1 + a) − 2x1WU − NxW − x1U q(3x1 + a) + q (3x1 + a) −

2 2 3 2x1U q(3x1 + a) + y1U mod p

3 2 2 2 2 2 2 3 2 U y4 = 9(14) (3(5) +2) −2(5)(9)(11) −6(9)−5(11) (14)(3(5) +2)+14 (3(5) +

2)3 − 2(5)(11)2(14)(3(5)2 + 2) + 1(11)3 mod 17

16 − 10 − 3 − 2 + 3 − 4 + 5 y = mod 17 4 5

48 −1 y4 = 5(5 ) mod 17

y4 = 1

Where the inverse of 5 mod 17 is equal to 7 and Ny = 5.

Example 2: In this example, we consider the point 11P = (13,10) as a base point, then we apply our fast 22P and we suppose to get the point 4(11P) = 44P mod n, where n is the order of the curve and is equal to 19 in our curve, which is equivalent to the point 6P = (16,13). As we illustrated previously in Example 1, we compute

W, q and U,

2 2 2 2 4 W = 3((3x1 + a) − 2x1(2y1) ) + a(2y1) mod p

W = 3(1 − 13)2 + 9 mod 17

W = 16

Now we complete computing q and U and then find the values of the new coordi-

49 nates,

2 2 2 3 2 2 4 q = 2x1(2y1) (3x1 + a) − 2(3x1 + a) + 4x1(2y1) (3x1 + a) − (2y1) mod p

q = 2(13)(2(10))2(3(13)2 + 2) − 2(3(13)2 + 2)3 + 4(13)(2(10))2(3(13)2 + 2) − (2(10))4 mod 17

q = 4 − 15 + 8 − 13 mod 17

q = 1

U = (2y1) q mod p

U = 20(1) mod 17

U = 3

Now we substitute the values of W, q, U, x1 and y1 in the new coordinates equa-

50 tions and we get,

(16)2 − 2(1)2(3(13)2 + 2)2 + 4(13)(3)2 x = mod 17 4 (3)2

1 − 2 + 9 x = mod 17 4 9

−1 x4 = 8(9 ) mod 17

x4 = 16

Where the inverse of 9 mod 17 is equal to 2 and Nx = 8.

3 2 2 2 2 2 2 3 2 3 U y4 = W q (3x1 + a) − 2x1WU − NxW − x1U q(3x1 + a) + q (3x1 + a) −

2 2 3 2x1U q(3x1 + a) + y1U mod p

3 2 2 2 2 2 2 (3) y4 = 16(1) (3(13) + 2) − 2(13)(16)(3) − 8(16) − 13(3) (1)(3(13) + 2) +

13(3(13)2 + 2)3 − 2(13)(3)2(1)(3(13)2 + 2) + 10(3)3 mod 17

16 − 4 − 9 − 2 + 16 − 4 + 15 y = mod 17 4 10

51 −1 y4 = 11(10 ) mod 17

y4 = 13

Where the inverse of 10 mod 17 is equal to 12 and Ny = 11.

3.1.2 Fast 23P

By applying the same steps that were followed in finding the second double, one can find the third double. First, employing the x4, y4, and λ4 in Equations 2.3, 2.4, and 2.5 respectively.

2 3x4 + a λ8 = mod p 2y4

2 2 3x1+a 2 2 3(λ − 2(( ) − 2x1)) + a 4 2y1 λ8 = 2 2 2 3x1+a 2 3x1+a 3x1+a 2 2(λ4((( ) − 2x1) − x4) − (( )(x1 − (( ) − 2x1)) − y1)) 2y1 2y1 2y1

mod p

52 For simplification we rewrite this as:

W8 λ8 = mod p (3.11) U8

2 W 2 3x1+a 2 2 W8 = 3(( ) − 2(( ) − 2x1)) + a U 2y1

2 2 2 3x1+a 2 3x1+a 3x1+a 2 U8 = 2(λ4((( ) − 2x1) − x4) − (( )(x1 − (( ) − 2x1)) − y1)) 2y1 2y1 2y1

2 2 2 2 3x1+a 2 2 3x1+a 2 3x1+a 3x1+a 2 U8 = 2(λ4((( ) − 2x1) − (λ − 2(( ) − 2x1))) − (( )(x1 − (( ) − 2y1 4 2y1 2y1 2y1

2x1)) − y1))

2 2 2 2 3x1+a 2 3 3x1+a 2 3x1+a 3x1+a 3 U8 = 2λ4( ) − 4x1λ4 − 2λ + 4λ4( ) − 8x1λ4 − 2x1( ) + 2( ) − 2y1 4 2y1 2y1 2y1

2 3x1+a 4x1( ) + 2y1 2y1

2 2 2 W 3x1+a 2 W 3x1+a W 3 3x1+a 3 U8 = 6( )( ) − 12x1( ) − 6x1( ) − 2( ) + 2( ) + 2y1 U 2y1 U 2y1 U 2y1

2 W 2 3x1+a 2 2 3(( ) − 2(( ) − 2x1)) + a U 2y1 λ8 = 2 2 2 mod p W 3x1+a 2 W 3x1+a W 3 3x1+a 3 6( )( ) − 12x1( ) − 6x1( ) − 2( ) + 2( ) + 2y1 U 2y1 U 2y1 U 2y1

In order to eliminate all the inverses in the equation for λ8, that are in λ and λ4,

U 4 we multiply λ8 by U 4 , where,

U8 = (2y1)Uq8 mod p (3.12)

2 2 2 2 2 2 4 W8 = 3(W − 2q (3x1 + a) + 4x1U ) + aU mod p

3 2 2 3 3 4 2 3 4 2 U8 = 6W (2y1)q (3x1+a) −12x1WU −6x1(2y1) q (3x1+a)−2W U+2(2y1)q (3x1+

3 4 a) + (2y1)U mod p

53 Now, we reformulate the U8 equation to maintain the form of the Equation 3.12.

W8 2y1 Thus, we have to multiply λ8 (i.e., ), by then take out (2y1)U as a common U8 2y1

factor of U8. This way we can easily eliminate all inverses in x8 and y8.

2 2 2 2 2 2 4 W8 = (2y1)(3(W − 2q (3x1 + a) + 4x1U ) + aU ) mod p

3 2 2 3 3 4 2 3 U8 = (2y1)(6W (2y1)q (3x1 + a) − 12x1WU − 6x1(2y1) q (3x1 + a) − 2W U +

4 2 3 4 2(2y1)q (3x1 + a) + (2y1)U ) mod p

2 2 2 2 2 3 2 3 3 2 U8 = (2y1)U(6W q (3x1 +a) −12x1WU −6x1(2y1) q (3x1 +a)−2W +2q (3x1 +

3 3 a) + (2y1)U ) mod p

where,

2 2 2 2 2 3 2 3 3 2 3 q8 = 6W q (3x1 + a) − 12x1WU − 6x1(2y1) q (3x1 + a) − 2W + 2q (3x1 + a) +

3 (2y1)U mod p

To compute x8 and y8, substitute Equation 3.11 with the new values of W8 and

U8 in the x8 and y8 equations,

2 x8 = λ8 − 2x4 mod p

2 2 2 x8 = λ8 − 2(λ4 − 2(λ − 2x1)) mod p

2 2 2 x8 = λ8 − 2(λ4 − 2λ + 4x1) mod p

54 2 2 2 x8 = λ8 − 2λ4 + 4λ − 8x1 mod p (3.13)

2 W8 2 W 2 3x1 + a 2 x8 = ( ) − 2( ) + 4( ) − 8x1 mod p U8 U 2y1

2 Multiply both sides by U8 by considering the value of U8 in 3.12 to eliminate all inverses,

2 2 2 2 2 2 2 2 2 2 U8 x8 = W8 − 2W (2y1) q8 + 4(3x1 + a) U q8 − 8x1U8 mod p

2 2 2 2 2 2 2 2 2 W8 − 2W (2y1) q8 + 4(3x1 + a) U q8 − 8x1U8 x8 = 2 mod p (3.14) U8

Same steps will be applied in order to find and simplify y8,

y8 = λ8(x4 − x8) − y4 mod p

2 2 2 2 2 y8 = λ8((λ4 − 2(λ − 2x1)) − x8) − λ4((λ − 2x1) − (λ4 − 2(λ − 2x1))) + (λ(x1 −

2 (λ − 2x1)) − y1) mod p

55 2 2 2 3 y8 = λ8(λ4 −2λ +4x1 −x8)−(λ4λ −2x1λ4 −x4λ4)+(λx1 −λ +2x1λ−y1) mod p

2 2 2 3 y8 = λ8λ4 −2λ8λ +4x1λ8 −x8λ8 −λ4λ +2x1λ4 +x4λ4 +λx1 −λ +2x1λ−y1 mod p

(3.15)

Nx Nx8 Similarly to the decomposition x4 = 2 , we consider x8 = 2 U U8 2 2 W8 W 2 W8 3x1+a 2 W8 Nx8 W8 W 3x1+a 2 W Nx W y8 = ( ) − 2 ( ) + 4x1 − 2 − ( ) + 2x1 + 2 + U8 U U8 2y1 U8 U8 U8 U 2y1 U U U

2 2 2 3x1+a 3x1+a 3 3x1+a x1( ) − ( ) + 2x1( ) − y1 mod p 2y1 2y1 2y1

3 Multiplying both sides by U8 ,

3 2 2 2 2 2 2 2 2 2 U8 y8 = W8W (2y1) q8 − 2W8(3x1 + a) U q8 + 4x1W8U8 − Nx8W8 − W (3x1 +

2 2 2 3 3 2 2 3 3 2 3 a) U8Uq8 + 2x1WU8 (2y1)q8 + NxW (2y1) q8 + x1U8 Uq8(3x1 + a) − U q8(3x1 + a) +

2 2 3 2x1U8 Uq8(3x1 + a) − y1U8 mod p

Ny8 y8 = 3 mod p (3.16) U8

As it can be observed, this proves that one can compute the third double of a node on elliptic curve with only one inverse as we have done for the second double.

Numerical Examples

Example 1: Let P = (5,1) then we apply our fast 23P algorithm. We substitute with the values of x1 and y1 in the equations 3.14 and 3.16 in order to find the new coordinates for the point 8P. First, we find W8, q8 and U8. Referring to Example 1

56 at the end of Section 3.1.1 we consider the values of Nx, W, q and U.

2 2 2 2 2 2 4 W8 = (2y1)(3(W − 2q (3x1 + a) + 4x1U ) + aU ) mod p

2 2 2 2 2 2 4 W8 = (2(1))(3(9 − 2(14) (3(5) + 2) + 4(5)11 ) + 2(11) ) mod 17

2 W8 = 2(3(13 − 13 + 6) + 8) mod 17

W8 = 11

2 2 2 2 2 3 2 3 3 2 3 q8 = 6W q (3x1 + a) − 12x1WU − 6x1(2y1) q (3x1 + a) − 2W + 2q (3x1 + a) +

3 (2y1)U mod p

2 2 2 2 2 3 2 3 q8 = 6(9)(14) (3(5) + 2) − 12(5)(9)(11) − 6(5)(2(1)) (14) (3(5) + 2) − 2(9) +

2(14)3(3(5)2 + 2)3 + (2(1))(11)3 mod 17

q8 = 11 − 9 − 12 − 13 + 6 + 10 mod 17

q8 = 10

U8 = (2y1)Uq8 mod p

U8 = (2(1))(11)(10) mod 17

U8 = 16

Now we substitute the previously computed variables in the new x and y coordi-

57 nates equations in order to find the third double, then we get,

2 2 2 2 2 2 2 2 2 W8 − 2W (2y1) q8 + 4(3x1 + a) U q8 − 8x1U8 x8 = 2 mod p U8

(11)2 − 2(9)2(2(1))2(10)2 + 4(3(5)2 + 2)2(11)2(10)2 − 8(5)(16)2 x = mod 17 8 (16)2

2 − 13 + 13 − 6 x = mod 17 8 1

x8 = 13

Where the inverse of 1 is 1 and Nx8 = 13.

3 2 2 2 2 2 2 2 2 2 U8 y8 = W8W (2y1) q8 − 2W8(3x1 + a) U q8 + 4x1W8U8 − Nx8W8 − W (3x1 +

2 2 2 3 3 2 2 3 3 2 3 a) U8Uq8 + 2x1WU8 (2y1)q8 + NxW (2y1) q8 + x1U8 Uq8(3x1 + a) − U q8(3x1 + a) +

2 2 3 2x1U8 Uq8(3x1 + a) − y1U8 mod p

3 2 2 2 2 2 2 2 2 (16) y8 = (11)(9) (2(1)) (10) − 2(11)(3(5) + 2) (11) (10) + 4(5)(11)(16) −

13(11) − 9(3(5)2 + 2)2(16)(11)(10)2 + 2(5)(9)(16)2(2(1))(10) + (6)(9)(2(1))3(10)3 +

5(16)2(11)(10)(3(5)2 + 2) − (11)3(10)3(3(5)2 + 2)3 + 2(5)(16)2(11)(10)(3(5)2 + 2) −

1(16)3 mod 17

16 y8 = 12 − 12 + 16 − 7 − 7 + 15 + 13 + 3 − 13 + 6 − 16 mod 17

−1 y8 = 10(16 ) mod 17

58 y8 = 7

Where the inverse of 16 is 16 and Ny8 = 10.

Example 2: As it was specified in the Example 2 in Section 3.1.1, the base point is 11P = (13,10), then we apply our fast 23P and we suppose to get the point 8(11P)

= 88P mod 19, which is equivalent to the point 12P = (0,11). As we illustrated

previously in Example 1, we compute W8, q8 and U8 by considering the variables

computed previously of W, q, U and Nx. Then we get,

2 2 2 2 2 2 4 W8 = (2y1)(3(W − 2q (3x1 + a) + 4x1U ) + aU ) mod p

2 2 2 2 2 2 4 W8 = (2(10))(3(16 − 2(1) (3(13) + 2) + 4(13)3 ) + 2(3) ) mod 17

2 W8 = 3(3(1 − 2 + 9) + 9) mod 17

W8 = 8

2 2 2 2 2 3 2 3 3 2 3 q8 = 6W q (3x1 + a) − 12x1WU − 6x1(2y1) q (3x1 + a) − 2W + 2q (3x1 + a) +

3 (2y1)U mod p

2 2 2 2 2 3 2 q8 = 6(16)(1) (3(13) + 2) − 12(13)(16)(3) − 6(13)(2(10)) (1) (3(13) + 2) −

59 2(16)3 + 2(1)3(3(13)2 + 2)3 + (2(10))(3)3 mod 17

q8 = 11 − 7 − 12 − 15 + 15 + 13 mod 17

q8 = 5

U8 = (2y1)Uq8 mod p

U8 = (2(10))(3)(5) mod 17

U8 = 11

Now we substitute the previously computed variables in the new x and y coordi- nates equations in order to find the third double, then we get,

2 2 2 2 2 2 2 2 2 W8 − 2W (2y1) q8 + 4(3x1 + a) U q8 − 8x1U8 x8 = 2 mod p U8

(8)2 − 2(16)2(2(10))2(5)2 + 4(3(13)2 + 2)2(3)2(5)2 − 8(13)(11)2 x = mod 17 8 (11)2

13 − 8 + 16 − 4 x = mod 17 8 2

x8 = 0

Where the inverse of 2 is 9 and Nx8 = 0.

3 2 2 2 2 2 2 2 2 2 U8 y8 = W8W (2y1) q8 − 2W8(3x1 + a) U q8 + 4x1W8U8 − Nx8W8 − W (3x1 +

2 2 2 3 3 2 2 3 3 2 3 a) U8Uq8 + 2x1WU8 (2y1)q8 + NxW (2y1) q8 + x1U8 Uq8(3x1 + a) − U q8(3x1 + a) +

60 2 2 3 2x1U8 Uq8(3x1 + a) − y1U8 mod p

3 2 2 2 2 2 2 2 2 (11) y8 = (8)(16) (2(10)) (5) −2(8)(3(13) +2) (3) (5) +4(13)(8)(11) −0(8)−

16(3(13)2+2)2(11)(3)(5)2+2(13)(16)(11)2(2(10))(5)+(8)(16)(2(10))3(5)3+13(11)2(3)(5)(3(13)2+

2) − (3)3(5)3(3(13)2 + 2)3 + 2(13)(11)2(3)(5)(3(13)2 + 2) − 10(11)3 mod 17

5 y8 = 15 − 13 + 16 − 0 − 8 + 2 + 13 + 1 − 8 + 2 − 16 mod 17

−1 y8 = 4(5 ) mod 17

y8 = 11

Where the inverse of 5 is 7 and Ny8 = 4.

3.1.3 Fast 24P

By expanding values x8, y8, and λ8 in the λ16 equation, we find the new slope for the fourth double. Then, we apply the same steps that were followed in the previous sections.

2 3x8 + a λ16 = mod p 2y8

Considering,

W16 λ16 = mod p (3.17) U16

where,

U16 = (2y1)U8Uq16 mod p (3.18)

Referring to the Equation 3.13 for x8 value,

61 2 2 2 2 W16 = 3(λ8 − 2λ4 + 4λ − 8x1) + a mod p

2 W8 2 W 2 3x1+a 2 2 W16 = 3(( ) − 2( ) + 4( ) − 8x1) + a mod p U8 U 2y1

Referring to Equations 3.5, 3.13, and 3.15 for the values of x4, x8, and y8 respec-

tively, we get,

2 2 2 3 U16 = 2(λ8λ4 − 2λ8λ + 4x1λ8 − x8λ8 − λ4λ + 2x1λ4 + x4λ4 + λx1 − λ + 2x1λ −

y1) mod p

2 2 2 2 2 2 2 U16 = 2(λ8λ4 − 2λ8λ + 4x1λ8 − (λ8 − 2λ4 + 4λ − 8x1)λ8 − λ4λ + 2x1λ4 + (λ4 −

2 3 2λ + 4x1)λ4 + λx1 − λ + 2x1λ − y1) mod p

2 2 3 2 3 3 U16 = 6λ8λ4−12λ λ8+24x1λ8−2λ8−6λ4λ +12x1λ4+2λ4+6x1λ−2λ −2y1 mod p

2 2 W8 W 2 3x1+a 2 W8 W8 W8 3 W 3x1+a 2 W U16 = 6( )( ) −12( ) ( )+24x1( )−2( ) −6( )( ) +12x1( )+ U8 U 2y1 U8 U8 U8 U 2y1 U

2 2 W 3 3x1+a 3x1+a 3 2( ) + 6x1( ) − 2( ) − 2y1 mod p U 2y1 2y1

In order to eliminate all the inverses in the equation λ16 that are part of λ, λ4,

4 U8 and λ8 we multiply λ8 by 4 , while considering the value of U8 in Equation 3.12. U8

2 2 2 2 2 2 2 2 2 2 4 W16 = 3(W8 − 2W (2y1) q8 + 4U q8(3x1 + a) − 8x1U8 ) + aU8 mod p

2 2 2 2 2 2 2 3 3 2 U16 = 6W8W (2y1) q8U8−12(3x1+a) W8q8U U8+24x1W8U8 −2W8 U8−6W (3x1+

2 2 2 3 3 3 3 2 3 2 3 3 3 a) q8UU8 +12x1W (2y1)q8U8 +2W (2y1) q8U8+6x1(3x1+a)q8UU8 −2(3x1+a) q8U U8−

4 2y1U8 mod p

Now, we reformulate the equation of U16 to maintain the form of Equation 3.12.

(2y1)U Thus, we have to multiply λ16 (W16,U16), by then take out (2y1)UU8 as a (2y1)U

common factor of U16. This way we can eliminate all inverses from the equations for

computing x16 and y16.

62 2 2 2 2 2 2 2 2 2 2 4 W16 = (2y1)U(3(W8 − 2W (2y1) q8 + 4U q8(3x1 + a) − 8x1U8 ) + aU8 ) mod p

2 2 2 2 2 2 2 3 3 U16 = (2y1)U(6W8W (2y1) q8U8 −12(3x1 +a) W8q8U U8 +24x1W8U8 −2W8 U8 −

2 2 2 2 3 3 3 3 2 3 2 6W (3x1+a) q8UU8 +12x1W (2y1)q8U8 +2W (2y1) q8U8+6x1(3x1+a)q8UU8 −2(3x1+

3 3 3 4 a) q8U U8 − 2y1U8 ) mod p

2 2 2 2 2 2 2 2 3 U16 = (2y1)UU8(6W8W (2y1) q8 − 12(3x1 + a) W8q8U + 24x1W8U8 − 2W8 −

2 2 2 2 3 3 3 2 2 2 6W (3x1 + a) q8UU8 + 12x1W (2y1)q8U8 + 2W (2y1) q8 + 6x1(3x1 + a)q8UU8 − 2(3x1 +

3 3 3 3 a) q8U − 2y1U8 ) mod p

Thus one finds λ16 with a single inverse, unlike the original method that requires

4 inverses in order to compute the fourth double slope. Likewise, we prove in the following equations that we can find x16 and y16 with a multiplier denominator of λ16.

As a result, there is no need to calculate their inverses as we have proven before.

2 x16 = λ16 − 2x8 mod p

2 2 2 2 x16 = λ16 − 2(λ8 − 2(λ4 − 2(λ − 2x1))) mod p

2 2 2 2 x16 = λ16 − 2(λ8 − 2(λ4 − 2λ + 4x1)) mod p

2 2 2 2 x16 = λ16 − 2(λ8 − 2λ4 + 4λ − 8x1) mod p

63 2 2 2 2 x16 = λ16 − 2λ8 + 4λ4 − 8λ + 16x1 mod p

2 W16 2 W8 2 W 2 3x1 + a 2 x16 = ( ) − 2( ) + 4( ) − 8( ) + 16x1 mod p U16 U8 U 2y1

In order to eliminate the inverses in the equation of x16, we multiply both sides

2 by U16, using the value of U16 in Equation 3.18,

2 2 2 2 2 2 U16 x16 = W16 − 2W8 (2y1) U q16+

2 2 2 2 2 2 2 2 2 2 4W (2y1) U8 q16 − 8(3x1 + a) U8 U q16 + 16x1U16 mod p

x16 =

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 W16 − 2W8 (2y1) U q16 + 4W (2y1) U8 q16 − 8(3x1 + a) U8 U q16 + 16x1U16 2 U16

mod p (3.19)

Likewise, finding y16,

y16 = λ16(x8 − x16) − y8 mod p

64 Nx8 Ny8 Based on the decomposition of x8 as 2 , given that y8 = 3 , and rewriting x16 U8 U8

Nx16 as 2 , we substitute with these values in the previous equation to find y16. U16

W16 Nx8 Nx16 Ny8 y16 = ( 2 − 2 ) − 3 mod p U16 U8 U16 U8

3 Multiplying both sides by U16, to eliminate all inverses as we have done previously with considering the value of U16 in Equation 3.18,

3 2 2 2 3 3 3 U16 y16 = W16Nx8(2y1) U q16 − W16Nx16 − Ny8(2y1) U q16 mod p

2 2 2 3 3 3 W16Nx8(2y1) U q16 − W16Nx16 − Ny8(2y1) U q16 y16 = 3 mod p (3.20) U16

Numerical Examples

Example 1: Let P = (5,1) then we apply our fast 24P algorithm. We substitute

with the values of x1 and y1 in the equations 3.19 and 3.20 in order to find the new

coordinates for the point 16P. First, we find W16, q16 and U16. Referring to the series

of Example 1 at the end of Section 3.1.1 and Section 3.1.2 we consider the values of

the variables where,

Nx = 6

W = 9

q = 14

65 U = 11

Ny8 = 10

Nx8 = 13

W8 = 11

q8 = 10

U8 = 16

2 2 2 2 2 2 2 2 2 2 4 W16 = (2y1)U(3(W8 − 2W (2y1) q8 + 4U q8(3x1 + a) − 8x1U8 ) + aU8 ) mod p

2 2 2 2 2 2 2 2 2 2 W16 = (2(1))(11)(3(11 −2(9) (2(1)) (10) +4(11) (10) (3(5) +2) −8(5)(16) ) +

2(16)4) mod 17

2 W16 = 5(3(2 − 13 + 13 − 6) + 2) mod 17

W16 = 12

2 2 2 2 2 2 2 2 3 2 q16 = 6W8W (2y1) q8 − 12(3x1 + a) W8q8U + 24x1W8U8 − 2W8 − 6W (3x1 +

2 2 2 3 3 3 2 2 2 3 3 3 a) q8UU8 + 12x1W (2y1)q8U8 + 2W (2y1) q8 + 6x1(3x1 + a)q8UU8 − 2(3x1 + a) q8U −

3 2y1U8 mod p

2 2 2 2 2 2 2 2 3 q16 = 6(11)(9) (2(1)) (10) −12(3(5) +2) (11)(10) (11) +24(5)(11)(16) −2(11) −

6(9)(3(5)2+2)2(10)2(11)(16)+12(5)(9)(2(1))(10)(16)2+2(9)3(2(1))3(10)3+6(5)(3(5)2+

2)(10)(11)(16)2 − 2(3(5)2 + 2)3(10)3(11)3 − 2(1)(16)3 mod 17

q16 = 4 − 4 + 11 − 10 − 8 + 5 + 11 + 1 − 9 − 15 mod 17

q16 = 3

U16 = (2y1)UU8q16 mod p

U16 = (2)(11)(16)(3) mod 17

66 U16 = 2

Now we substitute the previously computed variables in the new x and y coordi-

nates equations in order to find the third double, then we get,

x16 =

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 W16 − 2W8 (2y1) U q16 + 4W (2y1) U8 q16 − 8(3x1 + a) U8 U q16 + 16x1U16 2 U16

mod p

x16 =

(12)2 − 2(11)2(2)2(11)2(3)2 + 4(9)2(2)2(13)2(3)2 − 8(3(5)2 + 2)2(13)2(11)2(3)2 + 16(5)(2)2 22

mod 17

−1 x16 = (8 − 16 + 15 − 15 + 14)(4 ) mod 17

x16 = 10

Where the inverse of 4 is 13 and Nx16 = 6.

2 2 2 3 3 3 W16Nx8(2y1) U q16 − W16Nx16 − Ny8(2y1) U q16 y16 = 3 mod p U16

67 12(13)(2)2(11)2(3)2 − (12)(6) − 10(2)3(11)3(3)3 y = mod 17 16 23

12 − 4 − 5 y = mod 17 16 23

y16 = 3

Where the inverse of 8 is 15 and Ny16 = 11.

Example 2: As it was specified in the Example 2 in Section 3.1.1, the base pointis

11P = (13,10), then we apply our fast 24P and we suppose to get the point 16(11P)

= 176P mod 19, which is equivalent to the point 5P = (9,16). As we illustrated

previously in Example 1, we compute W16, q16 and U16 by considering the variables

computed previously where,

Nx = 8

W = 16

q = 1

U = 3

Ny8 = 4

Nx8 = 0

W8 = 8

q8 = 5

68 U8 = 11

2 2 2 2 2 2 2 2 2 2 4 W16 = (2y1)U(3(W8 − 2W (2y1) q8 + 4U q8(3x1 + a) − 8x1U8 ) + aU8 ) mod p

2 2 2 2 2 2 2 2 2 2 W16 = (2(10))(3)(3(8 −2(16) (2(10)) (5) +4(3) (5) (3(13) +2) −8(13)(11) ) +

2(11)4) mod 17

2 W16 = 9(3(13 − 8 + 16 − 4) + 8) mod 17

W16 = 4

2 2 2 2 2 2 2 2 3 2 q16 = 6W8W (2y1) q8 − 12(3x1 + a) W8q8U + 24x1W8U8 − 2W8 − 6W (3x1 +

2 2 2 3 3 3 2 2 2 3 3 3 a) q8UU8 + 12x1W (2y1)q8U8 + 2W (2y1) q8 + 6x1(3x1 + a)q8UU8 − 2(3x1 + a) q8U −

3 2y1U8 mod p

2 2 2 2 2 2 2 2 3 q16 = 6(8)(16) (2(10)) (5) − 12(3(13) + 2) (8)(5) (3) + 24(13)(8)(11) − 2(8) −

6(16)(3(13)2+2)2(5)2(3)(11)+12(13)(16)(2(10))(5)(11)2+2(16)3(2(10))3(5)3+6(13)(3(13)2+

2)(5)(3)(11)2 − 2(3(13)2 + 2)3(5)3(3)3 − 2(10)(11)3 mod 17

q16 = 5 − 10 + 11 − 4 − 14 + 12 + 16 + 6 − 16 − 15 mod 17

q16 = 8

U16 = (2y1)UU8q16 mod p

U16 = (2(10))(3)(11)(8) mod 17

U16 = 10

Now we substitute the previously computed variables in the new x and y coordi- nates equations in order to find the third double, then we get,

69 x16 =

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 W16 − 2W8 (2y1) U q16 + 4W (2y1) U8 q16 − 8(3x1 + a) U8 U q16 + 16x1U16 2 U16

mod p

x16 =

(4)2 − 2(8)2(2(10))2(3)2(8)2 + 4(16)2(2(10))2(11)2(8)2 − 8(3(13)2 + 2)2(11)2(3)2(8)2 + 16(13)(10)2 102

mod 17

−1 x16 = (16 − 8 + 1 − 2 + 9)(15 ) mod 17

x16 = 9

Where the inverse of 15 is 8 and Nx16 = 16.

2 2 2 3 3 3 W16Nx8(2y1) U q16 − W16Nx16 − Ny8(2y1) U q16 y16 = 3 mod p U16

4(0)(2(10))2(3)2(8)2 − (4)(16) − 4(2(10))3(3)3(8)3 y = mod 17 16 103

70 0 − 13 − 1 y = mod 17 16 14

y16 = 16

Where the inverse of 14 is 11 and Ny16 = 3.

3.1.4 Fast 3P (Point Tripling)

As it is important to calculate the binary multiplicative 2n for points Q to compute a large degree isogeny, we enhance the algorithm by finding the intermediate steps like

3Q, 5Q, and 7Q etc.

In [67] Subramanya Rao have worked on Montgomery curves and found an efficient technique to find point tripling. Simply, we will optimize an application of asingle double to some point Q then perform a point addition. This technique could be applied to all intermediate steps as follows.

Substitute the Equation 2.3 of the value of λ in Equations 2.4 and 2.5.

2 3x1 + a 2 x2 = ( ) − 2x1 mod p 2y1

2 3x1 + a y2 = ( )(x1 − x2) − y1 mod p 2y1

71 Then, we substitute the value of x3 and y3 in the slope of point addition as a value of 2Q. ′ ′ ′ y2 − y1 λ3 = ′ ′ mod p x2 − x1

Getting

2 3x1+a (( )(x1 − x3) − y1) − y1 ′ 2y1 λ3 = 2 mod p 3x1+a 2 (( ) − 2x1) − x1 2y1

2 2 3x1+a 3x1+a 2 ( )(x1 − ( ) + 2x1) − 2y1 ′ 2y1 2y1 λ3 = 2 mod p 3x1+a 2 (( ) − 2x1) − x1 2y1

2 2 3x1+a 3x1+a 2 ( )(3x1 − ( ) ) − 2y1 ′ 2y1 2y1 λ3 = 2 mod p 3x1+a 2 ( ) − 3x1 2y1

Multiplying λ′ with ( 2y1 )3 to eliminate all inverses and to be able to take out 3 2y1

2y1 as a common factor from the denominator to reduce the x2 fraction which is the

x-coordinate of the point doubling.

2 2 2 2 4 ′ (3x1 + a)(3x1(2y1) − (3x1 + a) ) − (2y1) λ3 = 2 2 2 mod p (3.21) (2y1)((3x1 + a) − 3x1(2y1) )

Rewriting,

′ W3 λ3 = mod p (3.22) U3

72 Where,

U3 = (2y1)q3 mod p (3.23)

Now we substitute with the value of the new slope in Equation 3.22 to compute 3Q.

′2 x3 = λ 3 − x1 − x2 mod p

2 W3 2 3x1 + a 2 x3 = ( ) − x1 − ( ) + 2x1 mod p U3 2y1

2 Multiplying the equation with U3

2 2 2 2 2 2 U3 x3 = W3 + x1U3 − q3(3x1 + a) mod p

2 2 2 2 2 W3 + x1U3 − q3(3x1 + a) x3 = 2 mod p (3.24) U3

Considering,

Nx3 x3 = 2 mod p (3.25) U3

Now we find y3,

′ y3 = λ3(x1 − x3) − y1 mod p

73 W3 Nx3 y3 = (x1 − 2 ) − y1 mod p U3 U3

3 Multiplying with U3 to eliminate the inverses,

3 2 3 U3 y3 = W3(x1U3 − Nx3) − y1U3 mod p

2 3 W3(x1U3 − Nx3) − y1U3 y3 = 3 mod p (3.26) U3

Numerical Examples Following the previous series of numerical examples for 2n algorithms, we continue representing examples for some of the intermediate steps algorithms and we start with representing the point 3P.

Example 1: Let P = (5,1), then we apply our 3P algorithm to compute the new x and y coordinates.

2 2 2 2 4 W3 = (3x1 + a)(3x1(2y1) − (3x1 + a) ) − (2y1) mod p

2 2 2 2 4 W3 = (3(5) + 2)(3(5)(2) − (3(5) + 2) ) − (2) mod 17

W3 = 9(9 − 13) − 16 mod 17

74 W3 = 16

2 2 2 q3 = ((3x1 + a) − 3x1(2y1) ) mod p

2 2 2 q3 = (3(5) + 2) − 3(5)(2) mod 17

q3 = 4

U3 = (2y1)q3 mod p

U3 = (2)(4) mod 17

U3 = 8

Now we substitute with these values in x and y cooedinates equations of the point

75 3P, then we get,

2 2 2 2 2 W3 + x1U3 − q3(3x1 + a) x3 = 2 mod p U3

162 + 5(8)2 − 42(3(5)2 + 2)2 x = mod 17 3 82

1 + 14 − 4 x = mod 17 3 13

x3 = 10

Where the inverse of 13 is 4 and Nx3 = 11.

2 3 W3(x1U3 − Nx3) − y1U3 y3 = 3 mod p U3

16(5(8)2 − 11) − 83 y = mod 17 3 83

12 y = mod 17 3 2

76 y3 = 6

Where the inverse of 2 is 9 and Ny3 = 12.

Example 2: Let the base point be 11P = (13,10), then we apply our 3P algorithm to compute the new x and y coordinates which is equivalent to the point 14P = (9,1).

2 2 2 2 4 W3 = (3x1 + a)(3x1(2y1) − (3x1 + a) ) − (2y1) mod p

2 2 2 2 4 W3 = (3(13) + 2)(3(13)(2(10)) − (3(13) + 2) ) − (2(10)) mod 17

W3 = 16(11 − 1) − 13 mod 17

W3 = 11

2 2 2 q3 = (3x1 + a) − 3x1(2y1) mod p

77 2 2 2 q3 = (3(13) + 2) − 3(13)(2(10)) mod 17

q3 = 7

U3 = (2y1)q3 mod p

U3 = (2(10))(7) mod 17

U3 = 4

Now we substitute with these values in x and y cooedinates equations of the point

3P, then we get,

2 2 2 2 2 W3 + x1U3 − q3(3x1 + a) x3 = 2 mod p U3

112 + 13(4)2 − 72(3(13)2 + 2)2 x = mod 17 3 42

78 2 + 4 − 15 x = mod 17 3 16

x3 = 9

Where the inverse of 16 is 16 and Nx3 = 8.

2 3 W3(x1U3 − Nx3) − y1U3 11 − 7 − 4y3 = 3 mod p U3

11(13(4)2 − 8) − 10(4)3 y = mod 17 3 43

13 y = mod 17 3 13

y3 = 1

Where the inverse of 13 is 4 and Ny3 = 13.

79 3.1.5 Fast 2nQ + P

As we have mentioned earlier, the complexity of the SIDH cryptosystem relies on computing isogenies between points on the elliptic curve. Thus, we have performed a further optimization in term of the kernel Equation P + [k]Q. As we have succeeded to perform an advanced exponent of a point on a curve with a single inverse, it would have been needed to compute an extra inverse for a differential point addition.

Therefore, in this section we introduce an optimization for mixing our advanced doubling equations with the addition and perform it with a single inverse.

In the point tripling section we computed the 3P as an intermediate step. Here, we provide general equations that can be applied to any of our equations 4Q, 8Q, and

16Q and their extensions.

The following equations have some variables like Nx, Ny, and U that have to be replaced with the variables related to each double. We have represented here the equation P + 4Q.

We substitute the value of x and y coordinates of the second double of the point

Q in Equations 3.8 and 3.9 respectively in the addition slope equation in 2.2.

Ny − y λ = U 3 1 mod p 5 Nx U 2 − x1

80 Multiplying with U 3 to eliminate the inverses,

3 Ny − y1U λ5 = 3 mod p (3.27) NxU − x1U

Substitute λ in the equations for x5 and y5,

3 Ny − y1U 2 Nx x5 = ( 3 ) − x1 − 2 mod p NxU − x1U U

Take out a common factor U of the λ denominator and multiply the equation with

2 2 2 U (Nx − x1U ) ,

2 2 2 3 2 2 2 2 2 2 U (Nx − x1U ) x3 = (Ny − y1U ) − x1U (Nx − x1U ) − Nx(Nx − x1U ) mod p

3 2 2 2 2 2 2 (Ny − y1U ) − x1U (Nx − x1U ) − Nx(Nx − x1U ) x5 = 2 2 2 mod p (3.28) U (Nx − x1U )

Now we find y3,

3 3 Ny − y1U Ny − y1U y5 = 2 x1 − 2 x3 − y1 mod p U(Nx − x1U ) U(Nx − x1U )

2 2 2 Multiplying with U (Nx − x1U ) one eliminates all inverses and adjusts the y3

81 denominator to be matched to x5 for simplification,

2 2 2 2 3 3 2 U (Nx − x1U ) y3 = U(Nx − x1U )((Ny − y1U )x1 − (Ny − y1U )x5 − U(Nx − x1U )y1)

2 3 3 2 U(Nx − x1U )((Ny − y1U )x1 − (Ny − y1U )x5 − U(Nx − x1U )y1) y5 = 2 2 2 mod p U (Nx − x1U ) (3.29)

Numerical Examples

Example 1: Let P = (5,1), then we apply our 2nP+P algorithm to compute the

new x and y coordinates. In this example we will apply 22P+P in order to find the

point 5P. We consider the values previously computed in Example 1 in the numerical

example of Section 3.1.1 where,

Nx = 6

Ny = 5

U = 11

We substitute these values in the Equations 3.28 and 3.29 then we get,

3 2 2 2 2 2 2 (Ny − y1U ) − x1U (Nx − x1U ) − Nx(Nx − x1U ) x5 = 2 2 2 mod p U (Nx − x1U )

82 (5 − (1)113)2 − 5(11)2(6 − 5(11)2)2 − 6(6 − 5(11)2)2 x = mod 17 5 112(6 − 5(11)2)2

0 − 10(16) − 6(16) x = mod 17 5 2(16)

16 x = mod 17 5 15

x5 = 9

Where the inverse of 15 is 8 and Nx5 = 16.

2 3 3 2 U(Nx − x1U )((Ny − y1U )x1 − (Ny − y1U )x3 − U(Nx − x1U )y1) y5 = 2 2 2 mod p U (Nx − x1U )

11(6 − 5(11)2)((5 − 1(11)3)(5) − (5 − 1(11)3)(9) − 11(6 − 5(11)2)(1)) y = mod 17 5 112(6 − 5(11)2)2

11(13)(0(5) − 0(9) − 11(13)(1)) y = mod 17 5 15

83 2 y = mod 17 5 15

y5 = 16

Where Ny5 = 2.

Example 2: Let P = (5,1), then we apply our 2nP+P algorithm to compute the new x and y coordinates. In this example we will apply 23P+P in order to find the point 9P. We consider the values previously computed in Example 1 in the numerical example of Section 3.1.2 where,

Nx8 = 13

Ny8 = 10

U8 = 16

We substitute these values in the Equations 3.28 and 3.29 then we get,

3 2 2 2 2 2 2 (Ny8 − y1U8 ) − x1U8 (Nx8 − x1U8 ) − Nx8 (Nx8 − x1U8 ) x9 = 2 2 2 mod p U8 (Nx8 − x1U8 )

(10 − (1)163)2 − 5(16)2(13 − 5(16)2)2 − 13(13 − 5(16)2)2 x = mod 17 9 162(13 − 5(16)2)2

84 2 − 5(13) − 13(13) x = mod 17 9 1(13)

6 x = mod 17 9 13

x9 = 7

Where the inverse of 13 is 4 and Nx9 = 6.

2 3 3 2 U8(Nx8 − x1U8 )((Ny8 − y1U8 )x1 − (Ny8 − y1U8 )x9 − U8(Nx8 − x1U8 )y1) y9 = 2 2 2 mod p U8 (Nx8 − x1U8 )

16(13 − 5(16)2)((10 − 1(16)3)(5) − (10 − 1(16)3)(7) − 16(13 − 5(16)2)(1)) y = mod 17 9 162(13 − 5(16)2)2

16(8)(11(5) − 11(7) − 16(8)(1)) y = mod 17 9 13

10 y = mod 17 9 13

85 y9 = 6

Where Ny5 = 10.

3.1.6 Fast 2nP + 2Q

In Section 3.1.4 we have illustrate the importance of computing the intermediate

equations for the overall speed-up of our algorithms. The 2nP + 2Q form, where 1 <

n < 5, will be efficient in computing the points 6Q, 10Q, and 18Q and since the

subtraction in elliptic curve could be represented by adding the inverse of a point,

14Q can be performed efficiently by using this algorithm as well. In this section, we

will exemplify implementing the form of 22P + 2Q=(6Q), where P = Q.

We substitute the value of x and y coordinates of the 4Q algorithm in Equations 3.8

and 3.9 respectively and the 2Q algorithm coordinates in Equations 2.4 and 2.5 as

well in the addition slope Equation 2.2.

2 2 Nyn 3x1+a 3x1+a 2 3 − ( )(x1 − ( ) + 2x1) + y1 U 2y1 2y1 λ6 = 2 mod p Nx 3x1+a 2 2 − ( ) + 2x1 U 2y1

Multiplying with U 3 to eliminate the inverses by considering the value of U in

Equation 3.3,

3 2 2 2 2 2 3 Nyn − q (3x1 + a)(x1(2y1) − (3x1 + a) + 2x1(2y1) ) + y1U λ6 = 3 2 2 2 mod p Nxn U − 2y1q (3x1 + a) + 2x1(2y1)(2y1) q3

86 3 2 2 2 2 2 3 Nyn − q (3x1 + a)(x1(2y1) − (3x1 + a) + 2x1(2y1) ) + y1U λ6 = 3 2 2 2 mod p Nxn U − 2y1q (3x1 + a) + 2x1(2y1)(2y1) q3

Since U = 2y1q, we take a common factor U from the denominator then multiply both the numerator and denominator with 2y1, in order to be able to eliminate all inverses in the coordinates equations,

3 2 2 2 2 2 3 2y1(Ny − q (3x1 + a)(x1(2y1) − (3x1 + a) + 2x1(2y1) ) + y1U ) λ6 = 2 2 2 2 mod p 2y1U(Nx − q ((3x1 + a) + 2x1(U) ))

Considering,

W6 λ6 = mod p (3.30) U6

Where,

U6 = (2y1)Uq6 mod p (3.31)

Now we substitute with the value of new slope in x and y coordinate equations to compute 6Q,

2 x6 = λ6 − x3 − x4 mod p

2 W6 2 3x1 + a 2 Nxn x6 = ( ) − ( ) + 2x1 − 2 mod p U6 2y1 U

87 2 Multiplying the equation with U6

2 2 2 2 2 2 2 2 2 U6 x6 = W6 − U q6(3x1 + a) + 2x1U6 − (2y1) q6Nxn mod p

2 2 2 2 2 2 2 2 W6 − U q6(3x1 + a) + 2x1U6 − (2y1) q6Nxn x6 = 2 mod p (3.32) U6

Now we find y6,

y6 = λ6(x3 − x6) − y3 mod p

Substituting the value of y3, x3 and x6 by considering,

Nx6 x6 = 2 U6

2 2 2 W6 3x1 + a 2 Nx6 3x1 + a 3x1 + a 2 y6 = (( ) − 2x1 − 2 ) − ( )(x1 − ( ) + 2x1) + y1 mod p U6 2y1 U6 2y1 2y1

2 2 2 W6 3x1 + a 2 Nx6 3x1 + a 3x1 + a 2 y6 = (( ) − 2x1 − 2 ) − ( )(3x1 − ( ) ) + y1 mod p U6 2y1 U6 2y1 2y1

3 Multiplying the equation with U6 to eliminate the inverses,

3 2 2 2 2 2 2 2 2 2 2 U6 y6 = W6(U q6(3x1 + a) − 2x1U6 − Nx6) − Uq6(3x1 + a)(3x1U6 − U q6(3x1 +

88 2 3 a) ) + y1U6 mod p

Ny6 y3 = 3 mod p (3.33) U6

Note: In case of n = 3 or 4, Nxn , Nyn and Un will be replaced with the values that related to each algorithm. For the variable q, it will be replaced with Uq8 and

UU8q16 in case of computing 8P + 2Q and 16P + 2Q respectively.

Numerical Example Let P = (5,1), then we apply our 2nP+2P algorithm to compute the new x and y coordinates. In this example we will apply 22P+2P in order to find the point 6P. We consider the values previously computed in Example

1 in the numerical example of Section 3.1.1 where,

Nx = 6

Ny = 5

q = 14

U = 11

First, we find W6, q6 and U6,

3 2 2 2 2 2 3 W6 = 2y1(Ny − q (3x1 + a)(x1(2y1) − (3x1 + a) + 2x1(2y1) ) + y1U ) mod p

3 2 2 2 2 2 3 W6 = 2(1)(5 − (14) (3(5) + 2)(5(2(1)) − (3(5) + 2) + 2(5)(2(1)) ) + 1(11) ) mod 17

89 W6 = 2(5 − 12(3 − 13 + 6) + 5) mod 17

W6 = 14

2 2 2 2 q6 = Nx − q (3x1 + a) + 2x1U mod p

2 2 2 2 q6 = 6 − 14 (3(5) + 2) + 2(5)(11) mod 17

q6 = 6 − 15 + 3 mod 17

q6 = 11

U6 = (2y1)Uq6 mod p

90 U6 = (2(1))11(11) mod 17

U6 = 4

Now, we substitute these values in the x and y coordinates equations, then we get,

2 2 2 2 2 2 2 2 W6 − U q6(3x1 + a) + 2x1U6 − (2y1) q6Nx x6 = 2 mod p U6

142 − (112(11)2(3(5)2 + 2)2 − 2(5)42) − (2(1))2(11)2(6) x = mod 17 6 42

9 − 11 − 14 x = mod 17 6 16

1 x = mod 17 6 16

x6 = 16

Where the inverse of 16 is 16 and Nx6 = 1.

91 3 2 2 2 2 2 2 2 2 2 2 U6 y6 = W6(U q6(3x1 + a) − 2x1U6 − Nx6) − Uq6(3x1 + a)(3x1U6 − U q6(3x1 +

2 3 a) ) + y1U6 mod p

3 2 2 2 2 2 2 2 2 2 2 4 y6 = 14(11 11 (3(5) +2) −2(5)4 −1)−11(11)(3(5) +2)(3(5)4 −11 11 (3(5) +

2)2) + (1)43 mod 17

13 y6 = 14(1 − 7 − 1) − 1(2 − 1) + 13 mod 17

13 y6 = 16 mod 17

y6 = 13

Where the inverse of 13 is 4 and Ny6 = 16.

3.1.7 2nP + mQ

We have succeeded in representing this algorithms for some integers n and m, in one implementation with some replaceable variables, Nyn , Nxn , and Un based on the n value, where (1 < n < 5) and (2 < m < 17). As we have described in the previous sections, each algorithm computes these variables differently. Thus, generalizing the equations benefits hardware optimization, flexibility and allows parallelization tobe applied intensively. In this section, we illustrate the case of 23P + 3Q = (11Q), where

P = Q.

We substitute the value of the coordinates x and y of the 8P, 3P algorithms of the point P from Sections 3.1.2, and 3.1.4 respectively in the addition slope Equation 2.2.

92 Ny8 Ny3 3 − 3 U8 U3 λ11 = mod p Nx8 Nx3 2 − 2 U8 U3

3 3 Multiplying the numerator and denominator with U8 U3 to eliminate the inverses,

3 3 Ny8U3 − Ny3U8 λ11 = 3 3 mod p Nx8U3 U8 − Nx3U8 U3

Take out a common factor U8U3 of the denominator,

3 3 Ny8U3 − Ny3U8 λ11 = 2 2 mod p U3U8(Nx8U3 − Nx3U8 )

Considering,

W11 λ11 = mod p (3.34) U11 where,

U11 = U3U8q11 mod p (3.35)

Now we substitute with the value of new slope in x and y coordinate equations to compute 11Q,

2 x11 = λ11 − x3 − x8 mod p

93 W11 2 Nx3 Nx8 x11 = ( ) − 2 − 2 mod p U11 U3 U8

2 Multiplying the equation with U11,

2 2 2 2 2 2 U11 x11 = W11 − Nx3U8 q11 − Nx8U3 q11 mod p

2 2 2 2 2 W11 − Nx3U8 q11 − Nx8U3 q11 x11 = 2 mod p (3.36) U11

where,

Nx11 x11 = 2 mod p (3.37) U11

Now we find y11,

y11 = λ11(x3 − x11) − y3 mod p

Substitute the value of y3, x3 and x11,

( ) W11 Nx3 Nx11 Ny3 y11 = 2 − 2 − 3 mod p U11 U3 U11 U3

3 Multiplying the equation with U11,

3 2 2 3 3 U11 y11 = W11(Nx3U8 q11 − Nx11) − Ny3U8 q11 mod p

94 2 2 3 3 W11(Nx3U8 q11 − Nx11) − Ny3U8 q11 y11 = 3 mod p (3.38) U11

Numerical Example Let P = (5,1), then we apply our 2nP+mP algorithm to compute the new x and y coordinates. In this example we will apply 23P+3P in order to find the point 11P. We consider the values previously computed in Example

1 in the numerical example of Section 3.1.2 and Section 3.1.4 where,

Nx3 = 11

Ny3 = 12

U3 = 8

Nx8 = 13

Ny8 = 10

U8 = 16

First, we find W11, q11 and U11,

3 3 W11 = Ny8U3 − Ny3U8 mod p

3 3 W11 = 10(8) − 12(16) mod 17

W11 = 15

95 2 2 q11 = Nx8U3 − Nx3U8 mod p

2 2 q11 = 13(8) − 11(16) mod 17

q11 = 5

U11 = U3U8q11 mod p

U11 = 8(16)(5) mod 17

U11 = 11

Now, we substitute these values in the x and y coordinates equations, then we

96 get,

2 2 2 2 2 W11 − Nx3U8 q11 − Nx8U3 q11 x11 = 2 mod p U11

152 − 11(16)2(5)2 − 13(8)2(5)2 x = mod 17 11 112

4 − 3 − 9 x = mod 17 11 2

x11 = 13

Where the inverse of 2 is 9 and Nx11 = 9.

2 2 3 3 W11(Nx3U8 q11 − Nx11) − Ny3U8 q11 y11 = 3 mod p U11

15(11(16)2(5)2 − 9) − 12(16)3(5)3 y = mod p 11 113

12 − 13 y = mod p 11 5

97 y11 = 10

Where the inverse of 5 is 7 and Ny11 = 16.

3.1.8 Generalizing 2nP + Q and 2nP + mQ Forms

We notice that by reforming the equations in Section 3.1.5, we can generalize the two forms 2nP + Q and 2nP + mQ.

As done in Section 3.1.5, we substitute the value of x and y coordinates of the second double of the point Q in Equations 3.8 and 3.9 respectively in the addition slope Equation 2.2.

Ny − y λ = U 3 1 mod p Nx U 2 − x1

Multiplying with U 3 to eliminate the inverses, yields

3 Ny − y1U λ = 3 mod p (3.39) NxU − x1U

Considering,

Wg λg = mod p (3.40) Ug

98 Substituting λ in the equations for xg and yg,

Wg 2 Nx xg = ( ) − x1 − 2 mod p Ug U

2 Multiplying this equation with Ug where,

Ug = Uqg mod p (3.41)

2 2 2 2 Ug xg = Wg − x1Ug − Nxqg mod p

2 2 2 Wg − x1Ug − Nxqg xg = 2 mod p (3.42) Ug

Considering,

Nxg xg = 2 mod p (3.43) Ug

Now we find yg,

Wg Nxg yg = (x1 − 2 ) − y1 mod p Ug Ug

2 Multiplying the equation with Ug in order to eliminate all inverses,

3 2 3 Ug yg = Wg(x1Ug − Nxg) − y1Ug mod p

99 2 3 Wg(x1Ug − Nxg) − y1Ug yg = 3 mod p (3.44) Ug

The previous equations have variables Nxg, Nyg, Ug’s and Wg’s that have to be replaced with the variables that related to the corresponding double. Additionally, as

2 2 we have seen in Equations 3.28 and 3.36 of coordinate x, the terms x1Ug , and Nxqg

2 2 2 2 have to be replaced with NxiUj qi+j, and NxjUi qi+j respectively in case we use the n 2 2 P +mQ form. As well as in the equation of coordinate y, we replace the terms x1Ug ,

3 2 2 3 3 and y1Ug respectively with NxiUj qi+j, and NyiUj qi+j. Knowingly, we have succeeded to generalize these two forms with replacing some variables and terms, leading to a potential optimization in terms of hardware.

3.1.9 Another Implementation of 6Q

In Section 3.1.6, we have illustrate that we can compute 6Q through the form of 22P +

2Q. In this section, we provide a different algorithm for computing 6Q by doubling the point 3Q which has been discussed in Section 3.1.4. Diversity in implementations provides us with a greater chance of comparison in terms of hardware versus speed.

Since it’s a doubling operation, we substitute the value of x and y coordinates that has been successfully computed in Section 3.1.4 in the slope Equation 2.3,

( 2 ) 3x3 + a λ6 = mod p 2y3

100 2 2 2 2 2 W3 +x1U3 −q3 (3x1+a) 2 3( 2 ) + a U3 λ6 = 2 mod p U3W3(x1−x3)−y1U3 2( 2 ) U3

Substituting the x3 value in the y3,

2 2 2 2 2 W3 +x1U3 −q3 (3x1+a) 2 3( 2 ) + a U3 λ6 = 2 2 2 2 2 mod p W3 +x1U3 −q3(3x1+a) 2 U3W3x1−W3( )−y1U U3 3 2( 2 ) U3

4 U3 Multiplying with 4 , U3

2 2 2 2 2 2 4 3(W3 + x1U3 − q3(3x1 + a) ) + aU3 λ6 = 3 2 2 2 2 2 4 mod p 2U3 W3x1 − 2U3W3(W3 + x1U3 − q3(3x1 + a) ) − 2y1U3

Take out a common factor U3 of the denominator in order to eliminate the x and y inverses in the new coordinates equations,

2 2 2 2 2 2 4 3(W3 + x1U3 − q3(3x1 + a) ) + aU3 λ6 = 2 2 2 2 2 2 3 mod p U3(2U3 W3x1 − 2W3(W3 + x1U3 − q3(3x1 + a) ) − 2y1U3 )

Considering,

W6 λ6 = mod p (3.45) U6 where,

U6 = U3q6 mod p (3.46)

Now we substitute the value of λ6 in Equation 3.45 with considering the value of U6

101 in Equation 3.46 when we eliminate the inverses,

2 x6 = λ6 − 2x3 mod p

( )2 ( 2 2 2 2 2 ) W6 W3 + x1U3 − q3(3x1 + a) x6 = − 2 2 mod p U6 U3

2 Multiplying the equation with U6 ,

2 2 2 2 2 2 2 2 U6 x6 = W6 − 2q6(W3 + x1U3 − q3(3x1 + a) ) mod p

2 2 2 2 2 2 2 W6 − 2q6(W3 + x1U3 − q3(3x1 + a) ) x6 = 2 mod p (3.47) U6

Now we find y6,

y6 = λ6(x3 − x6) − y3 mod p

Substituting the value of y3, x3 and x6,

2 2 2 2 2 2 2 2 2 2 2 2 2 W6 W3 +x1U3 −q3 (3x1+a) W6 −2q6 (W3 +x1U3 −q3 (3x1+a) ) U3W3(x1−x3)−y1U3 y6 = ( 2 − 2 )− 2 mod p U6 U3 U6 U3

4 Multiplying the equation with U6 ,

4 2 2 2 2 2 2 2 2 2 2 2 2 U6 y6 = W6U6(q6(W3 + x1U3 − q3(3x1 + a) ) − (W6 − 2q6(W3 + x1U3 − q3(3x1 +

2 2 4 2 a) ))) − U3 q6(U3W3(x1 − x3) − y1U3 ) mod p

102 4 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 U6 y6 = W6U6(q6W3 +x1U3 q6 −q6q3(3x1+a) −W6 +2q6W3 +2q6x1U3 −2q6q3(3x1+

2 2 4 2 a) ) − U3 q6(U3W3(x1 − x3) − y1U3 ) mod p

4 2 2 2 2 2 2 2 2 2 2 4 U6 y6 = W6U6(3q6W3 + 3x1U3 q6 − 3q6q3(3x1 + a) − W6 ) − U3 q6(U3W3(x1 − x3) −

2 y1U3 ) mod p

Substituting x3 value in equation 3.24,

4 2 2 2 2 2 2 2 2 2 2 4 U6 y6 = W6U6(3q6W3 + 3x1U3 q6 − 3q6q3(3x1 + a) − W6 ) − U3 q6(U3W3(x1 −

2 2 2 2 2 W3 +x1U3 −q3 (3x1+a) 2 2 ) − y1U3 ) mod p U3

4 2 2 2 2 2 2 2 2 2 4 2 2 U6 y6 = W6U6(3q6W3 + 3x1U3 q6 − 3q6q3(3x1 + a) − W6 ) − U3q6(W3(U3 x1 − W3 −

2 2 2 2 3 x1U3 + q3(3x1 + a) ) − y1U3 ) mod p

4 2 2 2 2 2 2 2 2 2 4 2 2 2 U6 y6 = W6U6(3q6W3 + 3x1U3 q6 − 3q6q3(3x1 + a) − W6 ) − U3q6(W3q3(3x1 + a) −

3 3 W3 − y1U3 ) mod p Ny6 y6 = 4 mod p (3.48) U6

Numerical Examples

Example 1: Let P = (5,1), then we apply our 2(3P) algorithm to compute the new x and y coordinates. We consider the values previously computed in Example 1 in the numerical example of Section 3.1.4 where,

Nx3 = 11

Ny3 = 12

U3 = 8

W3 = 16

103 q3 = 4

First, we find W6, q6 and U6,

2 2 2 2 2 2 4 W6 = 3(W3 + x1U3 − q3(3x1 + a) ) + aU3 mod p

2 2 2 2 2 2 4 W6 = 3(16 + 5(8) − 4 (3(5) + 2) ) + 2(8) mod 17

2 W6 = 3(1 + 14 − 4) + 15 mod 17

W6 = 4

2 2 2 2 2 2 3 q6 = 2U3 W3x1 − 2W3(W3 + x1U3 − q3(3x1 + a) ) − 2y1U3 mod p

2 2 2 2 2 2 3 q6 = 2(8) (16)(5) − 2(16)((16) + 5(8) − (4) (3(5) + 2) ) − 2(1)(8) mod 17

q6 = 6 − 15(1 + 14 − 4) − 4 mod 17

104 q6 = 7

U6 = U3q6 mod p

U6 = 8(7) mod 17

U6 = 5

Now, we substitute these values in the x and y coordinates equations, then we get,

2 2 2 2 2 2 2 W6 − 2q6(W3 + x1U3 − q3(3x1 + a) ) x6 = 2 mod p U6

42 − 2(7)2((16)2 + 5(8)2 − (4)2(3(5)2 + 2)2) x = mod 17 6 52

16 − 13(1 + 14 − 4) x = mod 17 6 8

105 9 x = mod 17 6 8

x6 = 16

Where the inverse of 8 is 15 and Nx6 = 9.

4 2 2 2 2 2 2 2 2 2 4 2 2 2 U6 y6 = W6U6(3q6W3 + 3x1U3 q6 − 3q6q3(3x1 + a) − W6 ) − U3q6(W3q3(3x1 + a) −

3 3 W3 − y1U3 ) mod p

4 2 2 2 2 2 2 2 2 2 4 2 2 (5) y6 = 4(5)(3(7) (16) +3(5)(8) (7) −3(7) (4) (3(5) +2) −(4) )−(8)(7) ((16)(4) (3(5) +

2)2 − (16)3 − (1)(8)3) mod 17

13 y6 = 3(11 + 1 − 10 − 16) − 15(13 − 16 − 2) mod 17

13 y6 = 9 − 10 mod 17

−1 y6 = 16(13 ) mod 17

y6 = 13

Where the inverse of 13 is 4 and Ny6 = 16.

Example 2: Let the base point be 11P = (13,10), then we apply our 2(3(11P)) algorithm to compute the new x and y coordinates which is equivalent to the point 9P

= (7,6). We consider the values previously computed in Example 2 in the numerical example of Section 3.1.4 where,

Nx3 = 8

Ny3 = 13

106 U3 = 4

W3 = 11

q3 = 7

First, we find W9, q9 and U9,

2 2 2 2 2 2 4 W9 = 3(W3 + x1U3 − q3(3x1 + a) ) + aU3 mod p

2 2 2 2 2 2 4 W9 = 3(11 + 13(4) − 7 (3(13) + 2) ) + 2(4) mod 17

2 W9 = 3(2 + 4 − 15) + 2 mod 17

W9 = 7

2 2 2 2 2 2 3 q9 = 2U3 W3x1 − 2W3(W3 + x1U3 − q3(3x1 + a) ) − 2y1U3 mod p

2 2 2 2 2 2 3 q9 = 2(4) (11)(13) − 2(11)((11) + 13(4) − (7) (3(13) + 2) ) − 2(10)(4) mod 17

107 q9 = 3 − 5(2 + 4 − 15) − 5 mod 17

q9 = 9

U9 = U3q9 mod p

U9 = 4(9) mod 17

U9 = 2

Now, we substitute these values in the x and y coordinates equations, then we get,

2 2 2 2 2 2 2 W9 − 2q9(W3 + x1U3 − q3(3x1 + a) ) x9 = 2 mod p U9

72 − 2(9)2((11)2 + 13(4)2 − (7)2(3(13)2 + 2)2) x = mod 17 9 22

108 15 − 9(2 + 4 − 15) x = mod 17 9 4

11 x = mod 17 9 4

x9 = 7

Where the inverse of 4 is 13 and Nx9 = 11.

4 2 2 2 2 2 2 2 2 2 4 2 2 2 U6 y6 = W6U6(3q6W3 + 3x1U3 q6 − 3q6q3(3x1 + a) − W6 ) − U3q6(W3q3(3x1 + a) −

3 3 W3 − y1U3 ) mod p

4 2 2 2 2 2 2 2 2 2 4 2 2 (2) y6 = 7(2)(3(9) (11) +3(13)(4) (9) −3(9) (7) (3(13) +2) −(7) )−(4)(9) ((11)(7) (3(13) +

2)2 − (11)3 − (10)(4)3) mod 17

16 y6 = 14(10 + 3 − 7 − 15) − 13(12 − 5 − 11) mod 17

16 y6 = 10 − 16 mod 17

−1 y6 = 11(16 ) mod 17

y6 = 6

Where the inverse of 16 is 16 and Ny9 = 11.

3.1.10 Another Implementation of 10Q

Here is another example of different implementation of 10Q that was computed pre- viously in Section 3.1.6 through 23 + 2Q form. In this section, we will implement 10Q

109 by doubling the point 5Q which were computed previously by applying a differential addition to the algorithm 4Q in Section 3.1.1. We have succeeded to prove that we can compute 10Q with a single inverse as well.

As we have done in the previous section, we will substitute the x5 and y5 coordi- nates to the doubling slope in Equation 2.3,

( 2 ) 3x5 + a λ10 = mod p 2y5

3 2 2 2 2 2 2 (Ny−y1U ) −x1U (Nx−x1U ) −Nx(Nx−x1U ) 2 3( 2 2 2 ) + a U (Nx−x1U ) λ10 = 2 3 3 2 mod p U(Nx−x1U )((Ny−y1U )x1−(Ny−y1U )x5−U(Nx−x1U )y1) 2( 2 2 2 ) U (Nx−x1U )

4 2 4 U (Nx−x1U ) Multiplying with 4 2 4 U (Nx−x1U )

3 2 2 2 2 2 2 2 4 2 4 3((Ny − y1U ) − x1U (Nx − x1U ) − Nx(Nx − x1U ) ) + aU (Nx − x1U ) λ10 = 3 2 3 3 3 2 mod p 2U (Nx − x1U ) ((Ny − y1U )x1 − (Ny − y1U )x5 − U(Nx − x1U )y1)

Reforming x5 as,

Nx5 x5 = 2 2 2 mod p U (Nx − x1U )

Then, substitute x5 in the slope equation,

3 2 2 2 2 2 2 2 4 2 4 3((Ny−y1U ) −x1U (Nx−x1U ) −Nx(Nx−x1U ) ) +aU (Nx−x1U ) λ10 = 3 2 3 3 2 3 4 2 4 mod p 2x1U (Nx−x1U ) (Ny−y1U )−2U(Nx−x1U )(Ny−y1U )Nx5−2U (Nx−x1U ) y1)

2 Take out a common factor U(Nx − x1U ) of the denominator,

3 2 2 2 2 2 2 2 4 2 4 3((Ny−y1U ) −x1U (Nx−x1U ) −Nx(Nx−x1U ) ) +aU (Nx−x1U ) λ10 = 2 2 2 2 3 3 3 2 3 mod p U(Nx−x1U )(2x1U (Nx−x1U ) (Ny−y1U )−2Nx5(Ny−y1U )−2y1U (Nx−x1U ) )

110 Considering,

W10 λ10 = mod p (3.49) U10

Where,

2 U10 = U(Nx − x1U )q10 mod p (3.50)

Now we substitute λ10 in x and y coordinates equations,

2 x10 = λ10 − 2x5 mod p

( )2 ( ) W10 Nx5 x10 = − 2 2 2 2 mod p U10 U (Nx − x1U )

2 Multiplying the equation with U10,

2 2 2 U10 x10 = W10 − 2q10Nx5 mod p

2 2 W10 − 2q10Nx5 x10 = 2 mod p (3.51) U10

Now we find y10,

y10 = λ10(x5 − x10) − y5 mod p

111 ( 2 2 ) W10 Nx5 W10 − 2q10Nx5 Ny5 y10 = 2 2 2 − 2 − 2 2 2 mod p U10 U (Nx − x1U ) U10 U (Nx − x1U )

3 Multiplying the equation with U10,

3 2 2 2 2 U10 y10 = W10(Nx5q10 − W10 + 2q10Nx5) − Ny5q10U10 mod p

2 2 2 2 W10(Nx5q10 − W10 + 2q10Nx5) − Ny5q10U10 y10 = 3 mod p (3.52) U10

3.2 Results

Simulation experiments are performed with a Java implementation of the proposed equations. We have applied the equations on large parameters defined in the standard curve P-521 from the National Institute of Standards and Technology (NIST). Table

2 shows the differences in time and operations count between the new method as compared to the original equations.

As noted in the Table 3.1, there is a significant improvement in the speed for our algorithms as compared to the original ones. The improvement achievable with parallelization can also be observed, together with its scalability for higher orders multiplication scalars.

112 Table 3.1: Algorithms Preliminary Measurements.

Routine Number of Operations Time ms Mults Divs ALUs Parallel levels 4P Eq. 350/642 291/628 309/634 310/646 0.5/0.9 8P Eq. 405/930 287/909 320/921 313/939 0.6/1.1 16P Eq. 518/1248 289/1220 347/1236 327/1260 0.9/1.6

3.3 Further optimization

In this part we show further optimization by using labeling and intermediate result reusing technique. Basically, we avoid recomputing the same mathematical operations that are reused in the next steps. Thus, we reuse the final results of these operations and label them. Table 3.2 shows the list of all labels that have been used in this phase of optimization in order to reduce the operations number, thus speeding up the elliptic curve computations.

As noted in Table 3.2, these labels are used for the equations 2n where n ≤ 4.

Now we rewrite the equations for each algorithm.

For 4P, one can compute W, U, x4 and y4 by using the following equations:

2 W = 3(L2 − L5) + aL6 mod p (3.53)

U = L3(3L5L1 − 2L1L2 − L6) mod p (3.54)

113 Table 3.2: List of Labels for equations 4P, 8P and 16P.

Label Operation Label Operation Label Operation 2 2 L1 3x1 + a L21 (3x1 + a)U L41 (2y1)Uq16 2 2 2 2 2 2 2 2 2 L2 L1 = (3x1 + a) L22 (3x1 + a) U L42 (2y1) U q16 2 2 L3 2y1 L23 U8 L43 U16 2 2 2 L4 L3 = (2y1) L24 x1U8 L44 Nx8 2 2 L5 2x1L4 = 2x1(2y1) L25 4x1U8 L45 Nx16 2 4 2 2 L6 L4 = (2y1) L26 q8 L46 x1(2y1) 2 L7 qL1 = q(3x1 + a) L27 Nx8 L47 Nx1 2 2 2 2 −1 2 L8 L7 = q (3x1 + a) L28 U8 L48 y1 2 −1 2 2 L9 U L29 (U8 ) L49 2y1 2 2 L10 x1L9 = x1U L30 q8(3x1 + a) L50 Ny1 2 3 L11 2L10 = 2x1U L31 y1U8 L51 2Ny1 2 2 2 2 L12 Nx L32 q8(3x1 + a) L52 q −1 2 2 L13 U L33 Uq8(3x1 + a) L53 Nx1 q 2 −1 2 2 2 2 2 L14 L13 = (U ) L34 U q8(3x1 + a) L54 Ny 2 2 2 L15 W L35 3x1U8 L55 q8 2 2 2 L16 2L11 = 4x1U L36 6x1U8 L56 Nxq8

L17 L3U = 2y1U L37 q8W (2y1) L57 Nx8 2 2 2 2 L18 W8 L38 q8W (2y1) L58 Ny8 2 2 L19 L3W = 2y1W L39 W16 L59 q16 2 2 2 2 2 L20 L19 = 2y1W L40 q16 L60 Nx8 q16 −1 L61 Nx16 L62 q L63 U16

Where,

U = L3q

2 W − 2L8 + L11 + L10 x4 = mod p (3.55) L9

114 Where,

x4 = NxL14

L (L (W + L ) − (L + L )) − L (2x W − Uy ) − L W y = 7 7 7 11 10 9 1 1 12 mod p (3.56) 4 U 3

Where,

y4 = NyL13L14

In case of computing 8P, one can compute W8, U8, x8 and y8 by using the following equations:

2 2 W8 = L3(3(L15 − 2L8 + L16) + aL9) mod p (3.57)

U8 = L17(L7(2L62L7(3WL3 + L1) − (L16 + L11)) − W (3UL16 + 2L15) + L17L9) mod p

(3.58)

115 Where,

U8 = L17q8

L18 − L26(2L20 − 4L22) − 2L25 x8 = mod p (3.59) L23

Where,

x8 = Nx8 L29

y8 = Ny8 L28L29 mod p (3.60)

Where,

Ny8 = W8L26(L20 − 2L22) + W8(L25 − L27) + L24q8(2L19 + 3L21)

− L21L26L1(WU8 + L21Uq8) + y1(2L12L19L3L26q8 − L23U8)

In case of computing 16P, one can compute W16, U16, x16 and y16 by using the following equations:

2 2 W16 = L17(3(L18 − 2L26(L20 − 2L22) − 2L25) + aL23) mod p (3.61)

116 U16 = L17U8(6W8L26(L20 − 2L22) + 2W8(3L25 − L8) − 2L33(3WU8L30− (3.62)

L35 + L34) + 2L37(L16 + L38) − 2L31) mod p

Where,

U16 = L17U8q16

L39 − 2L18L42 + 4L23L40(L20 − 2L22) + 16x1L43 x16 = mod p (3.63) L43

Where,

2 x16 = Nx16 L63

L42(W16L27 − L44L41) − W16L45 y16 = 3 mod p (3.64) U16

Where,

3 y16 = Ny16 L63

117 Chapter 4

Direct Doubling

Earlier in Section 3.1, we have illustrated how to find a higher order double indepen- dently without going throw all steps that are required for the original algorithm. Basi- cally, our previous technique was based on compensating the desired double variables with the previous coordinates equations then simplifying the terms of the equations to reach a simpler form with a single inversion. Likewise, in this phase of optimiza- tion we follow the same steps except we compensate directly with the assumption we

Nxn Nyn made in Sections 3.1.1, 3.1.2, and 3.1.3, that xn = 2 and yn = 3 where n is the Un Un order of the desired double.

In case that we compute 4P, first we find Nx1 and Ny1 where they are the numer- ators of the x and y coordinates of the first double (2P) respectively. Substitute the value of λ in Equation (2.3) in both x and y coordinates equations then multiply the

2 Equations (2.4) and (2.5) with (2y1) .

118 Then, one can compute Nx1 and Ny1 with following formulas,

2 2 2 Nx1 = (3x1 + a) − 2x1(2y1) mod p (4.1)

2 2 2 2 Ny1 = (3x1 + a)(x1(2y1) − Nx1 ) − 2y1 (2y1) mod p (4.2)

Then, we replace the variables x3 and y3 in the second double slope with our assumption then we get,

2 3x3 + a λ4 = mod p 2y3

Nx1 2 3( u2 ) + a λ4 = mod p Ny1 2 u3

Where u is the denominator of (2P) slope.

u4 Now, we eliminate the inverses by multiplying with u4 ,

2 4 3Nx1 + au λ4 = mod p (4.3) 2Ny1 u

119 For simplicity as in Equation (3.4), we consider,

2 4 W = 3Nx1 + au mod p (4.4)

q = 2Ny1 mod p

U = qu mod p (4.5)

Then we substitute the new slope equation in the x4 and y4 equations,

2 x4 = λ4 − 2x3 mod p

W N x = ( )2 − 2 x1 mod p 4 U u2

2 Eliminating the inverses in x4 equation by multiplying with the value of U where we remind from Equation 3.3 that,

U = (2y1) q

120 Then we get,

2 2 2 U x4 = W − 2Nx1 q mod p

W 2 − 2N q2 x = x1 mod p (4.6) 4 U 2

Same steps will be applied in order to find and simplify y4

y4 = λ4(x3 − x4) − y3 mod p

W (N N ) N y = x1 − x − y1 mod p 4 U u2 U 2 u3

3 Then we multiply y4 by U

3 2 3 U y4 = W (Nx1 q − Nx) − Ny1 q mod p

121 W (N q2 − N ) − N q3 y = x1 x y1 mod p (4.7) 4 U 3

Furthermore, these equations can be generalized for any doubling order. By using

this form, one can compute Nx1 and Ny1 and then replace all the variables in the equation that are related to the order of the desired double in order to perform any advance double directly (Direct Doubling). Computing the previous W’s, U’s, Nx’s

and Ny’s is required but with having Nx1 and Ny1 formulas, the computations can be

done smoothly.

Here is the general form that performs any double:

2 4 Wn = 3Nxn−1 + aUn−1 mod p (4.8)

qn = 2Nyn−1 mod p

Un = qUn−1 mod p (4.9)

2 2 Wn − 2Nxn−1 qn xn = 2 mod p (4.10) Un

122 2 3 Wn(Nxn−1 qn − Nxn ) − Nyn−1 qn yn = 3 mod p (4.11) Un

Where n is the order of the double and n-1 assigned to the previous one.

Numerical Examples In this section we will use the same cyclic group we have used in Section3.1.1. Clearly as we have illustrated in this chapter, Direct Doubling algorithm can find any advanced double by repeating the same algorithm itimes where i represents the number of iteration.

Example 1: Let P = (5,1), then we will apply our Direct Doubling algorithm in order to find the new x and y coordinates for the next double. First, we compute

Nx1 and Ny1 that are related to the point 2P = (6,3), then we apply two iterations in order to compute the point 8P = (13,7).

2 2 2 Nx1 = (3x1 + a) − 2x1(2y1) mod p

2 2 2 Nx1 = (3(5) + 2) − 2(5)(2(1)) mod 17

Nx1 = 13 − 6 mod 17

123 Nx1 = 7

2 2 2 2 Ny1 = (3x1 + a)(x1(2y1) − Nx1 ) − 2y1 (2y1) mod p

2 2 2 2 Ny1 = (3(5) + 2)((5)(2(1)) − 7) − 2(1) (2(1)) mod 17

Ny1 = 9(3 − 7) − 8 mod 17

Ny1 = 7

Where u = 2y1 = 2.

Now we start the first iteration to find the variables Nx, Ny, W, q and U that are related to the point 4P.

2 4 W = 3Nx1 + au mod p

W = 3(7)2 + 2(2)4 mod 17

124 W = 9

q = 2Ny1 mod p

q = 2(7) mod 17

q = 14

U = qu mod p

U = 14(2) mod 17

U = 11

125 Then we substitute these values in x4 and y4 equations, then we get,

W 2 − 2N q2 x = x1 mod p 4 U 2

92 − 2(7)(14)2 x = mod 17 4 2

6 x = mod 17 4 2

x4 = 3

Where the inverse of 2 is 9 and Nx = 6.

W (N q2 − N ) − N q3 y = x1 x y1 mod p 4 U 3

9(7(14)2 − 6) − 7(14)3 y = mod 17 4 113

5 y = mod 17 4 5

126 y4 = 1

Where the inverse of 5 is 7 and Ny = 5.

Now the inputs for the next iteration are ready in order to compute the point 8P.

2 4 W8 = 3Nx + aU mod p

2 4 W8 = 3(6) + 2(11) mod 17

W8 = 14

q8 = 2Ny mod p

q8 = 2(5) mod 17

q8 = 10

127 U8 = q8U mod p

U8 = 10(11) mod 17

U8 = 8

Then we substitute these values in x8 and y8 equations, then we get,

2 2 W8 − 2Nxq8 x8 = 2 mod p U8

142 − 2(6)(10)2 x = mod 17 8 82

16 x = mod 17 8 13

x8 = 13

Where the inverse of 13 is 4 and Nx8 = 16.

128 2 3 W8(Nxq8 − Nx8 ) − Nyq8 y8 = 3 mod p U8

14(6(10)2 − 16) − 5(10)3 y = mod 17 8 83

14 y = mod 17 8 2

y8 = 7

Where the inverse of 2 is 9 and Ny8 = 14.

Example 2: Consider the base point is the point 11P =(13,10), then we will apply our Direct Doubling algorithm in order to find the new x and y coordinates for the

next double. First, we compute Nx1 and Ny1 that are related to the point 2P = (6,3), then we apply two iterations in order to compute the point 8(11P) which is equivalent to the point 12P = (0,11).

2 2 2 Nx1 = (3x1 + a) − 2x1(2y1) mod p

129 2 2 2 Nx1 = (3(13) + 2) − 2(13)(2(10)) mod 17

Nx1 = 1 − 13 mod 17

Nx1 = 5

2 2 2 2 Ny1 = (3x1 + a)(x1(2y1) − Nx1 ) − 2y1 (2y1) mod p

2 2 2 2 Ny1 = (3(13) + 2)((13)(2(10)) − 5) − 2(10) (2(10)) mod 17

Ny1 = 16(15 − 5) − 15 mod 17

Ny1 = 9

Where u = 2y1 = 3.

Now we start the first iteration to find the variables Nx, Ny, W, q and U that are

130 related to the point 4P.

2 4 W = 3Nx1 + au mod p

W = 3(5)2 + 2(3)4 mod 17

W = 16

q = 2Ny1 mod p

q = 2(9) mod 17

q = 1

U = qu mod p

131 U = 1(3) mod 17

U = 3

Then we substitute these values in x4 and y4 equations, then we get,

W 2 − 2N q2 x = x1 mod p 4 U 2

162 − 2(5)(1)2 x = mod 17 4 32

8 x = mod 17 4 9

x4 = 16

Where the inverse of 9 is 2 and Nx = 8.

W (N q2 − N ) − N q3 y = x1 x y1 mod p 4 U 3

132 16(5(1)2 − 8) − 9(1)3 y = mod 17 4 33

11 y = mod 17 4 10

y4 = 13

Where the inverse of 10 is 12 and Ny = 11.

Now the inputs for the next iteration are ready in order to compute the point 8P.

2 4 W8 = 3Nx + aU mod p

2 4 W8 = 3(8) + 2(3) mod 17

W8 = 14

q8 = 2Ny mod p

133 q8 = 2(11) mod 17

q8 = 5

U8 = q8U mod p

U8 = 5(3) mod 17

U8 = 15

Then we substitute these values in x8 and y8 equations, then we get,

2 2 W8 − 2Nxq8 x8 = 2 mod p U8

142 − 2(8)(5)2 x = mod 17 8 152

134 0 x = mod 17 8 4

x8 = 0

Where the inverse of 4 is 13 and Nx8 = 0.

2 3 W8(Nxq8 − Nx8 ) − Nyq8 y8 = 3 mod p U8

14(8(5)2 − 0) − 11(5)3 y = mod 17 8 153

14 y = mod 17 8 9

y8 = 11

Where the inverse of 9 is 2 and Ny8 = 14.

135 4.1 Direct Doubling with Labeling

In this section we will apply labeling and register reusing technique to our 4P, 8P and 16P direct doubling form algorithms that were explained in the previous section.

Table 3.2 contains all the mathematical operations that are represented in all three algorithms.

In case of computing 4P:

Nx1 = L2 − L5 mod p (4.12)

Ny1 = L1(L46 − L47) − L49L4 mod p (4.13)

2 W = 3L47 + aL6 mod p (4.14)

q = 2L50 mod p

U = L3L51 mod p (4.15)

136 L − 2L x = 15 53 mod p (4.16) 4 U 2

W (L − L ) − L L q y = 53 12 50 52 mod p (4.17) 4 U 3

In case of computing 8P:

2 2 W8 = 3L12 + aL9 mod p (4.18)

q8 = 2L54 mod p

U8 = q8U mod p (4.19)

L18 − 2L56 x8 = 2 mod p (4.20) U8

W8(L56 − L57) − L54L55q8 y8 = 3 mod p (4.21) U8

137 In case of computing 16P:

2 2 W16 = 3L27 + aL23 mod p (4.22)

q16 = 2L58 mod p

U16 = q16U8 mod p (4.23)

L39 − 2L60 x16 = 2 mod p (4.24) U16

W16(L60 − L61) − L58L59q16 y16 = 3 mod p (4.25) U16

4.2 Comparing Fast 2nP algorithm vs Direct Dou-

bling

In terms of multiplications count we compare between our work in Fast 2nP algorithm and Direct Doubling both with applying labeling and register reusing technique. We have counted the number of multiplications in 4P, 8P and 16P in both algorithms.

138 Table 4.1 shows the differences in multiplications count between Fastn 2 P algorithm as compared to the Direct Doubling.

Table 4.1: Number of Mult. in Fast 2n vs Direct Doubling.

Algorithm Number of Multiplications 4P 8P 16P Fast 2n 25 58 90 Direct Doubling 15 25 35

As noted in the Table 4.1, there is a significant improvement in terms of the number of multiplications for Direct Doubling algorithm as compared to our previous work Fast 2nP. The improvement achievable with its scalability for higher orders multiplication scalars. In addition, we have observed that in Direct Doubling, there are only 10 more multiplication implemented each time we compute a higher order double. For instance, we can implement the scalar 512P by using 85 multiplications which is faster than computing 16P compared to Fast 2nP algorithm.

139 Chapter 5

Other Coordinate Systems

Previously, in Chapter 2 we have worked in minimizing the number of inversions in an affine coordinate system. Scientifically, the time taken to perform the inversion operation between 9 to 40 times compared to the multiplication [18], [84]. Therefore, executing fewer inversions contributes to making the system more efficient. As it was introduced earlier in Chapter 2, that we have succeeded to generalized our Direct

Doubling algorithm in order to compute unlimited point doubling with a single in- version beside the addition of some point that can be included as well. Clearly, an inversion has to be computed every iteration. On the other hand, there are some coordinate systems as Projective, Jacobian and Montgomery that were designed to recover x and y affine coordinates at the end. Thus, the total cost of inversions can be optimized down to a single inversion for the entire key size.

140 5.1 Projective Coordinates

Basically, all points in affine coordinate (xn, yn) are written as (Xn : Yn : Zn) in projective space where xn = Xn/Zn and yn = Yn/Zn. Let P1 and P2 be points on an elliptic curve then,

(X1 : Y1 : Z1) + (X2 : Y2 : Z2) = (X3 : Y3 : Z3)

Where X3 and Y3 are computed as follows:

When P1 ̸= ±P2 (Addition),

u = Y2Z1 − Y1Z2 mod p

v = X2Z1 − X1Z2 mod p

2 3 2 w = u Z1Z2 − v − 2v X1Z2 mod p

X3 = vw mod p

141 2 3 Y3 = u(v X1z2 − w) − v Y1Z2 mod p

3 Z3 = v Z1Z2 mod p

When P1 = P2 (Doubling),

2 2 t = AZ1 + 3X1 mod p

u = Y1Z1 mod p

v = uX1Y1 mod p

w = t2 − 8v mod p

X3 = 2uw mod p

142 2 2 Y3 = t(4v − w) − 8Y1 u mod p

3 Z3 = 8u mod p

As we have noticed that in the equations above (original algorithm), point addition costs 14 multiplications and 12 multiplications for point doubling where we consider all square operations as multiplications in our analysis. Now, we apply our method to the original algorithm of finding higher orders doubling with applying Labeling and Register Reusing technique. Table 5.1 shows all labels which are used in our algorithms.

Clearly, Z value at the first iteration equals one which results in a decrease inthe cost of multiplication to 9 operations. We have succeeded to optimize the original algorithm in terms of multiplication cost into 8 which leads to accumulated improve- ment to the higher orders doubling. Here we illustrate our work starting from a single double up to 32P.

In case of computing 2P:

X3 = L1L10 mod p (5.1)

Y3 = L15 mod p (5.2)

143 Table 5.1: List of Labels for Projective Algorithms.

Label Operation Label Operation Label Operation 2 L1 2y1 L31 4v2 L61 t4 2 L2 X1 L32 L31 − w2 L62 8v4 L3 3L2 L33 t2L32 L63 u4w4 2 L4 a + L3 L34 L15 L64 2L63 2 2 L5 L4 L35 L28L34 L65 u4 2 L6 L1 L36 u2L28 L66 8L64 2 L7 X1L6 L37 L36 L67 4v4 L8 L1L6 L38 aL37 L68 L67 − w4 2 L9 2L7 L39 L30 L69 t4L68 2 L10 L5 − L9 L40 3L39 L70 L54 L11 3L7 L41 L33 − L35 L71 L66L70 L12 L11 − L5 L42 u3L30 L72 L66u4 2 2 L13 L12L4 L43 t3 L73 L72 L14 Y1L8 L44 8v3 L74 aL73 2 L15 L13 − L14 L45 u3w3 L75 L64 L16 2L7 L46 2L45 L76 3L75 2 L17 L5 − L16 L47 u3 L77 L69 − L71 2 L18 L8 L48 8L47 L78 u5L64 2 L19 aL18 L49 4v3 L79 t5 2 L20 L17 L50 L49 − w3 L80 8v5 L21 3L6 L51 t3 − L50 L81 u5w5 2 L22 L20L21 L52 L41 L82 2L81 2 L23 u2L15 L53 L48L52 L83 u5 L24 L1L17 L54 L51 − L53 L84 8L83 2 L25 t2 L55 u3L48 L85 4v5 2 L26 8v2 L56 L55 L86 L85 − w5 2 L27 u2 L57 aL56 L87 t5L86 2 2 L28 8L27 L58 L46 L88 L77 L29 u2w2 L59 3L58 L89 L88L48 L30 2L29 L60 u4L46

Z3 = L8 mod p (5.3)

In case of computing 4P:

144 t2 = L19 + L22 mod p

u2 = L8L17 mod p

v2 = L23L24 mod p

w2 = L25 − L26 mod p

X4 = 2L29 mod p (5.4)

Y4 = L33 − L35 mod p (5.5)

Z4 = u2L28 mod p (5.6)

In case of computing 8P:

145 t3 = L38 + L40 mod p

u3 = L41L36 mod p

v3 = L42L41 mod p

w3 = L43 − L44 mod p

X8 = 2L45 mod p (5.7)

Y8 = L51 − L53 mod p (5.8)

Z8 = u3L48 mod p (5.9)

In case of computing 16P:

146 t4 = L57 + L59 mod p

u4 = L54L55 mod p

v4 = L60L54 mod p

w4 = L61 − L62 mod p

X16 = L64 mod p (5.10)

Y16 = L69 − L71 mod p (5.11)

Z16 = u4L66 mod p (5.12)

In case of computing 32P:

147 t5 = L74 + L76 mod p

u5 = L77L72 mod p

v5 = L78L77 mod p

w5 = L79 − L80 mod p

X32 = L82 mod p (5.13)

Y32 = L87 − L89 mod p (5.14)

Z32 = u5L84 mod p (5.15)

148 5.2 Jacobian Coordinates

Jacobian coordinate system is a modified version of Projective where some point Pn =

2 3 (Xn : Yn : Zn) is represented in affine coordinates as (Xn/Zn,Yn/Zn). As we will see later, Jacobian system is faster than Projective where the original algorithm for point doubling is implemented with 9 multiplications and only 6 multiplications for the first iteration as we consider the value of Z=1. Similarly, we follow the same steps that were applied to the Projective system in the previous section in order to find doubling algorithms up to 32P, and Table 5.2 shows all the labels that are used for Jacobian coordinate equations.

First, we list all equations for point addition and doubling for the original Jacobian coordinates,

In case P1 ̸= ±P2 (Addition),

2 r = X1Z2 mod p

2 s = X2Z1 mod p

3 t = Y1Z2 mod p

149 3 u = Y2Z1 mod p

v = s − r mod p

w = u − t mod p

3 2 2 X3 = −v − 2rv + w mod p

3 2 Y3 = −tv + (rv − X3)w mod p

Z3 = vZ1Z2 mod p

In case P1 = P2 (Doubling),

2 v = 4X1Y1 mod p

150 2 4 w = 3X1 + aZ1 mod p

2 X3 = −2v + w mod p

4 Y3 = −8Y1 + (v − X3)w mod p

Z3 = 2Y1Z1 mod p

Now, we illustrate our work starting from a single double up to 32P,

In case of computing 2P:

X3 = L7 + L8 mod p (5.16)

Y3 = L74 + L75 mod p (5.17)

Z3 = 2Y1 mod p (5.18)

In case of computing 4P:

151 Table 5.2: List of Labels for Jacobian Algorithms.

Label Operation Label Operation Label Operation 2 2 2 L1 y1 L14 Y3 L35 L27 L2 X1L1 L15 4L14 L36 −8L35 L3 4L2 L16 3L13 L37 v3 − X8 2 L4 X1 L17 16L9 L38 L37w3 L5 3L4 L18 aL17 L39 L10L11 L6 L5 + a L19 −2v2 L40 2Y4 2 L7 −2L3 L20 w2 L41 L33 + L34 2 2 L8 L6L6 L21 L14 L42 L41 2 L9 L1L1 L22 −8L21 L43 Y8 L73 L3 − X3 L23 v2 − X4 L44 4L43 L74 L73L6 L24 L23w2 L45 3L42 2 L75 −8L9 L25 L14 L46 L39L40 2 2 2 invx inv L26 L25 L47 L46 2 2 invy invxinv L27 Y4 L48 L47 L10 2Y1 L28 4L27 L49 aL48 L11 2Y3 L29 3L26 L50 −2v4 2 L12 L7 + L8 L30 256L9 L51 w4 2 2 L13 L12 L31 L30L21 L52 L43 L76 L3L12 L32 aL31 L53 −8L52 L71 L70w5 L33 −2v3 L54 v4 − X16 2 L72 2Y16 L34 w3 L55 w4L54 2 L56 2Y8 L57 L50 + L51 L58 L57 2 L59 Y16 L60 4L59 L61 3L58 2 2 L62 L56L46 L63 L62 L64 L63 2 L65 aL64 L66 −2v5 L67 w5 2 L68 L59 L69 −8L68 L70 v5 − X32

v2 = L15L12 mod p

152 w2 = L16 + L18 mod p

X4 = L19 + L20 mod p (5.19)

Y4 = L22 + L24 mod p (5.20)

Z4 = L10L11 mod p (5.21)

In case of computing 8P:

v3 = L28L25 mod p

w3 = L29 + L32 mod p

X8 = L33 + L34 mod p (5.22)

Y8 = L36 + L38 mod p (5.23)

153 Z8 = L39L40 mod p (5.24)

In case of computing 16P:

v4 = L44L41 mod p

w4 = L45 + L49 mod p

X16 = L50 + L51 mod p (5.25)

Y16 = L53 + L55 mod p (5.26)

Z16 = L56L46 mod p (5.27)

In case of computing 32P:

v5 = L60L57 mod p

154 w5 = L61 + L65 mod p

X32 = L66 + L67 mod p (5.28)

Y32 = L69 + L71 mod p (5.29)

Z32 = L72L62 mod p (5.30)

5.3 Montgomery Coordinates

Montgomery curve was first introduced by Peter L.Montgomery in 1987 [55]. Itis an elliptic curve that is different than the Weierstrass curve. Indeed, let Pn = (Xn :

Yn : Zn) be a point on the Montgomery curve Em. Then, one can compute x and y coordinates as follows:

2 3 Em : y = x + a4x + a6.

155 In case P1 ̸= ±P2 (Addition),

2 Xm+n = Zm−n(−4a6ZmZn(XmZn + XnZm) + (XmXn − a4ZmZn) ) mod p (5.31)

2 Zm+n = Xm−n(XmZn − XnZm) mod p (5.32)

In case P1 = P2 (Doubling),

2 2 2 3 X2n = (Xn − a4Zn) − 8a6XnZn mod p (5.33)

2 2 3 Z2n = 4Zn(Xn(Xn + a4Zn) + a6Zn) mod p (5.34)

As we see that Montgomery arithmetic relies on x-coordinates computation where the y-coordinate can be always recovered by applying the following formula:

2 2a6 + (x1xn + a4)(x1 + xn) − (x1 − xn) xn+1 yn = mod p (5.35) 2y1

Basically, the Montgomery arithmetic requires computing the next x-coordinate

point beside point doubling in order to recover y-coordinate. In particular, unlike

Projective and Jacobian systems, the Montgomery point computation costs 3 inverse

operations at the end since recovering y-coordinate formula applies only affine co-

156 ordinates. Figure 5.1 shows a graphical representation of the Montgomery Doubler implementation.

Figure 5.1: Montgomery Doubler Implementation Flowchart.

Noticeably, in Figure 5.1, that Montgomery system executes doubling and addition every iteration regardless of the secret key bits value which makes it indistinguishable and therefore resistible to the Side-Channel attack or Simple Power Analysis attack,

SPA [50]. Thus, applying this form of ECC which might not be the fastest but still competitive for another features.

157 Chapter 6

EiSi Coordinates

EiSi coordinates system is a modified version of affine where some point Pn = (Xn :

2 3 Yn) is represented in affine coordinates asN ( xn /Un,Nyn /Un). Compared to the Weier- strass form, EiSi elliptic curve offers also faster arithmetic. Similarly to Projective, this form of elliptic curve is represented with a single inversion at the last iteration.

Let P1 and P2 be points on an elliptic curve then,

(X1 : Y1) + (X2 : Y2) = (X3 : Y3)

At the first iteration we consider Un = 1, then we get,

2 3 (Nx1 : Ny1 ) + (Nx2 : Ny2 ) = (Nx3 /U3 : Ny3 /U3 )

158 Where (in case of doubling),

2 2 2 Nx3 = (3x1 + a) − 2x1(2y1) mod p (6.1)

2 2 2 2 Ny3 = (3x1 + a)(x1(2y1) − Nx1 ) − 2y1 (2y1) mod p (6.2)

U3 = 2y1 mod p (6.3)

Additionally, in case of adding two points after the first iteration, where the base point will be changed, we illustrate a modified version of point addition algorithm.

Let P1 and P2 be a point on the elliptic curve where (Nx1 : Ny1 : U1) and (Nx2 : Ny2

: U2) are the EiSi points representation respectively. Then we get,

(Nx1 : Ny1 : U1) + (Nx2 : Ny2 : U2) = (Nx3 : Ny3 : U3)

In case P1 ̸= ±P2 (Addition),

3 3 W3 = Ny2 U1 − Ny1 U2 mod p (6.4)

2 2 q3 = Nx2 U1 − Nx1 U2 mod p (6.5)

159 U3 = U1U2q3 mod p (6.6)

2 2 2 2 2 Nx3 = W3 − Nx1 U2 q3 − Nx2 U1 q3 mod p (6.7)

2 2 3 3 Ny3 = W3(Nx1 U2 q3 − Nx3 ) − Ny1 U2 q3 mod p (6.8)

In case P1 = P2 (Doubling),

2 4 Wn = 3Nxn−1 + aUn−1 mod p (6.9)

qn = 2Nyn−1 mod p (6.10)

Un = qnUn−1 mod p (6.11)

2 2 Nxn = Wn − 2Nxn−1 qn mod p (6.12)

2 3 Nyn = Wn(Nxn−1 qn − Nxn ) − Nyn−1 qn mod p (6.13)

In section 4.1, we have applied register renaming and labeling to the previous

160 equations that we used in our implementation.

We have modified our algorithms to receive Nxn , Nyn and Un instead of xn and yn values. Skilfully, by applying this method, we managed to dispense with computing the inverse at each iteration.

Since all algorithms start with finding Nx1 and Ny1 values that are related to the point 2P, we do some adjustments to these algorithms in terms of them inputs, then we get:

N = (3N 2 + aU 4 )2 − 8N N 2 mod p (6.14) x1 xin in xin yin

N = (3N 2 + aU 4 )(4N N 2 − N ) − 8N 4 mod p (6.15) y1 xin in xin yin x1 yin

u = 2Nyin Uin mod p (6.16)

Where Nxin , Nyin and Uin are the inputs that represent the point (X1 : Y1) at the

first iteration where Uin equals one and Nx1 , Nx1 and u are the outputs that represent the point 2P.

Numerical Example In this section we will use the same cyclic group were intro- duced in section 3.1.1 and consider some of the values where previously computed in the previous numerical examples sections in order to illustrate how our new coordi- nates system finds point doubling and addition correctly with a single inverse along

161 the key size.

Assume we have a key size of 4 bits that represents the number 1410 = (1110)2.

Then we apply the algorithm Left-to-Right in order to compute the new x and y coordinates for the point 14P = (9,1).

First, as we scan from left to right we process the second one-bit. Each 1-bit represented into doubling and addition while each 0-bit is only doubling. Thus, we double and add in order to find the point 3P.

As in Section 3.1.4, we consider the values,

U3 = 8

Nx3 = 11

Ny3 = 12

Note: Nx3 and Ny3 were computed with no inversion operation.

Then, we scan the next bit from left which is appear to be 1. Another double and add applied then we get 2(3P) + P = 7P.

KEY: 1 1 1 ....

Operations: 3P 7P ....

As in Section 3.1.6, we consider the values for 6P,

U6 = 4

Nx6 = 1

Ny6 = 16

Then we apply point addition to the 6P with the base point P = (5,1) in order to

162 find 7P. Since we add with the base point in thiscase Nx1 = x1 = 5, Ny1 = y1 = 1 and U1 = 1.

3 3 W7 = Ny6 U1 − Ny1 U6 mod p

3 3 W7 = 16(1) − 1(4) mod 17

W7 = 3

2 2 q7 = Nx6 U1 − Nx1 U6 mod p

2 2 q7 = 1(1) − 5(4) mod 17

q7 = 6

163 U7 = U1U6q7 mod p

U7 = 1(4)(6) mod 17

U7 = 7

2 2 2 2 2 Nx7 = W7 − Nx1 U6 q7 − Nx6 U1 q7 mod p

2 2 2 2 2 Nx7 = 3 − 5(4) (6) − 1(1) (6) mod 17

Nx7 = 9 − 7 − 2 mod 17

Nx7 = 0

164 2 2 3 3 Ny7 = W7(Nx1 U6 q7 − Nx7 ) − Ny1 U6 q7 mod p

2 2 3 3 Ny7 = 3(5(4) (6) − 0) − 1(4) (6) mod 17

2 2 3 3 Ny7 = 3(5(4) (6) − 0) − 1(4) (6) mod 17

Ny7 = 1

Note: To check if the variables that represents the point 7P = (0,6) is correct in

EiSi coordinates, we find the inverses.

Nx7 0 x7 = 2 = = 0 U7 15

Ny7 1 y7 = 3 = = 6 U7 3 Now, we scan the last bit which is appear to be 0. Doubling operation applied then we get 2(7P) = 14P = (9,1).

KEY: 1 1 1 0

Operations: 3P 7P 14P

165 2 4 W14 = 3Nx7 + aU7 mod p

2 4 W14 = 3(0) + 2(7) mod 17

W14 = 8

q14 = 2Ny7 mod p

q14 = 2(1) mod 17

q14 = 2

U14 = q14U7 mod p

166 U14 = 2(7) mod 17

U14 = 14

2 2 Nx14 = W14 − 2Nx7 q14 mod p

2 2 Nx14 = 8 − 2(0)(2) mod 17

Nx14 = 13

2 3 Ny14 = W14(Nx7 q14 − Nx14 ) − Ny7 q14 mod p

2 3 Ny14 = 8((0)(2) − 13) − (1)(2) mod 17

167 Ny14 = 7

At the end of the last iteration, the inverse function will be applied in order to find the affine coordinates for the point 14P.

Nx14 13 x14 = 2 = = 9 U14 9

Ny14 7 y14 = 3 = = 1 U14 7

Example 2: In this example we will apply a EiSi point addition to the points 14P =

(13 : 7 : 14) and 3P = (11 : 12 : 8) that were computed in the previous example and

Example 1 in Section 3.1.4 respectively in order to find point 17P that is represented

in affine space as (6,3).

3 3 W17 = Ny14 U3 − Ny3 U14 mod p

3 3 W17 = 7(8) − 12(14) mod 17

W17 = 15

2 2 q17 = Nx14 U3 − Nx3 U14 mod p

168 2 2 q17 = 13(8) − 11(14) mod 17

q17 = 2

U17 = U3U14q17 mod p

U17 = 8(14)(2) mod 17

U17 = 3

2 2 2 2 2 Nx17 = W17 − Nx3 U14q17 − Nx14 U3 q17 mod p

2 2 2 2 2 Nx17 = 15 − 11(14) (2) − 13(8) (2) mod 17

169 Nx17 = 4 − 5 − 13 mod 17

Nx17 = 3

2 2 3 3 Ny17 = W17(Nx3 U14q17 − Nx17 ) − Ny3 U14q17 mod p

2 2 3 3 Ny17 = 15(11(14) (2) − 3) − 12(14) (2) mod 17

Ny17 = 15(5 − 3) − 9 mod 17

Ny17 = 4

Now, we apply the inverse to convert the point 17P = (3 : 4 :3) into affine coordinates.

Nx17 3 x17 = 2 = = 6 U17 9

Ny17 4 y17 = 3 = = 14 U17 10

170 Example 3: In this example we will apply a EiSi point doubling to the point 17P

= (3 : 4 : 3) that was computed in the previous example in order to find point 2(17P) mod 19 that is represented in affine space as 15P = (3,16).

2 4 W15 = 3Nx17 + aU17 mod p

2 4 W15 = 3(3) + 2(3) mod 17

W15 = 2

q15 = 2Ny17 mod p

q15 = 2(4) mod 17

q15 = 8

171 U15 = q15U17 mod p

U15 = 8(3) mod 17

U15 = 7

2 2 Nx15 = W15 − 2Nx17 q15 mod p

2 2 Nx15 = 2 − 2(3)(8) mod 17

Nx15 = 11

2 3 Ny15 = W15(Nx17 q15 − Nx15 ) − Ny17 q15 mod p

172 2 3 Ny15 = 2(3(8) − 11) − 4(8) mod 17

Ny15 = 14

Now, we apply the inverse to convert the point 17P = (3 : 4 :3) into affine coordinates.

Nx15 11 x15 = 2 = = 3 U15 15

Ny15 14 y15 = 3 = = 16 U15 3

173 Chapter 7

Sample Applications

In Section 3.1, we have seen how to compute a scalar multiple k for elliptic curve point, with a single inverse. Furthermore, we succeeded to perform a differential addition between two points where one of them can be doubled up to 4 times in one equation with a single inverse, as well. This optimization can therefore speed up the computation of many applications and algorithms that are based on Elliptic Curves.

In addition, it has the potential to play an important role in developing post quantum cryptosystems such as SIDH. In this section we clarify some of these applications to

SIDH.

At each stage of the Super Singular Isogeny Diffie-Hellman (SIDH) protocol, the kernel of an isogeny has to be computed by both Alice and Bob, calculating the equation P + [k]Q, where P and Q are points on the curve and k is the secret key that is generated by both of them [40]. This operation must be performed in both phases of SIDH. In the first key generation phase, the point is known in advance. In

174 this case, one can construct a look up table that contains all doubles of point Q and reuse any of them as needed.

As shown in Section 3.1, the direct equations for finding a higher doubling order avoids the original steps. Furthermore, our implementation has proved that the new equations are faster than the original ones, as described in Section 3.2. The introduced optimization speeds up the elliptic curve computation and is beneficial for SIDH’s quantum security margin. The new equations can be applied to the right-to-left fast exponentiation algorithm for binary elliptic curves [42, 61], also used accordingly for

Montgomery curves [62, 29]. In addition, it can also be applied to the left-to-right fast

Double-and-Add or Double-Add-&-Subtract algorithms. We modify these algorithms in order to exploit the new equations.

7.1 Algorithms Overview

This section introduces new algorithms for computing P + [k]Q. The Three-point ladder algorithm (left-to-right or Double-and-Add) [40], is shown in Figure 7.1. The example, uses the same scalar the author assumed, which is 12. In order to apply the new equations, one can first compute the tripling equations, then input the result to the 2rd double equation. On the other hand, expanding on the right-to-left fast multiplication algorithm, one computes the 2nd double, then further applies a single double to the result, and have the results summed. Thus, the right-to-left algorithm is slower than the reverse in this case since it needs three steps while left-to-right

175 algorithm needs two.

In addition, both algorithms use two accumulators to have intermediary results added to the third accumulator which stores the expected output. In contrast, our equations can shorten most of the steps the processor consumes in calculating the final result by using a single accumulator.

Figure 7.1: Three point ladder (left-to-right) [40, 29].

The right-to-left algorithm [29, 42, 61, 29] in Figure 7.2 lets the processor calculate the double at each step, regardless of the value of the secret key bit. With the new equations to compute repeated doubles, one no longer needs to double at every step, enabling the development of more efficient algorithms introduced in the next section.

176 Figure 7.2: Right-to-left algorithm [29].

7.2 Fast Multiplication with Mixed Base Multipli-

cands

Here follows the description of a few algorithms that can integrate the fast repeated

doubling techniques mentioned so far by applying mixed base multiplicands. With

the algorithm mP + nQ one can compute multiplications with scalars up to 31. One can divide m’s binary representation into blocks of five bits. In case an obtained block represents one of the unimplemented scalar multiplications, such blocks may be reduced in length.

177 Figure 7.3: Data-dependency graph for calculating a single double merged with an- other one (Parallelization characteristic).

7.2.1 Right-to-left Extensions

The right-to-left algorithm computes the k[Q] by scanning the bits of the scalar k from right to left. It accelerates the computation in the SIDH key generation phase while the left-to-right algorithm increases efficiency in the key exchange phase [29].

Here we present an improved right-to-left algorithm that computes more efficiently the kernel of isogenies P + [k]Q in both phases. Unlike the Three ladder and right- to-left algorithms, this algorithm scans till it finds the first 1 bit of the secret key k, then starts its main loop. Once a 1 bit is located, it applies the improved equations to compute the double that matches the binary order of that bit. For example, if the first one bit was found at the third position from right, one appliesnd the2 double

178 equations.

Figure 7.4: First Proposed Algorithm.

One can take advantages of the parallelization characteristic offered by the new equations. Namely, assuming that the x coordinate is always computed before y, then

one can start steps from the next doubling operation before the end of the current

one. This saves at least M +1 clock cycles (see Figure 7.3). Moreover, mixed addition

provides more efficiency to addition and doubling elliptic curve algorithms, especially

for large parameters [7]. Figure 7.4 shows the steps for calculating P + [k]Q for our algorithm when the scalar is a 5-bit number (12)10 = (01100)2. This parallelization

characteristic can be applied to all algorithms described further.

In Algorithm 2 the Right-to-Left extension is further generalized to support dy-

namic implementation of the doubling to high orders, as a combination of available

implementations of lower order that provide the optimal combined performance (at

179 procedure MultiplyR2L(k, P) do R := O; H := P; if |k| > 0 and k0 == 1 then R = H; ; for (int i := 1; i < |k|;) do 1.1 l := 0; do l + +; i + +;

while (ki == 0 and l < 4); l 1.2 H := 2 H; 1.3 if (ki) then ∥parallel(R := R + H)∥; ;

1.4 return R; Algorithm 1: Right-to-left Extensions

procedure MultiplyR2LKnapsack(k, P) do R := O; H := P; if |k| > 0 and k0 == 1 then R := H; ; for (i := 1; i < |k|;) do 2.1 l := 0; do l + +; i + +;

while (ki == 0); 2.2 H := DoubleKnapsack(H, l); 2.3 ∥parallel(R := R + H)∥;

2.4 return R; Algorithm 2: Right-to-Left Knapsack

Line 2.2).

In Line 2.1 of the pseudocode, the counter l is initialized with value zero. We

scan for the next 1-bit to detect whether there is a following double to be launched

180 in parallel, once the computations of the x and y coordinates start. In Line 2.2 the

repeated doubling is applied to the point to the order that is specified by the counter

l.

In Algorithm 1 at Line 1.1 this is performed directly from one of the new doubling

equations, while in Algorithm 2 at Line 2.2 one uses a knapsack optimization to select

a combination of available repeated doubling techniques that best together produce

multiplication with 2l (this being the main difference between the two algorithms).

As we see in Line 2.3, the algorithm provides the opportunity to parallelize afore- mentioned independent computations. Finally, in Line 2.4, the register R will have the final result.

7.2.2 Double and Add Extensions

In Section 3.1 it is shown how to compute all intermediate exponent and mix doubling with a differential addition with a single inverse. The left-to-right algorithm starts scanning from left the next one-bit considering that the most significant bit is one.

Then, it decides weather it applies doubling or doubling and addition depending on the data being read. For instance, if the first two one bits were representing the binary equivalent (101)2 which is 510, it will multiply the base by 4 because it was shifted

to the left by two bits. Since the last bit scanned is a 1, it also applies a differential

addition to the point being doubled with the base point. Thus, the implementation

will be 4Q + Q. Figure 7.5 shows a practical example for calculating Q47. This

181 technique computes Q47 with only four inverses, instead of the eight inverses when

performing the original equations.

procedure MultiplyL2RKnapsack(k, P) do if |k| ≤ 0 or k|k|−1 == 0 then return O; ; D := P; for (int i := |k| − 2; i ≥ 0;) do l := 0; do i - -; l ++;

while (ki == 0 and i ≥ 0); if ki == 1 then 3.1 D := DoubleAndAddKnapsack(D, l, P); else 3.2 D := DoubleKnapsack(D, l); return D; Algorithm 3: Double and Add Extensions

In Line 3.1 of the pseudocode of the Double-and-Add extensions we apply the

DoubleAndAddKnapsack function taking as parameter the counter l that specifies the

current bit location, and the base point P to be added at the end. Otherwise, we apply

the DoubleKnapsack function that computes the shifting to the left by multiplying

the D value with 2l, which is the same as the function used in Algorithm 2.

7.2.3 NAF Extensions

The last proposed algorithm is a version of the Left-to-Right algorithm called Double-

Add-and-Subtract. The original Double-Add-and-Subtract minimizes the number of

1s in the binary representation of the scalar by repeatedly detecting sequences of k

182 Figure 7.5: Left-to-Right Proposed Algorithm.

1-bit at position (i) to (i+k −1) and replacing them with 2i+k −2i. The initialization of this technique minimizes the Hamming weight of a scalar multiplication by using the non-adjacent form (NAF). Likely, with a random scalar multiplier, half of the bits in its original binary representation will be non-zero. However, with NAF’s special representation this will be dropped to one third. For example, 1510 = 11112. This is rewritten as (16 − 1)10 = ⟨1, 0, 0, 0, −1⟩, where sequences ⟨bk, ..., b0⟩ are interpreted

∑k i as i=0 2 bi for bi ∈ Z, most commonly used with bi ∈ {−1, 0, 1}. The Left-to-Right

Double-Add-and-Subtract Algorithm in the past could not improve upon three lead- ing 1s, but the new efficientP 8 , algorithm can improve speed for sequences of 3 leading 1s. One can predict automatically the fastest combination of repeated dou- bling operations. Figure 7.6 shows the same example, calculating [47]Q by applying the Left-to-Right Double-Add-and-Subtract Algorithm. This computes [47]Q with only two inverses unlike the Left-to-Right Double-and-Add Algorithm that needed 4.

183 procedure MultiplyL2RMix(k, P) do (m,B) := mixed NAF representation Knapsack(k); D := multiply(P,m|m|−1); for (int i := |m| − 2; i ≥ 0;) do // D := D ∗ Bi + P ∗ mi D := RadixPromoteAndAdd(D, mi, Bi, P); return D; Algorithm 4: NAF Extensions

In general, for a number k, a mixed base representation m = mN ...m0, with bases

B = ⟨BN , ..., B0⟩, must have the property that

N i−1 ∑ ∏ k = mi Bj i=0 j=0

A canonical mixed base representation expects that ∀i, 0 ≤ mi < Bi, but with NAF this additional condition is not enforced.

The Algorithm 4 starts by recasting k in a mixed base B with NAF representation m, selected such that D∗bi+P ∗mi can be computed efficiently given available software libraries (implementations of direct repeated doubling operations). Then, the Multi- ply Left to Right Mix (MultiplyL2RMix) proceeds one digit at a time, promoting the current value based on the radix of the corresponding position and adding/subtracting the digit’s multiple of the base point P . The RadixPromoteAndAdd procedure can use memoisation dynamic programming to store encountered multiples of the base P , and reuse them on subsequent calls.

The simplest implementation of mixed NAF representation Knapsack is base 16,

namely where ∀i, bi = 16. Another alternative where base 32 can be used at times,

184 and base 16 any time, is in Algorithm 5. procedure mixed NAF representation Knapsack(k) do n := 0; carry := 0; for (int i := 0; i ≤ |k|;) do j = min(|k − 1|, i + 4);

if (optimizable(kj...ki + carry)) then 5 bn := 2 ; else j - -; 4 bn := 2 ;

(mn, carry) := NAF (bn, ki...kj + carry); n ++; i := j − 1;

if carry > 0 then mn := carry; return (m, b) Algorithm 5: Mix NAF Representation bases 16 and 32

Here procedure NAF reduces the current digit to an optimizable positive or neg- ative multiplier, and a carry value.

Figure 7.6: Left-to-Right Double-Add-and-Subtract Algorithm.

185 7.3 Fast Multiplication with Base 16 Multiplicands

Here we mention a simple special case algorithm based on base 16 representations of

the multiplicands. For example 35 = 2316. Then, for a multiplier of the form qr16 the computation is implemented by as,

16(qP ) + rP mod p

For the scalar 10150 = 27A616, the obtained algorithm is equivalent to:

((16(2P ) + 7P )(16) + 10P )(16) + 6P mod p

As noted in the above equation, and similar to Montgomery curve, the key is indistinguishable and can’t be recognized by side channel attack since the algorithm applies point doubling and addition each iteration regardless of the key bit value.

In addition, by applying Fast 2n or Direct Doubling algorithms we benefit that we reduce number of point addition, which in fact costs more than point doubling. Thus, we consider the series of Base 22 Multiplicands algorithms are the most efficient algorithms we introduce in our dissertation.

186 7.4 Fast Multiplication with Base 32 Multiplicands

Another version of a special case algorithm based on base 32 representations of the multiplicands. Similar to Base 16 Multiplicands the computation is implemented by as,

32(qP ) + rP mod p

Considering the same example were taken in the previous section for the scalar

10150 = 27A616, the obtained algorithm is equivalent to:

(32(9P ) + 29P )(32) + 6P mod p

Apparently, this algorithm has the same advantages as the Base 16 Multiplicands, but it excels in terms of reducing point addition operations, as by computing the higher order double, the key is divided into less number of blocks leading to less number of point addition. Therefore, we consider Base 32 Multiplicands the fastest among all other algorithms we introduced.

187 7.5 Fast Multiplication with Base 1024 Multipli-

cands

Here we mention another version of a special case algorithm based on base 1024 representations of the multiplicands. This algorithm differs from that of the same family in that the processing of the key does not end with the division process once, as we do not have algorithms to compute all the intermediate points up to 1023.

Thus, we reprocess the remainder again if its value exceeds the largest algorithm we have, which is 31, by dividing by 32. Similar to Base 16 and 32 Multiplicands the computation is implemented by as,

1024(qP ) + rP mod p

When r > 32,

r = 32(q′P ) + r′P mod p

Representing the same previous example for the scalar 10150 = 27A616, the ob- tained algorithm is equivalent to:

188 1024(9P ) + (32(29P ) + 6P ) mod p

Like the rest of them from the same family, this algorithm has the same advan- tages, but it may take more time and cost than Base 16 and 32 Multiplicands when it is implemented serially, because of the reprocessing that takes place on the remainder.

The main reason we introduced this algorithm is to test where if we apply the high- est possible doubling order algorithm that fits the key size without the presence of the direct intermediate point algorithms, and compare with the other Multiplicands algorithms.

7.6 Inverting Multiplications based on Curve Or-

der

Inverting multiplications can be very useful in case the number of 1 bits of the binary representation of an inversion of the scalar multiplication k is smaller than the number of 1 bits of the original one. Basically, instead of using the point P as a base, one traverses the group backward starting from the inversion of the base point −P .

[k]P = [ordE(P ) − k](−P ) = [#E − k](−P )

189 If #E = 61 and k = 49, the number of 1-bit of the binary representation of

1210 = (1100)2 is less than the number of 1-bit of 4910 = (110001)2 by one, directing to a smaller number of differential additions.

According to Hasse’s theorem [63]:

√ √ p + 1 − 2 p ≤ #E ≤ p + 1 + 2 p

This is also known as Hasse’s bound, which states that the number of points on

elliptic curves is roughly in the range of the prime p.

190 Chapter 8

Supersingular Isogeny

Diffie-Hellman SIDH

The tropical nature of research around quantum-resistant cryptosystems is depicted by the growing call for the standardization of quantum-resistant PK cryptographic algorithms by entities such as NIST and IEEE. Quantum computers effectively break elliptic curves, factoring and finite field adopted for public key cryptography. Su- persingular isogeny Diffie-Hellman (SIDH) key exchange is one of the post-quantum cryptographic algorithms adopted to create secure key between communicating en- tities over insecure communication channels [5, 23]. SIDH is similar analogous to the conventional Diffie-Hellman (DHE) protocol, but grounded on the supersingular isogeny graph characteristics. SDIH is also primarily designed as a post-quantum protocol for resisting cryptanalytic attacks supported by quantum computers. The

191 other property that makes SIDH a subject of interest is that it’s perfect forward se- crecy property enables the protocol to prevent the confidentially of communications session from compromise [48]. These properties make SIDH the ideal replacement to elliptic curve Diffie-Hellman (ECDH) and DHE.

8.1 Supersingular Curve and Elliptic Curve

This section provides an overview of elliptic curve and supersingular curve and then reviews their relation. Elliptic curves (ECs) take the form:

y2 = x3 + ax2 + bx + c.

Preserving rational points using substitutions, an EC takes the Weierstrass from:

y2 = x3 + ax + b.

Conditionally, the elliptic curve E must be non-singular implying that there are no cusps or self-intersections. Consistent with Koziel, Azarderakhsh, Kermani and

Jao in [48], ECC is characterized by the definition of points on EC and point addition formulas and specific point doubling from a point to the other. Additionally, ECC- based cryptosystems depend on their difficulty to compute the Elliptic Curve Discrete

Log (ECDL) problem, in a manner that given P and Q in the equation (Q = kP =

P + P + .. + P ), it is infeasible to establish the scalar multiple k for ECs with

192 large order points. However, congruent with Shors algorithm [48], such ECDH-based

cryptosystems are no longer safer with the adopting of quantum computers. For

that reason, standard ECC is no longer safe or applicable for most of the proposed

quantum resilient systems. Elliptic curves, denoted as OE are smooth and projective

algebraic curves of genus 1 [5]. The curves also have a fixed point. ECs are rich

objects for various applications because of their group law. The curves are defined

across fields that meets the condition p < ∞ and exist in two forms: supersingular

and ordinary. Supersingular curves are elliptic curves - E/K marked by the p < ∞ characteristic meeting the following conditions [48]:

i. The multiplication of E by p-map is inseparable.

ii. Its endomorphism ring is isomorphic to maximal order.

iii. The j-invariant of E is Fp2 .

Consistent with Costello in [21], supersingular curves have j-invariants. Elliptic curves that do not meet these conditions are referred to as ordinary curves. Super- singular isogeny graphs (SIGs) are increasingly being subject of post-quantum cryp- tosystems studies because of their vital role in the proposed PQC protocols. Prior to

De Fao, Jao and Plut’s proposal of SIDH in [26], Charles, Goren and Luater had pro- posed an exchange system marked by a hash function grounded on the complexity of routing or finding paths in SIGs [5]. Isogeny-based key exchange protocols are similar to standard ECC, but the former also exploits isogenies to move from elliptic curve to the other. According to Galbraith in [30], morphisms between EC, is a composition

193 of translation and homomorphism. That is to say, geometric maps between ECs are

characteristically group-theoretic interpretations.

8.2 Isogenies

The authors in [48] defined isogeny as a unique algebraic mapping between ECsthat

satisfy the elliptic curve group law. In line with the Weierstrass equations in [30], an

2 EC over a field k is defined by the equation: E : y +a1xy+a3y = x3 +a2x2 +a4x+a6, where, a1 to a6 ∈ k and there is a unique point OE that does not lie on the defined

affine curve. Two elliptic curves are said to be isogenous if there is isogeny betweentwo

elliptic curves, that is, if they have the some degree of isogeny or morphism. In that

regard, an isogeny can also be defined as a rational map that satisfies φ (OE) = OE.

This relation implies that φ induces a group homomorphism [30]. Consistent with

the Group Law, P ∈ E(k) we have P + OE = OE + P = P. In case where the two

points are affine, that is P1 = (x1, y1) and P2 = (x2, y2), the sum of the two points

(P1 + P2) is computable using the following formula where λ is the modular lambda

function.

x2 + x x + x2 + a (x + x ) + a − a y λ = 1 1 2 2 2 1 2 4 1 1 y1 + y2 + a1x2 + a3

Isogeny enables one to map from and EC to another. For instance, elliptic curve

194 y2 = x3 + x(E1) can transform to y2 = x3 − 4x(E2) using the mapping below:

x + 1 x − 2 ϕ(x, y) = ( 2 , y 2 ) x x2

One of the key applications of isogeny in the domain of cryptography is the SIDH, which is a post-quantum cryptography (PQC) method that used isogenies to establish shared private key.

8.3 Isomorphisms

From a mathematical perspective, isomorphism of an EC φ : E → E over k is such that the φ(OE) = OE. Characteristically, isogenies take the form of group homomorphism [30]. From the group theory perspective, isomorphic groups have similar characteristics or properties. Consequentially, it is possible to transform from one elliptic curve to another. In that regard, isogeny-based cryptography employs points on the EC, but instead grounded on the complexity of computing isogenies between ECs. Two elliptic curves are isomorphic under the condition that they have the same j-invariant. SIDH utilizes Velus formulas to compute isogenies between

ECs [75]. Velus formulas are deterministic [68]; hence, the choice of a formula results in similar isogenous EC in the resultant isomorphism class.

195 8.4 J-Invariant

Supersingular Isogeny Diffie-Hellman (SIDH) protocol is also marked quadratic ex-

tensions of large prime fields [23]. Of the prime fields (u + vi whereu;v ∈ Fp), the

primary element of interest is the subset of size because size of the subset grows with

the exponential growth of the p. According to [21], the subset of interest corresponds

to supersingular j-invariants. For example, subset [p/12]+z, where z ∈ {0, 1, 2} and

p := 431, the j-invariants in the prime fields Fp2 are illustrated in Figure 8.1 [21].

Figure 8.1: j-invariants in F4312 [21].

All the j-variants in Figure 8.1 are supersingular in characteristic. Elliptic curves are characterized by unique j-invariant. When an elliptic curve E over k is represented in Weierstrass from, the j-invariant is given by the formula:

4a j(E) = 1728 3 4a3 + 27b2

196 On the contrary, the j-invariant of the EC represented in Montgomery form:

E : y2 = x3 + ax2 + x

have j-invariant:

3(a − 3) j(E) = 256 2 a2 − 4

In particular, the Montgomery variant of EC results in improves performance because it speeds up factoring. Even though the optimizations for scalar multiplica- tions proceeds, input on computing explicit Montgomery curves is still limited [68].

Characteristically, elliptic curves are either ordinary or supersingular. Practically the supersingular scenario is advantageous in a number of ways. For instance, supersingu- lar characteristic eases instantiating constructions [21]. Additionally, the best known quantum and classical attacks against the supersingular case are complex making it secured.

8.5 Computing Isogenies over Finite Field

The complexity of computing the trace of ECs over a finite domain continues to attract significant interest due to a number of reasons. The question is essentially compelling because of the natural problem of counting points on a projective variety over a finite field. Additionally, it is also complex to work out figures inanarrayof

197 context from arithmetic algebraic geometry to primarily proving and to applications

in secure communication systems.

8.5.1 Finite Fields and Frobenius Isogeny

Considering K as finite field (FF) marked by the characteristic p > 0, it must entail

Z/pZ as its primary subfield [25]. The subfield is typically referred to as theprime

field of the finite field K, which is denoted by Fp. Given that K is a vector space over the finite field, it must meet the cardinality condition of q = pn for some values of n;

thus, marked by a multiplicative group of order q-1.As of consequence, the elements

of K* must be the roots of the corresponding polynomial X{q−1} − 1. Congruent with [25], it is essential to highlight that Fp[ζ] is isomorphic to k because p is not divisible by q-1, whereby ζ is the primitive {q − 1}-th root of unity in the finite field.

The underlying implication is that, under isomorphism, there is a unique FF entailing q elements denoted by Fq. Consistent with the arguments above, then for values of m ≥ 1, the finite field Fqm entails a corresponding subfield isomorphic to the finite

q field Fq. The map φq : Fqm → Fqm sending x → x is a morphism of field that fixes the corresponding finite field Fq, which is also referred to as Frobenius automorphism of Fqm /Fq [25]. In summary, elliptic curves defined over a finite field marked bythe characteristic p ̸= 0 has an isogeny called Frobenius isogeny:

ϕ :(x, y) → (xp, yp)

198 If an elliptic curve E is defined over a finite field Fq marked by the condition q = pd , then the number of d iterations of the underlying Frobenius map results in the isogeny represented by the ensuing equation.

d p p Φq def ≡ ϕ :(x, y) → (x , y )

In line with the author in [25], the points of E (Fq) are fixed by the map Φq, which is an endomorphism of curve E. The map Φq is also referred to as the Frobenius endomorphism of curve E, which plays a vital role in determining the cardinality of

E(Fq). The minimal polynomial of Φq in endomorphism E takes the form:

2 √ Φq − cΦq + q = 0, with |c| ≤ 2 q

Given that Φq serves as the identity on E (Fq), then.

√ #E(Fq) = q + 1 − c, with |c| ≤ 2 q

Azarderakhsh, Kermani and Koziel in [47] defined isogeny over finite field Fq, φ :

E → E as a dynamic rational map defined over Fq under the condition that φ meets the group homomorphism from E (Fq) to E (Fq). In concluding, the computation of isogeny over finite field showed that two elliptic curves E and E defined over thefinite field Fq are isogenous if #E (Fq) = #E (Fq).

199 8.6 SIDH and Key Exchange

Ubiquitous and pervasive computer systems are characteristically heterogeneous plat- forms marked by massive connection of devices with amplified storage and compu- tation power. Therefore, the design of any heterogeneous communication system demands a clear assignment of roles relating to security properties for all processes or tasks under communicating entities. Moreover, resource-constrained devices such as sensors in the IoT domain should receive less computationally taxing task. In the same context, long-term secrets should be limited from such devices due to their limited tamper-resistance protections. In agreement with Oliveira, et al. in [60], a cautious trade-off between security features, cryptographic primitives and function- ality must be addressed for each class of devices. It is on this backdrop that SIDH properties makes it a stronger candidate for key generation and exchanges.

SIDH key exchange is an isogeny-based cryptography [48]. Figure 8.2 illustrates the SIDH key exchange protocol, which exploits isogenies on supersingular curves.

Consistent with Figure 8.2, communicating entities Alice (A) and Bob (B) takes random walks on distinct isogeny graphs. The key modification in this scheme is that extra information is communicated alongside the protocol; thus, ensuring that both entities arrive at the matching common value. In other words, a common j-invariant is used to generate a secret common key [26]. The quantum-resistant SIDH key exchange is not only a promising protocols in NISTs progressive post-quantum standardization

200 Figure 8.2: SIDH Key Exchange Protocol [26]. process, but also shows potential in its application on constrained devices. Compared to other extant post-quantum algorithms, SIDH key exchange still emerges as one of the most effective key exchange algorithms because of its small sized keys. In terms of the security of communication, components using SIDH minimizes the amount of communication due to its non-interactive approach to the distribution of key. The underlying key exchange scheme resembles that of the Elliptic Curve Diffie-Hellman

(ECDH), but with an improved feature entailing the computation of isogenies over large degrees [48]. The authors in [26] showed that the usage of supersingular curves with smooth order improves performance. The underlying logic is that smooth curve orders give an array of isogenies that fast to compute. Instantiations of SIDH key

201 scheme also enjoys high implementation simplicity making it suitable for embedded systems. Consistent with the review and discussions above, Table 8.1 gives a sum- mary of some of the key properties of SIDH compared to other extant key exchange protocols.

Table 8.1: Instantiations of Diffie-Hellman.

DHE ECDH SIDH Elements Integers go modulo Points P in curve Curves E in prime group isogeny class Computations g, x → gx k, P → [k]P ϕ, E → ϕ(E) Secrets Exponents x Scalars k Isogenies ϕ Complex problem Compute x Compute k Compute ϕ

8.7 SIDH and Post-Quantum Cryptosystem

In the past decade, concepts regarding ubiquitous and pervasive computing have materialized into Internet of Things (IoT), Wireless Sensor Networks (WSNs) and

Wearables among others. Their applications range from autonomous vehicles to en- ergy usage monitoring and patient care. However, besides these benefits, the advent of these systems has also induced a number of challenges. Quantum computing comes with new challenges compared to traditional computing. In that domain, some of the challenges relates to privacy and security. As these computing and communi- cations systems pervade virtually all aspects of human life, the demand for secure communication schemes also grows exponentially. Guaranteeing long term security also demands improvements and modification of cryptographic techniques. Quantum

202 computing also breaks conventional cryptosystem ranging from factorization and dis- crete logarithm intractability assumptions as illustrated by Shors algorithm. There- fore, in the post-quantum era, it is essential to ascertain the effectiveness of classical cryptographic systems against quantum adversaries. In that regards, standardization organizations such as NIST and IEEE have initiated processes to standardize PK cryp- tographic algorithms against growing quantum adversaries [47, 48]. Post-quantum key generation and exchange proposals are tend to fall under three major umbrellas: code-based, lattice-based and isogeny based [78]. Lattice-based key exchange proto- cols are largely grounded on Regevs Learning with Errors (LWE) problem. Building on the work of Couveignes and later the work of Reostovse and Stolbunove, Jao and

De Fao proposed the SIDH key exchange [23, 26]. Since then, a number of proposals on the optimizations or improvements of the SIDH protocol have been emerged and adopted [78]. Actually, there was no certain frontrunner among the three umbrellas of post-quantum key exchange protocols. Regarding functionality, all the public imple- mentations grounded on the lattice-based, isogeny-based and code-based approaches suffered from the limitation of requiring progressive improvements to achieve effective communication security. However, there are a number of performance versus band- width performance properties that should be considered when exploring the three proposals. SIDH remains as the most appealing proposal because it avails affords the smallest public keys compared to lattice- and code-based proposals. Computing a sequence of elliptic curve isogenies is a growing cryptographic function for most data

203 transfer applications. The SIDH protocol is marked by shorter public keys compared to other extant or proposed post-quantum protocols; thus, continuously gaining pop- ularity as a replacement to extant Internet protocols [79]. However, according to [78] the performance of SIDH was slower than the other implementations. The reason be- hind the slow speed is that the other implementations adopt comparatively tiny and simple vector or matrix operations. Furthermore, SIDH inherits a number of com- plex operations from conventional curved-based cryptography such as pairing and scalar multiplications. SIDH also involves an new form of isogeny arithmetic. Most importantly, SIDH also requires significant computing power. A recent proposed mathematic attack on SIDHs security also showed that its security is not affected if appropriate public parameters are adopted [64].

204 Chapter 9

Results and Experiments

Simulation experiments are performed with a Java implementation of the proposed algorithms. We have applied the algorithms on large parameters defined in the stan- dard curves P-521, P-384, P-256 and P-224 from the National Institute of Standards and Technology (NIST). In addition, we have picked 10 different keys that were ran- domly generated with an appropriate size for the x and y coordinates of each curve.

Each algorithm has been executed multiple times and then we computed the average time taken to increase the accuracy of the calculations. Experimentally, our software implementation was tested on BeagleBone Black (BBB) System kit [19]. The BBB has been equipped with a minimum set of features to allow the user to experience the power of the processor [19]. This system is equipped with one of the ARM Cortex-A8 family, AM3358/9 processor [19].

205 9.1 Functions Description and Properties

In this section, we list the important functions were used in our software implementa- tion and their properties. As of the EiSi curve receives and returns two different forms of inputs and outputs, (x,y) or (Nx:Ny:U), we will clarify among these characteristics these details.

1. EC Func: This function receives and returns inputs and outputs of type affine.

Basically, it receives the number of doubling of a point and then builds the equation for implementing this doubling. For example, if one wants to compute the point 6P, it requires finding the point 2P then 4P in order to fulfill the constructional equation for 6P, which is 2P + 4P.

2. doublingn: This function receives and returns an affine point. Simply, it per- forms any number of doubling up to 34P where each one of them has a unique algorithm.

3. addp2p and subp2p: These functions receive and return affine points. Briefly, they perform point addition and subtraction between two affine points.

4. np func: This function receives an affine point and returns a EiSi point. Ba- sically, it functions as EC Func but instead of returning an affine point, it returns

EiSi coordinates without performing any inversion. In general, the use of this special

206 function is usually used in cases where the base point remains unchanged.

5. doubling2nN: This special function was designed to receive and return EiSi point. In addition, it works identically like the EC Func and np func functions but with only EiSi coordinates.

6. adv addN2N N and adv subN2N N: These functions receive and return

EiSi points. Briefly, they perform point addition and subtraction between twoEiSi points.

7. remi point: This function receives an affine point and return a EiSi point. In addition, it receives the number of doubling of a point and then builds the equation for implementing this doubling. Mainly, it is used in the Base 2n Multiplicands algorithms specifically for computing the remainders, where all of them basedon the same base point. It differs from EC Func and np func functions, where all the doubling algorithms operators and labels are dependent. Basically, the point 4P can’t be computed without finding the point 2P. As well as, the point 8P can’t be computed without finding the point 2P then 4P. For example, if one wants to compute thethird double for the base point 8P = (13,7), that is represented in EiSi coordinates as

(16:14:8) as in Example 1 in Section 4. The remi point will compute the Nx, Ny and

U values for the point 2P then 4P then return the WiSi point of 8(8P) = 7P mod 19 that is represented as (0:14:2).

207 8. remi func: This function works as a control for remi point function. The

remi func has architectures of all the doubling algorithms and how they are im-

plemented. Essentially, it has flags to be checked to avoid repeating any previously

computed operations. Figure 9.1 and 9.2, shows practical examples were printed from

our implementation debug page.

Figure 9.1: Computing the point 24P by using remi func and remi point functions.

As we notice in Figure 9.1, how at the second block 2P parameters was not recomputed as well as at the third block for 8P. Also, in Figure 9.2, at the last block we notice that 8P and 3P were previously computed and we had only to perform the point addition.

208 Figure 9.2: Computing the point 29P by using remi func and remi point functions.

9. Inverse Func: This function has multiple versions one for each coordinates. It mainly used at the very end when the key bits are all being scanned and we want to compute the point in an affine form. We applied Extended Euclidean algorithm in order to perform point inversion.

9.2 Double-and-Add vs NAF vs Right-to-left

In this section, we compare between three of our work algorithms in terms of speed and total number of operations. Table 9.1 shows a comparison of these algorithms in terms of the number of additions, subtractions, multiplications, divisions, modulos,

209 maximum levels of parallelization and elapsed time for implementing them on the

NIST standard curves P-521, P-384, P-256 and P-224.

Table 9.1: DA, NAF and RL Algorithms Measurements.

NIST Curve Algorithm Number of Operations Time ms Mults Divs ALUs Mods Maxls DA 9162 301 7848 15105 9924 968 P-521 NAF 12078 308 10514 20044 7136 1343 RL 18480 305 13140 28726 6085 1867 DA 6714 222 5746 11149 7267 515 P-384 NAF 8782 220 7647 14587 5215 724 RL 13460 219 9602 20953 4470 992 DA 4466 147 3820 7475 4833 279 P-256 NAF 5862 153 5097 9714 3470 379 RL 9235 152 6507 14259 2985 531 DA 3921 127 3354 6581 4243 222 P-224 NAF 5120 135 4444 8473 3051 317 RL 7848 132 5577 12171 2626 403

As we notice in Table 9.1, the Double and Add algorithm is the optimal one in terms of the number of multiplications and speed when we implement all algorithms serially. However, in case of implementing them on hardware or FPGA with the maximum number of multipliers and ALUs, we find the Right-to-Left algorithm more efficient. Figure 9.3 and 9.4 contain graphs that show the comparisons between algo- rithms in terms of the number of multiplications and maximum level of parallelization with different key sizes.

210 Figure 9.3: DA vs NAF vs RL in terms of Number of Multiplications.

Figure 9.4: DA vs NAF vs RL in terms of Number of Maximum Levels.

9.3 Base 16 vs 32 vs 1024 Multiplicands

As in Section 9.2, we compare between our Multiplicands family algorithms in terms of the same previous factors that we used earlier and on the four NIST standard 211 curves. Table 9.2 illustrates all the numbers and specifies the optimal algorithm in terms of number of multiplications and maximum level of parallelization.

Table 9.2: Base 16, 32 and 1024 Algorithms Measurements.

NIST Curve Algorithm Number of Operations Time ms Mults Divs ALUs Mods Maxls Base 16 8127 298 6183 12730 7351 840 P-521 Base 32 7921 296 6518 12469 7123 807 Base 1024 10383 301 8975 14433 5580 1049 Base 16 5969 212 4836 9404 5376 465 P-384 Base 32 5890 227 4833 9212 5224 445 Base 1024 7690 223 6637 10574 4093 562 Base 16 4018 144 3243 6323 3571 251 P-256 Base 32 4007 150 3277 6245 3463 244 Base 1024 5199 149 4468 7323 2717 301 Base 16 3551 130 2862 5547 3138 216 P-224 Base 32 3529 132 2886 5459 3041 209 Base 1024 4563 127 3924 6224 2387 248

As we notice in Table 9.2, the Base 32 Multiplicands is appear to be the optimal one in terms of the number of multiplications and speed when we implement all algo- rithms serially. However, in case of implementing them on hardware or FPGA with the maximum number of multipliers and ALUs, we find the Base 1024 Multiplicands algorithm more efficient. Figure 9.5 and 9.6 contain graphs that show the compar- isons between algorithms in terms of the number of multiplications and maximum level of parallelization with different key sizes.

212 Figure 9.5: Base 16 vs 32 vs 1024 Multiplicands in terms of Number of Multiplications.

Figure 9.6: Base 16 vs 32 vs 1024 Multiplicands in terms of Number of Maximum Levels.

9.4 Our Work vs Original

In this section, we compare the fastest algorithm in terms of number of multiplications of our work, Base 32 Multiplicands, with the original affine algorithm. We have imple- 213 mented the original affine equations with two different algorithms, Right-to-Left and

Left-to-Right. Table 9.3 shows the huge differences in the number of multiplications and inversions between these algorithms and our work.

Table 9.3: Base 32 vs Original Algorithms Measurements.

NIST Curve Algorithm Number of Operations Mults Invs RL (Original) 658514 1059 P-521 LR (Original) 478056 778 Base 32 7921 1 RL (Original) 355788 775 P-384 LR (Original) 259177 569 Base 32 5890 1 RL (Original) 162131 519 P-256 LR (Original) 115340 378 Base 32 4007 1 RL (Original) 123461 450 P-224 LR (Original) 88921 332 Base 32 3529 1

As we notice in Table 9.3, the great difference in number of multiplications and inversions between our work and the original as our work is faster approximately 35 up to 83 times for the key size 224 bits and 521 bits respectively in case of comparing with RL and 25 up to 60 times in case of LR. Obviously, all this difference is caused by the number of inverse operations that the original algorithm requires each point doubling or addition operation. Figure 9.7 translates this difference in chart where our work appears as a straight line along the x-axis.

214 Figure 9.7: Base 32 vs Original algorithms in terms of Number of Multiplications.

9.5 EiSi Coordinates vs Others

Here, we compare between our work and the other coordinates systems we have covered in Sections 5.1, 5.2 and 5.3. We have chosen our most efficient algorithms in terms of number of multiplications and maximum levels of parallelization, Base 32 and Base 1024 Multiplicands respectively. In addition, we have added Double and

Add algorithm in our comparison as well as it represents the original EiSi coordinates system. Table 9.4 shows the differences in the same factors we considered previously in Section 9.2 and 9.3 between these coordinates algorithms and our work.

As we notice in Table 9.4, our work which is represented in the last three algo- rithms are more efficient when it comes to number of multiplications. Clearly, Base

32 Multiplicands is the optimal algorithm in this case. Moreover, when we compare

215 Table 9.4: EiSi vs Other Coordinates Measurements.

NIST Curve Algorithm Number of Operations Time ms Mults Divs ALUs Mods Maxls Projective 14900 315 4211 17899 9390 1035 Jacobian 13312 301 4197 15821 8862 884 Montgomery 15866 923 7415 29867 8637 1316 P-521 Base 32 7921 296 6518 12469 7123 807 DA 9162 301 7848 15105 9924 968 Base 1024 10383 301 8975 14433 5580 1049 Projective 10901 226 3078 13107 6871 554 Jacobian 9752 222 3075 11587 6491 480 Montgomery 11647 690 5444 21870 6329 707 P-384 Base 32 5890 227 4833 9212 5224 445 DA 6714 222 5746 11149 7267 515 Base 1024 7690 223 6637 10574 4093 562 Projective 7236 145 2043 8714 4564 286 Jacobian 6488 147 2046 7708 4316 261 Montgomery 7730 454 3614 14527 4216 365 P-256 Base 32 4007 150 3277 6245 3463 244 DA 4466 147 3820 7475 4833 279 Base 1024 5199 149 4468 7323 2717 301 Projective 6356 127 1796 7656 4010 234 Jacobian 5697 127 1796 6773 3790 215 Montgomery 6789 398 3174 12759 3706 308 P-224 Base 32 3529 132 2886 5459 3041 193 DA 3921 127 3354 6581 4243 222 Base 1024 4563 127 3924 6224 2387 248

by the maximum level of parallelization, we come to find that our work outperforms the other coordinates algorithms as well through Base 1024 Multiplicands which is the optimal and Base 32 Multiplicands where it comes second. Nevertheless, Double and Add algorithm which represents original EiSi coordinates is appeared to be the least efficient in terms of maximum levels, however together with our direct doubling technique we outperform all other algorithms in all aspects. Figure 9.8 and 9.9 con-

216 tain graphs that show the comparisons between our work and the other coordinates algorithms in terms of number of multiplications and maximum level of parallelization with different key sizes.

Figure 9.8: Our Work vs Other Coordinates Algorithms in terms of Number of Mul- tiplications.

As can be seen in Figure 9.8 and 9.9, all algorithms are graphed in straight lines of varying slopes which gives us the opportunity to apply the straight line equation to any algorithm to predict the expected values, whether it is the number of multi- plications or the maximum levels of parallelization at the level of a larger key size.

Table 9.5 lists all the equations related to each algorithm.

Predictably, we apply the equations in Table 9.5 on two key sizes of the prime numbers of 751 and 1013. Table 9.6 lists the expected number of multiplications and maximum levels.

As it can be seen in Table 9.6, that our work which is represented in Base 32 and

217 Figure 9.9: Our Work vs Other Coordinates Algorithms in terms of Number of Max- imum Levels.

Table 9.5: List of Algorithms Linear Equations.

Algorithm Number of Mults Eq. Number of MaxLs Eq. Base 32 y = 14.777x + 220.37 y = 13.762x − 52.445 DA y = 17.658x − 48.287 y = 19.139x − 60.183 Base 1024 y = 19.576x + 180.7 y = 10.764x − 32.835 Jacobian y = 25.656x − 70.992 y = 17.089x − 52.485 Projective y = 28.795x − 121.85 y = 18.131x − 69.095 Montgomery y = 30.601x − 87.601 y = 16.615x − 30.828

1024 Multiplicands maintain their places as optimal algorithms in terms of number of multiplications and maximum levels of parallelization respectively. Likewise, we find that Jacobian and Montgomery algorithms outperform Projective in terms ofthe same factors, which leads us to another comparison between our work and these other coordinates to monitor if the difference in performance will shrink with the size ofthe key or continue to increase. Despite the slope values in the straight-line equations

218 Table 9.6: Expected number of Mults and MaxLs with key of sizes 751 and 1013.

Algorithm Expected Number of Mults Expected Number of MaxLs 751 1013 751 1013 Base 32 11317 15189 10282 13888 DA 13212 17839 14313 19327 Base 1024 14882 20011 8050 10871 Jacobian 19196 25918 12781 17258 Projective 21503 29047 13547 18297 Montgomery 22893 30911 12447 16800

that show the differences, we will compute the delta value, ∆, which is the difference between the y-axis values along the key size. Table 9.7 shows two comparisons, the first between Base 32 Multiplicands with Jacobian algorithms in terms of numberof multiplications and the other one between Base 1024 Multiplicands with Montgomery algorithms in terms of number of maximum levels, where,

∆i = yn − ym (9.1)

Where i represents the key size and n and m represent the algorithms labels.

Table 9.7: ∆ Values for Base 32 vs Jacobian and Base 1024 vs Montgomery.

Key Size Number of Mults Number of MaxLs Base 32 vs Jacobian Base 1024 vs Montgomery 224 2168 1319 256 2481 1499 384 3862 2236 521 5391 3057 751 7879 4397 1013 10729 5929

219 We extract from the ∆ values from Table 9.7, that our results in both cases show that the improvement scales with the size of the input.

9.6 Number of Multipliers Comparison

After tests and comparisons have proven the efficiency of our algorithms and over- coming other coordinates systems algorithms, in this section we specify the number of multiplications units each algorithm requires to achieve the maximum levels of parallelism. Table 9.8 shows the number of multipliers per algorithm in the case of a key size 521.

Table 9.8: The number of multipliers appropriate to achieve the highest level of parallelism.

Algorithm Number of MaxLs Number of Multipliers DA 9920 3 NAF 7164 3 RL 6072 20 Base 16 7385 2 Base 32 7133 3 Base 1024 5569 6 Projective 9370 6 Jacobian 8862 4 Montgomery 8636 4

As it can be seen in Table 9.8, the appropriate number of multipliers to achieve the highest level of parallelism varies between algorithms. In addition, we note that if we reduce the number of multipliers a little, we may get a very close result in terms of maximum levels of prallelization. Thus, it leads us to another close comparison in

220 which we monitor the behavior of each algorithm in comparison with the others in multiple cases where the number of multipliers is uniform. Table 9.9 shows another comparison between our two optimal algorithms that we specified in the previous section compared to Jacobian and Montgomery, in terms of the MaxLs at specific number of multipliers.

Table 9.9: MaxL at Different Number of Multipliers.

Number of Multipliers Algorithm Maximum Levels Base 32 7982 Base 1024 10028 1 Jacobian 13011 Montgomery 15655 Base 32 7134 2 Base 1024 5648 Jacobian 8864 Montgomery 9138 Base 32 7133 3 Base 1024 5573 Jacobian 8863 Montgomery 8816 Base 32 7133 4 Base 1024 5571 Jacobian 8862 Montgomery 8636 Base 32 7133 5 Base 1024 5571 Jacobian 8862 Montgomery 8636 Base 32 7133 6 Base 1024 5569 Jacobian 8862 Montgomery 8636

As it can be seen in Table 9.9, our algorithms outperform the other coordinates

221 algorithms in all levels, starting from a single multipliers, where Base 32 algorithm is appeared to be the optimal, up to 6 multipliers, where Base 1024 algorithm reached its peak. At 2 multipliers case, the performance of the Base 1024 algorithm improves significantly as it becomes almost 80% faster and continues to be the optimal until the end. We also note that the Base 32 and Jacobian algorithms become highly in- effective as they continue to increase by one parallel level by increasing thenumber of multiplication units each time until they reach their peak while the Montgomery algorithm continues to develop until it reaches the stage where the number of multi- pliers is 4. At the end, and in all cases, whether we use fewer or more multipliers, the efficiency of our algorithms clearly outweighs the work of other coordinates systems algorithms. Figure 9.10 shows the graphical representation of the relationship of the number of multipliers with the maximum levels of parallelism.

Figure 9.10: Maximum levels for different number of multipliers.

222 Chapter 10

Conclusion

This thesis has proposed optimization methods for computing scalar multiplication in

Elliptic Curves over a prime field, in the short Weierstrass form in the affine plane.It

describes a methodology for direct repeated point doubling with high order, as well as

point addition of the form nP +mQ by using a single inversion. These new algorithms are shown to be significantly faster than the original equations. In addition, wehave developed optimized equations for repeated doubling of higher order than available with comparable current existing algorithms (up to 31).

The second part of this work introduces a new coordinates system, EiSi, with fast algorithms shown to be offering the lowest cost when using only single inverse. In facts, EiSi shares the same Jacobian space but with different operators. Paralleliza- tion opportunities are also highlighted. Our implementation indicates that proposed equations outperform the other coordinates systems and also provides a significant speed-up when implemented in hardware. Moreover, EiSi with the direct repeated

223 doubling technique, has proven that it is more efficient in all aspects, number of multiplications, maximum levels of parallelization and estimated time.

This dissertation presented six algorithms. All algorithms have been tested, im- plemented, and compared to the other important coordinates systems. A comparison was made between them in terms of several factors, the most important of which are the number of multiplication, maximum levels of parallelism and the number of mul- tipliers appropriate to achieve the highest parallel level when hardware implemented.

The first algorithm introduced is Right-to-Left. We have developed this algorithm so that it can scan and process five bits once instead of one bit at a time. In general, this algorithm is considered the least efficient in terms of the number of multiplica- tions and time. Although it comes second in terms of the number of parallel levels, achieving this costs 20 multipliers, which is a large number compared to the rest of the algorithms, which require a maximum of 6 multipliers, for the key size of 521 bits.

However, Right-to-left is faster than the Montgomery-based operations by about 42% in terms of the number of parallel levels of the same key size.

The Double and Add algorithm is our EiSi original algorithm. This algorithm is superior in terms of the number of multiplications as compared to the other co- ordinates systems, but it is slower in terms of number of parallel levels, but not significantly. In fact, it demonstrates the strength of our work, which is represented in finding the higher order double directly, which if applied fully to this algorithm would have overcome other systems. If a 521-bit key is applied, the DA is 45%, 62%

224 and 73% faster than the Jacobian, Projective, and Montgomery algorithms respec- tively in terms of the number of multiplications. On the other hand, it is slower in terms of the number of parallel levels by 5%, 11% and 14% as compared to the Pro- jective, Jacobian and Montgomery algorithms respectively. However, we note that achieving the highest level of parallelism to this algorithm requires only 3 multipliers.

Third presented is the NAF algorithm, which is considered moderate in terms of number of multiplications and maximum levels of parallelism. Although it is not the best, it outperforms the other coordinates algorithms in both factors. If a 521-bit key is applied, the NAF is 10% faster than the most efficient other coordinates algorithm,

Jacobian, in terms of the number of multiplications. In addition, it outperforms the

Montgomery algorithm by 21% in terms of parallel levels with 3 multipliers instead of 6.

Fourth, the Base 16 Multiplicands algorithm, is the first member of the Multipli- cands family algorithms that we have invented. This algorithm is characterized by its speed and the low number of parallel levels required to implement it by just 2 multipliers. In addition, the operators of this algorithm are indistinguishable from the key bit value as they continue the point doubling and addition process according to the Montgomery procedure, which makes it similarly resistant to Side-channel at- tacks. Base 16 Multiplicands algorithm outperforms other coordinates algorithms in all aspects as it is 63% faster than Jacobian in terms of the number of multiplications and has 17% lower multiplier levels than Montgomery.

225 Fifth, Base 32 Multiplicands is presented, which overcomes its predecessor as it processes 5 bits once instead of 4 bits which reduces the number of addition operations to the entire key size. This algorithm is superior to the Base 16 by about 3% in both factors, which makes it superior to all algorithms of other coordinates systems and is considered the optimal choice in terms of the number of multiplications.

Last, Base 1024 Multiplicands algorithm was presented, which has the lowest number of parallelization levels among all other algorithms, as its number of parallel levels is less than for the Montgomery algorithm by 55%. Consequently, the 1024 algorithm inspired us to make a new development, or rather, to include a new algo- rithm in the Multiplicands family, as we find the highest doubling order appropriate for the key size and then we treat the remainder in the same way that was done with this algorithm.

226 References

[1] Khaleel Ahmad, MN Doja, Nur Izura Udzir, and Manu Pratap Singh. Emerging

Security Algorithms and Techniques. Chapman and Hall/CRC, 2019.

[2] Tohari Ahmad, Jiankun Hu, and Song Han. An efficient mobile voting system

security scheme based on elliptic curve cryptography. In 2009 Third International

Conference on Network and System Security, pages 474–479. IEEE, 2009.

[3] Gorjan Alagic, Gorjan Alagic, Jacob Alperin-Sheriff, Daniel Apon, David

Cooper, Quynh Dang, Yi-Kai Liu, Carl Miller, Dustin Moody, Rene Peralta,

et al. Status report on the first round of the NIST post-quantum cryptography

standardization process. US Department of Commerce, National Institute of

Standards and Technology, 2019.

[4] Diego F Aranha, Reza Azarderakhsh, and Koray Karabina. Efficient software

implementation of laddering algorithms over binary elliptic curves. In Interna-

tional Conference on Security, Privacy, and Applied Cryptography Engineering,

pages 74–92. Springer, 2017.

227 [5] Sarah Arpin, Catalina Camacho-Navarro, Kristin Lauter, Joelle Lim, Kristina

Nelson, Travis Scholl, and Jana Sot´akov´a. Adventures in supersingularland.

arXiv preprint arXiv:1909.07779, 2019.

[6] Mustapha Benssalah, Mustapha Djeddou, and Karim Drouiche. A provably se-

cure rfid authentication protocol based on elliptic curve signature with message

recovery suitable for m-health environments. Transactions on Emerging Telecom-

munications Technologies, 28(11):e3166, 2017.

[7] Daniel J Bernstein and Tanja Lange. Faster addition and doubling on elliptic

curves. In International Conference on the Theory and Application of Cryptology

and Information Security, pages 29–50. Springer, 2007.

[8] Olivier Billet and Marc Joye. The jacobi model of an elliptic curve and side-

channel analysis. In International Symposium on Applied Algebra, Algebraic

Algorithms, and Error-Correcting Codes, pages 34–42. Springer, 2003.

[9] Eric Brier and Marc Joye. Weierstraß elliptic curves and side-channel attacks.

In International workshop on public key cryptography, pages 335–345. Springer,

2002.

[10] Robert Brumnik, Vladislav Kovtun, Sergii Kavun, and Iztok Podbregar. Biomet-

ric encryption using co-z divisor addition formulae in weighted representation of

jacobean genus 2 hyperelliptic curves over prime fields. Recent Application in

Biometrics, page 167, 2011.

228 [11] Jurlind Budurushi, Stephan Neumann, and Melanie Volkamer. Smart cards in

electronic voting: lessons learned from applications in legally-binding elections

and approaches proposed in scientific papers. In 5th International Conference

on Electronic Voting 2012 (EVOTE2012). Gesellschaft f¨urInformatik eV, 2012.

[12] Denis Charles, Eyal Goren, and Kristin Lauter. Cryptographic hash functions

from expander graphs. Cryptology ePrint Archive, Report 2006/021, 2006.

https://eprint.iacr.org/2006/021.

[13] Denis X Charles, Kristin E Lauter, and Eyal Z Goren. Cryptographic hash

functions from expander graphs. Journal of Cryptology, 22(1):93–113, 2009.

[14] Shehzad Ashraf Chaudhry, Mohammad Sabzinejad Farash, Husnain Naqvi, and

Muhammad Sher. A secure and efficient authenticated encryption for electronic

payment systems using elliptic curve cryptography. Electronic Commerce Re-

search, 16(1):113–139, 2016.

[15] Lily Chen, Lily Chen, Stephen Jordan, Yi-Kai Liu, Dustin Moody, Rene Peralta,

Ray Perlner, and Daniel Smith-Tone. Report on post-quantum cryptography. US

Department of Commerce, National Institute of Standards and Technology, 2016.

[16] Andrew Childs, David Jao, and Vladimir Soukharev. Constructing elliptic curve

isogenies in quantum subexponential time. Journal of Mathematical Cryptology,

8(1):1–29, 2014.

229 [17] Mathieu Ciet, Marc Joye, Kristin Lauter, and Peter L Montgomery. Trading

inversions for multiplications in elliptic curve cryptography. Designs, codes and

cryptography, 39(2):189–206, 2006.

[18] Henri Cohen, Gerhard Frey, Roberto Avanzi, Christophe Doche, Tanja Lange,

Kim Nguyen, and Frederik Vercauteren. Handbook of elliptic and hyperelliptic

curve cryptography. Chapman and Hall/CRC, 2005.

[19] Gerald Coley. Beaglebone black system reference manual. Texas Instruments,

Dallas, 5, 2013.

[20] Jean-S´ebastienCoron. Resistance against differential power analysis for elliptic

curve cryptosystems. In International workshop on cryptographic hardware and

embedded systems, pages 292–302. Springer, 1999.

[21] Craig Costello. Supersingular isogeny key exchange for beginners.

[22] Craig Costello, David Jao, Patrick Longa, Michael Naehrig, Joost Renes, and

David Urbanik. Efficient compression of sidh public keys. In Annual International

Conference on the Theory and Applications of Cryptographic Techniques, pages

679–706. Springer, 2017.

[23] Craig Costello, Patrick Longa, and Michael Naehrig. Efficient algorithms for

supersingular isogeny diffie-hellman. In Annual International Cryptology Con-

ference, pages 572–601. Springer, 2016.

230 [24] Jean Marc Couveignes. Hard homogeneous spaces. IACR Cryptology ePrint

Archive, 2006:291, 2006.

[25] Luca De Feo. Fast algorithms for towers of finite fields and isogenies. PhD thesis,

Ecole Polytechnique X, 2010.

[26] Luca De Feo, David Jao, and J´erˆomePlˆut. Towards quantum-resistant cryp-

tosystems from supersingular elliptic curve isogenies. Journal of Mathematical

Cryptology, 8(3):209–247, 2014.

[27] Vassil Dimitrov, Laurent Imbert, and Pradeep Kumar Mishra. Efficient and

secure elliptic curve point multiplication using double-base chains. In Interna-

tional Conference on the Theory and Application of Cryptology and Information

Security, pages 59–78. Springer, 2005.

[28] Kirsten Eisentr¨ager,Kristin Lauter, and Peter L Montgomery. Fast elliptic curve

arithmetic and improved weil pairing evaluation. In Cryptographers Track at the

RSA Conference, pages 343–354. Springer, 2003.

[29] Armando Faz-Hern´andez, Julio L´opez, Eduardo Ochoa-Jim´enez,and Francisco

Rodr´ıguez-Henr´ıquez. A faster software implementation of the supersingular

isogeny diffie-hellman key exchange protocol. IEEE Transactions on Computers,

67(11):1622–1636, 2018.

[30] Steven D Galbraith. Mathematics of public key cryptography. Cambridge Uni-

versity Press, 2012.

231 [31] Jerry Gao, Vijay Kulkarni, Himanshu Ranavat, Lee Chang, and Hsing Mei. A 2d

barcode-based mobile payment system. In 2009 Third International Conference

on Multimedia and Ubiquitous Engineering, pages 320–329. IEEE, 2009.

[32] Daniel Genkin, Luke Valenta, and Yuval Yarom. May the fourth be with you:

A microarchitectural side channel attack on several real-world applications of

curve25519. In Proceedings of the 2017 ACM SIGSAC Conference on Computer

and Communications Security, pages 845–858, 2017.

[33] Benjamin Jun Gilbert Goodwill, Josh Jaffe, Pankaj Rohatgi, et al. A testing

methodology for side-channel resistance validation. In NIST non-invasive attack

testing workshop, volume 7, pages 115–136, 2011.

[34] Dhiraj P Girase. A secure smartphone based voting system with modified evm

using elliptic curve cryptography. International Journal of Electronics and Com-

munication Engineering, pages 91–98.

[35] Louis Goubin. A refined power-analysis attack on elliptic curve cryptosystems. In

International Workshop on Public Key Cryptography, pages 199–211. Springer,

2003.

[36] Adnan Gutub. Efficient utilization of scalable multipliers in parallel to compute

gf (p) elliptic curve cryptographic operations. Kuwait Journal of Science &

Engineering (KJSE), December 2007, 34(2):165–182, 2007.

232 [37] Akira Higuchi and Naofumi Takagi. A fast addition algorithm for elliptic curve

arithmetic in gf (2n) using projective coordinates. Information processing letters,

76(3):101–103, 2000.

[38] David D Hwang, Kris Tiri, Alireza Hodjat, B-C Lai, Shenglin Yang, Patrick

Schaumont, and Ingrid Verbauwhede. Aes-based security coprocessor ic in 0.18-

muhboxm cmos with resistance to differential power analysis side-channel at-

tacks. IEEE Journal of Solid-State Circuits, 41(4):781–792, 2006.

[39] Jes´usT´ellezIsaac and Zeadally Sherali. Secure mobile payment systems. IT

Professional, 16(3):36–43, 2014.

[40] David Jao and Luca De Feo. Towards quantum-resistant cryptosystems from su-

persingular elliptic curve isogenies. In International Workshop on Post-Quantum

Cryptography, pages 19–34. Springer, 2011.

[41] Damien Jauvart, Jacques JA Fournier, Louis Goubin, and Nadia El Mrabet. First

practical side-channel attack to defeat point randomization in secure implemen-

tations of pairing-based cryptography. In SECRYPT, pages 104–115, 2017.

[42] Marc Joye. Highly regular right-to-left algorithms for scalar multiplication. In In-

ternational Workshop on Cryptographic Hardware and Embedded Systems, pages

135–147. Springer, 2007.

233 [43] K. Kladko and Y. Polulyakh. Method and system for (22) filed: Oct. 13, 2007

side-channel testing a computing device and for improving related u.s. application

data resistance of a computing device to side-channel attacks, 2008.

[44] Neal Koblitz. Elliptic curve cryptosystems. Mathematics of computation,

48(177):203–209, 1987.

[45] Neal Koblitz. Cm-curves with good cryptographic properties. In Annual inter-

national cryptology conference, pages 279–287. Springer, 1991.

[46] Boris K¨opf and David Basin. An information-theoretic model for adaptive side-

channel attacks. In Proceedings of the 14th ACM conference on Computer and

communications security, pages 286–296, 2007.

[47] Brian Koziel, Reza Azarderakhsh, and Mehran Mozaffari Kermani. A high-

performance and scalable hardware architecture for isogeny-based cryptography.

IEEE Transactions on Computers, 67(11):1594–1609, 2018.

[48] Brian Koziel, Reza Azarderakhsh, Mehran Mozaffari Kermani, and David Jao.

Post-quantum cryptography on fpga based on isogenies on elliptic curves. IEEE

Transactions on Circuits and Systems I: Regular Papers, 64(1):86–99, 2017.

[49] Brian Koziel, Reza Azarderakhsh, and Mehran Mozaffari-Kermani. Fast hard-

ware architectures for supersingular isogeny diffie-hellman key exchange on fpga.

In International Conference in Cryptology in India, pages 191–206. Springer,

2016.

234 [50] A Tawalbeh Lo’ai, Turki F Somani, and Hilal Houssain. Towards secure commu-

nications: Review of side channel attacks and countermeasures on ecc. In 2016

11th International Conference for Internet Technology and Secured Transactions

(ICITST), pages 87–91. IEEE, 2016.

[51] Patrick Longa and Ali Miri. New multibase non-adjacent form scalar multiplica-

tion and its application to elliptic curve cryptosystems (extended version). IACR

Cryptology ePrint Archive, 2008:52, 2008.

[52] Julio L´opez and Ricardo Dahab. Improved algorithms for elliptic curve arith-

metic in gf (2 n). In International Workshop on Selected Areas in Cryptography,

pages 201–212. Springer, 1998.

[53] Victor S Miller. Use of elliptic curves in cryptography. In Conference on the the-

ory and application of cryptographic techniques, pages 417–426. Springer, 1985.

[54] Pradeep Kumar Mishra and Vassil Dimitrov. Efficient quintuple formulas for

elliptic curves and efficient scalar multiplication using multibase number repre-

sentation. In International Conference on Information Security, pages 390–406.

Springer, 2007.

[55] Peter L Montgomery. Speeding the pollard and elliptic curve methods of factor-

ization. Mathematics of computation, 48(177):243–264, 1987.

235 [56] Svetla Nikova, Christian Rechberger, and Vincent Rijmen. Threshold implemen-

tations against side-channel attacks and glitches. In International conference on

information and communications security, pages 529–545. Springer, 2006.

[57] Katsuyuki Okeya, Hiroyuki Kurumatani, and Kouichi Sakurai. Elliptic curves

with the montgomery-form and their cryptographic applications. In International

Workshop on Public Key Cryptography, pages 238–257. Springer, 2000.

[58] Katsuyuki Okeya, Kunihiko Miyazaki, and Kouichi Sakurai. A fast scalar multi-

plication method with randomized projective coordinates on a montgomery-form

elliptic curve secure against side channel attacks. In International Conference

on Information Security and Cryptology, pages 428–439. Springer, 2001.

[59] Olayemi Mikail Olaniyi, Arulogun Oladiran Tayo, Omidiora Elijah Olusayo,

Okediran Oladotun Olusola, et al. A survey of cryptographic and stegano-

cryptographic models for secure electronic voting system. Covenant Journal

of Informatics and Communication Technology, 2(1), 2013.

[60] Leonardo B Oliveira, Fernando Magno Quint˜aoPereira, Rafael Misoczki, Diego F

Aranha, F´abioBorges, Michele Nogueira, Michelle Wangham, Min Wu, and Jie

Liu. The computer for the 21st century: present security & privacy challenges.

Journal of Internet Services and Applications, 9(1):24, 2018.

236 [61] Thomaz Oliveira, Diego F Aranha, Julio L´opez, and Francisco Rodr´ıguez-

Henr´ıquez. Fast point multiplication algorithms for binary elliptic curves with

and without precomputation. In International Conference on Selected Areas in

Cryptography, pages 324–344. Springer, 2014.

[62] Thomaz Oliveira, Julio L´opez, H¨useyinHı¸sıl, Armando Faz-Hern´andez, and

Francisco Rodr´ıguez-Henr´ıquez. How to (pre-) compute a ladder. In Interna-

tional Conference on Selected Areas in Cryptography, pages 172–191. Springer,

2017.

[63] Christof Paar and Jan Pelzl. Understanding cryptography: a textbook for students

and practitioners. Springer Science & Business Media, 2009.

[64] Christophe Petit. Faster algorithms for isogeny problems using torsion point

images. In International Conference on the Theory and Application of Cryptology

and Information Security, pages 330–353. Springer, 2017.

[65] Axel Poschmann, Amir Moradi, Khoongming Khoo, Chu-Wee Lim, Huaxiong

Wang, and San Ling. Side-channel resistant crypto for less than 2,300 ge. Journal

of Cryptology, 24(2):322–345, 2011.

[66] GN Purohit and Asmita Singh Rawat. Fast scalar multiplication in ecc using the

multi base number system. International Journal of Computer Science Issues

(IJCSI), 8(1):131, 2011.

237 [67] Srinivasa Rao Subramanya Rao. Three dimensional montgomery ladder, differ-

ential point tripling on montgomery curves and point quintupling on weierstrass

and edwards curves. In International Conference on Cryptology in Africa, pages

84–106. Springer, 2016.

[68] Joost Renes. Computing isogenies between montgomery curves using the action

of (0, 0). In International Conference on Post-Quantum Cryptography, pages

229–247. Springer, 2018.

[69] Xe Rui. Secure e-check payment model based on ecc. In 2010 WASE Interna-

tional Conference on Information Engineering, volume 2, pages 109–112. IEEE,

2010.

[70] S Sable and UL Bombale. Cryptography based secured e-voting system using

arm for cell phone and internet application. International Journal of Computer

Engineering and Applications, 7(1):1–6, 2014.

[71] Jeff Seibert, Hamed Okhravi, and Eric S¨oderstr¨om. Information leaks without

memory disclosures: Remote side channel attacks on diversified code. In Proceed-

ings of the 2014 ACM SIGSAC Conference on Computer and Communications

Security, pages 54–65, 2014.

[72] Darshana Pritam Shah and Namita Pritam Shah. Implementation of digital sig-

nature algorithm by using elliptical curve p-192. Australian Journal of Wireless

Technologies, Mobility and Security, 1(1):1–4, 2019.

238 [73] Peter W Shor. Algorithms for quantum computation: Discrete logarithms and

factoring. In Proceedings 35th annual symposium on foundations of computer

science, pages 124–134. Ieee, 1994.

[74] Nigel P Smart. Cryptography made simple, volume 481. Springer, 2016.

[75] Nigel P Smart. Topics in Cryptology–CT-RSA 2018: The Cryptographers’ Track

at the RSA Conference 2018, San Francisco, CA, USA, April 16-20, 2018, Pro-

ceedings, volume 10808. Springer, 2018.

[76] Kevin Z Snow, Fabian Monrose, Lucas Davi, Alexandra Dmitrienko, Christo-

pher Liebchen, and Ahmad-Reza Sadeghi. Just-in-time code reuse: On the ef-

fectiveness of fine-grained address space layout randomization. In 2013 IEEE

Symposium on Security and Privacy, pages 574–588. IEEE, 2013.

[77] Anton Stolbunov. Constructing public-key cryptographic schemes based on class

group action on a set of isogenous elliptic curves. Advances in Mathematics of

Communications, 4(2):215–235, 2010.

[78] Tsuyoshi Takagi and Thomas Peyrin. Advances in Cryptology–ASIACRYPT

2017: 23rd International Conference on the Theory and Applications of Cryp-

tology and Information Security, Hong Kong, China, December 3-7, 2017, Pro-

ceedings, volume 10625. Springer, 2017.

[79] Erik Thormarker. Post-quantum cryptography: supersingular isogeny Diffie-

Hellman key exchange. PhD thesis, Thesis, Stockholm University, 2017.

239 [80] Dhanashree Toradmalle, Jayabhaskar Muthukuru, and B Sathyanarayana. Cer-

tificateless and provably-secure digital signature scheme based on elliptic curve.

International Journal of Electrical & Computer Engineering (2088-8708), 9,

2019.

[81] Chao Wang and Patrick Schaumont. Security by compilation: an automated ap-

proach to comprehensive side-channel resistance. ACM SIGLOG News, 4(2):76–

89, 2017.

[82] Hong Wang, Kunpeng Wang, Lijun Zhang, and Bao Li. Pairing computation on

elliptic curves of jacobi quartic form. Chinese Journal of electronics, 20(4):655–

661, 2011.

[83] Yi Wang, Jianwu Wan, Jun Guo, Yiu-Ming Cheung, and Pong C Yuen. Inference-

based similarity search in randomized montgomery domains for privacy-

preserving biometric identification. IEEE transactions on pattern analysis and

machine intelligence, 40(7):1611–1624, 2017.

[84] Lawrence C Washington. Elliptic curves: number theory and cryptography. Chap-

man and Hall/CRC, 2008.

[85] Xiu Xu, Chris Leonardi, Anzo Teh, David Jao, Kunpeng Wang, Wei Yu, and

Reza Azarderakhsh. Improved digital signatures based on elliptic curve endo-

morphism rings. In International Conference on Information Security Practice

and Experience, pages 293–309. Springer, 2019.

240 [86] Jen-Ho Yang, Ya-Fen Chang, and Yi-Hui Chen. An efficient authenticated en-

cryption scheme based on ecc and its application for electronic payment. Infor-

mation Technology and Control, 42(4):315–324, 2013.

241